Received: by 2002:a05:6a10:9e8c:0:0:0:0 with SMTP id y12csp299035pxx; Thu, 29 Oct 2020 03:00:12 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwEYe6PYbCtAAD3k5MI2i/v10/AlmR4D5SsCjAI8reOM2myvUZ9IJPvLQ7z+Apm+xpgkiGX X-Received: by 2002:aa7:da4d:: with SMTP id w13mr2993908eds.266.1603965612414; Thu, 29 Oct 2020 03:00:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1603965612; cv=none; d=google.com; s=arc-20160816; b=mrHT59TS8kSd2u2nZoOSrOyv4YfVoyGyjDO0h3h4jbIUUU0C07Kiio20zlPZ//7zaC n0ZtuxoI98+Gwch6SEf/iuJdkarXtIsOX9Iy2E5EXlroYp4DUzNJ04euCgXrRQHdq0wn a53beMl+EWMchkblW/wvQ2Pyc2UCgZOL+2kAQ4OYw6M7pXSgHnVNpJFFjIn8yzrP2EbZ 6VIE1IOWnKxSwr0M0pb+n3a+3yyuIXq1aSCSos3SjRBPDr2I/SInwJDwh5DOXpFQNhph P6hWi1Nv67wZiEaGEMPRkSO5gCuU2o+nJP9+Zqk748i+V5G2PcJPXfeHxEprpHl+5r+/ YL3g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=AFQ11nKNwZI/oGFeZxcljdz8x6kEb8+uWHBsNPCDUO0=; b=a0YHZxxubMYBJKS5xO0aL8abTNhT+PQm3O/mKuXoi9xa9wj4rZ84f777NL/wIEO+SC cYFC0ZQb1g1dkhqVk5CUJM+IMpFhleB/gIQTCy3x9+XnTQiuWDuQ0SSHpk0+h9WekM3+ +X8KrzgzdGo0krMmX/6cuhW+OtNn4QKa/vY985EdwTPxZGG449ZWC+u5LID4q16NA4jE UrxjSXXUYrk2lieIan3Yp4Qh+AmXceOE12XTReB6/DdJ7CK2Vw1S1wJC5KP/soIkxB5b ZEN5AZ5wWP6Pjziob82aQzL2eDZl87Xw2RFvHXGOZcwqwwlCUai58+w7DLBTG9TGmKw0 P/0A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=HFAjXWqA; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id r1si1541520ejh.137.2020.10.29.02.59.50; Thu, 29 Oct 2020 03:00:12 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=HFAjXWqA; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2387530AbgJ1XMe (ORCPT + 99 others); Wed, 28 Oct 2020 19:12:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34604 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731467AbgJ1XMd (ORCPT ); Wed, 28 Oct 2020 19:12:33 -0400 Received: from mail-lf1-x132.google.com (mail-lf1-x132.google.com [IPv6:2a00:1450:4864:20::132]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A3E2BC0613CF for ; Wed, 28 Oct 2020 16:12:32 -0700 (PDT) Received: by mail-lf1-x132.google.com with SMTP id 126so916883lfi.8 for ; Wed, 28 Oct 2020 16:12:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=AFQ11nKNwZI/oGFeZxcljdz8x6kEb8+uWHBsNPCDUO0=; b=HFAjXWqAbYshDr2yDxrNzRjLAazYy3NYleGOwhPwLv1pzOgTqSzUGl4QAew8/0Ue0Z UoM2bHGRSrCMkFFIE/eRm1e9A6GSJQQnBj0G0icXRTO3f42Z/TOgD2ZwXEDyH4EQzhtk ywG5wEF7+x5kI5uzusYn+VP7eFH2BvWpSYX2QaSH7NVhnOuxIXe879Y+CazILhKsmJD6 IL/m69v/0hvM8YmGddWOL1C5qP87stMvRtBRmf4MLpcBB3WPUywYeyXiWwll/+B6xjhd 0IHI4V6rTlKalnDT1Hs7tkfxy9/KFXAims9NH7Xvmh8xuaLNrqF4bCuC4UhMykb0uAU1 +C5w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=AFQ11nKNwZI/oGFeZxcljdz8x6kEb8+uWHBsNPCDUO0=; b=JBYMJ3f3y+sxZkqYF3meO/M4STGb5jg2wShccIJpE/CT1FFCs9YvVu5rYDtTqJWoU9 MgHxMpi0uUxaBwQq5dZk1K+RNfdttKF0KAkgDEN4j+buzyACPpWsfiPI11CJGKAXQ4qo 5p0MYb7q656gnLDx8WYcjaEU/Hb9ywpre1vY+XL+xP4qIFRW1weUeuDb9iuvVLXxXnFF rDEqDU5GGmWN5btRkvd1w39K/aEQARTIZpf/h459zQ8d5l4igXa3RTIJtRMXHLTH/rUE u8OSWK3DAYMbKcq4MbQmb3+WKXb3ebqTfDLjF43h8Sq/I9lw+xtKkLNt7BRARBJnAIbK riMA== X-Gm-Message-State: AOAM530r9n9j2JoANZWZ91B4By8INDpMOcxmCwcBEaVaKEOOTJESm3bL VaFu0qcFOx4OmF4TikuhZTLTmLTQN83hwU4gWFhJliKba00= X-Received: by 2002:a05:6512:1054:: with SMTP id c20mr2561887lfb.576.1603878221599; Wed, 28 Oct 2020 02:43:41 -0700 (PDT) MIME-Version: 1.0 References: <45f07f17-18b6-d187-0914-6f341fe90857@gmail.com> <20200930150330.GC284424@cisco> <8bcd956f-58d2-d2f0-ca7c-0a30f3fcd5b8@gmail.com> <20200930230327.GA1260245@cisco> <20200930232456.GB1260245@cisco> <656a37b5-75e3-0ded-6ba8-3bb57b537b24@gmail.com> In-Reply-To: From: Jann Horn Date: Wed, 28 Oct 2020 10:43:14 +0100 Message-ID: Subject: Re: For review: seccomp_user_notif(2) manual page To: Sargun Dhillon Cc: "Michael Kerrisk (man-pages)" , Tycho Andersen , Kees Cook , Christian Brauner , linux-man , lkml , Aleksa Sarai , Alexei Starovoitov , Will Drewry , bpf , Song Liu , Daniel Borkmann , Andy Lutomirski , Linux Containers , Giuseppe Scrivano , Robert Sesek Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Oct 28, 2020 at 7:32 AM Sargun Dhillon wrote: > On Tue, Oct 27, 2020 at 3:28 AM Jann Horn wrote: > > On Tue, Oct 27, 2020 at 7:14 AM Michael Kerrisk (man-pages) > > wrote: > > > On 10/26/20 4:54 PM, Jann Horn wrote: > > > > I'm a bit on the fence now on whether non-blocking mode should use > > > > ENOTCONN or not... I guess if we returned ENOENT even when there are > > > > no more listeners, you'd have to disambiguate through the poll() > > > > revents, which would be kinda ugly? > > > > > > I must confess, I'm not quite clear on which two cases you > > > are trying to distinguish. Can you elaborate? > > > > Let's say someone writes a program whose responsibilities are just to > > handle seccomp events and to listen on some other fd for commands. And > > this is implemented with an event loop. Then once all the target > > processes are gone (including zombie reaping), we'll start getting > > EPOLLERR. > > > > If NOTIF_RECV starts returning -ENOTCONN at this point, the event loop > > can just call into the seccomp logic without any arguments; it can > > just call NOTIF_RECV one more time, see the -ENOTCONN, and terminate. > > The downside is that there's one more error code userspace has to > > special-case. > > This would be more consistent with what we'd be doing in the blocking case. > > > > If NOTIF_RECV keeps returning -ENOENT, the event loop has to also tell > > the seccomp logic what the revents are. > > > > I guess it probably doesn't really matter much. > > So, in practice, if you're emulating a blocking syscall (such as open, > perf_event_open, or any of a number of other syscalls), you probably > have to do it on a separate thread in the supervisor because you want > to continue to be able to receive new notifications if any other process > generates a seccomp notification event that you need to handle. > > In addition to that, some of these syscalls are preemptible, so you need > to poll SECCOMP_IOCTL_NOTIF_ID_VALID to make sure that the program > under supervision hasn't left the syscall. > > If we're to implement a mechanism that makes the seccomp ioctl receive > non-blocking, it would be valuable to address this problem as well (getting > a notification when the supervisor is processing a syscall and needs to > preempt it). In the best case, this can be a minor inconvenience, and > in the worst case this can result in weird errors where you're keeping > resources open that the container expects to be closed. Does "a notification" mean signals? Or would you want to have a second thread in userspace that poll()s for cancellation events on the seccomp fd and then somehow takes care of interrupting the first thread, or something like that? Either way, I think your proposal goes beyond the scope of patching the existing weirdness, and should be a separate patch.