Received: by 2002:a25:ef43:0:0:0:0:0 with SMTP id w3csp1078099ybm; Wed, 27 May 2020 15:47:18 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzE2ynUoryQOwxuXbsAOSd3xkv04wMNkQGi60h7L46Yjqlt3FMMu5C3nm9k0UuBZvjUzJTG X-Received: by 2002:a17:906:d215:: with SMTP id w21mr459059ejz.383.1590619638596; Wed, 27 May 2020 15:47:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1590619638; cv=none; d=google.com; s=arc-20160816; b=KbADlMQNKVILd0CMjrQPdU1QUqdlyFj7EHsBdM1oDDiwzDFv0/3k4Y91T+AIW2zS4H uyziQbajBlkqFfoatrowiBASAKHt6I/5B3o5+gy2jnoyIUTbZXQYXK3TrC0FhR1VT3ZN 7M4salOIjHqjUQmVtbQmD55JZgqoNSOqStdHD7mZAzZNNPJcqBGqNyrHt/TsBdzi82wK M4Vhf9SCslqGDCYNcjx6EmSRH4zVuwJ0l6MofPYDchYgDgrFKw9mkLGQQnnUpG5RBI6n 5NZU5l9EcYyZQH+aHiJm3/S67P0JlzZ6dA65spFwNcM0cVYKAe7eEgB7QlSMCxMlpiAY 1fUw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=2a23IeETjQV+jt+DPhLRTgBL5Akg6zekYrXPMUMon2U=; b=ptSmArmEnM31l91v6lWO4+AULhiEEgkQLnAVtJLBcxbosi6QmCI+/F2PjZQL0O62WP Kc70K+BIktJ04Gh08Vudd0Uoaxj3SSFAxbW7LMsgq7UiAALzxiF1i5Cy4LpP31rXdpz3 kW3L9jPVKMgHxcGSI7vqjXYxm+EW6zJUqkPZL7E8x9weYovT2VjZZjumuJLZYDFGNAg6 iCxIUl7A2wc49gkXCv2e1ezLZvmQ4ZECfH5vBylOOc8QsFZLzhmf2otGcWajXwLtV8Cs B8whxVnLh/6wZ/kFwGb724zqKJPbDW75YJKfW2TWHG3iXTc/esw3+hl8q12bLCiCGzl2 MKXA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id bc18si2486994edb.558.2020.05.27.15.46.55; Wed, 27 May 2020 15:47:18 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725937AbgE0WpE (ORCPT + 99 others); Wed, 27 May 2020 18:45:04 -0400 Received: from youngberry.canonical.com ([91.189.89.112]:52287 "EHLO youngberry.canonical.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725267AbgE0WpE (ORCPT ); Wed, 27 May 2020 18:45:04 -0400 Received: from ip5f5af183.dynamic.kabel-deutschland.de ([95.90.241.131] helo=wittgenstein) by youngberry.canonical.com with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1je4nG-0004BY-0t; Wed, 27 May 2020 22:45:02 +0000 Date: Thu, 28 May 2020 00:45:01 +0200 From: Christian Brauner To: Kees Cook Cc: linux-kernel@vger.kernel.org, Andy Lutomirski , Tycho Andersen , Matt Denton , Sargun Dhillon , Jann Horn , Chris Palmer , Aleksa Sarai , Robert Sesek , Jeffrey Vander Stoep , Linux Containers Subject: Re: [PATCH 1/2] seccomp: notify user trap about unused filter Message-ID: <20200527224501.jddwcmvtvjtjsmsx@wittgenstein> References: <20200527111902.163213-1-christian.brauner@ubuntu.com> <202005271408.58F806514@keescook> <20200527220532.jplypougn3qzwrms@wittgenstein> <202005271537.75548B6@keescook> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <202005271537.75548B6@keescook> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, May 27, 2020 at 03:37:58PM -0700, Kees Cook wrote: > On Thu, May 28, 2020 at 12:05:32AM +0200, Christian Brauner wrote: > > The main question also is, is there precedence where the kernel just > > closes the file descriptor for userspace behind it's back? I'm not sure > > I've heard of this before. That's not how that works afaict; it's also > > not how we do pidfds. We don't just close the fd when the task > > associated with it goes away, we notify and then userspace can close. > > But there's a mapping between pidfd and task struct that is separate > from task struct itself, yes? I.e. keeping a pidfd open doesn't pin > struct task in memory forever, right? No, but that's an implementation detail and we discussed that. It pins struct pid instead of task_struct. Once the process is fully gone you just get ESRCH. For example, fds to /proc/// fds aren't just closed once the task has gone away, userspace will just get ESRCH when it tries to open files under there but the fd remains valid until close() is called. In addition, of all the anon inode fds, none of them have the "close the file behind userspace back" behavior: io_uring, signalfd, timerfd, btf, perf_event, bpf-prog, bpf-link, bpf-map, pidfd, userfaultfd, fanotify, inotify, eventpoll, fscontext, eventfd. These are just core kernel ones. I'm pretty sure that it'd be very odd behavior if we did that. I'd rather just notify userspace and leave the close to them. But maybe I'm missing something. Christian