Received: by 2002:a05:7412:8d10:b0:f3:1519:9f41 with SMTP id bj16csp1805652rdb; Thu, 7 Dec 2023 09:11:03 -0800 (PST) X-Google-Smtp-Source: AGHT+IGUgSpgXPlTGAMMbhzR3kesSLZ4gVsi1AM3kWQmMZ8nxd9LYRhjXpROkgZcatEqhZ/brNlU X-Received: by 2002:a17:902:cec6:b0:1d0:b4bf:cbca with SMTP id d6-20020a170902cec600b001d0b4bfcbcamr6222754plg.3.1701969062741; Thu, 07 Dec 2023 09:11:02 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701969062; cv=none; d=google.com; s=arc-20160816; b=Kawwe6G9TUZhnj2jGWmI0V9SlIfrkaxkk7zbZm7eHIA3zhD3/H4FTZ1+bmU0PFUrXI DEei6PVGIF77k7epkDd3bStW4tGG/Rs/Un5jR4GjUL1X/72Dq7qTRxCRfnB2lSANx63I NQTymJZz8orGj7p2c53is5CgnSTuAan8NsmVDdk3NKx3l++dUXIx8JwzDHWGd1WukHp6 tFAOYzmdII8P8jObS5+Ij4Ywfs30G/uFn6bbJHULSxLpct3S6+wTrrBxbzxxnJ/pII4h UYS2gqtIzNS9SXfXKZOezCyJjGBqqxAkFAMtGz7l6UXZ6xN2H4o4eypTWRGtUW6tNLlI 9arw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:feedback-id:dkim-signature :dkim-signature; bh=bT0vifi4KEA/by3Sw+sML2QMzMbrhVLku2bYSBxgxnQ=; fh=cqt7tYfUjLVb4iAKPLlMbd5K0yJUwDstCHDT1mZCG8Y=; b=aNlfYcJ60+M/fVm64lWRgqVJfR58DRuTU1tK5J6bV/+HxdS6rgaMSW+k8FE7cvuqiG DtiBEUWB0conNAjmQbQ9U9Slwj3IfVgurUMyRJ8am4rubk7a0iH8XEbrNkcgNydvsJ72 vSD+yYarDN6dP2SlOvL52jiyEQwBomhN/AxMCJY/aS02POjO6jwsaPigs1vNmSqabYSk st/kDOorxoI0L7uJuewtUpBeNxyfZJOshz2hL/LzO1I/P/Q2flxmTFGG9MXULeu2PRIB d4I3h79ttMzLb+7it8qyApXohpP8oTsCSg1XiES5ru6+YqwwEaWEtRrYnfmrJHuYewBc HBKA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@tycho.pizza header.s=fm3 header.b="qtN/QXp1"; dkim=pass header.i=@messagingengine.com header.s=fm1 header.b=cZqgMbdq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from pete.vger.email (pete.vger.email. [23.128.96.36]) by mx.google.com with ESMTPS id l8-20020a6542c8000000b005ab776a4d4csi3571pgp.610.2023.12.07.09.10.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 07 Dec 2023 09:11:02 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) client-ip=23.128.96.36; Authentication-Results: mx.google.com; dkim=pass header.i=@tycho.pizza header.s=fm3 header.b="qtN/QXp1"; dkim=pass header.i=@messagingengine.com header.s=fm1 header.b=cZqgMbdq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by pete.vger.email (Postfix) with ESMTP id 1302F801B308; Thu, 7 Dec 2023 09:10:20 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at pete.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1379413AbjLGRJ4 (ORCPT + 99 others); Thu, 7 Dec 2023 12:09:56 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43628 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232030AbjLGRJy (ORCPT ); Thu, 7 Dec 2023 12:09:54 -0500 Received: from wout3-smtp.messagingengine.com (wout3-smtp.messagingengine.com [64.147.123.19]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F1CE9D1; Thu, 7 Dec 2023 09:09:59 -0800 (PST) Received: from compute7.internal (compute7.nyi.internal [10.202.2.48]) by mailout.west.internal (Postfix) with ESMTP id 168293200B20; Thu, 7 Dec 2023 12:09:57 -0500 (EST) Received: from mailfrontend1 ([10.202.2.162]) by compute7.internal (MEProxy); Thu, 07 Dec 2023 12:09:57 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tycho.pizza; h= cc:cc:content-transfer-encoding:content-type:date:date:from:from :in-reply-to:message-id:mime-version:reply-to:sender:subject :subject:to:to; s=fm3; t=1701968996; x=1702055396; bh=bT0vifi4KE A/by3Sw+sML2QMzMbrhVLku2bYSBxgxnQ=; b=qtN/QXp1yO9p4LRNkcti69mCK2 qRjzp4NSMEi+YLiRrUw/c5JmXVL4x0wKzHjW5ZjrhEkWCO9ypxMnAvNqJtiN7k03 Lp+6WhXfJLt91BzlDflfPTFsD0aJDbMNmmZft9xuHGSRJvjRphCShzNphK3+zspK FhvBf7qc3tKr5I2yvzLsd1vlCJmHVTxFHE5AdBhorVF6hLzS+lQOg76OJbOfb6NJ p+bboVgIyLQbrI5eKZVYN3cpGB7ohOq4GE0rm6wTw+blTD5jNfv+Rg2yuefm5B5/ 6Y6qel18zqFwHfsarH1JZ7Z8dD+A/BGTXnczVKV8/86l4MzRSzBHKNlypOYA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:date:date:feedback-id:feedback-id:from:from :in-reply-to:message-id:mime-version:reply-to:sender:subject :subject:to:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm1; t=1701968996; x=1702055396; bh=bT0vifi4KEA/b y3Sw+sML2QMzMbrhVLku2bYSBxgxnQ=; b=cZqgMbdqmXEk3rSVJtRrMSG4aJVoh uBkpS+2Nqw5xFcWbzCMCeuW+zAJplRpKQKgty9kGNmTDaLRW4bryH01x7elkTQzm cg69Q5+/NbLnBY7wJYbU5DhSxRuTmu6XTt9mp7zCgvMC/VfPe9uVDkDofbLFKbqB e32FJDK53cA+1NVsaV91Lel9sUI+osnHBa5p+E6TnduC/tz3ojAwcVDMHu0HQWfB f9HXaICGaxCnFCnGmMjWR0xjTvgEbW9oYT0wimLV5I4HMPUILTd3YlIoXON//kdT 92JVupeQOcb5WUtq7ynoSazbOuypb81uFPLVigvbPGXulbZKL2w+aYLIw== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvkedrudekfedgfeejucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhephffvvefufffkofgggfestdekredtredttdenucfhrhhomhepvfihtghhohcu tehnuggvrhhsvghnuceothihtghhohesthihtghhohdrphhiiiiirgeqnecuggftrfgrth htvghrnhephfehleehfeejtdehteejgeefueehtdeufedvhefghefggfeigfegleelvdeh gfejnecuffhomhgrihhnpehkvghrnhgvlhdrohhrghenucevlhhushhtvghrufhiiigvpe dtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehthigthhhosehthigthhhordhpihiiiigr X-ME-Proxy: Feedback-ID: i21f147d5:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Thu, 7 Dec 2023 12:09:54 -0500 (EST) From: Tycho Andersen To: Christian Brauner Cc: Oleg Nesterov , "Eric W . Biederman" , linux-kernel@vger.kernel.org, linux-api@vger.kernel.org, Tycho Andersen , Tycho Andersen Subject: [PATCH v2 1/3] pidfd: allow pidfd_open() on non-thread-group leaders Date: Thu, 7 Dec 2023 10:09:44 -0700 Message-Id: <20231207170946.130823-1-tycho@tycho.pizza> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on pete.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (pete.vger.email [0.0.0.0]); Thu, 07 Dec 2023 09:10:20 -0800 (PST) From: Tycho Andersen We are using the pidfd family of syscalls with the seccomp userspace notifier. When some thread triggers a seccomp notification, we want to do some things to its context (munge fd tables via pidfd_getfd(), maybe write to its memory, etc.). However, threads created with ~CLONE_FILES or ~CLONE_VM mean that we can't use the pidfd family of syscalls for this purpose, since their fd table or mm are distinct from the thread group leader's. In this patch, we relax this restriction for pidfd_open(). In order to avoid dangling poll() users we need to notify pidfd waiters when individual threads die, but once we do that all the other machinery seems to work ok viz. the tests. But I suppose there are more cases than just this one. Signed-off-by: Tycho Andersen -- v2: unify pidfd notification to all go through do_notify_pidfd() inside of __exit_signals() suggested by Oleg. Link to v1: https://lore.kernel.org/all/20231130163946.277502-1-tycho@tycho.pizza/ --- include/linux/sched/signal.h | 1 + kernel/exit.c | 3 +++ kernel/fork.c | 4 +--- kernel/pid.c | 11 +---------- kernel/signal.c | 5 +---- 5 files changed, 7 insertions(+), 17 deletions(-) diff --git a/include/linux/sched/signal.h b/include/linux/sched/signal.h index 3499c1a8b929..37d6b4e4ab70 100644 --- a/include/linux/sched/signal.h +++ b/include/linux/sched/signal.h @@ -332,6 +332,7 @@ extern int kill_pid_usb_asyncio(int sig, int errno, sigval_t addr, struct pid *, extern int kill_pgrp(struct pid *pid, int sig, int priv); extern int kill_pid(struct pid *pid, int sig, int priv); extern __must_check bool do_notify_parent(struct task_struct *, int); +extern void do_notify_pidfd(struct task_struct *task); extern void __wake_up_parent(struct task_struct *p, struct task_struct *parent); extern void force_sig(int); extern void force_fatal_sig(int); diff --git a/kernel/exit.c b/kernel/exit.c index ee9f43bed49a..7bb6488ebd79 100644 --- a/kernel/exit.c +++ b/kernel/exit.c @@ -149,6 +149,9 @@ static void __exit_signal(struct task_struct *tsk) struct tty_struct *tty; u64 utime, stime; + /* Wake up all pidfd waiters */ + do_notify_pidfd(tsk); + sighand = rcu_dereference_check(tsk->sighand, lockdep_tasklist_lock_is_held()); spin_lock(&sighand->siglock); diff --git a/kernel/fork.c b/kernel/fork.c index 10917c3e1f03..eef15c93f6cf 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -2163,8 +2163,6 @@ static int __pidfd_prepare(struct pid *pid, unsigned int flags, struct file **re * Allocate a new file that stashes @pid and reserve a new pidfd number in the * caller's file descriptor table. The pidfd is reserved but not installed yet. * - * The helper verifies that @pid is used as a thread group leader. - * * If this function returns successfully the caller is responsible to either * call fd_install() passing the returned pidfd and pidfd file as arguments in * order to install the pidfd into its file descriptor table or they must use @@ -2182,7 +2180,7 @@ static int __pidfd_prepare(struct pid *pid, unsigned int flags, struct file **re */ int pidfd_prepare(struct pid *pid, unsigned int flags, struct file **ret) { - if (!pid || !pid_has_task(pid, PIDTYPE_TGID)) + if (!pid) return -EINVAL; return __pidfd_prepare(pid, flags, ret); diff --git a/kernel/pid.c b/kernel/pid.c index 6500ef956f2f..4806798022d9 100644 --- a/kernel/pid.c +++ b/kernel/pid.c @@ -552,11 +552,6 @@ struct pid *pidfd_get_pid(unsigned int fd, unsigned int *flags) * Return the task associated with @pidfd. The function takes a reference on * the returned task. The caller is responsible for releasing that reference. * - * Currently, the process identified by @pidfd is always a thread-group leader. - * This restriction currently exists for all aspects of pidfds including pidfd - * creation (CLONE_PIDFD cannot be used with CLONE_THREAD) and pidfd polling - * (only supports thread group leaders). - * * Return: On success, the task_struct associated with the pidfd. * On error, a negative errno number will be returned. */ @@ -615,11 +610,7 @@ int pidfd_create(struct pid *pid, unsigned int flags) * @flags: flags to pass * * This creates a new pid file descriptor with the O_CLOEXEC flag set for - * the process identified by @pid. Currently, the process identified by - * @pid must be a thread-group leader. This restriction currently exists - * for all aspects of pidfds including pidfd creation (CLONE_PIDFD cannot - * be used with CLONE_THREAD) and pidfd polling (only supports thread group - * leaders). + * the process identified by @pid. * * Return: On success, a cloexec pidfd is returned. * On error, a negative errno number will be returned. diff --git a/kernel/signal.c b/kernel/signal.c index 47a7602dfe8d..7b3a1e147225 100644 --- a/kernel/signal.c +++ b/kernel/signal.c @@ -2028,7 +2028,7 @@ int send_sigqueue(struct sigqueue *q, struct pid *pid, enum pid_type type) return ret; } -static void do_notify_pidfd(struct task_struct *task) +void do_notify_pidfd(struct task_struct *task) { struct pid *pid; @@ -2060,9 +2060,6 @@ bool do_notify_parent(struct task_struct *tsk, int sig) WARN_ON_ONCE(!tsk->ptrace && (tsk->group_leader != tsk || !thread_group_empty(tsk))); - /* Wake up all pidfd waiters */ - do_notify_pidfd(tsk); - if (sig != SIGCHLD) { /* * This is only possible if parent == real_parent. base-commit: bee0e7762ad2c6025b9f5245c040fcc36ef2bde8 -- 2.34.1