Received: by 2002:a05:7412:3290:b0:fa:6e18:a558 with SMTP id ev16csp511289rdb; Fri, 26 Jan 2024 02:28:04 -0800 (PST) X-Google-Smtp-Source: AGHT+IGSCc3ftXmNGkk1yHtD3XcRxCjOFphcZs/Y+zTrVQR4FQkMckV0pMDpV18Zok+qrhay/2xw X-Received: by 2002:a05:6a20:3d91:b0:19c:7468:a78a with SMTP id s17-20020a056a203d9100b0019c7468a78amr968700pzi.9.1706264884651; Fri, 26 Jan 2024 02:28:04 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1706264884; cv=pass; d=google.com; s=arc-20160816; b=WiHSpMeb3VtgLygrCpx8pzVXtvhmedh9sDU5JHUhMvvcAPyKDjHEBc60z4SfSxJkx9 49aAvflev/BgcVrKw5ZkIYCwe7FXT9og3eGdlkUjxdNER5Zfw688i9OrcjX+zLnCy+G4 HpczZz0qAgavMYajzKht+fnnNqywTe7etjnPvlkHXt90J1karR7zr+6cqCuppYH5QBkp AwVqR+4dQJByEgyAjrzI7I7cS8zCSvaPb8HpIE1fJ5JGyRpvewv/z+wh0cgyNUrOLOPw rvidpFqN0AV3XMPGpGIz28TrF4FDaUNhu4yxKkWrJjA5Cjd2GSpv0wPxr9EueT2IFXi0 I7/Q== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=in-reply-to:content-disposition:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:message-id:subject:cc :to:from:date:dkim-signature; bh=6AGoZf2PIQ6tB/80fb+RdH2jKQAfkryfShilD4blKk0=; fh=CgfkfLDlVGXEsRGdWUlWz+OjXipJDHdMqaC5G29Jh3w=; b=G518CHrSus+2qECvJF/vm98pNsxQVIeqf2Mt71dldvEqukVxCwy1uHc+J3PuXpDpGD kS5zH06Rda9TAaD/KlsW3pTOATgsnb3OghDUhRswhqvajKcywlqVqEL6FPv02GUJuQ8K Qi/CPkAqGBUf8/tky+Xnh6bMpsLOn/DQ4eYac339KDgfhF2BrpzmCUnNL42u+aucuPZL okSAA9Gb4YT47nCiyv1U/ccti1ohKWVfqZ3Pp9Fmer998Ebjo7wNZp3az2l5nso7G8ML HMHEjZ0YMotHKg/Lfeyp1G5bADujsY1rviFm7KJq29i0t+yaYdrRBhCAlezo1QU7xMRp DBSA== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=gJhFkcfH; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-39953-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-39953-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [2604:1380:45e3:2400::1]) by mx.google.com with ESMTPS id k82-20020a628455000000b006dbd4f841b5si939186pfd.340.2024.01.26.02.28.04 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Jan 2024 02:28:04 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-39953-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) client-ip=2604:1380:45e3:2400::1; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=gJhFkcfH; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-39953-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-39953-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 60CB629DC66 for ; Fri, 26 Jan 2024 10:19:15 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 5A4847A721; Fri, 26 Jan 2024 09:42:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="gJhFkcfH" Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 744E741C73; Fri, 26 Jan 2024 09:42:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706262174; cv=none; b=VZL3eUVBAhKIHfhpIG4Nf3UjkSLoUtcL1o5jA2Uraf3O+OWfgSglgJuEy95W0bEfdRtsaBtVNUge259KQkICgVcibw0myKtHUG3QQaohiZ24fhvDoQDCyXjLavSPlk7bW6KlyyZcjJT2x9G8qWT6M1DSxkORgR4+EASStzlLtwE= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706262174; c=relaxed/simple; bh=iU+dhx4uwlWJ+Zksl8w2gW5ZDavTpu20u5Ubln+mkOE=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=B6wew9m2L9W8Ro4TeUXP2l9VQYHDxE6jMUBW2d/2CK1e30R6loRCEz+LVV7TRCqFbOemvfG3zatL5r7Z2sc6A88Mv6N67KWlAToIteu10tKA31bhbEZtAy9VzKyZJetawWNFzFGGnaIkHsSIVUzqsfQb9ibEIkAaqTrEWWxfadI= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=gJhFkcfH; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id 152E2C433C7; Fri, 26 Jan 2024 09:42:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1706262173; bh=iU+dhx4uwlWJ+Zksl8w2gW5ZDavTpu20u5Ubln+mkOE=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=gJhFkcfH5BAvpyB8BL40QECbmigZgT+0KVZRaZKcTaNK1vl+8xgwUNJ1FWwEgeoed t0iD8R8HxVOG4XLbtJZ7rJzvXym5241fE8MWIQJR/liF44Sjn0pWCY7Ilwd4b2cnY2 5qbedr/7e+WCaZwO0wDCitPKaHrMuaxSdNfNLCYhpHf2Bb2/auVf5vUP3SkYISXamG xsy57pHzr/i57HYgIklfpRbI+V3msv8PDDZWqZ5vmoO9bvKkkJo2ca9KJoYH2CEn9j A5Hq+A638AI9Ml4C5vl4KbA20w2rZWZJjBAsLc4XYwvR3pecazLqwSiygkH8uHw0q8 je5q3xn8Lpobg== Date: Fri, 26 Jan 2024 10:42:49 +0100 From: Christian Brauner To: Oleg Nesterov Cc: Tycho Andersen , linux-kernel@vger.kernel.org, linux-api@vger.kernel.org, Tycho Andersen , "Eric W. Biederman" Subject: Re: [PATCH v3 1/3] pidfd: allow pidfd_open() on non-thread-group leaders Message-ID: <20240126-lokal-aktualisieren-fef41d9bce9f@brauner> References: <20240123153452.170866-1-tycho@tycho.pizza> <20240123153452.170866-2-tycho@tycho.pizza> <20240123195608.GB9978@redhat.com> <20240125140830.GA5513@redhat.com> <20240125-tricksen-baugrube-3f78c487a23a@brauner> <20240125175113.GC5513@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20240125175113.GC5513@redhat.com> On Thu, Jan 25, 2024 at 06:51:14PM +0100, Oleg Nesterov wrote: > On 01/25, Christian Brauner wrote: > > > > > > When it is reaped is "mostly unrelated". > > > > > > Then why pidfd_poll() can't simply check !task || task->exit_state ? > > > > > > Nevermind. So, currently pidfd_poll() succeeds when the leader can be > > > > Hm, the comment right above mentions: > > > > /* > > * Inform pollers only when the whole thread group exits. > > * If the thread group leader exits before all other threads in the > > * group, then poll(2) should block, similar to the wait(2) family. > > */ > > > reaped, iow the whole thread group has exited. > > Yes, but the comment doesn't contradict with what I have said? No, it doesn't. I'm trying to understand what you are suggesting though. Are you saying !task || tas->exit_state is enough and we shouldn't use the helper that was added in commit 38fd525a4c61 ("exit: Factor thread_group_exited out of pidfd_poll"). If so what does that buy us open-coding the check instead of using that helper? Is there an actual bug here? > > > But even if you are the > > > parent, you can't expect that wait(WNOHANG) must succeed, the leader > > > can be traced. I guess it is too late to change this behaviour. > > > > Hm, why is that an issue though? > > Well, I didn't say this is a problem. I simply do not know how/why people > use pidfd_poll(). Sorry, I just have a hard time understanding what you wanted then. :) "I guess it is too late to change this behavior." made it sound like a) there's a problem and b) that you would prefer to change behavior. Thus, it seems that wait(WNOHANG) hanging when a traced leader of an empty thread-group has exited is a problem in your eyes. > > I mostly tried to explain why do I think that do_notify_pidfd() should > be always called from exit_notify() path, not by release_task(), even > if the task is not a leader. > > > Because a program would rely on WNOHANG to hang on > > a ptraced leader? That seems esoteric imho. > > To me it would be usefule, but lets not discuss this now. The "patch" Ok, that's good then. I would expect that at least stuff like rr makes use of pidfd and they might rely on this behavior - although I haven't checked their code. > I sent doesn't change the current behaviour. Yeah, I got that but it would still be useful to understand the wider context you were adressing. > > > > What if we add the new PIDFD_THREAD flag? With this flag > > > > > > - sys_pidfd_open() doesn't require the must be a group leader > > > > Yes. > > > > > > > > - pidfd_poll() succeeds when the task passes exit_notify() and > > > becomes a zombie, even if it is a leader and has other threads. > > > > Iiuc, if an existing user creates a pidfd for a thread-group leader and > > then polls that pidfd they would currently only get notified if the > > thread-group is empty and the leader has exited. > > > > If we now start notifying when the thread-group leader exits but the > > thread-group isn't empty then this would be a fairly big api change > > Hmm... again, this patch doesn't (shouldn't) change the current behavior. > > Please note "with this flag" above. If sys_pidfd_open() was called > without PIDFD_THREAD, then sys_pidfd_open() still requires that the > target task must be a group leader, and pidfd_poll() won't succeed > until the leader exits and thread_group_empty() is true. Yeah, I missed the PIDFD_THREAD flag suggestion. Sorry about that. Btw, I'm not sure whether you remember that but when we originally did the pidfd work you and I discussed thread support and already decided back then that having a flag like PIDFD_THREAD would likely be the way to go. The PIDFD_THREAD flag would be would be interesting because we could make pidfd_send_signal() support this flag as well to allow sending a signal to a specific thread. That's something that I had also wanted to support. And I've been asked for this a few times already. What do you think?