Received: by 2002:a25:824b:0:0:0:0:0 with SMTP id d11csp1766131ybn; Thu, 26 Sep 2019 01:46:09 -0700 (PDT) X-Google-Smtp-Source: APXvYqyCLQBqEb0b474QSITjf1sn05Sg9bleFN416DFM+54Csu1eJY1bVk3aqtNq5W21t3grtf1r X-Received: by 2002:a05:6402:13c2:: with SMTP id a2mr2253921edx.21.1569487569434; Thu, 26 Sep 2019 01:46:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1569487569; cv=none; d=google.com; s=arc-20160816; b=o49VKTLqob5OnI5nWaq4DzoMRnMyMtZnUJ99Fz0sG3sYeKjBJ3S1O+UWXNV7Q/YYhl G8dPk18DKfAqWUfIQR/7yjjKY0qzZZCfvAQ7AeQ4IbHXxYZ7ITfa97Mszg5ziV35NgDN WwvYX1lD+eh6PZpPVexahru268pabcb8I0MbUgVmNbMZXBJ4J9Dp28/CkbUfF3lE9KcV GWLvFXfA8OhzUcOUm4lSDwCynTuK0121Y+S4lcgvMt610ZBu2x4dQyKsklYGDOJk6qPb GzAEJFPMLdkhJ5JBmyhbfugTk6r02jP/1nV6U8KBIQsfs2Fmz3+jd8NeKmmamstzLXI5 omkA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-transfer-encoding:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=RC5qD9g8uql/0+IlIabSBGu4V2JtFog2yOLgz5BZUYo=; b=xRdXsB3d1+BvasT7yzZfS/s3DEmWFIhyKUhg9jdmfcf9JbvJ6STjlZPHXNzy1hSXI7 vpBZwbljDLFjOMmqG8puKHM7gYmAPQ/iaa79QjM9sWZu+8PZpmsW0MMz1Z6OP/YnZg20 QBoAJLFfi12VBVxXRdP4mLi+RFfZf9qo4sywMsGB7iQrO+PxFAAWfAFE0h1U5CHIEJFK nNAtNN09+7Slp8/Z/vYkVAKBlNuou+vlrpaPwRaxvP3IKk+iQpDUW3ouA87mFLYBt0AX 2cAG075h/6iPU5PogD5Jj2Xqfesc7qoCsoVlTSs5tAO+hDAeIksWwpFDOuMgQLXaYxxe NBjA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id e23si814127edq.344.2019.09.26.01.45.46; Thu, 26 Sep 2019 01:46:09 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2633407AbfIXT5M (ORCPT + 99 others); Tue, 24 Sep 2019 15:57:12 -0400 Received: from youngberry.canonical.com ([91.189.89.112]:38606 "EHLO youngberry.canonical.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727602AbfIXT5M (ORCPT ); Tue, 24 Sep 2019 15:57:12 -0400 Received: from mon75-17-88-175-211-167.fbx.proxad.net ([88.175.211.167] helo=wittgenstein) by youngberry.canonical.com with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1iCqvp-0007Nx-Oc; Tue, 24 Sep 2019 19:57:05 +0000 Date: Tue, 24 Sep 2019 21:57:04 +0200 From: Christian Brauner To: "Michael Kerrisk (man-pages)" Cc: Florian Weimer , Oleg Nesterov , Jann Horn , "Eric W. Biederman" , Daniel Colascione , Joel Fernandes , linux-man , Linux API , lkml Subject: Re: For review: pidfd_send_signal(2) manual page Message-ID: <20190924195701.7pw2olbviieqsg5q@wittgenstein> References: <87pnjr9rth.fsf@mid.deneb.enyo.de> <20190923142325.jowzbnwjw7g7si7j@wittgenstein> <90dd38d5-34b3-b72f-8e5a-b51f944f22fb@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <90dd38d5-34b3-b72f-8e5a-b51f944f22fb@gmail.com> User-Agent: NeoMutt/20180716 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Sep 24, 2019 at 09:44:49PM +0200, Michael Kerrisk (man-pages) wrote: > Hello Christian, > > On 9/23/19 4:23 PM, Christian Brauner wrote: > > On Mon, Sep 23, 2019 at 01:26:34PM +0200, Florian Weimer wrote: > >> * Michael Kerrisk: > >> > >>> SYNOPSIS > >>> int pidfd_send_signal(int pidfd, int sig, siginfo_t info, > >>> unsigned int flags); > >> > >> This probably should reference a header for siginfo_t. > > > > Agreed. > > > >> > >>> ESRCH The target process does not exist. > >> > >> If the descriptor is valid, does this mean the process has been waited > >> for? Maybe this can be made more explicit. > > > > If by valid you mean "refers to a process/thread-group leader" aka is a > > pidfd then yes: Getting ESRCH means that the process has exited and has > > already been waited upon. > > If it had only exited but not waited upon aka is a zombie, then sending > > a signal will just work because that's currently how sending signals to > > zombies works, i.e. if you only send a signal and don't do any > > additional checks you won't notice a difference between a process being > > alive and a process being a zombie. The userspace visible behavior in > > terms of signaling them is identical. > > (Thanks for the clarification. I added the text "(i.e., it has > terminated and been waited on)" to the ESRCH error.) > > >>> The pidfd_send_signal() system call allows the avoidance of race > >>> conditions that occur when using traditional interfaces (such as > >>> kill(2)) to signal a process. The problem is that the traditional > >>> interfaces specify the target process via a process ID (PID), with > >>> the result that the sender may accidentally send a signal to the > >>> wrong process if the originally intended target process has termi‐ > >>> nated and its PID has been recycled for another process. By con‐ > >>> trast, a PID file descriptor is a stable reference to a specific > >>> process; if that process terminates, then the file descriptor > >>> ceases to be valid and the caller of pidfd_send_signal() is > >>> informed of this fact via an ESRCH error. > >> > >> It would be nice to explain somewhere how you can avoid the race using > >> a PID descriptor. Is there anything else besides CLONE_PIDFD? > > > > If you're the parent of the process you can do this without CLONE_PIDFD: > > pid = fork(); > > pidfd = pidfd_open(); > > ret = pidfd_send_signal(pidfd, 0, NULL, 0); > > if (ret < 0 && errno == ESRCH) > > /* pidfd refers to another, recycled process */ > > Although there is still the race between the fork() and the > pidfd_open(), right? Actually no and my code is even too complex. If you are the parent, and this is really a sequence that obeys the ordering pidfd_open() before waiting: pid = fork(); if (pid == 0) exit(EXIT_SUCCESS); pidfd = pidfd_open(pid, 0); waitid(pid, ...); Then you are guaranteed that pidfd will refer to pid. No recycling can happen since the process has not been waited upon yet (That is, excluding special cases such as where you have a mainloop where a callback reacts to a SIGCHLD event and waits on the child behind your back and your next callback in the mainloop calls pidfd_open() while the pid has been recycled etc.). A race could only appear in sequences where waiting happens before pidfd_open(): pid = fork(); if (pid == 0) exit(EXIT_SUCCESS); waitid(pid, ...); pidfd = pidfd_open(pid, 0); which honestly simply doesn't make any sense. So if you're the parent and you combine fork() + pidfd_open() correctly things should be fine without even having to verify via pidfd_send_signal() (I missed that in my first mail.). (Now, it gets more hairy when one considers clone(CLONE_PARENT) but that would be wildly esoteric because at that point you're using clone() already and then you should simply pass clone(CLONE_PARENT | CLONE_PIDFD).) If you're _not_ the parent then CLONE_PIDFD and sending around the pidfd are your only option to avoiding the race imho. > > >>> static > >>> int pidfd_send_signal(int pidfd, int sig, siginfo_t *info, > >>> unsigned int flags) > >>> { > >>> return syscall(__NR_pidfd_send_signal, pidfd, sig, info, flags); > >>> } > >> > >> Please use a different function name. Thanks. > > Covered in another thread. I await some further feedback from Florian. Right, that wasn't my suggestion anyway. :) Thanks! Christian