Received: by 2002:ac0:bc90:0:0:0:0:0 with SMTP id a16csp3604802img; Mon, 25 Mar 2019 13:47:04 -0700 (PDT) X-Google-Smtp-Source: APXvYqxK7NqB6hjsYL/B1kw41Gt/3UIYYhK0kD0NmElWNqIz3TEdaw3ydvk8AFmMrYXhzPt8BZTR X-Received: by 2002:a17:902:2a89:: with SMTP id j9mr26501595plb.272.1553546824179; Mon, 25 Mar 2019 13:47:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1553546824; cv=none; d=google.com; s=arc-20160816; b=C+EbHKJrPoXomVf924OA+ttxR8Cvl59jJe3LE++a39bEzJdFPYCOogReUFUNFjSYbP MXUmvt22JgCPmrW422dXe2QKdBRptBj+77Esk273q+3JGfaU9pHJaXM2QRsKp184IHs5 hXt4Oo1oIm/6lCS7GWDYy9g3XPzusGXFx8lwOAkGKfyLVVk2/bgBLZ5Z6ope30UyECpm wcXmj5tdmvQmgqwK6c+ysR6LtNZ0tAs/bzc0unRPH+C115CstN1x/iIc4vunZMTHujEN 7juDO4U2/uww4HksFU0PMcyr2JEd+F6eGakx74yqg9dzH1UvfYy4hfFgGP0JVNeG3NKC +9pw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:date:from; bh=ZfYWktENAW+kI/Voulf51S1RT5ubuNG6eEsN4U97xkY=; b=qEXOSb7g6KCSt5/+wPR3jYl7WXPpMQUVihqnpKeF41hYpeKWmT5DA/iowDXvG1ou3S dPe9UbJMFqPShKbD9gkDtI3JvERDNOvWUJvWX5hG0Dp9IQMxr8cxJje6Mnfn6VN219qR BGTmB2Yb+CzNwJU7DQmY0C+xyM39ZjZPbqvv5gPBm9rnRVPF/pVdirHqoFhqbBVFJPuq VmHntPxbyCLkSg3E+5IpiZ51C3/f2BySXYF7nKGubzJS0ZXmWA5Gri6bR80TjcuHvP8a gb69aAK5E3xWBlbDDms3uml446rrgkjtaKLO/cxxCw8VwcUtZOEDm3JfHynQC4UU8tq7 Bz6g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=canonical.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id n8si14893207plk.316.2019.03.25.13.46.49; Mon, 25 Mar 2019 13:47:04 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=canonical.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730246AbfCYUpt (ORCPT + 99 others); Mon, 25 Mar 2019 16:45:49 -0400 Received: from youngberry.canonical.com ([91.189.89.112]:50594 "EHLO youngberry.canonical.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729238AbfCYUpt (ORCPT ); Mon, 25 Mar 2019 16:45:49 -0400 Received: from mail-wr1-f70.google.com ([209.85.221.70]) by youngberry.canonical.com with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.76) (envelope-from ) id 1h8WTa-0002OI-OO for linux-kernel@vger.kernel.org; Mon, 25 Mar 2019 20:45:46 +0000 Received: by mail-wr1-f70.google.com with SMTP id z9so5253570wrn.21 for ; Mon, 25 Mar 2019 13:45:46 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:date:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=ZfYWktENAW+kI/Voulf51S1RT5ubuNG6eEsN4U97xkY=; b=uVFUaSDDqQt4HzzEGzldMk1QX4mv8Yp4lW92SYHfUHlhq6PraTLCZA4NaXzoj+sAz+ io43ZdA+rRPpOtMwtlwUv4BeRwLhIUPe2Dz8BuFWMJSpw/USvnBdeJC7B+r/UK8Z0vrQ Rh1J0iRc1En1eY49LM1673NYlOKNqTxw/F1zS4Is5HcbEHKS2QcqYDHtqpUiUS2jNV6T m9YejzYnTsGu/wOynbsGC2S0UMdHTNHewKkFiJdJP4tGlTWuWa/7F+/EKml1j7xkV59S 7sCBFufgSZb0E7jBz4xE7lPkpEOJUylZutHxbZhpcjwJ9R3//emNAZtRjkmZXlNK12Y2 e/Ig== X-Gm-Message-State: APjAAAXzuPsl61YxXo7UMg2NZQsC5jIIDKD3HegqQok1V98FYTN2aSLy odAFtPg3nlS/gic/mgNhSM1ioerlKYJopfJl5MxnWmmaYf5ESCz1b0uBYeCQQPCgtb+77Z1Tq3Q VvugGEX5j7uw4Mg+faR7MUmPPEcsJ04LkD8fG2BvEdg== X-Received: by 2002:adf:f88c:: with SMTP id u12mr15487358wrp.235.1553546746443; Mon, 25 Mar 2019 13:45:46 -0700 (PDT) X-Received: by 2002:adf:f88c:: with SMTP id u12mr15487339wrp.235.1553546746123; Mon, 25 Mar 2019 13:45:46 -0700 (PDT) Received: from gmail.com (p200300EA6F14663DB13635B07C8C280A.dip0.t-ipconnect.de. [2003:ea:6f14:663d:b136:35b0:7c8c:280a]) by smtp.gmail.com with ESMTPSA id v13sm15422602wmj.43.2019.03.25.13.45.45 (version=TLS1_3 cipher=AEAD-AES256-GCM-SHA384 bits=256/256); Mon, 25 Mar 2019 13:45:45 -0700 (PDT) From: Christian Brauner X-Google-Original-From: Christian Brauner Date: Mon, 25 Mar 2019 21:45:44 +0100 To: Linus Torvalds Cc: Michael Tirado , Alexey Dobriyan , LKML Subject: Re: pidfd design Message-ID: <20190325204543.rfpy2cbcfnmd5hst@gmail.com> References: <20190320200702.GA27111@avx2> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: User-Agent: NeoMutt/20180716 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Mar 25, 2019 at 10:45:29AM -0700, Linus Torvalds wrote: > On Fri, Mar 22, 2019 at 11:34 AM Michael Tirado wrote: > > > > On Wed, Mar 20, 2019 at 8:08 PM Alexey Dobriyan wrote: > > > > > > pidfd code should be backed out immediately. Forget about /proc. > > > > Seems like Torvalds just merges this sort of "stuff" without reading > > it now, or there's something that auto accepted pull request to RC tree? > > There is no auto-accept. > > But there also didn't seem to be any valid arguments against it, and > the android people had arguments for it. > > Arguing against it based on "I don't like /proc" is pointless. The > fact is, /proc is our system interface for a lot of things. > > Arguing against it based on "I worry about the _other_ > non-signal-sending things that could be done with this" is also > pointless. What other things? The only thing that got merged was the > signalling. To back Linus defense up with a glimpse into the future. We will not be to rely on dirfds from proc to do general process management. That is even in the commit message for the pidfd_send_signal syscall, that we intend to decouple this from procfs, i.e. decouple process management from process metadata reading. We have an ongoing discussion and what a lot of people agree upon is that pidfds will be anon inode file descriptors that stash a reference to struct pid in their private_data member. They can be pollable if ever need be and they are just conceptually cleaner and way simpler and mirror what will happen in the new mount api as well. The idea is to translate these pidfds e.g. via a simple ioctl() interface that takes a pidfd and gives back - with standard permissions applied as are today - a corresponding /proc/ fd that can be used to read metadata of a process (see the suggestion by Andy and Jann [1]). The advantage is that this means that pidfd_clone() or something similar can simply return a pidfd and does not need to care about what procfs the process is supposed to be located in/reference and is in general way safer. But there is absolutely nothing wrong with allowing users to use /proc/ to signal processes. One of the reasons why I did this is that it is so intuitive to users that non-kernel people have requested this be possible over and over. As mentioned in the orignal patchset the future was always to decouple this from procfs (see the references in there) and this is what the new pidctl() syscall is for that transparently translates between the pid-based api and the pidfd-based api. [1]: https://lore.kernel.org/lkml/CAG48ez3VMjLJBC_F3BxC2sc2s-28NdsrUduR=jX66XH0w2O-Qg@mail.gmail.com/ > > Now, arguing that signalling should use the open-time credentials > might make sense, but this isn't read/write. You can't fool some suid > program to do magic randon system calls for you, and if you can, then > arguing about pidfd is kind of pointless. > > So the model of using a file descriptor instead of a 'pid' for signal > handling is actually very unix-like. Maybe that's how pid's should > have worked to begin with. Remember that whole "everything is a file" > thing? > > Now, the fact that fork() and clone() return a pid obviously means > that pidfd isn't the primary model (not to decades of just history), > but that doesn't make pidfd wrong. > > And namespace issues etc are all also kind of irrelevant. If you open > random files in /proc and randomly do pidfd_send_signal() on those, > you get random results. If that worries you, then DON'T DO THAT THEN, > for chrissake! That's not a sane model to begin with, but it's not the > usage model for this, so it's another completely specious argument. > > So yes, I thought about the pidfd pull (which was why it happened at > the very end of the merge window), and I found the arguments against > it bad. > > Linus