Received: by 2002:a25:5b86:0:0:0:0:0 with SMTP id p128csp2684001ybb; Sat, 30 Mar 2019 11:01:55 -0700 (PDT) X-Google-Smtp-Source: APXvYqx3kH5J/066KcsFJ7Nx66ocGM0njXAkpO5ykII3SoMDg+uQvJstCRAJyn7GZx7aYfgsBjKG X-Received: by 2002:a62:6383:: with SMTP id x125mr52382770pfb.239.1553968914958; Sat, 30 Mar 2019 11:01:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1553968914; cv=none; d=google.com; s=arc-20160816; b=n52K/SBOHR6YQhxyojK5cvVonMNdQ7KxyCc7dtiqLjT5V9nO0b36OOms1aJY3ItZ+i glQ95M6TPgHyjMoEvvZD93E7P10olSsZXVoXRcu45z2mzwtLDqGma3AQ+UnZK+HmEpto uz8L1+n5PTzmIBrXa/6TbDUA01xW/ZxwLc295BHOozBbAJomn2qCv7h0DI9pCrw23O9d kkettRcaSOUycYhlYjk6o6RHcLHBTlGKzJZrFU6vSt7M0L5zZ4jCb4XDCdP52IxRI1H8 ZxUDQJO+EsI2+IOcgT6mLd20NQMRdjxEaPPkaVaqrhvdHgcSlway+FQnPFqcCmdv2r6B Un6Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=z3FFXH4gxkuC/G1c7qfWR3wOIcaso97fHrNrqzwKqK8=; b=06UH3/UKDA5t0CxtMPUNU7tR3FBX8qIsB9oN74D9ZyC2v+a2UEtVSUE0jW8ECUJIDq axyae1kQeQNMU+QeUg3or9oS6ksS8fIowidiNe0nrT8ulcD53AKM1Ds3fMcQwQnZHLRG GGYPBPdpzmnz6GTFRh30vA7qc1AZFbVJyb48nYMW01Y0JGYBHemjSgIXTI4wtz0R2g+Z SB8DIbFeZUZIzVw/sMrsXhrPERMd5U3dYhJHHEo8zEP+/sd+05IwzFcYsy5Z3gTkwRn+ /eXMRqFAFFVpQ+5VZvYHVVfgI78ya/vxr/pnyBvYOXMEYHeDdxsz0w0kmUuCO478fpU8 hprQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=TnzRtJVI; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 65si5066331plf.288.2019.03.30.11.01.39; Sat, 30 Mar 2019 11:01:54 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=TnzRtJVI; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730983AbfC3SAc (ORCPT + 99 others); Sat, 30 Mar 2019 14:00:32 -0400 Received: from mail-ot1-f68.google.com ([209.85.210.68]:41227 "EHLO mail-ot1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730396AbfC3SAb (ORCPT ); Sat, 30 Mar 2019 14:00:31 -0400 Received: by mail-ot1-f68.google.com with SMTP id 64so4934494otb.8 for ; Sat, 30 Mar 2019 11:00:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=z3FFXH4gxkuC/G1c7qfWR3wOIcaso97fHrNrqzwKqK8=; b=TnzRtJVIaaWfjZ/Sl5J5P/2gFhl3R3rbWB7NCIe5p8uzMLdsrZht9mVoXUutesB9kP tjoaoV7pbmeuLQY4Ht7KUZTUMbe/ESd2f1eu0y9PQFbvNq/F1csye0aL/bLdY3SeEge1 DnZd6RDHdODb93Wg+S6Uw3berNAkIehoWlmrecEka1a02cLMuaHad40k/lUDFVHK9AAT 4KecuJkGQpwtW+ycig2iiStbaj239iLmlO9PB5acpxA316coBoulgkZjn9x7Z7VUfdPT f6lFI1u9U/hoidPd5gGpRSlwBAWD1MoKaYyv7gNGNzrELcLCGLFVw/4daNITTDPW3AEn YYxA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=z3FFXH4gxkuC/G1c7qfWR3wOIcaso97fHrNrqzwKqK8=; b=RG/NijQYKZ0UhqX6vTKThALQTczHNz7S0v1AMQAklLcKRMdFY0KAOi6VuLoieVdT31 3vAqnHYdmpcY/EMrTQ39TdDq9NyIlpF0Wpnb13PnblZOCpDxzjM02C9QAR+3a0MFSxS3 dZJT+bD6d4TsIaP5+rNF/uz1beYHsg9aUi83g8CoGrVT5uWyac3qY6u+rX0nnKHkvJUm 8bF+CpdEpv1JJwRaLZ0AVdJ8WTKk/uqynNmJojw3SebFbjszYwlS3J8ii9Xm7eOKyDqO NBoBV++rHoIaoc4rQwcN6YwzgOAhGwRQHXYtk0RoRzsgwGYVbcGsBPC4WPFGEbg5RKxe XSQA== X-Gm-Message-State: APjAAAU40MZOKknJB2o1msQYNtdQgEkxgFashofm01S1IQ2PjRe/keCw doM6Gmt2DMIFPsnhTXfGSZbR7Knja4ccwS0WLSTdGA== X-Received: by 2002:a9d:694c:: with SMTP id p12mr38290862oto.242.1553968830984; Sat, 30 Mar 2019 11:00:30 -0700 (PDT) MIME-Version: 1.0 References: <20190329155425.26059-1-christian@brauner.io> <20190330171215.3yrfxwodstmgzmxy@brauner.io> In-Reply-To: From: Jann Horn Date: Sat, 30 Mar 2019 19:00:03 +0100 Message-ID: Subject: Re: [PATCH v2 0/5] pid: add pidfd_open() To: Linus Torvalds Cc: Christian Brauner , Daniel Colascione , Andrew Lutomirski , David Howells , "Serge E. Hallyn" , Linux API , Linux List Kernel Mailing , Arnd Bergmann , "Eric W. Biederman" , Konstantin Khlebnikov , Kees Cook , Alexey Dobriyan , Thomas Gleixner , Michael Kerrisk-manpages , Jonathan Kowalski , "Dmitry V. Levin" , Andrew Morton , Oleg Nesterov , Nagarathnam Muthusamy , Aleksa Sarai , Al Viro , Joel Fernandes Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Mar 30, 2019 at 6:24 PM Linus Torvalds wrote: > On Sat, Mar 30, 2019 at 10:12 AM Christian Brauner wrote: > > To clarify, what the Android guys really wanted to be part of the api is > > a way to get race-free access to metadata associated with a given pidfd. > > And the idea was that *if and only if procfs is mounted* you could do: > > > > int pidfd = pidfd_open(1234, 0); > > > > int procfd = open("/proc", O_RDONLY | O_CLOEXEC); > > int procpidfd = ioctl(pidfd, PIDFD_TO_PROCFD, procfd); > > And my claim is that this is three system calls - one of them very > hacky - to just do > > int pidfd = open("/proc/%d", O_PATH); > > and you're done. It acts as the pidfd _and_ the way to get the > associated status files etc. > > So there is absolutely zero advantage to going through pidfd_open(). > > No. No. No. > > So the *only* reason for "pidfd_open()" is if you don't have /proc in > the first place. In which case the whole PIDFD_TO_PROCFD is bogus. So if, in the future, there is some sort of "create a new task and return an fd to it" syscall, do you think it should always return pidfds, or do you think it should return fds to /proc if procfs is available? And if it should return fds to /proc, does that mean that this "create a task" API should take an extra argument with a file descriptor to the procfs instance you want to use? (This can't always be implemented easily in userspace on top of normal clone(), because if you create a task without a termination signal - like a thread -, its PID can be recycled under you.) An API like this would have less complexity stuffed into a single syscall if it always returns pidfds, and if you then actually want an fd to procfs, you can do the conversion that requires specifying a procfs instance separately. Of course, if you think that we shouldn't add an API for pidfd-to-procfs conversion before we have an API for clone()-with-an-fd-retval, that's understandable.