Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp417870yba; Mon, 1 Apr 2019 08:56:22 -0700 (PDT) X-Google-Smtp-Source: APXvYqybtJAJmvBhjj3LLFB7/XFF4Pi2NYwF/UjC5FtD1D6OOr9ReWxePf1p1Z9WNYH5xzsxFwhq X-Received: by 2002:a17:902:e109:: with SMTP id cc9mr12157170plb.148.1554134182501; Mon, 01 Apr 2019 08:56:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554134182; cv=none; d=google.com; s=arc-20160816; b=L4Z8Fet6SEt+xoVy9vSWS18bKw2aTCFIkjjMt2lf+77eWAFPdjVCy2oZu/K48HiHHx fXiKLb+nZyzkRS8DOLTdCtcV2tN7KWIb+0a81znBKOI8s9fMXzdKpErSJoHE5czocjQM UzIZO/Ga+1LMmA3+MnTWoiaJVLHHRgCse+B0vHTpVARp49lFQfSz0cB/tA9M+2XUl9Ht nNudUVVHCs4BHwnlninHioPigd1+UoJpPHmrbvTBzlCFiIk6O5YK16/sS1gNdCxzhxkc qxdLa72cJJROSMwKbazlIyCftwjzVPUs4OFuMN1ClBv08tjdBPO+iI5oDzckXv5Hjtnl Zklg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=i7/rx8kfm8QGsJgOqP/8KsvWO/rWVk9ak4OY5DbbGG0=; b=BnmEzxMBmgX8AYYJIRFLm/E5oAb7OxW3IbtuhRhSSwh8sAR/duKlsP9wCA+WrAgcua sUT91xMhpzlPwPqEG/Wh63JvZllPjn/J6z1Pku2hdB55ts+5QjXmVbBOuF8GHKdtQ9xN FZA/ayngTfilHE+R1/RDPaA+527PPiFog1NuBb7qUR9CZ5gl+vsVohXFOgoOdV47rGAj RfZ1mXKAm2wu+D2MIkjhLXFScSemdufX5X27DfOsZoShxsKjcZ6gUljgiHL3F5xMtJAC EkhtrL1x8qFb9f/0DECBKfibeOyWo35O1mgvzRXXfFQUazjkggC5fVF1v3nLPZM+HY1+ +evg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b="rT/D/yQU"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id g59si9431609plb.281.2019.04.01.08.56.06; Mon, 01 Apr 2019 08:56:22 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b="rT/D/yQU"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728589AbfDAPzO (ORCPT + 99 others); Mon, 1 Apr 2019 11:55:14 -0400 Received: from mail-vk1-f196.google.com ([209.85.221.196]:38204 "EHLO mail-vk1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726754AbfDAPzO (ORCPT ); Mon, 1 Apr 2019 11:55:14 -0400 Received: by mail-vk1-f196.google.com with SMTP id h71so2218594vkf.5 for ; Mon, 01 Apr 2019 08:55:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=i7/rx8kfm8QGsJgOqP/8KsvWO/rWVk9ak4OY5DbbGG0=; b=rT/D/yQUnxE1aBwNYv7bkZKdwiEMSxyUof6uxgHS2UE99VtFhWf6KfN/Hyk0DJlAab +FdVwDa5qTfWz/OePS/9Xo7dVQ9uxnzqJM+k4DWiCEJfEdMtLVOoB0+n+da+PINGJM7X Hmfru2iRJvx41HmReXhI29JENFtRM9tRM5kLQV7ht92Kag7FI3vR6Qir9CCuQ1jRKtuB ygdFFwIUmMwSgQUHHeDdHaihlsYTAq7sxxrbkwWQFFHKc6V3PPYkDE3vbL2WoQWdnQyB 9GRexG11ENmANZV3plhJV5nqx6SGRtHc1A33SyPqexPtB9bf+IgmiUpv4Dgg87g5vsbZ L6AA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=i7/rx8kfm8QGsJgOqP/8KsvWO/rWVk9ak4OY5DbbGG0=; b=pWfn0x8Pg9jFkN271d3qPJW2sbbm8dHOHq2CD5JpSJap2HZXFJKJgCOELFWuiQt7Hg H6WiIRyAas5I9iJOGNdezDjFMJPuZybDRoIlDt1SZLnys3y8g82kMO+lXB94gcAe5ZnC ZnqD7oUSpvckDZxBlVwkdcuaySuMkEbgyTTNgEqGUvIrDyI7zDrSNZDFP5RWcGhbomG8 NrpOzxUVKoeTe+RUPEJpBcyHKlENLYXFTOF9ah8i9o/z65F0GlBGutMTLvxFlo9cdwHu Fiw+c1h8cCAl0+8418p/oAiJ3W518KICJBhbvjY1MlB7RLQXBwbohztJLySRO+N2Rm9I W/pw== X-Gm-Message-State: APjAAAWE1wAl7BY9rls2WOnGinZw7RlQRPgQ96fDnfvda5Cf7s4dfXUF knmdkddkeQVhGfuCcEVo6Tp/9B1h5ZZ7mZAq6tZz1Q== X-Received: by 2002:a1f:32c7:: with SMTP id y190mr33352092vky.15.1554134112614; Mon, 01 Apr 2019 08:55:12 -0700 (PDT) MIME-Version: 1.0 References: <20190330171215.3yrfxwodstmgzmxy@brauner.io> <132107F4-F56B-4D6E-9E00-A6F7C092E6BD@amacapital.net> <20190331211041.vht7dnqg4e4bilr2@brauner.io> <18C7FCB9-2CBA-4237-94BB-9C4395A2106B@amacapital.net> <20190401114059.7gdsvcqyoz2o5bbz@yavin> In-Reply-To: From: Daniel Colascione Date: Mon, 1 Apr 2019 08:55:00 -0700 Message-ID: Subject: Re: [PATCH v2 0/5] pid: add pidfd_open() To: Linus Torvalds Cc: Aleksa Sarai , Andy Lutomirski , Christian Brauner , Jann Horn , Andrew Lutomirski , David Howells , "Serge E. Hallyn" , Linux API , Linux List Kernel Mailing , Arnd Bergmann , "Eric W. Biederman" , Konstantin Khlebnikov , Kees Cook , Alexey Dobriyan , Thomas Gleixner , Michael Kerrisk-manpages , Jonathan Kowalski , "Dmitry V. Levin" , Andrew Morton , Oleg Nesterov , Nagarathnam Muthusamy , Al Viro , Joel Fernandes Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Apr 1, 2019 at 8:36 AM Linus Torvalds wrote: > > On Mon, Apr 1, 2019 at 4:41 AM Aleksa Sarai wrote: > > > > Eric pitched a procfs2 which would *just* be the PIDs some time ago (in > > an attempt to make it possible one day to mount /proc inside a container > > without adding a bunch of masked paths), though it was just an idea and > > I don't know if he ever had a patch for it. Couldn't this mode just be a relatively simple procfs mount option instead of a whole new filesystem? It'd be a bit like hidepid, right? The internal bind mount option and the no-dotdot-traversal options also look good to me. > I wonder if we really want a fill procfs2, or maybe we could just make > the pidfd readable (yes, it's a directory file descriptor, but we > could allow reading). What would read(2) read? > What are the *actual* use cases for opening /proc files through it? If > it's really just for a small subset that android wants to do this > (getting basic process state like "running" etc), rather than anything > else, then we could skip the whole /proc linking entirely and go the > other way instead (ie open_pidfd() would get that limited IO model, > and we could make the /proc directory node get the same limited IO > model). We do a lot of process state inspection and manipulation, including reading and writing the oom killer adjustment score, reading smaps, and the occasional cgroup manipulation. More generally, I'd also like to be able to write a race-free pkill(1). Doing this work via pidfd would be convenient. More generally, we can't enumerate the specific use cases, because what we want to do with processes isn't bounded in advance, and we regularly find new things in /proc/pid that we want to read and write. I'd rather not prematurely limit the applicability of the pidfd interface, especially when there's a simple option (the procfs directory file descriptor approach) that doesn't require in-advance enumeration of supported process inspection and manipulation actions or a separate per-option pidfd equivalent. I very much want a general-purpose API that reuses the metadata interfaces the kernel already exposes. It's not clear to me how this rich interface could be matched by read(2) on a pidfd.