Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp831913yba; Sun, 31 Mar 2019 14:20:31 -0700 (PDT) X-Google-Smtp-Source: APXvYqzeVHy4Bq+Z3SuYY8CT22fuycmTkUNDOuNysGJ5MkESY5Hfa8c72iOt8ezpv902tU18o+F9 X-Received: by 2002:a63:d5f:: with SMTP id 31mr11178551pgn.208.1554067231582; Sun, 31 Mar 2019 14:20:31 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554067231; cv=none; d=google.com; s=arc-20160816; b=O+KKONDf4AeL3iszsM2+R2tQSMNyrWYckZTEgpCMBrFCnlPOL9VWSQculM4tYJb3Ya S5t5ywnvm4U3vRJhNYcTrhrFEYp+SGjP/1wKlfoflBr/zvlSMvtoMVDmQcUy9P2fSKPX 9AfzjFqyPLHgKSlsi7SCm3HIZdcXNcwIVOn0HiwrS5Pk4GTThtynmrvWIhRyVA1C6wbY xHyV7s1TUfInPzMOJX2j8H0C6dMH36X1OWVQJOQ7L/5N2MaEKycfKjpPtpLZHBZoRLaL FVk9sN/vSpqwmcYFbtD5a8cURi2LI01qpq0Qhav/kXR6KTNjbdDh7KwkTjeOQMW91EMq IDlQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-transfer-encoding:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=Qk56VuHZRHzMV1NFW0ejKgyMp3L43kw3+DUWA5PqkHo=; b=qbK4d2LvGerzrZ69cb6cQYAhuQfmgfmFZdldZDV1U/8RbmtZptNrjozAWZRJDGaqzd HkJQ+he2fgPVqcb0+XMQvZz3F2snWDwtyoTPe3bSD6Syt7jACgaUtpUGzyWvWQGDmDLX xuIUsIjkfEZJFpHLg353UDHj8jZIpXKPCy6sbPwAruxCvFeovu94bZQMGtzFDeN7Cq0l H8J/sBI5zthhsfzGxsg/Ye/7zh2qebKh4bLXXl/wW7+iikqmfU+KqXpFwqw3sGRm+Fud R3PqHLhx0w3Qd2D2wZBS4oznDUqOjiPf5kJ+2yT1G6XKDGTtgsZNLFpZVAHYimORfvOf sgcQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@brauner.io header.s=google header.b=UZ4LatSz; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d3si7222636pgd.147.2019.03.31.14.20.14; Sun, 31 Mar 2019 14:20:31 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@brauner.io header.s=google header.b=UZ4LatSz; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731387AbfCaVTg (ORCPT + 99 others); Sun, 31 Mar 2019 17:19:36 -0400 Received: from mail-ed1-f67.google.com ([209.85.208.67]:34670 "EHLO mail-ed1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731237AbfCaVTf (ORCPT ); Sun, 31 Mar 2019 17:19:35 -0400 Received: by mail-ed1-f67.google.com with SMTP id x14so6456066eds.1 for ; Sun, 31 Mar 2019 14:19:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=brauner.io; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to :user-agent; bh=Qk56VuHZRHzMV1NFW0ejKgyMp3L43kw3+DUWA5PqkHo=; b=UZ4LatSzgrruQRm5/XqZHYHKRrGtvEmlDI782KK1NWlTR8OY2eBr7Zu2OYyXzbiL7V N2eDu6fE5h/tUhlLOElddkB8J4N+BcQ4I1JxhljnADK0jiuC4Uvjv4YsFJY65ej0RNBr FEO9CwiaNm3pDduxTVy4Vr8VSVEMlZ4YRpx3uFBWfkkjAAhlYwmsxgWD9eCr3hSIFO7j 5Lvmb+tgMfO5zkAqWhsBdYcMHxzkVHPCy8fwMKsXtzCln2fUPQnBPHQ6XWY6yjiZEJ5D z6uWTCDZTCKtAvDubv1kXYe9Dagt2dI/4jacHNI2ctZPkizelkrcQPJEMBDk5EfHU2px TSBA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to:user-agent; bh=Qk56VuHZRHzMV1NFW0ejKgyMp3L43kw3+DUWA5PqkHo=; b=fY53EvfG3yUZHyy9jpdmQPHqRScLMo2mgb0g+t1zIY3+nCN2DNPw2RzYq+ij96Ti5P fT0WDAmN07EHqwYnb8ZfMlK0imCxeih0w0qOcyhsKJ2uHAS/0uQXGJUX/64r/30nlGfX VuLCi9ow6ciWhsk85PnO1d7UC+PiGSDEOL9FJBzNA+PQBjkw8UkSm5WAcrtSsm2fwpoj OcA24kJibQdnmuhEekpcTcx4aFgn1CalazJ529xwER7ZOw4scOStBweqSdewowD5I2ef zlYiXP2OovLQRRKqmGWVFoLyxJdMeoeff9px9D2aR8Q062/qx5d201aVzf7cSFh6iWSG M6Ag== X-Gm-Message-State: APjAAAUuhaF2U4wVX1ED8Cs+qeLZvdhwNskYJ/RLRdQkOIzjRruqLdMo 7hHtGDcHjD9cgUqCcIBW0pGMvQ== X-Received: by 2002:a17:906:3c5:: with SMTP id c5mr33873356eja.24.1554067173434; Sun, 31 Mar 2019 14:19:33 -0700 (PDT) Received: from brauner.io ([2a02:8109:b6bf:d24a:b136:35b0:7c8c:280a]) by smtp.gmail.com with ESMTPSA id x3sm2572156ede.25.2019.03.31.14.19.32 (version=TLS1_3 cipher=AEAD-AES256-GCM-SHA384 bits=256/256); Sun, 31 Mar 2019 14:19:32 -0700 (PDT) Date: Sun, 31 Mar 2019 23:19:31 +0200 From: Christian Brauner To: Andy Lutomirski Cc: Linus Torvalds , Daniel Colascione , Jann Horn , Andrew Lutomirski , David Howells , "Serge E. Hallyn" , Linux API , Linux List Kernel Mailing , Arnd Bergmann , "Eric W. Biederman" , Konstantin Khlebnikov , Kees Cook , Alexey Dobriyan , Thomas Gleixner , Michael Kerrisk-manpages , Jonathan Kowalski , "Dmitry V. Levin" , Andrew Morton , Oleg Nesterov , Nagarathnam Muthusamy , Aleksa Sarai , Al Viro , Joel Fernandes Subject: Re: [PATCH v2 0/5] pid: add pidfd_open() Message-ID: <20190331211930.wxkqhfvexdupfem6@brauner.io> References: <20190329155425.26059-1-christian@brauner.io> <20190330171215.3yrfxwodstmgzmxy@brauner.io> <132107F4-F56B-4D6E-9E00-A6F7C092E6BD@amacapital.net> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <132107F4-F56B-4D6E-9E00-A6F7C092E6BD@amacapital.net> User-Agent: NeoMutt/20180716 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, Mar 31, 2019 at 02:09:03PM -0600, Andy Lutomirski wrote: > > > > On Mar 30, 2019, at 11:24 AM, Linus Torvalds wrote: > > > >> On Sat, Mar 30, 2019 at 10:12 AM Christian Brauner wrote: > >> > >> > >> To clarify, what the Android guys really wanted to be part of the api is > >> a way to get race-free access to metadata associated with a given pidfd. > >> And the idea was that *if and only if procfs is mounted* you could do: > >> > >> int pidfd = pidfd_open(1234, 0); > >> > >> int procfd = open("/proc", O_RDONLY | O_CLOEXEC); > >> int procpidfd = ioctl(pidfd, PIDFD_TO_PROCFD, procfd); > > > > And my claim is that this is three system calls - one of them very > > hacky - to just do > > > > int pidfd = open("/proc/%d", O_PATH); > > Hi Linus- > > I want to re-check this because I think Christian’s example was bad. I proposed these ioctls, but that wasn’t the intended use. The real point is: Getting metadata access was pushed as essential originally which is why this ioctl() came up in the first place. The concerns about CLONE_PIDFD were not relevant when this came up [1]: > And how do you propose, given one of these handle objects, getting a > process's current priority, or its current oom score, or its list of > memory maps? As I mentioned in my original email, and which nobody has > addressed, if you don't use a dirfd as your process handle or you > don't provide an easy way to get one of these proc directory FDs, you > need to duplicate a lot of metadata access interfaces. An API that takes a process handle object and an fd pointing at /proc (the root of the proc fs) and gives you back a proc dirfd would do the trick. You could do this with no new kernel features at all if you're willing to read the pid, call openat(2), and handle the races in user code. [1]: https://lore.kernel.org/lkml/CALCETrUFrFKC2YTLH7ViM_7XPYk3LNmNiaz6s8wtWo1pmJQXzg@mail.gmail.com/ > > int pidfd = new_improved_clone(...); > > To be useful, this type of API *must* work without proc mounted. > > And, later: > > openat(fd to pidfd’s proc directory, “status”, ...); > > And we want a non-utterly-crappy way to do this. The ioctl is certainly ugly, but it *works*. > > Another approach is: > > pid_t pid = pidfd_get_pid(pidfd); > sprintf(buf, “/proc/%d”, pid); > int procfd = open(buf, O_PATH); > if (pidfd_get_pid(pidfd) != pid) { > we lose; > } > > But this is clunky. > > Do you think the clunky version is okay, or do you have a suggestion for making it better? > > —Andy