Received: by 2002:ac0:bc90:0:0:0:0:0 with SMTP id a16csp3601000img; Mon, 25 Mar 2019 13:41:17 -0700 (PDT) X-Google-Smtp-Source: APXvYqwphBLW0r1dmSHe626jgGlgn2rUIzQVxO1Lndfpk121S5q2eI1hrODrzIY4XA7kCyfHsfuL X-Received: by 2002:a17:902:24:: with SMTP id 33mr27766570pla.259.1553546477542; Mon, 25 Mar 2019 13:41:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1553546477; cv=none; d=google.com; s=arc-20160816; b=ER7gr2d6lyGYVvPrMkcE5Tgd+9zMqX/njWy/AvIrXzeLBOWKfvYvCOXUi3EUIzZmhl /B05SZdO/ZCtgSth0gcKhkTvarWxQLyN1G+fu6b7xjFhEZrNR4fA/UN81rBrxgbU+5Vp N40iloHQk6ZDYlY7vycGw7T43JQ1AToQyweC/78m2ro8oGp1RMarHisBxtsOW5/c+DFD TLn1xXlogLucTFDe4EfVGIcazkL+RoDb53HUE+zvekRe0UHhWQOIBEWGXoKhoxKyh+Hn AE3vNRRgo9rrV2lfCLUNKMvDfipRsQSlDReiZKg2iSITlCkSI8mE/4xEWUtZ13+1BBRG AOnA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=ydXQkjDWC81cEr+9mLJodiCszaqjihgM8gnFgFEMl0Q=; b=pPOBxTglqf0+6Kps/Xtd0SSMrZV7/2jZIpnk0mRcNf4fvCQwOEH9lSGiNDKy+xJFTQ QSx2b+1wZtoc9Z1Y93WhmXhut8fsfEPCTLBZK8hH3vPuev9Di47aBsFEYSlBnNaAlzbH zhn5ndI8n8kCUTsFl71sbLVwd05brCcuYUfU3EetzCXW+GGZd+9Ruv31EGpScVMY9sEC EyoiedZhrQ7KMbY65uxFKb0jS5+wGvCvf1NS/SIsCkbLRoMipric4GEMnbP2rOIfHZh2 MZZT5vKowEO/2JoBq4Som1N+pKXK9TDEHmR+OvYzViUkhBEiMQyjSLmVlvtSVH9ijO8v t+HQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@brauner.io header.s=google header.b=PToCVP9X; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id u7si14499644pfm.159.2019.03.25.13.41.02; Mon, 25 Mar 2019 13:41:17 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@brauner.io header.s=google header.b=PToCVP9X; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729761AbfCYUk0 (ORCPT + 99 others); Mon, 25 Mar 2019 16:40:26 -0400 Received: from mail-wr1-f66.google.com ([209.85.221.66]:44907 "EHLO mail-wr1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729238AbfCYUk0 (ORCPT ); Mon, 25 Mar 2019 16:40:26 -0400 Received: by mail-wr1-f66.google.com with SMTP id y7so7680845wrn.11 for ; Mon, 25 Mar 2019 13:40:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=brauner.io; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=ydXQkjDWC81cEr+9mLJodiCszaqjihgM8gnFgFEMl0Q=; b=PToCVP9XIWJiq8vDGX6o2FsDO0gSCatshfam89nPRuYrBHzzffwXLjCGSwo09PaeiX F1Xbh+Yo2PK72QIXqSem3YiGKkaKf4QPM+dDptvJgj/AiJESkjxan9fGlbWyxBFOCnLr Ur+PaWLeOHMcs59Hbv+MqPLr2CAfcUPSf1UyNRRyzDDYKJEwzdohAFJLrxaPipkDMqK/ Ilz2BHYy9qY8KGbz7t+Zx/YU7wvycSylGDi8tB6dH7MGtGfKeKynK9bDHL067v22r7Yw gI9jGihjh8mhIo4y+VbRCmbEZBVNhEFqvYLcbKz3JQp0538mxs16Y0eI+xYLQCHLHAHy 6/tw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=ydXQkjDWC81cEr+9mLJodiCszaqjihgM8gnFgFEMl0Q=; b=YkVXh1irMLfAZUyT6MyeU/GyVK6ety9uppNRiSbC81SRaVC/i03Og4QWg8vh9rdWNR se+Xe8naXhxdXmHo6Q44kfc5ZVIXLgKXIPjuBwHTKDH4okgNLU3W0j6IF2dUIL9q8o0T 48vxPVfAKdYLq02efdu1dNRKdSP5N3dcOc9UlDArfQaFqWkPFR8ehpo/ZD/EgaWVEIv/ GJms2sItmg01FPXoJJn+zSXeDXdlkJZ9X1ebMXfZysj6bRllmpis6oWNc6pS7e+99cRo aa84Fpy1EByhnW3i+gcLZLLUoSxziKYPED4w7mODEQGKmOtibWR6bZdIXcxmIF1WVRQt ONNA== X-Gm-Message-State: APjAAAVJjekTWGVl6EJqdfINq8qwxReurct04WK6AOKYTZRJi0T8zVAC 3y0Kz8EmjTkjM08MZi7vRX+N57o+y2pXsg== X-Received: by 2002:a5d:60cf:: with SMTP id x15mr16585382wrt.96.1553546423685; Mon, 25 Mar 2019 13:40:23 -0700 (PDT) Received: from brauner.io (p200300EA6F14663DB13635B07C8C280A.dip0.t-ipconnect.de. [2003:ea:6f14:663d:b136:35b0:7c8c:280a]) by smtp.gmail.com with ESMTPSA id b3sm21313053wmj.15.2019.03.25.13.40.22 (version=TLS1_3 cipher=AEAD-AES256-GCM-SHA384 bits=256/256); Mon, 25 Mar 2019 13:40:23 -0700 (PDT) Date: Mon, 25 Mar 2019 21:40:21 +0100 From: Christian Brauner To: Jann Horn Cc: Daniel Colascione , Jonathan Kowalski , Joel Fernandes , Konstantin Khlebnikov , Andy Lutomirski , David Howells , "Serge E. Hallyn" , "Eric W. Biederman" , Linux API , linux-kernel , Arnd Bergmann , Kees Cook , Alexey Dobriyan , Thomas Gleixner , Michael Kerrisk-manpages , "Dmitry V. Levin" , Andrew Morton , Oleg Nesterov , Nagarathnam Muthusamy , Aleksa Sarai , Al Viro Subject: Re: [PATCH 0/4] pid: add pidctl() Message-ID: <20190325204021.iknfkdvwykqlgzm4@brauner.io> References: <20190325162052.28987-1-christian@brauner.io> <20190325173614.GB25975@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: User-Agent: NeoMutt/20180716 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Mar 25, 2019 at 09:34:00PM +0100, Jann Horn wrote: > On Mon, Mar 25, 2019 at 9:15 PM Daniel Colascione wrote: > > On Mon, Mar 25, 2019 at 12:42 PM Jonathan Kowalski wrote: > > > On Mon, Mar 25, 2019 at 6:57 PM Daniel Colascione wrote: > [...] > > > Yes, but everything in /proc is not equivalent to an attribute, or an > > > option, and depending on its configuration, you may not want to allow > > > processes to even be able to see /proc for any PIDs other than those > > > running as their own user (hidepid). This means, even if this new > > > system call is added, to respect hidepid, it must, depending on if > > > /proc is mounted (and what hidepid is set to, and what gid= is set > > > to), return EPERM, because then there is a discrepancy between how the > > > two entrypoints to acquire a process handle do access control. > > > > That's why I proposed that this translation mechanism accept a procfs > > root directory --- so you'd specify *which* procfs you want and let > > the kernel apply whatever hidepid access restrictions it wants. > [...] > > > > and 2) it's > > > > "fail unsafe": IMHO, most users in practice will skip the line marked > > > > "LIVENESS CHECK", and as a result, their code will appear to work but > > > > contain subtle race conditions. An explicit interface to translate > > > > from a (PIDFD, PROCFS_ROOT) tuple to a /proc/pid directory file > > > > descriptor would be both more efficient and fail-safe. > > > > > > > > [1] as a separate matter, it'd be nice to have a batch version of close(2). > > > > > > Since /proc is full of gunk, > > > > People keep saying /proc is bad, but I haven't seen any serious > > proposals for a clean replacement. :-) > > > > > how about adding more to it and making > > > the magic symlink of /proc/self/fd for the pidfd to lead to the dirfd > > > of the /proc entry of the process it maps to, when one uses > > > O_DIRECTORY while opening it? Otherwise, it behaves as it does today. > > > It would be equivalent to opening the proc entry with usual access > > > restrictions (and hidepid made to work) but without the races, and > > > because for processes outside your and children pid ns, it shouldn't > > > work anyway, and since they wouldn't have their entry on this procfs > > > instance, it would all just fit in nicely? > > > > Thanks. That'll work. It's a bit magical, but /proc/self/fd is magical > > anyway, so that's okay. > > Please don't do that. /proc/$pid/fd refers to the set of file > descriptors the process has open, and semantically doesn't have much > to do with the identity of the process. If you want to have a procfs > directory entry for getting a pidfd, please add a new entry. (Although > I don't see the point in adding a new procfs entry for this when you > could instead have an ioctl or syscall operating on the procfs > directory fd.) Very much agreed!