Received: by 2002:a5b:505:0:0:0:0:0 with SMTP id o5csp2063107ybp; Sat, 12 Oct 2019 03:26:03 -0700 (PDT) X-Google-Smtp-Source: APXvYqwSI42k3PVj5bNal6jlxsSJTU4z4XdzZsaVp9wDueWUykxXbhrIf2ffL5YHUwOdwhdO98lM X-Received: by 2002:aa7:c24d:: with SMTP id y13mr18235002edo.186.1570875963493; Sat, 12 Oct 2019 03:26:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1570875963; cv=none; d=google.com; s=arc-20160816; b=KXo5hP7sCAQ7NANPpT8BtAbiy3YmOQVTLqohvQtuMRg6857NUvJDckv7GiuYleJa8A CItQWP2uevYSpF4Ad9slT/S9wwcbN/w2Qud0al3N4uA86bphpiHllC4mX8lRwqbPtf/x 1Sel+DqVZErvwcLB1vRZzo5uG2yRe58rP5d+Lv/YrIodPAz5qNMJMEkeZ7BJ1RFjMD7m kyqOW/lnus+yZt6U6z99Tt0HUbi+qIwncIzrYPmHIrE9YZ1Z3KjqncmCiaKzd2ADP2Np UtKKMeLi1eyhxZfEeHvpXtCf99vu2Z3xfmb/5S33GBksxFZkWB6/wEQmdMsSSVCeamHi ZHjA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=PSnLuxl7Dzu43EvjMGUBPy693ntzMVLgEJWgX5aap6s=; b=MVBS5Z0yMrWp/sR7BOZjS2GnE+4Us4WVcqe/WR0xxWSXEyrjDnufKRb/DdWzyFvR/z NX+7W5SwU9krt7Qj1DbfHD21qBi5s7vN7e6aSLTaqoqbXP/l2XzRVAnF2Z1vI1yD5R5P z8H07uEjVoCS35q05aF0enYNv0wKs18B85pKRwRDJGtHMC2dB7dLcfArEJSiTLNf4hHR iPj4lwcW8CW44+2gStB1H/dmg76TQ854cc0rKvltxu8pW/ECJAw1rxwc4cjdUK6t8wlh d+An20T2XSDfIbCDe+FrkYy1llJ5PwtJE/ySRaQQySdv87DnQFy+HKvsyDiY8rPrG2ib Z1+Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id s41si7336573edm.412.2019.10.12.03.25.40; Sat, 12 Oct 2019 03:26:03 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729004AbfJLKVh (ORCPT + 99 others); Sat, 12 Oct 2019 06:21:37 -0400 Received: from youngberry.canonical.com ([91.189.89.112]:49726 "EHLO youngberry.canonical.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726555AbfJLKVg (ORCPT ); Sat, 12 Oct 2019 06:21:36 -0400 Received: from [185.81.136.17] (helo=wittgenstein) by youngberry.canonical.com with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1iJEWZ-0006w3-Ut; Sat, 12 Oct 2019 10:21:24 +0000 Date: Sat, 12 Oct 2019 12:21:20 +0200 From: Christian Brauner To: jannh@google.com Cc: aarcange@redhat.com, akpm@linux-foundation.org, christian@kellner.me, ckellner@redhat.com, cyphar@cyphar.com, elena.reshetova@intel.com, guro@fb.com, ldv@altlinux.org, linux-api@vger.kernel.org, linux-kernel@vger.kernel.org, mhocko@suse.com, mingo@kernel.org, peterz@infradead.org, tglx@linutronix.de, viro@zeniv.linux.org.uk Subject: Re: [PATCH] pidfd: add NSpid entries to fdinfo Message-ID: <20191012102119.qq2adlnxjxrkslca@wittgenstein> References: <20191012101922.24168-1-christian.brauner@ubuntu.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20191012101922.24168-1-christian.brauner@ubuntu.com> User-Agent: NeoMutt/20180716 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Oct 12, 2019 at 12:19:22PM +0200, Christian Brauner wrote: > Currently, the fdinfo file of contains the field Pid: > It contains the pid a given pidfd refers to in the pid namespace of the > opener's procfs instance. > If the pid namespace of the process is not a descendant of the pid > namespace of the procfs instance 0 will be shown as its pid. This is > similar to calling getppid() on a process who's parent is out of it's > pid namespace (e.g. when moving a process into a sibling pid namespace > via setns()). > > Add an NSpid field for easy retrieval of the pid in all descendant pid > namespaces: > If pid namespaces are supported this field will contain the pid a given > pidfd refers to for all descendant pid namespaces starting from the > current pid namespace of the opener's procfs instance, i.e. the first > pid entry for Pid and NSpid will be identical. > If the pid namespace of the process is not a descendant of the pid > namespace of the procfs instance 0 will be shown as its first NSpid and > no other NSpid entries will be shown. > Note that this differs from the Pid and NSpid fields in > /proc//status where Pid and NSpid are always shown relative to the > pid namespace of the opener's procfs instace. The difference becomes > obvious when sending around a pidfd between pid namespaces from > different trees, i.e. where no ancestoral relation is present between > the pid namespaces: > 1. sending around pidfd: > - create two new pid namespaces ns1 and ns2 in the initial pid namespace > (Also take care to create new mount namespaces in the new pid > namespace and mount procfs.) > - create a process with a pidfd in ns1 > - send pidfd from ns1 to ns2 > - read /proc/self/fdinfo/ and observe that Pid and NSpid entry > are 0 > - create a process with a pidfd in > - open a pidfd for a process in the initial pid namespace > 2. sending around /proc//status fd: > - create two new pid namespaces ns1 and ns2 in the initial pid namespace > (Also take care to create new mount namespaces in the new pid > namespace and mount procfs.) > - create a process in ns1 > - open /proc//status in the initial pid namespace for the process > you created in ns1 > - send statusfd from initial pid namespace to ns2 > - read statusfd and observe: > - that Pid will contain the pid of the process as seen from the init > - that NSpid will contain the pids of the process for all descendant > pid namespaces starting from the initial pid namespace > > Cc: Jann Horn > Cc: linux-api@vger.kernel.org > Co-Developed-by: Christian Kellner > Signed-off-by: Christian Kellner > Signed-off-by: Christian Brauner I think this might be more what we want. I tried to think of cases where the first entry of Pid is not identical to the first entry of NSpid but I came up with none. Maybe you do, Jann? Christian, this is just a quick stab I took. Feel free to pick this up as a template. Thanks! Christian > --- > kernel/fork.c | 73 ++++++++++++++++++++++++++++++++++++++++++++++++++- > 1 file changed, 72 insertions(+), 1 deletion(-) > > diff --git a/kernel/fork.c b/kernel/fork.c > index 1f6c45f6a734..b155bad92d9c 100644 > --- a/kernel/fork.c > +++ b/kernel/fork.c > @@ -1695,12 +1695,83 @@ static int pidfd_release(struct inode *inode, struct file *file) > } > > #ifdef CONFIG_PROC_FS > +/** > + * pidfd_show_fdinfo - print information about a pidfd > + * @m: proc fdinfo file > + * @f: file referencing a pidfd > + * > + * Pid: > + * This function will print the pid a given pidfd refers to in the pid > + * namespace of the opener's procfs instance. > + * If the pid namespace of the process is not a descendant of the pid > + * namespace of the procfs instance 0 will be shown as its pid. This is > + * similar to calling getppid() on a process who's parent is out of it's > + * pid namespace (e.g. when moving a process into a sibling pid namespace > + * via setns()). > + * > + * NSpid: > + * If pid namespaces are supported then this function will also print the > + * pid a given pidfd refers to for all descendant pid namespaces starting > + * from the current pid namespace of the opener's procfs instance, i.e. the > + * first pid entry for Pid and NSpid will be identical. > + * If the pid namespace of the process is not a descendant of the pid > + * namespace of the procfs instance 0 will be shown as its first NSpid and > + * no other NSpid entries will be shown. > + * Note that this differs from the Pid and NSpid fields in > + * /proc//status where Pid and NSpid are always shown relative to the > + * pid namespace of the opener's procfs instace. The difference becomes > + * obvious when sending around a pidfd between pid namespaces from > + * different trees, i.e. where no ancestoral relation is present between > + * the pid namespaces: > + * 1. sending around pidfd: > + * - create two new pid namespaces ns1 and ns2 in the initial pid namespace > + * (Also take care to create new mount namespaces in the new pid > + * namespace and mount procfs.) > + * - create a process with a pidfd in ns1 > + * - send pidfd from ns1 to ns2 > + * - read /proc/self/fdinfo/ and observe that Pid and NSpid entry > + * are 0 > + * - create a process with a pidfd in > + * - open a pidfd for a process in the initial pid namespace > + * 2. sending around /proc//status fd: > + * - create two new pid namespaces ns1 and ns2 in the initial pid namespace > + * (Also take care to create new mount namespaces in the new pid > + * namespace and mount procfs.) > + * - create a process in ns1 > + * - open /proc//status in the initial pid namespace for the process > + * you created in ns1 > + * - send statusfd from initial pid namespace to ns2 > + * - read statusfd and observe: > + * - that Pid will contain the pid of the process as seen from the init > + * - that NSpid will contain the pids of the process for all descendant > + * pid namespaces starting from the initial pid namespace > + */ > static void pidfd_show_fdinfo(struct seq_file *m, struct file *f) > { > struct pid_namespace *ns = proc_pid_ns(file_inode(m->file)); > struct pid *pid = f->private_data; > + pid_t nr = pid_nr_ns(pid, ns); > + > + seq_put_decimal_ull(m, "Pid:\t", nr); > + > +#ifdef CONFIG_PID_NS > + seq_puts(m, "\nNSpid:"); > + if (nr == 0) { > + /* > + * If nr is zero the pid namespace of the procfs and the > + * pid namespace of the pidfd are neither the same pid > + * namespace nor are they ancestors. Since NSpid and Pid > + * are always identical in their first entry shortcut it > + * and simply print 0. > + */ > + seq_put_decimal_ull(m, "\t", nr); > + } else { > + int i; > + for (i = ns->level; i <= pid->level; i++) > + seq_put_decimal_ull(m, "\t", pid_nr_ns(pid, pid->numbers[i].ns)); > + } > +#endif > > - seq_put_decimal_ull(m, "Pid:\t", pid_nr_ns(pid, ns)); > seq_putc(m, '\n'); > } > #endif > -- > 2.23.0 >