Received: by 2002:a5b:505:0:0:0:0:0 with SMTP id o5csp2059905ybp; Sat, 12 Oct 2019 03:22:04 -0700 (PDT) X-Google-Smtp-Source: APXvYqwVHCFST/nUCOsVsXMNAnRP8fuLvb0PWZX7bTMHk87a5VhlpUlqPNzEzrLKrviy84QtmTMp X-Received: by 2002:a17:907:20c8:: with SMTP id qq8mr18221492ejb.311.1570875724027; Sat, 12 Oct 2019 03:22:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1570875724; cv=none; d=google.com; s=arc-20160816; b=0AUpIamp4pTiAOuYN4KQFcJoaX7ijbFcfqIf86lTd8SqV+ZITGv3Xt0shk4prp8IGG rwthWymVDRkZaEdOrlYJfJRA2ofjgcd5LeN7+CkKj2KVavYxUZXZxOYKSwrGBeVTbfTa 2yag9FGdXfCFAdsYRV3Zc2cvWZtcAyrqpNfFY6f5bSpumUFsr5nMxqFJSniDOJIaYlof vCOkERQZEAynLUTFjn4imubM94Z3SjAZfdv+CbuAzHVB1bhsv6DOzjV7s6Jg10XYol2M DLgzomIc441/oLED5+Lkz6TuM0TLBQOd/mHOytAnIAKaXcb6hIefx/zCeXGafiBSdKNf WLtQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=rChQuG2fC/PZD8ctpFHZEcfNM6F077kThB0pa16gMJg=; b=SY+KRrXeyYZ0tQrXPlftCwDSIC7KQRaVb3iIzHCBrU9eu7EsUmyvlMICB4Yd3T9FB2 hmMc32ZcQBJxrHK3kP4wcq55bnzugLY5ykeMtn3JQR1fja+SLz/ZoXBPVcy44ttC6ZJP 7UHgx4LHDDOYBwICscpiclAx28DPFczZVez5vbFUxJz8IrIrhTd8ttuJhUhCePv7v4BH dhekYIuBpSsdye65Qs9PpAFTMS4pTBeyaOZyxlaK/PtJZFUVDvUKWan0PqaIHKEYpp8t +DMgRvHLOzQv+jRpPpV0Zzzyk4XcqLwx/JShMzrVf01A164E6pwgDfOMg1JN7FLEWk1a 7/WQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id s41si7336573edm.412.2019.10.12.03.21.40; Sat, 12 Oct 2019 03:22:04 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728332AbfJLKTw (ORCPT + 99 others); Sat, 12 Oct 2019 06:19:52 -0400 Received: from youngberry.canonical.com ([91.189.89.112]:49692 "EHLO youngberry.canonical.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726839AbfJLKTv (ORCPT ); Sat, 12 Oct 2019 06:19:51 -0400 Received: from [185.81.136.17] (helo=localhost.localdomain) by youngberry.canonical.com with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1iJEUq-0006mw-IO; Sat, 12 Oct 2019 10:19:36 +0000 From: Christian Brauner To: jannh@google.com Cc: aarcange@redhat.com, akpm@linux-foundation.org, christian.brauner@ubuntu.com, christian@kellner.me, ckellner@redhat.com, cyphar@cyphar.com, elena.reshetova@intel.com, guro@fb.com, ldv@altlinux.org, linux-api@vger.kernel.org, linux-kernel@vger.kernel.org, mhocko@suse.com, mingo@kernel.org, peterz@infradead.org, tglx@linutronix.de, viro@zeniv.linux.org.uk Subject: [PATCH] pidfd: add NSpid entries to fdinfo Date: Sat, 12 Oct 2019 12:19:22 +0200 Message-Id: <20191012101922.24168-1-christian.brauner@ubuntu.com> X-Mailer: git-send-email 2.23.0 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Currently, the fdinfo file of contains the field Pid: It contains the pid a given pidfd refers to in the pid namespace of the opener's procfs instance. If the pid namespace of the process is not a descendant of the pid namespace of the procfs instance 0 will be shown as its pid. This is similar to calling getppid() on a process who's parent is out of it's pid namespace (e.g. when moving a process into a sibling pid namespace via setns()). Add an NSpid field for easy retrieval of the pid in all descendant pid namespaces: If pid namespaces are supported this field will contain the pid a given pidfd refers to for all descendant pid namespaces starting from the current pid namespace of the opener's procfs instance, i.e. the first pid entry for Pid and NSpid will be identical. If the pid namespace of the process is not a descendant of the pid namespace of the procfs instance 0 will be shown as its first NSpid and no other NSpid entries will be shown. Note that this differs from the Pid and NSpid fields in /proc//status where Pid and NSpid are always shown relative to the pid namespace of the opener's procfs instace. The difference becomes obvious when sending around a pidfd between pid namespaces from different trees, i.e. where no ancestoral relation is present between the pid namespaces: 1. sending around pidfd: - create two new pid namespaces ns1 and ns2 in the initial pid namespace (Also take care to create new mount namespaces in the new pid namespace and mount procfs.) - create a process with a pidfd in ns1 - send pidfd from ns1 to ns2 - read /proc/self/fdinfo/ and observe that Pid and NSpid entry are 0 - create a process with a pidfd in - open a pidfd for a process in the initial pid namespace 2. sending around /proc//status fd: - create two new pid namespaces ns1 and ns2 in the initial pid namespace (Also take care to create new mount namespaces in the new pid namespace and mount procfs.) - create a process in ns1 - open /proc//status in the initial pid namespace for the process you created in ns1 - send statusfd from initial pid namespace to ns2 - read statusfd and observe: - that Pid will contain the pid of the process as seen from the init - that NSpid will contain the pids of the process for all descendant pid namespaces starting from the initial pid namespace Cc: Jann Horn Cc: linux-api@vger.kernel.org Co-Developed-by: Christian Kellner Signed-off-by: Christian Kellner Signed-off-by: Christian Brauner --- kernel/fork.c | 73 ++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 72 insertions(+), 1 deletion(-) diff --git a/kernel/fork.c b/kernel/fork.c index 1f6c45f6a734..b155bad92d9c 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -1695,12 +1695,83 @@ static int pidfd_release(struct inode *inode, struct file *file) } #ifdef CONFIG_PROC_FS +/** + * pidfd_show_fdinfo - print information about a pidfd + * @m: proc fdinfo file + * @f: file referencing a pidfd + * + * Pid: + * This function will print the pid a given pidfd refers to in the pid + * namespace of the opener's procfs instance. + * If the pid namespace of the process is not a descendant of the pid + * namespace of the procfs instance 0 will be shown as its pid. This is + * similar to calling getppid() on a process who's parent is out of it's + * pid namespace (e.g. when moving a process into a sibling pid namespace + * via setns()). + * + * NSpid: + * If pid namespaces are supported then this function will also print the + * pid a given pidfd refers to for all descendant pid namespaces starting + * from the current pid namespace of the opener's procfs instance, i.e. the + * first pid entry for Pid and NSpid will be identical. + * If the pid namespace of the process is not a descendant of the pid + * namespace of the procfs instance 0 will be shown as its first NSpid and + * no other NSpid entries will be shown. + * Note that this differs from the Pid and NSpid fields in + * /proc//status where Pid and NSpid are always shown relative to the + * pid namespace of the opener's procfs instace. The difference becomes + * obvious when sending around a pidfd between pid namespaces from + * different trees, i.e. where no ancestoral relation is present between + * the pid namespaces: + * 1. sending around pidfd: + * - create two new pid namespaces ns1 and ns2 in the initial pid namespace + * (Also take care to create new mount namespaces in the new pid + * namespace and mount procfs.) + * - create a process with a pidfd in ns1 + * - send pidfd from ns1 to ns2 + * - read /proc/self/fdinfo/ and observe that Pid and NSpid entry + * are 0 + * - create a process with a pidfd in + * - open a pidfd for a process in the initial pid namespace + * 2. sending around /proc//status fd: + * - create two new pid namespaces ns1 and ns2 in the initial pid namespace + * (Also take care to create new mount namespaces in the new pid + * namespace and mount procfs.) + * - create a process in ns1 + * - open /proc//status in the initial pid namespace for the process + * you created in ns1 + * - send statusfd from initial pid namespace to ns2 + * - read statusfd and observe: + * - that Pid will contain the pid of the process as seen from the init + * - that NSpid will contain the pids of the process for all descendant + * pid namespaces starting from the initial pid namespace + */ static void pidfd_show_fdinfo(struct seq_file *m, struct file *f) { struct pid_namespace *ns = proc_pid_ns(file_inode(m->file)); struct pid *pid = f->private_data; + pid_t nr = pid_nr_ns(pid, ns); + + seq_put_decimal_ull(m, "Pid:\t", nr); + +#ifdef CONFIG_PID_NS + seq_puts(m, "\nNSpid:"); + if (nr == 0) { + /* + * If nr is zero the pid namespace of the procfs and the + * pid namespace of the pidfd are neither the same pid + * namespace nor are they ancestors. Since NSpid and Pid + * are always identical in their first entry shortcut it + * and simply print 0. + */ + seq_put_decimal_ull(m, "\t", nr); + } else { + int i; + for (i = ns->level; i <= pid->level; i++) + seq_put_decimal_ull(m, "\t", pid_nr_ns(pid, pid->numbers[i].ns)); + } +#endif - seq_put_decimal_ull(m, "Pid:\t", pid_nr_ns(pid, ns)); seq_putc(m, '\n'); } #endif -- 2.23.0