Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp1546987yba; Sun, 14 Apr 2019 13:16:13 -0700 (PDT) X-Google-Smtp-Source: APXvYqwAnfTLYaYWB64E4ci8CjQaZUmrs016TTRn/Se2to9SvQxPU0dIx4KRGXgilg1ayfeD8KCp X-Received: by 2002:a63:df43:: with SMTP id h3mr67388936pgj.294.1555272973363; Sun, 14 Apr 2019 13:16:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1555272973; cv=none; d=google.com; s=arc-20160816; b=maX4KihDNafjoYhbk6jMy9dsNWg4gcPReELmm33+i6J0gmz8+zYybm/4bv/fVs7NB2 Enfp2GeiX6efp316bgbkP2tVMr/surk4HeKnYYj6PhVeCA8MRJMLlAjEuVvOVIAQecVn 1kFnGO/MFDCuqQeVDop/KfUZe20nzlP6PMuX48AqtYX00MgEayf+CMvDBBEvpp08IYcP H9D2p6Oa+qYfkcbqM9jBkRuEfr7gZYlwJrlxkeNXhIjhxNY5GB4h3Eu2CXv46JODumXr w5ODzS1bJBSB/3RpmpRfdgDeKYGJWprEKgAL2fkGatyKpy15CUX2+38Bs584jSKDvxft VR3A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=imR32vbY9WZZPOkf8zxQCMKbA2gLbgn3RsJOli4pRVg=; b=E0TSvkMiEMo6MK0aBnZmumBlZZrVzfP5Dwlr12jkEQmftDUZrLZkwYmijAOoTlAyxx em2Mvjlipnq3sQ9IHC5K+A7dPq0HWxDNmLIvVUf7lWksH7o/rm0DzVrHT/TDviHM3oDo Uqaa5B0mXT5cg+7iP6VWGbT3AkcM7i0ogRtB9s+z+QjGHfGzAwm9kb17Ibs5lAQ9Gwsd zJJzEt4D8Bwx7bYRok+U9fdTcS9M6gt5AiKgV3SwSp7N7JoUnl+l4rx2V/to1V26z0Om GevaEDqUZ6g26l68NYuQ+heJWYfQZhAkKldliySuQ1Mpi5xhAKvYwi/o50Xy6qdnkKgE PRuA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@brauner.io header.s=google header.b="V56kYj4/"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h13si44850863pgr.133.2019.04.14.13.15.57; Sun, 14 Apr 2019 13:16:13 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@brauner.io header.s=google header.b="V56kYj4/"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727324AbfDNUPW (ORCPT + 99 others); Sun, 14 Apr 2019 16:15:22 -0400 Received: from mail-ed1-f67.google.com ([209.85.208.67]:39415 "EHLO mail-ed1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726308AbfDNUPT (ORCPT ); Sun, 14 Apr 2019 16:15:19 -0400 Received: by mail-ed1-f67.google.com with SMTP id k45so12837826edb.6 for ; Sun, 14 Apr 2019 13:15:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=brauner.io; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=imR32vbY9WZZPOkf8zxQCMKbA2gLbgn3RsJOli4pRVg=; b=V56kYj4/RaS+1p/Oxwpnxf2knlKvEKMTUxGAtpUz3a9KSsx/71m6KwkGONa6H63gqD 2Ir2g3K/eqx2YjYouXVqUdbmDeOEAUxN98ywj9ZLTaeJ7gE+xnvOy6Eubj8l5FoeAvhx Z8ThtuVS/0KVmIKImcGKlZsXvvq2U0LCFMYaX0v/LuZplp/Vj+OQ7ENJjEomtKnwcKE2 juUAuYb0riTouqzczBK7qZ7S1YZR60V3aTrqIe0OdxX6U7nOfp/MsdGcHpT0bHYTfuck i3rtLkcIdnndsmlx21VgK7xZIQUcnSuKD37SkaOBMPiMW0EBR6qvumSjoJdXrzoxR3Gu kouA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=imR32vbY9WZZPOkf8zxQCMKbA2gLbgn3RsJOli4pRVg=; b=W0QFa/eltTMNlm2rXyXMTXkNvDp8adNORQNoAvk4MiQbNgLZG9h6WHjei9miNkphkW WT4AfY/m96T6vFNraGK+Ma+nMBuLCakTGrM8BQzC102Kir9y5QIig2o6PtGbyayB4QDe hR0xT5JDt9jLnaAMHuDyL8fJfbhj8gKFGVvjTE6DNguJEJlUrqpiiibq6lbFEYzEpEY/ xeue5xcsMmdpYYSGNGI+fjhmp6UEo4poEFJHE/9aGjHQMShARlWD2dZKQ2R2UvCIX1kY iwQLu85F+R0xUNtDqgcAZCCl4k0HCbw9t943cRVGD5X2GWkVHDDYbNEjwHsHH9o2lDXX FTeg== X-Gm-Message-State: APjAAAWULtpW4ErM5oiNLS3Dkg2jlBj8rXfWGWICUUNtmt1rAOglPBb/ hkHD5aS+9H2KJRxRqVKNsAqekQ== X-Received: by 2002:a50:fc0a:: with SMTP id i10mr27003311edr.143.1555272917829; Sun, 14 Apr 2019 13:15:17 -0700 (PDT) Received: from localhost.localdomain ([212.91.227.56]) by smtp.gmail.com with ESMTPSA id n21sm3383068edq.14.2019.04.14.13.15.16 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 14 Apr 2019 13:15:17 -0700 (PDT) From: Christian Brauner To: torvalds@linux-foundation.org, viro@zeniv.linux.org.uk, jannh@google.com, dhowells@redhat.com, linux-api@vger.kernel.org, linux-kernel@vger.kernel.org Cc: serge@hallyn.com, luto@kernel.org, arnd@arndb.de, ebiederm@xmission.com, keescook@chromium.org, tglx@linutronix.de, mtk.manpages@gmail.com, akpm@linux-foundation.org, oleg@redhat.com, cyphar@cyphar.com, joel@joelfernandes.org, dancol@google.com, Christian Brauner Subject: [PATCH 4/4] samples: show race-free pidfd metadata access Date: Sun, 14 Apr 2019 22:14:36 +0200 Message-Id: <20190414201436.19502-5-christian@brauner.io> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20190414201436.19502-1-christian@brauner.io> References: <20190414201436.19502-1-christian@brauner.io> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org This is a sample program showing userspace how to get race-free access to process metadata from a pidfd. It is rather easy to do and userspace can actually simply reuse code that currently parses a process's status file in procfs. The program can easily be extended into a generic helper suitable for inclusion in a libc to make it even easier for userspace to gain metadata access. Signed-off-by: Christian Brauner Signed-off-by: Jann Horn Cc: Arnd Bergmann Cc: "Eric W. Biederman" Cc: Kees Cook Cc: Thomas Gleixner Cc: David Howells Cc: "Michael Kerrisk (man-pages)" Cc: Andy Lutomirsky Cc: Andrew Morton Cc: Oleg Nesterov Cc: Aleksa Sarai Cc: Linus Torvalds Cc: Al Viro --- samples/Makefile | 2 +- samples/pidfd/Makefile | 6 ++ samples/pidfd/pidfd-metadata.c | 172 +++++++++++++++++++++++++++++++++ 3 files changed, 179 insertions(+), 1 deletion(-) create mode 100644 samples/pidfd/Makefile create mode 100644 samples/pidfd/pidfd-metadata.c diff --git a/samples/Makefile b/samples/Makefile index b1142a958811..fadadb1c3b05 100644 --- a/samples/Makefile +++ b/samples/Makefile @@ -3,4 +3,4 @@ obj-$(CONFIG_SAMPLES) += kobject/ kprobes/ trace_events/ livepatch/ \ hw_breakpoint/ kfifo/ kdb/ hidraw/ rpmsg/ seccomp/ \ configfs/ connector/ v4l/ trace_printk/ \ - vfio-mdev/ statx/ qmi/ binderfs/ + vfio-mdev/ statx/ qmi/ binderfs/ pidfd/ diff --git a/samples/pidfd/Makefile b/samples/pidfd/Makefile new file mode 100644 index 000000000000..0ff97784177a --- /dev/null +++ b/samples/pidfd/Makefile @@ -0,0 +1,6 @@ +# SPDX-License-Identifier: GPL-2.0 + +hostprogs-y := pidfd-metadata +always := $(hostprogs-y) +HOSTCFLAGS_pidfd-metadata.o += -I$(objtree)/usr/include +all: pidfd-metadata diff --git a/samples/pidfd/pidfd-metadata.c b/samples/pidfd/pidfd-metadata.c new file mode 100644 index 000000000000..23a44e582ccb --- /dev/null +++ b/samples/pidfd/pidfd-metadata.c @@ -0,0 +1,172 @@ +// SPDX-License-Identifier: GPL-2.0 + +#define _GNU_SOURCE +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#ifndef CLONE_PIDFD +#define CLONE_PIDFD 0x00001000 +#endif + +static int raw_clone_pidfd(void) +{ + unsigned long flags = CLONE_PIDFD | SIGCHLD; + +#if defined(__s390x__) || defined(__s390__) || defined(__CRIS__) + /* + * On s390/s390x and cris the order of the first and second arguments + * of the system call is reversed. + */ + return (int)syscall(__NR_clone, NULL, flags); +#elif defined(__sparc__) && defined(__arch64__) + { + /* + * sparc64 always returns the other process id in %o0, and a + * boolean flag whether this is the child or the parent in %o1. + * Inline assembly is needed to get the flag returned in %o1. + */ + int child_pid, in_child; + + asm volatile("mov %2, %%g1\n\t" + "mov %3, %%o0\n\t" + "mov 0 , %%o1\n\t" + "t 0x6d\n\t" + "mov %%o1, %0\n\t" + "mov %%o0, %1" + : "=r"(in_child), "=r"(child_pid) + : "i"(__NR_clone), "r"(flags) + : "%o1", "%o0", "%g1"); + + if (in_child) + return 0; + else + return child_pid; + } +#elif defined(__ia64__) + /* On ia64 stack and stack size are passed as separate arguments. */ + return (int)syscall(__NR_clone, flags, NULL, 0UL); +#else + return (int)syscall(__NR_clone, flags, NULL); +#endif +} + +static inline int sys_pidfd_send_signal(int pidfd, int sig, siginfo_t *info, + unsigned int flags) +{ + return syscall(__NR_pidfd_send_signal, pidfd, sig, info, flags); +} + +static int pidfd_metadata_fd(int pidfd) +{ + int procfd, ret; + char path[100]; + FILE *f; + size_t n = 0; + char *line = NULL; + + snprintf(path, sizeof(path), "/proc/self/fdinfo/%d", pidfd); + + f = fopen(path, "re"); + if (!f) + return -1; + + ret = 0; + while (getline(&line, &n, f) != -1) { + char *numstr; + size_t len; + + if (strncmp(line, "Pid:\t", 5)) + continue; + + numstr = line + 5; + len = strlen(numstr); + if (len > 0 && numstr[len - 1] == '\n') + numstr[len - 1] = '\0'; + ret = snprintf(path, sizeof(path), "/proc/%s", numstr); + break; + } + free(line); + fclose(f); + + if (!ret) { + errno = ENOENT; + warn("Failed to parse pid from fdinfo\n"); + return -1; + } + + procfd = open(path, O_DIRECTORY | O_RDONLY | O_CLOEXEC); + if (procfd < 0) { + warn("Failed to open %s\n", path); + return -1; + } + + /* + * Verify that the pid has not been recycled and our /proc/ handle + * is still valid. + */ + ret = sys_pidfd_send_signal(pidfd, 0, NULL, 0); + if (ret < 0) { + switch (errno) { + case EPERM: + /* Process exists, just not allowed to signal it. */ + break; + default: + warn("Failed to signal process\n"); + close(procfd); + procfd = -1; + } + } + + return procfd; +} + +int main(int argc, char *argv[]) +{ + int ret = EXIT_FAILURE; + char buf[4096] = { 0 }; + int pidfd, procfd, statusfd; + ssize_t bytes; + + pidfd = raw_clone_pidfd(); + if (pidfd < 0) + exit(ret); + + if (pidfd == 0) { + printf("%d\n", getpid()); + exit(EXIT_SUCCESS); + } + + procfd = pidfd_metadata_fd(pidfd); + close(pidfd); + if (procfd < 0) + goto out; + + statusfd = openat(procfd, "status", O_RDONLY | O_CLOEXEC); + close(procfd); + if (statusfd < 0) + goto out; + + bytes = read(statusfd, buf, sizeof(buf)); + if (bytes > 0) + bytes = write(STDOUT_FILENO, buf, bytes); + close(statusfd); + ret = EXIT_SUCCESS; + +out: + (void)wait(NULL); + + exit(ret); +} -- 2.21.0