Received: by 2002:ac0:bc90:0:0:0:0:0 with SMTP id a16csp3394468img; Mon, 25 Mar 2019 09:23:40 -0700 (PDT) X-Google-Smtp-Source: APXvYqyZurN+twG9EyjEJF5CJd4LCEw3bGoYW2duWhhlIzT5TskvuwHvKpXdJwd/XnWVzLjG8tZL X-Received: by 2002:aa7:80c8:: with SMTP id a8mr25324243pfn.193.1553531020575; Mon, 25 Mar 2019 09:23:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1553531020; cv=none; d=google.com; s=arc-20160816; b=YhV1BmYr0O1VDB37w4X3rOygzyCeQWjzOa+6AD4zjWqd2A/TmcwwrhDZ7unlix41Us L8v0gp5k66tGhzL5z612WQXossqzAV71QYtCst1HWy2OlB3LON2WBBdYcS1VFFXReXCL 5tJvOKYNnMHSY+8o5z3jRyERC630laxLeRS0ypaYosPU4AEM8rh3xx1QAEuJLQweizX3 IanwVBhDbVXqDDobd1MHGdxQn5h/vp4gHOVZv2ku/6S7a8ti3XVPBdWedWAnv1px54c+ SQjylPYKVNd4IC0cJIIKN2qhHEQ4li2OTmQPFOeFIbT4B0XKh1Do9Rvecocv5WkDaOZ6 NdNg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=7N20GLSn1zIzkh96lcjNafv3wdHS2F1WYmIZ+i8iyyo=; b=ORoMsEZVK9+RuX6hq7naBhsMOBgmV+g26mLPKQBOAOk4O4Ew7q3y0sa4VwMTSnbF4Q bJWSrDRK4gp6kfuQSSBsQ2XDLS6wAocwHQ9h4ndB14slgzLTr6iNBymDsj9FOT0CCCBt WOkhN4ozJt5Byv4HvHsJfvjxUPNcF/zTzx1gLQJ1ClLHnd3wpNPQIo3dTbZX+FsgYUBw BEZVyufOwzTlJY2IV2a7/IG8nhEvvPMejNGBgV0kPhHCZXR8L4tUtOaFJNf1L9FNmJ3p GM9Vthv0wQcenOmN+3JfoMAfe7y+qwG6sFMegCYptDT/4J/LEKyNzK1Eg4bheJWo6OpL TZQQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@brauner.io header.s=google header.b=CiUqHD1w; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id g2si11228009pgc.225.2019.03.25.09.23.25; Mon, 25 Mar 2019 09:23:40 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@brauner.io header.s=google header.b=CiUqHD1w; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729648AbfCYQVP (ORCPT + 99 others); Mon, 25 Mar 2019 12:21:15 -0400 Received: from mail-ed1-f67.google.com ([209.85.208.67]:39550 "EHLO mail-ed1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726243AbfCYQVP (ORCPT ); Mon, 25 Mar 2019 12:21:15 -0400 Received: by mail-ed1-f67.google.com with SMTP id p20so7610403eds.6 for ; Mon, 25 Mar 2019 09:21:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=brauner.io; s=google; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=7N20GLSn1zIzkh96lcjNafv3wdHS2F1WYmIZ+i8iyyo=; b=CiUqHD1wUNiV+at0t+Y/BHxxxSx1R/msKFLdewkTu9PF+vW5LUvST9Syi9MR/5qWQu HFV7KH/LP6a80Ni/8WA4RCErT0gB9Ux1xur8XCkFueTlhk4PqBKkW30XkRpWyrt7J3Gh 4PcuYQPpbj/np7GV63mRh9H27P0U1LFjjZWiV9ST19UtrvvT9Ep4iWo8TMqRnzAsIIDN BgCCM/wN5TeKFZo9W6RufzdUYx8NcVl3uhhq6HpzMlukpw6C9+K6dr+T93yG2t4mnArP W8OOAp1YwzapezGUCcKKx29vmW7Ri6jTLqOMx+WcGkWR3enhKkCB/9nSADa42cCcRucP 7N9Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=7N20GLSn1zIzkh96lcjNafv3wdHS2F1WYmIZ+i8iyyo=; b=MiCd4YM0m5B1kqLEwk4/ALdfSVdXSeCm3603xgzJKKc7Txjr8LqtADXcuRZ1GNAG/S 3kBI2nb4QRr9Z475PYk1f304ZCXSHS1krigcTfKmJ3bpgpvWdp/DQj4x0Y/hb8xDU1dW r9gGI72ORWDHIn5PmplKCDnTyE5XBxpPf7dyMb3usjw62g3o6j+hHLX7OQ7YX5tHHEZN XOw8my/7XSJGPErzOdR7qmB0eltCOhA4bVmobFhi/Izaoh1IRZOlDtqmYplBxj6SfdId 7vNbRsPsXjjYgmjKpnPIGAzBuh2+WQg8NNZND5wLI3zgtMc48w0on/jtAudjf8rRXGeN rnuw== X-Gm-Message-State: APjAAAVT8GxY8M1wl28R/suXBZQ+6yn3wdhxH8NnB1HPLozLa3Qmrzn6 nOYpgutAmryTtLZ2Jxs0OgUPYw== X-Received: by 2002:a17:906:f2d6:: with SMTP id gz22mr8638674ejb.38.1553530872895; Mon, 25 Mar 2019 09:21:12 -0700 (PDT) Received: from localhost.localdomain ([193.96.224.244]) by smtp.gmail.com with ESMTPSA id e45sm5027759edd.3.2019.03.25.09.21.10 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 25 Mar 2019 09:21:12 -0700 (PDT) From: Christian Brauner To: jannh@google.com, khlebnikov@yandex-team.ru, luto@kernel.org, dhowells@redhat.com, serge@hallyn.com, ebiederm@xmission.com, linux-api@vger.kernel.org, linux-kernel@vger.kernel.org Cc: arnd@arndb.de, keescook@chromium.org, adobriyan@gmail.com, tglx@linutronix.de, mtk.manpages@gmail.com, bl0pbl33p@gmail.com, ldv@altlinux.org, akpm@linux-foundation.org, oleg@redhat.com, nagarathnam.muthusamy@oracle.com, cyphar@cyphar.com, viro@zeniv.linux.org.uk, joel@joelfernandes.org, dancol@google.com, Christian Brauner Subject: [PATCH 0/4] pid: add pidctl() Date: Mon, 25 Mar 2019 17:20:48 +0100 Message-Id: <20190325162052.28987-1-christian@brauner.io> X-Mailer: git-send-email 2.21.0 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The pidctl() syscalls builds on, extends, and improves translate_pid() [4]. I quote Konstantins original patchset first that has already been acked and picked up by Eric before and whose functionality is preserved in this syscall. Multiple people have asked when this patchset will be sent in for merging (cf. [1], [2]). It has recently been revived by Nagarathnam Muthusamy from Oracle [3]. The intention of the original translate_pid() syscall was twofold: 1. Provide translation of pids between pid namespaces 2. Provide implicit pid namespace introspection Both functionalities are preserved. The latter task has been improved upon though. In the original version of the pachset passing pid as 1 would allow to deterimine the relationship between the pid namespaces. This is inherhently racy. If pid 1 inside a pid namespace has died it would report false negatives. For example, if pid 1 inside of the target pid namespace already died, it would report that the target pid namespace cannot be reached from the source pid namespace because it couldn't find the pid inside of the target pid namespace and thus falsely report to the user that the two pid namespaces are not related. This problem is simple to avoid. In the new version we simply walk the list of ancestors and check whether the namespace are related to each other. By doing it this way we can reliably report what the relationship between two pid namespace file descriptors looks like. Additionally, this syscall has been extended to allow the retrieval of pidfds independent of procfs. These pidfds can e.g. be used with the new pidfd_send_signal() syscall we recently merged. The ability to retrieve pidfds independent of procfs had already been requested in the pidfd_send_signal patchset by e.g. Andrew [4] and later again by Alexey [5]. A use-case where a kernel is compiled without procfs but where pidfds are still useful has been outlined by Andy in [6]. Regular anon-inode based file descriptors are used that stash a reference to struct pid in file->private_data and drop that reference on close. With this translate_pid() has three closely related but still distinct functionalities. To clarify the semantics and to make it easier for userspace to use the syscall it has: - gained a command argument and three commands clearly reflecting the distinct functionalities (PIDCMD_QUERY_PID, PIDCMD_QUERY_PIDNS, PIDCMD_GET_PIDFD). - been renamed to pidctl() By gaining support for cleanly retrieving pidfds this syscall connects the traditional pid-based and the newer pidfd-based process API in a natural and clean way. Another advantage is that embedding this functionality into pidctl() let's us avoid adding another syscall just serving the single purpose of retrieving a pidfd. The flag argument allows to atomically set the cloexec when retrieving pidfds. Note that this patchset also includes Al's and David's commit to make anon inodes unconditional. The original intention is to make it possible to use anon inodes in core vfs functions. pidctl() has the same requirement so David suggested I sent this in alongside this patch. Both are informed of this. The syscall comes with extensive testing for all functionalities. /* References */ [1]: https://lore.kernel.org/lkml/37b17950-b130-7933-99a1-4846c61c8555@oracle.com/ [2]: https://lore.kernel.org/lkml/20181109034919.GA21681@altlinux.org/ [3]: https://lore.kernel.org/lkml/37b17950-b130-7933-99a1-4846c61c8555@oracle.com/ [4]: 3eb39f47934f9d5a3027fe00d906a45fe3a15fad [5]: https://lore.kernel.org/lkml/20190320203910.GA2842@avx2/ [6]: https://lore.kernel.org/lkml/CALCETrXO=V=+qEdLDVPf8eCgLZiB9bOTrUfe0V-U-tUZoeoRDA@mail.gmail.com/ Thanks! Christian Christian Brauner (3): pid: add pidctl() signal: support pidctl() with pidfd_send_signal() tests: add pidctl() tests David Howells (1): Make anon_inodes unconditional arch/arm/kvm/Kconfig | 1 - arch/arm64/kvm/Kconfig | 1 - arch/mips/kvm/Kconfig | 1 - arch/powerpc/kvm/Kconfig | 1 - arch/s390/kvm/Kconfig | 1 - arch/x86/Kconfig | 1 - arch/x86/entry/syscalls/syscall_32.tbl | 1 + arch/x86/entry/syscalls/syscall_64.tbl | 1 + arch/x86/kvm/Kconfig | 1 - drivers/base/Kconfig | 1 - drivers/char/tpm/Kconfig | 1 - drivers/dma-buf/Kconfig | 1 - drivers/gpio/Kconfig | 1 - drivers/iio/Kconfig | 1 - drivers/infiniband/Kconfig | 1 - drivers/vfio/Kconfig | 1 - fs/Makefile | 2 +- fs/notify/fanotify/Kconfig | 1 - fs/notify/inotify/Kconfig | 1 - include/linux/pid.h | 2 + include/linux/pid_namespace.h | 8 + include/linux/syscalls.h | 2 + include/uapi/linux/wait.h | 17 + init/Kconfig | 10 - kernel/pid.c | 162 ++++++ kernel/pid_namespace.c | 25 + kernel/signal.c | 20 +- kernel/sys_ni.c | 3 - tools/testing/selftests/pidfd/Makefile | 2 +- tools/testing/selftests/pidfd/pidctl_test.c | 553 ++++++++++++++++++++ 30 files changed, 782 insertions(+), 42 deletions(-) create mode 100644 tools/testing/selftests/pidfd/pidctl_test.c -- 2.21.0