2023-04-11 10:47:12

by Aleksandr Mikhalitsyn

[permalink] [raw]
Subject: [PATCH net-next v3 0/4] Add SCM_PIDFD and SO_PEERPIDFD

1. Implement SCM_PIDFD, a new type of CMSG type analogical to SCM_CREDENTIALS,
but it contains pidfd instead of plain pid, which allows programmers not
to care about PID reuse problem.

2. Add SO_PEERPIDFD which allows to get pidfd of peer socket holder pidfd.
This thing is direct analog of SO_PEERCRED which allows to get plain PID.

3. Add SCM_PIDFD / SO_PEERPIDFD kselftest

Idea comes from UAPI kernel group:
https://uapi-group.org/kernel-features/

Big thanks to Christian Brauner and Lennart Poettering for productive
discussions about this and Luca Boccassi for testing and reviewing this.

=== Motivation behind this patchset

Eric Dumazet raised a question:
> It seems that we already can use pidfd_open() (since linux-5.3), and
> pass the resulting fd in af_unix SCM_RIGHTS message ?

Yes, it's possible, but it means that from the receiver side we need
to trust the sent pidfd (in SCM_RIGHTS),
or always use combination of SCM_RIGHTS+SCM_CREDENTIALS, then we can
extract pidfd from SCM_RIGHTS,
then acquire plain pid from pidfd and after compare it with the pid
from SCM_CREDENTIALS.

A few comments from other folks regarding this.

Christian Brauner wrote:

>Let me try and provide some of the missing background.

>There are a range of use-cases where we would like to authenticate a
>client through sockets without being susceptible to PID recycling
>attacks. Currently, we can't do this as the race isn't fully fixable.
>We can only apply mitigations.

>What this patchset will allows us to do is to get a pidfd without the
>client having to send us an fd explicitly via SCM_RIGHTS. As that's
>already possibly as you correctly point out.

>But for protocols like polkit this is quite important. Every message is
>standalone and we would need to force a complete protocol change where
>we would need to require that every client allocate and send a pidfd via
>SCM_RIGHTS. That would also mean patching through all polkit users.

>For something like systemd-journald where we provide logging facilities
>and want to add metadata to the log we would also immensely benefit from
>being able to get a receiver-side controlled pidfd.

>With the message type we envisioned we don't need to change the sender
>at all and can be safe against pid recycling.

>Link: https://gitlab.freedesktop.org/polkit/polkit/-/merge_requests/154
>Link: https://uapi-group.org/kernel-features

Lennart Poettering wrote:

>So yes, this is of course possible, but it would mean the pidfd would
>have to be transported as part of the user protocol, explicitly sent
>by the sender. (Moreover, the receiver after receiving the pidfd would
>then still have to somehow be able to prove that the pidfd it just
>received actually refers to the peer's process and not some random
>process. – this part is actually solvable in userspace, but ugly)

>The big thing is simply that we want that the pidfd is associated
>*implicity* with each AF_UNIX connection, not explicitly. A lot of
>userspace already relies on this, both in the authentication area
>(polkit) as well as in the logging area (systemd-journald). Right now
>using the PID field from SO_PEERCREDS/SCM_CREDENTIALS is racy though
>and very hard to get right. Making this available as pidfd too, would
>solve this raciness, without otherwise changing semantics of it all:
>receivers can still enable the creds stuff as they wish, and the data
>is then implicitly appended to the connections/datagrams the sender
>initiates.

>Or to turn this around: things like polkit are typically used to
>authenticate arbitrary dbus methods calls: some service implements a
>dbus method call, and when an unprivileged client then issues that
>call, it will take the client's info, go to polkit and ask it if this
>is ok. If we wanted to send the pidfd as part of the protocol we
>basically would have to extend every single method call to contain the
>client's pidfd along with it as an additional argument, which would be
>a massive undertaking: it would change the prototypes of basically
>*all* methods a service defines… And that's just ugly.

>Note that Alex' patch set doesn't expose anything that wasn't exposed
>before, or attach, propagate what wasn't before. All it does, is make
>the field already available anyway (the struct ucred .pid field)
>available also in a better way (as a pidfd), to solve a variety of
>races, with no effect on the protocol actually spoken within the
>AF_UNIX transport. It's a seamless improvement of the status quo.

===

This patch series is on top of net-next tree with pidfd.file.api.v6.4
tag (from git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux.git) merged in.

Git tree:
https://github.com/mihalicyn/linux/tree/scm_pidfd

Cc: "David S. Miller" <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: Jakub Kicinski <[email protected]>
Cc: Paolo Abeni <[email protected]>
Cc: Leon Romanovsky <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Christian Brauner <[email protected]>
Cc: Kuniyuki Iwashima <[email protected]>
Cc: Lennart Poettering <[email protected]>
Cc: Luca Boccassi <[email protected]>

Tested-by: Luca Boccassi <[email protected]>

Alexander Mikhalitsyn (4):
scm: add SO_PASSPIDFD and SCM_PIDFD
net: socket: add sockopts blacklist for BPF cgroup hook
net: core: add getsockopt SO_PEERPIDFD
selftests: net: add SCM_PIDFD / SO_PEERPIDFD test

arch/alpha/include/uapi/asm/socket.h | 3 +
arch/mips/include/uapi/asm/socket.h | 3 +
arch/parisc/include/uapi/asm/socket.h | 3 +
arch/sparc/include/uapi/asm/socket.h | 3 +
include/linux/net.h | 1 +
include/linux/socket.h | 1 +
include/net/scm.h | 14 +-
include/uapi/asm-generic/socket.h | 3 +
net/core/sock.c | 44 ++
net/mptcp/sockopt.c | 1 +
net/socket.c | 45 +-
net/unix/af_unix.c | 18 +-
tools/include/uapi/asm-generic/socket.h | 3 +
tools/testing/selftests/net/.gitignore | 1 +
tools/testing/selftests/net/af_unix/Makefile | 2 +-
.../testing/selftests/net/af_unix/scm_pidfd.c | 430 ++++++++++++++++++
16 files changed, 564 insertions(+), 11 deletions(-)
create mode 100644 tools/testing/selftests/net/af_unix/scm_pidfd.c

--
2.34.1


2023-04-11 10:47:35

by Aleksandr Mikhalitsyn

[permalink] [raw]
Subject: [PATCH net-next v3 2/4] net: socket: add sockopts blacklist for BPF cgroup hook

During work on SO_PEERPIDFD, it was discovered (thanks to Christian),
that bpf cgroup hook can cause FD leaks when used with sockopts which
install FDs into the process fdtable.

After some offlist discussion it was proposed to add a blacklist of
socket options those can cause troubles when BPF cgroup hook is enabled.

Cc: "David S. Miller" <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: Jakub Kicinski <[email protected]>
Cc: Paolo Abeni <[email protected]>
Cc: Leon Romanovsky <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Christian Brauner <[email protected]>
Cc: Kuniyuki Iwashima <[email protected]>
Cc: Lennart Poettering <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Suggested-by: Daniel Borkmann <[email protected]>
Suggested-by: Christian Brauner <[email protected]>
Signed-off-by: Alexander Mikhalitsyn <[email protected]>
---
net/socket.c | 38 +++++++++++++++++++++++++++++++++++---
1 file changed, 35 insertions(+), 3 deletions(-)

diff --git a/net/socket.c b/net/socket.c
index 73e493da4589..9c1ef11de23f 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -108,6 +108,8 @@
#include <linux/ptp_clock_kernel.h>
#include <trace/events/sock.h>

+#include <linux/sctp.h>
+
#ifdef CONFIG_NET_RX_BUSY_POLL
unsigned int sysctl_net_busy_read __read_mostly;
unsigned int sysctl_net_busy_poll __read_mostly;
@@ -2227,6 +2229,36 @@ static bool sock_use_custom_sol_socket(const struct socket *sock)
return test_bit(SOCK_CUSTOM_SOCKOPT, &sock->flags);
}

+#ifdef CONFIG_CGROUP_BPF
+static bool sockopt_installs_fd(int level, int optname)
+{
+ /*
+ * These options do fd_install(), and if BPF_CGROUP_RUN_PROG_GETSOCKOPT
+ * hook returns an error after success of the original handler
+ * sctp_getsockopt(...), userspace will receive an error from getsockopt
+ * syscall and will be not aware that fd was successfully installed into fdtable.
+ *
+ * Let's prevent bpf cgroup hook from running on them.
+ */
+ if (level == SOL_SCTP) {
+ switch (optname) {
+ case SCTP_SOCKOPT_PEELOFF:
+ case SCTP_SOCKOPT_PEELOFF_FLAGS:
+ return true;
+ default:
+ return false;
+ }
+ }
+
+ return false;
+}
+#else /* CONFIG_CGROUP_BPF */
+static inline bool sockopt_installs_fd(int level, int optname)
+{
+ return false;
+}
+#endif /* CONFIG_CGROUP_BPF */
+
/*
* Set a socket option. Because we don't know the option lengths we have
* to pass the user mode parameter for the protocols to sort out.
@@ -2250,7 +2282,7 @@ int __sys_setsockopt(int fd, int level, int optname, char __user *user_optval,
if (err)
goto out_put;

- if (!in_compat_syscall())
+ if (!in_compat_syscall() && !sockopt_installs_fd(level, optname))
err = BPF_CGROUP_RUN_PROG_SETSOCKOPT(sock->sk, &level, &optname,
user_optval, &optlen,
&kernel_optval);
@@ -2304,7 +2336,7 @@ int __sys_getsockopt(int fd, int level, int optname, char __user *optval,
if (err)
goto out_put;

- if (!in_compat_syscall())
+ if (!in_compat_syscall() && !sockopt_installs_fd(level, optname))
max_optlen = BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen);

if (level == SOL_SOCKET)
@@ -2315,7 +2347,7 @@ int __sys_getsockopt(int fd, int level, int optname, char __user *optval,
err = sock->ops->getsockopt(sock, level, optname, optval,
optlen);

- if (!in_compat_syscall())
+ if (!in_compat_syscall() && !sockopt_installs_fd(level, optname))
err = BPF_CGROUP_RUN_PROG_GETSOCKOPT(sock->sk, level, optname,
optval, optlen, max_optlen,
err);
--
2.34.1

2023-04-11 10:48:01

by Aleksandr Mikhalitsyn

[permalink] [raw]
Subject: [PATCH net-next v3 4/4] selftests: net: add SCM_PIDFD / SO_PEERPIDFD test

Basic test to check consistency between:
- SCM_CREDENTIALS and SCM_PIDFD
- SO_PEERCRED and SO_PEERPIDFD

Cc: "David S. Miller" <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: Jakub Kicinski <[email protected]>
Cc: Paolo Abeni <[email protected]>
Cc: Leon Romanovsky <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Christian Brauner <[email protected]>
Cc: Kuniyuki Iwashima <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Alexander Mikhalitsyn <[email protected]>
---
v3:
- started using kselftest lib (thanks to Kuniyuki Iwashima for suggestion/review)
- now test covers abstract sockets too and SOCK_DGRAM sockets
---
tools/testing/selftests/net/.gitignore | 1 +
tools/testing/selftests/net/af_unix/Makefile | 2 +-
.../testing/selftests/net/af_unix/scm_pidfd.c | 430 ++++++++++++++++++
3 files changed, 432 insertions(+), 1 deletion(-)
create mode 100644 tools/testing/selftests/net/af_unix/scm_pidfd.c

diff --git a/tools/testing/selftests/net/.gitignore b/tools/testing/selftests/net/.gitignore
index 80f06aa62034..83fd1ebd34ec 100644
--- a/tools/testing/selftests/net/.gitignore
+++ b/tools/testing/selftests/net/.gitignore
@@ -26,6 +26,7 @@ reuseport_bpf_cpu
reuseport_bpf_numa
reuseport_dualstack
rxtimestamp
+scm_pidfd
sk_bind_sendto_listen
sk_connect_zero_addr
socket
diff --git a/tools/testing/selftests/net/af_unix/Makefile b/tools/testing/selftests/net/af_unix/Makefile
index 1e4b397cece6..f5ca9da8c4d5 100644
--- a/tools/testing/selftests/net/af_unix/Makefile
+++ b/tools/testing/selftests/net/af_unix/Makefile
@@ -1,3 +1,3 @@
-TEST_GEN_PROGS := diag_uid test_unix_oob unix_connect
+TEST_GEN_PROGS := diag_uid test_unix_oob unix_connect scm_pidfd

include ../../lib.mk
diff --git a/tools/testing/selftests/net/af_unix/scm_pidfd.c b/tools/testing/selftests/net/af_unix/scm_pidfd.c
new file mode 100644
index 000000000000..a86222143d79
--- /dev/null
+++ b/tools/testing/selftests/net/af_unix/scm_pidfd.c
@@ -0,0 +1,430 @@
+// SPDX-License-Identifier: GPL-2.0 OR MIT
+#define _GNU_SOURCE
+#include <error.h>
+#include <limits.h>
+#include <stddef.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <sys/socket.h>
+#include <linux/socket.h>
+#include <unistd.h>
+#include <string.h>
+#include <errno.h>
+#include <sys/un.h>
+#include <sys/signal.h>
+#include <sys/types.h>
+#include <sys/wait.h>
+
+#include "../../kselftest_harness.h"
+
+#define clean_errno() (errno == 0 ? "None" : strerror(errno))
+#define log_err(MSG, ...) \
+ fprintf(stderr, "(%s:%d: errno: %s) " MSG "\n", __FILE__, __LINE__, \
+ clean_errno(), ##__VA_ARGS__)
+
+#ifndef SCM_PIDFD
+#define SCM_PIDFD 0x04
+#endif
+
+static void child_die()
+{
+ exit(1);
+}
+
+static int safe_int(const char *numstr, int *converted)
+{
+ char *err = NULL;
+ long sli;
+
+ errno = 0;
+ sli = strtol(numstr, &err, 0);
+ if (errno == ERANGE && (sli == LONG_MAX || sli == LONG_MIN))
+ return -ERANGE;
+
+ if (errno != 0 && sli == 0)
+ return -EINVAL;
+
+ if (err == numstr || *err != '\0')
+ return -EINVAL;
+
+ if (sli > INT_MAX || sli < INT_MIN)
+ return -ERANGE;
+
+ *converted = (int)sli;
+ return 0;
+}
+
+static int char_left_gc(const char *buffer, size_t len)
+{
+ size_t i;
+
+ for (i = 0; i < len; i++) {
+ if (buffer[i] == ' ' || buffer[i] == '\t')
+ continue;
+
+ return i;
+ }
+
+ return 0;
+}
+
+static int char_right_gc(const char *buffer, size_t len)
+{
+ int i;
+
+ for (i = len - 1; i >= 0; i--) {
+ if (buffer[i] == ' ' || buffer[i] == '\t' ||
+ buffer[i] == '\n' || buffer[i] == '\0')
+ continue;
+
+ return i + 1;
+ }
+
+ return 0;
+}
+
+static char *trim_whitespace_in_place(char *buffer)
+{
+ buffer += char_left_gc(buffer, strlen(buffer));
+ buffer[char_right_gc(buffer, strlen(buffer))] = '\0';
+ return buffer;
+}
+
+/* borrowed (with all helpers) from pidfd/pidfd_open_test.c */
+static pid_t get_pid_from_fdinfo_file(int pidfd, const char *key, size_t keylen)
+{
+ int ret;
+ char path[512];
+ FILE *f;
+ size_t n = 0;
+ pid_t result = -1;
+ char *line = NULL;
+
+ snprintf(path, sizeof(path), "/proc/self/fdinfo/%d", pidfd);
+
+ f = fopen(path, "re");
+ if (!f)
+ return -1;
+
+ while (getline(&line, &n, f) != -1) {
+ char *numstr;
+
+ if (strncmp(line, key, keylen))
+ continue;
+
+ numstr = trim_whitespace_in_place(line + 4);
+ ret = safe_int(numstr, &result);
+ if (ret < 0)
+ goto out;
+
+ break;
+ }
+
+out:
+ free(line);
+ fclose(f);
+ return result;
+}
+
+static int cmsg_check(int fd)
+{
+ struct msghdr msg = { 0 };
+ struct cmsghdr *cmsg;
+ struct iovec iov;
+ struct ucred *ucred = NULL;
+ int data = 0;
+ char control[CMSG_SPACE(sizeof(struct ucred)) +
+ CMSG_SPACE(sizeof(int))] = { 0 };
+ int *pidfd = NULL;
+ pid_t parent_pid;
+ int err;
+
+ iov.iov_base = &data;
+ iov.iov_len = sizeof(data);
+
+ msg.msg_iov = &iov;
+ msg.msg_iovlen = 1;
+ msg.msg_control = control;
+ msg.msg_controllen = sizeof(control);
+
+ err = recvmsg(fd, &msg, 0);
+ if (err < 0) {
+ log_err("recvmsg");
+ return 1;
+ }
+
+ if (msg.msg_flags & (MSG_TRUNC | MSG_CTRUNC)) {
+ log_err("recvmsg: truncated");
+ return 1;
+ }
+
+ for (cmsg = CMSG_FIRSTHDR(&msg); cmsg != NULL;
+ cmsg = CMSG_NXTHDR(&msg, cmsg)) {
+ if (cmsg->cmsg_level == SOL_SOCKET &&
+ cmsg->cmsg_type == SCM_PIDFD) {
+ if (cmsg->cmsg_len < sizeof(*pidfd)) {
+ log_err("CMSG parse: SCM_PIDFD wrong len");
+ return 1;
+ }
+
+ pidfd = (void *)CMSG_DATA(cmsg);
+ }
+
+ if (cmsg->cmsg_level == SOL_SOCKET &&
+ cmsg->cmsg_type == SCM_CREDENTIALS) {
+ if (cmsg->cmsg_len < sizeof(*ucred)) {
+ log_err("CMSG parse: SCM_CREDENTIALS wrong len");
+ return 1;
+ }
+
+ ucred = (void *)CMSG_DATA(cmsg);
+ }
+ }
+
+ /* send(pfd, "x", sizeof(char), 0) */
+ if (data != 'x') {
+ log_err("recvmsg: data corruption");
+ return 1;
+ }
+
+ if (!pidfd) {
+ log_err("CMSG parse: SCM_PIDFD not found");
+ return 1;
+ }
+
+ if (!ucred) {
+ log_err("CMSG parse: SCM_CREDENTIALS not found");
+ return 1;
+ }
+
+ /* pidfd from SCM_PIDFD should point to the parent process PID */
+ parent_pid =
+ get_pid_from_fdinfo_file(*pidfd, "Pid:", sizeof("Pid:") - 1);
+ if (parent_pid != getppid()) {
+ log_err("wrong SCM_PIDFD %d != %d", parent_pid, getppid());
+ return 1;
+ }
+
+ return 0;
+}
+
+struct sock_addr {
+ char sock_name[32];
+ struct sockaddr_un listen_addr;
+ socklen_t addrlen;
+};
+
+FIXTURE(scm_pidfd)
+{
+ int server;
+ pid_t client_pid;
+ int startup_pipe[2];
+ struct sock_addr server_addr;
+ struct sock_addr *client_addr;
+};
+
+FIXTURE_VARIANT(scm_pidfd)
+{
+ int type;
+ bool abstract;
+};
+
+FIXTURE_VARIANT_ADD(scm_pidfd, stream_pathname)
+{
+ .type = SOCK_STREAM,
+ .abstract = 0,
+};
+
+FIXTURE_VARIANT_ADD(scm_pidfd, stream_abstract)
+{
+ .type = SOCK_STREAM,
+ .abstract = 1,
+};
+
+FIXTURE_VARIANT_ADD(scm_pidfd, dgram_pathname)
+{
+ .type = SOCK_DGRAM,
+ .abstract = 0,
+};
+
+FIXTURE_VARIANT_ADD(scm_pidfd, dgram_abstract)
+{
+ .type = SOCK_DGRAM,
+ .abstract = 1,
+};
+
+FIXTURE_SETUP(scm_pidfd)
+{
+ self->client_addr = mmap(NULL, sizeof(*self->client_addr), PROT_READ | PROT_WRITE,
+ MAP_SHARED | MAP_ANONYMOUS, -1, 0);
+ ASSERT_NE(MAP_FAILED, self->client_addr);
+}
+
+FIXTURE_TEARDOWN(scm_pidfd)
+{
+ close(self->server);
+
+ kill(self->client_pid, SIGKILL);
+ waitpid(self->client_pid, NULL, 0);
+
+ if (!variant->abstract) {
+ unlink(self->server_addr.sock_name);
+ unlink(self->client_addr->sock_name);
+ }
+}
+
+static void fill_sockaddr(struct sock_addr *addr, bool abstract)
+{
+ char *sun_path_buf = (char *)&addr->listen_addr.sun_path;
+
+ addr->listen_addr.sun_family = AF_UNIX;
+ addr->addrlen = offsetof(struct sockaddr_un, sun_path);
+ snprintf(addr->sock_name, sizeof(addr->sock_name), "scm_pidfd_%d", getpid());
+ addr->addrlen += strlen(addr->sock_name);
+ if (abstract) {
+ *sun_path_buf = '\0';
+ addr->addrlen++;
+ sun_path_buf++;
+ } else {
+ unlink(addr->sock_name);
+ }
+ memcpy(sun_path_buf, addr->sock_name, strlen(addr->sock_name));
+}
+
+static void client(FIXTURE_DATA(scm_pidfd) *self,
+ const FIXTURE_VARIANT(scm_pidfd) *variant)
+{
+ int err;
+ int cfd;
+ socklen_t len;
+ struct ucred peer_cred;
+ int peer_pidfd;
+ pid_t peer_pid;
+ int on = 0;
+
+ cfd = socket(AF_UNIX, variant->type, 0);
+ if (cfd < 0) {
+ log_err("socket");
+ child_die();
+ }
+
+ if (variant->type == SOCK_DGRAM) {
+ fill_sockaddr(self->client_addr, variant->abstract);
+
+ if (bind(cfd, (struct sockaddr *)&self->client_addr->listen_addr, self->client_addr->addrlen)) {
+ log_err("bind");
+ child_die();
+ }
+ }
+
+ if (connect(cfd, (struct sockaddr *)&self->server_addr.listen_addr,
+ self->server_addr.addrlen) != 0) {
+ log_err("connect");
+ child_die();
+ }
+
+ on = 1;
+ if (setsockopt(cfd, SOL_SOCKET, SO_PASSCRED, &on, sizeof(on))) {
+ log_err("Failed to set SO_PASSCRED");
+ child_die();
+ }
+
+ if (setsockopt(cfd, SOL_SOCKET, SO_PASSPIDFD, &on, sizeof(on))) {
+ log_err("Failed to set SO_PASSPIDFD");
+ child_die();
+ }
+
+ close(self->startup_pipe[1]);
+
+ if (cmsg_check(cfd)) {
+ log_err("cmsg_check failed");
+ child_die();
+ }
+
+ /* skip further for SOCK_DGRAM as it's not applicable */
+ if (variant->type == SOCK_DGRAM)
+ return;
+
+ len = sizeof(peer_cred);
+ if (getsockopt(cfd, SOL_SOCKET, SO_PEERCRED, &peer_cred, &len)) {
+ log_err("Failed to get SO_PEERCRED");
+ child_die();
+ }
+
+ len = sizeof(peer_pidfd);
+ if (getsockopt(cfd, SOL_SOCKET, SO_PEERPIDFD, &peer_pidfd, &len)) {
+ log_err("Failed to get SO_PEERPIDFD");
+ child_die();
+ }
+
+ /* pid from SO_PEERCRED should point to the parent process PID */
+ if (peer_cred.pid != getppid()) {
+ log_err("peer_cred.pid != getppid(): %d != %d", peer_cred.pid, getppid());
+ child_die();
+ }
+
+ peer_pid = get_pid_from_fdinfo_file(peer_pidfd,
+ "Pid:", sizeof("Pid:") - 1);
+ if (peer_pid != peer_cred.pid) {
+ log_err("peer_pid != peer_cred.pid: %d != %d", peer_pid, peer_cred.pid);
+ child_die();
+ }
+}
+
+TEST_F(scm_pidfd, test)
+{
+ int err;
+ int pfd;
+ int child_status = 0;
+
+ self->server = socket(AF_UNIX, variant->type, 0);
+ ASSERT_NE(-1, self->server);
+
+ fill_sockaddr(&self->server_addr, variant->abstract);
+
+ err = bind(self->server, (struct sockaddr *)&self->server_addr.listen_addr, self->server_addr.addrlen);
+ ASSERT_EQ(0, err);
+
+ if (variant->type == SOCK_STREAM) {
+ err = listen(self->server, 1);
+ ASSERT_EQ(0, err);
+ }
+
+ err = pipe(self->startup_pipe);
+ ASSERT_NE(-1, err);
+
+ self->client_pid = fork();
+ ASSERT_NE(-1, self->client_pid);
+ if (self->client_pid == 0) {
+ close(self->server);
+ close(self->startup_pipe[0]);
+ client(self, variant);
+ exit(0);
+ }
+ close(self->startup_pipe[1]);
+
+ if (variant->type == SOCK_STREAM) {
+ pfd = accept(self->server, NULL, NULL);
+ ASSERT_NE(-1, pfd);
+ } else {
+ pfd = self->server;
+ }
+
+ /* wait until the child arrives at checkpoint */
+ read(self->startup_pipe[0], &err, sizeof(int));
+ close(self->startup_pipe[0]);
+
+ if (variant->type == SOCK_DGRAM) {
+ err = sendto(pfd, "x", sizeof(char), 0, (struct sockaddr *)&self->client_addr->listen_addr, self->client_addr->addrlen);
+ ASSERT_NE(-1, err);
+ } else {
+ err = send(pfd, "x", sizeof(char), 0);
+ ASSERT_NE(-1, err);
+ }
+
+ close(pfd);
+ waitpid(self->client_pid, &child_status, 0);
+ ASSERT_EQ(0, WIFEXITED(child_status) ? WEXITSTATUS(child_status) : 1);
+}
+
+TEST_HARNESS_MAIN
--
2.34.1

2023-04-11 10:48:11

by Aleksandr Mikhalitsyn

[permalink] [raw]
Subject: [PATCH net-next v3 3/4] net: core: add getsockopt SO_PEERPIDFD

Add SO_PEERPIDFD which allows to get pidfd of peer socket holder pidfd.
This thing is direct analog of SO_PEERCRED which allows to get plain PID.

Cc: "David S. Miller" <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: Jakub Kicinski <[email protected]>
Cc: Paolo Abeni <[email protected]>
Cc: Leon Romanovsky <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Christian Brauner <[email protected]>
Cc: Kuniyuki Iwashima <[email protected]>
Cc: Lennart Poettering <[email protected]>
Cc: Luca Boccassi <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Tested-by: Luca Boccassi <[email protected]>
Signed-off-by: Alexander Mikhalitsyn <[email protected]>
---
v3:
- fixed possible fd leak (thanks to Christian Brauner)
v2:
According to review comments from Kuniyuki Iwashima and Christian Brauner:
- use pidfd_create(..) retval as a result
- whitespace change
---
arch/alpha/include/uapi/asm/socket.h | 1 +
arch/mips/include/uapi/asm/socket.h | 1 +
arch/parisc/include/uapi/asm/socket.h | 1 +
arch/sparc/include/uapi/asm/socket.h | 1 +
include/uapi/asm-generic/socket.h | 1 +
net/core/sock.c | 33 +++++++++++++++++++++++++
net/socket.c | 7 ++++++
tools/include/uapi/asm-generic/socket.h | 1 +
8 files changed, 46 insertions(+)

diff --git a/arch/alpha/include/uapi/asm/socket.h b/arch/alpha/include/uapi/asm/socket.h
index ff310613ae64..e94f621903fe 100644
--- a/arch/alpha/include/uapi/asm/socket.h
+++ b/arch/alpha/include/uapi/asm/socket.h
@@ -138,6 +138,7 @@
#define SO_RCVMARK 75

#define SO_PASSPIDFD 76
+#define SO_PEERPIDFD 77

#if !defined(__KERNEL__)

diff --git a/arch/mips/include/uapi/asm/socket.h b/arch/mips/include/uapi/asm/socket.h
index 762dcb80e4ec..60ebaed28a4c 100644
--- a/arch/mips/include/uapi/asm/socket.h
+++ b/arch/mips/include/uapi/asm/socket.h
@@ -149,6 +149,7 @@
#define SO_RCVMARK 75

#define SO_PASSPIDFD 76
+#define SO_PEERPIDFD 77

#if !defined(__KERNEL__)

diff --git a/arch/parisc/include/uapi/asm/socket.h b/arch/parisc/include/uapi/asm/socket.h
index df16a3e16d64..be264c2b1a11 100644
--- a/arch/parisc/include/uapi/asm/socket.h
+++ b/arch/parisc/include/uapi/asm/socket.h
@@ -130,6 +130,7 @@
#define SO_RCVMARK 0x4049

#define SO_PASSPIDFD 0x404A
+#define SO_PEERPIDFD 0x404B

#if !defined(__KERNEL__)

diff --git a/arch/sparc/include/uapi/asm/socket.h b/arch/sparc/include/uapi/asm/socket.h
index 6e2847804fea..682da3714686 100644
--- a/arch/sparc/include/uapi/asm/socket.h
+++ b/arch/sparc/include/uapi/asm/socket.h
@@ -131,6 +131,7 @@
#define SO_RCVMARK 0x0054

#define SO_PASSPIDFD 0x0055
+#define SO_PEERPIDFD 0x0056

#if !defined(__KERNEL__)

diff --git a/include/uapi/asm-generic/socket.h b/include/uapi/asm-generic/socket.h
index b76169fdb80b..8ce8a39a1e5f 100644
--- a/include/uapi/asm-generic/socket.h
+++ b/include/uapi/asm-generic/socket.h
@@ -133,6 +133,7 @@
#define SO_RCVMARK 75

#define SO_PASSPIDFD 76
+#define SO_PEERPIDFD 77

#if !defined(__KERNEL__)

diff --git a/net/core/sock.c b/net/core/sock.c
index 3f974246ba3e..2b040a69e355 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1763,6 +1763,39 @@ int sk_getsockopt(struct sock *sk, int level, int optname,
goto lenout;
}

+ case SO_PEERPIDFD:
+ {
+ struct pid *peer_pid;
+ struct file *pidfd_file = NULL;
+ int pidfd;
+
+ if (len > sizeof(pidfd))
+ len = sizeof(pidfd);
+
+ spin_lock(&sk->sk_peer_lock);
+ peer_pid = get_pid(sk->sk_peer_pid);
+ spin_unlock(&sk->sk_peer_lock);
+
+ pidfd = pidfd_prepare(peer_pid, 0, &pidfd_file);
+
+ put_pid(peer_pid);
+
+ if (copy_to_sockptr(optval, &pidfd, len) ||
+ copy_to_sockptr(optlen, &len, sizeof(int))) {
+ if (pidfd >= 0) {
+ put_unused_fd(pidfd);
+ fput(pidfd_file);
+ }
+
+ return -EFAULT;
+ }
+
+ if (pidfd_file)
+ fd_install(pidfd, pidfd_file);
+
+ return 0;
+ }
+
case SO_PEERGROUPS:
{
const struct cred *cred;
diff --git a/net/socket.c b/net/socket.c
index 9c1ef11de23f..505b85489354 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -2248,6 +2248,13 @@ static bool sockopt_installs_fd(int level, int optname)
default:
return false;
}
+ } else if (level == SOL_SOCKET) {
+ switch (optname) {
+ case SO_PEERPIDFD:
+ return true;
+ default:
+ return false;
+ }
}

return false;
diff --git a/tools/include/uapi/asm-generic/socket.h b/tools/include/uapi/asm-generic/socket.h
index fbbc4bf53ee3..54d9c8bf7c55 100644
--- a/tools/include/uapi/asm-generic/socket.h
+++ b/tools/include/uapi/asm-generic/socket.h
@@ -122,6 +122,7 @@
#define SO_RCVMARK 75

#define SO_PASSPIDFD 76
+#define SO_PEERPIDFD 77

#if !defined(__KERNEL__)

--
2.34.1

2023-04-11 10:48:18

by Aleksandr Mikhalitsyn

[permalink] [raw]
Subject: [PATCH net-next v3 1/4] scm: add SO_PASSPIDFD and SCM_PIDFD

Implement SCM_PIDFD, a new type of CMSG type analogical to SCM_CREDENTIALS,
but it contains pidfd instead of plain pid, which allows programmers not
to care about PID reuse problem.

Idea comes from UAPI kernel group:
https://uapi-group.org/kernel-features/

Big thanks to Christian Brauner and Lennart Poettering for productive
discussions about this.

Cc: "David S. Miller" <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: Jakub Kicinski <[email protected]>
Cc: Paolo Abeni <[email protected]>
Cc: Leon Romanovsky <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Christian Brauner <[email protected]>
Cc: Kuniyuki Iwashima <[email protected]>
Cc: Lennart Poettering <[email protected]>
Cc: Luca Boccassi <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Tested-by: Luca Boccassi <[email protected]>
Reviewed-by: Kuniyuki Iwashima <[email protected]>
Signed-off-by: Alexander Mikhalitsyn <[email protected]>
---
v2:
According to review comments from Kuniyuki Iwashima and Christian Brauner:
- use pidfd_create(..) retval as a result
- whitespace change
---
arch/alpha/include/uapi/asm/socket.h | 2 ++
arch/mips/include/uapi/asm/socket.h | 2 ++
arch/parisc/include/uapi/asm/socket.h | 2 ++
arch/sparc/include/uapi/asm/socket.h | 2 ++
include/linux/net.h | 1 +
include/linux/socket.h | 1 +
include/net/scm.h | 14 ++++++++++++--
include/uapi/asm-generic/socket.h | 2 ++
net/core/sock.c | 11 +++++++++++
net/mptcp/sockopt.c | 1 +
net/unix/af_unix.c | 18 +++++++++++++-----
tools/include/uapi/asm-generic/socket.h | 2 ++
12 files changed, 51 insertions(+), 7 deletions(-)

diff --git a/arch/alpha/include/uapi/asm/socket.h b/arch/alpha/include/uapi/asm/socket.h
index 739891b94136..ff310613ae64 100644
--- a/arch/alpha/include/uapi/asm/socket.h
+++ b/arch/alpha/include/uapi/asm/socket.h
@@ -137,6 +137,8 @@

#define SO_RCVMARK 75

+#define SO_PASSPIDFD 76
+
#if !defined(__KERNEL__)

#if __BITS_PER_LONG == 64
diff --git a/arch/mips/include/uapi/asm/socket.h b/arch/mips/include/uapi/asm/socket.h
index 18f3d95ecfec..762dcb80e4ec 100644
--- a/arch/mips/include/uapi/asm/socket.h
+++ b/arch/mips/include/uapi/asm/socket.h
@@ -148,6 +148,8 @@

#define SO_RCVMARK 75

+#define SO_PASSPIDFD 76
+
#if !defined(__KERNEL__)

#if __BITS_PER_LONG == 64
diff --git a/arch/parisc/include/uapi/asm/socket.h b/arch/parisc/include/uapi/asm/socket.h
index f486d3dfb6bb..df16a3e16d64 100644
--- a/arch/parisc/include/uapi/asm/socket.h
+++ b/arch/parisc/include/uapi/asm/socket.h
@@ -129,6 +129,8 @@

#define SO_RCVMARK 0x4049

+#define SO_PASSPIDFD 0x404A
+
#if !defined(__KERNEL__)

#if __BITS_PER_LONG == 64
diff --git a/arch/sparc/include/uapi/asm/socket.h b/arch/sparc/include/uapi/asm/socket.h
index 2fda57a3ea86..6e2847804fea 100644
--- a/arch/sparc/include/uapi/asm/socket.h
+++ b/arch/sparc/include/uapi/asm/socket.h
@@ -130,6 +130,8 @@

#define SO_RCVMARK 0x0054

+#define SO_PASSPIDFD 0x0055
+
#if !defined(__KERNEL__)


diff --git a/include/linux/net.h b/include/linux/net.h
index b73ad8e3c212..c234dfbe7a30 100644
--- a/include/linux/net.h
+++ b/include/linux/net.h
@@ -43,6 +43,7 @@ struct net;
#define SOCK_PASSSEC 4
#define SOCK_SUPPORT_ZC 5
#define SOCK_CUSTOM_SOCKOPT 6
+#define SOCK_PASSPIDFD 7

#ifndef ARCH_HAS_SOCKET_TYPES
/**
diff --git a/include/linux/socket.h b/include/linux/socket.h
index 13c3a237b9c9..6bf90f251910 100644
--- a/include/linux/socket.h
+++ b/include/linux/socket.h
@@ -177,6 +177,7 @@ static inline size_t msg_data_left(struct msghdr *msg)
#define SCM_RIGHTS 0x01 /* rw: access rights (array of int) */
#define SCM_CREDENTIALS 0x02 /* rw: struct ucred */
#define SCM_SECURITY 0x03 /* rw: security label */
+#define SCM_PIDFD 0x04 /* ro: pidfd (int) */

struct ucred {
__u32 pid;
diff --git a/include/net/scm.h b/include/net/scm.h
index 585adc1346bd..0c717ae9c8db 100644
--- a/include/net/scm.h
+++ b/include/net/scm.h
@@ -124,8 +124,9 @@ static __inline__ void scm_recv(struct socket *sock, struct msghdr *msg,
struct scm_cookie *scm, int flags)
{
if (!msg->msg_control) {
- if (test_bit(SOCK_PASSCRED, &sock->flags) || scm->fp ||
- scm_has_secdata(sock))
+ if (test_bit(SOCK_PASSCRED, &sock->flags) ||
+ test_bit(SOCK_PASSPIDFD, &sock->flags) ||
+ scm->fp || scm_has_secdata(sock))
msg->msg_flags |= MSG_CTRUNC;
scm_destroy(scm);
return;
@@ -141,6 +142,15 @@ static __inline__ void scm_recv(struct socket *sock, struct msghdr *msg,
put_cmsg(msg, SOL_SOCKET, SCM_CREDENTIALS, sizeof(ucreds), &ucreds);
}

+ if (test_bit(SOCK_PASSPIDFD, &sock->flags)) {
+ int pidfd;
+
+ WARN_ON_ONCE(!scm->pid);
+ pidfd = pidfd_create(scm->pid, 0);
+
+ put_cmsg(msg, SOL_SOCKET, SCM_PIDFD, sizeof(int), &pidfd);
+ }
+
scm_destroy_cred(scm);

scm_passec(sock, msg, scm);
diff --git a/include/uapi/asm-generic/socket.h b/include/uapi/asm-generic/socket.h
index 638230899e98..b76169fdb80b 100644
--- a/include/uapi/asm-generic/socket.h
+++ b/include/uapi/asm-generic/socket.h
@@ -132,6 +132,8 @@

#define SO_RCVMARK 75

+#define SO_PASSPIDFD 76
+
#if !defined(__KERNEL__)

#if __BITS_PER_LONG == 64 || (defined(__x86_64__) && defined(__ILP32__))
diff --git a/net/core/sock.c b/net/core/sock.c
index c25888795390..3f974246ba3e 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1246,6 +1246,13 @@ int sk_setsockopt(struct sock *sk, int level, int optname,
clear_bit(SOCK_PASSCRED, &sock->flags);
break;

+ case SO_PASSPIDFD:
+ if (valbool)
+ set_bit(SOCK_PASSPIDFD, &sock->flags);
+ else
+ clear_bit(SOCK_PASSPIDFD, &sock->flags);
+ break;
+
case SO_TIMESTAMP_OLD:
case SO_TIMESTAMP_NEW:
case SO_TIMESTAMPNS_OLD:
@@ -1737,6 +1744,10 @@ int sk_getsockopt(struct sock *sk, int level, int optname,
v.val = !!test_bit(SOCK_PASSCRED, &sock->flags);
break;

+ case SO_PASSPIDFD:
+ v.val = !!test_bit(SOCK_PASSPIDFD, &sock->flags);
+ break;
+
case SO_PEERCRED:
{
struct ucred peercred;
diff --git a/net/mptcp/sockopt.c b/net/mptcp/sockopt.c
index b655cebda0f3..67be0558862f 100644
--- a/net/mptcp/sockopt.c
+++ b/net/mptcp/sockopt.c
@@ -355,6 +355,7 @@ static int mptcp_setsockopt_sol_socket(struct mptcp_sock *msk, int optname,
case SO_BROADCAST:
case SO_BSDCOMPAT:
case SO_PASSCRED:
+ case SO_PASSPIDFD:
case SO_PASSSEC:
case SO_RXQ_OVFL:
case SO_WIFI_STATUS:
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index fb31e8a4409e..6d5dff4dfe83 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -1361,7 +1361,8 @@ static int unix_dgram_connect(struct socket *sock, struct sockaddr *addr,
if (err)
goto out;

- if (test_bit(SOCK_PASSCRED, &sock->flags) &&
+ if ((test_bit(SOCK_PASSCRED, &sock->flags) ||
+ test_bit(SOCK_PASSPIDFD, &sock->flags)) &&
!unix_sk(sk)->addr) {
err = unix_autobind(sk);
if (err)
@@ -1469,7 +1470,8 @@ static int unix_stream_connect(struct socket *sock, struct sockaddr *uaddr,
if (err)
goto out;

- if (test_bit(SOCK_PASSCRED, &sock->flags) && !u->addr) {
+ if ((test_bit(SOCK_PASSCRED, &sock->flags) ||
+ test_bit(SOCK_PASSPIDFD, &sock->flags)) && !u->addr) {
err = unix_autobind(sk);
if (err)
goto out;
@@ -1670,6 +1672,8 @@ static void unix_sock_inherit_flags(const struct socket *old,
{
if (test_bit(SOCK_PASSCRED, &old->flags))
set_bit(SOCK_PASSCRED, &new->flags);
+ if (test_bit(SOCK_PASSPIDFD, &old->flags))
+ set_bit(SOCK_PASSPIDFD, &new->flags);
if (test_bit(SOCK_PASSSEC, &old->flags))
set_bit(SOCK_PASSSEC, &new->flags);
}
@@ -1819,8 +1823,10 @@ static bool unix_passcred_enabled(const struct socket *sock,
const struct sock *other)
{
return test_bit(SOCK_PASSCRED, &sock->flags) ||
+ test_bit(SOCK_PASSPIDFD, &sock->flags) ||
!other->sk_socket ||
- test_bit(SOCK_PASSCRED, &other->sk_socket->flags);
+ test_bit(SOCK_PASSCRED, &other->sk_socket->flags) ||
+ test_bit(SOCK_PASSPIDFD, &other->sk_socket->flags);
}

/*
@@ -1922,7 +1928,8 @@ static int unix_dgram_sendmsg(struct socket *sock, struct msghdr *msg,
goto out;
}

- if (test_bit(SOCK_PASSCRED, &sock->flags) && !u->addr) {
+ if ((test_bit(SOCK_PASSCRED, &sock->flags) ||
+ test_bit(SOCK_PASSPIDFD, &sock->flags)) && !u->addr) {
err = unix_autobind(sk);
if (err)
goto out;
@@ -2824,7 +2831,8 @@ static int unix_stream_read_generic(struct unix_stream_read_state *state,
/* Never glue messages from different writers */
if (!unix_skb_scm_eq(skb, &scm))
break;
- } else if (test_bit(SOCK_PASSCRED, &sock->flags)) {
+ } else if (test_bit(SOCK_PASSCRED, &sock->flags) ||
+ test_bit(SOCK_PASSPIDFD, &sock->flags)) {
/* Copy credentials */
scm_set_cred(&scm, UNIXCB(skb).pid, UNIXCB(skb).uid, UNIXCB(skb).gid);
unix_set_secdata(&scm, skb);
diff --git a/tools/include/uapi/asm-generic/socket.h b/tools/include/uapi/asm-generic/socket.h
index 8756df13be50..fbbc4bf53ee3 100644
--- a/tools/include/uapi/asm-generic/socket.h
+++ b/tools/include/uapi/asm-generic/socket.h
@@ -121,6 +121,8 @@

#define SO_RCVMARK 75

+#define SO_PASSPIDFD 76
+
#if !defined(__KERNEL__)

#if __BITS_PER_LONG == 64 || (defined(__x86_64__) && defined(__ILP32__))
--
2.34.1

2023-04-11 15:48:58

by Christian Brauner

[permalink] [raw]
Subject: Re: [PATCH net-next v3 1/4] scm: add SO_PASSPIDFD and SCM_PIDFD

On Tue, Apr 11, 2023 at 12:42:28PM +0200, Alexander Mikhalitsyn wrote:
> Implement SCM_PIDFD, a new type of CMSG type analogical to SCM_CREDENTIALS,
> but it contains pidfd instead of plain pid, which allows programmers not
> to care about PID reuse problem.
>
> Idea comes from UAPI kernel group:
> https://uapi-group.org/kernel-features/
>
> Big thanks to Christian Brauner and Lennart Poettering for productive
> discussions about this.
>
> Cc: "David S. Miller" <[email protected]>
> Cc: Eric Dumazet <[email protected]>
> Cc: Jakub Kicinski <[email protected]>
> Cc: Paolo Abeni <[email protected]>
> Cc: Leon Romanovsky <[email protected]>
> Cc: David Ahern <[email protected]>
> Cc: Arnd Bergmann <[email protected]>
> Cc: Kees Cook <[email protected]>
> Cc: Christian Brauner <[email protected]>
> Cc: Kuniyuki Iwashima <[email protected]>
> Cc: Lennart Poettering <[email protected]>
> Cc: Luca Boccassi <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Tested-by: Luca Boccassi <[email protected]>
> Reviewed-by: Kuniyuki Iwashima <[email protected]>
> Signed-off-by: Alexander Mikhalitsyn <[email protected]>
> ---
> v2:
> According to review comments from Kuniyuki Iwashima and Christian Brauner:
> - use pidfd_create(..) retval as a result
> - whitespace change
> ---
> arch/alpha/include/uapi/asm/socket.h | 2 ++
> arch/mips/include/uapi/asm/socket.h | 2 ++
> arch/parisc/include/uapi/asm/socket.h | 2 ++
> arch/sparc/include/uapi/asm/socket.h | 2 ++
> include/linux/net.h | 1 +
> include/linux/socket.h | 1 +
> include/net/scm.h | 14 ++++++++++++--
> include/uapi/asm-generic/socket.h | 2 ++
> net/core/sock.c | 11 +++++++++++
> net/mptcp/sockopt.c | 1 +
> net/unix/af_unix.c | 18 +++++++++++++-----
> tools/include/uapi/asm-generic/socket.h | 2 ++
> 12 files changed, 51 insertions(+), 7 deletions(-)
>
> diff --git a/arch/alpha/include/uapi/asm/socket.h b/arch/alpha/include/uapi/asm/socket.h
> index 739891b94136..ff310613ae64 100644
> --- a/arch/alpha/include/uapi/asm/socket.h
> +++ b/arch/alpha/include/uapi/asm/socket.h
> @@ -137,6 +137,8 @@
>
> #define SO_RCVMARK 75
>
> +#define SO_PASSPIDFD 76
> +
> #if !defined(__KERNEL__)
>
> #if __BITS_PER_LONG == 64
> diff --git a/arch/mips/include/uapi/asm/socket.h b/arch/mips/include/uapi/asm/socket.h
> index 18f3d95ecfec..762dcb80e4ec 100644
> --- a/arch/mips/include/uapi/asm/socket.h
> +++ b/arch/mips/include/uapi/asm/socket.h
> @@ -148,6 +148,8 @@
>
> #define SO_RCVMARK 75
>
> +#define SO_PASSPIDFD 76
> +
> #if !defined(__KERNEL__)
>
> #if __BITS_PER_LONG == 64
> diff --git a/arch/parisc/include/uapi/asm/socket.h b/arch/parisc/include/uapi/asm/socket.h
> index f486d3dfb6bb..df16a3e16d64 100644
> --- a/arch/parisc/include/uapi/asm/socket.h
> +++ b/arch/parisc/include/uapi/asm/socket.h
> @@ -129,6 +129,8 @@
>
> #define SO_RCVMARK 0x4049
>
> +#define SO_PASSPIDFD 0x404A
> +
> #if !defined(__KERNEL__)
>
> #if __BITS_PER_LONG == 64
> diff --git a/arch/sparc/include/uapi/asm/socket.h b/arch/sparc/include/uapi/asm/socket.h
> index 2fda57a3ea86..6e2847804fea 100644
> --- a/arch/sparc/include/uapi/asm/socket.h
> +++ b/arch/sparc/include/uapi/asm/socket.h
> @@ -130,6 +130,8 @@
>
> #define SO_RCVMARK 0x0054
>
> +#define SO_PASSPIDFD 0x0055
> +
> #if !defined(__KERNEL__)
>
>
> diff --git a/include/linux/net.h b/include/linux/net.h
> index b73ad8e3c212..c234dfbe7a30 100644
> --- a/include/linux/net.h
> +++ b/include/linux/net.h
> @@ -43,6 +43,7 @@ struct net;
> #define SOCK_PASSSEC 4
> #define SOCK_SUPPORT_ZC 5
> #define SOCK_CUSTOM_SOCKOPT 6
> +#define SOCK_PASSPIDFD 7
>
> #ifndef ARCH_HAS_SOCKET_TYPES
> /**
> diff --git a/include/linux/socket.h b/include/linux/socket.h
> index 13c3a237b9c9..6bf90f251910 100644
> --- a/include/linux/socket.h
> +++ b/include/linux/socket.h
> @@ -177,6 +177,7 @@ static inline size_t msg_data_left(struct msghdr *msg)
> #define SCM_RIGHTS 0x01 /* rw: access rights (array of int) */
> #define SCM_CREDENTIALS 0x02 /* rw: struct ucred */
> #define SCM_SECURITY 0x03 /* rw: security label */
> +#define SCM_PIDFD 0x04 /* ro: pidfd (int) */
>
> struct ucred {
> __u32 pid;
> diff --git a/include/net/scm.h b/include/net/scm.h
> index 585adc1346bd..0c717ae9c8db 100644
> --- a/include/net/scm.h
> +++ b/include/net/scm.h
> @@ -124,8 +124,9 @@ static __inline__ void scm_recv(struct socket *sock, struct msghdr *msg,
> struct scm_cookie *scm, int flags)
> {
> if (!msg->msg_control) {
> - if (test_bit(SOCK_PASSCRED, &sock->flags) || scm->fp ||
> - scm_has_secdata(sock))
> + if (test_bit(SOCK_PASSCRED, &sock->flags) ||
> + test_bit(SOCK_PASSPIDFD, &sock->flags) ||
> + scm->fp || scm_has_secdata(sock))
> msg->msg_flags |= MSG_CTRUNC;
> scm_destroy(scm);
> return;
> @@ -141,6 +142,15 @@ static __inline__ void scm_recv(struct socket *sock, struct msghdr *msg,
> put_cmsg(msg, SOL_SOCKET, SCM_CREDENTIALS, sizeof(ucreds), &ucreds);
> }
>
> + if (test_bit(SOCK_PASSPIDFD, &sock->flags)) {
> + int pidfd;
> +
> + WARN_ON_ONCE(!scm->pid);
> + pidfd = pidfd_create(scm->pid, 0);
> +
> + put_cmsg(msg, SOL_SOCKET, SCM_PIDFD, sizeof(int), &pidfd);

I know you already mentioned that you accidently missed to change this
to not leak an fd. But just so we keep track of it see the comment to v2
https://lore.kernel.org/netdev/20230322154817.c6qasnixow452e6x@wittgenstein/#t

2023-04-11 15:55:42

by Christian Brauner

[permalink] [raw]
Subject: Re: [PATCH net-next v3 2/4] net: socket: add sockopts blacklist for BPF cgroup hook

On Tue, Apr 11, 2023 at 12:42:29PM +0200, Alexander Mikhalitsyn wrote:
> During work on SO_PEERPIDFD, it was discovered (thanks to Christian),
> that bpf cgroup hook can cause FD leaks when used with sockopts which
> install FDs into the process fdtable.
>
> After some offlist discussion it was proposed to add a blacklist of
> socket options those can cause troubles when BPF cgroup hook is enabled.
>
> Cc: "David S. Miller" <[email protected]>
> Cc: Eric Dumazet <[email protected]>
> Cc: Jakub Kicinski <[email protected]>
> Cc: Paolo Abeni <[email protected]>
> Cc: Leon Romanovsky <[email protected]>
> Cc: David Ahern <[email protected]>
> Cc: Arnd Bergmann <[email protected]>
> Cc: Kees Cook <[email protected]>
> Cc: Christian Brauner <[email protected]>
> Cc: Kuniyuki Iwashima <[email protected]>
> Cc: Lennart Poettering <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Suggested-by: Daniel Borkmann <[email protected]>
> Suggested-by: Christian Brauner <[email protected]>
> Signed-off-by: Alexander Mikhalitsyn <[email protected]>
> ---

Just some background for kicks

A crucial point for SO_PEERPIDFD is the allocation of a pidfd and to
place it into the optval buffer for userspace to retrieve.

The way we orginally envisioned this working is by splitting this into
an fd and file allocation phase, then get past the point of failure in
sockopt processing and then call fd_install(fd, pidfd_file).

While looking at this I realized that there's a generic problem in this
code:

if (level == SOL_SOCKET)
err = sock_getsockopt(sock, level, optname, optval, optlen);
else if (unlikely(!sock->ops->getsockopt))
err = -EOPNOTSUPP;
else
err = sock->ops->getsockopt(sock, level, optname, optval, optlen);

if (!in_compat_syscall())
err = BPF_CGROUP_RUN_PROG_GETSOCKOPT(sock->sk, level, optname, optval, optlen, max_optlen, err);

That BPF_CGROUP_RUN_PROG_GETSOCKOPT hook can fail after getsockopt
itself succeeded. So anything that places an fd into optval risks
leaking an fd into the caller's fdtable if the bpf hook fails.

If we do a pidfd_create() and place the pidfd into the optval buffer the
the bpf hook could reasonably interact with the fd but if it fails the
fd would be leaked. It could clean this up calling close_fd() but it
would be ugly since the fd has already been visible in the caller's
fdtable. Someone might've snatched it already even.

If we delay installing the fd and file past the bpf hook then the fd is
meaningless for the bpf hook but we wouldn't risk leaking the fd.

It should also be noted that the hook does a copy_from_user() on the
optval right after the prior getsockopt did a copy_to_user() into that
optval. This is not just racy it's also a bit wasteful. Userspace could
try to retrieve the optval and then copy another value over it so that
the bpf hook operates on another value than getsockopt originally placed
into optval. If the bpf hook wants to care about fd resources in the
future it should probably be passed the allocated struct file.

This should be addressed separately though. The solution here works for
me,

Acked-by: Christian Brauner <[email protected]>

> net/socket.c | 38 +++++++++++++++++++++++++++++++++++---
> 1 file changed, 35 insertions(+), 3 deletions(-)
>
> diff --git a/net/socket.c b/net/socket.c
> index 73e493da4589..9c1ef11de23f 100644
> --- a/net/socket.c
> +++ b/net/socket.c
> @@ -108,6 +108,8 @@
> #include <linux/ptp_clock_kernel.h>
> #include <trace/events/sock.h>
>
> +#include <linux/sctp.h>
> +
> #ifdef CONFIG_NET_RX_BUSY_POLL
> unsigned int sysctl_net_busy_read __read_mostly;
> unsigned int sysctl_net_busy_poll __read_mostly;
> @@ -2227,6 +2229,36 @@ static bool sock_use_custom_sol_socket(const struct socket *sock)
> return test_bit(SOCK_CUSTOM_SOCKOPT, &sock->flags);
> }
>
> +#ifdef CONFIG_CGROUP_BPF
> +static bool sockopt_installs_fd(int level, int optname)
> +{
> + /*
> + * These options do fd_install(), and if BPF_CGROUP_RUN_PROG_GETSOCKOPT
> + * hook returns an error after success of the original handler
> + * sctp_getsockopt(...), userspace will receive an error from getsockopt
> + * syscall and will be not aware that fd was successfully installed into fdtable.
> + *
> + * Let's prevent bpf cgroup hook from running on them.
> + */
> + if (level == SOL_SCTP) {
> + switch (optname) {
> + case SCTP_SOCKOPT_PEELOFF:
> + case SCTP_SOCKOPT_PEELOFF_FLAGS:
> + return true;
> + default:
> + return false;
> + }
> + }
> +
> + return false;
> +}
> +#else /* CONFIG_CGROUP_BPF */
> +static inline bool sockopt_installs_fd(int level, int optname)
> +{
> + return false;
> +}
> +#endif /* CONFIG_CGROUP_BPF */
> +
> /*
> * Set a socket option. Because we don't know the option lengths we have
> * to pass the user mode parameter for the protocols to sort out.
> @@ -2250,7 +2282,7 @@ int __sys_setsockopt(int fd, int level, int optname, char __user *user_optval,
> if (err)
> goto out_put;
>
> - if (!in_compat_syscall())
> + if (!in_compat_syscall() && !sockopt_installs_fd(level, optname))
> err = BPF_CGROUP_RUN_PROG_SETSOCKOPT(sock->sk, &level, &optname,
> user_optval, &optlen,
> &kernel_optval);
> @@ -2304,7 +2336,7 @@ int __sys_getsockopt(int fd, int level, int optname, char __user *optval,
> if (err)
> goto out_put;
>
> - if (!in_compat_syscall())
> + if (!in_compat_syscall() && !sockopt_installs_fd(level, optname))
> max_optlen = BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen);
>
> if (level == SOL_SOCKET)
> @@ -2315,7 +2347,7 @@ int __sys_getsockopt(int fd, int level, int optname, char __user *optval,
> err = sock->ops->getsockopt(sock, level, optname, optval,
> optlen);
>
> - if (!in_compat_syscall())
> + if (!in_compat_syscall() && !sockopt_installs_fd(level, optname))
> err = BPF_CGROUP_RUN_PROG_GETSOCKOPT(sock->sk, level, optname,
> optval, optlen, max_optlen,
> err);
> --
> 2.34.1
>

2023-04-11 16:04:34

by Christian Brauner

[permalink] [raw]
Subject: Re: [PATCH net-next v3 3/4] net: core: add getsockopt SO_PEERPIDFD

On Tue, Apr 11, 2023 at 12:42:30PM +0200, Alexander Mikhalitsyn wrote:
> Add SO_PEERPIDFD which allows to get pidfd of peer socket holder pidfd.
> This thing is direct analog of SO_PEERCRED which allows to get plain PID.
>
> Cc: "David S. Miller" <[email protected]>
> Cc: Eric Dumazet <[email protected]>
> Cc: Jakub Kicinski <[email protected]>
> Cc: Paolo Abeni <[email protected]>
> Cc: Leon Romanovsky <[email protected]>
> Cc: David Ahern <[email protected]>
> Cc: Arnd Bergmann <[email protected]>
> Cc: Kees Cook <[email protected]>
> Cc: Christian Brauner <[email protected]>
> Cc: Kuniyuki Iwashima <[email protected]>
> Cc: Lennart Poettering <[email protected]>
> Cc: Luca Boccassi <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Tested-by: Luca Boccassi <[email protected]>
> Signed-off-by: Alexander Mikhalitsyn <[email protected]>
> ---
> v3:
> - fixed possible fd leak (thanks to Christian Brauner)
> v2:
> According to review comments from Kuniyuki Iwashima and Christian Brauner:
> - use pidfd_create(..) retval as a result
> - whitespace change
> ---
> arch/alpha/include/uapi/asm/socket.h | 1 +
> arch/mips/include/uapi/asm/socket.h | 1 +
> arch/parisc/include/uapi/asm/socket.h | 1 +
> arch/sparc/include/uapi/asm/socket.h | 1 +
> include/uapi/asm-generic/socket.h | 1 +
> net/core/sock.c | 33 +++++++++++++++++++++++++
> net/socket.c | 7 ++++++
> tools/include/uapi/asm-generic/socket.h | 1 +
> 8 files changed, 46 insertions(+)
>
> diff --git a/arch/alpha/include/uapi/asm/socket.h b/arch/alpha/include/uapi/asm/socket.h
> index ff310613ae64..e94f621903fe 100644
> --- a/arch/alpha/include/uapi/asm/socket.h
> +++ b/arch/alpha/include/uapi/asm/socket.h
> @@ -138,6 +138,7 @@
> #define SO_RCVMARK 75
>
> #define SO_PASSPIDFD 76
> +#define SO_PEERPIDFD 77
>
> #if !defined(__KERNEL__)
>
> diff --git a/arch/mips/include/uapi/asm/socket.h b/arch/mips/include/uapi/asm/socket.h
> index 762dcb80e4ec..60ebaed28a4c 100644
> --- a/arch/mips/include/uapi/asm/socket.h
> +++ b/arch/mips/include/uapi/asm/socket.h
> @@ -149,6 +149,7 @@
> #define SO_RCVMARK 75
>
> #define SO_PASSPIDFD 76
> +#define SO_PEERPIDFD 77
>
> #if !defined(__KERNEL__)
>
> diff --git a/arch/parisc/include/uapi/asm/socket.h b/arch/parisc/include/uapi/asm/socket.h
> index df16a3e16d64..be264c2b1a11 100644
> --- a/arch/parisc/include/uapi/asm/socket.h
> +++ b/arch/parisc/include/uapi/asm/socket.h
> @@ -130,6 +130,7 @@
> #define SO_RCVMARK 0x4049
>
> #define SO_PASSPIDFD 0x404A
> +#define SO_PEERPIDFD 0x404B
>
> #if !defined(__KERNEL__)
>
> diff --git a/arch/sparc/include/uapi/asm/socket.h b/arch/sparc/include/uapi/asm/socket.h
> index 6e2847804fea..682da3714686 100644
> --- a/arch/sparc/include/uapi/asm/socket.h
> +++ b/arch/sparc/include/uapi/asm/socket.h
> @@ -131,6 +131,7 @@
> #define SO_RCVMARK 0x0054
>
> #define SO_PASSPIDFD 0x0055
> +#define SO_PEERPIDFD 0x0056
>
> #if !defined(__KERNEL__)
>
> diff --git a/include/uapi/asm-generic/socket.h b/include/uapi/asm-generic/socket.h
> index b76169fdb80b..8ce8a39a1e5f 100644
> --- a/include/uapi/asm-generic/socket.h
> +++ b/include/uapi/asm-generic/socket.h
> @@ -133,6 +133,7 @@
> #define SO_RCVMARK 75
>
> #define SO_PASSPIDFD 76
> +#define SO_PEERPIDFD 77
>
> #if !defined(__KERNEL__)
>
> diff --git a/net/core/sock.c b/net/core/sock.c
> index 3f974246ba3e..2b040a69e355 100644
> --- a/net/core/sock.c
> +++ b/net/core/sock.c
> @@ -1763,6 +1763,39 @@ int sk_getsockopt(struct sock *sk, int level, int optname,
> goto lenout;
> }
>
> + case SO_PEERPIDFD:
> + {
> + struct pid *peer_pid;
> + struct file *pidfd_file = NULL;
> + int pidfd;
> +
> + if (len > sizeof(pidfd))
> + len = sizeof(pidfd);
> +
> + spin_lock(&sk->sk_peer_lock);
> + peer_pid = get_pid(sk->sk_peer_pid);
> + spin_unlock(&sk->sk_peer_lock);
> +
> + pidfd = pidfd_prepare(peer_pid, 0, &pidfd_file);
> +
> + put_pid(peer_pid);

Would be a bit nicer if this would be:

pidfd = pidfd_prepare(peer_pid, 0, &pidfd_file);
put_pid(peer_pid);
if (pidfd < 0)
return pidfd;
if (copy_to_sockptr(optval, &pidfd, len) ||
copy_to_sockptr(optlen, &len, sizeof(int)))
return -EFAULT;

fd_install(pidfd, pidfd_file);
return 0;

Otherwise seems good enough to me.

> +
> + if (copy_to_sockptr(optval, &pidfd, len) ||
> + copy_to_sockptr(optlen, &len, sizeof(int))) {
> + if (pidfd >= 0) {
> + put_unused_fd(pidfd);
> + fput(pidfd_file);
> + }
> +
> + return -EFAULT;
> + }
> +
> + if (pidfd_file)
> + fd_install(pidfd, pidfd_file);
> +
> + return 0;
> + }
> +
> case SO_PEERGROUPS:
> {
> const struct cred *cred;
> diff --git a/net/socket.c b/net/socket.c
> index 9c1ef11de23f..505b85489354 100644
> --- a/net/socket.c
> +++ b/net/socket.c
> @@ -2248,6 +2248,13 @@ static bool sockopt_installs_fd(int level, int optname)
> default:
> return false;
> }
> + } else if (level == SOL_SOCKET) {
> + switch (optname) {
> + case SO_PEERPIDFD:
> + return true;
> + default:
> + return false;
> + }
> }
>
> return false;
> diff --git a/tools/include/uapi/asm-generic/socket.h b/tools/include/uapi/asm-generic/socket.h
> index fbbc4bf53ee3..54d9c8bf7c55 100644
> --- a/tools/include/uapi/asm-generic/socket.h
> +++ b/tools/include/uapi/asm-generic/socket.h
> @@ -122,6 +122,7 @@
> #define SO_RCVMARK 75
>
> #define SO_PASSPIDFD 76
> +#define SO_PEERPIDFD 77
>
> #if !defined(__KERNEL__)
>
> --
> 2.34.1
>

2023-04-11 16:06:12

by Aleksandr Mikhalitsyn

[permalink] [raw]
Subject: Re: [PATCH net-next v3 3/4] net: core: add getsockopt SO_PEERPIDFD

On Tue, Apr 11, 2023 at 5:57 PM Christian Brauner <[email protected]> wrote:
>
> On Tue, Apr 11, 2023 at 12:42:30PM +0200, Alexander Mikhalitsyn wrote:
> > Add SO_PEERPIDFD which allows to get pidfd of peer socket holder pidfd.
> > This thing is direct analog of SO_PEERCRED which allows to get plain PID.
> >
> > Cc: "David S. Miller" <[email protected]>
> > Cc: Eric Dumazet <[email protected]>
> > Cc: Jakub Kicinski <[email protected]>
> > Cc: Paolo Abeni <[email protected]>
> > Cc: Leon Romanovsky <[email protected]>
> > Cc: David Ahern <[email protected]>
> > Cc: Arnd Bergmann <[email protected]>
> > Cc: Kees Cook <[email protected]>
> > Cc: Christian Brauner <[email protected]>
> > Cc: Kuniyuki Iwashima <[email protected]>
> > Cc: Lennart Poettering <[email protected]>
> > Cc: Luca Boccassi <[email protected]>
> > Cc: [email protected]
> > Cc: [email protected]
> > Cc: [email protected]
> > Tested-by: Luca Boccassi <[email protected]>
> > Signed-off-by: Alexander Mikhalitsyn <[email protected]>
> > ---
> > v3:
> > - fixed possible fd leak (thanks to Christian Brauner)
> > v2:
> > According to review comments from Kuniyuki Iwashima and Christian Brauner:
> > - use pidfd_create(..) retval as a result
> > - whitespace change
> > ---
> > arch/alpha/include/uapi/asm/socket.h | 1 +
> > arch/mips/include/uapi/asm/socket.h | 1 +
> > arch/parisc/include/uapi/asm/socket.h | 1 +
> > arch/sparc/include/uapi/asm/socket.h | 1 +
> > include/uapi/asm-generic/socket.h | 1 +
> > net/core/sock.c | 33 +++++++++++++++++++++++++
> > net/socket.c | 7 ++++++
> > tools/include/uapi/asm-generic/socket.h | 1 +
> > 8 files changed, 46 insertions(+)
> >
> > diff --git a/arch/alpha/include/uapi/asm/socket.h b/arch/alpha/include/uapi/asm/socket.h
> > index ff310613ae64..e94f621903fe 100644
> > --- a/arch/alpha/include/uapi/asm/socket.h
> > +++ b/arch/alpha/include/uapi/asm/socket.h
> > @@ -138,6 +138,7 @@
> > #define SO_RCVMARK 75
> >
> > #define SO_PASSPIDFD 76
> > +#define SO_PEERPIDFD 77
> >
> > #if !defined(__KERNEL__)
> >
> > diff --git a/arch/mips/include/uapi/asm/socket.h b/arch/mips/include/uapi/asm/socket.h
> > index 762dcb80e4ec..60ebaed28a4c 100644
> > --- a/arch/mips/include/uapi/asm/socket.h
> > +++ b/arch/mips/include/uapi/asm/socket.h
> > @@ -149,6 +149,7 @@
> > #define SO_RCVMARK 75
> >
> > #define SO_PASSPIDFD 76
> > +#define SO_PEERPIDFD 77
> >
> > #if !defined(__KERNEL__)
> >
> > diff --git a/arch/parisc/include/uapi/asm/socket.h b/arch/parisc/include/uapi/asm/socket.h
> > index df16a3e16d64..be264c2b1a11 100644
> > --- a/arch/parisc/include/uapi/asm/socket.h
> > +++ b/arch/parisc/include/uapi/asm/socket.h
> > @@ -130,6 +130,7 @@
> > #define SO_RCVMARK 0x4049
> >
> > #define SO_PASSPIDFD 0x404A
> > +#define SO_PEERPIDFD 0x404B
> >
> > #if !defined(__KERNEL__)
> >
> > diff --git a/arch/sparc/include/uapi/asm/socket.h b/arch/sparc/include/uapi/asm/socket.h
> > index 6e2847804fea..682da3714686 100644
> > --- a/arch/sparc/include/uapi/asm/socket.h
> > +++ b/arch/sparc/include/uapi/asm/socket.h
> > @@ -131,6 +131,7 @@
> > #define SO_RCVMARK 0x0054
> >
> > #define SO_PASSPIDFD 0x0055
> > +#define SO_PEERPIDFD 0x0056
> >
> > #if !defined(__KERNEL__)
> >
> > diff --git a/include/uapi/asm-generic/socket.h b/include/uapi/asm-generic/socket.h
> > index b76169fdb80b..8ce8a39a1e5f 100644
> > --- a/include/uapi/asm-generic/socket.h
> > +++ b/include/uapi/asm-generic/socket.h
> > @@ -133,6 +133,7 @@
> > #define SO_RCVMARK 75
> >
> > #define SO_PASSPIDFD 76
> > +#define SO_PEERPIDFD 77
> >
> > #if !defined(__KERNEL__)
> >
> > diff --git a/net/core/sock.c b/net/core/sock.c
> > index 3f974246ba3e..2b040a69e355 100644
> > --- a/net/core/sock.c
> > +++ b/net/core/sock.c
> > @@ -1763,6 +1763,39 @@ int sk_getsockopt(struct sock *sk, int level, int optname,
> > goto lenout;
> > }
> >
> > + case SO_PEERPIDFD:
> > + {
> > + struct pid *peer_pid;
> > + struct file *pidfd_file = NULL;
> > + int pidfd;
> > +
> > + if (len > sizeof(pidfd))
> > + len = sizeof(pidfd);
> > +
> > + spin_lock(&sk->sk_peer_lock);
> > + peer_pid = get_pid(sk->sk_peer_pid);
> > + spin_unlock(&sk->sk_peer_lock);
> > +
> > + pidfd = pidfd_prepare(peer_pid, 0, &pidfd_file);
> > +
> > + put_pid(peer_pid);
>
> Would be a bit nicer if this would be:
>
> pidfd = pidfd_prepare(peer_pid, 0, &pidfd_file);
> put_pid(peer_pid);
> if (pidfd < 0)
> return pidfd;
> if (copy_to_sockptr(optval, &pidfd, len) ||
> copy_to_sockptr(optlen, &len, sizeof(int)))
{
put_unused_fd(pidfd);
fput(pidfd_file);

are still needed there, right?

> return -EFAULT;
}

>
> fd_install(pidfd, pidfd_file);
> return 0;
>
> Otherwise seems good enough to me.

Will do.

>
> > +
> > + if (copy_to_sockptr(optval, &pidfd, len) ||
> > + copy_to_sockptr(optlen, &len, sizeof(int))) {
> > + if (pidfd >= 0) {
> > + put_unused_fd(pidfd);
> > + fput(pidfd_file);
> > + }
> > +
> > + return -EFAULT;
> > + }
> > +
> > + if (pidfd_file)
> > + fd_install(pidfd, pidfd_file);
> > +
> > + return 0;
> > + }
> > +
> > case SO_PEERGROUPS:
> > {
> > const struct cred *cred;
> > diff --git a/net/socket.c b/net/socket.c
> > index 9c1ef11de23f..505b85489354 100644
> > --- a/net/socket.c
> > +++ b/net/socket.c
> > @@ -2248,6 +2248,13 @@ static bool sockopt_installs_fd(int level, int optname)
> > default:
> > return false;
> > }
> > + } else if (level == SOL_SOCKET) {
> > + switch (optname) {
> > + case SO_PEERPIDFD:
> > + return true;
> > + default:
> > + return false;
> > + }
> > }
> >
> > return false;
> > diff --git a/tools/include/uapi/asm-generic/socket.h b/tools/include/uapi/asm-generic/socket.h
> > index fbbc4bf53ee3..54d9c8bf7c55 100644
> > --- a/tools/include/uapi/asm-generic/socket.h
> > +++ b/tools/include/uapi/asm-generic/socket.h
> > @@ -122,6 +122,7 @@
> > #define SO_RCVMARK 75
> >
> > #define SO_PASSPIDFD 76
> > +#define SO_PEERPIDFD 77
> >
> > #if !defined(__KERNEL__)
> >
> > --
> > 2.34.1
> >

2023-04-13 13:51:12

by Aleksandr Mikhalitsyn

[permalink] [raw]
Subject: Re: [PATCH net-next v3 1/4] scm: add SO_PASSPIDFD and SCM_PIDFD

Fixed in -v4

https://lore.kernel.org/netdev/[email protected]/T/#u

Kind regards,
Alex


On Tue, Apr 11, 2023 at 5:37 PM Christian Brauner <[email protected]> wrote:
>
> On Tue, Apr 11, 2023 at 12:42:28PM +0200, Alexander Mikhalitsyn wrote:
> > Implement SCM_PIDFD, a new type of CMSG type analogical to SCM_CREDENTIALS,
> > but it contains pidfd instead of plain pid, which allows programmers not
> > to care about PID reuse problem.
> >
> > Idea comes from UAPI kernel group:
> > https://uapi-group.org/kernel-features/
> >
> > Big thanks to Christian Brauner and Lennart Poettering for productive
> > discussions about this.
> >
> > Cc: "David S. Miller" <[email protected]>
> > Cc: Eric Dumazet <[email protected]>
> > Cc: Jakub Kicinski <[email protected]>
> > Cc: Paolo Abeni <[email protected]>
> > Cc: Leon Romanovsky <[email protected]>
> > Cc: David Ahern <[email protected]>
> > Cc: Arnd Bergmann <[email protected]>
> > Cc: Kees Cook <[email protected]>
> > Cc: Christian Brauner <[email protected]>
> > Cc: Kuniyuki Iwashima <[email protected]>
> > Cc: Lennart Poettering <[email protected]>
> > Cc: Luca Boccassi <[email protected]>
> > Cc: [email protected]
> > Cc: [email protected]
> > Cc: [email protected]
> > Tested-by: Luca Boccassi <[email protected]>
> > Reviewed-by: Kuniyuki Iwashima <[email protected]>
> > Signed-off-by: Alexander Mikhalitsyn <[email protected]>
> > ---
> > v2:
> > According to review comments from Kuniyuki Iwashima and Christian Brauner:
> > - use pidfd_create(..) retval as a result
> > - whitespace change
> > ---
> > arch/alpha/include/uapi/asm/socket.h | 2 ++
> > arch/mips/include/uapi/asm/socket.h | 2 ++
> > arch/parisc/include/uapi/asm/socket.h | 2 ++
> > arch/sparc/include/uapi/asm/socket.h | 2 ++
> > include/linux/net.h | 1 +
> > include/linux/socket.h | 1 +
> > include/net/scm.h | 14 ++++++++++++--
> > include/uapi/asm-generic/socket.h | 2 ++
> > net/core/sock.c | 11 +++++++++++
> > net/mptcp/sockopt.c | 1 +
> > net/unix/af_unix.c | 18 +++++++++++++-----
> > tools/include/uapi/asm-generic/socket.h | 2 ++
> > 12 files changed, 51 insertions(+), 7 deletions(-)
> >
> > diff --git a/arch/alpha/include/uapi/asm/socket.h b/arch/alpha/include/uapi/asm/socket.h
> > index 739891b94136..ff310613ae64 100644
> > --- a/arch/alpha/include/uapi/asm/socket.h
> > +++ b/arch/alpha/include/uapi/asm/socket.h
> > @@ -137,6 +137,8 @@
> >
> > #define SO_RCVMARK 75
> >
> > +#define SO_PASSPIDFD 76
> > +
> > #if !defined(__KERNEL__)
> >
> > #if __BITS_PER_LONG == 64
> > diff --git a/arch/mips/include/uapi/asm/socket.h b/arch/mips/include/uapi/asm/socket.h
> > index 18f3d95ecfec..762dcb80e4ec 100644
> > --- a/arch/mips/include/uapi/asm/socket.h
> > +++ b/arch/mips/include/uapi/asm/socket.h
> > @@ -148,6 +148,8 @@
> >
> > #define SO_RCVMARK 75
> >
> > +#define SO_PASSPIDFD 76
> > +
> > #if !defined(__KERNEL__)
> >
> > #if __BITS_PER_LONG == 64
> > diff --git a/arch/parisc/include/uapi/asm/socket.h b/arch/parisc/include/uapi/asm/socket.h
> > index f486d3dfb6bb..df16a3e16d64 100644
> > --- a/arch/parisc/include/uapi/asm/socket.h
> > +++ b/arch/parisc/include/uapi/asm/socket.h
> > @@ -129,6 +129,8 @@
> >
> > #define SO_RCVMARK 0x4049
> >
> > +#define SO_PASSPIDFD 0x404A
> > +
> > #if !defined(__KERNEL__)
> >
> > #if __BITS_PER_LONG == 64
> > diff --git a/arch/sparc/include/uapi/asm/socket.h b/arch/sparc/include/uapi/asm/socket.h
> > index 2fda57a3ea86..6e2847804fea 100644
> > --- a/arch/sparc/include/uapi/asm/socket.h
> > +++ b/arch/sparc/include/uapi/asm/socket.h
> > @@ -130,6 +130,8 @@
> >
> > #define SO_RCVMARK 0x0054
> >
> > +#define SO_PASSPIDFD 0x0055
> > +
> > #if !defined(__KERNEL__)
> >
> >
> > diff --git a/include/linux/net.h b/include/linux/net.h
> > index b73ad8e3c212..c234dfbe7a30 100644
> > --- a/include/linux/net.h
> > +++ b/include/linux/net.h
> > @@ -43,6 +43,7 @@ struct net;
> > #define SOCK_PASSSEC 4
> > #define SOCK_SUPPORT_ZC 5
> > #define SOCK_CUSTOM_SOCKOPT 6
> > +#define SOCK_PASSPIDFD 7
> >
> > #ifndef ARCH_HAS_SOCKET_TYPES
> > /**
> > diff --git a/include/linux/socket.h b/include/linux/socket.h
> > index 13c3a237b9c9..6bf90f251910 100644
> > --- a/include/linux/socket.h
> > +++ b/include/linux/socket.h
> > @@ -177,6 +177,7 @@ static inline size_t msg_data_left(struct msghdr *msg)
> > #define SCM_RIGHTS 0x01 /* rw: access rights (array of int) */
> > #define SCM_CREDENTIALS 0x02 /* rw: struct ucred */
> > #define SCM_SECURITY 0x03 /* rw: security label */
> > +#define SCM_PIDFD 0x04 /* ro: pidfd (int) */
> >
> > struct ucred {
> > __u32 pid;
> > diff --git a/include/net/scm.h b/include/net/scm.h
> > index 585adc1346bd..0c717ae9c8db 100644
> > --- a/include/net/scm.h
> > +++ b/include/net/scm.h
> > @@ -124,8 +124,9 @@ static __inline__ void scm_recv(struct socket *sock, struct msghdr *msg,
> > struct scm_cookie *scm, int flags)
> > {
> > if (!msg->msg_control) {
> > - if (test_bit(SOCK_PASSCRED, &sock->flags) || scm->fp ||
> > - scm_has_secdata(sock))
> > + if (test_bit(SOCK_PASSCRED, &sock->flags) ||
> > + test_bit(SOCK_PASSPIDFD, &sock->flags) ||
> > + scm->fp || scm_has_secdata(sock))
> > msg->msg_flags |= MSG_CTRUNC;
> > scm_destroy(scm);
> > return;
> > @@ -141,6 +142,15 @@ static __inline__ void scm_recv(struct socket *sock, struct msghdr *msg,
> > put_cmsg(msg, SOL_SOCKET, SCM_CREDENTIALS, sizeof(ucreds), &ucreds);
> > }
> >
> > + if (test_bit(SOCK_PASSPIDFD, &sock->flags)) {
> > + int pidfd;
> > +
> > + WARN_ON_ONCE(!scm->pid);
> > + pidfd = pidfd_create(scm->pid, 0);
> > +
> > + put_cmsg(msg, SOL_SOCKET, SCM_PIDFD, sizeof(int), &pidfd);
>
> I know you already mentioned that you accidently missed to change this
> to not leak an fd. But just so we keep track of it see the comment to v2
> https://lore.kernel.org/netdev/20230322154817.c6qasnixow452e6x@wittgenstein/#t