2023-12-07 17:11:03

by Tycho Andersen

[permalink] [raw]
Subject: [PATCH v2 1/3] pidfd: allow pidfd_open() on non-thread-group leaders

From: Tycho Andersen <[email protected]>

We are using the pidfd family of syscalls with the seccomp userspace
notifier. When some thread triggers a seccomp notification, we want to do
some things to its context (munge fd tables via pidfd_getfd(), maybe write
to its memory, etc.). However, threads created with ~CLONE_FILES or
~CLONE_VM mean that we can't use the pidfd family of syscalls for this
purpose, since their fd table or mm are distinct from the thread group
leader's. In this patch, we relax this restriction for pidfd_open().

In order to avoid dangling poll() users we need to notify pidfd waiters
when individual threads die, but once we do that all the other machinery
seems to work ok viz. the tests. But I suppose there are more cases than
just this one.

Signed-off-by: Tycho Andersen <[email protected]>
--
v2: unify pidfd notification to all go through do_notify_pidfd() inside of
__exit_signals() suggested by Oleg.
Link to v1: https://lore.kernel.org/all/[email protected]/
---
include/linux/sched/signal.h | 1 +
kernel/exit.c | 3 +++
kernel/fork.c | 4 +---
kernel/pid.c | 11 +----------
kernel/signal.c | 5 +----
5 files changed, 7 insertions(+), 17 deletions(-)

diff --git a/include/linux/sched/signal.h b/include/linux/sched/signal.h
index 3499c1a8b929..37d6b4e4ab70 100644
--- a/include/linux/sched/signal.h
+++ b/include/linux/sched/signal.h
@@ -332,6 +332,7 @@ extern int kill_pid_usb_asyncio(int sig, int errno, sigval_t addr, struct pid *,
extern int kill_pgrp(struct pid *pid, int sig, int priv);
extern int kill_pid(struct pid *pid, int sig, int priv);
extern __must_check bool do_notify_parent(struct task_struct *, int);
+extern void do_notify_pidfd(struct task_struct *task);
extern void __wake_up_parent(struct task_struct *p, struct task_struct *parent);
extern void force_sig(int);
extern void force_fatal_sig(int);
diff --git a/kernel/exit.c b/kernel/exit.c
index ee9f43bed49a..7bb6488ebd79 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -149,6 +149,9 @@ static void __exit_signal(struct task_struct *tsk)
struct tty_struct *tty;
u64 utime, stime;

+ /* Wake up all pidfd waiters */
+ do_notify_pidfd(tsk);
+
sighand = rcu_dereference_check(tsk->sighand,
lockdep_tasklist_lock_is_held());
spin_lock(&sighand->siglock);
diff --git a/kernel/fork.c b/kernel/fork.c
index 10917c3e1f03..eef15c93f6cf 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -2163,8 +2163,6 @@ static int __pidfd_prepare(struct pid *pid, unsigned int flags, struct file **re
* Allocate a new file that stashes @pid and reserve a new pidfd number in the
* caller's file descriptor table. The pidfd is reserved but not installed yet.
*
- * The helper verifies that @pid is used as a thread group leader.
- *
* If this function returns successfully the caller is responsible to either
* call fd_install() passing the returned pidfd and pidfd file as arguments in
* order to install the pidfd into its file descriptor table or they must use
@@ -2182,7 +2180,7 @@ static int __pidfd_prepare(struct pid *pid, unsigned int flags, struct file **re
*/
int pidfd_prepare(struct pid *pid, unsigned int flags, struct file **ret)
{
- if (!pid || !pid_has_task(pid, PIDTYPE_TGID))
+ if (!pid)
return -EINVAL;

return __pidfd_prepare(pid, flags, ret);
diff --git a/kernel/pid.c b/kernel/pid.c
index 6500ef956f2f..4806798022d9 100644
--- a/kernel/pid.c
+++ b/kernel/pid.c
@@ -552,11 +552,6 @@ struct pid *pidfd_get_pid(unsigned int fd, unsigned int *flags)
* Return the task associated with @pidfd. The function takes a reference on
* the returned task. The caller is responsible for releasing that reference.
*
- * Currently, the process identified by @pidfd is always a thread-group leader.
- * This restriction currently exists for all aspects of pidfds including pidfd
- * creation (CLONE_PIDFD cannot be used with CLONE_THREAD) and pidfd polling
- * (only supports thread group leaders).
- *
* Return: On success, the task_struct associated with the pidfd.
* On error, a negative errno number will be returned.
*/
@@ -615,11 +610,7 @@ int pidfd_create(struct pid *pid, unsigned int flags)
* @flags: flags to pass
*
* This creates a new pid file descriptor with the O_CLOEXEC flag set for
- * the process identified by @pid. Currently, the process identified by
- * @pid must be a thread-group leader. This restriction currently exists
- * for all aspects of pidfds including pidfd creation (CLONE_PIDFD cannot
- * be used with CLONE_THREAD) and pidfd polling (only supports thread group
- * leaders).
+ * the process identified by @pid.
*
* Return: On success, a cloexec pidfd is returned.
* On error, a negative errno number will be returned.
diff --git a/kernel/signal.c b/kernel/signal.c
index 47a7602dfe8d..7b3a1e147225 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -2028,7 +2028,7 @@ int send_sigqueue(struct sigqueue *q, struct pid *pid, enum pid_type type)
return ret;
}

-static void do_notify_pidfd(struct task_struct *task)
+void do_notify_pidfd(struct task_struct *task)
{
struct pid *pid;

@@ -2060,9 +2060,6 @@ bool do_notify_parent(struct task_struct *tsk, int sig)
WARN_ON_ONCE(!tsk->ptrace &&
(tsk->group_leader != tsk || !thread_group_empty(tsk)));

- /* Wake up all pidfd waiters */
- do_notify_pidfd(tsk);
-
if (sig != SIGCHLD) {
/*
* This is only possible if parent == real_parent.

base-commit: bee0e7762ad2c6025b9f5245c040fcc36ef2bde8
--
2.34.1


2023-12-07 17:11:08

by Tycho Andersen

[permalink] [raw]
Subject: [PATCH v2 3/3] clone: allow CLONE_THREAD | CLONE_PIDFD together

From: Tycho Andersen <[email protected]>

This removes the restriction of CLONE_THREAD | CLONE_PIDFD being specified
together. Assuming the previous patch sorts out all the thorny issues this
should be safe. I've left it as a separate patch since it is not strictly
necessary as a usecase for us, but might be nice? Perhaps we want to wait
until someone actually needs it though.

Signed-off-by: Tycho Andersen <[email protected]>
---
kernel/fork.c | 3 +--
.../selftests/pidfd/pidfd_non_tgl_test.c | 17 +++++++++++++++++
2 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/kernel/fork.c b/kernel/fork.c
index eef15c93f6cf..ada476f38b56 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -2302,9 +2302,8 @@ __latent_entropy struct task_struct *copy_process(
/*
* - CLONE_DETACHED is blocked so that we can potentially
* reuse it later for CLONE_PIDFD.
- * - CLONE_THREAD is blocked until someone really needs it.
*/
- if (clone_flags & (CLONE_DETACHED | CLONE_THREAD))
+ if (clone_flags & CLONE_DETACHED)
return ERR_PTR(-EINVAL);
}

diff --git a/tools/testing/selftests/pidfd/pidfd_non_tgl_test.c b/tools/testing/selftests/pidfd/pidfd_non_tgl_test.c
index e3992f2d88cf..dfd6a2cd85a3 100644
--- a/tools/testing/selftests/pidfd/pidfd_non_tgl_test.c
+++ b/tools/testing/selftests/pidfd/pidfd_non_tgl_test.c
@@ -305,6 +305,22 @@ static int test_non_tgl_exit(void)
return ret;
}

+static int test_clone_thread_pidfd(void)
+{
+ pid_t pid;
+ int flags = CLONE_THREAD | CLONE_VM | CLONE_SIGHAND | CLONE_PIDFD;
+ int pidfd;
+
+ pid = clone(thread_sleep, stack + STACK_SIZE, flags, NULL, &pidfd);
+ if (pid < 0) {
+ perror("clone");
+ return KSFT_FAIL;
+ }
+
+ close(pidfd);
+ return KSFT_PASS;
+}
+
#define T(x) { x, #x }
struct pidfd_non_tgl_test {
int (*fn)();
@@ -313,6 +329,7 @@ struct pidfd_non_tgl_test {
T(test_non_tgl_basic),
T(test_non_tgl_exec),
T(test_non_tgl_exit),
+ T(test_clone_thread_pidfd),
};
#undef T

--
2.34.1

2023-12-07 17:11:14

by Tycho Andersen

[permalink] [raw]
Subject: [PATCH v2 2/3] selftests/pidfd: add non-thread-group leader tests

From: Tycho Andersen <[email protected]>

This adds a family of tests for various behaviors for pidfds of
non-thread-group leaders.

Signed-off-by: Tycho Andersen <[email protected]>
---
tools/testing/selftests/pidfd/.gitignore | 1 +
tools/testing/selftests/pidfd/Makefile | 3 +-
.../selftests/pidfd/pidfd_non_tgl_test.c | 339 ++++++++++++++++++
3 files changed, 342 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/pidfd/.gitignore b/tools/testing/selftests/pidfd/.gitignore
index 973198a3ec3d..e7532e84a34a 100644
--- a/tools/testing/selftests/pidfd/.gitignore
+++ b/tools/testing/selftests/pidfd/.gitignore
@@ -6,3 +6,4 @@ pidfd_wait
pidfd_fdinfo_test
pidfd_getfd_test
pidfd_setns_test
+pidfd_non_tgl_test
diff --git a/tools/testing/selftests/pidfd/Makefile b/tools/testing/selftests/pidfd/Makefile
index d731e3e76d5b..50e3aa9de05a 100644
--- a/tools/testing/selftests/pidfd/Makefile
+++ b/tools/testing/selftests/pidfd/Makefile
@@ -2,7 +2,8 @@
CFLAGS += -g $(KHDR_INCLUDES) -pthread -Wall

TEST_GEN_PROGS := pidfd_test pidfd_fdinfo_test pidfd_open_test \
- pidfd_poll_test pidfd_wait pidfd_getfd_test pidfd_setns_test
+ pidfd_poll_test pidfd_wait pidfd_getfd_test pidfd_setns_test \
+ pidfd_non_tgl_test

include ../lib.mk

diff --git a/tools/testing/selftests/pidfd/pidfd_non_tgl_test.c b/tools/testing/selftests/pidfd/pidfd_non_tgl_test.c
new file mode 100644
index 000000000000..e3992f2d88cf
--- /dev/null
+++ b/tools/testing/selftests/pidfd/pidfd_non_tgl_test.c
@@ -0,0 +1,339 @@
+// SPDX-License-Identifier: GPL-2.0
+#define _GNU_SOURCE
+#include <sys/socket.h>
+#include <limits.h>
+#include <string.h>
+#include <signal.h>
+#include <syscall.h>
+#include <sched.h>
+#include <poll.h>
+
+#include "../kselftest.h"
+#include "pidfd.h"
+
+// glibc defaults to 8MB stacks
+#define STACK_SIZE (8 * 1024 * 1024)
+static char stack[STACK_SIZE];
+
+static int thread_sleep(void *)
+{
+ while (1)
+ sleep(100);
+ return 1;
+}
+
+static int fork_task_with_thread(int (*fn)(void *), int sk_pair[2],
+ pid_t *tgl, pid_t *thread, int *tgl_pidfd,
+ int *thread_pidfd)
+{
+ *tgl_pidfd = *thread_pidfd = -1;
+
+ *tgl = fork();
+ if (*tgl < 0) {
+ perror("fork");
+ return -1;
+ }
+
+ if (!*tgl) {
+ int flags = CLONE_THREAD | CLONE_VM | CLONE_SIGHAND;
+ pid_t t;
+
+ t = clone(fn, stack + STACK_SIZE, flags, sk_pair);
+ if (t < 0) {
+ perror("clone");
+ exit(1);
+ }
+
+ close(sk_pair[1]);
+
+ if (write(sk_pair[0], &t, sizeof(t)) != sizeof(t)) {
+ perror("read");
+ exit(1);
+ }
+
+ // wait to get killed for various reasons by the tests.
+ while (1)
+ sleep(100);
+ }
+
+ errno = 0;
+ if (read(sk_pair[1], thread, sizeof(*thread)) != sizeof(*thread)) {
+ perror("read");
+ goto cleanup;
+ }
+
+ *tgl_pidfd = sys_pidfd_open(*tgl, 0);
+ if (*tgl_pidfd < 0) {
+ perror("pidfd_open tgl");
+ goto cleanup;
+ }
+
+ *thread_pidfd = sys_pidfd_open(*thread, 0);
+ if (*thread_pidfd < 0) {
+ perror("pidfd");
+ goto cleanup;
+ }
+
+ return 0;
+
+cleanup:
+ kill(*tgl, SIGKILL);
+ if (*tgl_pidfd >= 0)
+ close(*tgl_pidfd);
+ if (*thread_pidfd >= 0)
+ close(*thread_pidfd);
+ return -1;
+}
+
+static int test_non_tgl_basic(void)
+{
+ pid_t tgl, thread;
+ int sk_pair[2], status;
+ int tgl_pidfd = -1, thread_pidfd = -1;
+ int ret = KSFT_FAIL;
+
+ if (socketpair(PF_LOCAL, SOCK_SEQPACKET, 0, sk_pair) < 0) {
+ ksft_print_msg("socketpair failed %s\n", strerror(errno));
+ return KSFT_FAIL;
+ }
+
+ if (fork_task_with_thread(thread_sleep, sk_pair, &tgl, &thread,
+ &tgl_pidfd, &thread_pidfd) < 0) {
+ return KSFT_FAIL;
+ }
+
+ /*
+ * KILL of a thread should still kill everyone
+ */
+ if (sys_pidfd_send_signal(thread_pidfd, SIGKILL, NULL, 0)) {
+ perror("pidfd_send_signal");
+ goto cleanup;
+ }
+
+ errno = 0;
+ if (waitpid(tgl, &status, 0) != tgl) {
+ perror("waitpid tgl");
+ goto cleanup;
+ }
+
+ if (!WIFSIGNALED(status) || WTERMSIG(status) != SIGKILL) {
+ ksft_print_msg("bad exit status %x\n", status);
+ goto cleanup;
+ }
+
+ ret = KSFT_PASS;
+
+cleanup:
+ close(sk_pair[0]);
+ close(sk_pair[1]);
+ close(tgl_pidfd);
+ close(thread_pidfd);
+ return ret;
+}
+
+static int thread_exec(void *arg)
+{
+ int *sk_pair = arg;
+ pid_t thread;
+
+ if (read(sk_pair[0], &thread, sizeof(thread)) != sizeof(thread)) {
+ perror("read");
+ exit(1);
+ }
+
+ execlp("/bin/true", "/bin/true", NULL);
+ return 1;
+}
+
+static int test_non_tgl_exec(void)
+{
+ pid_t tgl, thread;
+ int sk_pair[2];
+ int tgl_pidfd = -1, thread_pidfd = -1;
+ int ret = KSFT_FAIL, ready;
+ struct pollfd pollfd;
+
+ if (socketpair(PF_LOCAL, SOCK_SEQPACKET, 0, sk_pair) < 0) {
+ ksft_print_msg("socketpair failed %s\n", strerror(errno));
+ return KSFT_FAIL;
+ }
+
+ if (fork_task_with_thread(thread_exec, sk_pair, &tgl, &thread,
+ &tgl_pidfd, &thread_pidfd) < 0) {
+ return KSFT_FAIL;
+ }
+
+ if (write(sk_pair[1], &thread, sizeof(thread)) != sizeof(thread)) {
+ perror("read");
+ goto cleanup;
+ }
+
+ // thread will exec(), so this pidfd should eventually be dead (i.e.
+ // poll should return).
+ pollfd.fd = thread_pidfd;
+ pollfd.events = POLLIN;
+
+ ready = poll(&pollfd, 1, -1);
+ if (ready == -1) {
+ perror("poll");
+ goto cleanup;
+ }
+
+ if (ready != 1) {
+ ksft_print_msg("bad poll result %d\n", ready);
+ goto cleanup;
+ }
+
+ if (pollfd.revents != POLLIN) {
+ ksft_print_msg("bad poll revents: %x\n", pollfd.revents);
+ goto cleanup;
+ }
+
+ errno = 0;
+ if (sys_pidfd_getfd(thread_pidfd, 0, 0) > 0) {
+ ksft_print_msg("got a real fd");
+ goto cleanup;
+ }
+
+ if (errno != ESRCH) {
+ ksft_print_msg("polling invalid thread didn't give ESRCH");
+ goto cleanup;
+ }
+
+ ret = KSFT_PASS;
+
+cleanup:
+ close(sk_pair[0]);
+ close(sk_pair[1]);
+ close(tgl_pidfd);
+ close(thread_pidfd);
+ return ret;
+}
+
+int thread_wait_exit(void *arg)
+{
+ int *sk_pair = arg;
+ pid_t thread;
+
+ if (read(sk_pair[0], &thread, sizeof(thread)) != sizeof(thread)) {
+ perror("read");
+ exit(1);
+ }
+
+ return 0;
+}
+
+static int test_non_tgl_exit(void)
+{
+ pid_t tgl, thread;
+ int sk_pair[2], status;
+ int tgl_pidfd = -1, thread_pidfd = -1;
+ int ret = KSFT_FAIL, ready;
+ struct pollfd pollfd;
+
+ if (socketpair(PF_LOCAL, SOCK_SEQPACKET, 0, sk_pair) < 0) {
+ ksft_print_msg("socketpair failed %s\n", strerror(errno));
+ return KSFT_FAIL;
+ }
+
+ if (fork_task_with_thread(thread_wait_exit, sk_pair, &tgl, &thread,
+ &tgl_pidfd, &thread_pidfd) < 0) {
+ return KSFT_FAIL;
+ }
+
+ if (write(sk_pair[1], &thread, sizeof(thread)) != sizeof(thread)) {
+ perror("write");
+ goto cleanup;
+ }
+
+ // thread will exit, so this pidfd should eventually be dead (i.e.
+ // poll should return).
+ pollfd.fd = thread_pidfd;
+ pollfd.events = POLLIN;
+
+ ready = poll(&pollfd, 1, -1);
+ if (ready == -1) {
+ perror("poll");
+ goto cleanup;
+ }
+
+ if (ready != 1) {
+ ksft_print_msg("bad poll result %d\n", ready);
+ goto cleanup;
+ }
+
+ if (pollfd.revents != POLLIN) {
+ ksft_print_msg("bad poll revents: %x\n", pollfd.revents);
+ goto cleanup;
+ }
+
+ errno = 0;
+ if (sys_pidfd_getfd(thread_pidfd, 0, 0) > 0) {
+ ksft_print_msg("got a real pidfd");
+ goto cleanup;
+ }
+
+ if (errno != ESRCH) {
+ ksft_print_msg("polling invalid thread didn't give ESRCH");
+ goto cleanup;
+ }
+
+ close(thread_pidfd);
+
+ // ok, but the thread group *leader* should still be alive
+ pollfd.fd = tgl_pidfd;
+ ready = poll(&pollfd, 1, 1);
+ if (ready == -1) {
+ perror("poll");
+ goto cleanup;
+ }
+
+ if (ready != 0) {
+ ksft_print_msg("polling leader returned something?! %x", pollfd.revents);
+ goto cleanup;
+ }
+
+ ret = KSFT_PASS;
+
+cleanup:
+ kill(tgl, SIGKILL);
+ if (waitpid(tgl, &status, 0) != tgl) {
+ perror("waitpid");
+ return KSFT_FAIL;
+ }
+
+ return ret;
+}
+
+#define T(x) { x, #x }
+struct pidfd_non_tgl_test {
+ int (*fn)();
+ const char *name;
+} tests[] = {
+ T(test_non_tgl_basic),
+ T(test_non_tgl_exec),
+ T(test_non_tgl_exit),
+};
+#undef T
+
+int main(int argc, char *argv[])
+{
+ int i, ret = EXIT_SUCCESS;
+
+ for (i = 0; i < ARRAY_SIZE(tests); i++) {
+ switch (tests[i].fn()) {
+ case KSFT_PASS:
+ ksft_test_result_pass("%s\n", tests[i].name);
+ break;
+ case KSFT_SKIP:
+ ksft_test_result_skip("%s\n", tests[i].name);
+ break;
+ default:
+ ret = EXIT_FAILURE;
+ ksft_test_result_fail("%s\n", tests[i].name);
+ break;
+ }
+ }
+
+ return ret;
+}
--
2.34.1

2023-12-11 07:29:07

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH v2 1/3] pidfd: allow pidfd_open() on non-thread-group leaders



Hello,

kernel test robot noticed "kernel-selftests.pidfd.pidfd_test.fail" on:

commit: e6d9be676d2c1fa8332c63c4382b8d3227fca991 ("[PATCH v2 1/3] pidfd: allow pidfd_open() on non-thread-group leaders")
url: https://github.com/intel-lab-lkp/linux/commits/Tycho-Andersen/selftests-pidfd-add-non-thread-group-leader-tests/20231208-011135
patch link: https://lore.kernel.org/all/[email protected]/
patch subject: [PATCH v2 1/3] pidfd: allow pidfd_open() on non-thread-group leaders

in testcase: kernel-selftests
version: kernel-selftests-x86_64-60acb023-1_20230329
with following parameters:

group: pidfd



compiler: gcc-12
test machine: 36 threads 1 sockets Intel(R) Core(TM) i9-10980XE CPU @ 3.00GHz (Cascade Lake) with 32G memory

(please refer to attached dmesg/kmsg for entire log/backtrace)



If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <[email protected]>
| Closes: https://lore.kernel.org/oe-lkp/[email protected]


besides, we also observed kernel-selftests.pidfd.pidfd_poll_test.fail on this
commit, but clean on parent:

bee0e7762ad2c602 e6d9be676d2c1fa8332c63c4382
---------------- ---------------------------
fail:runs %reproduction fail:runs
| | |
:6 100% 6:6 kernel-selftests.pidfd.pidfd_poll_test.fail
:6 100% 6:6 kernel-selftests.pidfd.pidfd_test.fail



TAP version 13
1..7
# timeout set to 300
# selftests: pidfd: pidfd_test
# TAP version 13
# 1..8
# # Parent: pid: 2191
# # Parent: Waiting for Child (2192) to complete.
# # Child (pidfd): starting. pid 2192 tid 2192
# # Child Thread: starting. pid 2192 tid 2193 ; and sleeping
# # Child Thread: doing exec of sleep
# Bail out! pidfd_poll check for premature notification on child thread exec test: Unexpected epoll_wait result (c=0, events=0) (errno 0)
# # Planned tests != run tests (8 != 0)
# # Totals: pass:0 fail:0 xfail:0 xpass:0 skip:0 error:0
not ok 1 selftests: pidfd: pidfd_test # exit=1

...

# timeout set to 300
# selftests: pidfd: pidfd_poll_test
# # running pidfd poll test for 10000 iterations
# Bail out! death notification wait timeout
# # Totals: pass:0 fail:0 xfail:0 xpass:0 skip:0 error:0
not ok 4 selftests: pidfd: pidfd_poll_test # exit=1




The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20231211/[email protected]



--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

2023-12-13 12:18:17

by Christian Brauner

[permalink] [raw]
Subject: Re: [PATCH v2 1/3] pidfd: allow pidfd_open() on non-thread-group leaders

On Mon, Dec 11, 2023 at 03:28:09PM +0800, kernel test robot wrote:
>
>
> Hello,
>
> kernel test robot noticed "kernel-selftests.pidfd.pidfd_test.fail" on:
>
> commit: e6d9be676d2c1fa8332c63c4382b8d3227fca991 ("[PATCH v2 1/3] pidfd: allow pidfd_open() on non-thread-group leaders")
> url: https://github.com/intel-lab-lkp/linux/commits/Tycho-Andersen/selftests-pidfd-add-non-thread-group-leader-tests/20231208-011135
> patch link: https://lore.kernel.org/all/[email protected]/
> patch subject: [PATCH v2 1/3] pidfd: allow pidfd_open() on non-thread-group leaders
>
> in testcase: kernel-selftests
> version: kernel-selftests-x86_64-60acb023-1_20230329
> with following parameters:
>
> group: pidfd
>
>
>
> compiler: gcc-12
> test machine: 36 threads 1 sockets Intel(R) Core(TM) i9-10980XE CPU @ 3.00GHz (Cascade Lake) with 32G memory
>
> (please refer to attached dmesg/kmsg for entire log/backtrace)
>
>
>
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <[email protected]>
> | Closes: https://lore.kernel.org/oe-lkp/[email protected]
>
>
> besides, we also observed kernel-selftests.pidfd.pidfd_poll_test.fail on this
> commit, but clean on parent:
>
> bee0e7762ad2c602 e6d9be676d2c1fa8332c63c4382
> ---------------- ---------------------------
> fail:runs %reproduction fail:runs
> | | |
> :6 100% 6:6 kernel-selftests.pidfd.pidfd_poll_test.fail
> :6 100% 6:6 kernel-selftests.pidfd.pidfd_test.fail
>
>
>
> TAP version 13
> 1..7
> # timeout set to 300
> # selftests: pidfd: pidfd_test
> # TAP version 13
> # 1..8
> # # Parent: pid: 2191
> # # Parent: Waiting for Child (2192) to complete.
> # # Child (pidfd): starting. pid 2192 tid 2192
> # # Child Thread: starting. pid 2192 tid 2193 ; and sleeping
> # # Child Thread: doing exec of sleep
> # Bail out! pidfd_poll check for premature notification on child thread exec test: Unexpected epoll_wait result (c=0, events=0) (errno 0)

So it seems that this broke multi-threaded exit notifications.

2023-12-13 19:18:43

by Tycho Andersen

[permalink] [raw]
Subject: Re: [PATCH v2 1/3] pidfd: allow pidfd_open() on non-thread-group leaders

On Wed, Dec 13, 2023 at 01:18:03PM +0100, Christian Brauner wrote:
> On Mon, Dec 11, 2023 at 03:28:09PM +0800, kernel test robot wrote:
> >
> >
> > Hello,
> >
> > kernel test robot noticed "kernel-selftests.pidfd.pidfd_test.fail" on:
> >
> > commit: e6d9be676d2c1fa8332c63c4382b8d3227fca991 ("[PATCH v2 1/3] pidfd: allow pidfd_open() on non-thread-group leaders")
> > url: https://github.com/intel-lab-lkp/linux/commits/Tycho-Andersen/selftests-pidfd-add-non-thread-group-leader-tests/20231208-011135
> > patch link: https://lore.kernel.org/all/[email protected]/
> > patch subject: [PATCH v2 1/3] pidfd: allow pidfd_open() on non-thread-group leaders
> >
> > in testcase: kernel-selftests
> > version: kernel-selftests-x86_64-60acb023-1_20230329
> > with following parameters:
> >
> > group: pidfd
> >
> >
> >
> > compiler: gcc-12
> > test machine: 36 threads 1 sockets Intel(R) Core(TM) i9-10980XE CPU @ 3.00GHz (Cascade Lake) with 32G memory
> >
> > (please refer to attached dmesg/kmsg for entire log/backtrace)
> >
> >
> >
> > If you fix the issue in a separate patch/commit (i.e. not just a new version of
> > the same patch/commit), kindly add following tags
> > | Reported-by: kernel test robot <[email protected]>
> > | Closes: https://lore.kernel.org/oe-lkp/[email protected]
> >
> >
> > besides, we also observed kernel-selftests.pidfd.pidfd_poll_test.fail on this
> > commit, but clean on parent:
> >
> > bee0e7762ad2c602 e6d9be676d2c1fa8332c63c4382
> > ---------------- ---------------------------
> > fail:runs %reproduction fail:runs
> > | | |
> > :6 100% 6:6 kernel-selftests.pidfd.pidfd_poll_test.fail
> > :6 100% 6:6 kernel-selftests.pidfd.pidfd_test.fail
> >
> >
> >
> > TAP version 13
> > 1..7
> > # timeout set to 300
> > # selftests: pidfd: pidfd_test
> > # TAP version 13
> > # 1..8
> > # # Parent: pid: 2191
> > # # Parent: Waiting for Child (2192) to complete.
> > # # Child (pidfd): starting. pid 2192 tid 2192
> > # # Child Thread: starting. pid 2192 tid 2193 ; and sleeping
> > # # Child Thread: doing exec of sleep
> > # Bail out! pidfd_poll check for premature notification on child thread exec test: Unexpected epoll_wait result (c=0, events=0) (errno 0)
>
> So it seems that this broke multi-threaded exit notifications.

Yeah... I've been trying to figure out how to fix it.

de_thread() calls release_task() for the original leader, which I
didn't realize.

Tycho