Hi,
This seven series fix an issue reported by kernel test robot [3].
Shuah, I (as well as Kees and Sean [4]) think this should be in -next
really soon to make sure everything works fine for the v6.9 release,
which is not currently the case. I cannot test against all kselftests
though. I would prefer to let you handle this, but I guess you're not
able to do so and I'll push it on my branch without reply from you.
Even if I push it on my branch, please push it on yours too as soon as
you see this and I'll remove it from mine.
Mark, Jakub, could you please test this series?
As reported by Kernel Test Robot [1] and Sean Christopherson [2], some
tests fail since v6.9-rc1 . This is due to the use of vfork() which
introduced some side effects. Similarly, while making it more generic,
a previous commit made some Landlock file system tests flaky, and
subject to the host's file system mount configuration.
This series fixes all these side effects by replacing vfork() with
clone3() and CLONE_VFORK, which is cleaner (no arbitrary shared memory)
and makes the Kselftest framework more robust.
I tried different approaches and I found this one to be the cleaner and
less invasive for current test cases.
I successfully ran the following tests (using TEST_F and
fork/clone/clone3, and KVM_ONE_VCPU_TEST) with this series:
- kvm:fix_hypercall_test
- kvm:sync_regs_test
- kvm:userspace_msr_exit_test
- kvm:vmx_pmu_caps_test
- landlock:fs_test
- landlock:net_test
- landlock:ptrace_test
- move_mount_set_group:move_mount_set_group_test
- net/af_unix:scm_pidfd
- perf_events:remove_on_exec
- pidfd:pidfd_getfd_test
- pidfd:pidfd_setns_test
- seccomp:seccomp_bpf
- user_events:abi_test
[1] https://lore.kernel.org/oe-lkp/[email protected]
[2] https://lore.kernel.org/r/[email protected]
[3] https://lore.kernel.org/r/[email protected]
[4] https://lore.kernel.org/r/202405061002.01D399877A@keescook
Previous versions:
v1: https://lore.kernel.org/r/[email protected]
v2: https://lore.kernel.org/r/[email protected]
v3: https://lore.kernel.org/r/[email protected]
v4: https://lore.kernel.org/r/[email protected]
v5: https://lore.kernel.org/r/[email protected]
v6: https://lore.kernel.org/r/[email protected]
Regards,
Mickaël Salaün (10):
selftests/pidfd: Fix config for pidfd_setns_test
selftests/landlock: Fix FS tests when run on a private mount point
selftests/harness: Fix fixture teardown
selftests/harness: Fix interleaved scheduling leading to race
conditions
selftests/landlock: Do not allocate memory in fixture data
selftests/harness: Constify fixture variants
selftests/pidfd: Fix wrong expectation
selftests/harness: Share _metadata between forked processes
selftests/harness: Fix vfork() side effects
selftests/harness: Handle TEST_F()'s explicit exit codes
tools/testing/selftests/kselftest_harness.h | 127 +++++++++++++-----
tools/testing/selftests/landlock/fs_test.c | 83 +++++++-----
tools/testing/selftests/pidfd/config | 2 +
.../selftests/pidfd/pidfd_setns_test.c | 2 +-
4 files changed, 147 insertions(+), 67 deletions(-)
base-commit: e67572cd2204894179d89bd7b984072f19313b03
--
2.45.0
Replace a wrong EXPECT_GT(self->child_pid_exited, 0) with EXPECT_GE(),
which will be actually tested on the parent and child sides with a
following commit.
Cc: Shuah Khan <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Reviewed-by: Christian Brauner <[email protected]>
Signed-off-by: Mickaël Salaün <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
Changes since v1:
* Extract change from a bigger patch (suggested by Kees).
---
tools/testing/selftests/pidfd/pidfd_setns_test.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/pidfd/pidfd_setns_test.c b/tools/testing/selftests/pidfd/pidfd_setns_test.c
index 6e2f2cd400ca..47746b0c6acd 100644
--- a/tools/testing/selftests/pidfd/pidfd_setns_test.c
+++ b/tools/testing/selftests/pidfd/pidfd_setns_test.c
@@ -158,7 +158,7 @@ FIXTURE_SETUP(current_nsset)
/* Create task that exits right away. */
self->child_pid_exited = create_child(&self->child_pidfd_exited,
CLONE_NEWUSER | CLONE_NEWNET);
- EXPECT_GT(self->child_pid_exited, 0);
+ EXPECT_GE(self->child_pid_exited, 0);
if (self->child_pid_exited == 0)
_exit(EXIT_SUCCESS);
--
2.45.0
According to the test environment, the mount point of the test's working
directory may be shared or not, which changes the visibility of the
nested "tmp" mount point for the test's parent process calling
umount("tmp").
This was spotted while running tests in containers [1], where mount
points are private.
Cc: Günther Noack <[email protected]>
Cc: Shuah Khan <[email protected]>
Link: https://github.com/landlock-lsm/landlock-test-tools/pull/4 [1]
Fixes: 41cca0542d7c ("selftests/harness: Fix TEST_F()'s vfork handling")
Reviewed-by: Kees Cook <[email protected]>
Signed-off-by: Mickaël Salaün <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
Changes since v1:
* Update commit description.
---
tools/testing/selftests/landlock/fs_test.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/landlock/fs_test.c b/tools/testing/selftests/landlock/fs_test.c
index 9a6036fbf289..46b9effd53e4 100644
--- a/tools/testing/selftests/landlock/fs_test.c
+++ b/tools/testing/selftests/landlock/fs_test.c
@@ -293,7 +293,15 @@ static void prepare_layout(struct __test_metadata *const _metadata)
static void cleanup_layout(struct __test_metadata *const _metadata)
{
set_cap(_metadata, CAP_SYS_ADMIN);
- EXPECT_EQ(0, umount(TMP_DIR));
+ if (umount(TMP_DIR)) {
+ /*
+ * According to the test environment, the mount point of the
+ * current directory may be shared or not, which changes the
+ * visibility of the nested TMP_DIR mount point for the test's
+ * parent process doing this cleanup.
+ */
+ ASSERT_EQ(EINVAL, errno);
+ }
clear_cap(_metadata, CAP_SYS_ADMIN);
EXPECT_EQ(0, remove_path(TMP_DIR));
}
--
2.45.0
Make sure fixture teardowns are run when test cases failed, including
when _metadata->teardown_parent is set to true.
Make sure only one fixture teardown is run per test case, handling the
case where the test child forks.
Cc: Jakub Kicinski <[email protected]>
Cc: Shengyu Li <[email protected]>
Cc: Shuah Khan <[email protected]>
Fixes: 72d7cb5c190b ("selftests/harness: Prevent infinite loop due to Assert in FIXTURE_TEARDOWN")
Fixes: 0710a1a73fb4 ("selftests/harness: Merge TEST_F_FORK() into TEST_F()")
Reviewed-by: Kees Cook <[email protected]>
Signed-off-by: Mickaël Salaün <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
tools/testing/selftests/kselftest_harness.h | 14 +++++++++-----
1 file changed, 9 insertions(+), 5 deletions(-)
diff --git a/tools/testing/selftests/kselftest_harness.h b/tools/testing/selftests/kselftest_harness.h
index d98702b6955d..55699a762c45 100644
--- a/tools/testing/selftests/kselftest_harness.h
+++ b/tools/testing/selftests/kselftest_harness.h
@@ -382,7 +382,10 @@
FIXTURE_DATA(fixture_name) self; \
pid_t child = 1; \
int status = 0; \
- bool jmp = false; \
+ /* Makes sure there is only one teardown, even when child forks again. */ \
+ bool *teardown = mmap(NULL, sizeof(*teardown), \
+ PROT_READ | PROT_WRITE, MAP_SHARED | MAP_ANONYMOUS, -1, 0); \
+ *teardown = false; \
memset(&self, 0, sizeof(FIXTURE_DATA(fixture_name))); \
if (setjmp(_metadata->env) == 0) { \
/* Use the same _metadata. */ \
@@ -399,15 +402,16 @@
_metadata->exit_code = KSFT_FAIL; \
} \
} \
- else \
- jmp = true; \
if (child == 0) { \
- if (_metadata->setup_completed && !_metadata->teardown_parent && !jmp) \
+ if (_metadata->setup_completed && !_metadata->teardown_parent && \
+ __sync_bool_compare_and_swap(teardown, false, true)) \
fixture_name##_teardown(_metadata, &self, variant->data); \
_exit(0); \
} \
- if (_metadata->setup_completed && _metadata->teardown_parent) \
+ if (_metadata->setup_completed && _metadata->teardown_parent && \
+ __sync_bool_compare_and_swap(teardown, false, true)) \
fixture_name##_teardown(_metadata, &self, variant->data); \
+ munmap(teardown, sizeof(*teardown)); \
if (!WIFEXITED(status) && WIFSIGNALED(status)) \
/* Forward signal to __wait_for_test(). */ \
kill(getpid(), WTERMSIG(status)); \
--
2.45.0
Fix a race condition when running several FIXTURE_TEARDOWN() managing
the same resource. This fixes a race condition in the Landlock file
system tests when creating or unmounting the same directory.
Using clone3() with CLONE_VFORK guarantees that the child and grandchild
test processes are sequentially scheduled. This is implemented with a
new clone3_vfork() helper replacing the fork() call.
This avoids triggering this error in __wait_for_test():
Test ended in some other way [127]
Cc: Christian Brauner <[email protected]>
Cc: David S. Miller <[email protected]>
Cc: Günther Noack <[email protected]>
Cc: Jakub Kicinski <[email protected]>
Cc: Mark Brown <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: Will Drewry <[email protected]>
Fixes: 41cca0542d7c ("selftests/harness: Fix TEST_F()'s vfork handling")
Reviewed-by: Kees Cook <[email protected]>
Signed-off-by: Mickaël Salaün <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
Changes since v2:
* Replace __attribute__((__unused__)) with inline for clone3_vfork()
(suggested by Kees and Jakub)
---
tools/testing/selftests/kselftest_harness.h | 15 ++++++++++++++-
1 file changed, 14 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/kselftest_harness.h b/tools/testing/selftests/kselftest_harness.h
index 55699a762c45..9d7178a71c2c 100644
--- a/tools/testing/selftests/kselftest_harness.h
+++ b/tools/testing/selftests/kselftest_harness.h
@@ -66,6 +66,8 @@
#include <sys/wait.h>
#include <unistd.h>
#include <setjmp.h>
+#include <syscall.h>
+#include <linux/sched.h>
#include "kselftest.h"
@@ -80,6 +82,17 @@
# define TH_LOG_ENABLED 1
#endif
+/* Wait for the child process to end but without sharing memory mapping. */
+static inline pid_t clone3_vfork(void)
+{
+ struct clone_args args = {
+ .flags = CLONE_VFORK,
+ .exit_signal = SIGCHLD,
+ };
+
+ return syscall(__NR_clone3, &args, sizeof(args));
+}
+
/**
* TH_LOG()
*
@@ -1183,7 +1196,7 @@ void __run_test(struct __fixture_metadata *f,
fflush(stdout);
fflush(stderr);
- t->pid = fork();
+ t->pid = clone3_vfork();
if (t->pid < 0) {
ksft_print_msg("ERROR SPAWNING TEST CHILD\n");
t->exit_code = KSFT_FAIL;
--
2.45.0
Do not allocate self->dir_path in the test process because this would
not be visible in the FIXTURE_TEARDOWN() process when relying on
fork()/clone3() instead of vfork().
This change is required for a following commit removing vfork() call to
not break the layout3_fs.* test cases.
Cc: Günther Noack <[email protected]>
Cc: Shuah Khan <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Signed-off-by: Mickaël Salaün <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
Changes since v1:
* Extract change from a bigger patch (suggested by Kees).
---
tools/testing/selftests/landlock/fs_test.c | 57 +++++++++++++---------
1 file changed, 35 insertions(+), 22 deletions(-)
diff --git a/tools/testing/selftests/landlock/fs_test.c b/tools/testing/selftests/landlock/fs_test.c
index 46b9effd53e4..1e2cffde02b5 100644
--- a/tools/testing/selftests/landlock/fs_test.c
+++ b/tools/testing/selftests/landlock/fs_test.c
@@ -9,6 +9,7 @@
#define _GNU_SOURCE
#include <fcntl.h>
+#include <libgen.h>
#include <linux/landlock.h>
#include <linux/magic.h>
#include <sched.h>
@@ -4624,7 +4625,6 @@ FIXTURE(layout3_fs)
{
bool has_created_dir;
bool has_created_file;
- char *dir_path;
bool skip_test;
};
@@ -4683,11 +4683,24 @@ FIXTURE_VARIANT_ADD(layout3_fs, hostfs) {
.cwd_fs_magic = HOSTFS_SUPER_MAGIC,
};
+static char *dirname_alloc(const char *path)
+{
+ char *dup;
+
+ if (!path)
+ return NULL;
+
+ dup = strdup(path);
+ if (!dup)
+ return NULL;
+
+ return dirname(dup);
+}
+
FIXTURE_SETUP(layout3_fs)
{
struct stat statbuf;
- const char *slash;
- size_t dir_len;
+ char *dir_path = dirname_alloc(variant->file_path);
if (!supports_filesystem(variant->mnt.type) ||
!cwd_matches_fs(variant->cwd_fs_magic)) {
@@ -4697,25 +4710,15 @@ FIXTURE_SETUP(layout3_fs)
_metadata->teardown_parent = true;
- slash = strrchr(variant->file_path, '/');
- ASSERT_NE(slash, NULL);
- dir_len = (size_t)slash - (size_t)variant->file_path;
- ASSERT_LT(0, dir_len);
- self->dir_path = malloc(dir_len + 1);
- self->dir_path[dir_len] = '\0';
- strncpy(self->dir_path, variant->file_path, dir_len);
-
prepare_layout_opt(_metadata, &variant->mnt);
/* Creates directory when required. */
- if (stat(self->dir_path, &statbuf)) {
+ if (stat(dir_path, &statbuf)) {
set_cap(_metadata, CAP_DAC_OVERRIDE);
- EXPECT_EQ(0, mkdir(self->dir_path, 0700))
+ EXPECT_EQ(0, mkdir(dir_path, 0700))
{
TH_LOG("Failed to create directory \"%s\": %s",
- self->dir_path, strerror(errno));
- free(self->dir_path);
- self->dir_path = NULL;
+ dir_path, strerror(errno));
}
self->has_created_dir = true;
clear_cap(_metadata, CAP_DAC_OVERRIDE);
@@ -4736,6 +4739,8 @@ FIXTURE_SETUP(layout3_fs)
self->has_created_file = true;
clear_cap(_metadata, CAP_DAC_OVERRIDE);
}
+
+ free(dir_path);
}
FIXTURE_TEARDOWN(layout3_fs)
@@ -4754,16 +4759,17 @@ FIXTURE_TEARDOWN(layout3_fs)
}
if (self->has_created_dir) {
+ char *dir_path = dirname_alloc(variant->file_path);
+
set_cap(_metadata, CAP_DAC_OVERRIDE);
/*
* Don't check for error because the directory might already
* have been removed (cf. release_inode test).
*/
- rmdir(self->dir_path);
+ rmdir(dir_path);
clear_cap(_metadata, CAP_DAC_OVERRIDE);
+ free(dir_path);
}
- free(self->dir_path);
- self->dir_path = NULL;
cleanup_layout(_metadata);
}
@@ -4830,7 +4836,10 @@ TEST_F_FORK(layout3_fs, tag_inode_dir_mnt)
TEST_F_FORK(layout3_fs, tag_inode_dir_child)
{
- layer3_fs_tag_inode(_metadata, self, variant, self->dir_path);
+ char *dir_path = dirname_alloc(variant->file_path);
+
+ layer3_fs_tag_inode(_metadata, self, variant, dir_path);
+ free(dir_path);
}
TEST_F_FORK(layout3_fs, tag_inode_file)
@@ -4857,9 +4866,13 @@ TEST_F_FORK(layout3_fs, release_inodes)
if (self->has_created_file)
EXPECT_EQ(0, remove_path(variant->file_path));
- if (self->has_created_dir)
+ if (self->has_created_dir) {
+ char *dir_path = dirname_alloc(variant->file_path);
+
/* Don't check for error because of cgroup specificities. */
- remove_path(self->dir_path);
+ remove_path(dir_path);
+ free(dir_path);
+ }
ruleset_fd =
create_ruleset(_metadata, LANDLOCK_ACCESS_FS_READ_DIR, layer1);
--
2.45.0
Unconditionally share _metadata between all forked processes, which
enables to actually catch errors which were previously ignored.
This is required for a following commit replacing vfork() with clone3()
and CLONE_VFORK (i.e. not sharing the full memory) . It should also be
useful to share _metadata to extend expectations to test process's
forks. For instance, this change identified a wrong expectation in
pidfd_setns_test.
Because this _metadata is used by the new XFAIL_ADD(), use a global
pointer initialized in TEST_F(). This is OK because only XFAIL_ADD()
use it, and XFAIL_ADD() already depends on TEST_F().
Cc: Jakub Kicinski <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: Will Drewry <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Signed-off-by: Mickaël Salaün <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
Changes since v6:
* Use a global pointer per TEST_F() and complete build of the
__test_xfail object in the xfail constructor to fix XFAIL_ADD():
https://lore.kernel.org/r/[email protected]/
Changes since v4:
* Reset _metadata's aborted and setup_completed fields.
Changes since v1:
* Extract change from a bigger patch (suggested by Kees).
---
tools/testing/selftests/kselftest_harness.h | 26 ++++++++++++---------
1 file changed, 15 insertions(+), 11 deletions(-)
diff --git a/tools/testing/selftests/kselftest_harness.h b/tools/testing/selftests/kselftest_harness.h
index 201040207c85..28415798fa60 100644
--- a/tools/testing/selftests/kselftest_harness.h
+++ b/tools/testing/selftests/kselftest_harness.h
@@ -430,19 +430,19 @@ static inline pid_t clone3_vfork(void)
kill(getpid(), WTERMSIG(status)); \
__test_check_assert(_metadata); \
} \
- static struct __test_metadata \
- _##fixture_name##_##test_name##_object = { \
- .name = #test_name, \
- .fn = &wrapper_##fixture_name##_##test_name, \
- .fixture = &_##fixture_name##_fixture_object, \
- .termsig = signal, \
- .timeout = tmout, \
- .teardown_parent = false, \
- }; \
+ static struct __test_metadata *_##fixture_name##_##test_name##_object; \
static void __attribute__((constructor)) \
_register_##fixture_name##_##test_name(void) \
{ \
- __register_test(&_##fixture_name##_##test_name##_object); \
+ struct __test_metadata *object = mmap(NULL, sizeof(*object), \
+ PROT_READ | PROT_WRITE, MAP_SHARED | MAP_ANONYMOUS, -1, 0); \
+ object->name = #test_name; \
+ object->fn = &wrapper_##fixture_name##_##test_name; \
+ object->fixture = &_##fixture_name##_fixture_object; \
+ object->termsig = signal; \
+ object->timeout = tmout; \
+ _##fixture_name##_##test_name##_object = object; \
+ __register_test(object); \
} \
static void fixture_name##_##test_name( \
struct __test_metadata __attribute__((unused)) *_metadata, \
@@ -850,11 +850,12 @@ struct __test_xfail {
{ \
.fixture = &_##fixture_name##_fixture_object, \
.variant = &_##fixture_name##_##variant_name##_object, \
- .test = &_##fixture_name##_##test_name##_object, \
}; \
static void __attribute__((constructor)) \
_register_##fixture_name##_##variant_name##_##test_name##_xfail(void) \
{ \
+ _##fixture_name##_##variant_name##_##test_name##_xfail.test = \
+ _##fixture_name##_##test_name##_object; \
__register_xfail(&_##fixture_name##_##variant_name##_##test_name##_xfail); \
}
@@ -1181,6 +1182,9 @@ void __run_test(struct __fixture_metadata *f,
/* reset test struct */
t->exit_code = KSFT_PASS;
t->trigger = 0;
+ t->aborted = false;
+ t->setup_completed = false;
+ memset(t->env, 0, sizeof(t->env));
memset(t->results->reason, 0, sizeof(t->results->reason));
if (asprintf(&test_name, "%s%s%s.%s", f->name,
--
2.45.0
Setting the time namespace with CLONE_NEWTIME returns -EUSERS if the
calling thread shares memory with another thread (because of the shared
vDSO), which is the case when it is created with vfork().
Fix pidfd_setns_test by replacing test harness's vfork() call with a
clone3() call with CLONE_VFORK, and an explicit sharing of the
_metadata and self objects.
Replace _metadata->teardown_parent with a new FIXTURE_TEARDOWN_PARENT()
helper that can replace FIXTURE_TEARDOWN(). This is a cleaner approach
and it enables to selectively share the fixture data between the child
process running tests and the parent process running the fixture
teardown. This also avoids updating several tests to not rely on the
self object's copy-on-write property (e.g. storing the returned value of
a fork() call).
Cc: Christian Brauner <[email protected]>
Cc: David S. Miller <[email protected]>
Cc: Günther Noack <[email protected]>
Cc: Jakub Kicinski <[email protected]>
Cc: Mark Brown <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: Will Drewry <[email protected]>
Reported-by: kernel test robot <[email protected]>
Closes: https://lore.kernel.org/oe-lkp/[email protected]
Fixes: 0710a1a73fb4 ("selftests/harness: Merge TEST_F_FORK() into TEST_F()")
Reviewed-by: Kees Cook <[email protected]>
Signed-off-by: Mickaël Salaün <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
Changes since v1:
* Split changes (suggested by Kees).
* Improve documentation.
* Remove the static fixture_name##_teardown_parent initialisation to
false (as suggested by checkpatch.pl).
---
tools/testing/selftests/kselftest_harness.h | 66 ++++++++++++++++-----
tools/testing/selftests/landlock/fs_test.c | 16 ++---
2 files changed, 57 insertions(+), 25 deletions(-)
diff --git a/tools/testing/selftests/kselftest_harness.h b/tools/testing/selftests/kselftest_harness.h
index 28415798fa60..cbedb4a6cf7b 100644
--- a/tools/testing/selftests/kselftest_harness.h
+++ b/tools/testing/selftests/kselftest_harness.h
@@ -294,6 +294,32 @@ static inline pid_t clone3_vfork(void)
* A bare "return;" statement may be used to return early.
*/
#define FIXTURE_TEARDOWN(fixture_name) \
+ static const bool fixture_name##_teardown_parent; \
+ __FIXTURE_TEARDOWN(fixture_name)
+
+/**
+ * FIXTURE_TEARDOWN_PARENT()
+ * *_metadata* is included so that EXPECT_*, ASSERT_* etc. work correctly.
+ *
+ * @fixture_name: fixture name
+ *
+ * .. code-block:: c
+ *
+ * FIXTURE_TEARDOWN_PARENT(fixture_name) { implementation }
+ *
+ * Same as FIXTURE_TEARDOWN() but run this code in a parent process. This
+ * enables the test process to drop its privileges without impacting the
+ * related FIXTURE_TEARDOWN_PARENT() (e.g. to remove files from a directory
+ * where write access was dropped).
+ *
+ * To make it possible for the parent process to use *self*, share (MAP_SHARED)
+ * the fixture data between all forked processes.
+ */
+#define FIXTURE_TEARDOWN_PARENT(fixture_name) \
+ static const bool fixture_name##_teardown_parent = true; \
+ __FIXTURE_TEARDOWN(fixture_name)
+
+#define __FIXTURE_TEARDOWN(fixture_name) \
void fixture_name##_teardown( \
struct __test_metadata __attribute__((unused)) *_metadata, \
FIXTURE_DATA(fixture_name) __attribute__((unused)) *self, \
@@ -368,10 +394,11 @@ static inline pid_t clone3_vfork(void)
* Very similar to TEST() except that *self* is the setup instance of fixture's
* datatype exposed for use by the implementation.
*
- * The @test_name code is run in a separate process sharing the same memory
- * (i.e. vfork), which means that the test process can update its privileges
- * without impacting the related FIXTURE_TEARDOWN() (e.g. to remove files from
- * a directory where write access was dropped).
+ * The _metadata object is shared (MAP_SHARED) with all the potential forked
+ * processes, which enables them to use EXCEPT_*() and ASSERT_*().
+ *
+ * The *self* object is only shared with the potential forked processes if
+ * FIXTURE_TEARDOWN_PARENT() is used instead of FIXTURE_TEARDOWN().
*/
#define TEST_F(fixture_name, test_name) \
__TEST_F_IMPL(fixture_name, test_name, -1, TEST_TIMEOUT_DEFAULT)
@@ -392,39 +419,49 @@ static inline pid_t clone3_vfork(void)
struct __fixture_variant_metadata *variant) \
{ \
/* fixture data is alloced, setup, and torn down per call. */ \
- FIXTURE_DATA(fixture_name) self; \
+ FIXTURE_DATA(fixture_name) self_private, *self = NULL; \
pid_t child = 1; \
int status = 0; \
/* Makes sure there is only one teardown, even when child forks again. */ \
bool *teardown = mmap(NULL, sizeof(*teardown), \
PROT_READ | PROT_WRITE, MAP_SHARED | MAP_ANONYMOUS, -1, 0); \
*teardown = false; \
- memset(&self, 0, sizeof(FIXTURE_DATA(fixture_name))); \
+ if (sizeof(*self) > 0) { \
+ if (fixture_name##_teardown_parent) { \
+ self = mmap(NULL, sizeof(*self), PROT_READ | PROT_WRITE, \
+ MAP_SHARED | MAP_ANONYMOUS, -1, 0); \
+ } else { \
+ memset(&self_private, 0, sizeof(self_private)); \
+ self = &self_private; \
+ } \
+ } \
if (setjmp(_metadata->env) == 0) { \
- /* Use the same _metadata. */ \
- child = vfork(); \
+ /* _metadata and potentially self are shared with all forks. */ \
+ child = clone3_vfork(); \
if (child == 0) { \
- fixture_name##_setup(_metadata, &self, variant->data); \
+ fixture_name##_setup(_metadata, self, variant->data); \
/* Let setup failure terminate early. */ \
if (_metadata->exit_code) \
_exit(0); \
_metadata->setup_completed = true; \
- fixture_name##_##test_name(_metadata, &self, variant->data); \
+ fixture_name##_##test_name(_metadata, self, variant->data); \
} else if (child < 0 || child != waitpid(child, &status, 0)) { \
ksft_print_msg("ERROR SPAWNING TEST GRANDCHILD\n"); \
_metadata->exit_code = KSFT_FAIL; \
} \
} \
if (child == 0) { \
- if (_metadata->setup_completed && !_metadata->teardown_parent && \
+ if (_metadata->setup_completed && !fixture_name##_teardown_parent && \
__sync_bool_compare_and_swap(teardown, false, true)) \
- fixture_name##_teardown(_metadata, &self, variant->data); \
+ fixture_name##_teardown(_metadata, self, variant->data); \
_exit(0); \
} \
- if (_metadata->setup_completed && _metadata->teardown_parent && \
+ if (_metadata->setup_completed && fixture_name##_teardown_parent && \
__sync_bool_compare_and_swap(teardown, false, true)) \
- fixture_name##_teardown(_metadata, &self, variant->data); \
+ fixture_name##_teardown(_metadata, self, variant->data); \
munmap(teardown, sizeof(*teardown)); \
+ if (self && fixture_name##_teardown_parent) \
+ munmap(self, sizeof(*self)); \
if (!WIFEXITED(status) && WIFSIGNALED(status)) \
/* Forward signal to __wait_for_test(). */ \
kill(getpid(), WTERMSIG(status)); \
@@ -898,7 +935,6 @@ struct __test_metadata {
bool timed_out; /* did this test timeout instead of exiting? */
bool aborted; /* stopped test due to failed ASSERT */
bool setup_completed; /* did setup finish? */
- bool teardown_parent; /* run teardown in a parent process */
jmp_buf env; /* for exiting out of test early */
struct __test_results *results;
struct __test_metadata *prev, *next;
diff --git a/tools/testing/selftests/landlock/fs_test.c b/tools/testing/selftests/landlock/fs_test.c
index 1e2cffde02b5..27744524df51 100644
--- a/tools/testing/selftests/landlock/fs_test.c
+++ b/tools/testing/selftests/landlock/fs_test.c
@@ -286,8 +286,6 @@ static void prepare_layout_opt(struct __test_metadata *const _metadata,
static void prepare_layout(struct __test_metadata *const _metadata)
{
- _metadata->teardown_parent = true;
-
prepare_layout_opt(_metadata, &mnt_tmp);
}
@@ -316,7 +314,7 @@ FIXTURE_SETUP(layout0)
prepare_layout(_metadata);
}
-FIXTURE_TEARDOWN(layout0)
+FIXTURE_TEARDOWN_PARENT(layout0)
{
cleanup_layout(_metadata);
}
@@ -379,7 +377,7 @@ FIXTURE_SETUP(layout1)
create_layout1(_metadata);
}
-FIXTURE_TEARDOWN(layout1)
+FIXTURE_TEARDOWN_PARENT(layout1)
{
remove_layout1(_metadata);
@@ -3692,7 +3690,7 @@ FIXTURE_SETUP(ftruncate)
create_file(_metadata, file1_s1d1);
}
-FIXTURE_TEARDOWN(ftruncate)
+FIXTURE_TEARDOWN_PARENT(ftruncate)
{
EXPECT_EQ(0, remove_path(file1_s1d1));
cleanup_layout(_metadata);
@@ -3870,7 +3868,7 @@ FIXTURE_SETUP(layout1_bind)
clear_cap(_metadata, CAP_SYS_ADMIN);
}
-FIXTURE_TEARDOWN(layout1_bind)
+FIXTURE_TEARDOWN_PARENT(layout1_bind)
{
/* umount(dir_s2d2)) is handled by namespace lifetime. */
@@ -4275,7 +4273,7 @@ FIXTURE_SETUP(layout2_overlay)
clear_cap(_metadata, CAP_SYS_ADMIN);
}
-FIXTURE_TEARDOWN(layout2_overlay)
+FIXTURE_TEARDOWN_PARENT(layout2_overlay)
{
if (self->skip_test)
SKIP(return, "overlayfs is not supported (teardown)");
@@ -4708,8 +4706,6 @@ FIXTURE_SETUP(layout3_fs)
SKIP(return, "this filesystem is not supported (setup)");
}
- _metadata->teardown_parent = true;
-
prepare_layout_opt(_metadata, &variant->mnt);
/* Creates directory when required. */
@@ -4743,7 +4739,7 @@ FIXTURE_SETUP(layout3_fs)
free(dir_path);
}
-FIXTURE_TEARDOWN(layout3_fs)
+FIXTURE_TEARDOWN_PARENT(layout3_fs)
{
if (self->skip_test)
SKIP(return, "this filesystem is not supported (teardown)");
--
2.45.0
If TEST_F() explicitly calls exit(code) with code different than 0, then
_metadata->exit_code is set to this code (e.g. KVM_ONE_VCPU_TEST()). We
need to keep in mind that _metadata->exit_code can be KSFT_SKIP while
the process exit code is 0.
Cc: Jakub Kicinski <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Mark Brown <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: Will Drewry <[email protected]>
Reported-by: Sean Christopherson <[email protected]>
Tested-by: Sean Christopherson <[email protected]>
Closes: https://lore.kernel.org/r/[email protected]
Fixes: 0710a1a73fb4 ("selftests/harness: Merge TEST_F_FORK() into TEST_F()")
Signed-off-by: Mickaël Salaün <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
Changes since v5:
* Update commit message as suggested by Sean.
Changes since v4:
* Check abort status when the grandchild exited.
* Keep the _exit(0) calls because _metadata->exit_code is always
checked.
* Only set _metadata->exit_code to WEXITSTATUS() if it is not zero.
Changes since v3:
* New patch mainly from Sean Christopherson.
---
tools/testing/selftests/kselftest_harness.h | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/kselftest_harness.h b/tools/testing/selftests/kselftest_harness.h
index cbedb4a6cf7b..3c8f2965c285 100644
--- a/tools/testing/selftests/kselftest_harness.h
+++ b/tools/testing/selftests/kselftest_harness.h
@@ -462,9 +462,13 @@ static inline pid_t clone3_vfork(void)
munmap(teardown, sizeof(*teardown)); \
if (self && fixture_name##_teardown_parent) \
munmap(self, sizeof(*self)); \
- if (!WIFEXITED(status) && WIFSIGNALED(status)) \
+ if (WIFEXITED(status)) { \
+ if (WEXITSTATUS(status)) \
+ _metadata->exit_code = WEXITSTATUS(status); \
+ } else if (WIFSIGNALED(status)) { \
/* Forward signal to __wait_for_test(). */ \
kill(getpid(), WTERMSIG(status)); \
+ } \
__test_check_assert(_metadata); \
} \
static struct __test_metadata *_##fixture_name##_##test_name##_object; \
--
2.45.0
On Sat, May 11, 2024 at 07:14:39PM +0200, Micka?l Sala?n wrote:
> Fix a race condition when running several FIXTURE_TEARDOWN() managing
> the same resource. This fixes a race condition in the Landlock file
> system tests when creating or unmounting the same directory.
> Using clone3() with CLONE_VFORK guarantees that the child and grandchild
> test processes are sequentially scheduled. This is implemented with a
> new clone3_vfork() helper replacing the fork() call.
This is now in mainline and appears to be causing several tests (at
least the ptrace vmaccess global_attach test on arm64, possibly also
some of the epoll tests) that previously were timed out by the harness
to to hang instead. A bisect seems to point at this patch in
particular, there was a bunch of discussion of the fallout of these
patches but I'm afraid I lost track of it, is there something in flight
for this? -next is affected as well from the looks of it.
Log of the ptrace issue (not that it's useful, the job just gets killed
by the test runner):
https://lava.sirena.org.uk/scheduler/job/307984
Bisect log:
git bisect start
# status: waiting for both good and bad commits
# good: [e8f897f4afef0031fe618a8e94127a0934896aba] Linux 6.8
git bisect good e8f897f4afef0031fe618a8e94127a0934896aba
# status: waiting for bad commit, 1 good commit known
# bad: [a38297e3fb012ddfa7ce0321a7e5a8daeb1872b6] Linux 6.9
git bisect bad a38297e3fb012ddfa7ce0321a7e5a8daeb1872b6
# good: [480e035fc4c714fb5536e64ab9db04fedc89e910] Merge tag 'drm-next-2024-03-13' of https://gitlab.freedesktop.org/drm/kernel
git bisect good 480e035fc4c714fb5536e64ab9db04fedc89e910
# good: [2ac2b1665d3fbec6ca709dd6ef3ea05f4a51ee4c] Merge tag 'hwlock-v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux
git bisect good 2ac2b1665d3fbec6ca709dd6ef3ea05f4a51ee4c
# good: [e858beeddfa3a400844c0e22d2118b3b52f1ea5e] bcachefs: If we run merges at a lower watermark, they must be nonblocking
git bisect good e858beeddfa3a400844c0e22d2118b3b52f1ea5e
# good: [e43afae4a335ac0bf54c7a8f23ed65dd55449649] Merge tag 'powerpc-6.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
git bisect good e43afae4a335ac0bf54c7a8f23ed65dd55449649
# good: [09e10499ee6a5a89fc352f25881276398a49596a] Merge tag 'drm-misc-fixes-2024-05-02' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes
git bisect good 09e10499ee6a5a89fc352f25881276398a49596a
# good: [3c15237018eb8a9a56bb49a5dbf4d8eeb154b5cc] Merge tag 'usb-6.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
git bisect good 3c15237018eb8a9a56bb49a5dbf4d8eeb154b5cc
# good: [62788b0f225da1837ad38101112e2c49123470ee] Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rmk/linux
git bisect good 62788b0f225da1837ad38101112e2c49123470ee
# good: [ed44935c330a2633440e8d2660db3c7538eeaf10] Merge tag 'spi-fix-v6.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
git bisect good ed44935c330a2633440e8d2660db3c7538eeaf10
# good: [c22c3e0753807feee1391a22228b0d5e6ba39b74] Merge tag 'mm-hotfixes-stable-2024-05-10-13-14' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
git bisect good c22c3e0753807feee1391a22228b0d5e6ba39b74
# good: [b61821bb32c5577272408e1b05e6a0879a64257f] Merge tag 'drm-misc-fixes-2024-05-10' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes
git bisect good b61821bb32c5577272408e1b05e6a0879a64257f
# bad: [323feb3bdb67649bfa5614eb24ec9cb92a60cf33] selftests/harness: Handle TEST_F()'s explicit exit codes
git bisect bad 323feb3bdb67649bfa5614eb24ec9cb92a60cf33
# bad: [323feb3bdb67649bfa5614eb24ec9cb92a60cf33] selftests/harness: Handle TEST_F()'s explicit exit codes
git bisect bad 323feb3bdb67649bfa5614eb24ec9cb92a60cf33
# bad: [3656bc23429a4d539c81b5cb8f17ceeeeca8901a] selftests/landlock: Do not allocate memory in fixture data
git bisect bad 3656bc23429a4d539c81b5cb8f17ceeeeca8901a
# good: [7e4042abe2ee7c0977fd8bb049a6991b174a5e6f] selftests/landlock: Fix FS tests when run on a private mount point
git bisect good 7e4042abe2ee7c0977fd8bb049a6991b174a5e6f
# bad: [a86f18903db9211e265cc130b61adb175b7a4c42] selftests/harness: Fix interleaved scheduling leading to race conditions
git bisect bad a86f18903db9211e265cc130b61adb175b7a4c42
# good: [fff37bd32c7605d93bf900c4c318d56d12000048] selftests/harness: Fix fixture teardown
git bisect good fff37bd32c7605d93bf900c4c318d56d12000048
# first bad commit: [a86f18903db9211e265cc130b61adb175b7a4c42] selftests/harness: Fix interleaved scheduling leading to race conditions
On Mon, May 27, 2024 at 08:07:40PM +0100, Mark Brown wrote:
> On Sat, May 11, 2024 at 07:14:39PM +0200, Micka?l Sala?n wrote:
> > Fix a race condition when running several FIXTURE_TEARDOWN() managing
> > the same resource. This fixes a race condition in the Landlock file
> > system tests when creating or unmounting the same directory.
>
> > Using clone3() with CLONE_VFORK guarantees that the child and grandchild
> > test processes are sequentially scheduled. This is implemented with a
> > new clone3_vfork() helper replacing the fork() call.
>
> This is now in mainline and appears to be causing several tests (at
> least the ptrace vmaccess global_attach test on arm64, possibly also
> some of the epoll tests) that previously were timed out by the harness
> to to hang instead. A bisect seems to point at this patch in
> particular, there was a bunch of discussion of the fallout of these
> patches but I'm afraid I lost track of it, is there something in flight
> for this? -next is affected as well from the looks of it.
FWIW I'm still seeing this on -rc2...
On Mon, Jun 03, 2024 at 05:27:52PM +0100, Mark Brown wrote:
> On Mon, May 27, 2024 at 08:07:40PM +0100, Mark Brown wrote:
> > This is now in mainline and appears to be causing several tests (at
> > least the ptrace vmaccess global_attach test on arm64, possibly also
> > some of the epoll tests) that previously were timed out by the harness
> > to to hang instead. A bisect seems to point at this patch in
> > particular, there was a bunch of discussion of the fallout of these
> > patches but I'm afraid I lost track of it, is there something in flight
> > for this? -next is affected as well from the looks of it.
> FWIW I'm still seeing this on -rc2...
AFAICT this is due to the switch to using clone3() with CLONE_VFORK
to start the test which means we never even call alarm() to set up the
timeout for the test, let alone have the signal for it delivered. I'm a
confused about how this could ever work, with clone_vfork() the parent
shouldn't run until the child execs (which won't happen here) or exits.
Since we don't call alarm() until after we started the child we never
actually get that far, but even if we reorder things we'll not get the
signal for the alarm if the child messes up since the parent is
suspended.
I'm not clear what the original race being fixed here was but it seems
like we should revert this since the timeout functionality is pretty
important?
On Mon, Jun 03, 2024 at 06:22:32PM +0100, Mark Brown wrote:
> On Mon, Jun 03, 2024 at 05:27:52PM +0100, Mark Brown wrote:
> > On Mon, May 27, 2024 at 08:07:40PM +0100, Mark Brown wrote:
>
> > > This is now in mainline and appears to be causing several tests (at
> > > least the ptrace vmaccess global_attach test on arm64, possibly also
> > > some of the epoll tests) that previously were timed out by the harness
> > > to to hang instead. A bisect seems to point at this patch in
> > > particular, there was a bunch of discussion of the fallout of these
> > > patches but I'm afraid I lost track of it, is there something in flight
> > > for this? -next is affected as well from the looks of it.
Thanks for the heads up. I warned about not being able to test
everything when fixing kselftest last time, but nobody show up. Is
there an easy way to run most kselftests? We really need a (more
accessible) CI...
>
> > FWIW I'm still seeing this on -rc2...
>
> AFAICT this is due to the switch to using clone3() with CLONE_VFORK
I guess it started with the previous vfork() that was later replaced
with CLONE_VFORK.
> to start the test which means we never even call alarm() to set up the
> timeout for the test, let alone have the signal for it delivered. I'm a
> confused about how this could ever work, with clone_vfork() the parent
> shouldn't run until the child execs (which won't happen here) or exits.
> Since we don't call alarm() until after we started the child we never
> actually get that far, but even if we reorder things we'll not get the
> signal for the alarm if the child messes up since the parent is
> suspended.
>
> I'm not clear what the original race being fixed here was but it seems
> like we should revert this since the timeout functionality is pretty
> important?
It took me a while to fix all the previous issues and it would be much
easier to just fix this issue too.
I'm working on it.
On Tue, Jun 04, 2024 at 06:06:48PM +0200, Micka?l Sala?n wrote:
> Thanks for the heads up. I warned about not being able to test
> everything when fixing kselftest last time, but nobody show up. Is
> there an easy way to run most kselftests? We really need a (more
> accessible) CI...
You can just invoke the top level kselftests Makefile but between things
being flaky and runtime requirements there's a bunch of noise there.
KernelCI covers a bunch of it and would be my go to, I've got a good
chunk of the selftests that actually build and run reliably in my
personal CI but it has no visible UI. Part of the issue here might be
platform specifics, I'm seeing this on arm64.
> > > FWIW I'm still seeing this on -rc2...
> > AFAICT this is due to the switch to using clone3() with CLONE_VFORK
> I guess it started with the previous vfork() that was later replaced
> with CLONE_VFORK.
Bisect did seem to point at this commit FWIW, I've not dug into any API
differences or anything here. The immediate thing being replaced was a
plain fork() though I see it was vfork() at some point before that, and
I'd not have noticed if the individual testcases weren't hanging so the
timeout was needed.
> > I'm not clear what the original race being fixed here was but it seems
> > like we should revert this since the timeout functionality is pretty
> > important?
> It took me a while to fix all the previous issues and it would be much
> easier to just fix this issue too.
> I'm working on it.
Great, thanks.