2024-04-12 02:59:20

by David Gow

[permalink] [raw]
Subject: [PATCH] kunit: Fix race condition in try-catch completion

KUnit's try-catch infrastructure now uses vfork_done, which is always
set to a valid completion when a kthread is created, but which is set to
NULL once the thread terminates. This creates a race condition, where
the kthread exits before we can wait on it.

Keep a copy of vfork_done, which is taken before we wake_up_process()
and so valid, and wait on that instead.

Fixes: 4de2a8e4cca4 ("kunit: Handle test faults")
Reported-by: Linux Kernel Functional Testing <[email protected]>
Closes: https://lore.kernel.org/lkml/[email protected]/
Tested-by: Linux Kernel Functional Testing <[email protected]>
Acked-by: Mickaël Salaün <[email protected]>
Signed-off-by: David Gow <[email protected]>
---
lib/kunit/try-catch.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/lib/kunit/try-catch.c b/lib/kunit/try-catch.c
index fa687278ccc9..6bbe0025b079 100644
--- a/lib/kunit/try-catch.c
+++ b/lib/kunit/try-catch.c
@@ -63,6 +63,7 @@ void kunit_try_catch_run(struct kunit_try_catch *try_catch, void *context)
{
struct kunit *test = try_catch->test;
struct task_struct *task_struct;
+ struct completion *task_done;
int exit_code, time_remaining;

try_catch->context = context;
@@ -75,13 +76,16 @@ void kunit_try_catch_run(struct kunit_try_catch *try_catch, void *context)
return;
}
get_task_struct(task_struct);
- wake_up_process(task_struct);
/*
* As for a vfork(2), task_struct->vfork_done (pointing to the
* underlying kthread->exited) can be used to wait for the end of a
- * kernel thread.
+ * kernel thread. It is set to NULL when the thread exits, so we
+ * keep a copy here.
*/
- time_remaining = wait_for_completion_timeout(task_struct->vfork_done,
+ task_done = task_struct->vfork_done;
+ wake_up_process(task_struct);
+
+ time_remaining = wait_for_completion_timeout(task_done,
kunit_test_timeout());
if (time_remaining == 0) {
try_catch->try_result = -ETIMEDOUT;
--
2.44.0.683.g7961c838ac-goog



2024-04-12 21:32:43

by Rae Moar

[permalink] [raw]
Subject: Re: [PATCH] kunit: Fix race condition in try-catch completion

On Thu, Apr 11, 2024 at 10:59 PM David Gow <[email protected]> wrote:
>
> KUnit's try-catch infrastructure now uses vfork_done, which is always
> set to a valid completion when a kthread is created, but which is set to
> NULL once the thread terminates. This creates a race condition, where
> the kthread exits before we can wait on it.
>
> Keep a copy of vfork_done, which is taken before we wake_up_process()
> and so valid, and wait on that instead.
>
> Fixes: 4de2a8e4cca4 ("kunit: Handle test faults")
> Reported-by: Linux Kernel Functional Testing <[email protected]>
> Closes: https://lore.kernel.org/lkml/[email protected]/
> Tested-by: Linux Kernel Functional Testing <[email protected]>
> Acked-by: Mickaël Salaün <[email protected]>
> Signed-off-by: David Gow <[email protected]>

Hello,

This fix looks good to me. I have tested it and besides the fortify
test error discussed in the previous patch series I am happy.

Thanks!
-Rae

Reviewed-by: Rae Moar <[email protected]>


> ---
> lib/kunit/try-catch.c | 10 +++++++---
> 1 file changed, 7 insertions(+), 3 deletions(-)
>
> diff --git a/lib/kunit/try-catch.c b/lib/kunit/try-catch.c
> index fa687278ccc9..6bbe0025b079 100644
> --- a/lib/kunit/try-catch.c
> +++ b/lib/kunit/try-catch.c
> @@ -63,6 +63,7 @@ void kunit_try_catch_run(struct kunit_try_catch *try_catch, void *context)
> {
> struct kunit *test = try_catch->test;
> struct task_struct *task_struct;
> + struct completion *task_done;
> int exit_code, time_remaining;
>
> try_catch->context = context;
> @@ -75,13 +76,16 @@ void kunit_try_catch_run(struct kunit_try_catch *try_catch, void *context)
> return;
> }
> get_task_struct(task_struct);
> - wake_up_process(task_struct);
> /*
> * As for a vfork(2), task_struct->vfork_done (pointing to the
> * underlying kthread->exited) can be used to wait for the end of a
> - * kernel thread.
> + * kernel thread. It is set to NULL when the thread exits, so we
> + * keep a copy here.
> */
> - time_remaining = wait_for_completion_timeout(task_struct->vfork_done,
> + task_done = task_struct->vfork_done;
> + wake_up_process(task_struct);
> +
> + time_remaining = wait_for_completion_timeout(task_done,
> kunit_test_timeout());
> if (time_remaining == 0) {
> try_catch->try_result = -ETIMEDOUT;
> --
> 2.44.0.683.g7961c838ac-goog
>

2024-04-13 21:05:30

by Miguel Ojeda

[permalink] [raw]
Subject: Re: [PATCH] kunit: Fix race condition in try-catch completion

On Thu, Apr 11, 2024 at 10:59 PM David Gow <[email protected]> wrote:
>
> KUnit's try-catch infrastructure now uses vfork_done, which is always
> set to a valid completion when a kthread is created, but which is set to
> NULL once the thread terminates. This creates a race condition, where
> the kthread exits before we can wait on it.
>
> Keep a copy of vfork_done, which is taken before we wake_up_process()
> and so valid, and wait on that instead.
>
> Fixes: 4de2a8e4cca4 ("kunit: Handle test faults")
> Reported-by: Linux Kernel Functional Testing <[email protected]>
> Closes: https://lore.kernel.org/lkml/[email protected]/
> Tested-by: Linux Kernel Functional Testing <[email protected]>
> Acked-by: Mickaël Salaün <[email protected]>
> Signed-off-by: David Gow <[email protected]>

I noticed it with the Rust tests too, and indeed this fixed it:

Tested-by: Miguel Ojeda <[email protected]>

Thanks!

Cheers,
Miguel