2023-04-06 19:42:39

by Arve Hjønnevåg

[permalink] [raw]
Subject: [PATCH] sched/wait: Fix a kthread_park race with wait_woken()

kthread_park and wait_woken have a similar race that kthread_stop and
wait_woken used to have before it was fixed in
cb6538e740d7543cd989128625cf8cac4b471e0a. Extend that fix to also cover
kthread_park.

Signed-off-by: Arve Hjønnevåg <[email protected]>
---
kernel/sched/wait.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/wait.c b/kernel/sched/wait.c
index 133b74730738..a9cf49da884b 100644
--- a/kernel/sched/wait.c
+++ b/kernel/sched/wait.c
@@ -425,9 +425,9 @@ int autoremove_wake_function(struct wait_queue_entry *wq_entry, unsigned mode, i
}
EXPORT_SYMBOL(autoremove_wake_function);

-static inline bool is_kthread_should_stop(void)
+static inline bool is_kthread_should_stop_or_park(void)
{
- return (current->flags & PF_KTHREAD) && kthread_should_stop();
+ return (current->flags & PF_KTHREAD) && (kthread_should_stop() || kthread_should_park());
}

/*
@@ -459,7 +459,7 @@ long wait_woken(struct wait_queue_entry *wq_entry, unsigned mode, long timeout)
* or woken_wake_function() sees our store to current->state.
*/
set_current_state(mode); /* A */
- if (!(wq_entry->flags & WQ_FLAG_WOKEN) && !is_kthread_should_stop())
+ if (!(wq_entry->flags & WQ_FLAG_WOKEN) && !is_kthread_should_stop_or_park())
timeout = schedule_timeout(timeout);
__set_current_state(TASK_RUNNING);

--
2.40.0.577.gac1e443424-goog


2023-05-11 22:07:19

by John Stultz

[permalink] [raw]
Subject: [PATCH] sched/wait: Fix a kthread_park race with wait_woken()

From: Arve Hjønnevåg <[email protected]>

kthread_park and wait_woken have a similar race that kthread_stop and
wait_woken used to have before it was fixed in
cb6538e740d7543cd989128625cf8cac4b471e0a. Extend that fix to also cover
kthread_park.

Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Juri Lelli <[email protected]>
Cc: Vincent Guittot <[email protected]>
Cc: Dietmar Eggemann <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Ben Segall <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Daniel Bristot de Oliveira <[email protected]>
Cc: Valentin Schneider <[email protected]>
Signed-off-by: Arve Hjønnevåg <[email protected]>
Signed-off-by: John Stultz <[email protected]>
---
This seemingly slipped by, so I wanted to resend it
for review.
---
kernel/sched/wait.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/wait.c b/kernel/sched/wait.c
index 133b74730738..a9cf49da884b 100644
--- a/kernel/sched/wait.c
+++ b/kernel/sched/wait.c
@@ -425,9 +425,9 @@ int autoremove_wake_function(struct wait_queue_entry *wq_entry, unsigned mode, i
}
EXPORT_SYMBOL(autoremove_wake_function);

-static inline bool is_kthread_should_stop(void)
+static inline bool is_kthread_should_stop_or_park(void)
{
- return (current->flags & PF_KTHREAD) && kthread_should_stop();
+ return (current->flags & PF_KTHREAD) && (kthread_should_stop() || kthread_should_park());
}

/*
@@ -459,7 +459,7 @@ long wait_woken(struct wait_queue_entry *wq_entry, unsigned mode, long timeout)
* or woken_wake_function() sees our store to current->state.
*/
set_current_state(mode); /* A */
- if (!(wq_entry->flags & WQ_FLAG_WOKEN) && !is_kthread_should_stop())
+ if (!(wq_entry->flags & WQ_FLAG_WOKEN) && !is_kthread_should_stop_or_park())
timeout = schedule_timeout(timeout);
__set_current_state(TASK_RUNNING);

--
2.40.1.606.ga4b1b128d6-goog


2023-05-11 22:38:15

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] sched/wait: Fix a kthread_park race with wait_woken()

On Thu, May 11, 2023 at 09:41:30PM +0000, John Stultz wrote:
> From: Arve Hj?nnev?g <[email protected]>
>
> kthread_park and wait_woken have a similar race that kthread_stop and
> wait_woken used to have before it was fixed in
> cb6538e740d7543cd989128625cf8cac4b471e0a. Extend that fix to also cover

cb6538e740d7 ("sched/wait: Fix a kthread race with wait_woken()")

> kthread_park.
>
> Cc: Ingo Molnar <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Juri Lelli <[email protected]>
> Cc: Vincent Guittot <[email protected]>
> Cc: Dietmar Eggemann <[email protected]>
> Cc: Steven Rostedt <[email protected]>
> Cc: Ben Segall <[email protected]>
> Cc: Mel Gorman <[email protected]>
> Cc: Daniel Bristot de Oliveira <[email protected]>
> Cc: Valentin Schneider <[email protected]>
> Signed-off-by: Arve Hj?nnev?g <[email protected]>
> Signed-off-by: John Stultz <[email protected]>
> ---
> This seemingly slipped by, so I wanted to resend it
> for review.
> ---
> kernel/sched/wait.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/kernel/sched/wait.c b/kernel/sched/wait.c
> index 133b74730738..a9cf49da884b 100644
> --- a/kernel/sched/wait.c
> +++ b/kernel/sched/wait.c
> @@ -425,9 +425,9 @@ int autoremove_wake_function(struct wait_queue_entry *wq_entry, unsigned mode, i
> }
> EXPORT_SYMBOL(autoremove_wake_function);
>
> -static inline bool is_kthread_should_stop(void)
> +static inline bool is_kthread_should_stop_or_park(void)
> {
> - return (current->flags & PF_KTHREAD) && kthread_should_stop();
> + return (current->flags & PF_KTHREAD) && (kthread_should_stop() || kthread_should_park());
> }
>
> /*

That's a bit sad; that two function calls for checking two consecutive
bits in the same word :-(

If we move this to kthread.c and write it like:

kthread = __to_kthread(current);
if (!kthread)
return false;

return test_bit(KTHREAD_SHOULD_STOP, &kthread->flags) ||
test_bit(KTHREAD_SHOULD_PARK, &kthread->flags);

Then the compiler should be able to merge the two bits in a single load
and test due to constant_test_bit() -- do check though.

> @@ -459,7 +459,7 @@ long wait_woken(struct wait_queue_entry *wq_entry, unsigned mode, long timeout)
> * or woken_wake_function() sees our store to current->state.
> */
> set_current_state(mode); /* A */
> - if (!(wq_entry->flags & WQ_FLAG_WOKEN) && !is_kthread_should_stop())
> + if (!(wq_entry->flags & WQ_FLAG_WOKEN) && !is_kthread_should_stop_or_park())
> timeout = schedule_timeout(timeout);
> __set_current_state(TASK_RUNNING);
>
> --
> 2.40.1.606.ga4b1b128d6-goog
>

2023-05-12 01:03:07

by John Stultz

[permalink] [raw]
Subject: Re: [PATCH] sched/wait: Fix a kthread_park race with wait_woken()

On Thu, May 11, 2023 at 3:31 PM Peter Zijlstra <[email protected]> wrote:
>
> On Thu, May 11, 2023 at 09:41:30PM +0000, John Stultz wrote:
> > From: Arve Hjønnevåg <[email protected]>
> >
> > kthread_park and wait_woken have a similar race that kthread_stop and
> > wait_woken used to have before it was fixed in
> > cb6538e740d7543cd989128625cf8cac4b471e0a. Extend that fix to also cover
>
> cb6538e740d7 ("sched/wait: Fix a kthread race with wait_woken()")
>
> > kthread_park.
> >
> > Cc: Ingo Molnar <[email protected]>
> > Cc: Peter Zijlstra <[email protected]>
> > Cc: Juri Lelli <[email protected]>
> > Cc: Vincent Guittot <[email protected]>
> > Cc: Dietmar Eggemann <[email protected]>
> > Cc: Steven Rostedt <[email protected]>
> > Cc: Ben Segall <[email protected]>
> > Cc: Mel Gorman <[email protected]>
> > Cc: Daniel Bristot de Oliveira <[email protected]>
> > Cc: Valentin Schneider <[email protected]>
> > Signed-off-by: Arve Hjønnevåg <[email protected]>
> > Signed-off-by: John Stultz <[email protected]>
> > ---
> > This seemingly slipped by, so I wanted to resend it
> > for review.
> > ---
> > kernel/sched/wait.c | 6 +++---
> > 1 file changed, 3 insertions(+), 3 deletions(-)
> >
> > diff --git a/kernel/sched/wait.c b/kernel/sched/wait.c
> > index 133b74730738..a9cf49da884b 100644
> > --- a/kernel/sched/wait.c
> > +++ b/kernel/sched/wait.c
> > @@ -425,9 +425,9 @@ int autoremove_wake_function(struct wait_queue_entry *wq_entry, unsigned mode, i
> > }
> > EXPORT_SYMBOL(autoremove_wake_function);
> >
> > -static inline bool is_kthread_should_stop(void)
> > +static inline bool is_kthread_should_stop_or_park(void)
> > {
> > - return (current->flags & PF_KTHREAD) && kthread_should_stop();
> > + return (current->flags & PF_KTHREAD) && (kthread_should_stop() || kthread_should_park());
> > }
> >
> > /*
>
> That's a bit sad; that two function calls for checking two consecutive
> bits in the same word :-(
>
> If we move this to kthread.c and write it like:
>
> kthread = __to_kthread(current);
> if (!kthread)
> return false;
>
> return test_bit(KTHREAD_SHOULD_STOP, &kthread->flags) ||
> test_bit(KTHREAD_SHOULD_PARK, &kthread->flags);
>
> Then the compiler should be able to merge the two bits in a single load
> and test due to constant_test_bit() -- do check though.

Hrm. Apologies, as it's been awhile since I've looked at disassembled
asm, so I may be wrong, but I don't think that's happening here.

With the logic above I'm seeing it build as:
0000000000000a50 <kthread_should_stop_or_park>:
a50: 65 48 8b 14 25 00 00 mov %gs:0x0,%rdx
a57: 00 00
a59: 48 8b 8a 78 0a 00 00 mov 0xa78(%rdx),%rcx
a60: 31 c0 xor %eax,%eax
a62: 48 85 c9 test %rcx,%rcx
a65: 74 11 je a78
<kthread_should_stop_or_park+0x28>
a67: f6 42 2e 20 testb $0x20,0x2e(%rdx)
a6b: 74 0b je a78
<kthread_should_stop_or_park+0x28>
a6d: 48 8b 01 mov (%rcx),%rax
a70: 48 d1 e8 shr %rax
a73: 83 e0 01 and $0x1,%eax
a76: 74 05 je a7d
<kthread_should_stop_or_park+0x2d>
a78: e9 00 00 00 00 jmp a7d
<kthread_should_stop_or_park+0x2d>
a7d: 48 8b 01 mov (%rcx),%rax
a80: 48 c1 e8 02 shr $0x2,%rax
a84: 83 e0 01 and $0x1,%eax
a87: e9 00 00 00 00 jmp a8c
<kthread_should_stop_or_park+0x3c>
a8c: 0f 1f 40 00 nopl 0x0(%rax)


Where as if I drop the second test_bit() to see if it was similar, I see:
0000000000000a50 <kthread_should_stop_or_park>:
a50: 65 48 8b 14 25 00 00 mov %gs:0x0,%rdx
a57: 00 00
a59: 48 8b 8a 78 0a 00 00 mov 0xa78(%rdx),%rcx
a60: 31 c0 xor %eax,%eax
a62: 48 85 c9 test %rcx,%rcx
a65: 74 14 je a7b
<kthread_should_stop_or_park+0x2b>
a67: f6 42 2e 20 testb $0x20,0x2e(%rdx)
a6b: 74 0e je a7b
<kthread_should_stop_or_park+0x2b>
a6d: 48 8b 01 mov (%rcx),%rax
a70: 48 d1 e8 shr %rax
a73: 83 e0 01 and $0x1,%eax
a76: e9 00 00 00 00 jmp a7b
<kthread_should_stop_or_park+0x2b>
a7b: e9 00 00 00 00 jmp a80
<__pfx_kthread_freezable_should_stop>


Despite that, should I still re-submit the change this way?

thanks
-john

2023-05-12 10:58:58

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] sched/wait: Fix a kthread_park race with wait_woken()

On Thu, May 11, 2023 at 05:52:24PM -0700, John Stultz wrote:
> On Thu, May 11, 2023 at 3:31 PM Peter Zijlstra <[email protected]> wrote:
> >
> > On Thu, May 11, 2023 at 09:41:30PM +0000, John Stultz wrote:
> > > From: Arve Hjønnevåg <[email protected]>
> > >
> > > kthread_park and wait_woken have a similar race that kthread_stop and
> > > wait_woken used to have before it was fixed in
> > > cb6538e740d7543cd989128625cf8cac4b471e0a. Extend that fix to also cover
> >
> > cb6538e740d7 ("sched/wait: Fix a kthread race with wait_woken()")
> >
> > > kthread_park.
> > >
> > > Cc: Ingo Molnar <[email protected]>
> > > Cc: Peter Zijlstra <[email protected]>
> > > Cc: Juri Lelli <[email protected]>
> > > Cc: Vincent Guittot <[email protected]>
> > > Cc: Dietmar Eggemann <[email protected]>
> > > Cc: Steven Rostedt <[email protected]>
> > > Cc: Ben Segall <[email protected]>
> > > Cc: Mel Gorman <[email protected]>
> > > Cc: Daniel Bristot de Oliveira <[email protected]>
> > > Cc: Valentin Schneider <[email protected]>
> > > Signed-off-by: Arve Hjønnevåg <[email protected]>
> > > Signed-off-by: John Stultz <[email protected]>
> > > ---
> > > This seemingly slipped by, so I wanted to resend it
> > > for review.
> > > ---
> > > kernel/sched/wait.c | 6 +++---
> > > 1 file changed, 3 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/kernel/sched/wait.c b/kernel/sched/wait.c
> > > index 133b74730738..a9cf49da884b 100644
> > > --- a/kernel/sched/wait.c
> > > +++ b/kernel/sched/wait.c
> > > @@ -425,9 +425,9 @@ int autoremove_wake_function(struct wait_queue_entry *wq_entry, unsigned mode, i
> > > }
> > > EXPORT_SYMBOL(autoremove_wake_function);
> > >
> > > -static inline bool is_kthread_should_stop(void)
> > > +static inline bool is_kthread_should_stop_or_park(void)
> > > {
> > > - return (current->flags & PF_KTHREAD) && kthread_should_stop();
> > > + return (current->flags & PF_KTHREAD) && (kthread_should_stop() || kthread_should_park());
> > > }
> > >
> > > /*
> >
> > That's a bit sad; that two function calls for checking two consecutive
> > bits in the same word :-(
> >
> > If we move this to kthread.c and write it like:
> >
> > kthread = __to_kthread(current);
> > if (!kthread)
> > return false;
> >
> > return test_bit(KTHREAD_SHOULD_STOP, &kthread->flags) ||
> > test_bit(KTHREAD_SHOULD_PARK, &kthread->flags);
> >
> > Then the compiler should be able to merge the two bits in a single load
> > and test due to constant_test_bit() -- do check though.
>
> Hrm. Apologies, as it's been awhile since I've looked at disassembled
> asm, so I may be wrong, but I don't think that's happening here.
>
> With the logic above I'm seeing it build as:
> 0000000000000a50 <kthread_should_stop_or_park>:
> a50: 65 48 8b 14 25 00 00 mov %gs:0x0,%rdx
> a57: 00 00
> a59: 48 8b 8a 78 0a 00 00 mov 0xa78(%rdx),%rcx
> a60: 31 c0 xor %eax,%eax
> a62: 48 85 c9 test %rcx,%rcx
> a65: 74 11 je a78 <kthread_should_stop_or_park+0x28>
> a67: f6 42 2e 20 testb $0x20,0x2e(%rdx)
> a6b: 74 0b je a78 <kthread_should_stop_or_park+0x28>
> a6d: 48 8b 01 mov (%rcx),%rax
> a70: 48 d1 e8 shr %rax
> a73: 83 e0 01 and $0x1,%eax
> a76: 74 05 je a7d <kthread_should_stop_or_park+0x2d>
> a78: e9 00 00 00 00 jmp a7d <kthread_should_stop_or_park+0x2d>
> a7d: 48 8b 01 mov (%rcx),%rax
> a80: 48 c1 e8 02 shr $0x2,%rax
> a84: 83 e0 01 and $0x1,%eax
> a87: e9 00 00 00 00 jmp a8c <kthread_should_stop_or_park+0x3c>
> a8c: 0f 1f 40 00 nopl 0x0(%rax)

Moo, where is that optimization pass when you need it ;-)

If we forgo test_bit() and write it plainly it seems to work:

diff --git a/kernel/kthread.c b/kernel/kthread.c
index 7e6751b29101..36f94616cae5 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -164,6 +164,15 @@ bool __kthread_should_park(struct task_struct *k)
}
EXPORT_SYMBOL_GPL(__kthread_should_park);

+bool kthread_should_stop_or_park(void)
+{
+ struct kthread *kthread = __to_kthread(current);
+ if (!kthread)
+ return false;
+
+ return kthread->flags & (BIT(KTHREAD_SHOULD_STOP) | BIT(KTHREAD_SHOULD_PARK));
+}
+
/**
* kthread_should_park - should this kthread park now?
*


0000000000001850 <kthread_should_stop_or_park>:
1850: f3 0f 1e fa endbr64
1854: 65 48 8b 04 25 00 00 00 00 mov %gs:0x0,%rax 1859: R_X86_64_32S pcpu_hot
185d: 48 8b 88 08 06 00 00 mov 0x608(%rax),%rcx
1864: 31 d2 xor %edx,%edx
1866: 48 85 c9 test %rcx,%rcx
1869: 74 0c je 1877 <kthread_should_stop_or_park+0x27>
186b: f6 40 2e 20 testb $0x20,0x2e(%rax)
186f: 74 06 je 1877 <kthread_should_stop_or_park+0x27>
1871: f6 01 06 testb $0x6,(%rcx)
1874: 0f 95 c2 setne %dl
1877: 89 d0 mov %edx,%eax
1879: e9 00 00 00 00 jmp 187e <kthread_should_stop_or_park+0x2e> 187a: R_X86_64_PLT32 __x86_return_thunk-0x4