2018-05-22 23:52:39

by Joel Fernandes

[permalink] [raw]
Subject: [PATCH RFC] schedutil: Address the r/w ordering race in kthread

Currently there is a race in schedutil code for slow-switch single-CPU
systems. Fix it by enforcing ordering the write to work_in_progress to
happen before the read of next_freq.

Kthread Sched update

sugov_work() sugov_update_single()

lock();
// The CPU is free to rearrange below
// two in any order, so it may clear
// the flag first and then read next
// freq. Lets assume it does.
work_in_progress = false

if (work_in_progress)
return;

sg_policy->next_freq = 0;
freq = sg_policy->next_freq;
sg_policy->next_freq = real-freq;
unlock();

Reported-by: Viresh Kumar <[email protected]>
CC: Rafael J. Wysocki <[email protected]>
CC: Peter Zijlstra <[email protected]>
CC: Ingo Molnar <[email protected]>
CC: Patrick Bellasi <[email protected]>
CC: Juri Lelli <[email protected]>
Cc: Luca Abeni <[email protected]>
CC: Todd Kjos <[email protected]>
CC: [email protected]
CC: [email protected]
CC: [email protected]
Signed-off-by: Joel Fernandes (Google) <[email protected]>
---
I split this into separate patch, because this race can also happen in
mainline.

kernel/sched/cpufreq_schedutil.c | 7 +++++++
1 file changed, 7 insertions(+)

diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
index 5c482ec38610..ce7749da7a44 100644
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -401,6 +401,13 @@ static void sugov_work(struct kthread_work *work)
*/
raw_spin_lock_irqsave(&sg_policy->update_lock, flags);
freq = sg_policy->next_freq;
+
+ /*
+ * sugov_update_single can access work_in_progress without update_lock,
+ * make sure next_freq is read before work_in_progress is set.
+ */
+ smp_mb();
+
sg_policy->work_in_progress = false;
raw_spin_unlock_irqrestore(&sg_policy->update_lock, flags);

--
2.17.0.441.gb46fe60e1d-goog



2018-05-23 00:19:05

by Joel Fernandes

[permalink] [raw]
Subject: Re: [PATCH RFC] schedutil: Address the r/w ordering race in kthread

On Tue, May 22, 2018 at 04:50:28PM -0700, Joel Fernandes (Google) wrote:
> Currently there is a race in schedutil code for slow-switch single-CPU
> systems. Fix it by enforcing ordering the write to work_in_progress to
> happen before the read of next_freq.

Aargh, s/before/after/.

Commit log has above issue but code is Ok. Should I resend this patch or
are there any additional comments? thanks!

- Joel

[..]

2018-05-23 06:48:22

by Juri Lelli

[permalink] [raw]
Subject: Re: [PATCH RFC] schedutil: Address the r/w ordering race in kthread

Hi Joel,

On 22/05/18 16:50, Joel Fernandes (Google) wrote:
> Currently there is a race in schedutil code for slow-switch single-CPU
> systems. Fix it by enforcing ordering the write to work_in_progress to
> happen before the read of next_freq.
>
> Kthread Sched update
>
> sugov_work() sugov_update_single()
>
> lock();
> // The CPU is free to rearrange below
> // two in any order, so it may clear
> // the flag first and then read next
> // freq. Lets assume it does.
> work_in_progress = false
>
> if (work_in_progress)
> return;
>
> sg_policy->next_freq = 0;
> freq = sg_policy->next_freq;
> sg_policy->next_freq = real-freq;
> unlock();
>
> Reported-by: Viresh Kumar <[email protected]>
> CC: Rafael J. Wysocki <[email protected]>
> CC: Peter Zijlstra <[email protected]>
> CC: Ingo Molnar <[email protected]>
> CC: Patrick Bellasi <[email protected]>
> CC: Juri Lelli <[email protected]>
> Cc: Luca Abeni <[email protected]>
> CC: Todd Kjos <[email protected]>
> CC: [email protected]
> CC: [email protected]
> CC: [email protected]
> Signed-off-by: Joel Fernandes (Google) <[email protected]>
> ---
> I split this into separate patch, because this race can also happen in
> mainline.
>
> kernel/sched/cpufreq_schedutil.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
> index 5c482ec38610..ce7749da7a44 100644
> --- a/kernel/sched/cpufreq_schedutil.c
> +++ b/kernel/sched/cpufreq_schedutil.c
> @@ -401,6 +401,13 @@ static void sugov_work(struct kthread_work *work)
> */
> raw_spin_lock_irqsave(&sg_policy->update_lock, flags);
> freq = sg_policy->next_freq;
> +
> + /*
> + * sugov_update_single can access work_in_progress without update_lock,
> + * make sure next_freq is read before work_in_progress is set.

s/set/reset/

> + */
> + smp_mb();
> +

Also, doesn't this need a corresponding barrier (I guess in
sugov_should_update_freq)? That being a wmb and this a rmb?

Best,

- Juri

2018-05-23 08:26:06

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH RFC] schedutil: Address the r/w ordering race in kthread

On Wed, May 23, 2018 at 1:50 AM, Joel Fernandes (Google)
<[email protected]> wrote:
> Currently there is a race in schedutil code for slow-switch single-CPU
> systems. Fix it by enforcing ordering the write to work_in_progress to
> happen before the read of next_freq.
>
> Kthread Sched update
>
> sugov_work() sugov_update_single()
>
> lock();
> // The CPU is free to rearrange below
> // two in any order, so it may clear
> // the flag first and then read next
> // freq. Lets assume it does.
> work_in_progress = false
>
> if (work_in_progress)
> return;
>
> sg_policy->next_freq = 0;
> freq = sg_policy->next_freq;
> sg_policy->next_freq = real-freq;
> unlock();
>
> Reported-by: Viresh Kumar <[email protected]>
> CC: Rafael J. Wysocki <[email protected]>
> CC: Peter Zijlstra <[email protected]>
> CC: Ingo Molnar <[email protected]>
> CC: Patrick Bellasi <[email protected]>
> CC: Juri Lelli <[email protected]>
> Cc: Luca Abeni <[email protected]>
> CC: Todd Kjos <[email protected]>
> CC: [email protected]
> CC: [email protected]
> CC: [email protected]
> Signed-off-by: Joel Fernandes (Google) <[email protected]>
> ---
> I split this into separate patch, because this race can also happen in
> mainline.
>
> kernel/sched/cpufreq_schedutil.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
> index 5c482ec38610..ce7749da7a44 100644
> --- a/kernel/sched/cpufreq_schedutil.c
> +++ b/kernel/sched/cpufreq_schedutil.c
> @@ -401,6 +401,13 @@ static void sugov_work(struct kthread_work *work)
> */
> raw_spin_lock_irqsave(&sg_policy->update_lock, flags);
> freq = sg_policy->next_freq;
> +
> + /*
> + * sugov_update_single can access work_in_progress without update_lock,
> + * make sure next_freq is read before work_in_progress is set.
> + */
> + smp_mb();
> +

This requires a corresponding barrier somewhere else.

> sg_policy->work_in_progress = false;
> raw_spin_unlock_irqrestore(&sg_policy->update_lock, flags);
>
> --

Also, as I said I actually would prefer to use the spinlock in the
one-CPU case when the kthread is used.

I'll have a patch for that shortly.