select_idle_sibling() has a special case for tasks woken up by a per-CPU
kthread, where the selected CPU is the previous one. However, the current
condition for this exit path is incomplete. A task can wake up from an
interrupt context (e.g. hrtimer), while a per-CPU kthread is running. A
such scenario would spuriously trigger the special case described above.
Also, a recent change made the idle task like a regular per-CPU kthread,
hence making that situation more likely to happen
(is_per_cpu_kthread(swapper) being true now).
Checking for task context makes sure select_idle_sibling() will not
interpret a wake up from any other context as a wake up by a per-CPU
kthread.
Fixes: 52262ee567ad ("sched/fair: Allow a per-CPU kthread waking a task to stack on the same CPU, to fix XFS performance regression")
Signed-off-by: Vincent Donnefort <[email protected]>
---
v1 -> v2:
* is_idle_thread() -> in_task() to also include spurious detection when
current != swapper. (Vincent Guittot)
---
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 945d987246c5..56db4ae85995 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6399,6 +6399,7 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target)
* pattern is IO completions.
*/
if (is_per_cpu_kthread(current) &&
+ in_task() &&
prev == smp_processor_id() &&
this_rq()->nr_running <= 1) {
return prev;
--
2.25.1
On Wed, 1 Dec 2021 at 15:35, Vincent Donnefort
<[email protected]> wrote:
>
> select_idle_sibling() has a special case for tasks woken up by a per-CPU
> kthread, where the selected CPU is the previous one. However, the current
> condition for this exit path is incomplete. A task can wake up from an
> interrupt context (e.g. hrtimer), while a per-CPU kthread is running. A
> such scenario would spuriously trigger the special case described above.
> Also, a recent change made the idle task like a regular per-CPU kthread,
> hence making that situation more likely to happen
> (is_per_cpu_kthread(swapper) being true now).
>
> Checking for task context makes sure select_idle_sibling() will not
> interpret a wake up from any other context as a wake up by a per-CPU
> kthread.
>
> Fixes: 52262ee567ad ("sched/fair: Allow a per-CPU kthread waking a task to stack on the same CPU, to fix XFS performance regression")
> Signed-off-by: Vincent Donnefort <[email protected]>
Reviewed-by: Vincent Guittot <[email protected]>
>
> ---
> v1 -> v2:
> * is_idle_thread() -> in_task() to also include spurious detection when
> current != swapper. (Vincent Guittot)
> ---
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 945d987246c5..56db4ae85995 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -6399,6 +6399,7 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target)
> * pattern is IO completions.
> */
> if (is_per_cpu_kthread(current) &&
> + in_task() &&
> prev == smp_processor_id() &&
> this_rq()->nr_running <= 1) {
> return prev;
> --
> 2.25.1
>
On 01/12/21 14:34, Vincent Donnefort wrote:
> select_idle_sibling() has a special case for tasks woken up by a per-CPU
> kthread, where the selected CPU is the previous one. However, the current
> condition for this exit path is incomplete. A task can wake up from an
> interrupt context (e.g. hrtimer), while a per-CPU kthread is running. A
> such scenario would spuriously trigger the special case described above.
> Also, a recent change made the idle task like a regular per-CPU kthread,
> hence making that situation more likely to happen
> (is_per_cpu_kthread(swapper) being true now).
>
> Checking for task context makes sure select_idle_sibling() will not
> interpret a wake up from any other context as a wake up by a per-CPU
> kthread.
>
> Fixes: 52262ee567ad ("sched/fair: Allow a per-CPU kthread waking a task to stack on the same CPU, to fix XFS performance regression")
> Signed-off-by: Vincent Donnefort <[email protected]>
>
Reviewed-by: Valentin Schneider <[email protected]>
> ---
> v1 -> v2:
> * is_idle_thread() -> in_task() to also include spurious detection when
> current != swapper. (Vincent Guittot)
> ---
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 945d987246c5..56db4ae85995 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -6399,6 +6399,7 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target)
> * pattern is IO completions.
> */
> if (is_per_cpu_kthread(current) &&
> + in_task() &&
> prev == smp_processor_id() &&
> this_rq()->nr_running <= 1) {
> return prev;
> --
> 2.25.1
On Wed, Dec 01, 2021 at 02:34:50PM +0000, Vincent Donnefort wrote:
> select_idle_sibling() has a special case for tasks woken up by a per-CPU
> kthread, where the selected CPU is the previous one. However, the current
> condition for this exit path is incomplete. A task can wake up from an
> interrupt context (e.g. hrtimer), while a per-CPU kthread is running. A
> such scenario would spuriously trigger the special case described above.
> Also, a recent change made the idle task like a regular per-CPU kthread,
> hence making that situation more likely to happen
> (is_per_cpu_kthread(swapper) being true now).
>
> Checking for task context makes sure select_idle_sibling() will not
> interpret a wake up from any other context as a wake up by a per-CPU
> kthread.
>
> Fixes: 52262ee567ad ("sched/fair: Allow a per-CPU kthread waking a task to stack on the same CPU, to fix XFS performance regression")
> Signed-off-by: Vincent Donnefort <[email protected]>
> ---
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 945d987246c5..56db4ae85995 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -6399,6 +6399,7 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target)
> * pattern is IO completions.
> */
> if (is_per_cpu_kthread(current) &&
> + in_task() &&
> prev == smp_processor_id() &&
> this_rq()->nr_running <= 1) {
> return prev;
Hurmph, so now I have two 'trivial' patches from you that touch this
same function and they's conflicting. I've fixed it up, but perhaps it
would've been nice to have them combined in a series or somesuch :-)
On Sat, Dec 04, 2021 at 10:53:16AM +0100, Peter Zijlstra wrote:
> On Wed, Dec 01, 2021 at 02:34:50PM +0000, Vincent Donnefort wrote:
> > select_idle_sibling() has a special case for tasks woken up by a per-CPU
> > kthread, where the selected CPU is the previous one. However, the current
> > condition for this exit path is incomplete. A task can wake up from an
> > interrupt context (e.g. hrtimer), while a per-CPU kthread is running. A
> > such scenario would spuriously trigger the special case described above.
> > Also, a recent change made the idle task like a regular per-CPU kthread,
> > hence making that situation more likely to happen
> > (is_per_cpu_kthread(swapper) being true now).
> >
> > Checking for task context makes sure select_idle_sibling() will not
> > interpret a wake up from any other context as a wake up by a per-CPU
> > kthread.
> >
> > Fixes: 52262ee567ad ("sched/fair: Allow a per-CPU kthread waking a task to stack on the same CPU, to fix XFS performance regression")
> > Signed-off-by: Vincent Donnefort <[email protected]>
> > ---
>
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 945d987246c5..56db4ae85995 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -6399,6 +6399,7 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target)
> > * pattern is IO completions.
> > */
> > if (is_per_cpu_kthread(current) &&
> > + in_task() &&
> > prev == smp_processor_id() &&
> > this_rq()->nr_running <= 1) {
> > return prev;
>
> Hurmph, so now I have two 'trivial' patches from you that touch this
> same function and they's conflicting. I've fixed it up, but perhaps it
> would've been nice to have them combined in a series or somesuch :-)
>
I definitely should have created a single patchset. Apologies for the
extra work and thanks for taking those two patches!
On another subject, in case you missed them, I also have two tiny fixes,
reviewed by Vincent:
[PATCH v2 1/2] sched/fair: Fix asym_fits_capacity() task_util type
[PATCH v2 2/2] sched/fair: Fix task_fits_capacity() capacity type
The following commit has been merged into the sched/core branch of tip:
Commit-ID: 8b4e74ccb582797f6f0b0a50372ebd9fd2372a27
Gitweb: https://git.kernel.org/tip/8b4e74ccb582797f6f0b0a50372ebd9fd2372a27
Author: Vincent Donnefort <[email protected]>
AuthorDate: Wed, 01 Dec 2021 14:34:50
Committer: Peter Zijlstra <[email protected]>
CommitterDate: Sat, 04 Dec 2021 10:56:20 +01:00
sched/fair: Fix detection of per-CPU kthreads waking a task
select_idle_sibling() has a special case for tasks woken up by a per-CPU
kthread, where the selected CPU is the previous one. However, the current
condition for this exit path is incomplete. A task can wake up from an
interrupt context (e.g. hrtimer), while a per-CPU kthread is running. A
such scenario would spuriously trigger the special case described above.
Also, a recent change made the idle task like a regular per-CPU kthread,
hence making that situation more likely to happen
(is_per_cpu_kthread(swapper) being true now).
Checking for task context makes sure select_idle_sibling() will not
interpret a wake up from any other context as a wake up by a per-CPU
kthread.
Fixes: 52262ee567ad ("sched/fair: Allow a per-CPU kthread waking a task to stack on the same CPU, to fix XFS performance regression")
Signed-off-by: Vincent Donnefort <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Vincent Guittot <[email protected]>
Reviewed-by: Valentin Schneider <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
kernel/sched/fair.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 884f29d..5cd2798 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6398,6 +6398,7 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target)
* pattern is IO completions.
*/
if (is_per_cpu_kthread(current) &&
+ in_task() &&
prev == smp_processor_id() &&
this_rq()->nr_running <= 1) {
return prev;