From: Scott Wood <[email protected]>
This function is concerned with the long-term cpu mask, not the
transitory mask the task might have while migrate disabled. Before
this patch, if a task was migrate disabled at the time
__set_cpus_allowed_ptr() was called, and the new mask happened to be
equal to the cpu that the task was running on, then the mask update
would be lost.
Signed-off-by: Scott Wood <[email protected]>
Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
---
kernel/sched/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1637,7 +1637,7 @@ static int __set_cpus_allowed_ptr(struct
goto out;
}
- if (cpumask_equal(p->cpus_ptr, new_mask))
+ if (cpumask_equal(&p->cpus_mask, new_mask))
goto out;
/*
On 17/06/20 13:17, Sebastian Andrzej Siewior wrote:
> From: Scott Wood <[email protected]>
>
> This function is concerned with the long-term cpu mask, not the
> transitory mask the task might have while migrate disabled. Before
> this patch, if a task was migrate disabled at the time
> __set_cpus_allowed_ptr() was called, and the new mask happened to be
> equal to the cpu that the task was running on, then the mask update
> would be lost.
>
> Signed-off-by: Scott Wood <[email protected]>
> Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
> ---
> kernel/sched/core.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -1637,7 +1637,7 @@ static int __set_cpus_allowed_ptr(struct
> goto out;
> }
>
> - if (cpumask_equal(p->cpus_ptr, new_mask))
> + if (cpumask_equal(&p->cpus_mask, new_mask))
> goto out;
>
> /*
Makes sense, but what about the rest of the checks? Further down there is
/* Can the task run on the task's current CPU? If so, we're done */
if (cpumask_test_cpu(task_cpu(p), new_mask))
goto out;
If the task is currently migrate disabled and for some stupid reason it
gets affined elsewhere, we could try to move it out - which AFAICT we don't
want to do because migrate disabled. So I suppose you'd want an extra
bailout condition here when the task is migrate disabled.
ISTR in RT you do re-check the affinity and potentially move the task away
when re-enabling migration, so that should work out all fine.
On Wed, 2020-06-17 at 15:15 +0100, Valentin Schneider wrote:
> On 17/06/20 13:17, Sebastian Andrzej Siewior wrote:
> > From: Scott Wood <[email protected]>
> >
> > This function is concerned with the long-term cpu mask, not the
> > transitory mask the task might have while migrate disabled. Before
> > this patch, if a task was migrate disabled at the time
> > __set_cpus_allowed_ptr() was called, and the new mask happened to be
> > equal to the cpu that the task was running on, then the mask update
> > would be lost.
> >
> > Signed-off-by: Scott Wood <[email protected]>
> > Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
> > ---
> > kernel/sched/core.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -1637,7 +1637,7 @@ static int __set_cpus_allowed_ptr(struct
> > goto out;
> > }
> >
> > - if (cpumask_equal(p->cpus_ptr, new_mask))
> > + if (cpumask_equal(&p->cpus_mask, new_mask))
> > goto out;
> >
> > /*
>
> Makes sense, but what about the rest of the checks? Further down there is
>
> /* Can the task run on the task's current CPU? If so, we're done
> */
> if (cpumask_test_cpu(task_cpu(p), new_mask))
> goto out;
>
> If the task is currently migrate disabled and for some stupid reason it
> gets affined elsewhere, we could try to move it out - which AFAICT we
> don't
> want to do because migrate disabled. So I suppose you'd want an extra
> bailout condition here when the task is migrate disabled.
>
> ISTR in RT you do re-check the affinity and potentially move the task away
> when re-enabling migration, so that should work out all fine.
On RT the above test is:
/* Can the task run on the task's current CPU? If so, we're done */
if (cpumask_test_cpu(task_cpu(p), new_mask) ||
p->cpus_ptr != &p->cpus_mask)
goto out;
...so we do bail out if we're migrate disabled.
-Scott
On 2020-06-17 17:49:48 [-0500], Scott Wood wrote:
> > Makes sense, but what about the rest of the checks? Further down there is
> >
> > /* Can the task run on the task's current CPU? If so, we're done
> > */
> > if (cpumask_test_cpu(task_cpu(p), new_mask))
> > goto out;
> >
> > If the task is currently migrate disabled and for some stupid reason it
> > gets affined elsewhere, we could try to move it out - which AFAICT we
> > don't
> > want to do because migrate disabled. So I suppose you'd want an extra
> > bailout condition here when the task is migrate disabled.
> >
> > ISTR in RT you do re-check the affinity and potentially move the task away
> > when re-enabling migration, so that should work out all fine.
>
> On RT the above test is:
>
> /* Can the task run on the task's current CPU? If so, we're done */
> if (cpumask_test_cpu(task_cpu(p), new_mask) ||
> p->cpus_ptr != &p->cpus_mask)
> goto out;
>
> ...so we do bail out if we're migrate disabled.
correct. There is a complete migrate_disable() patch in the RT queue
which has to wait. This patch however looked to be independent of that
and could "fix" the pointer part which is already here so I sent it.
> -Scott
Sebastian
On 18/06/20 09:07, Sebastian Andrzej Siewior wrote:
> On 2020-06-17 17:49:48 [-0500], Scott Wood wrote:
>> > Makes sense, but what about the rest of the checks? Further down there is
>> >
>> > /* Can the task run on the task's current CPU? If so, we're done
>> > */
>> > if (cpumask_test_cpu(task_cpu(p), new_mask))
>> > goto out;
>> >
>> > If the task is currently migrate disabled and for some stupid reason it
>> > gets affined elsewhere, we could try to move it out - which AFAICT we
>> > don't
>> > want to do because migrate disabled. So I suppose you'd want an extra
>> > bailout condition here when the task is migrate disabled.
>> >
>> > ISTR in RT you do re-check the affinity and potentially move the task away
>> > when re-enabling migration, so that should work out all fine.
>>
>> On RT the above test is:
>>
>> /* Can the task run on the task's current CPU? If so, we're done */
>> if (cpumask_test_cpu(task_cpu(p), new_mask) ||
>> p->cpus_ptr != &p->cpus_mask)
>> goto out;
>>
>> ...so we do bail out if we're migrate disabled.
>
> correct. There is a complete migrate_disable() patch in the RT queue
> which has to wait. This patch however looked to be independent of that
> and could "fix" the pointer part which is already here so I sent it.
>
Okay, thanks. I don't see any harm in including that extra check with the
patch, but either way:
Reviewed-by: Valentin Schneider <[email protected]>
>> -Scott
>
> Sebastian
The following commit has been merged into the sched/urgent branch of tip:
Commit-ID: a51dde512a93959d1df3a915fa3e3df68b1fc94d
Gitweb: https://git.kernel.org/tip/a51dde512a93959d1df3a915fa3e3df68b1fc94d
Author: Scott Wood <[email protected]>
AuthorDate: Wed, 17 Jun 2020 14:17:42 +02:00
Committer: Peter Zijlstra <[email protected]>
CommitterDate: Mon, 22 Jun 2020 20:51:05 +02:00
sched: __set_cpus_allowed_ptr(): Check cpus_mask, not cpus_ptr
This function is concerned with the long-term cpu mask, not the
transitory mask the task might have while migrate disabled. Before
this patch, if a task was migrate disabled at the time
__set_cpus_allowed_ptr() was called, and the new mask happened to be
equal to the cpu that the task was running on, then the mask update
would be lost.
Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
Signed-off-by: Scott Wood <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
---
kernel/sched/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 8298b2c..9f8aa34 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1637,7 +1637,7 @@ static int __set_cpus_allowed_ptr(struct task_struct *p,
goto out;
}
- if (cpumask_equal(p->cpus_ptr, new_mask))
+ if (cpumask_equal(&p->cpus_mask, new_mask))
goto out;
/*
The following commit has been merged into the sched/urgent branch of tip:
Commit-ID: 16568f1f4fd4decee6935751d5ada1f963e5bd5f
Gitweb: https://git.kernel.org/tip/16568f1f4fd4decee6935751d5ada1f963e5bd5f
Author: Scott Wood <[email protected]>
AuthorDate: Wed, 17 Jun 2020 14:17:42 +02:00
Committer: Ingo Molnar <[email protected]>
CommitterDate: Tue, 23 Jun 2020 10:42:30 +02:00
sched/core: Check cpus_mask, not cpus_ptr in __set_cpus_allowed_ptr(), to fix mask corruption
This function is concerned with the long-term CPU mask, not the
transitory mask the task might have while migrate disabled. Before
this patch, if a task was migrate-disabled at the time
__set_cpus_allowed_ptr() was called, and the new mask happened to be
equal to the CPU that the task was running on, then the mask update
would be lost.
Signed-off-by: Scott Wood <[email protected]>
Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
---
kernel/sched/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 8f36032..9eeac94 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1637,7 +1637,7 @@ static int __set_cpus_allowed_ptr(struct task_struct *p,
goto out;
}
- if (cpumask_equal(p->cpus_ptr, new_mask))
+ if (cpumask_equal(&p->cpus_mask, new_mask))
goto out;
/*