2009-11-09 19:35:48

by Gregory Haskins

[permalink] [raw]
Subject: Re: [[email protected]: Bug when changing cpus_allowed of RT tasks?]

Hi Lucas,

Ingo asked me to take a look at the problem you are reporting. Is this a bug you are seeing in the
wild, or was this found by code-inspection? I took a look, and it looks to me that the original code
is correct, but I will keep an open mind.

In the meantime, I will try to provide some details about what is going on here. Comments inline.

>>> On 11/8/2009 at 7:16 AM, in message <[email protected]>, Ingo
Molnar <[email protected]> wrote:

> FYI. If you reply then please do so on-list.
>
> Thanks,
>
> Ingo
>
> ----- Forwarded message from Lucas De Marchi <[email protected]> -----
>
> Date: Mon, 26 Oct 2009 14:11:13 -0200
> From: Lucas De Marchi <[email protected]>
> To: Peter Zijlstra <[email protected]>, Ingo Molnar <[email protected]>
> Cc: [email protected]
> Subject: Bug when changing cpus_allowed of RT tasks?
>
> I think there's a problem when balancing RT tasks: both find_lowest_rq_rt
> and
> select_task_rq_rt check for p->rt.nr_cpus_allowed being greater than 1 to wake
> a task in another cpu:
>
> static int select_task_rq_rt(struct task_struct *p, int sd_flag, int flags)
> {
> [...]
> if (unlikely(rt_task(rq->curr)) &&
> (p->rt.nr_cpus_allowed > 1)) {
> int cpu = find_lowest_rq(p);
>
> return (cpu == -1) ? task_cpu(p) : cpu;
> }
> [...]
> }

So the intent of this logic is to say "if the task is of type RT, and it can move, see if it can move
elsewhere". Otherwise, we do not try to move it at all.

>
> static int find_lowest_rq(struct task_struct *task)
> {
> struct sched_domain *sd;
> struct cpumask *lowest_mask = __get_cpu_var(local_cpu_mask);
> int this_cpu = smp_processor_id();
> int cpu = task_cpu(task);
> cpumask_var_t domain_mask;
>
> if (task->rt.nr_cpus_allowed == 1)
> return -1; /* No other targets possible */
> [...]
> }

And the intent here is to say if the task isn't allowed to move, to not try to compute another
location as task_cpu() is the only valid home.

>
>
> What happens if rt.nr_cpus_allowed continues to be 1, but the allowed cpu is
> not the one in which task is being woken up?

The one case that I can think of that would fall into this category is during a sched_setaffinity()?

If so, what happens is that we ultimately call set_cpus_allowed() -> set_cpus_allowed_ptr(). Inside
this function, the new mask is validated independent of the nr_cpus_allowed value and we will
migrate the task away if it is no longer on a valid core.

Are there other cases?

> Since set_cpus_allowed_rt just
> changed the mask and updated the value of nr_cpus_allowed_rt, as far as I
> can
> see, it remains on same CPU even if the new mask does not allow.
>
> A possible fix below, and in case it is right:
> Signed-off-by: Lucas De Marchi <[email protected]>
>
>

Until further evidence is presented, I have to respectfully NAK the patch, as I do not think its doing the right thing
nor do I think the current code is actually broken.

I will keep an open mind, though, so if you think there is really a problem, lets discuss.

>
> Lucas De Marchi
>
>
> --
>
> diff --git a/kernel/sched_rt.c b/kernel/sched_rt.c
> index a4d790c..531c946 100644
> --- a/kernel/sched_rt.c
> +++ b/kernel/sched_rt.c
> @@ -962,8 +962,7 @@ static int select_task_rq_rt(struct task_struct
> *p, int sd_flag, int flags)
> * that is just being woken and probably will have
> * cold cache anyway.
> */
> - if (unlikely(rt_task(rq->curr)) &&
> - (p->rt.nr_cpus_allowed > 1)) {
> + if (unlikely(rt_task(rq->curr))) {
> int cpu = find_lowest_rq(p);
>
> return (cpu == -1) ? task_cpu(p) : cpu;
> @@ -1177,7 +1176,8 @@ static int find_lowest_rq(struct task_struct *task)
> int cpu = task_cpu(task);
> cpumask_var_t domain_mask;
>
> - if (task->rt.nr_cpus_allowed == 1)
> + if ((task->rt.nr_cpus_allowed == 1) &&
> + likely(cpumask_test_cpu(cpu, &task->cpus_allowed)))
> return -1; /* No other targets possible */
>
> if (!cpupri_find(&task_rq(task)->rd->cpupri, task, lowest_mask))
>
> ----- End forwarded message -----


2009-11-09 21:12:51

by Lucas De Marchi

[permalink] [raw]
Subject: Re: [[email protected]: Bug when changing cpus_allowed of RT tasks?]

On Mon, Nov 9, 2009 at 17:35, Gregory Haskins <[email protected]> wrote:
> Hi Lucas,
>
> Ingo asked me to take a look at the problem you are reporting. ?Is this a bug you are seeing in the
> wild, or was this found by code-inspection? ?I took a look, and it looks to me that the original code
> is correct, but I will keep an open mind.

I found this "maybe"-bug in some changes I'm doing to balance of RT tasks. It
was not giving me the expected results so inspecting the code I supposed this
was the problem. Applying the patch it solved my problem, but I think it was a
problem with my code. Comments inlined.

>> ? ? ? static int select_task_rq_rt(struct task_struct *p, int sd_flag, int flags)
>> ? ? ? {
>> ? ? ? ? ? ? ? [...]
>> ? ? ? ? ? ? ? if (unlikely(rt_task(rq->curr)) &&
>> ? ? ? ? ? ? ? ? ? (p->rt.nr_cpus_allowed > 1)) {
>> ? ? ? ? ? ? ? ? ? ? ? int cpu = find_lowest_rq(p);
>>
>> ? ? ? ? ? ? ? ? ? ? ? return (cpu == -1) ? task_cpu(p) : cpu;
>> ? ? ? ? ? ? ? }
/*
* Otherwise, just let it ride on the affined RQ and the
* post-schedule router will push the preempted task away
*/
return task_cpu(p);

>> ? ? ? }
I completed the rest of function to emphasize it will return task_cpu(p)
afterwards.

>
> So the intent of this logic is to say "if the task is of type RT, and it can move, see if it can move
> elsewhere". ?Otherwise, we do not try to move it at all.

I'd say "if _current_ is of type RT, and _p_ can move, see if _p_ can move
elsewhere". And this check is repeated for p inside find_lowest_rq, so it would
not be needed here. Just let it call find_lowest_rq and -1 will be returned.

>> ? ? ? static int find_lowest_rq(struct task_struct *task)
>> ? ? ? {
[...]
>> ? ? ? ? ? ? ? if (task->rt.nr_cpus_allowed == 1)
>> ? ? ? ? ? ? ? ? ? ? ? return -1; /* No other targets possible */
>> ? ? ? ? ? ? ? [...]
>> ? ? ? }


>> What happens if rt.nr_cpus_allowed continues to be 1, but the allowed cpu is
>> not the one in which task is being woken up?
>
> The one case that I can think of that would fall into this category is during a sched_setaffinity()?
>
> If so, what happens is that we ultimately call set_cpus_allowed() -> set_cpus_allowed_ptr(). ?Inside
> this function, the new mask is validated independent of the nr_cpus_allowed value and we will
> migrate the task away if it is no longer on a valid core.

This was exactly the case I was falling into. But I was setting the affinity
inside the kernel, not passing through setaffinity syscall, and I was wrongly
calling set_cpus_allowed_rt directly instead of set_cpus_allowed, so task would
not migrate in that case.

>
> Until further evidence is presented, I have to respectfully NAK the patch, as I do not think its doing the right thing
> nor do I think the current code is actually broken.

I see now it's not doing the right thing. IMO only the double check of
rt.nr_cpus_allowed is superfluous, but not wrong.


Thanks for clarifications


Lucas De Marchi

2009-11-10 01:15:37

by Gregory Haskins

[permalink] [raw]
Subject: Re: [[email protected]: Bug when changing cpus_allowed of RT tasks?]

Lucas De Marchi wrote:
> On Mon, Nov 9, 2009 at 17:35, Gregory Haskins <[email protected]> wrote:
>
>>> static int select_task_rq_rt(struct task_struct *p, int sd_flag, int flags)
>>> {
>>> [...]
>>> if (unlikely(rt_task(rq->curr)) &&
>>> (p->rt.nr_cpus_allowed > 1)) {
>>> int cpu = find_lowest_rq(p);
>>>
>>> return (cpu == -1) ? task_cpu(p) : cpu;
>>> }
> /*
> * Otherwise, just let it ride on the affined RQ and the
> * post-schedule router will push the preempted task away
> */
> return task_cpu(p);
>
>>> }
> I completed the rest of function to emphasize it will return task_cpu(p)
> afterwards.
>
>> So the intent of this logic is to say "if the task is of type RT, and it can move, see if it can move
>> elsewhere". Otherwise, we do not try to move it at all.
>
> I'd say "if _current_ is of type RT, and _p_ can move, see if _p_ can move
> elsewhere". And this check is repeated for p inside find_lowest_rq, so it would
> not be needed here. Just let it call find_lowest_rq and -1 will be returned.

Ah, yes, "current" is correct. My bad.


>
>> Until further evidence is presented, I have to respectfully NAK the patch, as I do not think its doing the right thing
>> nor do I think the current code is actually broken.
>
> I see now it's not doing the right thing. IMO only the double check of
> rt.nr_cpus_allowed is superfluous, but not wrong.
>

Right. I have a suspicion that the original code didn't have the
redundant check, but it was patched that way later. I can't recall, tbh.

>
> Thanks for clarifications

Np.

Kind Regards,
-Greg


Attachments:
signature.asc (267.00 B)
OpenPGP digital signature