2011-05-10 13:04:17

by Hillf Danton

[permalink] [raw]
Subject: [PATCH] sched: correct how RT task is picked

When picking RT task for given CPU,
[1] if the cpu is invalid for cpumask test, right result could not be
reached even by further checking nr_cpus_allowed,
on the other hand, the input cpu is valid in two cases that
pick_next_highest_task_rt() is called, thus the invalid input cpu
looks over-concern.
[2] if the cpu is valid for cpumask test, further checking
nr_cpus_allowed looks overwork, since it is computed based on
cpus_allowed,
what is more, the combination of cpus_allowed and nr_cpus_allowed
could lead to incorrect result if the input cpu == rq->cpu, as in the
case of next_prio() where no pulling task is concerned.

In this work, invalid cpu is not removed but leads to negative result,
but nr_cpus_allowed is.

Signed-off-by: Hillf Danton <[email protected]>
---

--- a/kernel/sched_rt.c 2011-04-27 11:48:50.000000000 +0800
+++ b/kernel/sched_rt.c 2011-05-10 20:30:38.000000000 +0800
@@ -1149,10 +1149,12 @@ static void deactivate_task(struct rq *r

static int pick_rt_task(struct rq *rq, struct task_struct *p, int cpu)
{
- if (!task_running(rq, p) &&
- (cpu < 0 || cpumask_test_cpu(cpu, &p->cpus_allowed)) &&
- (p->rt.nr_cpus_allowed > 1))
- return 1;
+ if (!task_running(rq, p)) {
+ if (cpu < 0)
+ return 0;
+ if (cpumask_test_cpu(cpu, &p->cpus_allowed))
+ return 1;
+ }
return 0;
}


2011-05-11 15:57:24

by Yong Zhang

[permalink] [raw]
Subject: Re: [PATCH] sched: correct how RT task is picked

On Tue, May 10, 2011 at 9:04 PM, Hillf Danton <[email protected]> wrote:
> When picking RT task for given CPU,
> [1] if the cpu is invalid for cpumask test, right result could not be

'cpu is invalid' means weather we care it or not, it's not real 'invalid'

> reached even by further checking nr_cpus_allowed,
> on the other hand, the input cpu is valid in two cases that
> pick_next_highest_task_rt() is called, thus the invalid input cpu
> looks over-concern.
> [2] if the cpu is valid for cpumask test, further checking
> nr_cpus_allowed looks overwork, since it is computed based on
> cpus_allowed,

No, cpumask_test_cpu(cpu, &p->cpus_allowed) doesn't mean
p->rt.nr_cpus_allowed > 1.

Thanks,
Yong

> what is more, the combination of cpus_allowed and nr_cpus_allowed
> could lead to incorrect result if the input cpu == rq->cpu, as in the
> case of next_prio() where no pulling task is concerned.
>
> In this work, invalid cpu is not removed but leads to negative result,
> but nr_cpus_allowed is.
>
> Signed-off-by: Hillf Danton <[email protected]>
> ---
>
> --- a/kernel/sched_rt.c 2011-04-27 11:48:50.000000000 +0800
> +++ b/kernel/sched_rt.c 2011-05-10 20:30:38.000000000 +0800
> @@ -1149,10 +1149,12 @@ static void deactivate_task(struct rq *r
>
>  static int pick_rt_task(struct rq *rq, struct task_struct *p, int cpu)
>  {
> -       if (!task_running(rq, p) &&
> -           (cpu < 0 || cpumask_test_cpu(cpu, &p->cpus_allowed)) &&
> -           (p->rt.nr_cpus_allowed > 1))
> -               return 1;
> +       if (!task_running(rq, p)) {
> +               if (cpu < 0)
> +                       return 0;
> +               if (cpumask_test_cpu(cpu, &p->cpus_allowed))
> +                       return 1;
> +       }
>        return 0;
>  }
>



--
Only stand for myself

2011-05-11 17:53:53

by Hillf Danton

[permalink] [raw]
Subject: Re: [PATCH] sched: correct how RT task is picked

On Wed, May 11, 2011 at 4:43 PM, Yong Zhang <[email protected]> wrote:
> On Tue, May 10, 2011 at 9:04 PM, Hillf Danton <[email protected]> wrote:
>> When picking RT task for given CPU,
>> [1] if the cpu is invalid for cpumask test, right result could not be
>
> 'cpu is invalid' means weather we care it or not, it's not real 'invalid'
>
If cpu is not cared, how to determine whether it is allowed for task to run?

>> reached even by further checking nr_cpus_allowed,
>> on the other hand, the input cpu is valid in two cases that
>> pick_next_highest_task_rt() is called, thus the invalid input cpu
>> looks over-concern.
>> [2] if the cpu is valid for cpumask test, further checking
>> nr_cpus_allowed looks overwork, since it is computed based on
>> cpus_allowed,
>
> No, cpumask_test_cpu(cpu, &p->cpus_allowed) doesn't mean
> p->rt.nr_cpus_allowed > 1.
>
If cpu is allowed for task to run, then why more cpus are enforced?

thanks
Hillf

2011-05-12 12:06:16

by Yong Zhang

[permalink] [raw]
Subject: Re: [PATCH] sched: correct how RT task is picked

On Wed, May 11, 2011 at 09:44:04PM +0800, Hillf Danton wrote:
> On Wed, May 11, 2011 at 4:43 PM, Yong Zhang <[email protected]> wrote:
> > On Tue, May 10, 2011 at 9:04 PM, Hillf Danton <[email protected]> wrote:
> >> When picking RT task for given CPU,
> >> [1] if the cpu is invalid for cpumask test, right result could not be
> >
> > 'cpu is invalid' means weather we care it or not, it's not real 'invalid'
> >
> If cpu is not cared, how to determine whether it is allowed for task to run?

pick_next_highest_task_rt() can be used to get the next highest pullable
task on a certain rq(regradless on which cpu that task could run). but
currently we have no such kind of caller.

>
> >> reached even by further checking nr_cpus_allowed,
> >> on the other hand, the input cpu is valid in two cases that
> >> pick_next_highest_task_rt() is called, thus the invalid input cpu
> >> looks over-concern.
> >> [2] if the cpu is valid for cpumask test, further checking
> >> nr_cpus_allowed looks overwork, since it is computed based on
> >> cpus_allowed,
> >
> > No, cpumask_test_cpu(cpu, &p->cpus_allowed) doesn't mean
> > p->rt.nr_cpus_allowed > 1.
> >
> If cpu is allowed for task to run, then why more cpus are enforced?

I think you can take a look at next_prio(), it just calculate the
next highest task on the current cpu; in this case,
cpumask_test_cpu(cpu, &p->cpus_allowed) will be true for the most
of time, but maybe that task is bound to this cpu.

Thanks,
Yong
>
> thanks
> Hillf

2011-05-18 01:38:45

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] sched: correct how RT task is picked

On Thu, May 12, 2011 at 08:06:06PM +0800, Yong Zhang wrote:
> On Wed, May 11, 2011 at 09:44:04PM +0800, Hillf Danton wrote:
> > On Wed, May 11, 2011 at 4:43 PM, Yong Zhang <[email protected]> wrote:
> > > On Tue, May 10, 2011 at 9:04 PM, Hillf Danton <[email protected]> wrote:
> > >> When picking RT task for given CPU,
> > >> [1] if the cpu is invalid for cpumask test, right result could not be
> > >
> > > 'cpu is invalid' means weather we care it or not, it's not real 'invalid'
> > >
> > If cpu is not cared, how to determine whether it is allowed for task to run?
>
> pick_next_highest_task_rt() can be used to get the next highest pullable
> task on a certain rq(regradless on which cpu that task could run). but
> currently we have no such kind of caller.
>
> >
> > >> reached even by further checking nr_cpus_allowed,
> > >> on the other hand, the input cpu is valid in two cases that
> > >> pick_next_highest_task_rt() is called, thus the invalid input cpu
> > >> looks over-concern.
> > >> [2] if the cpu is valid for cpumask test, further checking
> > >> nr_cpus_allowed looks overwork, since it is computed based on
> > >> cpus_allowed,
> > >
> > > No, cpumask_test_cpu(cpu, &p->cpus_allowed) doesn't mean
> > > p->rt.nr_cpus_allowed > 1.
> > >
> > If cpu is allowed for task to run, then why more cpus are enforced?
>
> I think you can take a look at next_prio(), it just calculate the
> next highest task on the current cpu; in this case,
> cpumask_test_cpu(cpu, &p->cpus_allowed) will be true for the most
> of time, but maybe that task is bound to this cpu.

I've been looking at the history here, and I think that '-1' is a relic.

If you look at sched_rt.c in f65eda4f789168ba5ff3fa75546c29efeed19f58:

$ git show f65eda4f:kernel/sched_rt.c

You'll see that push_rt_task calls pick_next_highest_task_rt() with a
-1. That code has long been replaced.

I'm a bit nervous about taking Hillf's patch as is. But a little more
reviewing and testing may prove that it is legit.

-- Steve

2011-05-18 02:31:55

by Yong Zhang

[permalink] [raw]
Subject: Re: [PATCH] sched: correct how RT task is picked

On Wed, May 18, 2011 at 9:38 AM, Steven Rostedt <[email protected]> wrote:
>> I think you can take a look at next_prio(), it just calculate the
>> next highest task on the current cpu; in this case,
>> cpumask_test_cpu(cpu, &p->cpus_allowed) will be true for the most
>> of time, but maybe that task is bound to this cpu.
>
> I've been looking at the history here, and I think that '-1' is a relic.
>
> If you look at sched_rt.c in f65eda4f789168ba5ff3fa75546c29efeed19f58:
>
> $ git show f65eda4f:kernel/sched_rt.c
>
> You'll see that push_rt_task calls pick_next_highest_task_rt() with a
> -1. That code has long been replaced.

Yeah, the condition "cpu < 0" could be removed since we have no
that kind of caller.

>
> I'm a bit nervous about taking Hillf's patch as is. But a little more
> reviewing and testing may prove that it is legit.

But another point is like I said before:
'cpumask_test_cpu(cpu, &p->cpus_allowed)' doesn't equal to
'p->rt.nr_cpus_allowed > 1' because we could have bounded
task.
So the condition 'if (cpumask_test_cpu(cpu, &p->cpus_allowed))'
in Hillf's patch is not sufficient.

Thanks,
Yong



--
Only stand for myself

2011-05-18 13:19:47

by Hillf Danton

[permalink] [raw]
Subject: Re: [PATCH] sched: correct how RT task is picked

On Wed, May 18, 2011 at 10:31 AM, Yong Zhang <[email protected]> wrote:
> On Wed, May 18, 2011 at 9:38 AM, Steven Rostedt <[email protected]> wrote:
>>> I think you can take a look at next_prio(), it just calculate the
>>> next highest task on the current cpu; in this case,
>>> cpumask_test_cpu(cpu, &p->cpus_allowed) will be true for the most
>>> of time, but maybe that task is bound to this cpu.
>>
>> I've been looking at the history here, and I think that '-1' is a relic.
>>
>> If you look at sched_rt.c in f65eda4f789168ba5ff3fa75546c29efeed19f58:
>>
>> $ git show f65eda4f:kernel/sched_rt.c
>>
>> You'll see that push_rt_task calls pick_next_highest_task_rt() with a
>> -1. That code has long been replaced.
>
> Yeah, the condition "cpu < 0" could be removed since we have no
> that kind of caller.
>
>>
>> I'm a bit nervous about taking Hillf's patch as is. But a little more
>> reviewing and testing may prove that it is legit.
>
> But another point is like I said before:
> 'cpumask_test_cpu(cpu, &p->cpus_allowed)' doesn't equal to
> 'p->rt.nr_cpus_allowed > 1' because we could have bounded
> task.
> So the condition 'if (cpumask_test_cpu(cpu, &p->cpus_allowed))'
> in Hillf's patch is not sufficient.
>

Hi all

The patch is prepared again, in which tests for both cpu and
nr_cpus_allowed are dropped.

The reason to drop nr_cpus_allowed is to make sure that the
returned value is correct for both case that cpu == rq->cpu and
case that cpu != rq->cpu.

thanks

Hillf
---

--- a/kernel/sched_rt.c 2011-04-27 11:48:50.000000000 +0800
+++ b/kernel/sched_rt.c 2011-05-18 21:16:22.000000000 +0800
@@ -1149,9 +1149,7 @@ static void deactivate_task(struct rq *r

static int pick_rt_task(struct rq *rq, struct task_struct *p, int cpu)
{
- if (!task_running(rq, p) &&
- (cpu < 0 || cpumask_test_cpu(cpu, &p->cpus_allowed)) &&
- (p->rt.nr_cpus_allowed > 1))
+ if (!task_running(rq, p) && cpumask_test_cpu(cpu, &p->cpus_allowed))
return 1;
return 0;
}

2011-05-18 13:24:09

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] sched: correct how RT task is picked

On Wed, 2011-05-18 at 21:19 +0800, Hillf Danton wrote:

> Hi all
>
> The patch is prepared again, in which tests for both cpu and
> nr_cpus_allowed are dropped.
>
> The reason to drop nr_cpus_allowed is to make sure that the
> returned value is correct for both case that cpu == rq->cpu and
> case that cpu != rq->cpu.

-ENOPARSE

Why would we pick a task that can't migrate?

-- Steve

>
> thanks
>
> Hillf
> ---
>
> --- a/kernel/sched_rt.c 2011-04-27 11:48:50.000000000 +0800
> +++ b/kernel/sched_rt.c 2011-05-18 21:16:22.000000000 +0800
> @@ -1149,9 +1149,7 @@ static void deactivate_task(struct rq *r
>
> static int pick_rt_task(struct rq *rq, struct task_struct *p, int cpu)
> {
> - if (!task_running(rq, p) &&
> - (cpu < 0 || cpumask_test_cpu(cpu, &p->cpus_allowed)) &&
> - (p->rt.nr_cpus_allowed > 1))
> + if (!task_running(rq, p) && cpumask_test_cpu(cpu, &p->cpus_allowed))
> return 1;
> return 0;
> }

2011-05-18 14:46:22

by Hillf Danton

[permalink] [raw]
Subject: Re: [PATCH] sched: correct how RT task is picked

On Wed, May 18, 2011 at 9:24 PM, Steven Rostedt <[email protected]> wrote:
> On Wed, 2011-05-18 at 21:19 +0800, Hillf Danton wrote:
>
>> Hi all
>>
>> The patch is prepared again, in which tests for both cpu and
>> nr_cpus_allowed are dropped.
>>
>> The reason to drop nr_cpus_allowed is to make sure that the
>> returned value is correct for both case that cpu == rq->cpu and
>> case that cpu != rq->cpu.
>
> -ENOPARSE
>
> Why would we pick a task that can't migrate?
>
Hi Steven

For migration, it is the case that cpu != rq->cpu, and
if cpu is allowed by task's affinity, it is bug that task
could not goto cpu because of nr_cpus_allowed since
the nr_cpus_allowed is computed based on the cpus_allowed mask.

Hillf

2011-05-18 15:15:58

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] sched: correct how RT task is picked

On Wed, 2011-05-18 at 22:46 +0800, Hillf Danton wrote:

> For migration, it is the case that cpu != rq->cpu, and
> if cpu is allowed by task's affinity, it is bug that task
> could not goto cpu because of nr_cpus_allowed since
> the nr_cpus_allowed is computed based on the cpus_allowed mask.

Right, that nr_cpus_allowed was to be a short cut, so we did not have to
look at the cpu mask. But as that check came after, it was pointless.
Perhaps that too was a relic with the cpu < 0 case.

Acked-by: Steven Rostedt <[email protected]>

-- Steve

2011-05-18 15:21:11

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] sched: correct how RT task is picked

On Wed, 2011-05-18 at 11:15 -0400, Steven Rostedt wrote:
> On Wed, 2011-05-18 at 22:46 +0800, Hillf Danton wrote:
>
> > For migration, it is the case that cpu != rq->cpu, and
> > if cpu is allowed by task's affinity, it is bug that task
> > could not goto cpu because of nr_cpus_allowed since
> > the nr_cpus_allowed is computed based on the cpus_allowed mask.
>
> Right, that nr_cpus_allowed was to be a short cut, so we did not have to
> look at the cpu mask. But as that check came after, it was pointless.
> Perhaps that too was a relic with the cpu < 0 case.
>
> Acked-by: Steven Rostedt <[email protected]>

Hillf could you send a final patch with a proper changelog and Steven's
ack? From what I can see the latest patch is burried somewhere in this
discussion thread and is without changelog.

2011-05-19 01:41:29

by Yong Zhang

[permalink] [raw]
Subject: Re: [PATCH] sched: correct how RT task is picked

On Wed, May 18, 2011 at 10:46 PM, Hillf Danton <[email protected]> wrote:
> On Wed, May 18, 2011 at 9:24 PM, Steven Rostedt <[email protected]> wrote:
>> On Wed, 2011-05-18 at 21:19 +0800, Hillf Danton wrote:
>>
>>> Hi all
>>>
>>> The patch is prepared again, in which tests for both cpu and
>>> nr_cpus_allowed are dropped.
>>>
>>> The reason to drop nr_cpus_allowed is to make sure that the
>>> returned value is correct for both case that cpu == rq->cpu and
>>> case that cpu != rq->cpu.
>>
>> -ENOPARSE
>>
>> Why would we pick a task that can't migrate?
>>
> Hi Steven
>
> For migration, it is the case that cpu != rq->cpu, and
> if cpu is allowed by task's affinity, it is bug that task
> could not goto cpu because of nr_cpus_allowed since
> the nr_cpus_allowed is computed based on the cpus_allowed mask.

But for next_prio(), we just calculate the next highest migratible task
for a rq regardless on which cpu that task will run, say we let cpu=rq->cpu
to be second parameter of pick_next_highest_task_rt(), IOW, it has
the same effect as cpu = -1

Thanks,
Yong



--
Only stand for myself

2011-05-19 01:44:23

by Yong Zhang

[permalink] [raw]
Subject: Re: [PATCH] sched: correct how RT task is picked

On Thu, May 19, 2011 at 9:41 AM, Yong Zhang <[email protected]> wrote:
> On Wed, May 18, 2011 at 10:46 PM, Hillf Danton <[email protected]> wrote:
>> On Wed, May 18, 2011 at 9:24 PM, Steven Rostedt <[email protected]> wrote:
>>> On Wed, 2011-05-18 at 21:19 +0800, Hillf Danton wrote:
>>>
>>>> Hi all
>>>>
>>>> The patch is prepared again, in which tests for both cpu and
>>>> nr_cpus_allowed are dropped.
>>>>
>>>> The reason to drop nr_cpus_allowed is to make sure that the
>>>> returned value is correct for both case that cpu == rq->cpu and
>>>> case that cpu != rq->cpu.
>>>
>>> -ENOPARSE
>>>
>>> Why would we pick a task that can't migrate?
>>>
>> Hi Steven
>>
>> For migration, it is the case that cpu != rq->cpu, and
>> if cpu is allowed by task's affinity, it is bug that task
>> could not goto cpu because of nr_cpus_allowed since
>> the nr_cpus_allowed is computed based on the cpus_allowed mask.
>
> But for next_prio(), we just calculate the next highest migratible task
> for a rq regardless on which cpu that task will run, say we let cpu=rq->cpu
> to be second parameter of pick_next_highest_task_rt(), IOW, it has
> the same effect as cpu = -1
>

That means we could get a bound task's priority for the rt_rq's next highest
priority.


> Thanks,
> Yong
>
>
>
> --
> Only stand for myself
>



--
Only stand for myself

2011-05-19 12:49:08

by Hillf Danton

[permalink] [raw]
Subject: [PATCH resend] sched: correct how RT task is picked

When picking RT task for given runqueue and CPU, if CPU is less than
zero, then the check for task's CPU affinity is bypassed, which
creates incorrect result if the number of CPUs for the task to run is
bigger than one.
And lets check the case that CPU is not less than zero, if CPU is in
the mask of cpus_allowed and if CPU == rq->cpu, incorrect result could
also be reached if the number of CPUs for the task to run is not
bigger than one.
On the other hand, if CPU is in the mask of cpus_allowed and if CPU !=
rq->cpu, checking nr_cpus_allowed is overwork since it is computed
based on cpus_allowed as seen in set_cpus_allowed_rt().

In the patch the checks for both CPU and nr_cpus_allowed are removed
for correct results in the case that CPU is different from rq->cpu and
in the case that CPU is not.

Signed-off-by: Hillf Danton <[email protected]>
Acked-by: Steven Rostedt <[email protected]>
---

--- a/kernel/sched_rt.c 2011-04-27 11:48:50.000000000 +0800
+++ b/kernel/sched_rt.c 2011-05-19 20:05:46.000000000 +0800
@@ -1149,11 +1149,8 @@ static void deactivate_task(struct rq *r

static int pick_rt_task(struct rq *rq, struct task_struct *p, int cpu)
{
- if (!task_running(rq, p) &&
- (cpu < 0 || cpumask_test_cpu(cpu, &p->cpus_allowed)) &&
- (p->rt.nr_cpus_allowed > 1))
- return 1;
- return 0;
+ return !task_running(rq, p) &&
+ cpumask_test_cpu(cpu, &p->cpus_allowed);
}

/* Return the second highest RT task, NULL otherwise */