2020-04-30 14:08:53

by Dietmar Eggemann

[permalink] [raw]
Subject: Re: [PATCH 2/4] sched: set new prio after checking schedule policy

On 30/04/2020 14:13, Hillf Danton wrote:
>
> On Tue, 28 Apr 2020 17:32:45 Valentin Schneider wrote:
>>
>>> + else if (fair_policy(policy)) {
>>> + if (attr->sched_nice < MIN_NICE ||
>>> + attr->sched_nice > MAX_NICE)
>>> + return -EINVAL;
>>
>> We can't hit this with the syscall route, since we (silently) clamp those
>> values in sched_copy_attr(). setpriority() does the same. There's this
>> comment in sched_copy_attr() that asks whether we should clamp or return an
>> error; seems like the current consensus is on clamping, but then we might
>> want to get rid of that comment :)
>>
> Yes it's quite likely for me to miss the cases covered by that clamp;
> otherwise what is added does not break that consensus.
>
>>> + newprio = NICE_TO_PRIO(attr->sched_nice);
>>
>> This is new, however AFAICT it doesn't change anything for CFS (or about to
>> be) tasks since what matters is calling check_class_changed() further down.
>
> Yes it's only used by rt_effective_prio().
>

Looks like changing a SCHED_NORMAL to a SCHED_BATCH task will create a different
queue_flags value.

# chrt -p $$
pid 2803's current scheduling policy: SCHED_OTHER
pid 2803's current scheduling priority: 0

# chrt -b -p 0 $$

...
[bash 2803] policy=3 oldprio=120 newprio=[99->120] new_effective_prio=[99->120] queue_flags=[0xe->0xa]
[bash 2803] queued=0 running=0
...

But since in this example 'queued=0' it has no further effect here.

Why is SCHED_NORMAL/SCHED_BATCH (fair_policy()) now treated differently than SCHED_IDLE?

# chrt -i -p 0 $$

...
[bash 2803] policy=5 newprio=99 oldprio=120 new_effective_prio=99 queue_flags=0xe
[bash 2803] queued=0 running=0
...


2020-04-30 14:21:37

by Valentin Schneider

[permalink] [raw]
Subject: Re: [PATCH 2/4] sched: set new prio after checking schedule policy


On 30/04/20 15:06, Dietmar Eggemann wrote:
>>>> + newprio = NICE_TO_PRIO(attr->sched_nice);
>>>
>>> This is new, however AFAICT it doesn't change anything for CFS (or about to
>>> be) tasks since what matters is calling check_class_changed() further down.
>>
>> Yes it's only used by rt_effective_prio().
>>
>
> Looks like changing a SCHED_NORMAL to a SCHED_BATCH task will create a different
> queue_flags value.
>
> # chrt -p $$
> pid 2803's current scheduling policy: SCHED_OTHER
> pid 2803's current scheduling priority: 0
>
> # chrt -b -p 0 $$
>
> ...
> [bash 2803] policy=3 oldprio=120 newprio=[99->120] new_effective_prio=[99->120] queue_flags=[0xe->0xa]
> [bash 2803] queued=0 running=0
> ...
>
> But since in this example 'queued=0' it has no further effect here.
>
> Why is SCHED_NORMAL/SCHED_BATCH (fair_policy()) now treated differently than SCHED_IDLE?
>
> # chrt -i -p 0 $$
>
> ...
> [bash 2803] policy=5 newprio=99 oldprio=120 new_effective_prio=99 queue_flags=0xe
> [bash 2803] queued=0 running=0
> ...


Good catch; I suppose we'll want to special case SCHED_IDLE (IIRC should
map to nice 20).

As you pointed out, right now the newprio computation for CFS tasks is
kinda bonkers, so it seems we'll almost always clear DEQUEUE_MOVE from
queue_flags for them.

For CFS, not having DEQUEUE_MOVE here would lead to not calling
update_min_vruntime() on the dequeue. I'm not sure how much it matters in
this one case - I don't expect sched_setscheduler() calls to be *too*
frequent, and that oughta be fixed by the next entity_tick()) - but that is
an actual change.

2020-04-30 15:17:59

by Valentin Schneider

[permalink] [raw]
Subject: Re: [PATCH 2/4] sched: set new prio after checking schedule policy


On 30/04/20 15:18, Valentin Schneider wrote:
> On 30/04/20 15:06, Dietmar Eggemann wrote:
>>>>> + newprio = NICE_TO_PRIO(attr->sched_nice);
>>>>
>>>> This is new, however AFAICT it doesn't change anything for CFS (or about to
>>>> be) tasks since what matters is calling check_class_changed() further down.
>>>
>>> Yes it's only used by rt_effective_prio().
>>>
>>
>> Looks like changing a SCHED_NORMAL to a SCHED_BATCH task will create a different
>> queue_flags value.
>>
>> # chrt -p $$
>> pid 2803's current scheduling policy: SCHED_OTHER
>> pid 2803's current scheduling priority: 0
>>
>> # chrt -b -p 0 $$
>>
>> ...
>> [bash 2803] policy=3 oldprio=120 newprio=[99->120] new_effective_prio=[99->120] queue_flags=[0xe->0xa]
>> [bash 2803] queued=0 running=0
>> ...
>>
>> But since in this example 'queued=0' it has no further effect here.
>>
>> Why is SCHED_NORMAL/SCHED_BATCH (fair_policy()) now treated differently than SCHED_IDLE?
>>
>> # chrt -i -p 0 $$
>>
>> ...
>> [bash 2803] policy=5 newprio=99 oldprio=120 new_effective_prio=99 queue_flags=0xe
>> [bash 2803] queued=0 running=0
>> ...
>
>
> Good catch; I suppose we'll want to special case SCHED_IDLE (IIRC should
> map to nice 20).
>
> As you pointed out, right now the newprio computation for CFS tasks is
> kinda bonkers, so it seems we'll almost always clear DEQUEUE_MOVE from
> queue_flags for them.
>

Of course I misread that, it's the other way around: since newprio is
always 99 for SCHED_OTHER/BATCH/IDLE tasks, we'll never have
new_effective_prio == oldprio (unless pi involves a FIFO 99 task), thus
will never clear DEQUEUE_MOVE.

> For CFS, not having DEQUEUE_MOVE here would lead to not calling
> update_min_vruntime() on the dequeue. I'm not sure how much it matters in
> this one case - I don't expect sched_setscheduler() calls to be *too*
> frequent, and that oughta be fixed by the next entity_tick()) - but that is
> an actual change.