2022-03-14 00:17:37

by chenying

[permalink] [raw]
Subject: Re: [External] Re: Subject: [PATCH] sched/fair: prioritize normal task over sched_idle task with vruntime offset

在 2022/3/13 17:02, Peter Zijlstra 写道:
> On Sun, Mar 13, 2022 at 01:37:37PM +0800, chenying wrote:
>> 在 2022/3/12 20:03, Peter Zijlstra 写道:
>>> On Fri, Mar 11, 2022 at 03:58:47PM +0800, chenying wrote:
>>>> We add a time offset to the se->vruntime when the idle sched_entity
>>>> is enqueued, so that the idle entity will always be on the right of
>>>> the non-idle in the runqueue. This can allow non-idle tasks to be
>>>> selected and run before the idle.
>>>>
>>>> A use-case is that sched_idle for background tasks and non-idle
>>>> for foreground. The foreground tasks are latency sensitive and do
>>>> not want to be disturbed by the background. It is well known that
>>>> the idle tasks can be preempted by the non-idle tasks when waking up,
>>>> but will not distinguish between idle and non-idle when pick the next
>>>> entity. This may cause background tasks to disturb the foreground.
>>>>
>>>> Test results as below:
>>>>
>>>> ~$ ./loop.sh &
>>>> [1] 764
>>>> ~$ chrt -i 0 ./loop.sh &
>>>> [2] 765
>>>> ~$ taskset -p 04 764
>>>> ~$ taskset -p 04 765
>>>>
>>>> ~$ top -p 764 -p 765
>>>> top - 13:10:01 up 1 min,  2 users,  load average: 1.30, 0.38, 0.13
>>>> Tasks:   2 total,   2 running,   0 sleeping,   0 stopped,   0 zombie
>>>> %Cpu(s): 12.5 us,  0.0 sy,  0.0 ni, 87.4 id,  0.0 wa,  0.0 hi, 0.0 si,  0.0
>>>> st
>>>> KiB Mem : 16393492 total, 16142256 free,   111028 used,   140208 buff/cache
>>>> KiB Swap:   385836 total,   385836 free,        0 used. 16037992 avail Mem
>>>>
>>>>   PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM TIME+ COMMAND
>>>>   764 chenyin+  20   0   12888   1144   1004 R 100.0  0.0 1:05.12 loop.sh
>>>>   765 chenyin+  20   0   12888   1224   1080 R   0.0  0.0 0:16.21 loop.sh
>>>>
>>>> The non-idle process (764) can run at 100% and without being disturbed by
>>>> the idle process (765).
>>>
>>> Did you just do a very complicated true idle time scheduler, with all
>>> the problems that brings?
>>
>> When colocating CPU-intensive jobs with latency-sensitive services can
>> improve CPU utilization but it is difficult to meet the stringent
>> tail-latency requirements of latency-sensitive services. We use a true idle
>> time scheduler for CPU-intensive jobs to minimize the impact on
>> latency-sensitive services.
>
> Hard NAK on any true idle-time scheduler until you make the whole kernel
> immune to lock holder starvation issues.

If I set the sched_idle_vruntime_offset to a relatively small value
(e.g. 10 minutes), can this issues be avoided?


2022-03-17 06:22:10

by Josh Don

[permalink] [raw]
Subject: Re: [External] Re: Subject: [PATCH] sched/fair: prioritize normal task over sched_idle task with vruntime offset

On Sun, Mar 13, 2022 at 3:07 AM chenying <[email protected]> wrote:
>
> If I set the sched_idle_vruntime_offset to a relatively small value
> (e.g. 10 minutes), can this issues be avoided?

That's still long enough to cause lockups.

Is the issue that you have a large number of sched_idle entities, and
the occasional latency sensitive thing that wakes up for a short
duration? Have you considered approaching this from the other
direction (ie. if we have a latency sensitive thing wake onto a cpu
running only sched idle stuff, we could change entity placement to
position the latency sensitive thing further left on the timeline,
akin to !GENTLE_FAIR_SLEEPERS).