by Dietmar Eggemann

[permalink] [raw]

Subject: Re: [RFC PATCH 2/3] sched/cpuset: Keep track of SCHED_DEADLINE tasks in cpusets

On 15/03/2023 18:14, Juri Lelli wrote:
> On 15/03/23 11:46, Waiman Long wrote:
>>
>> On 3/15/23 08:18, Juri Lelli wrote:

[...]

>>> @@ -2472,6 +2492,11 @@ static int cpuset_can_attach(struct cgroup_taskset *tset)
>>> ret = security_task_setscheduler(task);
>>> if (ret)
>>> goto out_unlock;
>>> +
>>> + if (dl_task(task)) {
>>> + cs->nr_deadline_tasks++;
>>> + cpuset_attach_old_cs->nr_deadline_tasks--;
>>> + }
>>> }
>>
>> Any one of the tasks in the cpuset can cause the test to fail and abort the
>> attachment. I would suggest that you keep a deadline task transfer count in
>> the loop and then update cs and cpouset_attach_old_cs only after all the
>> tasks have been iterated successfully.
>
> Right, Dietmar I think commented pointing out something along these
> lines. Think though we already have this problem with current
> task_can_attach -> dl_cpu_busy which reserves bandwidth for each tasks
> in the destination cs. Will need to look into that. Do you know which
> sort of operation would move multiple tasks at once?

Moving the process instead of the individual tasks makes
cpuset_can_attach() have to deal with multiple tasks.

# ps2 | grep DLN
1614 1615 140 0 - DLN thread0-0
1614 1616 140 0 - DLN thread0-1
1614 1617 140 0 - DLN thread0-2

# echo 1614 > /sys/fs/cgroup/cpuset/cs2/cgroup.procs

2023-03-22 14:08:42

by Dietmar Eggemann

[permalink] [raw]

Subject: Re: [RFC PATCH 2/3] sched/cpuset: Keep track of SCHED_DEADLINE tasks in cpusets

On 15/03/2023 19:01, Waiman Long wrote:
>
> On 3/15/23 13:14, Juri Lelli wrote:
>> On 15/03/23 11:46, Waiman Long wrote:
>>> On 3/15/23 08:18, Juri Lelli wrote:

[...]

>>>> @@ -2472,6 +2492,11 @@ static int cpuset_can_attach(struct
>>>> cgroup_taskset *tset)
>>>>            ret = security_task_setscheduler(task);
>>>>            if (ret)
>>>>                goto out_unlock;
>>>> +
>>>> +        if (dl_task(task)) {
>>>> +            cs->nr_deadline_tasks++;
>>>> +            cpuset_attach_old_cs->nr_deadline_tasks--;
>>>> +        }
>>>>        }
>>> Any one of the tasks in the cpuset can cause the test to fail and
>>> abort the
>>> attachment. I would suggest that you keep a deadline task transfer
>>> count in
>>> the loop and then update cs and cpouset_attach_old_cs only after all the
>>> tasks have been iterated successfully.
>> Right, Dietmar I think commented pointing out something along these
>> lines. Think though we already have this problem with current
>> task_can_attach -> dl_cpu_busy which reserves bandwidth for each tasks
>> in the destination cs. Will need to look into that. Do you know which
>> sort of operation would move multiple tasks at once?
>
> Actually, what I said previously may not be enough. There can be
> multiple controllers attached to a cgroup. If any of thier can_attach()
> calls fails, the whole transaction is aborted and cancel_attach() will
> be called. My new suggestion is to add a new deadline task transfer
> count into the cpuset structure and store the information there
> temporarily. If cpuset_attach() is called, it means all the can_attach
> calls succeed. You can then update the dl task count accordingly and
> clear the temporary transfer count.
>
> I guess you may have to do something similar with dl_cpu_busy().

I gave it a shot:

https://lkml.kernel.org/r/[email protected]