2009-03-03 12:58:52

by Rolando Martins

[permalink] [raw]
Subject: Re: cgroup, RT reservation per core(s)?

On Tue, Feb 10, 2009 at 1:06 PM, Peter Zijlstra <[email protected]> wrote:
> On Mon, 2009-02-09 at 20:04 +0000, Rolando Martins wrote:
>
>> I should have elaborated this more:
>>
>> ? ? ? ? ? ? ? ? ? ? ?root
>> ? ? ? ? ? ? ? ? ? ----|----
>> ? ? ? ? ? ? ? ? ? | ? ? ? ? ?|
>> (0.5 mem) 0 ? ? ? ? 1 (100% rt, 0.5 mem)
>> ? ? ? ? ? ? ? ? ? ? ? ? ?---------
>> ? ? ? ? ? ? ? ? ? ? ? ? ?| ? ?| ? ?|
>> ? ? ? ? ? ? ? ? ? ? ? ? ?2 ? 3 ? 4 ?(33% rt for each group, 33% mem
>> per group(0.165))
>> Rol
>
>
> Right, i think this can be done.
>
> You would indeed need cpusets and sched-cgroups.
>
> Split the machine in 2 using cpusets.
>
> ? ___R___
> ?/ ? ? ? \
> ?A ? ? ? ? B
>
> Where R is the root cpuset, and A and B are the siblings.
> Assign A one half the cpus, and B the other half.
> Disable load-balancing on R.
>
> Then using sched cgroups create the hierarchy
>
> ?____1____
> ?/ ? ?| ? ?\
> 2 ? ? 3 ? ? 4
>
> Where 1 can be the root group if you like.
>
> Assign 1 a utilization limit of 100%, and 2,3 and 4 a utilization limit
> of 33% each.
>
> Then place the tasks that get 100% cputime on your 2 cpus in cpuset A
> and sched group 1.
>
> Place your other tasks in B,{2-4} respectively.
>
> The reason this works is that bandwidth distribution is sched domain
> wide, and by disabling load-balancing on R, you split the schedule
> domain.
>
> I've never actually tried anything like this, let me know if it
> works ;-)
>

Just to confirm, cpuset.sched_load_balance doesn't work with RT, right?
You cannot have tasks for sub-domain 2 to utilize bandwidth of
sub-domain 3, right?

__1__
/ \
2 3
(50% rt) (50% rt )

For my application domain;) it would be interesting to have
rt_runtime_ns as a min. of allocated rt and not a max.
Ex. If an application of domain 2 needs to go up to 100% and domain 3
is idle, then it would be cool to let it utilize the full bandwidth.
(we also could have a hard upper limit in each sub-domain, like
hard_up=0.8, i.e. even if we could get 100%, we will only utilize
80%).

Does this make sense?