LinuxLists.cc - Re: [RFD] sched/deadline: EDF dynamic quota design

2014-06-17 02:45:29

Subject: Re: [RFD] sched/deadline: EDF dynamic quota design

On 2014/5/21 20:45, Luca Abeni wrote:
> Hi,
>
> first of all, sorry for the ultra-delayed reply: I've been busy,
> and I did not notice this email... Anyway, some comments are below
>
> On 05/16/2014 09:11 AM, Henrik Austad wrote:
> [...]
>>>>> This can also be implemented in user-space (without modifying the
>>>>> scheduler)
>>>>> by having a daemon that monitors the SCHED_DEADLINE tasks and changes
>>>>> their
>>>>> runtimes based on some kind of feedback (for example, difference
>>>>> between the
>>>>> scheduling deadline of a task and its actual deadline - if this
>>>>> information
>>>>> is made available by the scheduler). I developed a similar
>>>>> implementation in
>>>>> the past (not based on SCHED_DEADLINE, but on some previous
>>>>> implementation
>>>>> of the CBS algorithm).
>>
>> This sounds like a very slow approach. What if the extra BW given by
>> T2 was
>> for one period only? There's no way you could create a userspace
>> daemon to
>> handle that kind of budget-tweaking.
> Right. With "This can also be implemented in user-space..." I was
> referring
> to the feedback scheduling (adaptive reservation) approach, which is
> designed
> to "auto-tune" the reservation budget following slower variations (it
> is like
> a low-pass filter, which can set the budget to something between the
> average
> used budget and the largest one). Basically, it works on a larger time
> scale.
> If you want a "per-period" runtime donation, you need a reclaiming
> mechanism
> like GRUB, CASH, or similar, which needs to be implemented in the kernel.
>
>
>> Also, it sounds like a *really* dangerous idea to let some random (tm)
>> userspace daemon adjust the deadline-budget for other tasks in the
>> system
>> based on an observation of spent budget for the last X seconds. It's not
>> military funding we're concerned with here.
>>
>> When you state your WCET, it is not because you need -exactly- that
>> budget,
>> it is because you should *never* exceed that kind of rquired
>> computational time.
> Exact. But the idea of feedback scheduling was that sometimes you do
> not know
> the WCET... You can guess it, or measure it over a large number of
> runs (but
> the Murphy law ensures that you will miss the worst case anyway ;-).
> And there are situations in which you do not need to respect all of the
> deadlines... The daemon I was talking about just monitors the
> difference between
> the scheduling deadline and the "real job deadline" for some tasks, in
> order to
> understand if the runtime they have been assigned is enough or not...
> If some
> task is not receiving enough runtime (that is, if the difference
> between its
> scheduling deadline and the "real deadline" becomes too large), the
> daemon
> tries to increase its runtime. When the system is overloaded (that is,
> everyone
> of the monitored tasks wants too much runtime, and the admission
> control fails),
> the daemon decreases the runtimes according to the weight of the tasks...
> Of course, the daemon does not have to monitor all of the
> SCHED_DEADLINE tasks
> in the system, but only the ones for which adaptive reservations are
> useful
> (tasks for which the WCET is not known for sure, and that can
> tollerate some
> missed deadlines). The other SCHED_DEADLINE tasks can stay with their
> fixed
> runtimes unchanged.
>
>
>> Blindly reducing allocated runtime is defeating that whole purpose.
> Of course, there could be a minimum guaranteed runtime per task.
>
>> Granted, if you use EDF for BW-control only, it could be done - but then
>> the thread itself should do that. Real-time is not about being fair.
>> Heck,
>> it's not even about being optimal, it's about being predictable and
>> "dynamically adjusting" is not!
> Well, this could lead to a long discussion, in which everyone is right
> and
> everyone is wrong... Let's say that it depends on the applications
> requirements
> and constraints.
>
> [...]
>>>> Will EDF has dynamic quota in future?
>>>
>>> Well, as Luca said, you can already use SCHED_DEADLINE as the backend
>>> for feedback scheduling (that pertain mostly to user-space).
>>>
>>> And yes, there are already thoughts to modify it a bit to go towards
>>> Lipari's et al. GRUB algorithm. That would be probably helpful in
>>> situations like yours. But I can't give you any timing for it.
>>
>> Need to read up on GRUB before involving myself in this part of the
>> discussion, but I'm not sure how much I enjoy the idea of some part of
>> userspace (more or less) blindly adjusting deadline-params for other
>> tasks.
> No, GRUB does not involve the user-space adjusting any scheduling
> parameter.
> GRUB is a reclaiming algorithm, which works in a different way respect to
> the feedback scheduling approach I described, and requires
> modifications in
> the scheduler.
> The basic ideas are (warning! This is an over-simplification of the
> algorithm! :)
> - You assign runtime and period to each SCHED_DEADLINE task as usual
> - Each task is guaranteed to receive its runtime every period
> - You can also define a maximum fraction Umax of the CPU time that the
> SCHED_DEADLINE tasks can use. Note that Umax _must_ be larger or equal
> than sum_i runtime_i / period_i
> (note: in the original GRUB paper, only one CPU is considered, and
> Umax is
> set equal to 1)
> - If the tasks are consuming less than Umax, then the scheduling
> algorithm
> allows them to use more runtime (but not less than the guaranteed
> runtime_i)
> in order to use up to Umax.
> This is achieved by modifying the rule used to decrease the runtime: in
> SCHED_DEADLINE, if a task executes for a time delta, its runtime is
> decreased
> by delta; using GRUB, it would be decreased by a smaller amount of time
> (computed based on Umax, on the active SCHED_DEADLINE tasks, etc...).
> This requires to implement some kind of state machine (the details
> are in
> the GRUB paper)
>
> I also had an implementation of the GRUB algorithm (based on a
> modification
> of my old CBS scheduler for Linux), but the computational complexity
> of the
> algorithm was too high. That's why I never proposed to merge it in
> SCHED_DEADLINE.
> But maybe there can be some trade-off between the "exact compliance
> with the
> GRUB algorithm" and implementation efficiency that can make it
> acceptable...
>
>
Has these codes been opened about the implementation in some community
or not ?
Could you tell me where I should get the program from?
I would like to test your solution in our scene.

Thanks
Yan
>> I think we need to be very careful about whether or not we're talking
>> about
>> EDF as a "CPU BW"-enforcer or a "meet a deadline for a periodic
>> task"-enforcer. Those 2 will react quite differently from having their
>> runtime adjusted on the fly.
> Well, IMHO the nice thing about SCHED_DEADLINE is that it can be both :)
> It depends on how you configure the scheduling parameters.
>
>
> Luca
>
> .
>

2014-06-17 08:01:36

by Luca Abeni

[permalink] [raw]

Subject: Re: [RFD] sched/deadline: EDF dynamic quota design

Hi,

On 06/17/2014 04:43 AM, xiaofeng.yan wrote:
[...]
>> The basic ideas are (warning! This is an over-simplification of the algorithm! :)
>> - You assign runtime and period to each SCHED_DEADLINE task as usual
>> - Each task is guaranteed to receive its runtime every period
>> - You can also define a maximum fraction Umax of the CPU time that the
>> SCHED_DEADLINE tasks can use. Note that Umax _must_ be larger or equal
>> than sum_i runtime_i / period_i
>> (note: in the original GRUB paper, only one CPU is considered, and Umax is
>> set equal to 1)
>> - If the tasks are consuming less than Umax, then the scheduling algorithm
>> allows them to use more runtime (but not less than the guaranteed runtime_i)
>> in order to use up to Umax.
>> This is achieved by modifying the rule used to decrease the runtime: in
>> SCHED_DEADLINE, if a task executes for a time delta, its runtime is decreased
>> by delta; using GRUB, it would be decreased by a smaller amount of time
>> (computed based on Umax, on the active SCHED_DEADLINE tasks, etc...).
>> This requires to implement some kind of state machine (the details are in
>> the GRUB paper)
>>
>> I also had an implementation of the GRUB algorithm (based on a modification
>> of my old CBS scheduler for Linux), but the computational complexity of the
>> algorithm was too high. That's why I never proposed to merge it in SCHED_DEADLINE.
>> But maybe there can be some trade-off between the "exact compliance with the
>> GRUB algorithm" and implementation efficiency that can make it acceptable...
>>
>>
> Has these codes been opened about the implementation in some community or not ?
The old GRUB scheduler for Linux was used for some experiments published in a paper
at RTLWS 2007, and of course the code was open-source (released under GPL).
It required a patch for the Linux kernel (I used a 2.6.something kernel) which allowed
to load the scheduler as a kernel module (yes, I know this is the wrong way to go...
But implementing it like this was simpler :).
That is very old code... I probably still have it somewhere, but I have to search
for it. If someone is interested, I can try to search (the story of the user-space
daemon for adaptive reservations is similar: I released it as open-source years ago...
If anyone is interested I can search for this code too)

Luca

2014-06-18 07:03:24

by xiaofeng.yan

[permalink] [raw]

Subject: Re: [RFD] sched/deadline: EDF dynamic quota design

On 2014/6/17 16:01, Luca Abeni wrote:
> Hi,
>
> On 06/17/2014 04:43 AM, xiaofeng.yan wrote:
> [...]
>>> The basic ideas are (warning! This is an over-simplification of the
>>> algorithm! :)
>>> - You assign runtime and period to each SCHED_DEADLINE task as usual
>>> - Each task is guaranteed to receive its runtime every period
>>> - You can also define a maximum fraction Umax of the CPU time that the
>>> SCHED_DEADLINE tasks can use. Note that Umax _must_ be larger or
>>> equal
>>> than sum_i runtime_i / period_i
>>> (note: in the original GRUB paper, only one CPU is considered, and
>>> Umax is
>>> set equal to 1)
>>> - If the tasks are consuming less than Umax, then the scheduling
>>> algorithm
>>> allows them to use more runtime (but not less than the guaranteed
>>> runtime_i)
>>> in order to use up to Umax.
>>> This is achieved by modifying the rule used to decrease the
>>> runtime: in
>>> SCHED_DEADLINE, if a task executes for a time delta, its runtime
>>> is decreased
>>> by delta; using GRUB, it would be decreased by a smaller amount of
>>> time
>>> (computed based on Umax, on the active SCHED_DEADLINE tasks, etc...).
>>> This requires to implement some kind of state machine (the details
>>> are in
>>> the GRUB paper)
>>>
>>> I also had an implementation of the GRUB algorithm (based on a
>>> modification
>>> of my old CBS scheduler for Linux), but the computational complexity
>>> of the
>>> algorithm was too high. That's why I never proposed to merge it in
>>> SCHED_DEADLINE.
>>> But maybe there can be some trade-off between the "exact compliance
>>> with the
>>> GRUB algorithm" and implementation efficiency that can make it
>>> acceptable...
>>>
>>>
>> Has these codes been opened about the implementation in some
>> community or not ?
> The old GRUB scheduler for Linux was used for some experiments
> published in a paper
> at RTLWS 2007, and of course the code was open-source (released under
> GPL).
> It required a patch for the Linux kernel (I used a 2.6.something
> kernel) which allowed
> to load the scheduler as a kernel module (yes, I know this is the
> wrong way to go...
> But implementing it like this was simpler :).
> That is very old code... I probably still have it somewhere, but I
> have to search
> for it. If someone is interested, I can try to search (the story of
> the user-space
> daemon for adaptive reservations is similar: I released it as
> open-source years ago...
> If anyone is interested I can search for this code too)
>
>
> Luca
>
I'm glad that you reply this email. yes, I'm so interesting about your
solution. In fact , there are scenarios
in our product. Could you send me a link if you have? I can test your
solution in our scene if you like.

Thanks
Yan
>

2014-06-19 09:13:48

by Luca Abeni

[permalink] [raw]

Subject: Re: [RFD] sched/deadline: EDF dynamic quota design

On 06/18/2014 09:01 AM, xiaofeng.yan wrote:
[...]
>>>> I also had an implementation of the GRUB algorithm (based on a modification
>>>> of my old CBS scheduler for Linux), but the computational complexity of the
>>>> algorithm was too high. That's why I never proposed to merge it in SCHED_DEADLINE.
>>>> But maybe there can be some trade-off between the "exact compliance with the
>>>> GRUB algorithm" and implementation efficiency that can make it acceptable...
>>>>
>>>>
>>> Has these codes been opened about the implementation in some community or not ?
>> The old GRUB scheduler for Linux was used for some experiments published in a paper
>> at RTLWS 2007, and of course the code was open-source (released under GPL).
>> It required a patch for the Linux kernel (I used a 2.6.something kernel) which allowed
>> to load the scheduler as a kernel module (yes, I know this is the wrong way to go...
>> But implementing it like this was simpler :).
>> That is very old code... I probably still have it somewhere, but I have to search
>> for it. If someone is interested, I can try to search (the story of the user-space
>> daemon for adaptive reservations is similar: I released it as open-source years ago...
>> If anyone is interested I can search for this code too)
>>
>>
>> Luca
>>
> I'm glad that you reply this email. yes, I'm so interesting about your solution. In fact , there are scenarios
> in our product. Could you send me a link if you have? I can test your solution in our scene if you like.
Ok, so I found my old code for the CBS scheduler with GRUB modifications.
You can get it from here: http://disi.unitn.it/~abeni/old-cbs-scheduler.tgz

Please note that:
1) This is old code (for 2.6.x kernels), written before SCHED_DEADLINE development
was started
2) The scheduler architecture is completely different respect to the current one,
but the basic scheduling algorithm implemented by my old scheduler is the same
one implemented by SCHED_DEADLINE (but I did not implement multi-processor support :)
3) You can have a look at the modifications needed to implement GRUB by simply grepping
for "GRUB" in the source code. Basically, the algorithm is implemented by:
1) Implementing a state machine to keep track of the current state of a task (is it
using its reserved fraction of CPU time, did it already use such a fraction of CPU
time, or is it not using any CPU time?). This is done by adding a "state" field in
"cbs_struct", and properly updating it in cbs.c
2) Keeping track of the total fraction of CPU time used by the active tasks. See the "U"
variable in cbs.c (in a modern scheduler, it should probably become a field in the
runqueue structure)
3) Modifying the rule used to update the runtime. For a "standard" CBS without CPU
reclaiming (the one implemented by SCHED_DEADLINE), if a task executes for an amount
of time "delta" its runtime must be decreased by delta. For GRUB, it must be decreased
by "delta" mutliplied by U. See "account()" in cbs.c.
The "trick" is in properly updating U (and this is done using the state machine
mentioned above)

Summing up, this code is not directly usable, but it shows you what needs to be done in
order to implement the GRUB mechanism for CPU reclaiming in a CBS scheduler...

Luca

2014-06-20 02:30:39

by xiaofeng.yan

[permalink] [raw]

Subject: Re: [RFD] sched/deadline: EDF dynamic quota design

On 2014/6/19 17:13, Luca Abeni wrote:
> On 06/18/2014 09:01 AM, xiaofeng.yan wrote:
> [...]
>>>>> I also had an implementation of the GRUB algorithm (based on a
>>>>> modification
>>>>> of my old CBS scheduler for Linux), but the computational
>>>>> complexity of the
>>>>> algorithm was too high. That's why I never proposed to merge it in
>>>>> SCHED_DEADLINE.
>>>>> But maybe there can be some trade-off between the "exact
>>>>> compliance with the
>>>>> GRUB algorithm" and implementation efficiency that can make it
>>>>> acceptable...
>>>>>
>>>>>
>>>> Has these codes been opened about the implementation in some
>>>> community or not ?
>>> The old GRUB scheduler for Linux was used for some experiments
>>> published in a paper
>>> at RTLWS 2007, and of course the code was open-source (released
>>> under GPL).
>>> It required a patch for the Linux kernel (I used a 2.6.something
>>> kernel) which allowed
>>> to load the scheduler as a kernel module (yes, I know this is the
>>> wrong way to go...
>>> But implementing it like this was simpler :).
>>> That is very old code... I probably still have it somewhere, but I
>>> have to search
>>> for it. If someone is interested, I can try to search (the story of
>>> the user-space
>>> daemon for adaptive reservations is similar: I released it as
>>> open-source years ago...
>>> If anyone is interested I can search for this code too)
>>>
>>>
>>> Luca
>>>
>> I'm glad that you reply this email. yes, I'm so interesting about
>> your solution. In fact , there are scenarios
>> in our product. Could you send me a link if you have? I can test
>> your solution in our scene if you like.
> Ok, so I found my old code for the CBS scheduler with GRUB modifications.
> You can get it from here:
> http://disi.unitn.it/~abeni/old-cbs-scheduler.tgz
>
> Please note that:
> 1) This is old code (for 2.6.x kernels), written before SCHED_DEADLINE
> development
> was started
> 2) The scheduler architecture is completely different respect to the
> current one,
> but the basic scheduling algorithm implemented by my old scheduler
> is the same
> one implemented by SCHED_DEADLINE (but I did not implement
> multi-processor support :)
> 3) You can have a look at the modifications needed to implement GRUB
> by simply grepping
> for "GRUB" in the source code. Basically, the algorithm is
> implemented by:
> 1) Implementing a state machine to keep track of the current state
> of a task (is it
> using its reserved fraction of CPU time, did it already use such
> a fraction of CPU
> time, or is it not using any CPU time?). This is done by adding
> a "state" field in
> "cbs_struct", and properly updating it in cbs.c
> 2) Keeping track of the total fraction of CPU time used by the
> active tasks. See the "U"
> variable in cbs.c (in a modern scheduler, it should probably
> become a field in the
> runqueue structure)
> 3) Modifying the rule used to update the runtime. For a "standard"
> CBS without CPU
> reclaiming (the one implemented by SCHED_DEADLINE), if a task
> executes for an amount
> of time "delta" its runtime must be decreased by delta. For
> GRUB, it must be decreased
> by "delta" mutliplied by U. See "account()" in cbs.c.
> The "trick" is in properly updating U (and this is done using
> the state machine
> mentioned above)
>
> Summing up, this code is not directly usable, but it shows you what
> needs to be done in
> order to implement the GRUB mechanism for CPU reclaiming in a CBS
> scheduler...
>
>
Thanks for giving me your solution. I will take a look at it and modify
it in our scene later.

Thanks
Yan
>
> Luca
>
> .
>