hi...
when the rt_runtime budget is exceeded, the kernel silently stops
scheduling RT tasks. there is no way to distinguish this from
a task taking very long to complete.
it would be very nice, if the kernel would send some form of notifaction
when it starts throttling things.
recording the timestamp of the last occurence of throttling
in a /proc file would be sufficient, if there were no cgroups.
would it be possible to add a readonly property to the cpu subsystem ?
there is already a printk_once, but thats pretty useless :)
from -rt kernel kernel/sched_rt.c:
----------------------------------------------------------------------------------
static int sched_rt_runtime_exceeded(struct rt_rq *rt_rq)
{
u64 runtime = sched_rt_runtime(rt_rq);
if (rt_rq->rt_throttled)
return rt_rq_throttled(rt_rq);
if (sched_rt_runtime(rt_rq) >= sched_rt_period(rt_rq))
return 0;
balance_runtime(rt_rq);
runtime = sched_rt_runtime(rt_rq);
if (runtime == RUNTIME_INF)
return 0;
if (rt_rq->rt_time > runtime) {
rt_rq->rt_throttled = 1;
printk_once(KERN_WARNING "sched: RT throttling activated\n");
+ // send some form of notification.
if (rt_rq_throttled(rt_rq)) {
sched_rt_rq_dequeue(rt_rq);
return 1;
}
}
return 0;
}
-----------------------------------------------------------------------------------
--
torben Hohn
On Thu, 2010-12-23 at 12:39 +0100, torbenh wrote:
> hi...
>
Hi,
> when the rt_runtime budget is exceeded, the kernel silently stops
> scheduling RT tasks. there is no way to distinguish this from
> a task taking very long to complete.
>
Well, this depends on how you do the accounting in your (user-level)
code.
> it would be very nice, if the kernel would send some form of notifaction
> when it starts throttling things.
>
This might be tricky, if the meaning is signals or something to the
throttled tasks, since you can have a (or more!) runqueue full of
them... Are we signalling them all? Moreover, they'll get the
notification only after resuming, and it's not guaranteed that this
helps in finding out who is "responsible for" the throttling and... By
the way...
> recording the timestamp of the last occurence of throttling
> in a /proc file would be sufficient, if there were no cgroups.
>
> would it be possible to add a readonly property to the cpu subsystem ?
>
... If you think you're fine with some /proc (and perhaps cpuacct, if
cgroups are being used) readable, I can try to come up with something.
I think that the number of times that throttling fired and the last
throttling instant could be exported this way without much issues.
Do others have some idea and/or comments about that? This is
ABI/interface, and that really scares me! :-P
Thanks and Regards,
Dario
--
<<This happens because I choose it to happen!>> (Raistlin Majere)
----------------------------------------------------------------------
Dario Faggioli, ReTiS Lab, Scuola Superiore Sant'Anna, Pisa (Italy)
http://retis.sssup.it/people/faggioli -- [email protected]
?
>>
> ... If you think you're fine with some /proc (and perhaps cpuacct, if
> cgroups are being used) readable, I can try to come up with something.
>
There is no point in putting it in cpuacct since cpuacct can be used
separately from cpu.
> I think that the number of times that throttling fired and the last
> throttling instant could be exported this way without much issues.
>
> Do others have some idea and/or comments about that? This is
> ABI/interface, and that really scares me! :-P
IIRC, your patchset had something like this for getting the
statistics? Starting fromt hre, would schedstats be the right place?
Dhaval
On Thu, 2010-12-23 at 15:04 +0100, Dhaval Giani wrote:
> > ... If you think you're fine with some /proc (and perhaps cpuacct, if
> > cgroups are being used) readable, I can try to come up with something.
> >
>
> There is no point in putting it in cpuacct since cpuacct can be used
> separately from cpu.
>
Which would mean that you'd need both for having such stat. Anyway, I'm
fine with putting this in 'cpu' as well, just trying to find a consensus
on what the right place is.
> > Do others have some idea and/or comments about that? This is
> > ABI/interface, and that really scares me! :-P
>
> IIRC, your patchset had something like this for getting the
> statistics? Starting fromt hre, would schedstats be the right place?
>
SCHED_DEADLINE patchset has both signaling capabilities and some
statistic reporting, bat it's a different thing.
I think schedstat could be the right place for _this_ thing here, but
since each cgroup could be throttled, we also need something which is
per-cgroup... Don't you agree?
Regards,
Dario
--
<<This happens because I choose it to happen!>> (Raistlin Majere)
----------------------------------------------------------------------
Dario Faggioli, ReTiS Lab, Scuola Superiore Sant'Anna, Pisa (Italy)
http://retis.sssup.it/people/faggioli -- [email protected]
* Dario Faggioli <[email protected]> [2010-12-23 15:44:56]:
> On Thu, 2010-12-23 at 15:04 +0100, Dhaval Giani wrote:
> > > ... If you think you're fine with some /proc (and perhaps cpuacct, if
> > > cgroups are being used) readable, I can try to come up with something.
> > >
> >
> > There is no point in putting it in cpuacct since cpuacct can be used
> > separately from cpu.
> >
> Which would mean that you'd need both for having such stat. Anyway, I'm
> fine with putting this in 'cpu' as well, just trying to find a consensus
> on what the right place is.
>
> > > Do others have some idea and/or comments about that? This is
> > > ABI/interface, and that really scares me! :-P
> >
> > IIRC, your patchset had something like this for getting the
> > statistics? Starting fromt hre, would schedstats be the right place?
> >
> SCHED_DEADLINE patchset has both signaling capabilities and some
> statistic reporting, bat it's a different thing.
>
> I think schedstat could be the right place for _this_ thing here, but
> since each cgroup could be throttled, we also need something which is
> per-cgroup... Don't you agree?
>
You definitely need something per cgroup, have you looked at the
events framework in cgroups and its implementation in memcgroup?
--
Three Cheers,
Balbir
On Thu, Dec 23, 2010 at 6:47 PM, Dario Faggioli <[email protected]> wrote:
> On Thu, 2010-12-23 at 12:39 +0100, torbenh wrote:
>> hi...
>>
> Hi,
>
>> when the rt_runtime budget is exceeded, the kernel silently stops
>> scheduling RT tasks. there is no way to distinguish this from
>> a task taking very long to complete.
>>
> Well, this depends on how you do the accounting in your (user-level)
> code.
>
>> it would be very nice, if the kernel would send some form of notifaction
>> when it starts throttling things.
>>
> This might be tricky, if the meaning is signals or something to the
> throttled tasks, since you can have a (or more!) runqueue full of
> them... Are we signalling them all? Moreover, they'll get the
> notification only after resuming, and it's not guaranteed that this
> helps in finding out who is "responsible for" the throttling and... By
> the way...
>
>> recording the timestamp of the last occurence of throttling
>> in a /proc file would be sufficient, if there were no cgroups.
>>
>> would it be possible to add a readonly property to the cpu subsystem ?
>>
> ... If you think you're fine with some /proc (and perhaps cpuacct, if
> cgroups are being used) readable, I can try to come up with something.
>
> I think that the number of times that throttling fired and the last
> throttling instant could be exported this way without much issues.
>
> Do others have some idea and/or comments about that? This is
> ABI/interface, and that really scares me! :-P
You might want to note that CFS bandwidth control patches export some
such stats via a new per-cgroup file cpu.stat. It exports 3 statistics:
nr_periods: Number of enforcement intervals that have elapsed.
nr_throttled: Number of times the group has been throttled/limited.
throttled_time: The total time duration (in nanoseconds) for which the group
remained throttled.
Regards,
Bharata.
On Mon, 2010-12-27 at 12:13 +0530, Bharata B Rao wrote:
> > Do others have some idea and/or comments about that? This is
> > ABI/interface, and that really scares me! :-P
>
> You might want to note that CFS bandwidth control patches export some
> such stats via a new per-cgroup file cpu.stat. It exports 3 statistics:
>
Yeah, I saw that. What can be also done is to extend schedstats and then
apply it to cgroups as well... I think Dhaval told me he has some plans
about that, am I wrong?
> nr_periods: Number of enforcement intervals that have elapsed.
> nr_throttled: Number of times the group has been throttled/limited.
> throttled_time: The total time duration (in nanoseconds) for which the group
> remained throttled.
>
More than reasonable reporting, to me at least. :-)
Regards,
Dario
--
<<This happens because I choose it to happen!>> (Raistlin Majere)
----------------------------------------------------------------------
Dario Faggioli, ReTiS Lab, Scuola Superiore Sant'Anna, Pisa (Italy)
http://retis.sssup.it/people/faggioli -- [email protected]