LinuxLists.cc - Re: [PATCH v2 08/11] sched: get CPU's activity statistic

2014-06-03 12:04:00

Subject: Re: [PATCH v2 08/11] sched: get CPU's activity statistic

On Wed, May 28, 2014 at 05:39:10PM +0100, Vincent Guittot wrote:
> On 28 May 2014 17:47, Morten Rasmussen <[email protected]> wrote:
> > On Wed, May 28, 2014 at 02:15:03PM +0100, Vincent Guittot wrote:
> >> On 28 May 2014 14:10, Morten Rasmussen <[email protected]> wrote:
> >> > On Fri, May 23, 2014 at 04:53:02PM +0100, Vincent Guittot wrote:
>
> [snip]
>
> >
> >> This value is linked to the CPU on
> >> which it has run previously because of the time sharing with others
> >> tasks, so the unweighted load of a freshly migrated task will reflect
> >> its load on the previous CPU (with the time sharing with other tasks
> >> on prev CPU).
> >
> > I agree that the task runnable_avg_sum is always affected by the
> > circumstances on the cpu where it is running, and that it takes this
> > history with it. However, I think cfs.runnable_load_avg leads to less
> > problems than using the rq runnable_avg_sum. It would work nicely for
> > the two tasks on two cpus example I mentioned earlier. We don't need add
>
> i would say that nr_running is an even better metrics for such
> situation as the load doesn't give any additional information.

I fail to understand how nr_running can be used. nr_running doesn't tell
you anything about the utilization of the cpu, just the number tasks
that happen to be runnable at a point in time on a specific cpu. It
might be two small tasks that just happened to be running while you read
nr_running.

An unweighted version of cfs.runnable_load_avg gives you a metric that
captures cpu utilization to some extend, but not the number of tasks.
And it reflects task migrations immediately unlike the rq
runnable_avg_sum.

> Just to point that we can spent a lot of time listing which use case
> are better covered by which metrics :-)

Agreed, but I think it is quite important to discuss what we understand
by cpu utilization. It seems to be different depending on what you want
to use it for. I think it is also clear that none of the metrics that
have been proposed are perfect. We therefore have to be careful to only
use metrics in scenarios where they make sense. IMHO, both rq
runnable_avg_sum and unweighted cfs.runnable_load_avg capture cpu
utilization, but in different ways.

We have done experiments internally with rq runnable_avg_sum for
load-balancing decisions in the past and found it unsuitable due to its
slow response to task migrations. That is why I brought it up here.
AFAICT, you use rq runnable_avg_sum more like a flag than a quantitative
measure of cpu utilization. Viewing things from an energy-awareness
point of view I'm more interested in the latter for estimating the
implications of moving tasks around. I don't have any problems with
using rq runnable_avg_sum for other things as long we are fully aware of
how this metric works.

> > something on top when the cpu is fully utilized by more than one task.
> > It comes more naturally with cfs.runnable_load_avg. If it is much larger
> > than 47742, it should be fairly safe to assume that you shouldn't stick
> > more tasks on that cpu.
> >
> >>
> >> I'm not saying that such metric is useless but it's not perfect as well.
> >
> > It comes with its own set of problems, agreed. Based on my current
> > understanding (or lack thereof) they just seem smaller :)
>
> I think it's worth using the cpu utilization for some cases because it
> has got some information that are not available elsewhere. And the
> replacement of the current capacity computation is one example.
> As explained previously, I'm not against adding other metrics and i'm
> not sure to understand why you oppose these 2 metrics whereas they
> could be complementary

I think we more or less agree :) I'm fine with both metrics and I agree
that they complement each other. My concern is using the right metric
for the right job. If you choose to use rq runnable_avg_sum you have to
keep its slow reaction time in mind. I think that might be
difficult/not possible for some load-balancing decisions. That is
basically my point :)

Morten

2014-06-03 15:59:56

by Peter Zijlstra

[permalink] [raw]

Subject: Re: [PATCH v2 08/11] sched: get CPU's activity statistic

On Tue, Jun 03, 2014 at 01:03:54PM +0100, Morten Rasmussen wrote:
> On Wed, May 28, 2014 at 05:39:10PM +0100, Vincent Guittot wrote:
> > On 28 May 2014 17:47, Morten Rasmussen <[email protected]> wrote:
> > > On Wed, May 28, 2014 at 02:15:03PM +0100, Vincent Guittot wrote:
> > >> On 28 May 2014 14:10, Morten Rasmussen <[email protected]> wrote:
> > >> > On Fri, May 23, 2014 at 04:53:02PM +0100, Vincent Guittot wrote:

> > > I agree that the task runnable_avg_sum is always affected by the
> > > circumstances on the cpu where it is running, and that it takes this
> > > history with it. However, I think cfs.runnable_load_avg leads to less
> > > problems than using the rq runnable_avg_sum. It would work nicely for
> > > the two tasks on two cpus example I mentioned earlier. We don't need add
> >
> > i would say that nr_running is an even better metrics for such
> > situation as the load doesn't give any additional information.
>
> I fail to understand how nr_running can be used. nr_running doesn't tell
> you anything about the utilization of the cpu, just the number tasks
> that happen to be runnable at a point in time on a specific cpu. It
> might be two small tasks that just happened to be running while you read
> nr_running.

Agreed, I'm not at all seeing how nr_running is useful here.

> An unweighted version of cfs.runnable_load_avg gives you a metric that
> captures cpu utilization to some extend, but not the number of tasks.
> And it reflects task migrations immediately unlike the rq
> runnable_avg_sum.

So runnable_avg would be equal to the utilization as long as
there's idle time, as soon as we're over-loaded the metric shows how
much extra cpu is required.

That is, runnable_avg - running_avg >= 0 and the amount is the
exact amount of extra cpu required to make all tasks run but not have
idle time.

> Agreed, but I think it is quite important to discuss what we understand
> by cpu utilization. It seems to be different depending on what you want
> to use it for.

I understand utilization to be however much cpu is actually used, so I
would, per the existing naming, call running_avg to be the avg
utilization of a task/group/cpu whatever.

> We have done experiments internally with rq runnable_avg_sum for
> load-balancing decisions in the past and found it unsuitable due to its
> slow response to task migrations. That is why I brought it up here.

So I'm not entirely seeing that from the code (I've not traced this),
afaict we actually update the per-cpu values on migration based on the
task values.

old_rq->sum -= p->val;
new_rq->sum += p->val;

like,.. except of course totally obscured.

Attachments:

(No filename) (2.59 kB)
(No filename) (836.00 B)
Download all attachments

2014-06-03 17:41:32

by Morten Rasmussen

[permalink] [raw]

Subject: Re: [PATCH v2 08/11] sched: get CPU's activity statistic

On Tue, Jun 03, 2014 at 04:59:39PM +0100, Peter Zijlstra wrote:
> On Tue, Jun 03, 2014 at 01:03:54PM +0100, Morten Rasmussen wrote:
> > An unweighted version of cfs.runnable_load_avg gives you a metric that
> > captures cpu utilization to some extend, but not the number of tasks.
> > And it reflects task migrations immediately unlike the rq
> > runnable_avg_sum.
>
> So runnable_avg would be equal to the utilization as long as
> there's idle time, as soon as we're over-loaded the metric shows how
> much extra cpu is required.
>
> That is, runnable_avg - running_avg >= 0 and the amount is the
> exact amount of extra cpu required to make all tasks run but not have
> idle time.

Yes, roughly. runnable_avg goes up quite steeply if you have many tasks
on a fully utilized cpu, so the actual amount of extra cpu required
might be somewhat lower. I can't come up with something better, so I
agree.

>
> > Agreed, but I think it is quite important to discuss what we understand
> > by cpu utilization. It seems to be different depending on what you want
> > to use it for.
>
> I understand utilization to be however much cpu is actually used, so I
> would, per the existing naming, call running_avg to be the avg
> utilization of a task/group/cpu whatever.

I see your point, but for load balancing purposes we are more intested
in the runnable_avg as it tells us about the cpu capacity requirements.
I don't like to throw more terms into the mix, but you could call
runnable_avg the potential task/group/cpu utilization. This is an
estimate of how much utilization a task would cause if we moved it to an
idle cpu. That might be quite different from running_avg on an
over-utilized cpu.

>
> > We have done experiments internally with rq runnable_avg_sum for
> > load-balancing decisions in the past and found it unsuitable due to its
> > slow response to task migrations. That is why I brought it up here.
>
> So I'm not entirely seeing that from the code (I've not traced this),
> afaict we actually update the per-cpu values on migration based on the
> task values.
>
> old_rq->sum -= p->val;
> new_rq->sum += p->val;
>
> like,.. except of course totally obscured.

Yes, for cfs.runnable_load_avg, rq->avg.runnable_avg_sum is different.
See the other reply.