2008-02-14 16:04:48

by Peter Zijlstra

[permalink] [raw]
Subject: [RFC][PATCH 0/2] reworking load_balance_monitor

Hi,

Here the current patches that rework load_balance_monitor.

The main reason for doing this is to eliminate the wakeups the thing generates,
esp. on an idle system. The bonus is that it removes a kernel thread.

Paul, Gregory - the thing that bothers me most atm is the lack of
rd->load_balance. Should I introduce that (-rt ought to make use of that as
well) by way of copying from the top sched_domain when it gets created?

- peterz


2008-02-14 16:16:31

by Gregory Haskins

[permalink] [raw]
Subject: Re: [RFC][PATCH 0/2] reworking load_balance_monitor

>>> On Thu, Feb 14, 2008 at 10:57 AM, in message
<[email protected]>, Peter Zijlstra <[email protected]>
wrote:
> Hi,
>
> Here the current patches that rework load_balance_monitor.
>
> The main reason for doing this is to eliminate the wakeups the thing
> generates,
> esp. on an idle system. The bonus is that it removes a kernel thread.
>
> Paul, Gregory - the thing that bothers me most atm is the lack of
> rd->load_balance. Should I introduce that (-rt ought to make use of that as
> well) by way of copying from the top sched_domain when it gets created?

With the caveat that I currently have not digested your patch series, this sounds like a reasonable approach. The root-domain effectively represents the top sched-domain anyway (with the additional attribute that its a shared structure with all constituent cpus).

Ill try to take a look at the series later today and get back to you with feedback.

-Greg

2008-02-14 18:16:00

by Paul Jackson

[permalink] [raw]
Subject: Re: [RFC][PATCH 0/2] reworking load_balance_monitor

Peter wrote of:
> the lack of rd->load_balance.

Could you explain to me a bit what that means?

Does this mean that the existing code would, by default (default being
a single sched domain, covering the entire system's CPUs) load balance
across the entire system, but with your rework, not so load balance
there? That seems unlikely.

In any event, from my rather cpuset-centric perspective, there are only
two common cases to consider.

1. In the default case, build_sched_domains() gets called once,
at init, with a cpu_map of all non-isolated CPUs, and we should
forever after load balance across all those non-isolated CPUs.

2. In some carefully managed systems using the per-cpuset
'sched_load_balance' flags, we tear down that first default
sched domain, by calling detach_destroy_domains() on it, and we
then setup some number of sched_domains (typically in the range
of two to ten, though I suppose we should design to scale to
hundreds of sched domains, on systems with thousands of CPUs)
by additional calls to build_sched_domains(), such that their
CPUs don't overlap (pairwise disjoint) and such that the union
of all their CPUs may, or may not, include all non-isolated CPUs
(some CPUs might be left 'out in the cold', intentionally, as
essentially additional isolated CPUs.) We would then expect load
balancing within each of these pair-wise disjoint sched domains,
but not between one of them and another.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.940.382.4214

2008-02-14 19:23:18

by Gregory Haskins

[permalink] [raw]
Subject: Re: [RFC][PATCH 0/2] reworking load_balance_monitor

>>> On Thu, Feb 14, 2008 at 1:15 PM, in message
<[email protected]>, Paul Jackson <[email protected]> wrote:
> Peter wrote of:
>> the lack of rd->load_balance.
>
> Could you explain to me a bit what that means?
>
> Does this mean that the existing code would, by default (default being
> a single sched domain, covering the entire system's CPUs) load balance
> across the entire system, but with your rework, not so load balance
> there? That seems unlikely.
>
> In any event, from my rather cpuset-centric perspective, there are only
> two common cases to consider.
>
> 1. In the default case, build_sched_domains() gets called once,
> at init, with a cpu_map of all non-isolated CPUs, and we should
> forever after load balance across all those non-isolated CPUs.
>
> 2. In some carefully managed systems using the per-cpuset
> 'sched_load_balance' flags, we tear down that first default
> sched domain, by calling detach_destroy_domains() on it, and we
> then setup some number of sched_domains (typically in the range
> of two to ten, though I suppose we should design to scale to
> hundreds of sched domains, on systems with thousands of CPUs)
> by additional calls to build_sched_domains(), such that their
> CPUs don't overlap (pairwise disjoint) and such that the union
> of all their CPUs may, or may not, include all non-isolated CPUs
> (some CPUs might be left 'out in the cold', intentionally, as
> essentially additional isolated CPUs.) We would then expect load
> balancing within each of these pair-wise disjoint sched domains,
> but not between one of them and another.


Hi Paul,
I think it will still work as you describe. We create a new root-domain object for each pair-wise disjoint sched-domain. In your case (1) above, we would only have one instance of a root-domain which contains (of course) a single instance of the rd->load_balance object. This would, in fact operate like the global variable that Peter is suggesting it replace (IIUC). However, for case (2), we would instantiate a root-domain object per pairwise-disjoint sched-domain, and therefore each one would have its own instance of rd->load_balance.

HTH
-Greg

2008-02-18 08:25:28

by Dhaval Giani

[permalink] [raw]
Subject: Re: [RFC][PATCH 0/2] reworking load_balance_monitor

On Thu, Feb 14, 2008 at 04:57:24PM +0100, Peter Zijlstra wrote:
> Hi,
>
> Here the current patches that rework load_balance_monitor.
>
> The main reason for doing this is to eliminate the wakeups the thing generates,
> esp. on an idle system. The bonus is that it removes a kernel thread.
>

Hi Peter,

The changes look really good to me. I will give it a run in sometime and
give some more feedback.

--
regards,
Dhaval