Date: Thu, 25 Mar 2004 22:59:08 +0100
From: Ingo Molnar <mingo@elte.hu>
To: Andi Kleen <ak@suse.de>
Cc: "Nakajima, Jun" <jun.nakajima@intel.com>,
       Rick Lindsley <ricklind@us.ibm.com>, piggin@cyberone.com.au,
       linux-kernel@vger.kernel.org, akpm@osdl.org, kernel@kolivas.org,
       rusty@rustcorp.com.au, anton@samba.org, lse-tech@lists.sourceforge.net,
       mbligh@aracnet.com
Subject: Re: [Lse-tech] [patch] sched-domain cleanups, sched-2.6.5-rc2-mm2-A3
Message-ID: <20040325215908.GA19313@elte.hu>
References: <7F740D512C7C1046AB53446D372001730111990F@scsmsx402.sc.intel.com> <20040325154011.GB30175@wotan.suse.de>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20040325154011.GB30175@wotan.suse.de>
User-Agent: Mutt/1.4.1i
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 1493
Lines: 35


* Andi Kleen <ak@suse.de> wrote:

> It doesn't do load balance in wake_up_forked_process() and is
> relatively non aggressive in balancing later. This leads to the
> multithreaded OpenMP STREAM running its childs first on the same node
> as the original process and allocating memory there. Then later they
> run on a different node when the balancing finally happens, but
> generate cross traffic to the old node, instead of using the memory
> bandwidth of their local nodes.
> 
> The difference is very visible, even the 4 thread STREAM only sees the
> bandwidth of a single node. With a more aggressive scheduler you get 4
> times as much.
> 
> Admittedly it's a bit of a stupid benchmark, but seems to
> representative for a lot of HPC codes.

There's no way the scheduler can figure out the scheduling and memory
use patterns of the new tasks in advance.

but userspace could give hints - e.g. a syscall that triggers a
rebalancing: sys_sched_load_balance(). This way userspace notifies the
scheduler that it is on 'zero ground' and that the scheduler can move it
to the least loaded cpu/node.

a variant of this is already possible, userspace can use setaffinity to
load-balance manually - but sched_load_balance() would be automatic.

	Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/