commit cb83b629b remove the NODE sched domain and check if the node
distance in SLIT table is farther than REMOTE_DISTANCE, if so, it will
lose the load balance chance at exec/fork/wake_affine points.
But actually, even the node distance is farther than REMOTE_DISTANCE,
Modern CPUs also has QPI like connections, that make memory access is
not too slow between nodes. So above losing on NUMA machine make a
huge performance regression on benchmark: hackbench, tbench, netperf
and oltp etc.
This patch will recover the scheduler behavior to old mode on all my
Intel platforms: NHM EP/EX, WSM EP, SNB EP/EP4S, and so remove the
perfromance regressions. (all of them just has 2 kinds distance, 10 21)
Signed-off-by: Alex Shi <[email protected]>
---
kernel/sched/core.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 39eb601..b2ee41a 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6286,7 +6286,7 @@ static int sched_domains_curr_level;
static inline int sd_local_flags(int level)
{
- if (sched_domains_numa_distance[level] > REMOTE_DISTANCE)
+ if (sched_domains_numa_distance[level] > RECLAIM_DISTANCE)
return 0;
return SD_BALANCE_EXEC | SD_BALANCE_FORK | SD_WAKE_AFFINE;
--
1.7.5.4
On Wed, 2012-06-06 at 14:52 +0800, Alex Shi wrote:
> - if (sched_domains_numa_distance[level] > REMOTE_DISTANCE)
> + if (sched_domains_numa_distance[level] > RECLAIM_DISTANCE)
I actually considered this.. I just felt a little uneasy re-purposing
the RECLAIM_DISTANCE for this, but I guess its all the same anyway. Both
mean expensive-away-distance.
So I've taken this.
thanks!
Hello.
On 06-06-2012 10:52, Alex Shi wrote:
> commit cb83b629b
Please also specify that commit's summary in parens.
> remove the NODE sched domain and check if the node
> distance in SLIT table is farther than REMOTE_DISTANCE, if so, it will
> lose the load balance chance at exec/fork/wake_affine points.
> But actually, even the node distance is farther than REMOTE_DISTANCE,
> Modern CPUs also has QPI like connections, that make memory access is
"Is" not needed here.
> not too slow between nodes. So above losing on NUMA machine make a
> huge performance regression on benchmark: hackbench, tbench, netperf
> and oltp etc.
> This patch will recover the scheduler behavior to old mode on all my
> Intel platforms: NHM EP/EX, WSM EP, SNB EP/EP4S, and so remove the
> perfromance regressions. (all of them just has 2 kinds distance, 10 21)
> Signed-off-by: Alex Shi<[email protected]>
WBR, Sergei
Commit-ID: 10717dcde10d09f9fcee53a12a4236af1a82b484
Gitweb: http://git.kernel.org/tip/10717dcde10d09f9fcee53a12a4236af1a82b484
Author: Alex Shi <[email protected]>
AuthorDate: Wed, 6 Jun 2012 14:52:51 +0800
Committer: Ingo Molnar <[email protected]>
CommitDate: Wed, 6 Jun 2012 16:52:25 +0200
sched/numa: Load balance between remote nodes
Commit cb83b629b ("sched/numa: Rewrite the CONFIG_NUMA sched
domain support") removed the NODE sched domain and started checking
if the node distance in SLIT table is farther than REMOTE_DISTANCE,
if so, it will lose the load balance chance at exec/fork/wake_affine
points.
But actually, even the node distance is farther than REMOTE_DISTANCE.
Modern CPUs also has QPI like connections, which ensures that memory
access is not too slow between nodes. So the above change in behavior
on NUMA machine causes a performance regression on various benchmarks:
hackbench, tbench, netperf, oltp, etc.
This patch will recover the scheduler behavior to old mode on all my
Intel platforms: NHM EP/EX, WSM EP, SNB EP/EP4S, and thus fixes the
perfromance regressions. (all of them just have 2 kinds distance, 10, 21)
Signed-off-by: Alex Shi <[email protected]>
Signed-off-by: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
kernel/sched/core.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index c46958e..6546083 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6321,7 +6321,7 @@ static int sched_domains_curr_level;
static inline int sd_local_flags(int level)
{
- if (sched_domains_numa_distance[level] > REMOTE_DISTANCE)
+ if (sched_domains_numa_distance[level] > RECLAIM_DISTANCE)
return 0;
return SD_BALANCE_EXEC | SD_BALANCE_FORK | SD_WAKE_AFFINE;
On 06/06/2012 05:01 PM, Peter Zijlstra wrote:
> On Wed, 2012-06-06 at 14:52 +0800, Alex Shi wrote:
>> - if (sched_domains_numa_distance[level] > REMOTE_DISTANCE)
>> + if (sched_domains_numa_distance[level] > RECLAIM_DISTANCE)
>
> I actually considered this.. I just felt a little uneasy re-purposing
> the RECLAIM_DISTANCE for this, but I guess its all the same anyway. Both
> mean expensive-away-distance.
>
I understand you, the BIOS guys don't have a good alignment with us on
this.
> So I've taken this.
>
> thanks!