When do performance testing on 37-rc1 kernel on Core2 machines, we find
the volanomark loopback performance drop about 30%, that due to
commit:fab476228ba37907ad7
Volanomark link: http://www.volano.com/benchmarks.html
Our volanomark testing parameters as following:
"-count 25000 -rooms 10 "
JVM is jrockit-R27.3.1-jre1.5.0_11
java_options is "-Xmx1500m -Xms1500m -Xns750m -XXaggressive -Xlargepages
-XXlazyUnlocking -Xgc:genpar -XXtlasize:min=16k,preferred=64k"
and we set /proc/sys/kernel/sched_compat_yield as "1".
We find if with the following patch, the regression can be recovered.
Signed-off-by:Alex Shi <[email protected]>
diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
index f4f6a83..5dca678 100644
--- a/kernel/sched_fair.c
+++ b/kernel/sched_fair.c
@@ -1766,7 +1766,6 @@ static void pull_task(struct rq *src_rq, struct task_struct *p,
check_preempt_curr(this_rq, p, 0);
/* re-arm NEWIDLE balancing when moving tasks */
- src_rq->avg_idle = this_rq->avg_idle = 2*sysctl_sched_migration_cost;
this_rq->idle_stamp = 0;
}
It seems some of load_balance() is not necessary that caused by avg_idle
setting. But do not know more details of the volano running. Anyone like
to give a comments for this issue?
Ncrao, I have no idea of your benchmarks, but just guess removing the
avg_idle setting won't bring much wakeup delay for tasks. Could you like
to show some data of this?
The vmstat output for .36 and .37-rc1 kernel as below:
2.6.36
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
r b swpd free buff cache si so bi bo in cs us sy id wa st
394 0 0 4680728 3168 118972 0 0 0 3 5314 1348711 38 62 0 0 0
396 0 0 4680500 3184 118976 0 0 0 8 5345 1303237 38 62 0 0 0
413 0 0 4680252 3200 118976 0 0 0 3 5082 1326851 38 61 0 0 0
2.6.37-rc1
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
r b swpd free buff cache si so bi bo in cs us sy id wa st
329 0 0 4653748 4600 134032 0 0 0 7 11649 918370 35 64 0 0 0
210 0 0 4653492 4608 134032 0 0 0 6 11957 898011 36 64 0 0 0
373 0 0 4653496 4624 134032 0 0 0 4 11736 912468 36 64 0 0 0
Regards
Alex
On 11/16/10, Alex,Shi <[email protected]> wrote:
> When do performance testing on 37-rc1 kernel on Core2 machines, we find
> the volanomark loopback performance drop about 30%, that due to
> commit:fab476228ba37907ad7
>
Was that test was made before and after applying above commit? Would
love to know, how did you find that commit (I mean was it a git
bisection)?
> Volanomark link: http://www.volano.com/benchmarks.html
> Our volanomark testing parameters as following:
> "-count 25000 -rooms 10 "
> JVM is jrockit-R27.3.1-jre1.5.0_11
> java_options is "-Xmx1500m -Xms1500m -Xns750m -XXaggressive -Xlargepages
> -XXlazyUnlocking -Xgc:genpar -XXtlasize:min=16k,preferred=64k"
> and we set /proc/sys/kernel/sched_compat_yield as "1".
>
> We find if with the following patch, the regression can be recovered.
>
What are the VolanoMark test results, after and before applying this patch?
>
> It seems some of load_balance() is not necessary that caused by avg_idle
> setting. But do not know more details of the volano running. Anyone like
> to give a comments for this issue?
>
Does VolanoMark is used for scheduler benchmarking? If I'm not wrong,
I don't think it directly relates to scheduler benchmarking.
> Ncrao, I have no idea of your benchmarks, but just guess removing the
> avg_idle setting won't bring much wakeup delay for tasks. Could you like
> to show some data of this?
>
> The vmstat output for .36 and .37-rc1 kernel as below:
You are showing the output of .36 and .37-rc1. If Ncrao's commit is
guilty for this performance regression, then what are the results of
before and after applied Ncrao's commit. Then, what are the result
after applying your patch. You are showing vmstat output of .36 and
.37-rc1, which really doesn't prove the point of your patch. It needs
to be more clearer.
thanks,
rakib
> Regards
> Alex
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
On Tue, 2010-11-16 at 20:38 +0600, Rakib Mullick wrote:
> Does VolanoMark is used for scheduler benchmarking? If I'm not wrong,
> I don't think it directly relates to scheduler benchmarking.
It's not generally considered to be a wonderful benchmark, but it is a
good indicator, and worth keeping an eye on IMHO.
I don't recall whether that patch works with the idle testcase without
resetting the throttle, or if it's only a bit less effective. If it's
only a little less effective, I'd be inclined to just whack the reset as
Alex did. Whatever is done has to prevent high frequency balancing.
-Mike
On Tue, Nov 16, 2010 at 1:34 AM, Alex,Shi <[email protected]> wrote:
> When do performance testing on 37-rc1 kernel on Core2 machines, we find
> the volanomark loopback performance drop about 30%, that due to
> commit:fab476228ba37907ad7
>
> Volanomark link: http://www.volano.com/benchmarks.html
> Our volanomark testing parameters as following:
> "-count 25000 -rooms 10 "
> JVM is jrockit-R27.3.1-jre1.5.0_11
> java_options is "-Xmx1500m -Xms1500m -Xns750m -XXaggressive -Xlargepages
> -XXlazyUnlocking -Xgc:genpar -XXtlasize:min=16k,preferred=64k"
> and we set /proc/sys/kernel/sched_compat_yield as "1".
>
> We find if with the following patch, the regression can be recovered.
>
> Signed-off-by:Alex Shi <[email protected]>
>
> diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
> index f4f6a83..5dca678 100644
> --- a/kernel/sched_fair.c
> +++ b/kernel/sched_fair.c
> @@ -1766,7 +1766,6 @@ static void pull_task(struct rq *src_rq, struct task_struct *p,
> check_preempt_curr(this_rq, p, 0);
>
> /* re-arm NEWIDLE balancing when moving tasks */
> - src_rq->avg_idle = this_rq->avg_idle = 2*sysctl_sched_migration_cost;
> this_rq->idle_stamp = 0;
> }
>
>
> It seems some of load_balance() is not necessary that caused by avg_idle
> setting. But do not know more details of the volano running. Anyone like
> to give a comments for this issue?
>
I'm not very familiar with the Volanomark benchmark, but from the
patch you posted it looks like resetting the idle throttle (i.e.
making newidle balance more likely) hurts this benchmark. This is also
evident in the sharp increase in ctx rate in 2.6.37-rc1. Resetting
throttling was a heuristic added to induce more frequent idle
balancing after a migration, and it isn't strictly required to make
the other parts of that patch work. In your patch above, can you
please also remove the comment and idle_stamp reset.
> Ncrao, I have no idea of your benchmarks, but just guess removing the
> avg_idle setting won't bring much wakeup delay for tasks. Could you like
> to show some data of this?
>
The benchmark was pretty simple; run nr_cpu while-1 loops, one of them
niced to -15. I also experimented with some payloads with random
sleep/wakeup times, but nothing with high frequency balancing. I think
removing the reset should be OK for the idle test cases.
-Thanks,
Nikhil
On Tue, Nov 16, 2010 at 7:26 AM, Mike Galbraith <[email protected]> wrote:
> On Tue, 2010-11-16 at 20:38 +0600, Rakib Mullick wrote:
>
>> Does VolanoMark is used for scheduler benchmarking? If I'm not wrong,
>> I don't think it directly relates to scheduler benchmarking.
>
> It's not generally considered to be a wonderful benchmark, but it is a
> good indicator, and worth keeping an eye on IMHO.
>
> I don't recall whether that patch works with the idle testcase without
> resetting the throttle, or if it's only a bit less effective. If it's
> only a little less effective, I'd be inclined to just whack the reset as
> Alex did. Whatever is done has to prevent high frequency balancing.
>
>From what I recall, I think removing the reset makes the original
patch a little less effective. I agree that we can remove the reset if
it hurts high frequency balancing.
-Thanks,
Nikhil
On Tue, 2010-11-16 at 08:31 -0800, Nikhil Rao wrote:
> On Tue, Nov 16, 2010 at 7:26 AM, Mike Galbraith <[email protected]> wrote:
> > On Tue, 2010-11-16 at 20:38 +0600, Rakib Mullick wrote:
> >
> >> Does VolanoMark is used for scheduler benchmarking? If I'm not wrong,
> >> I don't think it directly relates to scheduler benchmarking.
> >
> > It's not generally considered to be a wonderful benchmark, but it is a
> > good indicator, and worth keeping an eye on IMHO.
> >
> > I don't recall whether that patch works with the idle testcase without
> > resetting the throttle, or if it's only a bit less effective. If it's
> > only a little less effective, I'd be inclined to just whack the reset as
> > Alex did. Whatever is done has to prevent high frequency balancing.
> >
>
> >From what I recall, I think removing the reset makes the original
> patch a little less effective. I agree that we can remove the reset if
> it hurts high frequency balancing.
Ok, let's do that. I added your ack, OK?
From: Alex Shi <[email protected]>
Date: Tue, 16 Nov 2010 17:34:02 +0800
sched: volanomark regression fix
Commit fab4762 triggers excessive idle balancing, causing a ~30% loss in
volanomark throughput. Remove idle balancing throttle reset.
Signed-off-by: Mike Galbraith <[email protected]>
Acked-by: Nikhil Rao <[email protected]>
Reported-by: Alex Shi <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
LKML-Reference: <new-submission>
---
kernel/sched_fair.c | 4 ----
1 file changed, 4 deletions(-)
Index: linux-2.6/kernel/sched_fair.c
===================================================================
--- linux-2.6.orig/kernel/sched_fair.c
+++ linux-2.6/kernel/sched_fair.c
@@ -1758,10 +1758,6 @@ static void pull_task(struct rq *src_rq,
set_task_cpu(p, this_cpu);
activate_task(this_rq, p, 0);
check_preempt_curr(this_rq, p, 0);
-
- /* re-arm NEWIDLE balancing when moving tasks */
- src_rq->avg_idle = this_rq->avg_idle = 2*sysctl_sched_migration_cost;
- this_rq->idle_stamp = 0;
}
/*
On Tue, Nov 16, 2010 at 9:32 AM, Mike Galbraith <[email protected]> wrote:
> On Tue, 2010-11-16 at 08:31 -0800, Nikhil Rao wrote:
>> On Tue, Nov 16, 2010 at 7:26 AM, Mike Galbraith <[email protected]> wrote:
>> > On Tue, 2010-11-16 at 20:38 +0600, Rakib Mullick wrote:
>> >
>> >> Does VolanoMark is used for scheduler benchmarking? If I'm not wrong,
>> >> I don't think it directly relates to scheduler benchmarking.
>> >
>> > It's not generally considered to be a wonderful benchmark, but it is a
>> > good indicator, and worth keeping an eye on IMHO.
>> >
>> > I don't recall whether that patch works with the idle testcase without
>> > resetting the throttle, or if it's only a bit less effective. If it's
>> > only a little less effective, I'd be inclined to just whack the reset as
>> > Alex did. Whatever is done has to prevent high frequency balancing.
>> >
>>
>> >From what I recall, I think removing the reset makes the original
>> patch a little less effective. I agree that we can remove the reset if
>> it hurts high frequency balancing.
>
> Ok, let's do that. I added your ack, OK?
>
Yes, thanks for the patch.
> From: Alex Shi <[email protected]>
> Date: Tue, 16 Nov 2010 17:34:02 +0800
>
> sched: volanomark regression fix
>
> Commit fab4762 triggers excessive idle balancing, causing a ~30% loss in
> volanomark throughput. Remove idle balancing throttle reset.
>
> Signed-off-by: Mike Galbraith <[email protected]>
> Acked-by: Nikhil Rao <[email protected]>
> Reported-by: Alex Shi <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> LKML-Reference: <new-submission>
>
> ---
> kernel/sched_fair.c | 4 ----
> 1 file changed, 4 deletions(-)
>
> Index: linux-2.6/kernel/sched_fair.c
> ===================================================================
> --- linux-2.6.orig/kernel/sched_fair.c
> +++ linux-2.6/kernel/sched_fair.c
> @@ -1758,10 +1758,6 @@ static void pull_task(struct rq *src_rq,
> set_task_cpu(p, this_cpu);
> activate_task(this_rq, p, 0);
> check_preempt_curr(this_rq, p, 0);
> -
> - /* re-arm NEWIDLE balancing when moving tasks */
> - src_rq->avg_idle = this_rq->avg_idle = 2*sysctl_sched_migration_cost;
> - this_rq->idle_stamp = 0;
> }
>
> /*
>
>
>
On Tue, 2010-11-16 at 11:27 -0800, Nikhil Rao wrote:
>
> > Ok, let's do that. I added your ack, OK?
> >
>
> Yes, thanks for the patch.
Applied, thanks!
On Tue, 2010-11-16 at 22:38 +0800, Rakib Mullick wrote:
> On 11/16/10, Alex,Shi <[email protected]> wrote:
> > When do performance testing on 37-rc1 kernel on Core2 machines, we find
> > the volanomark loopback performance drop about 30%, that due to
> > commit:fab476228ba37907ad7
> >
> Was that test was made before and after applying above commit? Would
> love to know, how did you find that commit (I mean was it a git
> bisection)?
Yes, git bisect found this commit.
> >
> > It seems some of load_balance() is not necessary that caused by avg_idle
> > setting. But do not know more details of the volano running. Anyone like
> > to give a comments for this issue?
> >
> Does VolanoMark is used for scheduler benchmarking? If I'm not wrong,
> I don't think it directly relates to scheduler benchmarking.
Yes, but lots of benchmarks often find other part kernel issues. like
hackbench/netperf often find VM performance issues. And an our cache
testing tool often find scheduler problem.
>
> > Ncrao, I have no idea of your benchmarks, but just guess removing the
> > avg_idle setting won't bring much wakeup delay for tasks. Could you like
> > to show some data of this?
> >
> > The vmstat output for .36 and .37-rc1 kernel as below:
>
> You are showing the output of .36 and .37-rc1. If Ncrao's commit is
> guilty for this performance regression, then what are the results of
> before and after applied Ncrao's commit. Then, what are the result
> after applying your patch. You are showing vmstat output of .36 and
> .37-rc1, which really doesn't prove the point of your patch. It needs
> to be more clearer.
The vmstat for .36 represents original kernel, .37-rc1 represent with
the Ncrao's patch.
>
>
> thanks,
> rakib
>
>
> > Regards
> > Alex
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to [email protected]
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at http://www.tux.org/lkml/
> >
> ---
> kernel/sched_fair.c | 4 ----
> 1 file changed, 4 deletions(-)
>
> Index: linux-2.6/kernel/sched_fair.c
> ===================================================================
> --- linux-2.6.orig/kernel/sched_fair.c
> +++ linux-2.6/kernel/sched_fair.c
> @@ -1758,10 +1758,6 @@ static void pull_task(struct rq *src_rq,
> set_task_cpu(p, this_cpu);
> activate_task(this_rq, p, 0);
> check_preempt_curr(this_rq, p, 0);
> -
> - /* re-arm NEWIDLE balancing when moving tasks */
> - src_rq->avg_idle = this_rq->avg_idle = 2*sysctl_sched_migration_cost;
> - this_rq->idle_stamp = 0;
> }
>
> /*
>
>
In the original source (.36 kernel) the rq->idle_stamp is set as zero
after task was pulled to this cpu in load_balance(). Nikhil move this
setting to pull_task(), that has same effect.
I don't know what the details effect of removing idle_stamp setting
instead of recovered it on idle_balance(). :)
My machines are doing rc2 performance testing. I may try this patch
after testing finish.
The following is part of Nikhil's old patch.
===
@@ -3162,10 +3186,8 @@ static void idle_balance(int this_cpu, struct rq
*this_rq)
interval = msecs_to_jiffies(sd->balance_interval);
if (time_after(next_balance, sd->last_balance +
interval))
next_balance = sd->last_balance + interval;
- if (pulled_task) {
- this_rq->idle_stamp = 0;
+ if (pulled_task)
break;
- }
}
Regards
Alex
forgot to add cc-list in previous mail.
On Wed, Nov 17, 2010 at 11:42 AM, Nikhil Rao <[email protected]> wrote:
> On Tue, Nov 16, 2010 at 5:17 PM, Alex,Shi <[email protected]> wrote:
>>
>> In the original source (.36 kernel) the rq->idle_stamp is set as zero
>> after task was pulled to this cpu in load_balance(). Nikhil move this
>> setting to pull_task(), that has same effect.
>> I don't know what the details effect of removing idle_stamp setting
>> instead of recovered it on idle_balance(). :)
>>
>> My machines are doing rc2 performance testing. I may try this patch
>> after testing finish.
>>
>> The following is part of Nikhil's old patch.
>> ===
>> @@ -3162,10 +3186,8 @@ static void idle_balance(int this_cpu, struct rq
>> *this_rq)
>> interval = msecs_to_jiffies(sd->balance_interval);
>> if (time_after(next_balance, sd->last_balance +
>> interval))
>> next_balance = sd->last_balance + interval;
>> - if (pulled_task) {
>> - this_rq->idle_stamp = 0;
>> + if (pulled_task)
>> break;
>> - }
>> }
>>
>> Regards
>> Alex
>>
>
> Ah, should have caught this when reviewing Mike's patch. :-(
>
> Thanks for catching that. We need to reset idle_stamp to 0 or else the avg_idle
> calculations are incorrect. I've attached a patch below that resets idle_stamp
> in the newidle path when we pull.
>
> ---
> sched: volanomark regression fix (part 2)
>
> An earlier commit reverts idle balancing throttling reset to fix a 30%
> regression in volanomark throughput. We still need to reset idle_stamp when we
> pull a task in newidle balance.
>
> Reported-by: Alex Shi <[email protected]>
> Signed-off-by: Nikhil Rao <[email protected]>
> ---
> kernel/sched_fair.c | 4 +++-
> 1 files changed, 3 insertions(+), 1 deletions(-)
>
> diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
> index 83f65dd..e6e7d4b 100644
> --- a/kernel/sched_fair.c
> +++ b/kernel/sched_fair.c
> @@ -3193,8 +3193,10 @@ static void idle_balance(int this_cpu, struct rq *this_rq)
> interval = msecs_to_jiffies(sd->balance_interval);
> if (time_after(next_balance, sd->last_balance + interval))
> next_balance = sd->last_balance + interval;
> - if (pulled_task)
> + if (pulled_task) {
> + this_rq->idle_stamp = 0;
> break;
> + }
> }
>
> raw_spin_lock(&this_rq->lock);
> --
> 1.7.3.1
>
>
Commit-ID: b5482cfa1c95a188b3054fa33274806add91bbe5
Gitweb: http://git.kernel.org/tip/b5482cfa1c95a188b3054fa33274806add91bbe5
Author: Alex Shi <[email protected]>
AuthorDate: Tue, 16 Nov 2010 17:34:02 +0800
Committer: Ingo Molnar <[email protected]>
CommitDate: Thu, 18 Nov 2010 13:11:43 +0100
sched: Fix volanomark performance regression
Commit fab4762 triggers excessive idle balancing, causing a ~30% loss in
volanomark throughput. Remove idle balancing throttle reset.
Originally-by: Alex Shi <[email protected]>
Signed-off-by: Mike Galbraith <[email protected]>
Acked-by: Nikhil Rao <[email protected]>
Signed-off-by: Peter Zijlstra <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
kernel/sched_fair.c | 4 ----
1 files changed, 0 insertions(+), 4 deletions(-)
diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
index 52ab113..ba0556d 100644
--- a/kernel/sched_fair.c
+++ b/kernel/sched_fair.c
@@ -1758,10 +1758,6 @@ static void pull_task(struct rq *src_rq, struct task_struct *p,
set_task_cpu(p, this_cpu);
activate_task(this_rq, p, 0);
check_preempt_curr(this_rq, p, 0);
-
- /* re-arm NEWIDLE balancing when moving tasks */
- src_rq->avg_idle = this_rq->avg_idle = 2*sysctl_sched_migration_cost;
- this_rq->idle_stamp = 0;
}
/*