LinuxLists.cc - aim7 scalability issue on 4 socket machine

2009-09-17 09:30:09

Subject: aim7 scalability issue on 4 socket machine

Aim7 result is bad on my new Nehalem machines (4*8*2 logical cpu). Perf counter
shows spinlock consumes 70% cpu time on the machine. Lock_stat shows
anon_vma->lock causes most of the spinlock contention. Function tracer shows
below call chain creates the spinlock.

do_brk => vma_merge =>vma_adjust

Aim7 consists of lots of subtests. One test is to fork lots of processes and
every process calls sbrk for 1000 times to grow/shrink the heap. All the vma of
the heap of all sub-processes point to the same anon_vma and use the same
anon_vma->lock. When sbrk is called, kernel calls do_brk => vma_merge =>vma_adjust
and lock anon_vma->lock to create spinlock contentions.

There is a comment section in front of spin_lock(&anon_vma->lock. It says
anon_vma lock can be optimized when just changing vma->vm_end. As a matter
of fact, anon_vma->lock is used to protect anon_vma->list when an entry is
deleted/inserted or the list is accessed. There is no such deletion/insertion
if only vma->end is changed in function vma_adjust.

Below patch fixes it.

Test results with kernel 2.6.31-rc8. The improvement on the machine is about 150%.

Signed-off-by: Zhang Yanmin <[email protected]>

---

--- linux-2.6.31-rc8/mm/mmap.c 2009-09-03 10:03:57.000000000 +0800
+++ linux-2.6.31-rc8_aim7/mm/mmap.c 2009-09-17 19:11:20.000000000 +0800
@@ -512,6 +512,7 @@ void vma_adjust(struct vm_area_struct *v
struct anon_vma *anon_vma = NULL;
long adjust_next = 0;
int remove_next = 0;
+ int anon_vma_use_lock;

if (next && !insert) {
if (end >= next->vm_end) {
@@ -568,22 +569,32 @@ again: remove_next = 1 + (end > next->
}
}

- /*
- * When changing only vma->vm_end, we don't really need
- * anon_vma lock: but is that case worth optimizing out?
- */
if (vma->anon_vma)
anon_vma = vma->anon_vma;
+ anon_vma_use_lock = 0;
if (anon_vma) {
- spin_lock(&anon_vma->lock);
/*
- * Easily overlooked: when mprotect shifts the boundary,
- * make sure the expanding vma has anon_vma set if the
- * shrinking vma had, to cover any anon pages imported.
+ * When changing only vma->vm_end, we don't really need
+ * anon_vma lock.
+ * ana_vma->lock is to protect the access to the list
+ * started from anon_vma->head. If we don't remove or
+ * insert a vma to the list, and also don't access
+ * the list, we don't need ana_vma->lock.
*/
- if (importer && !importer->anon_vma) {
- importer->anon_vma = anon_vma;
- __anon_vma_link(importer);
+ if (remove_next ||
+ insert ||
+ (importer && !importer->anon_vma)) {
+ anon_vma_use_lock = 1;
+ spin_lock(&anon_vma->lock);
+ /*
+ * Easily overlooked: when mprotect shifts the boundary,
+ * make sure the expanding vma has anon_vma set if the
+ * shrinking vma had, to cover any anon pages imported.
+ */
+ if (importer && !importer->anon_vma) {
+ importer->anon_vma = anon_vma;
+ __anon_vma_link(importer);
+ }
}
}

@@ -628,7 +639,7 @@ again: remove_next = 1 + (end > next->
__insert_vm_struct(mm, insert);
}

- if (anon_vma)
+ if (anon_vma_use_lock)
spin_unlock(&anon_vma->lock);
if (mapping)
spin_unlock(&mapping->i_mmap_lock);

2009-09-17 09:40:17

by Peter Zijlstra

[permalink] [raw]

Subject: Re: aim7 scalability issue on 4 socket machine

On Thu, 2009-09-17 at 17:31 +0800, Zhang, Yanmin wrote:
> Aim7 result is bad on my new Nehalem machines (4*8*2 logical cpu). Perf counter
> shows spinlock consumes 70% cpu time on the machine. Lock_stat shows
> anon_vma->lock causes most of the spinlock contention. Function tracer shows
> below call chain creates the spinlock.
>
> do_brk => vma_merge =>vma_adjust
>
> Aim7 consists of lots of subtests. One test is to fork lots of processes and
> every process calls sbrk for 1000 times to grow/shrink the heap. All the vma of
> the heap of all sub-processes point to the same anon_vma and use the same
> anon_vma->lock. When sbrk is called, kernel calls do_brk => vma_merge =>vma_adjust
> and lock anon_vma->lock to create spinlock contentions.
>
> There is a comment section in front of spin_lock(&anon_vma->lock. It says
> anon_vma lock can be optimized when just changing vma->vm_end. As a matter
> of fact, anon_vma->lock is used to protect anon_vma->list when an entry is
> deleted/inserted or the list is accessed. There is no such deletion/insertion
> if only vma->end is changed in function vma_adjust.
>
> Below patch fixes it.
>
> Test results with kernel 2.6.31-rc8. The improvement on the machine is about 150%.

Did you see Lee's patch?:

http://lkml.org/lkml/2009/9/9/290

Added Lee and Hugh to CC, retained the below patch for them.

> Signed-off-by: Zhang Yanmin <[email protected]>
>
> ---
>
> --- linux-2.6.31-rc8/mm/mmap.c 2009-09-03 10:03:57.000000000 +0800
> +++ linux-2.6.31-rc8_aim7/mm/mmap.c 2009-09-17 19:11:20.000000000 +0800
> @@ -512,6 +512,7 @@ void vma_adjust(struct vm_area_struct *v
> struct anon_vma *anon_vma = NULL;
> long adjust_next = 0;
> int remove_next = 0;
> + int anon_vma_use_lock;
>
> if (next && !insert) {
> if (end >= next->vm_end) {
> @@ -568,22 +569,32 @@ again: remove_next = 1 + (end > next->
> }
> }
>
> - /*
> - * When changing only vma->vm_end, we don't really need
> - * anon_vma lock: but is that case worth optimizing out?
> - */
> if (vma->anon_vma)
> anon_vma = vma->anon_vma;
> + anon_vma_use_lock = 0;
> if (anon_vma) {
> - spin_lock(&anon_vma->lock);
> /*
> - * Easily overlooked: when mprotect shifts the boundary,
> - * make sure the expanding vma has anon_vma set if the
> - * shrinking vma had, to cover any anon pages imported.
> + * When changing only vma->vm_end, we don't really need
> + * anon_vma lock.
> + * ana_vma->lock is to protect the access to the list
> + * started from anon_vma->head. If we don't remove or
> + * insert a vma to the list, and also don't access
> + * the list, we don't need ana_vma->lock.
> */
> - if (importer && !importer->anon_vma) {
> - importer->anon_vma = anon_vma;
> - __anon_vma_link(importer);
> + if (remove_next ||
> + insert ||
> + (importer && !importer->anon_vma)) {
> + anon_vma_use_lock = 1;
> + spin_lock(&anon_vma->lock);
> + /*
> + * Easily overlooked: when mprotect shifts the boundary,
> + * make sure the expanding vma has anon_vma set if the
> + * shrinking vma had, to cover any anon pages imported.
> + */
> + if (importer && !importer->anon_vma) {
> + importer->anon_vma = anon_vma;
> + __anon_vma_link(importer);
> + }
> }
> }
>
> @@ -628,7 +639,7 @@ again: remove_next = 1 + (end > next->
> __insert_vm_struct(mm, insert);
> }
>
> - if (anon_vma)
> + if (anon_vma_use_lock)
> spin_unlock(&anon_vma->lock);
> if (mapping)
> spin_unlock(&mapping->i_mmap_lock);
>
>

2009-09-17 10:35:14

by Ingo Molnar

[permalink] [raw]

Subject: Re: aim7 scalability issue on 4 socket machine

* Zhang, Yanmin <[email protected]> wrote:

> ???Aim7 result is bad on my new Nehalem machines (4*8*2 logical cpu).
> Perf counter shows spinlock consumes 70% cpu time on the machine.
> Lock_stat shows anon_vma->lock causes most of the spinlock contention.
> Function tracer shows below call chain creates the spinlock.
>
> do_brk => vma_merge =>vma_adjust
>
> Aim7 consists of lots of subtests. One test is to fork lots of
> processes and every process calls sbrk for 1000 times to grow/shrink
> the heap. All the vma of the heap of all sub-processes point to the
> same anon_vma and use the same anon_vma->lock. When sbrk is called,
> kernel calls do_brk => vma_merge =>vma_adjust and lock anon_vma->lock
> to create spinlock contentions.
>
> There is a comment section in front of spin_lock(&anon_vma->lock. It
> says anon_vma lock can be optimized when just changing vma->vm_end. As
> a matter of fact, anon_vma->lock is used to protect anon_vma->list
> when an entry is deleted/inserted or the list is accessed. There is no
> such deletion/insertion if only vma->end is changed in function
> vma_adjust.
>
> Below patch fixes it.
>
> Test results with kernel 2.6.31-rc8. The improvement on the machine is
> about 150%.

Impressive speedup!

[ Also, the array of tools you used to debug this is impressive as well
;-) ]

Ingo

2009-09-17 10:36:42

by Hugh Dickins

[permalink] [raw]

Subject: Re: aim7 scalability issue on 4 socket machine

On Thu, 17 Sep 2009, Peter Zijlstra wrote:
> On Thu, 2009-09-17 at 17:31 +0800, Zhang, Yanmin wrote:
> > Aim7 result is bad on my new Nehalem machines (4*8*2 logical cpu). Perf counter
> > shows spinlock consumes 70% cpu time on the machine. Lock_stat shows
> > anon_vma->lock causes most of the spinlock contention. Function tracer shows
> > below call chain creates the spinlock.
> >
> > do_brk => vma_merge =>vma_adjust
> >
> > Aim7 consists of lots of subtests. One test is to fork lots of processes and
> > every process calls sbrk for 1000 times to grow/shrink the heap. All the vma of
> > the heap of all sub-processes point to the same anon_vma and use the same
> > anon_vma->lock. When sbrk is called, kernel calls do_brk => vma_merge =>vma_adjust
> > and lock anon_vma->lock to create spinlock contentions.
> >
> > There is a comment section in front of spin_lock(&anon_vma->lock. It says
> > anon_vma lock can be optimized when just changing vma->vm_end. As a matter
> > of fact, anon_vma->lock is used to protect anon_vma->list when an entry is
> > deleted/inserted or the list is accessed. There is no such deletion/insertion
> > if only vma->end is changed in function vma_adjust.
> >
> > Below patch fixes it.
> >
> > Test results with kernel 2.6.31-rc8. The improvement on the machine is about 150%.
>
> Did you see Lee's patch?:
>
> http://lkml.org/lkml/2009/9/9/290
>
> Added Lee and Hugh to CC, retained the below patch for them.

Thanks a lot for the CC, Peter.
See my reply to that mail for the slightly corrected version.

Yes, Yanmin and Lee appear to be fixing exactly the same issue.
I haven't thought through Yanmin's version for correctness, but
it lacks the vm_start check I added to Lee's, and I do prefer
Lee's style - hey, nothing personal!

So, Yanmin, please retest with http://lkml.org/lkml/2009/9/13/25
and let us know if that works as well for you - thanks.

Hugh

2009-09-18 02:01:11

by Yanmin Zhang

[permalink] [raw]

Subject: Re: aim7 scalability issue on 4 socket machine

On Thu, 2009-09-17 at 11:35 +0100, Hugh Dickins wrote:
> On Thu, 17 Sep 2009, Peter Zijlstra wrote:
> > On Thu, 2009-09-17 at 17:31 +0800, Zhang, Yanmin wrote:
> > > Aim7 result is bad on my new Nehalem machines (4*8*2 logical cpu). Perf counter
> > > shows spinlock consumes 70% cpu time on the machine. Lock_stat shows
> > > anon_vma->lock causes most of the spinlock contention. Function tracer shows
> > > below call chain creates the spinlock.
> > >
> > > do_brk => vma_merge =>vma_adjust
> > >
> > > Aim7 consists of lots of subtests. One test is to fork lots of processes and
> > > every process calls sbrk for 1000 times to grow/shrink the heap. All the vma of
> > > the heap of all sub-processes point to the same anon_vma and use the same
> > > anon_vma->lock. When sbrk is called, kernel calls do_brk => vma_merge =>vma_adjust
> > > and lock anon_vma->lock to create spinlock contentions.
> > >
> > > There is a comment section in front of spin_lock(&anon_vma->lock. It says
> > > anon_vma lock can be optimized when just changing vma->vm_end. As a matter
> > > of fact, anon_vma->lock is used to protect anon_vma->list when an entry is
> > > deleted/inserted or the list is accessed. There is no such deletion/insertion
> > > if only vma->end is changed in function vma_adjust.
> > >
> > > Below patch fixes it.
> > >
> > > Test results with kernel 2.6.31-rc8. The improvement on the machine is about 150%.
> >
> > Did you see Lee's patch?:
> >
> > http://lkml.org/lkml/2009/9/9/290
> >
> > Added Lee and Hugh to CC, retained the below patch for them.
>
> Thanks a lot for the CC, Peter.
> See my reply to that mail for the slightly corrected version.
>
> Yes, Yanmin and Lee appear to be fixing exactly the same issue.
> I haven't thought through Yanmin's version for correctness, but
> it lacks the vm_start check I added to Lee's, and I do prefer
> Lee's style - hey, nothing personal!
>
> So, Yanmin, please retest with http://lkml.org/lkml/2009/9/13/25
> and let us know if that works as well for you - thanks.
I tested Lee's patch and it does fix the issue.

Yanmin

2009-09-18 03:00:05

by Andrew Morton

[permalink] [raw]

Subject: Re: aim7 scalability issue on 4 socket machine

On Fri, 18 Sep 2009 10:02:19 +0800 "Zhang, Yanmin" <[email protected]> wrote:

> > > Did you see Lee's patch?:
> > >
> > > http://lkml.org/lkml/2009/9/9/290
> > >
> > > Added Lee and Hugh to CC, retained the below patch for them.
> >
> > Thanks a lot for the CC, Peter.
> > See my reply to that mail for the slightly corrected version.
> >
> > Yes, Yanmin and Lee appear to be fixing exactly the same issue.
> > I haven't thought through Yanmin's version for correctness, but
> > it lacks the vm_start check I added to Lee's, and I do prefer
> > Lee's style - hey, nothing personal!
> >
> > So, Yanmin, please retest with http://lkml.org/lkml/2009/9/13/25
> > and let us know if that works as well for you - thanks.
> I tested Lee's patch and it does fix the issue.

Do we think we should cook up something for -stable?

Either this is a regression or the workload is particularly obscure.

aim7 is sufficiently non-obscure to make me wonder what's happened here?

2009-09-18 03:16:47

by Yanmin Zhang

[permalink] [raw]

Subject: Re: aim7 scalability issue on 4 socket machine

On Thu, 2009-09-17 at 19:59 -0700, Andrew Morton wrote:
> On Fri, 18 Sep 2009 10:02:19 +0800 "Zhang, Yanmin" <[email protected]> wrote:
>
> > > > Did you see Lee's patch?:
> > > >
> > > > http://lkml.org/lkml/2009/9/9/290
> > > >
> > > > Added Lee and Hugh to CC, retained the below patch for them.
> > >
> > > Thanks a lot for the CC, Peter.
> > > See my reply to that mail for the slightly corrected version.
> > >
> > > Yes, Yanmin and Lee appear to be fixing exactly the same issue.
> > > I haven't thought through Yanmin's version for correctness, but
> > > it lacks the vm_start check I added to Lee's, and I do prefer
> > > Lee's style - hey, nothing personal!
> > >
> > > So, Yanmin, please retest with http://lkml.org/lkml/2009/9/13/25
> > > and let us know if that works as well for you - thanks.
> > I tested Lee's patch and it does fix the issue.
>
> Do we think we should cook up something for -stable?
It's better to have a patch for -stable.

>
> Either this is a regression or the workload is particularly obscure.
This issue is not clear on dual socket Nehalem machine (2*4*2 cpu), but
is severe on large machine (4*8*2 cpu).

>
> aim7 is sufficiently non-obscure to make me wonder what's happened here?
I copy previous content below:

Aim7 consists of lots of subtests. One test is to fork lots of processes and
every process calls sbrk for 1000 times to grow/shrink the heap. All the vma of
the heap of all sub-processes point to the same anon_vma and use the same
anon_vma->lock. When sbrk is called, kernel calls do_brk => vma_merge =>vma_adjust
and lock anon_vma->lock to create spinlock contentions.

2009-09-18 06:54:57

by Hugh Dickins

[permalink] [raw]

Subject: Re: aim7 scalability issue on 4 socket machine

On Thu, 17 Sep 2009, Andrew Morton wrote:
> On Fri, 18 Sep 2009 10:02:19 +0800 "Zhang, Yanmin" <[email protected]> wrote:
> > >
> > > So, Yanmin, please retest with http://lkml.org/lkml/2009/9/13/25
> > > and let us know if that works as well for you - thanks.
> > I tested Lee's patch and it does fix the issue.

Thanks for checking and reporting back, Yanmin.

>
> Do we think we should cook up something for -stable?

Gosh, I laughed at Lee (sorry!) for suggesting it for -stable:
is stable really for getting a better number out of a benchmark?

I'd have thought the next release is the right place for that; but
I've no problem if you guys and the stable guys agree it's appropriate.

>
> Either this is a regression or the workload is particularly obscure.

I've not cross-checked descriptions, but assume Lee was actually
testing on exactly the same kind of upcoming Nehalem as Yanmin, and
that machine happens to have characteristics which show up badly here.

>
> aim7 is sufficiently non-obscure to make me wonder what's happened here?

Not a regression, just the onward march of new hardware, I think.
Could easily be other such things in other places with other tests.

Hugh

2009-09-18 07:07:33

by Andrew Morton

[permalink] [raw]

Subject: Re: aim7 scalability issue on 4 socket machine

On Fri, 18 Sep 2009 07:53:58 +0100 (BST) Hugh Dickins <[email protected]> wrote:

> On Thu, 17 Sep 2009, Andrew Morton wrote:
> > On Fri, 18 Sep 2009 10:02:19 +0800 "Zhang, Yanmin" <[email protected]> wrote:
> > > >
> > > > So, Yanmin, please retest with http://lkml.org/lkml/2009/9/13/25
> > > > and let us know if that works as well for you - thanks.
> > > I tested Lee's patch and it does fix the issue.
>
> Thanks for checking and reporting back, Yanmin.
>
> >
> > Do we think we should cook up something for -stable?
>
> Gosh, I laughed at Lee (sorry!) for suggesting it for -stable:
> is stable really for getting a better number out of a benchmark?
>
> I'd have thought the next release is the right place for that; but
> I've no problem if you guys and the stable guys agree it's appropriate.
>
> >
> > Either this is a regression or the workload is particularly obscure.
>
> I've not cross-checked descriptions, but assume Lee was actually
> testing on exactly the same kind of upcoming Nehalem as Yanmin, and
> that machine happens to have characteristics which show up badly here.
>
> >
> > aim7 is sufficiently non-obscure to make me wonder what's happened here?
>
> Not a regression, just the onward march of new hardware, I think.
> Could easily be other such things in other places with other tests.
>

Well, it comes down to the question "what is -stable for".

If you take it as "bugfixed version of the 2.6.x kernel" then no,
speedups aren't appropriate.

If you consider -stable to be "something distros, etc will use" then
yes, perhaps we serve those consumers better by including fairly major
efficiency improvements.

I suspect most consumers of -stable would prefer the latter approach,
as long as we don't go nuts.

2009-09-18 07:12:21

by Andi Kleen

[permalink] [raw]

Subject: Re: aim7 scalability issue on 4 socket machine

On Fri, Sep 18, 2009 at 07:53:58AM +0100, Hugh Dickins wrote:
> On Thu, 17 Sep 2009, Andrew Morton wrote:
> > On Fri, 18 Sep 2009 10:02:19 +0800 "Zhang, Yanmin" <[email protected]> wrote:
> > > >
> > > > So, Yanmin, please retest with http://lkml.org/lkml/2009/9/13/25
> > > > and let us know if that works as well for you - thanks.
> > > I tested Lee's patch and it does fix the issue.
>
> Thanks for checking and reporting back, Yanmin.
>
> >
> > Do we think we should cook up something for -stable?
>
> Gosh, I laughed at Lee (sorry!) for suggesting it for -stable:
> is stable really for getting a better number out of a benchmark?

When your system is large enough scalability problems (e.g.
lock contention) can be a serious bug. i.e. when your workload
is 150% slower than expected that can well be a show stopper.

Admittedly the workload in this case was a benchmark, but it's
not that far fetched to expect the same problem in a real application.

We had a similar problem with the accounting lock some time
ago, I think that patch also went in.

So yes I think simple non intrusive fixes for serious scalability
problems should be stable candidates.

> > Either this is a regression or the workload is particularly obscure.
>
> I've not cross-checked descriptions, but assume Lee was actually
> testing on exactly the same kind of upcoming Nehalem as Yanmin, and
> that machine happens to have characteristics which show up badly here.

AFAIK Lee usually tests on large IA64 boxes.

> > aim7 is sufficiently non-obscure to make me wonder what's happened here?
>
> Not a regression, just the onward march of new hardware, I think.
> Could easily be other such things in other places with other tests.

Yes, it's just a much larger machine, so old hidden scalability sins now
appear.

-Andi

--
[email protected] -- Speaking for myself only.

2009-09-18 07:29:57

by Hugh Dickins

[permalink] [raw]

Subject: Re: aim7 scalability issue on 4 socket machine

On Fri, 18 Sep 2009, Andi Kleen wrote:
>
> So yes I think simple non intrusive fixes for serious scalability
> problems should be stable candidates.

This one is certainly non-intrusive, and you're all in agreement
that it better go in, so I won't dispute that.

(But I'm a little worried that this easily fixed issue will turn out
to be just the first in a string of not so easily fixed issues.)

Hugh

2009-09-18 13:15:05

by Lee Schermerhorn

[permalink] [raw]

Subject: Re: aim7 scalability issue on 4 socket machine

On Fri, 2009-09-18 at 09:12 +0200, Andi Kleen wrote:
> On Fri, Sep 18, 2009 at 07:53:58AM +0100, Hugh Dickins wrote:
> > On Thu, 17 Sep 2009, Andrew Morton wrote:
> > > On Fri, 18 Sep 2009 10:02:19 +0800 "Zhang, Yanmin" <[email protected]> wrote:
> > > > >
> > > > > So, Yanmin, please retest with http://lkml.org/lkml/2009/9/13/25
> > > > > and let us know if that works as well for you - thanks.
> > > > I tested Lee's patch and it does fix the issue.
> >
> > Thanks for checking and reporting back, Yanmin.
> >
> > >
> > > Do we think we should cook up something for -stable?
> >
> > Gosh, I laughed at Lee (sorry!) for suggesting it for -stable:
> > is stable really for getting a better number out of a benchmark?
>
> When your system is large enough scalability problems (e.g.
> lock contention) can be a serious bug. i.e. when your workload
> is 150% slower than expected that can well be a show stopper.
>
> Admittedly the workload in this case was a benchmark, but it's
> not that far fetched to expect the same problem in a real application.
>
> We had a similar problem with the accounting lock some time
> ago, I think that patch also went in.
>
> So yes I think simple non intrusive fixes for serious scalability
> problems should be stable candidates.
>
> > > Either this is a regression or the workload is particularly obscure.
> >
> > I've not cross-checked descriptions, but assume Lee was actually
> > testing on exactly the same kind of upcoming Nehalem as Yanmin, and
> > that machine happens to have characteristics which show up badly here.
>
> AFAIK Lee usually tests on large IA64 boxes.

In this case, it's an x86_64 [DL785] platform--an 8 socket, 4 core
Shanghai in a glueless, "twisted ladder" config.

Lee

2009-09-18 14:33:43

by Andi Kleen

[permalink] [raw]

Subject: Re: aim7 scalability issue on 4 socket machine

> >
> > AFAIK Lee usually tests on large IA64 boxes.
>
> In this case, it's an x86_64 [DL785] platform--an 8 socket, 4 core
> Shanghai in a glueless, "twisted ladder" config.

Thanks for the correction. However it's still a quite different system,
so I think it's clear it happens on multiple different systems.

-Andi

--
[email protected] -- Speaking for myself only.

2009-12-06 20:08:41

by Pavel Machek

[permalink] [raw]

Subject: Re: aim7 scalability issue on 4 socket machine

On Fri 2009-09-18 00:05:42, Andrew Morton wrote:
> On Fri, 18 Sep 2009 07:53:58 +0100 (BST) Hugh Dickins <[email protected]> wrote:
>
> > On Thu, 17 Sep 2009, Andrew Morton wrote:
> > > On Fri, 18 Sep 2009 10:02:19 +0800 "Zhang, Yanmin" <[email protected]> wrote:
> > > > >
> > > > > So, Yanmin, please retest with http://lkml.org/lkml/2009/9/13/25
> > > > > and let us know if that works as well for you - thanks.
> > > > I tested Lee's patch and it does fix the issue.
> >
> > Thanks for checking and reporting back, Yanmin.
> >
> > >
> > > Do we think we should cook up something for -stable?
> >
> > Gosh, I laughed at Lee (sorry!) for suggesting it for -stable:
> > is stable really for getting a better number out of a benchmark?
> >
> > I'd have thought the next release is the right place for that; but
> > I've no problem if you guys and the stable guys agree it's appropriate.
> >
> > >
> > > Either this is a regression or the workload is particularly obscure.
> >
> > I've not cross-checked descriptions, but assume Lee was actually
> > testing on exactly the same kind of upcoming Nehalem as Yanmin, and
> > that machine happens to have characteristics which show up badly here.
> >
> > >
> > > aim7 is sufficiently non-obscure to make me wonder what's happened here?
> >
> > Not a regression, just the onward march of new hardware, I think.
> > Could easily be other such things in other places with other tests.
> >
>
> Well, it comes down to the question "what is -stable for".
>
> If you take it as "bugfixed version of the 2.6.x kernel" then no,
> speedups aren't appropriate.
>
> If you consider -stable to be "something distros, etc will use" then
> yes, perhaps we serve those consumers better by including fairly major
> efficiency improvements.

Well, if speedups are ok, then someone should update stable_rules
file...?

...because I do not think it should be accepted based on that.

Pavel
It currently says:
-------------------

Rules on what kind of patches are accepted, and which ones are not,
into the
"-stable" tree:

- It must be obviously correct and tested.
- It cannot be bigger than 100 lines, with context.
- It must fix only one thing.
- It must fix a real bug that bothers people (not a, "This could be a
problem..." type thing).
- It must fix a problem that causes a build error (but not for things
marked CONFIG_BROKEN), an oops, a hang, data corruption, a real
security issue, or some "oh, that's not good" issue. In short,
something
critical.
- New device IDs and quirks are also accepted.
- No "theoretical race condition" issues, unless an explanation of
how the
race can be exploited is also provided.
- It cannot contain any "trivial" fixes in it (spelling changes,
whitespace cleanups, etc).
- It must follow the Documentation/SubmittingPatches rules.
- It or an equivalent fix must already exist in Linus' tree. Quote
the
respective commit ID in Linus' tree in your patch submission to
-stable.

--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2009-12-06 20:11:32

by Andi Kleen

[permalink] [raw]

Subject: Re: aim7 scalability issue on 4 socket machine

> - It must be obviously correct and tested.
> - It cannot be bigger than 100 lines, with context.
> - It must fix only one thing.
> - It must fix a real bug that bothers people (not a, "This could be a
> problem..." type thing).

A significant slow down in a common situation is a "significant
bug that bothers people"

-Andi

2009-12-06 21:11:11

by Pavel Machek

[permalink] [raw]

Subject: Re: aim7 scalability issue on 4 socket machine

On Sun 2009-12-06 21:11:36, Andi Kleen wrote:
> > - It must be obviously correct and tested.
> > - It cannot be bigger than 100 lines, with context.
> > - It must fix only one thing.
> > - It must fix a real bug that bothers people (not a, "This could be a
> > problem..." type thing).
>
> A significant slow down in a common situation is a "significant
> bug that bothers people"

Well, IIRC it was benchmark that was slowed down, on huge system, so
it was exactly "this could be a problem".

Anyway, I don't care about -stable series too much (and sorry for
replying to such an old mail), but perhaps the docs should be updated?

Examples cited there are such as "data corruption" or "oops", which is
clearly different ballpark then "slowdown".
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2009-12-06 22:18:23

by Greg KH

[permalink] [raw]

Subject: Re: [stable] aim7 scalability issue on 4 socket machine

On Sun, Dec 06, 2009 at 10:11:05PM +0100, Pavel Machek wrote:
> On Sun 2009-12-06 21:11:36, Andi Kleen wrote:
> > > - It must be obviously correct and tested.
> > > - It cannot be bigger than 100 lines, with context.
> > > - It must fix only one thing.
> > > - It must fix a real bug that bothers people (not a, "This could be a
> > > problem..." type thing).
> >
> > A significant slow down in a common situation is a "significant
> > bug that bothers people"
>
> Well, IIRC it was benchmark that was slowed down, on huge system, so
> it was exactly "this could be a problem".
>
> Anyway, I don't care about -stable series too much (and sorry for
> replying to such an old mail), but perhaps the docs should be updated?
>
> Examples cited there are such as "data corruption" or "oops", which is
> clearly different ballpark then "slowdown".

Why does it really matter if the intent is the thing that matters here.

Deal with specifics on a case-by-case basis, and if you don't really
care about this, then why dig up a many-month old thread?

strange,

greg k-h

2009-12-06 22:23:42

by Pavel Machek

[permalink] [raw]

Subject: Re: [stable] aim7 scalability issue on 4 socket machine

> Deal with specifics on a case-by-case basis, and if you don't really
> care about this, then why dig up a many-month old thread?

Digging it was a mistake, I was working on wrong machine. Sorry.
Pavel

--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html