2008-03-25 21:59:21

by Pekka Enberg

[permalink] [raw]
Subject: [PATCH] slab: fix cache_cache bootstrap in kmem_cache_init()

From: Daniel Yeisley <[email protected]>

Commit 556a169dab38b5100df6f4a45b655dddd3db94c1 ("slab: fix bootstrap on
memoryless node") introduced bootstrap-time cache_cache list3s for all nodes
but forgot that initkmem_list3 needs to be accessed by [somevalue + node]. This
patch fixes list_add() corruption in mm/slab.c seen on the ES7000.

Cc: Mel Gorman <[email protected]>
Cc: Olaf Hering <[email protected]>
Cc: Christoph Lameter <[email protected]>
Signed-off-by: Dan Yeisley <[email protected]>
Signed-off-by: Pekka Enberg <[email protected]>
---
Andrew/Christoph, this needs to go into 2.6.25 and probably 2.6.24.x as
well. Hopefully either Mel or Olaf can test this on their machines to
confirm the fix doesn't break their setup.

mm/slab.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

Index: linux-2.6/mm/slab.c
===================================================================
--- linux-2.6.orig/mm/slab.c
+++ linux-2.6/mm/slab.c
@@ -1481,7 +1481,7 @@ void __init kmem_cache_init(void)
list_add(&cache_cache.next, &cache_chain);
cache_cache.colour_off = cache_line_size();
cache_cache.array[smp_processor_id()] = &initarray_cache.cache;
- cache_cache.nodelists[node] = &initkmem_list3[CACHE_CACHE];
+ cache_cache.nodelists[node] = &initkmem_list3[CACHE_CACHE + node];

/*
* struct kmem_cache size depends on nr_node_ids, which
@@ -1602,7 +1602,7 @@ void __init kmem_cache_init(void)
int nid;

for_each_online_node(nid) {
- init_list(&cache_cache, &initkmem_list3[CACHE_CACHE], nid);
+ init_list(&cache_cache, &initkmem_list3[CACHE_CACHE + nid], nid);

init_list(malloc_sizes[INDEX_AC].cs_cachep,
&initkmem_list3[SIZE_AC + nid], nid);


2008-03-26 03:29:23

by Christoph Lameter

[permalink] [raw]
Subject: Re: [PATCH] slab: fix cache_cache bootstrap in kmem_cache_init()

On Tue, 25 Mar 2008, Pekka J Enberg wrote:

> Andrew/Christoph, this needs to go into 2.6.25 and probably 2.6.24.x as
> well. Hopefully either Mel or Olaf can test this on their machines to
> confirm the fix doesn't break their setup.

Ok. I see the problem in the amazing artwork of NUMA bootstrap in SLAB
that was made even more complex with the fixes late in 2.6.24.
Now we are up for the da capo on this in 2.6.25. Sigh.

Will merge if I get confirmation that this indeed addresses the issue.

2008-03-26 06:23:23

by Pekka Enberg

[permalink] [raw]
Subject: Re: [PATCH] slab: fix cache_cache bootstrap in kmem_cache_init()

Christoph Lameter wrote:
> Will merge if I get confirmation that this indeed addresses the issue.

It fixes the problem reported by Daniel:

http://lkml.org/lkml/2008/3/25/249

But what I worry about is breaking setups that have memoryless nodes.

Pekka

2008-03-26 12:59:47

by Daniel Yeisley

[permalink] [raw]
Subject: Re: [PATCH] slab: fix cache_cache bootstrap in kmem_cache_init()

On Wed, 2008-03-26 at 08:21 +0200, Pekka Enberg wrote:
> Christoph Lameter wrote:
> > Will merge if I get confirmation that this indeed addresses the issue.
>
> It fixes the problem reported by Daniel:
>
> http://lkml.org/lkml/2008/3/25/249
>
> But what I worry about is breaking setups that have memoryless nodes.
>
> Pekka

The only way I could get the ES7000 to boot with 2.6.24.x was with
memoryless nodes (or numa=off). The list corruption I saw was due to
CPU 0 not being on node 0.
I do see problems with memoryless nodes on 2.6.25-rc. I'll post a patch
for that shortly.

Dan

2008-03-26 16:30:26

by Christoph Lameter

[permalink] [raw]
Subject: Re: [PATCH] slab: fix cache_cache bootstrap in kmem_cache_init()

On Wed, 26 Mar 2008, Daniel Yeisley wrote:

> The only way I could get the ES7000 to boot with 2.6.24.x was with
> memoryless nodes (or numa=off). The list corruption I saw was due to
> CPU 0 not being on node 0.

What is the ES7000?

> I do see problems with memoryless nodes on 2.6.25-rc. I'll post a patch
> for that shortly.

Another patch in addition to the one we are discussing? Or another
revision of the patch?

2008-03-26 17:24:26

by Daniel Yeisley

[permalink] [raw]
Subject: RE: [PATCH] slab: fix cache_cache bootstrap in kmem_cache_init()



> -----Original Message-----
> From: Christoph Lameter [mailto:[email protected]]
> Sent: Wednesday, March 26, 2008 12:28
> To: Yeisley, Dan P.
> Cc: Pekka Enberg; [email protected]; [email protected];
> [email protected]; [email protected]
> Subject: Re: [PATCH] slab: fix cache_cache bootstrap in
kmem_cache_init()
>
> On Wed, 26 Mar 2008, Daniel Yeisley wrote:
>
> > The only way I could get the ES7000 to boot with 2.6.24.x was with
> > memoryless nodes (or numa=off). The list corruption I saw was due
to
> > CPU 0 not being on node 0.
>
> What is the ES7000?
>

The ES7000 is a server built by Unisys that supports up to 32 CPU
sockets (Xeon or Itanium).

> > I do see problems with memoryless nodes on 2.6.25-rc. I'll post a
patch
> > for that shortly.
>
> Another patch in addition to the one we are discussing? Or another
> revision of the patch?

I wrote another patch that affects arch/x86.

http://lkml.org/lkml/2008/3/26/165

CPUs on memoryless nodes aren't being moved to nodes with memory.

Dan