2007-06-11 14:19:42

by Haavard Skinnemoen

[permalink] [raw]
Subject: kernel BUG at mm/slub.c:3689!

While trying to get SLUB debugging to not break DMA on AVR32, I ran
into this:

(...)
adding slab: name=kmalloc-64 size=64 objsize=64
adding slab: name=kmalloc-32 size=32 objsize=32
adding slab: name=kmalloc-16 size=32 objsize=16
kobject_add failed for :0000032 with -EEXIST, don't try to register things with the same name in the same directory.
Call trace:
[<9000f14e>] dump_stack+0x1a/0x24
[<90093144>] kobject_shadow_add+0xc4/0xe8
[<90093170>] kobject_add+0x8/0xc
[<9003ded8>] sysfs_slab_add+0xc8/0x10c
[<900046c6>] slab_sysfs_init+0x26/0x78
[<900003f0>] kernel_init+0x74/0x1c0
[<90015cd0>] do_exit+0x0/0x52c

------------[ cut here ]------------
kernel BUG at /home/hskinnemoen/git/linux/mm/slub.c:3689!
Oops: Kernel BUG, sig: 9 [#1]
FRAME_POINTER chip: 0x01f:0x1e82 rev 0
Modules linked in:
PC is at slab_sysfs_init+0x28/0x78
LR is at 0x9014e654
pc : [<900046c8>] lr : [<9014e654>] Not tainted
sp : 901cbf98 r12: ffffffef r11: 00000001
r10: 00000001 r9 : 00000000 r8 : 9014e654
r7 : 901cbf98 r6 : 9014a2b4 r5 : 00000000 r4 : 00000000
r3 : 00000000 r2 : 90015cd0 r1 : 9000b4d0 r0 : 00000000
Flags: qvNzc
Mode bits: hrje....g
CPU Mode: Supervisor
Process: swapper [1] (task: 901c9000 thread: 901ca000)
Stack: (0x901cbf98 to 0x901cc000)
bf80: 900003f0 901cbfdc
bfa0: 90009b20 00000000 00000000 00000000 00000000 00000000 00400000 9000ebd0
bfc0: 9000ebd0 901cc000 00000000 00000000 00000000 00000000 00000000 90015cd0
bfe0: 00000000 00000000 00000000 00000000 00000000 90015cd0 9000037c 00000000
Call trace:
[<900003f0>] kernel_init+0x74/0x1c0
[<90015cd0>] do_exit+0x0/0x52c

Kernel panic - not syncing: Attempted to kill init!


The "adding slab" messages were added by me; see the patch below.

I think the combination that triggered this bug was:
* CONFIG_SLUB_DEBUG=y
* ARCH_KMALLOC_MINALIGN=32
* slub_debug not set at the command line

If I disable CONFIG_SLUB_DEBUG or add slub_debug to the command line,
everything is fine.

Please let me know if you need any additional information.

Haavard

diff --git a/include/asm-avr32/cache.h b/include/asm-avr32/cache.h
index dabb955..d3cf35a 100644
--- a/include/asm-avr32/cache.h
+++ b/include/asm-avr32/cache.h
@@ -4,6 +4,15 @@
#define L1_CACHE_SHIFT 5
#define L1_CACHE_BYTES (1 << L1_CACHE_SHIFT)

+/*
+ * Memory returned by kmalloc() may be used for DMA, so we must make
+ * sure that all such allocations are cache aligned. Otherwise,
+ * unrelated code may cause parts of the buffer to be read into the
+ * cache before the transfer is done, causing old data to be seen by
+ * the CPU.
+ */
+#define ARCH_KMALLOC_MINALIGN L1_CACHE_BYTES
+
#ifndef __ASSEMBLER__
struct cache_info {
unsigned int ways;
diff --git a/mm/slub.c b/mm/slub.c
index c9ab688..6c45b4c 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -3587,6 +3587,9 @@ static int sysfs_slab_add(struct kmem_cache *s)
/* Defer until later */
return 0;

+ printk(KERN_NOTICE "adding slab: name=%s size=%d objsize=%d\n",
+ s->name, s->size, s->objsize);
+
unmergeable = slab_unmergeable(s);
if (unmergeable) {
/*


2007-06-11 15:39:49

by Paul Mundt

[permalink] [raw]
Subject: Re: kernel BUG at mm/slub.c:3689!

On Mon, Jun 11, 2007 at 04:19:26PM +0200, Haavard Skinnemoen wrote:
> I think the combination that triggered this bug was:
> * CONFIG_SLUB_DEBUG=y
> * ARCH_KMALLOC_MINALIGN=32
> * slub_debug not set at the command line
>
ARCH_KMALLOC_MINALIGN and ARCH_SLAB_MINALIGN are both in bytes
(ARCH_KMALLOC_MINALIGN used to be defined as BYTES_PER_WORD) if I recall
correctly, are you sure this is what you want?

2007-06-11 15:58:30

by Håvard Skinnemoen

[permalink] [raw]
Subject: Re: kernel BUG at mm/slub.c:3689!

On 6/11/07, Paul Mundt <[email protected]> wrote:
> On Mon, Jun 11, 2007 at 04:19:26PM +0200, Haavard Skinnemoen wrote:
> > I think the combination that triggered this bug was:
> > * CONFIG_SLUB_DEBUG=y
> > * ARCH_KMALLOC_MINALIGN=32
> > * slub_debug not set at the command line
> >
> ARCH_KMALLOC_MINALIGN and ARCH_SLAB_MINALIGN are both in bytes
> (ARCH_KMALLOC_MINALIGN used to be defined as BYTES_PER_WORD) if I recall
> correctly, are you sure this is what you want?

Yeah, I actually defined it to L1_CACHE_BYTES. The idea is to avoid
cacheline sharing between different kmalloc() allocations, which may
mess up DMA transfers. So 32 bytes alignment is indeed what I want.

Haavard

2007-06-11 16:19:20

by Christoph Lameter

[permalink] [raw]
Subject: Re: kernel BUG at mm/slub.c:3689!

On Mon, 11 Jun 2007, Haavard Skinnemoen wrote:

> While trying to get SLUB debugging to not break DMA on AVR32, I ran
> into this:

This is a known bug in 2.6.22-rc2/rc3. Which version are you running?

2007-06-11 16:43:49

by Håvard Skinnemoen

[permalink] [raw]
Subject: Re: kernel BUG at mm/slub.c:3689!

On 6/11/07, Christoph Lameter <[email protected]> wrote:
> On Mon, 11 Jun 2007, Haavard Skinnemoen wrote:
>
> > While trying to get SLUB debugging to not break DMA on AVR32, I ran
> > into this:
>
> This is a known bug in 2.6.22-rc2/rc3. Which version are you running?

2.6.22-rc4. I did a pull from Linus' tree a few hours ago.

Haavard

2007-06-11 16:49:52

by Christoph Lameter

[permalink] [raw]
Subject: Re: kernel BUG at mm/slub.c:3689!

On Mon, 11 Jun 2007, H?vard Skinnemoen wrote:

> On 6/11/07, Christoph Lameter <[email protected]> wrote:
> > On Mon, 11 Jun 2007, Haavard Skinnemoen wrote:
> >
> > > While trying to get SLUB debugging to not break DMA on AVR32, I ran
> > > into this:
> >
> > This is a known bug in 2.6.22-rc2/rc3. Which version are you running?
>
> 2.6.22-rc4. I did a pull from Linus' tree a few hours ago.

Ahhh... I see its the same phenomenon as before but triggered by
a different cause.

If you set the align to 32 then the first kmalloc slabs of size

8
16
32

are all of the same size which leads to duplicate files in sysfs.

Does this patch fix it?

---
mm/slub.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-2.6/mm/slub.c
===================================================================
--- linux-2.6.orig/mm/slub.c 2007-06-11 09:48:04.000000000 -0700
+++ linux-2.6/mm/slub.c 2007-06-11 09:48:17.000000000 -0700
@@ -3579,7 +3579,7 @@ static char *create_unique_id(struct kme
*p++ = 'F';
if (p != name + 1)
*p++ = '-';
- p += sprintf(p, "%07d", s->size);
+ p += sprintf(p, "%07d", s->objsize);
BUG_ON(p > name + ID_STR_LENGTH - 1);
return name;
}

2007-06-11 17:11:28

by Håvard Skinnemoen

[permalink] [raw]
Subject: Re: kernel BUG at mm/slub.c:3689!

On 6/11/07, Christoph Lameter <[email protected]> wrote:
> Ahhh... I see its the same phenomenon as before but triggered by
> a different cause.
>
> If you set the align to 32 then the first kmalloc slabs of size
>
> 8
> 16
> 32
>
> are all of the same size which leads to duplicate files in sysfs.

Yes, that seems to be the problem.

> Does this patch fix it?

Unfortunately, no. But I get a different error message; see below...

Also unfortunately, I won't be able to do any more testing until I get
back to work tomorrow -- I'm currently at home and I can't reset my
board remotely (Doh! Why didn't I set panic_timeout...)

Haavard

kobject_add failed for :0000016 with -EEXIST, don't try to register
things with the same name in the same directory.
Call trace:
[<9000f14e>] dump_stack+0x1a/0x24
[<90093128>] kobject_shadow_add+0xc4/0xe8
[<90093154>] kobject_add+0x8/0xc
[<9003dec4>] sysfs_slab_add+0xb0/0xf0
[<900046c6>] slab_sysfs_init+0x26/0x78
[<900003f0>] kernel_init+0x74/0x1c0
[<90015cd4>] do_exit+0x0/0x52c

------------[ cut here ]------------
kernel BUG at /home/hskinnemoen/git/linux/mm/slub.c:3686!
Oops: Kernel BUG, sig: 9 [#1]
FRAME_POINTER chip: 0x01f:0x1e82 rev 0
Modules linked in:
PC is at slab_sysfs_init+0x28/0x78
LR is at 0x9014e654
pc : [<900046c8>] lr : [<9014e654>] Not tainted
sp : 901cbf98 r12: ffffffef r11: 00000001
r10: 00000001 r9 : 00000000 r8 : 9014e654
r7 : 901cbf98 r6 : 9014a2b4 r5 : 00000000 r4 : 00000000
r3 : 00000000 r2 : 90015cd4 r1 : 9000b4d0 r0 : 00000000
Flags: qvNzc
Mode bits: hrje....g
CPU Mode: Supervisor
Process: swapper [1] (task: 901c9000 thread: 901ca000)
Stack: (0x901cbf98 to 0x901cc000)
bf80: 900003f0 901cbfdc
bfa0: 90009b20 00000000 00000000 00000000 00000000 00000000 00400000 9000ebd0
bfc0: 9000ebd0 901cc000 00000000 00000000 00000000 00000000 00000000 90015cd4
bfe0: 00000000 00000000 00000000 00000000 00000000 90015cd4 9000037c 00000000
Call trace:
[<900003f0>] kernel_init+0x74/0x1c0
[<90015cd4>] do_exit+0x0/0x52c

Kernel panic - not syncing: Attempted to kill init!

2007-06-11 17:52:40

by Christoph Lameter

[permalink] [raw]
Subject: Re: kernel BUG at mm/slub.c:3689!

On Mon, 11 Jun 2007, H?vard Skinnemoen wrote:

> On 6/11/07, Christoph Lameter <[email protected]> wrote:
> > Ahhh... I see its the same phenomenon as before but triggered by
> > a different cause.
> >
> > If you set the align to 32 then the first kmalloc slabs of size
> >
> > 8
> > 16
> > 32
> >
> > are all of the same size which leads to duplicate files in sysfs.
>
> Yes, that seems to be the problem.
>
> > Does this patch fix it?
>
> Unfortunately, no. But I get a different error message; see below...

Hmmmm.. Yeah slabs with different user object sizes may coexist.

Argh!

Ok. Drop the patch and use this one instead. This one avoids the use
of smaller slabs that cause the conflict. The first slab will be sized 32
bytes instead of 8.

Note that I do not get why you would be aligning the objects to 32 bytes.
Increasing the smallest cache size wastes a lot of memory. And it is
usually advantageous if multiple related objects are in the same cacheline
unless you have heavy SMP contention.

---
include/linux/slub_def.h | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

Index: linux-2.6/include/linux/slub_def.h
===================================================================
--- linux-2.6.orig/include/linux/slub_def.h 2007-06-11 10:50:09.000000000 -0700
+++ linux-2.6/include/linux/slub_def.h 2007-06-11 10:50:58.000000000 -0700
@@ -56,7 +56,8 @@ struct kmem_cache {
/*
* Kmalloc subsystem.
*/
-#define KMALLOC_SHIFT_LOW 3
+#define KMALLOC_SHIFT_LOW 5
+

/*
* We keep the general caches in an array of slab caches that are used for
@@ -76,6 +77,9 @@ static inline int kmalloc_index(size_t s
if (size > KMALLOC_MAX_SIZE)
return -1;

+ if (size <= (1 << KMALLOC_SHIFT_LOW))
+ return KMALLOC_SHIFT_LOW;
+
if (size > 64 && size <= 96)
return 1;
if (size > 128 && size <= 192)

2007-06-11 18:22:17

by Håvard Skinnemoen

[permalink] [raw]
Subject: Re: kernel BUG at mm/slub.c:3689!

On 6/11/07, Christoph Lameter <[email protected]> wrote:
> Ok. Drop the patch and use this one instead. This one avoids the use
> of smaller slabs that cause the conflict. The first slab will be sized 32
> bytes instead of 8.

Thanks, I'll test it tomorrow.

> Note that I do not get why you would be aligning the objects to 32 bytes.
> Increasing the smallest cache size wastes a lot of memory. And it is
> usually advantageous if multiple related objects are in the same cacheline
> unless you have heavy SMP contention.

It's not about performance at all, it's about DMA buffers allocated
using kmalloc() getting corrupted. Imagine this:

1. A SPI protocol driver allocates a buffer using kmalloc()
2. SPI master driver receives a request and flushes all cachelines
touched by the buffer (using dma_map_single()) before handing it to
the DMA controller.
3. While the transfer is in progress, something else comes along and
reads something from a different buffer which happens to share a
cacheline with the buffer currently being used for DMA.
4. When the transfer is complete, the protocol driver will see stale
data because a part of the buffer was fetched by the dcache before the
received data was stored in RAM by the DMA controller.

Maybe there are other solutions to this problem, but the old SLAB
allocator did guarantee 32-byte alignment as long as SLAB debugging
was turned off, so setting ARCH_KMALLOC_MINALIGN seemed like the
easiest way to get back to the old, known-working behaviour.

It could be that I've underestimated the AVR32 AP cache though; I
think it can do partial writeback of cachelines, so it could be a
solution to writeback+invalidate the parts of the buffer that may have
shared cachelines from dma_unmap_*() and dma_sync_*_for_cpu(). It will
probably hit a few false positives since it might not always see the
whole buffer, but perhaps it's worth it.

Haavard

2007-06-11 18:32:42

by Christoph Lameter

[permalink] [raw]
Subject: Re: kernel BUG at mm/slub.c:3689!

On Mon, 11 Jun 2007, H?vard Skinnemoen wrote:

> > Note that I do not get why you would be aligning the objects to 32 bytes.
> > Increasing the smallest cache size wastes a lot of memory. And it is
> > usually advantageous if multiple related objects are in the same cacheline
> > unless you have heavy SMP contention.
>
> It's not about performance at all, it's about DMA buffers allocated
> using kmalloc() getting corrupted. Imagine this:

Uhhh... How about using a separate slab for the DMA buffers?

> Maybe there are other solutions to this problem, but the old SLAB
> allocator did guarantee 32-byte alignment as long as SLAB debugging
> was turned off, so setting ARCH_KMALLOC_MINALIGN seemed like the
> easiest way to get back to the old, known-working behaviour.

SLABs mininum object size is 32 thus you had no problems. I see. SLAB
does not guarantee 32 byte alignment. It just happened to work. If you
switch on CONFIG_SLAB_DEBUG you will likely get into trouble.

So I'd suggest to set up a special slab for your DMA buffers.

2007-06-11 19:04:41

by Håvard Skinnemoen

[permalink] [raw]
Subject: Re: kernel BUG at mm/slub.c:3689!

On 6/11/07, Christoph Lameter <[email protected]> wrote:
> On Mon, 11 Jun 2007, H?vard Skinnemoen wrote:
>
> > > Note that I do not get why you would be aligning the objects to 32 bytes.
> > > Increasing the smallest cache size wastes a lot of memory. And it is
> > > usually advantageous if multiple related objects are in the same cacheline
> > > unless you have heavy SMP contention.
> >
> > It's not about performance at all, it's about DMA buffers allocated
> > using kmalloc() getting corrupted. Imagine this:
>
> Uhhh... How about using a separate slab for the DMA buffers?

If there were just a few, known drivers that did this, sure. But as
long as Documentation/DMA-mapping.txt includes this paragraph:

If you acquired your memory via the page allocator
(i.e. __get_free_page*()) or the generic memory allocators
(i.e. kmalloc() or kmem_cache_alloc()) then you may DMA to/from
that memory using the addresses returned from those routines.

I think it's best to ensure that memory returned by kmalloc() actually
can be used for DMA. I used to work around this problem in the SPI
controller driver by using a temporary DMA buffer when possible
misalignment was detected, but David Brownell said it was the wrong
way to do it and pointed at the above paragraph.

But, as I mentioned, perhaps ARCH_KMALLOC_MINALIGN isn't the best way
to solve the problem. I'll look into the flush-caches-from-dma_unmap
approach. However, it looks like other arches set
ARCH_KMALLOC_MINALIGN to various values -- I suspect some of them
might run into the same problem as well?

hskinnemoen@dhcp-255-175:~/git/linux$ grep -r ARCH_KMALLOC_MINALIGN
include/asm-*
include/asm-mips/mach-generic/kmalloc.h:#define ARCH_KMALLOC_MINALIGN 128
include/asm-mips/mach-ip27/kmalloc.h: * All happy, no need to define
ARCH_KMALLOC_MINALIGN
include/asm-mips/mach-ip32/kmalloc.h:#define ARCH_KMALLOC_MINALIGN 32
include/asm-mips/mach-ip32/kmalloc.h:#define ARCH_KMALLOC_MINALIGN 128
include/asm-s390/cache.h:#define ARCH_KMALLOC_MINALIGN 8
include/asm-sh64/uaccess.h:#define ARCH_KMALLOC_MINALIGN 8

> > Maybe there are other solutions to this problem, but the old SLAB
> > allocator did guarantee 32-byte alignment as long as SLAB debugging
> > was turned off, so setting ARCH_KMALLOC_MINALIGN seemed like the
> > easiest way to get back to the old, known-working behaviour.
>
> SLABs mininum object size is 32 thus you had no problems. I see. SLAB
> does not guarantee 32 byte alignment. It just happened to work. If you
> switch on CONFIG_SLAB_DEBUG you will likely get into trouble.

Yeah, that's true. CONFIG_SLAB_DEBUG does indeed cause DMA buffer
corruption on avr32, and so does CONFIG_SLOB. I've been wanting to fix
it, but I never understood how. Now that SLUB seems to offer a
solution that doesn't effectively turn off debugging, I thought I'd
finally found it...

Haavard

2007-06-11 19:11:28

by Christoph Lameter

[permalink] [raw]
Subject: Re: kernel BUG at mm/slub.c:3689!

On Mon, 11 Jun 2007, H?vard Skinnemoen wrote:

> > >
> > > It's not about performance at all, it's about DMA buffers allocated
> > > using kmalloc() getting corrupted. Imagine this:
> >
> > Uhhh... How about using a separate slab for the DMA buffers?
>
> If there were just a few, known drivers that did this, sure. But as
> long as Documentation/DMA-mapping.txt includes this paragraph:
>
> If you acquired your memory via the page allocator
> (i.e. __get_free_page*()) or the generic memory allocators
> (i.e. kmalloc() or kmem_cache_alloc()) then you may DMA to/from
> that memory using the addresses returned from those routines.
>
> I think it's best to ensure that memory returned by kmalloc() actually
> can be used for DMA. I used to work around this problem in the SPI
> controller driver by using a temporary DMA buffer when possible
> misalignment was detected, but David Brownell said it was the wrong
> way to do it and pointed at the above paragraph.

Well there are various ways of doing DMA. Memory returned can be used for
DMA but it may not be suitable for your DMA device if that device has
issues like alignment or physical address size restrictions.

> But, as I mentioned, perhaps ARCH_KMALLOC_MINALIGN isn't the best way
> to solve the problem. I'll look into the flush-caches-from-dma_unmap
> approach. However, it looks like other arches set
> ARCH_KMALLOC_MINALIGN to various values -- I suspect some of them
> might run into the same problem as well?

Could be. That is why I am looking for a general solution.

> > SLABs mininum object size is 32 thus you had no problems. I see. SLAB
> > does not guarantee 32 byte alignment. It just happened to work. If you
> > switch on CONFIG_SLAB_DEBUG you will likely get into trouble.
>
> Yeah, that's true. CONFIG_SLAB_DEBUG does indeed cause DMA buffer
> corruption on avr32, and so does CONFIG_SLOB. I've been wanting to fix
> it, but I never understood how. Now that SLUB seems to offer a
> solution that doesn't effectively turn off debugging, I thought I'd
> finally found it...

We should probably make the minimum slab size dependent on
ARCH_KMALLOC_MINALIGN. There is no point in having smaller slabs anyways.
They will all have the same size.

2007-06-11 19:35:49

by Håvard Skinnemoen

[permalink] [raw]
Subject: Re: kernel BUG at mm/slub.c:3689!

On 6/11/07, Christoph Lameter <[email protected]> wrote:
> On Mon, 11 Jun 2007, H?vard Skinnemoen wrote:
> > I think it's best to ensure that memory returned by kmalloc() actually
> > can be used for DMA. I used to work around this problem in the SPI
> > controller driver by using a temporary DMA buffer when possible
> > misalignment was detected, but David Brownell said it was the wrong
> > way to do it and pointed at the above paragraph.
>
> Well there are various ways of doing DMA. Memory returned can be used for
> DMA but it may not be suitable for your DMA device if that device has
> issues like alignment or physical address size restrictions.

Yes, that's true. If the DMA device has such restrictions, it probably
needs to be addressed elsewhere. My goal here is to make sure that
kmalloc()'ed memory is suitable for DMA as far as the CPU and caches
are concerned.

> We should probably make the minimum slab size dependent on
> ARCH_KMALLOC_MINALIGN. There is no point in having smaller slabs anyways.
> They will all have the same size.

Sounds reasonable to me.

Haavard

2007-06-11 19:40:36

by Christoph Lameter

[permalink] [raw]
Subject: Re: kernel BUG at mm/slub.c:3689!

On Mon, 11 Jun 2007, H?vard Skinnemoen wrote:

> > We should probably make the minimum slab size dependent on
> > ARCH_KMALLOC_MINALIGN. There is no point in having smaller slabs anyways.
> > They will all have the same size.
>
> Sounds reasonable to me.

Trouble is that ARCH_KMALLOC_MINALIGN is in bytes whereas we would need a
shift value for KMALLOC_MIN_SHIFT.

Does the latest patch work?

2007-06-11 19:53:43

by Håvard Skinnemoen

[permalink] [raw]
Subject: Re: kernel BUG at mm/slub.c:3689!

On 6/11/07, Christoph Lameter <[email protected]> wrote:
> Trouble is that ARCH_KMALLOC_MINALIGN is in bytes whereas we would need a
> shift value for KMALLOC_MIN_SHIFT.

Ah, of course. Hrm...I just thought I had an idea, but it wouldn't work...

> Does the latest patch work?

I'm sorry, but I haven't tested it yet. My board needs a hard reset,
and I can't do that over VPN. I'll test it first thing in the morning.

H?vard

2007-06-12 03:16:41

by Christoph Lameter

[permalink] [raw]
Subject: Re: kernel BUG at mm/slub.c:3689!

> and I can't do that over VPN. I'll test it first thing in the morning.

Here is a more general fix



SLUB: minimum alignment fixes

If ARCH_KMALLOC_MIN_ALIGN is set to a value greater than 8 (SLUBs smallest
kmalloc cache) then SLUB may generate duplicate slabs in sysfs (yes again).
No arch sets ARCH_KMALLOC_MINALIGN larger than 8 though excepts mips which
needs a 128 byte cache.

This patch increases the size of the smallest cache if ARCH_KMALLOC_MINALIGN
is greater than 8. In that case more and more of the smallest caches are
disabled.

Signed-off-by: Christoph Lameter <[email protected]>

---
include/linux/slub_def.h | 11 +++++++++--
mm/slub.c | 18 ++++++++++--------
2 files changed, 19 insertions(+), 10 deletions(-)

Index: vps/include/linux/slub_def.h
===================================================================
--- vps.orig/include/linux/slub_def.h 2007-06-11 15:56:37.000000000 -0700
+++ vps/include/linux/slub_def.h 2007-06-11 19:31:15.000000000 -0700
@@ -28,7 +28,7 @@ struct kmem_cache {
int size; /* The size of an object including meta data */
int objsize; /* The size of an object without meta data */
int offset; /* Free pointer offset. */
- unsigned int order;
+ int order;

/*
* Avoid an extra cache line for UP, SMP and for the node local to
@@ -56,7 +56,11 @@ struct kmem_cache {
/*
* Kmalloc subsystem.
*/
-#define KMALLOC_SHIFT_LOW 3
+#ifdef ARCH_KMALLOC_MIN_ALIGN
+#define KMALLOC_MIN_SIZE max(8, ARCH_KMALLOC_MIN_ALIGN)
+#else
+#define KMALLOC_MIN_SIZE 8
+#endif

/*
* We keep the general caches in an array of slab caches that are used for
@@ -76,6 +80,9 @@ static inline int kmalloc_index(size_t s
if (size > KMALLOC_MAX_SIZE)
return -1;

+ if (size <= KMALLOC_MIN_SIZE)
+ return ilog2(KMALLOC_MIN_SIZE);
+
if (size > 64 && size <= 96)
return 1;
if (size > 128 && size <= 192)
Index: vps/mm/slub.c
===================================================================
--- vps.orig/mm/slub.c 2007-06-11 15:56:37.000000000 -0700
+++ vps/mm/slub.c 2007-06-11 19:13:41.000000000 -0700
@@ -2193,11 +2193,11 @@ EXPORT_SYMBOL(kmem_cache_destroy);
* Kmalloc subsystem
*******************************************************************/

-struct kmem_cache kmalloc_caches[KMALLOC_SHIFT_HIGH + 1] __cacheline_aligned;
+struct kmem_cache kmalloc_caches[ilog2(KMALLOC_MAX_SIZE) + 1] __cacheline_aligned;
EXPORT_SYMBOL(kmalloc_caches);

#ifdef CONFIG_ZONE_DMA
-static struct kmem_cache *kmalloc_caches_dma[KMALLOC_SHIFT_HIGH + 1];
+static struct kmem_cache *kmalloc_caches_dma[ilog2(KMALLOC_MAX_SIZE) + 1];
#endif

static int __init setup_slub_min_order(char *str)
@@ -2284,7 +2284,7 @@ static struct kmem_cache *get_slab(size_
if (!x)
panic("Unable to allocate memory for dma cache\n");

- if (index <= KMALLOC_SHIFT_HIGH)
+ if (index <= ilog2(KMALLOC_MAX_SIZE))
realsize = 1 << index;
else {
if (index == 1)
@@ -2529,19 +2529,21 @@ void __init kmem_cache_init(void)
slab_state = PARTIAL;

/* Caches that are not of the two-to-the-power-of size */
- create_kmalloc_cache(&kmalloc_caches[1],
+ if (KMALLOC_MIN_SIZE < 96)
+ create_kmalloc_cache(&kmalloc_caches[1],
"kmalloc-96", 96, GFP_KERNEL);
- create_kmalloc_cache(&kmalloc_caches[2],
+ if (KMALLOC_MIN_SIZE < 192)
+ create_kmalloc_cache(&kmalloc_caches[2],
"kmalloc-192", 192, GFP_KERNEL);

- for (i = KMALLOC_SHIFT_LOW; i <= KMALLOC_SHIFT_HIGH; i++)
+ for (i = ilog2(KMALLOC_MIN_SIZE); i <= ilog2(KMALLOC_MAX_SIZE); i++)
create_kmalloc_cache(&kmalloc_caches[i],
"kmalloc", 1 << i, GFP_KERNEL);

slab_state = UP;

/* Provide the correct kmalloc names now that the caches are up */
- for (i = KMALLOC_SHIFT_LOW; i <= KMALLOC_SHIFT_HIGH; i++)
+ for (i = ilog2(KMALLOC_MIN_SIZE); i <= ilog2(KMALLOC_MAX_SIZE); i++)
kmalloc_caches[i]. name =
kasprintf(GFP_KERNEL, "kmalloc-%d", 1 << i);

@@ -2554,7 +2556,7 @@ void __init kmem_cache_init(void)

printk(KERN_INFO "SLUB: Genslabs=%d, HWalign=%d, Order=%d-%d, MinObjects=%d,"
" Processors=%d, Nodes=%d\n",
- KMALLOC_SHIFT_HIGH, cache_line_size(),
+ ilog2(KMALLOC_MAX_SIZE), cache_line_size(),
slub_min_order, slub_max_order, slub_min_objects,
nr_cpu_ids, nr_node_ids);
}

2007-06-12 08:55:29

by Haavard Skinnemoen

[permalink] [raw]
Subject: Re: kernel BUG at mm/slub.c:3689!

On Mon, 11 Jun 2007 20:16:31 -0700 (PDT)
Christoph Lameter <[email protected]> wrote:

> > and I can't do that over VPN. I'll test it first thing in the morning.
>
> Here is a more general fix

Sorry, that didn't work either. One problem is that you're using
ARCH_KMALLOC_MIN_ALIGN instead of ARCH_KMALLOC_MINALIGN in slub_defs.h
to determine KMALLOC_MIN_SIZE. After fixing that, it fails with
undefined references to __kmalloc_size_too_large. It looks like ilog2()
is behaving strangely; I'll see if I can figure out what's going on.

Haavard

2007-06-12 10:17:18

by Haavard Skinnemoen

[permalink] [raw]
Subject: Re: kernel BUG at mm/slub.c:3689!

On Tue, 12 Jun 2007 10:55:12 +0200
Haavard Skinnemoen <[email protected]> wrote:

> On Mon, 11 Jun 2007 20:16:31 -0700 (PDT)
> Christoph Lameter <[email protected]> wrote:
>
> > > and I can't do that over VPN. I'll test it first thing in the morning.
> >
> > Here is a more general fix
>
> Sorry, that didn't work either. One problem is that you're using
> ARCH_KMALLOC_MIN_ALIGN instead of ARCH_KMALLOC_MINALIGN in slub_defs.h
> to determine KMALLOC_MIN_SIZE. After fixing that, it fails with
> undefined references to __kmalloc_size_too_large. It looks like ilog2()
> is behaving strangely; I'll see if I can figure out what's going on.

Looks like gcc didn't recognize max(8, ARCH_KMALLOC_MINALIGN) as a
constant expression, so __kmalloc_size_too_large() got referenced
although it never got called as far as I can tell. Works like a charm
after applying this patch on top of yours.

Tested with CONFIG_SLUB_DEBUG=n, CONFIG_SLUB_DEBUG=y and
CONFIG_SLUB_DEBUG=y with slub_debug set.


diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h
index 7ef94d4..7fb785f 100644
--- a/include/linux/slub_def.h
+++ b/include/linux/slub_def.h
@@ -56,8 +56,8 @@ struct kmem_cache {
/*
* Kmalloc subsystem.
*/
-#ifdef ARCH_KMALLOC_MIN_ALIGN
-#define KMALLOC_MIN_SIZE max(8, ARCH_KMALLOC_MIN_ALIGN)
+#if defined(ARCH_KMALLOC_MINALIGN) && (ARCH_KMALLOC_MINALIGN > 8)
+#define KMALLOC_MIN_SIZE ARCH_KMALLOC_MINALIGN
#else
#define KMALLOC_MIN_SIZE 8
#endif