2013-05-07 06:37:14

by Pekka Enberg

[permalink] [raw]
Subject: [GIT PULL] SLAB changes for v3.10

Hi Linus,

Please pull the latest SLAB tree from:

git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux.git slab/for-linus

Bulk of the changes are more slab unification from Christoph. There's
also few fixes from Aaron, Glauber, and Joonsoo thrown into the mix.

Pekka

------------------>
The following changes since commit 51a26ae7a14b85c99c9be470c2d28eeeba0f26a3:

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net (2013-05-06 15:51:10 -0700)

are available in the git repository at:

git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux.git slab/for-linus

Aaron Tomlin (1):
mm: slab: Verify the nodeid passed to ____cache_alloc_node

Christoph Lameter (19):
slab: Use proper formatting specs for unsigned size_t
slab: Move kmalloc related function defs
slab: Common kmalloc slab index determination
slab: Use common kmalloc_index/kmalloc_size functions
slab: Common name for the per node structures
slab: Rename nodelists to node
slab: Common constants for kmalloc boundaries
slab: Common definition for the array of kmalloc caches
slab: Common function to create the kmalloc array
stat: Use size_t for sizes instead of unsigned
slab: Common Kmalloc cache determination
slab: Rename list3/l3 to node
slab: Common definition for kmem_cache_node
slab: Handle ARCH_DMA_MINALIGN correctly
slab: Fixup CONFIG_PAGE_ALLOC/DEBUG_SLAB_LEAK sections
slub: Do not dereference NULL pointer in node_match
slub: tid must be retrieved from the percpu area of the current processor
slab: Return NULL for oversized allocations
mm, slab_common: Fix bootstrap creation of kmalloc caches

Glauber Costa (1):
slub: correctly bootstrap boot caches

Joonsoo Kim (3):
mm/sl[au]b: correct allocation type check in kmalloc_slab()
slub: correct to calculate num of acquired objects in get_partial_node()
slub: add 'likely' macro to inc_slabs_node()

Pekka Enberg (1):
Merge branch 'slab/next' into slab/for-linus

fs/proc/stat.c | 2 +-
include/linux/kmalloc_sizes.h | 45 ---
include/linux/slab.h | 231 +++++++++----
include/linux/slab_def.h | 54 +--
include/linux/slub_def.h | 136 +-------
mm/slab.c | 790 +++++++++++++++++------------------------
mm/slab.h | 43 +++-
mm/slab_common.c | 174 +++++++++-
mm/slub.c | 221 ++----------
9 files changed, 781 insertions(+), 915 deletions(-)
delete mode 100644 include/linux/kmalloc_sizes.h


2013-05-08 00:30:30

by Tony Lindgren

[permalink] [raw]
Subject: Re: [GIT PULL] SLAB changes for v3.10

* Pekka Enberg <[email protected]> [130506 23:42]:
> Hi Linus,
>
> Please pull the latest SLAB tree from:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux.git slab/for-linus
...

> mm, slab_common: Fix bootstrap creation of kmalloc caches

This one seems to cause a regression for me on at least arm omap
and vexpress that depends on some kconfig option I have not been
able to figure out yet. Reverting 8a965b3b fixes the issue.

Any ideas?

Regards,

Tony


Without reverting 8a965b3b I'm getting:

[ 0.000000] Unable to handle kernel NULL pointer dereference at virtual address 00000054
[ 0.000000] pgd = c0004000
[ 0.000000] [00000054] *pgd=00000000
[ 0.000000] Internal error: Oops: 5 [#1] SMP ARM
[ 0.000000] Modules linked in:
[ 0.000000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.9.0-11486-g6d13e11 #689
[ 0.000000] task: c06f77c8 ti: c06ec000 task.ti: c06ec000
[ 0.000000] PC is at kmem_cache_alloc_trace+0x50/0x178
[ 0.000000] LR is at kmem_cache_alloc_trace+0x38/0x178
[ 0.000000] pc : [<c00edab4>] lr : [<c00eda9c>] psr: 600001d3
[ 0.000000] sp : c06edf70 ip : 600001d3 fp : 00000000
[ 0.000000] r10: 000000c0 r9 : c0086958 r8 : 000080d0
[ 0.000000] r7 : 00000000 r6 : 600001d3 r5 : 00000000 r4 : 00008000
[ 0.000000] r3 : 00000050 r2 : c06ec000 r1 : c06f77c8 r0 : c00eda9c
[ 0.000000] Flags: nZCv IRQs off FIQs off Mode SVC_32 ISA ARM Segment kernel
[ 0.000000] Control: 10c5387d Table: 6000406a DAC: 00000015
[ 0.000000] Process swapper/0 (pid: 0, stack limit = 0xc06ec240)
[ 0.000000] Stack: (0xc06edf70 to 0xc06ee000)
[ 0.000000] df60: 00002014 00000010 c0702890 00000000
[ 0.000000] df80: 00000000 00000000 00000000 410fc090 c06f4480 c0086958 c0702890 00000000
[ 0.000000] dfa0: 00000010 c06dc0b0 ffffffff c06b44f4 9fffffff c073834c c0738340 c06a5828
[ 0.000000] dfc0: ffffffff ffffffff c06a5478 00000000 00000000 c06dc0b0 00000000 10c5387d
[ 0.000000] dfe0: c06f44c8 c06dc4b4 c06f8cac 6000406a 00000000 60008074 00000000 00000000
[ 0.000000] [<c00edab4>] (kmem_cache_alloc_trace+0x50/0x178) from [<c0086958>] (alloc_desc+0x24/0xb4)
[ 0.000000] [<c0086958>] (alloc_desc+0x24/0xb4) from [<c06b44f4>] (early_irq_init+0x78/0xec)
[ 0.000000] [<c06b44f4>] (early_irq_init+0x78/0xec) from [<c06a5828>] (start_kernel+0x17c/0x33c)
[ 0.000000] [<c06a5828>] (start_kernel+0x17c/0x33c) from [<60008074>] (0x60008074)
[ 0.000000] Code: e5923014 e2833014 e1a03103 e0873003 (e5931004)
[ 0.000000] ---[ end trace 1b75b31a2719ed1c ]---

2013-05-08 04:24:29

by Tony Lindgren

[permalink] [raw]
Subject: Re: [GIT PULL] SLAB changes for v3.10

* Tony Lindgren <[email protected]> [130507 17:35]:
> * Pekka Enberg <[email protected]> [130506 23:42]:
> > Hi Linus,
> >
> > Please pull the latest SLAB tree from:
> >
> > git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux.git slab/for-linus
> ...
>
> > mm, slab_common: Fix bootstrap creation of kmalloc caches
>
> This one seems to cause a regression for me on at least arm omap
> and vexpress that depends on some kconfig option I have not been
> able to figure out yet. Reverting 8a965b3b fixes the issue.
>
> Any ideas?

OK got it narrowed down to CONFIG_DEBUG_SPINLOCK=y causing the problem
with commit 8a965b3b. Ain't nothing like bisecting and booting and then
diffing .config files on top of that.

> Without reverting 8a965b3b I'm getting:
>
> [ 0.000000] Unable to handle kernel NULL pointer dereference at virtual address 00000054
> [ 0.000000] pgd = c0004000
> [ 0.000000] [00000054] *pgd=00000000
> [ 0.000000] Internal error: Oops: 5 [#1] SMP ARM
> [ 0.000000] Modules linked in:
> [ 0.000000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.9.0-11486-g6d13e11 #689
> [ 0.000000] task: c06f77c8 ti: c06ec000 task.ti: c06ec000
> [ 0.000000] PC is at kmem_cache_alloc_trace+0x50/0x178
> [ 0.000000] LR is at kmem_cache_alloc_trace+0x38/0x178
> [ 0.000000] pc : [<c00edab4>] lr : [<c00eda9c>] psr: 600001d3
> [ 0.000000] sp : c06edf70 ip : 600001d3 fp : 00000000
> [ 0.000000] r10: 000000c0 r9 : c0086958 r8 : 000080d0
> [ 0.000000] r7 : 00000000 r6 : 600001d3 r5 : 00000000 r4 : 00008000
> [ 0.000000] r3 : 00000050 r2 : c06ec000 r1 : c06f77c8 r0 : c00eda9c
> [ 0.000000] Flags: nZCv IRQs off FIQs off Mode SVC_32 ISA ARM Segment kernel
> [ 0.000000] Control: 10c5387d Table: 6000406a DAC: 00000015
> [ 0.000000] Process swapper/0 (pid: 0, stack limit = 0xc06ec240)
> [ 0.000000] Stack: (0xc06edf70 to 0xc06ee000)
> [ 0.000000] df60: 00002014 00000010 c0702890 00000000
> [ 0.000000] df80: 00000000 00000000 00000000 410fc090 c06f4480 c0086958 c0702890 00000000
> [ 0.000000] dfa0: 00000010 c06dc0b0 ffffffff c06b44f4 9fffffff c073834c c0738340 c06a5828
> [ 0.000000] dfc0: ffffffff ffffffff c06a5478 00000000 00000000 c06dc0b0 00000000 10c5387d
> [ 0.000000] dfe0: c06f44c8 c06dc4b4 c06f8cac 6000406a 00000000 60008074 00000000 00000000
> [ 0.000000] [<c00edab4>] (kmem_cache_alloc_trace+0x50/0x178) from [<c0086958>] (alloc_desc+0x24/0xb4)
> [ 0.000000] [<c0086958>] (alloc_desc+0x24/0xb4) from [<c06b44f4>] (early_irq_init+0x78/0xec)
> [ 0.000000] [<c06b44f4>] (early_irq_init+0x78/0xec) from [<c06a5828>] (start_kernel+0x17c/0x33c)
> [ 0.000000] [<c06a5828>] (start_kernel+0x17c/0x33c) from [<60008074>] (0x60008074)
> [ 0.000000] Code: e5923014 e2833014 e1a03103 e0873003 (e5931004)
> [ 0.000000] ---[ end trace 1b75b31a2719ed1c ]---
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2013-05-08 05:16:10

by Tony Lindgren

[permalink] [raw]
Subject: Re: [GIT PULL] SLAB changes for v3.10

* Tony Lindgren <[email protected]> [130507 21:30]:
> * Tony Lindgren <[email protected]> [130507 17:35]:
> > * Pekka Enberg <[email protected]> [130506 23:42]:
> > > Hi Linus,
> > >
> > > Please pull the latest SLAB tree from:
> > >
> > > git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux.git slab/for-linus
> > ...
> >
> > > mm, slab_common: Fix bootstrap creation of kmalloc caches
> >
> > This one seems to cause a regression for me on at least arm omap
> > and vexpress that depends on some kconfig option I have not been
> > able to figure out yet. Reverting 8a965b3b fixes the issue.
> >
> > Any ideas?
>
> OK got it narrowed down to CONFIG_DEBUG_SPINLOCK=y causing the problem
> with commit 8a965b3b. Ain't nothing like bisecting and booting and then
> diffing .config files on top of that.

Hmm it's actually CONFIG_PROVE_LOCKING=y that causes the problem,
not CONFIG_DEBUG_SPINLOCK=y. CONFIG_DEBUG_SPINLOCK=y was just selected
by CONFIG_PROVE_LOCKING=y in my non-booting .config. I can now fix my
non-booting .config by removing PROVE_LOCKING and DEBUG_SPINLOCK,
but I cannot break my other working .config by adding PROVE_LOCKING
and DEBUG_SPINLOCK. Hmm.

> > Without reverting 8a965b3b I'm getting:
> >
> > [ 0.000000] Unable to handle kernel NULL pointer dereference at virtual address 00000054
> > [ 0.000000] pgd = c0004000
> > [ 0.000000] [00000054] *pgd=00000000
> > [ 0.000000] Internal error: Oops: 5 [#1] SMP ARM
> > [ 0.000000] Modules linked in:
> > [ 0.000000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.9.0-11486-g6d13e11 #689
> > [ 0.000000] task: c06f77c8 ti: c06ec000 task.ti: c06ec000
> > [ 0.000000] PC is at kmem_cache_alloc_trace+0x50/0x178
> > [ 0.000000] LR is at kmem_cache_alloc_trace+0x38/0x178
> > [ 0.000000] pc : [<c00edab4>] lr : [<c00eda9c>] psr: 600001d3
> > [ 0.000000] sp : c06edf70 ip : 600001d3 fp : 00000000
> > [ 0.000000] r10: 000000c0 r9 : c0086958 r8 : 000080d0
> > [ 0.000000] r7 : 00000000 r6 : 600001d3 r5 : 00000000 r4 : 00008000
> > [ 0.000000] r3 : 00000050 r2 : c06ec000 r1 : c06f77c8 r0 : c00eda9c
> > [ 0.000000] Flags: nZCv IRQs off FIQs off Mode SVC_32 ISA ARM Segment kernel
> > [ 0.000000] Control: 10c5387d Table: 6000406a DAC: 00000015
> > [ 0.000000] Process swapper/0 (pid: 0, stack limit = 0xc06ec240)
> > [ 0.000000] Stack: (0xc06edf70 to 0xc06ee000)
> > [ 0.000000] df60: 00002014 00000010 c0702890 00000000
> > [ 0.000000] df80: 00000000 00000000 00000000 410fc090 c06f4480 c0086958 c0702890 00000000
> > [ 0.000000] dfa0: 00000010 c06dc0b0 ffffffff c06b44f4 9fffffff c073834c c0738340 c06a5828
> > [ 0.000000] dfc0: ffffffff ffffffff c06a5478 00000000 00000000 c06dc0b0 00000000 10c5387d
> > [ 0.000000] dfe0: c06f44c8 c06dc4b4 c06f8cac 6000406a 00000000 60008074 00000000 00000000
> > [ 0.000000] [<c00edab4>] (kmem_cache_alloc_trace+0x50/0x178) from [<c0086958>] (alloc_desc+0x24/0xb4)
> > [ 0.000000] [<c0086958>] (alloc_desc+0x24/0xb4) from [<c06b44f4>] (early_irq_init+0x78/0xec)
> > [ 0.000000] [<c06b44f4>] (early_irq_init+0x78/0xec) from [<c06a5828>] (start_kernel+0x17c/0x33c)
> > [ 0.000000] [<c06a5828>] (start_kernel+0x17c/0x33c) from [<60008074>] (0x60008074)
> > [ 0.000000] Code: e5923014 e2833014 e1a03103 e0873003 (e5931004)
> > [ 0.000000] ---[ end trace 1b75b31a2719ed1c ]---
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to [email protected]
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at http://www.tux.org/lkml/
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2013-05-08 06:20:09

by Pekka Enberg

[permalink] [raw]
Subject: Re: [GIT PULL] SLAB changes for v3.10

Hi Tony,

On Wed, May 8, 2013 at 8:16 AM, Tony Lindgren <[email protected]> wrote:
> * Tony Lindgren <[email protected]> [130507 21:30]:
>> * Tony Lindgren <[email protected]> [130507 17:35]:
>> > * Pekka Enberg <[email protected]> [130506 23:42]:
>> > > Hi Linus,
>> > >
>> > > Please pull the latest SLAB tree from:
>> > >
>> > > git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux.git slab/for-linus
>> > ...
>> >
>> > > mm, slab_common: Fix bootstrap creation of kmalloc caches
>> >
>> > This one seems to cause a regression for me on at least arm omap
>> > and vexpress that depends on some kconfig option I have not been
>> > able to figure out yet. Reverting 8a965b3b fixes the issue.
>> >
>> > Any ideas?
>>
>> OK got it narrowed down to CONFIG_DEBUG_SPINLOCK=y causing the problem
>> with commit 8a965b3b. Ain't nothing like bisecting and booting and then
>> diffing .config files on top of that.
>
> Hmm it's actually CONFIG_PROVE_LOCKING=y that causes the problem,
> not CONFIG_DEBUG_SPINLOCK=y. CONFIG_DEBUG_SPINLOCK=y was just selected
> by CONFIG_PROVE_LOCKING=y in my non-booting .config. I can now fix my
> non-booting .config by removing PROVE_LOCKING and DEBUG_SPINLOCK,
> but I cannot break my other working .config by adding PROVE_LOCKING
> and DEBUG_SPINLOCK. Hmm.

That's almost certainly a slab bootstrap issue. Lockdep et al just
change struct sizes which is why the problem shows up. Is this with
SLAB or SLUB? Christoph, Glauber, care to take a look?

>> > Without reverting 8a965b3b I'm getting:
>> >
>> > [ 0.000000] Unable to handle kernel NULL pointer dereference at virtual address 00000054
>> > [ 0.000000] pgd = c0004000
>> > [ 0.000000] [00000054] *pgd=00000000
>> > [ 0.000000] Internal error: Oops: 5 [#1] SMP ARM
>> > [ 0.000000] Modules linked in:
>> > [ 0.000000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.9.0-11486-g6d13e11 #689
>> > [ 0.000000] task: c06f77c8 ti: c06ec000 task.ti: c06ec000
>> > [ 0.000000] PC is at kmem_cache_alloc_trace+0x50/0x178
>> > [ 0.000000] LR is at kmem_cache_alloc_trace+0x38/0x178
>> > [ 0.000000] pc : [<c00edab4>] lr : [<c00eda9c>] psr: 600001d3
>> > [ 0.000000] sp : c06edf70 ip : 600001d3 fp : 00000000
>> > [ 0.000000] r10: 000000c0 r9 : c0086958 r8 : 000080d0
>> > [ 0.000000] r7 : 00000000 r6 : 600001d3 r5 : 00000000 r4 : 00008000
>> > [ 0.000000] r3 : 00000050 r2 : c06ec000 r1 : c06f77c8 r0 : c00eda9c
>> > [ 0.000000] Flags: nZCv IRQs off FIQs off Mode SVC_32 ISA ARM Segment kernel
>> > [ 0.000000] Control: 10c5387d Table: 6000406a DAC: 00000015
>> > [ 0.000000] Process swapper/0 (pid: 0, stack limit = 0xc06ec240)
>> > [ 0.000000] Stack: (0xc06edf70 to 0xc06ee000)
>> > [ 0.000000] df60: 00002014 00000010 c0702890 00000000
>> > [ 0.000000] df80: 00000000 00000000 00000000 410fc090 c06f4480 c0086958 c0702890 00000000
>> > [ 0.000000] dfa0: 00000010 c06dc0b0 ffffffff c06b44f4 9fffffff c073834c c0738340 c06a5828
>> > [ 0.000000] dfc0: ffffffff ffffffff c06a5478 00000000 00000000 c06dc0b0 00000000 10c5387d
>> > [ 0.000000] dfe0: c06f44c8 c06dc4b4 c06f8cac 6000406a 00000000 60008074 00000000 00000000
>> > [ 0.000000] [<c00edab4>] (kmem_cache_alloc_trace+0x50/0x178) from [<c0086958>] (alloc_desc+0x24/0xb4)
>> > [ 0.000000] [<c0086958>] (alloc_desc+0x24/0xb4) from [<c06b44f4>] (early_irq_init+0x78/0xec)
>> > [ 0.000000] [<c06b44f4>] (early_irq_init+0x78/0xec) from [<c06a5828>] (start_kernel+0x17c/0x33c)
>> > [ 0.000000] [<c06a5828>] (start_kernel+0x17c/0x33c) from [<60008074>] (0x60008074)
>> > [ 0.000000] Code: e5923014 e2833014 e1a03103 e0873003 (e5931004)
>> > [ 0.000000] ---[ end trace 1b75b31a2719ed1c ]---

2013-05-08 11:57:33

by Glauber Costa

[permalink] [raw]
Subject: Re: [GIT PULL] SLAB changes for v3.10

On 05/08/2013 10:20 AM, Pekka Enberg wrote:
>> > Hmm it's actually CONFIG_PROVE_LOCKING=y that causes the problem,
>> > not CONFIG_DEBUG_SPINLOCK=y. CONFIG_DEBUG_SPINLOCK=y was just selected
>> > by CONFIG_PROVE_LOCKING=y in my non-booting .config. I can now fix my
>> > non-booting .config by removing PROVE_LOCKING and DEBUG_SPINLOCK,
>> > but I cannot break my other working .config by adding PROVE_LOCKING
>> > and DEBUG_SPINLOCK. Hmm.
> That's almost certainly a slab bootstrap issue. Lockdep et al just
> change struct sizes which is why the problem shows up. Is this with
> SLAB or SLUB? Christoph, Glauber, care to take a look?
>
>>>> >> >

My first guess is that it hit a NULL cache. Being a NULL pointer
dereference, the thing among all that has the biggest chances of being
NULL and accessed unconditionally is the cache pointer itself.

Due to the size being too big. But if that were the case, he would have
hit the WARN_ON recently introduced:

if (WARN_ON_ONCE(size > KMALLOC_MAX_SIZE))
return NULL;


Is this WARN hit ?

2013-05-08 12:26:12

by Pekka Enberg

[permalink] [raw]
Subject: Re: [GIT PULL] SLAB changes for v3.10

On Wed, May 8, 2013 at 2:58 PM, Glauber Costa <[email protected]> wrote:
> My first guess is that it hit a NULL cache. Being a NULL pointer
> dereference, the thing among all that has the biggest chances of being
> NULL and accessed unconditionally is the cache pointer itself.
>
> Due to the size being too big. But if that were the case, he would have
> hit the WARN_ON recently introduced:
>
> if (WARN_ON_ONCE(size > KMALLOC_MAX_SIZE))
> return NULL;
>
>
> Is this WARN hit ?

I doubt it:

[ 0.000000] r7 : 00000000 r6 : 600001d3 r5 : 00000000 r4 : 00008000
[ 0.000000] r3 : 00000050 r2 : c06ec000 r1 : c06f77c8 r0 : c00eda9c

[ 0.000000] [<c00edab4>] (kmem_cache_alloc_trace+0x50/0x178) from
[<c0086958>] (alloc_desc+0x24/0xb4)

It's the kzalloc_node() in kernel/irq/irqdesc.c::alloc_desc() and
AFAICT based on r4 it's a 32 KB allocation. It's more likely that
KMALLOC_SHIFT_HIGH is less than 25 and because kmalloc_index() doesn't
respect it, we return a pointer to an uninitialized kmalloc cache.

Pekka

2013-05-08 12:37:49

by Glauber Costa

[permalink] [raw]
Subject: Re: [GIT PULL] SLAB changes for v3.10

On 05/08/2013 04:26 PM, Pekka Enberg wrote:
> On Wed, May 8, 2013 at 2:58 PM, Glauber Costa <[email protected]> wrote:
>> My first guess is that it hit a NULL cache. Being a NULL pointer
>> dereference, the thing among all that has the biggest chances of being
>> NULL and accessed unconditionally is the cache pointer itself.
>>
>> Due to the size being too big. But if that were the case, he would have
>> hit the WARN_ON recently introduced:
>>
>> if (WARN_ON_ONCE(size > KMALLOC_MAX_SIZE))
>> return NULL;
>>
>>
>> Is this WARN hit ?
>
> I doubt it:
>
> [ 0.000000] r7 : 00000000 r6 : 600001d3 r5 : 00000000 r4 : 00008000
> [ 0.000000] r3 : 00000050 r2 : c06ec000 r1 : c06f77c8 r0 : c00eda9c
>
> [ 0.000000] [<c00edab4>] (kmem_cache_alloc_trace+0x50/0x178) from
> [<c0086958>] (alloc_desc+0x24/0xb4)
>
> It's the kzalloc_node() in kernel/irq/irqdesc.c::alloc_desc() and
> AFAICT based on r4 it's a 32 KB allocation. It's more likely that
> KMALLOC_SHIFT_HIGH is less than 25 and because kmalloc_index() doesn't
> respect it, we return a pointer to an uninitialized kmalloc cache.
>

Exactly, but then the index is calculated from the size. If we are
allocating with a size that would lead to an invalid index, we should
WARN. If this is not happening, that WARN is really really badly placed.

Subject: Re: [GIT PULL] SLAB changes for v3.10

On Tue, 7 May 2013, Tony Lindgren wrote:

> OK got it narrowed down to CONFIG_DEBUG_SPINLOCK=y causing the problem
> with commit 8a965b3b. Ain't nothing like bisecting and booting and then
> diffing .config files on top of that.
>
> > Without reverting 8a965b3b I'm getting:

The patch (commit 8a965b3baa89ffedc73c0fbc750006c631012ced) merely changed the sequence of
slab creation to address an issue in SLAB.

Hmmm.. But if KMALLOC_SHIFT_LOW is higher than 6 or 7 then the creation of
the non-power of two slab could be skipped as a result of the patch. But
we should not need them in those cases.

Can I see the kernel config?

What is the value of KMALLOC_SHIFT_LOW?

Can you figure out which kernel slab the function is trying to access?

2013-05-08 15:45:26

by Tony Lindgren

[permalink] [raw]
Subject: Re: [GIT PULL] SLAB changes for v3.10

* Christoph Lameter <[email protected]> [130508 07:01]:
> On Tue, 7 May 2013, Tony Lindgren wrote:
>
> > OK got it narrowed down to CONFIG_DEBUG_SPINLOCK=y causing the problem
> > with commit 8a965b3b. Ain't nothing like bisecting and booting and then
> > diffing .config files on top of that.
> >
> > > Without reverting 8a965b3b I'm getting:
>
> The patch (commit 8a965b3baa89ffedc73c0fbc750006c631012ced) merely changed the sequence of
> slab creation to address an issue in SLAB.
>
> Hmmm.. But if KMALLOC_SHIFT_LOW is higher than 6 or 7 then the creation of
> the non-power of two slab could be skipped as a result of the patch. But
> we should not need them in those cases.
>
> Can I see the kernel config?

Attached are minimal defconfig-bad and defconfig-good with pretty
much everything disabled. These boot vexpress in qemu using earlyprintk
for the output.

> What is the value of KMALLOC_SHIFT_LOW?

It's 6.

> Can you figure out which kernel slab the function is trying to access?

I can certainly debug it further, but it's also pretty easy to reproduce:

1. Download and build qemu-linaro from:

https://git.linaro.org/gitweb?p=qemu/qemu-linaro.git

The last time I tried the stock qemu I could not get vexpress to
boot with it, don't know if that's still the case. But the above
works for me.

2. Cross compile kernel with the attached defconfig-bad

$ ARCH=arm CROSS_COMPILE=... make zImage

3. Try to boot the kernel in qemu

$ qemu-system-arm -machine vexpress-a9 -m 1024 -curses -net nic \
-net user -serial stdio -append "console=ttyAMA0,115200n8 \
root=/dev/mmcblk0p2 ro rootwait physmap.enabled=0 \
debug earlyprintk" -kernel zImage

Regards,

Tony


Attachments:
(No filename) (1.65 kB)
defconfig-bad (1.88 kB)
defconfig-good (1.88 kB)
Download all attachments

2013-05-08 18:13:59

by Chris Mason

[permalink] [raw]
Subject: Re: [GIT PULL] SLAB changes for v3.10

[ Sorry if I break the threading on this, I had to pull it off gmane ]

On Tue, 7 May 2013, Tony Lindgren wrote:
> OK got it narrowed down to CONFIG_DEBUG_SPINLOCK=y causing the problem
> with commit 8a965b3b. Ain't nothing like bisecting and booting and then
> diffing .config files on top of that.

I'm unable to boot with slab on current Linus -master, and bisected it
down almost as far as Tony did before trying SLUB and then finding this
thread. My box is a standard x86-64, nothing exciting and spinlock
debugging isn't on.

A few printks stuffed into Christoph's code:

cache ffff88047f000080 at index 5
creating slab cache at 6
create special cache #1 (this is kmalloc_caches[1])
cache ffff88047f0001c0 at index 7
creating slab cache at 8
creating slab cache at 9
creating slab cache at 10
... more get created

Pulling this into the code from commit 8a965b3b:

for (i = KMALLOC_SHIFT_LOW; i <= KMALLOC_SHIFT_HIGH; i++) {
if (!kmalloc_caches[i]) {
kmalloc_caches[i] = create_kmalloc_cache(NULL,
1 << i, flags);

/*
* Caches that are not of the two-to-the-power-of size.
* These have to be created immediately after the
* earlier power of two caches
*/
if (KMALLOC_MIN_SIZE <= 32 && !kmalloc_caches[1] && i == 6)
kmalloc_caches[1] = create_kmalloc_cache(NULL, 96, flags);

if (KMALLOC_MIN_SIZE <= 64 && !kmalloc_caches[2] && i == 7)
kmalloc_caches[2] = create_kmalloc_cache(NULL, 192, flags);
}
}

kmalloc_caches[7] was not null, and so kmalloc_caches[2] was never
created.

I get this oops (with early printk on)

------------[ cut here ]------------
kernel BUG at mm/slab.c:1635!
invalid opcode: 0000 [#1] PREEMPT SMP
Modules linked in:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.9.0-josef+ #920
Hardware name: Supermicro X9SRE/X9SRE-3F/X9SRi/X9SRi-3F/X9SRE/X9SRE-3F/X9SRi/X9SRi-3F, BIOS 1.0a 03/06/2012
task: ffffffff8196a410 ti: ffffffff8195a000 task.ti: ffffffff8195a000
RIP: 0010:[<ffffffff81a49c9d>] [<ffffffff81a49c9d>] kmem_cache_init_late+0x40/0x7d
RSP: 0000:ffffffff8195bf78 EFLAGS: 00010282
RAX: 00000000fffffff4 RBX: ffff88047f006480 RCX: 000000000000ff31
RDX: 000000000000000e RSI: 0000000000000046 RDI: ffffffff81baf238
RBP: ffffffff8195bf80 R08: 0000000000000400 R09: 0000000000000000
R10: 0000000000002fa4 R11: 0000000000000000 R12: ffffffff81ab58d0
R13: ffffffff81abd2c0 R14: ffff88047ffaf0c0 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff88047fc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffff88047ffff000 CR3: 0000000001965000 CR4: 00000000000406b0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Stack:
ffffffffffffffff ffffffff8195bfc0 ffffffff81a25b64 ffffffff81a2573d
ffffffff81abd2c0 0000000000003000 0000000000000000 0000000000000000
0000000000000000 ffffffff8195bfd0 ffffffff81a2547f ffffffff8195bfe8
Call Trace:
[<ffffffff81a25b64>] start_kernel+0x235/0x394
[<ffffffff81a2573d>] ? repair_env_string+0x58/0x58
[<ffffffff81a2547f>] x86_64_start_reservations+0x2a/0x2c
[<ffffffff81a25548>] x86_64_start_kernel+0xc7/0xca
Code: 53 e8 40 57 bd ff 48 8b 05 21 f2 f4 ff 48 8d 58 a8 48 8d 43 58 48 3d a0 8e 99 81 74 1a 31 f6 48 89 df e8 ba 7f 6d ff 85 c0 74 02 <0f> 0b 48 8b 5b 58 48 83 eb 58 eb da 48 c7 c7 70 8e 99 81 e8 e6
RIP [<ffffffff81a49c9d>] kmem_cache_init_late+0x40/0x7d
RSP <ffffffff8195bf78>
---[ end trace 2e5587581263f881 ]---

This patch fixes things for me, but to maintain the rules from
Christoph's patch, kmalloc_caches[2] should have been created whenever
kmalloc_caches[7] was done.

diff --git a/mm/slab_common.c b/mm/slab_common.c
index d2517b0..ff3218a 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -446,18 +446,18 @@ void __init create_kmalloc_caches(unsigned long flags)
if (!kmalloc_caches[i]) {
kmalloc_caches[i] = create_kmalloc_cache(NULL,
1 << i, flags);
+ }

- /*
- * Caches that are not of the two-to-the-power-of size.
- * These have to be created immediately after the
- * earlier power of two caches
- */
- if (KMALLOC_MIN_SIZE <= 32 && !kmalloc_caches[1] && i == 6)
- kmalloc_caches[1] = create_kmalloc_cache(NULL, 96, flags);
+ /*
+ * Caches that are not of the two-to-the-power-of size.
+ * These have to be created immediately after the
+ * earlier power of two caches
+ */
+ if (KMALLOC_MIN_SIZE <= 32 && !kmalloc_caches[1] && i == 6)
+ kmalloc_caches[1] = create_kmalloc_cache(NULL, 96, flags);

- if (KMALLOC_MIN_SIZE <= 64 && !kmalloc_caches[2] && i == 7)
- kmalloc_caches[2] = create_kmalloc_cache(NULL, 192, flags);
- }
+ if (KMALLOC_MIN_SIZE <= 64 && !kmalloc_caches[2] && i == 7)
+ kmalloc_caches[2] = create_kmalloc_cache(NULL, 192, flags);
}

/* Kmalloc array is now usable */

Subject: Re: [GIT PULL] SLAB changes for v3.10

> The 1.4.0 verion in ubuntu 13.04 s not good enough?

qemu 1.4.0 reproduces the bug here on arm. And Chris Mason;s patch fixes
it.

Subject: Re: [GIT PULL] SLAB changes for v3.10

On Wed, 8 May 2013, Chris Mason wrote:

> This patch fixes things for me, but to maintain the rules from
> Christoph's patch, kmalloc_caches[2] should have been created whenever
> kmalloc_caches[7] was done.

Not necessary. The early slab bootstrap must create some slab caches of
specific sizes, it will only use those during very early bootstrap.

The later creation of the array must skip those.

You correctly moved the checks out of the if (!kmalloc_cacheS())
condition so that the caches are created properly.

Acked-by: Christoph Lameter <[email protected]>

2013-05-08 18:48:21

by Chris Mason

[permalink] [raw]
Subject: Re: [GIT PULL] SLAB changes for v3.10

Quoting Christoph Lameter (2013-05-08 14:25:49)
> On Wed, 8 May 2013, Chris Mason wrote:
>
> > This patch fixes things for me, but to maintain the rules from
> > Christoph's patch, kmalloc_caches[2] should have been created whenever
> > kmalloc_caches[7] was done.
>
> Not necessary. The early slab bootstrap must create some slab caches of
> specific sizes, it will only use those during very early bootstrap.
>
> The later creation of the array must skip those.
>
> You correctly moved the checks out of the if (!kmalloc_cacheS())
> condition so that the caches are created properly.

But if the ordering is required at all, why is it ok to create cache 2
after cache 6 instead of after cache 7?

IOW if we can safely do cache 2 after cache 6, why can't we just do both
cache 1 and cache 2 after the loop?

-chris

Subject: Re: [GIT PULL] SLAB changes for v3.10

On Wed, 8 May 2013, Chris Mason wrote:

> > You correctly moved the checks out of the if (!kmalloc_cacheS())
> > condition so that the caches are created properly.
>
> But if the ordering is required at all, why is it ok to create cache 2
> after cache 6 instead of after cache 7?

The power of two caches are 2^x beginning with KMALLOC_MIN_SHIFT. The non
power of two caches were folded into number 1 + 2 since they do not fit
into the scheme and they are special cased throughout. This works since
the minimal slab cache size is 8 bytes.

> IOW if we can safely do cache 2 after cache 6, why can't we just do both
> cache 1 and cache 2 after the loop?

Because the cache creation in SLAB can cause the use of a fractional slab
size if kmem_cache_create() thinks its better to put the metadata on a
different slab cache (OFF_SLAB type) because data will align better that
way. Its weird I know but its due to the way that SLAB aligns data in the
page frame.

2013-05-08 19:05:42

by Tony Lindgren

[permalink] [raw]
Subject: Re: [GIT PULL] SLAB changes for v3.10

* Christoph Lameter <[email protected]> [130508 11:31]:
> On Wed, 8 May 2013, Chris Mason wrote:
>
> > This patch fixes things for me, but to maintain the rules from
> > Christoph's patch, kmalloc_caches[2] should have been created whenever
> > kmalloc_caches[7] was done.
>
> Not necessary. The early slab bootstrap must create some slab caches of
> specific sizes, it will only use those during very early bootstrap.
>
> The later creation of the array must skip those.
>
> You correctly moved the checks out of the if (!kmalloc_cacheS())
> condition so that the caches are created properly.
>
> Acked-by: Christoph Lameter <[email protected]>

Great this fixes things for me too:

Acked-by: Tony Lindgren <[email protected]>

2013-05-08 19:11:19

by Tony Lindgren

[permalink] [raw]
Subject: Re: [GIT PULL] SLAB changes for v3.10

* Christoph Lameter <[email protected]> [130508 12:06]:
> On Wed, 8 May 2013, Chris Mason wrote:
>
> > > You correctly moved the checks out of the if (!kmalloc_cacheS())
> > > condition so that the caches are created properly.
> >
> > But if the ordering is required at all, why is it ok to create cache 2
> > after cache 6 instead of after cache 7?
>
> The power of two caches are 2^x beginning with KMALLOC_MIN_SHIFT. The non
> power of two caches were folded into number 1 + 2 since they do not fit
> into the scheme and they are special cased throughout. This works since
> the minimal slab cache size is 8 bytes.
>
> > IOW if we can safely do cache 2 after cache 6, why can't we just do both
> > cache 1 and cache 2 after the loop?
>
> Because the cache creation in SLAB can cause the use of a fractional slab
> size if kmem_cache_create() thinks its better to put the metadata on a
> different slab cache (OFF_SLAB type) because data will align better that
> way. Its weird I know but its due to the way that SLAB aligns data in the
> page frame.

Hmm OK so kmalloc_caches[7] got created earlier with INDEX_AC != INDEX_NODE,
and those are defined as:

#define INDEX_AC kmalloc_index(sizeof(struct arraycache_init))
#define INDEX_NODE kmalloc_index(sizeof(struct kmem_cache_node))

So the different sizes for the structs can trigger it like Pekka was
speculating earlier.

Regards,

Tony

2013-05-08 19:56:32

by Chris Mason

[permalink] [raw]
Subject: [PATCH] Fix crash during slab init

Commit 8a965b3b introduced a regression that caused us to crash early
during boot. The commit was introducing ordering of slab creation,
making sure two odd-sized slabs were created after specific powers of
two sizes.

But, if any of the power of two slabs were created earlier during boot,
slabs at index 1 or 2 might not get created at all. This patch makes
sure none of the slabs get skipped.

Tony Lindgren bisected this down to the offending commit, which really
helped because bisect kept bringing me to almost but not quite this one.

Signed-off-by: Chris Mason <[email protected]>
Acked-by: Christoph Lameter <[email protected]>
Acked-by: Tony Lindgren <[email protected]>

---

v1->v2 reword description

diff --git a/mm/slab_common.c b/mm/slab_common.c
index d2517b0..ff3218a 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -446,18 +446,18 @@ void __init create_kmalloc_caches(unsigned long flags)
if (!kmalloc_caches[i]) {
kmalloc_caches[i] = create_kmalloc_cache(NULL,
1 << i, flags);
+ }

- /*
- * Caches that are not of the two-to-the-power-of size.
- * These have to be created immediately after the
- * earlier power of two caches
- */
- if (KMALLOC_MIN_SIZE <= 32 && !kmalloc_caches[1] && i == 6)
- kmalloc_caches[1] = create_kmalloc_cache(NULL, 96, flags);
+ /*
+ * Caches that are not of the two-to-the-power-of size.
+ * These have to be created immediately after the
+ * earlier power of two caches
+ */
+ if (KMALLOC_MIN_SIZE <= 32 && !kmalloc_caches[1] && i == 6)
+ kmalloc_caches[1] = create_kmalloc_cache(NULL, 96, flags);

- if (KMALLOC_MIN_SIZE <= 64 && !kmalloc_caches[2] && i == 7)
- kmalloc_caches[2] = create_kmalloc_cache(NULL, 192, flags);
- }
+ if (KMALLOC_MIN_SIZE <= 64 && !kmalloc_caches[2] && i == 7)
+ kmalloc_caches[2] = create_kmalloc_cache(NULL, 192, flags);
}

/* Kmalloc array is now usable */

2013-05-08 20:10:13

by Soren Brinkmann

[permalink] [raw]
Subject: Re: [PATCH] Fix crash during slab init

On Wed, May 08, 2013 at 03:56:28PM -0400, Chris Mason wrote:
> Commit 8a965b3b introduced a regression that caused us to crash early
> during boot. The commit was introducing ordering of slab creation,
> making sure two odd-sized slabs were created after specific powers of
> two sizes.
>
> But, if any of the power of two slabs were created earlier during boot,
> slabs at index 1 or 2 might not get created at all. This patch makes
> sure none of the slabs get skipped.
>
> Tony Lindgren bisected this down to the offending commit, which really
> helped because bisect kept bringing me to almost but not quite this one.
>
> Signed-off-by: Chris Mason <[email protected]>
> Acked-by: Christoph Lameter <[email protected]>
> Acked-by: Tony Lindgren <[email protected]>
Acked-by: Soren Brinkmann <[email protected]>

Fixes things for me on Zynq, too.

Sören

2013-05-08 21:01:39

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: Re: [GIT PULL] SLAB changes for v3.10

On Wed, May 08, 2013 at 02:13:53PM -0400, Chris Mason wrote:
> [ Sorry if I break the threading on this, I had to pull it off gmane ]
>
> On Tue, 7 May 2013, Tony Lindgren wrote:
> > OK got it narrowed down to CONFIG_DEBUG_SPINLOCK=y causing the problem
> > with commit 8a965b3b. Ain't nothing like bisecting and booting and then
> > diffing .config files on top of that.
>
> I'm unable to boot with slab on current Linus -master, and bisected it
> down almost as far as Tony did before trying SLUB and then finding this
> thread. My box is a standard x86-64, nothing exciting and spinlock
> debugging isn't on.
>
> A few printks stuffed into Christoph's code:
>
> cache ffff88047f000080 at index 5
> creating slab cache at 6
> create special cache #1 (this is kmalloc_caches[1])
> cache ffff88047f0001c0 at index 7
> creating slab cache at 8
> creating slab cache at 9
> creating slab cache at 10
> ... more get created
>
> Pulling this into the code from commit 8a965b3b:
>
> for (i = KMALLOC_SHIFT_LOW; i <= KMALLOC_SHIFT_HIGH; i++) {
> if (!kmalloc_caches[i]) {
> kmalloc_caches[i] = create_kmalloc_cache(NULL,
> 1 << i, flags);
>
> /*
> * Caches that are not of the two-to-the-power-of size.
> * These have to be created immediately after the
> * earlier power of two caches
> */
> if (KMALLOC_MIN_SIZE <= 32 && !kmalloc_caches[1] && i == 6)
> kmalloc_caches[1] = create_kmalloc_cache(NULL, 96, flags);
>
> if (KMALLOC_MIN_SIZE <= 64 && !kmalloc_caches[2] && i == 7)
> kmalloc_caches[2] = create_kmalloc_cache(NULL, 192, flags);
> }
> }
>
> kmalloc_caches[7] was not null, and so kmalloc_caches[2] was never
> created.
>
> I get this oops (with early printk on)
>
> ------------[ cut here ]------------
> kernel BUG at mm/slab.c:1635!
> invalid opcode: 0000 [#1] PREEMPT SMP
> Modules linked in:
> CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.9.0-josef+ #920
> Hardware name: Supermicro X9SRE/X9SRE-3F/X9SRi/X9SRi-3F/X9SRE/X9SRE-3F/X9SRi/X9SRi-3F, BIOS 1.0a 03/06/2012
> task: ffffffff8196a410 ti: ffffffff8195a000 task.ti: ffffffff8195a000
> RIP: 0010:[<ffffffff81a49c9d>] [<ffffffff81a49c9d>] kmem_cache_init_late+0x40/0x7d
> RSP: 0000:ffffffff8195bf78 EFLAGS: 00010282
> RAX: 00000000fffffff4 RBX: ffff88047f006480 RCX: 000000000000ff31
> RDX: 000000000000000e RSI: 0000000000000046 RDI: ffffffff81baf238
> RBP: ffffffff8195bf80 R08: 0000000000000400 R09: 0000000000000000
> R10: 0000000000002fa4 R11: 0000000000000000 R12: ffffffff81ab58d0
> R13: ffffffff81abd2c0 R14: ffff88047ffaf0c0 R15: 0000000000000000
> FS: 0000000000000000(0000) GS:ffff88047fc00000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: ffff88047ffff000 CR3: 0000000001965000 CR4: 00000000000406b0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Stack:
> ffffffffffffffff ffffffff8195bfc0 ffffffff81a25b64 ffffffff81a2573d
> ffffffff81abd2c0 0000000000003000 0000000000000000 0000000000000000
> 0000000000000000 ffffffff8195bfd0 ffffffff81a2547f ffffffff8195bfe8
> Call Trace:
> [<ffffffff81a25b64>] start_kernel+0x235/0x394
> [<ffffffff81a2573d>] ? repair_env_string+0x58/0x58
> [<ffffffff81a2547f>] x86_64_start_reservations+0x2a/0x2c
> [<ffffffff81a25548>] x86_64_start_kernel+0xc7/0xca
> Code: 53 e8 40 57 bd ff 48 8b 05 21 f2 f4 ff 48 8d 58 a8 48 8d 43 58 48 3d a0 8e 99 81 74 1a 31 f6 48 89 df e8 ba 7f 6d ff 85 c0 74 02 <0f> 0b 48 8b 5b 58 48 83 eb 58 eb da 48 c7 c7 70 8e 99 81 e8 e6
> RIP [<ffffffff81a49c9d>] kmem_cache_init_late+0x40/0x7d
> RSP <ffffffff8195bf78>
> ---[ end trace 2e5587581263f881 ]---
>
> This patch fixes things for me, but to maintain the rules from
> Christoph's patch, kmalloc_caches[2] should have been created whenever
> kmalloc_caches[7] was done.

And you might also want to add:

Tested-by: Konrad Rzeszutek Wilk <[email protected]>

>
> diff --git a/mm/slab_common.c b/mm/slab_common.c
> index d2517b0..ff3218a 100644
> --- a/mm/slab_common.c
> +++ b/mm/slab_common.c
> @@ -446,18 +446,18 @@ void __init create_kmalloc_caches(unsigned long flags)
> if (!kmalloc_caches[i]) {
> kmalloc_caches[i] = create_kmalloc_cache(NULL,
> 1 << i, flags);
> + }
>
> - /*
> - * Caches that are not of the two-to-the-power-of size.
> - * These have to be created immediately after the
> - * earlier power of two caches
> - */
> - if (KMALLOC_MIN_SIZE <= 32 && !kmalloc_caches[1] && i == 6)
> - kmalloc_caches[1] = create_kmalloc_cache(NULL, 96, flags);
> + /*
> + * Caches that are not of the two-to-the-power-of size.
> + * These have to be created immediately after the
> + * earlier power of two caches
> + */
> + if (KMALLOC_MIN_SIZE <= 32 && !kmalloc_caches[1] && i == 6)
> + kmalloc_caches[1] = create_kmalloc_cache(NULL, 96, flags);
>
> - if (KMALLOC_MIN_SIZE <= 64 && !kmalloc_caches[2] && i == 7)
> - kmalloc_caches[2] = create_kmalloc_cache(NULL, 192, flags);
> - }
> + if (KMALLOC_MIN_SIZE <= 64 && !kmalloc_caches[2] && i == 7)
> + kmalloc_caches[2] = create_kmalloc_cache(NULL, 192, flags);
> }
>
> /* Kmalloc array is now usable */
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2013-05-08 21:48:49

by Tetsuo Handa

[permalink] [raw]
Subject: Re: [PATCH] Fix crash during slab init

Chris Mason wrote:
> Commit 8a965b3b introduced a regression that caused us to crash early
> during boot. The commit was introducing ordering of slab creation,
> making sure two odd-sized slabs were created after specific powers of
> two sizes.
>
> But, if any of the power of two slabs were created earlier during boot,
> slabs at index 1 or 2 might not get created at all. This patch makes
> sure none of the slabs get skipped.
>
> Tony Lindgren bisected this down to the offending commit, which really
> helped because bisect kept bringing me to almost but not quite this one.
>
> Signed-off-by: Chris Mason <[email protected]>
> Acked-by: Christoph Lameter <[email protected]>
> Acked-by: Tony Lindgren <[email protected]>

This patch keeps my x86_32 debug kernel boots fine. Thank you.

2013-05-08 22:09:35

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] Fix crash during slab init

On Wed, 8 May 2013 15:56:28 -0400 Chris Mason <[email protected]> wrote:

> Commit 8a965b3b introduced a regression that caused us to crash early
> during boot. The commit was introducing ordering of slab creation,
> making sure two odd-sized slabs were created after specific powers of
> two sizes.
>
> But, if any of the power of two slabs were created earlier during boot,
> slabs at index 1 or 2 might not get created at all. This patch makes
> sure none of the slabs get skipped.
>
> Tony Lindgren bisected this down to the offending commit, which really
> helped because bisect kept bringing me to almost but not quite this one.

err, yes. Without your patch, current mainline does an ignominious
faceplant on my test box.


From: Chris Mason <[email protected]>
Subject: slab: fix crash during slab init

Commit 8a965b3b ("mm, slab_common: Fix bootstrap creation of kmalloc
caches") introduced a regression that caused us to crash early during
boot. The commit was introducing ordering of slab creation, making sure
two odd-sized slabs were created after specific powers of two sizes.

But, if any of the power of two slabs were created earlier during boot,
slabs at index 1 or 2 might not get created at all. This patch makes sure
none of the slabs get skipped.

Tony Lindgren bisected this down to the offending commit, which really
helped because bisect kept bringing me to almost but not quite this one.

Signed-off-by: Chris Mason <[email protected]>
Acked-by: Christoph Lameter <[email protected]>
Acked-by: Tony Lindgren <[email protected]>
Tested-by: Tetsuo Handa <[email protected]>
Cc: Pekka Enberg <[email protected]>
Tested-by: Andrew Morton <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
---

mm/slab_common.c | 20 ++++++++++----------
1 file changed, 10 insertions(+), 10 deletions(-)

diff -puN mm/slab_common.c~slab-fix-crash-during-slab-init mm/slab_common.c
--- a/mm/slab_common.c~slab-fix-crash-during-slab-init
+++ a/mm/slab_common.c
@@ -446,18 +446,18 @@ void __init create_kmalloc_caches(unsign
if (!kmalloc_caches[i]) {
kmalloc_caches[i] = create_kmalloc_cache(NULL,
1 << i, flags);
+ }

- /*
- * Caches that are not of the two-to-the-power-of size.
- * These have to be created immediately after the
- * earlier power of two caches
- */
- if (KMALLOC_MIN_SIZE <= 32 && !kmalloc_caches[1] && i == 6)
- kmalloc_caches[1] = create_kmalloc_cache(NULL, 96, flags);
+ /*
+ * Caches that are not of the two-to-the-power-of size.
+ * These have to be created immediately after the
+ * earlier power of two caches
+ */
+ if (KMALLOC_MIN_SIZE <= 32 && !kmalloc_caches[1] && i == 6)
+ kmalloc_caches[1] = create_kmalloc_cache(NULL, 96, flags);

- if (KMALLOC_MIN_SIZE <= 64 && !kmalloc_caches[2] && i == 7)
- kmalloc_caches[2] = create_kmalloc_cache(NULL, 192, flags);
- }
+ if (KMALLOC_MIN_SIZE <= 64 && !kmalloc_caches[2] && i == 7)
+ kmalloc_caches[2] = create_kmalloc_cache(NULL, 192, flags);
}

/* Kmalloc array is now usable */
_