2018-10-09 06:36:00

by Heiko Carstens

[permalink] [raw]
Subject: [BUG -next 20181008] list corruption with "mm/slub: remove useless condition in deactivate_slab"

Hello,

with linux-next for 20181008 I can reliably crash my system with lot's of
debugging options enabled on s390. List debugging triggers the list
corruption below, which I could bisect down to this commit:

fde06e07750477f049f12d7d471ffa505338a3e7 is the first bad commit
commit fde06e07750477f049f12d7d471ffa505338a3e7
Author: Pingfan Liu <[email protected]>
Date: Thu Oct 4 07:43:01 2018 +1000

mm/slub: remove useless condition in deactivate_slab

The var l should be used to reflect the original list, on which the page
should be. But c->page is not on any list. Furthermore, the current code
does not update the value of l. Hence remove the related logic

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Pingfan Liu <[email protected]>
Acked-by: Christoph Lameter <[email protected]>
Cc: Pekka Enberg <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: Joonsoo Kim <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Stephen Rothwell <[email protected]>

list_add double add: new=000003d1029ecc08, prev=000000008ff846d0,next=000003d1029ecc08.
------------[ cut here ]------------
kernel BUG at lib/list_debug.c:31!
illegal operation: 0001 ilc:1 [#1] PREEMPT SMP
Modules linked in:
CPU: 3 PID: 106 Comm: (sd-executor) Not tainted 4.19.0-rc6-00291-gfde06e077504 #21
Hardware name: IBM 2964 NC9 702 (z/VM 6.4.0)
Krnl PSW : (____ptrval____) (____ptrval____) (__list_add_valid+0x98/0xa8)
R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 RI:0 EA:3
Krnl GPRS: 0000000074311fdf 0000000080000001 0000000000000058 0000000000e7b8b2
0000000000000000 0000000075438c64 00000000a7b31928 001c007b00000000
000000008fe99d00 00000000a7b31b40 000003d1029ecc08 00000000a7c03a80
000003d1029ecc08 000000008ff84680 00000000007b5674 00000000a7c03960
Krnl Code: 00000000007b5668: c0200034734a larl %r2,e43cfc
00000000007b566e: c0e5ffd0cf51 brasl %r14,1cf510
#00000000007b5674: a7f40001 brc 15,7b5676
>00000000007b5678: a7290001 lghi %r2,1
00000000007b567c: ebcff0a00004 lmg %r12,%r15,160(%r15)
00000000007b5682: 07fe bcr 15,%r14
00000000007b5684: 0707 bcr 0,%r7
00000000007b5686: 0707 bcr 0,%r7
Call Trace:
([<00000000007b5674>] __list_add_valid+0x94/0xa8)
[<000000000037d30e>] deactivate_slab.isra.15+0x45e/0x810
[<000000000037ede4>] ___slab_alloc+0x76c/0x7c0
[<000000000037eeb0>] __slab_alloc.isra.16+0x78/0xa8
[<00000000003808c8>] kmem_cache_alloc+0x160/0x458
[<0000000000141a3a>] vm_area_dup+0x3a/0x60
[<0000000000142f0a>] copy_process+0xd72/0x2100
[<000000000014449a>] _do_fork+0xba/0x688
[<0000000000144bb0>] sys_clone+0x48/0x50
[<0000000000b8faf0>] system_call+0xd8/0x2d0
INFO: lockdep is turned off.
Last Breaking-Event-Address:
[<00000000007b5674>] __list_add_valid+0x94/0xa8

Kernel panic - not syncing: Fatal exception: panic_on_oops



2018-10-09 23:25:23

by Andrew Morton

[permalink] [raw]
Subject: Re: [BUG -next 20181008] list corruption with "mm/slub: remove useless condition in deactivate_slab"

On Tue, 9 Oct 2018 08:35:00 +0200 Heiko Carstens <[email protected]> wrote:

> with linux-next for 20181008 I can reliably crash my system with lot's of
> debugging options enabled on s390. List debugging triggers the list
> corruption below, which I could bisect down to this commit:
>
> fde06e07750477f049f12d7d471ffa505338a3e7 is the first bad commit
> commit fde06e07750477f049f12d7d471ffa505338a3e7
> Author: Pingfan Liu <[email protected]>
> Date: Thu Oct 4 07:43:01 2018 +1000
>
> mm/slub: remove useless condition in deactivate_slab
>
> The var l should be used to reflect the original list, on which the page
> should be. But c->page is not on any list. Furthermore, the current code
> does not update the value of l. Hence remove the related logic
>
> Link: http://lkml.kernel.org/r/[email protected]
> Signed-off-by: Pingfan Liu <[email protected]>
> Acked-by: Christoph Lameter <[email protected]>
> Cc: Pekka Enberg <[email protected]>
> Cc: David Rientjes <[email protected]>
> Cc: Joonsoo Kim <[email protected]>
> Signed-off-by: Andrew Morton <[email protected]>
> Signed-off-by: Stephen Rothwell <[email protected]>
>
> list_add double add: new=000003d1029ecc08, prev=000000008ff846d0,next=000003d1029ecc08.
> ------------[ cut here ]------------
> kernel BUG at lib/list_debug.c:31!

Thanks much. I'll drop
mm-slub-remove-useless-condition-in-deactivate_slab.patch.


2018-10-15 05:54:58

by Pingfan Liu

[permalink] [raw]
Subject: Re: [BUG -next 20181008] list corruption with "mm/slub: remove useless condition in deactivate_slab"

On Tue, Oct 9, 2018 at 2:35 PM Heiko Carstens <[email protected]> wrote:
>
> Hello,
>
> with linux-next for 20181008 I can reliably crash my system with lot's of
> debugging options enabled on s390. List debugging triggers the list
> corruption below, which I could bisect down to this commit:
>
> fde06e07750477f049f12d7d471ffa505338a3e7 is the first bad commit
> commit fde06e07750477f049f12d7d471ffa505338a3e7
> Author: Pingfan Liu <[email protected]>
> Date: Thu Oct 4 07:43:01 2018 +1000
>
> mm/slub: remove useless condition in deactivate_slab
>
> The var l should be used to reflect the original list, on which the page
> should be. But c->page is not on any list. Furthermore, the current code
> does not update the value of l. Hence remove the related logic
>
> Link: http://lkml.kernel.org/r/[email protected]
> Signed-off-by: Pingfan Liu <[email protected]>
> Acked-by: Christoph Lameter <[email protected]>
> Cc: Pekka Enberg <[email protected]>
> Cc: David Rientjes <[email protected]>
> Cc: Joonsoo Kim <[email protected]>
> Signed-off-by: Andrew Morton <[email protected]>
> Signed-off-by: Stephen Rothwell <[email protected]>
>
> list_add double add: new=000003d1029ecc08, prev=000000008ff846d0,next=000003d1029ecc08.
> ------------[ cut here ]------------
> kernel BUG at lib/list_debug.c:31!
> illegal operation: 0001 ilc:1 [#1] PREEMPT SMP
> Modules linked in:
> CPU: 3 PID: 106 Comm: (sd-executor) Not tainted 4.19.0-rc6-00291-gfde06e077504 #21
> Hardware name: IBM 2964 NC9 702 (z/VM 6.4.0)
> Krnl PSW : (____ptrval____) (____ptrval____) (__list_add_valid+0x98/0xa8)
> R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 RI:0 EA:3
> Krnl GPRS: 0000000074311fdf 0000000080000001 0000000000000058 0000000000e7b8b2
> 0000000000000000 0000000075438c64 00000000a7b31928 001c007b00000000
> 000000008fe99d00 00000000a7b31b40 000003d1029ecc08 00000000a7c03a80
> 000003d1029ecc08 000000008ff84680 00000000007b5674 00000000a7c03960
> Krnl Code: 00000000007b5668: c0200034734a larl %r2,e43cfc
> 00000000007b566e: c0e5ffd0cf51 brasl %r14,1cf510
> #00000000007b5674: a7f40001 brc 15,7b5676
> >00000000007b5678: a7290001 lghi %r2,1
> 00000000007b567c: ebcff0a00004 lmg %r12,%r15,160(%r15)
> 00000000007b5682: 07fe bcr 15,%r14
> 00000000007b5684: 0707 bcr 0,%r7
> 00000000007b5686: 0707 bcr 0,%r7
> Call Trace:
> ([<00000000007b5674>] __list_add_valid+0x94/0xa8)
> [<000000000037d30e>] deactivate_slab.isra.15+0x45e/0x810
> [<000000000037ede4>] ___slab_alloc+0x76c/0x7c0
> [<000000000037eeb0>] __slab_alloc.isra.16+0x78/0xa8
> [<00000000003808c8>] kmem_cache_alloc+0x160/0x458
> [<0000000000141a3a>] vm_area_dup+0x3a/0x60
> [<0000000000142f0a>] copy_process+0xd72/0x2100
> [<000000000014449a>] _do_fork+0xba/0x688
> [<0000000000144bb0>] sys_clone+0x48/0x50
> [<0000000000b8faf0>] system_call+0xd8/0x2d0
> INFO: lockdep is turned off.
> Last Breaking-Event-Address:
> [<00000000007b5674>] __list_add_valid+0x94/0xa8
>
> Kernel panic - not syncing: Fatal exception: panic_on_oops
>

Sorry that just see this email, and reply late.
I think it is caused by the uinon page->lru and page->next. It can be fixed by:
diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h
index 3a1a1db..4aa0fb5 100644
--- a/include/linux/slub_def.h
+++ b/include/linux/slub_def.h
@@ -56,6 +56,7 @@ struct kmem_cache_cpu {
#define slub_set_percpu_partial(c, p) \
({ \
slub_percpu_partial(c) = (p)->next; \
+ p->next = NULL; \
})

I will do some test and post the fix.

Thanks,
Pingfan

2018-10-16 06:32:06

by Pingfan Liu

[permalink] [raw]
Subject: Re: [BUG -next 20181008] list corruption with "mm/slub: remove useless condition in deactivate_slab"

Hi heiko,

On Mon, Oct 15, 2018 at 1:54 PM Pingfan Liu <[email protected]> wrote:
>
> On Tue, Oct 9, 2018 at 2:35 PM Heiko Carstens <[email protected]> wrote:
> >
> > Hello,
> >
> > with linux-next for 20181008 I can reliably crash my system with lot's of
> > debugging options enabled on s390. List debugging triggers the list
> > corruption below, which I could bisect down to this commit:
> >
> > fde06e07750477f049f12d7d471ffa505338a3e7 is the first bad commit
> > commit fde06e07750477f049f12d7d471ffa505338a3e7
> > Author: Pingfan Liu <[email protected]>
> > Date: Thu Oct 4 07:43:01 2018 +1000
> >
> > mm/slub: remove useless condition in deactivate_slab
> >
> > The var l should be used to reflect the original list, on which the page
> > should be. But c->page is not on any list. Furthermore, the current code
> > does not update the value of l. Hence remove the related logic
> >
> > Link: http://lkml.kernel.org/r/[email protected]
> > Signed-off-by: Pingfan Liu <[email protected]>
> > Acked-by: Christoph Lameter <[email protected]>
> > Cc: Pekka Enberg <[email protected]>
> > Cc: David Rientjes <[email protected]>
> > Cc: Joonsoo Kim <[email protected]>
> > Signed-off-by: Andrew Morton <[email protected]>
> > Signed-off-by: Stephen Rothwell <[email protected]>
> >
> > list_add double add: new=000003d1029ecc08, prev=000000008ff846d0,next=000003d1029ecc08.
> > ------------[ cut here ]------------
> > kernel BUG at lib/list_debug.c:31!
> > illegal operation: 0001 ilc:1 [#1] PREEMPT SMP
> > Modules linked in:
> > CPU: 3 PID: 106 Comm: (sd-executor) Not tainted 4.19.0-rc6-00291-gfde06e077504 #21
> > Hardware name: IBM 2964 NC9 702 (z/VM 6.4.0)
> > Krnl PSW : (____ptrval____) (____ptrval____) (__list_add_valid+0x98/0xa8)
> > R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 RI:0 EA:3
> > Krnl GPRS: 0000000074311fdf 0000000080000001 0000000000000058 0000000000e7b8b2
> > 0000000000000000 0000000075438c64 00000000a7b31928 001c007b00000000
> > 000000008fe99d00 00000000a7b31b40 000003d1029ecc08 00000000a7c03a80
> > 000003d1029ecc08 000000008ff84680 00000000007b5674 00000000a7c03960
> > Krnl Code: 00000000007b5668: c0200034734a larl %r2,e43cfc
> > 00000000007b566e: c0e5ffd0cf51 brasl %r14,1cf510
> > #00000000007b5674: a7f40001 brc 15,7b5676
> > >00000000007b5678: a7290001 lghi %r2,1
> > 00000000007b567c: ebcff0a00004 lmg %r12,%r15,160(%r15)
> > 00000000007b5682: 07fe bcr 15,%r14
> > 00000000007b5684: 0707 bcr 0,%r7
> > 00000000007b5686: 0707 bcr 0,%r7
> > Call Trace:
> > ([<00000000007b5674>] __list_add_valid+0x94/0xa8)
> > [<000000000037d30e>] deactivate_slab.isra.15+0x45e/0x810
> > [<000000000037ede4>] ___slab_alloc+0x76c/0x7c0
> > [<000000000037eeb0>] __slab_alloc.isra.16+0x78/0xa8
> > [<00000000003808c8>] kmem_cache_alloc+0x160/0x458
> > [<0000000000141a3a>] vm_area_dup+0x3a/0x60
> > [<0000000000142f0a>] copy_process+0xd72/0x2100
> > [<000000000014449a>] _do_fork+0xba/0x688
> > [<0000000000144bb0>] sys_clone+0x48/0x50
> > [<0000000000b8faf0>] system_call+0xd8/0x2d0
> > INFO: lockdep is turned off.
> > Last Breaking-Event-Address:
> > [<00000000007b5674>] __list_add_valid+0x94/0xa8
> >
> > Kernel panic - not syncing: Fatal exception: panic_on_oops
> >
>
> Sorry that just see this email, and reply late.
> I think it is caused by the uinon page->lru and page->next. It can be fixed by:
> diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h
> index 3a1a1db..4aa0fb5 100644
> --- a/include/linux/slub_def.h
> +++ b/include/linux/slub_def.h
> @@ -56,6 +56,7 @@ struct kmem_cache_cpu {
> #define slub_set_percpu_partial(c, p) \
> ({ \
> slub_percpu_partial(c) = (p)->next; \
> + p->next = NULL; \
> })
>
> I will do some test and post the fix.
>
Please ignore the above comment. And after re-check the code, I am
sure that all callers of deactivate_slab(), pass c->page, which means
that page should not be on any list. But your test result "list_add
double add: new=000003d1029ecc08,
prev=000000008ff846d0,next=000003d1029ecc08" indicates that
page(new) is already on a list. I think that maybe something else is
wrong which is covered.
I can not reproduce this bug on x86. Could you share your config and
cmdline? Any do you turn on any debug option of slub?

Thanks,
Pingfan

2018-10-16 07:37:18

by Heiko Carstens

[permalink] [raw]
Subject: Re: [BUG -next 20181008] list corruption with "mm/slub: remove useless condition in deactivate_slab"

On Tue, Oct 16, 2018 at 02:29:28PM +0800, Pingfan Liu wrote:
> > I think it is caused by the uinon page->lru and page->next. It can be fixed by:
> > diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h
> > index 3a1a1db..4aa0fb5 100644
> > --- a/include/linux/slub_def.h
> > +++ b/include/linux/slub_def.h
> > @@ -56,6 +56,7 @@ struct kmem_cache_cpu {
> > #define slub_set_percpu_partial(c, p) \
> > ({ \
> > slub_percpu_partial(c) = (p)->next; \
> > + p->next = NULL; \
> > })
> >
> > I will do some test and post the fix.
> >
> Please ignore the above comment. And after re-check the code, I am
> sure that all callers of deactivate_slab(), pass c->page, which means
> that page should not be on any list. But your test result "list_add
> double add: new=000003d1029ecc08,
> prev=000000008ff846d0,next=000003d1029ecc08" indicates that
> page(new) is already on a list. I think that maybe something else is
> wrong which is covered.
> I can not reproduce this bug on x86. Could you share your config and
> cmdline? Any do you turn on any debug option of slub?

You can re-create the config with "make ARCH=s390 debug_defconfig".

Not sure which machine I used to reproduce this but most likely it was
a machine with these command line options:

dasd=e12d root=/dev/dasda1 userprocess_debug numa_debug sched_debug
ignore_loglevel sclp_con_drop=1 sclp_con_pages=32 audit=0
crashkernel=128M ignore_rlimit_data

You can ignore the dasd and sclp* command line options. These are
s390 specific. The rest should be available on any architecture.


2018-10-19 01:18:25

by Pingfan Liu

[permalink] [raw]
Subject: Re: [BUG -next 20181008] list corruption with "mm/slub: remove useless condition in deactivate_slab"

On Tue, Oct 16, 2018 at 3:36 PM Heiko Carstens
<[email protected]> wrote:
>
> On Tue, Oct 16, 2018 at 02:29:28PM +0800, Pingfan Liu wrote:
> > > I think it is caused by the uinon page->lru and page->next. It can be fixed by:
> > > diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h
> > > index 3a1a1db..4aa0fb5 100644
> > > --- a/include/linux/slub_def.h
> > > +++ b/include/linux/slub_def.h
> > > @@ -56,6 +56,7 @@ struct kmem_cache_cpu {
> > > #define slub_set_percpu_partial(c, p) \
> > > ({ \
> > > slub_percpu_partial(c) = (p)->next; \
> > > + p->next = NULL; \
> > > })
> > >
> > > I will do some test and post the fix.
> > >
> > Please ignore the above comment. And after re-check the code, I am
> > sure that all callers of deactivate_slab(), pass c->page, which means
> > that page should not be on any list. But your test result "list_add
> > double add: new=000003d1029ecc08,
> > prev=000000008ff846d0,next=000003d1029ecc08" indicates that
> > page(new) is already on a list. I think that maybe something else is
> > wrong which is covered.
> > I can not reproduce this bug on x86. Could you share your config and
> > cmdline? Any do you turn on any debug option of slub?
>
> You can re-create the config with "make ARCH=s390 debug_defconfig".
>
> Not sure which machine I used to reproduce this but most likely it was
> a machine with these command line options:
>
> dasd=e12d root=/dev/dasda1 userprocess_debug numa_debug sched_debug
> ignore_loglevel sclp_con_drop=1 sclp_con_pages=32 audit=0
> crashkernel=128M ignore_rlimit_data
>
> You can ignore the dasd and sclp* command line options. These are
> s390 specific. The rest should be available on any architecture.
>
Thank you for the info. I can reproduce the bug, and find that this
bug is caused by this commit. In deactivate_slab(), page is firstly
add_full(), then hit the redo condition, hence it should be
remove_full(). This wrong commit erases the related code.

Regards,
Pingfan