2019-11-06 22:25:30

by Laura Abbott

[permalink] [raw]
Subject: [PATCH] mm: slub: Really fix slab walking for init_on_free

Commit 1b7e816fc80e ("mm: slub: Fix slab walking for init_on_free")
fixed one problem with the slab walking but missed a key detail:
When walking the list, the head and tail pointers need to be updated
since we end up reversing the list as a result. Without doing this,
bulk free is broken. One way this is exposed is a NULL pointer with
slub_debug=F:

=============================================================================
BUG skbuff_head_cache (Tainted: G T): Object already free
-----------------------------------------------------------------------------

INFO: Slab 0x000000000d2d2f8f objects=16 used=3 fp=0x0000000064309071 flags=0x3fff00000000201
BUG: kernel NULL pointer dereference, address: 0000000000000000
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP PTI
CPU: 0 PID: 0 Comm: swapper/0 Tainted: G B T 5.3.8 #1
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
RIP: 0010:print_trailer+0x70/0x1d5
Code: 28 4d 8b 4d 00 4d 8b 45 20 81 e2 ff 7f 00 00 e8 86 ce ef ff 8b 4b 20 48 89 ea 48 89 ee 4c 29 e2 48 c7 c7 90 6f d4 89 48 01 e9 <48> 33 09 48 33 8b 70 01 00 00 e8 61 ce ef ff f6 43 09 04 74 35 8b
RSP: 0018:ffffbf7680003d58 EFLAGS: 00010046
RAX: 000000000000005d RBX: ffffa3d2bb08e540 RCX: 0000000000000000
RDX: 00005c2d8fdc2000 RSI: 0000000000000000 RDI: ffffffff89d46f90
RBP: 0000000000000000 R08: 0000000000000242 R09: 000000000000006c
R10: 0000000000000000 R11: 0000000000000030 R12: ffffa3d27023e000
R13: fffff11080c08f80 R14: ffffa3d2bb047a80 R15: 0000000000000002
FS: 0000000000000000(0000) GS:ffffa3d2be400000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 000000007a6c4000 CR4: 00000000000006f0
Call Trace:
<IRQ>
free_debug_processing.cold.37+0xc9/0x149
? __kfree_skb_flush+0x30/0x40
? __kfree_skb_flush+0x30/0x40
__slab_free+0x22a/0x3d0
? tcp_wfree+0x2a/0x140
? __sock_wfree+0x1b/0x30
kmem_cache_free_bulk+0x415/0x420
? __kfree_skb_flush+0x30/0x40
__kfree_skb_flush+0x30/0x40
net_rx_action+0x2dd/0x480
__do_softirq+0xf0/0x246
irq_exit+0x93/0xb0
do_IRQ+0xa0/0x110
common_interrupt+0xf/0xf
</IRQ>

Given we're now almost identical to the existing debugging
code which correctly walks the list, combine with that.

Link: https://lkml.kernel.org/r/[email protected]
Reported-by: Thibaut Sautereau <[email protected]>
Fixes: 1b7e816fc80e ("mm: slub: Fix slab walking for init_on_free")
Signed-off-by: Laura Abbott <[email protected]>
---
mm/slub.c | 39 +++++++++------------------------------
1 file changed, 9 insertions(+), 30 deletions(-)

diff --git a/mm/slub.c b/mm/slub.c
index dac41cf0b94a..d2445dd1c7ed 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1432,12 +1432,15 @@ static inline bool slab_free_freelist_hook(struct kmem_cache *s,
void *old_tail = *tail ? *tail : *head;
int rsize;

- if (slab_want_init_on_free(s)) {
- void *p = NULL;
+ /* Head and tail of the reconstructed freelist */
+ *head = NULL;
+ *tail = NULL;

- do {
- object = next;
- next = get_freepointer(s, object);
+ do {
+ object = next;
+ next = get_freepointer(s, object);
+
+ if (slab_want_init_on_free(s)) {
/*
* Clear the object and the metadata, but don't touch
* the redzone.
@@ -1447,29 +1450,8 @@ static inline bool slab_free_freelist_hook(struct kmem_cache *s,
: 0;
memset((char *)object + s->inuse, 0,
s->size - s->inuse - rsize);
- set_freepointer(s, object, p);
- p = object;
- } while (object != old_tail);
- }
-
-/*
- * Compiler cannot detect this function can be removed if slab_free_hook()
- * evaluates to nothing. Thus, catch all relevant config debug options here.
- */
-#if defined(CONFIG_LOCKDEP) || \
- defined(CONFIG_DEBUG_KMEMLEAK) || \
- defined(CONFIG_DEBUG_OBJECTS_FREE) || \
- defined(CONFIG_KASAN)

- next = *head;
-
- /* Head and tail of the reconstructed freelist */
- *head = NULL;
- *tail = NULL;
-
- do {
- object = next;
- next = get_freepointer(s, object);
+ }
/* If object's reuse doesn't have to be delayed */
if (!slab_free_hook(s, object)) {
/* Move object to the new freelist */
@@ -1484,9 +1466,6 @@ static inline bool slab_free_freelist_hook(struct kmem_cache *s,
*tail = NULL;

return *head != NULL;
-#else
- return true;
-#endif
}

static void *setup_object(struct kmem_cache *s, struct page *page,
--
2.21.0


2019-11-07 11:16:30

by Alexander Potapenko

[permalink] [raw]
Subject: Re: [PATCH] mm: slub: Really fix slab walking for init_on_free

On Wed, Nov 6, 2019 at 11:22 PM Laura Abbott <[email protected]> wrote:
>
> Commit 1b7e816fc80e ("mm: slub: Fix slab walking for init_on_free")
> fixed one problem with the slab walking but missed a key detail:
> When walking the list, the head and tail pointers need to be updated
> since we end up reversing the list as a result. Without doing this,
> bulk free is broken. One way this is exposed is a NULL pointer with
> slub_debug=F:
Thanks for the fix!
Is it possible to reproduce the problem on a small test case that
could be added to lib/test_meminit.c?

> =============================================================================
> BUG skbuff_head_cache (Tainted: G T): Object already free
> -----------------------------------------------------------------------------
>
> INFO: Slab 0x000000000d2d2f8f objects=16 used=3 fp=0x0000000064309071 flags=0x3fff00000000201
> BUG: kernel NULL pointer dereference, address: 0000000000000000
> PGD 0 P4D 0
> Oops: 0000 [#1] PREEMPT SMP PTI
> CPU: 0 PID: 0 Comm: swapper/0 Tainted: G B T 5.3.8 #1
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
> RIP: 0010:print_trailer+0x70/0x1d5
> Code: 28 4d 8b 4d 00 4d 8b 45 20 81 e2 ff 7f 00 00 e8 86 ce ef ff 8b 4b 20 48 89 ea 48 89 ee 4c 29 e2 48 c7 c7 90 6f d4 89 48 01 e9 <48> 33 09 48 33 8b 70 01 00 00 e8 61 ce ef ff f6 43 09 04 74 35 8b
> RSP: 0018:ffffbf7680003d58 EFLAGS: 00010046
> RAX: 000000000000005d RBX: ffffa3d2bb08e540 RCX: 0000000000000000
> RDX: 00005c2d8fdc2000 RSI: 0000000000000000 RDI: ffffffff89d46f90
> RBP: 0000000000000000 R08: 0000000000000242 R09: 000000000000006c
> R10: 0000000000000000 R11: 0000000000000030 R12: ffffa3d27023e000
> R13: fffff11080c08f80 R14: ffffa3d2bb047a80 R15: 0000000000000002
> FS: 0000000000000000(0000) GS:ffffa3d2be400000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000000 CR3: 000000007a6c4000 CR4: 00000000000006f0
> Call Trace:
> <IRQ>
> free_debug_processing.cold.37+0xc9/0x149
> ? __kfree_skb_flush+0x30/0x40
> ? __kfree_skb_flush+0x30/0x40
> __slab_free+0x22a/0x3d0
> ? tcp_wfree+0x2a/0x140
> ? __sock_wfree+0x1b/0x30
> kmem_cache_free_bulk+0x415/0x420
> ? __kfree_skb_flush+0x30/0x40
> __kfree_skb_flush+0x30/0x40
> net_rx_action+0x2dd/0x480
> __do_softirq+0xf0/0x246
> irq_exit+0x93/0xb0
> do_IRQ+0xa0/0x110
> common_interrupt+0xf/0xf
> </IRQ>
>
> Given we're now almost identical to the existing debugging
> code which correctly walks the list, combine with that.
>
> Link: https://lkml.kernel.org/r/[email protected]
> Reported-by: Thibaut Sautereau <[email protected]>
> Fixes: 1b7e816fc80e ("mm: slub: Fix slab walking for init_on_free")
> Signed-off-by: Laura Abbott <[email protected]>
Acked-by: Alexander Potapenko <[email protected]>
> ---
> mm/slub.c | 39 +++++++++------------------------------
> 1 file changed, 9 insertions(+), 30 deletions(-)
>
> diff --git a/mm/slub.c b/mm/slub.c
> index dac41cf0b94a..d2445dd1c7ed 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -1432,12 +1432,15 @@ static inline bool slab_free_freelist_hook(struct kmem_cache *s,
> void *old_tail = *tail ? *tail : *head;
> int rsize;
>
> - if (slab_want_init_on_free(s)) {
> - void *p = NULL;
> + /* Head and tail of the reconstructed freelist */
> + *head = NULL;
> + *tail = NULL;
>
> - do {
> - object = next;
> - next = get_freepointer(s, object);
> + do {
> + object = next;
> + next = get_freepointer(s, object);
> +
> + if (slab_want_init_on_free(s)) {
> /*
> * Clear the object and the metadata, but don't touch
> * the redzone.
> @@ -1447,29 +1450,8 @@ static inline bool slab_free_freelist_hook(struct kmem_cache *s,
> : 0;
> memset((char *)object + s->inuse, 0,
> s->size - s->inuse - rsize);
> - set_freepointer(s, object, p);
> - p = object;
> - } while (object != old_tail);
> - }
> -
> -/*
> - * Compiler cannot detect this function can be removed if slab_free_hook()
> - * evaluates to nothing. Thus, catch all relevant config debug options here.
> - */
> -#if defined(CONFIG_LOCKDEP) || \
> - defined(CONFIG_DEBUG_KMEMLEAK) || \
> - defined(CONFIG_DEBUG_OBJECTS_FREE) || \
> - defined(CONFIG_KASAN)
>
> - next = *head;
> -
> - /* Head and tail of the reconstructed freelist */
> - *head = NULL;
> - *tail = NULL;
> -
> - do {
> - object = next;
> - next = get_freepointer(s, object);
> + }
> /* If object's reuse doesn't have to be delayed */
> if (!slab_free_hook(s, object)) {
> /* Move object to the new freelist */
> @@ -1484,9 +1466,6 @@ static inline bool slab_free_freelist_hook(struct kmem_cache *s,
> *tail = NULL;
>
> return *head != NULL;
> -#else
> - return true;
> -#endif
> }
>
> static void *setup_object(struct kmem_cache *s, struct page *page,
> --
> 2.21.0
>


--
Alexander Potapenko
Software Engineer

Google Germany GmbH
Erika-Mann-Straße, 33
80636 München

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg

2019-11-07 20:48:12

by David Rientjes

[permalink] [raw]
Subject: Re: [PATCH] mm: slub: Really fix slab walking for init_on_free

On Wed, 6 Nov 2019, Laura Abbott wrote:

> Commit 1b7e816fc80e ("mm: slub: Fix slab walking for init_on_free")
> fixed one problem with the slab walking but missed a key detail:
> When walking the list, the head and tail pointers need to be updated
> since we end up reversing the list as a result. Without doing this,
> bulk free is broken. One way this is exposed is a NULL pointer with
> slub_debug=F:
>
> =============================================================================
> BUG skbuff_head_cache (Tainted: G T): Object already free
> -----------------------------------------------------------------------------
>
> INFO: Slab 0x000000000d2d2f8f objects=16 used=3 fp=0x0000000064309071 flags=0x3fff00000000201
> BUG: kernel NULL pointer dereference, address: 0000000000000000
> PGD 0 P4D 0
> Oops: 0000 [#1] PREEMPT SMP PTI
> CPU: 0 PID: 0 Comm: swapper/0 Tainted: G B T 5.3.8 #1
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
> RIP: 0010:print_trailer+0x70/0x1d5
> Code: 28 4d 8b 4d 00 4d 8b 45 20 81 e2 ff 7f 00 00 e8 86 ce ef ff 8b 4b 20 48 89 ea 48 89 ee 4c 29 e2 48 c7 c7 90 6f d4 89 48 01 e9 <48> 33 09 48 33 8b 70 01 00 00 e8 61 ce ef ff f6 43 09 04 74 35 8b
> RSP: 0018:ffffbf7680003d58 EFLAGS: 00010046
> RAX: 000000000000005d RBX: ffffa3d2bb08e540 RCX: 0000000000000000
> RDX: 00005c2d8fdc2000 RSI: 0000000000000000 RDI: ffffffff89d46f90
> RBP: 0000000000000000 R08: 0000000000000242 R09: 000000000000006c
> R10: 0000000000000000 R11: 0000000000000030 R12: ffffa3d27023e000
> R13: fffff11080c08f80 R14: ffffa3d2bb047a80 R15: 0000000000000002
> FS: 0000000000000000(0000) GS:ffffa3d2be400000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000000 CR3: 000000007a6c4000 CR4: 00000000000006f0
> Call Trace:
> <IRQ>
> free_debug_processing.cold.37+0xc9/0x149
> ? __kfree_skb_flush+0x30/0x40
> ? __kfree_skb_flush+0x30/0x40
> __slab_free+0x22a/0x3d0
> ? tcp_wfree+0x2a/0x140
> ? __sock_wfree+0x1b/0x30
> kmem_cache_free_bulk+0x415/0x420
> ? __kfree_skb_flush+0x30/0x40
> __kfree_skb_flush+0x30/0x40
> net_rx_action+0x2dd/0x480
> __do_softirq+0xf0/0x246
> irq_exit+0x93/0xb0
> do_IRQ+0xa0/0x110
> common_interrupt+0xf/0xf
> </IRQ>
>
> Given we're now almost identical to the existing debugging
> code which correctly walks the list, combine with that.
>
> Link: https://lkml.kernel.org/r/[email protected]
> Reported-by: Thibaut Sautereau <[email protected]>
> Fixes: 1b7e816fc80e ("mm: slub: Fix slab walking for init_on_free")
> Signed-off-by: Laura Abbott <[email protected]>

Acked-by: David Rientjes <[email protected]>