Otherwise, if there are > 1 nodes, we can get use-after-free while
processing the second or higher node:
WARNING: CPU: 60 PID: 1 at lib/list_debug.c:29 __list_add+0x3c/0xa9()
list_add corruption. next->prev should be prev (ffff881ff0a6bb98), but was ffffea007ff57020. (next=ffffea007fbf7320).
Modules linked in:
CPU: 60 PID: 1 Comm: swapper/0 Not tainted 3.19.0-rc7-next-20150203-gb50cadf #2178
Hardware name: Intel Corporation BRICKLAND/BRICKLAND, BIOS BIVTSDP1.86B.0038.R02.1307231126 07/23/2013
0000000000000009 ffff881ff0a6ba88 ffffffff81c2e096 ffffffff810e2d03
ffff881ff0a6bad8 ffff881ff0a6bac8 ffffffff8108b320 ffff881ff0a6bb18
ffffffff8154bbc7 ffff881ff0a6bb98 ffffea007fbf7320 ffffea00ffc3c220
Call Trace:
[<ffffffff81c2e096>] dump_stack+0x4c/0x65
[<ffffffff810e2d03>] ? console_unlock+0x398/0x3c7
[<ffffffff8108b320>] warn_slowpath_common+0xa1/0xbb
[<ffffffff8154bbc7>] ? __list_add+0x3c/0xa9
[<ffffffff8108b380>] warn_slowpath_fmt+0x46/0x48
[<ffffffff8154bbc7>] __list_add+0x3c/0xa9
[<ffffffff811bf5aa>] __kmem_cache_shrink+0x12b/0x24c
[<ffffffff81190ca9>] kmem_cache_shrink+0x26/0x38
[<ffffffff815848b4>] acpi_os_purge_cache+0xe/0x12
[<ffffffff815c6424>] acpi_purge_cached_objects+0x32/0x7a
[<ffffffff825f70f1>] acpi_initialize_objects+0x17e/0x1ae
[<ffffffff825f5177>] ? acpi_sleep_proc_init+0x2a/0x2a
[<ffffffff825f5209>] acpi_init+0x92/0x25e
[<ffffffff810002bd>] ? do_one_initcall+0x90/0x17f
[<ffffffff811bdfcd>] ? kfree+0x1fc/0x2d5
[<ffffffff825f5177>] ? acpi_sleep_proc_init+0x2a/0x2a
[<ffffffff8100031a>] do_one_initcall+0xed/0x17f
[<ffffffff825ae0e2>] kernel_init_freeable+0x1f0/0x278
[<ffffffff81c1f31a>] ? rest_init+0x13e/0x13e
[<ffffffff81c1f328>] kernel_init+0xe/0xda
[<ffffffff81c3ca7c>] ret_from_fork+0x7c/0xb0
[<ffffffff81c1f31a>] ? rest_init+0x13e/0x13e
fixes: slub-never-fail-to-shrink-cache
Signed-off-by: Vladimir Davydov <[email protected]>
Reported-by: Huang Ying <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Pekka Enberg <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: Joonsoo Kim <[email protected]>
---
mm/slub.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/mm/slub.c b/mm/slub.c
index 0909e13cf708..59dde3f3efed 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -3499,6 +3499,8 @@ int __kmem_cache_shrink(struct kmem_cache *s, bool deactivate)
list_for_each_entry_safe(page, t, &discard, lru)
discard_slab(s, page);
+ INIT_LIST_HEAD(&discard);
+
if (slabs_node(s, node))
ret = 1;
}
--
1.7.10.4
On Wed, 11 Feb 2015, Vladimir Davydov wrote:
> Otherwise, if there are > 1 nodes, we can get use-after-free while
> processing the second or higher node:
Acked-by: Christoph Lameter <[email protected]>
Hmmmm... Thinking about this some more. It may be better to initialize the
list head at the beginning of the loop?
Also the promote array should also be initialized in the loop right?
On Wed, Feb 11, 2015 at 09:00:39AM -0600, Christoph Lameter wrote:
> Hmmmm... Thinking about this some more. It may be better to initialize the
> list head at the beginning of the loop?
>
> Also the promote array should also be initialized in the loop right?
I do initialize promote lists in the loop using list_splice_init, but
yeah, initializing them in the beginning of the loop would look more
readable indeed. The updated patch is below. Thanks!
---
From: Vladimir Davydov <[email protected]>
Subject: [PATCH] slub: kmem_cache_shrink: fix crash due to uninitialized
discard list
Currently, the discard list is only initialized at the beginning of the
function. As a result, if there are > 1 nodes, we can get use-after-free
while processing the second or higher node:
WARNING: CPU: 60 PID: 1 at lib/list_debug.c:29 __list_add+0x3c/0xa9()
list_add corruption. next->prev should be prev (ffff881ff0a6bb98), but was ffffea007ff57020. (next=ffffea007fbf7320).
Modules linked in:
CPU: 60 PID: 1 Comm: swapper/0 Not tainted 3.19.0-rc7-next-20150203-gb50cadf #2178
Hardware name: Intel Corporation BRICKLAND/BRICKLAND, BIOS BIVTSDP1.86B.0038.R02.1307231126 07/23/2013
0000000000000009 ffff881ff0a6ba88 ffffffff81c2e096 ffffffff810e2d03
ffff881ff0a6bad8 ffff881ff0a6bac8 ffffffff8108b320 ffff881ff0a6bb18
ffffffff8154bbc7 ffff881ff0a6bb98 ffffea007fbf7320 ffffea00ffc3c220
Call Trace:
[<ffffffff81c2e096>] dump_stack+0x4c/0x65
[<ffffffff810e2d03>] ? console_unlock+0x398/0x3c7
[<ffffffff8108b320>] warn_slowpath_common+0xa1/0xbb
[<ffffffff8154bbc7>] ? __list_add+0x3c/0xa9
[<ffffffff8108b380>] warn_slowpath_fmt+0x46/0x48
[<ffffffff8154bbc7>] __list_add+0x3c/0xa9
[<ffffffff811bf5aa>] __kmem_cache_shrink+0x12b/0x24c
[<ffffffff81190ca9>] kmem_cache_shrink+0x26/0x38
[<ffffffff815848b4>] acpi_os_purge_cache+0xe/0x12
[<ffffffff815c6424>] acpi_purge_cached_objects+0x32/0x7a
[<ffffffff825f70f1>] acpi_initialize_objects+0x17e/0x1ae
[<ffffffff825f5177>] ? acpi_sleep_proc_init+0x2a/0x2a
[<ffffffff825f5209>] acpi_init+0x92/0x25e
[<ffffffff810002bd>] ? do_one_initcall+0x90/0x17f
[<ffffffff811bdfcd>] ? kfree+0x1fc/0x2d5
[<ffffffff825f5177>] ? acpi_sleep_proc_init+0x2a/0x2a
[<ffffffff8100031a>] do_one_initcall+0xed/0x17f
[<ffffffff825ae0e2>] kernel_init_freeable+0x1f0/0x278
[<ffffffff81c1f31a>] ? rest_init+0x13e/0x13e
[<ffffffff81c1f328>] kernel_init+0xe/0xda
[<ffffffff81c3ca7c>] ret_from_fork+0x7c/0xb0
[<ffffffff81c1f31a>] ? rest_init+0x13e/0x13e
Fix this by initializing the discard list at each iteration of the
for_each_kmem_cache_node loop. Also, move promote lists initialization
to the beginning of the loop to conform.
fixes: slub-never-fail-to-shrink-cache
Signed-off-by: Vladimir Davydov <[email protected]>
Reported-by: Huang Ying <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Pekka Enberg <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: Joonsoo Kim <[email protected]>
diff --git a/mm/slub.c b/mm/slub.c
index 0909e13cf708..6832c4eab104 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -3437,7 +3437,7 @@ int __kmem_cache_shrink(struct kmem_cache *s, bool deactivate)
struct kmem_cache_node *n;
struct page *page;
struct page *t;
- LIST_HEAD(discard);
+ struct list_head discard;
struct list_head promote[SHRINK_PROMOTE_MAX];
unsigned long flags;
int ret = 0;
@@ -3457,11 +3457,12 @@ int __kmem_cache_shrink(struct kmem_cache *s, bool deactivate)
kick_all_cpus_sync();
}
- for (i = 0; i < SHRINK_PROMOTE_MAX; i++)
- INIT_LIST_HEAD(promote + i);
-
flush_all(s);
for_each_kmem_cache_node(s, node, n) {
+ INIT_LIST_HEAD(&discard);
+ for (i = 0; i < SHRINK_PROMOTE_MAX; i++)
+ INIT_LIST_HEAD(promote + i);
+
spin_lock_irqsave(&n->list_lock, flags);
/*
@@ -3491,7 +3492,7 @@ int __kmem_cache_shrink(struct kmem_cache *s, bool deactivate)
* partial list.
*/
for (i = SHRINK_PROMOTE_MAX - 1; i >= 0; i--)
- list_splice_init(promote + i, &n->partial);
+ list_splice(promote + i, &n->partial);
spin_unlock_irqrestore(&n->list_lock, flags);
On Wed, 11 Feb 2015, Vladimir Davydov wrote:
> > Also the promote array should also be initialized in the loop right?
>
> I do initialize promote lists in the loop using list_splice_init, but
> yeah, initializing them in the beginning of the loop would look more
> readable indeed. The updated patch is below. Thanks!
Acked-by: Christoph Lameter <[email protected]>
On Wed, 11 Feb 2015, Vladimir Davydov wrote:
> Currently, the discard list is only initialized at the beginning of the
> function. As a result, if there are > 1 nodes, we can get use-after-free
> while processing the second or higher node:
>
> WARNING: CPU: 60 PID: 1 at lib/list_debug.c:29 __list_add+0x3c/0xa9()
> list_add corruption. next->prev should be prev (ffff881ff0a6bb98), but was ffffea007ff57020. (next=ffffea007fbf7320).
> Modules linked in:
> CPU: 60 PID: 1 Comm: swapper/0 Not tainted 3.19.0-rc7-next-20150203-gb50cadf #2178
> Hardware name: Intel Corporation BRICKLAND/BRICKLAND, BIOS BIVTSDP1.86B.0038.R02.1307231126 07/23/2013
> 0000000000000009 ffff881ff0a6ba88 ffffffff81c2e096 ffffffff810e2d03
> ffff881ff0a6bad8 ffff881ff0a6bac8 ffffffff8108b320 ffff881ff0a6bb18
> ffffffff8154bbc7 ffff881ff0a6bb98 ffffea007fbf7320 ffffea00ffc3c220
> Call Trace:
> [<ffffffff81c2e096>] dump_stack+0x4c/0x65
> [<ffffffff810e2d03>] ? console_unlock+0x398/0x3c7
> [<ffffffff8108b320>] warn_slowpath_common+0xa1/0xbb
> [<ffffffff8154bbc7>] ? __list_add+0x3c/0xa9
> [<ffffffff8108b380>] warn_slowpath_fmt+0x46/0x48
> [<ffffffff8154bbc7>] __list_add+0x3c/0xa9
> [<ffffffff811bf5aa>] __kmem_cache_shrink+0x12b/0x24c
> [<ffffffff81190ca9>] kmem_cache_shrink+0x26/0x38
> [<ffffffff815848b4>] acpi_os_purge_cache+0xe/0x12
> [<ffffffff815c6424>] acpi_purge_cached_objects+0x32/0x7a
> [<ffffffff825f70f1>] acpi_initialize_objects+0x17e/0x1ae
> [<ffffffff825f5177>] ? acpi_sleep_proc_init+0x2a/0x2a
> [<ffffffff825f5209>] acpi_init+0x92/0x25e
> [<ffffffff810002bd>] ? do_one_initcall+0x90/0x17f
> [<ffffffff811bdfcd>] ? kfree+0x1fc/0x2d5
> [<ffffffff825f5177>] ? acpi_sleep_proc_init+0x2a/0x2a
> [<ffffffff8100031a>] do_one_initcall+0xed/0x17f
> [<ffffffff825ae0e2>] kernel_init_freeable+0x1f0/0x278
> [<ffffffff81c1f31a>] ? rest_init+0x13e/0x13e
> [<ffffffff81c1f328>] kernel_init+0xe/0xda
> [<ffffffff81c3ca7c>] ret_from_fork+0x7c/0xb0
> [<ffffffff81c1f31a>] ? rest_init+0x13e/0x13e
>
> Fix this by initializing the discard list at each iteration of the
> for_each_kmem_cache_node loop. Also, move promote lists initialization
> to the beginning of the loop to conform.
>
> fixes: slub-never-fail-to-shrink-cache
8f44a586ac86 ("slub: never fail to shrink cache")
> Signed-off-by: Vladimir Davydov <[email protected]>
> Reported-by: Huang Ying <[email protected]>
> Cc: Christoph Lameter <[email protected]>
> Cc: Pekka Enberg <[email protected]>
> Cc: David Rientjes <[email protected]>
> Cc: Joonsoo Kim <[email protected]>
Acked-by: David Rientjes <[email protected]>