2013-08-08 13:19:55

by Steven Rostedt

[permalink] [raw]
Subject: [PATCH] Revert "slub: do not put a slab to cpu partial list when cpu_partial is 0"

This reverts commit 318df36e57c0ca9f2146660d41ff28e8650af423

When a box sets all cpu_partials to zero (for example with:

find /sys/kernel/slab -name 'cpu_partial' -print | \
while read f; do echo 0 > $f; done
)

and runs hackbench 50, a memory leak occurs rapidly, and the system can
crash with:

[ 74.172202] hackbench invoked oom-killer: gfp_mask=0x200da, order=0, oom_score_adj=0
[ 74.179968] hackbench cpuset=/ mems_allowed=0
[ 74.184345] CPU: 4 PID: 6390 Comm: hackbench Not tainted 3.11.0-rc4-test+ #28
[ 74.191503] Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v02.05 05/07/2012
[ 74.200477] 0000000000000000 ffff880011cbb8a8 ffffffff8162024f 00000000000000cc
[ 74.207965] ffff880011c6c650 ffff880011cbb928 ffffffff8161ca6e 000000000000821b
[ 74.215482] 0000000000000b31 ffff880011cbb8d8 ffffffff810f15fe ffff880000000000
[ 74.223082] Call Trace:
[ 74.225548] [<ffffffff8162024f>] dump_stack+0x46/0x58
[ 74.230721] [<ffffffff8161ca6e>] dump_header+0x7a/0x1be
[ 74.236055] [<ffffffff810f15fe>] ? __delayacct_freepages_end+0x2e/0x30
[ 74.242699] [<ffffffff812ee4c3>] ? ___ratelimit+0x93/0x110
[ 74.248291] [<ffffffff8112f65b>] oom_kill_process+0x1cb/0x330
[ 74.254149] [<ffffffff8105da15>] ? has_ns_capability_noaudit+0x15/0x20
[ 74.260803] [<ffffffff8105da37>] ? has_capability_noaudit+0x17/0x20
[ 74.267188] [<ffffffff8112fe20>] out_of_memory+0x470/0x4c0
[ 74.272785] [<ffffffff81135659>] __alloc_pages_nodemask+0xab9/0xad0
[ 74.279166] [<ffffffff810a857c>] ? ktime_get_ts+0x4c/0xe0
[ 74.284669] [<ffffffff81173cb3>] alloc_pages_vma+0xa3/0x150
[ 74.290345] [<ffffffff8116786b>] read_swap_cache_async+0x10b/0x190
[ 74.296626] [<ffffffff8116798e>] swapin_readahead+0x9e/0xf0
[ 74.302299] [<ffffffff81154e4f>] handle_pte_fault+0x29f/0xa60
[ 74.308147] [<ffffffff811567d6>] handle_mm_fault+0x256/0x5d0
[ 74.313908] [<ffffffff8162aa02>] __do_page_fault+0x182/0x4c0
[ 74.319671] [<ffffffff81194eac>] ? vfs_write+0x17c/0x200
[ 74.325081] [<ffffffff8162ad4e>] do_page_fault+0xe/0x10
[ 74.330405] [<ffffffff816273f2>] page_fault+0x22/0x30


Joonsoo Kim said that if cpu_partial is set to zero, the following can
happen:

"In this case, memory leak is possible in following case.
Code flow of possible leak is follwing case.

* in __slab_free()
1. (!new.inuse || !prior) && !was_frozen
2. !kmem_cache_debug && !prior
3. new.frozen = 1
4. after cmpxchg_double_slab, run the (!n) case with new.frozen=1
5. with this patch, put_cpu_partial() doesn't do anything,
because this cache's cpu_partial is 0
6. return

In step 5, leak occur."

Joonsoo said that there's a better fix for this problem, but for now the
best thing to do is to revert the patch that causes the leak.

Link:
https://lkml.kernel.org/r/[email protected]

Signed-off-by: Steven Rostedt <[email protected]>

diff --git a/mm/slub.c b/mm/slub.c
index 2b02d66..e3ba1f2 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1968,9 +1968,6 @@ static void put_cpu_partial(struct kmem_cache *s, struct page *page, int drain)
int pages;
int pobjects;

- if (!s->cpu_partial)
- return;
-
do {
pages = 0;
pobjects = 0;


2013-08-08 13:57:54

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] Revert "slub: do not put a slab to cpu partial list when cpu_partial is 0"

On Thu, 2013-08-08 at 09:19 -0400, Steven Rostedt wrote:

> Link:
> https://lkml.kernel.org/r/[email protected]
>

Evolution is really pissing me off. The wrapping that is used when I
compose a message is not the same as what is sent. This looked fine when
I hit send, but it should have been (/me switches to Preformatted mode):

Link: https://lkml.kernel.org/r/[email protected]

Anyone know how to fix Evolution to have the same wrap in the composer
as what is sent?

I'm using Debian testing, XFCE desktop, and Evolution 3.4.4.

Thanks,

-- Steve

2013-08-08 14:35:40

by Pekka Enberg

[permalink] [raw]
Subject: Re: [PATCH] Revert "slub: do not put a slab to cpu partial list when cpu_partial is 0"

On Thu, Aug 8, 2013 at 4:19 PM, Steven Rostedt <[email protected]> wrote:
> This reverts commit 318df36e57c0ca9f2146660d41ff28e8650af423

[snip]

> Signed-off-by: Steven Rostedt <[email protected]>

Acked-by: Pekka Enberg <[email protected]>

Linus, can you pick this up or do you want a pull request?

Pekka

2013-08-08 16:12:36

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH] Revert "slub: do not put a slab to cpu partial list when cpu_partial is 0"

On Thu, Aug 8, 2013 at 7:35 AM, Pekka Enberg <[email protected]> wrote:
>
> Linus, can you pick this up or do you want a pull request?

I just did the revert when reading the original thread, and hadn't
even gotten to this new email thread with an actual patch.

Anyway, it's reverted in my tree,

Linus

2013-08-08 16:22:11

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] Revert "slub: do not put a slab to cpu partial list when cpu_partial is 0"

On Thu, 2013-08-08 at 09:12 -0700, Linus Torvalds wrote:
> On Thu, Aug 8, 2013 at 7:35 AM, Pekka Enberg <[email protected]> wrote:
> >
> > Linus, can you pick this up or do you want a pull request?
>
> I just did the revert when reading the original thread, and hadn't
> even gotten to this new email thread with an actual patch.
>
> Anyway, it's reverted in my tree,
>

Cool, thanks!

-- Steve

2013-08-08 17:39:33

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] Revert "slub: do not put a slab to cpu partial list when cpu_partial is 0"

On Thu, 2013-08-08 at 09:57 -0400, Steven Rostedt wrote:

> I'm using Debian testing, XFCE desktop, and Evolution 3.4.4.

I'm starting to play with claws-mail. This may become my new default
mailer :-)

-- Steve