2009-06-11 08:44:32

by Pekka Enberg

[permalink] [raw]
Subject: [PATCH 2/2] SLUB: Disable debugging if it increases the minimum page order

From: Pekka Enberg <[email protected]>

As CONFIG_SLUB_DEBUG is almost always enabled, we need to disable debugging on
per-cache basis if the added debug metadata increases minimum page order. This
is a problem for large caches such as kmalloc-4096 as seen in the following
bugzilla report:

http://bugzilla.kernel.org/show_bug.cgi?id=13319

Cc: Christoph Lameter <[email protected]>
Cc: Larry Finger <[email protected]>
Cc: Mel Gorman <[email protected]>
Signed-off-by: Pekka Enberg <[email protected]>
---
mm/slub.c | 13 +++++++++++++
1 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/mm/slub.c b/mm/slub.c
index 2bbacfc..45080d3 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2385,6 +2385,16 @@ static int calculate_sizes(struct kmem_cache *s, int forced_order)

}

+#define MAX_DEBUG_SIZE (3 * sizeof(void *) + 2 * sizeof(struct track))
+
+static bool must_disable_debug(size_t size)
+{
+ /*
+ * Disable debugging if it increases the minimum page order.
+ */
+ return get_order(size + MAX_DEBUG_SIZE) > get_order(size);
+}
+
static int kmem_cache_open(struct kmem_cache *s, gfp_t gfpflags,
const char *name, size_t size,
size_t align, unsigned long flags,
@@ -2397,6 +2407,9 @@ static int kmem_cache_open(struct kmem_cache *s, gfp_t gfpflags,
s->align = align;
s->flags = kmem_cache_flags(size, flags, name, ctor);

+ if (must_disable_debug(size))
+ s->flags &= ~(SLAB_POISON|SLAB_RED_ZONE|SLAB_STORE_USER);
+
if (!calculate_sizes(s, -1))
goto error;

--
1.6.0.4


2009-06-11 13:44:17

by Christoph Lameter

[permalink] [raw]
Subject: Re: [PATCH 2/2] SLUB: Disable debugging if it increases the minimum page order

We have had that with SLAB. NO! This leads to the situation that some
slabs have debug on and some have not. You just do not know which.

Note that CONFIG_SLUB_DEBUG only enables the code to debug a slab. It does
not enable debugging for each slab. CONFIG_SLUB_DEBUG_ON does that.

On Thu, 11 Jun 2009, Pekka J Enberg wrote:

> From: Pekka Enberg <[email protected]>
>
> As CONFIG_SLUB_DEBUG is almost always enabled, we need to disable debugging on
> per-cache basis if the added debug metadata increases minimum page order. This
> is a problem for large caches such as kmalloc-4096 as seen in the following
> bugzilla report:
>
> http://bugzilla.kernel.org/show_bug.cgi?id=13319
>
> Cc: Christoph Lameter <[email protected]>
> Cc: Larry Finger <[email protected]>
> Cc: Mel Gorman <[email protected]>
> Signed-off-by: Pekka Enberg <[email protected]>
> ---
> mm/slub.c | 13 +++++++++++++
> 1 files changed, 13 insertions(+), 0 deletions(-)
>
> diff --git a/mm/slub.c b/mm/slub.c
> index 2bbacfc..45080d3 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -2385,6 +2385,16 @@ static int calculate_sizes(struct kmem_cache *s, int forced_order)
>
> }
>
> +#define MAX_DEBUG_SIZE (3 * sizeof(void *) + 2 * sizeof(struct track))
> +
> +static bool must_disable_debug(size_t size)
> +{
> + /*
> + * Disable debugging if it increases the minimum page order.
> + */
> + return get_order(size + MAX_DEBUG_SIZE) > get_order(size);
> +}
> +
> static int kmem_cache_open(struct kmem_cache *s, gfp_t gfpflags,
> const char *name, size_t size,
> size_t align, unsigned long flags,
> @@ -2397,6 +2407,9 @@ static int kmem_cache_open(struct kmem_cache *s, gfp_t gfpflags,
> s->align = align;
> s->flags = kmem_cache_flags(size, flags, name, ctor);
>
> + if (must_disable_debug(size))
> + s->flags &= ~(SLAB_POISON|SLAB_RED_ZONE|SLAB_STORE_USER);
> +
> if (!calculate_sizes(s, -1))
> goto error;
>
>

2009-06-11 14:00:31

by Pekka Enberg

[permalink] [raw]
Subject: Re: [PATCH 2/2] SLUB: Disable debugging if it increases the minimum page order

Hi Christoph,

On Thu, 2009-06-11 at 09:43 -0400, Christoph Lameter wrote:
> We have had that with SLAB. NO! This leads to the situation that some
> slabs have debug on and some have not. You just do not know which.

I do see your point but surely we don't want to use order 1 allocations
in the fall-back case for kmalloc-4096? Couldn't we just add a printk
saying that debug was disabled for the cache? After all, my patch is
much better than what SLAB does.

On Thu, 2009-06-11 at 09:43 -0400, Christoph Lameter wrote:
> Note that CONFIG_SLUB_DEBUG only enables the code to debug a slab. It does
> not enable debugging for each slab. CONFIG_SLUB_DEBUG_ON does that.

True. Larry, do you have CONFIG_SLUB_DEBUG_ON enabled or are you passing
SLUB debugging options to the kernel?

Pekka

2009-06-11 14:24:48

by Christoph Lameter

[permalink] [raw]
Subject: Re: [PATCH 2/2] SLUB: Disable debugging if it increases the minimum page order

On Thu, 11 Jun 2009, Pekka Enberg wrote:

> On Thu, 2009-06-11 at 09:43 -0400, Christoph Lameter wrote:
> > We have had that with SLAB. NO! This leads to the situation that some
> > slabs have debug on and some have not. You just do not know which.
>
> I do see your point but surely we don't want to use order 1 allocations
> in the fall-back case for kmalloc-4096? Couldn't we just add a printk
> saying that debug was disabled for the cache? After all, my patch is
> much better than what SLAB does.

If we are enabling global debugging then we are looking for memory
corruption in *all* slab caches. Disabling debugging of some cache behind
the scenes is bad even if this leads to order 1 allocations.

We could refine the way to specify groups of slab caches that should have
debugging on.

> On Thu, 2009-06-11 at 09:43 -0400, Christoph Lameter wrote:
> > Note that CONFIG_SLUB_DEBUG only enables the code to debug a slab. It does
> > not enable debugging for each slab. CONFIG_SLUB_DEBUG_ON does that.
>
> True. Larry, do you have CONFIG_SLUB_DEBUG_ON enabled or are you passing
> SLUB debugging options to the kernel?

2009-06-11 14:39:45

by Larry Finger

[permalink] [raw]
Subject: Re: [PATCH 2/2] SLUB: Disable debugging if it increases the minimum page order

Pekka Enberg wrote:
> Hi Christoph,
>
> On Thu, 2009-06-11 at 09:43 -0400, Christoph Lameter wrote:
>> We have had that with SLAB. NO! This leads to the situation that some
>> slabs have debug on and some have not. You just do not know which.
>
> I do see your point but surely we don't want to use order 1 allocations
> in the fall-back case for kmalloc-4096? Couldn't we just add a printk
> saying that debug was disabled for the cache? After all, my patch is
> much better than what SLAB does.
>
> On Thu, 2009-06-11 at 09:43 -0400, Christoph Lameter wrote:
>> Note that CONFIG_SLUB_DEBUG only enables the code to debug a slab. It does
>> not enable debugging for each slab. CONFIG_SLUB_DEBUG_ON does that.
>
> True. Larry, do you have CONFIG_SLUB_DEBUG_ON enabled or are you passing
> SLUB debugging options to the kernel?

It is enabled in the kernel. My SLUB options in .config are:

finger@larrylap:~> grep SLUB .config
CONFIG_SLUB_DEBUG=y
CONFIG_SLUB=y
CONFIG_SLUB_DEBUG_ON=y
# CONFIG_SLUB_STATS is not set

Would having STATS enabled help?

For a bug that hits infrequently, determining that it is fixed is
difficult. That said, my system has been up about 22.5 hours during
which I have tried to force the failure. I see the fragmentation of
memory vary widely as shown below:

finger@larrylap:~> date ; cat /proc/buddyinfo
Wed Jun 10 20:15:34 CDT 2009
Node 0, zone DMA 4 3 5 2 4 1 2
0 1 0 0
Node 0, zone DMA32 5920 11678 2245 369 117 21 2
1 0 0 0
finger@larrylap:~> date ; cat /proc/buddyinfo
Wed Jun 10 23:32:39 CDT 2009
Node 0, zone DMA 4 3 5 2 4 1 2
0 1 0 0
Node 0, zone DMA32 2605 4140 3804 7 0 1 1
1 0 0 0
finger@larrylap:~> date ; cat /proc/buddyinfo
Thu Jun 11 09:28:06 CDT 2009
Node 0, zone DMA 4 3 5 2 4 1 2
0 1 0 0
Node 0, zone DMA32 2231 429 2726 54 1 0 1
1 0 0 0
finger@larrylap:~> cat /proc/uptime
80678.11 78.54

The latest value of the number of O(1) fragments is about as low as I
have seen.

Larry

2009-06-11 15:12:24

by Pekka Enberg

[permalink] [raw]
Subject: Re: [PATCH 2/2] SLUB: Disable debugging if it increases the minimum page order

Hi Christoph,

On Thu, 11 Jun 2009, Pekka Enberg wrote:
> > On Thu, 2009-06-11 at 09:43 -0400, Christoph Lameter wrote:
> > > We have had that with SLAB. NO! This leads to the situation that some
> > > slabs have debug on and some have not. You just do not know which.
> >
> > I do see your point but surely we don't want to use order 1 allocations
> > in the fall-back case for kmalloc-4096? Couldn't we just add a printk
> > saying that debug was disabled for the cache? After all, my patch is
> > much better than what SLAB does.

On Thu, 2009-06-11 at 10:24 -0400, Christoph Lameter wrote:
> If we are enabling global debugging then we are looking for memory
> corruption in *all* slab caches. Disabling debugging of some cache behind
> the scenes is bad even if this leads to order 1 allocations.
>
> We could refine the way to specify groups of slab caches that should have
> debugging on.

I think my patch is the simplest solution here: it turns off debugging
for those caches where the metadata bumps up the minimum allocation
order and I suspect that we disable cache only for 4096 in practice.

Also note that when we switch back to more aggressive page allocator
pass-through, we lose SLUB debugging support.

Pekka

2009-06-11 15:21:07

by Christoph Lameter

[permalink] [raw]
Subject: Re: [PATCH 2/2] SLUB: Disable debugging if it increases the minimum page order

On Thu, 11 Jun 2009, Pekka Enberg wrote:

> I think my patch is the simplest solution here: it turns off debugging
> for those caches where the metadata bumps up the minimum allocation
> order and I suspect that we disable cache only for 4096 in practice.

There could be other slab caches that would randomly be affected by this.
I ran a couple of times into situations where I had to hack slab to do
debugging of exempted slabs.

> Also note that when we switch back to more aggressive page allocator
> pass-through, we lose SLUB debugging support.

Debugging options can have side effects. Slab debugging is not something
suitable by default for production environments.

2009-06-11 15:28:37

by Pekka Enberg

[permalink] [raw]
Subject: Re: [PATCH 2/2] SLUB: Disable debugging if it increases the minimum page order

Hi Christoph,

On Thu, 11 Jun 2009, Pekka Enberg wrote:
> > I think my patch is the simplest solution here: it turns off debugging
> > for those caches where the metadata bumps up the minimum allocation
> > order and I suspect that we disable cache only for 4096 in practice.

On Thu, 2009-06-11 at 11:20 -0400, Christoph Lameter wrote:
> There could be other slab caches that would randomly be affected by this.
> I ran a couple of times into situations where I had to hack slab to do
> debugging of exempted slabs.

Yes, I do understand that and I have hit that too in the past. But
surely a printk() telling the user we disabled debugging can cure that?

My main point is that a lot of _testers_ will probably enable all SLUB
debugging by default because we encourage them to and it's pretty bad
that we end up causing order 1 allocations and oom conditions.

So I still think we need to fix _at minimum_ the kmalloc-4096 case
(assuming Larry won't hit the same problem still). I see you're not
happy with my patch so any suggestions how to handle that?

Pekka

2009-06-11 15:49:36

by Christoph Lameter

[permalink] [raw]
Subject: Re: [PATCH 2/2] SLUB: Disable debugging if it increases the minimum page order

On Thu, 11 Jun 2009, Pekka Enberg wrote:

> My main point is that a lot of _testers_ will probably enable all SLUB
> debugging by default because we encourage them to and it's pretty bad
> that we end up causing order 1 allocations and oom conditions.

Other test methods (like PAGE_ALLOC debugging) also have significant side
effects.

> So I still think we need to fix _at minimum_ the kmalloc-4096 case
> (assuming Larry won't hit the same problem still). I see you're not
> happy with my patch so any suggestions how to handle that?

Add a warning to Kconfig that the higher order page allocations may
increase with debugging on for caches with object sizes near or equal to
PAGE_SIZE?

Its good to run with full debugging on for even the 4k sized caches.
Otherwise we wont be catching overruns there. But the debugging can cause
some side effects.

2009-06-11 16:49:18

by Larry Finger

[permalink] [raw]
Subject: Re: [PATCH 2/2] SLUB: Disable debugging if it increases the minimum page order

Christoph Lameter wrote:
> On Thu, 11 Jun 2009, Pekka Enberg wrote:
>
>> My main point is that a lot of _testers_ will probably enable all SLUB
>> debugging by default because we encourage them to and it's pretty bad
>> that we end up causing order 1 allocations and oom conditions.
>
> Other test methods (like PAGE_ALLOC debugging) also have significant side
> effects.
>
>> So I still think we need to fix _at minimum_ the kmalloc-4096 case
>> (assuming Larry won't hit the same problem still). I see you're not
>> happy with my patch so any suggestions how to handle that?
>
> Add a warning to Kconfig that the higher order page allocations may
> increase with debugging on for caches with object sizes near or equal to
> PAGE_SIZE?
>
> Its good to run with full debugging on for even the 4k sized caches.
> Otherwise we wont be catching overruns there. But the debugging can cause
> some side effects.

This is obviously an extreme corner case. Both Ubuntu and openSUSE
distribute kernels with SLAB enabled, thus the bulk of users will
never get into this situation. In addition, I'm not aware of any other
testers that have reported this condition.

If the Kconfig warning is too strong, then even the testers might not
turn SLUB debugging on, and you lose people like me. Having a printk
that indicates that debugging was turned off for a particular request
would have been useful for my current test as we would then know that
the fix might have saved an oom condition, but the total number of
such printks should be limited as I wouldn't want my logs polluted too
much.

Is it easy to test the number of available O(1) fragments when
debugging would increase the minimum from 0 to 1, and only turn off
debugging if the available number is small?

After 23 hours of getting my system to a "steady state" condition, I
am currently running the following:

1. Two concurrent -j8 kernel builds with the sources on an NFS mounted
volume with b43 as the network device.
2. A simultaneous flood ping over the network.
3. A dump of the DMA32 line of /proc/buddyinfo every 5 seconds. The
number of O(1) fragments has seldom gotten below 1000.

Larry

2009-06-11 18:04:41

by Mel Gorman

[permalink] [raw]
Subject: Re: [PATCH 2/2] SLUB: Disable debugging if it increases the minimum page order

On Thu, Jun 11, 2009 at 11:49:19AM -0400, Christoph Lameter wrote:
> On Thu, 11 Jun 2009, Pekka Enberg wrote:
>
> > My main point is that a lot of _testers_ will probably enable all SLUB
> > debugging by default because we encourage them to and it's pretty bad
> > that we end up causing order 1 allocations and oom conditions.
>
> Other test methods (like PAGE_ALLOC debugging) also have significant side
> effects.
>

True, but they come with huge big warnings and I agree that we don't want
to make the SL*B debug warning too drastic, particularly because SL*B
debugging is often so valuable and relatively lightweight in comparison
to some debug options.


> > So I still think we need to fix _at minimum_ the kmalloc-4096 case
> > (assuming Larry won't hit the same problem still). I see you're not
> > happy with my patch so any suggestions how to handle that?
>
> Add a warning to Kconfig that the higher order page allocations may
> increase with debugging on for caches with object sizes near or equal to
> PAGE_SIZE?
>

Possibly clueless suggestion here. How possible would it be to implement
something like

"With SLUB_DEBUG, enable debug on all caches unless the increased meta-data
would force the minimum order up on order due to object sizes being near or
equal the PAGE_SIZE. If SLUB_DEBUG must be enabled, specify slub_debug=A for
'All caches enable debug regardless'"

?

> Its good to run with full debugging on for even the 4k sized caches.
> Otherwise we wont be catching overruns there. But the debugging can cause
> some side effects.
>

--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab

2009-06-11 19:02:31

by Christoph Lameter

[permalink] [raw]
Subject: Re: [PATCH 2/2] SLUB: Disable debugging if it increases the minimum page order

On Thu, 11 Jun 2009, Mel Gorman wrote:

> "With SLUB_DEBUG, enable debug on all caches unless the increased meta-data
> would force the minimum order up on order due to object sizes being near or
> equal the PAGE_SIZE. If SLUB_DEBUG must be enabled, specify slub_debug=A for
> 'All caches enable debug regardless'"

Hmmm... No SLUB_DEBUG just compiles the code in for debugging. Does not
enable anything.

But we could make SLUB_DEBUG_ON enable debugging except on those slabs
where the order increases (for which it does a printk saying that the
debugging was disabled) and then have a slub_debug=A option that
overrides the order check. Generally if you know that you have memory
corruption you would want to boot with slub_debug=A. If you just want
continuous checks then SLUB_DEBUG_ON would do the trick.

2009-06-11 19:17:04

by Mel Gorman

[permalink] [raw]
Subject: Re: [PATCH 2/2] SLUB: Disable debugging if it increases the minimum page order

On Thu, Jun 11, 2009 at 02:26:02PM -0400, Christoph Lameter wrote:
> On Thu, 11 Jun 2009, Mel Gorman wrote:
>
> > "With SLUB_DEBUG, enable debug on all caches unless the increased meta-data
> > would force the minimum order up on order due to object sizes being near or
> > equal the PAGE_SIZE. If SLUB_DEBUG must be enabled, specify slub_debug=A for
> > 'All caches enable debug regardless'"
>
> Hmmm... No SLUB_DEBUG just compiles the code in for debugging. Does not
> enable anything.
>
> But we could make SLUB_DEBUG_ON enable debugging except on those slabs
> where the order increases (for which it does a printk saying that the
> debugging was disabled) and then have a slub_debug=A option that
> overrides the order check.

Sorry, this is what I meant and I failed to explain myself properly.

> Generally if you know that you have memory
> corruption you would want to boot with slub_debug=A. If you just want
> continuous checks then SLUB_DEBUG_ON would do the trick.
>

That's the general idea. It would catch a number of bugs that SLUB_DEBUG_ON
catches without creating a different class of bug report such as this
atomic-order-1 problem.

--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab