2010-11-18 02:54:50

by b32542

[permalink] [raw]
Subject: [PATCH] slub: operate cache name memory same to slab and slob

From: Zeng Zhaoming <[email protected]>

Get a memory leak complaint about ext4:
comm "mount", pid 1159, jiffies 4294904647 (age 6077.804s)
hex dump (first 32 bytes):
65 78 74 34 5f 67 72 6f 75 70 69 6e 66 6f 5f 31 ext4_groupinfo_1
30 00 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5 0.kkkkkkkkkkkkk.
backtrace:
[<c068ade3>] kmemleak_alloc+0x93/0xd0
[<c024e54c>] __kmalloc_track_caller+0x30c/0x380
[<c02269d3>] kstrdup+0x33/0x60
[<c0318a70>] ext4_mb_init+0x4e0/0x550
[<c0304e0e>] ext4_fill_super+0x1e6e/0x2f60
[<c0261140>] mount_bdev+0x1c0/0x1f0
[<c02fc00f>] ext4_mount+0x1f/0x30
[<c02603d8>] vfs_kern_mount+0x78/0x250
[<c026060e>] do_kern_mount+0x3e/0x100
[<c027b4c2>] do_mount+0x2e2/0x780
[<c027ba04>] sys_mount+0xa4/0xd0
[<c010429f>] sysenter_do_call+0x12/0x38
[<ffffffff>] 0xffffffff

It is cause by slub manage the cache name different from slab and slob.
In slab and slob, only reference to name, alloc and reclaim the memory
is the duty of the code that invoked kmem_cache_create().

In slub, cache name duplicated when create. This ambiguity will cause
some memory leaks and double free if kmem_cache_create() pass a
dynamic malloc cache name.

Signed-off-by: Zeng Zhaoming <[email protected]>
---
mm/slub.c | 11 +----------
1 files changed, 1 insertions(+), 10 deletions(-)
mode change 100644 => 100755 mm/slub.c

diff --git a/mm/slub.c b/mm/slub.c
old mode 100644
new mode 100755
index 981fb73..a223e08
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -209,7 +209,6 @@ static inline int sysfs_slab_alias(struct kmem_cache *s, const char *p)
{ return 0; }
static inline void sysfs_slab_remove(struct kmem_cache *s)
{
- kfree(s->name);
kfree(s);
}

@@ -3228,7 +3227,6 @@ struct kmem_cache *kmem_cache_create(const char *name, size_t size,
size_t align, unsigned long flags, void (*ctor)(void *))
{
struct kmem_cache *s;
- char *n;

if (WARN_ON(!name))
return NULL;
@@ -3252,25 +3250,19 @@ struct kmem_cache *kmem_cache_create(const char *name, size_t size,
return s;
}

- n = kstrdup(name, GFP_KERNEL);
- if (!n)
- goto err;
-
s = kmalloc(kmem_size, GFP_KERNEL);
if (s) {
- if (kmem_cache_open(s, n,
+ if (kmem_cache_open(s, name,
size, align, flags, ctor)) {
list_add(&s->list, &slab_caches);
if (sysfs_slab_add(s)) {
list_del(&s->list);
- kfree(n);
kfree(s);
goto err;
}
up_write(&slub_lock);
return s;
}
- kfree(n);
kfree(s);
}
err:
@@ -4421,7 +4413,6 @@ static void kmem_cache_release(struct kobject *kobj)
{
struct kmem_cache *s = to_slab(kobj);

- kfree(s->name);
kfree(s);
}

--
1.7.0.4


2010-11-18 21:15:16

by Matt Mackall

[permalink] [raw]
Subject: Re: [PATCH] slub: operate cache name memory same to slab and slob

On Thu, 2010-11-18 at 11:00 +0800, [email protected] wrote:
> From: Zeng Zhaoming <[email protected]>
>
> Get a memory leak complaint about ext4:
> comm "mount", pid 1159, jiffies 4294904647 (age 6077.804s)
> hex dump (first 32 bytes):
> 65 78 74 34 5f 67 72 6f 75 70 69 6e 66 6f 5f 31 ext4_groupinfo_1
> 30 00 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5 0.kkkkkkkkkkkkk.
> backtrace:
> [<c068ade3>] kmemleak_alloc+0x93/0xd0
> [<c024e54c>] __kmalloc_track_caller+0x30c/0x380
> [<c02269d3>] kstrdup+0x33/0x60
> [<c0318a70>] ext4_mb_init+0x4e0/0x550
> [<c0304e0e>] ext4_fill_super+0x1e6e/0x2f60
> [<c0261140>] mount_bdev+0x1c0/0x1f0
> [<c02fc00f>] ext4_mount+0x1f/0x30
> [<c02603d8>] vfs_kern_mount+0x78/0x250
> [<c026060e>] do_kern_mount+0x3e/0x100
> [<c027b4c2>] do_mount+0x2e2/0x780
> [<c027ba04>] sys_mount+0xa4/0xd0
> [<c010429f>] sysenter_do_call+0x12/0x38
> [<ffffffff>] 0xffffffff
>
> It is cause by slub manage the cache name different from slab and slob.
> In slab and slob, only reference to name, alloc and reclaim the memory
> is the duty of the code that invoked kmem_cache_create().
>
> In slub, cache name duplicated when create. This ambiguity will cause
> some memory leaks and double free if kmem_cache_create() pass a
> dynamic malloc cache name.

I don't get it.

Caller allocates X, passes X to slub, slub duplicates X as X', and
properly frees X', then caller frees X. Yes, that's silly, but where's
the leak?

But slub and slab should obviously both manage names in the same way,
namely the historical "caller allocates" way. So:

Acked-by: Matt Mackall <[email protected]>

> ---
> mm/slub.c | 11 +----------
> 1 files changed, 1 insertions(+), 10 deletions(-)
> mode change 100644 => 100755 mm/slub.c
>
> diff --git a/mm/slub.c b/mm/slub.c
> old mode 100644
> new mode 100755
> index 981fb73..a223e08
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -209,7 +209,6 @@ static inline int sysfs_slab_alias(struct kmem_cache *s, const char *p)
> { return 0; }
> static inline void sysfs_slab_remove(struct kmem_cache *s)
> {
> - kfree(s->name);
> kfree(s);
> }
>
> @@ -3228,7 +3227,6 @@ struct kmem_cache *kmem_cache_create(const char *name, size_t size,
> size_t align, unsigned long flags, void (*ctor)(void *))
> {
> struct kmem_cache *s;
> - char *n;
>
> if (WARN_ON(!name))
> return NULL;
> @@ -3252,25 +3250,19 @@ struct kmem_cache *kmem_cache_create(const char *name, size_t size,
> return s;
> }
>
> - n = kstrdup(name, GFP_KERNEL);
> - if (!n)
> - goto err;
> -
> s = kmalloc(kmem_size, GFP_KERNEL);
> if (s) {
> - if (kmem_cache_open(s, n,
> + if (kmem_cache_open(s, name,
> size, align, flags, ctor)) {
> list_add(&s->list, &slab_caches);
> if (sysfs_slab_add(s)) {
> list_del(&s->list);
> - kfree(n);
> kfree(s);
> goto err;
> }
> up_write(&slub_lock);
> return s;
> }
> - kfree(n);
> kfree(s);
> }
> err:
> @@ -4421,7 +4413,6 @@ static void kmem_cache_release(struct kobject *kobj)
> {
> struct kmem_cache *s = to_slab(kobj);
>
> - kfree(s->name);
> kfree(s);
> }
>


--
Mathematics is the supreme nostalgia of our time.

2010-11-18 21:36:17

by David Rientjes

[permalink] [raw]
Subject: Re: [PATCH] slub: operate cache name memory same to slab and slob

On Thu, 18 Nov 2010, Matt Mackall wrote:

> > Get a memory leak complaint about ext4:
> > comm "mount", pid 1159, jiffies 4294904647 (age 6077.804s)
> > hex dump (first 32 bytes):
> > 65 78 74 34 5f 67 72 6f 75 70 69 6e 66 6f 5f 31 ext4_groupinfo_1
> > 30 00 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5 0.kkkkkkkkkkkkk.
> > backtrace:
> > [<c068ade3>] kmemleak_alloc+0x93/0xd0
> > [<c024e54c>] __kmalloc_track_caller+0x30c/0x380
> > [<c02269d3>] kstrdup+0x33/0x60
> > [<c0318a70>] ext4_mb_init+0x4e0/0x550
> > [<c0304e0e>] ext4_fill_super+0x1e6e/0x2f60
> > [<c0261140>] mount_bdev+0x1c0/0x1f0
> > [<c02fc00f>] ext4_mount+0x1f/0x30
> > [<c02603d8>] vfs_kern_mount+0x78/0x250
> > [<c026060e>] do_kern_mount+0x3e/0x100
> > [<c027b4c2>] do_mount+0x2e2/0x780
> > [<c027ba04>] sys_mount+0xa4/0xd0
> > [<c010429f>] sysenter_do_call+0x12/0x38
> > [<ffffffff>] 0xffffffff
> >
> > It is cause by slub manage the cache name different from slab and slob.
> > In slab and slob, only reference to name, alloc and reclaim the memory
> > is the duty of the code that invoked kmem_cache_create().
> >
> > In slub, cache name duplicated when create. This ambiguity will cause
> > some memory leaks and double free if kmem_cache_create() pass a
> > dynamic malloc cache name.
>
> I don't get it.
>
> Caller allocates X, passes X to slub, slub duplicates X as X', and
> properly frees X', then caller frees X. Yes, that's silly, but where's
> the leak?
>

The leak in ext4_mb_init() above is because it is using kstrdup() to
allocate the string itself and then on destroy uses kmem_cache_name() to
attain the slub allocator's pointer to the name, not the memory the ext4
layer allocated itself.

2010-11-18 21:41:20

by Pekka Enberg

[permalink] [raw]
Subject: Re: [PATCH] slub: operate cache name memory same to slab and slob

On 18.11.2010 23.15, Matt Mackall wrote:
> On Thu, 2010-11-18 at 11:00 +0800, [email protected] wrote:
>> From: Zeng Zhaoming<[email protected]>
>>
>> Get a memory leak complaint about ext4:
>> comm "mount", pid 1159, jiffies 4294904647 (age 6077.804s)
>> hex dump (first 32 bytes):
>> 65 78 74 34 5f 67 72 6f 75 70 69 6e 66 6f 5f 31 ext4_groupinfo_1
>> 30 00 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5 0.kkkkkkkkkkkkk.
>> backtrace:
>> [<c068ade3>] kmemleak_alloc+0x93/0xd0
>> [<c024e54c>] __kmalloc_track_caller+0x30c/0x380
>> [<c02269d3>] kstrdup+0x33/0x60
>> [<c0318a70>] ext4_mb_init+0x4e0/0x550
>> [<c0304e0e>] ext4_fill_super+0x1e6e/0x2f60
>> [<c0261140>] mount_bdev+0x1c0/0x1f0
>> [<c02fc00f>] ext4_mount+0x1f/0x30
>> [<c02603d8>] vfs_kern_mount+0x78/0x250
>> [<c026060e>] do_kern_mount+0x3e/0x100
>> [<c027b4c2>] do_mount+0x2e2/0x780
>> [<c027ba04>] sys_mount+0xa4/0xd0
>> [<c010429f>] sysenter_do_call+0x12/0x38
>> [<ffffffff>] 0xffffffff
>>
>> It is cause by slub manage the cache name different from slab and slob.
>> In slab and slob, only reference to name, alloc and reclaim the memory
>> is the duty of the code that invoked kmem_cache_create().
>>
>> In slub, cache name duplicated when create. This ambiguity will cause
>> some memory leaks and double free if kmem_cache_create() pass a
>> dynamic malloc cache name.
>
> I don't get it.
>
> Caller allocates X, passes X to slub, slub duplicates X as X', and
> properly frees X', then caller frees X. Yes, that's silly, but where's
> the leak?
>
> But slub and slab should obviously both manage names in the same way,
> namely the historical "caller allocates" way. So:
>
> Acked-by: Matt Mackall<[email protected]>

The kstrdup() is there because of SLUB cache merging. See commit
84c1cf62465e2fb0a692620dcfeb52323ab03d48 ("SLUB: Fix merged slab cache
names") for details.

2010-11-18 22:27:51

by Matt Mackall

[permalink] [raw]
Subject: Re: [PATCH] slub: operate cache name memory same to slab and slob

On Thu, 2010-11-18 at 13:36 -0800, David Rientjes wrote:
> On Thu, 18 Nov 2010, Matt Mackall wrote:
>
> > > Get a memory leak complaint about ext4:
> > > comm "mount", pid 1159, jiffies 4294904647 (age 6077.804s)
> > > hex dump (first 32 bytes):
> > > 65 78 74 34 5f 67 72 6f 75 70 69 6e 66 6f 5f 31 ext4_groupinfo_1
> > > 30 00 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5 0.kkkkkkkkkkkkk.
> > > backtrace:
> > > [<c068ade3>] kmemleak_alloc+0x93/0xd0
> > > [<c024e54c>] __kmalloc_track_caller+0x30c/0x380
> > > [<c02269d3>] kstrdup+0x33/0x60
> > > [<c0318a70>] ext4_mb_init+0x4e0/0x550
> > > [<c0304e0e>] ext4_fill_super+0x1e6e/0x2f60
> > > [<c0261140>] mount_bdev+0x1c0/0x1f0
> > > [<c02fc00f>] ext4_mount+0x1f/0x30
> > > [<c02603d8>] vfs_kern_mount+0x78/0x250
> > > [<c026060e>] do_kern_mount+0x3e/0x100
> > > [<c027b4c2>] do_mount+0x2e2/0x780
> > > [<c027ba04>] sys_mount+0xa4/0xd0
> > > [<c010429f>] sysenter_do_call+0x12/0x38
> > > [<ffffffff>] 0xffffffff
> > >
> > > It is cause by slub manage the cache name different from slab and slob.
> > > In slab and slob, only reference to name, alloc and reclaim the memory
> > > is the duty of the code that invoked kmem_cache_create().
> > >
> > > In slub, cache name duplicated when create. This ambiguity will cause
> > > some memory leaks and double free if kmem_cache_create() pass a
> > > dynamic malloc cache name.
> >
> > I don't get it.
> >
> > Caller allocates X, passes X to slub, slub duplicates X as X', and
> > properly frees X', then caller frees X. Yes, that's silly, but where's
> > the leak?
> >
>
> The leak in ext4_mb_init() above is because it is using kstrdup() to
> allocate the string itself and then on destroy uses kmem_cache_name() to
> attain the slub allocator's pointer to the name, not the memory the ext4
> layer allocated itself.

And Pekka says:

> The kstrdup() is there because of SLUB cache merging. See commit
> 84c1cf62465e2fb0a692620dcfeb52323ab03d48 ("SLUB: Fix merged slab
> cache names") for details.

I see. So we can either:

- force anyone using dynamically-allocated names to track their own damn
pointer
- implement kstrdup in the other allocators and fix all callers (the
bulk of which use static names!)
- eliminate dynamically-allocated names (mostly useless when we start
merging slabs!)
- add an indirection layer for slub that holds the unmerged details
- stop pretending we track slab names and show only generic names based
on size in /proc

kmem_cache_name() is also a highly suspect function in a
post-merged-slabs kernel. As ext4 is the only user in the kernel, and it
got it wrong, perhaps it's time to rip it out.

--
Mathematics is the supreme nostalgia of our time.

2010-11-19 14:38:29

by Zeng Zhaoming

[permalink] [raw]
Subject: Re: [PATCH] slub: operate cache name memory same to slab and slob

> - eliminate dynamically-allocated names (mostly useless when we start
> merging slabs!)

not permit dynamically allocated name. I think this one is better, but
as a rule, describe in header is not enough.
It is helpful to print out some warning when someone break the rule.

> kmem_cache_name() is also a highly suspect function in a
> post-merged-slabs kernel. As ext4 is the only user in the kernel, and it
> got it wrong, perhaps it's time to rip it out.

agree, kmem_cache_name() is ugly.

---
Best Regards
Zeng Zhaoming

2010-11-21 00:55:31

by David Rientjes

[permalink] [raw]
Subject: Re: [PATCH] slub: operate cache name memory same to slab and slob

On Thu, 18 Nov 2010, Matt Mackall wrote:

> > The leak in ext4_mb_init() above is because it is using kstrdup() to
> > allocate the string itself and then on destroy uses kmem_cache_name() to
> > attain the slub allocator's pointer to the name, not the memory the ext4
> > layer allocated itself.
>
> And Pekka says:
>
> > The kstrdup() is there because of SLUB cache merging. See commit
> > 84c1cf62465e2fb0a692620dcfeb52323ab03d48 ("SLUB: Fix merged slab
> > cache names") for details.
>
> I see. So we can either:
>
> - force anyone using dynamically-allocated names to track their own damn
> pointer
> - implement kstrdup in the other allocators and fix all callers (the
> bulk of which use static names!)
> - eliminate dynamically-allocated names (mostly useless when we start
> merging slabs!)
> - add an indirection layer for slub that holds the unmerged details
> - stop pretending we track slab names and show only generic names based
> on size in /proc
>

I agree that we should force each user to track its own memory, and this
is really what the issue is about (it doesn't matter if that memory is the
cache's name). This particular issue is an ext4 memory leak and not the
responsibility of any allocator.

> kmem_cache_name() is also a highly suspect function in a
> post-merged-slabs kernel. As ext4 is the only user in the kernel, and it
> got it wrong, perhaps it's time to rip it out.
>

Yes, I think kmem_cache_name() should be removed since it shouldn't be
used for anything other than the internal slabinfo/slabtop display as the
slub allocator actually specifies in include/linux/slub_def.h. The only
user is ext4 to track this dynamically allocated pointer, so we can
eliminate it if we leave it to track its own memory allocations (a slab
allocator shouldn't be carrying a metadata payload).