2012-06-25 09:53:15

by Li Zhong

[permalink] [raw]
Subject: [PATCH SLUB 1/2] duplicate the cache name in saved_alias list

SLUB duplicates the cache name in kmem_cache_create(). However if the
cache could be merged to others during early booting, the name pointer
is saved in saved_alias list, and the string needs to be kept valid
before slab_sysfs_init() is called.

This patch tries to duplicate the cache name in saved_alias list, so
that the cache name could be safely kfreed after calling
kmem_cache_create(), if that name is kmalloced.

Signed-off-by: Li Zhong <[email protected]>
---
mm/slub.c | 6 ++++++
1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/mm/slub.c b/mm/slub.c
index 8c691fa..3dc8ed5 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -5373,6 +5373,11 @@ static int sysfs_slab_alias(struct kmem_cache *s,
const char *name)

al->s = s;
al->name = name;
+ al->name = kstrdup(name, GFP_KERNEL);
+ if (!al->name) {
+ kfree(al);
+ return -ENOMEM;
+ }
al->next = alias_list;
alias_list = al;
return 0;
@@ -5409,6 +5414,7 @@ static int __init slab_sysfs_init(void)
if (err)
printk(KERN_ERR "SLUB: Unable to add boot slab alias"
" %s to sysfs\n", s->name);
+ kfree(al->name);
kfree(al);
}

--
1.7.1


2012-06-25 09:55:16

by Li Zhong

[permalink] [raw]
Subject: [PATCH powerpc 2/2] kfree the cache name of pgtable cache if SLUB is used

This patch tries to kfree the cache name of pgtables cache if SLUB is
used, as SLUB duplicates the cache name, and the original one is leaked.

This patch depends on patch 1 -- (duplicate the cache name in
saved_alias list) in this mail thread. As the pgtables cache might be
merged to other caches. In this case, the name could be safely kfreed
after calling kmem_cache_create() with patch 1.

Signed-off-by: Li Zhong <[email protected]>
---
arch/powerpc/mm/init_64.c | 3 +++
1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
index 620b7ac..c9d2a7f 100644
--- a/arch/powerpc/mm/init_64.c
+++ b/arch/powerpc/mm/init_64.c
@@ -130,6 +130,9 @@ void pgtable_cache_add(unsigned shift, void
(*ctor)(void *))
align = max_t(unsigned long, align, minalign);
name = kasprintf(GFP_KERNEL, "pgtable-2^%d", shift);
new = kmem_cache_create(name, table_size, align, 0, ctor);
+#ifdef CONFIG_SLUB
+ kfree(name); /* SLUB duplicates the cache name */
+#endif
PGT_CACHE(shift) = new;

pr_debug("Allocated pgtable cache for order %d\n", shift);
--
1.7.1

2012-06-25 10:54:07

by Wanlong Gao

[permalink] [raw]
Subject: Re: [PATCH SLUB 1/2] duplicate the cache name in saved_alias list

On 06/25/2012 05:53 PM, Li Zhong wrote:
> SLUB duplicates the cache name in kmem_cache_create(). However if the
> cache could be merged to others during early booting, the name pointer
> is saved in saved_alias list, and the string needs to be kept valid
> before slab_sysfs_init() is called.
>
> This patch tries to duplicate the cache name in saved_alias list, so
> that the cache name could be safely kfreed after calling
> kmem_cache_create(), if that name is kmalloced.
>
> Signed-off-by: Li Zhong <[email protected]>
> ---
> mm/slub.c | 6 ++++++
> 1 files changed, 6 insertions(+), 0 deletions(-)
>
> diff --git a/mm/slub.c b/mm/slub.c
> index 8c691fa..3dc8ed5 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -5373,6 +5373,11 @@ static int sysfs_slab_alias(struct kmem_cache *s,
> const char *name)
>
> al->s = s;
> al->name = name;
> + al->name = kstrdup(name, GFP_KERNEL);

dup assigned the al->name ?


Thanks,
Wanlong Gao

> + if (!al->name) {
> + kfree(al);
> + return -ENOMEM;
> + }
> al->next = alias_list;
> alias_list = al;
> return 0;
> @@ -5409,6 +5414,7 @@ static int __init slab_sysfs_init(void)
> if (err)
> printk(KERN_ERR "SLUB: Unable to add boot slab alias"
> " %s to sysfs\n", s->name);
> + kfree(al->name);
> kfree(al);
> }
>
>

2012-06-25 11:13:43

by Glauber Costa

[permalink] [raw]
Subject: Re: [PATCH SLUB 1/2] duplicate the cache name in saved_alias list

On 06/25/2012 01:53 PM, Li Zhong wrote:
> SLUB duplicates the cache name in kmem_cache_create(). However if the
> cache could be merged to others during early booting, the name pointer
> is saved in saved_alias list, and the string needs to be kept valid
> before slab_sysfs_init() is called.
>
> This patch tries to duplicate the cache name in saved_alias list, so
> that the cache name could be safely kfreed after calling
> kmem_cache_create(), if that name is kmalloced.
>
> Signed-off-by: Li Zhong <[email protected]>
> ---
> mm/slub.c | 6 ++++++
> 1 files changed, 6 insertions(+), 0 deletions(-)
>
> diff --git a/mm/slub.c b/mm/slub.c
> index 8c691fa..3dc8ed5 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -5373,6 +5373,11 @@ static int sysfs_slab_alias(struct kmem_cache *s,
> const char *name)
>
> al->s = s;
> al->name = name;
> + al->name = kstrdup(name, GFP_KERNEL);
> + if (!al->name) {
> + kfree(al);
> + return -ENOMEM;
> + }
> al->next = alias_list;
> alias_list = al;
> return 0;
> @@ -5409,6 +5414,7 @@ static int __init slab_sysfs_init(void)
> if (err)
> printk(KERN_ERR "SLUB: Unable to add boot slab alias"
> " %s to sysfs\n", s->name);
> + kfree(al->name);
> kfree(al);
> }
>
>

What's unsafe about the current state of affairs ?
Whenever we alias, we'll increase the reference counter.
kmem_cache_destroy will only actually destroy the structure whenever
that refcnt reaches zero.

This means that kfree shouldn't happen until then. So what is exactly
that you are seeing?

Now, if you ask me, keeping the name around in user-visible files like
/proc/slabinfo for caches that are removed already can be a bit
confusing (that is because we don't add aliases to the slab_cache list)

If you want to touch this, one thing you can do is to keep a list of
names bundled in an alias. If an alias is removed, you free that name.
If that name is the representative name of the bundle, you move to the
next one.

2012-06-26 02:49:35

by Li Zhong

[permalink] [raw]
Subject: Re: [PATCH SLUB 1/2] duplicate the cache name in saved_alias list

On Mon, 2012-06-25 at 18:54 +0800, Wanlong Gao wrote:
> On 06/25/2012 05:53 PM, Li Zhong wrote:
> > SLUB duplicates the cache name in kmem_cache_create(). However if the
> > cache could be merged to others during early booting, the name pointer
> > is saved in saved_alias list, and the string needs to be kept valid
> > before slab_sysfs_init() is called.
> >
> > This patch tries to duplicate the cache name in saved_alias list, so
> > that the cache name could be safely kfreed after calling
> > kmem_cache_create(), if that name is kmalloced.
> >
> > Signed-off-by: Li Zhong <[email protected]>
> > ---
> > mm/slub.c | 6 ++++++
> > 1 files changed, 6 insertions(+), 0 deletions(-)
> >
> > diff --git a/mm/slub.c b/mm/slub.c
> > index 8c691fa..3dc8ed5 100644
> > --- a/mm/slub.c
> > +++ b/mm/slub.c
> > @@ -5373,6 +5373,11 @@ static int sysfs_slab_alias(struct kmem_cache *s,
> > const char *name)
> >
> > al->s = s;
> > al->name = name;
> > + al->name = kstrdup(name, GFP_KERNEL);
>
> dup assigned the al->name ?
>

Ah, yes, there should be a '-' before the line al->name = name;

Thank you, I will update it.

> Thanks,
> Wanlong Gao
>
> > + if (!al->name) {
> > + kfree(al);
> > + return -ENOMEM;
> > + }
> > al->next = alias_list;
> > alias_list = al;
> > return 0;
> > @@ -5409,6 +5414,7 @@ static int __init slab_sysfs_init(void)
> > if (err)
> > printk(KERN_ERR "SLUB: Unable to add boot slab alias"
> > " %s to sysfs\n", s->name);
> > + kfree(al->name);
> > kfree(al);
> > }
> >
> >
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>


2012-06-26 02:58:42

by Li Zhong

[permalink] [raw]
Subject: Re: [PATCH SLUB 1/2] duplicate the cache name in saved_alias list

On Mon, 2012-06-25 at 15:10 +0400, Glauber Costa wrote:
> On 06/25/2012 01:53 PM, Li Zhong wrote:
> > SLUB duplicates the cache name in kmem_cache_create(). However if the
> > cache could be merged to others during early booting, the name pointer
> > is saved in saved_alias list, and the string needs to be kept valid
> > before slab_sysfs_init() is called.
> >
> > This patch tries to duplicate the cache name in saved_alias list, so
> > that the cache name could be safely kfreed after calling
> > kmem_cache_create(), if that name is kmalloced.
> >
> > Signed-off-by: Li Zhong <[email protected]>
> > ---
> > mm/slub.c | 6 ++++++
> > 1 files changed, 6 insertions(+), 0 deletions(-)
> >
> > diff --git a/mm/slub.c b/mm/slub.c
> > index 8c691fa..3dc8ed5 100644
> > --- a/mm/slub.c
> > +++ b/mm/slub.c
> > @@ -5373,6 +5373,11 @@ static int sysfs_slab_alias(struct kmem_cache *s,
> > const char *name)
> >
> > al->s = s;
> > al->name = name;
> > + al->name = kstrdup(name, GFP_KERNEL);
> > + if (!al->name) {
> > + kfree(al);
> > + return -ENOMEM;
> > + }
> > al->next = alias_list;
> > alias_list = al;
> > return 0;
> > @@ -5409,6 +5414,7 @@ static int __init slab_sysfs_init(void)
> > if (err)
> > printk(KERN_ERR "SLUB: Unable to add boot slab alias"
> > " %s to sysfs\n", s->name);
> > + kfree(al->name);
> > kfree(al);
> > }
> >
> >
>
> What's unsafe about the current state of affairs ?
> Whenever we alias, we'll increase the reference counter.
> kmem_cache_destroy will only actually destroy the structure whenever
> that refcnt reaches zero.
>
> This means that kfree shouldn't happen until then. So what is exactly
> that you are seeing?

Maybe I didn't describe it clearly ... It is only about the name string
passed into kmem_cache_create() during early boot.

kmem_cache_create() checks whether it is mergeable before creating one.
If not mergeable, the name is duplicated: n = kstrdup(name, GFP_KERNEL);

If it is mergeable, it calls sysfs_slab_alias(). If the sysfs is ready
(slab_state == SYSFS ), then the name is duplicated (or dropped if no
SYSFS support ) in sysfs_create_link() for use.

For the above cases, we could safely kfree the name string after calling
cache create.

However, During early boot, before sysfs is ready ( slab_state <
SYSFS ), the sysfs_slab_alias() saves the pointer of name in the
alias_list. And those entries in the list are added to sysfs later after
slab_sysfs_init() is called. So we need to keep the name string valid
until slab_sysfs_init() is called to set up the sysfs stuff. By
duplicating the name string here also, we are able to kfree the name
string after calling the cache create.

>
> Now, if you ask me, keeping the name around in user-visible files like
> /proc/slabinfo for caches that are removed already can be a bit
> confusing (that is because we don't add aliases to the slab_cache list)
>
> If you want to touch this, one thing you can do is to keep a list of
> names bundled in an alias. If an alias is removed, you free that name.
> If that name is the representative name of the bundle, you move to the
> next one.
>


2012-06-27 07:53:49

by Li Zhong

[permalink] [raw]
Subject: [PATCH SLUB 1/2 v2] duplicate the cache name in saved_alias list

SLUB duplicates the cache name string passed into kmem_cache_create().
However if the cache could be merged to others during early boot, the
name pointer is saved in saved_alias list, and the string needs to be
kept valid before slab_sysfs_init() is finished. With this patch, the
name string (if kmalloced) could be kfreed after calling
kmem_cache_create().

Some more details:

kmem_cache_create() checks whether it is mergeable before creating one.
If not mergeable, the name is duplicated: n = kstrdup(name, GFP_KERNEL);

If it is mergeable, it calls sysfs_slab_alias(). If the sysfs is ready
(slab_state == SYSFS), then the name is duplicated (or dropped if no
SYSFS support) in sysfs_create_link() for use.

For the above cases, we could safely kfree the name string after calling
cache create.

However, during early boot, before sysfs is ready (slab_state < SYSFS),
the sysfs_slab_alias() saves the pointer of name in the alias_list.
Those entries in the list are added to sysfs later in slab_sysfs_init()
to set up the sysfs stuff, and we need keep the name string passed in
valid until it finishes. By duplicating the name string here also, we
are able to safely kfree the name string after calling cache create.

v2: removed an unnecessary assignment in v1; some changes in change log,
added more details

Signed-off-by: Li Zhong <[email protected]>
---
mm/slub.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/mm/slub.c b/mm/slub.c
index 8c691fa..ed9f3c5 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -5372,7 +5372,11 @@ static int sysfs_slab_alias(struct kmem_cache *s,
const char *name)
return -ENOMEM;

al->s = s;
- al->name = name;
+ al->name = kstrdup(name, GFP_KERNEL);
+ if (!al->name) {
+ kfree(al);
+ return -ENOMEM;
+ }
al->next = alias_list;
alias_list = al;
return 0;
@@ -5409,6 +5413,7 @@ static int __init slab_sysfs_init(void)
if (err)
printk(KERN_ERR "SLUB: Unable to add boot slab alias"
" %s to sysfs\n", s->name);
+ kfree(al->name);
kfree(al);
}

--
1.7.9.5

2012-06-29 00:45:49

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: [PATCH powerpc 2/2] kfree the cache name of pgtable cache if SLUB is used

On Mon, 2012-06-25 at 17:54 +0800, Li Zhong wrote:

> diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
> index 620b7ac..c9d2a7f 100644
> --- a/arch/powerpc/mm/init_64.c
> +++ b/arch/powerpc/mm/init_64.c
> @@ -130,6 +130,9 @@ void pgtable_cache_add(unsigned shift, void
> (*ctor)(void *))
> align = max_t(unsigned long, align, minalign);
> name = kasprintf(GFP_KERNEL, "pgtable-2^%d", shift);
> new = kmem_cache_create(name, table_size, align, 0, ctor);
> +#ifdef CONFIG_SLUB
> + kfree(name); /* SLUB duplicates the cache name */
> +#endif
> PGT_CACHE(shift) = new;
>
> pr_debug("Allocated pgtable cache for order %d\n", shift);

This is very gross ... and fragile. Also the subtle difference in
semantics between SLUB and SLAB is a VERY BAD IDEA.

I reckon you should make the other allocators all copy the name
instead.

Ben.

2012-06-29 01:41:40

by Li Zhong

[permalink] [raw]
Subject: Re: [PATCH powerpc 2/2] kfree the cache name of pgtable cache if SLUB is used

On 06/29/2012 08:45 AM, Benjamin Herrenschmidt wrote:
> On Mon, 2012-06-25 at 17:54 +0800, Li Zhong wrote:
>
>> diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
>> index 620b7ac..c9d2a7f 100644
>> --- a/arch/powerpc/mm/init_64.c
>> +++ b/arch/powerpc/mm/init_64.c
>> @@ -130,6 +130,9 @@ void pgtable_cache_add(unsigned shift, void
>> (*ctor)(void *))
>> align = max_t(unsigned long, align, minalign);
>> name = kasprintf(GFP_KERNEL, "pgtable-2^%d", shift);
>> new = kmem_cache_create(name, table_size, align, 0, ctor);
>> +#ifdef CONFIG_SLUB
>> + kfree(name); /* SLUB duplicates the cache name */
>> +#endif
>> PGT_CACHE(shift) = new;
>>
>> pr_debug("Allocated pgtable cache for order %d\n", shift);
>
> This is very gross ... and fragile. Also the subtle difference in
> semantics between SLUB and SLAB is a VERY BAD IDEA.

I agree.

> I reckon you should make the other allocators all copy the name
> instead.

Thank you for the suggestion. I will do it in the next version.

Thanks, Zhong

> Ben.
>
>