2013-03-21 01:22:57

by Zefan Li

[permalink] [raw]
Subject: [PATCH] memcg: fix memcg_cache_name() to use cgroup_name()

As cgroup supports rename, it's unsafe to dereference dentry->d_name
without proper vfs locks. Fix this by using cgroup_name().

Signed-off-by: Li Zefan <[email protected]>
---

This patch depends on "cgroup: fix cgroup_path() vs rename() race",
which has been queued for 3.10.

---
mm/memcontrol.c | 15 +++++++--------
1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 53b8201..72be5c9 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3217,17 +3217,16 @@ void mem_cgroup_destroy_cache(struct kmem_cache *cachep)
static char *memcg_cache_name(struct mem_cgroup *memcg, struct kmem_cache *s)
{
char *name;
- struct dentry *dentry;
+
+ name = (char *)__get_free_page(GFP_TEMPORARY);
+ if (!name)
+ return NULL;

rcu_read_lock();
- dentry = rcu_dereference(memcg->css.cgroup->dentry);
+ snprintf(name, PAGE_SIZE, "%s(%d:%s)", s->name, memcg_cache_id(memcg),
+ cgroup_name(memcg->css.cgroup));
rcu_read_unlock();

- BUG_ON(dentry == NULL);
-
- name = kasprintf(GFP_KERNEL, "%s(%d:%s)", s->name,
- memcg_cache_id(memcg), dentry->d_name.name);
-
return name;
}

@@ -3247,7 +3246,7 @@ static struct kmem_cache *kmem_cache_dup(struct mem_cgroup *memcg,
if (new)
new->allocflags |= __GFP_KMEMCG;

- kfree(name);
+ free_page((unsigned long)name);
return new;
}

--
1.8.0.2


2013-03-21 09:08:55

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH] memcg: fix memcg_cache_name() to use cgroup_name()

On Thu 21-03-13 09:22:21, Li Zefan wrote:
> As cgroup supports rename, it's unsafe to dereference dentry->d_name
> without proper vfs locks. Fix this by using cgroup_name().
>
> Signed-off-by: Li Zefan <[email protected]>
> ---
>
> This patch depends on "cgroup: fix cgroup_path() vs rename() race",
> which has been queued for 3.10.
>
> ---
> mm/memcontrol.c | 15 +++++++--------
> 1 file changed, 7 insertions(+), 8 deletions(-)
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 53b8201..72be5c9 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -3217,17 +3217,16 @@ void mem_cgroup_destroy_cache(struct kmem_cache *cachep)
> static char *memcg_cache_name(struct mem_cgroup *memcg, struct kmem_cache *s)
> {
> char *name;
> - struct dentry *dentry;
> +
> + name = (char *)__get_free_page(GFP_TEMPORARY);

Ouch. Can we use a static temporary buffer instead? This is called from
workqueue context so we do not have to be afraid of the deep call chain.
It is also not a hot path AFAICS.

Even GFP_ATOMIC for kasprintf would be an improvement IMO.

> + if (!name)
> + return NULL;
>
> rcu_read_lock();
> - dentry = rcu_dereference(memcg->css.cgroup->dentry);
> + snprintf(name, PAGE_SIZE, "%s(%d:%s)", s->name, memcg_cache_id(memcg),
> + cgroup_name(memcg->css.cgroup));
> rcu_read_unlock();
>
> - BUG_ON(dentry == NULL);
> -
> - name = kasprintf(GFP_KERNEL, "%s(%d:%s)", s->name,
> - memcg_cache_id(memcg), dentry->d_name.name);
> -
> return name;
> }
>
> @@ -3247,7 +3246,7 @@ static struct kmem_cache *kmem_cache_dup(struct mem_cgroup *memcg,
> if (new)
> new->allocflags |= __GFP_KMEMCG;
>
> - kfree(name);
> + free_page((unsigned long)name);
> return new;
> }
>
> --
> 1.8.0.2
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

--
Michal Hocko
SUSE Labs

2013-03-21 10:23:03

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH] memcg: fix memcg_cache_name() to use cgroup_name()

On Thu 21-03-13 10:08:49, Michal Hocko wrote:
> On Thu 21-03-13 09:22:21, Li Zefan wrote:
> > As cgroup supports rename, it's unsafe to dereference dentry->d_name
> > without proper vfs locks. Fix this by using cgroup_name().
> >
> > Signed-off-by: Li Zefan <[email protected]>
> > ---
> >
> > This patch depends on "cgroup: fix cgroup_path() vs rename() race",
> > which has been queued for 3.10.
> >
> > ---
> > mm/memcontrol.c | 15 +++++++--------
> > 1 file changed, 7 insertions(+), 8 deletions(-)
> >
> > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > index 53b8201..72be5c9 100644
> > --- a/mm/memcontrol.c
> > +++ b/mm/memcontrol.c
> > @@ -3217,17 +3217,16 @@ void mem_cgroup_destroy_cache(struct kmem_cache *cachep)
> > static char *memcg_cache_name(struct mem_cgroup *memcg, struct kmem_cache *s)
> > {
> > char *name;
> > - struct dentry *dentry;
> > +
> > + name = (char *)__get_free_page(GFP_TEMPORARY);
>
> Ouch. Can we use a static temporary buffer instead?

> This is called from workqueue context so we do not have to be afraid
> of the deep call chain.

Bahh, I was thinking about two things at the same time and that is how
it ends... I meant a temporary buffer on the stack. But a separate
allocation sounds even easier.

> It is also not a hot path AFAICS.
>
> Even GFP_ATOMIC for kasprintf would be an improvement IMO.

What about the following (not even compile tested because I do not have
cgroup_name in my tree yet):
---
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index f608546..ede0382 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3370,13 +3370,18 @@ static char *memcg_cache_name(struct mem_cgroup *memcg, struct kmem_cache *s)
struct dentry *dentry;

rcu_read_lock();
- dentry = rcu_dereference(memcg->css.cgroup->dentry);
+ name = kasprintf(GFP_ATOMIC, "%s(%d:%s)", s->name,
+ memcg_cache_id(memcg), cgroup_name(memcg->css.cgroup));
rcu_read_unlock();

- BUG_ON(dentry == NULL);
-
- name = kasprintf(GFP_KERNEL, "%s(%d:%s)", s->name,
- memcg_cache_id(memcg), dentry->d_name.name);
+ if (!name) {
+ name = kmalloc(PAGE_SIZE, GFP_KERNEL);
+ rcu_read_lock();
+ name = snprintf(name, PAGE_SIZE, "%s(%d:%s)", s->name,
+ memcg_cache_id(memcg),
+ cgroup_name(memcg->css.cgroup));
+ rcu_read_unlock();
+ }

return name;
}
--
Michal Hocko
SUSE Labs

2013-03-22 01:22:32

by Zefan Li

[permalink] [raw]
Subject: Re: [PATCH] memcg: fix memcg_cache_name() to use cgroup_name()

On 2013/3/21 18:22, Michal Hocko wrote:
> On Thu 21-03-13 10:08:49, Michal Hocko wrote:
>> On Thu 21-03-13 09:22:21, Li Zefan wrote:
>>> As cgroup supports rename, it's unsafe to dereference dentry->d_name
>>> without proper vfs locks. Fix this by using cgroup_name().
>>>
>>> Signed-off-by: Li Zefan <[email protected]>
>>> ---
>>>
>>> This patch depends on "cgroup: fix cgroup_path() vs rename() race",
>>> which has been queued for 3.10.
>>>
>>> ---
>>> mm/memcontrol.c | 15 +++++++--------
>>> 1 file changed, 7 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
>>> index 53b8201..72be5c9 100644
>>> --- a/mm/memcontrol.c
>>> +++ b/mm/memcontrol.c
>>> @@ -3217,17 +3217,16 @@ void mem_cgroup_destroy_cache(struct kmem_cache *cachep)
>>> static char *memcg_cache_name(struct mem_cgroup *memcg, struct kmem_cache *s)
>>> {
>>> char *name;
>>> - struct dentry *dentry;
>>> +
>>> + name = (char *)__get_free_page(GFP_TEMPORARY);
>>
>> Ouch. Can we use a static temporary buffer instead?
>
>> This is called from workqueue context so we do not have to be afraid
>> of the deep call chain.
>
> Bahh, I was thinking about two things at the same time and that is how
> it ends... I meant a temporary buffer on the stack. But a separate
> allocation sounds even easier.
>

Actually I don't care much about which way to take. Use on-stack buffer (if stack
usage is not a concern) or local static buffer (caller already held memcg_cache_mutex)
is simplest.

But why it's bad to allocate a page for temp use?

>> It is also not a hot path AFAICS.
>>
>> Even GFP_ATOMIC for kasprintf would be an improvement IMO.
>
> What about the following (not even compile tested because I do not have
> cgroup_name in my tree yet):

No, it won't compile. ;)

> ---
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index f608546..ede0382 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -3370,13 +3370,18 @@ static char *memcg_cache_name(struct mem_cgroup *memcg, struct kmem_cache *s)
> struct dentry *dentry;
>
> rcu_read_lock();
> - dentry = rcu_dereference(memcg->css.cgroup->dentry);
> + name = kasprintf(GFP_ATOMIC, "%s(%d:%s)", s->name,
> + memcg_cache_id(memcg), cgroup_name(memcg->css.cgroup));
> rcu_read_unlock();
>
> - BUG_ON(dentry == NULL);
> -
> - name = kasprintf(GFP_KERNEL, "%s(%d:%s)", s->name,
> - memcg_cache_id(memcg), dentry->d_name.name);
> + if (!name) {
> + name = kmalloc(PAGE_SIZE, GFP_KERNEL);
> + rcu_read_lock();
> + name = snprintf(name, PAGE_SIZE, "%s(%d:%s)", s->name,
> + memcg_cache_id(memcg),
> + cgroup_name(memcg->css.cgroup));
> + rcu_read_unlock();
> + }
>
> return name;
> }
>

2013-03-22 08:07:58

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH] memcg: fix memcg_cache_name() to use cgroup_name()

On Fri 22-03-13 09:22:06, Li Zefan wrote:
> On 2013/3/21 18:22, Michal Hocko wrote:
> > On Thu 21-03-13 10:08:49, Michal Hocko wrote:
> >> On Thu 21-03-13 09:22:21, Li Zefan wrote:
> >>> As cgroup supports rename, it's unsafe to dereference dentry->d_name
> >>> without proper vfs locks. Fix this by using cgroup_name().
> >>>
> >>> Signed-off-by: Li Zefan <[email protected]>
> >>> ---
> >>>
> >>> This patch depends on "cgroup: fix cgroup_path() vs rename() race",
> >>> which has been queued for 3.10.
> >>>
> >>> ---
> >>> mm/memcontrol.c | 15 +++++++--------
> >>> 1 file changed, 7 insertions(+), 8 deletions(-)
> >>>
> >>> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> >>> index 53b8201..72be5c9 100644
> >>> --- a/mm/memcontrol.c
> >>> +++ b/mm/memcontrol.c
> >>> @@ -3217,17 +3217,16 @@ void mem_cgroup_destroy_cache(struct kmem_cache *cachep)
> >>> static char *memcg_cache_name(struct mem_cgroup *memcg, struct kmem_cache *s)
> >>> {
> >>> char *name;
> >>> - struct dentry *dentry;
> >>> +
> >>> + name = (char *)__get_free_page(GFP_TEMPORARY);
> >>
> >> Ouch. Can we use a static temporary buffer instead?
> >
> >> This is called from workqueue context so we do not have to be afraid
> >> of the deep call chain.
> >
> > Bahh, I was thinking about two things at the same time and that is how
> > it ends... I meant a temporary buffer on the stack. But a separate
> > allocation sounds even easier.
> >
>
> Actually I don't care much about which way to take. Use on-stack buffer (if stack
> usage is not a concern) or local static buffer (caller already held memcg_cache_mutex)
> is simplest.
>
> But why it's bad to allocate a page for temp use?

GFP_TEMPORARY groups short lived allocations but the mem cache is not
an ideal candidate of this type of allocations..

> >> It is also not a hot path AFAICS.
> >>
> >> Even GFP_ATOMIC for kasprintf would be an improvement IMO.
> >
> > What about the following (not even compile tested because I do not have
> > cgroup_name in my tree yet):
>
> No, it won't compile. ;)

Somehow expected so as this was just a quick hack to show what I meant.
The full patch is bellow (compile time tested on top of for-3.10 branch
this time :P)
---
>From 7e1f6f0e266a230ced238a9bf2398b4069a6a764 Mon Sep 17 00:00:00 2001
From: Michal Hocko <[email protected]>
Date: Fri, 22 Mar 2013 09:04:58 +0100
Subject: [PATCH] memcg: fix memcg_cache_name() to use cgroup_name()

As cgroup supports rename, it's unsafe to dereference dentry->d_name
without proper vfs locks. Fix this by using cgroup_name().

Signed-off-by: Li Zefan <[email protected]>
Signed-off-by: Michal Hocko <[email protected]>
---
mm/memcontrol.c | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 53b8201..5741bf5 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3220,13 +3220,18 @@ static char *memcg_cache_name(struct mem_cgroup *memcg, struct kmem_cache *s)
struct dentry *dentry;

rcu_read_lock();
- dentry = rcu_dereference(memcg->css.cgroup->dentry);
+ name = kasprintf(GFP_ATOMIC, "%s(%d:%s)", s->name,
+ memcg_cache_id(memcg), dcgroup_name(memcg->css.cgroup));
rcu_read_unlock();

- BUG_ON(dentry == NULL);
+ if (!name) {
+ name = kmalloc(PAGE_SIZE, GFP_KERNEL);
+ rcu_read_lock();
+ name = snprintf(name, PAGE_SIZE, "%s(%d:%s)", s->name,
+ memcg_cache_id(memcg), dcgroup_name(memcg->css.cgroup));
+ rcu_read_unlock();

- name = kasprintf(GFP_KERNEL, "%s(%d:%s)", s->name,
- memcg_cache_id(memcg), dentry->d_name.name);
+ }

return name;
}
--
1.7.10.4

--
Michal Hocko
SUSE Labs

2013-03-22 08:17:29

by Zefan Li

[permalink] [raw]
Subject: Re: [PATCH] memcg: fix memcg_cache_name() to use cgroup_name()

>>>>> @@ -3217,17 +3217,16 @@ void mem_cgroup_destroy_cache(struct kmem_cache *cachep)
>>>>> static char *memcg_cache_name(struct mem_cgroup *memcg, struct kmem_cache *s)
>>>>> {
>>>>> char *name;
>>>>> - struct dentry *dentry;
>>>>> +
>>>>> + name = (char *)__get_free_page(GFP_TEMPORARY);
>>>>
>>>> Ouch. Can we use a static temporary buffer instead?
>>>
>>>> This is called from workqueue context so we do not have to be afraid
>>>> of the deep call chain.
>>>
>>> Bahh, I was thinking about two things at the same time and that is how
>>> it ends... I meant a temporary buffer on the stack. But a separate
>>> allocation sounds even easier.
>>>
>>
>> Actually I don't care much about which way to take. Use on-stack buffer (if stack
>> usage is not a concern) or local static buffer (caller already held memcg_cache_mutex)
>> is simplest.
>>
>> But why it's bad to allocate a page for temp use?
>
> GFP_TEMPORARY groups short lived allocations but the mem cache is not
> an ideal candidate of this type of allocations..
>

I'm not sure I'm following you...

char *memcg_cache_name()
{
char *name = alloc();
return name;
}

kmem_cache_dup()
{
name = memcg_cache_name();
kmem_cache_create_memcg(name);
free(name);
}

Isn't this a short lived allocation?

>>>> It is also not a hot path AFAICS.
>>>>
>>>> Even GFP_ATOMIC for kasprintf would be an improvement IMO.
>>>
>>> What about the following (not even compile tested because I do not have
>>> cgroup_name in my tree yet):
>>
>> No, it won't compile. ;)
>
> Somehow expected so as this was just a quick hack to show what I meant.
> The full patch is bellow (compile time tested on top of for-3.10 branch
> this time :P)
> ---
>>From 7e1f6f0e266a230ced238a9bf2398b4069a6a764 Mon Sep 17 00:00:00 2001
> From: Michal Hocko <[email protected]>
> Date: Fri, 22 Mar 2013 09:04:58 +0100
> Subject: [PATCH] memcg: fix memcg_cache_name() to use cgroup_name()
>
> As cgroup supports rename, it's unsafe to dereference dentry->d_name
> without proper vfs locks. Fix this by using cgroup_name().
>
> Signed-off-by: Li Zefan <[email protected]>
> Signed-off-by: Michal Hocko <[email protected]>
> ---
> mm/memcontrol.c | 13 +++++++++----
> 1 file changed, 9 insertions(+), 4 deletions(-)
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 53b8201..5741bf5 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -3220,13 +3220,18 @@ static char *memcg_cache_name(struct mem_cgroup *memcg, struct kmem_cache *s)
> struct dentry *dentry;
>
> rcu_read_lock();
> - dentry = rcu_dereference(memcg->css.cgroup->dentry);
> + name = kasprintf(GFP_ATOMIC, "%s(%d:%s)", s->name,
> + memcg_cache_id(memcg), dcgroup_name(memcg->css.cgroup));
> rcu_read_unlock();
>
> - BUG_ON(dentry == NULL);
> + if (!name) {
> + name = kmalloc(PAGE_SIZE, GFP_KERNEL);
> + rcu_read_lock();
> + name = snprintf(name, PAGE_SIZE, "%s(%d:%s)", s->name,
> + memcg_cache_id(memcg), dcgroup_name(memcg->css.cgroup));
> + rcu_read_unlock();
>
> - name = kasprintf(GFP_KERNEL, "%s(%d:%s)", s->name,
> - memcg_cache_id(memcg), dentry->d_name.name);
> + }
>
> return name;
> }
>

2013-03-22 08:34:13

by Glauber Costa

[permalink] [raw]
Subject: Re: [PATCH] memcg: fix memcg_cache_name() to use cgroup_name()

On 03/22/2013 12:17 PM, Li Zefan wrote:
>> GFP_TEMPORARY groups short lived allocations but the mem cache is not
>> > an ideal candidate of this type of allocations..
>> >
> I'm not sure I'm following you...
>
> char *memcg_cache_name()
> {
> char *name = alloc();
> return name;
> }
>
> kmem_cache_dup()
> {
> name = memcg_cache_name();
> kmem_cache_create_memcg(name);
> free(name);
> }
>
> Isn't this a short lived allocation?
>

Hi,

Thanks for identifying and fixing this.

Li is right. The cache name will live long, but this is because the
slab/slub caches will strdup it internally. So the actual memcg
allocation is short lived.

2013-03-22 09:31:50

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH] memcg: fix memcg_cache_name() to use cgroup_name()

On Fri 22-03-13 12:22:23, Glauber Costa wrote:
> On 03/22/2013 12:17 PM, Li Zefan wrote:
> >> GFP_TEMPORARY groups short lived allocations but the mem cache is not
> >> > an ideal candidate of this type of allocations..
> >> >
> > I'm not sure I'm following you...
> >
> > char *memcg_cache_name()
> > {
> > char *name = alloc();
> > return name;
> > }
> >
> > kmem_cache_dup()
> > {
> > name = memcg_cache_name();
> > kmem_cache_create_memcg(name);
> > free(name);
> > }
> >
> > Isn't this a short lived allocation?
> >
>
> Hi,
>
> Thanks for identifying and fixing this.
>
> Li is right. The cache name will live long, but this is because the
> slab/slub caches will strdup it internally. So the actual memcg
> allocation is short lived.

OK, I have totally missed that. Sorry about the confusion. Then all the
churn around the allocation is pointless, no?
What about:
---
>From 7ed7f53bb597e8cb40d9ac91ce16142fb60f1e93 Mon Sep 17 00:00:00 2001
From: Michal Hocko <[email protected]>
Date: Fri, 22 Mar 2013 10:22:54 +0100
Subject: [PATCH] memcg: fix memcg_cache_name() to use cgroup_name()

As cgroup supports rename, it's unsafe to dereference dentry->d_name
without proper vfs locks. Fix this by using cgroup_name() rather than
dentry directly.

Also open code memcg_cache_name because it is called only from
kmem_cache_dup which frees the returned name right after
kmem_cache_create_memcg makes a copy of it. Such a short-lived
allocation doesn't make too much sense. So replace it by a static
buffer as kmem_cache_dup is called with memcg_cache_mutex.

Signed-off-by: Li Zefan <[email protected]>
Signed-off-by: Michal Hocko <[email protected]>
---
mm/memcontrol.c | 33 +++++++++++----------------------
1 file changed, 11 insertions(+), 22 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 53b8201..9452b56 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3214,40 +3214,29 @@ void mem_cgroup_destroy_cache(struct kmem_cache *cachep)
schedule_work(&cachep->memcg_params->destroy);
}

-static char *memcg_cache_name(struct mem_cgroup *memcg, struct kmem_cache *s)
-{
- char *name;
- struct dentry *dentry;
-
- rcu_read_lock();
- dentry = rcu_dereference(memcg->css.cgroup->dentry);
- rcu_read_unlock();
-
- BUG_ON(dentry == NULL);
-
- name = kasprintf(GFP_KERNEL, "%s(%d:%s)", s->name,
- memcg_cache_id(memcg), dentry->d_name.name);
-
- return name;
-}
-
+/*
+ * Called with memcg_cache_mutex held
+ */
static struct kmem_cache *kmem_cache_dup(struct mem_cgroup *memcg,
struct kmem_cache *s)
{
char *name;
struct kmem_cache *new;
+ static char tmp_name[PAGE_SIZE];

- name = memcg_cache_name(memcg, s);
- if (!name)
- return NULL;
+ lockdep_assert_held(&memcg_cache_mutex);
+
+ rcu_read_lock();
+ tmp_name = snprintf(tmp_name, sizeof(tmp_name), "%s(%d:%s)", s->name,
+ memcg_cache_id(memcg), cgroup_name(memcg->css.cgroup));
+ rcu_read_unlock();

- new = kmem_cache_create_memcg(memcg, name, s->object_size, s->align,
+ new = kmem_cache_create_memcg(memcg, tmp_name, s->object_size, s->align,
(s->flags & ~SLAB_PANIC), s->ctor, s);

if (new)
new->allocflags |= __GFP_KMEMCG;

- kfree(name);
return new;
}

--
1.7.10.4


--
Michal Hocko
SUSE Labs

2013-03-22 09:41:05

by Glauber Costa

[permalink] [raw]
Subject: Re: [PATCH] memcg: fix memcg_cache_name() to use cgroup_name()

On 03/22/2013 01:31 PM, Michal Hocko wrote:
> On Fri 22-03-13 12:22:23, Glauber Costa wrote:
>> On 03/22/2013 12:17 PM, Li Zefan wrote:
>>>> GFP_TEMPORARY groups short lived allocations but the mem cache is not
>>>>> an ideal candidate of this type of allocations..
>>>>>
>>> I'm not sure I'm following you...
>>>
>>> char *memcg_cache_name()
>>> {
>>> char *name = alloc();
>>> return name;
>>> }
>>>
>>> kmem_cache_dup()
>>> {
>>> name = memcg_cache_name();
>>> kmem_cache_create_memcg(name);
>>> free(name);
>>> }
>>>
>>> Isn't this a short lived allocation?
>>>
>>
>> Hi,
>>
>> Thanks for identifying and fixing this.
>>
>> Li is right. The cache name will live long, but this is because the
>> slab/slub caches will strdup it internally. So the actual memcg
>> allocation is short lived.
>
> OK, I have totally missed that. Sorry about the confusion. Then all the
> churn around the allocation is pointless, no?
> What about:

If we're really not concerned about stack, then yes. Even if always
running from workqueues, a PAGE_SIZEd stack variable seems risky to me.

2013-03-22 09:48:36

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH] memcg: fix memcg_cache_name() to use cgroup_name()

On Fri 22-03-13 13:41:40, Glauber Costa wrote:
> On 03/22/2013 01:31 PM, Michal Hocko wrote:
> > On Fri 22-03-13 12:22:23, Glauber Costa wrote:
> >> On 03/22/2013 12:17 PM, Li Zefan wrote:
> >>>> GFP_TEMPORARY groups short lived allocations but the mem cache is not
> >>>>> an ideal candidate of this type of allocations..
> >>>>>
> >>> I'm not sure I'm following you...
> >>>
> >>> char *memcg_cache_name()
> >>> {
> >>> char *name = alloc();
> >>> return name;
> >>> }
> >>>
> >>> kmem_cache_dup()
> >>> {
> >>> name = memcg_cache_name();
> >>> kmem_cache_create_memcg(name);
> >>> free(name);
> >>> }
> >>>
> >>> Isn't this a short lived allocation?
> >>>
> >>
> >> Hi,
> >>
> >> Thanks for identifying and fixing this.
> >>
> >> Li is right. The cache name will live long, but this is because the
> >> slab/slub caches will strdup it internally. So the actual memcg
> >> allocation is short lived.
> >
> > OK, I have totally missed that. Sorry about the confusion. Then all the
> > churn around the allocation is pointless, no?
> > What about:
>
> If we're really not concerned about stack, then yes. Even if always
> running from workqueues, a PAGE_SIZEd stack variable seems risky to me.

This is not on stack. It is static

--
Michal Hocko
SUSE Labs

2013-03-22 10:02:53

by Glauber Costa

[permalink] [raw]
Subject: Re: [PATCH] memcg: fix memcg_cache_name() to use cgroup_name()

On 03/22/2013 01:48 PM, Michal Hocko wrote:
> On Fri 22-03-13 13:41:40, Glauber Costa wrote:
>> On 03/22/2013 01:31 PM, Michal Hocko wrote:
>>> On Fri 22-03-13 12:22:23, Glauber Costa wrote:
>>>> On 03/22/2013 12:17 PM, Li Zefan wrote:
>>>>>> GFP_TEMPORARY groups short lived allocations but the mem cache is not
>>>>>>> an ideal candidate of this type of allocations..
>>>>>>>
>>>>> I'm not sure I'm following you...
>>>>>
>>>>> char *memcg_cache_name()
>>>>> {
>>>>> char *name = alloc();
>>>>> return name;
>>>>> }
>>>>>
>>>>> kmem_cache_dup()
>>>>> {
>>>>> name = memcg_cache_name();
>>>>> kmem_cache_create_memcg(name);
>>>>> free(name);
>>>>> }
>>>>>
>>>>> Isn't this a short lived allocation?
>>>>>
>>>>
>>>> Hi,
>>>>
>>>> Thanks for identifying and fixing this.
>>>>
>>>> Li is right. The cache name will live long, but this is because the
>>>> slab/slub caches will strdup it internally. So the actual memcg
>>>> allocation is short lived.
>>>
>>> OK, I have totally missed that. Sorry about the confusion. Then all the
>>> churn around the allocation is pointless, no?
>>> What about:
>>
>> If we're really not concerned about stack, then yes. Even if always
>> running from workqueues, a PAGE_SIZEd stack variable seems risky to me.
>
> This is not on stack. It is static
>
Ah, right, I totally missed that. And then you're taking the mutex.

But actually, you don't need to take the mutex. All calls to
kmem_cache_dup are protected by the memcg_cache_mutex. So you should be
able to just use the buffer without further problems.

2013-03-22 10:06:12

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH] memcg: fix memcg_cache_name() to use cgroup_name()

On Fri 22-03-13 14:03:30, Glauber Costa wrote:
> On 03/22/2013 01:48 PM, Michal Hocko wrote:
> > On Fri 22-03-13 13:41:40, Glauber Costa wrote:
> >> On 03/22/2013 01:31 PM, Michal Hocko wrote:
> >>> On Fri 22-03-13 12:22:23, Glauber Costa wrote:
> >>>> On 03/22/2013 12:17 PM, Li Zefan wrote:
> >>>>>> GFP_TEMPORARY groups short lived allocations but the mem cache is not
> >>>>>>> an ideal candidate of this type of allocations..
> >>>>>>>
> >>>>> I'm not sure I'm following you...
> >>>>>
> >>>>> char *memcg_cache_name()
> >>>>> {
> >>>>> char *name = alloc();
> >>>>> return name;
> >>>>> }
> >>>>>
> >>>>> kmem_cache_dup()
> >>>>> {
> >>>>> name = memcg_cache_name();
> >>>>> kmem_cache_create_memcg(name);
> >>>>> free(name);
> >>>>> }
> >>>>>
> >>>>> Isn't this a short lived allocation?
> >>>>>
> >>>>
> >>>> Hi,
> >>>>
> >>>> Thanks for identifying and fixing this.
> >>>>
> >>>> Li is right. The cache name will live long, but this is because the
> >>>> slab/slub caches will strdup it internally. So the actual memcg
> >>>> allocation is short lived.
> >>>
> >>> OK, I have totally missed that. Sorry about the confusion. Then all the
> >>> churn around the allocation is pointless, no?
> >>> What about:
> >>
> >> If we're really not concerned about stack, then yes. Even if always
> >> running from workqueues, a PAGE_SIZEd stack variable seems risky to me.
> >
> > This is not on stack. It is static
> >
> Ah, right, I totally missed that. And then you're taking the mutex.
>
> But actually, you don't need to take the mutex. All calls to
> kmem_cache_dup are protected by the memcg_cache_mutex.

Yes and I am not taking that mutex. I've just added lockdep assert to
make sure that this still holds true.

> So you should be able to just use the buffer without further problems.

--
Michal Hocko
SUSE Labs

2013-03-22 10:24:48

by Glauber Costa

[permalink] [raw]
Subject: Re: [PATCH] memcg: fix memcg_cache_name() to use cgroup_name()

On 03/22/2013 02:06 PM, Michal Hocko wrote:
> On Fri 22-03-13 14:03:30, Glauber Costa wrote:
>> On 03/22/2013 01:48 PM, Michal Hocko wrote:
>>> On Fri 22-03-13 13:41:40, Glauber Costa wrote:
>>>> On 03/22/2013 01:31 PM, Michal Hocko wrote:
>>>>> On Fri 22-03-13 12:22:23, Glauber Costa wrote:
>>>>>> On 03/22/2013 12:17 PM, Li Zefan wrote:
>>>>>>>> GFP_TEMPORARY groups short lived allocations but the mem cache is not
>>>>>>>>> an ideal candidate of this type of allocations..
>>>>>>>>>
>>>>>>> I'm not sure I'm following you...
>>>>>>>
>>>>>>> char *memcg_cache_name()
>>>>>>> {
>>>>>>> char *name = alloc();
>>>>>>> return name;
>>>>>>> }
>>>>>>>
>>>>>>> kmem_cache_dup()
>>>>>>> {
>>>>>>> name = memcg_cache_name();
>>>>>>> kmem_cache_create_memcg(name);
>>>>>>> free(name);
>>>>>>> }
>>>>>>>
>>>>>>> Isn't this a short lived allocation?
>>>>>>>
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> Thanks for identifying and fixing this.
>>>>>>
>>>>>> Li is right. The cache name will live long, but this is because the
>>>>>> slab/slub caches will strdup it internally. So the actual memcg
>>>>>> allocation is short lived.
>>>>>
>>>>> OK, I have totally missed that. Sorry about the confusion. Then all the
>>>>> churn around the allocation is pointless, no?
>>>>> What about:
>>>>
>>>> If we're really not concerned about stack, then yes. Even if always
>>>> running from workqueues, a PAGE_SIZEd stack variable seems risky to me.
>>>
>>> This is not on stack. It is static
>>>
>> Ah, right, I totally missed that. And then you're taking the mutex.
>>
>> But actually, you don't need to take the mutex. All calls to
>> kmem_cache_dup are protected by the memcg_cache_mutex.
>
> Yes and I am not taking that mutex. I've just added lockdep assert to
> make sure that this still holds true.
>
It is impressive what a busy week does to our brains...

I read the code as lockdep_assert(memcg_cache_mutex), and then later on
mutex_lock(&memcg_mutex). But reading again, that was a just an
rcu_read_lock(). Good thing it is Friday

You guys can add my Acked-by, and thanks again

2013-03-22 10:56:19

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH] memcg: fix memcg_cache_name() to use cgroup_name()

On Fri 22-03-13 14:25:23, Glauber Costa wrote:
> On 03/22/2013 02:06 PM, Michal Hocko wrote:
> > On Fri 22-03-13 14:03:30, Glauber Costa wrote:
> >> On 03/22/2013 01:48 PM, Michal Hocko wrote:
> >>> On Fri 22-03-13 13:41:40, Glauber Costa wrote:
> >>>> On 03/22/2013 01:31 PM, Michal Hocko wrote:
> >>>>> On Fri 22-03-13 12:22:23, Glauber Costa wrote:
> >>>>>> On 03/22/2013 12:17 PM, Li Zefan wrote:
> >>>>>>>> GFP_TEMPORARY groups short lived allocations but the mem cache is not
> >>>>>>>>> an ideal candidate of this type of allocations..
> >>>>>>>>>
> >>>>>>> I'm not sure I'm following you...
> >>>>>>>
> >>>>>>> char *memcg_cache_name()
> >>>>>>> {
> >>>>>>> char *name = alloc();
> >>>>>>> return name;
> >>>>>>> }
> >>>>>>>
> >>>>>>> kmem_cache_dup()
> >>>>>>> {
> >>>>>>> name = memcg_cache_name();
> >>>>>>> kmem_cache_create_memcg(name);
> >>>>>>> free(name);
> >>>>>>> }
> >>>>>>>
> >>>>>>> Isn't this a short lived allocation?
> >>>>>>>
> >>>>>>
> >>>>>> Hi,
> >>>>>>
> >>>>>> Thanks for identifying and fixing this.
> >>>>>>
> >>>>>> Li is right. The cache name will live long, but this is because the
> >>>>>> slab/slub caches will strdup it internally. So the actual memcg
> >>>>>> allocation is short lived.
> >>>>>
> >>>>> OK, I have totally missed that. Sorry about the confusion. Then all the
> >>>>> churn around the allocation is pointless, no?
> >>>>> What about:
> >>>>
> >>>> If we're really not concerned about stack, then yes. Even if always
> >>>> running from workqueues, a PAGE_SIZEd stack variable seems risky to me.
> >>>
> >>> This is not on stack. It is static
> >>>
> >> Ah, right, I totally missed that. And then you're taking the mutex.
> >>
> >> But actually, you don't need to take the mutex. All calls to
> >> kmem_cache_dup are protected by the memcg_cache_mutex.
> >
> > Yes and I am not taking that mutex. I've just added lockdep assert to
> > make sure that this still holds true.
> >
> It is impressive what a busy week does to our brains...

Tell me something about that.

> I read the code as lockdep_assert(memcg_cache_mutex), and then later on
> mutex_lock(&memcg_mutex). But reading again, that was a just an
> rcu_read_lock(). Good thing it is Friday
>
> You guys can add my Acked-by, and thanks again

Li, are you ok to take the page via your tree?

--
Michal Hocko
SUSE Labs

2013-03-24 07:33:40

by Zefan Li

[permalink] [raw]
Subject: Re: [PATCH] memcg: fix memcg_cache_name() to use cgroup_name()

>> Thanks for identifying and fixing this.
>>
>> Li is right. The cache name will live long, but this is because the
>> slab/slub caches will strdup it internally. So the actual memcg
>> allocation is short lived.
>
> OK, I have totally missed that. Sorry about the confusion. Then all the
> churn around the allocation is pointless, no?
> What about:
> ---
>>From 7ed7f53bb597e8cb40d9ac91ce16142fb60f1e93 Mon Sep 17 00:00:00 2001
> From: Michal Hocko <[email protected]>
> Date: Fri, 22 Mar 2013 10:22:54 +0100
> Subject: [PATCH] memcg: fix memcg_cache_name() to use cgroup_name()
>
> As cgroup supports rename, it's unsafe to dereference dentry->d_name
> without proper vfs locks. Fix this by using cgroup_name() rather than
> dentry directly.
>
> Also open code memcg_cache_name because it is called only from
> kmem_cache_dup which frees the returned name right after
> kmem_cache_create_memcg makes a copy of it. Such a short-lived
> allocation doesn't make too much sense. So replace it by a static
> buffer as kmem_cache_dup is called with memcg_cache_mutex.
>

I doubt it's a win to add 4K to kernel text size instead of adding
a few extra lines of code... but it's up to you.

> Signed-off-by: Li Zefan <[email protected]>
> Signed-off-by: Michal Hocko <[email protected]>
> ---
> mm/memcontrol.c | 33 +++++++++++----------------------
> 1 file changed, 11 insertions(+), 22 deletions(-)
...
> static struct kmem_cache *kmem_cache_dup(struct mem_cgroup *memcg,
> struct kmem_cache *s)
> {
> char *name;
> struct kmem_cache *new;
> + static char tmp_name[PAGE_SIZE];
>
> - name = memcg_cache_name(memcg, s);
> - if (!name)
> - return NULL;
> + lockdep_assert_held(&memcg_cache_mutex);
> +
> + rcu_read_lock();
> + tmp_name = snprintf(tmp_name, sizeof(tmp_name), "%s(%d:%s)", s->name,
> + memcg_cache_id(memcg), cgroup_name(memcg->css.cgroup));

I guess you didn't turn on CONFIG_MEMCG_KMEM?

snprintf() returns a int value.

> + rcu_read_unlock();
>
> - new = kmem_cache_create_memcg(memcg, name, s->object_size, s->align,
> + new = kmem_cache_create_memcg(memcg, tmp_name, s->object_size, s->align,
> (s->flags & ~SLAB_PANIC), s->ctor, s);
>
> if (new)
> new->allocflags |= __GFP_KMEMCG;
>
> - kfree(name);
> return new;
> }
>
>

2013-03-24 07:35:07

by Zefan Li

[permalink] [raw]
Subject: Re: [PATCH] memcg: fix memcg_cache_name() to use cgroup_name()

>> I read the code as lockdep_assert(memcg_cache_mutex), and then later on
>> mutex_lock(&memcg_mutex). But reading again, that was a just an
>> rcu_read_lock(). Good thing it is Friday
>>
>> You guys can add my Acked-by, and thanks again
>
> Li, are you ok to take the page via your tree?
>

I don't have a git tree in kernel.org. It's Tejun that picks up
cgroup patches.

2013-03-25 08:20:52

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH] memcg: fix memcg_cache_name() to use cgroup_name()

On Sun 24-03-13 15:34:51, Li Zefan wrote:
> >> I read the code as lockdep_assert(memcg_cache_mutex), and then later on
> >> mutex_lock(&memcg_mutex). But reading again, that was a just an
> >> rcu_read_lock(). Good thing it is Friday
> >>
> >> You guys can add my Acked-by, and thanks again
> >
> > Li, are you ok to take the page via your tree?
> >
>
> I don't have a git tree in kernel.org. It's Tejun that picks up
> cgroup patches.

Oh, I thought both of you push to that tree. Anyway, I will rework the
patch and send it again.
--
Michal Hocko
SUSE Labs

2013-03-25 09:06:36

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH] memcg: fix memcg_cache_name() to use cgroup_name()

On Sun 24-03-13 15:33:21, Li Zefan wrote:
> >> Thanks for identifying and fixing this.
> >>
> >> Li is right. The cache name will live long, but this is because the
> >> slab/slub caches will strdup it internally. So the actual memcg
> >> allocation is short lived.
> >
> > OK, I have totally missed that. Sorry about the confusion. Then all the
> > churn around the allocation is pointless, no?
> > What about:
> > ---
> >>From 7ed7f53bb597e8cb40d9ac91ce16142fb60f1e93 Mon Sep 17 00:00:00 2001
> > From: Michal Hocko <[email protected]>
> > Date: Fri, 22 Mar 2013 10:22:54 +0100
> > Subject: [PATCH] memcg: fix memcg_cache_name() to use cgroup_name()
> >
> > As cgroup supports rename, it's unsafe to dereference dentry->d_name
> > without proper vfs locks. Fix this by using cgroup_name() rather than
> > dentry directly.
> >
> > Also open code memcg_cache_name because it is called only from
> > kmem_cache_dup which frees the returned name right after
> > kmem_cache_create_memcg makes a copy of it. Such a short-lived
> > allocation doesn't make too much sense. So replace it by a static
> > buffer as kmem_cache_dup is called with memcg_cache_mutex.
> >
>
> I doubt it's a win to add 4K to kernel text size instead of adding
> a few extra lines of code... but it's up to you.

I will leave the decision to Glauber. The updated version which uses
kmalloc for the static buffer is bellow.

> > Signed-off-by: Li Zefan <[email protected]>
> > Signed-off-by: Michal Hocko <[email protected]>
> > ---
> > mm/memcontrol.c | 33 +++++++++++----------------------
> > 1 file changed, 11 insertions(+), 22 deletions(-)
> ...
> > static struct kmem_cache *kmem_cache_dup(struct mem_cgroup *memcg,
> > struct kmem_cache *s)
> > {
> > char *name;
> > struct kmem_cache *new;
> > + static char tmp_name[PAGE_SIZE];
> >
> > - name = memcg_cache_name(memcg, s);
> > - if (!name)
> > - return NULL;
> > + lockdep_assert_held(&memcg_cache_mutex);
> > +
> > + rcu_read_lock();
> > + tmp_name = snprintf(tmp_name, sizeof(tmp_name), "%s(%d:%s)", s->name,
> > + memcg_cache_id(memcg), cgroup_name(memcg->css.cgroup));
>
> I guess you didn't turn on CONFIG_MEMCG_KMEM?

dohh. Friday effect...

> snprintf() returns a int value.
[...]
---
>From 6f5d4c08cde5c82ac9432608adf517916ab54634 Mon Sep 17 00:00:00 2001
From: Michal Hocko <[email protected]>
Date: Mon, 25 Mar 2013 10:04:02 +0100
Subject: [PATCH] memcg: fix memcg_cache_name() to use cgroup_name()

As cgroup supports rename, it's unsafe to dereference dentry->d_name
without proper vfs locks. Fix this by using cgroup_name() rather than
dentry directly.

Also open code memcg_cache_name because it is called only from
kmem_cache_dup which frees the returned name right after
kmem_cache_create_memcg makes a copy of it. Such a short-lived
allocation doesn't make too much sense. So replace it by a static
buffer as kmem_cache_dup is called with memcg_cache_mutex.

Signed-off-by: Li Zefan <[email protected]>
Signed-off-by: Michal Hocko <[email protected]>
---
mm/memcontrol.c | 64 ++++++++++++++++++++++++++++---------------------------
1 file changed, 33 insertions(+), 31 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 53b8201..3a75f2c 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3214,52 +3214,54 @@ void mem_cgroup_destroy_cache(struct kmem_cache *cachep)
schedule_work(&cachep->memcg_params->destroy);
}

-static char *memcg_cache_name(struct mem_cgroup *memcg, struct kmem_cache *s)
-{
- char *name;
- struct dentry *dentry;
-
- rcu_read_lock();
- dentry = rcu_dereference(memcg->css.cgroup->dentry);
- rcu_read_unlock();
-
- BUG_ON(dentry == NULL);
-
- name = kasprintf(GFP_KERNEL, "%s(%d:%s)", s->name,
- memcg_cache_id(memcg), dentry->d_name.name);
-
- return name;
-}
+/*
+ * This lock protects updaters, not readers. We want readers to be as fast as
+ * they can, and they will either see NULL or a valid cache value. Our model
+ * allow them to see NULL, in which case the root memcg will be selected.
+ *
+ * We need this lock because multiple allocations to the same cache from a non
+ * will span more than one worker. Only one of them can create the cache.
+ */
+static DEFINE_MUTEX(memcg_cache_mutex);

+/*
+ * Called with memcg_cache_mutex held
+ */
static struct kmem_cache *kmem_cache_dup(struct mem_cgroup *memcg,
struct kmem_cache *s)
{
- char *name;
struct kmem_cache *new;
+ static char *tmp_name = NULL;

- name = memcg_cache_name(memcg, s);
- if (!name)
- return NULL;
+ lockdep_assert_held(&memcg_cache_mutex);
+
+ /*
+ * kmem_cache_create_memcg duplicates the given name and
+ * cgroup_name for this name requires RCU context.
+ * This static temporary buffer is used to prevent from
+ * pointless shortliving allocation.
+ */
+ if (!tmp_name) {
+ tmp_name = kmalloc(PAGE_SIZE, GFP_KERNEL);
+ WARN_ON_ONCE(!tmp_name);
+ if (!tmp_name)
+ return NULL;
+ }
+
+ rcu_read_lock();
+ snprintf(tmp_name, PAGE_SIZE, "%s(%d:%s)", s->name,
+ memcg_cache_id(memcg), cgroup_name(memcg->css.cgroup));
+ rcu_read_unlock();

- new = kmem_cache_create_memcg(memcg, name, s->object_size, s->align,
+ new = kmem_cache_create_memcg(memcg, tmp_name, s->object_size, s->align,
(s->flags & ~SLAB_PANIC), s->ctor, s);

if (new)
new->allocflags |= __GFP_KMEMCG;

- kfree(name);
return new;
}

-/*
- * This lock protects updaters, not readers. We want readers to be as fast as
- * they can, and they will either see NULL or a valid cache value. Our model
- * allow them to see NULL, in which case the root memcg will be selected.
- *
- * We need this lock because multiple allocations to the same cache from a non
- * will span more than one worker. Only one of them can create the cache.
- */
-static DEFINE_MUTEX(memcg_cache_mutex);
static struct kmem_cache *memcg_create_kmem_cache(struct mem_cgroup *memcg,
struct kmem_cache *cachep)
{
--
1.7.10.4


--
Michal Hocko
SUSE Labs

2013-03-26 07:53:00

by Zefan Li

[permalink] [raw]
Subject: Re: [PATCH] memcg: fix memcg_cache_name() to use cgroup_name()

>>> >From 7ed7f53bb597e8cb40d9ac91ce16142fb60f1e93 Mon Sep 17 00:00:00 2001
>>> From: Michal Hocko <[email protected]>
>>> Date: Fri, 22 Mar 2013 10:22:54 +0100
>>> Subject: [PATCH] memcg: fix memcg_cache_name() to use cgroup_name()
>>>
>>> As cgroup supports rename, it's unsafe to dereference dentry->d_name
>>> without proper vfs locks. Fix this by using cgroup_name() rather than
>>> dentry directly.
>>>
>>> Also open code memcg_cache_name because it is called only from
>>> kmem_cache_dup which frees the returned name right after
>>> kmem_cache_create_memcg makes a copy of it. Such a short-lived
>>> allocation doesn't make too much sense. So replace it by a static
>>> buffer as kmem_cache_dup is called with memcg_cache_mutex.
>>>
>>
>> I doubt it's a win to add 4K to kernel text size instead of adding
>> a few extra lines of code... but it's up to you.
>
> I will leave the decision to Glauber. The updated version which uses
> kmalloc for the static buffer is bellow.
>

I don't have strong preference. Glauber, what's your opinion?

...
> static struct kmem_cache *kmem_cache_dup(struct mem_cgroup *memcg,
> struct kmem_cache *s)
> {
> - char *name;
> struct kmem_cache *new;
> + static char *tmp_name = NULL;

(minor nitpick) why not preserve the name "name"

2013-03-26 08:10:11

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH] memcg: fix memcg_cache_name() to use cgroup_name()

On Tue 26-03-13 15:52:32, Li Zefan wrote:
[...]
> ...
> > static struct kmem_cache *kmem_cache_dup(struct mem_cgroup *memcg,
> > struct kmem_cache *s)
> > {
> > - char *name;
> > struct kmem_cache *new;
> > + static char *tmp_name = NULL;
>
> (minor nitpick) why not preserve the name "name"

I wanted to make it clear that the name is just temporal

--
Michal Hocko
SUSE Labs

2013-03-26 08:35:24

by Glauber Costa

[permalink] [raw]
Subject: Re: [PATCH] memcg: fix memcg_cache_name() to use cgroup_name()

>>
>> I doubt it's a win to add 4K to kernel text size instead of adding
>> a few extra lines of code... but it's up to you.
>
> I will leave the decision to Glauber. The updated version which uses
> kmalloc for the static buffer is bellow.
>
I prefer to allocate dynamically here. But although I understand why we
need to call cgroup_name, I don't understand what is wrong with
kasprintf if we're going to allocate anyway. It will allocate a string
just big enough. A PAGE_SIZE'd allocation is a lot more likely to fail.

Now, if we really want to be smart here, we can do something like what
I've done for the slub attribute buffers, that can actually have very
long values.

allocate a small buffer that will hold 80 % > of the allocations (256
bytes should be enough for most cache names), and if the string is
bigger than this, we allocate. Once we allocate, we save it in a static
pointer and leave it there. The hope here is that we may be able to
live without ever allocating in many systems.

> +
> + /*
> + * kmem_cache_create_memcg duplicates the given name and
> + * cgroup_name for this name requires RCU context.
> + * This static temporary buffer is used to prevent from
> + * pointless shortliving allocation.
> + */
The comment is also no longer true if you don't resort to a static buffer.

The following (untested) patch implements the idea I outlined above.

What do you guys think ?


Attachments:
0001-memcg-fix-memcg_cache_name-to-use-cgroup_name.patch (4.14 kB)

2013-03-26 08:43:55

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH] memcg: fix memcg_cache_name() to use cgroup_name()

On Tue 26-03-13 12:35:58, Glauber Costa wrote:
> >>
> >> I doubt it's a win to add 4K to kernel text size instead of adding
> >> a few extra lines of code... but it's up to you.
> >
> > I will leave the decision to Glauber. The updated version which uses
> > kmalloc for the static buffer is bellow.
> >
> I prefer to allocate dynamically here. But although I understand why we
> need to call cgroup_name, I don't understand what is wrong with
> kasprintf if we're going to allocate anyway. It will allocate a string
> just big enough. A PAGE_SIZE'd allocation is a lot more likely to fail.
>
> Now, if we really want to be smart here, we can do something like what
> I've done for the slub attribute buffers, that can actually have very
> long values.
>
> allocate a small buffer that will hold 80 % > of the allocations (256
> bytes should be enough for most cache names), and if the string is
> bigger than this, we allocate. Once we allocate, we save it in a static
> pointer and leave it there. The hope here is that we may be able to
> live without ever allocating in many systems.
>
> > +
> > + /*
> > + * kmem_cache_create_memcg duplicates the given name and
> > + * cgroup_name for this name requires RCU context.
> > + * This static temporary buffer is used to prevent from
> > + * pointless shortliving allocation.
> > + */
> The comment is also no longer true if you don't resort to a static buffer.

The buffer _is_ static (read global variable hidden with the function
scope).

> The following (untested) patch implements the idea I outlined above.
>
> What do you guys think ?

I really do not care which way to fix this.

[...]
> +static struct kmem_cache *kmem_cache_dup(struct mem_cgroup *memcg,
> + struct kmem_cache *s)
> {
> - char *name;
> - struct dentry *dentry;
> + const char *cgname; /* actual cache name */
> + char *name = NULL; /* actual cache name */
> + char buf[256]; /* stack buffer for small allocations */
> + int buf_len;
> + static char *buf_name; /* pointer to a page, if we ever need */
> + struct kmem_cache *new;
> +
> + lockdep_assert_held(&memcg_cache_mutex);
>
> rcu_read_lock();
> - dentry = rcu_dereference(memcg->css.cgroup->dentry);
> + cgname = cgroup_name(memcg->css.cgroup);
> rcu_read_unlock();

cgname is valid only within RCU read lock AFAIU.

--
Michal Hocko
SUSE Labs

2013-03-26 09:01:30

by Glauber Costa

[permalink] [raw]
Subject: Re: [PATCH] memcg: fix memcg_cache_name() to use cgroup_name()

On 03/26/2013 12:43 PM, Michal Hocko wrote:
> On Tue 26-03-13 12:35:58, Glauber Costa wrote:
>>>>
>>>> I doubt it's a win to add 4K to kernel text size instead of adding
>>>> a few extra lines of code... but it's up to you.
>>>
>>> I will leave the decision to Glauber. The updated version which uses
>>> kmalloc for the static buffer is bellow.
>>>
>> I prefer to allocate dynamically here. But although I understand why we
>> need to call cgroup_name, I don't understand what is wrong with
>> kasprintf if we're going to allocate anyway. It will allocate a string
>> just big enough. A PAGE_SIZE'd allocation is a lot more likely to fail.
>>
>> Now, if we really want to be smart here, we can do something like what
>> I've done for the slub attribute buffers, that can actually have very
>> long values.
>>
>> allocate a small buffer that will hold 80 % > of the allocations (256
>> bytes should be enough for most cache names), and if the string is
>> bigger than this, we allocate. Once we allocate, we save it in a static
>> pointer and leave it there. The hope here is that we may be able to
>> live without ever allocating in many systems.
>>
>>> +
>>> + /*
>>> + * kmem_cache_create_memcg duplicates the given name and
>>> + * cgroup_name for this name requires RCU context.
>>> + * This static temporary buffer is used to prevent from
>>> + * pointless shortliving allocation.
>>> + */
>> The comment is also no longer true if you don't resort to a static buffer.
>
> The buffer _is_ static (read global variable hidden with the function
> scope).
>

Although correct, it is a bit misleading. It is static in the sense it
is held by a static variable. But it is acquired by kmalloc...

In any way, this is a tiny detail.

FWIW, I am fine with the patch you provided:

Acked-by: Glauber Costa <[email protected]>

We could go seeing how big those allocations are in practice, but I
doubt it is worth the trouble.

2013-03-27 01:16:13

by Zefan Li

[permalink] [raw]
Subject: Re: [PATCH] memcg: fix memcg_cache_name() to use cgroup_name()

> Although correct, it is a bit misleading. It is static in the sense it
> is held by a static variable. But it is acquired by kmalloc...
>
> In any way, this is a tiny detail.
>
> FWIW, I am fine with the patch you provided:
>
> Acked-by: Glauber Costa <[email protected]>
>

Michal, could you resend your final patch to Tejun in a new mail thread?
There are quite a few different patches inlined in this thread.

2013-03-27 08:37:12

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH] memcg: fix memcg_cache_name() to use cgroup_name()

On Wed 27-03-13 09:15:53, Li Zefan wrote:
> > Although correct, it is a bit misleading. It is static in the sense it
> > is held by a static variable. But it is acquired by kmalloc...
> >
> > In any way, this is a tiny detail.
> >
> > FWIW, I am fine with the patch you provided:
> >
> > Acked-by: Glauber Costa <[email protected]>
> >
>
> Michal, could you resend your final patch to Tejun in a new mail thread?
> There are quite a few different patches inlined in this thread.

Done.

--
Michal Hocko
SUSE Labs