2012-11-20 01:44:39

by David Rientjes

[permalink] [raw]
Subject: [patch] mm, memcg: avoid unnecessary function call when memcg is disabled

While profiling numa/core v16 with cgroup_disable=memory on the command
line, I noticed mem_cgroup_count_vm_event() still showed up as high as
0.60% in perftop.

This occurs because the function is called extremely often even when memcg
is disabled.

To fix this, inline the check for mem_cgroup_disabled() so we avoid the
unnecessary function call if memcg is disabled.

Signed-off-by: David Rientjes <[email protected]>
---
include/linux/memcontrol.h | 9 ++++++++-
mm/memcontrol.c | 9 ++++-----
2 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -181,7 +181,14 @@ unsigned long mem_cgroup_soft_limit_reclaim(struct zone *zone, int order,
gfp_t gfp_mask,
unsigned long *total_scanned);

-void mem_cgroup_count_vm_event(struct mm_struct *mm, enum vm_event_item idx);
+void __mem_cgroup_count_vm_event(struct mm_struct *mm, enum vm_event_item idx);
+static inline void mem_cgroup_count_vm_event(struct mm_struct *mm,
+ enum vm_event_item idx)
+{
+ if (mem_cgroup_disabled() || !mm)
+ return;
+ __mem_cgroup_count_vm_event(mm, idx);
+}
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
void mem_cgroup_split_huge_fixup(struct page *head);
#endif
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -59,6 +59,8 @@
#include <trace/events/vmscan.h>

struct cgroup_subsys mem_cgroup_subsys __read_mostly;
+EXPORT_SYMBOL(mem_cgroup_subsys);
+
#define MEM_CGROUP_RECLAIM_RETRIES 5
static struct mem_cgroup *root_mem_cgroup __read_mostly;

@@ -1015,13 +1017,10 @@ void mem_cgroup_iter_break(struct mem_cgroup *root,
iter != NULL; \
iter = mem_cgroup_iter(NULL, iter, NULL))

-void mem_cgroup_count_vm_event(struct mm_struct *mm, enum vm_event_item idx)
+void __mem_cgroup_count_vm_event(struct mm_struct *mm, enum vm_event_item idx)
{
struct mem_cgroup *memcg;

- if (!mm)
- return;
-
rcu_read_lock();
memcg = mem_cgroup_from_task(rcu_dereference(mm->owner));
if (unlikely(!memcg))
@@ -1040,7 +1039,7 @@ void mem_cgroup_count_vm_event(struct mm_struct *mm, enum vm_event_item idx)
out:
rcu_read_unlock();
}
-EXPORT_SYMBOL(mem_cgroup_count_vm_event);
+EXPORT_SYMBOL(__mem_cgroup_count_vm_event);

/**
* mem_cgroup_zone_lruvec - get the lru list vector for a zone and memcg


2012-11-20 04:23:56

by Kamezawa Hiroyuki

[permalink] [raw]
Subject: Re: [patch] mm, memcg: avoid unnecessary function call when memcg is disabled

(2012/11/20 10:44), David Rientjes wrote:
> While profiling numa/core v16 with cgroup_disable=memory on the command
> line, I noticed mem_cgroup_count_vm_event() still showed up as high as
> 0.60% in perftop.
>
> This occurs because the function is called extremely often even when memcg
> is disabled.
>
> To fix this, inline the check for mem_cgroup_disabled() so we avoid the
> unnecessary function call if memcg is disabled.
>
> Signed-off-by: David Rientjes <[email protected]>

Acked-By: KAMEZAWA Hiroyuki <[email protected]>


2012-11-20 07:07:33

by Michal Hocko

[permalink] [raw]
Subject: Re: [patch] mm, memcg: avoid unnecessary function call when memcg is disabled

On Mon 19-11-12 17:44:34, David Rientjes wrote:
> While profiling numa/core v16 with cgroup_disable=memory on the command
> line, I noticed mem_cgroup_count_vm_event() still showed up as high as
> 0.60% in perftop.
>
> This occurs because the function is called extremely often even when memcg
> is disabled.
>
> To fix this, inline the check for mem_cgroup_disabled() so we avoid the
> unnecessary function call if memcg is disabled.
>
> Signed-off-by: David Rientjes <[email protected]>

Acked-by: Michal Hocko <[email protected]>

Thanks!

> ---
> include/linux/memcontrol.h | 9 ++++++++-
> mm/memcontrol.c | 9 ++++-----
> 2 files changed, 12 insertions(+), 6 deletions(-)
>
> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -181,7 +181,14 @@ unsigned long mem_cgroup_soft_limit_reclaim(struct zone *zone, int order,
> gfp_t gfp_mask,
> unsigned long *total_scanned);
>
> -void mem_cgroup_count_vm_event(struct mm_struct *mm, enum vm_event_item idx);
> +void __mem_cgroup_count_vm_event(struct mm_struct *mm, enum vm_event_item idx);
> +static inline void mem_cgroup_count_vm_event(struct mm_struct *mm,
> + enum vm_event_item idx)
> +{
> + if (mem_cgroup_disabled() || !mm)
> + return;
> + __mem_cgroup_count_vm_event(mm, idx);
> +}
> #ifdef CONFIG_TRANSPARENT_HUGEPAGE
> void mem_cgroup_split_huge_fixup(struct page *head);
> #endif
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -59,6 +59,8 @@
> #include <trace/events/vmscan.h>
>
> struct cgroup_subsys mem_cgroup_subsys __read_mostly;
> +EXPORT_SYMBOL(mem_cgroup_subsys);
> +
> #define MEM_CGROUP_RECLAIM_RETRIES 5
> static struct mem_cgroup *root_mem_cgroup __read_mostly;
>
> @@ -1015,13 +1017,10 @@ void mem_cgroup_iter_break(struct mem_cgroup *root,
> iter != NULL; \
> iter = mem_cgroup_iter(NULL, iter, NULL))
>
> -void mem_cgroup_count_vm_event(struct mm_struct *mm, enum vm_event_item idx)
> +void __mem_cgroup_count_vm_event(struct mm_struct *mm, enum vm_event_item idx)
> {
> struct mem_cgroup *memcg;
>
> - if (!mm)
> - return;
> -
> rcu_read_lock();
> memcg = mem_cgroup_from_task(rcu_dereference(mm->owner));
> if (unlikely(!memcg))
> @@ -1040,7 +1039,7 @@ void mem_cgroup_count_vm_event(struct mm_struct *mm, enum vm_event_item idx)
> out:
> rcu_read_unlock();
> }
> -EXPORT_SYMBOL(mem_cgroup_count_vm_event);
> +EXPORT_SYMBOL(__mem_cgroup_count_vm_event);
>
> /**
> * mem_cgroup_zone_lruvec - get the lru list vector for a zone and memcg
> --
> To unsubscribe from this list: send the line "unsubscribe cgroups" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

--
Michal Hocko
SUSE Labs

2012-11-20 08:34:37

by Glauber Costa

[permalink] [raw]
Subject: Re: [patch] mm, memcg: avoid unnecessary function call when memcg is disabled

On 11/20/2012 08:23 AM, Kamezawa Hiroyuki wrote:
> (2012/11/20 10:44), David Rientjes wrote:
>> While profiling numa/core v16 with cgroup_disable=memory on the command
>> line, I noticed mem_cgroup_count_vm_event() still showed up as high as
>> 0.60% in perftop.
>>
>> This occurs because the function is called extremely often even when
>> memcg
>> is disabled.
>>
>> To fix this, inline the check for mem_cgroup_disabled() so we avoid the
>> unnecessary function call if memcg is disabled.
>>
>> Signed-off-by: David Rientjes <[email protected]>
>
> Acked-By: KAMEZAWA Hiroyuki <[email protected]>
>
I am fine with this as well.

Acked-by: Glauber Costa <[email protected]>

2012-11-20 17:27:11

by Johannes Weiner

[permalink] [raw]
Subject: Re: [patch] mm, memcg: avoid unnecessary function call when memcg is disabled

On Mon, Nov 19, 2012 at 05:44:34PM -0800, David Rientjes wrote:
> While profiling numa/core v16 with cgroup_disable=memory on the command
> line, I noticed mem_cgroup_count_vm_event() still showed up as high as
> 0.60% in perftop.
>
> This occurs because the function is called extremely often even when memcg
> is disabled.
>
> To fix this, inline the check for mem_cgroup_disabled() so we avoid the
> unnecessary function call if memcg is disabled.
>
> Signed-off-by: David Rientjes <[email protected]>

Acked-by: Johannes Weiner <[email protected]>

2012-11-20 21:49:35

by Andrew Morton

[permalink] [raw]
Subject: Re: [patch] mm, memcg: avoid unnecessary function call when memcg is disabled

On Mon, 19 Nov 2012 17:44:34 -0800 (PST)
David Rientjes <[email protected]> wrote:

> While profiling numa/core v16 with cgroup_disable=memory on the command
> line, I noticed mem_cgroup_count_vm_event() still showed up as high as
> 0.60% in perftop.
>
> This occurs because the function is called extremely often even when memcg
> is disabled.
>
> To fix this, inline the check for mem_cgroup_disabled() so we avoid the
> unnecessary function call if memcg is disabled.
>
> ...
>
> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -181,7 +181,14 @@ unsigned long mem_cgroup_soft_limit_reclaim(struct zone *zone, int order,
> gfp_t gfp_mask,
> unsigned long *total_scanned);
>
> -void mem_cgroup_count_vm_event(struct mm_struct *mm, enum vm_event_item idx);
> +void __mem_cgroup_count_vm_event(struct mm_struct *mm, enum vm_event_item idx);
> +static inline void mem_cgroup_count_vm_event(struct mm_struct *mm,
> + enum vm_event_item idx)
> +{
> + if (mem_cgroup_disabled() || !mm)
> + return;
> + __mem_cgroup_count_vm_event(mm, idx);
> +}

Does the !mm case occur frequently enough to justify inlining it, or
should that test remain out-of-line?

2012-11-21 01:03:00

by Kamezawa Hiroyuki

[permalink] [raw]
Subject: Re: [patch] mm, memcg: avoid unnecessary function call when memcg is disabled

(2012/11/21 6:49), Andrew Morton wrote:
> On Mon, 19 Nov 2012 17:44:34 -0800 (PST)
> David Rientjes <[email protected]> wrote:
>
>> While profiling numa/core v16 with cgroup_disable=memory on the command
>> line, I noticed mem_cgroup_count_vm_event() still showed up as high as
>> 0.60% in perftop.
>>
>> This occurs because the function is called extremely often even when memcg
>> is disabled.
>>
>> To fix this, inline the check for mem_cgroup_disabled() so we avoid the
>> unnecessary function call if memcg is disabled.
>>
>> ...
>>
>> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
>> --- a/include/linux/memcontrol.h
>> +++ b/include/linux/memcontrol.h
>> @@ -181,7 +181,14 @@ unsigned long mem_cgroup_soft_limit_reclaim(struct zone *zone, int order,
>> gfp_t gfp_mask,
>> unsigned long *total_scanned);
>>
>> -void mem_cgroup_count_vm_event(struct mm_struct *mm, enum vm_event_item idx);
>> +void __mem_cgroup_count_vm_event(struct mm_struct *mm, enum vm_event_item idx);
>> +static inline void mem_cgroup_count_vm_event(struct mm_struct *mm,
>> + enum vm_event_item idx)
>> +{
>> + if (mem_cgroup_disabled() || !mm)
>> + return;
>> + __mem_cgroup_count_vm_event(mm, idx);
>> +}
>
> Does the !mm case occur frequently enough to justify inlining it, or
> should that test remain out-of-line?
>
I think this should be out-of-line.

Thanks,
-Kame

2012-11-21 02:48:56

by David Rientjes

[permalink] [raw]
Subject: [patch] mm, memcg: avoid unnecessary function call when memcg is disabled fix

Move the check for !mm out of line as suggested by Andrew.

Signed-off-by: David Rientjes <[email protected]>
---
include/linux/memcontrol.h | 2 +-
mm/memcontrol.c | 3 +++
2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -185,7 +185,7 @@ void __mem_cgroup_count_vm_event(struct mm_struct *mm, enum vm_event_item idx);
static inline void mem_cgroup_count_vm_event(struct mm_struct *mm,
enum vm_event_item idx)
{
- if (mem_cgroup_disabled() || !mm)
+ if (mem_cgroup_disabled())
return;
__mem_cgroup_count_vm_event(mm, idx);
}
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1021,6 +1021,9 @@ void __mem_cgroup_count_vm_event(struct mm_struct *mm, enum vm_event_item idx)
{
struct mem_cgroup *memcg;

+ if (!mm)
+ return;
+
rcu_read_lock();
memcg = mem_cgroup_from_task(rcu_dereference(mm->owner));
if (unlikely(!memcg))

2012-11-21 04:11:38

by Kamezawa Hiroyuki

[permalink] [raw]
Subject: Re: [patch] mm, memcg: avoid unnecessary function call when memcg is disabled fix

(2012/11/21 11:48), David Rientjes wrote:
> Move the check for !mm out of line as suggested by Andrew.
>
> Signed-off-by: David Rientjes <[email protected]>

Thank you very much !

Acked-by: KAMEZAWA Hiroyuki <[email protected]>


> ---
> include/linux/memcontrol.h | 2 +-
> mm/memcontrol.c | 3 +++
> 2 files changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -185,7 +185,7 @@ void __mem_cgroup_count_vm_event(struct mm_struct *mm, enum vm_event_item idx);
> static inline void mem_cgroup_count_vm_event(struct mm_struct *mm,
> enum vm_event_item idx)
> {
> - if (mem_cgroup_disabled() || !mm)
> + if (mem_cgroup_disabled())
> return;
> __mem_cgroup_count_vm_event(mm, idx);
> }
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -1021,6 +1021,9 @@ void __mem_cgroup_count_vm_event(struct mm_struct *mm, enum vm_event_item idx)
> {
> struct mem_cgroup *memcg;
>
> + if (!mm)
> + return;
> +
> rcu_read_lock();
> memcg = mem_cgroup_from_task(rcu_dereference(mm->owner));
> if (unlikely(!memcg))
>

2012-11-21 08:35:24

by Michal Hocko

[permalink] [raw]
Subject: Re: [patch] mm, memcg: avoid unnecessary function call when memcg is disabled

On Tue 20-11-12 13:49:32, Andrew Morton wrote:
> On Mon, 19 Nov 2012 17:44:34 -0800 (PST)
> David Rientjes <[email protected]> wrote:
>
> > While profiling numa/core v16 with cgroup_disable=memory on the command
> > line, I noticed mem_cgroup_count_vm_event() still showed up as high as
> > 0.60% in perftop.
> >
> > This occurs because the function is called extremely often even when memcg
> > is disabled.
> >
> > To fix this, inline the check for mem_cgroup_disabled() so we avoid the
> > unnecessary function call if memcg is disabled.
> >
> > ...
> >
> > diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> > --- a/include/linux/memcontrol.h
> > +++ b/include/linux/memcontrol.h
> > @@ -181,7 +181,14 @@ unsigned long mem_cgroup_soft_limit_reclaim(struct zone *zone, int order,
> > gfp_t gfp_mask,
> > unsigned long *total_scanned);
> >
> > -void mem_cgroup_count_vm_event(struct mm_struct *mm, enum vm_event_item idx);
> > +void __mem_cgroup_count_vm_event(struct mm_struct *mm, enum vm_event_item idx);
> > +static inline void mem_cgroup_count_vm_event(struct mm_struct *mm,
> > + enum vm_event_item idx)
> > +{
> > + if (mem_cgroup_disabled() || !mm)
> > + return;
> > + __mem_cgroup_count_vm_event(mm, idx);
> > +}
>
> Does the !mm case occur frequently enough to justify inlining it, or
> should that test remain out-of-line?

Now that you've asked about it I started looking around and I cannot see
how mm can ever be NULL. The condition is there since the very beginning
(456f998e memcg: add the pagefault count into memcg stats) but all the
callers are page fault handlers and those shouldn't have mm==NULL.
Or is there anything obvious I am missing?

Ying, the whole thread starts https://lkml.org/lkml/2012/11/19/545 but
the primary question is why we need !mm test for mem_cgroup_count_vm_event
at all.

Thanks!
--
Michal Hocko
SUSE Labs

2012-11-28 23:29:30

by Hugh Dickins

[permalink] [raw]
Subject: Re: [patch] mm, memcg: avoid unnecessary function call when memcg is disabled

On Wed, 21 Nov 2012, Michal Hocko wrote:
> On Tue 20-11-12 13:49:32, Andrew Morton wrote:
> > On Mon, 19 Nov 2012 17:44:34 -0800 (PST)
> > David Rientjes <[email protected]> wrote:
> >
> > > While profiling numa/core v16 with cgroup_disable=memory on the command
> > > line, I noticed mem_cgroup_count_vm_event() still showed up as high as
> > > 0.60% in perftop.
> > >
> > > This occurs because the function is called extremely often even when memcg
> > > is disabled.
> > >
> > > To fix this, inline the check for mem_cgroup_disabled() so we avoid the
> > > unnecessary function call if memcg is disabled.
> > >
> > > ...
> > >
> > > diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> > > --- a/include/linux/memcontrol.h
> > > +++ b/include/linux/memcontrol.h
> > > @@ -181,7 +181,14 @@ unsigned long mem_cgroup_soft_limit_reclaim(struct zone *zone, int order,
> > > gfp_t gfp_mask,
> > > unsigned long *total_scanned);
> > >
> > > -void mem_cgroup_count_vm_event(struct mm_struct *mm, enum vm_event_item idx);
> > > +void __mem_cgroup_count_vm_event(struct mm_struct *mm, enum vm_event_item idx);
> > > +static inline void mem_cgroup_count_vm_event(struct mm_struct *mm,
> > > + enum vm_event_item idx)
> > > +{
> > > + if (mem_cgroup_disabled() || !mm)
> > > + return;
> > > + __mem_cgroup_count_vm_event(mm, idx);
> > > +}
> >
> > Does the !mm case occur frequently enough to justify inlining it, or
> > should that test remain out-of-line?
>
> Now that you've asked about it I started looking around and I cannot see
> how mm can ever be NULL. The condition is there since the very beginning
> (456f998e memcg: add the pagefault count into memcg stats) but all the
> callers are page fault handlers and those shouldn't have mm==NULL.
> Or is there anything obvious I am missing?
>
> Ying, the whole thread starts https://lkml.org/lkml/2012/11/19/545 but
> the primary question is why we need !mm test for mem_cgroup_count_vm_event
> at all.

Here's a guess: as Ying's 456f998e patch started out in akpm's tree,
shmem.c was calling mem_cgroup_count_vm_event(current->mm, PGMAJFAULT).

Then I insisted that was inconsistent with how we usually account when
one task touches another's address space, and rearranged it to work on
vma->vm_mm instead.

Done the original way, if the touching task were a kernel daemon (KSM's
ksmd comes to my mind), then the current->mm could well have been NULL.

I agree with you that it looks redundant now.

Hugh

2012-11-29 13:29:01

by Michal Hocko

[permalink] [raw]
Subject: [PATCH] memcg: do not check for mm in mem_cgroup_count_vm_event disabled

On Wed 28-11-12 15:29:30, Hugh Dickins wrote:
> On Wed, 21 Nov 2012, Michal Hocko wrote:
> > On Tue 20-11-12 13:49:32, Andrew Morton wrote:
> > > On Mon, 19 Nov 2012 17:44:34 -0800 (PST)
> > > David Rientjes <[email protected]> wrote:
[...]
> > > > -void mem_cgroup_count_vm_event(struct mm_struct *mm, enum vm_event_item idx);
> > > > +void __mem_cgroup_count_vm_event(struct mm_struct *mm, enum vm_event_item idx);
> > > > +static inline void mem_cgroup_count_vm_event(struct mm_struct *mm,
> > > > + enum vm_event_item idx)
> > > > +{
> > > > + if (mem_cgroup_disabled() || !mm)
> > > > + return;
> > > > + __mem_cgroup_count_vm_event(mm, idx);
> > > > +}
> > >
> > > Does the !mm case occur frequently enough to justify inlining it, or
> > > should that test remain out-of-line?
> >
> > Now that you've asked about it I started looking around and I cannot see
> > how mm can ever be NULL. The condition is there since the very beginning
> > (456f998e memcg: add the pagefault count into memcg stats) but all the
> > callers are page fault handlers and those shouldn't have mm==NULL.
> > Or is there anything obvious I am missing?
> >
> > Ying, the whole thread starts https://lkml.org/lkml/2012/11/19/545 but
> > the primary question is why we need !mm test for mem_cgroup_count_vm_event
> > at all.
>
> Here's a guess: as Ying's 456f998e patch started out in akpm's tree,
> shmem.c was calling mem_cgroup_count_vm_event(current->mm, PGMAJFAULT).
>
> Then I insisted that was inconsistent with how we usually account when
> one task touches another's address space, and rearranged it to work on
> vma->vm_mm instead.

Thanks Hugh!

> Done the original way, if the touching task were a kernel daemon (KSM's
> ksmd comes to my mind), then the current->mm could well have been NULL.
>
> I agree with you that it looks redundant now.

Andrew could you please pick this up?
---
>From 619b1ab26c3e96944f6c60256cf7920671bafa5b Mon Sep 17 00:00:00 2001
From: Michal Hocko <[email protected]>
Date: Thu, 29 Nov 2012 14:20:58 +0100
Subject: [PATCH] memcg: do not check for mm in mem_cgroup_count_vm_event

mm given to mem_cgroup_count_vm_event cannot be NULL because the
function is either called from the page fault path or vma->vm_mm is
used. So the check can be dropped.

The check has been introduced by 456f998e (memcg: add the pagefault
count into memcg stats) because the originally proposed patch used
current->mm for shmem but this has been changed to vma->vm_mm later on
without the check being removed (thanks to Hugh for this recollection).

Signed-off-by: Michal Hocko <[email protected]>
---
include/linux/memcontrol.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 095d2b4..0108a56 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -188,7 +188,7 @@ void __mem_cgroup_count_vm_event(struct mm_struct *mm, enum vm_event_item idx);
static inline void mem_cgroup_count_vm_event(struct mm_struct *mm,
enum vm_event_item idx)
{
- if (mem_cgroup_disabled() || !mm)
+ if (mem_cgroup_disabled())
return;
__mem_cgroup_count_vm_event(mm, idx);
}
--
1.7.10.4

--
Michal Hocko
SUSE Labs