2013-10-01 23:31:29

by David Rientjes

[permalink] [raw]
Subject: [patch for-3.12] mm, memcg: protect mem_cgroup_read_events for cpu hotplug

for_each_online_cpu() needs the protection of {get,put}_online_cpus() so
cpu_online_mask doesn't change during the iteration.

Signed-off-by: David Rientjes <[email protected]>
---
mm/memcontrol.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -866,6 +866,7 @@ static unsigned long mem_cgroup_read_events(struct mem_cgroup *memcg,
unsigned long val = 0;
int cpu;

+ get_online_cpus();
for_each_online_cpu(cpu)
val += per_cpu(memcg->stat->events[idx], cpu);
#ifdef CONFIG_HOTPLUG_CPU
@@ -873,6 +874,7 @@ static unsigned long mem_cgroup_read_events(struct mem_cgroup *memcg,
val += memcg->nocpu_base.events[idx];
spin_unlock(&memcg->pcp_counter_lock);
#endif
+ put_online_cpus();
return val;
}


2013-10-02 00:46:28

by KOSAKI Motohiro

[permalink] [raw]
Subject: Re: [patch for-3.12] mm, memcg: protect mem_cgroup_read_events for cpu hotplug

(10/1/13 7:31 PM), David Rientjes wrote:
> for_each_online_cpu() needs the protection of {get,put}_online_cpus() so
> cpu_online_mask doesn't change during the iteration.
>
> Signed-off-by: David Rientjes <[email protected]>

Acked-by: KOSAKI Motohiro <[email protected]>

2013-10-02 02:22:40

by Johannes Weiner

[permalink] [raw]
Subject: Re: [patch for-3.12] mm, memcg: protect mem_cgroup_read_events for cpu hotplug

On Tue, Oct 01, 2013 at 04:31:23PM -0700, David Rientjes wrote:
> for_each_online_cpu() needs the protection of {get,put}_online_cpus() so
> cpu_online_mask doesn't change during the iteration.

There is no problem report here.

Is there a crash?

If it's just accuracy of the read, why would we care about some
inaccuracies in counters that can change before you even get the
results to userspace? And care to the point where we hold up CPU
hotplugging for this?

Also, the fact that you directly sent this to Linus suggests there is
some urgency for this fix. What's going on?

Thanks,
Johannes

2013-10-02 03:08:48

by David Rientjes

[permalink] [raw]
Subject: Re: [patch for-3.12] mm, memcg: protect mem_cgroup_read_events for cpu hotplug

On Tue, 1 Oct 2013, Johannes Weiner wrote:

> On Tue, Oct 01, 2013 at 04:31:23PM -0700, David Rientjes wrote:
> > for_each_online_cpu() needs the protection of {get,put}_online_cpus() so
> > cpu_online_mask doesn't change during the iteration.
>
> There is no problem report here.
>
> Is there a crash?
>

No.

> If it's just accuracy of the read, why would we care about some
> inaccuracies in counters that can change before you even get the
> results to userspace? And care to the point where we hold up CPU
> hotplugging for this?
>

cpu_hotplug.lock is held while a cpu is going down, it's a coarse lock
that is used kernel-wide to synchronize cpu hotplug activity. Memcg has
a cpu hotplug notifier, called while there may not be any cpu hotplug
refcounts, which drains per-cpu event counts to memcg->nocpu_base.events
to maintain a cumulative event count as cpus disappear. Without
get_online_cpus() in mem_cgroup_read_events(), it's possible to account
for the event count on a dying cpu twice, and this value may be
significantly large.

In fact, all memcg->pcp_counter_lock use should be nested by
{get,put}_online_cpus().

This fixes that issue and ensures the reported statistics are not vastly
over-reported during cpu hotplug.

> Also, the fact that you directly sent this to Linus suggests there is
> some urgency for this fix. What's going on?
>

I believe users of cpu hotplug still want event counts that are
approximate to the real value and that this is 3.12 material.