2019-01-09 04:02:46

by Rik van Riel

[permalink] [raw]
Subject: [PATCH] mm,slab,memcg: call memcg kmem put cache with same condition as get

There is an imbalance between when slab_pre_alloc_hook calls
memcg_kmem_get_cache and when slab_post_alloc_hook calls
memcg_kmem_put_cache.

This can cause a memcg kmem cache to be destroyed right as
an object from that cache is being allocated, which is probably
not good. It could lead to things like a memcg allocating new
kmalloc slabs instead of using freed space in old ones, maybe
memory leaks, and maybe oopses as a memcg kmalloc slab is getting
destroyed on one CPU while another CPU is trying to do an allocation
from that same memcg.

The obvious fix would be to use the same condition for calling
memcg_kmem_put_cache that we also use to decide whether to call
memcg_kmem_get_cache.

I am not sure how long this bug has been around, since the last
changeset to touch that code - 452647784b2f ("mm: memcontrol: cleanup
kmem charge functions") - merely moved the bug from one location to
another. I am still tagging that changeset, because the fix should
automatically apply that far back.

Signed-off-by: Rik van Riel <[email protected]>
Fixes: 452647784b2f ("mm: memcontrol: cleanup kmem charge functions")
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: Alexey Dobriyan <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Pekka Enberg <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: Joonsoo Kim <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Tejun Heo <[email protected]>
---
mm/slab.h | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/mm/slab.h b/mm/slab.h
index 4190c24ef0e9..ab3d95bef8a0 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -444,7 +444,8 @@ static inline void slab_post_alloc_hook(struct kmem_cache *s, gfp_t flags,
p[i] = kasan_slab_alloc(s, object, flags);
}

- if (memcg_kmem_enabled())
+ if (memcg_kmem_enabled() &&
+ ((flags & __GFP_ACCOUNT) || (s->flags & SLAB_ACCOUNT)))
memcg_kmem_put_cache(s);
}

--
2.17.1



2019-01-09 05:39:40

by Shakeel Butt

[permalink] [raw]
Subject: Re: [PATCH] mm,slab,memcg: call memcg kmem put cache with same condition as get

On Tue, Jan 8, 2019 at 8:01 PM Rik van Riel <[email protected]> wrote:
>
> There is an imbalance between when slab_pre_alloc_hook calls
> memcg_kmem_get_cache and when slab_post_alloc_hook calls
> memcg_kmem_put_cache.
>

Can you explain how there is an imbalance? If the returned kmem cache
from memcg_kmem_get_cache() is the memcg kmem cache then the refcnt of
memcg is elevated and the memcg_kmem_put_cache() will correctly
decrement the refcnt of the memcg.

> This can cause a memcg kmem cache to be destroyed right as
> an object from that cache is being allocated, which is probably
> not good. It could lead to things like a memcg allocating new
> kmalloc slabs instead of using freed space in old ones, maybe
> memory leaks, and maybe oopses as a memcg kmalloc slab is getting
> destroyed on one CPU while another CPU is trying to do an allocation
> from that same memcg.
>
> The obvious fix would be to use the same condition for calling
> memcg_kmem_put_cache that we also use to decide whether to call
> memcg_kmem_get_cache.
>
> I am not sure how long this bug has been around, since the last
> changeset to touch that code - 452647784b2f ("mm: memcontrol: cleanup
> kmem charge functions") - merely moved the bug from one location to
> another. I am still tagging that changeset, because the fix should
> automatically apply that far back.
>
> Signed-off-by: Rik van Riel <[email protected]>
> Fixes: 452647784b2f ("mm: memcontrol: cleanup kmem charge functions")
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: Alexey Dobriyan <[email protected]>
> Cc: Christoph Lameter <[email protected]>
> Cc: Pekka Enberg <[email protected]>
> Cc: Andrew Morton <[email protected]>
> Cc: David Rientjes <[email protected]>
> Cc: Joonsoo Kim <[email protected]>
> Cc: Johannes Weiner <[email protected]>
> Cc: Tejun Heo <[email protected]>
> ---
> mm/slab.h | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/mm/slab.h b/mm/slab.h
> index 4190c24ef0e9..ab3d95bef8a0 100644
> --- a/mm/slab.h
> +++ b/mm/slab.h
> @@ -444,7 +444,8 @@ static inline void slab_post_alloc_hook(struct kmem_cache *s, gfp_t flags,
> p[i] = kasan_slab_alloc(s, object, flags);
> }
>
> - if (memcg_kmem_enabled())
> + if (memcg_kmem_enabled() &&
> + ((flags & __GFP_ACCOUNT) || (s->flags & SLAB_ACCOUNT)))

I don't think these extra checks are needed. They are safe but not needed.

> memcg_kmem_put_cache(s);
> }
>
> --
> 2.17.1
>

thanks,
Shakeel

2019-01-09 05:57:20

by Shakeel Butt

[permalink] [raw]
Subject: Re: [PATCH] mm,slab,memcg: call memcg kmem put cache with same condition as get

On Tue, Jan 8, 2019 at 9:36 PM Shakeel Butt <[email protected]> wrote:
>
> On Tue, Jan 8, 2019 at 8:01 PM Rik van Riel <[email protected]> wrote:
> >
> > There is an imbalance between when slab_pre_alloc_hook calls
> > memcg_kmem_get_cache and when slab_post_alloc_hook calls
> > memcg_kmem_put_cache.
> >
>
> Can you explain how there is an imbalance? If the returned kmem cache
> from memcg_kmem_get_cache() is the memcg kmem cache then the refcnt of
> memcg is elevated and the memcg_kmem_put_cache() will correctly
> decrement the refcnt of the memcg.
>
> > This can cause a memcg kmem cache to be destroyed right as
> > an object from that cache is being allocated,

Also please note that the memcg kmem caches are destroyed (if empty)
on memcg offline. The css_tryget_online() within
memcg_kmem_get_cache() will fail.

See kernel/cgroup/cgroup.c
* 2. When the percpu_ref is confirmed to be visible as killed on all CPUs
* and thus css_tryget_online() is guaranteed to fail, the css can be
* offlined by invoking offline_css(). After offlining, the base ref is
* put. Implemented in css_killed_work_fn().

> > which is probably
> > not good. It could lead to things like a memcg allocating new
> > kmalloc slabs instead of using freed space in old ones, maybe
> > memory leaks, and maybe oopses as a memcg kmalloc slab is getting
> > destroyed on one CPU while another CPU is trying to do an allocation
> > from that same memcg.
> >
> > The obvious fix would be to use the same condition for calling
> > memcg_kmem_put_cache that we also use to decide whether to call
> > memcg_kmem_get_cache.
> >
> > I am not sure how long this bug has been around, since the last
> > changeset to touch that code - 452647784b2f ("mm: memcontrol: cleanup
> > kmem charge functions") - merely moved the bug from one location to
> > another. I am still tagging that changeset, because the fix should
> > automatically apply that far back.
> >
> > Signed-off-by: Rik van Riel <[email protected]>
> > Fixes: 452647784b2f ("mm: memcontrol: cleanup kmem charge functions")
> > Cc: [email protected]
> > Cc: [email protected]
> > Cc: [email protected]
> > Cc: Alexey Dobriyan <[email protected]>
> > Cc: Christoph Lameter <[email protected]>
> > Cc: Pekka Enberg <[email protected]>
> > Cc: Andrew Morton <[email protected]>
> > Cc: David Rientjes <[email protected]>
> > Cc: Joonsoo Kim <[email protected]>
> > Cc: Johannes Weiner <[email protected]>
> > Cc: Tejun Heo <[email protected]>
> > ---
> > mm/slab.h | 3 ++-
> > 1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/mm/slab.h b/mm/slab.h
> > index 4190c24ef0e9..ab3d95bef8a0 100644
> > --- a/mm/slab.h
> > +++ b/mm/slab.h
> > @@ -444,7 +444,8 @@ static inline void slab_post_alloc_hook(struct kmem_cache *s, gfp_t flags,
> > p[i] = kasan_slab_alloc(s, object, flags);
> > }
> >
> > - if (memcg_kmem_enabled())
> > + if (memcg_kmem_enabled() &&
> > + ((flags & __GFP_ACCOUNT) || (s->flags & SLAB_ACCOUNT)))
>
> I don't think these extra checks are needed. They are safe but not needed.
>
> > memcg_kmem_put_cache(s);
> > }
> >
> > --
> > 2.17.1
> >
>
> thanks,
> Shakeel

2019-01-09 14:13:00

by Rik van Riel

[permalink] [raw]
Subject: Re: [PATCH] mm,slab,memcg: call memcg kmem put cache with same condition as get

On Tue, 2019-01-08 at 21:36 -0800, Shakeel Butt wrote:
> On Tue, Jan 8, 2019 at 8:01 PM Rik van Riel <[email protected]> wrote:
> >
> > There is an imbalance between when slab_pre_alloc_hook calls
> > memcg_kmem_get_cache and when slab_post_alloc_hook calls
> > memcg_kmem_put_cache.
> >
>
> Can you explain how there is an imbalance? If the returned kmem cache
> from memcg_kmem_get_cache() is the memcg kmem cache then the refcnt
> of
> memcg is elevated and the memcg_kmem_put_cache() will correctly
> decrement the refcnt of the memcg.

Indeed, you are right. Never mind this patch.

Back to square one on that bug.

--
All Rights Reversed.


Attachments:
signature.asc (499.00 B)
This is a digitally signed message part