2021-07-28 09:18:24

by Wang Hai

[permalink] [raw]
Subject: [PATCH] mm/memcg: fix NULL pointer dereference in memcg_slab_free_hook()

When I use kfree_rcu() to free a large memory allocated by
kmalloc_node(), the following dump occurs.

BUG: kernel NULL pointer dereference, address: 0000000000000020
[...]
Oops: 0000 [#1] SMP
[...]
Workqueue: events kfree_rcu_work
RIP: 0010:__obj_to_index include/linux/slub_def.h:182 [inline]
RIP: 0010:obj_to_index include/linux/slub_def.h:191 [inline]
RIP: 0010:memcg_slab_free_hook+0x120/0x260 mm/slab.h:363
[...]
Call Trace:
kmem_cache_free_bulk+0x58/0x630 mm/slub.c:3293
kfree_bulk include/linux/slab.h:413 [inline]
kfree_rcu_work+0x1ab/0x200 kernel/rcu/tree.c:3300
process_one_work+0x207/0x530 kernel/workqueue.c:2276
worker_thread+0x320/0x610 kernel/workqueue.c:2422
kthread+0x13d/0x160 kernel/kthread.c:313
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294

When kmalloc_node() a large memory, page is allocated, not slab,
so when freeing memory via kfree_rcu(), this large memory should not
be used by memcg_slab_free_hook(), because memcg_slab_free_hook() is
is used for slab.

So in this case, there is no need to do anything with this large
page in memcg_slab_free_hook(), just skip it.

Fixes: 270c6a71460e ("mm: memcontrol/slab: Use helpers to access slab page's memcg_data")
Signed-off-by: Wang Hai <[email protected]>
---
mm/slab.h | 15 ++++++++++-----
1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/mm/slab.h b/mm/slab.h
index 67e06637ff2e..247d3f9c21f7 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -339,15 +339,20 @@ static inline void memcg_slab_free_hook(struct kmem_cache *s_orig,
continue;

page = virt_to_head_page(p[i]);
+ if (!s_orig) {
+ if (unlikely(!PageSlab(page))) {
+ BUG_ON(!PageCompound(page));
+ continue;
+ }
+ s = page->slab_cache;
+ } else {
+ s = s_orig;
+ }
+
objcgs = page_objcgs(page);
if (!objcgs)
continue;

- if (!s_orig)
- s = page->slab_cache;
- else
- s = s_orig;
-
off = obj_to_index(s, page, p[i]);
objcg = objcgs[off];
if (!objcg)
--
2.17.1



2021-07-28 10:51:21

by Kefeng Wang

[permalink] [raw]
Subject: Re: [PATCH] mm/memcg: fix NULL pointer dereference in memcg_slab_free_hook()


On 2021/7/28 17:13, Wang Hai wrote:
> When I use kfree_rcu() to free a large memory allocated by
> kmalloc_node(), the following dump occurs.
>
> BUG: kernel NULL pointer dereference, address: 0000000000000020
> [...]
> Oops: 0000 [#1] SMP
> [...]
> Workqueue: events kfree_rcu_work
> RIP: 0010:__obj_to_index include/linux/slub_def.h:182 [inline]
> RIP: 0010:obj_to_index include/linux/slub_def.h:191 [inline]
> RIP: 0010:memcg_slab_free_hook+0x120/0x260 mm/slab.h:363
> [...]
> Call Trace:
> kmem_cache_free_bulk+0x58/0x630 mm/slub.c:3293
> kfree_bulk include/linux/slab.h:413 [inline]
> kfree_rcu_work+0x1ab/0x200 kernel/rcu/tree.c:3300
> process_one_work+0x207/0x530 kernel/workqueue.c:2276
> worker_thread+0x320/0x610 kernel/workqueue.c:2422
> kthread+0x13d/0x160 kernel/kthread.c:313
> ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
>
> When kmalloc_node() a large memory, page is allocated, not slab,
> so when freeing memory via kfree_rcu(), this large memory should not
> be used by memcg_slab_free_hook(), because memcg_slab_free_hook() is
> is used for slab.
>
> So in this case, there is no need to do anything with this large
> page in memcg_slab_free_hook(), just skip it.
>
> Fixes: 270c6a71460e ("mm: memcontrol/slab: Use helpers to access slab page's memcg_data")
> Signed-off-by: Wang Hai <[email protected]>
Reviewed-by: Kefeng Wang <[email protected]>
> ---
> mm/slab.h | 15 ++++++++++-----
> 1 file changed, 10 insertions(+), 5 deletions(-)
>
> diff --git a/mm/slab.h b/mm/slab.h
> index 67e06637ff2e..247d3f9c21f7 100644
> --- a/mm/slab.h
> +++ b/mm/slab.h
> @@ -339,15 +339,20 @@ static inline void memcg_slab_free_hook(struct kmem_cache *s_orig,
> continue;
>
> page = virt_to_head_page(p[i]);
> + if (!s_orig) {
> + if (unlikely(!PageSlab(page))) {
> + BUG_ON(!PageCompound(page));
> + continue;
> + }
> + s = page->slab_cache;
> + } else {
> + s = s_orig;
> + }
> +
> objcgs = page_objcgs(page);
> if (!objcgs)
> continue;
>
> - if (!s_orig)
> - s = page->slab_cache;
> - else
> - s = s_orig;
> -
> off = obj_to_index(s, page, p[i]);
> objcg = objcgs[off];
> if (!objcg)

2021-07-28 13:24:13

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH] mm/memcg: fix NULL pointer dereference in memcg_slab_free_hook()

On Wed 28-07-21 17:13:48, Wang Hai wrote:
> When I use kfree_rcu() to free a large memory allocated by
> kmalloc_node(), the following dump occurs.
>
> BUG: kernel NULL pointer dereference, address: 0000000000000020
> [...]
> Oops: 0000 [#1] SMP
> [...]
> Workqueue: events kfree_rcu_work
> RIP: 0010:__obj_to_index include/linux/slub_def.h:182 [inline]
> RIP: 0010:obj_to_index include/linux/slub_def.h:191 [inline]
> RIP: 0010:memcg_slab_free_hook+0x120/0x260 mm/slab.h:363
> [...]
> Call Trace:
> kmem_cache_free_bulk+0x58/0x630 mm/slub.c:3293
> kfree_bulk include/linux/slab.h:413 [inline]
> kfree_rcu_work+0x1ab/0x200 kernel/rcu/tree.c:3300
> process_one_work+0x207/0x530 kernel/workqueue.c:2276
> worker_thread+0x320/0x610 kernel/workqueue.c:2422
> kthread+0x13d/0x160 kernel/kthread.c:313
> ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
>
> When kmalloc_node() a large memory, page is allocated, not slab,
> so when freeing memory via kfree_rcu(), this large memory should not
> be used by memcg_slab_free_hook(), because memcg_slab_free_hook() is
> is used for slab.
>
> So in this case, there is no need to do anything with this large
> page in memcg_slab_free_hook(), just skip it.
>
> Fixes: 270c6a71460e ("mm: memcontrol/slab: Use helpers to access slab page's memcg_data")

Are you sure that this commit is really breaking the code. Unless I have
missed something there shouldn't be any real change wrt. large
allocations here. page_has_obj_cgroups is just a different name for what
what page_objcgs is giving us.

I haven't studied the kfree_rcu part but isn't the problem its use of
kmem_cache_free_bulk or isn't the problem right there in the bulk free?

> Signed-off-by: Wang Hai <[email protected]>
> ---
> mm/slab.h | 15 ++++++++++-----
> 1 file changed, 10 insertions(+), 5 deletions(-)
>
> diff --git a/mm/slab.h b/mm/slab.h
> index 67e06637ff2e..247d3f9c21f7 100644
> --- a/mm/slab.h
> +++ b/mm/slab.h
> @@ -339,15 +339,20 @@ static inline void memcg_slab_free_hook(struct kmem_cache *s_orig,
> continue;
>
> page = virt_to_head_page(p[i]);
> + if (!s_orig) {
> + if (unlikely(!PageSlab(page))) {
> + BUG_ON(!PageCompound(page));

BUG_ON is not really a good idea here. Why should we crash the kernel
just because of an unexpected page showing up. Leaking it would be more
appropriate (the same would apply to kfree btw). I would just warn
here. Also don't we need any hookd here. Looking at kfree path it does
call kfree_hook. Why is that not needed here?

> + continue;
> + }
> + s = page->slab_cache;
> + } else {
> + s = s_orig;
> + }
> +
> objcgs = page_objcgs(page);
> if (!objcgs)
> continue;
>
> - if (!s_orig)
> - s = page->slab_cache;
> - else
> - s = s_orig;
> -
> off = obj_to_index(s, page, p[i]);
> objcg = objcgs[off];
> if (!objcg)
> --
> 2.17.1

--
Michal Hocko
SUSE Labs

2021-07-28 14:11:57

by Shakeel Butt

[permalink] [raw]
Subject: Re: [PATCH] mm/memcg: fix NULL pointer dereference in memcg_slab_free_hook()

+Roman

On Wed, Jul 28, 2021 at 6:23 AM Michal Hocko <[email protected]> wrote:
>
> On Wed 28-07-21 17:13:48, Wang Hai wrote:
> > When I use kfree_rcu() to free a large memory allocated by
> > kmalloc_node(), the following dump occurs.
> >
> > BUG: kernel NULL pointer dereference, address: 0000000000000020
> > [...]
> > Oops: 0000 [#1] SMP
> > [...]
> > Workqueue: events kfree_rcu_work
> > RIP: 0010:__obj_to_index include/linux/slub_def.h:182 [inline]
> > RIP: 0010:obj_to_index include/linux/slub_def.h:191 [inline]
> > RIP: 0010:memcg_slab_free_hook+0x120/0x260 mm/slab.h:363
> > [...]
> > Call Trace:
> > kmem_cache_free_bulk+0x58/0x630 mm/slub.c:3293
> > kfree_bulk include/linux/slab.h:413 [inline]
> > kfree_rcu_work+0x1ab/0x200 kernel/rcu/tree.c:3300
> > process_one_work+0x207/0x530 kernel/workqueue.c:2276
> > worker_thread+0x320/0x610 kernel/workqueue.c:2422
> > kthread+0x13d/0x160 kernel/kthread.c:313
> > ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
> >
> > When kmalloc_node() a large memory, page is allocated, not slab,
> > so when freeing memory via kfree_rcu(), this large memory should not
> > be used by memcg_slab_free_hook(), because memcg_slab_free_hook() is
> > is used for slab.
> >
> > So in this case, there is no need to do anything with this large
> > page in memcg_slab_free_hook(), just skip it.
> >
> > Fixes: 270c6a71460e ("mm: memcontrol/slab: Use helpers to access slab page's memcg_data")
>
> Are you sure that this commit is really breaking the code. Unless I have
> missed something there shouldn't be any real change wrt. large
> allocations here. page_has_obj_cgroups is just a different name for what
> what page_objcgs is giving us.

Actually they are different. For MEMCG_DATA_KMEM page,
page_has_obj_cgroups() will return false while page_objcgs() on
non-VM_DEBUG kernels will return "struct obj_cgroup *" instead of
"struct obj_cgroup **".

>
> I haven't studied the kfree_rcu part but isn't the problem its use of
> kmem_cache_free_bulk or isn't the problem right there in the bulk free?
>

SLUB's kmem_cache_free_bulk() is doing two passes. In first pass
uncharges all the objects and in the second pass frees them to the
slub infra. There is nothing wrong with that. It is just that SLUB
mixes page (for higher order) and slab allocations and requires
special handling.

> > Signed-off-by: Wang Hai <[email protected]>
> > ---
> > mm/slab.h | 15 ++++++++++-----
> > 1 file changed, 10 insertions(+), 5 deletions(-)
> >
> > diff --git a/mm/slab.h b/mm/slab.h
> > index 67e06637ff2e..247d3f9c21f7 100644
> > --- a/mm/slab.h
> > +++ b/mm/slab.h
> > @@ -339,15 +339,20 @@ static inline void memcg_slab_free_hook(struct kmem_cache *s_orig,
> > continue;
> >
> > page = virt_to_head_page(p[i]);
> > + if (!s_orig) {
> > + if (unlikely(!PageSlab(page))) {
> > + BUG_ON(!PageCompound(page));
>
> BUG_ON is not really a good idea here. Why should we crash the kernel
> just because of an unexpected page showing up. Leaking it would be more
> appropriate (the same would apply to kfree btw). I would just warn
> here.

The simplest fix would be to not call memcg_slab_free_hook() in
kmem_cache_free_bulk() because slab_free() will call
memcg_slab_free_hook() for individual objects. Not sure if there will
be any performance impact.

> Also don't we need any hookd here. Looking at kfree path it does
> call kfree_hook. Why is that not needed here?

These are called later in build_detached_freelist().

2021-07-28 14:22:51

by Kefeng Wang

[permalink] [raw]
Subject: Re: [PATCH] mm/memcg: fix NULL pointer dereference in memcg_slab_free_hook()


On 2021/7/28 21:23, Michal Hocko wrote:
> On Wed 28-07-21 17:13:48, Wang Hai wrote:
>> When I use kfree_rcu() to free a large memory allocated by
>> kmalloc_node(), the following dump occurs.
>>
>> BUG: kernel NULL pointer dereference, address: 0000000000000020
>> [...]
>> Oops: 0000 [#1] SMP
>> [...]
>> Workqueue: events kfree_rcu_work
>> RIP: 0010:__obj_to_index include/linux/slub_def.h:182 [inline]
>> RIP: 0010:obj_to_index include/linux/slub_def.h:191 [inline]
>> RIP: 0010:memcg_slab_free_hook+0x120/0x260 mm/slab.h:363
>> [...]
>> Call Trace:
>> kmem_cache_free_bulk+0x58/0x630 mm/slub.c:3293
>> kfree_bulk include/linux/slab.h:413 [inline]
>> kfree_rcu_work+0x1ab/0x200 kernel/rcu/tree.c:3300
>> process_one_work+0x207/0x530 kernel/workqueue.c:2276
>> worker_thread+0x320/0x610 kernel/workqueue.c:2422
>> kthread+0x13d/0x160 kernel/kthread.c:313
>> ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
>>
>> When kmalloc_node() a large memory, page is allocated, not slab,
>> so when freeing memory via kfree_rcu(), this large memory should not
>> be used by memcg_slab_free_hook(), because memcg_slab_free_hook() is
>> is used for slab.
>>
>> So in this case, there is no need to do anything with this large
>> page in memcg_slab_free_hook(), just skip it.
>>
>> Fixes: 270c6a71460e ("mm: memcontrol/slab: Use helpers to access slab page's memcg_data")
> Are you sure that this commit is really breaking the code. Unless I have
Yes, we confirmed that this commit introduces the bug.
> missed something there shouldn't be any real change wrt. large
> allocations here. page_has_obj_cgroups is just a different name for what
> what page_objcgs is giving us.

maybe we could simply use page_objcgs_check to fix the issue ? we will
check it again.

diff --git a/mm/slab.h b/mm/slab.h
index f997fd5e42c8..58c01a34e5b8 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -346,7 +346,7 @@ static inline void memcg_slab_free_hook(struct
kmem_cache *s_orig,
                        continue;

                page = virt_to_head_page(p[i]);
-               objcgs = page_objcgs(page);
+               objcgs = page_objcgs_check(page);
                if (!objcgs)
                        continue;

>
> I haven't studied the kfree_rcu part but isn't the problem its use of
> kmem_cache_free_bulk or isn't the problem right there in the bulk free?
>
>> Signed-off-by: Wang Hai <[email protected]>
>> ---
>> mm/slab.h | 15 ++++++++++-----
>> 1 file changed, 10 insertions(+), 5 deletions(-)
>>
>> diff --git a/mm/slab.h b/mm/slab.h
>> index 67e06637ff2e..247d3f9c21f7 100644
>> --- a/mm/slab.h
>> +++ b/mm/slab.h
>> @@ -339,15 +339,20 @@ static inline void memcg_slab_free_hook(struct kmem_cache *s_orig,
>> continue;
>>
>> page = virt_to_head_page(p[i]);
>> + if (!s_orig) {
>> + if (unlikely(!PageSlab(page))) {
>> + BUG_ON(!PageCompound(page));
the same logical is in build_detached_freelist()
> BUG_ON is not really a good idea here. Why should we crash the kernel
> just because of an unexpected page showing up. Leaking it would be more
> appropriate (the same would apply to kfree btw). I would just warn
> here. Also don't we need any hookd here. Looking at kfree path it does
> call kfree_hook. Why is that not needed here?
kmem_cache_free_bulk
-- memcg_slab_free_hook  // skip objcgs ops for page
-- build_detached_freelist  // kfree_hook and page free


2021-07-28 14:32:20

by Shakeel Butt

[permalink] [raw]
Subject: Re: [PATCH] mm/memcg: fix NULL pointer dereference in memcg_slab_free_hook()

On Wed, Jul 28, 2021 at 7:21 AM Kefeng Wang <[email protected]> wrote:
>
>
> On 2021/7/28 21:23, Michal Hocko wrote:
> > On Wed 28-07-21 17:13:48, Wang Hai wrote:
> >> When I use kfree_rcu() to free a large memory allocated by
> >> kmalloc_node(), the following dump occurs.
> >>
> >> BUG: kernel NULL pointer dereference, address: 0000000000000020
> >> [...]
> >> Oops: 0000 [#1] SMP
> >> [...]
> >> Workqueue: events kfree_rcu_work
> >> RIP: 0010:__obj_to_index include/linux/slub_def.h:182 [inline]
> >> RIP: 0010:obj_to_index include/linux/slub_def.h:191 [inline]
> >> RIP: 0010:memcg_slab_free_hook+0x120/0x260 mm/slab.h:363
> >> [...]
> >> Call Trace:
> >> kmem_cache_free_bulk+0x58/0x630 mm/slub.c:3293
> >> kfree_bulk include/linux/slab.h:413 [inline]
> >> kfree_rcu_work+0x1ab/0x200 kernel/rcu/tree.c:3300
> >> process_one_work+0x207/0x530 kernel/workqueue.c:2276
> >> worker_thread+0x320/0x610 kernel/workqueue.c:2422
> >> kthread+0x13d/0x160 kernel/kthread.c:313
> >> ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
> >>
> >> When kmalloc_node() a large memory, page is allocated, not slab,
> >> so when freeing memory via kfree_rcu(), this large memory should not
> >> be used by memcg_slab_free_hook(), because memcg_slab_free_hook() is
> >> is used for slab.
> >>
> >> So in this case, there is no need to do anything with this large
> >> page in memcg_slab_free_hook(), just skip it.
> >>
> >> Fixes: 270c6a71460e ("mm: memcontrol/slab: Use helpers to access slab page's memcg_data")
> > Are you sure that this commit is really breaking the code. Unless I have
> Yes, we confirmed that this commit introduces the bug.
> > missed something there shouldn't be any real change wrt. large
> > allocations here. page_has_obj_cgroups is just a different name for what
> > what page_objcgs is giving us.
>
> maybe we could simply use page_objcgs_check to fix the issue ? we will
> check it again.

You will see the same crash with page_objcgs_check as well.

2021-07-28 14:36:38

by Wang Hai

[permalink] [raw]
Subject: Re: [PATCH] mm/memcg: fix NULL pointer dereference in memcg_slab_free_hook()


在 2021/7/28 22:26, Shakeel Butt 写道:
> On Wed, Jul 28, 2021 at 7:21 AM Kefeng Wang <[email protected]> wrote:
>>
>> On 2021/7/28 21:23, Michal Hocko wrote:
>>> On Wed 28-07-21 17:13:48, Wang Hai wrote:
>>>> When I use kfree_rcu() to free a large memory allocated by
>>>> kmalloc_node(), the following dump occurs.
>>>>
>>>> BUG: kernel NULL pointer dereference, address: 0000000000000020
>>>> [...]
>>>> Oops: 0000 [#1] SMP
>>>> [...]
>>>> Workqueue: events kfree_rcu_work
>>>> RIP: 0010:__obj_to_index include/linux/slub_def.h:182 [inline]
>>>> RIP: 0010:obj_to_index include/linux/slub_def.h:191 [inline]
>>>> RIP: 0010:memcg_slab_free_hook+0x120/0x260 mm/slab.h:363
>>>> [...]
>>>> Call Trace:
>>>> kmem_cache_free_bulk+0x58/0x630 mm/slub.c:3293
>>>> kfree_bulk include/linux/slab.h:413 [inline]
>>>> kfree_rcu_work+0x1ab/0x200 kernel/rcu/tree.c:3300
>>>> process_one_work+0x207/0x530 kernel/workqueue.c:2276
>>>> worker_thread+0x320/0x610 kernel/workqueue.c:2422
>>>> kthread+0x13d/0x160 kernel/kthread.c:313
>>>> ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
>>>>
>>>> When kmalloc_node() a large memory, page is allocated, not slab,
>>>> so when freeing memory via kfree_rcu(), this large memory should not
>>>> be used by memcg_slab_free_hook(), because memcg_slab_free_hook() is
>>>> is used for slab.
>>>>
>>>> So in this case, there is no need to do anything with this large
>>>> page in memcg_slab_free_hook(), just skip it.
>>>>
>>>> Fixes: 270c6a71460e ("mm: memcontrol/slab: Use helpers to access slab page's memcg_data")
>>> Are you sure that this commit is really breaking the code. Unless I have
>> Yes, we confirmed that this commit introduces the bug.
>>> missed something there shouldn't be any real change wrt. large
>>> allocations here. page_has_obj_cgroups is just a different name for what
>>> what page_objcgs is giving us.
>> maybe we could simply use page_objcgs_check to fix the issue ? we will
>> check it again.
> You will see the same crash with page_objcgs_check as well.
> .

I just test it. It won't crash.

This is the test case:

node = kmalloc_node(299999, GFP_ATOMIC | __GFP_NOWARN | __GFP_ACCOUNT, -1);
kfree_rcu(node, rcu);


2021-07-28 14:46:34

by Shakeel Butt

[permalink] [raw]
Subject: Re: [PATCH] mm/memcg: fix NULL pointer dereference in memcg_slab_free_hook()

On Wed, Jul 28, 2021 at 7:34 AM wanghai (M) <[email protected]> wrote:
>
>
> 在 2021/7/28 22:26, Shakeel Butt 写道:
> > On Wed, Jul 28, 2021 at 7:21 AM Kefeng Wang <[email protected]> wrote:
> >>
> >> On 2021/7/28 21:23, Michal Hocko wrote:
> >>> On Wed 28-07-21 17:13:48, Wang Hai wrote:
> >>>> When I use kfree_rcu() to free a large memory allocated by
> >>>> kmalloc_node(), the following dump occurs.
> >>>>
> >>>> BUG: kernel NULL pointer dereference, address: 0000000000000020
> >>>> [...]
> >>>> Oops: 0000 [#1] SMP
> >>>> [...]
> >>>> Workqueue: events kfree_rcu_work
> >>>> RIP: 0010:__obj_to_index include/linux/slub_def.h:182 [inline]
> >>>> RIP: 0010:obj_to_index include/linux/slub_def.h:191 [inline]
> >>>> RIP: 0010:memcg_slab_free_hook+0x120/0x260 mm/slab.h:363
> >>>> [...]
> >>>> Call Trace:
> >>>> kmem_cache_free_bulk+0x58/0x630 mm/slub.c:3293
> >>>> kfree_bulk include/linux/slab.h:413 [inline]
> >>>> kfree_rcu_work+0x1ab/0x200 kernel/rcu/tree.c:3300
> >>>> process_one_work+0x207/0x530 kernel/workqueue.c:2276
> >>>> worker_thread+0x320/0x610 kernel/workqueue.c:2422
> >>>> kthread+0x13d/0x160 kernel/kthread.c:313
> >>>> ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
> >>>>
> >>>> When kmalloc_node() a large memory, page is allocated, not slab,
> >>>> so when freeing memory via kfree_rcu(), this large memory should not
> >>>> be used by memcg_slab_free_hook(), because memcg_slab_free_hook() is
> >>>> is used for slab.
> >>>>
> >>>> So in this case, there is no need to do anything with this large
> >>>> page in memcg_slab_free_hook(), just skip it.
> >>>>
> >>>> Fixes: 270c6a71460e ("mm: memcontrol/slab: Use helpers to access slab page's memcg_data")
> >>> Are you sure that this commit is really breaking the code. Unless I have
> >> Yes, we confirmed that this commit introduces the bug.
> >>> missed something there shouldn't be any real change wrt. large
> >>> allocations here. page_has_obj_cgroups is just a different name for what
> >>> what page_objcgs is giving us.
> >> maybe we could simply use page_objcgs_check to fix the issue ? we will
> >> check it again.
> > You will see the same crash with page_objcgs_check as well.
> > .
>
> I just test it. It won't crash.
>
> This is the test case:
>
> node = kmalloc_node(299999, GFP_ATOMIC | __GFP_NOWARN | __GFP_ACCOUNT, -1);
> kfree_rcu(node, rcu);
>

Indeed it is checking MEMCG_DATA_OBJCGS directly. Yeah, we can use
page_objcgs_check(). Please send a new version of the patch and do CC
Roman.

2021-07-28 16:45:35

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH] mm/memcg: fix NULL pointer dereference in memcg_slab_free_hook()

On Wed 28-07-21 07:10:26, Shakeel Butt wrote:
> +Roman
>
> On Wed, Jul 28, 2021 at 6:23 AM Michal Hocko <[email protected]> wrote:
> >
> > On Wed 28-07-21 17:13:48, Wang Hai wrote:
> > > When I use kfree_rcu() to free a large memory allocated by
> > > kmalloc_node(), the following dump occurs.
> > >
> > > BUG: kernel NULL pointer dereference, address: 0000000000000020
> > > [...]
> > > Oops: 0000 [#1] SMP
> > > [...]
> > > Workqueue: events kfree_rcu_work
> > > RIP: 0010:__obj_to_index include/linux/slub_def.h:182 [inline]
> > > RIP: 0010:obj_to_index include/linux/slub_def.h:191 [inline]
> > > RIP: 0010:memcg_slab_free_hook+0x120/0x260 mm/slab.h:363
> > > [...]
> > > Call Trace:
> > > kmem_cache_free_bulk+0x58/0x630 mm/slub.c:3293
> > > kfree_bulk include/linux/slab.h:413 [inline]
> > > kfree_rcu_work+0x1ab/0x200 kernel/rcu/tree.c:3300
> > > process_one_work+0x207/0x530 kernel/workqueue.c:2276
> > > worker_thread+0x320/0x610 kernel/workqueue.c:2422
> > > kthread+0x13d/0x160 kernel/kthread.c:313
> > > ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
> > >
> > > When kmalloc_node() a large memory, page is allocated, not slab,
> > > so when freeing memory via kfree_rcu(), this large memory should not
> > > be used by memcg_slab_free_hook(), because memcg_slab_free_hook() is
> > > is used for slab.
> > >
> > > So in this case, there is no need to do anything with this large
> > > page in memcg_slab_free_hook(), just skip it.
> > >
> > > Fixes: 270c6a71460e ("mm: memcontrol/slab: Use helpers to access slab page's memcg_data")
> >
> > Are you sure that this commit is really breaking the code. Unless I have
> > missed something there shouldn't be any real change wrt. large
> > allocations here. page_has_obj_cgroups is just a different name for what
> > what page_objcgs is giving us.
>
> Actually they are different. For MEMCG_DATA_KMEM page,
> page_has_obj_cgroups() will return false while page_objcgs() on
> non-VM_DEBUG kernels will return "struct obj_cgroup *" instead of
> "struct obj_cgroup **".

Right. Thanks for the clarification. I have missed that subtle
difference.
--
Michal Hocko
SUSE Labs