2023-08-10 08:11:38

by Xiaolei Wang

[permalink] [raw]
Subject: [PATCH 0/2] Bail out in __stack_depot_save() if the stack_table is not allocated and delete the kmemleak_initialized judgment in set_track_prepare()

patch1 solves the null pointer situation when __stack_depot_save()
is called when the stack_table is not initialized.

patch2 solved there is no call trace for the memory leak which object
is created before the kmemleak_late_init()

Xiaolei Wang (2):
lib/stackdepot: Bail out in __stack_depot_save() if the stack_table is
not allocated
mm/kmemleak: No need to check kmemleak_initialized in
set_track_prepare()

lib/stackdepot.c | 2 +-
mm/kmemleak.c | 2 --
2 files changed, 1 insertion(+), 3 deletions(-)

--
2.25.1



2023-08-10 09:01:38

by Xiaolei Wang

[permalink] [raw]
Subject: [PATCH 1/2] lib/stackdepot: Bail out in __stack_depot_save() if the stack_table is not allocated

The __stack_depot_save() may be used by some subsystems even before
the stack depot is initialized. So add a check of stack_table in
__stack_depot_save() to make sure no oops in this case.

Fixes: 56a61617dd22 ("mm: use stack_depot for recording kmemleak's backtrace")
Signed-off-by: Xiaolei Wang <[email protected]>
---
lib/stackdepot.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/stackdepot.c b/lib/stackdepot.c
index 2f5aa851834e..a0651d013a0d 100644
--- a/lib/stackdepot.c
+++ b/lib/stackdepot.c
@@ -376,7 +376,7 @@ depot_stack_handle_t __stack_depot_save(unsigned long *entries,
*/
nr_entries = filter_irq_stacks(entries, nr_entries);

- if (unlikely(nr_entries == 0) || stack_depot_disabled)
+ if (unlikely(nr_entries == 0) || stack_depot_disabled || unlikely(!stack_table))
goto fast_exit;

hash = hash_stack(entries, nr_entries);
--
2.25.1


2023-08-10 09:34:13

by Xiaolei Wang

[permalink] [raw]
Subject: [PATCH 2/2] mm/kmemleak: No need to check kmemleak_initialized in set_track_prepare()

The kmemleak_late_init() is defined as a late_initcall. The current
implementation of set_track_prepare() depends on the kmemleak init.
That also means there is no call trace for the memory leak which object
is created before the kmemleak_late_init().



In a previous patch, we have fixed a bug in stack_depot_save() so that
it can be invoked even before stack depot is initialized. So there is
no reason to check the kmemleak_initialized in set_track_prepare().
So delete the kmemleak_initialized judgment in set_track_prepare()

unreferenced object 0xc674ca80 (size 64):
comm "swapper/0", pid 1, jiffies 4294938337 (age 204.880s)
hex dump (first 32 bytes):
80 55 75 c6 80 54 75 c6 00 55 75 c6 80 52 75 c6 .Uu..Tu..Uu..Ru.
00 53 75 c6 00 00 00 00 00 00 00 00 00 00 00 00 .Su..........

Fixes: 56a61617dd22 ("mm: use stack_depot for recording kmemleak's backtrace")
Signed-off-by: Xiaolei Wang <[email protected]>
---
mm/kmemleak.c | 2 --
1 file changed, 2 deletions(-)

diff --git a/mm/kmemleak.c b/mm/kmemleak.c
index a2d34226e3c8..c9f2f816db19 100644
--- a/mm/kmemleak.c
+++ b/mm/kmemleak.c
@@ -610,8 +610,6 @@ static noinline depot_stack_handle_t set_track_prepare(void)
unsigned long entries[MAX_TRACE];
unsigned int nr_entries;

- if (!kmemleak_initialized)
- return 0;
nr_entries = stack_trace_save(entries, ARRAY_SIZE(entries), 3);
trace_handle = stack_depot_save(entries, nr_entries, GFP_NOWAIT);

--
2.25.1


2023-08-10 10:58:57

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [PATCH 2/2] mm/kmemleak: No need to check kmemleak_initialized in set_track_prepare()

On 8/10/23 12:03, Vlastimil Babka wrote:
> On 8/10/23 09:47, Xiaolei Wang wrote:
>> The kmemleak_late_init() is defined as a late_initcall. The current
>> implementation of set_track_prepare() depends on the kmemleak init.
>> That also means there is no call trace for the memory leak which object
>> is created before the kmemleak_late_init().
>
> So if I understand correctly, we have the following sequence of events durin
> boot
>
> ...
> A: stack_depot is initialized
> ...
> B: kmemleak is initialized
> ...
>
> before this patchset, we can miss allocations before B, aftewards only
> before A (which can't be helped), so we now have between A and B.
>
> That's nice, but it's weird that can record kmemleak when
> !kmemleak_initialized. Why can't it be initialized sooner in that case?

Looking closer, I think what you want could be achieved by kmemleak_init()
setting a variable that is checked in kmemleak_initialized() instead of the
kmemleak_initialized that's set too late.

I think this should work because:
- I assume kmemleak can't record anything before kmemleak_init()
- stack depot early init is requested one way or the other
- mm_core_init() calls stack_depot_early_init() before kmemleak_init()

But I also wonder how kmemleak can even reach set_track_prepare() before
kmemleak_init(), maybe that's the issue?

>> In a previous patch, we have fixed a bug in stack_depot_save() so that
>> it can be invoked even before stack depot is initialized. So there is
>> no reason to check the kmemleak_initialized in set_track_prepare().
>> So delete the kmemleak_initialized judgment in set_track_prepare()
>>
>> unreferenced object 0xc674ca80 (size 64):
>> comm "swapper/0", pid 1, jiffies 4294938337 (age 204.880s)
>> hex dump (first 32 bytes):
>> 80 55 75 c6 80 54 75 c6 00 55 75 c6 80 52 75 c6 .Uu..Tu..Uu..Ru.
>> 00 53 75 c6 00 00 00 00 00 00 00 00 00 00 00 00 .Su..........
>>
>> Fixes: 56a61617dd22 ("mm: use stack_depot for recording kmemleak's backtrace")
>> Signed-off-by: Xiaolei Wang <[email protected]>
>> ---
>> mm/kmemleak.c | 2 --
>> 1 file changed, 2 deletions(-)
>>
>> diff --git a/mm/kmemleak.c b/mm/kmemleak.c
>> index a2d34226e3c8..c9f2f816db19 100644
>> --- a/mm/kmemleak.c
>> +++ b/mm/kmemleak.c
>> @@ -610,8 +610,6 @@ static noinline depot_stack_handle_t set_track_prepare(void)
>> unsigned long entries[MAX_TRACE];
>> unsigned int nr_entries;
>>
>> - if (!kmemleak_initialized)
>> - return 0;
>> nr_entries = stack_trace_save(entries, ARRAY_SIZE(entries), 3);
>> trace_handle = stack_depot_save(entries, nr_entries, GFP_NOWAIT);
>>
>


2023-08-10 11:19:23

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [PATCH 1/2] lib/stackdepot: Bail out in __stack_depot_save() if the stack_table is not allocated

On 8/10/23 09:47, Xiaolei Wang wrote:
> The __stack_depot_save() may be used by some subsystems even before
> the stack depot is initialized.

Does that currently happen, or only after patch 2/2 it starts happening via
kmemleak?

> So add a check of stack_table in
> __stack_depot_save() to make sure no oops in this case.
>
> Fixes: 56a61617dd22 ("mm: use stack_depot for recording kmemleak's backtrace")

In case it's only after 2/2 I don't think this is truly "Fixes"?

> Signed-off-by: Xiaolei Wang <[email protected]>
> ---
> lib/stackdepot.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/lib/stackdepot.c b/lib/stackdepot.c
> index 2f5aa851834e..a0651d013a0d 100644
> --- a/lib/stackdepot.c
> +++ b/lib/stackdepot.c
> @@ -376,7 +376,7 @@ depot_stack_handle_t __stack_depot_save(unsigned long *entries,
> */
> nr_entries = filter_irq_stacks(entries, nr_entries);
>
> - if (unlikely(nr_entries == 0) || stack_depot_disabled)
> + if (unlikely(nr_entries == 0) || stack_depot_disabled || unlikely(!stack_table))
> goto fast_exit;
>
> hash = hash_stack(entries, nr_entries);


2023-08-10 12:39:31

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [PATCH 2/2] mm/kmemleak: No need to check kmemleak_initialized in set_track_prepare()

On 8/10/23 09:47, Xiaolei Wang wrote:
> The kmemleak_late_init() is defined as a late_initcall. The current
> implementation of set_track_prepare() depends on the kmemleak init.
> That also means there is no call trace for the memory leak which object
> is created before the kmemleak_late_init().

So if I understand correctly, we have the following sequence of events durin
boot

...
A: stack_depot is initialized
...
B: kmemleak is initialized
...

before this patchset, we can miss allocations before B, aftewards only
before A (which can't be helped), so we now have between A and B.

That's nice, but it's weird that can record kmemleak when
!kmemleak_initialized. Why can't it be initialized sooner in that case?

> In a previous patch, we have fixed a bug in stack_depot_save() so that
> it can be invoked even before stack depot is initialized. So there is
> no reason to check the kmemleak_initialized in set_track_prepare().
> So delete the kmemleak_initialized judgment in set_track_prepare()
>
> unreferenced object 0xc674ca80 (size 64):
> comm "swapper/0", pid 1, jiffies 4294938337 (age 204.880s)
> hex dump (first 32 bytes):
> 80 55 75 c6 80 54 75 c6 00 55 75 c6 80 52 75 c6 .Uu..Tu..Uu..Ru.
> 00 53 75 c6 00 00 00 00 00 00 00 00 00 00 00 00 .Su..........
>
> Fixes: 56a61617dd22 ("mm: use stack_depot for recording kmemleak's backtrace")
> Signed-off-by: Xiaolei Wang <[email protected]>
> ---
> mm/kmemleak.c | 2 --
> 1 file changed, 2 deletions(-)
>
> diff --git a/mm/kmemleak.c b/mm/kmemleak.c
> index a2d34226e3c8..c9f2f816db19 100644
> --- a/mm/kmemleak.c
> +++ b/mm/kmemleak.c
> @@ -610,8 +610,6 @@ static noinline depot_stack_handle_t set_track_prepare(void)
> unsigned long entries[MAX_TRACE];
> unsigned int nr_entries;
>
> - if (!kmemleak_initialized)
> - return 0;
> nr_entries = stack_trace_save(entries, ARRAY_SIZE(entries), 3);
> trace_handle = stack_depot_save(entries, nr_entries, GFP_NOWAIT);
>


2023-08-11 02:41:16

by Xiaolei Wang

[permalink] [raw]
Subject: Re: [PATCH 2/2] mm/kmemleak: No need to check kmemleak_initialized in set_track_prepare()


On 8/10/23 9:16 PM, Vlastimil Babka wrote:
> CAUTION: This email comes from a non Wind River email account!
> Do not click links or open attachments unless you recognize the sender and know the content is safe.
>
> On 8/10/23 12:03, Vlastimil Babka wrote:
>> On 8/10/23 09:47, Xiaolei Wang wrote:
>>> The kmemleak_late_init() is defined as a late_initcall. The current
>>> implementation of set_track_prepare() depends on the kmemleak init.
>>> That also means there is no call trace for the memory leak which object
>>> is created before the kmemleak_late_init().
>> So if I understand correctly, we have the following sequence of events durin
>> boot
>>
>> ...
>> A: stack_depot is initialized
>> ...
>> B: kmemleak is initialized
>> ...
>>
>> before this patchset, we can miss allocations before B, aftewards only
>> before A (which can't be helped), so we now have between A and B.
>>
>> That's nice, but it's weird that can record kmemleak when
>> !kmemleak_initialized. Why can't it be initialized sooner in that case?
> Looking closer, I think what you want could be achieved by kmemleak_init()
> setting a variable that is checked in kmemleak_initialized() instead of the
> kmemleak_initialized that's set too late.
>
> I think this should work because:
> - I assume kmemleak can't record anything before kmemleak_init()
> - stack depot early init is requested one way or the other
> - mm_core_init() calls stack_depot_early_init() before kmemleak_init()
>
> But I also wonder how kmemleak can even reach set_track_prepare() before
> kmemleak_init(), maybe that's the issue?

Before kmemleak_init, many places also need to allocate kmemleak_object,

and also need to save stack in advance, but kmemleak_object is allocated

in the form of an array, after kmemleak_init 'object_cache =
KMEM_CACHE(kmemleak_object, SLAB_NOLEAKTRACE);'


I think there is still some memory not recorded on the backtrace before

stack_depot_early_init(), does anyone have a better suggestion?


thanks

xiaolei

>
>>> In a previous patch, we have fixed a bug in stack_depot_save() so that
>>> it can be invoked even before stack depot is initialized. So there is
>>> no reason to check the kmemleak_initialized in set_track_prepare().
>>> So delete the kmemleak_initialized judgment in set_track_prepare()
>>>
>>> unreferenced object 0xc674ca80 (size 64):
>>> comm "swapper/0", pid 1, jiffies 4294938337 (age 204.880s)
>>> hex dump (first 32 bytes):
>>> 80 55 75 c6 80 54 75 c6 00 55 75 c6 80 52 75 c6 .Uu..Tu..Uu..Ru.
>>> 00 53 75 c6 00 00 00 00 00 00 00 00 00 00 00 00 .Su..........
>>>
>>> Fixes: 56a61617dd22 ("mm: use stack_depot for recording kmemleak's backtrace")
>>> Signed-off-by: Xiaolei Wang <[email protected]>
>>> ---
>>> mm/kmemleak.c | 2 --
>>> 1 file changed, 2 deletions(-)
>>>
>>> diff --git a/mm/kmemleak.c b/mm/kmemleak.c
>>> index a2d34226e3c8..c9f2f816db19 100644
>>> --- a/mm/kmemleak.c
>>> +++ b/mm/kmemleak.c
>>> @@ -610,8 +610,6 @@ static noinline depot_stack_handle_t set_track_prepare(void)
>>> unsigned long entries[MAX_TRACE];
>>> unsigned int nr_entries;
>>>
>>> - if (!kmemleak_initialized)
>>> - return 0;
>>> nr_entries = stack_trace_save(entries, ARRAY_SIZE(entries), 3);
>>> trace_handle = stack_depot_save(entries, nr_entries, GFP_NOWAIT);
>>>

2023-08-11 03:02:47

by Xiaolei Wang

[permalink] [raw]
Subject: Re: [PATCH 1/2] lib/stackdepot: Bail out in __stack_depot_save() if the stack_table is not allocated


On 8/10/23 8:53 PM, Vlastimil Babka wrote:
> CAUTION: This email comes from a non Wind River email account!
> Do not click links or open attachments unless you recognize the sender and know the content is safe.
>
> On 8/10/23 09:47, Xiaolei Wang wrote:
>> The __stack_depot_save() may be used by some subsystems even before
>> the stack depot is initialized.
> Does that currently happen, or only after patch 2/2 it starts happening via
> kmemleak?
Yes, currently it happens after patch 2/2 it starts happening via kmemleak,

The reason why I take it as the first patch is because I think this can
avoid

exceptions when there is only patch2, such as when we use git bisect to
debug

>
>> So add a check of stack_table in
>> __stack_depot_save() to make sure no oops in this case.
>>
>> Fixes: 56a61617dd22 ("mm: use stack_depot for recording kmemleak's backtrace")
> In case it's only after 2/2 I don't think this is truly "Fixes"?

Yes, it is indeed for patch2 at present, it is to fix the situation that
kmemleak

has no backtrace in the boot phase, and at the same time, it can also
prevent

patch2 from entering the stable kernel and causing panic

thanks

xiaolei

>
>> Signed-off-by: Xiaolei Wang <[email protected]>
>> ---
>> lib/stackdepot.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/lib/stackdepot.c b/lib/stackdepot.c
>> index 2f5aa851834e..a0651d013a0d 100644
>> --- a/lib/stackdepot.c
>> +++ b/lib/stackdepot.c
>> @@ -376,7 +376,7 @@ depot_stack_handle_t __stack_depot_save(unsigned long *entries,
>> */
>> nr_entries = filter_irq_stacks(entries, nr_entries);
>>
>> - if (unlikely(nr_entries == 0) || stack_depot_disabled)
>> + if (unlikely(nr_entries == 0) || stack_depot_disabled || unlikely(!stack_table))
>> goto fast_exit;
>>
>> hash = hash_stack(entries, nr_entries);

2023-08-11 08:36:59

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [PATCH 2/2] mm/kmemleak: No need to check kmemleak_initialized in set_track_prepare()

On 8/11/23 04:03, wang xiaolei wrote:
>
> On 8/10/23 9:16 PM, Vlastimil Babka wrote:
>> CAUTION: This email comes from a non Wind River email account!
>> Do not click links or open attachments unless you recognize the sender and know the content is safe.
>>
>> On 8/10/23 12:03, Vlastimil Babka wrote:
>>> On 8/10/23 09:47, Xiaolei Wang wrote:
>>>> The kmemleak_late_init() is defined as a late_initcall. The current
>>>> implementation of set_track_prepare() depends on the kmemleak init.
>>>> That also means there is no call trace for the memory leak which object
>>>> is created before the kmemleak_late_init().
>>> So if I understand correctly, we have the following sequence of events durin
>>> boot
>>>
>>> ...
>>> A: stack_depot is initialized
>>> ...
>>> B: kmemleak is initialized
>>> ...
>>>
>>> before this patchset, we can miss allocations before B, aftewards only
>>> before A (which can't be helped), so we now have between A and B.
>>>
>>> That's nice, but it's weird that can record kmemleak when
>>> !kmemleak_initialized. Why can't it be initialized sooner in that case?
>> Looking closer, I think what you want could be achieved by kmemleak_init()
>> setting a variable that is checked in kmemleak_initialized() instead of the
>> kmemleak_initialized that's set too late.
>>
>> I think this should work because:
>> - I assume kmemleak can't record anything before kmemleak_init()
>> - stack depot early init is requested one way or the other
>> - mm_core_init() calls stack_depot_early_init() before kmemleak_init()
>>
>> But I also wonder how kmemleak can even reach set_track_prepare() before
>> kmemleak_init(), maybe that's the issue?
>
> Before kmemleak_init, many places also need to allocate kmemleak_object,
>
> and also need to save stack in advance, but kmemleak_object is allocated
>
> in the form of an array, after kmemleak_init 'object_cache =
> KMEM_CACHE(kmemleak_object, SLAB_NOLEAKTRACE);'

Hm I see, kmemleak has this static mempool so it really can record some
objects very early.

> I think there is still some memory not recorded on the backtrace before
>
> stack_depot_early_init(), does anyone have a better suggestion?

No we can't record the backtrace earlier. But I don't think it's a problem
in practice. AFAIU kmemleak needs to record these very early allocations so
if they point to further objects, those are not suspected as orphans. But
the early allocations themselves also are very unlikely to be leaks, so does
it really matter that we don't have a backtrace for their allocation?
Because the backtrace is the only thing that's missing - the object is
otherwise recorded even if set_track_prepare() returns 0.

> thanks
>
> xiaolei
>
>>
>>>> In a previous patch, we have fixed a bug in stack_depot_save() so that
>>>> it can be invoked even before stack depot is initialized. So there is
>>>> no reason to check the kmemleak_initialized in set_track_prepare().
>>>> So delete the kmemleak_initialized judgment in set_track_prepare()
>>>>
>>>> unreferenced object 0xc674ca80 (size 64):
>>>> comm "swapper/0", pid 1, jiffies 4294938337 (age 204.880s)
>>>> hex dump (first 32 bytes):
>>>> 80 55 75 c6 80 54 75 c6 00 55 75 c6 80 52 75 c6 .Uu..Tu..Uu..Ru.
>>>> 00 53 75 c6 00 00 00 00 00 00 00 00 00 00 00 00 .Su..........
>>>>
>>>> Fixes: 56a61617dd22 ("mm: use stack_depot for recording kmemleak's backtrace")
>>>> Signed-off-by: Xiaolei Wang <[email protected]>
>>>> ---
>>>> mm/kmemleak.c | 2 --
>>>> 1 file changed, 2 deletions(-)
>>>>
>>>> diff --git a/mm/kmemleak.c b/mm/kmemleak.c
>>>> index a2d34226e3c8..c9f2f816db19 100644
>>>> --- a/mm/kmemleak.c
>>>> +++ b/mm/kmemleak.c
>>>> @@ -610,8 +610,6 @@ static noinline depot_stack_handle_t set_track_prepare(void)
>>>> unsigned long entries[MAX_TRACE];
>>>> unsigned int nr_entries;
>>>>
>>>> - if (!kmemleak_initialized)
>>>> - return 0;
>>>> nr_entries = stack_trace_save(entries, ARRAY_SIZE(entries), 3);
>>>> trace_handle = stack_depot_save(entries, nr_entries, GFP_NOWAIT);
>>>>


2023-08-14 16:33:12

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH 2/2] mm/kmemleak: No need to check kmemleak_initialized in set_track_prepare()

On Fri, Aug 11, 2023 at 10:09:08AM +0200, Vlastimil Babka wrote:
> On 8/11/23 04:03, wang xiaolei wrote:
> > On 8/10/23 9:16 PM, Vlastimil Babka wrote:
> >> Looking closer, I think what you want could be achieved by kmemleak_init()
> >> setting a variable that is checked in kmemleak_initialized() instead of the
> >> kmemleak_initialized that's set too late.
> >>
> >> I think this should work because:
> >> - I assume kmemleak can't record anything before kmemleak_init()
> >> - stack depot early init is requested one way or the other
> >> - mm_core_init() calls stack_depot_early_init() before kmemleak_init()
> >>
> >> But I also wonder how kmemleak can even reach set_track_prepare() before
> >> kmemleak_init(), maybe that's the issue?
> >
> > Before kmemleak_init, many places also need to allocate kmemleak_object,
> >
> > and also need to save stack in advance, but kmemleak_object is allocated
> >
> > in the form of an array, after kmemleak_init 'object_cache =
> > KMEM_CACHE(kmemleak_object, SLAB_NOLEAKTRACE);'
>
> Hm I see, kmemleak has this static mempool so it really can record some
> objects very early.

Indeed, otherwise we'd get a lot of false positives.

> > I think there is still some memory not recorded on the backtrace before
> >
> > stack_depot_early_init(), does anyone have a better suggestion?
>
> No we can't record the backtrace earlier. But I don't think it's a problem
> in practice. AFAIU kmemleak needs to record these very early allocations so
> if they point to further objects, those are not suspected as orphans. But
> the early allocations themselves also are very unlikely to be leaks, so does
> it really matter that we don't have a backtrace for their allocation?
> Because the backtrace is the only thing that's missing - the object is
> otherwise recorded even if set_track_prepare() returns 0.

It's not a functional problem, just a reporting one. There are
rare early leaks (usually false positives) so identifying them would
help. That said, I think set_track_prepare() is too conservative in
waiting for kmemleak_initialized to be set in kmemleak_late_init().
That's a late_initcall() meant for the scanning thread etc. not the core
kmemleak functionality (which is on from early boot).

We can instead use a different variable to check in set_track_prepare(),
e.g. object_cache. stack_depot_early_init() is called prior to
kmemleak_init(), so it should be fine.

If "kmemleak_initialized" is confusing, we could rename it to
"kmemleak_late_initialized" or "kmemleak_fully_initialized". I'm not too
fussed about this as long as we add some comment on why we check
object_cache instead of kmemleak_initialized.

--
Catalin