2024-01-02 15:35:01

by Chris von Recklinghausen

[permalink] [raw]
Subject: [PATCH] don't record leak information on allocations done between kmemleak_init and kmemleak_late_init

If an object is allocated after kmemleak_init is called but before
kmemleak_late_init is called, calls to kmemleak_not_leak or
kmemleak_ignore on the object don't prevent a scan from reporting the
object as a leak.

Avoid this situation by only registering objects in kmemleak_alloc when
kmemleak_initialized is set.

Signed-off-by: Chris von Recklinghausen <[email protected]>
---
mm/kmemleak.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/mm/kmemleak.c b/mm/kmemleak.c
index 5501363d6b31..0c8a5f456874 100644
--- a/mm/kmemleak.c
+++ b/mm/kmemleak.c
@@ -1117,7 +1117,8 @@ void __ref kmemleak_free_part(const void *ptr, size_t size)
{
pr_debug("%s(0x%px)\n", __func__, ptr);

- if (kmemleak_enabled && ptr && !IS_ERR(ptr))
+ if (kmemleak_enabled && kmemleak_late_initialized && ptr &&
+ !IS_ERR(ptr))
delete_object_part((unsigned long)ptr, size, false);
}
EXPORT_SYMBOL_GPL(kmemleak_free_part);
@@ -1135,7 +1136,8 @@ void __ref kmemleak_free_percpu(const void __percpu *ptr)

pr_debug("%s(0x%px)\n", __func__, ptr);

- if (kmemleak_free_enabled && ptr && !IS_ERR(ptr))
+ if (kmemleak_free_enabled && kmemleak_late_initialized && ptr &&
+ !IS_ERR(ptr))
for_each_possible_cpu(cpu)
delete_object_full((unsigned long)per_cpu_ptr(ptr,
cpu));
@@ -1189,7 +1191,8 @@ void __ref kmemleak_not_leak(const void *ptr)
{
pr_debug("%s(0x%px)\n", __func__, ptr);

- if (kmemleak_enabled && ptr && !IS_ERR(ptr))
+ if (kmemleak_enabled && kmamleak_late_initialized && ptr &&
+ !IS_ERR(ptr))
make_gray_object((unsigned long)ptr);
}
EXPORT_SYMBOL(kmemleak_not_leak);
@@ -1207,7 +1210,8 @@ void __ref kmemleak_ignore(const void *ptr)
{
pr_debug("%s(0x%px)\n", __func__, ptr);

- if (kmemleak_enabled && ptr && !IS_ERR(ptr))
+ if (kmemleak_enabled && kmamleak_late_initialized && ptr &&
+ !IS_ERR(ptr))
make_black_object((unsigned long)ptr, false);
}
EXPORT_SYMBOL(kmemleak_ignore);
--
2.43.0



2024-01-02 20:08:02

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH] don't record leak information on allocations done between kmemleak_init and kmemleak_late_init

On Tue, Jan 02, 2024 at 10:34:28AM -0500, Chris von Recklinghausen wrote:
> If an object is allocated after kmemleak_init is called but before
> kmemleak_late_init is called, calls to kmemleak_not_leak or
> kmemleak_ignore on the object don't prevent a scan from reporting the
> object as a leak.

This may be true but what is the reason for this? Can you give some
example of false positives you get?

> Avoid this situation by only registering objects in kmemleak_alloc when
> kmemleak_initialized is set.

I wouldn't do this, kmemleak needs to track all the early allocations,
otherwise it will lead to lots of false positives. However, looking at
your patch, it looks like it doesn't touch kmemleak_alloc() at all and
it does something completely different.

> diff --git a/mm/kmemleak.c b/mm/kmemleak.c
> index 5501363d6b31..0c8a5f456874 100644
> --- a/mm/kmemleak.c
> +++ b/mm/kmemleak.c
> @@ -1117,7 +1117,8 @@ void __ref kmemleak_free_part(const void *ptr, size_t size)
> {
> pr_debug("%s(0x%px)\n", __func__, ptr);
>
> - if (kmemleak_enabled && ptr && !IS_ERR(ptr))
> + if (kmemleak_enabled && kmemleak_late_initialized && ptr &&
> + !IS_ERR(ptr))
> delete_object_part((unsigned long)ptr, size, false);
> }

This leaves some memory to still be tracked by kmemleak when it was
actually freed. Later when it is reallocated, you'll get some errors and
kmemleak will disable itself.

> EXPORT_SYMBOL_GPL(kmemleak_free_part);
> @@ -1135,7 +1136,8 @@ void __ref kmemleak_free_percpu(const void __percpu *ptr)
>
> pr_debug("%s(0x%px)\n", __func__, ptr);
>
> - if (kmemleak_free_enabled && ptr && !IS_ERR(ptr))
> + if (kmemleak_free_enabled && kmemleak_late_initialized && ptr &&
> + !IS_ERR(ptr))
> for_each_possible_cpu(cpu)
> delete_object_full((unsigned long)per_cpu_ptr(ptr,
> cpu));

Same here.

> @@ -1189,7 +1191,8 @@ void __ref kmemleak_not_leak(const void *ptr)
> {
> pr_debug("%s(0x%px)\n", __func__, ptr);
>
> - if (kmemleak_enabled && ptr && !IS_ERR(ptr))
> + if (kmemleak_enabled && kmamleak_late_initialized && ptr &&
^^^
I guess you haven't compiled this patch. Does it actually fix the
problem you are reporting?

> + !IS_ERR(ptr))
> make_gray_object((unsigned long)ptr);
> }

This change doesn't help at all with your problem statement.

> EXPORT_SYMBOL(kmemleak_not_leak);
> @@ -1207,7 +1210,8 @@ void __ref kmemleak_ignore(const void *ptr)
> {
> pr_debug("%s(0x%px)\n", __func__, ptr);
>
> - if (kmemleak_enabled && ptr && !IS_ERR(ptr))
> + if (kmemleak_enabled && kmamleak_late_initialized && ptr &&
> + !IS_ERR(ptr))
> make_black_object((unsigned long)ptr, false);
> }
> EXPORT_SYMBOL(kmemleak_ignore);

Neither does this.

Also if you re-post, please cc linux-mm as well. Andrew Morton tends to
pick up the kmemleak patches (once acked).

--
Catalin

2024-01-03 03:09:24

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH] don't record leak information on allocations done between kmemleak_init and kmemleak_late_init

Hi Chris,

kernel test robot noticed the following build errors:

[auto build test ERROR on v6.7-rc8]
[also build test ERROR on linus/master]
[cannot apply to akpm-mm/mm-everything next-20240102]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url: https://github.com/intel-lab-lkp/linux/commits/Chris-von-Recklinghausen/don-t-record-leak-information-on-allocations-done-between-kmemleak_init-and-kmemleak_late_init/20240102-233553
base: v6.7-rc8
patch link: https://lore.kernel.org/r/20240102153428.139984-1-crecklin%40redhat.com
patch subject: [PATCH] don't record leak information on allocations done between kmemleak_init and kmemleak_late_init
config: i386-randconfig-141-20240103 (https://download.01.org/0day-ci/archive/20240103/[email protected]/config)
compiler: ClangBuiltLinux clang version 17.0.6 (https://github.com/llvm/llvm-project 6009708b4367171ccdbf4b5905cb6a803753fe18)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240103/[email protected]/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <[email protected]>
| Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/

All errors (new ones prefixed by >>):

>> mm/kmemleak.c:1194:26: error: use of undeclared identifier 'kmamleak_late_initialized'; did you mean 'kmemleak_late_initialized'?
1194 | if (kmemleak_enabled && kmamleak_late_initialized && ptr &&
| ^~~~~~~~~~~~~~~~~~~~~~~~~
| kmemleak_late_initialized
mm/kmemleak.c:221:12: note: 'kmemleak_late_initialized' declared here
221 | static int kmemleak_late_initialized;
| ^
mm/kmemleak.c:1213:26: error: use of undeclared identifier 'kmamleak_late_initialized'; did you mean 'kmemleak_late_initialized'?
1213 | if (kmemleak_enabled && kmamleak_late_initialized && ptr &&
| ^~~~~~~~~~~~~~~~~~~~~~~~~
| kmemleak_late_initialized
mm/kmemleak.c:221:12: note: 'kmemleak_late_initialized' declared here
221 | static int kmemleak_late_initialized;
| ^
2 errors generated.


vim +1194 mm/kmemleak.c

1182
1183 /**
1184 * kmemleak_not_leak - mark an allocated object as false positive
1185 * @ptr: pointer to beginning of the object
1186 *
1187 * Calling this function on an object will cause the memory block to no longer
1188 * be reported as leak and always be scanned.
1189 */
1190 void __ref kmemleak_not_leak(const void *ptr)
1191 {
1192 pr_debug("%s(0x%px)\n", __func__, ptr);
1193
> 1194 if (kmemleak_enabled && kmamleak_late_initialized && ptr &&
1195 !IS_ERR(ptr))
1196 make_gray_object((unsigned long)ptr);
1197 }
1198 EXPORT_SYMBOL(kmemleak_not_leak);
1199

--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

2024-01-03 10:52:41

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH] don't record leak information on allocations done between kmemleak_init and kmemleak_late_init

Hi Chris,

kernel test robot noticed the following build errors:

[auto build test ERROR on v6.7-rc8]
[also build test ERROR on linus/master]
[cannot apply to akpm-mm/mm-everything next-20240103]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url: https://github.com/intel-lab-lkp/linux/commits/Chris-von-Recklinghausen/don-t-record-leak-information-on-allocations-done-between-kmemleak_init-and-kmemleak_late_init/20240102-233553
base: v6.7-rc8
patch link: https://lore.kernel.org/r/20240102153428.139984-1-crecklin%40redhat.com
patch subject: [PATCH] don't record leak information on allocations done between kmemleak_init and kmemleak_late_init
config: arc-randconfig-001-20240103 (https://download.01.org/0day-ci/archive/20240103/[email protected]/config)
compiler: arceb-elf-gcc (GCC) 13.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240103/[email protected]/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <[email protected]>
| Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/

All errors (new ones prefixed by >>):

mm/kmemleak.c: In function 'kmemleak_not_leak':
>> mm/kmemleak.c:1194:33: error: 'kmamleak_late_initialized' undeclared (first use in this function); did you mean 'kmemleak_late_initialized'?
1194 | if (kmemleak_enabled && kmamleak_late_initialized && ptr &&
| ^~~~~~~~~~~~~~~~~~~~~~~~~
| kmemleak_late_initialized
mm/kmemleak.c:1194:33: note: each undeclared identifier is reported only once for each function it appears in
mm/kmemleak.c: In function 'kmemleak_ignore':
mm/kmemleak.c:1213:33: error: 'kmamleak_late_initialized' undeclared (first use in this function); did you mean 'kmemleak_late_initialized'?
1213 | if (kmemleak_enabled && kmamleak_late_initialized && ptr &&
| ^~~~~~~~~~~~~~~~~~~~~~~~~
| kmemleak_late_initialized


vim +1194 mm/kmemleak.c

1182
1183 /**
1184 * kmemleak_not_leak - mark an allocated object as false positive
1185 * @ptr: pointer to beginning of the object
1186 *
1187 * Calling this function on an object will cause the memory block to no longer
1188 * be reported as leak and always be scanned.
1189 */
1190 void __ref kmemleak_not_leak(const void *ptr)
1191 {
1192 pr_debug("%s(0x%px)\n", __func__, ptr);
1193
> 1194 if (kmemleak_enabled && kmamleak_late_initialized && ptr &&
1195 !IS_ERR(ptr))
1196 make_gray_object((unsigned long)ptr);
1197 }
1198 EXPORT_SYMBOL(kmemleak_not_leak);
1199

--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

2024-01-03 12:25:54

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH] don't record leak information on allocations done between kmemleak_init and kmemleak_late_init

(as you noticed, don't post html as they usually get rejected from
lists)

On Wed, Jan 03, 2024 at 06:20:16AM -0500, Christoph von Recklinghausen wrote:
> On 1/2/24 15:07, Catalin Marinas wrote:
> > On Tue, Jan 02, 2024 at 10:34:28AM -0500, Chris von Recklinghausen wrote:
> > > If an object is allocated after kmemleak_init is called but before
> > > kmemleak_late_init is called, calls to kmemleak_not_leak or
> > > kmemleak_ignore on the object don't prevent a scan from reporting the
> > > object as a leak.
> > This may be true but what is the reason for this? Can you give some
> > example of false positives you get?
>
> In centos-stream-9 on s390x I get the following complaint:
>
> WARN:(libsan.host.linux) Found kernel memory leak:
> unreferenced object 0x1bff7fffb30000 (size 65536):
> comm "swapper/0", pid 0, jiffies 4294937297 (age 76.530s)
> hex dump (first 32 bytes):
> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> backtrace:
> [<00000000eda98345>] __vmalloc_node_range+0x29a/0x360
> [<00000000e3051c75>] __vmalloc_node+0x9e/0xd0
> [<00000000a5dd11b7>] stack_alloc+0x38/0x50
> [<0000000081096e42>] smp_reinit_ipl_cpu+0xf8/0x3f8
> [<00000000ee13aae5>] arch_call_rest_init+0x22/0x100
> [<00000000b37567c9>] start_kernel+0x44c/0x460
> [<00000000548d9080>] startup_continue+0x30/0x50
>
>
> In arch/s390/kernel/setup.c we have
>
> unsigned long stack_alloc(void)
> {
> #ifdef CONFIG_VMAP_STACK
> ??????? return (unsigned long)__vmalloc_node(THREAD_SIZE, THREAD_SIZE,
> ??????????????????????? THREADINFO_GFP, NUMA_NO_NODE,
> ??????????????????????? __builtin_return_address(0));
> #else
> ??????? return __get_free_pages(GFP_KERNEL, THREAD_SIZE_ORDER);
> #endif
> }

I guess that's an older kernel as arch_call_rest_init() is no longer in
mainline.

Mainline stack_alloc() has a kmemleak_not_leak() call here with an
explanation in the commit log (it should have been added as a comment in
the code), 436fc4feeabb ("s390: add kmemleak annotation in
stack_alloc()"):

kmemleak with enabled auto scanning reports that our stack allocation is
lost. This is because we're saving the pointer + STACK_INIT_OFFSET to
lowcore. When kmemleak now scans the objects, it thinks that this one is
lost because it can't find a corresponding pointer.

Does this commit not fix it for you? It looks like it did the trick in
mainline. Late kmemleak initialisation should not interfere unless you
have a very old kernel and we had some bugs with tracking these (before
we introduced the mem_pool[] array in kmemleak for early allocations).

> void __init arch_call_rest_init(void)
> {
> ??????? unsigned long stack;
>
> ??????? smp_reinit_ipl_cpu();
> ??????? stack = stack_alloc();
> ??????? if (!stack)
> ??????????????? panic("Couldn't allocate kernel stack");
> ??????? current->stack = (void *) stack;
> #ifdef CONFIG_VMAP_STACK
> ??????? current->stack_vm_area = (void *) stack;
> #endif

In mainline at least, stack_vm_area is a struct vm_struct, so it
shouldn't be assigned the actual stack pointer (but maybe that's not the
case in your version, I haven't checked the history).

--
Catalin