2023-04-25 22:41:35

by Jim Cromie

[permalink] [raw]
Subject: [PATCH 0/3] kmemleak report format changes

If format changes are not /sys/** ABI violating, heres 3 minor ones:

1st strips "age <increasing>" from output. This makes the output
idempotent; unchanging until a new leak is reported.

2nd adds the backtrace.checksum to the "backtrace:" line. This lets a
user see repeats without actually reading the whole backtrace. So now
the backtrace line looks like this:

backtrace (ck 603070071): # also see below

Q: should ck be spelled crc ? it feels more communicative.

NB: with ck exposed, it becomes possible to do a "selective clear",
something like:

echo drop 603070071 > /sys/kernel/debug/kmemleak

The 3rd patch takes __init off of kmemleak_test_init(). This fixes a
bare-pointer in the 2nd line of the backtrace below, which previously
looked like:

[<00000000ef738764>] 0xffffffffc02350a2

NB: this happens still/again, after rmmod kmemleak-test.

unreferenced object 0xffff888005d9ca40 (size 32):
comm "modprobe", pid 412, jiffies 4294703300
hex dump (first 32 bytes):
00 cd d9 05 80 88 ff ff 40 cf d9 05 80 88 ff ff ........@.......
14 a7 c4 f6 7d f9 87 10 00 00 00 00 00 00 00 00 ....}...........
backtrace (ck 1354775490):
[<000000002c474f61>] kmalloc_trace+0x26/0x90
[<00000000b26599c1>] kmemleak_test_init+0x58/0x2d0 [kmemleak_test]
[<0000000044d13990>] do_one_initcall+0x43/0x210
[<00000000131bc505>] do_init_module+0x4a/0x210
[<00000000b2902890>] __do_sys_finit_module+0x93/0xf0
[<00000000673fdce2>] do_syscall_64+0x34/0x80
[<00000000357a2d80>] entry_SYSCALL_64_after_hwframe+0x46/0xb0


Jim Cromie (3):
kmemleak: drop (age <increasing>) from leak record
kmemleak: add checksum to backtrace report
kmemleak-test: drop __init to get better backtrace

mm/kmemleak.c | 8 +++-----
samples/kmemleak/kmemleak-test.c | 2 +-
2 files changed, 4 insertions(+), 6 deletions(-)

--
2.40.0


2023-04-25 22:41:40

by Jim Cromie

[permalink] [raw]
Subject: [PATCH 1/3] kmemleak: drop (age <increasing>) from leak record

Displaying age is pretty, but counter-productive; it surrenders
idempotency of the output, which breaks simple hash-based cataloging
of the records by the user.

The trouble: sequential reads, wo new leaks, get new results:

:#> sum /sys/kernel/debug/kmemleak
53439 74 /sys/kernel/debug/kmemleak
:#> sum /sys/kernel/debug/kmemleak
59066 74 /sys/kernel/debug/kmemleak

and age is why (nothing else changes):

:#> grep -v age /sys/kernel/debug/kmemleak | sum
58894 67
:#> grep -v age /sys/kernel/debug/kmemleak | sum
58894 67

Further, age is not an intrinsic property of the leak, its an artifact
of when it was scanned, and relative age is embedded in leak order.

While userspace could work around the always-changing output, ISTM
none could be relying upon age in any important way, and having
idempotent output is just better.

Signed-off-by: Jim Cromie <[email protected]>
---
mm/kmemleak.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/mm/kmemleak.c b/mm/kmemleak.c
index a2d34226e3c8..f025c7bc845b 100644
--- a/mm/kmemleak.c
+++ b/mm/kmemleak.c
@@ -355,14 +355,12 @@ static void print_unreferenced(struct seq_file *seq,
int i;
unsigned long *entries;
unsigned int nr_entries;
- unsigned int msecs_age = jiffies_to_msecs(jiffies - object->jiffies);

nr_entries = stack_depot_fetch(object->trace_handle, &entries);
warn_or_seq_printf(seq, "unreferenced object 0x%08lx (size %zu):\n",
object->pointer, object->size);
- warn_or_seq_printf(seq, " comm \"%s\", pid %d, jiffies %lu (age %d.%03ds)\n",
- object->comm, object->pid, object->jiffies,
- msecs_age / 1000, msecs_age % 1000);
+ warn_or_seq_printf(seq, " comm \"%s\", pid %d, jiffies %lu\n",
+ object->comm, object->pid, object->jiffies);
hex_dump_object(seq, object);
warn_or_seq_printf(seq, " backtrace:\n");

--
2.40.0

2023-04-25 22:42:03

by Jim Cromie

[permalink] [raw]
Subject: [PATCH 2/3] kmemleak: add checksum to backtrace report

change kmemleak report format:

from: " backtrace:"
to: " backtrace (ck <cksum>):"

The <cksum> allows a user to see recurring backtraces without
detailed/careful reading of multiline backtraces. So after cycling
kmemleak-test a few times, I know some leaks are repeating.

bash-5.2# grep backtrace /sys/kernel/debug/kmemleak | wc
62 186 1792
bash-5.2# grep backtrace /sys/kernel/debug/kmemleak | sort -u | wc
37 111 1067

Signed-off-by: Jim Cromie <[email protected]>
---
mm/kmemleak.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/kmemleak.c b/mm/kmemleak.c
index f025c7bc845b..2d1dfed4293d 100644
--- a/mm/kmemleak.c
+++ b/mm/kmemleak.c
@@ -362,7 +362,7 @@ static void print_unreferenced(struct seq_file *seq,
warn_or_seq_printf(seq, " comm \"%s\", pid %d, jiffies %lu\n",
object->comm, object->pid, object->jiffies);
hex_dump_object(seq, object);
- warn_or_seq_printf(seq, " backtrace:\n");
+ warn_or_seq_printf(seq, " backtrace (ck %u):\n", object->checksum);

for (i = 0; i < nr_entries; i++) {
void *ptr = (void *)entries[i];
--
2.40.0

2023-04-25 22:42:09

by Jim Cromie

[permalink] [raw]
Subject: [PATCH 3/3] kmemleak-test: drop __init to get better backtrace

Drop the __init on kmemleak_test_init(). With it, the storage is
reclaimed, but then the symbol isn't available for "%pS" rendering,
and the backtrace gets a bare pointer where the actual leak happened.

unreferenced object 0xffff88800a2b0800 (size 1024):
comm "modprobe", pid 413, jiffies 4294953430
hex dump (first 32 bytes):
73 02 00 00 75 01 00 68 02 00 00 01 00 00 00 04 s...u..h........
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace (ck 603070071):
[<00000000fabad728>] kmalloc_trace+0x26/0x90
[<00000000ef738764>] 0xffffffffc02350a2
[<00000000004e5795>] do_one_initcall+0x43/0x210
[<00000000d768905e>] do_init_module+0x4a/0x210
[<0000000087135ab5>] __do_sys_finit_module+0x93/0xf0
[<000000004fcb1fa2>] do_syscall_64+0x34/0x80
[<00000000c73c8d9d>] entry_SYSCALL_64_after_hwframe+0x46/0xb0

with __init gone, that trace entry renders like:

[<00000000ef738764>] kmemleak_test_init+<offset>/<size>
---
samples/kmemleak/kmemleak-test.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/samples/kmemleak/kmemleak-test.c b/samples/kmemleak/kmemleak-test.c
index 7b476eb8285f..6ced5ddd99d4 100644
--- a/samples/kmemleak/kmemleak-test.c
+++ b/samples/kmemleak/kmemleak-test.c
@@ -32,7 +32,7 @@ static DEFINE_PER_CPU(void *, kmemleak_test_pointer);
* Some very simple testing. This function needs to be extended for
* proper testing.
*/
-static int __init kmemleak_test_init(void)
+static int kmemleak_test_init(void)
{
struct test_node *elem;
int i;
--
2.40.0

2023-04-28 17:13:57

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH 3/3] kmemleak-test: drop __init to get better backtrace

On Tue, Apr 25, 2023 at 04:24:46PM -0600, Jim Cromie wrote:
> Drop the __init on kmemleak_test_init(). With it, the storage is
> reclaimed, but then the symbol isn't available for "%pS" rendering,
> and the backtrace gets a bare pointer where the actual leak happened.
>
> unreferenced object 0xffff88800a2b0800 (size 1024):
> comm "modprobe", pid 413, jiffies 4294953430
> hex dump (first 32 bytes):
> 73 02 00 00 75 01 00 68 02 00 00 01 00 00 00 04 s...u..h........
> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> backtrace (ck 603070071):
> [<00000000fabad728>] kmalloc_trace+0x26/0x90
> [<00000000ef738764>] 0xffffffffc02350a2
> [<00000000004e5795>] do_one_initcall+0x43/0x210
> [<00000000d768905e>] do_init_module+0x4a/0x210
> [<0000000087135ab5>] __do_sys_finit_module+0x93/0xf0
> [<000000004fcb1fa2>] do_syscall_64+0x34/0x80
> [<00000000c73c8d9d>] entry_SYSCALL_64_after_hwframe+0x46/0xb0
>
> with __init gone, that trace entry renders like:
>
> [<00000000ef738764>] kmemleak_test_init+<offset>/<size>

Acked-by: Catalin Marinas <[email protected]>

2023-04-28 17:25:47

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH 0/3] kmemleak report format changes

On Tue, Apr 25, 2023 at 04:24:43PM -0600, Jim Cromie wrote:
> If format changes are not /sys/** ABI violating, heres 3 minor ones:
>
> 1st strips "age <increasing>" from output. This makes the output
> idempotent; unchanging until a new leak is reported.
>
> 2nd adds the backtrace.checksum to the "backtrace:" line. This lets a
> user see repeats without actually reading the whole backtrace. So now
> the backtrace line looks like this:
>
> backtrace (ck 603070071): # also see below
>
> Q: should ck be spelled crc ? it feels more communicative.

These all would make sense (and 'crc' sounds better) if they were done
from the start. I know there are test scripts out there parsing the
kmemleak sysfs file. I can't tell whether these changes would break
them.

Cc'ing Dmitry, I think syzbot was regularly checking kmemleak (not sure
it still does).

> NB: with ck exposed, it becomes possible to do a "selective clear",
> something like:
>
> echo drop 603070071 > /sys/kernel/debug/kmemleak
>
> The 3rd patch takes __init off of kmemleak_test_init(). This fixes a
> bare-pointer in the 2nd line of the backtrace below, which previously
> looked like:
>
> [<00000000ef738764>] 0xffffffffc02350a2
>
> NB: this happens still/again, after rmmod kmemleak-test.
>
> unreferenced object 0xffff888005d9ca40 (size 32):
> comm "modprobe", pid 412, jiffies 4294703300
> hex dump (first 32 bytes):
> 00 cd d9 05 80 88 ff ff 40 cf d9 05 80 88 ff ff ........@.......
> 14 a7 c4 f6 7d f9 87 10 00 00 00 00 00 00 00 00 ....}...........
> backtrace (ck 1354775490):
> [<000000002c474f61>] kmalloc_trace+0x26/0x90
> [<00000000b26599c1>] kmemleak_test_init+0x58/0x2d0 [kmemleak_test]
> [<0000000044d13990>] do_one_initcall+0x43/0x210
> [<00000000131bc505>] do_init_module+0x4a/0x210
> [<00000000b2902890>] __do_sys_finit_module+0x93/0xf0
> [<00000000673fdce2>] do_syscall_64+0x34/0x80
> [<00000000357a2d80>] entry_SYSCALL_64_after_hwframe+0x46/0xb0

--
Catalin

2023-11-16 17:56:29

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH 0/3] kmemleak report format changes

On Fri, Nov 10, 2023 at 05:19:38PM -0700, [email protected] wrote:
> On Fri, Apr 28, 2023 at 11:25 AM Catalin Marinas
> <[email protected]> wrote:
> > On Tue, Apr 25, 2023 at 04:24:43PM -0600, Jim Cromie wrote:
> > > If format changes are not /sys/** ABI violating, heres 3 minor ones:
> > >
> > > 1st strips "age <increasing>" from output. This makes the output
> > > idempotent; unchanging until a new leak is reported.
> > >
> > > 2nd adds the backtrace.checksum to the "backtrace:" line. This lets a
> > > user see repeats without actually reading the whole backtrace. So now
> > > the backtrace line looks like this:
> > >
> > > backtrace (ck 603070071): # also see below
> > >
> > > Q: should ck be spelled crc ? it feels more communicative.
> >
> > These all would make sense (and 'crc' sounds better) if they were done
> > from the start. I know there are test scripts out there parsing the
> > kmemleak sysfs file. I can't tell whether these changes would break
> > them.
> >
> > Cc'ing Dmitry, I think syzbot was regularly checking kmemleak (not sure
> > it still does).
[...]
> QED: there are no kmemleak parsers in public github repos that would
> break with these changes

Thanks for digging into this, I completely forgot about this series.
Would you mind rebasing to the latest kernel and reposting?

Thanks.

--
Catalin