LinuxLists.cc - [PATCH 0/5] SLUB debugfs improvements based on stackdepot

2022-02-26 00:16:37

Subject: [PATCH 0/5] SLUB debugfs improvements based on stackdepot

Hi,

this series combines and revives patches from Oliver's last year
bachelor thesis (where I was the advisor) that make SLUB's debugfs
files alloc_traces and free_traces more useful.
The resubmission was blocked on stackdepot changes that are now merged,
as explained in patch 2.

Patch 1 is a new preparatory cleanup.

Patch 2 originally submitted here [1], was merged to mainline but
reverted for stackdepot related issues as explained in the patch.

Patches 3-5 originally submitted as RFC here [2]. In this submission I
have omitted the new file 'all_objects' (patch 3/3 in [2]) as it might
be considered too intrusive so I will postpone it for later. The docs
patch is adjusted accordingly.

Also available in git, based on v5.17-rc1:
https://git.kernel.org/pub/scm/linux/kernel/git/vbabka/linux.git/log/?h=slub-stackdepot-v1

I'd like to ask for some review before I add this to the slab tree.

[1] https://lore.kernel.org/all/[email protected]/
[2] https://lore.kernel.org/all/[email protected]/

Oliver Glitta (4):
mm/slub: use stackdepot to save stack trace in objects
mm/slub: aggregate and print stack traces in debugfs files
mm/slub: sort debugfs output by frequency of stack traces
slab, documentation: add description of debugfs files for SLUB caches

Vlastimil Babka (1):
mm/slub: move struct track init out of set_track()

Documentation/vm/slub.rst | 61 +++++++++++++++
init/Kconfig | 1 +
mm/slub.c | 152 +++++++++++++++++++++++++-------------
3 files changed, 162 insertions(+), 52 deletions(-)

--
2.35.1

2022-02-26 00:16:37

by Vlastimil Babka

[permalink] [raw]

Subject: [PATCH 2/5] mm/slub: use stackdepot to save stack trace in objects

From: Oliver Glitta <[email protected]>

Many stack traces are similar so there are many similar arrays.
Stackdepot saves each unique stack only once.

Replace field addrs in struct track with depot_stack_handle_t handle. Use
stackdepot to save stack trace.

The benefits are smaller memory overhead and possibility to aggregate
per-cache statistics in the following patch using the stackdepot handle
instead of matching stacks manually.

[ [email protected]: rebase to 5.17-rc1 and adjust accordingly ]

This was initially merged as commit 788691464c29 and reverted by commit
ae14c63a9f20 due to several issues, that should now be fixed.
The problem of unconditional memory overhead by stackdepot has been
addressed by commit 2dba5eb1c73b ("lib/stackdepot: allow optional init
and stack_table allocation by kvmalloc()"), so the dependency on
stackdepot will result in extra memory usage only when a slab cache
tracking is actually enabled, and not for all CONFIG_SLUB_DEBUG builds.
The build failures on some architectures were also addressed, and the
reported issue with xfs/433 test did not reproduce on 5.17-rc1 with this
patch.

Signed-off-by: Oliver Glitta <[email protected]>
Signed-off-by: Vlastimil Babka <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Pekka Enberg <[email protected]>
Cc: Joonsoo Kim <[email protected]>
---
init/Kconfig | 1 +
mm/slub.c | 88 +++++++++++++++++++++++++++++-----------------------
2 files changed, 50 insertions(+), 39 deletions(-)

diff --git a/init/Kconfig b/init/Kconfig
index e9119bf54b1f..b21dd3a4a106 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1871,6 +1871,7 @@ config SLUB_DEBUG
default y
bool "Enable SLUB debugging support" if EXPERT
depends on SLUB && SYSFS
+ select STACKDEPOT if STACKTRACE_SUPPORT
help
SLUB has extensive debug support features. Disabling these can
result in significant savings in code size. This also disables
diff --git a/mm/slub.c b/mm/slub.c
index 1fc451f4fe62..3140f763e819 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -26,6 +26,7 @@
#include <linux/cpuset.h>
#include <linux/mempolicy.h>
#include <linux/ctype.h>
+#include <linux/stackdepot.h>
#include <linux/debugobjects.h>
#include <linux/kallsyms.h>
#include <linux/kfence.h>
@@ -264,8 +265,8 @@ static inline bool kmem_cache_has_cpu_partial(struct kmem_cache *s)
#define TRACK_ADDRS_COUNT 16
struct track {
unsigned long addr; /* Called from address */
-#ifdef CONFIG_STACKTRACE
- unsigned long addrs[TRACK_ADDRS_COUNT]; /* Called from address */
+#ifdef CONFIG_STACKDEPOT
+ depot_stack_handle_t handle;
#endif
int cpu; /* Was running on cpu */
int pid; /* Pid context */
@@ -724,22 +725,20 @@ static struct track *get_track(struct kmem_cache *s, void *object,
return kasan_reset_tag(p + alloc);
}

-static void set_track(struct kmem_cache *s, void *object,
- enum track_item alloc, unsigned long addr)
+static noinline void
+set_track(struct kmem_cache *s, void *object, enum track_item alloc,
+ unsigned long addr, gfp_t flags)
{
struct track *p = get_track(s, object, alloc);

-#ifdef CONFIG_STACKTRACE
+#ifdef CONFIG_STACKDEPOT
+ unsigned long entries[TRACK_ADDRS_COUNT];
unsigned int nr_entries;

- metadata_access_enable();
- nr_entries = stack_trace_save(kasan_reset_tag(p->addrs),
- TRACK_ADDRS_COUNT, 3);
- metadata_access_disable();
-
- if (nr_entries < TRACK_ADDRS_COUNT)
- p->addrs[nr_entries] = 0;
+ nr_entries = stack_trace_save(entries, ARRAY_SIZE(entries), 3);
+ p->handle = stack_depot_save(entries, nr_entries, flags);
#endif
+
p->addr = addr;
p->cpu = smp_processor_id();
p->pid = current->pid;
@@ -759,20 +758,19 @@ static void init_tracking(struct kmem_cache *s, void *object)

static void print_track(const char *s, struct track *t, unsigned long pr_time)
{
+ depot_stack_handle_t handle __maybe_unused;
+
if (!t->addr)
return;

pr_err("%s in %pS age=%lu cpu=%u pid=%d\n",
s, (void *)t->addr, pr_time - t->when, t->cpu, t->pid);
-#ifdef CONFIG_STACKTRACE
- {
- int i;
- for (i = 0; i < TRACK_ADDRS_COUNT; i++)
- if (t->addrs[i])
- pr_err("\t%pS\n", (void *)t->addrs[i]);
- else
- break;
- }
+#ifdef CONFIG_STACKDEPOT
+ handle = READ_ONCE(t->handle);
+ if (handle)
+ stack_depot_print(handle);
+ else
+ pr_err("object allocation/free stack trace missing\n");
#endif
}

@@ -1304,9 +1302,9 @@ static inline int alloc_consistency_checks(struct kmem_cache *s,
return 1;
}

-static noinline int alloc_debug_processing(struct kmem_cache *s,
- struct slab *slab,
- void *object, unsigned long addr)
+static noinline int
+alloc_debug_processing(struct kmem_cache *s, struct slab *slab, void *object,
+ unsigned long addr, gfp_t flags)
{
if (s->flags & SLAB_CONSISTENCY_CHECKS) {
if (!alloc_consistency_checks(s, slab, object))
@@ -1315,7 +1313,7 @@ static noinline int alloc_debug_processing(struct kmem_cache *s,

/* Success perform special debug activities for allocs */
if (s->flags & SLAB_STORE_USER)
- set_track(s, object, TRACK_ALLOC, addr);
+ set_track(s, object, TRACK_ALLOC, addr, flags);
trace(s, slab, object, 1);
init_object(s, object, SLUB_RED_ACTIVE);
return 1;
@@ -1395,7 +1393,7 @@ static noinline int free_debug_processing(
}

if (s->flags & SLAB_STORE_USER)
- set_track(s, object, TRACK_FREE, addr);
+ set_track(s, object, TRACK_FREE, addr, GFP_NOWAIT);
trace(s, slab, object, 0);
/* Freepointer not overwritten by init_object(), SLAB_POISON moved it */
init_object(s, object, SLUB_RED_INACTIVE);
@@ -1632,7 +1630,8 @@ static inline
void setup_slab_debug(struct kmem_cache *s, struct slab *slab, void *addr) {}

static inline int alloc_debug_processing(struct kmem_cache *s,
- struct slab *slab, void *object, unsigned long addr) { return 0; }
+ struct slab *slab, void *object, unsigned long addr,
+ gfp_t flags) { return 0; }

static inline int free_debug_processing(
struct kmem_cache *s, struct slab *slab,
@@ -3033,7 +3032,7 @@ static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node,
check_new_slab:

if (kmem_cache_debug(s)) {
- if (!alloc_debug_processing(s, slab, freelist, addr)) {
+ if (!alloc_debug_processing(s, slab, freelist, addr, gfpflags)) {
/* Slab failed checks. Next slab needed */
goto new_slab;
} else {
@@ -4221,6 +4220,9 @@ static int kmem_cache_open(struct kmem_cache *s, slab_flags_t flags)
s->remote_node_defrag_ratio = 1000;
#endif

+ if (s->flags & SLAB_STORE_USER && IS_ENABLED(CONFIG_STACKDEPOT))
+ stack_depot_init();
+
/* Initialize the pre-computed randomized freelist if slab is up */
if (slab_state >= UP) {
if (init_cache_random_seq(s))
@@ -4352,18 +4354,26 @@ void kmem_obj_info(struct kmem_obj_info *kpp, void *object, struct slab *slab)
objp = fixup_red_left(s, objp);
trackp = get_track(s, objp, TRACK_ALLOC);
kpp->kp_ret = (void *)trackp->addr;
-#ifdef CONFIG_STACKTRACE
- for (i = 0; i < KS_ADDRS_COUNT && i < TRACK_ADDRS_COUNT; i++) {
- kpp->kp_stack[i] = (void *)trackp->addrs[i];
- if (!kpp->kp_stack[i])
- break;
- }
+#ifdef CONFIG_STACKDEPOT
+ {
+ depot_stack_handle_t handle;
+ unsigned long *entries;
+ unsigned int nr_entries;
+
+ handle = READ_ONCE(trackp->handle);
+ if (handle) {
+ nr_entries = stack_depot_fetch(handle, &entries);
+ for (i = 0; i < KS_ADDRS_COUNT && i < nr_entries; i++)
+ kpp->kp_stack[i] = (void *)entries[i];
+ }

- trackp = get_track(s, objp, TRACK_FREE);
- for (i = 0; i < KS_ADDRS_COUNT && i < TRACK_ADDRS_COUNT; i++) {
- kpp->kp_free_stack[i] = (void *)trackp->addrs[i];
- if (!kpp->kp_free_stack[i])
- break;
+ trackp = get_track(s, objp, TRACK_FREE);
+ handle = READ_ONCE(trackp->handle);
+ if (handle) {
+ nr_entries = stack_depot_fetch(handle, &entries);
+ for (i = 0; i < KS_ADDRS_COUNT && i < nr_entries; i++)
+ kpp->kp_free_stack[i] = (void *)entries[i];
+ }
}
#endif
#endif
--
2.35.1

2022-02-26 01:03:35

by Vlastimil Babka

[permalink] [raw]

Subject: [PATCH 1/5] mm/slub: move struct track init out of set_track()

set_track() either zeroes out the struct track or fills it, depending on
the addr parameter. This is unnecessary as there's only one place that
calls it for the initialization - init_tracking(). We can simply do the
zeroing there, with a single memset() that covers both TRACK_ALLOC and
TRACK_FREE as they are adjacent.

Signed-off-by: Vlastimil Babka <[email protected]>
---
mm/slub.c | 32 +++++++++++++++-----------------
1 file changed, 15 insertions(+), 17 deletions(-)

diff --git a/mm/slub.c b/mm/slub.c
index 261474092e43..1fc451f4fe62 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -729,34 +729,32 @@ static void set_track(struct kmem_cache *s, void *object,
{
struct track *p = get_track(s, object, alloc);

- if (addr) {
#ifdef CONFIG_STACKTRACE
- unsigned int nr_entries;
+ unsigned int nr_entries;

- metadata_access_enable();
- nr_entries = stack_trace_save(kasan_reset_tag(p->addrs),
- TRACK_ADDRS_COUNT, 3);
- metadata_access_disable();
+ metadata_access_enable();
+ nr_entries = stack_trace_save(kasan_reset_tag(p->addrs),
+ TRACK_ADDRS_COUNT, 3);
+ metadata_access_disable();

- if (nr_entries < TRACK_ADDRS_COUNT)
- p->addrs[nr_entries] = 0;
+ if (nr_entries < TRACK_ADDRS_COUNT)
+ p->addrs[nr_entries] = 0;
#endif
- p->addr = addr;
- p->cpu = smp_processor_id();
- p->pid = current->pid;
- p->when = jiffies;
- } else {
- memset(p, 0, sizeof(struct track));
- }
+ p->addr = addr;
+ p->cpu = smp_processor_id();
+ p->pid = current->pid;
+ p->when = jiffies;
}

static void init_tracking(struct kmem_cache *s, void *object)
{
+ struct track *p;
+
if (!(s->flags & SLAB_STORE_USER))
return;

- set_track(s, object, TRACK_FREE, 0UL);
- set_track(s, object, TRACK_ALLOC, 0UL);
+ p = get_track(s, object, TRACK_ALLOC);
+ memset(p, 0, 2*sizeof(struct track));
}

static void print_track(const char *s, struct track *t, unsigned long pr_time)
--
2.35.1

2022-02-26 01:58:38

by Vlastimil Babka

[permalink] [raw]

Subject: [PATCH 4/5] mm/slub: sort debugfs output by frequency of stack traces

From: Oliver Glitta <[email protected]>

Sort the output of debugfs alloc_traces and free_traces by the frequency
of allocation/freeing stack traces. Most frequently used stack traces
will be printed first, e.g. for easier memory leak debugging.

Signed-off-by: Oliver Glitta <[email protected]>
Signed-off-by: Vlastimil Babka <[email protected]>
---
mm/slub.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)

diff --git a/mm/slub.c b/mm/slub.c
index 06599db4faa3..a74afe59a403 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -38,6 +38,7 @@
#include <linux/memcontrol.h>
#include <linux/random.h>
#include <kunit/test.h>
+#include <linux/sort.h>

#include <linux/debugfs.h>
#include <trace/events/kmem.h>
@@ -6150,6 +6151,17 @@ static void *slab_debugfs_next(struct seq_file *seq, void *v, loff_t *ppos)
return NULL;
}

+static int cmp_loc_by_count(const void *a, const void *b, const void *data)
+{
+ struct location *loc1 = (struct location *)a;
+ struct location *loc2 = (struct location *)b;
+
+ if (loc1->count > loc2->count)
+ return -1;
+ else
+ return 1;
+}
+
static void *slab_debugfs_start(struct seq_file *seq, loff_t *ppos)
{
struct loc_track *t = seq->private;
@@ -6211,6 +6223,10 @@ static int slab_debug_trace_open(struct inode *inode, struct file *filep)
spin_unlock_irqrestore(&n->list_lock, flags);
}

+ /* Sort locations by count */
+ sort_r(t->loc, t->count, sizeof(struct location),
+ cmp_loc_by_count, NULL, NULL);
+
bitmap_free(obj_map);
return 0;
}
--
2.35.1

2022-02-26 02:10:12

by Vlastimil Babka

[permalink] [raw]

Subject: [PATCH 5/5] slab, documentation: add description of debugfs files for SLUB caches

From: Oliver Glitta <[email protected]>

Add description of debugfs files alloc_traces and free_traces
to SLUB cache documentation.

[ [email protected]: some rewording ]

Signed-off-by: Oliver Glitta <[email protected]>
Signed-off-by: Vlastimil Babka <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: [email protected]
---
Documentation/vm/slub.rst | 61 +++++++++++++++++++++++++++++++++++++++
1 file changed, 61 insertions(+)

diff --git a/Documentation/vm/slub.rst b/Documentation/vm/slub.rst
index d3028554b1e9..2b2b931e59fc 100644
--- a/Documentation/vm/slub.rst
+++ b/Documentation/vm/slub.rst
@@ -384,5 +384,66 @@ c) Execute ``slabinfo-gnuplot.sh`` in '-t' mode, passing all of the
40,60`` range will plot only samples collected between 40th and
60th seconds).

+
+DebugFS files for SLUB
+======================
+
+For more information about current state of SLUB caches with the user tracking
+debug option enabled, debugfs files are available, typically under
+/sys/kernel/debug/slab/<cache>/ (created only for caches with enabled user
+tracking). There are 2 types of these files with the following debug
+information:
+
+1. alloc_traces::
+
+ Prints information about unique allocation traces of the currently
+ allocated objects. The output is sorted by frequency of each trace.
+
+ Information in the output:
+ Number of objects, allocating function, minimal/average/maximal jiffies since alloc,
+ pid range of the allocating processes, cpu mask of allocating cpus, and stack trace.
+
+ Example:::
+
+ 1085 populate_error_injection_list+0x97/0x110 age=166678/166680/166682 pid=1 cpus=1::
+ __slab_alloc+0x6d/0x90
+ kmem_cache_alloc_trace+0x2eb/0x300
+ populate_error_injection_list+0x97/0x110
+ init_error_injection+0x1b/0x71
+ do_one_initcall+0x5f/0x2d0
+ kernel_init_freeable+0x26f/0x2d7
+ kernel_init+0xe/0x118
+ ret_from_fork+0x22/0x30
+
+
+2. free_traces::
+
+ Prints information about unique free traces of the currently free objects,
+ sorted by their frequency.
+
+ Information in the output:
+ Number of objects, freeing function, minimal/average/maximal jiffies since free,
+ pid range of the freeing processes, cpu mask of freeing cpus, and stack trace.
+
+ Example:::
+
+ 51 acpi_ut_update_ref_count+0x6a6/0x782 age=236886/237027/237772 pid=1 cpus=1
+ kfree+0x2db/0x420
+ acpi_ut_update_ref_count+0x6a6/0x782
+ acpi_ut_update_object_reference+0x1ad/0x234
+ acpi_ut_remove_reference+0x7d/0x84
+ acpi_rs_get_prt_method_data+0x97/0xd6
+ acpi_get_irq_routing_table+0x82/0xc4
+ acpi_pci_irq_find_prt_entry+0x8e/0x2e0
+ acpi_pci_irq_lookup+0x3a/0x1e0
+ acpi_pci_irq_enable+0x77/0x240
+ pcibios_enable_device+0x39/0x40
+ do_pci_enable_device.part.0+0x5d/0xe0
+ pci_enable_device_flags+0xfc/0x120
+ pci_enable_device+0x13/0x20
+ virtio_pci_probe+0x9e/0x170
+ local_pci_probe+0x48/0x80
+ pci_device_probe+0x105/0x1c0
+
Christoph Lameter, May 30, 2007
Sergey Senozhatsky, October 23, 2015
--
2.35.1

2022-02-26 07:27:37

Subject: [PATCH 0/5] SLUB debugfs improvements based on stackdepot

Subject: [PATCH 2/5] mm/slub: use stackdepot to save stack trace in objects

Subject: [PATCH 1/5] mm/slub: move struct track init out of set_track()

Subject: [PATCH 4/5] mm/slub: sort debugfs output by frequency of stack traces

Subject: [PATCH 5/5] slab, documentation: add description of debugfs files for SLUB caches

Subject: Re: [PATCH 0/5] SLUB debugfs improvements based on stackdepot

Subject: Re: [PATCH 2/5] mm/slub: use stackdepot to save stack trace in objects

Subject: Re: [PATCH 1/5] mm/slub: move struct track init out of set_track()

Subject: Re: [PATCH 0/5] SLUB debugfs improvements based on stackdepot

Subject: Re: [PATCH 4/5] mm/slub: sort debugfs output by frequency of stack traces

Subject: [PATCH] lib/stackdepot: Use page allocator if both slab and memblock is unavailable

Subject: Re: [PATCH] lib/stackdepot: Use page allocator if both slab and memblock is unavailable

Subject: [PATCH v2] lib/stackdepot: Use page allocator if both slab and memblock is unavailable

Subject: Re: [PATCH] lib/stackdepot: Use page allocator if both slab and memblock is unavailable

Subject: Re: [PATCH 5/5] slab, documentation: add description of debugfs files for SLUB caches

Subject: Re: [PATCH 2/5] mm/slub: use stackdepot to save stack trace in objects

Subject: Re: [PATCH] lib/stackdepot: Use page allocator if both slab and memblock is unavailable

Subject: Re: [PATCH] lib/stackdepot: Use page allocator if both slab and memblock is unavailable

Subject: Re: [PATCH] lib/stackdepot: Use page allocator if both slab and memblock is unavailable

Subject: Re: [PATCH] lib/stackdepot: Use page allocator if both slab and memblock is unavailable

Subject: [PATCH] mm/slub: initialize stack depot in boot process

Subject: Re: [PATCH] mm/slub: initialize stack depot in boot process

Subject: Re: [PATCH 2/5] mm/slub: use stackdepot to save stack trace in objects

Subject: Re: [PATCH 0/5] SLUB debugfs improvements based on stackdepot

Subject: Re: [PATCH 0/5] SLUB debugfs improvements based on stackdepot

Subject: Re: [PATCH 0/5] SLUB debugfs improvements based on stackdepot

Subject: Re: [PATCH 0/5] SLUB debugfs improvements based on stackdepot

Subject: Re: [PATCH] mm/slub: initialize stack depot in boot process

Subject: Re: [PATCH] mm/slub: initialize stack depot in boot process

Subject: Re: [PATCH 0/5] SLUB debugfs improvements based on stackdepot

Subject: Re: [PATCH 0/5] SLUB debugfs improvements based on stackdepot

Subject: Re: [PATCH 0/5] SLUB debugfs improvements based on stackdepot

Subject: Re: [PATCH 0/5] SLUB debugfs improvements based on stackdepot

Subject: Re: [PATCH 0/5] SLUB debugfs improvements based on stackdepot

Subject: Re: [PATCH 0/5] SLUB debugfs improvements based on stackdepot

Subject: Re: [PATCH 0/5] SLUB debugfs improvements based on stackdepot

Subject: Re: [PATCH 0/5] SLUB debugfs improvements based on stackdepot

Subject: Re: [PATCH 0/5] SLUB debugfs improvements based on stackdepot

Subject: Re: [PATCH 2/5] mm/slub: use stackdepot to save stack trace in objects

Subject: Re: [PATCH 2/5] mm/slub: use stackdepot to save stack trace in objects

Subject: Re: [PATCH 0/5] SLUB debugfs improvements based on stackdepot

Subject: Re: [PATCH 5/5] slab, documentation: add description of debugfs files for SLUB caches

Subject: Re: [PATCH 0/5] SLUB debugfs improvements based on stackdepot