There are 50+ different shrinkers in the kernel, many with their own bells and
whistles. Under the memory pressure the kernel applies some pressure on each of
them in the order of which they were created/registered in the system. Some
of them can contain only few objects, some can be quite large. Some can be
effective at reclaiming memory, some not.
The only existing debugging mechanism is a couple of tracepoints in
do_shrink_slab(): mm_shrink_slab_start and mm_shrink_slab_end. They aren't
covering everything though: shrinkers which report 0 objects will never show up,
there is no support for memcg-aware shrinkers. Shrinkers are identified by their
scan function, which is not always enough (e.g. hard to guess which super
block's shrinker it is having only "super_cache_scan").
To provide a better visibility and debug options for memory shrinkers
this patchset introduces a /sys/kernel/debug/shrinker interface, to some extent
similar to /sys/kernel/slab.
For each shrinker registered in the system a directory is created.
As now, the directory will contain only a "scan" file, which allows to get
the number of managed objects for each memory cgroup (for memcg-aware shrinkers)
and each numa node (for numa-aware shrinkers on a numa machine). Other
interfaces might be added in the future.
To make debugging more pleasant, the patchset also names all shrinkers,
so that debugfs entries can have meaningful names.
v3:
1) separated the "scan" part into a separate patch, by Dave
2) merged *_memcg, *_node and *_memcg_node interfaces, by Dave
3) shrinkers naming enhancements, by Christophe and Dave
4) added signal_pending() check, by Hillf
5) enabled by default, by Dave
v2:
1) switched to debugfs, suggested by Mike, Andrew, Greg and others
2) switched to seq_file API for output, no PAGE_SIZE limit anymore, by Andrew
3) switched to down_read_killable(), suggested by Hillf
4) dropped stateful filtering and "freed" returning, by Kent
5) added docs, by Andrew
6) added memcg_shrinker.py tool
rfc:
https://lwn.net/Articles/891542/
Roman Gushchin (6):
mm: memcontrol: introduce mem_cgroup_ino() and
mem_cgroup_get_from_ino()
mm: shrinkers: introduce debugfs interface for memory shrinkers
mm: shrinkers: provide shrinkers with names
mm: docs: document shrinker debugfs
tools: add memcg_shrinker.py
mm: shrinkers: add scan interface for shrinker debugfs
Documentation/admin-guide/mm/index.rst | 1 +
.../admin-guide/mm/shrinker_debugfs.rst | 131 ++++++++
arch/x86/kvm/mmu/mmu.c | 2 +-
drivers/android/binder_alloc.c | 2 +-
drivers/gpu/drm/i915/gem/i915_gem_shrinker.c | 3 +-
drivers/gpu/drm/msm/msm_gem_shrinker.c | 2 +-
.../gpu/drm/panfrost/panfrost_gem_shrinker.c | 2 +-
drivers/gpu/drm/ttm/ttm_pool.c | 2 +-
drivers/md/bcache/btree.c | 2 +-
drivers/md/dm-bufio.c | 2 +-
drivers/md/dm-zoned-metadata.c | 2 +-
drivers/md/raid5.c | 2 +-
drivers/misc/vmw_balloon.c | 2 +-
drivers/virtio/virtio_balloon.c | 2 +-
drivers/xen/xenbus/xenbus_probe_backend.c | 2 +-
fs/btrfs/super.c | 2 +
fs/erofs/utils.c | 2 +-
fs/ext4/extents_status.c | 3 +-
fs/f2fs/super.c | 2 +-
fs/gfs2/glock.c | 2 +-
fs/gfs2/main.c | 2 +-
fs/jbd2/journal.c | 2 +-
fs/mbcache.c | 2 +-
fs/nfs/nfs42xattr.c | 7 +-
fs/nfs/super.c | 2 +-
fs/nfsd/filecache.c | 2 +-
fs/nfsd/nfscache.c | 2 +-
fs/quota/dquot.c | 2 +-
fs/super.c | 6 +-
fs/ubifs/super.c | 2 +-
fs/xfs/xfs_buf.c | 2 +-
fs/xfs/xfs_icache.c | 2 +-
fs/xfs/xfs_qm.c | 2 +-
include/linux/memcontrol.h | 21 ++
include/linux/shrinker.h | 31 +-
kernel/rcu/tree.c | 2 +-
lib/Kconfig.debug | 9 +
mm/Makefile | 1 +
mm/huge_memory.c | 4 +-
mm/memcontrol.c | 23 ++
mm/shrinker_debug.c | 285 ++++++++++++++++++
mm/vmscan.c | 64 +++-
mm/workingset.c | 2 +-
mm/zsmalloc.c | 2 +-
net/sunrpc/auth.c | 2 +-
tools/cgroup/memcg_shrinker.py | 71 +++++
46 files changed, 675 insertions(+), 47 deletions(-)
create mode 100644 Documentation/admin-guide/mm/shrinker_debugfs.rst
create mode 100644 mm/shrinker_debug.c
create mode 100755 tools/cgroup/memcg_shrinker.py
--
2.35.3
Shrinker debugfs requires a way to represent memory cgroups without
using full paths, both for displaying information and getting input
from a user.
Cgroup inode number is a perfect way, already used by bpf.
This commit adds a couple of helper functions which will be used
to handle memcg-aware shrinkers.
Signed-off-by: Roman Gushchin <[email protected]>
---
include/linux/memcontrol.h | 21 +++++++++++++++++++++
mm/memcontrol.c | 23 +++++++++++++++++++++++
2 files changed, 44 insertions(+)
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index fe580cb96683..a6de9e5c1549 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -831,6 +831,15 @@ static inline unsigned short mem_cgroup_id(struct mem_cgroup *memcg)
}
struct mem_cgroup *mem_cgroup_from_id(unsigned short id);
+#ifdef CONFIG_SHRINKER_DEBUG
+static inline unsigned long mem_cgroup_ino(struct mem_cgroup *memcg)
+{
+ return memcg ? cgroup_ino(memcg->css.cgroup) : 0;
+}
+
+struct mem_cgroup *mem_cgroup_get_from_ino(unsigned long ino);
+#endif
+
static inline struct mem_cgroup *mem_cgroup_from_seq(struct seq_file *m)
{
return mem_cgroup_from_css(seq_css(m));
@@ -1324,6 +1333,18 @@ static inline struct mem_cgroup *mem_cgroup_from_id(unsigned short id)
return NULL;
}
+#ifdef CONFIG_SHRINKER_DEBUG
+static inline unsigned long mem_cgroup_ino(struct mem_cgroup *memcg)
+{
+ return 0;
+}
+
+static inline struct mem_cgroup *mem_cgroup_get_from_ino(unsigned long ino)
+{
+ return NULL;
+}
+#endif
+
static inline struct mem_cgroup *mem_cgroup_from_seq(struct seq_file *m)
{
return NULL;
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 04cea4fa362a..e6472728fa66 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -5018,6 +5018,29 @@ struct mem_cgroup *mem_cgroup_from_id(unsigned short id)
return idr_find(&mem_cgroup_idr, id);
}
+#ifdef CONFIG_SHRINKER_DEBUG
+struct mem_cgroup *mem_cgroup_get_from_ino(unsigned long ino)
+{
+ struct cgroup *cgrp;
+ struct cgroup_subsys_state *css;
+ struct mem_cgroup *memcg;
+
+ cgrp = cgroup_get_from_id(ino);
+ if (!cgrp)
+ return ERR_PTR(-ENOENT);
+
+ css = cgroup_get_e_css(cgrp, &memory_cgrp_subsys);
+ if (css)
+ memcg = container_of(css, struct mem_cgroup, css);
+ else
+ memcg = ERR_PTR(-ENOENT);
+
+ cgroup_put(cgrp);
+
+ return memcg;
+}
+#endif
+
static int alloc_mem_cgroup_per_node_info(struct mem_cgroup *memcg, int node)
{
struct mem_cgroup_per_node *pn;
--
2.35.3
Add a document describing the shrinker debugfs interface.
Signed-off-by: Roman Gushchin <[email protected]>
---
Documentation/admin-guide/mm/index.rst | 1 +
.../admin-guide/mm/shrinker_debugfs.rst | 100 ++++++++++++++++++
2 files changed, 101 insertions(+)
create mode 100644 Documentation/admin-guide/mm/shrinker_debugfs.rst
diff --git a/Documentation/admin-guide/mm/index.rst b/Documentation/admin-guide/mm/index.rst
index c21b5823f126..1bd11118dfb1 100644
--- a/Documentation/admin-guide/mm/index.rst
+++ b/Documentation/admin-guide/mm/index.rst
@@ -36,6 +36,7 @@ the Linux memory management.
numa_memory_policy
numaperf
pagemap
+ shrinker_debugfs
soft-dirty
swap_numa
transhuge
diff --git a/Documentation/admin-guide/mm/shrinker_debugfs.rst b/Documentation/admin-guide/mm/shrinker_debugfs.rst
new file mode 100644
index 000000000000..6783f8190e63
--- /dev/null
+++ b/Documentation/admin-guide/mm/shrinker_debugfs.rst
@@ -0,0 +1,100 @@
+.. _shrinker_debugfs:
+
+==========================
+Shrinker Debugfs Interface
+==========================
+
+Shrinker debugfs interface provides a visibility into the kernel memory
+shrinkers subsystem and allows to get information about individual shrinkers.
+
+For each shrinker registered in the system a directory in **<debugfs>/shrinker/**
+is created. The directory's name is composed from the shrinker's name and an
+unique id: e.g. *kfree_rcu-0* or *sb-xfs:vda1-36*.
+
+Each shrinker directory contains the **count** file, which allows to trigger
+the *count_objects()* callback for each memcg and numa node (if applicable).
+
+Usage:
+------
+
+1. *List registered shrinkers*
+
+ ::
+
+ $ cd /sys/kernel/debug/shrinker/
+ $ ls
+ dqcache-16 sb-hugetlbfs-17 sb-rootfs-2 sb-tmpfs-50
+ kfree_rcu-0 sb-hugetlbfs-33 sb-securityfs-6 sb-tracefs-13
+ sb-aio-20 sb-iomem-12 sb-selinuxfs-22 sb-xfs:vda1-36
+ sb-anon_inodefs-15 sb-mqueue-21 sb-sockfs-8 sb-zsmalloc-19
+ sb-bdev-3 sb-nsfs-4 sb-sysfs-26 shadow-18
+ sb-bpf-32 sb-pipefs-14 sb-tmpfs-1 thp_deferred_split-10
+ sb-btrfs:vda2-24 sb-proc-25 sb-tmpfs-27 thp_zero-9
+ sb-cgroup2-30 sb-proc-39 sb-tmpfs-29 xfs_buf-37
+ sb-configfs-23 sb-proc-41 sb-tmpfs-35 xfs_inodegc-38
+ sb-dax-11 sb-proc-45 sb-tmpfs-40 zspool-34
+ sb-debugfs-7 sb-proc-46 sb-tmpfs-42
+ sb-devpts-28 sb-proc-47 sb-tmpfs-43
+ sb-devtmpfs-5 sb-pstore-31 sb-tmpfs-44
+
+2. *Get information about a specific shrinker*
+
+ ::
+
+ $ cd sb-btrfs\:vda2-24/
+ $ ls
+ count
+
+3. *Count objects*
+
+ Each line in the output has the following format::
+
+ <cgroup inode id> <nr of objects on node 0> <nr of objects on node 1> ...
+ <cgroup inode id> <nr of objects on node 0> <nr of objects on node 1> ...
+ ...
+
+ If there are no objects on all numa nodes, a line is omitted. If there
+ are no objects at all, the output might be empty.
+ ::
+
+ $ cat count
+ 1 224 2
+ 21 98 0
+ 55 818 10
+ 2367 2 0
+ 2401 30 0
+ 225 13 0
+ 599 35 0
+ 939 124 0
+ 1041 3 0
+ 1075 1 0
+ 1109 1 0
+ 1279 60 0
+ 1313 7 0
+ 1347 39 0
+ 1381 3 0
+ 1449 14 0
+ 1483 63 0
+ 1517 53 0
+ 1551 6 0
+ 1585 1 0
+ 1619 6 0
+ 1653 40 0
+ 1687 11 0
+ 1721 8 0
+ 1755 4 0
+ 1789 52 0
+ 1823 888 0
+ 1857 1 0
+ 1925 2 0
+ 1959 32 0
+ 2027 22 0
+ 2061 9 0
+ 2469 799 0
+ 2537 861 0
+ 2639 1 0
+ 2707 70 0
+ 2775 4 0
+ 2877 84 0
+ 293 1 0
+ 735 8 0
--
2.35.3
This commit introduces the /sys/kernel/debug/shrinker debugfs
interface which provides an ability to observe the state of
individual kernel memory shrinkers.
Because the feature adds some memory overhead (which shouldn't be
large unless there is a huge amount of registered shrinkers), it's
guarded by a config option (enabled by default).
This commit introduces the "count" interface for each shrinker
registered in the system.
The output is in the following format:
<cgroup inode id> <nr of objects on node 0> <nr of objects on node 1>...
<cgroup inode id> <nr of objects on node 0> <nr of objects on node 1>...
...
To reduce the size of output on machines with many thousands cgroups,
if the total number of objects on all nodes is 0, the line is omitted.
If the shrinker is not memcg-aware or CONFIG_MEMCG is off, 0 is
printed as cgroup inode id. If the shrinker is not numa-aware, 0's are
printed for all nodes except the first one.
This commit gives debugfs entries simple numeric names, which are not
very convenient. The following commit in the series will provide
shrinkers with more meaningful names.
Signed-off-by: Roman Gushchin <[email protected]>
---
include/linux/shrinker.h | 19 ++++-
lib/Kconfig.debug | 9 +++
mm/Makefile | 1 +
mm/shrinker_debug.c | 171 +++++++++++++++++++++++++++++++++++++++
mm/vmscan.c | 6 +-
5 files changed, 203 insertions(+), 3 deletions(-)
create mode 100644 mm/shrinker_debug.c
diff --git a/include/linux/shrinker.h b/include/linux/shrinker.h
index 76fbf92b04d9..2ced8149c513 100644
--- a/include/linux/shrinker.h
+++ b/include/linux/shrinker.h
@@ -72,6 +72,10 @@ struct shrinker {
#ifdef CONFIG_MEMCG
/* ID in shrinker_idr */
int id;
+#endif
+#ifdef CONFIG_SHRINKER_DEBUG
+ int debugfs_id;
+ struct dentry *debugfs_entry;
#endif
/* objs pending delete, per node */
atomic_long_t *nr_deferred;
@@ -94,4 +98,17 @@ extern int register_shrinker(struct shrinker *shrinker);
extern void unregister_shrinker(struct shrinker *shrinker);
extern void free_prealloced_shrinker(struct shrinker *shrinker);
extern void synchronize_shrinkers(void);
-#endif
+
+#ifdef CONFIG_SHRINKER_DEBUG
+extern int shrinker_debugfs_add(struct shrinker *shrinker);
+extern void shrinker_debugfs_remove(struct shrinker *shrinker);
+#else /* CONFIG_SHRINKER_DEBUG */
+static inline int shrinker_debugfs_add(struct shrinker *shrinker)
+{
+ return 0;
+}
+static inline void shrinker_debugfs_remove(struct shrinker *shrinker)
+{
+}
+#endif /* CONFIG_SHRINKER_DEBUG */
+#endif /* _LINUX_SHRINKER_H */
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 3fd7a2e9eaf1..5fa65a649798 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -733,6 +733,15 @@ config SLUB_STATS
out which slabs are relevant to a particular load.
Try running: slabinfo -DA
+config SHRINKER_DEBUG
+ default y
+ bool "Enable shrinker debugging support"
+ depends on DEBUG_FS
+ help
+ Say Y to enable the shrinker debugfs interface which provides
+ visibility into the kernel memory shrinkers subsystem.
+ Disable it to avoid an extra memory footprint.
+
config HAVE_DEBUG_KMEMLEAK
bool
diff --git a/mm/Makefile b/mm/Makefile
index 298c9991ab75..8083fa85a348 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -133,3 +133,4 @@ obj-$(CONFIG_PAGE_REPORTING) += page_reporting.o
obj-$(CONFIG_IO_MAPPING) += io-mapping.o
obj-$(CONFIG_HAVE_BOOTMEM_INFO_NODE) += bootmem_info.o
obj-$(CONFIG_GENERIC_IOREMAP) += ioremap.o
+obj-$(CONFIG_SHRINKER_DEBUG) += shrinker_debug.o
diff --git a/mm/shrinker_debug.c b/mm/shrinker_debug.c
new file mode 100644
index 000000000000..fd1f805a581a
--- /dev/null
+++ b/mm/shrinker_debug.c
@@ -0,0 +1,171 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/idr.h>
+#include <linux/slab.h>
+#include <linux/debugfs.h>
+#include <linux/seq_file.h>
+#include <linux/shrinker.h>
+#include <linux/memcontrol.h>
+
+/* defined in vmscan.c */
+extern struct rw_semaphore shrinker_rwsem;
+extern struct list_head shrinker_list;
+
+static DEFINE_IDA(shrinker_debugfs_ida);
+static struct dentry *shrinker_debugfs_root;
+
+static unsigned long shrinker_count_objects(struct shrinker *shrinker,
+ struct mem_cgroup *memcg,
+ unsigned long *count_per_node)
+{
+ unsigned long nr, total = 0;
+ int nid;
+
+ for_each_node(nid) {
+ if (nid == 0 || (shrinker->flags & SHRINKER_NUMA_AWARE)) {
+ struct shrink_control sc = {
+ .gfp_mask = GFP_KERNEL,
+ .nid = nid,
+ .memcg = memcg,
+ };
+
+ nr = shrinker->count_objects(shrinker, &sc);
+ if (nr == SHRINK_EMPTY)
+ nr = 0;
+ } else {
+ nr = 0;
+ }
+
+ count_per_node[nid] = nr;
+ total += nr;
+ }
+
+ return total;
+}
+
+static int shrinker_debugfs_count_show(struct seq_file *m, void *v)
+{
+ struct shrinker *shrinker = (struct shrinker *)m->private;
+ unsigned long *count_per_node = NULL;
+ struct mem_cgroup *memcg;
+ unsigned long total;
+ bool memcg_aware;
+ int ret, nid;
+
+ count_per_node = kcalloc(nr_node_ids, sizeof(unsigned long), GFP_KERNEL);
+ if (!count_per_node)
+ return -ENOMEM;
+
+ ret = down_read_killable(&shrinker_rwsem);
+ if (ret) {
+ kfree(count_per_node);
+ return ret;
+ }
+ rcu_read_lock();
+
+ memcg_aware = shrinker->flags & SHRINKER_MEMCG_AWARE;
+
+ memcg = mem_cgroup_iter(NULL, NULL, NULL);
+ do {
+ if (memcg && !mem_cgroup_online(memcg))
+ continue;
+
+ total = shrinker_count_objects(shrinker,
+ memcg_aware ? memcg : NULL,
+ count_per_node);
+ if (total) {
+ seq_printf(m, "%lu", mem_cgroup_ino(memcg));
+ for_each_node(nid)
+ seq_printf(m, " %lu", count_per_node[nid]);
+ seq_puts(m, "\n");
+ }
+
+ if (!memcg_aware) {
+ mem_cgroup_iter_break(NULL, memcg);
+ break;
+ }
+
+ if (signal_pending(current)) {
+ mem_cgroup_iter_break(NULL, memcg);
+ ret = -EINTR;
+ break;
+ }
+
+ cond_resched();
+ } while ((memcg = mem_cgroup_iter(NULL, memcg, NULL)) != NULL);
+
+ rcu_read_unlock();
+ up_read(&shrinker_rwsem);
+
+ kfree(count_per_node);
+ return ret;
+}
+DEFINE_SHOW_ATTRIBUTE(shrinker_debugfs_count);
+
+int shrinker_debugfs_add(struct shrinker *shrinker)
+{
+ struct dentry *entry;
+ char buf[16];
+ int id;
+
+ lockdep_assert_held(&shrinker_rwsem);
+
+ /* debugfs isn't initialized yet, add debugfs entries later. */
+ if (!shrinker_debugfs_root)
+ return 0;
+
+ id = ida_alloc(&shrinker_debugfs_ida, GFP_KERNEL);
+ if (id < 0)
+ return id;
+ shrinker->debugfs_id = id;
+
+ snprintf(buf, sizeof(buf), "%d", id);
+
+ /* create debugfs entry */
+ entry = debugfs_create_dir(buf, shrinker_debugfs_root);
+ if (IS_ERR(entry)) {
+ ida_free(&shrinker_debugfs_ida, id);
+ return PTR_ERR(entry);
+ }
+ shrinker->debugfs_entry = entry;
+
+ debugfs_create_file("count", 0220, entry, shrinker,
+ &shrinker_debugfs_count_fops);
+ return 0;
+}
+
+void shrinker_debugfs_remove(struct shrinker *shrinker)
+{
+ lockdep_assert_held(&shrinker_rwsem);
+
+ if (!shrinker->debugfs_entry)
+ return;
+
+ debugfs_remove_recursive(shrinker->debugfs_entry);
+ ida_free(&shrinker_debugfs_ida, shrinker->debugfs_id);
+}
+
+static int __init shrinker_debugfs_init(void)
+{
+ struct shrinker *shrinker;
+ int ret;
+
+ if (!debugfs_initialized())
+ return -ENODEV;
+
+ shrinker_debugfs_root = debugfs_create_dir("shrinker", NULL);
+ if (!shrinker_debugfs_root)
+ return -ENOMEM;
+
+ /* Create debugfs entries for shrinkers registered at boot */
+ ret = down_write_killable(&shrinker_rwsem);
+ if (ret)
+ return ret;
+
+ list_for_each_entry(shrinker, &shrinker_list, list)
+ if (!shrinker->debugfs_entry)
+ ret = shrinker_debugfs_add(shrinker);
+ up_write(&shrinker_rwsem);
+
+ return ret;
+}
+late_initcall(shrinker_debugfs_init);
diff --git a/mm/vmscan.c b/mm/vmscan.c
index c6918fff06e1..024f7056b98c 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -190,8 +190,8 @@ static void set_task_reclaim_state(struct task_struct *task,
task->reclaim_state = rs;
}
-static LIST_HEAD(shrinker_list);
-static DECLARE_RWSEM(shrinker_rwsem);
+LIST_HEAD(shrinker_list);
+DECLARE_RWSEM(shrinker_rwsem);
#ifdef CONFIG_MEMCG
static int shrinker_nr_max;
@@ -655,6 +655,7 @@ void register_shrinker_prepared(struct shrinker *shrinker)
down_write(&shrinker_rwsem);
list_add_tail(&shrinker->list, &shrinker_list);
shrinker->flags |= SHRINKER_REGISTERED;
+ WARN_ON_ONCE(shrinker_debugfs_add(shrinker));
up_write(&shrinker_rwsem);
}
@@ -682,6 +683,7 @@ void unregister_shrinker(struct shrinker *shrinker)
shrinker->flags &= ~SHRINKER_REGISTERED;
if (shrinker->flags & SHRINKER_MEMCG_AWARE)
unregister_memcg_shrinker(shrinker);
+ shrinker_debugfs_remove(shrinker);
up_write(&shrinker_rwsem);
kfree(shrinker->nr_deferred);
--
2.35.3
Add a simple tool which prints a sorted list of shrinker lists
in the following format: (number of objects, shrinker name, cgroup).
Example:
$ ./memcg_shrinker.py -n 10
2090 sb-sysfs-26 /sys/fs/cgroup/system.slice
1809 sb-sysfs-26 /sys/fs/cgroup/system.slice/systemd-udevd.service
1044 sb-btrfs:vda2-24 /sys/fs/cgroup/system.slice/system-dbus\x2d:1.3\...
861 sb-btrfs:vda2-24 /sys/fs/cgroup/system.slice/system-dbus\x2d:1.3\...
804 sb-btrfs:vda2-24 /sys/fs/cgroup/system.slice
643 sb-btrfs:vda2-24 /sys/fs/cgroup/system.slice/firewalld.service
616 sb-cgroup2-30 /sys/fs/cgroup/init.scope
275 sb-sysfs-26 /
238 sb-proc-25 /sys/fs/cgroup/system.slice/systemd-journald.service
225 sb-proc-25 /sys/fs/cgroup/system.slice/abrtd.service
Signed-off-by: Roman Gushchin <[email protected]>
---
tools/cgroup/memcg_shrinker.py | 71 ++++++++++++++++++++++++++++++++++
1 file changed, 71 insertions(+)
create mode 100755 tools/cgroup/memcg_shrinker.py
diff --git a/tools/cgroup/memcg_shrinker.py b/tools/cgroup/memcg_shrinker.py
new file mode 100755
index 000000000000..706ab27666a4
--- /dev/null
+++ b/tools/cgroup/memcg_shrinker.py
@@ -0,0 +1,71 @@
+#!/usr/bin/env python3
+#
+# Copyright (C) 2022 Roman Gushchin <[email protected]>
+# Copyright (C) 2022 Meta
+
+import os
+import argparse
+import sys
+
+
+def scan_cgroups(cgroup_root):
+ cgroups = {}
+
+ for root, subdirs, _ in os.walk(cgroup_root):
+ for cgroup in subdirs:
+ path = os.path.join(root, cgroup)
+ ino = os.stat(path).st_ino
+ cgroups[ino] = path
+
+ # (memcg ino, path)
+ return cgroups
+
+
+def scan_shrinkers(shrinker_debugfs):
+ shrinkers = []
+
+ for root, subdirs, _ in os.walk(shrinker_debugfs):
+ for shrinker in subdirs:
+ count_path = os.path.join(root, shrinker, "count")
+ with open(count_path) as f:
+ for line in f.readlines():
+ items = line.split(' ')
+ ino = int(items[0])
+ # (count, shrinker, memcg ino)
+ shrinkers.append((int(items[1]), shrinker, ino))
+ return shrinkers
+
+
+def main():
+ parser = argparse.ArgumentParser(description='Display biggest shrinkers')
+ parser.add_argument('-n', '--lines', type=int, help='Number of lines to print')
+
+ args = parser.parse_args()
+
+ cgroups = scan_cgroups("/sys/fs/cgroup/")
+ shrinkers = scan_shrinkers("/sys/kernel/debug/shrinker/")
+ shrinkers = sorted(shrinkers, reverse = True, key = lambda x: x[0])
+
+ n = 0
+ for s in shrinkers:
+ count, name, ino = (s[0], s[1], s[2])
+ if count == 0:
+ break
+
+ if ino == 0 or ino == 1:
+ cg = "/"
+ else:
+ try:
+ cg = cgroups[ino]
+ except KeyError:
+ cg = "unknown (%d)" % ino
+
+ print("%-8s %-20s %s" % (count, name, cg))
+
+ n += 1
+ if args.lines and n >= args.lines:
+ break
+
+
+if __name__ == '__main__':
+ main()
--
2.35.3
Currently shrinkers are anonymous objects. For debugging purposes they
can be identified by count/scan function names, but it's not always
useful: e.g. for superblock's shrinkers it's nice to have at least
an idea of to which superblock the shrinker belongs.
This commit adds names to shrinkers. register_shrinker() and
prealloc_shrinker() functions are extended to take a format and
arguments to master a name.
In some cases it's not possible to determine a good name at the time
when a shrinker is allocated. For such cases shrinker_debugfs_rename()
is provided.
After this change the shrinker debugfs directory looks like:
$ cd /sys/kernel/debug/shrinker/
$ ls
dqcache-16 sb-hugetlbfs-17 sb-rootfs-2 sb-tmpfs-50
kfree_rcu-0 sb-hugetlbfs-33 sb-securityfs-6 sb-tracefs-13
sb-aio-20 sb-iomem-12 sb-selinuxfs-22 sb-xfs:vda1-36
sb-anon_inodefs-15 sb-mqueue-21 sb-sockfs-8 sb-zsmalloc-19
sb-bdev-3 sb-nsfs-4 sb-sysfs-26 shadow-18
sb-bpf-32 sb-pipefs-14 sb-tmpfs-1 thp_deferred_split-10
sb-btrfs:vda2-24 sb-proc-25 sb-tmpfs-27 thp_zero-9
sb-cgroup2-30 sb-proc-39 sb-tmpfs-29 xfs_buf-37
sb-configfs-23 sb-proc-41 sb-tmpfs-35 xfs_inodegc-38
sb-dax-11 sb-proc-45 sb-tmpfs-40 zspool-34
sb-debugfs-7 sb-proc-46 sb-tmpfs-42
sb-devpts-28 sb-proc-47 sb-tmpfs-43
sb-devtmpfs-5 sb-pstore-31 sb-tmpfs-44
Signed-off-by: Roman Gushchin <[email protected]>
---
arch/x86/kvm/mmu/mmu.c | 2 +-
drivers/android/binder_alloc.c | 2 +-
drivers/gpu/drm/i915/gem/i915_gem_shrinker.c | 3 +-
drivers/gpu/drm/msm/msm_gem_shrinker.c | 2 +-
.../gpu/drm/panfrost/panfrost_gem_shrinker.c | 2 +-
drivers/gpu/drm/ttm/ttm_pool.c | 2 +-
drivers/md/bcache/btree.c | 2 +-
drivers/md/dm-bufio.c | 2 +-
drivers/md/dm-zoned-metadata.c | 2 +-
drivers/md/raid5.c | 2 +-
drivers/misc/vmw_balloon.c | 2 +-
drivers/virtio/virtio_balloon.c | 2 +-
drivers/xen/xenbus/xenbus_probe_backend.c | 2 +-
fs/btrfs/super.c | 2 +
fs/erofs/utils.c | 2 +-
fs/ext4/extents_status.c | 3 +-
fs/f2fs/super.c | 2 +-
fs/gfs2/glock.c | 2 +-
fs/gfs2/main.c | 2 +-
fs/jbd2/journal.c | 2 +-
fs/mbcache.c | 2 +-
fs/nfs/nfs42xattr.c | 7 ++-
fs/nfs/super.c | 2 +-
fs/nfsd/filecache.c | 2 +-
fs/nfsd/nfscache.c | 2 +-
fs/quota/dquot.c | 2 +-
fs/super.c | 6 +-
fs/ubifs/super.c | 2 +-
fs/xfs/xfs_buf.c | 2 +-
fs/xfs/xfs_icache.c | 2 +-
fs/xfs/xfs_qm.c | 2 +-
include/linux/shrinker.h | 12 +++-
kernel/rcu/tree.c | 2 +-
mm/huge_memory.c | 4 +-
mm/shrinker_debug.c | 45 +++++++++++++-
mm/vmscan.c | 58 ++++++++++++++++++-
mm/workingset.c | 2 +-
mm/zsmalloc.c | 2 +-
net/sunrpc/auth.c | 2 +-
39 files changed, 154 insertions(+), 46 deletions(-)
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index c623019929a7..8cfabdd63406 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -6283,7 +6283,7 @@ int kvm_mmu_vendor_module_init(void)
if (percpu_counter_init(&kvm_total_used_mmu_pages, 0, GFP_KERNEL))
goto out;
- ret = register_shrinker(&mmu_shrinker);
+ ret = register_shrinker(&mmu_shrinker, "mmu");
if (ret)
goto out;
diff --git a/drivers/android/binder_alloc.c b/drivers/android/binder_alloc.c
index 2ac1008a5f39..951343c41ba8 100644
--- a/drivers/android/binder_alloc.c
+++ b/drivers/android/binder_alloc.c
@@ -1084,7 +1084,7 @@ int binder_alloc_shrinker_init(void)
int ret = list_lru_init(&binder_alloc_lru);
if (ret == 0) {
- ret = register_shrinker(&binder_shrinker);
+ ret = register_shrinker(&binder_shrinker, "binder");
if (ret)
list_lru_destroy(&binder_alloc_lru);
}
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
index 6a6ff98a8746..85524ef92ea4 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
@@ -426,7 +426,8 @@ void i915_gem_driver_register__shrinker(struct drm_i915_private *i915)
i915->mm.shrinker.count_objects = i915_gem_shrinker_count;
i915->mm.shrinker.seeks = DEFAULT_SEEKS;
i915->mm.shrinker.batch = 4096;
- drm_WARN_ON(&i915->drm, register_shrinker(&i915->mm.shrinker));
+ drm_WARN_ON(&i915->drm, register_shrinker(&i915->mm.shrinker,
+ "drm_i915_gem"));
i915->mm.oom_notifier.notifier_call = i915_gem_shrinker_oom;
drm_WARN_ON(&i915->drm, register_oom_notifier(&i915->mm.oom_notifier));
diff --git a/drivers/gpu/drm/msm/msm_gem_shrinker.c b/drivers/gpu/drm/msm/msm_gem_shrinker.c
index 086dacf2f26a..2d3cf4f13dfd 100644
--- a/drivers/gpu/drm/msm/msm_gem_shrinker.c
+++ b/drivers/gpu/drm/msm/msm_gem_shrinker.c
@@ -221,7 +221,7 @@ void msm_gem_shrinker_init(struct drm_device *dev)
priv->shrinker.count_objects = msm_gem_shrinker_count;
priv->shrinker.scan_objects = msm_gem_shrinker_scan;
priv->shrinker.seeks = DEFAULT_SEEKS;
- WARN_ON(register_shrinker(&priv->shrinker));
+ WARN_ON(register_shrinker(&priv->shrinker, "drm_msm_gem"));
priv->vmap_notifier.notifier_call = msm_gem_shrinker_vmap;
WARN_ON(register_vmap_purge_notifier(&priv->vmap_notifier));
diff --git a/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c b/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c
index 77e7cb6d1ae3..0d028266ee9e 100644
--- a/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c
+++ b/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c
@@ -103,7 +103,7 @@ void panfrost_gem_shrinker_init(struct drm_device *dev)
pfdev->shrinker.count_objects = panfrost_gem_shrinker_count;
pfdev->shrinker.scan_objects = panfrost_gem_shrinker_scan;
pfdev->shrinker.seeks = DEFAULT_SEEKS;
- WARN_ON(register_shrinker(&pfdev->shrinker));
+ WARN_ON(register_shrinker(&pfdev->shrinker, "drm_panfrost"));
}
/**
diff --git a/drivers/gpu/drm/ttm/ttm_pool.c b/drivers/gpu/drm/ttm/ttm_pool.c
index 1bba0a0ed3f9..b8b41d242197 100644
--- a/drivers/gpu/drm/ttm/ttm_pool.c
+++ b/drivers/gpu/drm/ttm/ttm_pool.c
@@ -722,7 +722,7 @@ int ttm_pool_mgr_init(unsigned long num_pages)
mm_shrinker.count_objects = ttm_pool_shrinker_count;
mm_shrinker.scan_objects = ttm_pool_shrinker_scan;
mm_shrinker.seeks = 1;
- return register_shrinker(&mm_shrinker);
+ return register_shrinker(&mm_shrinker, "drm_ttm_pool");
}
/**
diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c
index ad9f16689419..c1f734ab86b3 100644
--- a/drivers/md/bcache/btree.c
+++ b/drivers/md/bcache/btree.c
@@ -812,7 +812,7 @@ int bch_btree_cache_alloc(struct cache_set *c)
c->shrink.seeks = 4;
c->shrink.batch = c->btree_pages * 2;
- if (register_shrinker(&c->shrink))
+ if (register_shrinker(&c->shrink, "btree"))
pr_warn("bcache: %s: could not register shrinker\n",
__func__);
diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c
index 5ffa1dcf84cf..bcc95898c341 100644
--- a/drivers/md/dm-bufio.c
+++ b/drivers/md/dm-bufio.c
@@ -1806,7 +1806,7 @@ struct dm_bufio_client *dm_bufio_client_create(struct block_device *bdev, unsign
c->shrinker.scan_objects = dm_bufio_shrink_scan;
c->shrinker.seeks = 1;
c->shrinker.batch = 0;
- r = register_shrinker(&c->shrinker);
+ r = register_shrinker(&c->shrinker, "dm_bufio");
if (r)
goto bad;
diff --git a/drivers/md/dm-zoned-metadata.c b/drivers/md/dm-zoned-metadata.c
index d1ea66114d14..05f2fd12066b 100644
--- a/drivers/md/dm-zoned-metadata.c
+++ b/drivers/md/dm-zoned-metadata.c
@@ -2944,7 +2944,7 @@ int dmz_ctr_metadata(struct dmz_dev *dev, int num_dev,
zmd->mblk_shrinker.seeks = DEFAULT_SEEKS;
/* Metadata cache shrinker */
- ret = register_shrinker(&zmd->mblk_shrinker);
+ ret = register_shrinker(&zmd->mblk_shrinker, "md_meta");
if (ret) {
dmz_zmd_err(zmd, "Register metadata cache shrinker failed");
goto err;
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 59f91e392a2a..34ddebd3aff7 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -7383,7 +7383,7 @@ static struct r5conf *setup_conf(struct mddev *mddev)
conf->shrinker.count_objects = raid5_cache_count;
conf->shrinker.batch = 128;
conf->shrinker.flags = 0;
- if (register_shrinker(&conf->shrinker)) {
+ if (register_shrinker(&conf->shrinker, "md-%s", mdname(mddev))) {
pr_warn("md/raid:%s: couldn't register shrinker.\n",
mdname(mddev));
goto abort;
diff --git a/drivers/misc/vmw_balloon.c b/drivers/misc/vmw_balloon.c
index f1d8ba6d4857..6c9ddf1187dd 100644
--- a/drivers/misc/vmw_balloon.c
+++ b/drivers/misc/vmw_balloon.c
@@ -1587,7 +1587,7 @@ static int vmballoon_register_shrinker(struct vmballoon *b)
b->shrinker.count_objects = vmballoon_shrinker_count;
b->shrinker.seeks = DEFAULT_SEEKS;
- r = register_shrinker(&b->shrinker);
+ r = register_shrinker(&b->shrinker, "vmw_balloon");
if (r == 0)
b->shrinker_registered = true;
diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index f4c34a2a6b8e..093e06e19d0e 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -875,7 +875,7 @@ static int virtio_balloon_register_shrinker(struct virtio_balloon *vb)
vb->shrinker.count_objects = virtio_balloon_shrinker_count;
vb->shrinker.seeks = DEFAULT_SEEKS;
- return register_shrinker(&vb->shrinker);
+ return register_shrinker(&vb->shrinker, "virtio_valloon");
}
static int virtballoon_probe(struct virtio_device *vdev)
diff --git a/drivers/xen/xenbus/xenbus_probe_backend.c b/drivers/xen/xenbus/xenbus_probe_backend.c
index 5abded97e1a7..a6c5e344017d 100644
--- a/drivers/xen/xenbus/xenbus_probe_backend.c
+++ b/drivers/xen/xenbus/xenbus_probe_backend.c
@@ -305,7 +305,7 @@ static int __init xenbus_probe_backend_init(void)
register_xenstore_notifier(&xenstore_notifier);
- if (register_shrinker(&backend_memory_shrinker))
+ if (register_shrinker(&backend_memory_shrinker, "xen_backend"))
pr_warn("shrinker registration failed\n");
return 0;
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 206f44005c52..062dbd8071e2 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -1790,6 +1790,8 @@ static struct dentry *btrfs_mount_root(struct file_system_type *fs_type,
error = -EBUSY;
} else {
snprintf(s->s_id, sizeof(s->s_id), "%pg", bdev);
+ shrinker_debugfs_rename(&s->s_shrink, "sb-%s:%s", fs_type->name,
+ s->s_id);
btrfs_sb(s)->bdev_holder = fs_type;
if (!strstr(crc32c_impl(), "generic"))
set_bit(BTRFS_FS_CSUM_IMPL_FAST, &fs_info->flags);
diff --git a/fs/erofs/utils.c b/fs/erofs/utils.c
index ec9a1d780dc1..67eb64fadd4f 100644
--- a/fs/erofs/utils.c
+++ b/fs/erofs/utils.c
@@ -282,7 +282,7 @@ static struct shrinker erofs_shrinker_info = {
int __init erofs_init_shrinker(void)
{
- return register_shrinker(&erofs_shrinker_info);
+ return register_shrinker(&erofs_shrinker_info, "erofs");
}
void erofs_exit_shrinker(void)
diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c
index 9a3a8996aacf..a7aa79d580e5 100644
--- a/fs/ext4/extents_status.c
+++ b/fs/ext4/extents_status.c
@@ -1650,11 +1650,10 @@ int ext4_es_register_shrinker(struct ext4_sb_info *sbi)
err = percpu_counter_init(&sbi->s_es_stats.es_stats_shk_cnt, 0, GFP_KERNEL);
if (err)
goto err3;
-
sbi->s_es_shrinker.scan_objects = ext4_es_scan;
sbi->s_es_shrinker.count_objects = ext4_es_count;
sbi->s_es_shrinker.seeks = DEFAULT_SEEKS;
- err = register_shrinker(&sbi->s_es_shrinker);
+ err = register_shrinker(&sbi->s_es_shrinker, "ext4_es");
if (err)
goto err4;
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 4368f90571bd..2fc40a1635f3 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -4579,7 +4579,7 @@ static int __init init_f2fs_fs(void)
err = f2fs_init_sysfs();
if (err)
goto free_garbage_collection_cache;
- err = register_shrinker(&f2fs_shrinker_info);
+ err = register_shrinker(&f2fs_shrinker_info, "f2fs");
if (err)
goto free_sysfs;
err = register_filesystem(&f2fs_fs_type);
diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
index 26169cedcefc..791c23d9f7e7 100644
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -2549,7 +2549,7 @@ int __init gfs2_glock_init(void)
return -ENOMEM;
}
- ret = register_shrinker(&glock_shrinker);
+ ret = register_shrinker(&glock_shrinker, "gfs2_glock");
if (ret) {
destroy_workqueue(gfs2_delete_workqueue);
destroy_workqueue(glock_workqueue);
diff --git a/fs/gfs2/main.c b/fs/gfs2/main.c
index 28d0eb23e18e..dde981b78488 100644
--- a/fs/gfs2/main.c
+++ b/fs/gfs2/main.c
@@ -150,7 +150,7 @@ static int __init init_gfs2_fs(void)
if (!gfs2_trans_cachep)
goto fail_cachep8;
- error = register_shrinker(&gfs2_qd_shrinker);
+ error = register_shrinker(&gfs2_qd_shrinker, "gfs2_qd");
if (error)
goto fail_shrinker;
diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
index c0cbeeaec2d1..e7786445ecc1 100644
--- a/fs/jbd2/journal.c
+++ b/fs/jbd2/journal.c
@@ -1418,7 +1418,7 @@ static journal_t *journal_init_common(struct block_device *bdev,
if (percpu_counter_init(&journal->j_checkpoint_jh_count, 0, GFP_KERNEL))
goto err_cleanup;
- if (register_shrinker(&journal->j_shrinker)) {
+ if (register_shrinker(&journal->j_shrinker, "jbd2_journal")) {
percpu_counter_destroy(&journal->j_checkpoint_jh_count);
goto err_cleanup;
}
diff --git a/fs/mbcache.c b/fs/mbcache.c
index 97c54d3a2227..379dc5b0b6ad 100644
--- a/fs/mbcache.c
+++ b/fs/mbcache.c
@@ -367,7 +367,7 @@ struct mb_cache *mb_cache_create(int bucket_bits)
cache->c_shrink.count_objects = mb_cache_count;
cache->c_shrink.scan_objects = mb_cache_scan;
cache->c_shrink.seeks = DEFAULT_SEEKS;
- if (register_shrinker(&cache->c_shrink)) {
+ if (register_shrinker(&cache->c_shrink, "mb_cache")) {
kfree(cache->c_hash);
kfree(cache);
goto err_out;
diff --git a/fs/nfs/nfs42xattr.c b/fs/nfs/nfs42xattr.c
index e7b34f7e0614..147b8a2f2dc6 100644
--- a/fs/nfs/nfs42xattr.c
+++ b/fs/nfs/nfs42xattr.c
@@ -1017,15 +1017,16 @@ int __init nfs4_xattr_cache_init(void)
if (ret)
goto out2;
- ret = register_shrinker(&nfs4_xattr_cache_shrinker);
+ ret = register_shrinker(&nfs4_xattr_cache_shrinker, "nfs_xattr_cache");
if (ret)
goto out1;
- ret = register_shrinker(&nfs4_xattr_entry_shrinker);
+ ret = register_shrinker(&nfs4_xattr_entry_shrinker, "nfs_xattr_entry");
if (ret)
goto out;
- ret = register_shrinker(&nfs4_xattr_large_entry_shrinker);
+ ret = register_shrinker(&nfs4_xattr_large_entry_shrinker,
+ "nfs_xattr_large_entry");
if (!ret)
return 0;
diff --git a/fs/nfs/super.c b/fs/nfs/super.c
index 6ab5eeb000dc..c7a2aef911f1 100644
--- a/fs/nfs/super.c
+++ b/fs/nfs/super.c
@@ -149,7 +149,7 @@ int __init register_nfs_fs(void)
ret = nfs_register_sysctl();
if (ret < 0)
goto error_2;
- ret = register_shrinker(&acl_shrinker);
+ ret = register_shrinker(&acl_shrinker, "nfs_acl");
if (ret < 0)
goto error_3;
#ifdef CONFIG_NFS_V4_2
diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
index 2c1b027774d4..9c2879a3c3c0 100644
--- a/fs/nfsd/filecache.c
+++ b/fs/nfsd/filecache.c
@@ -666,7 +666,7 @@ nfsd_file_cache_init(void)
goto out_err;
}
- ret = register_shrinker(&nfsd_file_shrinker);
+ ret = register_shrinker(&nfsd_file_shrinker, "nfsd_filecache");
if (ret) {
pr_err("nfsd: failed to register nfsd_file_shrinker: %d\n", ret);
goto out_lru;
diff --git a/fs/nfsd/nfscache.c b/fs/nfsd/nfscache.c
index 0b3f12aa37ff..f1cfb06d0be5 100644
--- a/fs/nfsd/nfscache.c
+++ b/fs/nfsd/nfscache.c
@@ -176,7 +176,7 @@ int nfsd_reply_cache_init(struct nfsd_net *nn)
nn->nfsd_reply_cache_shrinker.scan_objects = nfsd_reply_cache_scan;
nn->nfsd_reply_cache_shrinker.count_objects = nfsd_reply_cache_count;
nn->nfsd_reply_cache_shrinker.seeks = 1;
- status = register_shrinker(&nn->nfsd_reply_cache_shrinker);
+ status = register_shrinker(&nn->nfsd_reply_cache_shrinker, "nfsd_reply");
if (status)
goto out_stats_destroy;
diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
index a74aef99bd3d..854d2b1d0914 100644
--- a/fs/quota/dquot.c
+++ b/fs/quota/dquot.c
@@ -2985,7 +2985,7 @@ static int __init dquot_init(void)
pr_info("VFS: Dquot-cache hash table entries: %ld (order %ld,"
" %ld bytes)\n", nr_hash, order, (PAGE_SIZE << order));
- if (register_shrinker(&dqcache_shrinker))
+ if (register_shrinker(&dqcache_shrinker, "dqcache"))
panic("Cannot register dquot shrinker");
return 0;
diff --git a/fs/super.c b/fs/super.c
index 60f57c7bc0a6..4fca6657f442 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -265,7 +265,7 @@ static struct super_block *alloc_super(struct file_system_type *type, int flags,
s->s_shrink.count_objects = super_cache_count;
s->s_shrink.batch = 1024;
s->s_shrink.flags = SHRINKER_NUMA_AWARE | SHRINKER_MEMCG_AWARE;
- if (prealloc_shrinker(&s->s_shrink))
+ if (prealloc_shrinker(&s->s_shrink, "sb-%s", type->name))
goto fail;
if (list_lru_init_memcg(&s->s_dentry_lru, &s->s_shrink))
goto fail;
@@ -1288,6 +1288,8 @@ int get_tree_bdev(struct fs_context *fc,
} else {
s->s_mode = mode;
snprintf(s->s_id, sizeof(s->s_id), "%pg", bdev);
+ shrinker_debugfs_rename(&s->s_shrink, "sb-%s:%s",
+ fc->fs_type->name, s->s_id);
sb_set_blocksize(s, block_size(bdev));
error = fill_super(s, fc);
if (error) {
@@ -1363,6 +1365,8 @@ struct dentry *mount_bdev(struct file_system_type *fs_type,
} else {
s->s_mode = mode;
snprintf(s->s_id, sizeof(s->s_id), "%pg", bdev);
+ shrinker_debugfs_rename(&s->s_shrink, "sb-%s:%s",
+ fs_type->name, s->s_id);
sb_set_blocksize(s, block_size(bdev));
error = fill_super(s, data, flags & SB_SILENT ? 1 : 0);
if (error) {
diff --git a/fs/ubifs/super.c b/fs/ubifs/super.c
index bad67455215f..a3663d201f64 100644
--- a/fs/ubifs/super.c
+++ b/fs/ubifs/super.c
@@ -2430,7 +2430,7 @@ static int __init ubifs_init(void)
if (!ubifs_inode_slab)
return -ENOMEM;
- err = register_shrinker(&ubifs_shrinker_info);
+ err = register_shrinker(&ubifs_shrinker_info, "ubifs");
if (err)
goto out_slab;
diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
index e1afb9e503e1..5645e92df0c9 100644
--- a/fs/xfs/xfs_buf.c
+++ b/fs/xfs/xfs_buf.c
@@ -1986,7 +1986,7 @@ xfs_alloc_buftarg(
btp->bt_shrinker.scan_objects = xfs_buftarg_shrink_scan;
btp->bt_shrinker.seeks = DEFAULT_SEEKS;
btp->bt_shrinker.flags = SHRINKER_NUMA_AWARE;
- if (register_shrinker(&btp->bt_shrinker))
+ if (register_shrinker(&btp->bt_shrinker, "xfs_buf"))
goto error_pcpu;
return btp;
diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
index bffd6eb0b298..d0c4e74ff763 100644
--- a/fs/xfs/xfs_icache.c
+++ b/fs/xfs/xfs_icache.c
@@ -2198,5 +2198,5 @@ xfs_inodegc_register_shrinker(
shrink->flags = SHRINKER_NONSLAB;
shrink->batch = XFS_INODEGC_SHRINKER_BATCH;
- return register_shrinker(shrink);
+ return register_shrinker(shrink, "xfs_inodegc");
}
diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c
index f165d1a3de1d..93ded9e81f49 100644
--- a/fs/xfs/xfs_qm.c
+++ b/fs/xfs/xfs_qm.c
@@ -686,7 +686,7 @@ xfs_qm_init_quotainfo(
qinf->qi_shrinker.seeks = DEFAULT_SEEKS;
qinf->qi_shrinker.flags = SHRINKER_NUMA_AWARE;
- error = register_shrinker(&qinf->qi_shrinker);
+ error = register_shrinker(&qinf->qi_shrinker, "xfs_qm");
if (error)
goto out_free_inos;
diff --git a/include/linux/shrinker.h b/include/linux/shrinker.h
index 2ced8149c513..64416f3e0a1f 100644
--- a/include/linux/shrinker.h
+++ b/include/linux/shrinker.h
@@ -75,6 +75,7 @@ struct shrinker {
#endif
#ifdef CONFIG_SHRINKER_DEBUG
int debugfs_id;
+ const char *name;
struct dentry *debugfs_entry;
#endif
/* objs pending delete, per node */
@@ -92,9 +93,9 @@ struct shrinker {
*/
#define SHRINKER_NONSLAB (1 << 3)
-extern int prealloc_shrinker(struct shrinker *shrinker);
+extern int prealloc_shrinker(struct shrinker *shrinker, const char *fmt, ...);
extern void register_shrinker_prepared(struct shrinker *shrinker);
-extern int register_shrinker(struct shrinker *shrinker);
+extern int register_shrinker(struct shrinker *shrinker, const char *fmt, ...);
extern void unregister_shrinker(struct shrinker *shrinker);
extern void free_prealloced_shrinker(struct shrinker *shrinker);
extern void synchronize_shrinkers(void);
@@ -102,6 +103,8 @@ extern void synchronize_shrinkers(void);
#ifdef CONFIG_SHRINKER_DEBUG
extern int shrinker_debugfs_add(struct shrinker *shrinker);
extern void shrinker_debugfs_remove(struct shrinker *shrinker);
+extern int shrinker_debugfs_rename(struct shrinker *shrinker,
+ const char *fmt, ...);
#else /* CONFIG_SHRINKER_DEBUG */
static inline int shrinker_debugfs_add(struct shrinker *shrinker)
{
@@ -110,5 +113,10 @@ static inline int shrinker_debugfs_add(struct shrinker *shrinker)
static inline void shrinker_debugfs_remove(struct shrinker *shrinker)
{
}
+static inline int shrinker_debugfs_rename(struct shrinker *shrinker,
+ const char *fmt, ...)
+{
+ return 0;
+}
#endif /* CONFIG_SHRINKER_DEBUG */
#endif /* _LINUX_SHRINKER_H */
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 5c587e00811c..b4c66916bea9 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -4978,7 +4978,7 @@ static void __init kfree_rcu_batch_init(void)
INIT_DELAYED_WORK(&krcp->page_cache_work, fill_page_cache_func);
krcp->initialized = true;
}
- if (register_shrinker(&kfree_rcu_shrinker))
+ if (register_shrinker(&kfree_rcu_shrinker, "kfree_rcu"))
pr_err("Failed to register kfree_rcu() shrinker!\n");
}
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index fa6a1623976a..a40df19c0e38 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -423,10 +423,10 @@ static int __init hugepage_init(void)
if (err)
goto err_slab;
- err = register_shrinker(&huge_zero_page_shrinker);
+ err = register_shrinker(&huge_zero_page_shrinker, "thp_zero");
if (err)
goto err_hzp_shrinker;
- err = register_shrinker(&deferred_split_shrinker);
+ err = register_shrinker(&deferred_split_shrinker, "thp_deferred_split");
if (err)
goto err_split_shrinker;
diff --git a/mm/shrinker_debug.c b/mm/shrinker_debug.c
index fd1f805a581a..28b1c1ab60ef 100644
--- a/mm/shrinker_debug.c
+++ b/mm/shrinker_debug.c
@@ -104,7 +104,7 @@ DEFINE_SHOW_ATTRIBUTE(shrinker_debugfs_count);
int shrinker_debugfs_add(struct shrinker *shrinker)
{
struct dentry *entry;
- char buf[16];
+ char buf[128];
int id;
lockdep_assert_held(&shrinker_rwsem);
@@ -118,7 +118,7 @@ int shrinker_debugfs_add(struct shrinker *shrinker)
return id;
shrinker->debugfs_id = id;
- snprintf(buf, sizeof(buf), "%d", id);
+ snprintf(buf, sizeof(buf), "%s-%d", shrinker->name, id);
/* create debugfs entry */
entry = debugfs_create_dir(buf, shrinker_debugfs_root);
@@ -133,10 +133,51 @@ int shrinker_debugfs_add(struct shrinker *shrinker)
return 0;
}
+int shrinker_debugfs_rename(struct shrinker *shrinker, const char *fmt, ...)
+{
+ struct dentry *entry;
+ char buf[128];
+ const char *old;
+ va_list ap;
+ int ret = 0;
+
+ down_write(&shrinker_rwsem);
+
+ old = shrinker->name;
+
+ va_start(ap, fmt);
+ shrinker->name = kvasprintf_const(GFP_KERNEL, fmt, ap);
+ va_end(ap);
+ if (!shrinker->name) {
+ shrinker->name = old;
+ ret = -ENOMEM;
+ } else if (shrinker->debugfs_entry) {
+ snprintf(buf, sizeof(buf), "%s-%d", shrinker->name,
+ shrinker->debugfs_id);
+
+ entry = debugfs_rename(shrinker_debugfs_root,
+ shrinker->debugfs_entry,
+ shrinker_debugfs_root, buf);
+ if (IS_ERR(entry))
+ ret = PTR_ERR(entry);
+ else
+ shrinker->debugfs_entry = entry;
+
+ kfree_const(old);
+ }
+
+ up_write(&shrinker_rwsem);
+
+ return ret;
+}
+EXPORT_SYMBOL(shrinker_debugfs_rename);
+
void shrinker_debugfs_remove(struct shrinker *shrinker)
{
lockdep_assert_held(&shrinker_rwsem);
+ kfree_const(shrinker->name);
+
if (!shrinker->debugfs_entry)
return;
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 024f7056b98c..42bae0fd0442 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -613,7 +613,7 @@ static unsigned long lruvec_lru_size(struct lruvec *lruvec, enum lru_list lru,
/*
* Add a shrinker callback to be called from the vm.
*/
-int prealloc_shrinker(struct shrinker *shrinker)
+static int __prealloc_shrinker(struct shrinker *shrinker)
{
unsigned int size;
int err;
@@ -637,8 +637,36 @@ int prealloc_shrinker(struct shrinker *shrinker)
return 0;
}
+#ifdef CONFIG_SHRINKER_DEBUG
+int prealloc_shrinker(struct shrinker *shrinker, const char *fmt, ...)
+{
+ va_list ap;
+ int err;
+
+ va_start(ap, fmt);
+ shrinker->name = kvasprintf_const(GFP_KERNEL, fmt, ap);
+ va_end(ap);
+ if (!shrinker->name)
+ return -ENOMEM;
+
+ err = __prealloc_shrinker(shrinker);
+ if (err)
+ kfree_const(shrinker->name);
+
+ return err;
+}
+#else
+int prealloc_shrinker(struct shrinker *shrinker, const char *fmt, ...)
+{
+ return __prealloc_shrinker(shrinker);
+}
+#endif
+
void free_prealloced_shrinker(struct shrinker *shrinker)
{
+#ifdef CONFIG_SHRINKER_DEBUG
+ kfree_const(shrinker->name);
+#endif
if (shrinker->flags & SHRINKER_MEMCG_AWARE) {
down_write(&shrinker_rwsem);
unregister_memcg_shrinker(shrinker);
@@ -659,15 +687,39 @@ void register_shrinker_prepared(struct shrinker *shrinker)
up_write(&shrinker_rwsem);
}
-int register_shrinker(struct shrinker *shrinker)
+static int __register_shrinker(struct shrinker *shrinker)
{
- int err = prealloc_shrinker(shrinker);
+ int err = __prealloc_shrinker(shrinker);
if (err)
return err;
register_shrinker_prepared(shrinker);
return 0;
}
+
+#ifdef CONFIG_SHRINKER_DEBUG
+int register_shrinker(struct shrinker *shrinker, const char *fmt, ...)
+{
+ va_list ap;
+ int err;
+
+ va_start(ap, fmt);
+ shrinker->name = kvasprintf_const(GFP_KERNEL, fmt, ap);
+ va_end(ap);
+ if (!shrinker->name)
+ return -ENOMEM;
+
+ err = __register_shrinker(shrinker);
+ if (err)
+ kfree_const(shrinker->name);
+ return err;
+}
+#else
+int register_shrinker(struct shrinker *shrinker, const char *fmt, ...)
+{
+ return __register_shrinker(shrinker);
+}
+#endif
EXPORT_SYMBOL(register_shrinker);
/*
diff --git a/mm/workingset.c b/mm/workingset.c
index 592569a8974c..840986179cf3 100644
--- a/mm/workingset.c
+++ b/mm/workingset.c
@@ -625,7 +625,7 @@ static int __init workingset_init(void)
pr_info("workingset: timestamp_bits=%d max_order=%d bucket_order=%u\n",
timestamp_bits, max_order, bucket_order);
- ret = prealloc_shrinker(&workingset_shadow_shrinker);
+ ret = prealloc_shrinker(&workingset_shadow_shrinker, "shadow");
if (ret)
goto err;
ret = __list_lru_init(&shadow_nodes, true, &shadow_nodes_key,
diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index 9152fbde33b5..a19de176f604 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -2188,7 +2188,7 @@ static int zs_register_shrinker(struct zs_pool *pool)
pool->shrinker.batch = 0;
pool->shrinker.seeks = DEFAULT_SEEKS;
- return register_shrinker(&pool->shrinker);
+ return register_shrinker(&pool->shrinker, "zspool");
}
/**
diff --git a/net/sunrpc/auth.c b/net/sunrpc/auth.c
index 682fcd24bf43..a29742a9c3f1 100644
--- a/net/sunrpc/auth.c
+++ b/net/sunrpc/auth.c
@@ -874,7 +874,7 @@ int __init rpcauth_init_module(void)
err = rpc_init_authunix();
if (err < 0)
goto out1;
- err = register_shrinker(&rpc_cred_shrinker);
+ err = register_shrinker(&rpc_cred_shrinker, "rpc_cred");
if (err < 0)
goto out2;
return 0;
--
2.35.3
Add a scan interface which allows to trigger scanning of a particular
shrinker and specify memcg and numa node. It's useful for testing,
debugging and profiling of a specific scan_objects() callback.
Unlike alternatives (creating a real memory pressure and dropping
caches via /proc/sys/vm/drop_caches) this interface allows to interact
with only one shrinker at once. Also, if a shrinker is misreporting
the number of objects (as some do), it doesn't affect scanning.
Signed-off-by: Roman Gushchin <[email protected]>
---
.../admin-guide/mm/shrinker_debugfs.rst | 39 +++++++++-
mm/shrinker_debug.c | 73 +++++++++++++++++++
2 files changed, 108 insertions(+), 4 deletions(-)
diff --git a/Documentation/admin-guide/mm/shrinker_debugfs.rst b/Documentation/admin-guide/mm/shrinker_debugfs.rst
index 6783f8190e63..8fecf81d60ee 100644
--- a/Documentation/admin-guide/mm/shrinker_debugfs.rst
+++ b/Documentation/admin-guide/mm/shrinker_debugfs.rst
@@ -5,14 +5,16 @@ Shrinker Debugfs Interface
==========================
Shrinker debugfs interface provides a visibility into the kernel memory
-shrinkers subsystem and allows to get information about individual shrinkers.
+shrinkers subsystem and allows to get information about individual shrinkers
+and interact with them.
For each shrinker registered in the system a directory in **<debugfs>/shrinker/**
is created. The directory's name is composed from the shrinker's name and an
unique id: e.g. *kfree_rcu-0* or *sb-xfs:vda1-36*.
-Each shrinker directory contains the **count** file, which allows to trigger
-the *count_objects()* callback for each memcg and numa node (if applicable).
+Each shrinker directory contains **count** and **scan** files, which allow to
+trigger *count_objects()* and *scan_objects()* callbacks for each memcg and
+numa node (if applicable).
Usage:
------
@@ -43,7 +45,7 @@ Usage:
$ cd sb-btrfs\:vda2-24/
$ ls
- count
+ count scan
3. *Count objects*
@@ -98,3 +100,32 @@ Usage:
2877 84 0
293 1 0
735 8 0
+
+4. *Scan objects*
+
+ The expected input format::
+
+ <cgroup inode id> <numa id> <number of objects to scan>
+
+ For a non-memcg-aware shrinker or on a system with no memory
+ cgrups **0** should be passed as cgroup id.
+ ::
+
+ $ cd /sys/kernel/debug/shrinker/
+ $ cd sb-btrfs\:vda2-24/
+
+ $ cat count | head -n 5
+ 1 212 0
+ 21 97 0
+ 55 802 5
+ 2367 2 0
+ 225 13 0
+
+ $ echo "55 0 200" > scan
+
+ $ cat count | head -n 5
+ 1 212 0
+ 21 96 0
+ 55 752 5
+ 2367 2 0
+ 225 13 0
diff --git a/mm/shrinker_debug.c b/mm/shrinker_debug.c
index 28b1c1ab60ef..8f67fef5a643 100644
--- a/mm/shrinker_debug.c
+++ b/mm/shrinker_debug.c
@@ -101,6 +101,77 @@ static int shrinker_debugfs_count_show(struct seq_file *m, void *v)
}
DEFINE_SHOW_ATTRIBUTE(shrinker_debugfs_count);
+static int shrinker_debugfs_scan_open(struct inode *inode, struct file *file)
+{
+ file->private_data = inode->i_private;
+ return nonseekable_open(inode, file);
+}
+
+static ssize_t shrinker_debugfs_scan_write(struct file *file,
+ const char __user *buf,
+ size_t size, loff_t *pos)
+{
+ struct shrinker *shrinker = (struct shrinker *)file->private_data;
+ unsigned long nr_to_scan = 0, ino;
+ struct shrink_control sc = {
+ .gfp_mask = GFP_KERNEL,
+ };
+ struct mem_cgroup *memcg = NULL;
+ int nid;
+ char kbuf[72];
+ int read_len = size < (sizeof(kbuf) - 1) ? size : (sizeof(kbuf) - 1);
+ ssize_t ret;
+
+ if (copy_from_user(kbuf, buf, read_len))
+ return -EFAULT;
+ kbuf[read_len] = '\0';
+
+ if (sscanf(kbuf, "%lu %d %lu", &ino, &nid, &nr_to_scan) < 2)
+ return -EINVAL;
+
+ if (nid < 0 || nid >= nr_node_ids)
+ return -EINVAL;
+
+ if (shrinker->flags & SHRINKER_MEMCG_AWARE) {
+ memcg = mem_cgroup_get_from_ino(ino);
+ if (!memcg || IS_ERR(memcg))
+ return -ENOENT;
+
+ if (!mem_cgroup_online(memcg)) {
+ mem_cgroup_put(memcg);
+ return -ENOENT;
+ }
+ } else {
+ if (ino != 0)
+ return -EINVAL;
+ memcg = NULL;
+ }
+
+ ret = down_read_killable(&shrinker_rwsem);
+ if (ret) {
+ mem_cgroup_put(memcg);
+ return ret;
+ }
+
+ sc.nid = nid;
+ sc.memcg = memcg;
+ sc.nr_to_scan = nr_to_scan;
+ sc.nr_scanned = nr_to_scan;
+
+ shrinker->scan_objects(shrinker, &sc);
+
+ up_read(&shrinker_rwsem);
+ mem_cgroup_put(memcg);
+
+ return ret ? ret : size;
+}
+
+static const struct file_operations shrinker_debugfs_scan_fops = {
+ .owner = THIS_MODULE,
+ .open = shrinker_debugfs_scan_open,
+ .write = shrinker_debugfs_scan_write,
+};
+
int shrinker_debugfs_add(struct shrinker *shrinker)
{
struct dentry *entry;
@@ -130,6 +201,8 @@ int shrinker_debugfs_add(struct shrinker *shrinker)
debugfs_create_file("count", 0220, entry, shrinker,
&shrinker_debugfs_count_fops);
+ debugfs_create_file("scan", 0440, entry, shrinker,
+ &shrinker_debugfs_scan_fops);
return 0;
}
--
2.35.3
On Mon, May 09, 2022 at 11:38:14AM -0700, Roman Gushchin wrote:
> There are 50+ different shrinkers in the kernel, many with their own bells and
> whistles. Under the memory pressure the kernel applies some pressure on each of
> them in the order of which they were created/registered in the system. Some
> of them can contain only few objects, some can be quite large. Some can be
> effective at reclaiming memory, some not.
>
> The only existing debugging mechanism is a couple of tracepoints in
> do_shrink_slab(): mm_shrink_slab_start and mm_shrink_slab_end. They aren't
> covering everything though: shrinkers which report 0 objects will never show up,
> there is no support for memcg-aware shrinkers. Shrinkers are identified by their
> scan function, which is not always enough (e.g. hard to guess which super
> block's shrinker it is having only "super_cache_scan").
>
> To provide a better visibility and debug options for memory shrinkers
> this patchset introduces a /sys/kernel/debug/shrinker interface, to some extent
> similar to /sys/kernel/slab.
>
> For each shrinker registered in the system a directory is created.
> As now, the directory will contain only a "scan" file, which allows to get
> the number of managed objects for each memory cgroup (for memcg-aware shrinkers)
> and each numa node (for numa-aware shrinkers on a numa machine). Other
> interfaces might be added in the future.
>
> To make debugging more pleasant, the patchset also names all shrinkers,
> so that debugfs entries can have meaningful names.
>
>
> v3:
> 1) separated the "scan" part into a separate patch, by Dave
> 2) merged *_memcg, *_node and *_memcg_node interfaces, by Dave
> 3) shrinkers naming enhancements, by Christophe and Dave
> 4) added signal_pending() check, by Hillf
> 5) enabled by default, by Dave
Any comments? Thoughts? Objections?
Thanks!
On Mon, May 09, 2022 at 11:38:16AM -0700, Roman Gushchin wrote:
> This commit introduces the /sys/kernel/debug/shrinker debugfs
> interface which provides an ability to observe the state of
> individual kernel memory shrinkers.
>
> Because the feature adds some memory overhead (which shouldn't be
> large unless there is a huge amount of registered shrinkers), it's
> guarded by a config option (enabled by default).
>
> This commit introduces the "count" interface for each shrinker
> registered in the system.
>
> The output is in the following format:
> <cgroup inode id> <nr of objects on node 0> <nr of objects on node 1>...
> <cgroup inode id> <nr of objects on node 0> <nr of objects on node 1>...
> ...
>
> To reduce the size of output on machines with many thousands cgroups,
> if the total number of objects on all nodes is 0, the line is omitted.
>
> If the shrinker is not memcg-aware or CONFIG_MEMCG is off, 0 is
> printed as cgroup inode id. If the shrinker is not numa-aware, 0's are
> printed for all nodes except the first one.
>
> This commit gives debugfs entries simple numeric names, which are not
> very convenient. The following commit in the series will provide
> shrinkers with more meaningful names.
>
> Signed-off-by: Roman Gushchin <[email protected]>
I think this looks reasonable
Reviewed-by: Kent Overstreet <[email protected]>
On Fri, May 20, 2022 at 06:58:12PM +0200, Christophe JAILLET wrote:
> Le 09/05/2022 ? 20:38, Roman Gushchin a ?crit?:
> > This commit introduces the /sys/kernel/debug/shrinker debugfs
> > interface which provides an ability to observe the state of
> > individual kernel memory shrinkers.
> >
> > Because the feature adds some memory overhead (which shouldn't be
> > large unless there is a huge amount of registered shrinkers), it's
> > guarded by a config option (enabled by default).
> >
> > This commit introduces the "count" interface for each shrinker
> > registered in the system.
> >
> > The output is in the following format:
> > <cgroup inode id> <nr of objects on node 0> <nr of objects on node 1>...
> > <cgroup inode id> <nr of objects on node 0> <nr of objects on node 1>...
> > ...
> >
> > To reduce the size of output on machines with many thousands cgroups,
> > if the total number of objects on all nodes is 0, the line is omitted.
> >
> > If the shrinker is not memcg-aware or CONFIG_MEMCG is off, 0 is
> > printed as cgroup inode id. If the shrinker is not numa-aware, 0's are
> > printed for all nodes except the first one.
> >
> > This commit gives debugfs entries simple numeric names, which are not
> > very convenient. The following commit in the series will provide
> > shrinkers with more meaningful names.
> >
> > Signed-off-by: Roman Gushchin <[email protected]>
> > ---
> > include/linux/shrinker.h | 19 ++++-
> > lib/Kconfig.debug | 9 +++
> > mm/Makefile | 1 +
> > mm/shrinker_debug.c | 171 +++++++++++++++++++++++++++++++++++++++
> > mm/vmscan.c | 6 +-
> > 5 files changed, 203 insertions(+), 3 deletions(-)
> > create mode 100644 mm/shrinker_debug.c
> >
> > diff --git a/include/linux/shrinker.h b/include/linux/shrinker.h
> > index 76fbf92b04d9..2ced8149c513 100644
> > --- a/include/linux/shrinker.h
> > +++ b/include/linux/shrinker.h
> > @@ -72,6 +72,10 @@ struct shrinker {
> > #ifdef CONFIG_MEMCG
> > /* ID in shrinker_idr */
> > int id;
> > +#endif
> > +#ifdef CONFIG_SHRINKER_DEBUG
> > + int debugfs_id;
> > + struct dentry *debugfs_entry;
> > #endif
> > /* objs pending delete, per node */
> > atomic_long_t *nr_deferred;
> > @@ -94,4 +98,17 @@ extern int register_shrinker(struct shrinker *shrinker);
> > extern void unregister_shrinker(struct shrinker *shrinker);
> > extern void free_prealloced_shrinker(struct shrinker *shrinker);
> > extern void synchronize_shrinkers(void);
> > -#endif
> > +
> > +#ifdef CONFIG_SHRINKER_DEBUG
> > +extern int shrinker_debugfs_add(struct shrinker *shrinker);
> > +extern void shrinker_debugfs_remove(struct shrinker *shrinker);
> > +#else /* CONFIG_SHRINKER_DEBUG */
> > +static inline int shrinker_debugfs_add(struct shrinker *shrinker)
> > +{
> > + return 0;
> > +}
> > +static inline void shrinker_debugfs_remove(struct shrinker *shrinker)
> > +{
> > +}
> > +#endif /* CONFIG_SHRINKER_DEBUG */
> > +#endif /* _LINUX_SHRINKER_H */
> > diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> > index 3fd7a2e9eaf1..5fa65a649798 100644
> > --- a/lib/Kconfig.debug
> > +++ b/lib/Kconfig.debug
> > @@ -733,6 +733,15 @@ config SLUB_STATS
> > out which slabs are relevant to a particular load.
> > Try running: slabinfo -DA
> > +config SHRINKER_DEBUG
> > + default y
>
> The previous version of the serie had default 'n'.
> Is it intentional to have it now activated by default? It looked more like a
> tuning functionality when fine grained mangement of shrinker is needed.
Yes, it is intentional.
The overhead is small, so I don't think we have a good reason to hide it
by default behind a config option. In my opinion, enabling it be default
will increase the chances to gather a useful data.
It was the feedback I've received for one of the previous versions of
the patchset, and I think it's totally valid.
And preserving the config option allows to have a zero overhead for
really constrained systems.
Thanks!
On Mon, May 09, 2022 at 11:38:20AM -0700, Roman Gushchin wrote:
> Add a scan interface which allows to trigger scanning of a particular
> shrinker and specify memcg and numa node. It's useful for testing,
> debugging and profiling of a specific scan_objects() callback.
> Unlike alternatives (creating a real memory pressure and dropping
> caches via /proc/sys/vm/drop_caches) this interface allows to interact
> with only one shrinker at once. Also, if a shrinker is misreporting
> the number of objects (as some do), it doesn't affect scanning.
>
> Signed-off-by: Roman Gushchin <[email protected]>
> ---
> .../admin-guide/mm/shrinker_debugfs.rst | 39 +++++++++-
> mm/shrinker_debug.c | 73 +++++++++++++++++++
> 2 files changed, 108 insertions(+), 4 deletions(-)
>
> diff --git a/Documentation/admin-guide/mm/shrinker_debugfs.rst b/Documentation/admin-guide/mm/shrinker_debugfs.rst
> index 6783f8190e63..8fecf81d60ee 100644
> --- a/Documentation/admin-guide/mm/shrinker_debugfs.rst
> +++ b/Documentation/admin-guide/mm/shrinker_debugfs.rst
> @@ -5,14 +5,16 @@ Shrinker Debugfs Interface
> ==========================
>
> Shrinker debugfs interface provides a visibility into the kernel memory
> -shrinkers subsystem and allows to get information about individual shrinkers.
> +shrinkers subsystem and allows to get information about individual shrinkers
> +and interact with them.
>
> For each shrinker registered in the system a directory in **<debugfs>/shrinker/**
> is created. The directory's name is composed from the shrinker's name and an
> unique id: e.g. *kfree_rcu-0* or *sb-xfs:vda1-36*.
>
> -Each shrinker directory contains the **count** file, which allows to trigger
> -the *count_objects()* callback for each memcg and numa node (if applicable).
> +Each shrinker directory contains **count** and **scan** files, which allow to
> +trigger *count_objects()* and *scan_objects()* callbacks for each memcg and
> +numa node (if applicable).
>
> Usage:
> ------
> @@ -43,7 +45,7 @@ Usage:
>
> $ cd sb-btrfs\:vda2-24/
> $ ls
> - count
> + count scan
>
> 3. *Count objects*
>
> @@ -98,3 +100,32 @@ Usage:
> 2877 84 0
> 293 1 0
> 735 8 0
> +
> +4. *Scan objects*
> +
> + The expected input format::
> +
> + <cgroup inode id> <numa id> <number of objects to scan>
> +
> + For a non-memcg-aware shrinker or on a system with no memory
> + cgrups **0** should be passed as cgroup id.
> + ::
> +
> + $ cd /sys/kernel/debug/shrinker/
> + $ cd sb-btrfs\:vda2-24/
> +
> + $ cat count | head -n 5
> + 1 212 0
> + 21 97 0
> + 55 802 5
> + 2367 2 0
> + 225 13 0
> +
> + $ echo "55 0 200" > scan
> +
> + $ cat count | head -n 5
> + 1 212 0
> + 21 96 0
> + 55 752 5
> + 2367 2 0
> + 225 13 0
> diff --git a/mm/shrinker_debug.c b/mm/shrinker_debug.c
> index 28b1c1ab60ef..8f67fef5a643 100644
> --- a/mm/shrinker_debug.c
> +++ b/mm/shrinker_debug.c
> @@ -101,6 +101,77 @@ static int shrinker_debugfs_count_show(struct seq_file *m, void *v)
> }
> DEFINE_SHOW_ATTRIBUTE(shrinker_debugfs_count);
>
> +static int shrinker_debugfs_scan_open(struct inode *inode, struct file *file)
> +{
> + file->private_data = inode->i_private;
> + return nonseekable_open(inode, file);
> +}
> +
> +static ssize_t shrinker_debugfs_scan_write(struct file *file,
> + const char __user *buf,
> + size_t size, loff_t *pos)
> +{
> + struct shrinker *shrinker = (struct shrinker *)file->private_data;
Seems we could drop the cast since ->private_data is void * type.
> + unsigned long nr_to_scan = 0, ino;
> + struct shrink_control sc = {
> + .gfp_mask = GFP_KERNEL,
> + };
> + struct mem_cgroup *memcg = NULL;
> + int nid;
> + char kbuf[72];
> + int read_len = size < (sizeof(kbuf) - 1) ? size : (sizeof(kbuf) - 1);
> + ssize_t ret;
> +
> + if (copy_from_user(kbuf, buf, read_len))
> + return -EFAULT;
> + kbuf[read_len] = '\0';
> +
> + if (sscanf(kbuf, "%lu %d %lu", &ino, &nid, &nr_to_scan) < 2)
> + return -EINVAL;
> +
> + if (nid < 0 || nid >= nr_node_ids)
> + return -EINVAL;
> +
Should we break here if nr_to_scan is zero?
> + if (shrinker->flags & SHRINKER_MEMCG_AWARE) {
> + memcg = mem_cgroup_get_from_ino(ino);
> + if (!memcg || IS_ERR(memcg))
Should we drop the check of "!memcg" since mem_cgroup_get_from_ino
cannot return NULL?
> + return -ENOENT;
> +
> + if (!mem_cgroup_online(memcg)) {
> + mem_cgroup_put(memcg);
> + return -ENOENT;
> + }
> + } else {
> + if (ino != 0)
> + return -EINVAL;
> + memcg = NULL;
IIUC, memcg is already NULL if we reach here, right? Then the
assignment is not necessary. Or we cound remove the initialization
of 'memcg' where it is definned.
> + }
> +
> + ret = down_read_killable(&shrinker_rwsem);
> + if (ret) {
> + mem_cgroup_put(memcg);
> + return ret;
> + }
> +
> + sc.nid = nid;
> + sc.memcg = memcg;
> + sc.nr_to_scan = nr_to_scan;
> + sc.nr_scanned = nr_to_scan;
> +
> + shrinker->scan_objects(shrinker, &sc);
> +
> + up_read(&shrinker_rwsem);
> + mem_cgroup_put(memcg);
> +
> + return ret ? ret : size;
Seems "ret" is always equal to 0 here, should we simplify this
to "return size"?
Thanks.
> +}
> +
> +static const struct file_operations shrinker_debugfs_scan_fops = {
> + .owner = THIS_MODULE,
> + .open = shrinker_debugfs_scan_open,
> + .write = shrinker_debugfs_scan_write,
> +};
> +
> int shrinker_debugfs_add(struct shrinker *shrinker)
> {
> struct dentry *entry;
> @@ -130,6 +201,8 @@ int shrinker_debugfs_add(struct shrinker *shrinker)
>
> debugfs_create_file("count", 0220, entry, shrinker,
> &shrinker_debugfs_count_fops);
> + debugfs_create_file("scan", 0440, entry, shrinker,
> + &shrinker_debugfs_scan_fops);
> return 0;
> }
>
> --
> 2.35.3
>
>
On Fri, May 20, 2022 at 06:58:12PM +0200, Christophe JAILLET wrote:
> Le 09/05/2022 à 20:38, Roman Gushchin a écrit :
> > diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> > index 3fd7a2e9eaf1..5fa65a649798 100644
> > --- a/lib/Kconfig.debug
> > +++ b/lib/Kconfig.debug
> > @@ -733,6 +733,15 @@ config SLUB_STATS
> > out which slabs are relevant to a particular load.
> > Try running: slabinfo -DA
> > +config SHRINKER_DEBUG
> > + default y
>
> The previous version of the serie had default 'n'.
> Is it intentional to have it now activated by default? It looked more like a
> tuning functionality when fine grained mangement of shrinker is needed.
I think having this on by default if you've already enabled debugfs is smart -
it doesn't add runtime overhead, just a bit of code, and things that make the
system more observable are great to have on by default.
On Fri, May 20, 2022 at 12:45:12PM -0400, Kent Overstreet wrote:
> On Mon, May 09, 2022 at 11:38:16AM -0700, Roman Gushchin wrote:
> > This commit introduces the /sys/kernel/debug/shrinker debugfs
> > interface which provides an ability to observe the state of
> > individual kernel memory shrinkers.
> >
> > Because the feature adds some memory overhead (which shouldn't be
> > large unless there is a huge amount of registered shrinkers), it's
> > guarded by a config option (enabled by default).
> >
> > This commit introduces the "count" interface for each shrinker
> > registered in the system.
> >
> > The output is in the following format:
> > <cgroup inode id> <nr of objects on node 0> <nr of objects on node 1>...
> > <cgroup inode id> <nr of objects on node 0> <nr of objects on node 1>...
> > ...
> >
> > To reduce the size of output on machines with many thousands cgroups,
> > if the total number of objects on all nodes is 0, the line is omitted.
> >
> > If the shrinker is not memcg-aware or CONFIG_MEMCG is off, 0 is
> > printed as cgroup inode id. If the shrinker is not numa-aware, 0's are
> > printed for all nodes except the first one.
> >
> > This commit gives debugfs entries simple numeric names, which are not
> > very convenient. The following commit in the series will provide
> > shrinkers with more meaningful names.
> >
> > Signed-off-by: Roman Gushchin <[email protected]>
>
> I think this looks reasonable
>
> Reviewed-by: Kent Overstreet <[email protected]>
Thank you!
On Mon, May 09, 2022 at 11:38:17AM -0700, Roman Gushchin wrote:
> diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c
> index ad9f16689419..c1f734ab86b3 100644
> --- a/drivers/md/bcache/btree.c
> +++ b/drivers/md/bcache/btree.c
> @@ -812,7 +812,7 @@ int bch_btree_cache_alloc(struct cache_set *c)
> c->shrink.seeks = 4;
> c->shrink.batch = c->btree_pages * 2;
>
> - if (register_shrinker(&c->shrink))
> + if (register_shrinker(&c->shrink, "btree"))
> pr_warn("bcache: %s: could not register shrinker\n",
> __func__);
These drivers need better names for their shrinkers - there will often be
multiple instances in use on a system and we need to distinguish.
Also, "btree" isn't a good name for the bcache shrinker - "bcache-%pU",
c->set_uuid would be a good name for bcache's shrinker, it'll match up with the
cache set directory in /sys/fs/bcache/.
For others (device mapper, md, etc.) there should be a minor device number you
can reference.
On Mon, May 09, 2022 at 11:38:17AM -0700, Roman Gushchin wrote:
> Currently shrinkers are anonymous objects. For debugging purposes they
> can be identified by count/scan function names, but it's not always
> useful: e.g. for superblock's shrinkers it's nice to have at least
> an idea of to which superblock the shrinker belongs.
>
> This commit adds names to shrinkers. register_shrinker() and
> prealloc_shrinker() functions are extended to take a format and
> arguments to master a name.
>
> In some cases it's not possible to determine a good name at the time
> when a shrinker is allocated. For such cases shrinker_debugfs_rename()
> is provided.
>
> After this change the shrinker debugfs directory looks like:
> $ cd /sys/kernel/debug/shrinker/
> $ ls
> dqcache-16 sb-hugetlbfs-17 sb-rootfs-2 sb-tmpfs-50
> kfree_rcu-0 sb-hugetlbfs-33 sb-securityfs-6 sb-tracefs-13
> sb-aio-20 sb-iomem-12 sb-selinuxfs-22 sb-xfs:vda1-36
> sb-anon_inodefs-15 sb-mqueue-21 sb-sockfs-8 sb-zsmalloc-19
> sb-bdev-3 sb-nsfs-4 sb-sysfs-26 shadow-18
> sb-bpf-32 sb-pipefs-14 sb-tmpfs-1 thp_deferred_split-10
> sb-btrfs:vda2-24 sb-proc-25 sb-tmpfs-27 thp_zero-9
> sb-cgroup2-30 sb-proc-39 sb-tmpfs-29 xfs_buf-37
> sb-configfs-23 sb-proc-41 sb-tmpfs-35 xfs_inodegc-38
> sb-dax-11 sb-proc-45 sb-tmpfs-40 zspool-34
> sb-debugfs-7 sb-proc-46 sb-tmpfs-42
> sb-devpts-28 sb-proc-47 sb-tmpfs-43
> sb-devtmpfs-5 sb-pstore-31 sb-tmpfs-44
>
> Signed-off-by: Roman Gushchin <[email protected]>
> ---
> arch/x86/kvm/mmu/mmu.c | 2 +-
> drivers/android/binder_alloc.c | 2 +-
> drivers/gpu/drm/i915/gem/i915_gem_shrinker.c | 3 +-
> drivers/gpu/drm/msm/msm_gem_shrinker.c | 2 +-
> .../gpu/drm/panfrost/panfrost_gem_shrinker.c | 2 +-
> drivers/gpu/drm/ttm/ttm_pool.c | 2 +-
> drivers/md/bcache/btree.c | 2 +-
> drivers/md/dm-bufio.c | 2 +-
> drivers/md/dm-zoned-metadata.c | 2 +-
> drivers/md/raid5.c | 2 +-
> drivers/misc/vmw_balloon.c | 2 +-
> drivers/virtio/virtio_balloon.c | 2 +-
> drivers/xen/xenbus/xenbus_probe_backend.c | 2 +-
> fs/btrfs/super.c | 2 +
> fs/erofs/utils.c | 2 +-
> fs/ext4/extents_status.c | 3 +-
> fs/f2fs/super.c | 2 +-
> fs/gfs2/glock.c | 2 +-
> fs/gfs2/main.c | 2 +-
> fs/jbd2/journal.c | 2 +-
> fs/mbcache.c | 2 +-
> fs/nfs/nfs42xattr.c | 7 ++-
> fs/nfs/super.c | 2 +-
> fs/nfsd/filecache.c | 2 +-
> fs/nfsd/nfscache.c | 2 +-
> fs/quota/dquot.c | 2 +-
> fs/super.c | 6 +-
> fs/ubifs/super.c | 2 +-
> fs/xfs/xfs_buf.c | 2 +-
> fs/xfs/xfs_icache.c | 2 +-
> fs/xfs/xfs_qm.c | 2 +-
> include/linux/shrinker.h | 12 +++-
> kernel/rcu/tree.c | 2 +-
> mm/huge_memory.c | 4 +-
> mm/shrinker_debug.c | 45 +++++++++++++-
> mm/vmscan.c | 58 ++++++++++++++++++-
> mm/workingset.c | 2 +-
> mm/zsmalloc.c | 2 +-
> net/sunrpc/auth.c | 2 +-
> 39 files changed, 154 insertions(+), 46 deletions(-)
>
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index c623019929a7..8cfabdd63406 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -6283,7 +6283,7 @@ int kvm_mmu_vendor_module_init(void)
> if (percpu_counter_init(&kvm_total_used_mmu_pages, 0, GFP_KERNEL))
> goto out;
>
> - ret = register_shrinker(&mmu_shrinker);
> + ret = register_shrinker(&mmu_shrinker, "mmu");
> if (ret)
> goto out;
>
> diff --git a/drivers/android/binder_alloc.c b/drivers/android/binder_alloc.c
> index 2ac1008a5f39..951343c41ba8 100644
> --- a/drivers/android/binder_alloc.c
> +++ b/drivers/android/binder_alloc.c
> @@ -1084,7 +1084,7 @@ int binder_alloc_shrinker_init(void)
> int ret = list_lru_init(&binder_alloc_lru);
>
> if (ret == 0) {
> - ret = register_shrinker(&binder_shrinker);
> + ret = register_shrinker(&binder_shrinker, "binder");
> if (ret)
> list_lru_destroy(&binder_alloc_lru);
> }
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
> index 6a6ff98a8746..85524ef92ea4 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
> @@ -426,7 +426,8 @@ void i915_gem_driver_register__shrinker(struct drm_i915_private *i915)
> i915->mm.shrinker.count_objects = i915_gem_shrinker_count;
> i915->mm.shrinker.seeks = DEFAULT_SEEKS;
> i915->mm.shrinker.batch = 4096;
> - drm_WARN_ON(&i915->drm, register_shrinker(&i915->mm.shrinker));
> + drm_WARN_ON(&i915->drm, register_shrinker(&i915->mm.shrinker,
> + "drm_i915_gem"));
>
> i915->mm.oom_notifier.notifier_call = i915_gem_shrinker_oom;
> drm_WARN_ON(&i915->drm, register_oom_notifier(&i915->mm.oom_notifier));
> diff --git a/drivers/gpu/drm/msm/msm_gem_shrinker.c b/drivers/gpu/drm/msm/msm_gem_shrinker.c
> index 086dacf2f26a..2d3cf4f13dfd 100644
> --- a/drivers/gpu/drm/msm/msm_gem_shrinker.c
> +++ b/drivers/gpu/drm/msm/msm_gem_shrinker.c
> @@ -221,7 +221,7 @@ void msm_gem_shrinker_init(struct drm_device *dev)
> priv->shrinker.count_objects = msm_gem_shrinker_count;
> priv->shrinker.scan_objects = msm_gem_shrinker_scan;
> priv->shrinker.seeks = DEFAULT_SEEKS;
> - WARN_ON(register_shrinker(&priv->shrinker));
> + WARN_ON(register_shrinker(&priv->shrinker, "drm_msm_gem"));
>
> priv->vmap_notifier.notifier_call = msm_gem_shrinker_vmap;
> WARN_ON(register_vmap_purge_notifier(&priv->vmap_notifier));
> diff --git a/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c b/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c
> index 77e7cb6d1ae3..0d028266ee9e 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c
> @@ -103,7 +103,7 @@ void panfrost_gem_shrinker_init(struct drm_device *dev)
> pfdev->shrinker.count_objects = panfrost_gem_shrinker_count;
> pfdev->shrinker.scan_objects = panfrost_gem_shrinker_scan;
> pfdev->shrinker.seeks = DEFAULT_SEEKS;
> - WARN_ON(register_shrinker(&pfdev->shrinker));
> + WARN_ON(register_shrinker(&pfdev->shrinker, "drm_panfrost"));
> }
>
> /**
> diff --git a/drivers/gpu/drm/ttm/ttm_pool.c b/drivers/gpu/drm/ttm/ttm_pool.c
> index 1bba0a0ed3f9..b8b41d242197 100644
> --- a/drivers/gpu/drm/ttm/ttm_pool.c
> +++ b/drivers/gpu/drm/ttm/ttm_pool.c
> @@ -722,7 +722,7 @@ int ttm_pool_mgr_init(unsigned long num_pages)
> mm_shrinker.count_objects = ttm_pool_shrinker_count;
> mm_shrinker.scan_objects = ttm_pool_shrinker_scan;
> mm_shrinker.seeks = 1;
> - return register_shrinker(&mm_shrinker);
> + return register_shrinker(&mm_shrinker, "drm_ttm_pool");
> }
>
> /**
> diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c
> index ad9f16689419..c1f734ab86b3 100644
> --- a/drivers/md/bcache/btree.c
> +++ b/drivers/md/bcache/btree.c
> @@ -812,7 +812,7 @@ int bch_btree_cache_alloc(struct cache_set *c)
> c->shrink.seeks = 4;
> c->shrink.batch = c->btree_pages * 2;
>
> - if (register_shrinker(&c->shrink))
> + if (register_shrinker(&c->shrink, "btree"))
> pr_warn("bcache: %s: could not register shrinker\n",
> __func__);
>
> diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c
> index 5ffa1dcf84cf..bcc95898c341 100644
> --- a/drivers/md/dm-bufio.c
> +++ b/drivers/md/dm-bufio.c
> @@ -1806,7 +1806,7 @@ struct dm_bufio_client *dm_bufio_client_create(struct block_device *bdev, unsign
> c->shrinker.scan_objects = dm_bufio_shrink_scan;
> c->shrinker.seeks = 1;
> c->shrinker.batch = 0;
> - r = register_shrinker(&c->shrinker);
> + r = register_shrinker(&c->shrinker, "dm_bufio");
> if (r)
> goto bad;
>
> diff --git a/drivers/md/dm-zoned-metadata.c b/drivers/md/dm-zoned-metadata.c
> index d1ea66114d14..05f2fd12066b 100644
> --- a/drivers/md/dm-zoned-metadata.c
> +++ b/drivers/md/dm-zoned-metadata.c
> @@ -2944,7 +2944,7 @@ int dmz_ctr_metadata(struct dmz_dev *dev, int num_dev,
> zmd->mblk_shrinker.seeks = DEFAULT_SEEKS;
>
> /* Metadata cache shrinker */
> - ret = register_shrinker(&zmd->mblk_shrinker);
> + ret = register_shrinker(&zmd->mblk_shrinker, "md_meta");
> if (ret) {
> dmz_zmd_err(zmd, "Register metadata cache shrinker failed");
> goto err;
> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> index 59f91e392a2a..34ddebd3aff7 100644
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -7383,7 +7383,7 @@ static struct r5conf *setup_conf(struct mddev *mddev)
> conf->shrinker.count_objects = raid5_cache_count;
> conf->shrinker.batch = 128;
> conf->shrinker.flags = 0;
> - if (register_shrinker(&conf->shrinker)) {
> + if (register_shrinker(&conf->shrinker, "md-%s", mdname(mddev))) {
> pr_warn("md/raid:%s: couldn't register shrinker.\n",
> mdname(mddev));
> goto abort;
> diff --git a/drivers/misc/vmw_balloon.c b/drivers/misc/vmw_balloon.c
> index f1d8ba6d4857..6c9ddf1187dd 100644
> --- a/drivers/misc/vmw_balloon.c
> +++ b/drivers/misc/vmw_balloon.c
> @@ -1587,7 +1587,7 @@ static int vmballoon_register_shrinker(struct vmballoon *b)
> b->shrinker.count_objects = vmballoon_shrinker_count;
> b->shrinker.seeks = DEFAULT_SEEKS;
>
> - r = register_shrinker(&b->shrinker);
> + r = register_shrinker(&b->shrinker, "vmw_balloon");
>
> if (r == 0)
> b->shrinker_registered = true;
> diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> index f4c34a2a6b8e..093e06e19d0e 100644
> --- a/drivers/virtio/virtio_balloon.c
> +++ b/drivers/virtio/virtio_balloon.c
> @@ -875,7 +875,7 @@ static int virtio_balloon_register_shrinker(struct virtio_balloon *vb)
> vb->shrinker.count_objects = virtio_balloon_shrinker_count;
> vb->shrinker.seeks = DEFAULT_SEEKS;
>
> - return register_shrinker(&vb->shrinker);
> + return register_shrinker(&vb->shrinker, "virtio_valloon");
> }
>
> static int virtballoon_probe(struct virtio_device *vdev)
> diff --git a/drivers/xen/xenbus/xenbus_probe_backend.c b/drivers/xen/xenbus/xenbus_probe_backend.c
> index 5abded97e1a7..a6c5e344017d 100644
> --- a/drivers/xen/xenbus/xenbus_probe_backend.c
> +++ b/drivers/xen/xenbus/xenbus_probe_backend.c
> @@ -305,7 +305,7 @@ static int __init xenbus_probe_backend_init(void)
>
> register_xenstore_notifier(&xenstore_notifier);
>
> - if (register_shrinker(&backend_memory_shrinker))
> + if (register_shrinker(&backend_memory_shrinker, "xen_backend"))
> pr_warn("shrinker registration failed\n");
>
> return 0;
> diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
> index 206f44005c52..062dbd8071e2 100644
> --- a/fs/btrfs/super.c
> +++ b/fs/btrfs/super.c
> @@ -1790,6 +1790,8 @@ static struct dentry *btrfs_mount_root(struct file_system_type *fs_type,
> error = -EBUSY;
> } else {
> snprintf(s->s_id, sizeof(s->s_id), "%pg", bdev);
> + shrinker_debugfs_rename(&s->s_shrink, "sb-%s:%s", fs_type->name,
> + s->s_id);
> btrfs_sb(s)->bdev_holder = fs_type;
> if (!strstr(crc32c_impl(), "generic"))
> set_bit(BTRFS_FS_CSUM_IMPL_FAST, &fs_info->flags);
> diff --git a/fs/erofs/utils.c b/fs/erofs/utils.c
> index ec9a1d780dc1..67eb64fadd4f 100644
> --- a/fs/erofs/utils.c
> +++ b/fs/erofs/utils.c
> @@ -282,7 +282,7 @@ static struct shrinker erofs_shrinker_info = {
>
> int __init erofs_init_shrinker(void)
> {
> - return register_shrinker(&erofs_shrinker_info);
> + return register_shrinker(&erofs_shrinker_info, "erofs");
> }
>
> void erofs_exit_shrinker(void)
> diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c
> index 9a3a8996aacf..a7aa79d580e5 100644
> --- a/fs/ext4/extents_status.c
> +++ b/fs/ext4/extents_status.c
> @@ -1650,11 +1650,10 @@ int ext4_es_register_shrinker(struct ext4_sb_info *sbi)
> err = percpu_counter_init(&sbi->s_es_stats.es_stats_shk_cnt, 0, GFP_KERNEL);
> if (err)
> goto err3;
> -
> sbi->s_es_shrinker.scan_objects = ext4_es_scan;
> sbi->s_es_shrinker.count_objects = ext4_es_count;
> sbi->s_es_shrinker.seeks = DEFAULT_SEEKS;
> - err = register_shrinker(&sbi->s_es_shrinker);
> + err = register_shrinker(&sbi->s_es_shrinker, "ext4_es");
> if (err)
> goto err4;
>
> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
> index 4368f90571bd..2fc40a1635f3 100644
> --- a/fs/f2fs/super.c
> +++ b/fs/f2fs/super.c
> @@ -4579,7 +4579,7 @@ static int __init init_f2fs_fs(void)
> err = f2fs_init_sysfs();
> if (err)
> goto free_garbage_collection_cache;
> - err = register_shrinker(&f2fs_shrinker_info);
> + err = register_shrinker(&f2fs_shrinker_info, "f2fs");
> if (err)
> goto free_sysfs;
> err = register_filesystem(&f2fs_fs_type);
> diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
> index 26169cedcefc..791c23d9f7e7 100644
> --- a/fs/gfs2/glock.c
> +++ b/fs/gfs2/glock.c
> @@ -2549,7 +2549,7 @@ int __init gfs2_glock_init(void)
> return -ENOMEM;
> }
>
> - ret = register_shrinker(&glock_shrinker);
> + ret = register_shrinker(&glock_shrinker, "gfs2_glock");
> if (ret) {
> destroy_workqueue(gfs2_delete_workqueue);
> destroy_workqueue(glock_workqueue);
> diff --git a/fs/gfs2/main.c b/fs/gfs2/main.c
> index 28d0eb23e18e..dde981b78488 100644
> --- a/fs/gfs2/main.c
> +++ b/fs/gfs2/main.c
> @@ -150,7 +150,7 @@ static int __init init_gfs2_fs(void)
> if (!gfs2_trans_cachep)
> goto fail_cachep8;
>
> - error = register_shrinker(&gfs2_qd_shrinker);
> + error = register_shrinker(&gfs2_qd_shrinker, "gfs2_qd");
> if (error)
> goto fail_shrinker;
>
> diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
> index c0cbeeaec2d1..e7786445ecc1 100644
> --- a/fs/jbd2/journal.c
> +++ b/fs/jbd2/journal.c
> @@ -1418,7 +1418,7 @@ static journal_t *journal_init_common(struct block_device *bdev,
> if (percpu_counter_init(&journal->j_checkpoint_jh_count, 0, GFP_KERNEL))
> goto err_cleanup;
>
> - if (register_shrinker(&journal->j_shrinker)) {
> + if (register_shrinker(&journal->j_shrinker, "jbd2_journal")) {
> percpu_counter_destroy(&journal->j_checkpoint_jh_count);
> goto err_cleanup;
> }
> diff --git a/fs/mbcache.c b/fs/mbcache.c
> index 97c54d3a2227..379dc5b0b6ad 100644
> --- a/fs/mbcache.c
> +++ b/fs/mbcache.c
> @@ -367,7 +367,7 @@ struct mb_cache *mb_cache_create(int bucket_bits)
> cache->c_shrink.count_objects = mb_cache_count;
> cache->c_shrink.scan_objects = mb_cache_scan;
> cache->c_shrink.seeks = DEFAULT_SEEKS;
> - if (register_shrinker(&cache->c_shrink)) {
> + if (register_shrinker(&cache->c_shrink, "mb_cache")) {
> kfree(cache->c_hash);
> kfree(cache);
> goto err_out;
> diff --git a/fs/nfs/nfs42xattr.c b/fs/nfs/nfs42xattr.c
> index e7b34f7e0614..147b8a2f2dc6 100644
> --- a/fs/nfs/nfs42xattr.c
> +++ b/fs/nfs/nfs42xattr.c
> @@ -1017,15 +1017,16 @@ int __init nfs4_xattr_cache_init(void)
> if (ret)
> goto out2;
>
> - ret = register_shrinker(&nfs4_xattr_cache_shrinker);
> + ret = register_shrinker(&nfs4_xattr_cache_shrinker, "nfs_xattr_cache");
> if (ret)
> goto out1;
>
> - ret = register_shrinker(&nfs4_xattr_entry_shrinker);
> + ret = register_shrinker(&nfs4_xattr_entry_shrinker, "nfs_xattr_entry");
> if (ret)
> goto out;
>
> - ret = register_shrinker(&nfs4_xattr_large_entry_shrinker);
> + ret = register_shrinker(&nfs4_xattr_large_entry_shrinker,
> + "nfs_xattr_large_entry");
> if (!ret)
> return 0;
>
> diff --git a/fs/nfs/super.c b/fs/nfs/super.c
> index 6ab5eeb000dc..c7a2aef911f1 100644
> --- a/fs/nfs/super.c
> +++ b/fs/nfs/super.c
> @@ -149,7 +149,7 @@ int __init register_nfs_fs(void)
> ret = nfs_register_sysctl();
> if (ret < 0)
> goto error_2;
> - ret = register_shrinker(&acl_shrinker);
> + ret = register_shrinker(&acl_shrinker, "nfs_acl");
> if (ret < 0)
> goto error_3;
> #ifdef CONFIG_NFS_V4_2
> diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
> index 2c1b027774d4..9c2879a3c3c0 100644
> --- a/fs/nfsd/filecache.c
> +++ b/fs/nfsd/filecache.c
> @@ -666,7 +666,7 @@ nfsd_file_cache_init(void)
> goto out_err;
> }
>
> - ret = register_shrinker(&nfsd_file_shrinker);
> + ret = register_shrinker(&nfsd_file_shrinker, "nfsd_filecache");
> if (ret) {
> pr_err("nfsd: failed to register nfsd_file_shrinker: %d\n", ret);
> goto out_lru;
> diff --git a/fs/nfsd/nfscache.c b/fs/nfsd/nfscache.c
> index 0b3f12aa37ff..f1cfb06d0be5 100644
> --- a/fs/nfsd/nfscache.c
> +++ b/fs/nfsd/nfscache.c
> @@ -176,7 +176,7 @@ int nfsd_reply_cache_init(struct nfsd_net *nn)
> nn->nfsd_reply_cache_shrinker.scan_objects = nfsd_reply_cache_scan;
> nn->nfsd_reply_cache_shrinker.count_objects = nfsd_reply_cache_count;
> nn->nfsd_reply_cache_shrinker.seeks = 1;
> - status = register_shrinker(&nn->nfsd_reply_cache_shrinker);
> + status = register_shrinker(&nn->nfsd_reply_cache_shrinker, "nfsd_reply");
> if (status)
> goto out_stats_destroy;
>
> diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
> index a74aef99bd3d..854d2b1d0914 100644
> --- a/fs/quota/dquot.c
> +++ b/fs/quota/dquot.c
> @@ -2985,7 +2985,7 @@ static int __init dquot_init(void)
> pr_info("VFS: Dquot-cache hash table entries: %ld (order %ld,"
> " %ld bytes)\n", nr_hash, order, (PAGE_SIZE << order));
>
> - if (register_shrinker(&dqcache_shrinker))
> + if (register_shrinker(&dqcache_shrinker, "dqcache"))
> panic("Cannot register dquot shrinker");
>
> return 0;
> diff --git a/fs/super.c b/fs/super.c
> index 60f57c7bc0a6..4fca6657f442 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -265,7 +265,7 @@ static struct super_block *alloc_super(struct file_system_type *type, int flags,
> s->s_shrink.count_objects = super_cache_count;
> s->s_shrink.batch = 1024;
> s->s_shrink.flags = SHRINKER_NUMA_AWARE | SHRINKER_MEMCG_AWARE;
> - if (prealloc_shrinker(&s->s_shrink))
> + if (prealloc_shrinker(&s->s_shrink, "sb-%s", type->name))
> goto fail;
> if (list_lru_init_memcg(&s->s_dentry_lru, &s->s_shrink))
> goto fail;
> @@ -1288,6 +1288,8 @@ int get_tree_bdev(struct fs_context *fc,
> } else {
> s->s_mode = mode;
> snprintf(s->s_id, sizeof(s->s_id), "%pg", bdev);
> + shrinker_debugfs_rename(&s->s_shrink, "sb-%s:%s",
> + fc->fs_type->name, s->s_id);
> sb_set_blocksize(s, block_size(bdev));
> error = fill_super(s, fc);
> if (error) {
> @@ -1363,6 +1365,8 @@ struct dentry *mount_bdev(struct file_system_type *fs_type,
> } else {
> s->s_mode = mode;
> snprintf(s->s_id, sizeof(s->s_id), "%pg", bdev);
> + shrinker_debugfs_rename(&s->s_shrink, "sb-%s:%s",
> + fs_type->name, s->s_id);
> sb_set_blocksize(s, block_size(bdev));
> error = fill_super(s, data, flags & SB_SILENT ? 1 : 0);
> if (error) {
> diff --git a/fs/ubifs/super.c b/fs/ubifs/super.c
> index bad67455215f..a3663d201f64 100644
> --- a/fs/ubifs/super.c
> +++ b/fs/ubifs/super.c
> @@ -2430,7 +2430,7 @@ static int __init ubifs_init(void)
> if (!ubifs_inode_slab)
> return -ENOMEM;
>
> - err = register_shrinker(&ubifs_shrinker_info);
> + err = register_shrinker(&ubifs_shrinker_info, "ubifs");
> if (err)
> goto out_slab;
>
> diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
> index e1afb9e503e1..5645e92df0c9 100644
> --- a/fs/xfs/xfs_buf.c
> +++ b/fs/xfs/xfs_buf.c
> @@ -1986,7 +1986,7 @@ xfs_alloc_buftarg(
> btp->bt_shrinker.scan_objects = xfs_buftarg_shrink_scan;
> btp->bt_shrinker.seeks = DEFAULT_SEEKS;
> btp->bt_shrinker.flags = SHRINKER_NUMA_AWARE;
> - if (register_shrinker(&btp->bt_shrinker))
> + if (register_shrinker(&btp->bt_shrinker, "xfs_buf"))
> goto error_pcpu;
> return btp;
>
> diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
> index bffd6eb0b298..d0c4e74ff763 100644
> --- a/fs/xfs/xfs_icache.c
> +++ b/fs/xfs/xfs_icache.c
> @@ -2198,5 +2198,5 @@ xfs_inodegc_register_shrinker(
> shrink->flags = SHRINKER_NONSLAB;
> shrink->batch = XFS_INODEGC_SHRINKER_BATCH;
>
> - return register_shrinker(shrink);
> + return register_shrinker(shrink, "xfs_inodegc");
> }
> diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c
> index f165d1a3de1d..93ded9e81f49 100644
> --- a/fs/xfs/xfs_qm.c
> +++ b/fs/xfs/xfs_qm.c
> @@ -686,7 +686,7 @@ xfs_qm_init_quotainfo(
> qinf->qi_shrinker.seeks = DEFAULT_SEEKS;
> qinf->qi_shrinker.flags = SHRINKER_NUMA_AWARE;
>
> - error = register_shrinker(&qinf->qi_shrinker);
> + error = register_shrinker(&qinf->qi_shrinker, "xfs_qm");
> if (error)
> goto out_free_inos;
>
> diff --git a/include/linux/shrinker.h b/include/linux/shrinker.h
> index 2ced8149c513..64416f3e0a1f 100644
> --- a/include/linux/shrinker.h
> +++ b/include/linux/shrinker.h
> @@ -75,6 +75,7 @@ struct shrinker {
> #endif
> #ifdef CONFIG_SHRINKER_DEBUG
> int debugfs_id;
> + const char *name;
> struct dentry *debugfs_entry;
> #endif
> /* objs pending delete, per node */
> @@ -92,9 +93,9 @@ struct shrinker {
> */
> #define SHRINKER_NONSLAB (1 << 3)
>
> -extern int prealloc_shrinker(struct shrinker *shrinker);
> +extern int prealloc_shrinker(struct shrinker *shrinker, const char *fmt, ...);
> extern void register_shrinker_prepared(struct shrinker *shrinker);
> -extern int register_shrinker(struct shrinker *shrinker);
> +extern int register_shrinker(struct shrinker *shrinker, const char *fmt, ...);
> extern void unregister_shrinker(struct shrinker *shrinker);
> extern void free_prealloced_shrinker(struct shrinker *shrinker);
> extern void synchronize_shrinkers(void);
> @@ -102,6 +103,8 @@ extern void synchronize_shrinkers(void);
> #ifdef CONFIG_SHRINKER_DEBUG
> extern int shrinker_debugfs_add(struct shrinker *shrinker);
> extern void shrinker_debugfs_remove(struct shrinker *shrinker);
> +extern int shrinker_debugfs_rename(struct shrinker *shrinker,
> + const char *fmt, ...);
> #else /* CONFIG_SHRINKER_DEBUG */
> static inline int shrinker_debugfs_add(struct shrinker *shrinker)
> {
> @@ -110,5 +113,10 @@ static inline int shrinker_debugfs_add(struct shrinker *shrinker)
> static inline void shrinker_debugfs_remove(struct shrinker *shrinker)
> {
> }
> +static inline int shrinker_debugfs_rename(struct shrinker *shrinker,
> + const char *fmt, ...)
> +{
> + return 0;
> +}
> #endif /* CONFIG_SHRINKER_DEBUG */
> #endif /* _LINUX_SHRINKER_H */
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 5c587e00811c..b4c66916bea9 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -4978,7 +4978,7 @@ static void __init kfree_rcu_batch_init(void)
> INIT_DELAYED_WORK(&krcp->page_cache_work, fill_page_cache_func);
> krcp->initialized = true;
> }
> - if (register_shrinker(&kfree_rcu_shrinker))
> + if (register_shrinker(&kfree_rcu_shrinker, "kfree_rcu"))
> pr_err("Failed to register kfree_rcu() shrinker!\n");
> }
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index fa6a1623976a..a40df19c0e38 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -423,10 +423,10 @@ static int __init hugepage_init(void)
> if (err)
> goto err_slab;
>
> - err = register_shrinker(&huge_zero_page_shrinker);
> + err = register_shrinker(&huge_zero_page_shrinker, "thp_zero");
> if (err)
> goto err_hzp_shrinker;
> - err = register_shrinker(&deferred_split_shrinker);
> + err = register_shrinker(&deferred_split_shrinker, "thp_deferred_split");
> if (err)
> goto err_split_shrinker;
>
> diff --git a/mm/shrinker_debug.c b/mm/shrinker_debug.c
> index fd1f805a581a..28b1c1ab60ef 100644
> --- a/mm/shrinker_debug.c
> +++ b/mm/shrinker_debug.c
> @@ -104,7 +104,7 @@ DEFINE_SHOW_ATTRIBUTE(shrinker_debugfs_count);
> int shrinker_debugfs_add(struct shrinker *shrinker)
> {
> struct dentry *entry;
> - char buf[16];
> + char buf[128];
> int id;
>
> lockdep_assert_held(&shrinker_rwsem);
> @@ -118,7 +118,7 @@ int shrinker_debugfs_add(struct shrinker *shrinker)
> return id;
> shrinker->debugfs_id = id;
>
> - snprintf(buf, sizeof(buf), "%d", id);
> + snprintf(buf, sizeof(buf), "%s-%d", shrinker->name, id);
>
> /* create debugfs entry */
> entry = debugfs_create_dir(buf, shrinker_debugfs_root);
> @@ -133,10 +133,51 @@ int shrinker_debugfs_add(struct shrinker *shrinker)
> return 0;
> }
>
> +int shrinker_debugfs_rename(struct shrinker *shrinker, const char *fmt, ...)
> +{
> + struct dentry *entry;
> + char buf[128];
> + const char *old;
> + va_list ap;
> + int ret = 0;
> +
> + down_write(&shrinker_rwsem);
> +
> + old = shrinker->name;
> +
> + va_start(ap, fmt);
> + shrinker->name = kvasprintf_const(GFP_KERNEL, fmt, ap);
> + va_end(ap);
> + if (!shrinker->name) {
> + shrinker->name = old;
> + ret = -ENOMEM;
Seems we could move those 6 lines out of shrinker_rwsem. I know
this function is not in a hot path, but it it better to improve
it if it is easy, right?
> + } else if (shrinker->debugfs_entry) {
> + snprintf(buf, sizeof(buf), "%s-%d", shrinker->name,
> + shrinker->debugfs_id);
> +
> + entry = debugfs_rename(shrinker_debugfs_root,
> + shrinker->debugfs_entry,
> + shrinker_debugfs_root, buf);
> + if (IS_ERR(entry))
> + ret = PTR_ERR(entry);
> + else
> + shrinker->debugfs_entry = entry;
> +
> + kfree_const(old);
> + }
> +
> + up_write(&shrinker_rwsem);
> +
> + return ret;
> +}
> +EXPORT_SYMBOL(shrinker_debugfs_rename);
> +
> void shrinker_debugfs_remove(struct shrinker *shrinker)
> {
> lockdep_assert_held(&shrinker_rwsem);
>
> + kfree_const(shrinker->name);
> +
> if (!shrinker->debugfs_entry)
> return;
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 024f7056b98c..42bae0fd0442 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -613,7 +613,7 @@ static unsigned long lruvec_lru_size(struct lruvec *lruvec, enum lru_list lru,
> /*
> * Add a shrinker callback to be called from the vm.
> */
> -int prealloc_shrinker(struct shrinker *shrinker)
> +static int __prealloc_shrinker(struct shrinker *shrinker)
> {
> unsigned int size;
> int err;
> @@ -637,8 +637,36 @@ int prealloc_shrinker(struct shrinker *shrinker)
> return 0;
> }
>
> +#ifdef CONFIG_SHRINKER_DEBUG
> +int prealloc_shrinker(struct shrinker *shrinker, const char *fmt, ...)
> +{
> + va_list ap;
> + int err;
> +
> + va_start(ap, fmt);
> + shrinker->name = kvasprintf_const(GFP_KERNEL, fmt, ap);
> + va_end(ap);
> + if (!shrinker->name)
> + return -ENOMEM;
> +
> + err = __prealloc_shrinker(shrinker);
> + if (err)
> + kfree_const(shrinker->name);
> +
> + return err;
> +}
> +#else
> +int prealloc_shrinker(struct shrinker *shrinker, const char *fmt, ...)
> +{
> + return __prealloc_shrinker(shrinker);
> +}
> +#endif
> +
> void free_prealloced_shrinker(struct shrinker *shrinker)
> {
> +#ifdef CONFIG_SHRINKER_DEBUG
> + kfree_const(shrinker->name);
> +#endif
> if (shrinker->flags & SHRINKER_MEMCG_AWARE) {
> down_write(&shrinker_rwsem);
> unregister_memcg_shrinker(shrinker);
> @@ -659,15 +687,39 @@ void register_shrinker_prepared(struct shrinker *shrinker)
> up_write(&shrinker_rwsem);
> }
>
> -int register_shrinker(struct shrinker *shrinker)
> +static int __register_shrinker(struct shrinker *shrinker)
> {
> - int err = prealloc_shrinker(shrinker);
> + int err = __prealloc_shrinker(shrinker);
>
> if (err)
> return err;
> register_shrinker_prepared(shrinker);
> return 0;
> }
> +
> +#ifdef CONFIG_SHRINKER_DEBUG
> +int register_shrinker(struct shrinker *shrinker, const char *fmt, ...)
> +{
> + va_list ap;
> + int err;
> +
> + va_start(ap, fmt);
> + shrinker->name = kvasprintf_const(GFP_KERNEL, fmt, ap);
> + va_end(ap);
> + if (!shrinker->name)
> + return -ENOMEM;
> +
How about moving those initialization of name into shrinker_debugfs_add()?
Then maybe we could hide handling to "name" from vmscan.c.
Thanks.
> + err = __register_shrinker(shrinker);
> + if (err)
> + kfree_const(shrinker->name);
> + return err;
> +}
> +#else
> +int register_shrinker(struct shrinker *shrinker, const char *fmt, ...)
> +{
> + return __register_shrinker(shrinker);
> +}
> +#endif
> EXPORT_SYMBOL(register_shrinker);
>
> /*
> diff --git a/mm/workingset.c b/mm/workingset.c
> index 592569a8974c..840986179cf3 100644
> --- a/mm/workingset.c
> +++ b/mm/workingset.c
> @@ -625,7 +625,7 @@ static int __init workingset_init(void)
> pr_info("workingset: timestamp_bits=%d max_order=%d bucket_order=%u\n",
> timestamp_bits, max_order, bucket_order);
>
> - ret = prealloc_shrinker(&workingset_shadow_shrinker);
> + ret = prealloc_shrinker(&workingset_shadow_shrinker, "shadow");
> if (ret)
> goto err;
> ret = __list_lru_init(&shadow_nodes, true, &shadow_nodes_key,
> diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> index 9152fbde33b5..a19de176f604 100644
> --- a/mm/zsmalloc.c
> +++ b/mm/zsmalloc.c
> @@ -2188,7 +2188,7 @@ static int zs_register_shrinker(struct zs_pool *pool)
> pool->shrinker.batch = 0;
> pool->shrinker.seeks = DEFAULT_SEEKS;
>
> - return register_shrinker(&pool->shrinker);
> + return register_shrinker(&pool->shrinker, "zspool");
> }
>
> /**
> diff --git a/net/sunrpc/auth.c b/net/sunrpc/auth.c
> index 682fcd24bf43..a29742a9c3f1 100644
> --- a/net/sunrpc/auth.c
> +++ b/net/sunrpc/auth.c
> @@ -874,7 +874,7 @@ int __init rpcauth_init_module(void)
> err = rpc_init_authunix();
> if (err < 0)
> goto out1;
> - err = register_shrinker(&rpc_cred_shrinker);
> + err = register_shrinker(&rpc_cred_shrinker, "rpc_cred");
> if (err < 0)
> goto out2;
> return 0;
> --
> 2.35.3
>
>
On Thu, May 19, 2022 at 10:15:04AM -0700, Roman Gushchin wrote:
> On Mon, May 09, 2022 at 11:38:14AM -0700, Roman Gushchin wrote:
> > There are 50+ different shrinkers in the kernel, many with their own bells and
> > whistles. Under the memory pressure the kernel applies some pressure on each of
> > them in the order of which they were created/registered in the system. Some
> > of them can contain only few objects, some can be quite large. Some can be
> > effective at reclaiming memory, some not.
> >
> > The only existing debugging mechanism is a couple of tracepoints in
> > do_shrink_slab(): mm_shrink_slab_start and mm_shrink_slab_end. They aren't
> > covering everything though: shrinkers which report 0 objects will never show up,
> > there is no support for memcg-aware shrinkers. Shrinkers are identified by their
> > scan function, which is not always enough (e.g. hard to guess which super
> > block's shrinker it is having only "super_cache_scan").
> >
> > To provide a better visibility and debug options for memory shrinkers
> > this patchset introduces a /sys/kernel/debug/shrinker interface, to some extent
> > similar to /sys/kernel/slab.
> >
> > For each shrinker registered in the system a directory is created.
> > As now, the directory will contain only a "scan" file, which allows to get
> > the number of managed objects for each memory cgroup (for memcg-aware shrinkers)
> > and each numa node (for numa-aware shrinkers on a numa machine). Other
> > interfaces might be added in the future.
> >
> > To make debugging more pleasant, the patchset also names all shrinkers,
> > so that debugfs entries can have meaningful names.
> >
> >
> > v3:
> > 1) separated the "scan" part into a separate patch, by Dave
> > 2) merged *_memcg, *_node and *_memcg_node interfaces, by Dave
> > 3) shrinkers naming enhancements, by Christophe and Dave
> > 4) added signal_pending() check, by Hillf
> > 5) enabled by default, by Dave
>
> Any comments? Thoughts? Objections?
I have no time available to look at this right now, and won't for a
while.
Cheers,
Dave.
--
Dave Chinner
[email protected]
On Fri, May 20, 2022 at 12:41:15PM -0400, Kent Overstreet wrote:
> On Mon, May 09, 2022 at 11:38:17AM -0700, Roman Gushchin wrote:
> > diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c
> > index ad9f16689419..c1f734ab86b3 100644
> > --- a/drivers/md/bcache/btree.c
> > +++ b/drivers/md/bcache/btree.c
> > @@ -812,7 +812,7 @@ int bch_btree_cache_alloc(struct cache_set *c)
> > c->shrink.seeks = 4;
> > c->shrink.batch = c->btree_pages * 2;
> >
> > - if (register_shrinker(&c->shrink))
> > + if (register_shrinker(&c->shrink, "btree"))
> > pr_warn("bcache: %s: could not register shrinker\n",
> > __func__);
>
> These drivers need better names for their shrinkers - there will often be
> multiple instances in use on a system and we need to distinguish.
>
> Also, "btree" isn't a good name for the bcache shrinker - "bcache-%pU",
> c->set_uuid would be a good name for bcache's shrinker, it'll match up with the
> cache set directory in /sys/fs/bcache/.
Sure, will improve in the next version. Thanks!
>
> For others (device mapper, md, etc.) there should be a minor device number you
> can reference.
Good point, will think of it.
Thank you for taking a look!
On Mon, May 09, 2022 at 11:38:16AM -0700, Roman Gushchin wrote:
> This commit introduces the /sys/kernel/debug/shrinker debugfs
> interface which provides an ability to observe the state of
> individual kernel memory shrinkers.
>
> Because the feature adds some memory overhead (which shouldn't be
> large unless there is a huge amount of registered shrinkers), it's
> guarded by a config option (enabled by default).
>
> This commit introduces the "count" interface for each shrinker
> registered in the system.
>
> The output is in the following format:
Hi Roman,
Shoud we print a title to show what those numbers mean? In this case,
it is more understandable.
> <cgroup inode id> <nr of objects on node 0> <nr of objects on node 1>...
> <cgroup inode id> <nr of objects on node 0> <nr of objects on node 1>...
> ...
>
> To reduce the size of output on machines with many thousands cgroups,
> if the total number of objects on all nodes is 0, the line is omitted.
>
> If the shrinker is not memcg-aware or CONFIG_MEMCG is off, 0 is
> printed as cgroup inode id. If the shrinker is not numa-aware, 0's are
> printed for all nodes except the first one.
>
> This commit gives debugfs entries simple numeric names, which are not
> very convenient. The following commit in the series will provide
> shrinkers with more meaningful names.
>
> Signed-off-by: Roman Gushchin <[email protected]>
> ---
> include/linux/shrinker.h | 19 ++++-
> lib/Kconfig.debug | 9 +++
> mm/Makefile | 1 +
> mm/shrinker_debug.c | 171 +++++++++++++++++++++++++++++++++++++++
> mm/vmscan.c | 6 +-
> 5 files changed, 203 insertions(+), 3 deletions(-)
> create mode 100644 mm/shrinker_debug.c
>
> diff --git a/include/linux/shrinker.h b/include/linux/shrinker.h
> index 76fbf92b04d9..2ced8149c513 100644
> --- a/include/linux/shrinker.h
> +++ b/include/linux/shrinker.h
> @@ -72,6 +72,10 @@ struct shrinker {
> #ifdef CONFIG_MEMCG
> /* ID in shrinker_idr */
> int id;
> +#endif
> +#ifdef CONFIG_SHRINKER_DEBUG
> + int debugfs_id;
> + struct dentry *debugfs_entry;
> #endif
> /* objs pending delete, per node */
> atomic_long_t *nr_deferred;
> @@ -94,4 +98,17 @@ extern int register_shrinker(struct shrinker *shrinker);
> extern void unregister_shrinker(struct shrinker *shrinker);
> extern void free_prealloced_shrinker(struct shrinker *shrinker);
> extern void synchronize_shrinkers(void);
> -#endif
> +
> +#ifdef CONFIG_SHRINKER_DEBUG
> +extern int shrinker_debugfs_add(struct shrinker *shrinker);
> +extern void shrinker_debugfs_remove(struct shrinker *shrinker);
> +#else /* CONFIG_SHRINKER_DEBUG */
> +static inline int shrinker_debugfs_add(struct shrinker *shrinker)
> +{
> + return 0;
> +}
> +static inline void shrinker_debugfs_remove(struct shrinker *shrinker)
> +{
> +}
> +#endif /* CONFIG_SHRINKER_DEBUG */
> +#endif /* _LINUX_SHRINKER_H */
> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> index 3fd7a2e9eaf1..5fa65a649798 100644
> --- a/lib/Kconfig.debug
> +++ b/lib/Kconfig.debug
> @@ -733,6 +733,15 @@ config SLUB_STATS
> out which slabs are relevant to a particular load.
> Try running: slabinfo -DA
>
> +config SHRINKER_DEBUG
> + default y
> + bool "Enable shrinker debugging support"
> + depends on DEBUG_FS
> + help
> + Say Y to enable the shrinker debugfs interface which provides
> + visibility into the kernel memory shrinkers subsystem.
> + Disable it to avoid an extra memory footprint.
> +
> config HAVE_DEBUG_KMEMLEAK
> bool
>
> diff --git a/mm/Makefile b/mm/Makefile
> index 298c9991ab75..8083fa85a348 100644
> --- a/mm/Makefile
> +++ b/mm/Makefile
> @@ -133,3 +133,4 @@ obj-$(CONFIG_PAGE_REPORTING) += page_reporting.o
> obj-$(CONFIG_IO_MAPPING) += io-mapping.o
> obj-$(CONFIG_HAVE_BOOTMEM_INFO_NODE) += bootmem_info.o
> obj-$(CONFIG_GENERIC_IOREMAP) += ioremap.o
> +obj-$(CONFIG_SHRINKER_DEBUG) += shrinker_debug.o
> diff --git a/mm/shrinker_debug.c b/mm/shrinker_debug.c
> new file mode 100644
> index 000000000000..fd1f805a581a
> --- /dev/null
> +++ b/mm/shrinker_debug.c
> @@ -0,0 +1,171 @@
> +// SPDX-License-Identifier: GPL-2.0
> +#include <linux/idr.h>
> +#include <linux/slab.h>
> +#include <linux/debugfs.h>
> +#include <linux/seq_file.h>
> +#include <linux/shrinker.h>
> +#include <linux/memcontrol.h>
> +
> +/* defined in vmscan.c */
> +extern struct rw_semaphore shrinker_rwsem;
> +extern struct list_head shrinker_list;
> +
> +static DEFINE_IDA(shrinker_debugfs_ida);
> +static struct dentry *shrinker_debugfs_root;
> +
> +static unsigned long shrinker_count_objects(struct shrinker *shrinker,
> + struct mem_cgroup *memcg,
> + unsigned long *count_per_node)
> +{
> + unsigned long nr, total = 0;
> + int nid;
> +
> + for_each_node(nid) {
> + if (nid == 0 || (shrinker->flags & SHRINKER_NUMA_AWARE)) {
> + struct shrink_control sc = {
> + .gfp_mask = GFP_KERNEL,
> + .nid = nid,
> + .memcg = memcg,
> + };
> +
> + nr = shrinker->count_objects(shrinker, &sc);
> + if (nr == SHRINK_EMPTY)
> + nr = 0;
> + } else {
> + nr = 0;
For efficiency, we could break here, right?
> + }
> +
> + count_per_node[nid] = nr;
> + total += nr;
> + }
> +
> + return total;
> +}
> +
> +static int shrinker_debugfs_count_show(struct seq_file *m, void *v)
> +{
> + struct shrinker *shrinker = (struct shrinker *)m->private;
Maybe we cound drop the cast since m->private is a void * type.
> + unsigned long *count_per_node = NULL;
Do not need to be initialized, right?
> + struct mem_cgroup *memcg;
> + unsigned long total;
> + bool memcg_aware;
> + int ret, nid;
> +
> + count_per_node = kcalloc(nr_node_ids, sizeof(unsigned long), GFP_KERNEL);
> + if (!count_per_node)
> + return -ENOMEM;
> +
> + ret = down_read_killable(&shrinker_rwsem);
> + if (ret) {
> + kfree(count_per_node);
> + return ret;
> + }
> + rcu_read_lock();
> +
> + memcg_aware = shrinker->flags & SHRINKER_MEMCG_AWARE;
> +
> + memcg = mem_cgroup_iter(NULL, NULL, NULL);
> + do {
> + if (memcg && !mem_cgroup_online(memcg))
> + continue;
> +
> + total = shrinker_count_objects(shrinker,
> + memcg_aware ? memcg : NULL,
> + count_per_node);
> + if (total) {
> + seq_printf(m, "%lu", mem_cgroup_ino(memcg));
> + for_each_node(nid)
> + seq_printf(m, " %lu", count_per_node[nid]);
> + seq_puts(m, "\n");
seq_putc(m, '\n') is more efficient.
> + }
> +
> + if (!memcg_aware) {
> + mem_cgroup_iter_break(NULL, memcg);
> + break;
> + }
> +
> + if (signal_pending(current)) {
> + mem_cgroup_iter_break(NULL, memcg);
> + ret = -EINTR;
> + break;
> + }
> +
> + cond_resched();
We are in rcu read lock, cannot be scheduled, right?
> + } while ((memcg = mem_cgroup_iter(NULL, memcg, NULL)) != NULL);
> +
> + rcu_read_unlock();
> + up_read(&shrinker_rwsem);
> +
> + kfree(count_per_node);
> + return ret;
> +}
> +DEFINE_SHOW_ATTRIBUTE(shrinker_debugfs_count);
> +
> +int shrinker_debugfs_add(struct shrinker *shrinker)
> +{
> + struct dentry *entry;
> + char buf[16];
> + int id;
> +
> + lockdep_assert_held(&shrinker_rwsem);
> +
> + /* debugfs isn't initialized yet, add debugfs entries later. */
> + if (!shrinker_debugfs_root)
> + return 0;
> +
> + id = ida_alloc(&shrinker_debugfs_ida, GFP_KERNEL);
> + if (id < 0)
> + return id;
> + shrinker->debugfs_id = id;
> +
> + snprintf(buf, sizeof(buf), "%d", id);
> +
> + /* create debugfs entry */
> + entry = debugfs_create_dir(buf, shrinker_debugfs_root);
> + if (IS_ERR(entry)) {
> + ida_free(&shrinker_debugfs_ida, id);
> + return PTR_ERR(entry);
> + }
> + shrinker->debugfs_entry = entry;
> +
> + debugfs_create_file("count", 0220, entry, shrinker,
> + &shrinker_debugfs_count_fops);
> + return 0;
> +}
> +
> +void shrinker_debugfs_remove(struct shrinker *shrinker)
> +{
> + lockdep_assert_held(&shrinker_rwsem);
> +
> + if (!shrinker->debugfs_entry)
> + return;
> +
> + debugfs_remove_recursive(shrinker->debugfs_entry);
> + ida_free(&shrinker_debugfs_ida, shrinker->debugfs_id);
> +}
> +
> +static int __init shrinker_debugfs_init(void)
> +{
> + struct shrinker *shrinker;
> + int ret;
> +
> + if (!debugfs_initialized())
> + return -ENODEV;
> +
Redundant check since it is checked in debugfs_create_dir().
So I think we could remove this.
> + shrinker_debugfs_root = debugfs_create_dir("shrinker", NULL);
We should use IS_ERR() to detect the error code. So the following
check is wrong.
> + if (!shrinker_debugfs_root)
> + return -ENOMEM;
> +
> + /* Create debugfs entries for shrinkers registered at boot */
> + ret = down_write_killable(&shrinker_rwsem);
How could we kill this process? IIUC, late_initcall() is called
from early init process, there is no way to kill this. Right?
If yes, I think we could just use down_write().
Thanks.
> + if (ret)
> + return ret;
> +
> + list_for_each_entry(shrinker, &shrinker_list, list)
> + if (!shrinker->debugfs_entry)
> + ret = shrinker_debugfs_add(shrinker);
> + up_write(&shrinker_rwsem);
> +
> + return ret;
> +}
> +late_initcall(shrinker_debugfs_init);
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index c6918fff06e1..024f7056b98c 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -190,8 +190,8 @@ static void set_task_reclaim_state(struct task_struct *task,
> task->reclaim_state = rs;
> }
>
> -static LIST_HEAD(shrinker_list);
> -static DECLARE_RWSEM(shrinker_rwsem);
> +LIST_HEAD(shrinker_list);
> +DECLARE_RWSEM(shrinker_rwsem);
>
> #ifdef CONFIG_MEMCG
> static int shrinker_nr_max;
> @@ -655,6 +655,7 @@ void register_shrinker_prepared(struct shrinker *shrinker)
> down_write(&shrinker_rwsem);
> list_add_tail(&shrinker->list, &shrinker_list);
> shrinker->flags |= SHRINKER_REGISTERED;
> + WARN_ON_ONCE(shrinker_debugfs_add(shrinker));
> up_write(&shrinker_rwsem);
> }
>
> @@ -682,6 +683,7 @@ void unregister_shrinker(struct shrinker *shrinker)
> shrinker->flags &= ~SHRINKER_REGISTERED;
> if (shrinker->flags & SHRINKER_MEMCG_AWARE)
> unregister_memcg_shrinker(shrinker);
> + shrinker_debugfs_remove(shrinker);
> up_write(&shrinker_rwsem);
>
> kfree(shrinker->nr_deferred);
> --
> 2.35.3
>
>
On Mon, May 09, 2022 at 11:38:15AM -0700, Roman Gushchin wrote:
> Shrinker debugfs requires a way to represent memory cgroups without
> using full paths, both for displaying information and getting input
> from a user.
>
> Cgroup inode number is a perfect way, already used by bpf.
>
> This commit adds a couple of helper functions which will be used
> to handle memcg-aware shrinkers.
>
> Signed-off-by: Roman Gushchin <[email protected]>
> ---
> include/linux/memcontrol.h | 21 +++++++++++++++++++++
> mm/memcontrol.c | 23 +++++++++++++++++++++++
> 2 files changed, 44 insertions(+)
>
> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> index fe580cb96683..a6de9e5c1549 100644
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -831,6 +831,15 @@ static inline unsigned short mem_cgroup_id(struct mem_cgroup *memcg)
> }
> struct mem_cgroup *mem_cgroup_from_id(unsigned short id);
>
> +#ifdef CONFIG_SHRINKER_DEBUG
> +static inline unsigned long mem_cgroup_ino(struct mem_cgroup *memcg)
> +{
> + return memcg ? cgroup_ino(memcg->css.cgroup) : 0;
> +}
> +
> +struct mem_cgroup *mem_cgroup_get_from_ino(unsigned long ino);
> +#endif
> +
> static inline struct mem_cgroup *mem_cgroup_from_seq(struct seq_file *m)
> {
> return mem_cgroup_from_css(seq_css(m));
> @@ -1324,6 +1333,18 @@ static inline struct mem_cgroup *mem_cgroup_from_id(unsigned short id)
> return NULL;
> }
>
> +#ifdef CONFIG_SHRINKER_DEBUG
> +static inline unsigned long mem_cgroup_ino(struct mem_cgroup *memcg)
> +{
> + return 0;
> +}
> +
> +static inline struct mem_cgroup *mem_cgroup_get_from_ino(unsigned long ino)
> +{
> + return NULL;
> +}
> +#endif
> +
> static inline struct mem_cgroup *mem_cgroup_from_seq(struct seq_file *m)
> {
> return NULL;
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 04cea4fa362a..e6472728fa66 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -5018,6 +5018,29 @@ struct mem_cgroup *mem_cgroup_from_id(unsigned short id)
> return idr_find(&mem_cgroup_idr, id);
> }
>
> +#ifdef CONFIG_SHRINKER_DEBUG
> +struct mem_cgroup *mem_cgroup_get_from_ino(unsigned long ino)
> +{
> + struct cgroup *cgrp;
> + struct cgroup_subsys_state *css;
> + struct mem_cgroup *memcg;
> +
> + cgrp = cgroup_get_from_id(ino);
> + if (!cgrp)
> + return ERR_PTR(-ENOENT);
> +
> + css = cgroup_get_e_css(cgrp, &memory_cgrp_subsys);
> + if (css)
> + memcg = container_of(css, struct mem_cgroup, css);
> + else
> + memcg = ERR_PTR(-ENOENT);
> +
> + cgroup_put(cgrp);
I think it's better to use css_put() here since the refcount is get
via cgroup_get_e_css() which returns a css struct.
Thanks.
> +
> + return memcg;
> +}
> +#endif
> +
> static int alloc_mem_cgroup_per_node_info(struct mem_cgroup *memcg, int node)
> {
> struct mem_cgroup_per_node *pn;
> --
> 2.35.3
>
>
On Mon, May 09, 2022 at 11:38:17AM -0700, Roman Gushchin wrote:
> Currently shrinkers are anonymous objects. For debugging purposes they
> can be identified by count/scan function names, but it's not always
> useful: e.g. for superblock's shrinkers it's nice to have at least
> an idea of to which superblock the shrinker belongs.
>
> This commit adds names to shrinkers. register_shrinker() and
> prealloc_shrinker() functions are extended to take a format and
> arguments to master a name.
>
> In some cases it's not possible to determine a good name at the time
> when a shrinker is allocated. For such cases shrinker_debugfs_rename()
> is provided.
>
> After this change the shrinker debugfs directory looks like:
> $ cd /sys/kernel/debug/shrinker/
> $ ls
> dqcache-16 sb-hugetlbfs-17 sb-rootfs-2 sb-tmpfs-50
> kfree_rcu-0 sb-hugetlbfs-33 sb-securityfs-6 sb-tracefs-13
> sb-aio-20 sb-iomem-12 sb-selinuxfs-22 sb-xfs:vda1-36
> sb-anon_inodefs-15 sb-mqueue-21 sb-sockfs-8 sb-zsmalloc-19
> sb-bdev-3 sb-nsfs-4 sb-sysfs-26 shadow-18
> sb-bpf-32 sb-pipefs-14 sb-tmpfs-1 thp_deferred_split-10
> sb-btrfs:vda2-24 sb-proc-25 sb-tmpfs-27 thp_zero-9
> sb-cgroup2-30 sb-proc-39 sb-tmpfs-29 xfs_buf-37
> sb-configfs-23 sb-proc-41 sb-tmpfs-35 xfs_inodegc-38
^^^^^^^^^^^^^^
These XFS shrinkers are also per-block device like the superblock.
They need to read like "sb-xfs:vda1-36". and even though it is not
in this list, the xfs dquot shrinker will need this as well.
> sb-dax-11 sb-proc-45 sb-tmpfs-40 zspool-34
> sb-debugfs-7 sb-proc-46 sb-tmpfs-42
> sb-devpts-28 sb-proc-47 sb-tmpfs-43
> sb-devtmpfs-5 sb-pstore-31 sb-tmpfs-44
The proc and tmpfs shrinkers have the same problem - what instance
do they actually refer to?
> diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
> index e1afb9e503e1..5645e92df0c9 100644
> --- a/fs/xfs/xfs_buf.c
> +++ b/fs/xfs/xfs_buf.c
> @@ -1986,7 +1986,7 @@ xfs_alloc_buftarg(
> btp->bt_shrinker.scan_objects = xfs_buftarg_shrink_scan;
> btp->bt_shrinker.seeks = DEFAULT_SEEKS;
> btp->bt_shrinker.flags = SHRINKER_NUMA_AWARE;
> - if (register_shrinker(&btp->bt_shrinker))
> + if (register_shrinker(&btp->bt_shrinker, "xfs_buf"))
> goto error_pcpu;
> return btp;
>
> diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
> index bffd6eb0b298..d0c4e74ff763 100644
> --- a/fs/xfs/xfs_icache.c
> +++ b/fs/xfs/xfs_icache.c
> @@ -2198,5 +2198,5 @@ xfs_inodegc_register_shrinker(
> shrink->flags = SHRINKER_NONSLAB;
> shrink->batch = XFS_INODEGC_SHRINKER_BATCH;
>
> - return register_shrinker(shrink);
> + return register_shrinker(shrink, "xfs_inodegc");
> }
> diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c
> index f165d1a3de1d..93ded9e81f49 100644
> --- a/fs/xfs/xfs_qm.c
> +++ b/fs/xfs/xfs_qm.c
> @@ -686,7 +686,7 @@ xfs_qm_init_quotainfo(
> qinf->qi_shrinker.seeks = DEFAULT_SEEKS;
> qinf->qi_shrinker.flags = SHRINKER_NUMA_AWARE;
>
> - error = register_shrinker(&qinf->qi_shrinker);
> + error = register_shrinker(&qinf->qi_shrinker, "xfs_qm");
> if (error)
> goto out_free_inos;
Yeah, these all have a xfs_mount passed to them, so the superblock is
easily accessible here (mp->m_super)....
Cheers,
Dave.
--
Dave Chinner
[email protected]
Le 09/05/2022 à 20:38, Roman Gushchin a écrit :
> This commit introduces the /sys/kernel/debug/shrinker debugfs
> interface which provides an ability to observe the state of
> individual kernel memory shrinkers.
>
> Because the feature adds some memory overhead (which shouldn't be
> large unless there is a huge amount of registered shrinkers), it's
> guarded by a config option (enabled by default).
>
> This commit introduces the "count" interface for each shrinker
> registered in the system.
>
> The output is in the following format:
> <cgroup inode id> <nr of objects on node 0> <nr of objects on node 1>...
> <cgroup inode id> <nr of objects on node 0> <nr of objects on node 1>...
> ...
>
> To reduce the size of output on machines with many thousands cgroups,
> if the total number of objects on all nodes is 0, the line is omitted.
>
> If the shrinker is not memcg-aware or CONFIG_MEMCG is off, 0 is
> printed as cgroup inode id. If the shrinker is not numa-aware, 0's are
> printed for all nodes except the first one.
>
> This commit gives debugfs entries simple numeric names, which are not
> very convenient. The following commit in the series will provide
> shrinkers with more meaningful names.
>
> Signed-off-by: Roman Gushchin <[email protected]>
> ---
> include/linux/shrinker.h | 19 ++++-
> lib/Kconfig.debug | 9 +++
> mm/Makefile | 1 +
> mm/shrinker_debug.c | 171 +++++++++++++++++++++++++++++++++++++++
> mm/vmscan.c | 6 +-
> 5 files changed, 203 insertions(+), 3 deletions(-)
> create mode 100644 mm/shrinker_debug.c
>
> diff --git a/include/linux/shrinker.h b/include/linux/shrinker.h
> index 76fbf92b04d9..2ced8149c513 100644
> --- a/include/linux/shrinker.h
> +++ b/include/linux/shrinker.h
> @@ -72,6 +72,10 @@ struct shrinker {
> #ifdef CONFIG_MEMCG
> /* ID in shrinker_idr */
> int id;
> +#endif
> +#ifdef CONFIG_SHRINKER_DEBUG
> + int debugfs_id;
> + struct dentry *debugfs_entry;
> #endif
> /* objs pending delete, per node */
> atomic_long_t *nr_deferred;
> @@ -94,4 +98,17 @@ extern int register_shrinker(struct shrinker *shrinker);
> extern void unregister_shrinker(struct shrinker *shrinker);
> extern void free_prealloced_shrinker(struct shrinker *shrinker);
> extern void synchronize_shrinkers(void);
> -#endif
> +
> +#ifdef CONFIG_SHRINKER_DEBUG
> +extern int shrinker_debugfs_add(struct shrinker *shrinker);
> +extern void shrinker_debugfs_remove(struct shrinker *shrinker);
> +#else /* CONFIG_SHRINKER_DEBUG */
> +static inline int shrinker_debugfs_add(struct shrinker *shrinker)
> +{
> + return 0;
> +}
> +static inline void shrinker_debugfs_remove(struct shrinker *shrinker)
> +{
> +}
> +#endif /* CONFIG_SHRINKER_DEBUG */
> +#endif /* _LINUX_SHRINKER_H */
> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> index 3fd7a2e9eaf1..5fa65a649798 100644
> --- a/lib/Kconfig.debug
> +++ b/lib/Kconfig.debug
> @@ -733,6 +733,15 @@ config SLUB_STATS
> out which slabs are relevant to a particular load.
> Try running: slabinfo -DA
>
> +config SHRINKER_DEBUG
> + default y
The previous version of the serie had default 'n'.
Is it intentional to have it now activated by default? It looked more
like a tuning functionality when fine grained mangement of shrinker is
needed.
> + bool "Enable shrinker debugging support"
> + depends on DEBUG_FS
> + help
> + Say Y to enable the shrinker debugfs interface which provides
> + visibility into the kernel memory shrinkers subsystem.
> + Disable it to avoid an extra memory footprint.
> +
[...]
On Sun, May 22, 2022 at 03:05:33PM +0800, Muchun Song wrote:
> On Mon, May 09, 2022 at 11:38:15AM -0700, Roman Gushchin wrote:
> > Shrinker debugfs requires a way to represent memory cgroups without
> > using full paths, both for displaying information and getting input
> > from a user.
> >
> > Cgroup inode number is a perfect way, already used by bpf.
> >
> > This commit adds a couple of helper functions which will be used
> > to handle memcg-aware shrinkers.
> >
> > Signed-off-by: Roman Gushchin <[email protected]>
> > ---
> > include/linux/memcontrol.h | 21 +++++++++++++++++++++
> > mm/memcontrol.c | 23 +++++++++++++++++++++++
> > 2 files changed, 44 insertions(+)
> >
> > diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> > index fe580cb96683..a6de9e5c1549 100644
> > --- a/include/linux/memcontrol.h
> > +++ b/include/linux/memcontrol.h
> > @@ -831,6 +831,15 @@ static inline unsigned short mem_cgroup_id(struct mem_cgroup *memcg)
> > }
> > struct mem_cgroup *mem_cgroup_from_id(unsigned short id);
> >
> > +#ifdef CONFIG_SHRINKER_DEBUG
> > +static inline unsigned long mem_cgroup_ino(struct mem_cgroup *memcg)
> > +{
> > + return memcg ? cgroup_ino(memcg->css.cgroup) : 0;
> > +}
> > +
> > +struct mem_cgroup *mem_cgroup_get_from_ino(unsigned long ino);
> > +#endif
> > +
> > static inline struct mem_cgroup *mem_cgroup_from_seq(struct seq_file *m)
> > {
> > return mem_cgroup_from_css(seq_css(m));
> > @@ -1324,6 +1333,18 @@ static inline struct mem_cgroup *mem_cgroup_from_id(unsigned short id)
> > return NULL;
> > }
> >
> > +#ifdef CONFIG_SHRINKER_DEBUG
> > +static inline unsigned long mem_cgroup_ino(struct mem_cgroup *memcg)
> > +{
> > + return 0;
> > +}
> > +
> > +static inline struct mem_cgroup *mem_cgroup_get_from_ino(unsigned long ino)
> > +{
> > + return NULL;
> > +}
> > +#endif
> > +
> > static inline struct mem_cgroup *mem_cgroup_from_seq(struct seq_file *m)
> > {
> > return NULL;
> > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > index 04cea4fa362a..e6472728fa66 100644
> > --- a/mm/memcontrol.c
> > +++ b/mm/memcontrol.c
> > @@ -5018,6 +5018,29 @@ struct mem_cgroup *mem_cgroup_from_id(unsigned short id)
> > return idr_find(&mem_cgroup_idr, id);
> > }
> >
> > +#ifdef CONFIG_SHRINKER_DEBUG
> > +struct mem_cgroup *mem_cgroup_get_from_ino(unsigned long ino)
> > +{
> > + struct cgroup *cgrp;
> > + struct cgroup_subsys_state *css;
> > + struct mem_cgroup *memcg;
> > +
> > + cgrp = cgroup_get_from_id(ino);
> > + if (!cgrp)
> > + return ERR_PTR(-ENOENT);
> > +
> > + css = cgroup_get_e_css(cgrp, &memory_cgrp_subsys);
> > + if (css)
> > + memcg = container_of(css, struct mem_cgroup, css);
> > + else
> > + memcg = ERR_PTR(-ENOENT);
> > +
> > + cgroup_put(cgrp);
>
> I think it's better to use css_put() here since the refcount is get
> via cgroup_get_e_css() which returns a css struct.
cgroup_put() is matching cgroup_get_from_id().
The reference grabbed by cgroup_get_e_css() shouldn't be dropped
because mem_cgroup_get_from_ino() has a "get" semantics.
Thanks!
On Sun, May 22, 2022 at 06:36:56PM +0800, Muchun Song wrote:
> On Mon, May 09, 2022 at 11:38:16AM -0700, Roman Gushchin wrote:
> > This commit introduces the /sys/kernel/debug/shrinker debugfs
> > interface which provides an ability to observe the state of
> > individual kernel memory shrinkers.
> >
> > Because the feature adds some memory overhead (which shouldn't be
> > large unless there is a huge amount of registered shrinkers), it's
> > guarded by a config option (enabled by default).
> >
> > This commit introduces the "count" interface for each shrinker
> > registered in the system.
> >
> > The output is in the following format:
>
> Hi Roman,
Hi Muchun!
Thank you for taking a look!
>
> Shoud we print a title to show what those numbers mean? In this case,
> it is more understandable.
No, I don't think so: this interface is not supposed to be used by
an average user and those who will be using it can refer to the provided
documentation. Printing the header each time will add some overhead for
no good reason.
> > <cgroup inode id> <nr of objects on node 0> <nr of objects on node 1>...
> > <cgroup inode id> <nr of objects on node 0> <nr of objects on node 1>...
> > ...
> >
> > To reduce the size of output on machines with many thousands cgroups,
> > if the total number of objects on all nodes is 0, the line is omitted.
> >
> > If the shrinker is not memcg-aware or CONFIG_MEMCG is off, 0 is
> > printed as cgroup inode id. If the shrinker is not numa-aware, 0's are
> > printed for all nodes except the first one.
> >
> > This commit gives debugfs entries simple numeric names, which are not
> > very convenient. The following commit in the series will provide
> > shrinkers with more meaningful names.
> >
> > Signed-off-by: Roman Gushchin <[email protected]>
> > ---
> > include/linux/shrinker.h | 19 ++++-
> > lib/Kconfig.debug | 9 +++
> > mm/Makefile | 1 +
> > mm/shrinker_debug.c | 171 +++++++++++++++++++++++++++++++++++++++
> > mm/vmscan.c | 6 +-
> > 5 files changed, 203 insertions(+), 3 deletions(-)
> > create mode 100644 mm/shrinker_debug.c
> >
> > diff --git a/include/linux/shrinker.h b/include/linux/shrinker.h
> > index 76fbf92b04d9..2ced8149c513 100644
> > --- a/include/linux/shrinker.h
> > +++ b/include/linux/shrinker.h
> > @@ -72,6 +72,10 @@ struct shrinker {
> > #ifdef CONFIG_MEMCG
> > /* ID in shrinker_idr */
> > int id;
> > +#endif
> > +#ifdef CONFIG_SHRINKER_DEBUG
> > + int debugfs_id;
> > + struct dentry *debugfs_entry;
> > #endif
> > /* objs pending delete, per node */
> > atomic_long_t *nr_deferred;
> > @@ -94,4 +98,17 @@ extern int register_shrinker(struct shrinker *shrinker);
> > extern void unregister_shrinker(struct shrinker *shrinker);
> > extern void free_prealloced_shrinker(struct shrinker *shrinker);
> > extern void synchronize_shrinkers(void);
> > -#endif
> > +
> > +#ifdef CONFIG_SHRINKER_DEBUG
> > +extern int shrinker_debugfs_add(struct shrinker *shrinker);
> > +extern void shrinker_debugfs_remove(struct shrinker *shrinker);
> > +#else /* CONFIG_SHRINKER_DEBUG */
> > +static inline int shrinker_debugfs_add(struct shrinker *shrinker)
> > +{
> > + return 0;
> > +}
> > +static inline void shrinker_debugfs_remove(struct shrinker *shrinker)
> > +{
> > +}
> > +#endif /* CONFIG_SHRINKER_DEBUG */
> > +#endif /* _LINUX_SHRINKER_H */
> > diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> > index 3fd7a2e9eaf1..5fa65a649798 100644
> > --- a/lib/Kconfig.debug
> > +++ b/lib/Kconfig.debug
> > @@ -733,6 +733,15 @@ config SLUB_STATS
> > out which slabs are relevant to a particular load.
> > Try running: slabinfo -DA
> >
> > +config SHRINKER_DEBUG
> > + default y
> > + bool "Enable shrinker debugging support"
> > + depends on DEBUG_FS
> > + help
> > + Say Y to enable the shrinker debugfs interface which provides
> > + visibility into the kernel memory shrinkers subsystem.
> > + Disable it to avoid an extra memory footprint.
> > +
> > config HAVE_DEBUG_KMEMLEAK
> > bool
> >
> > diff --git a/mm/Makefile b/mm/Makefile
> > index 298c9991ab75..8083fa85a348 100644
> > --- a/mm/Makefile
> > +++ b/mm/Makefile
> > @@ -133,3 +133,4 @@ obj-$(CONFIG_PAGE_REPORTING) += page_reporting.o
> > obj-$(CONFIG_IO_MAPPING) += io-mapping.o
> > obj-$(CONFIG_HAVE_BOOTMEM_INFO_NODE) += bootmem_info.o
> > obj-$(CONFIG_GENERIC_IOREMAP) += ioremap.o
> > +obj-$(CONFIG_SHRINKER_DEBUG) += shrinker_debug.o
> > diff --git a/mm/shrinker_debug.c b/mm/shrinker_debug.c
> > new file mode 100644
> > index 000000000000..fd1f805a581a
> > --- /dev/null
> > +++ b/mm/shrinker_debug.c
> > @@ -0,0 +1,171 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +#include <linux/idr.h>
> > +#include <linux/slab.h>
> > +#include <linux/debugfs.h>
> > +#include <linux/seq_file.h>
> > +#include <linux/shrinker.h>
> > +#include <linux/memcontrol.h>
> > +
> > +/* defined in vmscan.c */
> > +extern struct rw_semaphore shrinker_rwsem;
> > +extern struct list_head shrinker_list;
> > +
> > +static DEFINE_IDA(shrinker_debugfs_ida);
> > +static struct dentry *shrinker_debugfs_root;
> > +
> > +static unsigned long shrinker_count_objects(struct shrinker *shrinker,
> > + struct mem_cgroup *memcg,
> > + unsigned long *count_per_node)
> > +{
> > + unsigned long nr, total = 0;
> > + int nid;
> > +
> > + for_each_node(nid) {
> > + if (nid == 0 || (shrinker->flags & SHRINKER_NUMA_AWARE)) {
> > + struct shrink_control sc = {
> > + .gfp_mask = GFP_KERNEL,
> > + .nid = nid,
> > + .memcg = memcg,
> > + };
> > +
> > + nr = shrinker->count_objects(shrinker, &sc);
> > + if (nr == SHRINK_EMPTY)
> > + nr = 0;
> > + } else {
> > + nr = 0;
>
> For efficiency, we could break here, right?
Not really, we need to fill count_per_node[] with zeros.
>
> > + }
> > +
> > + count_per_node[nid] = nr;
> > + total += nr;
> > + }
> > +
> > + return total;
> > +}
> > +
> > +static int shrinker_debugfs_count_show(struct seq_file *m, void *v)
> > +{
> > + struct shrinker *shrinker = (struct shrinker *)m->private;
>
> Maybe we cound drop the cast since m->private is a void * type.
Ok.
>
> > + unsigned long *count_per_node = NULL;
>
> Do not need to be initialized, right?
Right, will fix in v4.
>
> > + struct mem_cgroup *memcg;
> > + unsigned long total;
> > + bool memcg_aware;
> > + int ret, nid;
> > +
> > + count_per_node = kcalloc(nr_node_ids, sizeof(unsigned long), GFP_KERNEL);
> > + if (!count_per_node)
> > + return -ENOMEM;
> > +
> > + ret = down_read_killable(&shrinker_rwsem);
> > + if (ret) {
> > + kfree(count_per_node);
> > + return ret;
> > + }
> > + rcu_read_lock();
> > +
> > + memcg_aware = shrinker->flags & SHRINKER_MEMCG_AWARE;
> > +
> > + memcg = mem_cgroup_iter(NULL, NULL, NULL);
> > + do {
> > + if (memcg && !mem_cgroup_online(memcg))
> > + continue;
> > +
> > + total = shrinker_count_objects(shrinker,
> > + memcg_aware ? memcg : NULL,
> > + count_per_node);
> > + if (total) {
> > + seq_printf(m, "%lu", mem_cgroup_ino(memcg));
> > + for_each_node(nid)
> > + seq_printf(m, " %lu", count_per_node[nid]);
> > + seq_puts(m, "\n");
>
> seq_putc(m, '\n') is more efficient.
Ok.
>
> > + }
> > +
> > + if (!memcg_aware) {
> > + mem_cgroup_iter_break(NULL, memcg);
> > + break;
> > + }
> > +
> > + if (signal_pending(current)) {
> > + mem_cgroup_iter_break(NULL, memcg);
> > + ret = -EINTR;
> > + break;
> > + }
> > +
> > + cond_resched();
>
> We are in rcu read lock, cannot be scheduled, right?
This is a good one, thanks. Fixed.
>
> > + } while ((memcg = mem_cgroup_iter(NULL, memcg, NULL)) != NULL);
> > +
> > + rcu_read_unlock();
> > + up_read(&shrinker_rwsem);
> > +
> > + kfree(count_per_node);
> > + return ret;
> > +}
> > +DEFINE_SHOW_ATTRIBUTE(shrinker_debugfs_count);
> > +
> > +int shrinker_debugfs_add(struct shrinker *shrinker)
> > +{
> > + struct dentry *entry;
> > + char buf[16];
> > + int id;
> > +
> > + lockdep_assert_held(&shrinker_rwsem);
> > +
> > + /* debugfs isn't initialized yet, add debugfs entries later. */
> > + if (!shrinker_debugfs_root)
> > + return 0;
> > +
> > + id = ida_alloc(&shrinker_debugfs_ida, GFP_KERNEL);
> > + if (id < 0)
> > + return id;
> > + shrinker->debugfs_id = id;
> > +
> > + snprintf(buf, sizeof(buf), "%d", id);
> > +
> > + /* create debugfs entry */
> > + entry = debugfs_create_dir(buf, shrinker_debugfs_root);
> > + if (IS_ERR(entry)) {
> > + ida_free(&shrinker_debugfs_ida, id);
> > + return PTR_ERR(entry);
> > + }
> > + shrinker->debugfs_entry = entry;
> > +
> > + debugfs_create_file("count", 0220, entry, shrinker,
> > + &shrinker_debugfs_count_fops);
> > + return 0;
> > +}
> > +
> > +void shrinker_debugfs_remove(struct shrinker *shrinker)
> > +{
> > + lockdep_assert_held(&shrinker_rwsem);
> > +
> > + if (!shrinker->debugfs_entry)
> > + return;
> > +
> > + debugfs_remove_recursive(shrinker->debugfs_entry);
> > + ida_free(&shrinker_debugfs_ida, shrinker->debugfs_id);
> > +}
> > +
> > +static int __init shrinker_debugfs_init(void)
> > +{
> > + struct shrinker *shrinker;
> > + int ret;
> > +
> > + if (!debugfs_initialized())
> > + return -ENODEV;
> > +
>
> Redundant check since it is checked in debugfs_create_dir().
> So I think we could remove this.
>
> > + shrinker_debugfs_root = debugfs_create_dir("shrinker", NULL);
>
> We should use IS_ERR() to detect the error code. So the following
> check is wrong.
Right, will fix in the next version.
>
> > + if (!shrinker_debugfs_root)
> > + return -ENOMEM;
> > +
> > + /* Create debugfs entries for shrinkers registered at boot */
> > + ret = down_write_killable(&shrinker_rwsem);
>
> How could we kill this process? IIUC, late_initcall() is called
> from early init process, there is no way to kill this. Right?
> If yes, I think we could just use down_write().
Ok, agree.
Thanks!
On Sun, May 22, 2022 at 07:35:59PM +0800, Muchun Song wrote:
> On Mon, May 09, 2022 at 11:38:20AM -0700, Roman Gushchin wrote:
> > Add a scan interface which allows to trigger scanning of a particular
> > shrinker and specify memcg and numa node. It's useful for testing,
> > debugging and profiling of a specific scan_objects() callback.
> > Unlike alternatives (creating a real memory pressure and dropping
> > caches via /proc/sys/vm/drop_caches) this interface allows to interact
> > with only one shrinker at once. Also, if a shrinker is misreporting
> > the number of objects (as some do), it doesn't affect scanning.
> >
> > Signed-off-by: Roman Gushchin <[email protected]>
> > ---
> > .../admin-guide/mm/shrinker_debugfs.rst | 39 +++++++++-
> > mm/shrinker_debug.c | 73 +++++++++++++++++++
> > 2 files changed, 108 insertions(+), 4 deletions(-)
> >
> > diff --git a/Documentation/admin-guide/mm/shrinker_debugfs.rst b/Documentation/admin-guide/mm/shrinker_debugfs.rst
> > index 6783f8190e63..8fecf81d60ee 100644
> > --- a/Documentation/admin-guide/mm/shrinker_debugfs.rst
> > +++ b/Documentation/admin-guide/mm/shrinker_debugfs.rst
> > @@ -5,14 +5,16 @@ Shrinker Debugfs Interface
> > ==========================
> >
> > Shrinker debugfs interface provides a visibility into the kernel memory
> > -shrinkers subsystem and allows to get information about individual shrinkers.
> > +shrinkers subsystem and allows to get information about individual shrinkers
> > +and interact with them.
> >
> > For each shrinker registered in the system a directory in **<debugfs>/shrinker/**
> > is created. The directory's name is composed from the shrinker's name and an
> > unique id: e.g. *kfree_rcu-0* or *sb-xfs:vda1-36*.
> >
> > -Each shrinker directory contains the **count** file, which allows to trigger
> > -the *count_objects()* callback for each memcg and numa node (if applicable).
> > +Each shrinker directory contains **count** and **scan** files, which allow to
> > +trigger *count_objects()* and *scan_objects()* callbacks for each memcg and
> > +numa node (if applicable).
> >
> > Usage:
> > ------
> > @@ -43,7 +45,7 @@ Usage:
> >
> > $ cd sb-btrfs\:vda2-24/
> > $ ls
> > - count
> > + count scan
> >
> > 3. *Count objects*
> >
> > @@ -98,3 +100,32 @@ Usage:
> > 2877 84 0
> > 293 1 0
> > 735 8 0
> > +
> > +4. *Scan objects*
> > +
> > + The expected input format::
> > +
> > + <cgroup inode id> <numa id> <number of objects to scan>
> > +
> > + For a non-memcg-aware shrinker or on a system with no memory
> > + cgrups **0** should be passed as cgroup id.
> > + ::
> > +
> > + $ cd /sys/kernel/debug/shrinker/
> > + $ cd sb-btrfs\:vda2-24/
> > +
> > + $ cat count | head -n 5
> > + 1 212 0
> > + 21 97 0
> > + 55 802 5
> > + 2367 2 0
> > + 225 13 0
> > +
> > + $ echo "55 0 200" > scan
> > +
> > + $ cat count | head -n 5
> > + 1 212 0
> > + 21 96 0
> > + 55 752 5
> > + 2367 2 0
> > + 225 13 0
> > diff --git a/mm/shrinker_debug.c b/mm/shrinker_debug.c
> > index 28b1c1ab60ef..8f67fef5a643 100644
> > --- a/mm/shrinker_debug.c
> > +++ b/mm/shrinker_debug.c
> > @@ -101,6 +101,77 @@ static int shrinker_debugfs_count_show(struct seq_file *m, void *v)
> > }
> > DEFINE_SHOW_ATTRIBUTE(shrinker_debugfs_count);
> >
> > +static int shrinker_debugfs_scan_open(struct inode *inode, struct file *file)
> > +{
> > + file->private_data = inode->i_private;
> > + return nonseekable_open(inode, file);
> > +}
> > +
> > +static ssize_t shrinker_debugfs_scan_write(struct file *file,
> > + const char __user *buf,
> > + size_t size, loff_t *pos)
> > +{
> > + struct shrinker *shrinker = (struct shrinker *)file->private_data;
>
> Seems we could drop the cast since ->private_data is void * type.
Yep, fixed. Thanks!
>
> > + unsigned long nr_to_scan = 0, ino;
> > + struct shrink_control sc = {
> > + .gfp_mask = GFP_KERNEL,
> > + };
> > + struct mem_cgroup *memcg = NULL;
> > + int nid;
> > + char kbuf[72];
> > + int read_len = size < (sizeof(kbuf) - 1) ? size : (sizeof(kbuf) - 1);
> > + ssize_t ret;
> > +
> > + if (copy_from_user(kbuf, buf, read_len))
> > + return -EFAULT;
> > + kbuf[read_len] = '\0';
> > +
> > + if (sscanf(kbuf, "%lu %d %lu", &ino, &nid, &nr_to_scan) < 2)
> > + return -EINVAL;
> > +
> > + if (nid < 0 || nid >= nr_node_ids)
> > + return -EINVAL;
> > +
>
> Should we break here if nr_to_scan is zero?
Not a very likely scenario, but ok.
>
> > + if (shrinker->flags & SHRINKER_MEMCG_AWARE) {
> > + memcg = mem_cgroup_get_from_ino(ino);
> > + if (!memcg || IS_ERR(memcg))
>
> Should we drop the check of "!memcg" since mem_cgroup_get_from_ino
> cannot return NULL?
It can if !CONFIG_MEMCG. You might argue that then shrinker can not have
the SHRINKER_MEMCG_AWARE flag, but since it's not a hot path at all,
I'll keep it for extra safety.
>
> > + return -ENOENT;
> > +
> > + if (!mem_cgroup_online(memcg)) {
> > + mem_cgroup_put(memcg);
> > + return -ENOENT;
> > + }
> > + } else {
> > + if (ino != 0)
> > + return -EINVAL;
> > + memcg = NULL;
>
> IIUC, memcg is already NULL if we reach here, right? Then the
> assignment is not necessary. Or we cound remove the initialization
> of 'memcg' where it is definned.
Right, removed.
>
> > + }
> > +
> > + ret = down_read_killable(&shrinker_rwsem);
> > + if (ret) {
> > + mem_cgroup_put(memcg);
> > + return ret;
> > + }
> > +
> > + sc.nid = nid;
> > + sc.memcg = memcg;
> > + sc.nr_to_scan = nr_to_scan;
> > + sc.nr_scanned = nr_to_scan;
> > +
> > + shrinker->scan_objects(shrinker, &sc);
> > +
> > + up_read(&shrinker_rwsem);
> > + mem_cgroup_put(memcg);
> > +
> > + return ret ? ret : size;
>
> Seems "ret" is always equal to 0 here, should we simplify this
> to "return size"?
Right.
Thank you for the review!
On Mon, May 23, 2022 at 11:24:10AM -0700, Roman Gushchin wrote:
> On Sun, May 22, 2022 at 06:36:56PM +0800, Muchun Song wrote:
> > On Mon, May 09, 2022 at 11:38:16AM -0700, Roman Gushchin wrote:
> > > This commit introduces the /sys/kernel/debug/shrinker debugfs
> > > interface which provides an ability to observe the state of
> > > individual kernel memory shrinkers.
> > >
> > > Because the feature adds some memory overhead (which shouldn't be
> > > large unless there is a huge amount of registered shrinkers), it's
> > > guarded by a config option (enabled by default).
> > >
> > > This commit introduces the "count" interface for each shrinker
> > > registered in the system.
> > >
> > > The output is in the following format:
> >
> > Hi Roman,
>
> Hi Muchun!
>
> Thank you for taking a look!
>
> >
> > Shoud we print a title to show what those numbers mean? In this case,
> > it is more understandable.
>
> No, I don't think so: this interface is not supposed to be used by
> an average user and those who will be using it can refer to the provided
> documentation. Printing the header each time will add some overhead for
> no good reason.
>
Got it. Make sense.
> > > <cgroup inode id> <nr of objects on node 0> <nr of objects on node 1>...
> > > <cgroup inode id> <nr of objects on node 0> <nr of objects on node 1>...
> > > ...
> > >
> > > To reduce the size of output on machines with many thousands cgroups,
> > > if the total number of objects on all nodes is 0, the line is omitted.
> > >
> > > If the shrinker is not memcg-aware or CONFIG_MEMCG is off, 0 is
> > > printed as cgroup inode id. If the shrinker is not numa-aware, 0's are
> > > printed for all nodes except the first one.
> > >
> > > This commit gives debugfs entries simple numeric names, which are not
> > > very convenient. The following commit in the series will provide
> > > shrinkers with more meaningful names.
> > >
> > > Signed-off-by: Roman Gushchin <[email protected]>
> > > ---
> > > include/linux/shrinker.h | 19 ++++-
> > > lib/Kconfig.debug | 9 +++
> > > mm/Makefile | 1 +
> > > mm/shrinker_debug.c | 171 +++++++++++++++++++++++++++++++++++++++
> > > mm/vmscan.c | 6 +-
> > > 5 files changed, 203 insertions(+), 3 deletions(-)
> > > create mode 100644 mm/shrinker_debug.c
> > >
> > > diff --git a/include/linux/shrinker.h b/include/linux/shrinker.h
> > > index 76fbf92b04d9..2ced8149c513 100644
> > > --- a/include/linux/shrinker.h
> > > +++ b/include/linux/shrinker.h
> > > @@ -72,6 +72,10 @@ struct shrinker {
> > > #ifdef CONFIG_MEMCG
> > > /* ID in shrinker_idr */
> > > int id;
> > > +#endif
> > > +#ifdef CONFIG_SHRINKER_DEBUG
> > > + int debugfs_id;
> > > + struct dentry *debugfs_entry;
> > > #endif
> > > /* objs pending delete, per node */
> > > atomic_long_t *nr_deferred;
> > > @@ -94,4 +98,17 @@ extern int register_shrinker(struct shrinker *shrinker);
> > > extern void unregister_shrinker(struct shrinker *shrinker);
> > > extern void free_prealloced_shrinker(struct shrinker *shrinker);
> > > extern void synchronize_shrinkers(void);
> > > -#endif
> > > +
> > > +#ifdef CONFIG_SHRINKER_DEBUG
> > > +extern int shrinker_debugfs_add(struct shrinker *shrinker);
> > > +extern void shrinker_debugfs_remove(struct shrinker *shrinker);
> > > +#else /* CONFIG_SHRINKER_DEBUG */
> > > +static inline int shrinker_debugfs_add(struct shrinker *shrinker)
> > > +{
> > > + return 0;
> > > +}
> > > +static inline void shrinker_debugfs_remove(struct shrinker *shrinker)
> > > +{
> > > +}
> > > +#endif /* CONFIG_SHRINKER_DEBUG */
> > > +#endif /* _LINUX_SHRINKER_H */
> > > diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> > > index 3fd7a2e9eaf1..5fa65a649798 100644
> > > --- a/lib/Kconfig.debug
> > > +++ b/lib/Kconfig.debug
> > > @@ -733,6 +733,15 @@ config SLUB_STATS
> > > out which slabs are relevant to a particular load.
> > > Try running: slabinfo -DA
> > >
> > > +config SHRINKER_DEBUG
> > > + default y
> > > + bool "Enable shrinker debugging support"
> > > + depends on DEBUG_FS
> > > + help
> > > + Say Y to enable the shrinker debugfs interface which provides
> > > + visibility into the kernel memory shrinkers subsystem.
> > > + Disable it to avoid an extra memory footprint.
> > > +
> > > config HAVE_DEBUG_KMEMLEAK
> > > bool
> > >
> > > diff --git a/mm/Makefile b/mm/Makefile
> > > index 298c9991ab75..8083fa85a348 100644
> > > --- a/mm/Makefile
> > > +++ b/mm/Makefile
> > > @@ -133,3 +133,4 @@ obj-$(CONFIG_PAGE_REPORTING) += page_reporting.o
> > > obj-$(CONFIG_IO_MAPPING) += io-mapping.o
> > > obj-$(CONFIG_HAVE_BOOTMEM_INFO_NODE) += bootmem_info.o
> > > obj-$(CONFIG_GENERIC_IOREMAP) += ioremap.o
> > > +obj-$(CONFIG_SHRINKER_DEBUG) += shrinker_debug.o
> > > diff --git a/mm/shrinker_debug.c b/mm/shrinker_debug.c
> > > new file mode 100644
> > > index 000000000000..fd1f805a581a
> > > --- /dev/null
> > > +++ b/mm/shrinker_debug.c
> > > @@ -0,0 +1,171 @@
> > > +// SPDX-License-Identifier: GPL-2.0
> > > +#include <linux/idr.h>
> > > +#include <linux/slab.h>
> > > +#include <linux/debugfs.h>
> > > +#include <linux/seq_file.h>
> > > +#include <linux/shrinker.h>
> > > +#include <linux/memcontrol.h>
> > > +
> > > +/* defined in vmscan.c */
> > > +extern struct rw_semaphore shrinker_rwsem;
> > > +extern struct list_head shrinker_list;
> > > +
> > > +static DEFINE_IDA(shrinker_debugfs_ida);
> > > +static struct dentry *shrinker_debugfs_root;
> > > +
> > > +static unsigned long shrinker_count_objects(struct shrinker *shrinker,
> > > + struct mem_cgroup *memcg,
> > > + unsigned long *count_per_node)
> > > +{
> > > + unsigned long nr, total = 0;
> > > + int nid;
> > > +
> > > + for_each_node(nid) {
> > > + if (nid == 0 || (shrinker->flags & SHRINKER_NUMA_AWARE)) {
> > > + struct shrink_control sc = {
> > > + .gfp_mask = GFP_KERNEL,
> > > + .nid = nid,
> > > + .memcg = memcg,
> > > + };
> > > +
> > > + nr = shrinker->count_objects(shrinker, &sc);
> > > + if (nr == SHRINK_EMPTY)
> > > + nr = 0;
> > > + } else {
> > > + nr = 0;
> >
> > For efficiency, we could break here, right?
>
> Not really, we need to fill count_per_node[] with zeros.
>
I thought count_per_node[] was initialized with zero by the caller
when allocated. However, I am wrong. Because it'll be reused
in each loop. You are right.
> >
> > > + }
> > > +
> > > + count_per_node[nid] = nr;
> > > + total += nr;
> > > + }
> > > +
> > > + return total;
> > > +}
> > > +
> > > +static int shrinker_debugfs_count_show(struct seq_file *m, void *v)
> > > +{
> > > + struct shrinker *shrinker = (struct shrinker *)m->private;
> >
> > Maybe we cound drop the cast since m->private is a void * type.
>
> Ok.
>
> >
> > > + unsigned long *count_per_node = NULL;
> >
> > Do not need to be initialized, right?
>
> Right, will fix in v4.
>
> >
> > > + struct mem_cgroup *memcg;
> > > + unsigned long total;
> > > + bool memcg_aware;
> > > + int ret, nid;
> > > +
> > > + count_per_node = kcalloc(nr_node_ids, sizeof(unsigned long), GFP_KERNEL);
> > > + if (!count_per_node)
> > > + return -ENOMEM;
> > > +
> > > + ret = down_read_killable(&shrinker_rwsem);
> > > + if (ret) {
> > > + kfree(count_per_node);
> > > + return ret;
> > > + }
> > > + rcu_read_lock();
> > > +
> > > + memcg_aware = shrinker->flags & SHRINKER_MEMCG_AWARE;
> > > +
> > > + memcg = mem_cgroup_iter(NULL, NULL, NULL);
> > > + do {
> > > + if (memcg && !mem_cgroup_online(memcg))
> > > + continue;
> > > +
> > > + total = shrinker_count_objects(shrinker,
> > > + memcg_aware ? memcg : NULL,
> > > + count_per_node);
> > > + if (total) {
> > > + seq_printf(m, "%lu", mem_cgroup_ino(memcg));
> > > + for_each_node(nid)
> > > + seq_printf(m, " %lu", count_per_node[nid]);
> > > + seq_puts(m, "\n");
> >
> > seq_putc(m, '\n') is more efficient.
>
> Ok.
>
> >
> > > + }
> > > +
> > > + if (!memcg_aware) {
> > > + mem_cgroup_iter_break(NULL, memcg);
> > > + break;
> > > + }
> > > +
> > > + if (signal_pending(current)) {
> > > + mem_cgroup_iter_break(NULL, memcg);
> > > + ret = -EINTR;
> > > + break;
> > > + }
> > > +
> > > + cond_resched();
> >
> > We are in rcu read lock, cannot be scheduled, right?
>
> This is a good one, thanks. Fixed.
>
> >
> > > + } while ((memcg = mem_cgroup_iter(NULL, memcg, NULL)) != NULL);
> > > +
> > > + rcu_read_unlock();
> > > + up_read(&shrinker_rwsem);
> > > +
> > > + kfree(count_per_node);
> > > + return ret;
> > > +}
> > > +DEFINE_SHOW_ATTRIBUTE(shrinker_debugfs_count);
> > > +
> > > +int shrinker_debugfs_add(struct shrinker *shrinker)
> > > +{
> > > + struct dentry *entry;
> > > + char buf[16];
> > > + int id;
> > > +
> > > + lockdep_assert_held(&shrinker_rwsem);
> > > +
> > > + /* debugfs isn't initialized yet, add debugfs entries later. */
> > > + if (!shrinker_debugfs_root)
> > > + return 0;
> > > +
> > > + id = ida_alloc(&shrinker_debugfs_ida, GFP_KERNEL);
> > > + if (id < 0)
> > > + return id;
> > > + shrinker->debugfs_id = id;
> > > +
> > > + snprintf(buf, sizeof(buf), "%d", id);
> > > +
> > > + /* create debugfs entry */
> > > + entry = debugfs_create_dir(buf, shrinker_debugfs_root);
> > > + if (IS_ERR(entry)) {
> > > + ida_free(&shrinker_debugfs_ida, id);
> > > + return PTR_ERR(entry);
> > > + }
> > > + shrinker->debugfs_entry = entry;
> > > +
> > > + debugfs_create_file("count", 0220, entry, shrinker,
> > > + &shrinker_debugfs_count_fops);
> > > + return 0;
> > > +}
> > > +
> > > +void shrinker_debugfs_remove(struct shrinker *shrinker)
> > > +{
> > > + lockdep_assert_held(&shrinker_rwsem);
> > > +
> > > + if (!shrinker->debugfs_entry)
> > > + return;
> > > +
> > > + debugfs_remove_recursive(shrinker->debugfs_entry);
> > > + ida_free(&shrinker_debugfs_ida, shrinker->debugfs_id);
> > > +}
> > > +
> > > +static int __init shrinker_debugfs_init(void)
> > > +{
> > > + struct shrinker *shrinker;
> > > + int ret;
> > > +
> > > + if (!debugfs_initialized())
> > > + return -ENODEV;
> > > +
> >
> > Redundant check since it is checked in debugfs_create_dir().
> > So I think we could remove this.
> >
> > > + shrinker_debugfs_root = debugfs_create_dir("shrinker", NULL);
> >
> > We should use IS_ERR() to detect the error code. So the following
> > check is wrong.
>
> Right, will fix in the next version.
>
> >
> > > + if (!shrinker_debugfs_root)
> > > + return -ENOMEM;
> > > +
> > > + /* Create debugfs entries for shrinkers registered at boot */
> > > + ret = down_write_killable(&shrinker_rwsem);
> >
> > How could we kill this process? IIUC, late_initcall() is called
> > from early init process, there is no way to kill this. Right?
> > If yes, I think we could just use down_write().
>
> Ok, agree.
>
> Thanks!
>
On Mon, May 23, 2022 at 01:54:24PM -0700, Roman Gushchin wrote:
> On Sun, May 22, 2022 at 07:35:59PM +0800, Muchun Song wrote:
> > On Mon, May 09, 2022 at 11:38:20AM -0700, Roman Gushchin wrote:
> > > Add a scan interface which allows to trigger scanning of a particular
> > > shrinker and specify memcg and numa node. It's useful for testing,
> > > debugging and profiling of a specific scan_objects() callback.
> > > Unlike alternatives (creating a real memory pressure and dropping
> > > caches via /proc/sys/vm/drop_caches) this interface allows to interact
> > > with only one shrinker at once. Also, if a shrinker is misreporting
> > > the number of objects (as some do), it doesn't affect scanning.
> > >
> > > Signed-off-by: Roman Gushchin <[email protected]>
> > > ---
> > > .../admin-guide/mm/shrinker_debugfs.rst | 39 +++++++++-
> > > mm/shrinker_debug.c | 73 +++++++++++++++++++
> > > 2 files changed, 108 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/Documentation/admin-guide/mm/shrinker_debugfs.rst b/Documentation/admin-guide/mm/shrinker_debugfs.rst
> > > index 6783f8190e63..8fecf81d60ee 100644
> > > --- a/Documentation/admin-guide/mm/shrinker_debugfs.rst
> > > +++ b/Documentation/admin-guide/mm/shrinker_debugfs.rst
> > > @@ -5,14 +5,16 @@ Shrinker Debugfs Interface
> > > ==========================
> > >
> > > Shrinker debugfs interface provides a visibility into the kernel memory
> > > -shrinkers subsystem and allows to get information about individual shrinkers.
> > > +shrinkers subsystem and allows to get information about individual shrinkers
> > > +and interact with them.
> > >
> > > For each shrinker registered in the system a directory in **<debugfs>/shrinker/**
> > > is created. The directory's name is composed from the shrinker's name and an
> > > unique id: e.g. *kfree_rcu-0* or *sb-xfs:vda1-36*.
> > >
> > > -Each shrinker directory contains the **count** file, which allows to trigger
> > > -the *count_objects()* callback for each memcg and numa node (if applicable).
> > > +Each shrinker directory contains **count** and **scan** files, which allow to
> > > +trigger *count_objects()* and *scan_objects()* callbacks for each memcg and
> > > +numa node (if applicable).
> > >
> > > Usage:
> > > ------
> > > @@ -43,7 +45,7 @@ Usage:
> > >
> > > $ cd sb-btrfs\:vda2-24/
> > > $ ls
> > > - count
> > > + count scan
> > >
> > > 3. *Count objects*
> > >
> > > @@ -98,3 +100,32 @@ Usage:
> > > 2877 84 0
> > > 293 1 0
> > > 735 8 0
> > > +
> > > +4. *Scan objects*
> > > +
> > > + The expected input format::
> > > +
> > > + <cgroup inode id> <numa id> <number of objects to scan>
> > > +
> > > + For a non-memcg-aware shrinker or on a system with no memory
> > > + cgrups **0** should be passed as cgroup id.
> > > + ::
> > > +
> > > + $ cd /sys/kernel/debug/shrinker/
> > > + $ cd sb-btrfs\:vda2-24/
> > > +
> > > + $ cat count | head -n 5
> > > + 1 212 0
> > > + 21 97 0
> > > + 55 802 5
> > > + 2367 2 0
> > > + 225 13 0
> > > +
> > > + $ echo "55 0 200" > scan
> > > +
> > > + $ cat count | head -n 5
> > > + 1 212 0
> > > + 21 96 0
> > > + 55 752 5
> > > + 2367 2 0
> > > + 225 13 0
> > > diff --git a/mm/shrinker_debug.c b/mm/shrinker_debug.c
> > > index 28b1c1ab60ef..8f67fef5a643 100644
> > > --- a/mm/shrinker_debug.c
> > > +++ b/mm/shrinker_debug.c
> > > @@ -101,6 +101,77 @@ static int shrinker_debugfs_count_show(struct seq_file *m, void *v)
> > > }
> > > DEFINE_SHOW_ATTRIBUTE(shrinker_debugfs_count);
> > >
> > > +static int shrinker_debugfs_scan_open(struct inode *inode, struct file *file)
> > > +{
> > > + file->private_data = inode->i_private;
> > > + return nonseekable_open(inode, file);
> > > +}
> > > +
> > > +static ssize_t shrinker_debugfs_scan_write(struct file *file,
> > > + const char __user *buf,
> > > + size_t size, loff_t *pos)
> > > +{
> > > + struct shrinker *shrinker = (struct shrinker *)file->private_data;
> >
> > Seems we could drop the cast since ->private_data is void * type.
>
> Yep, fixed. Thanks!
>
> >
> > > + unsigned long nr_to_scan = 0, ino;
> > > + struct shrink_control sc = {
> > > + .gfp_mask = GFP_KERNEL,
> > > + };
> > > + struct mem_cgroup *memcg = NULL;
> > > + int nid;
> > > + char kbuf[72];
> > > + int read_len = size < (sizeof(kbuf) - 1) ? size : (sizeof(kbuf) - 1);
> > > + ssize_t ret;
> > > +
> > > + if (copy_from_user(kbuf, buf, read_len))
> > > + return -EFAULT;
> > > + kbuf[read_len] = '\0';
> > > +
> > > + if (sscanf(kbuf, "%lu %d %lu", &ino, &nid, &nr_to_scan) < 2)
> > > + return -EINVAL;
> > > +
> > > + if (nid < 0 || nid >= nr_node_ids)
> > > + return -EINVAL;
> > > +
> >
> > Should we break here if nr_to_scan is zero?
>
> Not a very likely scenario, but ok.
>
Agree.
> >
> > > + if (shrinker->flags & SHRINKER_MEMCG_AWARE) {
> > > + memcg = mem_cgroup_get_from_ino(ino);
> > > + if (!memcg || IS_ERR(memcg))
> >
> > Should we drop the check of "!memcg" since mem_cgroup_get_from_ino
> > cannot return NULL?
>
> It can if !CONFIG_MEMCG. You might argue that then shrinker can not have
> the SHRINKER_MEMCG_AWARE flag, but since it's not a hot path at all,
> I'll keep it for extra safety.
>
Make sense.
> >
> > > + return -ENOENT;
> > > +
> > > + if (!mem_cgroup_online(memcg)) {
> > > + mem_cgroup_put(memcg);
> > > + return -ENOENT;
> > > + }
> > > + } else {
> > > + if (ino != 0)
> > > + return -EINVAL;
> > > + memcg = NULL;
> >
> > IIUC, memcg is already NULL if we reach here, right? Then the
> > assignment is not necessary. Or we cound remove the initialization
> > of 'memcg' where it is definned.
>
> Right, removed.
>
> >
> > > + }
> > > +
> > > + ret = down_read_killable(&shrinker_rwsem);
> > > + if (ret) {
> > > + mem_cgroup_put(memcg);
> > > + return ret;
> > > + }
> > > +
> > > + sc.nid = nid;
> > > + sc.memcg = memcg;
> > > + sc.nr_to_scan = nr_to_scan;
> > > + sc.nr_scanned = nr_to_scan;
> > > +
> > > + shrinker->scan_objects(shrinker, &sc);
> > > +
> > > + up_read(&shrinker_rwsem);
> > > + mem_cgroup_put(memcg);
> > > +
> > > + return ret ? ret : size;
> >
> > Seems "ret" is always equal to 0 here, should we simplify this
> > to "return size"?
>
> Right.
>
> Thank you for the review!
>
My pleasure.
Thanks.
On Mon, May 23, 2022 at 11:12:12AM -0700, Roman Gushchin wrote:
> On Sun, May 22, 2022 at 03:05:33PM +0800, Muchun Song wrote:
> > On Mon, May 09, 2022 at 11:38:15AM -0700, Roman Gushchin wrote:
> > > Shrinker debugfs requires a way to represent memory cgroups without
> > > using full paths, both for displaying information and getting input
> > > from a user.
> > >
> > > Cgroup inode number is a perfect way, already used by bpf.
> > >
> > > This commit adds a couple of helper functions which will be used
> > > to handle memcg-aware shrinkers.
> > >
> > > Signed-off-by: Roman Gushchin <[email protected]>
> > > ---
> > > include/linux/memcontrol.h | 21 +++++++++++++++++++++
> > > mm/memcontrol.c | 23 +++++++++++++++++++++++
> > > 2 files changed, 44 insertions(+)
> > >
> > > diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> > > index fe580cb96683..a6de9e5c1549 100644
> > > --- a/include/linux/memcontrol.h
> > > +++ b/include/linux/memcontrol.h
> > > @@ -831,6 +831,15 @@ static inline unsigned short mem_cgroup_id(struct mem_cgroup *memcg)
> > > }
> > > struct mem_cgroup *mem_cgroup_from_id(unsigned short id);
> > >
> > > +#ifdef CONFIG_SHRINKER_DEBUG
> > > +static inline unsigned long mem_cgroup_ino(struct mem_cgroup *memcg)
> > > +{
> > > + return memcg ? cgroup_ino(memcg->css.cgroup) : 0;
> > > +}
> > > +
> > > +struct mem_cgroup *mem_cgroup_get_from_ino(unsigned long ino);
> > > +#endif
> > > +
> > > static inline struct mem_cgroup *mem_cgroup_from_seq(struct seq_file *m)
> > > {
> > > return mem_cgroup_from_css(seq_css(m));
> > > @@ -1324,6 +1333,18 @@ static inline struct mem_cgroup *mem_cgroup_from_id(unsigned short id)
> > > return NULL;
> > > }
> > >
> > > +#ifdef CONFIG_SHRINKER_DEBUG
> > > +static inline unsigned long mem_cgroup_ino(struct mem_cgroup *memcg)
> > > +{
> > > + return 0;
> > > +}
> > > +
> > > +static inline struct mem_cgroup *mem_cgroup_get_from_ino(unsigned long ino)
> > > +{
> > > + return NULL;
> > > +}
> > > +#endif
> > > +
> > > static inline struct mem_cgroup *mem_cgroup_from_seq(struct seq_file *m)
> > > {
> > > return NULL;
> > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > > index 04cea4fa362a..e6472728fa66 100644
> > > --- a/mm/memcontrol.c
> > > +++ b/mm/memcontrol.c
> > > @@ -5018,6 +5018,29 @@ struct mem_cgroup *mem_cgroup_from_id(unsigned short id)
> > > return idr_find(&mem_cgroup_idr, id);
> > > }
> > >
> > > +#ifdef CONFIG_SHRINKER_DEBUG
> > > +struct mem_cgroup *mem_cgroup_get_from_ino(unsigned long ino)
> > > +{
> > > + struct cgroup *cgrp;
> > > + struct cgroup_subsys_state *css;
> > > + struct mem_cgroup *memcg;
> > > +
> > > + cgrp = cgroup_get_from_id(ino);
> > > + if (!cgrp)
> > > + return ERR_PTR(-ENOENT);
> > > +
> > > + css = cgroup_get_e_css(cgrp, &memory_cgrp_subsys);
> > > + if (css)
> > > + memcg = container_of(css, struct mem_cgroup, css);
> > > + else
> > > + memcg = ERR_PTR(-ENOENT);
> > > +
> > > + cgroup_put(cgrp);
> >
> > I think it's better to use css_put() here since the refcount is get
> > via cgroup_get_e_css() which returns a css struct.
>
> cgroup_put() is matching cgroup_get_from_id().
>
> The reference grabbed by cgroup_get_e_css() shouldn't be dropped
> because mem_cgroup_get_from_ino() has a "get" semantics.
>
My bad. I have misread it here. Thanks.
On Sun, May 22, 2022 at 07:08:24PM +0800, Muchun Song wrote:
> On Mon, May 09, 2022 at 11:38:17AM -0700, Roman Gushchin wrote:
> > Currently shrinkers are anonymous objects. For debugging purposes they
> > can be identified by count/scan function names, but it's not always
> > useful: e.g. for superblock's shrinkers it's nice to have at least
> > an idea of to which superblock the shrinker belongs.
> >
> > This commit adds names to shrinkers. register_shrinker() and
> > prealloc_shrinker() functions are extended to take a format and
> > arguments to master a name.
> >
> > In some cases it's not possible to determine a good name at the time
> > when a shrinker is allocated. For such cases shrinker_debugfs_rename()
> > is provided.
> >
> > After this change the shrinker debugfs directory looks like:
> > $ cd /sys/kernel/debug/shrinker/
> > $ ls
> > dqcache-16 sb-hugetlbfs-17 sb-rootfs-2 sb-tmpfs-50
> > kfree_rcu-0 sb-hugetlbfs-33 sb-securityfs-6 sb-tracefs-13
> > sb-aio-20 sb-iomem-12 sb-selinuxfs-22 sb-xfs:vda1-36
> > sb-anon_inodefs-15 sb-mqueue-21 sb-sockfs-8 sb-zsmalloc-19
> > sb-bdev-3 sb-nsfs-4 sb-sysfs-26 shadow-18
> > sb-bpf-32 sb-pipefs-14 sb-tmpfs-1 thp_deferred_split-10
> > sb-btrfs:vda2-24 sb-proc-25 sb-tmpfs-27 thp_zero-9
> > sb-cgroup2-30 sb-proc-39 sb-tmpfs-29 xfs_buf-37
> > sb-configfs-23 sb-proc-41 sb-tmpfs-35 xfs_inodegc-38
> > sb-dax-11 sb-proc-45 sb-tmpfs-40 zspool-34
> > sb-debugfs-7 sb-proc-46 sb-tmpfs-42
> > sb-devpts-28 sb-proc-47 sb-tmpfs-43
> > sb-devtmpfs-5 sb-pstore-31 sb-tmpfs-44
> >
> > Signed-off-by: Roman Gushchin <[email protected]>
> > ---
> > arch/x86/kvm/mmu/mmu.c | 2 +-
> > drivers/android/binder_alloc.c | 2 +-
> > drivers/gpu/drm/i915/gem/i915_gem_shrinker.c | 3 +-
> > drivers/gpu/drm/msm/msm_gem_shrinker.c | 2 +-
> > .../gpu/drm/panfrost/panfrost_gem_shrinker.c | 2 +-
> > drivers/gpu/drm/ttm/ttm_pool.c | 2 +-
> > drivers/md/bcache/btree.c | 2 +-
> > drivers/md/dm-bufio.c | 2 +-
> > drivers/md/dm-zoned-metadata.c | 2 +-
> > drivers/md/raid5.c | 2 +-
> > drivers/misc/vmw_balloon.c | 2 +-
> > drivers/virtio/virtio_balloon.c | 2 +-
> > drivers/xen/xenbus/xenbus_probe_backend.c | 2 +-
> > fs/btrfs/super.c | 2 +
> > fs/erofs/utils.c | 2 +-
> > fs/ext4/extents_status.c | 3 +-
> > fs/f2fs/super.c | 2 +-
> > fs/gfs2/glock.c | 2 +-
> > fs/gfs2/main.c | 2 +-
> > fs/jbd2/journal.c | 2 +-
> > fs/mbcache.c | 2 +-
> > fs/nfs/nfs42xattr.c | 7 ++-
> > fs/nfs/super.c | 2 +-
> > fs/nfsd/filecache.c | 2 +-
> > fs/nfsd/nfscache.c | 2 +-
> > fs/quota/dquot.c | 2 +-
> > fs/super.c | 6 +-
> > fs/ubifs/super.c | 2 +-
> > fs/xfs/xfs_buf.c | 2 +-
> > fs/xfs/xfs_icache.c | 2 +-
> > fs/xfs/xfs_qm.c | 2 +-
> > include/linux/shrinker.h | 12 +++-
> > kernel/rcu/tree.c | 2 +-
> > mm/huge_memory.c | 4 +-
> > mm/shrinker_debug.c | 45 +++++++++++++-
> > mm/vmscan.c | 58 ++++++++++++++++++-
> > mm/workingset.c | 2 +-
> > mm/zsmalloc.c | 2 +-
> > net/sunrpc/auth.c | 2 +-
> > 39 files changed, 154 insertions(+), 46 deletions(-)
> >
> > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > index c623019929a7..8cfabdd63406 100644
> > --- a/arch/x86/kvm/mmu/mmu.c
> > +++ b/arch/x86/kvm/mmu/mmu.c
> > @@ -6283,7 +6283,7 @@ int kvm_mmu_vendor_module_init(void)
> > if (percpu_counter_init(&kvm_total_used_mmu_pages, 0, GFP_KERNEL))
> > goto out;
> >
> > - ret = register_shrinker(&mmu_shrinker);
> > + ret = register_shrinker(&mmu_shrinker, "mmu");
> > if (ret)
> > goto out;
> >
> > diff --git a/drivers/android/binder_alloc.c b/drivers/android/binder_alloc.c
> > index 2ac1008a5f39..951343c41ba8 100644
> > --- a/drivers/android/binder_alloc.c
> > +++ b/drivers/android/binder_alloc.c
> > @@ -1084,7 +1084,7 @@ int binder_alloc_shrinker_init(void)
> > int ret = list_lru_init(&binder_alloc_lru);
> >
> > if (ret == 0) {
> > - ret = register_shrinker(&binder_shrinker);
> > + ret = register_shrinker(&binder_shrinker, "binder");
> > if (ret)
> > list_lru_destroy(&binder_alloc_lru);
> > }
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
> > index 6a6ff98a8746..85524ef92ea4 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
> > @@ -426,7 +426,8 @@ void i915_gem_driver_register__shrinker(struct drm_i915_private *i915)
> > i915->mm.shrinker.count_objects = i915_gem_shrinker_count;
> > i915->mm.shrinker.seeks = DEFAULT_SEEKS;
> > i915->mm.shrinker.batch = 4096;
> > - drm_WARN_ON(&i915->drm, register_shrinker(&i915->mm.shrinker));
> > + drm_WARN_ON(&i915->drm, register_shrinker(&i915->mm.shrinker,
> > + "drm_i915_gem"));
> >
> > i915->mm.oom_notifier.notifier_call = i915_gem_shrinker_oom;
> > drm_WARN_ON(&i915->drm, register_oom_notifier(&i915->mm.oom_notifier));
> > diff --git a/drivers/gpu/drm/msm/msm_gem_shrinker.c b/drivers/gpu/drm/msm/msm_gem_shrinker.c
> > index 086dacf2f26a..2d3cf4f13dfd 100644
> > --- a/drivers/gpu/drm/msm/msm_gem_shrinker.c
> > +++ b/drivers/gpu/drm/msm/msm_gem_shrinker.c
> > @@ -221,7 +221,7 @@ void msm_gem_shrinker_init(struct drm_device *dev)
> > priv->shrinker.count_objects = msm_gem_shrinker_count;
> > priv->shrinker.scan_objects = msm_gem_shrinker_scan;
> > priv->shrinker.seeks = DEFAULT_SEEKS;
> > - WARN_ON(register_shrinker(&priv->shrinker));
> > + WARN_ON(register_shrinker(&priv->shrinker, "drm_msm_gem"));
> >
> > priv->vmap_notifier.notifier_call = msm_gem_shrinker_vmap;
> > WARN_ON(register_vmap_purge_notifier(&priv->vmap_notifier));
> > diff --git a/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c b/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c
> > index 77e7cb6d1ae3..0d028266ee9e 100644
> > --- a/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c
> > +++ b/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c
> > @@ -103,7 +103,7 @@ void panfrost_gem_shrinker_init(struct drm_device *dev)
> > pfdev->shrinker.count_objects = panfrost_gem_shrinker_count;
> > pfdev->shrinker.scan_objects = panfrost_gem_shrinker_scan;
> > pfdev->shrinker.seeks = DEFAULT_SEEKS;
> > - WARN_ON(register_shrinker(&pfdev->shrinker));
> > + WARN_ON(register_shrinker(&pfdev->shrinker, "drm_panfrost"));
> > }
> >
> > /**
> > diff --git a/drivers/gpu/drm/ttm/ttm_pool.c b/drivers/gpu/drm/ttm/ttm_pool.c
> > index 1bba0a0ed3f9..b8b41d242197 100644
> > --- a/drivers/gpu/drm/ttm/ttm_pool.c
> > +++ b/drivers/gpu/drm/ttm/ttm_pool.c
> > @@ -722,7 +722,7 @@ int ttm_pool_mgr_init(unsigned long num_pages)
> > mm_shrinker.count_objects = ttm_pool_shrinker_count;
> > mm_shrinker.scan_objects = ttm_pool_shrinker_scan;
> > mm_shrinker.seeks = 1;
> > - return register_shrinker(&mm_shrinker);
> > + return register_shrinker(&mm_shrinker, "drm_ttm_pool");
> > }
> >
> > /**
> > diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c
> > index ad9f16689419..c1f734ab86b3 100644
> > --- a/drivers/md/bcache/btree.c
> > +++ b/drivers/md/bcache/btree.c
> > @@ -812,7 +812,7 @@ int bch_btree_cache_alloc(struct cache_set *c)
> > c->shrink.seeks = 4;
> > c->shrink.batch = c->btree_pages * 2;
> >
> > - if (register_shrinker(&c->shrink))
> > + if (register_shrinker(&c->shrink, "btree"))
> > pr_warn("bcache: %s: could not register shrinker\n",
> > __func__);
> >
> > diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c
> > index 5ffa1dcf84cf..bcc95898c341 100644
> > --- a/drivers/md/dm-bufio.c
> > +++ b/drivers/md/dm-bufio.c
> > @@ -1806,7 +1806,7 @@ struct dm_bufio_client *dm_bufio_client_create(struct block_device *bdev, unsign
> > c->shrinker.scan_objects = dm_bufio_shrink_scan;
> > c->shrinker.seeks = 1;
> > c->shrinker.batch = 0;
> > - r = register_shrinker(&c->shrinker);
> > + r = register_shrinker(&c->shrinker, "dm_bufio");
> > if (r)
> > goto bad;
> >
> > diff --git a/drivers/md/dm-zoned-metadata.c b/drivers/md/dm-zoned-metadata.c
> > index d1ea66114d14..05f2fd12066b 100644
> > --- a/drivers/md/dm-zoned-metadata.c
> > +++ b/drivers/md/dm-zoned-metadata.c
> > @@ -2944,7 +2944,7 @@ int dmz_ctr_metadata(struct dmz_dev *dev, int num_dev,
> > zmd->mblk_shrinker.seeks = DEFAULT_SEEKS;
> >
> > /* Metadata cache shrinker */
> > - ret = register_shrinker(&zmd->mblk_shrinker);
> > + ret = register_shrinker(&zmd->mblk_shrinker, "md_meta");
> > if (ret) {
> > dmz_zmd_err(zmd, "Register metadata cache shrinker failed");
> > goto err;
> > diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> > index 59f91e392a2a..34ddebd3aff7 100644
> > --- a/drivers/md/raid5.c
> > +++ b/drivers/md/raid5.c
> > @@ -7383,7 +7383,7 @@ static struct r5conf *setup_conf(struct mddev *mddev)
> > conf->shrinker.count_objects = raid5_cache_count;
> > conf->shrinker.batch = 128;
> > conf->shrinker.flags = 0;
> > - if (register_shrinker(&conf->shrinker)) {
> > + if (register_shrinker(&conf->shrinker, "md-%s", mdname(mddev))) {
> > pr_warn("md/raid:%s: couldn't register shrinker.\n",
> > mdname(mddev));
> > goto abort;
> > diff --git a/drivers/misc/vmw_balloon.c b/drivers/misc/vmw_balloon.c
> > index f1d8ba6d4857..6c9ddf1187dd 100644
> > --- a/drivers/misc/vmw_balloon.c
> > +++ b/drivers/misc/vmw_balloon.c
> > @@ -1587,7 +1587,7 @@ static int vmballoon_register_shrinker(struct vmballoon *b)
> > b->shrinker.count_objects = vmballoon_shrinker_count;
> > b->shrinker.seeks = DEFAULT_SEEKS;
> >
> > - r = register_shrinker(&b->shrinker);
> > + r = register_shrinker(&b->shrinker, "vmw_balloon");
> >
> > if (r == 0)
> > b->shrinker_registered = true;
> > diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> > index f4c34a2a6b8e..093e06e19d0e 100644
> > --- a/drivers/virtio/virtio_balloon.c
> > +++ b/drivers/virtio/virtio_balloon.c
> > @@ -875,7 +875,7 @@ static int virtio_balloon_register_shrinker(struct virtio_balloon *vb)
> > vb->shrinker.count_objects = virtio_balloon_shrinker_count;
> > vb->shrinker.seeks = DEFAULT_SEEKS;
> >
> > - return register_shrinker(&vb->shrinker);
> > + return register_shrinker(&vb->shrinker, "virtio_valloon");
> > }
> >
> > static int virtballoon_probe(struct virtio_device *vdev)
> > diff --git a/drivers/xen/xenbus/xenbus_probe_backend.c b/drivers/xen/xenbus/xenbus_probe_backend.c
> > index 5abded97e1a7..a6c5e344017d 100644
> > --- a/drivers/xen/xenbus/xenbus_probe_backend.c
> > +++ b/drivers/xen/xenbus/xenbus_probe_backend.c
> > @@ -305,7 +305,7 @@ static int __init xenbus_probe_backend_init(void)
> >
> > register_xenstore_notifier(&xenstore_notifier);
> >
> > - if (register_shrinker(&backend_memory_shrinker))
> > + if (register_shrinker(&backend_memory_shrinker, "xen_backend"))
> > pr_warn("shrinker registration failed\n");
> >
> > return 0;
> > diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
> > index 206f44005c52..062dbd8071e2 100644
> > --- a/fs/btrfs/super.c
> > +++ b/fs/btrfs/super.c
> > @@ -1790,6 +1790,8 @@ static struct dentry *btrfs_mount_root(struct file_system_type *fs_type,
> > error = -EBUSY;
> > } else {
> > snprintf(s->s_id, sizeof(s->s_id), "%pg", bdev);
> > + shrinker_debugfs_rename(&s->s_shrink, "sb-%s:%s", fs_type->name,
> > + s->s_id);
> > btrfs_sb(s)->bdev_holder = fs_type;
> > if (!strstr(crc32c_impl(), "generic"))
> > set_bit(BTRFS_FS_CSUM_IMPL_FAST, &fs_info->flags);
> > diff --git a/fs/erofs/utils.c b/fs/erofs/utils.c
> > index ec9a1d780dc1..67eb64fadd4f 100644
> > --- a/fs/erofs/utils.c
> > +++ b/fs/erofs/utils.c
> > @@ -282,7 +282,7 @@ static struct shrinker erofs_shrinker_info = {
> >
> > int __init erofs_init_shrinker(void)
> > {
> > - return register_shrinker(&erofs_shrinker_info);
> > + return register_shrinker(&erofs_shrinker_info, "erofs");
> > }
> >
> > void erofs_exit_shrinker(void)
> > diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c
> > index 9a3a8996aacf..a7aa79d580e5 100644
> > --- a/fs/ext4/extents_status.c
> > +++ b/fs/ext4/extents_status.c
> > @@ -1650,11 +1650,10 @@ int ext4_es_register_shrinker(struct ext4_sb_info *sbi)
> > err = percpu_counter_init(&sbi->s_es_stats.es_stats_shk_cnt, 0, GFP_KERNEL);
> > if (err)
> > goto err3;
> > -
> > sbi->s_es_shrinker.scan_objects = ext4_es_scan;
> > sbi->s_es_shrinker.count_objects = ext4_es_count;
> > sbi->s_es_shrinker.seeks = DEFAULT_SEEKS;
> > - err = register_shrinker(&sbi->s_es_shrinker);
> > + err = register_shrinker(&sbi->s_es_shrinker, "ext4_es");
> > if (err)
> > goto err4;
> >
> > diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
> > index 4368f90571bd..2fc40a1635f3 100644
> > --- a/fs/f2fs/super.c
> > +++ b/fs/f2fs/super.c
> > @@ -4579,7 +4579,7 @@ static int __init init_f2fs_fs(void)
> > err = f2fs_init_sysfs();
> > if (err)
> > goto free_garbage_collection_cache;
> > - err = register_shrinker(&f2fs_shrinker_info);
> > + err = register_shrinker(&f2fs_shrinker_info, "f2fs");
> > if (err)
> > goto free_sysfs;
> > err = register_filesystem(&f2fs_fs_type);
> > diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
> > index 26169cedcefc..791c23d9f7e7 100644
> > --- a/fs/gfs2/glock.c
> > +++ b/fs/gfs2/glock.c
> > @@ -2549,7 +2549,7 @@ int __init gfs2_glock_init(void)
> > return -ENOMEM;
> > }
> >
> > - ret = register_shrinker(&glock_shrinker);
> > + ret = register_shrinker(&glock_shrinker, "gfs2_glock");
> > if (ret) {
> > destroy_workqueue(gfs2_delete_workqueue);
> > destroy_workqueue(glock_workqueue);
> > diff --git a/fs/gfs2/main.c b/fs/gfs2/main.c
> > index 28d0eb23e18e..dde981b78488 100644
> > --- a/fs/gfs2/main.c
> > +++ b/fs/gfs2/main.c
> > @@ -150,7 +150,7 @@ static int __init init_gfs2_fs(void)
> > if (!gfs2_trans_cachep)
> > goto fail_cachep8;
> >
> > - error = register_shrinker(&gfs2_qd_shrinker);
> > + error = register_shrinker(&gfs2_qd_shrinker, "gfs2_qd");
> > if (error)
> > goto fail_shrinker;
> >
> > diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
> > index c0cbeeaec2d1..e7786445ecc1 100644
> > --- a/fs/jbd2/journal.c
> > +++ b/fs/jbd2/journal.c
> > @@ -1418,7 +1418,7 @@ static journal_t *journal_init_common(struct block_device *bdev,
> > if (percpu_counter_init(&journal->j_checkpoint_jh_count, 0, GFP_KERNEL))
> > goto err_cleanup;
> >
> > - if (register_shrinker(&journal->j_shrinker)) {
> > + if (register_shrinker(&journal->j_shrinker, "jbd2_journal")) {
> > percpu_counter_destroy(&journal->j_checkpoint_jh_count);
> > goto err_cleanup;
> > }
> > diff --git a/fs/mbcache.c b/fs/mbcache.c
> > index 97c54d3a2227..379dc5b0b6ad 100644
> > --- a/fs/mbcache.c
> > +++ b/fs/mbcache.c
> > @@ -367,7 +367,7 @@ struct mb_cache *mb_cache_create(int bucket_bits)
> > cache->c_shrink.count_objects = mb_cache_count;
> > cache->c_shrink.scan_objects = mb_cache_scan;
> > cache->c_shrink.seeks = DEFAULT_SEEKS;
> > - if (register_shrinker(&cache->c_shrink)) {
> > + if (register_shrinker(&cache->c_shrink, "mb_cache")) {
> > kfree(cache->c_hash);
> > kfree(cache);
> > goto err_out;
> > diff --git a/fs/nfs/nfs42xattr.c b/fs/nfs/nfs42xattr.c
> > index e7b34f7e0614..147b8a2f2dc6 100644
> > --- a/fs/nfs/nfs42xattr.c
> > +++ b/fs/nfs/nfs42xattr.c
> > @@ -1017,15 +1017,16 @@ int __init nfs4_xattr_cache_init(void)
> > if (ret)
> > goto out2;
> >
> > - ret = register_shrinker(&nfs4_xattr_cache_shrinker);
> > + ret = register_shrinker(&nfs4_xattr_cache_shrinker, "nfs_xattr_cache");
> > if (ret)
> > goto out1;
> >
> > - ret = register_shrinker(&nfs4_xattr_entry_shrinker);
> > + ret = register_shrinker(&nfs4_xattr_entry_shrinker, "nfs_xattr_entry");
> > if (ret)
> > goto out;
> >
> > - ret = register_shrinker(&nfs4_xattr_large_entry_shrinker);
> > + ret = register_shrinker(&nfs4_xattr_large_entry_shrinker,
> > + "nfs_xattr_large_entry");
> > if (!ret)
> > return 0;
> >
> > diff --git a/fs/nfs/super.c b/fs/nfs/super.c
> > index 6ab5eeb000dc..c7a2aef911f1 100644
> > --- a/fs/nfs/super.c
> > +++ b/fs/nfs/super.c
> > @@ -149,7 +149,7 @@ int __init register_nfs_fs(void)
> > ret = nfs_register_sysctl();
> > if (ret < 0)
> > goto error_2;
> > - ret = register_shrinker(&acl_shrinker);
> > + ret = register_shrinker(&acl_shrinker, "nfs_acl");
> > if (ret < 0)
> > goto error_3;
> > #ifdef CONFIG_NFS_V4_2
> > diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
> > index 2c1b027774d4..9c2879a3c3c0 100644
> > --- a/fs/nfsd/filecache.c
> > +++ b/fs/nfsd/filecache.c
> > @@ -666,7 +666,7 @@ nfsd_file_cache_init(void)
> > goto out_err;
> > }
> >
> > - ret = register_shrinker(&nfsd_file_shrinker);
> > + ret = register_shrinker(&nfsd_file_shrinker, "nfsd_filecache");
> > if (ret) {
> > pr_err("nfsd: failed to register nfsd_file_shrinker: %d\n", ret);
> > goto out_lru;
> > diff --git a/fs/nfsd/nfscache.c b/fs/nfsd/nfscache.c
> > index 0b3f12aa37ff..f1cfb06d0be5 100644
> > --- a/fs/nfsd/nfscache.c
> > +++ b/fs/nfsd/nfscache.c
> > @@ -176,7 +176,7 @@ int nfsd_reply_cache_init(struct nfsd_net *nn)
> > nn->nfsd_reply_cache_shrinker.scan_objects = nfsd_reply_cache_scan;
> > nn->nfsd_reply_cache_shrinker.count_objects = nfsd_reply_cache_count;
> > nn->nfsd_reply_cache_shrinker.seeks = 1;
> > - status = register_shrinker(&nn->nfsd_reply_cache_shrinker);
> > + status = register_shrinker(&nn->nfsd_reply_cache_shrinker, "nfsd_reply");
> > if (status)
> > goto out_stats_destroy;
> >
> > diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
> > index a74aef99bd3d..854d2b1d0914 100644
> > --- a/fs/quota/dquot.c
> > +++ b/fs/quota/dquot.c
> > @@ -2985,7 +2985,7 @@ static int __init dquot_init(void)
> > pr_info("VFS: Dquot-cache hash table entries: %ld (order %ld,"
> > " %ld bytes)\n", nr_hash, order, (PAGE_SIZE << order));
> >
> > - if (register_shrinker(&dqcache_shrinker))
> > + if (register_shrinker(&dqcache_shrinker, "dqcache"))
> > panic("Cannot register dquot shrinker");
> >
> > return 0;
> > diff --git a/fs/super.c b/fs/super.c
> > index 60f57c7bc0a6..4fca6657f442 100644
> > --- a/fs/super.c
> > +++ b/fs/super.c
> > @@ -265,7 +265,7 @@ static struct super_block *alloc_super(struct file_system_type *type, int flags,
> > s->s_shrink.count_objects = super_cache_count;
> > s->s_shrink.batch = 1024;
> > s->s_shrink.flags = SHRINKER_NUMA_AWARE | SHRINKER_MEMCG_AWARE;
> > - if (prealloc_shrinker(&s->s_shrink))
> > + if (prealloc_shrinker(&s->s_shrink, "sb-%s", type->name))
> > goto fail;
> > if (list_lru_init_memcg(&s->s_dentry_lru, &s->s_shrink))
> > goto fail;
> > @@ -1288,6 +1288,8 @@ int get_tree_bdev(struct fs_context *fc,
> > } else {
> > s->s_mode = mode;
> > snprintf(s->s_id, sizeof(s->s_id), "%pg", bdev);
> > + shrinker_debugfs_rename(&s->s_shrink, "sb-%s:%s",
> > + fc->fs_type->name, s->s_id);
> > sb_set_blocksize(s, block_size(bdev));
> > error = fill_super(s, fc);
> > if (error) {
> > @@ -1363,6 +1365,8 @@ struct dentry *mount_bdev(struct file_system_type *fs_type,
> > } else {
> > s->s_mode = mode;
> > snprintf(s->s_id, sizeof(s->s_id), "%pg", bdev);
> > + shrinker_debugfs_rename(&s->s_shrink, "sb-%s:%s",
> > + fs_type->name, s->s_id);
> > sb_set_blocksize(s, block_size(bdev));
> > error = fill_super(s, data, flags & SB_SILENT ? 1 : 0);
> > if (error) {
> > diff --git a/fs/ubifs/super.c b/fs/ubifs/super.c
> > index bad67455215f..a3663d201f64 100644
> > --- a/fs/ubifs/super.c
> > +++ b/fs/ubifs/super.c
> > @@ -2430,7 +2430,7 @@ static int __init ubifs_init(void)
> > if (!ubifs_inode_slab)
> > return -ENOMEM;
> >
> > - err = register_shrinker(&ubifs_shrinker_info);
> > + err = register_shrinker(&ubifs_shrinker_info, "ubifs");
> > if (err)
> > goto out_slab;
> >
> > diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
> > index e1afb9e503e1..5645e92df0c9 100644
> > --- a/fs/xfs/xfs_buf.c
> > +++ b/fs/xfs/xfs_buf.c
> > @@ -1986,7 +1986,7 @@ xfs_alloc_buftarg(
> > btp->bt_shrinker.scan_objects = xfs_buftarg_shrink_scan;
> > btp->bt_shrinker.seeks = DEFAULT_SEEKS;
> > btp->bt_shrinker.flags = SHRINKER_NUMA_AWARE;
> > - if (register_shrinker(&btp->bt_shrinker))
> > + if (register_shrinker(&btp->bt_shrinker, "xfs_buf"))
> > goto error_pcpu;
> > return btp;
> >
> > diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
> > index bffd6eb0b298..d0c4e74ff763 100644
> > --- a/fs/xfs/xfs_icache.c
> > +++ b/fs/xfs/xfs_icache.c
> > @@ -2198,5 +2198,5 @@ xfs_inodegc_register_shrinker(
> > shrink->flags = SHRINKER_NONSLAB;
> > shrink->batch = XFS_INODEGC_SHRINKER_BATCH;
> >
> > - return register_shrinker(shrink);
> > + return register_shrinker(shrink, "xfs_inodegc");
> > }
> > diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c
> > index f165d1a3de1d..93ded9e81f49 100644
> > --- a/fs/xfs/xfs_qm.c
> > +++ b/fs/xfs/xfs_qm.c
> > @@ -686,7 +686,7 @@ xfs_qm_init_quotainfo(
> > qinf->qi_shrinker.seeks = DEFAULT_SEEKS;
> > qinf->qi_shrinker.flags = SHRINKER_NUMA_AWARE;
> >
> > - error = register_shrinker(&qinf->qi_shrinker);
> > + error = register_shrinker(&qinf->qi_shrinker, "xfs_qm");
> > if (error)
> > goto out_free_inos;
> >
> > diff --git a/include/linux/shrinker.h b/include/linux/shrinker.h
> > index 2ced8149c513..64416f3e0a1f 100644
> > --- a/include/linux/shrinker.h
> > +++ b/include/linux/shrinker.h
> > @@ -75,6 +75,7 @@ struct shrinker {
> > #endif
> > #ifdef CONFIG_SHRINKER_DEBUG
> > int debugfs_id;
> > + const char *name;
> > struct dentry *debugfs_entry;
> > #endif
> > /* objs pending delete, per node */
> > @@ -92,9 +93,9 @@ struct shrinker {
> > */
> > #define SHRINKER_NONSLAB (1 << 3)
> >
> > -extern int prealloc_shrinker(struct shrinker *shrinker);
> > +extern int prealloc_shrinker(struct shrinker *shrinker, const char *fmt, ...);
> > extern void register_shrinker_prepared(struct shrinker *shrinker);
> > -extern int register_shrinker(struct shrinker *shrinker);
> > +extern int register_shrinker(struct shrinker *shrinker, const char *fmt, ...);
> > extern void unregister_shrinker(struct shrinker *shrinker);
> > extern void free_prealloced_shrinker(struct shrinker *shrinker);
> > extern void synchronize_shrinkers(void);
> > @@ -102,6 +103,8 @@ extern void synchronize_shrinkers(void);
> > #ifdef CONFIG_SHRINKER_DEBUG
> > extern int shrinker_debugfs_add(struct shrinker *shrinker);
> > extern void shrinker_debugfs_remove(struct shrinker *shrinker);
> > +extern int shrinker_debugfs_rename(struct shrinker *shrinker,
> > + const char *fmt, ...);
> > #else /* CONFIG_SHRINKER_DEBUG */
> > static inline int shrinker_debugfs_add(struct shrinker *shrinker)
> > {
> > @@ -110,5 +113,10 @@ static inline int shrinker_debugfs_add(struct shrinker *shrinker)
> > static inline void shrinker_debugfs_remove(struct shrinker *shrinker)
> > {
> > }
> > +static inline int shrinker_debugfs_rename(struct shrinker *shrinker,
> > + const char *fmt, ...)
> > +{
> > + return 0;
> > +}
> > #endif /* CONFIG_SHRINKER_DEBUG */
> > #endif /* _LINUX_SHRINKER_H */
> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > index 5c587e00811c..b4c66916bea9 100644
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -4978,7 +4978,7 @@ static void __init kfree_rcu_batch_init(void)
> > INIT_DELAYED_WORK(&krcp->page_cache_work, fill_page_cache_func);
> > krcp->initialized = true;
> > }
> > - if (register_shrinker(&kfree_rcu_shrinker))
> > + if (register_shrinker(&kfree_rcu_shrinker, "kfree_rcu"))
> > pr_err("Failed to register kfree_rcu() shrinker!\n");
> > }
> >
> > diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> > index fa6a1623976a..a40df19c0e38 100644
> > --- a/mm/huge_memory.c
> > +++ b/mm/huge_memory.c
> > @@ -423,10 +423,10 @@ static int __init hugepage_init(void)
> > if (err)
> > goto err_slab;
> >
> > - err = register_shrinker(&huge_zero_page_shrinker);
> > + err = register_shrinker(&huge_zero_page_shrinker, "thp_zero");
> > if (err)
> > goto err_hzp_shrinker;
> > - err = register_shrinker(&deferred_split_shrinker);
> > + err = register_shrinker(&deferred_split_shrinker, "thp_deferred_split");
> > if (err)
> > goto err_split_shrinker;
> >
> > diff --git a/mm/shrinker_debug.c b/mm/shrinker_debug.c
> > index fd1f805a581a..28b1c1ab60ef 100644
> > --- a/mm/shrinker_debug.c
> > +++ b/mm/shrinker_debug.c
> > @@ -104,7 +104,7 @@ DEFINE_SHOW_ATTRIBUTE(shrinker_debugfs_count);
> > int shrinker_debugfs_add(struct shrinker *shrinker)
> > {
> > struct dentry *entry;
> > - char buf[16];
> > + char buf[128];
> > int id;
> >
> > lockdep_assert_held(&shrinker_rwsem);
> > @@ -118,7 +118,7 @@ int shrinker_debugfs_add(struct shrinker *shrinker)
> > return id;
> > shrinker->debugfs_id = id;
> >
> > - snprintf(buf, sizeof(buf), "%d", id);
> > + snprintf(buf, sizeof(buf), "%s-%d", shrinker->name, id);
> >
> > /* create debugfs entry */
> > entry = debugfs_create_dir(buf, shrinker_debugfs_root);
> > @@ -133,10 +133,51 @@ int shrinker_debugfs_add(struct shrinker *shrinker)
> > return 0;
> > }
> >
> > +int shrinker_debugfs_rename(struct shrinker *shrinker, const char *fmt, ...)
> > +{
> > + struct dentry *entry;
> > + char buf[128];
> > + const char *old;
> > + va_list ap;
> > + int ret = 0;
> > +
> > + down_write(&shrinker_rwsem);
> > +
> > + old = shrinker->name;
> > +
> > + va_start(ap, fmt);
> > + shrinker->name = kvasprintf_const(GFP_KERNEL, fmt, ap);
> > + va_end(ap);
> > + if (!shrinker->name) {
> > + shrinker->name = old;
> > + ret = -ENOMEM;
>
> Seems we could move those 6 lines out of shrinker_rwsem. I know
> this function is not in a hot path, but it it better to improve
> it if it is easy, right?
Not sure if it worth it, but ok, it's indeed easy.
>
> > + } else if (shrinker->debugfs_entry) {
> > + snprintf(buf, sizeof(buf), "%s-%d", shrinker->name,
> > + shrinker->debugfs_id);
> > +
> > + entry = debugfs_rename(shrinker_debugfs_root,
> > + shrinker->debugfs_entry,
> > + shrinker_debugfs_root, buf);
> > + if (IS_ERR(entry))
> > + ret = PTR_ERR(entry);
> > + else
> > + shrinker->debugfs_entry = entry;
> > +
> > + kfree_const(old);
> > + }
> > +
> > + up_write(&shrinker_rwsem);
> > +
> > + return ret;
> > +}
> > +EXPORT_SYMBOL(shrinker_debugfs_rename);
> > +
> > void shrinker_debugfs_remove(struct shrinker *shrinker)
> > {
> > lockdep_assert_held(&shrinker_rwsem);
> >
> > + kfree_const(shrinker->name);
> > +
> > if (!shrinker->debugfs_entry)
> > return;
> >
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index 024f7056b98c..42bae0fd0442 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -613,7 +613,7 @@ static unsigned long lruvec_lru_size(struct lruvec *lruvec, enum lru_list lru,
> > /*
> > * Add a shrinker callback to be called from the vm.
> > */
> > -int prealloc_shrinker(struct shrinker *shrinker)
> > +static int __prealloc_shrinker(struct shrinker *shrinker)
> > {
> > unsigned int size;
> > int err;
> > @@ -637,8 +637,36 @@ int prealloc_shrinker(struct shrinker *shrinker)
> > return 0;
> > }
> >
> > +#ifdef CONFIG_SHRINKER_DEBUG
> > +int prealloc_shrinker(struct shrinker *shrinker, const char *fmt, ...)
> > +{
> > + va_list ap;
> > + int err;
> > +
> > + va_start(ap, fmt);
> > + shrinker->name = kvasprintf_const(GFP_KERNEL, fmt, ap);
> > + va_end(ap);
> > + if (!shrinker->name)
> > + return -ENOMEM;
> > +
> > + err = __prealloc_shrinker(shrinker);
> > + if (err)
> > + kfree_const(shrinker->name);
> > +
> > + return err;
> > +}
> > +#else
> > +int prealloc_shrinker(struct shrinker *shrinker, const char *fmt, ...)
> > +{
> > + return __prealloc_shrinker(shrinker);
> > +}
> > +#endif
> > +
> > void free_prealloced_shrinker(struct shrinker *shrinker)
> > {
> > +#ifdef CONFIG_SHRINKER_DEBUG
> > + kfree_const(shrinker->name);
> > +#endif
> > if (shrinker->flags & SHRINKER_MEMCG_AWARE) {
> > down_write(&shrinker_rwsem);
> > unregister_memcg_shrinker(shrinker);
> > @@ -659,15 +687,39 @@ void register_shrinker_prepared(struct shrinker *shrinker)
> > up_write(&shrinker_rwsem);
> > }
> >
> > -int register_shrinker(struct shrinker *shrinker)
> > +static int __register_shrinker(struct shrinker *shrinker)
> > {
> > - int err = prealloc_shrinker(shrinker);
> > + int err = __prealloc_shrinker(shrinker);
> >
> > if (err)
> > return err;
> > register_shrinker_prepared(shrinker);
> > return 0;
> > }
> > +
> > +#ifdef CONFIG_SHRINKER_DEBUG
> > +int register_shrinker(struct shrinker *shrinker, const char *fmt, ...)
> > +{
> > + va_list ap;
> > + int err;
> > +
> > + va_start(ap, fmt);
> > + shrinker->name = kvasprintf_const(GFP_KERNEL, fmt, ap);
> > + va_end(ap);
> > + if (!shrinker->name)
> > + return -ENOMEM;
> > +
>
> How about moving those initialization of name into shrinker_debugfs_add()?
> Then maybe we could hide handling to "name" from vmscan.c.
shrinker_debugfs_add() can be delayed up to the moment when debugfs is
initialized. Some shrinkers are registered earlier. So it's not (easily)
possible.
Thanks!
On Mon, May 23, 2022 at 08:13:41AM +1000, Dave Chinner wrote:
> On Mon, May 09, 2022 at 11:38:17AM -0700, Roman Gushchin wrote:
> > Currently shrinkers are anonymous objects. For debugging purposes they
> > can be identified by count/scan function names, but it's not always
> > useful: e.g. for superblock's shrinkers it's nice to have at least
> > an idea of to which superblock the shrinker belongs.
> >
> > This commit adds names to shrinkers. register_shrinker() and
> > prealloc_shrinker() functions are extended to take a format and
> > arguments to master a name.
> >
> > In some cases it's not possible to determine a good name at the time
> > when a shrinker is allocated. For such cases shrinker_debugfs_rename()
> > is provided.
> >
> > After this change the shrinker debugfs directory looks like:
> > $ cd /sys/kernel/debug/shrinker/
> > $ ls
> > dqcache-16 sb-hugetlbfs-17 sb-rootfs-2 sb-tmpfs-50
> > kfree_rcu-0 sb-hugetlbfs-33 sb-securityfs-6 sb-tracefs-13
> > sb-aio-20 sb-iomem-12 sb-selinuxfs-22 sb-xfs:vda1-36
> > sb-anon_inodefs-15 sb-mqueue-21 sb-sockfs-8 sb-zsmalloc-19
> > sb-bdev-3 sb-nsfs-4 sb-sysfs-26 shadow-18
> > sb-bpf-32 sb-pipefs-14 sb-tmpfs-1 thp_deferred_split-10
> > sb-btrfs:vda2-24 sb-proc-25 sb-tmpfs-27 thp_zero-9
> > sb-cgroup2-30 sb-proc-39 sb-tmpfs-29 xfs_buf-37
> > sb-configfs-23 sb-proc-41 sb-tmpfs-35 xfs_inodegc-38
> ^^^^^^^^^^^^^^
>
> These XFS shrinkers are also per-block device like the superblock.
> They need to read like "sb-xfs:vda1-36". and even though it is not
> in this list, the xfs dquot shrinker will need this as well.
Sure, will do in v4. Thanks!
>
>
> > sb-dax-11 sb-proc-45 sb-tmpfs-40 zspool-34
> > sb-debugfs-7 sb-proc-46 sb-tmpfs-42
> > sb-devpts-28 sb-proc-47 sb-tmpfs-43
> > sb-devtmpfs-5 sb-pstore-31 sb-tmpfs-44
>
> The proc and tmpfs shrinkers have the same problem - what instance
> do they actually refer to?
Any ideas on how to name/identify them?
On Mon, May 23, 2022 at 03:06:22PM -0700, Roman Gushchin wrote:
> On Sun, May 22, 2022 at 07:08:24PM +0800, Muchun Song wrote:
> > On Mon, May 09, 2022 at 11:38:17AM -0700, Roman Gushchin wrote:
> > > Currently shrinkers are anonymous objects. For debugging purposes they
> > > can be identified by count/scan function names, but it's not always
> > > useful: e.g. for superblock's shrinkers it's nice to have at least
> > > an idea of to which superblock the shrinker belongs.
> > >
> > > This commit adds names to shrinkers. register_shrinker() and
> > > prealloc_shrinker() functions are extended to take a format and
> > > arguments to master a name.
> > >
> > > In some cases it's not possible to determine a good name at the time
> > > when a shrinker is allocated. For such cases shrinker_debugfs_rename()
> > > is provided.
> > >
> > > After this change the shrinker debugfs directory looks like:
> > > $ cd /sys/kernel/debug/shrinker/
> > > $ ls
> > > dqcache-16 sb-hugetlbfs-17 sb-rootfs-2 sb-tmpfs-50
> > > kfree_rcu-0 sb-hugetlbfs-33 sb-securityfs-6 sb-tracefs-13
> > > sb-aio-20 sb-iomem-12 sb-selinuxfs-22 sb-xfs:vda1-36
> > > sb-anon_inodefs-15 sb-mqueue-21 sb-sockfs-8 sb-zsmalloc-19
> > > sb-bdev-3 sb-nsfs-4 sb-sysfs-26 shadow-18
> > > sb-bpf-32 sb-pipefs-14 sb-tmpfs-1 thp_deferred_split-10
> > > sb-btrfs:vda2-24 sb-proc-25 sb-tmpfs-27 thp_zero-9
> > > sb-cgroup2-30 sb-proc-39 sb-tmpfs-29 xfs_buf-37
> > > sb-configfs-23 sb-proc-41 sb-tmpfs-35 xfs_inodegc-38
> > > sb-dax-11 sb-proc-45 sb-tmpfs-40 zspool-34
> > > sb-debugfs-7 sb-proc-46 sb-tmpfs-42
> > > sb-devpts-28 sb-proc-47 sb-tmpfs-43
> > > sb-devtmpfs-5 sb-pstore-31 sb-tmpfs-44
> > >
> > > Signed-off-by: Roman Gushchin <[email protected]>
> > > ---
> > > arch/x86/kvm/mmu/mmu.c | 2 +-
> > > drivers/android/binder_alloc.c | 2 +-
> > > drivers/gpu/drm/i915/gem/i915_gem_shrinker.c | 3 +-
> > > drivers/gpu/drm/msm/msm_gem_shrinker.c | 2 +-
> > > .../gpu/drm/panfrost/panfrost_gem_shrinker.c | 2 +-
> > > drivers/gpu/drm/ttm/ttm_pool.c | 2 +-
> > > drivers/md/bcache/btree.c | 2 +-
> > > drivers/md/dm-bufio.c | 2 +-
> > > drivers/md/dm-zoned-metadata.c | 2 +-
> > > drivers/md/raid5.c | 2 +-
> > > drivers/misc/vmw_balloon.c | 2 +-
> > > drivers/virtio/virtio_balloon.c | 2 +-
> > > drivers/xen/xenbus/xenbus_probe_backend.c | 2 +-
> > > fs/btrfs/super.c | 2 +
> > > fs/erofs/utils.c | 2 +-
> > > fs/ext4/extents_status.c | 3 +-
> > > fs/f2fs/super.c | 2 +-
> > > fs/gfs2/glock.c | 2 +-
> > > fs/gfs2/main.c | 2 +-
> > > fs/jbd2/journal.c | 2 +-
> > > fs/mbcache.c | 2 +-
> > > fs/nfs/nfs42xattr.c | 7 ++-
> > > fs/nfs/super.c | 2 +-
> > > fs/nfsd/filecache.c | 2 +-
> > > fs/nfsd/nfscache.c | 2 +-
> > > fs/quota/dquot.c | 2 +-
> > > fs/super.c | 6 +-
> > > fs/ubifs/super.c | 2 +-
> > > fs/xfs/xfs_buf.c | 2 +-
> > > fs/xfs/xfs_icache.c | 2 +-
> > > fs/xfs/xfs_qm.c | 2 +-
> > > include/linux/shrinker.h | 12 +++-
> > > kernel/rcu/tree.c | 2 +-
> > > mm/huge_memory.c | 4 +-
> > > mm/shrinker_debug.c | 45 +++++++++++++-
> > > mm/vmscan.c | 58 ++++++++++++++++++-
> > > mm/workingset.c | 2 +-
> > > mm/zsmalloc.c | 2 +-
> > > net/sunrpc/auth.c | 2 +-
> > > 39 files changed, 154 insertions(+), 46 deletions(-)
> > >
> > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > > index c623019929a7..8cfabdd63406 100644
> > > --- a/arch/x86/kvm/mmu/mmu.c
> > > +++ b/arch/x86/kvm/mmu/mmu.c
> > > @@ -6283,7 +6283,7 @@ int kvm_mmu_vendor_module_init(void)
> > > if (percpu_counter_init(&kvm_total_used_mmu_pages, 0, GFP_KERNEL))
> > > goto out;
> > >
> > > - ret = register_shrinker(&mmu_shrinker);
> > > + ret = register_shrinker(&mmu_shrinker, "mmu");
> > > if (ret)
> > > goto out;
> > >
> > > diff --git a/drivers/android/binder_alloc.c b/drivers/android/binder_alloc.c
> > > index 2ac1008a5f39..951343c41ba8 100644
> > > --- a/drivers/android/binder_alloc.c
> > > +++ b/drivers/android/binder_alloc.c
> > > @@ -1084,7 +1084,7 @@ int binder_alloc_shrinker_init(void)
> > > int ret = list_lru_init(&binder_alloc_lru);
> > >
> > > if (ret == 0) {
> > > - ret = register_shrinker(&binder_shrinker);
> > > + ret = register_shrinker(&binder_shrinker, "binder");
> > > if (ret)
> > > list_lru_destroy(&binder_alloc_lru);
> > > }
> > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
> > > index 6a6ff98a8746..85524ef92ea4 100644
> > > --- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
> > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
> > > @@ -426,7 +426,8 @@ void i915_gem_driver_register__shrinker(struct drm_i915_private *i915)
> > > i915->mm.shrinker.count_objects = i915_gem_shrinker_count;
> > > i915->mm.shrinker.seeks = DEFAULT_SEEKS;
> > > i915->mm.shrinker.batch = 4096;
> > > - drm_WARN_ON(&i915->drm, register_shrinker(&i915->mm.shrinker));
> > > + drm_WARN_ON(&i915->drm, register_shrinker(&i915->mm.shrinker,
> > > + "drm_i915_gem"));
> > >
> > > i915->mm.oom_notifier.notifier_call = i915_gem_shrinker_oom;
> > > drm_WARN_ON(&i915->drm, register_oom_notifier(&i915->mm.oom_notifier));
> > > diff --git a/drivers/gpu/drm/msm/msm_gem_shrinker.c b/drivers/gpu/drm/msm/msm_gem_shrinker.c
> > > index 086dacf2f26a..2d3cf4f13dfd 100644
> > > --- a/drivers/gpu/drm/msm/msm_gem_shrinker.c
> > > +++ b/drivers/gpu/drm/msm/msm_gem_shrinker.c
> > > @@ -221,7 +221,7 @@ void msm_gem_shrinker_init(struct drm_device *dev)
> > > priv->shrinker.count_objects = msm_gem_shrinker_count;
> > > priv->shrinker.scan_objects = msm_gem_shrinker_scan;
> > > priv->shrinker.seeks = DEFAULT_SEEKS;
> > > - WARN_ON(register_shrinker(&priv->shrinker));
> > > + WARN_ON(register_shrinker(&priv->shrinker, "drm_msm_gem"));
> > >
> > > priv->vmap_notifier.notifier_call = msm_gem_shrinker_vmap;
> > > WARN_ON(register_vmap_purge_notifier(&priv->vmap_notifier));
> > > diff --git a/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c b/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c
> > > index 77e7cb6d1ae3..0d028266ee9e 100644
> > > --- a/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c
> > > +++ b/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c
> > > @@ -103,7 +103,7 @@ void panfrost_gem_shrinker_init(struct drm_device *dev)
> > > pfdev->shrinker.count_objects = panfrost_gem_shrinker_count;
> > > pfdev->shrinker.scan_objects = panfrost_gem_shrinker_scan;
> > > pfdev->shrinker.seeks = DEFAULT_SEEKS;
> > > - WARN_ON(register_shrinker(&pfdev->shrinker));
> > > + WARN_ON(register_shrinker(&pfdev->shrinker, "drm_panfrost"));
> > > }
> > >
> > > /**
> > > diff --git a/drivers/gpu/drm/ttm/ttm_pool.c b/drivers/gpu/drm/ttm/ttm_pool.c
> > > index 1bba0a0ed3f9..b8b41d242197 100644
> > > --- a/drivers/gpu/drm/ttm/ttm_pool.c
> > > +++ b/drivers/gpu/drm/ttm/ttm_pool.c
> > > @@ -722,7 +722,7 @@ int ttm_pool_mgr_init(unsigned long num_pages)
> > > mm_shrinker.count_objects = ttm_pool_shrinker_count;
> > > mm_shrinker.scan_objects = ttm_pool_shrinker_scan;
> > > mm_shrinker.seeks = 1;
> > > - return register_shrinker(&mm_shrinker);
> > > + return register_shrinker(&mm_shrinker, "drm_ttm_pool");
> > > }
> > >
> > > /**
> > > diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c
> > > index ad9f16689419..c1f734ab86b3 100644
> > > --- a/drivers/md/bcache/btree.c
> > > +++ b/drivers/md/bcache/btree.c
> > > @@ -812,7 +812,7 @@ int bch_btree_cache_alloc(struct cache_set *c)
> > > c->shrink.seeks = 4;
> > > c->shrink.batch = c->btree_pages * 2;
> > >
> > > - if (register_shrinker(&c->shrink))
> > > + if (register_shrinker(&c->shrink, "btree"))
> > > pr_warn("bcache: %s: could not register shrinker\n",
> > > __func__);
> > >
> > > diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c
> > > index 5ffa1dcf84cf..bcc95898c341 100644
> > > --- a/drivers/md/dm-bufio.c
> > > +++ b/drivers/md/dm-bufio.c
> > > @@ -1806,7 +1806,7 @@ struct dm_bufio_client *dm_bufio_client_create(struct block_device *bdev, unsign
> > > c->shrinker.scan_objects = dm_bufio_shrink_scan;
> > > c->shrinker.seeks = 1;
> > > c->shrinker.batch = 0;
> > > - r = register_shrinker(&c->shrinker);
> > > + r = register_shrinker(&c->shrinker, "dm_bufio");
> > > if (r)
> > > goto bad;
> > >
> > > diff --git a/drivers/md/dm-zoned-metadata.c b/drivers/md/dm-zoned-metadata.c
> > > index d1ea66114d14..05f2fd12066b 100644
> > > --- a/drivers/md/dm-zoned-metadata.c
> > > +++ b/drivers/md/dm-zoned-metadata.c
> > > @@ -2944,7 +2944,7 @@ int dmz_ctr_metadata(struct dmz_dev *dev, int num_dev,
> > > zmd->mblk_shrinker.seeks = DEFAULT_SEEKS;
> > >
> > > /* Metadata cache shrinker */
> > > - ret = register_shrinker(&zmd->mblk_shrinker);
> > > + ret = register_shrinker(&zmd->mblk_shrinker, "md_meta");
> > > if (ret) {
> > > dmz_zmd_err(zmd, "Register metadata cache shrinker failed");
> > > goto err;
> > > diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> > > index 59f91e392a2a..34ddebd3aff7 100644
> > > --- a/drivers/md/raid5.c
> > > +++ b/drivers/md/raid5.c
> > > @@ -7383,7 +7383,7 @@ static struct r5conf *setup_conf(struct mddev *mddev)
> > > conf->shrinker.count_objects = raid5_cache_count;
> > > conf->shrinker.batch = 128;
> > > conf->shrinker.flags = 0;
> > > - if (register_shrinker(&conf->shrinker)) {
> > > + if (register_shrinker(&conf->shrinker, "md-%s", mdname(mddev))) {
> > > pr_warn("md/raid:%s: couldn't register shrinker.\n",
> > > mdname(mddev));
> > > goto abort;
> > > diff --git a/drivers/misc/vmw_balloon.c b/drivers/misc/vmw_balloon.c
> > > index f1d8ba6d4857..6c9ddf1187dd 100644
> > > --- a/drivers/misc/vmw_balloon.c
> > > +++ b/drivers/misc/vmw_balloon.c
> > > @@ -1587,7 +1587,7 @@ static int vmballoon_register_shrinker(struct vmballoon *b)
> > > b->shrinker.count_objects = vmballoon_shrinker_count;
> > > b->shrinker.seeks = DEFAULT_SEEKS;
> > >
> > > - r = register_shrinker(&b->shrinker);
> > > + r = register_shrinker(&b->shrinker, "vmw_balloon");
> > >
> > > if (r == 0)
> > > b->shrinker_registered = true;
> > > diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> > > index f4c34a2a6b8e..093e06e19d0e 100644
> > > --- a/drivers/virtio/virtio_balloon.c
> > > +++ b/drivers/virtio/virtio_balloon.c
> > > @@ -875,7 +875,7 @@ static int virtio_balloon_register_shrinker(struct virtio_balloon *vb)
> > > vb->shrinker.count_objects = virtio_balloon_shrinker_count;
> > > vb->shrinker.seeks = DEFAULT_SEEKS;
> > >
> > > - return register_shrinker(&vb->shrinker);
> > > + return register_shrinker(&vb->shrinker, "virtio_valloon");
> > > }
> > >
> > > static int virtballoon_probe(struct virtio_device *vdev)
> > > diff --git a/drivers/xen/xenbus/xenbus_probe_backend.c b/drivers/xen/xenbus/xenbus_probe_backend.c
> > > index 5abded97e1a7..a6c5e344017d 100644
> > > --- a/drivers/xen/xenbus/xenbus_probe_backend.c
> > > +++ b/drivers/xen/xenbus/xenbus_probe_backend.c
> > > @@ -305,7 +305,7 @@ static int __init xenbus_probe_backend_init(void)
> > >
> > > register_xenstore_notifier(&xenstore_notifier);
> > >
> > > - if (register_shrinker(&backend_memory_shrinker))
> > > + if (register_shrinker(&backend_memory_shrinker, "xen_backend"))
> > > pr_warn("shrinker registration failed\n");
> > >
> > > return 0;
> > > diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
> > > index 206f44005c52..062dbd8071e2 100644
> > > --- a/fs/btrfs/super.c
> > > +++ b/fs/btrfs/super.c
> > > @@ -1790,6 +1790,8 @@ static struct dentry *btrfs_mount_root(struct file_system_type *fs_type,
> > > error = -EBUSY;
> > > } else {
> > > snprintf(s->s_id, sizeof(s->s_id), "%pg", bdev);
> > > + shrinker_debugfs_rename(&s->s_shrink, "sb-%s:%s", fs_type->name,
> > > + s->s_id);
> > > btrfs_sb(s)->bdev_holder = fs_type;
> > > if (!strstr(crc32c_impl(), "generic"))
> > > set_bit(BTRFS_FS_CSUM_IMPL_FAST, &fs_info->flags);
> > > diff --git a/fs/erofs/utils.c b/fs/erofs/utils.c
> > > index ec9a1d780dc1..67eb64fadd4f 100644
> > > --- a/fs/erofs/utils.c
> > > +++ b/fs/erofs/utils.c
> > > @@ -282,7 +282,7 @@ static struct shrinker erofs_shrinker_info = {
> > >
> > > int __init erofs_init_shrinker(void)
> > > {
> > > - return register_shrinker(&erofs_shrinker_info);
> > > + return register_shrinker(&erofs_shrinker_info, "erofs");
> > > }
> > >
> > > void erofs_exit_shrinker(void)
> > > diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c
> > > index 9a3a8996aacf..a7aa79d580e5 100644
> > > --- a/fs/ext4/extents_status.c
> > > +++ b/fs/ext4/extents_status.c
> > > @@ -1650,11 +1650,10 @@ int ext4_es_register_shrinker(struct ext4_sb_info *sbi)
> > > err = percpu_counter_init(&sbi->s_es_stats.es_stats_shk_cnt, 0, GFP_KERNEL);
> > > if (err)
> > > goto err3;
> > > -
> > > sbi->s_es_shrinker.scan_objects = ext4_es_scan;
> > > sbi->s_es_shrinker.count_objects = ext4_es_count;
> > > sbi->s_es_shrinker.seeks = DEFAULT_SEEKS;
> > > - err = register_shrinker(&sbi->s_es_shrinker);
> > > + err = register_shrinker(&sbi->s_es_shrinker, "ext4_es");
> > > if (err)
> > > goto err4;
> > >
> > > diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
> > > index 4368f90571bd..2fc40a1635f3 100644
> > > --- a/fs/f2fs/super.c
> > > +++ b/fs/f2fs/super.c
> > > @@ -4579,7 +4579,7 @@ static int __init init_f2fs_fs(void)
> > > err = f2fs_init_sysfs();
> > > if (err)
> > > goto free_garbage_collection_cache;
> > > - err = register_shrinker(&f2fs_shrinker_info);
> > > + err = register_shrinker(&f2fs_shrinker_info, "f2fs");
> > > if (err)
> > > goto free_sysfs;
> > > err = register_filesystem(&f2fs_fs_type);
> > > diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
> > > index 26169cedcefc..791c23d9f7e7 100644
> > > --- a/fs/gfs2/glock.c
> > > +++ b/fs/gfs2/glock.c
> > > @@ -2549,7 +2549,7 @@ int __init gfs2_glock_init(void)
> > > return -ENOMEM;
> > > }
> > >
> > > - ret = register_shrinker(&glock_shrinker);
> > > + ret = register_shrinker(&glock_shrinker, "gfs2_glock");
> > > if (ret) {
> > > destroy_workqueue(gfs2_delete_workqueue);
> > > destroy_workqueue(glock_workqueue);
> > > diff --git a/fs/gfs2/main.c b/fs/gfs2/main.c
> > > index 28d0eb23e18e..dde981b78488 100644
> > > --- a/fs/gfs2/main.c
> > > +++ b/fs/gfs2/main.c
> > > @@ -150,7 +150,7 @@ static int __init init_gfs2_fs(void)
> > > if (!gfs2_trans_cachep)
> > > goto fail_cachep8;
> > >
> > > - error = register_shrinker(&gfs2_qd_shrinker);
> > > + error = register_shrinker(&gfs2_qd_shrinker, "gfs2_qd");
> > > if (error)
> > > goto fail_shrinker;
> > >
> > > diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
> > > index c0cbeeaec2d1..e7786445ecc1 100644
> > > --- a/fs/jbd2/journal.c
> > > +++ b/fs/jbd2/journal.c
> > > @@ -1418,7 +1418,7 @@ static journal_t *journal_init_common(struct block_device *bdev,
> > > if (percpu_counter_init(&journal->j_checkpoint_jh_count, 0, GFP_KERNEL))
> > > goto err_cleanup;
> > >
> > > - if (register_shrinker(&journal->j_shrinker)) {
> > > + if (register_shrinker(&journal->j_shrinker, "jbd2_journal")) {
> > > percpu_counter_destroy(&journal->j_checkpoint_jh_count);
> > > goto err_cleanup;
> > > }
> > > diff --git a/fs/mbcache.c b/fs/mbcache.c
> > > index 97c54d3a2227..379dc5b0b6ad 100644
> > > --- a/fs/mbcache.c
> > > +++ b/fs/mbcache.c
> > > @@ -367,7 +367,7 @@ struct mb_cache *mb_cache_create(int bucket_bits)
> > > cache->c_shrink.count_objects = mb_cache_count;
> > > cache->c_shrink.scan_objects = mb_cache_scan;
> > > cache->c_shrink.seeks = DEFAULT_SEEKS;
> > > - if (register_shrinker(&cache->c_shrink)) {
> > > + if (register_shrinker(&cache->c_shrink, "mb_cache")) {
> > > kfree(cache->c_hash);
> > > kfree(cache);
> > > goto err_out;
> > > diff --git a/fs/nfs/nfs42xattr.c b/fs/nfs/nfs42xattr.c
> > > index e7b34f7e0614..147b8a2f2dc6 100644
> > > --- a/fs/nfs/nfs42xattr.c
> > > +++ b/fs/nfs/nfs42xattr.c
> > > @@ -1017,15 +1017,16 @@ int __init nfs4_xattr_cache_init(void)
> > > if (ret)
> > > goto out2;
> > >
> > > - ret = register_shrinker(&nfs4_xattr_cache_shrinker);
> > > + ret = register_shrinker(&nfs4_xattr_cache_shrinker, "nfs_xattr_cache");
> > > if (ret)
> > > goto out1;
> > >
> > > - ret = register_shrinker(&nfs4_xattr_entry_shrinker);
> > > + ret = register_shrinker(&nfs4_xattr_entry_shrinker, "nfs_xattr_entry");
> > > if (ret)
> > > goto out;
> > >
> > > - ret = register_shrinker(&nfs4_xattr_large_entry_shrinker);
> > > + ret = register_shrinker(&nfs4_xattr_large_entry_shrinker,
> > > + "nfs_xattr_large_entry");
> > > if (!ret)
> > > return 0;
> > >
> > > diff --git a/fs/nfs/super.c b/fs/nfs/super.c
> > > index 6ab5eeb000dc..c7a2aef911f1 100644
> > > --- a/fs/nfs/super.c
> > > +++ b/fs/nfs/super.c
> > > @@ -149,7 +149,7 @@ int __init register_nfs_fs(void)
> > > ret = nfs_register_sysctl();
> > > if (ret < 0)
> > > goto error_2;
> > > - ret = register_shrinker(&acl_shrinker);
> > > + ret = register_shrinker(&acl_shrinker, "nfs_acl");
> > > if (ret < 0)
> > > goto error_3;
> > > #ifdef CONFIG_NFS_V4_2
> > > diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
> > > index 2c1b027774d4..9c2879a3c3c0 100644
> > > --- a/fs/nfsd/filecache.c
> > > +++ b/fs/nfsd/filecache.c
> > > @@ -666,7 +666,7 @@ nfsd_file_cache_init(void)
> > > goto out_err;
> > > }
> > >
> > > - ret = register_shrinker(&nfsd_file_shrinker);
> > > + ret = register_shrinker(&nfsd_file_shrinker, "nfsd_filecache");
> > > if (ret) {
> > > pr_err("nfsd: failed to register nfsd_file_shrinker: %d\n", ret);
> > > goto out_lru;
> > > diff --git a/fs/nfsd/nfscache.c b/fs/nfsd/nfscache.c
> > > index 0b3f12aa37ff..f1cfb06d0be5 100644
> > > --- a/fs/nfsd/nfscache.c
> > > +++ b/fs/nfsd/nfscache.c
> > > @@ -176,7 +176,7 @@ int nfsd_reply_cache_init(struct nfsd_net *nn)
> > > nn->nfsd_reply_cache_shrinker.scan_objects = nfsd_reply_cache_scan;
> > > nn->nfsd_reply_cache_shrinker.count_objects = nfsd_reply_cache_count;
> > > nn->nfsd_reply_cache_shrinker.seeks = 1;
> > > - status = register_shrinker(&nn->nfsd_reply_cache_shrinker);
> > > + status = register_shrinker(&nn->nfsd_reply_cache_shrinker, "nfsd_reply");
> > > if (status)
> > > goto out_stats_destroy;
> > >
> > > diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
> > > index a74aef99bd3d..854d2b1d0914 100644
> > > --- a/fs/quota/dquot.c
> > > +++ b/fs/quota/dquot.c
> > > @@ -2985,7 +2985,7 @@ static int __init dquot_init(void)
> > > pr_info("VFS: Dquot-cache hash table entries: %ld (order %ld,"
> > > " %ld bytes)\n", nr_hash, order, (PAGE_SIZE << order));
> > >
> > > - if (register_shrinker(&dqcache_shrinker))
> > > + if (register_shrinker(&dqcache_shrinker, "dqcache"))
> > > panic("Cannot register dquot shrinker");
> > >
> > > return 0;
> > > diff --git a/fs/super.c b/fs/super.c
> > > index 60f57c7bc0a6..4fca6657f442 100644
> > > --- a/fs/super.c
> > > +++ b/fs/super.c
> > > @@ -265,7 +265,7 @@ static struct super_block *alloc_super(struct file_system_type *type, int flags,
> > > s->s_shrink.count_objects = super_cache_count;
> > > s->s_shrink.batch = 1024;
> > > s->s_shrink.flags = SHRINKER_NUMA_AWARE | SHRINKER_MEMCG_AWARE;
> > > - if (prealloc_shrinker(&s->s_shrink))
> > > + if (prealloc_shrinker(&s->s_shrink, "sb-%s", type->name))
> > > goto fail;
> > > if (list_lru_init_memcg(&s->s_dentry_lru, &s->s_shrink))
> > > goto fail;
> > > @@ -1288,6 +1288,8 @@ int get_tree_bdev(struct fs_context *fc,
> > > } else {
> > > s->s_mode = mode;
> > > snprintf(s->s_id, sizeof(s->s_id), "%pg", bdev);
> > > + shrinker_debugfs_rename(&s->s_shrink, "sb-%s:%s",
> > > + fc->fs_type->name, s->s_id);
> > > sb_set_blocksize(s, block_size(bdev));
> > > error = fill_super(s, fc);
> > > if (error) {
> > > @@ -1363,6 +1365,8 @@ struct dentry *mount_bdev(struct file_system_type *fs_type,
> > > } else {
> > > s->s_mode = mode;
> > > snprintf(s->s_id, sizeof(s->s_id), "%pg", bdev);
> > > + shrinker_debugfs_rename(&s->s_shrink, "sb-%s:%s",
> > > + fs_type->name, s->s_id);
> > > sb_set_blocksize(s, block_size(bdev));
> > > error = fill_super(s, data, flags & SB_SILENT ? 1 : 0);
> > > if (error) {
> > > diff --git a/fs/ubifs/super.c b/fs/ubifs/super.c
> > > index bad67455215f..a3663d201f64 100644
> > > --- a/fs/ubifs/super.c
> > > +++ b/fs/ubifs/super.c
> > > @@ -2430,7 +2430,7 @@ static int __init ubifs_init(void)
> > > if (!ubifs_inode_slab)
> > > return -ENOMEM;
> > >
> > > - err = register_shrinker(&ubifs_shrinker_info);
> > > + err = register_shrinker(&ubifs_shrinker_info, "ubifs");
> > > if (err)
> > > goto out_slab;
> > >
> > > diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
> > > index e1afb9e503e1..5645e92df0c9 100644
> > > --- a/fs/xfs/xfs_buf.c
> > > +++ b/fs/xfs/xfs_buf.c
> > > @@ -1986,7 +1986,7 @@ xfs_alloc_buftarg(
> > > btp->bt_shrinker.scan_objects = xfs_buftarg_shrink_scan;
> > > btp->bt_shrinker.seeks = DEFAULT_SEEKS;
> > > btp->bt_shrinker.flags = SHRINKER_NUMA_AWARE;
> > > - if (register_shrinker(&btp->bt_shrinker))
> > > + if (register_shrinker(&btp->bt_shrinker, "xfs_buf"))
> > > goto error_pcpu;
> > > return btp;
> > >
> > > diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
> > > index bffd6eb0b298..d0c4e74ff763 100644
> > > --- a/fs/xfs/xfs_icache.c
> > > +++ b/fs/xfs/xfs_icache.c
> > > @@ -2198,5 +2198,5 @@ xfs_inodegc_register_shrinker(
> > > shrink->flags = SHRINKER_NONSLAB;
> > > shrink->batch = XFS_INODEGC_SHRINKER_BATCH;
> > >
> > > - return register_shrinker(shrink);
> > > + return register_shrinker(shrink, "xfs_inodegc");
> > > }
> > > diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c
> > > index f165d1a3de1d..93ded9e81f49 100644
> > > --- a/fs/xfs/xfs_qm.c
> > > +++ b/fs/xfs/xfs_qm.c
> > > @@ -686,7 +686,7 @@ xfs_qm_init_quotainfo(
> > > qinf->qi_shrinker.seeks = DEFAULT_SEEKS;
> > > qinf->qi_shrinker.flags = SHRINKER_NUMA_AWARE;
> > >
> > > - error = register_shrinker(&qinf->qi_shrinker);
> > > + error = register_shrinker(&qinf->qi_shrinker, "xfs_qm");
> > > if (error)
> > > goto out_free_inos;
> > >
> > > diff --git a/include/linux/shrinker.h b/include/linux/shrinker.h
> > > index 2ced8149c513..64416f3e0a1f 100644
> > > --- a/include/linux/shrinker.h
> > > +++ b/include/linux/shrinker.h
> > > @@ -75,6 +75,7 @@ struct shrinker {
> > > #endif
> > > #ifdef CONFIG_SHRINKER_DEBUG
> > > int debugfs_id;
> > > + const char *name;
> > > struct dentry *debugfs_entry;
> > > #endif
> > > /* objs pending delete, per node */
> > > @@ -92,9 +93,9 @@ struct shrinker {
> > > */
> > > #define SHRINKER_NONSLAB (1 << 3)
> > >
> > > -extern int prealloc_shrinker(struct shrinker *shrinker);
> > > +extern int prealloc_shrinker(struct shrinker *shrinker, const char *fmt, ...);
> > > extern void register_shrinker_prepared(struct shrinker *shrinker);
> > > -extern int register_shrinker(struct shrinker *shrinker);
> > > +extern int register_shrinker(struct shrinker *shrinker, const char *fmt, ...);
> > > extern void unregister_shrinker(struct shrinker *shrinker);
> > > extern void free_prealloced_shrinker(struct shrinker *shrinker);
> > > extern void synchronize_shrinkers(void);
> > > @@ -102,6 +103,8 @@ extern void synchronize_shrinkers(void);
> > > #ifdef CONFIG_SHRINKER_DEBUG
> > > extern int shrinker_debugfs_add(struct shrinker *shrinker);
> > > extern void shrinker_debugfs_remove(struct shrinker *shrinker);
> > > +extern int shrinker_debugfs_rename(struct shrinker *shrinker,
> > > + const char *fmt, ...);
> > > #else /* CONFIG_SHRINKER_DEBUG */
> > > static inline int shrinker_debugfs_add(struct shrinker *shrinker)
> > > {
> > > @@ -110,5 +113,10 @@ static inline int shrinker_debugfs_add(struct shrinker *shrinker)
> > > static inline void shrinker_debugfs_remove(struct shrinker *shrinker)
> > > {
> > > }
> > > +static inline int shrinker_debugfs_rename(struct shrinker *shrinker,
> > > + const char *fmt, ...)
> > > +{
> > > + return 0;
> > > +}
> > > #endif /* CONFIG_SHRINKER_DEBUG */
> > > #endif /* _LINUX_SHRINKER_H */
> > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > > index 5c587e00811c..b4c66916bea9 100644
> > > --- a/kernel/rcu/tree.c
> > > +++ b/kernel/rcu/tree.c
> > > @@ -4978,7 +4978,7 @@ static void __init kfree_rcu_batch_init(void)
> > > INIT_DELAYED_WORK(&krcp->page_cache_work, fill_page_cache_func);
> > > krcp->initialized = true;
> > > }
> > > - if (register_shrinker(&kfree_rcu_shrinker))
> > > + if (register_shrinker(&kfree_rcu_shrinker, "kfree_rcu"))
> > > pr_err("Failed to register kfree_rcu() shrinker!\n");
> > > }
> > >
> > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> > > index fa6a1623976a..a40df19c0e38 100644
> > > --- a/mm/huge_memory.c
> > > +++ b/mm/huge_memory.c
> > > @@ -423,10 +423,10 @@ static int __init hugepage_init(void)
> > > if (err)
> > > goto err_slab;
> > >
> > > - err = register_shrinker(&huge_zero_page_shrinker);
> > > + err = register_shrinker(&huge_zero_page_shrinker, "thp_zero");
> > > if (err)
> > > goto err_hzp_shrinker;
> > > - err = register_shrinker(&deferred_split_shrinker);
> > > + err = register_shrinker(&deferred_split_shrinker, "thp_deferred_split");
> > > if (err)
> > > goto err_split_shrinker;
> > >
> > > diff --git a/mm/shrinker_debug.c b/mm/shrinker_debug.c
> > > index fd1f805a581a..28b1c1ab60ef 100644
> > > --- a/mm/shrinker_debug.c
> > > +++ b/mm/shrinker_debug.c
> > > @@ -104,7 +104,7 @@ DEFINE_SHOW_ATTRIBUTE(shrinker_debugfs_count);
> > > int shrinker_debugfs_add(struct shrinker *shrinker)
> > > {
> > > struct dentry *entry;
> > > - char buf[16];
> > > + char buf[128];
> > > int id;
> > >
> > > lockdep_assert_held(&shrinker_rwsem);
> > > @@ -118,7 +118,7 @@ int shrinker_debugfs_add(struct shrinker *shrinker)
> > > return id;
> > > shrinker->debugfs_id = id;
> > >
> > > - snprintf(buf, sizeof(buf), "%d", id);
> > > + snprintf(buf, sizeof(buf), "%s-%d", shrinker->name, id);
> > >
> > > /* create debugfs entry */
> > > entry = debugfs_create_dir(buf, shrinker_debugfs_root);
> > > @@ -133,10 +133,51 @@ int shrinker_debugfs_add(struct shrinker *shrinker)
> > > return 0;
> > > }
> > >
> > > +int shrinker_debugfs_rename(struct shrinker *shrinker, const char *fmt, ...)
> > > +{
> > > + struct dentry *entry;
> > > + char buf[128];
> > > + const char *old;
> > > + va_list ap;
> > > + int ret = 0;
> > > +
> > > + down_write(&shrinker_rwsem);
> > > +
> > > + old = shrinker->name;
> > > +
> > > + va_start(ap, fmt);
> > > + shrinker->name = kvasprintf_const(GFP_KERNEL, fmt, ap);
> > > + va_end(ap);
> > > + if (!shrinker->name) {
> > > + shrinker->name = old;
> > > + ret = -ENOMEM;
> >
> > Seems we could move those 6 lines out of shrinker_rwsem. I know
> > this function is not in a hot path, but it it better to improve
> > it if it is easy, right?
>
> Not sure if it worth it, but ok, it's indeed easy.
>
> >
> > > + } else if (shrinker->debugfs_entry) {
> > > + snprintf(buf, sizeof(buf), "%s-%d", shrinker->name,
> > > + shrinker->debugfs_id);
> > > +
> > > + entry = debugfs_rename(shrinker_debugfs_root,
> > > + shrinker->debugfs_entry,
> > > + shrinker_debugfs_root, buf);
> > > + if (IS_ERR(entry))
> > > + ret = PTR_ERR(entry);
> > > + else
> > > + shrinker->debugfs_entry = entry;
> > > +
> > > + kfree_const(old);
> > > + }
> > > +
> > > + up_write(&shrinker_rwsem);
> > > +
> > > + return ret;
> > > +}
> > > +EXPORT_SYMBOL(shrinker_debugfs_rename);
> > > +
> > > void shrinker_debugfs_remove(struct shrinker *shrinker)
> > > {
> > > lockdep_assert_held(&shrinker_rwsem);
> > >
> > > + kfree_const(shrinker->name);
> > > +
> > > if (!shrinker->debugfs_entry)
> > > return;
> > >
> > > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > > index 024f7056b98c..42bae0fd0442 100644
> > > --- a/mm/vmscan.c
> > > +++ b/mm/vmscan.c
> > > @@ -613,7 +613,7 @@ static unsigned long lruvec_lru_size(struct lruvec *lruvec, enum lru_list lru,
> > > /*
> > > * Add a shrinker callback to be called from the vm.
> > > */
> > > -int prealloc_shrinker(struct shrinker *shrinker)
> > > +static int __prealloc_shrinker(struct shrinker *shrinker)
> > > {
> > > unsigned int size;
> > > int err;
> > > @@ -637,8 +637,36 @@ int prealloc_shrinker(struct shrinker *shrinker)
> > > return 0;
> > > }
> > >
> > > +#ifdef CONFIG_SHRINKER_DEBUG
> > > +int prealloc_shrinker(struct shrinker *shrinker, const char *fmt, ...)
> > > +{
> > > + va_list ap;
> > > + int err;
> > > +
> > > + va_start(ap, fmt);
> > > + shrinker->name = kvasprintf_const(GFP_KERNEL, fmt, ap);
> > > + va_end(ap);
> > > + if (!shrinker->name)
> > > + return -ENOMEM;
> > > +
> > > + err = __prealloc_shrinker(shrinker);
> > > + if (err)
> > > + kfree_const(shrinker->name);
> > > +
> > > + return err;
> > > +}
> > > +#else
> > > +int prealloc_shrinker(struct shrinker *shrinker, const char *fmt, ...)
> > > +{
> > > + return __prealloc_shrinker(shrinker);
> > > +}
> > > +#endif
> > > +
> > > void free_prealloced_shrinker(struct shrinker *shrinker)
> > > {
> > > +#ifdef CONFIG_SHRINKER_DEBUG
> > > + kfree_const(shrinker->name);
> > > +#endif
> > > if (shrinker->flags & SHRINKER_MEMCG_AWARE) {
> > > down_write(&shrinker_rwsem);
> > > unregister_memcg_shrinker(shrinker);
> > > @@ -659,15 +687,39 @@ void register_shrinker_prepared(struct shrinker *shrinker)
> > > up_write(&shrinker_rwsem);
> > > }
> > >
> > > -int register_shrinker(struct shrinker *shrinker)
> > > +static int __register_shrinker(struct shrinker *shrinker)
> > > {
> > > - int err = prealloc_shrinker(shrinker);
> > > + int err = __prealloc_shrinker(shrinker);
> > >
> > > if (err)
> > > return err;
> > > register_shrinker_prepared(shrinker);
> > > return 0;
> > > }
> > > +
> > > +#ifdef CONFIG_SHRINKER_DEBUG
> > > +int register_shrinker(struct shrinker *shrinker, const char *fmt, ...)
> > > +{
> > > + va_list ap;
> > > + int err;
> > > +
> > > + va_start(ap, fmt);
> > > + shrinker->name = kvasprintf_const(GFP_KERNEL, fmt, ap);
> > > + va_end(ap);
> > > + if (!shrinker->name)
> > > + return -ENOMEM;
> > > +
> >
> > How about moving those initialization of name into shrinker_debugfs_add()?
> > Then maybe we could hide handling to "name" from vmscan.c.
>
> shrinker_debugfs_add() can be delayed up to the moment when debugfs is
> initialized. Some shrinkers are registered earlier. So it's not (easily)
> possible.
>
You are right. I just tried it and then I gave up.
It it not easy as you said.
Thanks.
On Mon, May 23, 2022 at 07:18:27PM -0700, Roman Gushchin wrote:
> On Mon, May 23, 2022 at 08:13:41AM +1000, Dave Chinner wrote:
> > On Mon, May 09, 2022 at 11:38:17AM -0700, Roman Gushchin wrote:
> > > Currently shrinkers are anonymous objects. For debugging purposes they
> > > can be identified by count/scan function names, but it's not always
> > > useful: e.g. for superblock's shrinkers it's nice to have at least
> > > an idea of to which superblock the shrinker belongs.
> > >
> > > This commit adds names to shrinkers. register_shrinker() and
> > > prealloc_shrinker() functions are extended to take a format and
> > > arguments to master a name.
> > >
> > > In some cases it's not possible to determine a good name at the time
> > > when a shrinker is allocated. For such cases shrinker_debugfs_rename()
> > > is provided.
> > >
> > > After this change the shrinker debugfs directory looks like:
> > > $ cd /sys/kernel/debug/shrinker/
> > > $ ls
> > > dqcache-16 sb-hugetlbfs-17 sb-rootfs-2 sb-tmpfs-50
> > > kfree_rcu-0 sb-hugetlbfs-33 sb-securityfs-6 sb-tracefs-13
> > > sb-aio-20 sb-iomem-12 sb-selinuxfs-22 sb-xfs:vda1-36
> > > sb-anon_inodefs-15 sb-mqueue-21 sb-sockfs-8 sb-zsmalloc-19
> > > sb-bdev-3 sb-nsfs-4 sb-sysfs-26 shadow-18
> > > sb-bpf-32 sb-pipefs-14 sb-tmpfs-1 thp_deferred_split-10
> > > sb-btrfs:vda2-24 sb-proc-25 sb-tmpfs-27 thp_zero-9
> > > sb-cgroup2-30 sb-proc-39 sb-tmpfs-29 xfs_buf-37
> > > sb-configfs-23 sb-proc-41 sb-tmpfs-35 xfs_inodegc-38
> > ^^^^^^^^^^^^^^
> >
> > These XFS shrinkers are also per-block device like the superblock.
> > They need to read like "sb-xfs:vda1-36". and even though it is not
> > in this list, the xfs dquot shrinker will need this as well.
>
> Sure, will do in v4. Thanks!
>
> >
> >
> > > sb-dax-11 sb-proc-45 sb-tmpfs-40 zspool-34
> > > sb-debugfs-7 sb-proc-46 sb-tmpfs-42
> > > sb-devpts-28 sb-proc-47 sb-tmpfs-43
> > > sb-devtmpfs-5 sb-pstore-31 sb-tmpfs-44
> >
> > The proc and tmpfs shrinkers have the same problem - what instance
> > do they actually refer to?
>
> Any ideas on how to name/identify them?
I've looked a bit into it and realized I've no good idea:
I naively thought that procfs sb's have different pid namespaces,
but it's not true anymore. So I've no idea how to identify them.
The same applies to tmpfs instances, no simple id.
If somebody has any ideas, I'd appreciate. If not, we can enhance it later.
Thanks!