2022-05-26 05:52:10

by Roman Gushchin

[permalink] [raw]
Subject: [PATCH v4 0/6] mm: introduce shrinker debugfs interface

The only existing debugging mechanism is a couple of tracepoints in
do_shrink_slab(): mm_shrink_slab_start and mm_shrink_slab_end. They aren't
covering everything though: shrinkers which report 0 objects will never show up,
there is no support for memcg-aware shrinkers. Shrinkers are identified by their
scan function, which is not always enough (e.g. hard to guess which super
block's shrinker it is having only "super_cache_scan").

To provide a better visibility and debug options for memory shrinkers
this patchset introduces a /sys/kernel/debug/shrinker interface, to some extent
similar to /sys/kernel/slab.

For each shrinker registered in the system a directory is created.
As now, the directory will contain only a "scan" file, which allows to get
the number of managed objects for each memory cgroup (for memcg-aware shrinkers)
and each numa node (for numa-aware shrinkers on a numa machine). Other
interfaces might be added in the future.

To make debugging more pleasant, the patchset also names all shrinkers,
so that debugfs entries can have meaningful names.


v4:
1) multiple shrinkers naming enhancements, by Kent and Dave
2) multiple minor fixes/optimizations, by Muchun

v3:
1) separated the "scan" part into a separate patch, by Dave
2) merged *_memcg, *_node and *_memcg_node interfaces, by Dave
3) shrinkers naming enhancements, by Christophe and Dave
4) added signal_pending() check, by Hillf
5) enabled by default, by Dave

v2:
1) switched to debugfs, suggested by Mike, Andrew, Greg and others
2) switched to seq_file API for output, no PAGE_SIZE limit anymore, by Andrew
3) switched to down_read_killable(), suggested by Hillf
4) dropped stateful filtering and "freed" returning, by Kent
5) added docs, by Andrew
6) added memcg_shrinker.py tool

rfc:
https://lwn.net/Articles/891542/


Roman Gushchin (6):
mm: memcontrol: introduce mem_cgroup_ino() and
mem_cgroup_get_from_ino()
mm: shrinkers: introduce debugfs interface for memory shrinkers
mm: shrinkers: provide shrinkers with names
mm: docs: document shrinker debugfs
tools: add memcg_shrinker.py
mm: shrinkers: add scan interface for shrinker debugfs

Documentation/admin-guide/mm/index.rst | 1 +
.../admin-guide/mm/shrinker_debugfs.rst | 131 ++++++++
arch/x86/kvm/mmu/mmu.c | 2 +-
drivers/android/binder_alloc.c | 2 +-
drivers/gpu/drm/i915/gem/i915_gem_shrinker.c | 3 +-
drivers/gpu/drm/msm/msm_gem_shrinker.c | 2 +-
.../gpu/drm/panfrost/panfrost_gem_shrinker.c | 2 +-
drivers/gpu/drm/ttm/ttm_pool.c | 2 +-
drivers/md/bcache/btree.c | 2 +-
drivers/md/dm-bufio.c | 3 +-
drivers/md/dm-zoned-metadata.c | 4 +-
drivers/md/raid5.c | 2 +-
drivers/misc/vmw_balloon.c | 2 +-
drivers/virtio/virtio_balloon.c | 2 +-
drivers/xen/xenbus/xenbus_probe_backend.c | 2 +-
fs/btrfs/super.c | 2 +
fs/erofs/utils.c | 2 +-
fs/ext4/extents_status.c | 3 +-
fs/f2fs/super.c | 2 +-
fs/gfs2/glock.c | 2 +-
fs/gfs2/main.c | 2 +-
fs/jbd2/journal.c | 3 +-
fs/mbcache.c | 2 +-
fs/nfs/nfs42xattr.c | 7 +-
fs/nfs/super.c | 2 +-
fs/nfsd/filecache.c | 2 +-
fs/nfsd/nfscache.c | 3 +-
fs/quota/dquot.c | 2 +-
fs/super.c | 6 +-
fs/ubifs/super.c | 2 +-
fs/xfs/xfs_buf.c | 3 +-
fs/xfs/xfs_icache.c | 2 +-
fs/xfs/xfs_qm.c | 3 +-
include/linux/memcontrol.h | 21 ++
include/linux/shrinker.h | 31 +-
kernel/rcu/tree.c | 2 +-
lib/Kconfig.debug | 9 +
mm/Makefile | 1 +
mm/huge_memory.c | 4 +-
mm/memcontrol.c | 23 ++
mm/shrinker_debug.c | 285 ++++++++++++++++++
mm/vmscan.c | 64 +++-
mm/workingset.c | 2 +-
mm/zsmalloc.c | 3 +-
net/sunrpc/auth.c | 2 +-
tools/cgroup/memcg_shrinker.py | 71 +++++
46 files changed, 684 insertions(+), 46 deletions(-)
create mode 100644 Documentation/admin-guide/mm/shrinker_debugfs.rst
create mode 100644 mm/shrinker_debug.c
create mode 100755 tools/cgroup/memcg_shrinker.py

--
2.35.3



2022-05-26 09:09:09

by Roman Gushchin

[permalink] [raw]
Subject: [PATCH v4 1/6] mm: memcontrol: introduce mem_cgroup_ino() and mem_cgroup_get_from_ino()

Shrinker debugfs requires a way to represent memory cgroups without
using full paths, both for displaying information and getting input
from a user.

Cgroup inode number is a perfect way, already used by bpf.

This commit adds a couple of helper functions which will be used
to handle memcg-aware shrinkers.

Signed-off-by: Roman Gushchin <[email protected]>
---
include/linux/memcontrol.h | 21 +++++++++++++++++++++
mm/memcontrol.c | 23 +++++++++++++++++++++++
2 files changed, 44 insertions(+)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 89b14729d59f..66a4f22e8154 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -831,6 +831,15 @@ static inline unsigned short mem_cgroup_id(struct mem_cgroup *memcg)
}
struct mem_cgroup *mem_cgroup_from_id(unsigned short id);

+#ifdef CONFIG_SHRINKER_DEBUG
+static inline unsigned long mem_cgroup_ino(struct mem_cgroup *memcg)
+{
+ return memcg ? cgroup_ino(memcg->css.cgroup) : 0;
+}
+
+struct mem_cgroup *mem_cgroup_get_from_ino(unsigned long ino);
+#endif
+
static inline struct mem_cgroup *mem_cgroup_from_seq(struct seq_file *m)
{
return mem_cgroup_from_css(seq_css(m));
@@ -1328,6 +1337,18 @@ static inline struct mem_cgroup *mem_cgroup_from_id(unsigned short id)
return NULL;
}

+#ifdef CONFIG_SHRINKER_DEBUG
+static inline unsigned long mem_cgroup_ino(struct mem_cgroup *memcg)
+{
+ return 0;
+}
+
+static inline struct mem_cgroup *mem_cgroup_get_from_ino(unsigned long ino)
+{
+ return NULL;
+}
+#endif
+
static inline struct mem_cgroup *mem_cgroup_from_seq(struct seq_file *m)
{
return NULL;
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 598fece89e2b..d0f892bde698 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -5027,6 +5027,29 @@ struct mem_cgroup *mem_cgroup_from_id(unsigned short id)
return idr_find(&mem_cgroup_idr, id);
}

+#ifdef CONFIG_SHRINKER_DEBUG
+struct mem_cgroup *mem_cgroup_get_from_ino(unsigned long ino)
+{
+ struct cgroup *cgrp;
+ struct cgroup_subsys_state *css;
+ struct mem_cgroup *memcg;
+
+ cgrp = cgroup_get_from_id(ino);
+ if (!cgrp)
+ return ERR_PTR(-ENOENT);
+
+ css = cgroup_get_e_css(cgrp, &memory_cgrp_subsys);
+ if (css)
+ memcg = container_of(css, struct mem_cgroup, css);
+ else
+ memcg = ERR_PTR(-ENOENT);
+
+ cgroup_put(cgrp);
+
+ return memcg;
+}
+#endif
+
static int alloc_mem_cgroup_per_node_info(struct mem_cgroup *memcg, int node)
{
struct mem_cgroup_per_node *pn;
--
2.35.3


2022-05-26 09:10:53

by Roman Gushchin

[permalink] [raw]
Subject: [PATCH v4 3/6] mm: shrinkers: provide shrinkers with names

Currently shrinkers are anonymous objects. For debugging purposes they
can be identified by count/scan function names, but it's not always
useful: e.g. for superblock's shrinkers it's nice to have at least
an idea of to which superblock the shrinker belongs.

This commit adds names to shrinkers. register_shrinker() and
prealloc_shrinker() functions are extended to take a format and
arguments to master a name.

In some cases it's not possible to determine a good name at the time
when a shrinker is allocated. For such cases shrinker_debugfs_rename()
is provided.

After this change the shrinker debugfs directory looks like:
$ cd /sys/kernel/debug/shrinker/
$ ls
dqcache-16 sb-hugetlbfs-17 sb-rootfs-2 sb-tmpfs-49
kfree_rcu-0 sb-hugetlbfs-33 sb-securityfs-6 sb-tracefs-13
sb-aio-20 sb-iomem-12 sb-selinuxfs-22 sb-xfs:vda1-36
sb-anon_inodefs-15 sb-mqueue-21 sb-sockfs-8 sb-zsmalloc-19
sb-bdev-3 sb-nsfs-4 sb-sysfs-26 shadow-18
sb-bpf-32 sb-pipefs-14 sb-tmpfs-1 thp_deferred_split-10
sb-btrfs:vda2-24 sb-proc-25 sb-tmpfs-27 thp_zero-9
sb-cgroup2-30 sb-proc-39 sb-tmpfs-29 xfs_buf-vda1-37
sb-configfs-23 sb-proc-41 sb-tmpfs-35 xfs_inodegc-vda1-38
sb-dax-11 sb-proc-45 sb-tmpfs-40 zspool-zram0-34
sb-debugfs-7 sb-proc-46 sb-tmpfs-42
sb-devpts-28 sb-proc-47 sb-tmpfs-43
sb-devtmpfs-5 sb-pstore-31 sb-tmpfs-44

Signed-off-by: Roman Gushchin <[email protected]>
---
arch/x86/kvm/mmu/mmu.c | 2 +-
drivers/android/binder_alloc.c | 2 +-
drivers/gpu/drm/i915/gem/i915_gem_shrinker.c | 3 +-
drivers/gpu/drm/msm/msm_gem_shrinker.c | 2 +-
.../gpu/drm/panfrost/panfrost_gem_shrinker.c | 2 +-
drivers/gpu/drm/ttm/ttm_pool.c | 2 +-
drivers/md/bcache/btree.c | 2 +-
drivers/md/dm-bufio.c | 3 +-
drivers/md/dm-zoned-metadata.c | 4 +-
drivers/md/raid5.c | 2 +-
drivers/misc/vmw_balloon.c | 2 +-
drivers/virtio/virtio_balloon.c | 2 +-
drivers/xen/xenbus/xenbus_probe_backend.c | 2 +-
fs/btrfs/super.c | 2 +
fs/erofs/utils.c | 2 +-
fs/ext4/extents_status.c | 3 +-
fs/f2fs/super.c | 2 +-
fs/gfs2/glock.c | 2 +-
fs/gfs2/main.c | 2 +-
fs/jbd2/journal.c | 3 +-
fs/mbcache.c | 2 +-
fs/nfs/nfs42xattr.c | 7 ++-
fs/nfs/super.c | 2 +-
fs/nfsd/filecache.c | 2 +-
fs/nfsd/nfscache.c | 3 +-
fs/quota/dquot.c | 2 +-
fs/super.c | 6 +-
fs/ubifs/super.c | 2 +-
fs/xfs/xfs_buf.c | 3 +-
fs/xfs/xfs_icache.c | 2 +-
fs/xfs/xfs_qm.c | 3 +-
include/linux/shrinker.h | 12 +++-
kernel/rcu/tree.c | 2 +-
mm/huge_memory.c | 4 +-
mm/shrinker_debug.c | 47 ++++++++++++++-
mm/vmscan.c | 58 ++++++++++++++++++-
mm/workingset.c | 2 +-
mm/zsmalloc.c | 3 +-
net/sunrpc/auth.c | 2 +-
39 files changed, 165 insertions(+), 45 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index f9080ee50ffa..7f3abc800621 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -6283,7 +6283,7 @@ int kvm_mmu_vendor_module_init(void)
if (percpu_counter_init(&kvm_total_used_mmu_pages, 0, GFP_KERNEL))
goto out;

- ret = register_shrinker(&mmu_shrinker);
+ ret = register_shrinker(&mmu_shrinker, "mmu");
if (ret)
goto out;

diff --git a/drivers/android/binder_alloc.c b/drivers/android/binder_alloc.c
index 2ac1008a5f39..951343c41ba8 100644
--- a/drivers/android/binder_alloc.c
+++ b/drivers/android/binder_alloc.c
@@ -1084,7 +1084,7 @@ int binder_alloc_shrinker_init(void)
int ret = list_lru_init(&binder_alloc_lru);

if (ret == 0) {
- ret = register_shrinker(&binder_shrinker);
+ ret = register_shrinker(&binder_shrinker, "binder");
if (ret)
list_lru_destroy(&binder_alloc_lru);
}
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
index 6a6ff98a8746..85524ef92ea4 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
@@ -426,7 +426,8 @@ void i915_gem_driver_register__shrinker(struct drm_i915_private *i915)
i915->mm.shrinker.count_objects = i915_gem_shrinker_count;
i915->mm.shrinker.seeks = DEFAULT_SEEKS;
i915->mm.shrinker.batch = 4096;
- drm_WARN_ON(&i915->drm, register_shrinker(&i915->mm.shrinker));
+ drm_WARN_ON(&i915->drm, register_shrinker(&i915->mm.shrinker,
+ "drm_i915_gem"));

i915->mm.oom_notifier.notifier_call = i915_gem_shrinker_oom;
drm_WARN_ON(&i915->drm, register_oom_notifier(&i915->mm.oom_notifier));
diff --git a/drivers/gpu/drm/msm/msm_gem_shrinker.c b/drivers/gpu/drm/msm/msm_gem_shrinker.c
index 086dacf2f26a..2d3cf4f13dfd 100644
--- a/drivers/gpu/drm/msm/msm_gem_shrinker.c
+++ b/drivers/gpu/drm/msm/msm_gem_shrinker.c
@@ -221,7 +221,7 @@ void msm_gem_shrinker_init(struct drm_device *dev)
priv->shrinker.count_objects = msm_gem_shrinker_count;
priv->shrinker.scan_objects = msm_gem_shrinker_scan;
priv->shrinker.seeks = DEFAULT_SEEKS;
- WARN_ON(register_shrinker(&priv->shrinker));
+ WARN_ON(register_shrinker(&priv->shrinker, "drm_msm_gem"));

priv->vmap_notifier.notifier_call = msm_gem_shrinker_vmap;
WARN_ON(register_vmap_purge_notifier(&priv->vmap_notifier));
diff --git a/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c b/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c
index 77e7cb6d1ae3..0d028266ee9e 100644
--- a/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c
+++ b/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c
@@ -103,7 +103,7 @@ void panfrost_gem_shrinker_init(struct drm_device *dev)
pfdev->shrinker.count_objects = panfrost_gem_shrinker_count;
pfdev->shrinker.scan_objects = panfrost_gem_shrinker_scan;
pfdev->shrinker.seeks = DEFAULT_SEEKS;
- WARN_ON(register_shrinker(&pfdev->shrinker));
+ WARN_ON(register_shrinker(&pfdev->shrinker, "drm_panfrost"));
}

/**
diff --git a/drivers/gpu/drm/ttm/ttm_pool.c b/drivers/gpu/drm/ttm/ttm_pool.c
index 1bba0a0ed3f9..b8b41d242197 100644
--- a/drivers/gpu/drm/ttm/ttm_pool.c
+++ b/drivers/gpu/drm/ttm/ttm_pool.c
@@ -722,7 +722,7 @@ int ttm_pool_mgr_init(unsigned long num_pages)
mm_shrinker.count_objects = ttm_pool_shrinker_count;
mm_shrinker.scan_objects = ttm_pool_shrinker_scan;
mm_shrinker.seeks = 1;
- return register_shrinker(&mm_shrinker);
+ return register_shrinker(&mm_shrinker, "drm_ttm_pool");
}

/**
diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c
index ad9f16689419..7ee2bf84a0b9 100644
--- a/drivers/md/bcache/btree.c
+++ b/drivers/md/bcache/btree.c
@@ -812,7 +812,7 @@ int bch_btree_cache_alloc(struct cache_set *c)
c->shrink.seeks = 4;
c->shrink.batch = c->btree_pages * 2;

- if (register_shrinker(&c->shrink))
+ if (register_shrinker(&c->shrink, "bcache-%pU", c->set_uuid))
pr_warn("bcache: %s: could not register shrinker\n",
__func__);

diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c
index e9cbc70d5a0e..1095e2bf9e9b 100644
--- a/drivers/md/dm-bufio.c
+++ b/drivers/md/dm-bufio.c
@@ -1807,7 +1807,8 @@ struct dm_bufio_client *dm_bufio_client_create(struct block_device *bdev, unsign
c->shrinker.scan_objects = dm_bufio_shrink_scan;
c->shrinker.seeks = 1;
c->shrinker.batch = 0;
- r = register_shrinker(&c->shrinker);
+ r = register_shrinker(&c->shrinker, "%s-(%u:%u)", slab_name,
+ MAJOR(bdev->bd_dev), MINOR(bdev->bd_dev));
if (r)
goto bad;

diff --git a/drivers/md/dm-zoned-metadata.c b/drivers/md/dm-zoned-metadata.c
index d1ea66114d14..57a4e0eab1eb 100644
--- a/drivers/md/dm-zoned-metadata.c
+++ b/drivers/md/dm-zoned-metadata.c
@@ -2944,7 +2944,9 @@ int dmz_ctr_metadata(struct dmz_dev *dev, int num_dev,
zmd->mblk_shrinker.seeks = DEFAULT_SEEKS;

/* Metadata cache shrinker */
- ret = register_shrinker(&zmd->mblk_shrinker);
+ ret = register_shrinker(&zmd->mblk_shrinker, "md_meta-(%u:%u)",
+ MAJOR(dev->bdev->bd_dev),
+ MINOR(dev->bdev->bd_dev));
if (ret) {
dmz_zmd_err(zmd, "Register metadata cache shrinker failed");
goto err;
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 351d341a1ffa..73c35b99836b 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -7383,7 +7383,7 @@ static struct r5conf *setup_conf(struct mddev *mddev)
conf->shrinker.count_objects = raid5_cache_count;
conf->shrinker.batch = 128;
conf->shrinker.flags = 0;
- if (register_shrinker(&conf->shrinker)) {
+ if (register_shrinker(&conf->shrinker, "md-%s", mdname(mddev))) {
pr_warn("md/raid:%s: couldn't register shrinker.\n",
mdname(mddev));
goto abort;
diff --git a/drivers/misc/vmw_balloon.c b/drivers/misc/vmw_balloon.c
index f1d8ba6d4857..6c9ddf1187dd 100644
--- a/drivers/misc/vmw_balloon.c
+++ b/drivers/misc/vmw_balloon.c
@@ -1587,7 +1587,7 @@ static int vmballoon_register_shrinker(struct vmballoon *b)
b->shrinker.count_objects = vmballoon_shrinker_count;
b->shrinker.seeks = DEFAULT_SEEKS;

- r = register_shrinker(&b->shrinker);
+ r = register_shrinker(&b->shrinker, "vmw_balloon");

if (r == 0)
b->shrinker_registered = true;
diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index f4c34a2a6b8e..093e06e19d0e 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -875,7 +875,7 @@ static int virtio_balloon_register_shrinker(struct virtio_balloon *vb)
vb->shrinker.count_objects = virtio_balloon_shrinker_count;
vb->shrinker.seeks = DEFAULT_SEEKS;

- return register_shrinker(&vb->shrinker);
+ return register_shrinker(&vb->shrinker, "virtio_valloon");
}

static int virtballoon_probe(struct virtio_device *vdev)
diff --git a/drivers/xen/xenbus/xenbus_probe_backend.c b/drivers/xen/xenbus/xenbus_probe_backend.c
index 5abded97e1a7..a6c5e344017d 100644
--- a/drivers/xen/xenbus/xenbus_probe_backend.c
+++ b/drivers/xen/xenbus/xenbus_probe_backend.c
@@ -305,7 +305,7 @@ static int __init xenbus_probe_backend_init(void)

register_xenstore_notifier(&xenstore_notifier);

- if (register_shrinker(&backend_memory_shrinker))
+ if (register_shrinker(&backend_memory_shrinker, "xen_backend"))
pr_warn("shrinker registration failed\n");

return 0;
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index b228efe8ab6e..ba0bf210895d 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -1790,6 +1790,8 @@ static struct dentry *btrfs_mount_root(struct file_system_type *fs_type,
error = -EBUSY;
} else {
snprintf(s->s_id, sizeof(s->s_id), "%pg", bdev);
+ shrinker_debugfs_rename(&s->s_shrink, "sb-%s:%s", fs_type->name,
+ s->s_id);
btrfs_sb(s)->bdev_holder = fs_type;
if (!strstr(crc32c_impl(), "generic"))
set_bit(BTRFS_FS_CSUM_IMPL_FAST, &fs_info->flags);
diff --git a/fs/erofs/utils.c b/fs/erofs/utils.c
index ec9a1d780dc1..67eb64fadd4f 100644
--- a/fs/erofs/utils.c
+++ b/fs/erofs/utils.c
@@ -282,7 +282,7 @@ static struct shrinker erofs_shrinker_info = {

int __init erofs_init_shrinker(void)
{
- return register_shrinker(&erofs_shrinker_info);
+ return register_shrinker(&erofs_shrinker_info, "erofs");
}

void erofs_exit_shrinker(void)
diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c
index 9a3a8996aacf..dbf06e40a2f7 100644
--- a/fs/ext4/extents_status.c
+++ b/fs/ext4/extents_status.c
@@ -1654,7 +1654,8 @@ int ext4_es_register_shrinker(struct ext4_sb_info *sbi)
sbi->s_es_shrinker.scan_objects = ext4_es_scan;
sbi->s_es_shrinker.count_objects = ext4_es_count;
sbi->s_es_shrinker.seeks = DEFAULT_SEEKS;
- err = register_shrinker(&sbi->s_es_shrinker);
+ err = register_shrinker(&sbi->s_es_shrinker, "ext4_es-%s",
+ sbi->s_sb->s_id);
if (err)
goto err4;

diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 4368f90571bd..2fc40a1635f3 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -4579,7 +4579,7 @@ static int __init init_f2fs_fs(void)
err = f2fs_init_sysfs();
if (err)
goto free_garbage_collection_cache;
- err = register_shrinker(&f2fs_shrinker_info);
+ err = register_shrinker(&f2fs_shrinker_info, "f2fs");
if (err)
goto free_sysfs;
err = register_filesystem(&f2fs_fs_type);
diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
index 630c6550eacf..3d1c7d2f32d7 100644
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -2530,7 +2530,7 @@ int __init gfs2_glock_init(void)
return -ENOMEM;
}

- ret = register_shrinker(&glock_shrinker);
+ ret = register_shrinker(&glock_shrinker, "gfs2_glock");
if (ret) {
destroy_workqueue(gfs2_delete_workqueue);
destroy_workqueue(glock_workqueue);
diff --git a/fs/gfs2/main.c b/fs/gfs2/main.c
index 28d0eb23e18e..dde981b78488 100644
--- a/fs/gfs2/main.c
+++ b/fs/gfs2/main.c
@@ -150,7 +150,7 @@ static int __init init_gfs2_fs(void)
if (!gfs2_trans_cachep)
goto fail_cachep8;

- error = register_shrinker(&gfs2_qd_shrinker);
+ error = register_shrinker(&gfs2_qd_shrinker, "gfs2_qd");
if (error)
goto fail_shrinker;

diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
index fcacafa4510d..786d2e4bc4f8 100644
--- a/fs/jbd2/journal.c
+++ b/fs/jbd2/journal.c
@@ -1418,7 +1418,8 @@ static journal_t *journal_init_common(struct block_device *bdev,
if (percpu_counter_init(&journal->j_checkpoint_jh_count, 0, GFP_KERNEL))
goto err_cleanup;

- if (register_shrinker(&journal->j_shrinker)) {
+ if (register_shrinker(&journal->j_shrinker, "jbd2_journal-(%u:%u)",
+ MAJOR(bdev->bd_dev), MINOR(bdev->bd_dev))) {
percpu_counter_destroy(&journal->j_checkpoint_jh_count);
goto err_cleanup;
}
diff --git a/fs/mbcache.c b/fs/mbcache.c
index 97c54d3a2227..379dc5b0b6ad 100644
--- a/fs/mbcache.c
+++ b/fs/mbcache.c
@@ -367,7 +367,7 @@ struct mb_cache *mb_cache_create(int bucket_bits)
cache->c_shrink.count_objects = mb_cache_count;
cache->c_shrink.scan_objects = mb_cache_scan;
cache->c_shrink.seeks = DEFAULT_SEEKS;
- if (register_shrinker(&cache->c_shrink)) {
+ if (register_shrinker(&cache->c_shrink, "mb_cache")) {
kfree(cache->c_hash);
kfree(cache);
goto err_out;
diff --git a/fs/nfs/nfs42xattr.c b/fs/nfs/nfs42xattr.c
index e7b34f7e0614..147b8a2f2dc6 100644
--- a/fs/nfs/nfs42xattr.c
+++ b/fs/nfs/nfs42xattr.c
@@ -1017,15 +1017,16 @@ int __init nfs4_xattr_cache_init(void)
if (ret)
goto out2;

- ret = register_shrinker(&nfs4_xattr_cache_shrinker);
+ ret = register_shrinker(&nfs4_xattr_cache_shrinker, "nfs_xattr_cache");
if (ret)
goto out1;

- ret = register_shrinker(&nfs4_xattr_entry_shrinker);
+ ret = register_shrinker(&nfs4_xattr_entry_shrinker, "nfs_xattr_entry");
if (ret)
goto out;

- ret = register_shrinker(&nfs4_xattr_large_entry_shrinker);
+ ret = register_shrinker(&nfs4_xattr_large_entry_shrinker,
+ "nfs_xattr_large_entry");
if (!ret)
return 0;

diff --git a/fs/nfs/super.c b/fs/nfs/super.c
index 6ab5eeb000dc..c7a2aef911f1 100644
--- a/fs/nfs/super.c
+++ b/fs/nfs/super.c
@@ -149,7 +149,7 @@ int __init register_nfs_fs(void)
ret = nfs_register_sysctl();
if (ret < 0)
goto error_2;
- ret = register_shrinker(&acl_shrinker);
+ ret = register_shrinker(&acl_shrinker, "nfs_acl");
if (ret < 0)
goto error_3;
#ifdef CONFIG_NFS_V4_2
diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
index 2c1b027774d4..9c2879a3c3c0 100644
--- a/fs/nfsd/filecache.c
+++ b/fs/nfsd/filecache.c
@@ -666,7 +666,7 @@ nfsd_file_cache_init(void)
goto out_err;
}

- ret = register_shrinker(&nfsd_file_shrinker);
+ ret = register_shrinker(&nfsd_file_shrinker, "nfsd_filecache");
if (ret) {
pr_err("nfsd: failed to register nfsd_file_shrinker: %d\n", ret);
goto out_lru;
diff --git a/fs/nfsd/nfscache.c b/fs/nfsd/nfscache.c
index 0b3f12aa37ff..698bd314cd66 100644
--- a/fs/nfsd/nfscache.c
+++ b/fs/nfsd/nfscache.c
@@ -176,7 +176,8 @@ int nfsd_reply_cache_init(struct nfsd_net *nn)
nn->nfsd_reply_cache_shrinker.scan_objects = nfsd_reply_cache_scan;
nn->nfsd_reply_cache_shrinker.count_objects = nfsd_reply_cache_count;
nn->nfsd_reply_cache_shrinker.seeks = 1;
- status = register_shrinker(&nn->nfsd_reply_cache_shrinker);
+ status = register_shrinker(&nn->nfsd_reply_cache_shrinker,
+ "nfsd_reply-%s", nn->nfsd_name);
if (status)
goto out_stats_destroy;

diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
index a74aef99bd3d..854d2b1d0914 100644
--- a/fs/quota/dquot.c
+++ b/fs/quota/dquot.c
@@ -2985,7 +2985,7 @@ static int __init dquot_init(void)
pr_info("VFS: Dquot-cache hash table entries: %ld (order %ld,"
" %ld bytes)\n", nr_hash, order, (PAGE_SIZE << order));

- if (register_shrinker(&dqcache_shrinker))
+ if (register_shrinker(&dqcache_shrinker, "dqcache"))
panic("Cannot register dquot shrinker");

return 0;
diff --git a/fs/super.c b/fs/super.c
index f1d4a193602d..4ce867ab1f4e 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -265,7 +265,7 @@ static struct super_block *alloc_super(struct file_system_type *type, int flags,
s->s_shrink.count_objects = super_cache_count;
s->s_shrink.batch = 1024;
s->s_shrink.flags = SHRINKER_NUMA_AWARE | SHRINKER_MEMCG_AWARE;
- if (prealloc_shrinker(&s->s_shrink))
+ if (prealloc_shrinker(&s->s_shrink, "sb-%s", type->name))
goto fail;
if (list_lru_init_memcg(&s->s_dentry_lru, &s->s_shrink))
goto fail;
@@ -1288,6 +1288,8 @@ int get_tree_bdev(struct fs_context *fc,
} else {
s->s_mode = mode;
snprintf(s->s_id, sizeof(s->s_id), "%pg", bdev);
+ shrinker_debugfs_rename(&s->s_shrink, "sb-%s:%s",
+ fc->fs_type->name, s->s_id);
sb_set_blocksize(s, block_size(bdev));
error = fill_super(s, fc);
if (error) {
@@ -1363,6 +1365,8 @@ struct dentry *mount_bdev(struct file_system_type *fs_type,
} else {
s->s_mode = mode;
snprintf(s->s_id, sizeof(s->s_id), "%pg", bdev);
+ shrinker_debugfs_rename(&s->s_shrink, "sb-%s:%s",
+ fs_type->name, s->s_id);
sb_set_blocksize(s, block_size(bdev));
error = fill_super(s, data, flags & SB_SILENT ? 1 : 0);
if (error) {
diff --git a/fs/ubifs/super.c b/fs/ubifs/super.c
index bad67455215f..a3663d201f64 100644
--- a/fs/ubifs/super.c
+++ b/fs/ubifs/super.c
@@ -2430,7 +2430,7 @@ static int __init ubifs_init(void)
if (!ubifs_inode_slab)
return -ENOMEM;

- err = register_shrinker(&ubifs_shrinker_info);
+ err = register_shrinker(&ubifs_shrinker_info, "ubifs");
if (err)
goto out_slab;

diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
index bf4e60871068..1aeb3d6407fc 100644
--- a/fs/xfs/xfs_buf.c
+++ b/fs/xfs/xfs_buf.c
@@ -1986,7 +1986,8 @@ xfs_alloc_buftarg(
btp->bt_shrinker.scan_objects = xfs_buftarg_shrink_scan;
btp->bt_shrinker.seeks = DEFAULT_SEEKS;
btp->bt_shrinker.flags = SHRINKER_NUMA_AWARE;
- if (register_shrinker(&btp->bt_shrinker))
+ if (register_shrinker(&btp->bt_shrinker, "xfs_buf-%s",
+ mp->m_super->s_id))
goto error_pcpu;
return btp;

diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
index bffd6eb0b298..59407fc10652 100644
--- a/fs/xfs/xfs_icache.c
+++ b/fs/xfs/xfs_icache.c
@@ -2198,5 +2198,5 @@ xfs_inodegc_register_shrinker(
shrink->flags = SHRINKER_NONSLAB;
shrink->batch = XFS_INODEGC_SHRINKER_BATCH;

- return register_shrinker(shrink);
+ return register_shrinker(shrink, "xfs_inodegc-%s", mp->m_super->s_id);
}
diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c
index f165d1a3de1d..052c66299066 100644
--- a/fs/xfs/xfs_qm.c
+++ b/fs/xfs/xfs_qm.c
@@ -686,7 +686,8 @@ xfs_qm_init_quotainfo(
qinf->qi_shrinker.seeks = DEFAULT_SEEKS;
qinf->qi_shrinker.flags = SHRINKER_NUMA_AWARE;

- error = register_shrinker(&qinf->qi_shrinker);
+ error = register_shrinker(&qinf->qi_shrinker, "xfs_qm-%s",
+ mp->m_super->s_id);
if (error)
goto out_free_inos;

diff --git a/include/linux/shrinker.h b/include/linux/shrinker.h
index 2ced8149c513..64416f3e0a1f 100644
--- a/include/linux/shrinker.h
+++ b/include/linux/shrinker.h
@@ -75,6 +75,7 @@ struct shrinker {
#endif
#ifdef CONFIG_SHRINKER_DEBUG
int debugfs_id;
+ const char *name;
struct dentry *debugfs_entry;
#endif
/* objs pending delete, per node */
@@ -92,9 +93,9 @@ struct shrinker {
*/
#define SHRINKER_NONSLAB (1 << 3)

-extern int prealloc_shrinker(struct shrinker *shrinker);
+extern int prealloc_shrinker(struct shrinker *shrinker, const char *fmt, ...);
extern void register_shrinker_prepared(struct shrinker *shrinker);
-extern int register_shrinker(struct shrinker *shrinker);
+extern int register_shrinker(struct shrinker *shrinker, const char *fmt, ...);
extern void unregister_shrinker(struct shrinker *shrinker);
extern void free_prealloced_shrinker(struct shrinker *shrinker);
extern void synchronize_shrinkers(void);
@@ -102,6 +103,8 @@ extern void synchronize_shrinkers(void);
#ifdef CONFIG_SHRINKER_DEBUG
extern int shrinker_debugfs_add(struct shrinker *shrinker);
extern void shrinker_debugfs_remove(struct shrinker *shrinker);
+extern int shrinker_debugfs_rename(struct shrinker *shrinker,
+ const char *fmt, ...);
#else /* CONFIG_SHRINKER_DEBUG */
static inline int shrinker_debugfs_add(struct shrinker *shrinker)
{
@@ -110,5 +113,10 @@ static inline int shrinker_debugfs_add(struct shrinker *shrinker)
static inline void shrinker_debugfs_remove(struct shrinker *shrinker)
{
}
+static inline int shrinker_debugfs_rename(struct shrinker *shrinker,
+ const char *fmt, ...)
+{
+ return 0;
+}
#endif /* CONFIG_SHRINKER_DEBUG */
#endif /* _LINUX_SHRINKER_H */
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index a4b8189455d5..730541140e86 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -4776,7 +4776,7 @@ static void __init kfree_rcu_batch_init(void)
INIT_DELAYED_WORK(&krcp->page_cache_work, fill_page_cache_func);
krcp->initialized = true;
}
- if (register_shrinker(&kfree_rcu_shrinker))
+ if (register_shrinker(&kfree_rcu_shrinker, "kfree_rcu"))
pr_err("Failed to register kfree_rcu() shrinker!\n");
}

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index c468fee595ff..460039df9ae1 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -429,10 +429,10 @@ static int __init hugepage_init(void)
if (err)
goto err_slab;

- err = register_shrinker(&huge_zero_page_shrinker);
+ err = register_shrinker(&huge_zero_page_shrinker, "thp_zero");
if (err)
goto err_hzp_shrinker;
- err = register_shrinker(&deferred_split_shrinker);
+ err = register_shrinker(&deferred_split_shrinker, "thp_deferred_split");
if (err)
goto err_split_shrinker;

diff --git a/mm/shrinker_debug.c b/mm/shrinker_debug.c
index 1a70556bd46c..781ecbd3d608 100644
--- a/mm/shrinker_debug.c
+++ b/mm/shrinker_debug.c
@@ -102,7 +102,7 @@ DEFINE_SHOW_ATTRIBUTE(shrinker_debugfs_count);
int shrinker_debugfs_add(struct shrinker *shrinker)
{
struct dentry *entry;
- char buf[16];
+ char buf[128];
int id;

lockdep_assert_held(&shrinker_rwsem);
@@ -116,7 +116,7 @@ int shrinker_debugfs_add(struct shrinker *shrinker)
return id;
shrinker->debugfs_id = id;

- snprintf(buf, sizeof(buf), "%d", id);
+ snprintf(buf, sizeof(buf), "%s-%d", shrinker->name, id);

/* create debugfs entry */
entry = debugfs_create_dir(buf, shrinker_debugfs_root);
@@ -131,10 +131,53 @@ int shrinker_debugfs_add(struct shrinker *shrinker)
return 0;
}

+int shrinker_debugfs_rename(struct shrinker *shrinker, const char *fmt, ...)
+{
+ struct dentry *entry;
+ char buf[128];
+ const char *new, *old;
+ va_list ap;
+ int ret = 0;
+
+ va_start(ap, fmt);
+ new = kvasprintf_const(GFP_KERNEL, fmt, ap);
+ va_end(ap);
+
+ if (!new)
+ return -ENOMEM;
+
+ down_write(&shrinker_rwsem);
+
+ old = shrinker->name;
+ shrinker->name = new;
+
+ if (shrinker->debugfs_entry) {
+ snprintf(buf, sizeof(buf), "%s-%d", shrinker->name,
+ shrinker->debugfs_id);
+
+ entry = debugfs_rename(shrinker_debugfs_root,
+ shrinker->debugfs_entry,
+ shrinker_debugfs_root, buf);
+ if (IS_ERR(entry))
+ ret = PTR_ERR(entry);
+ else
+ shrinker->debugfs_entry = entry;
+ }
+
+ up_write(&shrinker_rwsem);
+
+ kfree_const(old);
+
+ return ret;
+}
+EXPORT_SYMBOL(shrinker_debugfs_rename);
+
void shrinker_debugfs_remove(struct shrinker *shrinker)
{
lockdep_assert_held(&shrinker_rwsem);

+ kfree_const(shrinker->name);
+
if (!shrinker->debugfs_entry)
return;

diff --git a/mm/vmscan.c b/mm/vmscan.c
index f54df1ce9312..fd8a472b6501 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -612,7 +612,7 @@ static unsigned long lruvec_lru_size(struct lruvec *lruvec, enum lru_list lru,
/*
* Add a shrinker callback to be called from the vm.
*/
-int prealloc_shrinker(struct shrinker *shrinker)
+static int __prealloc_shrinker(struct shrinker *shrinker)
{
unsigned int size;
int err;
@@ -636,8 +636,36 @@ int prealloc_shrinker(struct shrinker *shrinker)
return 0;
}

+#ifdef CONFIG_SHRINKER_DEBUG
+int prealloc_shrinker(struct shrinker *shrinker, const char *fmt, ...)
+{
+ va_list ap;
+ int err;
+
+ va_start(ap, fmt);
+ shrinker->name = kvasprintf_const(GFP_KERNEL, fmt, ap);
+ va_end(ap);
+ if (!shrinker->name)
+ return -ENOMEM;
+
+ err = __prealloc_shrinker(shrinker);
+ if (err)
+ kfree_const(shrinker->name);
+
+ return err;
+}
+#else
+int prealloc_shrinker(struct shrinker *shrinker, const char *fmt, ...)
+{
+ return __prealloc_shrinker(shrinker);
+}
+#endif
+
void free_prealloced_shrinker(struct shrinker *shrinker)
{
+#ifdef CONFIG_SHRINKER_DEBUG
+ kfree_const(shrinker->name);
+#endif
if (shrinker->flags & SHRINKER_MEMCG_AWARE) {
down_write(&shrinker_rwsem);
unregister_memcg_shrinker(shrinker);
@@ -658,15 +686,39 @@ void register_shrinker_prepared(struct shrinker *shrinker)
up_write(&shrinker_rwsem);
}

-int register_shrinker(struct shrinker *shrinker)
+static int __register_shrinker(struct shrinker *shrinker)
{
- int err = prealloc_shrinker(shrinker);
+ int err = __prealloc_shrinker(shrinker);

if (err)
return err;
register_shrinker_prepared(shrinker);
return 0;
}
+
+#ifdef CONFIG_SHRINKER_DEBUG
+int register_shrinker(struct shrinker *shrinker, const char *fmt, ...)
+{
+ va_list ap;
+ int err;
+
+ va_start(ap, fmt);
+ shrinker->name = kvasprintf_const(GFP_KERNEL, fmt, ap);
+ va_end(ap);
+ if (!shrinker->name)
+ return -ENOMEM;
+
+ err = __register_shrinker(shrinker);
+ if (err)
+ kfree_const(shrinker->name);
+ return err;
+}
+#else
+int register_shrinker(struct shrinker *shrinker, const char *fmt, ...)
+{
+ return __register_shrinker(shrinker);
+}
+#endif
EXPORT_SYMBOL(register_shrinker);

/*
diff --git a/mm/workingset.c b/mm/workingset.c
index 592569a8974c..840986179cf3 100644
--- a/mm/workingset.c
+++ b/mm/workingset.c
@@ -625,7 +625,7 @@ static int __init workingset_init(void)
pr_info("workingset: timestamp_bits=%d max_order=%d bucket_order=%u\n",
timestamp_bits, max_order, bucket_order);

- ret = prealloc_shrinker(&workingset_shadow_shrinker);
+ ret = prealloc_shrinker(&workingset_shadow_shrinker, "shadow");
if (ret)
goto err;
ret = __list_lru_init(&shadow_nodes, true, &shadow_nodes_key,
diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index 9152fbde33b5..1b759ba69ca6 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -2188,7 +2188,8 @@ static int zs_register_shrinker(struct zs_pool *pool)
pool->shrinker.batch = 0;
pool->shrinker.seeks = DEFAULT_SEEKS;

- return register_shrinker(&pool->shrinker);
+ return register_shrinker(&pool->shrinker, "zspool-%s",
+ pool->name);
}

/**
diff --git a/net/sunrpc/auth.c b/net/sunrpc/auth.c
index 682fcd24bf43..a29742a9c3f1 100644
--- a/net/sunrpc/auth.c
+++ b/net/sunrpc/auth.c
@@ -874,7 +874,7 @@ int __init rpcauth_init_module(void)
err = rpc_init_authunix();
if (err < 0)
goto out1;
- err = register_shrinker(&rpc_cred_shrinker);
+ err = register_shrinker(&rpc_cred_shrinker, "rpc_cred");
if (err < 0)
goto out2;
return 0;
--
2.35.3


2022-05-26 13:51:08

by Roman Gushchin

[permalink] [raw]
Subject: [PATCH v4 6/6] mm: shrinkers: add scan interface for shrinker debugfs

Add a scan interface which allows to trigger scanning of a particular
shrinker and specify memcg and numa node. It's useful for testing,
debugging and profiling of a specific scan_objects() callback.
Unlike alternatives (creating a real memory pressure and dropping
caches via /proc/sys/vm/drop_caches) this interface allows to interact
with only one shrinker at once. Also, if a shrinker is misreporting
the number of objects (as some do), it doesn't affect scanning.

Signed-off-by: Roman Gushchin <[email protected]>
---
.../admin-guide/mm/shrinker_debugfs.rst | 39 +++++++++-
mm/shrinker_debug.c | 74 +++++++++++++++++++
2 files changed, 109 insertions(+), 4 deletions(-)

diff --git a/Documentation/admin-guide/mm/shrinker_debugfs.rst b/Documentation/admin-guide/mm/shrinker_debugfs.rst
index 2033d696aa59..97e6da829a43 100644
--- a/Documentation/admin-guide/mm/shrinker_debugfs.rst
+++ b/Documentation/admin-guide/mm/shrinker_debugfs.rst
@@ -5,14 +5,16 @@ Shrinker Debugfs Interface
==========================

Shrinker debugfs interface provides a visibility into the kernel memory
-shrinkers subsystem and allows to get information about individual shrinkers.
+shrinkers subsystem and allows to get information about individual shrinkers
+and interact with them.

For each shrinker registered in the system a directory in **<debugfs>/shrinker/**
is created. The directory's name is composed from the shrinker's name and an
unique id: e.g. *kfree_rcu-0* or *sb-xfs:vda1-36*.

-Each shrinker directory contains the **count** file, which allows to trigger
-the *count_objects()* callback for each memcg and numa node (if applicable).
+Each shrinker directory contains **count** and **scan** files, which allow to
+trigger *count_objects()* and *scan_objects()* callbacks for each memcg and
+numa node (if applicable).

Usage:
------
@@ -43,7 +45,7 @@ Usage:

$ cd sb-btrfs\:vda2-24/
$ ls
- count
+ count scan

3. *Count objects*

@@ -98,3 +100,32 @@ Usage:
2877 84 0
293 1 0
735 8 0
+
+4. *Scan objects*
+
+ The expected input format::
+
+ <cgroup inode id> <numa id> <number of objects to scan>
+
+ For a non-memcg-aware shrinker or on a system with no memory
+ cgrups **0** should be passed as cgroup id.
+ ::
+
+ $ cd /sys/kernel/debug/shrinker/
+ $ cd sb-btrfs\:vda2-24/
+
+ $ cat count | head -n 5
+ 1 212 0
+ 21 97 0
+ 55 802 5
+ 2367 2 0
+ 225 13 0
+
+ $ echo "55 0 200" > scan
+
+ $ cat count | head -n 5
+ 1 212 0
+ 21 96 0
+ 55 752 5
+ 2367 2 0
+ 225 13 0
diff --git a/mm/shrinker_debug.c b/mm/shrinker_debug.c
index 781ecbd3d608..e25114e0c41c 100644
--- a/mm/shrinker_debug.c
+++ b/mm/shrinker_debug.c
@@ -99,6 +99,78 @@ static int shrinker_debugfs_count_show(struct seq_file *m, void *v)
}
DEFINE_SHOW_ATTRIBUTE(shrinker_debugfs_count);

+static int shrinker_debugfs_scan_open(struct inode *inode, struct file *file)
+{
+ file->private_data = inode->i_private;
+ return nonseekable_open(inode, file);
+}
+
+static ssize_t shrinker_debugfs_scan_write(struct file *file,
+ const char __user *buf,
+ size_t size, loff_t *pos)
+{
+ struct shrinker *shrinker = file->private_data;
+ unsigned long nr_to_scan = 0, ino;
+ struct shrink_control sc = {
+ .gfp_mask = GFP_KERNEL,
+ };
+ struct mem_cgroup *memcg = NULL;
+ int nid;
+ char kbuf[72];
+ int read_len = size < (sizeof(kbuf) - 1) ? size : (sizeof(kbuf) - 1);
+ ssize_t ret;
+
+ if (copy_from_user(kbuf, buf, read_len))
+ return -EFAULT;
+ kbuf[read_len] = '\0';
+
+ if (sscanf(kbuf, "%lu %d %lu", &ino, &nid, &nr_to_scan) < 2)
+ return -EINVAL;
+
+ if (nid < 0 || nid >= nr_node_ids)
+ return -EINVAL;
+
+ if (nr_to_scan == 0)
+ return size;
+
+ if (shrinker->flags & SHRINKER_MEMCG_AWARE) {
+ memcg = mem_cgroup_get_from_ino(ino);
+ if (!memcg || IS_ERR(memcg))
+ return -ENOENT;
+
+ if (!mem_cgroup_online(memcg)) {
+ mem_cgroup_put(memcg);
+ return -ENOENT;
+ }
+ } else if (ino != 0) {
+ return -EINVAL;
+ }
+
+ ret = down_read_killable(&shrinker_rwsem);
+ if (ret) {
+ mem_cgroup_put(memcg);
+ return ret;
+ }
+
+ sc.nid = nid;
+ sc.memcg = memcg;
+ sc.nr_to_scan = nr_to_scan;
+ sc.nr_scanned = nr_to_scan;
+
+ shrinker->scan_objects(shrinker, &sc);
+
+ up_read(&shrinker_rwsem);
+ mem_cgroup_put(memcg);
+
+ return size;
+}
+
+static const struct file_operations shrinker_debugfs_scan_fops = {
+ .owner = THIS_MODULE,
+ .open = shrinker_debugfs_scan_open,
+ .write = shrinker_debugfs_scan_write,
+};
+
int shrinker_debugfs_add(struct shrinker *shrinker)
{
struct dentry *entry;
@@ -128,6 +200,8 @@ int shrinker_debugfs_add(struct shrinker *shrinker)

debugfs_create_file("count", 0220, entry, shrinker,
&shrinker_debugfs_count_fops);
+ debugfs_create_file("scan", 0440, entry, shrinker,
+ &shrinker_debugfs_scan_fops);
return 0;
}

--
2.35.3


2022-05-26 17:20:25

by Muchun Song

[permalink] [raw]
Subject: Re: [PATCH v4 1/6] mm: memcontrol: introduce mem_cgroup_ino() and mem_cgroup_get_from_ino()

On Wed, May 25, 2022 at 01:25:55PM -0700, Roman Gushchin wrote:
> Shrinker debugfs requires a way to represent memory cgroups without
> using full paths, both for displaying information and getting input
> from a user.
>
> Cgroup inode number is a perfect way, already used by bpf.
>
> This commit adds a couple of helper functions which will be used
> to handle memcg-aware shrinkers.
>
> Signed-off-by: Roman Gushchin <[email protected]>

Acked-by: Muchun Song <[email protected]>

Thanks.

2022-05-26 18:42:03

by Roman Gushchin

[permalink] [raw]
Subject: [PATCH v4 2/6] mm: shrinkers: introduce debugfs interface for memory shrinkers

This commit introduces the /sys/kernel/debug/shrinker debugfs
interface which provides an ability to observe the state of
individual kernel memory shrinkers.

Because the feature adds some memory overhead (which shouldn't be
large unless there is a huge amount of registered shrinkers), it's
guarded by a config option (enabled by default).

This commit introduces the "count" interface for each shrinker
registered in the system.

The output is in the following format:
<cgroup inode id> <nr of objects on node 0> <nr of objects on node 1>...
<cgroup inode id> <nr of objects on node 0> <nr of objects on node 1>...
...

To reduce the size of output on machines with many thousands cgroups,
if the total number of objects on all nodes is 0, the line is omitted.

If the shrinker is not memcg-aware or CONFIG_MEMCG is off, 0 is
printed as cgroup inode id. If the shrinker is not numa-aware, 0's are
printed for all nodes except the first one.

This commit gives debugfs entries simple numeric names, which are not
very convenient. The following commit in the series will provide
shrinkers with more meaningful names.

Signed-off-by: Roman Gushchin <[email protected]>
Reviewed-by: Kent Overstreet <[email protected]>
---
include/linux/shrinker.h | 19 ++++-
lib/Kconfig.debug | 9 +++
mm/Makefile | 1 +
mm/shrinker_debug.c | 168 +++++++++++++++++++++++++++++++++++++++
mm/vmscan.c | 6 +-
5 files changed, 200 insertions(+), 3 deletions(-)
create mode 100644 mm/shrinker_debug.c

diff --git a/include/linux/shrinker.h b/include/linux/shrinker.h
index 76fbf92b04d9..2ced8149c513 100644
--- a/include/linux/shrinker.h
+++ b/include/linux/shrinker.h
@@ -72,6 +72,10 @@ struct shrinker {
#ifdef CONFIG_MEMCG
/* ID in shrinker_idr */
int id;
+#endif
+#ifdef CONFIG_SHRINKER_DEBUG
+ int debugfs_id;
+ struct dentry *debugfs_entry;
#endif
/* objs pending delete, per node */
atomic_long_t *nr_deferred;
@@ -94,4 +98,17 @@ extern int register_shrinker(struct shrinker *shrinker);
extern void unregister_shrinker(struct shrinker *shrinker);
extern void free_prealloced_shrinker(struct shrinker *shrinker);
extern void synchronize_shrinkers(void);
-#endif
+
+#ifdef CONFIG_SHRINKER_DEBUG
+extern int shrinker_debugfs_add(struct shrinker *shrinker);
+extern void shrinker_debugfs_remove(struct shrinker *shrinker);
+#else /* CONFIG_SHRINKER_DEBUG */
+static inline int shrinker_debugfs_add(struct shrinker *shrinker)
+{
+ return 0;
+}
+static inline void shrinker_debugfs_remove(struct shrinker *shrinker)
+{
+}
+#endif /* CONFIG_SHRINKER_DEBUG */
+#endif /* _LINUX_SHRINKER_H */
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 075cd25363ac..6fda0ac6661c 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -732,6 +732,15 @@ config SLUB_STATS
out which slabs are relevant to a particular load.
Try running: slabinfo -DA

+config SHRINKER_DEBUG
+ default y
+ bool "Enable shrinker debugging support"
+ depends on DEBUG_FS
+ help
+ Say Y to enable the shrinker debugfs interface which provides
+ visibility into the kernel memory shrinkers subsystem.
+ Disable it to avoid an extra memory footprint.
+
config HAVE_DEBUG_KMEMLEAK
bool

diff --git a/mm/Makefile b/mm/Makefile
index 4cc13f3179a5..b693cbec9aa9 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -133,3 +133,4 @@ obj-$(CONFIG_PAGE_REPORTING) += page_reporting.o
obj-$(CONFIG_IO_MAPPING) += io-mapping.o
obj-$(CONFIG_HAVE_BOOTMEM_INFO_NODE) += bootmem_info.o
obj-$(CONFIG_GENERIC_IOREMAP) += ioremap.o
+obj-$(CONFIG_SHRINKER_DEBUG) += shrinker_debug.o
diff --git a/mm/shrinker_debug.c b/mm/shrinker_debug.c
new file mode 100644
index 000000000000..1a70556bd46c
--- /dev/null
+++ b/mm/shrinker_debug.c
@@ -0,0 +1,168 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/idr.h>
+#include <linux/slab.h>
+#include <linux/debugfs.h>
+#include <linux/seq_file.h>
+#include <linux/shrinker.h>
+#include <linux/memcontrol.h>
+
+/* defined in vmscan.c */
+extern struct rw_semaphore shrinker_rwsem;
+extern struct list_head shrinker_list;
+
+static DEFINE_IDA(shrinker_debugfs_ida);
+static struct dentry *shrinker_debugfs_root;
+
+static unsigned long shrinker_count_objects(struct shrinker *shrinker,
+ struct mem_cgroup *memcg,
+ unsigned long *count_per_node)
+{
+ unsigned long nr, total = 0;
+ int nid;
+
+ for_each_node(nid) {
+ if (nid == 0 || (shrinker->flags & SHRINKER_NUMA_AWARE)) {
+ struct shrink_control sc = {
+ .gfp_mask = GFP_KERNEL,
+ .nid = nid,
+ .memcg = memcg,
+ };
+
+ nr = shrinker->count_objects(shrinker, &sc);
+ if (nr == SHRINK_EMPTY)
+ nr = 0;
+ } else {
+ nr = 0;
+ }
+
+ count_per_node[nid] = nr;
+ total += nr;
+ }
+
+ return total;
+}
+
+static int shrinker_debugfs_count_show(struct seq_file *m, void *v)
+{
+ struct shrinker *shrinker = m->private;
+ unsigned long *count_per_node;
+ struct mem_cgroup *memcg;
+ unsigned long total;
+ bool memcg_aware;
+ int ret, nid;
+
+ count_per_node = kcalloc(nr_node_ids, sizeof(unsigned long), GFP_KERNEL);
+ if (!count_per_node)
+ return -ENOMEM;
+
+ ret = down_read_killable(&shrinker_rwsem);
+ if (ret) {
+ kfree(count_per_node);
+ return ret;
+ }
+ rcu_read_lock();
+
+ memcg_aware = shrinker->flags & SHRINKER_MEMCG_AWARE;
+
+ memcg = mem_cgroup_iter(NULL, NULL, NULL);
+ do {
+ if (memcg && !mem_cgroup_online(memcg))
+ continue;
+
+ total = shrinker_count_objects(shrinker,
+ memcg_aware ? memcg : NULL,
+ count_per_node);
+ if (total) {
+ seq_printf(m, "%lu", mem_cgroup_ino(memcg));
+ for_each_node(nid)
+ seq_printf(m, " %lu", count_per_node[nid]);
+ seq_putc(m, '\n');
+ }
+
+ if (!memcg_aware) {
+ mem_cgroup_iter_break(NULL, memcg);
+ break;
+ }
+
+ if (signal_pending(current)) {
+ mem_cgroup_iter_break(NULL, memcg);
+ ret = -EINTR;
+ break;
+ }
+ } while ((memcg = mem_cgroup_iter(NULL, memcg, NULL)) != NULL);
+
+ rcu_read_unlock();
+ up_read(&shrinker_rwsem);
+
+ kfree(count_per_node);
+ return ret;
+}
+DEFINE_SHOW_ATTRIBUTE(shrinker_debugfs_count);
+
+int shrinker_debugfs_add(struct shrinker *shrinker)
+{
+ struct dentry *entry;
+ char buf[16];
+ int id;
+
+ lockdep_assert_held(&shrinker_rwsem);
+
+ /* debugfs isn't initialized yet, add debugfs entries later. */
+ if (!shrinker_debugfs_root)
+ return 0;
+
+ id = ida_alloc(&shrinker_debugfs_ida, GFP_KERNEL);
+ if (id < 0)
+ return id;
+ shrinker->debugfs_id = id;
+
+ snprintf(buf, sizeof(buf), "%d", id);
+
+ /* create debugfs entry */
+ entry = debugfs_create_dir(buf, shrinker_debugfs_root);
+ if (IS_ERR(entry)) {
+ ida_free(&shrinker_debugfs_ida, id);
+ return PTR_ERR(entry);
+ }
+ shrinker->debugfs_entry = entry;
+
+ debugfs_create_file("count", 0220, entry, shrinker,
+ &shrinker_debugfs_count_fops);
+ return 0;
+}
+
+void shrinker_debugfs_remove(struct shrinker *shrinker)
+{
+ lockdep_assert_held(&shrinker_rwsem);
+
+ if (!shrinker->debugfs_entry)
+ return;
+
+ debugfs_remove_recursive(shrinker->debugfs_entry);
+ ida_free(&shrinker_debugfs_ida, shrinker->debugfs_id);
+}
+
+static int __init shrinker_debugfs_init(void)
+{
+ struct shrinker *shrinker;
+ struct dentry *dentry;
+ int ret = 0;
+
+ dentry = debugfs_create_dir("shrinker", NULL);
+ if (IS_ERR(dentry))
+ return PTR_ERR(dentry);
+ shrinker_debugfs_root = dentry;
+
+ /* Create debugfs entries for shrinkers registered at boot */
+ down_write(&shrinker_rwsem);
+ list_for_each_entry(shrinker, &shrinker_list, list)
+ if (!shrinker->debugfs_entry) {
+ ret = shrinker_debugfs_add(shrinker);
+ if (ret)
+ break;
+ }
+ up_write(&shrinker_rwsem);
+
+ return ret;
+}
+late_initcall(shrinker_debugfs_init);
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 1678802e03e7..f54df1ce9312 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -189,8 +189,8 @@ static void set_task_reclaim_state(struct task_struct *task,
task->reclaim_state = rs;
}

-static LIST_HEAD(shrinker_list);
-static DECLARE_RWSEM(shrinker_rwsem);
+LIST_HEAD(shrinker_list);
+DECLARE_RWSEM(shrinker_rwsem);

#ifdef CONFIG_MEMCG
static int shrinker_nr_max;
@@ -654,6 +654,7 @@ void register_shrinker_prepared(struct shrinker *shrinker)
down_write(&shrinker_rwsem);
list_add_tail(&shrinker->list, &shrinker_list);
shrinker->flags |= SHRINKER_REGISTERED;
+ WARN_ON_ONCE(shrinker_debugfs_add(shrinker));
up_write(&shrinker_rwsem);
}

@@ -681,6 +682,7 @@ void unregister_shrinker(struct shrinker *shrinker)
shrinker->flags &= ~SHRINKER_REGISTERED;
if (shrinker->flags & SHRINKER_MEMCG_AWARE)
unregister_memcg_shrinker(shrinker);
+ shrinker_debugfs_remove(shrinker);
up_write(&shrinker_rwsem);

kfree(shrinker->nr_deferred);
--
2.35.3


2022-05-26 22:12:51

by Roman Gushchin

[permalink] [raw]
Subject: [PATCH v4 4/6] mm: docs: document shrinker debugfs

Add a document describing the shrinker debugfs interface.

Signed-off-by: Roman Gushchin <[email protected]>
---
Documentation/admin-guide/mm/index.rst | 1 +
.../admin-guide/mm/shrinker_debugfs.rst | 100 ++++++++++++++++++
2 files changed, 101 insertions(+)
create mode 100644 Documentation/admin-guide/mm/shrinker_debugfs.rst

diff --git a/Documentation/admin-guide/mm/index.rst b/Documentation/admin-guide/mm/index.rst
index c21b5823f126..1bd11118dfb1 100644
--- a/Documentation/admin-guide/mm/index.rst
+++ b/Documentation/admin-guide/mm/index.rst
@@ -36,6 +36,7 @@ the Linux memory management.
numa_memory_policy
numaperf
pagemap
+ shrinker_debugfs
soft-dirty
swap_numa
transhuge
diff --git a/Documentation/admin-guide/mm/shrinker_debugfs.rst b/Documentation/admin-guide/mm/shrinker_debugfs.rst
new file mode 100644
index 000000000000..2033d696aa59
--- /dev/null
+++ b/Documentation/admin-guide/mm/shrinker_debugfs.rst
@@ -0,0 +1,100 @@
+.. _shrinker_debugfs:
+
+==========================
+Shrinker Debugfs Interface
+==========================
+
+Shrinker debugfs interface provides a visibility into the kernel memory
+shrinkers subsystem and allows to get information about individual shrinkers.
+
+For each shrinker registered in the system a directory in **<debugfs>/shrinker/**
+is created. The directory's name is composed from the shrinker's name and an
+unique id: e.g. *kfree_rcu-0* or *sb-xfs:vda1-36*.
+
+Each shrinker directory contains the **count** file, which allows to trigger
+the *count_objects()* callback for each memcg and numa node (if applicable).
+
+Usage:
+------
+
+1. *List registered shrinkers*
+
+ ::
+
+ $ cd /sys/kernel/debug/shrinker/
+ $ ls
+ dqcache-16 sb-hugetlbfs-17 sb-rootfs-2 sb-tmpfs-49
+ kfree_rcu-0 sb-hugetlbfs-33 sb-securityfs-6 sb-tracefs-13
+ sb-aio-20 sb-iomem-12 sb-selinuxfs-22 sb-xfs:vda1-36
+ sb-anon_inodefs-15 sb-mqueue-21 sb-sockfs-8 sb-zsmalloc-19
+ sb-bdev-3 sb-nsfs-4 sb-sysfs-26 shadow-18
+ sb-bpf-32 sb-pipefs-14 sb-tmpfs-1 thp_deferred_split-10
+ sb-btrfs:vda2-24 sb-proc-25 sb-tmpfs-27 thp_zero-9
+ sb-cgroup2-30 sb-proc-39 sb-tmpfs-29 xfs_buf-vda1-37
+ sb-configfs-23 sb-proc-41 sb-tmpfs-35 xfs_inodegc-vda1-38
+ sb-dax-11 sb-proc-45 sb-tmpfs-40 zspool-zram0-34
+ sb-debugfs-7 sb-proc-46 sb-tmpfs-42
+ sb-devpts-28 sb-proc-47 sb-tmpfs-43
+ sb-devtmpfs-5 sb-pstore-31 sb-tmpfs-44
+
+2. *Get information about a specific shrinker*
+
+ ::
+
+ $ cd sb-btrfs\:vda2-24/
+ $ ls
+ count
+
+3. *Count objects*
+
+ Each line in the output has the following format::
+
+ <cgroup inode id> <nr of objects on node 0> <nr of objects on node 1> ...
+ <cgroup inode id> <nr of objects on node 0> <nr of objects on node 1> ...
+ ...
+
+ If there are no objects on all numa nodes, a line is omitted. If there
+ are no objects at all, the output might be empty.
+ ::
+
+ $ cat count
+ 1 224 2
+ 21 98 0
+ 55 818 10
+ 2367 2 0
+ 2401 30 0
+ 225 13 0
+ 599 35 0
+ 939 124 0
+ 1041 3 0
+ 1075 1 0
+ 1109 1 0
+ 1279 60 0
+ 1313 7 0
+ 1347 39 0
+ 1381 3 0
+ 1449 14 0
+ 1483 63 0
+ 1517 53 0
+ 1551 6 0
+ 1585 1 0
+ 1619 6 0
+ 1653 40 0
+ 1687 11 0
+ 1721 8 0
+ 1755 4 0
+ 1789 52 0
+ 1823 888 0
+ 1857 1 0
+ 1925 2 0
+ 1959 32 0
+ 2027 22 0
+ 2061 9 0
+ 2469 799 0
+ 2537 861 0
+ 2639 1 0
+ 2707 70 0
+ 2775 4 0
+ 2877 84 0
+ 293 1 0
+ 735 8 0
--
2.35.3


2022-05-27 01:28:48

by Muchun Song

[permalink] [raw]
Subject: Re: [PATCH v4 6/6] mm: shrinkers: add scan interface for shrinker debugfs

On Wed, May 25, 2022 at 01:26:00PM -0700, Roman Gushchin wrote:
> Add a scan interface which allows to trigger scanning of a particular
> shrinker and specify memcg and numa node. It's useful for testing,
> debugging and profiling of a specific scan_objects() callback.
> Unlike alternatives (creating a real memory pressure and dropping
> caches via /proc/sys/vm/drop_caches) this interface allows to interact
> with only one shrinker at once. Also, if a shrinker is misreporting
> the number of objects (as some do), it doesn't affect scanning.
>
> Signed-off-by: Roman Gushchin <[email protected]>

Acked-by: Muchun Song <[email protected]>

Thanks.

2022-05-27 15:02:24

by Muchun Song

[permalink] [raw]
Subject: Re: [PATCH v4 2/6] mm: shrinkers: introduce debugfs interface for memory shrinkers

On Wed, May 25, 2022 at 01:25:56PM -0700, Roman Gushchin wrote:
> This commit introduces the /sys/kernel/debug/shrinker debugfs
> interface which provides an ability to observe the state of
> individual kernel memory shrinkers.
>
> Because the feature adds some memory overhead (which shouldn't be
> large unless there is a huge amount of registered shrinkers), it's
> guarded by a config option (enabled by default).
>
> This commit introduces the "count" interface for each shrinker
> registered in the system.
>
> The output is in the following format:
> <cgroup inode id> <nr of objects on node 0> <nr of objects on node 1>...
> <cgroup inode id> <nr of objects on node 0> <nr of objects on node 1>...
> ...
>
> To reduce the size of output on machines with many thousands cgroups,
> if the total number of objects on all nodes is 0, the line is omitted.
>
> If the shrinker is not memcg-aware or CONFIG_MEMCG is off, 0 is
> printed as cgroup inode id. If the shrinker is not numa-aware, 0's are
> printed for all nodes except the first one.
>
> This commit gives debugfs entries simple numeric names, which are not
> very convenient. The following commit in the series will provide
> shrinkers with more meaningful names.
>
> Signed-off-by: Roman Gushchin <[email protected]>
> Reviewed-by: Kent Overstreet <[email protected]>

Acked-by: Muchun Song <[email protected]>

Thanks.

2022-05-27 21:55:54

by Roman Gushchin

[permalink] [raw]
Subject: [PATCH v4 5/6] tools: add memcg_shrinker.py

Add a simple tool which prints a sorted list of shrinker lists
in the following format: (number of objects, shrinker name, cgroup).

Example:
$ ./memcg_shrinker.py -n 10
2090 sb-sysfs-26 /sys/fs/cgroup/system.slice
1809 sb-sysfs-26 /sys/fs/cgroup/system.slice/systemd-udevd.service
1044 sb-btrfs:vda2-24 /sys/fs/cgroup/system.slice/system-dbus\x2d:1.3\...
861 sb-btrfs:vda2-24 /sys/fs/cgroup/system.slice/system-dbus\x2d:1.3\...
804 sb-btrfs:vda2-24 /sys/fs/cgroup/system.slice
643 sb-btrfs:vda2-24 /sys/fs/cgroup/system.slice/firewalld.service
616 sb-cgroup2-30 /sys/fs/cgroup/init.scope
275 sb-sysfs-26 /
238 sb-proc-25 /sys/fs/cgroup/system.slice/systemd-journald.service
225 sb-proc-25 /sys/fs/cgroup/system.slice/abrtd.service

Signed-off-by: Roman Gushchin <[email protected]>
---
tools/cgroup/memcg_shrinker.py | 71 ++++++++++++++++++++++++++++++++++
1 file changed, 71 insertions(+)
create mode 100755 tools/cgroup/memcg_shrinker.py

diff --git a/tools/cgroup/memcg_shrinker.py b/tools/cgroup/memcg_shrinker.py
new file mode 100755
index 000000000000..706ab27666a4
--- /dev/null
+++ b/tools/cgroup/memcg_shrinker.py
@@ -0,0 +1,71 @@
+#!/usr/bin/env python3
+#
+# Copyright (C) 2022 Roman Gushchin <[email protected]>
+# Copyright (C) 2022 Meta
+
+import os
+import argparse
+import sys
+
+
+def scan_cgroups(cgroup_root):
+ cgroups = {}
+
+ for root, subdirs, _ in os.walk(cgroup_root):
+ for cgroup in subdirs:
+ path = os.path.join(root, cgroup)
+ ino = os.stat(path).st_ino
+ cgroups[ino] = path
+
+ # (memcg ino, path)
+ return cgroups
+
+
+def scan_shrinkers(shrinker_debugfs):
+ shrinkers = []
+
+ for root, subdirs, _ in os.walk(shrinker_debugfs):
+ for shrinker in subdirs:
+ count_path = os.path.join(root, shrinker, "count")
+ with open(count_path) as f:
+ for line in f.readlines():
+ items = line.split(' ')
+ ino = int(items[0])
+ # (count, shrinker, memcg ino)
+ shrinkers.append((int(items[1]), shrinker, ino))
+ return shrinkers
+
+
+def main():
+ parser = argparse.ArgumentParser(description='Display biggest shrinkers')
+ parser.add_argument('-n', '--lines', type=int, help='Number of lines to print')
+
+ args = parser.parse_args()
+
+ cgroups = scan_cgroups("/sys/fs/cgroup/")
+ shrinkers = scan_shrinkers("/sys/kernel/debug/shrinker/")
+ shrinkers = sorted(shrinkers, reverse = True, key = lambda x: x[0])
+
+ n = 0
+ for s in shrinkers:
+ count, name, ino = (s[0], s[1], s[2])
+ if count == 0:
+ break
+
+ if ino == 0 or ino == 1:
+ cg = "/"
+ else:
+ try:
+ cg = cgroups[ino]
+ except KeyError:
+ cg = "unknown (%d)" % ino
+
+ print("%-8s %-20s %s" % (count, name, cg))
+
+ n += 1
+ if args.lines and n >= args.lines:
+ break
+
+
+if __name__ == '__main__':
+ main()
--
2.35.3


2022-05-28 19:08:50

by Dave Chinner

[permalink] [raw]
Subject: Re: [PATCH v4 3/6] mm: shrinkers: provide shrinkers with names

On Wed, May 25, 2022 at 01:25:57PM -0700, Roman Gushchin wrote:
> Currently shrinkers are anonymous objects. For debugging purposes they
> can be identified by count/scan function names, but it's not always
> useful: e.g. for superblock's shrinkers it's nice to have at least
> an idea of to which superblock the shrinker belongs.
>
> This commit adds names to shrinkers. register_shrinker() and
> prealloc_shrinker() functions are extended to take a format and
> arguments to master a name.
>
> In some cases it's not possible to determine a good name at the time
> when a shrinker is allocated. For such cases shrinker_debugfs_rename()
> is provided.
>
> After this change the shrinker debugfs directory looks like:
> $ cd /sys/kernel/debug/shrinker/
> $ ls
> dqcache-16 sb-hugetlbfs-17 sb-rootfs-2 sb-tmpfs-49
> kfree_rcu-0 sb-hugetlbfs-33 sb-securityfs-6 sb-tracefs-13
> sb-aio-20 sb-iomem-12 sb-selinuxfs-22 sb-xfs:vda1-36
> sb-anon_inodefs-15 sb-mqueue-21 sb-sockfs-8 sb-zsmalloc-19
> sb-bdev-3 sb-nsfs-4 sb-sysfs-26 shadow-18
> sb-bpf-32 sb-pipefs-14 sb-tmpfs-1 thp_deferred_split-10
> sb-btrfs:vda2-24 sb-proc-25 sb-tmpfs-27 thp_zero-9
> sb-cgroup2-30 sb-proc-39 sb-tmpfs-29 xfs_buf-vda1-37
> sb-configfs-23 sb-proc-41 sb-tmpfs-35 xfs_inodegc-vda1-38

sb-xfs:vda1-36
xfs_buf-vda1-37
xfs_inodegc-vda1-38

That's a parsing nightmare right there. Please use the same format
for everything. You have <subsystem>-<type>:<instance>-<id> for
superblock stuff, but <subsys>_<type>-<instance>-<id> for the XFS
stuff. Make it consistent so we aren't reduced to pulling out our
hair trying to parse this in any useful way:

sb-xfs:vda1-36
xfs-buf:vda1-37
xfs-inodegc:vda1-38

FWIW, how we are supposed to know what actually owns these:

sb-tmpfs-1
sb-tmpfs-27
sb-tmpfs-29
sb-tmpfs-35
sb-tmpfs-49

tmpfs-27 might own all the memory - how do we link that back to a
mount point, container, user, workload, etc?

Cheers,

Dave.
--
Dave Chinner
[email protected]


2022-05-28 20:46:34

by Roman Gushchin

[permalink] [raw]
Subject: Re: [PATCH v4 3/6] mm: shrinkers: provide shrinkers with names

On Fri, May 27, 2022 at 07:30:34PM +1000, Dave Chinner wrote:
> On Wed, May 25, 2022 at 01:25:57PM -0700, Roman Gushchin wrote:
> > Currently shrinkers are anonymous objects. For debugging purposes they
> > can be identified by count/scan function names, but it's not always
> > useful: e.g. for superblock's shrinkers it's nice to have at least
> > an idea of to which superblock the shrinker belongs.
> >
> > This commit adds names to shrinkers. register_shrinker() and
> > prealloc_shrinker() functions are extended to take a format and
> > arguments to master a name.
> >
> > In some cases it's not possible to determine a good name at the time
> > when a shrinker is allocated. For such cases shrinker_debugfs_rename()
> > is provided.
> >
> > After this change the shrinker debugfs directory looks like:
> > $ cd /sys/kernel/debug/shrinker/
> > $ ls
> > dqcache-16 sb-hugetlbfs-17 sb-rootfs-2 sb-tmpfs-49
> > kfree_rcu-0 sb-hugetlbfs-33 sb-securityfs-6 sb-tracefs-13
> > sb-aio-20 sb-iomem-12 sb-selinuxfs-22 sb-xfs:vda1-36
> > sb-anon_inodefs-15 sb-mqueue-21 sb-sockfs-8 sb-zsmalloc-19
> > sb-bdev-3 sb-nsfs-4 sb-sysfs-26 shadow-18
> > sb-bpf-32 sb-pipefs-14 sb-tmpfs-1 thp_deferred_split-10
> > sb-btrfs:vda2-24 sb-proc-25 sb-tmpfs-27 thp_zero-9
> > sb-cgroup2-30 sb-proc-39 sb-tmpfs-29 xfs_buf-vda1-37
> > sb-configfs-23 sb-proc-41 sb-tmpfs-35 xfs_inodegc-vda1-38
>
> sb-xfs:vda1-36
> xfs_buf-vda1-37
> xfs_inodegc-vda1-38
>
> That's a parsing nightmare right there. Please use the same format
> for everything. You have <subsystem>-<type>:<instance>-<id> for
> superblock stuff, but <subsys>_<type>-<instance>-<id> for the XFS
> stuff. Make it consistent so we aren't reduced to pulling out our
> hair trying to parse this in any useful way:
>
> sb-xfs:vda1-36
> xfs-buf:vda1-37
> xfs-inodegc:vda1-38

Ok, good point, will do in the next version.

>
> FWIW, how we are supposed to know what actually owns these:
>
> sb-tmpfs-1
> sb-tmpfs-27
> sb-tmpfs-29
> sb-tmpfs-35
> sb-tmpfs-49
>
> tmpfs-27 might own all the memory - how do we link that back to a
> mount point, container, user, workload, etc?

I agree, but I've no good idea what to use as an id. We can't put the mount
point, user, group etc together in the file name - it will be too lengthy
(and mount namespaces are making it even more complicated).

Maybe we can add a symlink to the mount point from within the directory?

Do you have any ideas here?

Thanks!