Received: by 2002:a05:6602:18e:0:0:0:0 with SMTP id m14csp5744407ioo; Wed, 1 Jun 2022 11:37:17 -0700 (PDT) X-Google-Smtp-Source: ABdhPJy6Rai4yVDtUTdbhwGyAVzi5/QXAM6ZGKYEdYRVz0ikqxTAhxyLX0UaW1UQGwCkuiYiGzgl X-Received: by 2002:a17:90b:4b82:b0:1e6:7835:2f05 with SMTP id lr2-20020a17090b4b8200b001e678352f05mr724865pjb.121.1654108636923; Wed, 01 Jun 2022 11:37:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1654108636; cv=none; d=google.com; s=arc-20160816; b=RJ04I5+48StO/USanb+V3BhsO6pDJ/d6eRCRuySd4AlI2sDRlGLMebMO3und7gbh9b TntKfl5jI2fxGgJ35XnABfCoZ238AxtB0crN6OcUmUZRV1HeMd+Cwd734KqqSel3PtUl tXCv2ECNyIedCjPTvbjH128P2XelDpx2QUmxaacoBSsu+OSoVORTammcxSHaWq2516pd qSBqy4pWvvvx35nahu26qPv7Me48HWbznwuI3tTNyskys+FqEjNcvqqydx8p9wRFSae0 7rf1bo9OMAymeK6RJUA3HUuRLYkGYUcxYUBmeqOPoy8FXtFcmAp/HBs55JlOWwMmGqbo xV4g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=G1XnzXVFcgkxUaEXroUxqWAHQBR97Aftvk9f2kX+02o=; b=QXjf8Nz++cmICVJuu3K0VhPvUgbBeWlA2fo7pFtU0oKlsbNd+o09GKBYXpCXP2o9J6 6/JOuVHWrrqN9n6eOETLvxBR6GrF6nBo7qeyeeCNNpm2HYPWHQXi/qYVtX78eRqvmFIe rrqTIxubUdH+Fv25qlAa7iI0IUex1XiQ0UwD6sxnWMn8HZhvUnBFShVYwYmdedUe9zHH lk8TPYyWBrtHDOy3A4SM2ZeoHWAyejAZcFYhM20Qje0SSpQpZfFAXkrRXLLGU9BzOf1I vQ1tqXIbCqy6VoWhcLXtZcROu7NRUPNU5cj0wQk3xg06lL3MfYw6Le3KxpuO2dieflf8 h1jg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=PtFtkfpb; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [23.128.96.19]) by mx.google.com with ESMTPS id o13-20020a170902d4cd00b001637dbe1bc4si3449278plg.44.2022.06.01.11.37.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Jun 2022 11:37:16 -0700 (PDT) Received-SPF: softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) client-ip=23.128.96.19; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=PtFtkfpb; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 592F6B41DD; Wed, 1 Jun 2022 11:36:35 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1349343AbiFADWy (ORCPT + 99 others); Tue, 31 May 2022 23:22:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37236 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1349340AbiFADWo (ORCPT ); Tue, 31 May 2022 23:22:44 -0400 Received: from out2.migadu.com (out2.migadu.com [IPv6:2001:41d0:2:aacc::]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B153020BED for ; Tue, 31 May 2022 20:22:42 -0700 (PDT) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1654053761; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=G1XnzXVFcgkxUaEXroUxqWAHQBR97Aftvk9f2kX+02o=; b=PtFtkfpba4MJyyCoGHOxtr2b2whL6H2an9XnO9rDr83Wi52z9jC95tGuaf02FQN0mRp5uW KIB2zwkKJ1cAdGMrZGl/iVgU1oucRb2hOneeMqUvO4IJOu72IW0FUazbJs2jOs0eUTQLcr bACWx7PrtRvcSpy2MZcdWwRfYFaBTv0= From: Roman Gushchin To: Andrew Morton , linux-mm@kvack.org Cc: Dave Chinner , linux-kernel@vger.kernel.org, Kent Overstreet , Hillf Danton , Christophe JAILLET , Muchun Song , Roman Gushchin Subject: [PATCH v5 2/6] mm: shrinkers: introduce debugfs interface for memory shrinkers Date: Tue, 31 May 2022 20:22:23 -0700 Message-Id: <20220601032227.4076670-3-roman.gushchin@linux.dev> In-Reply-To: <20220601032227.4076670-1-roman.gushchin@linux.dev> References: <20220601032227.4076670-1-roman.gushchin@linux.dev> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Migadu-Flow: FLOW_OUT X-Migadu-Auth-User: linux.dev X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org This commit introduces the /sys/kernel/debug/shrinker debugfs interface which provides an ability to observe the state of individual kernel memory shrinkers. Because the feature adds some memory overhead (which shouldn't be large unless there is a huge amount of registered shrinkers), it's guarded by a config option (enabled by default). This commit introduces the "count" interface for each shrinker registered in the system. The output is in the following format: ... ... ... To reduce the size of output on machines with many thousands cgroups, if the total number of objects on all nodes is 0, the line is omitted. If the shrinker is not memcg-aware or CONFIG_MEMCG is off, 0 is printed as cgroup inode id. If the shrinker is not numa-aware, 0's are printed for all nodes except the first one. This commit gives debugfs entries simple numeric names, which are not very convenient. The following commit in the series will provide shrinkers with more meaningful names. Signed-off-by: Roman Gushchin Reviewed-by: Kent Overstreet Acked-by: Muchun Song --- include/linux/shrinker.h | 19 ++++- lib/Kconfig.debug | 9 +++ mm/Makefile | 1 + mm/shrinker_debug.c | 168 +++++++++++++++++++++++++++++++++++++++ mm/vmscan.c | 6 +- 5 files changed, 200 insertions(+), 3 deletions(-) create mode 100644 mm/shrinker_debug.c diff --git a/include/linux/shrinker.h b/include/linux/shrinker.h index 76fbf92b04d9..2ced8149c513 100644 --- a/include/linux/shrinker.h +++ b/include/linux/shrinker.h @@ -72,6 +72,10 @@ struct shrinker { #ifdef CONFIG_MEMCG /* ID in shrinker_idr */ int id; +#endif +#ifdef CONFIG_SHRINKER_DEBUG + int debugfs_id; + struct dentry *debugfs_entry; #endif /* objs pending delete, per node */ atomic_long_t *nr_deferred; @@ -94,4 +98,17 @@ extern int register_shrinker(struct shrinker *shrinker); extern void unregister_shrinker(struct shrinker *shrinker); extern void free_prealloced_shrinker(struct shrinker *shrinker); extern void synchronize_shrinkers(void); -#endif + +#ifdef CONFIG_SHRINKER_DEBUG +extern int shrinker_debugfs_add(struct shrinker *shrinker); +extern void shrinker_debugfs_remove(struct shrinker *shrinker); +#else /* CONFIG_SHRINKER_DEBUG */ +static inline int shrinker_debugfs_add(struct shrinker *shrinker) +{ + return 0; +} +static inline void shrinker_debugfs_remove(struct shrinker *shrinker) +{ +} +#endif /* CONFIG_SHRINKER_DEBUG */ +#endif /* _LINUX_SHRINKER_H */ diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index 075cd25363ac..6fda0ac6661c 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -732,6 +732,15 @@ config SLUB_STATS out which slabs are relevant to a particular load. Try running: slabinfo -DA +config SHRINKER_DEBUG + default y + bool "Enable shrinker debugging support" + depends on DEBUG_FS + help + Say Y to enable the shrinker debugfs interface which provides + visibility into the kernel memory shrinkers subsystem. + Disable it to avoid an extra memory footprint. + config HAVE_DEBUG_KMEMLEAK bool diff --git a/mm/Makefile b/mm/Makefile index 4cc13f3179a5..b693cbec9aa9 100644 --- a/mm/Makefile +++ b/mm/Makefile @@ -133,3 +133,4 @@ obj-$(CONFIG_PAGE_REPORTING) += page_reporting.o obj-$(CONFIG_IO_MAPPING) += io-mapping.o obj-$(CONFIG_HAVE_BOOTMEM_INFO_NODE) += bootmem_info.o obj-$(CONFIG_GENERIC_IOREMAP) += ioremap.o +obj-$(CONFIG_SHRINKER_DEBUG) += shrinker_debug.o diff --git a/mm/shrinker_debug.c b/mm/shrinker_debug.c new file mode 100644 index 000000000000..1a70556bd46c --- /dev/null +++ b/mm/shrinker_debug.c @@ -0,0 +1,168 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include +#include +#include +#include +#include + +/* defined in vmscan.c */ +extern struct rw_semaphore shrinker_rwsem; +extern struct list_head shrinker_list; + +static DEFINE_IDA(shrinker_debugfs_ida); +static struct dentry *shrinker_debugfs_root; + +static unsigned long shrinker_count_objects(struct shrinker *shrinker, + struct mem_cgroup *memcg, + unsigned long *count_per_node) +{ + unsigned long nr, total = 0; + int nid; + + for_each_node(nid) { + if (nid == 0 || (shrinker->flags & SHRINKER_NUMA_AWARE)) { + struct shrink_control sc = { + .gfp_mask = GFP_KERNEL, + .nid = nid, + .memcg = memcg, + }; + + nr = shrinker->count_objects(shrinker, &sc); + if (nr == SHRINK_EMPTY) + nr = 0; + } else { + nr = 0; + } + + count_per_node[nid] = nr; + total += nr; + } + + return total; +} + +static int shrinker_debugfs_count_show(struct seq_file *m, void *v) +{ + struct shrinker *shrinker = m->private; + unsigned long *count_per_node; + struct mem_cgroup *memcg; + unsigned long total; + bool memcg_aware; + int ret, nid; + + count_per_node = kcalloc(nr_node_ids, sizeof(unsigned long), GFP_KERNEL); + if (!count_per_node) + return -ENOMEM; + + ret = down_read_killable(&shrinker_rwsem); + if (ret) { + kfree(count_per_node); + return ret; + } + rcu_read_lock(); + + memcg_aware = shrinker->flags & SHRINKER_MEMCG_AWARE; + + memcg = mem_cgroup_iter(NULL, NULL, NULL); + do { + if (memcg && !mem_cgroup_online(memcg)) + continue; + + total = shrinker_count_objects(shrinker, + memcg_aware ? memcg : NULL, + count_per_node); + if (total) { + seq_printf(m, "%lu", mem_cgroup_ino(memcg)); + for_each_node(nid) + seq_printf(m, " %lu", count_per_node[nid]); + seq_putc(m, '\n'); + } + + if (!memcg_aware) { + mem_cgroup_iter_break(NULL, memcg); + break; + } + + if (signal_pending(current)) { + mem_cgroup_iter_break(NULL, memcg); + ret = -EINTR; + break; + } + } while ((memcg = mem_cgroup_iter(NULL, memcg, NULL)) != NULL); + + rcu_read_unlock(); + up_read(&shrinker_rwsem); + + kfree(count_per_node); + return ret; +} +DEFINE_SHOW_ATTRIBUTE(shrinker_debugfs_count); + +int shrinker_debugfs_add(struct shrinker *shrinker) +{ + struct dentry *entry; + char buf[16]; + int id; + + lockdep_assert_held(&shrinker_rwsem); + + /* debugfs isn't initialized yet, add debugfs entries later. */ + if (!shrinker_debugfs_root) + return 0; + + id = ida_alloc(&shrinker_debugfs_ida, GFP_KERNEL); + if (id < 0) + return id; + shrinker->debugfs_id = id; + + snprintf(buf, sizeof(buf), "%d", id); + + /* create debugfs entry */ + entry = debugfs_create_dir(buf, shrinker_debugfs_root); + if (IS_ERR(entry)) { + ida_free(&shrinker_debugfs_ida, id); + return PTR_ERR(entry); + } + shrinker->debugfs_entry = entry; + + debugfs_create_file("count", 0220, entry, shrinker, + &shrinker_debugfs_count_fops); + return 0; +} + +void shrinker_debugfs_remove(struct shrinker *shrinker) +{ + lockdep_assert_held(&shrinker_rwsem); + + if (!shrinker->debugfs_entry) + return; + + debugfs_remove_recursive(shrinker->debugfs_entry); + ida_free(&shrinker_debugfs_ida, shrinker->debugfs_id); +} + +static int __init shrinker_debugfs_init(void) +{ + struct shrinker *shrinker; + struct dentry *dentry; + int ret = 0; + + dentry = debugfs_create_dir("shrinker", NULL); + if (IS_ERR(dentry)) + return PTR_ERR(dentry); + shrinker_debugfs_root = dentry; + + /* Create debugfs entries for shrinkers registered at boot */ + down_write(&shrinker_rwsem); + list_for_each_entry(shrinker, &shrinker_list, list) + if (!shrinker->debugfs_entry) { + ret = shrinker_debugfs_add(shrinker); + if (ret) + break; + } + up_write(&shrinker_rwsem); + + return ret; +} +late_initcall(shrinker_debugfs_init); diff --git a/mm/vmscan.c b/mm/vmscan.c index 1678802e03e7..f54df1ce9312 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -189,8 +189,8 @@ static void set_task_reclaim_state(struct task_struct *task, task->reclaim_state = rs; } -static LIST_HEAD(shrinker_list); -static DECLARE_RWSEM(shrinker_rwsem); +LIST_HEAD(shrinker_list); +DECLARE_RWSEM(shrinker_rwsem); #ifdef CONFIG_MEMCG static int shrinker_nr_max; @@ -654,6 +654,7 @@ void register_shrinker_prepared(struct shrinker *shrinker) down_write(&shrinker_rwsem); list_add_tail(&shrinker->list, &shrinker_list); shrinker->flags |= SHRINKER_REGISTERED; + WARN_ON_ONCE(shrinker_debugfs_add(shrinker)); up_write(&shrinker_rwsem); } @@ -681,6 +682,7 @@ void unregister_shrinker(struct shrinker *shrinker) shrinker->flags &= ~SHRINKER_REGISTERED; if (shrinker->flags & SHRINKER_MEMCG_AWARE) unregister_memcg_shrinker(shrinker); + shrinker_debugfs_remove(shrinker); up_write(&shrinker_rwsem); kfree(shrinker->nr_deferred); -- 2.35.3