2024-04-22 07:53:11

by Kemeng Shi

[permalink] [raw]
Subject: [PATCH v4 0/4] Improve visibility of writeback

v3->v4:
-Fix build warning that filepages, headroom and writeback in
cgwb_calc_thresh is used uninitialized when CONFIG_CGROUP_WRITEBACK
is not enabled.

v2->v3:
-Drop patches to protect non-exist race and to define GDTC_INIT_NO_WB to
null.
-Add wb_tryget to wb from which we collect stats to bdi stats.
-Create wb_stats when CONFIG_CGROUP_WRITEBACK is not enabled.
-Add a blank line between two wb stats in wb_stats.

v1->v2:
-Send cleanup to wq_monitor.py separately.
-Add patch to avoid use after free of bdi.
-Rename wb_calc_cg_thresh to cgwb_calc_thresh as Tejun suggested.
-Use rcu walk to avoid use after free.
-Add debug output to each related patches.

This series tries to improve visilibity of writeback. Patch 1 make
/sys/kernel/debug/bdi/xxx/stats show writeback info of whole bdi
instead of only writeback info in root cgroup. Patch 2 add a new
debug file /sys/kernel/debug/bdi/xxx/wb_stats to show per wb writeback
info. Patch 3 add wb_monitor.py to monitor basic writeback info
of running system, more info could be added on demand. Patch 4
is a random cleanup. More details can be found in respective
patches. Thanks!

Following domain hierarchy is tested:
global domain (320G)
/ \
cgroup domain1(10G) cgroup domain2(10G)
| |
bdi wb1 wb2

/* all writeback info of bdi is successfully collected */
cat stats
BdiWriteback: 4704 kB
BdiReclaimable: 1294496 kB
BdiDirtyThresh: 204208088 kB
DirtyThresh: 195259944 kB
BackgroundThresh: 32503588 kB
BdiDirtied: 48519296 kB
BdiWritten: 47225696 kB
BdiWriteBandwidth: 1173892 kBps
b_dirty: 1
b_io: 0
b_more_io: 1
b_dirty_time: 0
bdi_list: 1
state: 1

/* per wb writeback info of bdi is collected */
cat /sys/kernel/debug/bdi/252:16/wb_stats
WbCgIno: 1
WbWriteback: 0 kB
WbReclaimable: 0 kB
WbDirtyThresh: 0 kB
WbDirtied: 0 kB
WbWritten: 0 kB
WbWriteBandwidth: 102400 kBps
b_dirty: 0
b_io: 0
b_more_io: 0
b_dirty_time: 0
state: 1

WbCgIno: 4208
WbWriteback: 59808 kB
WbReclaimable: 676480 kB
WbDirtyThresh: 6004624 kB
WbDirtied: 23348192 kB
WbWritten: 22614592 kB
WbWriteBandwidth: 593204 kBps
b_dirty: 1
b_io: 1
b_more_io: 0
b_dirty_time: 0
state: 7

WbCgIno: 4249
WbWriteback: 144256 kB
WbReclaimable: 432096 kB
WbDirtyThresh: 6004344 kB
WbDirtied: 25727744 kB
WbWritten: 25154752 kB
WbWriteBandwidth: 577904 kBps
b_dirty: 0
b_io: 1
b_more_io: 0
b_dirty_time: 0
state: 7

The wb_monitor.py script output is as following:
/wb_monitor.py 252:16 -c
writeback reclaimable dirtied written avg_bw
252:16_1 0 0 0 0 102400
252:16_4284 672 820064 9230368 8410304 685612
252:16_4325 896 819840 10491264 9671648 652348
252:16 1568 1639904 19721632 18081952 1440360

writeback reclaimable dirtied written avg_bw
252:16_1 0 0 0 0 102400
252:16_4284 672 820064 9230368 8410304 685612
252:16_4325 896 819840 10491264 9671648 652348
252:16 1568 1639904 19721632 18081952 1440360
..

Kemeng Shi (4):
writeback: collect stats of all wb of bdi in bdi_debug_stats_show
writeback: support retrieving per group debug writeback stats of bdi
writeback: add wb_monitor.py script to monitor writeback info on bdi
writeback: rename nr_reclaimable to nr_dirty in balance_dirty_pages

include/linux/writeback.h | 1 +
mm/backing-dev.c | 174 +++++++++++++++++++++++++++++-----
mm/page-writeback.c | 27 +++++-
tools/writeback/wb_monitor.py | 172 +++++++++++++++++++++++++++++++++
4 files changed, 345 insertions(+), 29 deletions(-)
create mode 100644 tools/writeback/wb_monitor.py

--
2.30.0



2024-04-22 07:53:26

by Kemeng Shi

[permalink] [raw]
Subject: [PATCH v4 4/4] writeback: rename nr_reclaimable to nr_dirty in balance_dirty_pages

Commit 8d92890bd6b85 ("mm/writeback: discard NR_UNSTABLE_NFS, use
NR_WRITEBACK instead") removed NR_UNSTABLE_NFS and nr_reclaimable
only contains dirty page now.
Rename nr_reclaimable to nr_dirty properly.

Signed-off-by: Kemeng Shi <[email protected]>
Reviewed-by: Jan Kara <[email protected]>
---
mm/page-writeback.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 3bb3bed102ef..44df5c899a33 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -1694,7 +1694,7 @@ static int balance_dirty_pages(struct bdi_writeback *wb,
struct dirty_throttle_control * const mdtc = mdtc_valid(&mdtc_stor) ?
&mdtc_stor : NULL;
struct dirty_throttle_control *sdtc;
- unsigned long nr_reclaimable; /* = file_dirty */
+ unsigned long nr_dirty;
long period;
long pause;
long max_pause;
@@ -1715,9 +1715,9 @@ static int balance_dirty_pages(struct bdi_writeback *wb,
unsigned long m_thresh = 0;
unsigned long m_bg_thresh = 0;

- nr_reclaimable = global_node_page_state(NR_FILE_DIRTY);
+ nr_dirty = global_node_page_state(NR_FILE_DIRTY);
gdtc->avail = global_dirtyable_memory();
- gdtc->dirty = nr_reclaimable + global_node_page_state(NR_WRITEBACK);
+ gdtc->dirty = nr_dirty + global_node_page_state(NR_WRITEBACK);

domain_dirty_limits(gdtc);

@@ -1768,7 +1768,7 @@ static int balance_dirty_pages(struct bdi_writeback *wb,
* In normal mode, we start background writeout at the lower
* background_thresh, to keep the amount of dirty memory low.
*/
- if (!laptop_mode && nr_reclaimable > gdtc->bg_thresh &&
+ if (!laptop_mode && nr_dirty > gdtc->bg_thresh &&
!writeback_in_progress(wb))
wb_start_background_writeback(wb);

--
2.30.0


2024-04-22 07:53:29

by Kemeng Shi

[permalink] [raw]
Subject: [PATCH v4 1/4] writeback: collect stats of all wb of bdi in bdi_debug_stats_show

/sys/kernel/debug/bdi/xxx/stats is supposed to show writeback information
of whole bdi, but only writeback information of bdi in root cgroup is
collected. So writeback information in non-root cgroup are missing now.
To be more specific, considering following case:

/* create writeback cgroup */
cd /sys/fs/cgroup
echo "+memory +io" > cgroup.subtree_control
mkdir group1
cd group1
echo $$ > cgroup.procs
/* do writeback in cgroup */
fio -name test -filename=/dev/vdb ...
/* get writeback info of bdi */
cat /sys/kernel/debug/bdi/xxx/stats
The cat result unexpectedly implies that there is no writeback on target
bdi.

Fix this by collecting stats of all wb in bdi instead of only wb in
root cgroup.

Following domain hierarchy is tested:
global domain (320G)
/ \
cgroup domain1(10G) cgroup domain2(10G)
| |
bdi wb1 wb2

/* all writeback info of bdi is successfully collected */
cat stats
BdiWriteback: 2912 kB
BdiReclaimable: 1598464 kB
BdiDirtyThresh: 167479028 kB
DirtyThresh: 195038532 kB
BackgroundThresh: 32466728 kB
BdiDirtied: 19141696 kB
BdiWritten: 17543456 kB
BdiWriteBandwidth: 1136172 kBps
b_dirty: 2
b_io: 0
b_more_io: 1
b_dirty_time: 0
bdi_list: 1
state: 1

Signed-off-by: Kemeng Shi <[email protected]>
---
mm/backing-dev.c | 96 ++++++++++++++++++++++++++++++++++++------------
1 file changed, 73 insertions(+), 23 deletions(-)

diff --git a/mm/backing-dev.c b/mm/backing-dev.c
index 5fa3666356f9..089146feb830 100644
--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -39,6 +39,19 @@ struct workqueue_struct *bdi_wq;
#include <linux/debugfs.h>
#include <linux/seq_file.h>

+struct wb_stats {
+ unsigned long nr_dirty;
+ unsigned long nr_io;
+ unsigned long nr_more_io;
+ unsigned long nr_dirty_time;
+ unsigned long nr_writeback;
+ unsigned long nr_reclaimable;
+ unsigned long nr_dirtied;
+ unsigned long nr_written;
+ unsigned long dirty_thresh;
+ unsigned long wb_thresh;
+};
+
static struct dentry *bdi_debug_root;

static void bdi_debug_init(void)
@@ -46,31 +59,68 @@ static void bdi_debug_init(void)
bdi_debug_root = debugfs_create_dir("bdi", NULL);
}

-static int bdi_debug_stats_show(struct seq_file *m, void *v)
+static void collect_wb_stats(struct wb_stats *stats,
+ struct bdi_writeback *wb)
{
- struct backing_dev_info *bdi = m->private;
- struct bdi_writeback *wb = &bdi->wb;
- unsigned long background_thresh;
- unsigned long dirty_thresh;
- unsigned long wb_thresh;
- unsigned long nr_dirty, nr_io, nr_more_io, nr_dirty_time;
struct inode *inode;

- nr_dirty = nr_io = nr_more_io = nr_dirty_time = 0;
spin_lock(&wb->list_lock);
list_for_each_entry(inode, &wb->b_dirty, i_io_list)
- nr_dirty++;
+ stats->nr_dirty++;
list_for_each_entry(inode, &wb->b_io, i_io_list)
- nr_io++;
+ stats->nr_io++;
list_for_each_entry(inode, &wb->b_more_io, i_io_list)
- nr_more_io++;
+ stats->nr_more_io++;
list_for_each_entry(inode, &wb->b_dirty_time, i_io_list)
if (inode->i_state & I_DIRTY_TIME)
- nr_dirty_time++;
+ stats->nr_dirty_time++;
spin_unlock(&wb->list_lock);

+ stats->nr_writeback += wb_stat(wb, WB_WRITEBACK);
+ stats->nr_reclaimable += wb_stat(wb, WB_RECLAIMABLE);
+ stats->nr_dirtied += wb_stat(wb, WB_DIRTIED);
+ stats->nr_written += wb_stat(wb, WB_WRITTEN);
+ stats->wb_thresh += wb_calc_thresh(wb, stats->dirty_thresh);
+}
+
+#ifdef CONFIG_CGROUP_WRITEBACK
+static void bdi_collect_stats(struct backing_dev_info *bdi,
+ struct wb_stats *stats)
+{
+ struct bdi_writeback *wb;
+
+ rcu_read_lock();
+ list_for_each_entry_rcu(wb, &bdi->wb_list, bdi_node) {
+ if (!wb_tryget(wb))
+ continue;
+
+ collect_wb_stats(stats, wb);
+ wb_put(wb);
+ }
+ rcu_read_unlock();
+}
+#else
+static void bdi_collect_stats(struct backing_dev_info *bdi,
+ struct wb_stats *stats)
+{
+ collect_wb_stats(stats, &bdi->wb);
+}
+#endif
+
+static int bdi_debug_stats_show(struct seq_file *m, void *v)
+{
+ struct backing_dev_info *bdi = m->private;
+ unsigned long background_thresh;
+ unsigned long dirty_thresh;
+ struct wb_stats stats;
+ unsigned long tot_bw;
+
global_dirty_limits(&background_thresh, &dirty_thresh);
- wb_thresh = wb_calc_thresh(wb, dirty_thresh);
+
+ memset(&stats, 0, sizeof(stats));
+ stats.dirty_thresh = dirty_thresh;
+ bdi_collect_stats(bdi, &stats);
+ tot_bw = atomic_long_read(&bdi->tot_write_bandwidth);

seq_printf(m,
"BdiWriteback: %10lu kB\n"
@@ -87,18 +137,18 @@ static int bdi_debug_stats_show(struct seq_file *m, void *v)
"b_dirty_time: %10lu\n"
"bdi_list: %10u\n"
"state: %10lx\n",
- (unsigned long) K(wb_stat(wb, WB_WRITEBACK)),
- (unsigned long) K(wb_stat(wb, WB_RECLAIMABLE)),
- K(wb_thresh),
+ K(stats.nr_writeback),
+ K(stats.nr_reclaimable),
+ K(stats.wb_thresh),
K(dirty_thresh),
K(background_thresh),
- (unsigned long) K(wb_stat(wb, WB_DIRTIED)),
- (unsigned long) K(wb_stat(wb, WB_WRITTEN)),
- (unsigned long) K(wb->write_bandwidth),
- nr_dirty,
- nr_io,
- nr_more_io,
- nr_dirty_time,
+ K(stats.nr_dirtied),
+ K(stats.nr_written),
+ K(tot_bw),
+ stats.nr_dirty,
+ stats.nr_io,
+ stats.nr_more_io,
+ stats.nr_dirty_time,
!list_empty(&bdi->bdi_list), bdi->wb.state);

return 0;
--
2.30.0


2024-04-22 07:53:43

by Kemeng Shi

[permalink] [raw]
Subject: [PATCH v4 3/4] writeback: add wb_monitor.py script to monitor writeback info on bdi

Add wb_monitor.py script to monitor writeback information on backing dev
which makes it easier and more convenient to observe writeback behaviors
of running system.

The wb_monitor.py script is written based on wq_monitor.py.

Following domain hierarchy is tested:
global domain (320G)
/ \
cgroup domain1(10G) cgroup domain2(10G)
| |
bdi wb1 wb2

The wb_monitor.py script output is as following:
/wb_monitor.py 252:16 -c
writeback reclaimable dirtied written avg_bw
252:16_1 0 0 0 0 102400
252:16_4284 672 820064 9230368 8410304 685612
252:16_4325 896 819840 10491264 9671648 652348
252:16 1568 1639904 19721632 18081952 1440360

writeback reclaimable dirtied written avg_bw
252:16_1 0 0 0 0 102400
252:16_4284 672 820064 9230368 8410304 685612
252:16_4325 896 819840 10491264 9671648 652348
252:16 1568 1639904 19721632 18081952 1440360
..

Signed-off-by: Kemeng Shi <[email protected]>
Suggested-by: Tejun Heo <[email protected]>
---
tools/writeback/wb_monitor.py | 172 ++++++++++++++++++++++++++++++++++
1 file changed, 172 insertions(+)
create mode 100644 tools/writeback/wb_monitor.py

diff --git a/tools/writeback/wb_monitor.py b/tools/writeback/wb_monitor.py
new file mode 100644
index 000000000000..5e3591f1f9a9
--- /dev/null
+++ b/tools/writeback/wb_monitor.py
@@ -0,0 +1,172 @@
+#!/usr/bin/env drgn
+#
+# Copyright (C) 2024 Kemeng Shi <[email protected]>
+# Copyright (C) 2024 Huawei Inc
+
+desc = """
+This is a drgn script based on wq_monitor.py to monitor writeback info on
+backing dev. For more info on drgn, visit https://github.com/osandov/drgn.
+
+ writeback(kB) Amount of dirty pages are currently being written back to
+ disk.
+
+ reclaimable(kB) Amount of pages are currently reclaimable.
+
+ dirtied(kB) Amount of pages have been dirtied.
+
+ wrttien(kB) Amount of dirty pages have been written back to disk.
+
+ avg_wb(kBps) Smoothly estimated write bandwidth of writing dirty pages
+ back to disk.
+"""
+
+import signal
+import re
+import time
+import json
+
+import drgn
+from drgn.helpers.linux.list import list_for_each_entry
+
+import argparse
+parser = argparse.ArgumentParser(description=desc,
+ formatter_class=argparse.RawTextHelpFormatter)
+parser.add_argument('bdi', metavar='REGEX', nargs='*',
+ help='Target backing device name patterns (all if empty)')
+parser.add_argument('-i', '--interval', metavar='SECS', type=float, default=1,
+ help='Monitoring interval (0 to print once and exit)')
+parser.add_argument('-j', '--json', action='store_true',
+ help='Output in json')
+parser.add_argument('-c', '--cgroup', action='store_true',
+ help='show writeback of bdi in cgroup')
+args = parser.parse_args()
+
+bdi_list = prog['bdi_list']
+
+WB_RECLAIMABLE = prog['WB_RECLAIMABLE']
+WB_WRITEBACK = prog['WB_WRITEBACK']
+WB_DIRTIED = prog['WB_DIRTIED']
+WB_WRITTEN = prog['WB_WRITTEN']
+NR_WB_STAT_ITEMS = prog['NR_WB_STAT_ITEMS']
+
+PAGE_SHIFT = prog['PAGE_SHIFT']
+
+def K(x):
+ return x << (PAGE_SHIFT - 10)
+
+class Stats:
+ def dict(self, now):
+ return { 'timestamp' : now,
+ 'name' : self.name,
+ 'writeback' : self.stats[WB_WRITEBACK],
+ 'reclaimable' : self.stats[WB_RECLAIMABLE],
+ 'dirtied' : self.stats[WB_DIRTIED],
+ 'written' : self.stats[WB_WRITTEN],
+ 'avg_wb' : self.avg_bw, }
+
+ def table_header_str():
+ return f'{"":>16} {"writeback":>10} {"reclaimable":>12} ' \
+ f'{"dirtied":>9} {"written":>9} {"avg_bw":>9}'
+
+ def table_row_str(self):
+ out = f'{self.name[-16:]:16} ' \
+ f'{self.stats[WB_WRITEBACK]:10} ' \
+ f'{self.stats[WB_RECLAIMABLE]:12} ' \
+ f'{self.stats[WB_DIRTIED]:9} ' \
+ f'{self.stats[WB_WRITTEN]:9} ' \
+ f'{self.avg_bw:9} '
+ return out
+
+ def show_header():
+ if Stats.table_fmt:
+ print()
+ print(Stats.table_header_str())
+
+ def show_stats(self):
+ if Stats.table_fmt:
+ print(self.table_row_str())
+ else:
+ print(self.dict(Stats.now))
+
+class WbStats(Stats):
+ def __init__(self, wb):
+ bdi_name = wb.bdi.dev_name.string_().decode()
+ # avoid to use bdi.wb.memcg_css which is only defined when
+ # CONFIG_CGROUP_WRITEBACK is enabled
+ if wb == wb.bdi.wb.address_of_():
+ ino = "1"
+ else:
+ ino = str(wb.memcg_css.cgroup.kn.id.value_())
+ self.name = bdi_name + '_' + ino
+
+ self.stats = [0] * NR_WB_STAT_ITEMS
+ for i in range(NR_WB_STAT_ITEMS):
+ if wb.stat[i].count >= 0:
+ self.stats[i] = int(K(wb.stat[i].count))
+ else:
+ self.stats[i] = 0
+
+ self.avg_bw = int(K(wb.avg_write_bandwidth))
+
+class BdiStats(Stats):
+ def __init__(self, bdi):
+ self.name = bdi.dev_name.string_().decode()
+ self.stats = [0] * NR_WB_STAT_ITEMS
+ self.avg_bw = 0
+
+ def collectStats(self, wb_stats):
+ for i in range(NR_WB_STAT_ITEMS):
+ self.stats[i] += wb_stats.stats[i]
+
+ self.avg_bw += wb_stats.avg_bw
+
+exit_req = False
+
+def sigint_handler(signr, frame):
+ global exit_req
+ exit_req = True
+
+def main():
+ # handle args
+ Stats.table_fmt = not args.json
+ interval = args.interval
+ cgroup = args.cgroup
+
+ re_str = None
+ if args.bdi:
+ for r in args.bdi:
+ if re_str is None:
+ re_str = r
+ else:
+ re_str += '|' + r
+
+ filter_re = re.compile(re_str) if re_str else None
+
+ # monitoring loop
+ signal.signal(signal.SIGINT, sigint_handler)
+
+ while not exit_req:
+ Stats.now = time.time()
+
+ Stats.show_header()
+ for bdi in list_for_each_entry('struct backing_dev_info', bdi_list.address_of_(), 'bdi_list'):
+ bdi_stats = BdiStats(bdi)
+ if filter_re and not filter_re.search(bdi_stats.name):
+ continue
+
+ for wb in list_for_each_entry('struct bdi_writeback', bdi.wb_list.address_of_(), 'bdi_node'):
+ wb_stats = WbStats(wb)
+ bdi_stats.collectStats(wb_stats)
+ if cgroup:
+ wb_stats.show_stats()
+
+ bdi_stats.show_stats()
+ if cgroup and Stats.table_fmt:
+ print()
+
+ if interval == 0:
+ break
+ time.sleep(interval)
+
+if __name__ == "__main__":
+ main()
--
2.30.0


2024-04-22 07:54:05

by Kemeng Shi

[permalink] [raw]
Subject: [PATCH v4 2/4] writeback: support retrieving per group debug writeback stats of bdi

Add /sys/kernel/debug/bdi/xxx/wb_stats to show per group writeback stats
of bdi.

Following domain hierarchy is tested:
global domain (320G)
/ \
cgroup domain1(10G) cgroup domain2(10G)
| |
bdi wb1 wb2

/* per wb writeback info of bdi is collected */
cat /sys/kernel/debug/bdi/252:16/wb_stats
WbCgIno: 1
WbWriteback: 0 kB
WbReclaimable: 0 kB
WbDirtyThresh: 0 kB
WbDirtied: 0 kB
WbWritten: 0 kB
WbWriteBandwidth: 102400 kBps
b_dirty: 0
b_io: 0
b_more_io: 0
b_dirty_time: 0
state: 1
WbCgIno: 4094
WbWriteback: 54432 kB
WbReclaimable: 766080 kB
WbDirtyThresh: 3094760 kB
WbDirtied: 1656480 kB
WbWritten: 837088 kB
WbWriteBandwidth: 132772 kBps
b_dirty: 1
b_io: 1
b_more_io: 0
b_dirty_time: 0
state: 7
WbCgIno: 4135
WbWriteback: 15232 kB
WbReclaimable: 786688 kB
WbDirtyThresh: 2909984 kB
WbDirtied: 1482656 kB
WbWritten: 681408 kB
WbWriteBandwidth: 124848 kBps
b_dirty: 0
b_io: 1
b_more_io: 0
b_dirty_time: 0
state: 7

Signed-off-by: Kemeng Shi <[email protected]>
---
include/linux/writeback.h | 1 +
mm/backing-dev.c | 78 ++++++++++++++++++++++++++++++++++++++-
mm/page-writeback.c | 19 ++++++++++
3 files changed, 96 insertions(+), 2 deletions(-)

diff --git a/include/linux/writeback.h b/include/linux/writeback.h
index 9845cb62e40b..112d806ddbe4 100644
--- a/include/linux/writeback.h
+++ b/include/linux/writeback.h
@@ -355,6 +355,7 @@ int dirtytime_interval_handler(struct ctl_table *table, int write,

void global_dirty_limits(unsigned long *pbackground, unsigned long *pdirty);
unsigned long wb_calc_thresh(struct bdi_writeback *wb, unsigned long thresh);
+unsigned long cgwb_calc_thresh(struct bdi_writeback *wb);

void wb_update_bandwidth(struct bdi_writeback *wb);

diff --git a/mm/backing-dev.c b/mm/backing-dev.c
index 089146feb830..6ecd11bdce6e 100644
--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -155,19 +155,93 @@ static int bdi_debug_stats_show(struct seq_file *m, void *v)
}
DEFINE_SHOW_ATTRIBUTE(bdi_debug_stats);

+static void wb_stats_show(struct seq_file *m, struct bdi_writeback *wb,
+ struct wb_stats *stats)
+{
+
+ seq_printf(m,
+ "WbCgIno: %10lu\n"
+ "WbWriteback: %10lu kB\n"
+ "WbReclaimable: %10lu kB\n"
+ "WbDirtyThresh: %10lu kB\n"
+ "WbDirtied: %10lu kB\n"
+ "WbWritten: %10lu kB\n"
+ "WbWriteBandwidth: %10lu kBps\n"
+ "b_dirty: %10lu\n"
+ "b_io: %10lu\n"
+ "b_more_io: %10lu\n"
+ "b_dirty_time: %10lu\n"
+ "state: %10lx\n\n",
+ cgroup_ino(wb->memcg_css->cgroup),
+ K(stats->nr_writeback),
+ K(stats->nr_reclaimable),
+ K(stats->wb_thresh),
+ K(stats->nr_dirtied),
+ K(stats->nr_written),
+ K(wb->avg_write_bandwidth),
+ stats->nr_dirty,
+ stats->nr_io,
+ stats->nr_more_io,
+ stats->nr_dirty_time,
+ wb->state);
+}
+
+static int cgwb_debug_stats_show(struct seq_file *m, void *v)
+{
+ struct backing_dev_info *bdi = m->private;
+ unsigned long background_thresh;
+ unsigned long dirty_thresh;
+ struct bdi_writeback *wb;
+ struct wb_stats stats;
+
+ global_dirty_limits(&background_thresh, &dirty_thresh);
+
+ rcu_read_lock();
+ list_for_each_entry_rcu(wb, &bdi->wb_list, bdi_node) {
+ struct wb_stats stats = { .dirty_thresh = dirty_thresh };
+
+ if (!wb_tryget(wb))
+ continue;
+
+ collect_wb_stats(&stats, wb);
+
+ /*
+ * Calculate thresh of wb in writeback cgroup which is min of
+ * thresh in global domain and thresh in cgroup domain. Drop
+ * rcu lock because cgwb_calc_thresh may sleep in
+ * cgroup_rstat_flush. We can do so here because we have a ref.
+ */
+ if (mem_cgroup_wb_domain(wb)) {
+ rcu_read_unlock();
+ stats.wb_thresh = min(stats.wb_thresh, cgwb_calc_thresh(wb));
+ rcu_read_lock();
+ }
+
+ wb_stats_show(m, wb, &stats);
+
+ wb_put(wb);
+ }
+ rcu_read_unlock();
+
+ return 0;
+}
+DEFINE_SHOW_ATTRIBUTE(cgwb_debug_stats);
+
static void bdi_debug_register(struct backing_dev_info *bdi, const char *name)
{
bdi->debug_dir = debugfs_create_dir(name, bdi_debug_root);

debugfs_create_file("stats", 0444, bdi->debug_dir, bdi,
&bdi_debug_stats_fops);
+ debugfs_create_file("wb_stats", 0444, bdi->debug_dir, bdi,
+ &cgwb_debug_stats_fops);
}

static void bdi_debug_unregister(struct backing_dev_info *bdi)
{
debugfs_remove_recursive(bdi->debug_dir);
}
-#else
+#else /* CONFIG_DEBUG_FS */
static inline void bdi_debug_init(void)
{
}
@@ -178,7 +252,7 @@ static inline void bdi_debug_register(struct backing_dev_info *bdi,
static inline void bdi_debug_unregister(struct backing_dev_info *bdi)
{
}
-#endif
+#endif /* CONFIG_DEBUG_FS */

static ssize_t read_ahead_kb_store(struct device *dev,
struct device_attribute *attr,
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 3e19b87049db..3bb3bed102ef 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -892,6 +892,25 @@ unsigned long wb_calc_thresh(struct bdi_writeback *wb, unsigned long thresh)
return __wb_calc_thresh(&gdtc);
}

+unsigned long cgwb_calc_thresh(struct bdi_writeback *wb)
+{
+ struct dirty_throttle_control gdtc = { GDTC_INIT_NO_WB };
+ struct dirty_throttle_control mdtc = { MDTC_INIT(wb, &gdtc) };
+ unsigned long filepages = 0, headroom = 0, writeback = 0;
+
+ gdtc.avail = global_dirtyable_memory();
+ gdtc.dirty = global_node_page_state(NR_FILE_DIRTY) +
+ global_node_page_state(NR_WRITEBACK);
+
+ mem_cgroup_wb_stats(wb, &filepages, &headroom,
+ &mdtc.dirty, &writeback);
+ mdtc.dirty += writeback;
+ mdtc_calc_avail(&mdtc, filepages, headroom);
+ domain_dirty_limits(&mdtc);
+
+ return __wb_calc_thresh(&mdtc);
+}
+
/*
* setpoint - dirty 3
* f(dirty) := 1.0 + (----------------)
--
2.30.0


2024-04-22 18:23:38

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH v4 1/4] writeback: collect stats of all wb of bdi in bdi_debug_stats_show

On Tue, Apr 23, 2024 at 12:48:05AM +0800, Kemeng Shi wrote:
> /sys/kernel/debug/bdi/xxx/stats is supposed to show writeback information
> of whole bdi, but only writeback information of bdi in root cgroup is
> collected. So writeback information in non-root cgroup are missing now.
> To be more specific, considering following case:
>
> /* create writeback cgroup */
> cd /sys/fs/cgroup
> echo "+memory +io" > cgroup.subtree_control
> mkdir group1
> cd group1
> echo $$ > cgroup.procs
> /* do writeback in cgroup */
> fio -name test -filename=/dev/vdb ...
> /* get writeback info of bdi */
> cat /sys/kernel/debug/bdi/xxx/stats
> The cat result unexpectedly implies that there is no writeback on target
> bdi.
>
> Fix this by collecting stats of all wb in bdi instead of only wb in
> root cgroup.
>
> Following domain hierarchy is tested:
> global domain (320G)
> / \
> cgroup domain1(10G) cgroup domain2(10G)
> | |
> bdi wb1 wb2
>
> /* all writeback info of bdi is successfully collected */
> cat stats
> BdiWriteback: 2912 kB
> BdiReclaimable: 1598464 kB
> BdiDirtyThresh: 167479028 kB
> DirtyThresh: 195038532 kB
> BackgroundThresh: 32466728 kB
> BdiDirtied: 19141696 kB
> BdiWritten: 17543456 kB
> BdiWriteBandwidth: 1136172 kBps
> b_dirty: 2
> b_io: 0
> b_more_io: 1
> b_dirty_time: 0
> bdi_list: 1
> state: 1
>
> Signed-off-by: Kemeng Shi <[email protected]>

Acked-by: Tejun Heo <[email protected]>

Thanks.

--
tejun

2024-04-22 18:32:50

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH v4 2/4] writeback: support retrieving per group debug writeback stats of bdi

Hello,

On Tue, Apr 23, 2024 at 12:48:06AM +0800, Kemeng Shi wrote:
> Add /sys/kernel/debug/bdi/xxx/wb_stats to show per group writeback stats
> of bdi.
>
> Following domain hierarchy is tested:
> global domain (320G)
> / \
> cgroup domain1(10G) cgroup domain2(10G)
> | |
> bdi wb1 wb2
>
> /* per wb writeback info of bdi is collected */
> cat /sys/kernel/debug/bdi/252:16/wb_stats
> WbCgIno: 1
> WbWriteback: 0 kB
> WbReclaimable: 0 kB
> WbDirtyThresh: 0 kB
> WbDirtied: 0 kB
> WbWritten: 0 kB
> WbWriteBandwidth: 102400 kBps
> b_dirty: 0
> b_io: 0
> b_more_io: 0
> b_dirty_time: 0
> state: 1
> WbCgIno: 4094
> WbWriteback: 54432 kB
> WbReclaimable: 766080 kB
> WbDirtyThresh: 3094760 kB
> WbDirtied: 1656480 kB
> WbWritten: 837088 kB
> WbWriteBandwidth: 132772 kBps
> b_dirty: 1
> b_io: 1
> b_more_io: 0
> b_dirty_time: 0
> state: 7
> WbCgIno: 4135
> WbWriteback: 15232 kB
> WbReclaimable: 786688 kB
> WbDirtyThresh: 2909984 kB
> WbDirtied: 1482656 kB
> WbWritten: 681408 kB
> WbWriteBandwidth: 124848 kBps
> b_dirty: 0
> b_io: 1
> b_more_io: 0
> b_dirty_time: 0
> state: 7
>
> Signed-off-by: Kemeng Shi <[email protected]>

Maybe it'd be useful to delineate each wb better in the output? It's a bit
difficult to parse visually. Other than that,

Acked-by: Tejun Heo <[email protected]>

Thanks.

--
tejun

2024-04-22 18:33:14

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH v4 3/4] writeback: add wb_monitor.py script to monitor writeback info on bdi

On Tue, Apr 23, 2024 at 12:48:07AM +0800, Kemeng Shi wrote:
> Add wb_monitor.py script to monitor writeback information on backing dev
> which makes it easier and more convenient to observe writeback behaviors
> of running system.
>
> The wb_monitor.py script is written based on wq_monitor.py.
>
> Following domain hierarchy is tested:
> global domain (320G)
> / \
> cgroup domain1(10G) cgroup domain2(10G)
> | |
> bdi wb1 wb2
>
> The wb_monitor.py script output is as following:
> ./wb_monitor.py 252:16 -c
> writeback reclaimable dirtied written avg_bw
> 252:16_1 0 0 0 0 102400
> 252:16_4284 672 820064 9230368 8410304 685612
> 252:16_4325 896 819840 10491264 9671648 652348
> 252:16 1568 1639904 19721632 18081952 1440360
>
> writeback reclaimable dirtied written avg_bw
> 252:16_1 0 0 0 0 102400
> 252:16_4284 672 820064 9230368 8410304 685612
> 252:16_4325 896 819840 10491264 9671648 652348
> 252:16 1568 1639904 19721632 18081952 1440360
> ...
>
> Signed-off-by: Kemeng Shi <[email protected]>
> Suggested-by: Tejun Heo <[email protected]>

Acked-by: Tejun Heo <[email protected]>

Thanks.

--
tejun

2024-04-22 18:33:39

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH v4 4/4] writeback: rename nr_reclaimable to nr_dirty in balance_dirty_pages

On Tue, Apr 23, 2024 at 12:48:08AM +0800, Kemeng Shi wrote:
> Commit 8d92890bd6b85 ("mm/writeback: discard NR_UNSTABLE_NFS, use
> NR_WRITEBACK instead") removed NR_UNSTABLE_NFS and nr_reclaimable
> only contains dirty page now.
> Rename nr_reclaimable to nr_dirty properly.
>
> Signed-off-by: Kemeng Shi <[email protected]>
> Reviewed-by: Jan Kara <[email protected]>

Acked-by: Tejun Heo <[email protected]>

Thanks.

--
tejun

2024-04-22 21:02:31

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH v4 0/4] Improve visibility of writeback

On Tue, Apr 23, 2024 at 12:48:04AM +0800, Kemeng Shi wrote:

Your clock is a day out! This causes your patches to be sorted
abnormally in my mail client. Please fix.

2024-04-23 00:44:59

by SeongJae Park

[permalink] [raw]
Subject: Re: [PATCH v4 2/4] writeback: support retrieving per group debug writeback stats of bdi

Hi Kemeng,

On Tue, 23 Apr 2024 00:48:06 +0800 Kemeng Shi <[email protected]> wrote:

> Add /sys/kernel/debug/bdi/xxx/wb_stats to show per group writeback stats
> of bdi.
>
> Following domain hierarchy is tested:
> global domain (320G)
> / \
> cgroup domain1(10G) cgroup domain2(10G)
> | |
> bdi wb1 wb2
>
> /* per wb writeback info of bdi is collected */
> cat /sys/kernel/debug/bdi/252:16/wb_stats
> WbCgIno: 1
> WbWriteback: 0 kB
> WbReclaimable: 0 kB
> WbDirtyThresh: 0 kB
> WbDirtied: 0 kB
> WbWritten: 0 kB
> WbWriteBandwidth: 102400 kBps
> b_dirty: 0
> b_io: 0
> b_more_io: 0
> b_dirty_time: 0
> state: 1
> WbCgIno: 4094
> WbWriteback: 54432 kB
> WbReclaimable: 766080 kB
> WbDirtyThresh: 3094760 kB
> WbDirtied: 1656480 kB
> WbWritten: 837088 kB
> WbWriteBandwidth: 132772 kBps
> b_dirty: 1
> b_io: 1
> b_more_io: 0
> b_dirty_time: 0
> state: 7
> WbCgIno: 4135
> WbWriteback: 15232 kB
> WbReclaimable: 786688 kB
> WbDirtyThresh: 2909984 kB
> WbDirtied: 1482656 kB
> WbWritten: 681408 kB
> WbWriteBandwidth: 124848 kBps
> b_dirty: 0
> b_io: 1
> b_more_io: 0
> b_dirty_time: 0
> state: 7
>
> Signed-off-by: Kemeng Shi <[email protected]>
> ---
> include/linux/writeback.h | 1 +
> mm/backing-dev.c | 78 ++++++++++++++++++++++++++++++++++++++-
> mm/page-writeback.c | 19 ++++++++++
> 3 files changed, 96 insertions(+), 2 deletions(-)
>
> diff --git a/include/linux/writeback.h b/include/linux/writeback.h
> index 9845cb62e40b..112d806ddbe4 100644
> --- a/include/linux/writeback.h
> +++ b/include/linux/writeback.h
> @@ -355,6 +355,7 @@ int dirtytime_interval_handler(struct ctl_table *table, int write,
>
> void global_dirty_limits(unsigned long *pbackground, unsigned long *pdirty);
> unsigned long wb_calc_thresh(struct bdi_writeback *wb, unsigned long thresh);
> +unsigned long cgwb_calc_thresh(struct bdi_writeback *wb);
>
> void wb_update_bandwidth(struct bdi_writeback *wb);
>
> diff --git a/mm/backing-dev.c b/mm/backing-dev.c
> index 089146feb830..6ecd11bdce6e 100644
> --- a/mm/backing-dev.c
> +++ b/mm/backing-dev.c
> @@ -155,19 +155,93 @@ static int bdi_debug_stats_show(struct seq_file *m, void *v)
> }
> DEFINE_SHOW_ATTRIBUTE(bdi_debug_stats);
>
> +static void wb_stats_show(struct seq_file *m, struct bdi_writeback *wb,
> + struct wb_stats *stats)
> +{
> +
> + seq_printf(m,
> + "WbCgIno: %10lu\n"
> + "WbWriteback: %10lu kB\n"
> + "WbReclaimable: %10lu kB\n"
> + "WbDirtyThresh: %10lu kB\n"
> + "WbDirtied: %10lu kB\n"
> + "WbWritten: %10lu kB\n"
> + "WbWriteBandwidth: %10lu kBps\n"
> + "b_dirty: %10lu\n"
> + "b_io: %10lu\n"
> + "b_more_io: %10lu\n"
> + "b_dirty_time: %10lu\n"
> + "state: %10lx\n\n",
> + cgroup_ino(wb->memcg_css->cgroup),

I'm getting below kunit build failure from the latest mm-unstable tree, and
'git bisect' points this patch.

ERROR:root:.../linux/mm/backing-dev.c: In function ‘wb_stats_show’:
.../linux/mm/backing-dev.c:175:20: error: implicit declaration of function ‘cgroup_ino’; did you mean ‘cgroup_init’? [-Werror=implicit-function-declaration]
175 | cgroup_ino(wb->memcg_css->cgroup),
| ^~~~~~~~~~
| cgroup_init
.../linux/mm/backing-dev.c:175:33: error: ‘struct bdi_writeback’ has no member named ‘memcg_css’
175 | cgroup_ino(wb->memcg_css->cgroup),
| ^~

The kunit build config is not having CONFIG_CGROUPS. I guess we need to check
the case? I confirmed below dumb change is fixing the issue, but I guess it
could be cleaner. May I ask your opinion?

--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -160,7 +160,9 @@ static void wb_stats_show(struct seq_file *m, struct bdi_writeback *wb,
{

seq_printf(m,
+#ifdef CONFIG_CGROUPS
"WbCgIno: %10lu\n"
+#endif
"WbWriteback: %10lu kB\n"
"WbReclaimable: %10lu kB\n"
"WbDirtyThresh: %10lu kB\n"
@@ -172,7 +174,9 @@ static void wb_stats_show(struct seq_file *m, struct bdi_writeback *wb,
"b_more_io: %10lu\n"
"b_dirty_time: %10lu\n"
"state: %10lx\n\n",
+#ifdef CONFIG_CGROUPS
cgroup_ino(wb->memcg_css->cgroup),
+#endif
K(stats->nr_writeback),
K(stats->nr_reclaimable),
K(stats->wb_thresh),


> + K(stats->nr_writeback),
> + K(stats->nr_reclaimable),
> + K(stats->wb_thresh),
> + K(stats->nr_dirtied),
> + K(stats->nr_written),
> + K(wb->avg_write_bandwidth),
> + stats->nr_dirty,
> + stats->nr_io,
> + stats->nr_more_io,
> + stats->nr_dirty_time,
> + wb->state);
> +}
> +
> +static int cgwb_debug_stats_show(struct seq_file *m, void *v)
> +{
> + struct backing_dev_info *bdi = m->private;
> + unsigned long background_thresh;
> + unsigned long dirty_thresh;
> + struct bdi_writeback *wb;
> + struct wb_stats stats;

Kunit build also shows below warning:

.../linux/mm/backing-dev.c: In function ‘cgwb_debug_stats_show’:
.../linux/mm/backing-dev.c:195:25: warning: unused variable ‘stats’ [-Wunused-variable]
195 | struct wb_stats stats;
| ^~~~~

I guess above line can simply removed?

> +
> + global_dirty_limits(&background_thresh, &dirty_thresh);
> +
> + rcu_read_lock();
> + list_for_each_entry_rcu(wb, &bdi->wb_list, bdi_node) {
> + struct wb_stats stats = { .dirty_thresh = dirty_thresh };
> +
> + if (!wb_tryget(wb))
> + continue;
> +
> + collect_wb_stats(&stats, wb);
> +
> + /*
> + * Calculate thresh of wb in writeback cgroup which is min of
> + * thresh in global domain and thresh in cgroup domain. Drop
> + * rcu lock because cgwb_calc_thresh may sleep in
> + * cgroup_rstat_flush. We can do so here because we have a ref.
> + */
> + if (mem_cgroup_wb_domain(wb)) {
> + rcu_read_unlock();
> + stats.wb_thresh = min(stats.wb_thresh, cgwb_calc_thresh(wb));
> + rcu_read_lock();
> + }
> +
> + wb_stats_show(m, wb, &stats);
> +
> + wb_put(wb);
> + }
> + rcu_read_unlock();
> +
> + return 0;
> +}


Thanks,
SJ

[...]

2024-04-23 01:18:57

by Kemeng Shi

[permalink] [raw]
Subject: Re: [PATCH v4 2/4] writeback: support retrieving per group debug writeback stats of bdi



on 4/23/2024 2:32 AM, Tejun Heo wrote:
> Hello,
>
> On Tue, Apr 23, 2024 at 12:48:06AM +0800, Kemeng Shi wrote:
>> Add /sys/kernel/debug/bdi/xxx/wb_stats to show per group writeback stats
>> of bdi.
>>
>> Following domain hierarchy is tested:
>> global domain (320G)
>> / \
>> cgroup domain1(10G) cgroup domain2(10G)
>> | |
>> bdi wb1 wb2
>>
>> /* per wb writeback info of bdi is collected */
>> cat /sys/kernel/debug/bdi/252:16/wb_stats
>> WbCgIno: 1
>> WbWriteback: 0 kB
>> WbReclaimable: 0 kB
>> WbDirtyThresh: 0 kB
>> WbDirtied: 0 kB
>> WbWritten: 0 kB
>> WbWriteBandwidth: 102400 kBps
>> b_dirty: 0
>> b_io: 0
>> b_more_io: 0
>> b_dirty_time: 0
>> state: 1
>> WbCgIno: 4094
>> WbWriteback: 54432 kB
>> WbReclaimable: 766080 kB
>> WbDirtyThresh: 3094760 kB
>> WbDirtied: 1656480 kB
>> WbWritten: 837088 kB
>> WbWriteBandwidth: 132772 kBps
>> b_dirty: 1
>> b_io: 1
>> b_more_io: 0
>> b_dirty_time: 0
>> state: 7
>> WbCgIno: 4135
>> WbWriteback: 15232 kB
>> WbReclaimable: 786688 kB
>> WbDirtyThresh: 2909984 kB
>> WbDirtied: 1482656 kB
>> WbWritten: 681408 kB
>> WbWriteBandwidth: 124848 kBps
>> b_dirty: 0
>> b_io: 1
>> b_more_io: 0
>> b_dirty_time: 0
>> state: 7
>>
>> Signed-off-by: Kemeng Shi <[email protected]>
>
> Maybe it'd be useful to delineate each wb better in the output? It's a bit
> difficult to parse visually. Other than that,
Sorry for this. A blank line was added at end of each wb in this patch but I
forgot to update the changelog.
As there is some kunit build problem reported, I will correct changelog with
the fix to build problem.
Thanks a lot for review.

Kemeng
>
> Acked-by: Tejun Heo <[email protected]>
>
> Thanks.
>


2024-04-23 01:29:24

by Kemeng Shi

[permalink] [raw]
Subject: Re: [PATCH v4 2/4] writeback: support retrieving per group debug writeback stats of bdi


Hello
on 4/23/2024 8:44 AM, SeongJae Park wrote:
> Hi Kemeng,
>
> On Tue, 23 Apr 2024 00:48:06 +0800 Kemeng Shi <[email protected]> wrote:
>
>> Add /sys/kernel/debug/bdi/xxx/wb_stats to show per group writeback stats
>> of bdi.
>>
>> Following domain hierarchy is tested:
>> global domain (320G)
>> / \
>> cgroup domain1(10G) cgroup domain2(10G)
>> | |
>> bdi wb1 wb2
>>
>> /* per wb writeback info of bdi is collected */
>> cat /sys/kernel/debug/bdi/252:16/wb_stats
>> WbCgIno: 1
>> WbWriteback: 0 kB
>> WbReclaimable: 0 kB
>> WbDirtyThresh: 0 kB
>> WbDirtied: 0 kB
>> WbWritten: 0 kB
>> WbWriteBandwidth: 102400 kBps
>> b_dirty: 0
>> b_io: 0
>> b_more_io: 0
>> b_dirty_time: 0
>> state: 1
>> WbCgIno: 4094
>> WbWriteback: 54432 kB
>> WbReclaimable: 766080 kB
>> WbDirtyThresh: 3094760 kB
>> WbDirtied: 1656480 kB
>> WbWritten: 837088 kB
>> WbWriteBandwidth: 132772 kBps
>> b_dirty: 1
>> b_io: 1
>> b_more_io: 0
>> b_dirty_time: 0
>> state: 7
>> WbCgIno: 4135
>> WbWriteback: 15232 kB
>> WbReclaimable: 786688 kB
>> WbDirtyThresh: 2909984 kB
>> WbDirtied: 1482656 kB
>> WbWritten: 681408 kB
>> WbWriteBandwidth: 124848 kBps
>> b_dirty: 0
>> b_io: 1
>> b_more_io: 0
>> b_dirty_time: 0
>> state: 7
>>
>> Signed-off-by: Kemeng Shi <[email protected]>
>> ---
>> include/linux/writeback.h | 1 +
>> mm/backing-dev.c | 78 ++++++++++++++++++++++++++++++++++++++-
>> mm/page-writeback.c | 19 ++++++++++
>> 3 files changed, 96 insertions(+), 2 deletions(-)
>>
>> diff --git a/include/linux/writeback.h b/include/linux/writeback.h
>> index 9845cb62e40b..112d806ddbe4 100644
>> --- a/include/linux/writeback.h
>> +++ b/include/linux/writeback.h
>> @@ -355,6 +355,7 @@ int dirtytime_interval_handler(struct ctl_table *table, int write,
>>
>> void global_dirty_limits(unsigned long *pbackground, unsigned long *pdirty);
>> unsigned long wb_calc_thresh(struct bdi_writeback *wb, unsigned long thresh);
>> +unsigned long cgwb_calc_thresh(struct bdi_writeback *wb);
>>
>> void wb_update_bandwidth(struct bdi_writeback *wb);
>>
>> diff --git a/mm/backing-dev.c b/mm/backing-dev.c
>> index 089146feb830..6ecd11bdce6e 100644
>> --- a/mm/backing-dev.c
>> +++ b/mm/backing-dev.c
>> @@ -155,19 +155,93 @@ static int bdi_debug_stats_show(struct seq_file *m, void *v)
>> }
>> DEFINE_SHOW_ATTRIBUTE(bdi_debug_stats);
>>
>> +static void wb_stats_show(struct seq_file *m, struct bdi_writeback *wb,
>> + struct wb_stats *stats)
>> +{
>> +
>> + seq_printf(m,
>> + "WbCgIno: %10lu\n"
>> + "WbWriteback: %10lu kB\n"
>> + "WbReclaimable: %10lu kB\n"
>> + "WbDirtyThresh: %10lu kB\n"
>> + "WbDirtied: %10lu kB\n"
>> + "WbWritten: %10lu kB\n"
>> + "WbWriteBandwidth: %10lu kBps\n"
>> + "b_dirty: %10lu\n"
>> + "b_io: %10lu\n"
>> + "b_more_io: %10lu\n"
>> + "b_dirty_time: %10lu\n"
>> + "state: %10lx\n\n",
>> + cgroup_ino(wb->memcg_css->cgroup),
>
> I'm getting below kunit build failure from the latest mm-unstable tree, and
> 'git bisect' points this patch.
>
> ERROR:root:.../linux/mm/backing-dev.c: In function ‘wb_stats_show’:
> .../linux/mm/backing-dev.c:175:20: error: implicit declaration of function ‘cgroup_ino’; did you mean ‘cgroup_init’? [-Werror=implicit-function-declaration]
> 175 | cgroup_ino(wb->memcg_css->cgroup),
> | ^~~~~~~~~~
> | cgroup_init
> .../linux/mm/backing-dev.c:175:33: error: ‘struct bdi_writeback’ has no member named ‘memcg_css’
> 175 | cgroup_ino(wb->memcg_css->cgroup),
> | ^~
>
> The kunit build config is not having CONFIG_CGROUPS. I guess we need to check
> the case? I confirmed below dumb change is fixing the issue, but I guess it
> could be cleaner. May I ask your opinion?
Thanks for report the issue. As discussed before, we try to keep the same output
whether CGROUP is enable or not. So we'd better show cgroup number 0 for bdi->wb
instead remove cgroup number from output.
>
> --- a/mm/backing-dev.c
> +++ b/mm/backing-dev.c
> @@ -160,7 +160,9 @@ static void wb_stats_show(struct seq_file *m, struct bdi_writeback *wb,
> {
>
> seq_printf(m,
> +#ifdef CONFIG_CGROUPS
> "WbCgIno: %10lu\n"
> +#endif
> "WbWriteback: %10lu kB\n"
> "WbReclaimable: %10lu kB\n"
> "WbDirtyThresh: %10lu kB\n"
> @@ -172,7 +174,9 @@ static void wb_stats_show(struct seq_file *m, struct bdi_writeback *wb,
> "b_more_io: %10lu\n"
> "b_dirty_time: %10lu\n"
> "state: %10lx\n\n",
> +#ifdef CONFIG_CGROUPS
> cgroup_ino(wb->memcg_css->cgroup),
> +#endif
> K(stats->nr_writeback),
> K(stats->nr_reclaimable),
> K(stats->wb_thresh),
>
>
>> + K(stats->nr_writeback),
>> + K(stats->nr_reclaimable),
>> + K(stats->wb_thresh),
>> + K(stats->nr_dirtied),
>> + K(stats->nr_written),
>> + K(wb->avg_write_bandwidth),
>> + stats->nr_dirty,
>> + stats->nr_io,
>> + stats->nr_more_io,
>> + stats->nr_dirty_time,
>> + wb->state);
>> +}
>> +
>> +static int cgwb_debug_stats_show(struct seq_file *m, void *v)
>> +{
>> + struct backing_dev_info *bdi = m->private;
>> + unsigned long background_thresh;
>> + unsigned long dirty_thresh;
>> + struct bdi_writeback *wb;
>> + struct wb_stats stats;
>
> Kunit build also shows below warning:
>
> .../linux/mm/backing-dev.c: In function ‘cgwb_debug_stats_show’:
> .../linux/mm/backing-dev.c:195:25: warning: unused variable ‘stats’ [-Wunused-variable]
> 195 | struct wb_stats stats;
> | ^~~~~
>
> I guess above line can simply removed?
Yes, this could be simply removed.

I'd submit a new series to fix this very soon.

Thanks.
>
>> +
>> + global_dirty_limits(&background_thresh, &dirty_thresh);
>> +
>> + rcu_read_lock();
>> + list_for_each_entry_rcu(wb, &bdi->wb_list, bdi_node) {
>> + struct wb_stats stats = { .dirty_thresh = dirty_thresh };
>> +
>> + if (!wb_tryget(wb))
>> + continue;
>> +
>> + collect_wb_stats(&stats, wb);
>> +
>> + /*
>> + * Calculate thresh of wb in writeback cgroup which is min of
>> + * thresh in global domain and thresh in cgroup domain. Drop
>> + * rcu lock because cgwb_calc_thresh may sleep in
>> + * cgroup_rstat_flush. We can do so here because we have a ref.
>> + */
>> + if (mem_cgroup_wb_domain(wb)) {
>> + rcu_read_unlock();
>> + stats.wb_thresh = min(stats.wb_thresh, cgwb_calc_thresh(wb));
>> + rcu_read_lock();
>> + }
>> +
>> + wb_stats_show(m, wb, &stats);
>> +
>> + wb_put(wb);
>> + }
>> + rcu_read_unlock();
>> +
>> + return 0;
>> +}
>
>
> Thanks,
> SJ
>
> [...]
>