2010-08-05 00:44:08

by Michael Rubin

[permalink] [raw]
Subject: [PATCH 0/2] Adding two writeback files in /proc/sys/vm

Patch #1 sets up some helper functions for accounting.

Patch #2 adds writeback files for visibility

To help developers and applications gain visibility into writeback
behaviour adding two read-only sysctl files into /proc/sys/vm.
These files allow user apps to understand writeback behaviour over time
and learn how it is impacting their performance.

# cat /proc/sys/vm/pages_dirtied
3747
# cat /proc/sys/vm/pages_entered_writeback
3618

These two new files are necessary to give visibility into writeback
behaviour. We have /proc/diskstats which lets us understand the io in
the block layer. We have blktrace for more in depth understanding. We have
e2fsprogs and debugsfs to give insight into the file systems behaviour,
but we don't offer our users the ability understand what writeback is
doing. There is no way to know how active it is over the whole system,
if it's falling behind or to quantify it's efforts. With these values
exported users can easily see how much data applications are sending
through writeback and also at what rates writeback is processing this
data. Comparing the rates of change between the two allow developers
to see when writeback is not able to keep up with incoming traffic and
the rate of dirty memory being sent to the IO back end. This allows
folks to understand their io workloads and track kernel issues. Non
kernel engineers at Google often use these counters to solve puzzling
performance problems.


Michael Rubin (2):
mm: helper functions for dirty and writeback accounting
writeback: Adding pages_dirtied and pages_entered_writeback

Documentation/sysctl/vm.txt | 20 +++++++++++++++---
drivers/base/node.c | 14 +++++++++++++
fs/ceph/addr.c | 8 +-----
fs/nilfs2/segment.c | 2 +-
include/linux/mm.h | 1 +
include/linux/mmzone.h | 2 +
include/linux/writeback.h | 9 ++++++++
kernel/sysctl.c | 14 +++++++++++++
mm/page-writeback.c | 45 ++++++++++++++++++++++++++++++++++++++++--
mm/vmstat.c | 2 +
10 files changed, 103 insertions(+), 14 deletions(-)


2010-08-05 00:43:47

by Michael Rubin

[permalink] [raw]
Subject: [PATCH 1/2] mm: helper functions for dirty and writeback accounting

Exporting account_pages_dirty and adding a symmetric routine
account_pages_writeback.

This allows code outside of the mm core to safely manipulate page state
and not worry about the other accounting. Not using these routines means
that some code will lose track of the accounting and we get bugs. This
has happened once already.

Signed-off-by: Michael Rubin <[email protected]>
---
fs/ceph/addr.c | 8 ++------
fs/nilfs2/segment.c | 2 +-
include/linux/mm.h | 1 +
mm/page-writeback.c | 15 +++++++++++++++
4 files changed, 19 insertions(+), 7 deletions(-)

diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
index d9c60b8..359aa3a 100644
--- a/fs/ceph/addr.c
+++ b/fs/ceph/addr.c
@@ -106,12 +106,8 @@ static int ceph_set_page_dirty(struct page *page)
if (page->mapping) { /* Race with truncate? */
WARN_ON_ONCE(!PageUptodate(page));

- if (mapping_cap_account_dirty(mapping)) {
- __inc_zone_page_state(page, NR_FILE_DIRTY);
- __inc_bdi_stat(mapping->backing_dev_info,
- BDI_RECLAIMABLE);
- task_io_account_write(PAGE_CACHE_SIZE);
- }
+ if (mapping_cap_account_dirty(mapping))
+ account_page_dirtied(page, page->mapping);
radix_tree_tag_set(&mapping->page_tree,
page_index(page), PAGECACHE_TAG_DIRTY);

diff --git a/fs/nilfs2/segment.c b/fs/nilfs2/segment.c
index c920164..967ed7d 100644
--- a/fs/nilfs2/segment.c
+++ b/fs/nilfs2/segment.c
@@ -1599,7 +1599,7 @@ nilfs_copy_replace_page_buffers(struct page *page, struct list_head *out)
kunmap_atomic(kaddr, KM_USER0);

if (!TestSetPageWriteback(clone_page))
- inc_zone_page_state(clone_page, NR_WRITEBACK);
+ account_page_writeback(clone_page, page_mapping(clone_page));
unlock_page(clone_page);

return 0;
diff --git a/include/linux/mm.h b/include/linux/mm.h
index a2b4804..b138392 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -855,6 +855,7 @@ int __set_page_dirty_no_writeback(struct page *page);
int redirty_page_for_writepage(struct writeback_control *wbc,
struct page *page);
void account_page_dirtied(struct page *page, struct address_space *mapping);
+void account_page_writeback(struct page *page, struct address_space *mapping);
int set_page_dirty(struct page *page);
int set_page_dirty_lock(struct page *page);
int clear_page_dirty_for_io(struct page *page);
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 37498ef..b8e7b3b 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -1096,6 +1096,21 @@ void account_page_dirtied(struct page *page, struct address_space *mapping)
task_io_account_write(PAGE_CACHE_SIZE);
}
}
+EXPORT_SYMBOL(account_page_dirtied);
+
+/*
+ * Helper function for set_page_writeback family.
+ * NOTE: Unlike account_page_dirtied this does not rely on being atomic
+ * wrt interrupts.
+ */
+
+void account_page_writeback(struct page *page, struct address_space *mapping)
+{
+ if (mapping_cap_account_dirty(mapping))
+ inc_zone_page_state(page, NR_WRITEBACK);
+}
+EXPORT_SYMBOL(account_page_writeback);
+

/*
* For address_spaces which do not use buffers. Just tag the page as dirty in
--
1.7.1

2010-08-05 00:44:13

by Michael Rubin

[permalink] [raw]
Subject: [PATCH 2/2] writeback: Adding pages_dirtied and pages_entered_writeback

To help developers and applications gain visibility into writeback
behaviour adding four read only sysctl files into /proc/sys/vm.
These files allow user apps to understand writeback behaviour over time
and learn how it is impacting their performance.

# cat /proc/sys/vm/pages_dirtied
3747
# cat /proc/sys/vm/pages_entered_writeback
3618

Documentation/vm.txt has been updated.

In order to track the "cleaned" and "dirtied" counts we added two
vm_stat_items. Per memory node stats have been added also. So we can
see per node granularity:

# cat /sys/devices/system/node/node20/writebackstat
Node 20 pages_writeback: 0 times
Node 20 pages_dirtied: 0 times

Signed-off-by: Michael Rubin <[email protected]>
---
Documentation/sysctl/vm.txt | 20 ++++++++++++++++----
drivers/base/node.c | 14 ++++++++++++++
include/linux/mmzone.h | 2 ++
include/linux/writeback.h | 9 +++++++++
kernel/sysctl.c | 14 ++++++++++++++
mm/page-writeback.c | 36 ++++++++++++++++++++++++++++++------
mm/vmstat.c | 2 ++
7 files changed, 87 insertions(+), 10 deletions(-)

diff --git a/Documentation/sysctl/vm.txt b/Documentation/sysctl/vm.txt
index 5fdbb61..de9ec6a 100644
--- a/Documentation/sysctl/vm.txt
+++ b/Documentation/sysctl/vm.txt
@@ -50,6 +50,8 @@ Currently, these files are in /proc/sys/vm:
- overcommit_memory
- overcommit_ratio
- page-cluster
+- pages_dirtied
+- pages_entered_writeback
- panic_on_oom
- percpu_pagelist_fraction
- stat_interval
@@ -425,10 +427,7 @@ See Documentation/vm/hugetlbpage.txt
nr_pdflush_threads

The current number of pdflush threads. This value is read-only.
-The value changes according to the number of dirty pages in the system.
-
-When necessary, additional pdflush threads are created, one per second, up to
-nr_pdflush_threads_max.
+This value is obsolete.

==============================================================

@@ -582,6 +581,19 @@ swap-intensive.

=============================================================

+pages_dirtied
+
+Number of pages that have ever been dirtied since boot.
+This value is read-only.
+
+=============================================================
+
+pages_entered_writeback
+
+Number of pages that have been moved from dirty to writeback since boot.
+This is only a count of file pages. This value is read-only.
+
+=============================================================
panic_on_oom

This enables or disables panic on out-of-memory feature.
diff --git a/drivers/base/node.c b/drivers/base/node.c
index 2bdd8a9..b321d32 100644
--- a/drivers/base/node.c
+++ b/drivers/base/node.c
@@ -160,6 +160,18 @@ static ssize_t node_read_numastat(struct sys_device * dev,
}
static SYSDEV_ATTR(numastat, S_IRUGO, node_read_numastat, NULL);

+static ssize_t node_read_writebackstat(struct sys_device *dev,
+ struct sysdev_attribute *attr, char *buf)
+{
+ int nid = dev->id;
+ return sprintf(buf,
+ "Node %d pages_writeback: %lu times\n"
+ "Node %d pages_dirtied: %lu times\n",
+ nid, node_page_state(nid, NR_PAGES_ENTERED_WRITEBACK),
+ nid, node_page_state(nid, NR_FILE_PAGES_DIRTIED));
+}
+static SYSDEV_ATTR(writebackstat, S_IRUGO, node_read_writebackstat, NULL);
+
static ssize_t node_read_distance(struct sys_device * dev,
struct sysdev_attribute *attr, char * buf)
{
@@ -243,6 +255,7 @@ int register_node(struct node *node, int num, struct node *parent)
sysdev_create_file(&node->sysdev, &attr_meminfo);
sysdev_create_file(&node->sysdev, &attr_numastat);
sysdev_create_file(&node->sysdev, &attr_distance);
+ sysdev_create_file(&node->sysdev, &attr_writebackstat);

scan_unevictable_register_node(node);

@@ -267,6 +280,7 @@ void unregister_node(struct node *node)
sysdev_remove_file(&node->sysdev, &attr_meminfo);
sysdev_remove_file(&node->sysdev, &attr_numastat);
sysdev_remove_file(&node->sysdev, &attr_distance);
+ sysdev_remove_file(&node->sysdev, &attr_writebackstat);

scan_unevictable_unregister_node(node);
hugetlb_unregister_node(node); /* no-op, if memoryless node */
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index b4d109e..c0cd2bd 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -112,6 +112,8 @@ enum zone_stat_item {
NUMA_LOCAL, /* allocation from local node */
NUMA_OTHER, /* allocation from other node */
#endif
+ NR_PAGES_ENTERED_WRITEBACK, /* number of times pages enter writeback */
+ NR_FILE_PAGES_DIRTIED, /* number of times pages get dirtied */
NR_VM_ZONE_STAT_ITEMS };

/*
diff --git a/include/linux/writeback.h b/include/linux/writeback.h
index c24eca7..2d47afb 100644
--- a/include/linux/writeback.h
+++ b/include/linux/writeback.h
@@ -99,6 +99,8 @@ extern int dirty_background_ratio;
extern unsigned long dirty_background_bytes;
extern int vm_dirty_ratio;
extern unsigned long vm_dirty_bytes;
+extern unsigned long vm_pages_dirtied;
+extern unsigned long vm_pages_entered_writeback;
extern unsigned int dirty_writeback_interval;
extern unsigned int dirty_expire_interval;
extern int vm_highmem_is_dirtyable;
@@ -120,6 +122,13 @@ extern int dirty_bytes_handler(struct ctl_table *table, int write,
void __user *buffer, size_t *lenp,
loff_t *ppos);

+extern int pages_dirtied_handler(struct ctl_table *table, int write,
+ void __user *buffer, size_t *lenp,
+ loff_t *ppos);
+extern int pages_entered_writeback_handler(struct ctl_table *table, int write,
+ void __user *buffer, size_t *lenp,
+ loff_t *ppos);
+
struct ctl_table;
int dirty_writeback_centisecs_handler(struct ctl_table *, int,
void __user *, size_t *, loff_t *);
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index d24f761..33c3589 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -1053,6 +1053,20 @@ static struct ctl_table vm_table[] = {
.proc_handler = proc_dointvec,
},
{
+ .procname = "pages_dirtied",
+ .data = &vm_pages_dirtied,
+ .maxlen = sizeof(vm_pages_dirtied),
+ .mode = 0444 /* read-only */,
+ .proc_handler = pages_dirtied_handler,
+ },
+ {
+ .procname = "pages_entered_writeback",
+ .data = &vm_pages_entered_writeback,
+ .maxlen = sizeof(vm_pages_entered_writeback),
+ .mode = 0444 /* read-only */,
+ .proc_handler = pages_entered_writeback_handler,
+ },
+ {
.procname = "nr_pdflush_threads",
.data = &nr_pdflush_threads,
.maxlen = sizeof nr_pdflush_threads,
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index b8e7b3b..4ed5dec 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -95,6 +95,14 @@ unsigned int dirty_writeback_interval = 5 * 100; /* centiseconds */
*/
unsigned int dirty_expire_interval = 30 * 100; /* centiseconds */

+
+/*
+ * Number of pages dirtied and entered writeback state
+ */
+
+unsigned long vm_pages_dirtied;
+unsigned long vm_pages_entered_writeback;
+
/*
* Flag that makes the machine dump writes/reads and block dirtyings.
*/
@@ -196,7 +204,6 @@ int dirty_ratio_handler(struct ctl_table *table, int write,
return ret;
}

-
int dirty_bytes_handler(struct ctl_table *table, int write,
void __user *buffer, size_t *lenp,
loff_t *ppos)
@@ -212,6 +219,23 @@ int dirty_bytes_handler(struct ctl_table *table, int write,
return ret;
}

+int pages_dirtied_handler(struct ctl_table *table, int write,
+ void __user *buffer, size_t *lenp,
+ loff_t *ppos)
+{
+ vm_pages_dirtied = global_page_state(NR_FILE_PAGES_DIRTIED);
+ return proc_doulongvec_minmax(table, write, buffer, lenp, ppos);
+}
+
+int pages_entered_writeback_handler(struct ctl_table *table, int write,
+ void __user *buffer, size_t *lenp,
+ loff_t *ppos)
+{
+ vm_pages_entered_writeback =
+ global_page_state(NR_PAGES_ENTERED_WRITEBACK);
+ return proc_doulongvec_minmax(table, write, buffer, lenp, ppos);
+}
+
/*
* Increment the BDI's writeout completion count and the global writeout
* completion count. Called from test_clear_page_writeback().
@@ -1091,6 +1115,7 @@ void account_page_dirtied(struct page *page, struct address_space *mapping)
{
if (mapping_cap_account_dirty(mapping)) {
__inc_zone_page_state(page, NR_FILE_DIRTY);
+ __inc_zone_page_state(page, NR_FILE_PAGES_DIRTIED);
__inc_bdi_stat(mapping->backing_dev_info, BDI_RECLAIMABLE);
task_dirty_inc(current);
task_io_account_write(PAGE_CACHE_SIZE);
@@ -1103,15 +1128,15 @@ EXPORT_SYMBOL(account_page_dirtied);
* NOTE: Unlike account_page_dirtied this does not rely on being atomic
* wrt interrupts.
*/
-
void account_page_writeback(struct page *page, struct address_space *mapping)
{
- if (mapping_cap_account_dirty(mapping))
+ if (mapping_cap_account_dirty(mapping)) {
inc_zone_page_state(page, NR_WRITEBACK);
+ inc_zone_page_state(page, NR_PAGES_ENTERED_WRITEBACK);
+ }
}
EXPORT_SYMBOL(account_page_writeback);

-
/*
* For address_spaces which do not use buffers. Just tag the page as dirty in
* its radix tree.
@@ -1347,9 +1372,8 @@ int test_set_page_writeback(struct page *page)
ret = TestSetPageWriteback(page);
}
if (!ret)
- inc_zone_page_state(page, NR_WRITEBACK);
+ account_page_writeback(page, mapping);
return ret;
-
}
EXPORT_SYMBOL(test_set_page_writeback);

diff --git a/mm/vmstat.c b/mm/vmstat.c
index 7759941..e177a40 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -740,6 +740,8 @@ static const char * const vmstat_text[] = {
"numa_local",
"numa_other",
#endif
+ "nr_pages_entered_writeback",
+ "nr_file_pages_dirtied",

#ifdef CONFIG_VM_EVENT_COUNTERS
"pgpgin",
--
1.7.1

2010-08-05 20:25:29

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH 2/2] writeback: Adding pages_dirtied and pages_entered_writeback

On Wed, 4 Aug 2010 17:43:24 -0700
Michael Rubin <[email protected]> wrote:

> To help developers and applications gain visibility into writeback
> behaviour adding four read only sysctl files into /proc/sys/vm.
> These files allow user apps to understand writeback behaviour over time
> and learn how it is impacting their performance.
>
> # cat /proc/sys/vm/pages_dirtied
> 3747
> # cat /proc/sys/vm/pages_entered_writeback
> 3618
>
> Documentation/vm.txt has been updated.
>
> In order to track the "cleaned" and "dirtied" counts we added two
> vm_stat_items. Per memory node stats have been added also. So we can
> see per node granularity:
>
> # cat /sys/devices/system/node/node20/writebackstat
> Node 20 pages_writeback: 0 times
> Node 20 pages_dirtied: 0 times
>
> ...
>
> @@ -1091,6 +1115,7 @@ void account_page_dirtied(struct page *page, struct address_space *mapping)
> {
> if (mapping_cap_account_dirty(mapping)) {
> __inc_zone_page_state(page, NR_FILE_DIRTY);
> + __inc_zone_page_state(page, NR_FILE_PAGES_DIRTIED);
> __inc_bdi_stat(mapping->backing_dev_info, BDI_RECLAIMABLE);
> task_dirty_inc(current);
> task_io_account_write(PAGE_CACHE_SIZE);

I hope the utility of this change is worth the overhead :(

> --- a/mm/vmstat.c
> +++ b/mm/vmstat.c
> @@ -740,6 +740,8 @@ static const char * const vmstat_text[] = {
> "numa_local",
> "numa_other",
> #endif
> + "nr_pages_entered_writeback",
> + "nr_file_pages_dirtied",
>

Wait. These counters appear in /proc/vmstat. So why create standalone
/proc/sys/vm files as well?

2010-08-05 22:06:24

by Michael Rubin

[permalink] [raw]
Subject: Re: [PATCH 2/2] writeback: Adding pages_dirtied and pages_entered_writeback

On Thu, Aug 5, 2010 at 1:24 PM, Andrew Morton <[email protected]> wrote:
> On Wed, ?4 Aug 2010 17:43:24 -0700
> Michael Rubin <[email protected]> wrote:
> Wait. ?These counters appear in /proc/vmstat. ?So why create standalone
> /proc/sys/vm files as well?

I did not know they would show up in /proc/vmstat.

I thought it made sense to put them in /proc/sys/vm since the other
writeback controls are there.
but have no problems just adding them to /prov/vmstat if that makes more sense.

mrubin

2010-08-05 23:56:28

by KOSAKI Motohiro

[permalink] [raw]
Subject: Re: [PATCH 2/2] writeback: Adding pages_dirtied and pages_entered_writeback

> On Thu, Aug 5, 2010 at 1:24 PM, Andrew Morton <[email protected]> wrote:
> > On Wed, ?4 Aug 2010 17:43:24 -0700
> > Michael Rubin <[email protected]> wrote:
> > Wait. ?These counters appear in /proc/vmstat. ?So why create standalone
> > /proc/sys/vm files as well?
>
> I did not know they would show up in /proc/vmstat.
>
> I thought it made sense to put them in /proc/sys/vm since the other
> writeback controls are there.
> but have no problems just adding them to /prov/vmstat if that makes more sense.

?

/proc/vmstat already have both.

cat /proc/vmstat |grep nr_dirty
cat /proc/vmstat |grep nr_writeback

Also, /sys/devices/system/node/node0/meminfo show per-node stat.

Perhaps, I'm missing your point.

2010-08-06 00:12:01

by Michael Rubin

[permalink] [raw]
Subject: Re: [PATCH 2/2] writeback: Adding pages_dirtied and pages_entered_writeback

On Thu, Aug 5, 2010 at 4:56 PM, KOSAKI Motohiro
<[email protected]> wrote:
> /proc/vmstat already have both.
>
> cat /proc/vmstat |grep nr_dirty
> cat /proc/vmstat |grep nr_writeback
>
> Also, /sys/devices/system/node/node0/meminfo show per-node stat.
>
> Perhaps, I'm missing your point.

These only show the number of dirty pages present in the system at the
point they are queried.
The counter I am trying to add are increasing over time. They allow
developers to see rates of pages being dirtied and entering writeback.
Which is very helpful.

mrubin

2010-08-06 00:19:11

by KOSAKI Motohiro

[permalink] [raw]
Subject: Re: [PATCH 2/2] writeback: Adding pages_dirtied and pages_entered_writeback

> On Thu, Aug 5, 2010 at 4:56 PM, KOSAKI Motohiro
> <[email protected]> wrote:
> > /proc/vmstat already have both.
> >
> > cat /proc/vmstat |grep nr_dirty
> > cat /proc/vmstat |grep nr_writeback
> >
> > Also, /sys/devices/system/node/node0/meminfo show per-node stat.
> >
> > Perhaps, I'm missing your point.
>
> These only show the number of dirty pages present in the system at the
> point they are queried.
> The counter I am trying to add are increasing over time. They allow
> developers to see rates of pages being dirtied and entering writeback.
> Which is very helpful.

Usually administrators get the data two times and subtract them. Isn't it sufficient?

2010-08-06 00:28:28

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH 2/2] writeback: Adding pages_dirtied and pages_entered_writeback

On Fri, 6 Aug 2010 09:18:59 +0900 (JST)
KOSAKI Motohiro <[email protected]> wrote:

> > On Thu, Aug 5, 2010 at 4:56 PM, KOSAKI Motohiro
> > <[email protected]> wrote:
> > > /proc/vmstat already have both.
> > >
> > > cat /proc/vmstat |grep nr_dirty
> > > cat /proc/vmstat |grep nr_writeback
> > >
> > > Also, /sys/devices/system/node/node0/meminfo show per-node stat.
> > >
> > > Perhaps, I'm missing your point.
> >
> > These only show the number of dirty pages present in the system at the
> > point they are queried.
> > The counter I am trying to add are increasing over time. They allow
> > developers to see rates of pages being dirtied and entering writeback.
> > Which is very helpful.
>
> Usually administrators get the data two times and subtract them. Isn't it sufficient?
>

Nope. The existing nr_dirty is "number of pages dirtied since boot"
minus "number of pages cleaned since boot". If you do the
wait-one-second-then-subtract thing on nr_dirty, the result is
dirtying-bandwidth minus cleaning-bandwidth, and can't be used to
determine dirtying-bandwidth.

I can see that a graph of dirtying events versus time could be an
interesting thing. I don't see how it could be obtained using the
existing instrumentation. tracepoints, probably..

2010-08-06 00:44:58

by KOSAKI Motohiro

[permalink] [raw]
Subject: Re: [PATCH 2/2] writeback: Adding pages_dirtied and pages_entered_writeback

> On Fri, 6 Aug 2010 09:18:59 +0900 (JST)
> KOSAKI Motohiro <[email protected]> wrote:
>
> > > On Thu, Aug 5, 2010 at 4:56 PM, KOSAKI Motohiro
> > > <[email protected]> wrote:
> > > > /proc/vmstat already have both.
> > > >
> > > > cat /proc/vmstat |grep nr_dirty
> > > > cat /proc/vmstat |grep nr_writeback
> > > >
> > > > Also, /sys/devices/system/node/node0/meminfo show per-node stat.
> > > >
> > > > Perhaps, I'm missing your point.
> > >
> > > These only show the number of dirty pages present in the system at the
> > > point they are queried.
> > > The counter I am trying to add are increasing over time. They allow
> > > developers to see rates of pages being dirtied and entering writeback.
> > > Which is very helpful.
> >
> > Usually administrators get the data two times and subtract them. Isn't it sufficient?
> >
>
> Nope. The existing nr_dirty is "number of pages dirtied since boot"
> minus "number of pages cleaned since boot". If you do the
> wait-one-second-then-subtract thing on nr_dirty, the result is
> dirtying-bandwidth minus cleaning-bandwidth, and can't be used to
> determine dirtying-bandwidth.

Technically, yes. I meant, _now_, typical administrators are using the
subtraction.
Do you mean this is wrong? or do you mean you have another use case?

Just curious.


> I can see that a graph of dirtying events versus time could be an
> interesting thing. I don't see how it could be obtained using the
> existing instrumentation. tracepoints, probably..

I think it depend on frequency of the usecase. If the usecase is enouth
major, convenience way (e.g. /proc/vmstat) is very helpful.

probably, I haven't understand the usecase of this feature.


2010-08-06 07:19:26

by Michael Rubin

[permalink] [raw]
Subject: Re: [PATCH 2/2] writeback: Adding pages_dirtied and pages_entered_writeback

On Thu, Aug 5, 2010 at 1:24 PM, Andrew Morton <[email protected]> wrote:
> Wait. These counters appear in /proc/vmstat. So why create standalone
> /proc/sys/vm files as well?

Andrew I was thinking about this today. And I think there is a case
for keeping the proc files.
Christoph was the one who pointed out to me that is their proper home
and I think he's right. Most if not all the tunables for writeback are
there. When one is trying to find the state of the system's writeback
activity that's the directory. Only having these variables in
/proc/vmstat to me feels like a way to make sure that users who would
need them won't find them unless they are reading source. And these
are folks who aren't reading source.

/proc/vmstat _does_ look like a good place to put the thresholds as it
already has similar values as the thresholds suck as
kswapd_low_wmark_hit_quickly.

mrubin