2005-12-20 22:02:08

by Christoph Lameter

[permalink] [raw]
Subject: Zoned counters V1 [ 0/14]: Overview

Zone based VM statistics are necessary to be able to determine what the state
of memory in one zone is. In a NUMA system this can be helpful to do local
reclaim and other memory optimizations by shifting VM load to optimize
page allocation. It is also helpful to know how the computing load affects
the memory allocations on various zones.

The patchset introduces a framework for counters that is a cross between the
existing page_stats --which are simply global counters split per cpu-- and the
approach of deferred incremental updates implemented for nr_pagecache.

Small per cpu 8 bit counters are introduced in struct zone. If counting
exceeds certain threshold then the counters are accumulated in an array in
the zone of the page and in a global array. This means that access to
VM counter information for a zone and for the whole machine is possible
by simply indexing an array. [Thanks to Nick Piggin for pointing me
at that approach].

The remaining counters in page_state are just for showing some statistics
via proc. Another patchset "VM event counters" will convert the remaining
counters to lightweight inline counters and allows switching off nonessential
counters for embedded systems.

This patchset is against 2.6.15-rc5-mm3. Only the first 4 patches are needed
to support zone reclaim.

1 Add some consts for inlines in mm.h
2 Basic zoned counter functionality
3 Make /proc/vmstat include zoned counters
4 Convert nr_mapped
5 Convert nr_pagecache
6 Expanded node and zone statistics
7 Convert nr_slab
8 Convert nr_page_table
9 Convert nr_dirty
10 Convert nr_writeback
11 Convert nr_unstable
12 Convert nr_bounce
13 Remove get_page_state functions
14 Remove wbs


2005-12-20 22:02:10

by Christoph Lameter

[permalink] [raw]
Subject: Zoned counters V1 [ 1/14]: Add some consts for inlines in mm.h

[PATCH] const attributes for some inlines in mm.h

Const attributes allow the compiler to generate more efficient code by
allowing callers to keep elements of struct page in registers [Or if
the architecture does not have too many registers it will at least avoid
address recalculation and allow common subexpression elimination to work.]

Some of the zoned vm statistics functions need to be passed a
"struct page *". The parameter is defined const. That in turn requires
that the inlines used by that function also take const page * parameters.

Signed-off-by: Christoph Lameter <[email protected]>

Index: linux-2.6.15-rc5-mm3/include/linux/mm.h
===================================================================
--- linux-2.6.15-rc5-mm3.orig/include/linux/mm.h 2005-12-16 11:44:09.000000000 -0800
+++ linux-2.6.15-rc5-mm3/include/linux/mm.h 2005-12-20 11:54:03.000000000 -0800
@@ -456,7 +456,7 @@ void put_page(struct page *page);
#define SECTIONS_MASK ((1UL << SECTIONS_WIDTH) - 1)
#define ZONETABLE_MASK ((1UL << ZONETABLE_SHIFT) - 1)

-static inline unsigned long page_zonenum(struct page *page)
+static inline unsigned long page_zonenum(const struct page *page)
{
return (page->flags >> ZONES_PGSHIFT) & ZONES_MASK;
}
@@ -464,20 +464,20 @@ static inline unsigned long page_zonenum
struct zone;
extern struct zone *zone_table[];

-static inline struct zone *page_zone(struct page *page)
+static inline struct zone *page_zone(const struct page *page)
{
return zone_table[(page->flags >> ZONETABLE_PGSHIFT) &
ZONETABLE_MASK];
}

-static inline unsigned long page_to_nid(struct page *page)
+static inline unsigned long page_to_nid(const struct page *page)
{
if (FLAGS_HAS_NODE)
return (page->flags >> NODES_PGSHIFT) & NODES_MASK;
else
return page_zone(page)->zone_pgdat->node_id;
}
-static inline unsigned long page_to_section(struct page *page)
+static inline unsigned long page_to_section(const struct page *page)
{
return (page->flags >> SECTIONS_PGSHIFT) & SECTIONS_MASK;
}
@@ -511,7 +511,7 @@ static inline void set_page_links(struct
extern struct page *mem_map;
#endif

-static inline void *lowmem_page_address(struct page *page)
+static inline void *lowmem_page_address(const struct page *page)
{
return __va(page_to_pfn(page) << PAGE_SHIFT);
}
@@ -553,7 +553,7 @@ void page_address_init(void);
#define PAGE_MAPPING_ANON 1

extern struct address_space swapper_space;
-static inline struct address_space *page_mapping(struct page *page)
+static inline struct address_space *page_mapping(const struct page *page)
{
struct address_space *mapping = page->mapping;

@@ -564,7 +564,7 @@ static inline struct address_space *page
return mapping;
}

-static inline int PageAnon(struct page *page)
+static inline int PageAnon(const struct page *page)
{
return ((unsigned long)page->mapping & PAGE_MAPPING_ANON) != 0;
}
@@ -573,7 +573,7 @@ static inline int PageAnon(struct page *
* Return the pagecache index of the passed page. Regular pagecache pages
* use ->index whereas swapcache pages use ->private
*/
-static inline pgoff_t page_index(struct page *page)
+static inline pgoff_t page_index(const struct page *page)
{
if (unlikely(PageSwapCache(page)))
return page_private(page);
@@ -590,7 +590,7 @@ static inline void reset_page_mapcount(s
atomic_set(&(page)->_mapcount, -1);
}

-static inline int page_mapcount(struct page *page)
+static inline int page_mapcount(const struct page *page)
{
return atomic_read(&(page)->_mapcount) + 1;
}
@@ -598,7 +598,7 @@ static inline int page_mapcount(struct p
/*
* Return true if this page is mapped into pagetables.
*/
-static inline int page_mapped(struct page *page)
+static inline int page_mapped(const struct page *page)
{
return atomic_read(&(page)->_mapcount) >= 0;
}

2005-12-20 22:02:37

by Christoph Lameter

[permalink] [raw]
Subject: Zoned counters V1 [ 7/14]: Convert nr_slab

Convert nr_slab

Signed-off-by: Christoph Lameter <[email protected]>

Index: linux-2.6.15-rc5-mm3/drivers/base/node.c
===================================================================
--- linux-2.6.15-rc5-mm3.orig/drivers/base/node.c 2005-12-20 12:57:57.000000000 -0800
+++ linux-2.6.15-rc5-mm3/drivers/base/node.c 2005-12-20 12:58:02.000000000 -0800
@@ -88,7 +88,7 @@ static ssize_t node_read_meminfo(struct
nid, K(ps.nr_writeback),
nid, K(nr[NR_MAPPED]),
nid, K(nr[NR_PAGECACHE]),
- nid, K(ps.nr_slab));
+ nid, K(nr[NR_SLAB]));
n += hugetlb_report_node_meminfo(nid, buf + n);
return n;
}
Index: linux-2.6.15-rc5-mm3/fs/proc/proc_misc.c
===================================================================
--- linux-2.6.15-rc5-mm3.orig/fs/proc/proc_misc.c 2005-12-20 12:57:55.000000000 -0800
+++ linux-2.6.15-rc5-mm3/fs/proc/proc_misc.c 2005-12-20 12:58:02.000000000 -0800
@@ -191,7 +191,7 @@ static int meminfo_read_proc(char *page,
K(ps.nr_dirty),
K(ps.nr_writeback),
K(global_page_state(NR_MAPPED)),
- K(ps.nr_slab),
+ K(global_page_state(NR_SLAB)),
K(allowed),
K(committed),
K(ps.nr_page_table_pages),
Index: linux-2.6.15-rc5-mm3/mm/page_alloc.c
===================================================================
--- linux-2.6.15-rc5-mm3.orig/mm/page_alloc.c 2005-12-20 12:57:57.000000000 -0800
+++ linux-2.6.15-rc5-mm3/mm/page_alloc.c 2005-12-20 12:58:17.000000000 -0800
@@ -597,7 +597,7 @@ static int rmqueue_bulk(struct zone *zon
return i;
}

-char *stat_item_descr[NR_STAT_ITEMS] = { "mapped","pagecache" };
+char *stat_item_descr[NR_STAT_ITEMS] = { "mapped","pagecache", "slab" };

/*
* Manage combined zone based / global counters
@@ -1784,7 +1784,7 @@ void show_free_areas(void)
ps.nr_writeback,
ps.nr_unstable,
nr_free_pages(),
- ps.nr_slab,
+ global_page_state(NR_SLAB),
global_page_state(NR_MAPPED),
ps.nr_page_table_pages);

@@ -2677,13 +2677,13 @@ static char *vmstat_text[] = {
/* Zoned VM counters */
"nr_mapped",
"nr_pagecache",
+ "nr_slab",

/* Page state */
"nr_dirty",
"nr_writeback",
"nr_unstable",
"nr_page_table_pages",
- "nr_slab",

"pgpgin",
"pgpgout",
Index: linux-2.6.15-rc5-mm3/mm/slab.c
===================================================================
--- linux-2.6.15-rc5-mm3.orig/mm/slab.c 2005-12-20 12:57:37.000000000 -0800
+++ linux-2.6.15-rc5-mm3/mm/slab.c 2005-12-20 12:58:02.000000000 -0800
@@ -1236,7 +1236,7 @@ static void *kmem_getpages(kmem_cache_t
i = (1 << cachep->gfporder);
if (cachep->flags & SLAB_RECLAIM_ACCOUNT)
atomic_add(i, &slab_reclaim_pages);
- add_page_state(nr_slab, i);
+ add_zone_page_state(page_zone(page), NR_SLAB, i);
while (i--) {
SetPageSlab(page);
page++;
@@ -1258,7 +1258,7 @@ static void kmem_freepages(kmem_cache_t
BUG();
page++;
}
- sub_page_state(nr_slab, nr_freed);
+ sub_zone_page_state(page_zone(page), NR_SLAB, nr_freed);
if (current->reclaim_state)
current->reclaim_state->reclaimed_slab += nr_freed;
free_pages((unsigned long)addr, cachep->gfporder);
Index: linux-2.6.15-rc5-mm3/include/linux/mmzone.h
===================================================================
--- linux-2.6.15-rc5-mm3.orig/include/linux/mmzone.h 2005-12-20 12:57:55.000000000 -0800
+++ linux-2.6.15-rc5-mm3/include/linux/mmzone.h 2005-12-20 12:58:02.000000000 -0800
@@ -48,6 +48,7 @@ enum zone_stat_item {
NR_MAPPED, /* mapped into pagetables.
only modified from process context */
NR_PAGECACHE, /* file backed pages */
+ NR_SLAB, /* used by slab allocator */
NR_STAT_ITEMS };

#ifdef CONFIG_SMP
Index: linux-2.6.15-rc5-mm3/include/linux/page-flags.h
===================================================================
--- linux-2.6.15-rc5-mm3.orig/include/linux/page-flags.h 2005-12-20 12:57:42.000000000 -0800
+++ linux-2.6.15-rc5-mm3/include/linux/page-flags.h 2005-12-20 12:58:02.000000000 -0800
@@ -95,8 +95,7 @@ struct page_state {
unsigned long nr_writeback; /* Pages under writeback */
unsigned long nr_unstable; /* NFS unstable pages */
unsigned long nr_page_table_pages;/* Pages used for pagetables */
- unsigned long nr_slab; /* In slab */
-#define GET_PAGE_STATE_LAST nr_slab
+#define GET_PAGE_STATE_LAST nr_page_table_pages

/*
* The below are zeroed by get_page_state(). Use get_full_page_state()

2005-12-20 22:02:34

by Christoph Lameter

[permalink] [raw]
Subject: Zoned counters V1 [ 2/14]: Basic counter functionality

Currently we have various vm counters for the pages in a zone that are split
per cpu. This arrangement does not allow access to per zone statistics that
are important to optimize VM behavior for NUMA architectures. All one can say
from the per cpu differential variables is how much a certain variable was
changed by this cpu without being able to deduce how many pages in each zone
are of a certain type.

This framework here implements differential counters for each processor
in struct zone. The differential counters are consolidated when a threshold
is exceeded (like done in the current implementation for nr_pageache), when
slab reaping occurs or when a consolidation function is called.

Consolidation uses atomic operations and accumulates counters per zone in
the zone structure and also globally in the vm_stat array. VM functions can
access the counts by simply indexing a global or zone specific array.

The arrangement of counters in an array also simplifies processing when output
has to be generated for /proc/*.

Counter updates can be triggered by calling *_zone_page_state or
__*_zone_page_state. The second function can be called if it is known that
interrupts are disabled.

Specially optimized increment and decrement functions are provided. These
can avoid certain checks and use increment or decrement instructions that
an architecture may provide.

Two other patchsets depend on zoned VM stats:
1. Zone reclaim patchset (needs zoned VM stats to determine when to run a
reclaim scan)
2. event counter patchset. This introduces lightweight counters and
converts the rest of the page_state.

Signed-off-by: Christoph Lameter <[email protected]>

Index: linux-2.6.15-rc5-mm3/include/linux/mmzone.h
===================================================================
--- linux-2.6.15-rc5-mm3.orig/include/linux/mmzone.h 2005-12-16 11:44:09.000000000 -0800
+++ linux-2.6.15-rc5-mm3/include/linux/mmzone.h 2005-12-20 11:58:27.000000000 -0800
@@ -44,6 +44,19 @@ struct zone_padding {
#define ZONE_PADDING(name)
#endif

+enum zone_stat_item {
+ NR_STAT_ITEMS };
+
+#ifdef CONFIG_SMP
+typedef atomic_long_t vm_stat_t;
+#define VM_STAT_GET(x) atomic_long_read(&(x))
+#define VM_STAT_ADD(x,v) atomic_long_add(v, &(x))
+#else
+typedef unsigned long vm_stat_t;
+#define VM_STAT_GET(x) (x)
+#define VM_STAT_ADD(x,v) (x) += (v)
+#endif
+
struct per_cpu_pages {
int count; /* number of pages in the list */
int high; /* high watermark, emptying needed */
@@ -53,6 +66,10 @@ struct per_cpu_pages {

struct per_cpu_pageset {
struct per_cpu_pages pcp[2]; /* 0: hot. 1: cold */
+#ifdef CONFIG_SMP
+ s8 vm_stat_diff[NR_STAT_ITEMS];
+#endif
+
#ifdef CONFIG_NUMA
unsigned long numa_hit; /* allocated in intended node */
unsigned long numa_miss; /* allocated in non intended node */
@@ -149,6 +166,8 @@ struct zone {
unsigned long pages_scanned; /* since last reclaim */
int all_unreclaimable; /* All pages pinned */

+ /* Zone statistics */
+ vm_stat_t vm_stat[NR_STAT_ITEMS];
/*
* Does the allocator try to reclaim pages from the zone as soon
* as it fails a watermark_ok() in __alloc_pages?Index: linux-2.6.15-rc5-mm3/include/linux/page-flags.h
===================================================================
--- linux-2.6.15-rc5-mm3.orig/include/linux/page-flags.h 2005-12-16 11:44:09.000000000 -0800
+++ linux-2.6.15-rc5-mm3/include/linux/page-flags.h 2005-12-20 12:10:14.000000000 -0800
@@ -205,6 +205,50 @@ extern void __mod_page_state_offset(unsi
} while (0)

/*
+ * Zone based accounting with per cpu differentials.
+ */
+extern vm_stat_t vm_stat[NR_STAT_ITEMS];
+
+static inline unsigned long global_page_state(enum zone_stat_item item)
+{
+ long x = VM_STAT_GET(vm_stat[item]);
+
+ if (x < 0)
+ x = 0;
+ return x;
+}
+
+static inline unsigned long zone_page_state(struct zone *zone,
+ enum zone_stat_item item)
+{
+ long x = VM_STAT_GET(zone->vm_stat[item]);
+
+ if (x < 0)
+ x = 0;
+ return x;
+}
+
+#ifdef CONFIG_NUMA
+unsigned long node_page_state(int node, enum zone_stat_item);
+#else
+#define node_page_state(node, item) global_page_state(item)
+#endif
+
+void __mod_zone_page_state(struct zone *, enum zone_stat_item item, int);
+void __inc_zone_page_state(const struct page *, enum zone_stat_item);
+void __dec_zone_page_state(const struct page *, enum zone_stat_item);
+
+#define __add_zone_page_state(__z, __i, __d) __mod_zone_page_state(__z, __i, __d)
+#define __sub_zone_page_state(__z, __i, __d) __mod_zone_page_state(__z, __i,-(__d))
+
+void mod_zone_page_state(struct zone *, enum zone_stat_item, int);
+void inc_zone_page_state(const struct page *, enum zone_stat_item);
+void dec_zone_page_state(const struct page *, enum zone_stat_item);
+
+#define add_zone_page_state(__z, __i, __d) mod_zone_page_state(__z, __i, __d)
+#define sub_zone_page_state(__z, __i, __d) mod_zone_page_state(__z, __i, -(__d))
+
+/*
* Manipulation of page state flags
*/
#define PageLocked(page) \
Index: linux-2.6.15-rc5-mm3/mm/page_alloc.c
===================================================================
--- linux-2.6.15-rc5-mm3.orig/mm/page_alloc.c 2005-12-16 11:44:09.000000000 -0800
+++ linux-2.6.15-rc5-mm3/mm/page_alloc.c 2005-12-20 12:05:56.000000000 -0800
@@ -597,7 +597,279 @@ static int rmqueue_bulk(struct zone *zon
return i;
}

+/*
+ * Manage combined zone based / global counters
+ */
+vm_stat_t vm_stat[NR_STAT_ITEMS];
+
+static inline void zone_page_state_add(long x, struct zone *zone,
+ enum zone_stat_item item)
+{
+ VM_STAT_ADD(zone->vm_stat[item], x);
+ VM_STAT_ADD(vm_stat[item], x);
+}
+
+#ifdef CONFIG_SMP
+
+#define STAT_THRESHOLD 32
+
+/*
+ * Determine pointer to currently valid differential byte given a zone and
+ * the item number.
+ *
+ * Preemption must be off
+ */
+static inline s8 *diff_pointer(struct zone *zone, enum zone_stat_item item)
+{
+ return &zone_pcp(zone, raw_smp_processor_id())->vm_stat_diff[item];
+}
+
+/*
+ * For use when we know that interrupts are disabled.
+ */
+void __mod_zone_page_state(struct zone *zone, enum zone_stat_item item,
+ int delta)
+{
+ s8 *p;
+ long x;
+
+ p = diff_pointer(zone, item);
+ x = delta + *p;
+
+ if (unlikely(x > STAT_THRESHOLD || x < -STAT_THRESHOLD)) {
+ zone_page_state_add(x, zone, item);
+ x = 0;
+ }
+
+ *p = x;
+}
+EXPORT_SYMBOL(__mod_zone_page_state);
+
+/*
+ * For an unknown interrupt state
+ */
+void mod_zone_page_state(struct zone *zone, enum zone_stat_item item,
+ int delta)
+{
+ unsigned long flags;
+
+ local_irq_save(flags);
+ __mod_zone_page_state(zone, item, delta);
+ local_irq_restore(flags);
+}
+EXPORT_SYMBOL(mod_zone_page_state);
+
+/*
+ * Optimized increment and decrement functions.
+ *
+ * These are only for a single page and therefore can take a struct page *
+ * argument instead of struct zone *. This allows the inclusion of the code
+ * generated for page_zone(page) into the optimized functions.
+ *
+ * No overflow check is necessary and therefore the differential can be
+ * incremented or decremented in place which may allow the compilers to
+ * generate better code.
+ *
+ * The increment or decrement is known and therefore one boundary check can
+ * be omitted.
+ *
+ * Some processors have inc/dec instructions that are atomic vs an interrupt.
+ * However, the code must first determine the differential location in a zone
+ * based on the processor number and then inc/dec the counter. There is no
+ * guarantee without disabling preemption that the processor will not change
+ * in between and therefore the atomicity vs. interrupt cannot be exploited
+ * in a useful way here.
+ */
+void __inc_zone_page_state(const struct page *page, enum zone_stat_item item)
+{
+ struct zone *zone = page_zone(page);
+ s8 *p = diff_pointer(zone, item);
+
+ (*p)++;
+
+ if (unlikely(*p > STAT_THRESHOLD)) {
+ zone_page_state_add(*p, zone, item);
+ *p = 0;
+ }
+}
+EXPORT_SYMBOL(__inc_zone_page_state);
+
+void __dec_zone_page_state(const struct page *page, enum zone_stat_item item)
+{
+ struct zone *zone = page_zone(page);
+ s8 *p = diff_pointer(zone, item);
+
+ (*p)--;
+
+ if (unlikely(*p < -STAT_THRESHOLD)) {
+ zone_page_state_add(*p, zone, item);
+ *p = 0;
+ }
+}
+EXPORT_SYMBOL(__dec_zone_page_state);
+
+void inc_zone_page_state(const struct page *page, enum zone_stat_item item)
+{
+ unsigned long flags;
+ struct zone *zone;
+ s8 *p;
+
+ local_irq_save(flags);
+ zone = page_zone(page);
+ p = diff_pointer(zone, item);
+
+ (*p)++;
+
+ if (unlikely(*p > STAT_THRESHOLD)) {
+ zone_page_state_add(*p, zone, item);
+ *p = 0;
+ }
+ local_irq_restore(flags);
+}
+EXPORT_SYMBOL(inc_zone_page_state);
+
+void dec_zone_page_state(const struct page *page, enum zone_stat_item item)
+{
+ unsigned long flags;
+ struct zone *zone;
+ s8 *p;
+
+ local_irq_save(flags);
+ zone = page_zone(page);
+ p = diff_pointer(zone, item);
+
+ (*p)--;
+
+ if (unlikely(*p < -STAT_THRESHOLD)) {
+ zone_page_state_add(*p, zone, item);
+ *p = 0;
+ }
+ local_irq_restore(flags);
+}
+EXPORT_SYMBOL(dec_zone_page_state);
+
+/*
+ * Update the zone counters for one cpu.
+ */
+void refresh_cpu_vm_stats(void)
+{
+ struct zone *zone;
+ int i;
+ unsigned long flags;
+
+ local_irq_save(flags);
+ for_each_zone(zone) {
+ struct per_cpu_pageset *pcp =
+ zone_pcp(zone, raw_smp_processor_id());
+
+ for(i = 0; i < NR_STAT_ITEMS; i++) {
+ int v;
+
+ v = pcp->vm_stat_diff[i];
+ if (v) {
+ pcp->vm_stat_diff[i] = 0;
+ zone_page_state_add(v, zone, i);
+ }
+ }
+ }
+ local_irq_restore(flags);
+}
+
+static void __refresh_cpu_vm_stats(void *dummy)
+{
+ refresh_cpu_vm_stats();
+}
+
+/*
+ * Consolidate all counters.
+ *
+ * Note that the result is less inaccurate but still inaccurate
+ * since concurrent processes can increment/decrement counters
+ * while this functions runs.
+ */
+void refresh_vm_stats(void)
+{
+ schedule_on_each_cpu(__refresh_cpu_vm_stats, NULL);
+}
+EXPORT_SYMBOL(refresh_vm_stats);
+
+#else /* CONFIG_SMP */
+
+/*
+ * We do not maintain differentials in a single processor configuration.
+ * The functions directly modify the zone and global counters.
+ */
+
+void __mod_zone_page_state(struct zone *zone, enum zone_stat_item item,
+ int delta)
+{
+ zone_page_state_add(delta, zone, item);
+}
+EXPORT_SYMBOL(__mod_zone_page_state);
+
+void mod_zone_page_state(struct zone *zone, enum zone_stat_item item,
+ int delta)
+{
+ unsigned long flags;
+
+ local_irq_save(flags);
+ zone_page_state_add(delta, zone, item);
+ local_irq_restore(flags);
+}
+EXPORT_SYMBOL(mod_zone_page_state);
+
+void __inc_zone_page_state(const struct page *page, enum zone_stat_item item)
+{
+ zone_page_state_add(1, page_zone(page), item);
+}
+EXPORT_SYMBOL(__inc_zone_page_state);
+
+void __dec_zone_page_state(const struct page *page, enum zone_stat_item item)
+{
+ zone_page_state_add(-1, page_zone(page), item);
+}
+EXPORT_SYMBOL(__dec_zone_page_state);
+
+void inc_zone_page_state(const struct page *page, enum zone_stat_item item)
+{
+ unsigned long flags;
+
+ local_irq_save(flags);
+ zone_page_state_add(1, page_zone(page), item);
+ local_irq_restore(flags);
+}
+EXPORT_SYMBOL(inc_zone_page_state);
+
+void dec_zone_page_state(const struct page *page, enum zone_stat_item item)
+{
+ unsigned long flags;
+
+ local_irq_save(flags);
+ zone_page_state_add( -1, page_zone(page), item);
+ local_irq_restore(flags);
+}
+EXPORT_SYMBOL(dec_zone_page_state);
+#endif
+
#ifdef CONFIG_NUMA
+/*
+ * Determine the per node value of a stat item. This is done by cycling
+ * through all the zones of a node.
+ */
+unsigned long node_page_state(int node, enum zone_stat_item item)
+{
+ struct zone *zones = NODE_DATA(node)->node_zones;
+ int i;
+ long v = 0;
+
+ for (i = 0; i < MAX_NR_ZONES; i++)
+ v += VM_STAT_GET(zones[i].vm_stat[item]);
+ if (v < 0)
+ v = 0;
+ return v;
+}
+EXPORT_SYMBOL(node_page_state);
+
/* Called from the slab reaper to drain remote pagesets */
void drain_remote_pages(void)
{
@@ -2169,6 +2441,7 @@ static void __init free_area_init_core(s
zone->nr_scan_inactive = 0;
zone->nr_active = 0;
zone->nr_inactive = 0;
+ memset(zone->vm_stat, 0, sizeof(zone->vm_stat));
atomic_set(&zone->reclaim_in_progress, 0);
if (!size)
continue;
Index: linux-2.6.15-rc5-mm3/include/linux/gfp.h
===================================================================
--- linux-2.6.15-rc5-mm3.orig/include/linux/gfp.h 2005-12-16 11:44:09.000000000 -0800
+++ linux-2.6.15-rc5-mm3/include/linux/gfp.h 2005-12-20 11:58:27.000000000 -0800
@@ -164,4 +164,12 @@ void drain_remote_pages(void);
static inline void drain_remote_pages(void) { };
#endif

+#ifdef CONFIG_SMP
+void refresh_cpu_vm_stats(void);
+void refresh_vm_stats(void);
+#else
+static inline void refresh_cpu_vm_stats(void) { };
+static inline void refresh_vm_stats(void) { };
+#endif
+
#endif /* __LINUX_GFP_H */
Index: linux-2.6.15-rc5-mm3/mm/slab.c
===================================================================
--- linux-2.6.15-rc5-mm3.orig/mm/slab.c 2005-12-16 11:44:09.000000000 -0800
+++ linux-2.6.15-rc5-mm3/mm/slab.c 2005-12-20 11:58:27.000000000 -0800
@@ -3423,6 +3423,7 @@ static void cache_reap(void *unused)
check_irq_on();
up(&cache_chain_sem);
drain_remote_pages();
+ refresh_cpu_vm_stats();
/* Setup the next iteration */
schedule_delayed_work(&__get_cpu_var(reap_work), REAPTIMEOUT_CPUC);
}

2005-12-20 22:03:05

by Christoph Lameter

[permalink] [raw]
Subject: Zoned counters V1 [ 3/14]: Include zoned counters in /proc/vmstat

Make /proc/vmstat include zoned couters

This makes vmstat print counters from a combined array of zoned counters
plus the current page_state (which will be later converted by the
event counter patchset).

Signed-off-by: Christoph Lameter <[email protected]>

Index: linux-2.6.15-rc5-mm3/mm/page_alloc.c
===================================================================
--- linux-2.6.15-rc5-mm3.orig/mm/page_alloc.c 2005-12-20 12:05:56.000000000 -0800
+++ linux-2.6.15-rc5-mm3/mm/page_alloc.c 2005-12-20 12:19:01.000000000 -0800
@@ -2673,6 +2673,9 @@ struct seq_operations zoneinfo_op = {
};

static char *vmstat_text[] = {
+ /* Zoned VM counters */
+
+ /* Page state */
"nr_dirty",
"nr_writeback",
"nr_unstable",
@@ -2730,19 +2733,25 @@ static char *vmstat_text[] = {

static void *vmstat_start(struct seq_file *m, loff_t *pos)
{
+ unsigned long *v;
struct page_state *ps;
+ int i;

if (*pos >= ARRAY_SIZE(vmstat_text))
return NULL;

- ps = kmalloc(sizeof(*ps), GFP_KERNEL);
- m->private = ps;
- if (!ps)
+ v = kmalloc(NR_STAT_ITEMS * sizeof(unsigned long)
+ + sizeof(struct page_state), GFP_KERNEL);
+ m->private = v;
+ if (!v)
return ERR_PTR(-ENOMEM);
+ for (i = 0; i < NR_STAT_ITEMS; i++)
+ v[i] = global_page_state(i);
+ ps = (struct page_state *)(v + NR_STAT_ITEMS);
get_full_page_state(ps);
ps->pgpgin /= 2; /* sectors -> kbytes */
ps->pgpgout /= 2;
- return (unsigned long *)ps + *pos;
+ return v + *pos;
}

static void *vmstat_next(struct seq_file *m, void *arg, loff_t *pos)

2005-12-20 22:03:15

by Christoph Lameter

[permalink] [raw]
Subject: Zoned counters V1 [ 9/14]: Convert nr_dirty

Convert nr_dirty

This makes nr_dirty a per zone counter, so that we can determine the number
of dirty pages per node etc.

The counter aggregation for nr_dirty had to be undone in the NFS layer
since it summed up the pages from multiple zones.

Signed-off-by: Christoph Lameter <[email protected]>

Index: linux-2.6.15-rc5-mm3/mm/page_alloc.c
===================================================================
--- linux-2.6.15-rc5-mm3.orig/mm/page_alloc.c 2005-12-20 12:58:47.000000000 -0800
+++ linux-2.6.15-rc5-mm3/mm/page_alloc.c 2005-12-20 12:58:54.000000000 -0800
@@ -597,7 +597,9 @@ static int rmqueue_bulk(struct zone *zon
return i;
}

-char *stat_item_descr[NR_STAT_ITEMS] = { "mapped","pagecache", "slab", "pagetable" };
+char *stat_item_descr[NR_STAT_ITEMS] = {
+ "mapped","pagecache", "slab", "pagetable", "dirty"
+};

/*
* Manage combined zone based / global counters
@@ -1780,7 +1782,7 @@ void show_free_areas(void)
"unstable:%lu free:%u slab:%lu mapped:%lu pagetables:%lu\n",
active,
inactive,
- ps.nr_dirty,
+ global_page_state(NR_DIRTY),
ps.nr_writeback,
ps.nr_unstable,
nr_free_pages(),
@@ -2679,9 +2681,9 @@ static char *vmstat_text[] = {
"nr_pagecache",
"nr_slab",
"nr_page_table_pages",
+ "nr_dirty",

/* Page state */
- "nr_dirty",
"nr_writeback",
"nr_unstable",

Index: linux-2.6.15-rc5-mm3/include/linux/page-flags.h
===================================================================
--- linux-2.6.15-rc5-mm3.orig/include/linux/page-flags.h 2005-12-20 12:58:31.000000000 -0800
+++ linux-2.6.15-rc5-mm3/include/linux/page-flags.h 2005-12-20 12:58:54.000000000 -0800
@@ -91,7 +91,6 @@
* In this case, the field should be commented here.
*/
struct page_state {
- unsigned long nr_dirty; /* Dirty writeable pages */
unsigned long nr_writeback; /* Pages under writeback */
unsigned long nr_unstable; /* NFS unstable pages */
#define GET_PAGE_STATE_LAST nr_unstable
Index: linux-2.6.15-rc5-mm3/mm/page-writeback.c
===================================================================
--- linux-2.6.15-rc5-mm3.orig/mm/page-writeback.c 2005-12-20 12:57:42.000000000 -0800
+++ linux-2.6.15-rc5-mm3/mm/page-writeback.c 2005-12-20 12:58:54.000000000 -0800
@@ -109,7 +109,7 @@ struct writeback_state

static void get_writeback_state(struct writeback_state *wbs)
{
- wbs->nr_dirty = read_page_state(nr_dirty);
+ wbs->nr_dirty = global_page_state(NR_DIRTY);
wbs->nr_unstable = read_page_state(nr_unstable);
wbs->nr_mapped = global_page_state(NR_MAPPED);
wbs->nr_writeback = read_page_state(nr_writeback);
@@ -632,7 +632,7 @@ int __set_page_dirty_nobuffers(struct pa
if (mapping2) { /* Race with truncate? */
BUG_ON(mapping2 != mapping);
if (mapping_cap_account_dirty(mapping))
- inc_page_state(nr_dirty);
+ __inc_zone_page_state(page, NR_DIRTY);
radix_tree_tag_set(&mapping->page_tree,
page_index(page), PAGECACHE_TAG_DIRTY);
}
@@ -716,9 +716,9 @@ int test_clear_page_dirty(struct page *p
radix_tree_tag_clear(&mapping->page_tree,
page_index(page),
PAGECACHE_TAG_DIRTY);
- write_unlock_irqrestore(&mapping->tree_lock, flags);
if (mapping_cap_account_dirty(mapping))
- dec_page_state(nr_dirty);
+ __dec_zone_page_state(page, NR_DIRTY);
+ write_unlock_irqrestore(&mapping->tree_lock, flags);
return 1;
}
write_unlock_irqrestore(&mapping->tree_lock, flags);
@@ -749,7 +749,7 @@ int clear_page_dirty_for_io(struct page
if (mapping) {
if (TestClearPageDirty(page)) {
if (mapping_cap_account_dirty(mapping))
- dec_page_state(nr_dirty);
+ dec_zone_page_state(page, NR_DIRTY);
return 1;
}
return 0;
Index: linux-2.6.15-rc5-mm3/include/linux/mmzone.h
===================================================================
--- linux-2.6.15-rc5-mm3.orig/include/linux/mmzone.h 2005-12-20 12:58:31.000000000 -0800
+++ linux-2.6.15-rc5-mm3/include/linux/mmzone.h 2005-12-20 12:58:54.000000000 -0800
@@ -50,6 +50,7 @@ enum zone_stat_item {
NR_PAGECACHE, /* file backed pages */
NR_SLAB, /* used by slab allocator */
NR_PAGETABLE, /* used for pagetables */
+ NR_DIRTY,
NR_STAT_ITEMS };

#ifdef CONFIG_SMP
Index: linux-2.6.15-rc5-mm3/drivers/base/node.c
===================================================================
--- linux-2.6.15-rc5-mm3.orig/drivers/base/node.c 2005-12-20 12:58:31.000000000 -0800
+++ linux-2.6.15-rc5-mm3/drivers/base/node.c 2005-12-20 12:58:54.000000000 -0800
@@ -53,8 +53,6 @@ static ssize_t node_read_meminfo(struct
nr[j] = node_page_state(nid, j);

/* Check for negative values in these approximate counters */
- if ((long)ps.nr_dirty < 0)
- ps.nr_dirty = 0;
if ((long)ps.nr_writeback < 0)
ps.nr_writeback = 0;

@@ -82,7 +80,7 @@ static ssize_t node_read_meminfo(struct
nid, K(i.freehigh),
nid, K(i.totalram - i.totalhigh),
nid, K(i.freeram - i.freehigh),
- nid, K(ps.nr_dirty),
+ nid, K(nr[NR_DIRTY]),
nid, K(ps.nr_writeback),
nid, K(nr[NR_MAPPED]),
nid, K(nr[NR_PAGECACHE]),
Index: linux-2.6.15-rc5-mm3/fs/fs-writeback.c
===================================================================
--- linux-2.6.15-rc5-mm3.orig/fs/fs-writeback.c 2005-12-16 11:44:08.000000000 -0800
+++ linux-2.6.15-rc5-mm3/fs/fs-writeback.c 2005-12-20 12:58:54.000000000 -0800
@@ -470,7 +470,7 @@ void sync_inodes_sb(struct super_block *
struct writeback_control wbc = {
.sync_mode = wait ? WB_SYNC_ALL : WB_SYNC_HOLD,
};
- unsigned long nr_dirty = read_page_state(nr_dirty);
+ unsigned long nr_dirty = global_page_state(NR_DIRTY);
unsigned long nr_unstable = read_page_state(nr_unstable);

wbc.nr_to_write = nr_dirty + nr_unstable +
Index: linux-2.6.15-rc5-mm3/fs/buffer.c
===================================================================
--- linux-2.6.15-rc5-mm3.orig/fs/buffer.c 2005-12-16 11:44:08.000000000 -0800
+++ linux-2.6.15-rc5-mm3/fs/buffer.c 2005-12-20 12:58:54.000000000 -0800
@@ -857,7 +857,7 @@ int __set_page_dirty_buffers(struct page
write_lock_irq(&mapping->tree_lock);
if (page->mapping) { /* Race with truncate? */
if (mapping_cap_account_dirty(mapping))
- inc_page_state(nr_dirty);
+ __inc_zone_page_state(page, NR_DIRTY);
radix_tree_tag_set(&mapping->page_tree,
page_index(page),
PAGECACHE_TAG_DIRTY);
Index: linux-2.6.15-rc5-mm3/fs/proc/proc_misc.c
===================================================================
--- linux-2.6.15-rc5-mm3.orig/fs/proc/proc_misc.c 2005-12-20 12:58:31.000000000 -0800
+++ linux-2.6.15-rc5-mm3/fs/proc/proc_misc.c 2005-12-20 12:58:54.000000000 -0800
@@ -188,7 +188,7 @@ static int meminfo_read_proc(char *page,
K(i.freeram-i.freehigh),
K(i.totalswap),
K(i.freeswap),
- K(ps.nr_dirty),
+ K(global_page_state(NR_DIRTY)),
K(ps.nr_writeback),
K(global_page_state(NR_MAPPED)),
K(global_page_state(NR_SLAB)),
Index: linux-2.6.15-rc5-mm3/arch/i386/mm/pgtable.c
===================================================================
--- linux-2.6.15-rc5-mm3.orig/arch/i386/mm/pgtable.c 2005-12-03 21:10:42.000000000 -0800
+++ linux-2.6.15-rc5-mm3/arch/i386/mm/pgtable.c 2005-12-20 12:58:54.000000000 -0800
@@ -59,7 +59,7 @@ void show_mem(void)
printk(KERN_INFO "%d pages swap cached\n", cached);

get_page_state(&ps);
- printk(KERN_INFO "%lu pages dirty\n", ps.nr_dirty);
+ printk(KERN_INFO "%lu pages dirty\n", global_page_state(NR_DIRTY));
printk(KERN_INFO "%lu pages writeback\n", ps.nr_writeback);
printk(KERN_INFO "%lu pages mapped\n", ps.nr_mapped);
printk(KERN_INFO "%lu pages slab\n", ps.nr_slab);
Index: linux-2.6.15-rc5-mm3/fs/reiser4/page_cache.c
===================================================================
--- linux-2.6.15-rc5-mm3.orig/fs/reiser4/page_cache.c 2005-12-16 11:44:08.000000000 -0800
+++ linux-2.6.15-rc5-mm3/fs/reiser4/page_cache.c 2005-12-20 12:58:54.000000000 -0800
@@ -470,7 +470,7 @@ int set_page_dirty_internal(struct page

if (!TestSetPageDirty(page)) {
if (mapping_cap_account_dirty(mapping))
- inc_page_state(nr_dirty);
+ inc_zone_page_state(page, NR_DIRTY);

__mark_inode_dirty(mapping->host, I_DIRTY_PAGES);
}
Index: linux-2.6.15-rc5-mm3/fs/nfs/write.c
===================================================================
--- linux-2.6.15-rc5-mm3.orig/fs/nfs/write.c 2005-12-16 11:44:08.000000000 -0800
+++ linux-2.6.15-rc5-mm3/fs/nfs/write.c 2005-12-20 12:58:54.000000000 -0800
@@ -461,7 +461,7 @@ nfs_mark_request_dirty(struct nfs_page *
nfs_list_add_request(req, &nfsi->dirty);
nfsi->ndirty++;
spin_unlock(&nfsi->req_lock);
- inc_page_state(nr_dirty);
+ inc_zone_page_state(req->wb_page, NR_DIRTY);
mark_inode_dirty(inode);
}

@@ -554,7 +554,6 @@ nfs_scan_dirty(struct inode *inode, stru
if (nfsi->ndirty != 0) {
res = nfs_scan_lock_dirty(nfsi, dst, idx_start, npages);
nfsi->ndirty -= res;
- sub_page_state(nr_dirty,res);
if ((nfsi->ndirty == 0) != list_empty(&nfsi->dirty))
printk(KERN_ERR "NFS: desynchronized value of nfs_i.ndirty.\n");
}
Index: linux-2.6.15-rc5-mm3/fs/reiser4/emergency_flush.c
===================================================================
--- linux-2.6.15-rc5-mm3.orig/fs/reiser4/emergency_flush.c 2005-12-16 11:44:08.000000000 -0800
+++ linux-2.6.15-rc5-mm3/fs/reiser4/emergency_flush.c 2005-12-20 12:58:54.000000000 -0800
@@ -740,7 +740,7 @@ void eflush_del(jnode * node, int page_l
if (!TestSetPageDirty(page)) {
BUG_ON(jnode_get_mapping(node) != page->mapping);
if (mapping_cap_account_dirty(page->mapping))
- inc_page_state(nr_dirty);
+ inc_zone_page_state(page, NR_DIRTY);
}

assert("nikita-2766", atomic_read(&node->x_count) > 1);
Index: linux-2.6.15-rc5-mm3/fs/reiser4/as_ops.c
===================================================================
--- linux-2.6.15-rc5-mm3.orig/fs/reiser4/as_ops.c 2005-12-16 11:44:08.000000000 -0800
+++ linux-2.6.15-rc5-mm3/fs/reiser4/as_ops.c 2005-12-20 12:58:54.000000000 -0800
@@ -84,7 +84,7 @@ int reiser4_set_page_dirty(struct page *
if (page->mapping) {
assert("vs-1652", page->mapping == mapping);
if (mapping_cap_account_dirty(mapping))
- inc_page_state(nr_dirty);
+ __inc_zone_page_state(page, NR_DIRTY);
radix_tree_tag_set(&mapping->page_tree,
page->index,
PAGECACHE_TAG_REISER4_MOVED);
Index: linux-2.6.15-rc5-mm3/fs/nfs/pagelist.c
===================================================================
--- linux-2.6.15-rc5-mm3.orig/fs/nfs/pagelist.c 2005-12-03 21:10:42.000000000 -0800
+++ linux-2.6.15-rc5-mm3/fs/nfs/pagelist.c 2005-12-20 12:58:54.000000000 -0800
@@ -309,6 +309,7 @@ nfs_scan_lock_dirty(struct nfs_inode *nf
req->wb_index, NFS_PAGE_TAG_DIRTY);
nfs_list_remove_request(req);
nfs_list_add_request(req, dst);
+ inc_zone_page_state(req->wb_page, NR_DIRTY);
res++;
}
}

2005-12-20 22:04:38

by Christoph Lameter

[permalink] [raw]
Subject: Zoned counters V1 [14/14]: Remove wbs

Remove writeback state

We can remove some functions now that were needed to calculate the page
state for writeback control since these statistics are now directly
available.

Signed-off-by: Christoph Lameter <[email protected]>

Index: linux-2.6.15-rc5-mm3/mm/page-writeback.c
===================================================================
--- linux-2.6.15-rc5-mm3.orig/mm/page-writeback.c 2005-12-20 12:59:17.000000000 -0800
+++ linux-2.6.15-rc5-mm3/mm/page-writeback.c 2005-12-20 13:16:34.000000000 -0800
@@ -99,22 +99,6 @@ EXPORT_SYMBOL(laptop_mode);

static void background_writeout(unsigned long _min_pages);

-struct writeback_state
-{
- unsigned long nr_dirty;
- unsigned long nr_unstable;
- unsigned long nr_mapped;
- unsigned long nr_writeback;
-};
-
-static void get_writeback_state(struct writeback_state *wbs)
-{
- wbs->nr_dirty = global_page_state(NR_DIRTY);
- wbs->nr_unstable = global_page_state(NR_UNSTABLE);
- wbs->nr_mapped = global_page_state(NR_MAPPED);
- wbs->nr_writeback = global_page_state(NR_WRITEBACK);
-}
-
/*
* Work out the current dirty-memory clamping and background writeout
* thresholds.
@@ -133,8 +117,7 @@ static void get_writeback_state(struct w
* clamping level.
*/
static void
-get_dirty_limits(struct writeback_state *wbs, long *pbackground, long *pdirty,
- struct address_space *mapping)
+get_dirty_limits(long *pbackground, long *pdirty, struct address_space *mapping)
{
int background_ratio; /* Percentages */
int dirty_ratio;
@@ -144,8 +127,6 @@ get_dirty_limits(struct writeback_state
unsigned long available_memory = total_pages;
struct task_struct *tsk;

- get_writeback_state(wbs);
-
#ifdef CONFIG_HIGHMEM
/*
* If this mapping can only allocate from low memory,
@@ -156,7 +137,7 @@ get_dirty_limits(struct writeback_state
#endif


- unmapped_ratio = 100 - (wbs->nr_mapped * 100) / total_pages;
+ unmapped_ratio = 100 - (global_page_state(NR_MAPPED) * 100) / total_pages;

dirty_ratio = vm_dirty_ratio;
if (dirty_ratio > unmapped_ratio / 2)
@@ -189,7 +170,6 @@ get_dirty_limits(struct writeback_state
*/
static void balance_dirty_pages(struct address_space *mapping)
{
- struct writeback_state wbs;
long nr_reclaimable;
long background_thresh;
long dirty_thresh;
@@ -206,10 +186,9 @@ static void balance_dirty_pages(struct a
.nr_to_write = write_chunk,
};

- get_dirty_limits(&wbs, &background_thresh,
- &dirty_thresh, mapping);
- nr_reclaimable = wbs.nr_dirty + wbs.nr_unstable;
- if (nr_reclaimable + wbs.nr_writeback <= dirty_thresh)
+ get_dirty_limits(&background_thresh, &dirty_thresh, mapping);
+ nr_reclaimable = global_page_state(NR_DIRTY) + global_page_state(NR_UNSTABLE);
+ if (nr_reclaimable + global_page_state(NR_WRITEBACK) <= dirty_thresh)
break;

dirty_exceeded = 1;
@@ -222,10 +201,9 @@ static void balance_dirty_pages(struct a
*/
if (nr_reclaimable) {
writeback_inodes(&wbc);
- get_dirty_limits(&wbs, &background_thresh,
- &dirty_thresh, mapping);
- nr_reclaimable = wbs.nr_dirty + wbs.nr_unstable;
- if (nr_reclaimable + wbs.nr_writeback <= dirty_thresh)
+ get_dirty_limits(&background_thresh, &dirty_thresh, mapping);
+ nr_reclaimable = global_page_state(NR_DIRTY) + global_page_state(NR_UNSTABLE);
+ if (nr_reclaimable + global_page_state(NR_WRITEBACK) <= dirty_thresh)
break;
pages_written += write_chunk - wbc.nr_to_write;
if (pages_written >= write_chunk)
@@ -234,7 +212,7 @@ static void balance_dirty_pages(struct a
blk_congestion_wait(WRITE, HZ/10);
}

- if (nr_reclaimable + wbs.nr_writeback <= dirty_thresh)
+ if (nr_reclaimable + global_page_state(NR_WRITEBACK) <= dirty_thresh)
dirty_exceeded = 0;

if (writeback_in_progress(bdi))
@@ -291,12 +269,11 @@ EXPORT_SYMBOL(balance_dirty_pages_rateli

void throttle_vm_writeout(void)
{
- struct writeback_state wbs;
long background_thresh;
long dirty_thresh;

for ( ; ; ) {
- get_dirty_limits(&wbs, &background_thresh, &dirty_thresh, NULL);
+ get_dirty_limits(&background_thresh, &dirty_thresh, NULL);

/*
* Boost the allowable dirty threshold a bit for page
@@ -304,7 +281,7 @@ void throttle_vm_writeout(void)
*/
dirty_thresh += dirty_thresh / 10; /* wheeee... */

- if (wbs.nr_unstable + wbs.nr_writeback <= dirty_thresh)
+ if (global_page_state(NR_UNSTABLE) + global_page_state(NR_WRITEBACK) <= dirty_thresh)
break;
blk_congestion_wait(WRITE, HZ/10);
}
@@ -327,12 +304,11 @@ static void background_writeout(unsigned
};

for ( ; ; ) {
- struct writeback_state wbs;
long background_thresh;
long dirty_thresh;

- get_dirty_limits(&wbs, &background_thresh, &dirty_thresh, NULL);
- if (wbs.nr_dirty + wbs.nr_unstable < background_thresh
+ get_dirty_limits(&background_thresh, &dirty_thresh, NULL);
+ if (global_page_state(NR_DIRTY) + global_page_state(NR_UNSTABLE) < background_thresh
&& min_pages <= 0)
break;
wbc.encountered_congestion = 0;
@@ -356,12 +332,8 @@ static void background_writeout(unsigned
*/
int wakeup_pdflush(long nr_pages)
{
- if (nr_pages == 0) {
- struct writeback_state wbs;
-
- get_writeback_state(&wbs);
- nr_pages = wbs.nr_dirty + wbs.nr_unstable;
- }
+ if (nr_pages == 0)
+ nr_pages = global_page_state(NR_DIRTY) + global_page_state(NR_UNSTABLE);
return pdflush_operation(background_writeout, nr_pages);
}

@@ -392,7 +364,6 @@ static void wb_kupdate(unsigned long arg
unsigned long start_jif;
unsigned long next_jif;
long nr_to_write;
- struct writeback_state wbs;
struct writeback_control wbc = {
.bdi = NULL,
.sync_mode = WB_SYNC_NONE,
@@ -404,11 +375,10 @@ static void wb_kupdate(unsigned long arg

sync_supers();

- get_writeback_state(&wbs);
oldest_jif = jiffies - (dirty_expire_centisecs * HZ) / 100;
start_jif = jiffies;
next_jif = start_jif + (dirty_writeback_centisecs * HZ) / 100;
- nr_to_write = wbs.nr_dirty + wbs.nr_unstable +
+ nr_to_write = global_page_state(NR_DIRTY) + global_page_state(NR_UNSTABLE) +
(inodes_stat.nr_inodes - inodes_stat.nr_unused);
while (nr_to_write > 0) {
wbc.encountered_congestion = 0;

2005-12-20 22:03:05

by Christoph Lameter

[permalink] [raw]
Subject: Zoned counters V1 [ 4/14]: Convert nr_mapped

Convert nr_mapped

nr_mapped is important because it allows a determination how many pages of a
zone are not mapped, which would allow a more efficient means of determining
when we need to reclaim memory in a zone.

Signed-off-by: Christoph Lameter <[email protected]>

Index: linux-2.6.15-rc5-mm3/drivers/base/node.c
===================================================================
--- linux-2.6.15-rc5-mm3.orig/drivers/base/node.c 2005-12-03 21:10:42.000000000 -0800
+++ linux-2.6.15-rc5-mm3/drivers/base/node.c 2005-12-20 12:57:42.000000000 -0800
@@ -43,18 +43,18 @@ static ssize_t node_read_meminfo(struct
unsigned long inactive;
unsigned long active;
unsigned long free;
+ unsigned long nr_mapped;

si_meminfo_node(&i, nid);
get_page_state_node(&ps, nid);
__get_zone_counts(&active, &inactive, &free, NODE_DATA(nid));
+ nr_mapped = node_page_state(nid, NR_MAPPED);

/* Check for negative values in these approximate counters */
if ((long)ps.nr_dirty < 0)
ps.nr_dirty = 0;
if ((long)ps.nr_writeback < 0)
ps.nr_writeback = 0;
- if ((long)ps.nr_mapped < 0)
- ps.nr_mapped = 0;
if ((long)ps.nr_slab < 0)
ps.nr_slab = 0;

@@ -83,7 +83,7 @@ static ssize_t node_read_meminfo(struct
nid, K(i.freeram - i.freehigh),
nid, K(ps.nr_dirty),
nid, K(ps.nr_writeback),
- nid, K(ps.nr_mapped),
+ nid, K(nr_mapped),
nid, K(ps.nr_slab));
n += hugetlb_report_node_meminfo(nid, buf + n);
return n;
Index: linux-2.6.15-rc5-mm3/fs/proc/proc_misc.c
===================================================================
--- linux-2.6.15-rc5-mm3.orig/fs/proc/proc_misc.c 2005-12-16 11:44:08.000000000 -0800
+++ linux-2.6.15-rc5-mm3/fs/proc/proc_misc.c 2005-12-20 12:57:42.000000000 -0800
@@ -190,7 +190,7 @@ static int meminfo_read_proc(char *page,
K(i.freeswap),
K(ps.nr_dirty),
K(ps.nr_writeback),
- K(ps.nr_mapped),
+ K(global_page_state(NR_MAPPED)),
K(ps.nr_slab),
K(allowed),
K(committed),
Index: linux-2.6.15-rc5-mm3/mm/vmscan.c
===================================================================
--- linux-2.6.15-rc5-mm3.orig/mm/vmscan.c 2005-12-16 11:44:09.000000000 -0800
+++ linux-2.6.15-rc5-mm3/mm/vmscan.c 2005-12-20 12:57:42.000000000 -0800
@@ -1195,7 +1195,7 @@ int try_to_free_pages(struct zone **zone
}

for (priority = DEF_PRIORITY; priority >= 0; priority--) {
- sc.nr_mapped = read_page_state(nr_mapped);
+ sc.nr_mapped = global_page_state(NR_MAPPED);
sc.nr_scanned = 0;
sc.nr_reclaimed = 0;
sc.priority = priority;
@@ -1283,7 +1283,7 @@ loop_again:
total_reclaimed = 0;
sc.gfp_mask = GFP_KERNEL;
sc.may_writepage = 0;
- sc.nr_mapped = read_page_state(nr_mapped);
+ sc.nr_mapped = global_page_state(NR_MAPPED);

inc_page_state(pageoutrun);

Index: linux-2.6.15-rc5-mm3/mm/page-writeback.c
===================================================================
--- linux-2.6.15-rc5-mm3.orig/mm/page-writeback.c 2005-12-16 11:44:09.000000000 -0800
+++ linux-2.6.15-rc5-mm3/mm/page-writeback.c 2005-12-20 12:57:42.000000000 -0800
@@ -111,7 +111,7 @@ static void get_writeback_state(struct w
{
wbs->nr_dirty = read_page_state(nr_dirty);
wbs->nr_unstable = read_page_state(nr_unstable);
- wbs->nr_mapped = read_page_state(nr_mapped);
+ wbs->nr_mapped = global_page_state(NR_MAPPED);
wbs->nr_writeback = read_page_state(nr_writeback);
}

Index: linux-2.6.15-rc5-mm3/mm/page_alloc.c
===================================================================
--- linux-2.6.15-rc5-mm3.orig/mm/page_alloc.c 2005-12-20 12:57:39.000000000 -0800
+++ linux-2.6.15-rc5-mm3/mm/page_alloc.c 2005-12-20 12:57:51.000000000 -0800
@@ -1789,7 +1789,7 @@ void show_free_areas(void)
ps.nr_unstable,
nr_free_pages(),
ps.nr_slab,
- ps.nr_mapped,
+ global_page_state(NR_MAPPED),
ps.nr_page_table_pages);

for_each_zone(zone) {
@@ -2674,13 +2674,13 @@ struct seq_operations zoneinfo_op = {

static char *vmstat_text[] = {
/* Zoned VM counters */
+ "nr_mapped",

/* Page state */
"nr_dirty",
"nr_writeback",
"nr_unstable",
"nr_page_table_pages",
- "nr_mapped",
"nr_slab",

"pgpgin",
Index: linux-2.6.15-rc5-mm3/mm/rmap.c
===================================================================
--- linux-2.6.15-rc5-mm3.orig/mm/rmap.c 2005-12-16 11:44:09.000000000 -0800
+++ linux-2.6.15-rc5-mm3/mm/rmap.c 2005-12-20 12:57:42.000000000 -0800
@@ -455,7 +455,7 @@ static void __page_set_anon_rmap(struct
* nr_mapped state can be updated without turning off
* interrupts because it is not modified via interrupt.
*/
- __inc_page_state(nr_mapped);
+ __inc_zone_page_state(page, NR_MAPPED);
}

/**
@@ -502,7 +502,7 @@ void page_add_file_rmap(struct page *pag
BUG_ON(!pfn_valid(page_to_pfn(page)));

if (atomic_inc_and_test(&page->_mapcount))
- __inc_page_state(nr_mapped);
+ __inc_zone_page_state(page, NR_MAPPED);
}

/**
@@ -526,7 +526,7 @@ void page_remove_rmap(struct page *page)
*/
if (page_test_and_clear_dirty(page))
set_page_dirty(page);
- __dec_page_state(nr_mapped);
+ __dec_zone_page_state(page, NR_MAPPED);
}
}

Index: linux-2.6.15-rc5-mm3/include/linux/mmzone.h
===================================================================
--- linux-2.6.15-rc5-mm3.orig/include/linux/mmzone.h 2005-12-20 12:57:37.000000000 -0800
+++ linux-2.6.15-rc5-mm3/include/linux/mmzone.h 2005-12-20 12:57:42.000000000 -0800
@@ -45,6 +45,9 @@ struct zone_padding {
#endif

enum zone_stat_item {
+ NR_MAPPED, /* mapped into pagetables.
+ only modified from process context */
+
NR_STAT_ITEMS };

#ifdef CONFIG_SMP
Index: linux-2.6.15-rc5-mm3/include/linux/page-flags.h
===================================================================
--- linux-2.6.15-rc5-mm3.orig/include/linux/page-flags.h 2005-12-20 12:57:37.000000000 -0800
+++ linux-2.6.15-rc5-mm3/include/linux/page-flags.h 2005-12-20 12:57:42.000000000 -0800
@@ -95,8 +95,6 @@ struct page_state {
unsigned long nr_writeback; /* Pages under writeback */
unsigned long nr_unstable; /* NFS unstable pages */
unsigned long nr_page_table_pages;/* Pages used for pagetables */
- unsigned long nr_mapped; /* mapped into pagetables.
- * only modified from process context */
unsigned long nr_slab; /* In slab */
#define GET_PAGE_STATE_LAST nr_slab

2005-12-20 22:03:54

by Christoph Lameter

[permalink] [raw]
Subject: Zoned counters V1 [13/14]: Remove get_page_state functions

Remove obsolete page_state related functions

We can remove all the get_page_state related functions after all the basic
page state variables have been moved to the zone based scheme.

Signed-off-by: Christoph Lameter <[email protected]>

Index: linux-2.6.15-rc5-mm3/include/linux/page-flags.h
===================================================================
--- linux-2.6.15-rc5-mm3.orig/include/linux/page-flags.h 2005-12-20 13:15:44.000000000 -0800
+++ linux-2.6.15-rc5-mm3/include/linux/page-flags.h 2005-12-20 13:15:45.000000000 -0800
@@ -91,8 +91,6 @@
* In this case, the field should be commented here.
*/
struct page_state {
-#define GET_PAGE_STATE_LAST xxx
-
/*
* The below are zeroed by get_page_state(). Use get_full_page_state()
* to add up all these.
@@ -144,8 +142,6 @@ struct page_state {
unsigned long pgrotated; /* pages rotated to tail of the LRU */
};

-extern void get_page_state(struct page_state *ret);
-extern void get_page_state_node(struct page_state *ret, int node);
extern void get_full_page_state(struct page_state *ret);
extern unsigned long read_page_state_offset(unsigned long offset);
extern void mod_page_state_offset(unsigned long offset, unsigned long delta);
Index: linux-2.6.15-rc5-mm3/drivers/base/node.c
===================================================================
--- linux-2.6.15-rc5-mm3.orig/drivers/base/node.c 2005-12-20 13:15:44.000000000 -0800
+++ linux-2.6.15-rc5-mm3/drivers/base/node.c 2005-12-20 13:15:45.000000000 -0800
@@ -39,7 +39,6 @@ static ssize_t node_read_meminfo(struct
int n;
int nid = dev->id;
struct sysinfo i;
- struct page_state ps;
unsigned long inactive;
unsigned long active;
unsigned long free;
@@ -47,7 +46,6 @@ static ssize_t node_read_meminfo(struct
unsigned long nr[NR_STAT_ITEMS];

si_meminfo_node(&i, nid);
- get_page_state_node(&ps, nid);
__get_zone_counts(&active, &inactive, &free, NODE_DATA(nid));
for (j = 0; j < NR_STAT_ITEMS; j++)
nr[j] = node_page_state(nid, j);
Index: linux-2.6.15-rc5-mm3/arch/i386/mm/pgtable.c
===================================================================
--- linux-2.6.15-rc5-mm3.orig/arch/i386/mm/pgtable.c 2005-12-20 12:59:09.000000000 -0800
+++ linux-2.6.15-rc5-mm3/arch/i386/mm/pgtable.c 2005-12-20 13:15:45.000000000 -0800
@@ -30,7 +30,6 @@ void show_mem(void)
struct page *page;
pg_data_t *pgdat;
unsigned long i;
- struct page_state ps;
unsigned long flags;

printk(KERN_INFO "Mem-info:\n");
@@ -58,7 +57,6 @@ void show_mem(void)
printk(KERN_INFO "%d pages shared\n", shared);
printk(KERN_INFO "%d pages swap cached\n", cached);

- get_page_state(&ps);
printk(KERN_INFO "%lu pages dirty\n", global_page_state(NR_DIRTY));
printk(KERN_INFO "%lu pages writeback\n", global_page_state(NR_WRITEBACK));
printk(KERN_INFO "%lu pages mapped\n", ps.nr_mapped);
Index: linux-2.6.15-rc5-mm3/mm/page_alloc.c
===================================================================
--- linux-2.6.15-rc5-mm3.orig/mm/page_alloc.c 2005-12-20 13:15:44.000000000 -0800
+++ linux-2.6.15-rc5-mm3/mm/page_alloc.c 2005-12-20 13:15:45.000000000 -0800
@@ -1603,28 +1603,6 @@ static void __get_page_state(struct page
}
}

-void get_page_state_node(struct page_state *ret, int node)
-{
- int nr;
- cpumask_t mask = node_to_cpumask(node);
-
- nr = offsetof(struct page_state, GET_PAGE_STATE_LAST);
- nr /= sizeof(unsigned long);
-
- __get_page_state(ret, nr+1, &mask);
-}
-
-void get_page_state(struct page_state *ret)
-{
- int nr;
- cpumask_t mask = CPU_MASK_ALL;
-
- nr = offsetof(struct page_state, GET_PAGE_STATE_LAST);
- nr /= sizeof(unsigned long);
-
- __get_page_state(ret, nr + 1, &mask);
-}
-
void get_full_page_state(struct page_state *ret)
{
cpumask_t mask = CPU_MASK_ALL;
@@ -1740,7 +1718,6 @@ void si_meminfo_node(struct sysinfo *val
*/
void show_free_areas(void)
{
- struct page_state ps;
int cpu, temperature;
unsigned long active;
unsigned long inactive;
@@ -1772,7 +1749,6 @@ void show_free_areas(void)
}
}

- get_page_state(&ps);
get_zone_counts(&active, &inactive, &free);

printk("Free pages: %11ukB (%ukB HighMem)\n",
Index: linux-2.6.15-rc5-mm3/fs/proc/proc_misc.c
===================================================================
--- linux-2.6.15-rc5-mm3.orig/fs/proc/proc_misc.c 2005-12-20 12:59:09.000000000 -0800
+++ linux-2.6.15-rc5-mm3/fs/proc/proc_misc.c 2005-12-20 13:15:45.000000000 -0800
@@ -120,7 +120,6 @@ static int meminfo_read_proc(char *page,
{
struct sysinfo i;
int len;
- struct page_state ps;
unsigned long inactive;
unsigned long active;
unsigned long free;
@@ -129,7 +128,6 @@ static int meminfo_read_proc(char *page,
struct vmalloc_info vmi;
long cached;

- get_page_state(&ps);
get_zone_counts(&active, &inactive, &free);

/*

2005-12-20 22:03:16

by Christoph Lameter

[permalink] [raw]
Subject: Zoned counters V1 [11/14]: Convert nr_unstable

Per zone unstable pages

Signed-off-by: Christoph Lameter <[email protected]>

Index: linux-2.6.15-rc5-mm3/fs/fs-writeback.c
===================================================================
--- linux-2.6.15-rc5-mm3.orig/fs/fs-writeback.c 2005-12-20 12:58:54.000000000 -0800
+++ linux-2.6.15-rc5-mm3/fs/fs-writeback.c 2005-12-20 12:59:17.000000000 -0800
@@ -471,7 +471,7 @@ void sync_inodes_sb(struct super_block *
.sync_mode = wait ? WB_SYNC_ALL : WB_SYNC_HOLD,
};
unsigned long nr_dirty = global_page_state(NR_DIRTY);
- unsigned long nr_unstable = read_page_state(nr_unstable);
+ unsigned long nr_unstable = global_page_state(NR_UNSTABLE);

wbc.nr_to_write = nr_dirty + nr_unstable +
(inodes_stat.nr_inodes - inodes_stat.nr_unused) +
Index: linux-2.6.15-rc5-mm3/mm/page_alloc.c
===================================================================
--- linux-2.6.15-rc5-mm3.orig/mm/page_alloc.c 2005-12-20 12:59:09.000000000 -0800
+++ linux-2.6.15-rc5-mm3/mm/page_alloc.c 2005-12-20 12:59:40.000000000 -0800
@@ -598,7 +598,8 @@ static int rmqueue_bulk(struct zone *zon
}

char *stat_item_descr[NR_STAT_ITEMS] = {
- "mapped","pagecache", "slab", "pagetable", "dirty", "writeback"
+ "mapped","pagecache", "slab", "pagetable", "dirty", "writeback",
+ "unstable"
};

/*
@@ -1784,7 +1785,7 @@ void show_free_areas(void)
inactive,
global_page_state(NR_DIRTY),
global_page_state(NR_WRITEBACK),
- ps.nr_unstable,
+ global_page_state(NR_UNSTABLE),
nr_free_pages(),
global_page_state(NR_SLAB),
global_page_state(NR_MAPPED),
@@ -2683,10 +2684,9 @@ static char *vmstat_text[] = {
"nr_page_table_pages",
"nr_dirty",
"nr_writeback",
-
- /* Page state */
"nr_unstable",

+ /* Page state */
"pgpgin",
"pgpgout",
"pswpin",
Index: linux-2.6.15-rc5-mm3/fs/nfs/write.c
===================================================================
--- linux-2.6.15-rc5-mm3.orig/fs/nfs/write.c 2005-12-20 12:58:54.000000000 -0800
+++ linux-2.6.15-rc5-mm3/fs/nfs/write.c 2005-12-20 12:59:17.000000000 -0800
@@ -489,7 +489,7 @@ nfs_mark_request_commit(struct nfs_page
nfs_list_add_request(req, &nfsi->commit);
nfsi->ncommit++;
spin_unlock(&nfsi->req_lock);
- inc_page_state(nr_unstable);
+ inc_zone_page_state(req->wb_page, NR_UNSTABLE);
mark_inode_dirty(inode);
}
#endif
@@ -1287,7 +1287,6 @@ void nfs_commit_done(struct rpc_task *ta
{
struct nfs_write_data *data = calldata;
struct nfs_page *req;
- int res = 0;

dprintk("NFS: %4d nfs_commit_done (status %d)\n",
task->tk_pid, task->tk_status);
@@ -1321,9 +1320,8 @@ void nfs_commit_done(struct rpc_task *ta
nfs_mark_request_dirty(req);
next:
nfs_clear_page_writeback(req);
- res++;
+ dec_zone_page_state(req->wb_page, NR_UNSTABLE);
}
- sub_page_state(nr_unstable,res);
}
#endif

Index: linux-2.6.15-rc5-mm3/include/linux/page-flags.h
===================================================================
--- linux-2.6.15-rc5-mm3.orig/include/linux/page-flags.h 2005-12-20 12:59:09.000000000 -0800
+++ linux-2.6.15-rc5-mm3/include/linux/page-flags.h 2005-12-20 12:59:17.000000000 -0800
@@ -91,8 +91,7 @@
* In this case, the field should be commented here.
*/
struct page_state {
- unsigned long nr_unstable; /* NFS unstable pages */
-#define GET_PAGE_STATE_LAST nr_unstable
+#define GET_PAGE_STATE_LAST xxx

/*
* The below are zeroed by get_page_state(). Use get_full_page_state()
Index: linux-2.6.15-rc5-mm3/mm/page-writeback.c
===================================================================
--- linux-2.6.15-rc5-mm3.orig/mm/page-writeback.c 2005-12-20 12:59:09.000000000 -0800
+++ linux-2.6.15-rc5-mm3/mm/page-writeback.c 2005-12-20 12:59:17.000000000 -0800
@@ -110,7 +110,7 @@ struct writeback_state
static void get_writeback_state(struct writeback_state *wbs)
{
wbs->nr_dirty = global_page_state(NR_DIRTY);
- wbs->nr_unstable = read_page_state(nr_unstable);
+ wbs->nr_unstable = global_page_state(NR_UNSTABLE);
wbs->nr_mapped = global_page_state(NR_MAPPED);
wbs->nr_writeback = global_page_state(NR_WRITEBACK);
}
Index: linux-2.6.15-rc5-mm3/include/linux/mmzone.h
===================================================================
--- linux-2.6.15-rc5-mm3.orig/include/linux/mmzone.h 2005-12-20 12:59:09.000000000 -0800
+++ linux-2.6.15-rc5-mm3/include/linux/mmzone.h 2005-12-20 12:59:17.000000000 -0800
@@ -52,6 +52,7 @@ enum zone_stat_item {
NR_PAGETABLE, /* used for pagetables */
NR_DIRTY,
NR_WRITEBACK,
+ NR_UNSTABLE, /* NFS unstable pages */
NR_STAT_ITEMS };

#ifdef CONFIG_SMP
Index: linux-2.6.15-rc5-mm3/drivers/base/node.c
===================================================================
--- linux-2.6.15-rc5-mm3.orig/drivers/base/node.c 2005-12-20 12:59:09.000000000 -0800
+++ linux-2.6.15-rc5-mm3/drivers/base/node.c 2005-12-20 12:59:17.000000000 -0800
@@ -65,6 +65,7 @@ static ssize_t node_read_meminfo(struct
"Node %d LowFree: %8lu kB\n"
"Node %d Dirty: %8lu kB\n"
"Node %d Writeback: %8lu kB\n"
+ "Node %d Unstable: %8lu kB\n"
"Node %d Mapped: %8lu kB\n"
"Node %d Pagecache: %8lu kB\n"
"Node %d Slab: %8lu kB\n",
@@ -79,6 +80,7 @@ static ssize_t node_read_meminfo(struct
nid, K(i.freeram - i.freehigh),
nid, K(nr[NR_DIRTY]),
nid, K(nr[NR_WRITEBACK]),
+ nid, K(nr[NR_UNSTABLE]),
nid, K(nr[NR_MAPPED]),
nid, K(nr[NR_PAGECACHE]),
nid, K(nr[NR_SLAB]));

2005-12-20 22:03:54

by Christoph Lameter

[permalink] [raw]
Subject: Zoned counters V1 [10/14]: Convert nr_writeback

Convert nr_writeback

Signed-off-by: Christoph Lameter <[email protected]>

Index: linux-2.6.15-rc5-mm3/drivers/base/node.c
===================================================================
--- linux-2.6.15-rc5-mm3.orig/drivers/base/node.c 2005-12-20 12:58:54.000000000 -0800
+++ linux-2.6.15-rc5-mm3/drivers/base/node.c 2005-12-20 12:59:09.000000000 -0800
@@ -52,9 +52,6 @@ static ssize_t node_read_meminfo(struct
for (j = 0; j < NR_STAT_ITEMS; j++)
nr[j] = node_page_state(nid, j);

- /* Check for negative values in these approximate counters */
- if ((long)ps.nr_writeback < 0)
- ps.nr_writeback = 0;

n = sprintf(buf, "\n"
"Node %d MemTotal: %8lu kB\n"
@@ -81,7 +78,7 @@ static ssize_t node_read_meminfo(struct
nid, K(i.totalram - i.totalhigh),
nid, K(i.freeram - i.freehigh),
nid, K(nr[NR_DIRTY]),
- nid, K(ps.nr_writeback),
+ nid, K(nr[NR_WRITEBACK]),
nid, K(nr[NR_MAPPED]),
nid, K(nr[NR_PAGECACHE]),
nid, K(nr[NR_SLAB]));
Index: linux-2.6.15-rc5-mm3/fs/proc/proc_misc.c
===================================================================
--- linux-2.6.15-rc5-mm3.orig/fs/proc/proc_misc.c 2005-12-20 12:58:54.000000000 -0800
+++ linux-2.6.15-rc5-mm3/fs/proc/proc_misc.c 2005-12-20 12:59:09.000000000 -0800
@@ -189,7 +189,7 @@ static int meminfo_read_proc(char *page,
K(i.totalswap),
K(i.freeswap),
K(global_page_state(NR_DIRTY)),
- K(ps.nr_writeback),
+ K(global_page_state(NR_WRITEBACK)),
K(global_page_state(NR_MAPPED)),
K(global_page_state(NR_SLAB)),
K(allowed),
Index: linux-2.6.15-rc5-mm3/arch/i386/mm/pgtable.c
===================================================================
--- linux-2.6.15-rc5-mm3.orig/arch/i386/mm/pgtable.c 2005-12-20 12:58:54.000000000 -0800
+++ linux-2.6.15-rc5-mm3/arch/i386/mm/pgtable.c 2005-12-20 12:59:09.000000000 -0800
@@ -60,7 +60,7 @@ void show_mem(void)

get_page_state(&ps);
printk(KERN_INFO "%lu pages dirty\n", global_page_state(NR_DIRTY));
- printk(KERN_INFO "%lu pages writeback\n", ps.nr_writeback);
+ printk(KERN_INFO "%lu pages writeback\n", global_page_state(NR_WRITEBACK));
printk(KERN_INFO "%lu pages mapped\n", ps.nr_mapped);
printk(KERN_INFO "%lu pages slab\n", ps.nr_slab);
printk(KERN_INFO "%lu pages pagetables\n", ps.nr_page_table_pages);
Index: linux-2.6.15-rc5-mm3/mm/page_alloc.c
===================================================================
--- linux-2.6.15-rc5-mm3.orig/mm/page_alloc.c 2005-12-20 12:58:54.000000000 -0800
+++ linux-2.6.15-rc5-mm3/mm/page_alloc.c 2005-12-20 12:59:09.000000000 -0800
@@ -598,7 +598,7 @@ static int rmqueue_bulk(struct zone *zon
}

char *stat_item_descr[NR_STAT_ITEMS] = {
- "mapped","pagecache", "slab", "pagetable", "dirty"
+ "mapped","pagecache", "slab", "pagetable", "dirty", "writeback"
};

/*
@@ -1783,7 +1783,7 @@ void show_free_areas(void)
active,
inactive,
global_page_state(NR_DIRTY),
- ps.nr_writeback,
+ global_page_state(NR_WRITEBACK),
ps.nr_unstable,
nr_free_pages(),
global_page_state(NR_SLAB),
@@ -2682,9 +2682,9 @@ static char *vmstat_text[] = {
"nr_slab",
"nr_page_table_pages",
"nr_dirty",
+ "nr_writeback",

/* Page state */
- "nr_writeback",
"nr_unstable",

"pgpgin",
Index: linux-2.6.15-rc5-mm3/include/linux/page-flags.h
===================================================================
--- linux-2.6.15-rc5-mm3.orig/include/linux/page-flags.h 2005-12-20 12:58:54.000000000 -0800
+++ linux-2.6.15-rc5-mm3/include/linux/page-flags.h 2005-12-20 12:59:09.000000000 -0800
@@ -91,7 +91,6 @@
* In this case, the field should be commented here.
*/
struct page_state {
- unsigned long nr_writeback; /* Pages under writeback */
unsigned long nr_unstable; /* NFS unstable pages */
#define GET_PAGE_STATE_LAST nr_unstable

@@ -322,7 +321,7 @@ void dec_zone_page_state(const struct pa
do { \
if (!test_and_set_bit(PG_writeback, \
&(page)->flags)) \
- inc_page_state(nr_writeback); \
+ inc_zone_page_state(page, NR_WRITEBACK); \
} while (0)
#define TestSetPageWriteback(page) \
({ \
@@ -330,14 +329,14 @@ void dec_zone_page_state(const struct pa
ret = test_and_set_bit(PG_writeback, \
&(page)->flags); \
if (!ret) \
- inc_page_state(nr_writeback); \
+ inc_zone_page_state(page, NR_WRITEBACK); \
ret; \
})
#define ClearPageWriteback(page) \
do { \
if (test_and_clear_bit(PG_writeback, \
&(page)->flags)) \
- dec_page_state(nr_writeback); \
+ dec_zone_page_state(page, NR_WRITEBACK); \
} while (0)
#define TestClearPageWriteback(page) \
({ \
@@ -345,7 +344,7 @@ void dec_zone_page_state(const struct pa
ret = test_and_clear_bit(PG_writeback, \
&(page)->flags); \
if (ret) \
- dec_page_state(nr_writeback); \
+ dec_zone_page_state(page, NR_WRITEBACK); \
ret; \
})

Index: linux-2.6.15-rc5-mm3/mm/page-writeback.c
===================================================================
--- linux-2.6.15-rc5-mm3.orig/mm/page-writeback.c 2005-12-20 12:58:54.000000000 -0800
+++ linux-2.6.15-rc5-mm3/mm/page-writeback.c 2005-12-20 12:59:09.000000000 -0800
@@ -112,7 +112,7 @@ static void get_writeback_state(struct w
wbs->nr_dirty = global_page_state(NR_DIRTY);
wbs->nr_unstable = read_page_state(nr_unstable);
wbs->nr_mapped = global_page_state(NR_MAPPED);
- wbs->nr_writeback = read_page_state(nr_writeback);
+ wbs->nr_writeback = global_page_state(NR_WRITEBACK);
}

/*
Index: linux-2.6.15-rc5-mm3/include/linux/mmzone.h
===================================================================
--- linux-2.6.15-rc5-mm3.orig/include/linux/mmzone.h 2005-12-20 12:58:54.000000000 -0800
+++ linux-2.6.15-rc5-mm3/include/linux/mmzone.h 2005-12-20 12:59:09.000000000 -0800
@@ -51,6 +51,7 @@ enum zone_stat_item {
NR_SLAB, /* used by slab allocator */
NR_PAGETABLE, /* used for pagetables */
NR_DIRTY,
+ NR_WRITEBACK,
NR_STAT_ITEMS };

#ifdef CONFIG_SMP

2005-12-20 22:05:11

by Christoph Lameter

[permalink] [raw]
Subject: Zoned counters V1 [12/14]: Convert nr_bounce

Per zone unstable pages

Signed-off-by: Christoph Lameter <[email protected]>

Index: linux-2.6.15-rc5-mm3/fs/fs-writeback.c
===================================================================
--- linux-2.6.15-rc5-mm3.orig/fs/fs-writeback.c 2005-12-20 12:58:54.000000000 -0800
+++ linux-2.6.15-rc5-mm3/fs/fs-writeback.c 2005-12-20 12:59:17.000000000 -0800
@@ -471,7 +471,7 @@ void sync_inodes_sb(struct super_block *
.sync_mode = wait ? WB_SYNC_ALL : WB_SYNC_HOLD,
};
unsigned long nr_dirty = global_page_state(NR_DIRTY);
- unsigned long nr_unstable = read_page_state(nr_unstable);
+ unsigned long nr_unstable = global_page_state(NR_UNSTABLE);

wbc.nr_to_write = nr_dirty + nr_unstable +
(inodes_stat.nr_inodes - inodes_stat.nr_unused) +
Index: linux-2.6.15-rc5-mm3/mm/page_alloc.c
===================================================================
--- linux-2.6.15-rc5-mm3.orig/mm/page_alloc.c 2005-12-20 12:59:09.000000000 -0800
+++ linux-2.6.15-rc5-mm3/mm/page_alloc.c 2005-12-20 12:59:40.000000000 -0800
@@ -598,7 +598,8 @@ static int rmqueue_bulk(struct zone *zon
}

char *stat_item_descr[NR_STAT_ITEMS] = {
- "mapped","pagecache", "slab", "pagetable", "dirty", "writeback"
+ "mapped","pagecache", "slab", "pagetable", "dirty", "writeback",
+ "unstable"
};

/*
@@ -1784,7 +1785,7 @@ void show_free_areas(void)
inactive,
global_page_state(NR_DIRTY),
global_page_state(NR_WRITEBACK),
- ps.nr_unstable,
+ global_page_state(NR_UNSTABLE),
nr_free_pages(),
global_page_state(NR_SLAB),
global_page_state(NR_MAPPED),
@@ -2683,10 +2684,9 @@ static char *vmstat_text[] = {
"nr_page_table_pages",
"nr_dirty",
"nr_writeback",
-
- /* Page state */
"nr_unstable",

+ /* Page state */
"pgpgin",
"pgpgout",
"pswpin",
Index: linux-2.6.15-rc5-mm3/fs/nfs/write.c
===================================================================
--- linux-2.6.15-rc5-mm3.orig/fs/nfs/write.c 2005-12-20 12:58:54.000000000 -0800
+++ linux-2.6.15-rc5-mm3/fs/nfs/write.c 2005-12-20 12:59:17.000000000 -0800
@@ -489,7 +489,7 @@ nfs_mark_request_commit(struct nfs_page
nfs_list_add_request(req, &nfsi->commit);
nfsi->ncommit++;
spin_unlock(&nfsi->req_lock);
- inc_page_state(nr_unstable);
+ inc_zone_page_state(req->wb_page, NR_UNSTABLE);
mark_inode_dirty(inode);
}
#endif
@@ -1287,7 +1287,6 @@ void nfs_commit_done(struct rpc_task *ta
{
struct nfs_write_data *data = calldata;
struct nfs_page *req;
- int res = 0;

dprintk("NFS: %4d nfs_commit_done (status %d)\n",
task->tk_pid, task->tk_status);
@@ -1321,9 +1320,8 @@ void nfs_commit_done(struct rpc_task *ta
nfs_mark_request_dirty(req);
next:
nfs_clear_page_writeback(req);
- res++;
+ dec_zone_page_state(req->wb_page, NR_UNSTABLE);
}
- sub_page_state(nr_unstable,res);
}
#endif

Index: linux-2.6.15-rc5-mm3/include/linux/page-flags.h
===================================================================
--- linux-2.6.15-rc5-mm3.orig/include/linux/page-flags.h 2005-12-20 12:59:09.000000000 -0800
+++ linux-2.6.15-rc5-mm3/include/linux/page-flags.h 2005-12-20 12:59:17.000000000 -0800
@@ -91,8 +91,7 @@
* In this case, the field should be commented here.
*/
struct page_state {
- unsigned long nr_unstable; /* NFS unstable pages */
-#define GET_PAGE_STATE_LAST nr_unstable
+#define GET_PAGE_STATE_LAST xxx

/*
* The below are zeroed by get_page_state(). Use get_full_page_state()
Index: linux-2.6.15-rc5-mm3/mm/page-writeback.c
===================================================================
--- linux-2.6.15-rc5-mm3.orig/mm/page-writeback.c 2005-12-20 12:59:09.000000000 -0800
+++ linux-2.6.15-rc5-mm3/mm/page-writeback.c 2005-12-20 12:59:17.000000000 -0800
@@ -110,7 +110,7 @@ struct writeback_state
static void get_writeback_state(struct writeback_state *wbs)
{
wbs->nr_dirty = global_page_state(NR_DIRTY);
- wbs->nr_unstable = read_page_state(nr_unstable);
+ wbs->nr_unstable = global_page_state(NR_UNSTABLE);
wbs->nr_mapped = global_page_state(NR_MAPPED);
wbs->nr_writeback = global_page_state(NR_WRITEBACK);
}
Index: linux-2.6.15-rc5-mm3/include/linux/mmzone.h
===================================================================
--- linux-2.6.15-rc5-mm3.orig/include/linux/mmzone.h 2005-12-20 12:59:09.000000000 -0800
+++ linux-2.6.15-rc5-mm3/include/linux/mmzone.h 2005-12-20 12:59:17.000000000 -0800
@@ -52,6 +52,7 @@ enum zone_stat_item {
NR_PAGETABLE, /* used for pagetables */
NR_DIRTY,
NR_WRITEBACK,
+ NR_UNSTABLE, /* NFS unstable pages */
NR_STAT_ITEMS };

#ifdef CONFIG_SMP
Index: linux-2.6.15-rc5-mm3/drivers/base/node.c
===================================================================
--- linux-2.6.15-rc5-mm3.orig/drivers/base/node.c 2005-12-20 12:59:09.000000000 -0800
+++ linux-2.6.15-rc5-mm3/drivers/base/node.c 2005-12-20 12:59:17.000000000 -0800
@@ -65,6 +65,7 @@ static ssize_t node_read_meminfo(struct
"Node %d LowFree: %8lu kB\n"
"Node %d Dirty: %8lu kB\n"
"Node %d Writeback: %8lu kB\n"
+ "Node %d Unstable: %8lu kB\n"
"Node %d Mapped: %8lu kB\n"
"Node %d Pagecache: %8lu kB\n"
"Node %d Slab: %8lu kB\n",
@@ -79,6 +80,7 @@ static ssize_t node_read_meminfo(struct
nid, K(i.freeram - i.freehigh),
nid, K(nr[NR_DIRTY]),
nid, K(nr[NR_WRITEBACK]),
+ nid, K(nr[NR_UNSTABLE]),
nid, K(nr[NR_MAPPED]),
nid, K(nr[NR_PAGECACHE]),
nid, K(nr[NR_SLAB]));

2005-12-20 22:05:11

by Christoph Lameter

[permalink] [raw]
Subject: Zoned counters V1 [ 5/14]: Convert nr_pagecache

Convert nr_pagecache

Currently a single atomic variable is used to establish the size of the page
cache in the whole machine. The zoned VM counters have the same method of
implementation as the nr_pagecache code but also allow the determination
of the pagecache size per zone.

Remove the special implementation for nr_pagecache and make it a zoned
counter.

Updates of the page cache counters are always performed with interrupts off.
We can therefore use the __ variant here.

Signed-off-by: Christoph Lameter <[email protected]>

Index: linux-2.6.15-rc5-mm3/include/linux/pagemap.h
===================================================================
--- linux-2.6.15-rc5-mm3.orig/include/linux/pagemap.h 2005-12-16 11:44:09.000000000 -0800
+++ linux-2.6.15-rc5-mm3/include/linux/pagemap.h 2005-12-20 12:23:47.000000000 -0800
@@ -99,51 +99,6 @@ int add_to_page_cache_lru(struct page *p
extern void remove_from_page_cache(struct page *page);
extern void __remove_from_page_cache(struct page *page);

-extern atomic_t nr_pagecache;
-
-#ifdef CONFIG_SMP
-
-#define PAGECACHE_ACCT_THRESHOLD max(16, NR_CPUS * 2)
-DECLARE_PER_CPU(long, nr_pagecache_local);
-
-/*
- * pagecache_acct implements approximate accounting for pagecache.
- * vm_enough_memory() do not need high accuracy. Writers will keep
- * an offset in their per-cpu arena and will spill that into the
- * global count whenever the absolute value of the local count
- * exceeds the counter's threshold.
- *
- * MUST be protected from preemption.
- * current protection is mapping->page_lock.
- */
-static inline void pagecache_acct(int count)
-{
- long *local;
-
- local = &__get_cpu_var(nr_pagecache_local);
- *local += count;
- if (*local > PAGECACHE_ACCT_THRESHOLD || *local < -PAGECACHE_ACCT_THRESHOLD) {
- atomic_add(*local, &nr_pagecache);
- *local = 0;
- }
-}
-
-#else
-
-static inline void pagecache_acct(int count)
-{
- atomic_add(count, &nr_pagecache);
-}
-#endif
-
-static inline unsigned long get_page_cache_size(void)
-{
- int ret = atomic_read(&nr_pagecache);
- if (unlikely(ret < 0))
- ret = 0;
- return ret;
-}
-
/*
* Return byte-offset into filesystem object for page.
*/
Index: linux-2.6.15-rc5-mm3/mm/swap_state.c
===================================================================
--- linux-2.6.15-rc5-mm3.orig/mm/swap_state.c 2005-12-16 11:44:09.000000000 -0800
+++ linux-2.6.15-rc5-mm3/mm/swap_state.c 2005-12-20 12:23:47.000000000 -0800
@@ -85,7 +85,7 @@ static int __add_to_swap_cache(struct pa
SetPageSwapCache(page);
set_page_private(page, entry.val);
total_swapcache_pages++;
- pagecache_acct(1);
+ __inc_zone_page_state(page, NR_PAGECACHE);
}
write_unlock_irq(&swapper_space.tree_lock);
radix_tree_preload_end();
@@ -130,7 +130,7 @@ void __delete_from_swap_cache(struct pag
set_page_private(page, 0);
ClearPageSwapCache(page);
total_swapcache_pages--;
- pagecache_acct(-1);
+ __dec_zone_page_state(page, NR_PAGECACHE);
INC_CACHE_INFO(del_total);
}

Index: linux-2.6.15-rc5-mm3/mm/filemap.c
===================================================================
--- linux-2.6.15-rc5-mm3.orig/mm/filemap.c 2005-12-16 11:44:09.000000000 -0800
+++ linux-2.6.15-rc5-mm3/mm/filemap.c 2005-12-20 12:23:47.000000000 -0800
@@ -115,7 +115,7 @@ void __remove_from_page_cache(struct pag
radix_tree_delete(&mapping->page_tree, page->index);
page->mapping = NULL;
mapping->nrpages--;
- pagecache_acct(-1);
+ __dec_zone_page_state(page, NR_PAGECACHE);
}
EXPORT_SYMBOL(__remove_from_page_cache);

@@ -406,7 +406,7 @@ int add_to_page_cache(struct page *page,
page->mapping = mapping;
page->index = offset;
mapping->nrpages++;
- pagecache_acct(1);
+ __inc_zone_page_state(page, NR_PAGECACHE);
}
write_unlock_irq(&mapping->tree_lock);
radix_tree_preload_end();
Index: linux-2.6.15-rc5-mm3/mm/page_alloc.c
===================================================================
--- linux-2.6.15-rc5-mm3.orig/mm/page_alloc.c 2005-12-20 12:19:28.000000000 -0800
+++ linux-2.6.15-rc5-mm3/mm/page_alloc.c 2005-12-20 12:23:47.000000000 -0800
@@ -1575,12 +1575,6 @@ static void show_node(struct zone *zone)
*/
static DEFINE_PER_CPU(struct page_state, page_states) = {0};

-atomic_t nr_pagecache = ATOMIC_INIT(0);
-EXPORT_SYMBOL(nr_pagecache);
-#ifdef CONFIG_SMP
-DEFINE_PER_CPU(long, nr_pagecache_local) = 0;
-#endif
-
static void __get_page_state(struct page_state *ret, int nr, cpumask_t *cpumask)
{
int cpu = 0;
@@ -2675,6 +2669,7 @@ struct seq_operations zoneinfo_op = {
static char *vmstat_text[] = {
/* Zoned VM counters */
"nr_mapped",
+ "nr_pagecache",

/* Page state */
"nr_dirty",
Index: linux-2.6.15-rc5-mm3/mm/mmap.c
===================================================================
--- linux-2.6.15-rc5-mm3.orig/mm/mmap.c 2005-12-03 21:10:42.000000000 -0800
+++ linux-2.6.15-rc5-mm3/mm/mmap.c 2005-12-20 12:23:47.000000000 -0800
@@ -95,7 +95,7 @@ int __vm_enough_memory(long pages, int c
if (sysctl_overcommit_memory == OVERCOMMIT_GUESS) {
unsigned long n;

- free = get_page_cache_size();
+ free = global_page_state(NR_PAGECACHE);
free += nr_swap_pages;

/*
Index: linux-2.6.15-rc5-mm3/mm/nommu.c
===================================================================
--- linux-2.6.15-rc5-mm3.orig/mm/nommu.c 2005-12-16 11:44:09.000000000 -0800
+++ linux-2.6.15-rc5-mm3/mm/nommu.c 2005-12-20 12:23:47.000000000 -0800
@@ -1114,7 +1114,7 @@ int __vm_enough_memory(long pages, int c
if (sysctl_overcommit_memory == OVERCOMMIT_GUESS) {
unsigned long n;

- free = get_page_cache_size();
+ free = global_page_state(NR_PAGECACHE);
free += nr_swap_pages;

/*
Index: linux-2.6.15-rc5-mm3/arch/sparc64/kernel/sys_sunos32.c
===================================================================
--- linux-2.6.15-rc5-mm3.orig/arch/sparc64/kernel/sys_sunos32.c 2005-12-03 21:10:42.000000000 -0800
+++ linux-2.6.15-rc5-mm3/arch/sparc64/kernel/sys_sunos32.c 2005-12-20 12:23:47.000000000 -0800
@@ -154,7 +154,7 @@ asmlinkage int sunos_brk(u32 baddr)
* simple, it hopefully works in most obvious cases.. Easy to
* fool it, but this should catch most mistakes.
*/
- freepages = get_page_cache_size();
+ freepages = global_page_state(NR_PAGECACHE);
freepages >>= 1;
freepages += nr_free_pages();
freepages += nr_swap_pages;
Index: linux-2.6.15-rc5-mm3/arch/sparc/kernel/sys_sunos.c
===================================================================
--- linux-2.6.15-rc5-mm3.orig/arch/sparc/kernel/sys_sunos.c 2005-12-03 21:10:42.000000000 -0800
+++ linux-2.6.15-rc5-mm3/arch/sparc/kernel/sys_sunos.c 2005-12-20 12:23:47.000000000 -0800
@@ -195,7 +195,7 @@ asmlinkage int sunos_brk(unsigned long b
* simple, it hopefully works in most obvious cases.. Easy to
* fool it, but this should catch most mistakes.
*/
- freepages = get_page_cache_size();
+ freepages = global_page_state(NR_PAGECACHE);
freepages >>= 1;
freepages += nr_free_pages();
freepages += nr_swap_pages;
Index: linux-2.6.15-rc5-mm3/fs/proc/proc_misc.c
===================================================================
--- linux-2.6.15-rc5-mm3.orig/fs/proc/proc_misc.c 2005-12-20 12:19:10.000000000 -0800
+++ linux-2.6.15-rc5-mm3/fs/proc/proc_misc.c 2005-12-20 12:23:47.000000000 -0800
@@ -142,7 +142,7 @@ static int meminfo_read_proc(char *page,
allowed = ((totalram_pages - hugetlb_total_pages())
* sysctl_overcommit_ratio / 100) + total_swap_pages;

- cached = get_page_cache_size() - total_swapcache_pages - i.bufferram;
+ cached = global_page_state(NR_PAGECACHE) - total_swapcache_pages - i.bufferram;
if (cached < 0)
cached = 0;

Index: linux-2.6.15-rc5-mm3/include/linux/mmzone.h
===================================================================
--- linux-2.6.15-rc5-mm3.orig/include/linux/mmzone.h 2005-12-20 12:23:14.000000000 -0800
+++ linux-2.6.15-rc5-mm3/include/linux/mmzone.h 2005-12-20 12:26:32.000000000 -0800
@@ -47,7 +47,7 @@ struct zone_padding {
enum zone_stat_item {
NR_MAPPED, /* mapped into pagetables.
only modified from process context */
-
+ NR_PAGECACHE, /* file backed pages */
NR_STAT_ITEMS };

#ifdef CONFIG_SMP

2005-12-20 22:06:48

by Christoph Lameter

[permalink] [raw]
Subject: Zoned counters V1 [ 8/14]: Convert nr_page_table

Convert nr_page_table_pages

Signed-off-by: Christoph Lameter <[email protected]>

Index: linux-2.6.15-rc5-mm3/mm/memory.c
===================================================================
--- linux-2.6.15-rc5-mm3.orig/mm/memory.c 2005-12-16 11:44:09.000000000 -0800
+++ linux-2.6.15-rc5-mm3/mm/memory.c 2005-12-20 12:58:31.000000000 -0800
@@ -116,7 +116,7 @@ static void free_pte_range(struct mmu_ga
pmd_clear(pmd);
pte_lock_deinit(page);
pte_free_tlb(tlb, page);
- dec_page_state(nr_page_table_pages);
+ dec_zone_page_state(page, NR_PAGETABLE);
tlb->mm->nr_ptes--;
}

@@ -302,7 +302,7 @@ int __pte_alloc(struct mm_struct *mm, pm
pte_free(new);
} else {
mm->nr_ptes++;
- inc_page_state(nr_page_table_pages);
+ inc_zone_page_state(new, NR_PAGETABLE);
pmd_populate(mm, pmd, new);
}
spin_unlock(&mm->page_table_lock);
Index: linux-2.6.15-rc5-mm3/mm/page_alloc.c
===================================================================
--- linux-2.6.15-rc5-mm3.orig/mm/page_alloc.c 2005-12-20 12:58:17.000000000 -0800
+++ linux-2.6.15-rc5-mm3/mm/page_alloc.c 2005-12-20 12:58:47.000000000 -0800
@@ -597,7 +597,7 @@ static int rmqueue_bulk(struct zone *zon
return i;
}

-char *stat_item_descr[NR_STAT_ITEMS] = { "mapped","pagecache", "slab" };
+char *stat_item_descr[NR_STAT_ITEMS] = { "mapped","pagecache", "slab", "pagetable" };

/*
* Manage combined zone based / global counters
@@ -1786,7 +1786,7 @@ void show_free_areas(void)
nr_free_pages(),
global_page_state(NR_SLAB),
global_page_state(NR_MAPPED),
- ps.nr_page_table_pages);
+ global_page_state(NR_PAGETABLE));

for_each_zone(zone) {
int i;
@@ -2678,12 +2678,12 @@ static char *vmstat_text[] = {
"nr_mapped",
"nr_pagecache",
"nr_slab",
+ "nr_page_table_pages",

/* Page state */
"nr_dirty",
"nr_writeback",
"nr_unstable",
- "nr_page_table_pages",

"pgpgin",
"pgpgout",
Index: linux-2.6.15-rc5-mm3/include/linux/page-flags.h
===================================================================
--- linux-2.6.15-rc5-mm3.orig/include/linux/page-flags.h 2005-12-20 12:58:02.000000000 -0800
+++ linux-2.6.15-rc5-mm3/include/linux/page-flags.h 2005-12-20 12:58:31.000000000 -0800
@@ -94,8 +94,7 @@ struct page_state {
unsigned long nr_dirty; /* Dirty writeable pages */
unsigned long nr_writeback; /* Pages under writeback */
unsigned long nr_unstable; /* NFS unstable pages */
- unsigned long nr_page_table_pages;/* Pages used for pagetables */
-#define GET_PAGE_STATE_LAST nr_page_table_pages
+#define GET_PAGE_STATE_LAST nr_unstable

/*
* The below are zeroed by get_page_state(). Use get_full_page_state()
Index: linux-2.6.15-rc5-mm3/include/linux/mmzone.h
===================================================================
--- linux-2.6.15-rc5-mm3.orig/include/linux/mmzone.h 2005-12-20 12:58:02.000000000 -0800
+++ linux-2.6.15-rc5-mm3/include/linux/mmzone.h 2005-12-20 12:58:31.000000000 -0800
@@ -49,6 +49,7 @@ enum zone_stat_item {
only modified from process context */
NR_PAGECACHE, /* file backed pages */
NR_SLAB, /* used by slab allocator */
+ NR_PAGETABLE, /* used for pagetables */
NR_STAT_ITEMS };

#ifdef CONFIG_SMP
Index: linux-2.6.15-rc5-mm3/fs/proc/proc_misc.c
===================================================================
--- linux-2.6.15-rc5-mm3.orig/fs/proc/proc_misc.c 2005-12-20 12:58:02.000000000 -0800
+++ linux-2.6.15-rc5-mm3/fs/proc/proc_misc.c 2005-12-20 12:58:31.000000000 -0800
@@ -194,7 +194,7 @@ static int meminfo_read_proc(char *page,
K(global_page_state(NR_SLAB)),
K(allowed),
K(committed),
- K(ps.nr_page_table_pages),
+ K(global_page_state(NR_PAGETABLE)),
(unsigned long)VMALLOC_TOTAL >> 10,
vmi.used >> 10,
vmi.largest_chunk >> 10
Index: linux-2.6.15-rc5-mm3/drivers/base/node.c
===================================================================
--- linux-2.6.15-rc5-mm3.orig/drivers/base/node.c 2005-12-20 12:58:02.000000000 -0800
+++ linux-2.6.15-rc5-mm3/drivers/base/node.c 2005-12-20 12:58:31.000000000 -0800
@@ -57,8 +57,6 @@ static ssize_t node_read_meminfo(struct
ps.nr_dirty = 0;
if ((long)ps.nr_writeback < 0)
ps.nr_writeback = 0;
- if ((long)ps.nr_slab < 0)
- ps.nr_slab = 0;

n = sprintf(buf, "\n"
"Node %d MemTotal: %8lu kB\n"

2005-12-20 22:02:37

by Christoph Lameter

[permalink] [raw]
Subject: Zoned counters V1 [ 6/14]: Expanded node and zone statistics


- Extend zone, node and global statistics by printing all counters from
the vmstats arrays.

- Provide an array describing zoned VM counters

Signed-off-by: Christoph Lameter <[email protected]>

Index: linux-2.6.15-rc5-mm3/drivers/base/node.c
===================================================================
--- linux-2.6.15-rc5-mm3.orig/drivers/base/node.c 2005-12-20 12:19:10.000000000 -0800
+++ linux-2.6.15-rc5-mm3/drivers/base/node.c 2005-12-20 12:28:48.000000000 -0800
@@ -43,12 +43,14 @@ static ssize_t node_read_meminfo(struct
unsigned long inactive;
unsigned long active;
unsigned long free;
- unsigned long nr_mapped;
+ int j;
+ unsigned long nr[NR_STAT_ITEMS];

si_meminfo_node(&i, nid);
get_page_state_node(&ps, nid);
__get_zone_counts(&active, &inactive, &free, NODE_DATA(nid));
- nr_mapped = node_page_state(nid, NR_MAPPED);
+ for (j = 0; j < NR_STAT_ITEMS; j++)
+ nr[j] = node_page_state(nid, j);

/* Check for negative values in these approximate counters */
if ((long)ps.nr_dirty < 0)
@@ -71,6 +73,7 @@ static ssize_t node_read_meminfo(struct
"Node %d Dirty: %8lu kB\n"
"Node %d Writeback: %8lu kB\n"
"Node %d Mapped: %8lu kB\n"
+ "Node %d Pagecache: %8lu kB\n"
"Node %d Slab: %8lu kB\n",
nid, K(i.totalram),
nid, K(i.freeram),
@@ -83,7 +86,8 @@ static ssize_t node_read_meminfo(struct
nid, K(i.freeram - i.freehigh),
nid, K(ps.nr_dirty),
nid, K(ps.nr_writeback),
- nid, K(nr_mapped),
+ nid, K(nr[NR_MAPPED]),
+ nid, K(nr[NR_PAGECACHE]),
nid, K(ps.nr_slab));
n += hugetlb_report_node_meminfo(nid, buf + n);
return n;
Index: linux-2.6.15-rc5-mm3/mm/page_alloc.c
===================================================================
--- linux-2.6.15-rc5-mm3.orig/mm/page_alloc.c 2005-12-20 12:23:47.000000000 -0800
+++ linux-2.6.15-rc5-mm3/mm/page_alloc.c 2005-12-20 12:28:48.000000000 -0800
@@ -597,6 +597,8 @@ static int rmqueue_bulk(struct zone *zon
return i;
}

+char *stat_item_descr[NR_STAT_ITEMS] = { "mapped","pagecache" };
+
/*
* Manage combined zone based / global counters
*/
@@ -2597,6 +2599,11 @@ static int zoneinfo_show(struct seq_file
zone->nr_scan_active, zone->nr_scan_inactive,
zone->spanned_pages,
zone->present_pages);
+ for(i = 0; i < NR_STAT_ITEMS; i++)
+ seq_printf(m, "\n %-8s %lu",
+ stat_item_descr[i],
+ zone_page_state(zone, i));
+
seq_printf(m,
"\n protection: (%lu",
zone->lowmem_reserve[0]);

2005-12-23 11:57:11

by Coywolf Qi Hunt

[permalink] [raw]
Subject: Re: Zoned counters V1 [14/14]: Remove wbs

2005/12/21, Christoph Lameter <[email protected]>:
> Remove writeback state
>
> We can remove some functions now that were needed to calculate the page
> state for writeback control since these statistics are now directly
> available.
>
> Signed-off-by: Christoph Lameter <[email protected]>
>
> Index: linux-2.6.15-rc5-mm3/mm/page-writeback.c
> ===================================================================
> --- linux-2.6.15-rc5-mm3.orig/mm/page-writeback.c 2005-12-20 12:59:17.000000000 -0800
> +++ linux-2.6.15-rc5-mm3/mm/page-writeback.c 2005-12-20 13:16:34.000000000 -0800
> @@ -99,22 +99,6 @@ EXPORT_SYMBOL(laptop_mode);
>
> static void background_writeout(unsigned long _min_pages);
>
> -struct writeback_state
> -{
> - unsigned long nr_dirty;
> - unsigned long nr_unstable;
> - unsigned long nr_mapped;
> - unsigned long nr_writeback;
> -};
> -
> -static void get_writeback_state(struct writeback_state *wbs)
> -{
> - wbs->nr_dirty = global_page_state(NR_DIRTY);
> - wbs->nr_unstable = global_page_state(NR_UNSTABLE);
> - wbs->nr_mapped = global_page_state(NR_MAPPED);
> - wbs->nr_writeback = global_page_state(NR_WRITEBACK);
> -}
> -
> /*
> * Work out the current dirty-memory clamping and background writeout
> * thresholds.
> @@ -133,8 +117,7 @@ static void get_writeback_state(struct w
> * clamping level.
> */
> static void
> -get_dirty_limits(struct writeback_state *wbs, long *pbackground, long *pdirty,
> - struct address_space *mapping)
> +get_dirty_limits(long *pbackground, long *pdirty, struct address_space *mapping)

Maybe get rid of the odd Hungarian naming too.

-- Coywolf

> {
> int background_ratio; /* Percentages */
> int dirty_ratio;
> @@ -144,8 +127,6 @@ get_dirty_limits(struct writeback_state
> unsigned long available_memory = total_pages;
> struct task_struct *tsk;
>
> - get_writeback_state(wbs);
> -
> #ifdef CONFIG_HIGHMEM
> /*
> * If this mapping can only allocate from low memory,
> @@ -156,7 +137,7 @@ get_dirty_limits(struct writeback_state
> #endif
>
>
> - unmapped_ratio = 100 - (wbs->nr_mapped * 100) / total_pages;
> + unmapped_ratio = 100 - (global_page_state(NR_MAPPED) * 100) / total_pages;
>
> dirty_ratio = vm_dirty_ratio;
> if (dirty_ratio > unmapped_ratio / 2)
> @@ -189,7 +170,6 @@ get_dirty_limits(struct writeback_state
> */
> static void balance_dirty_pages(struct address_space *mapping)
> {
> - struct writeback_state wbs;
> long nr_reclaimable;
> long background_thresh;
> long dirty_thresh;
> @@ -206,10 +186,9 @@ static void balance_dirty_pages(struct a
> .nr_to_write = write_chunk,
> };
>
> - get_dirty_limits(&wbs, &background_thresh,
> - &dirty_thresh, mapping);
> - nr_reclaimable = wbs.nr_dirty + wbs.nr_unstable;
> - if (nr_reclaimable + wbs.nr_writeback <= dirty_thresh)
> + get_dirty_limits(&background_thresh, &dirty_thresh, mapping);
> + nr_reclaimable = global_page_state(NR_DIRTY) + global_page_state(NR_UNSTABLE);
> + if (nr_reclaimable + global_page_state(NR_WRITEBACK) <= dirty_thresh)
> break;
>
> dirty_exceeded = 1;
> @@ -222,10 +201,9 @@ static void balance_dirty_pages(struct a
> */
> if (nr_reclaimable) {
> writeback_inodes(&wbc);
> - get_dirty_limits(&wbs, &background_thresh,
> - &dirty_thresh, mapping);
> - nr_reclaimable = wbs.nr_dirty + wbs.nr_unstable;
> - if (nr_reclaimable + wbs.nr_writeback <= dirty_thresh)
> + get_dirty_limits(&background_thresh, &dirty_thresh, mapping);
> + nr_reclaimable = global_page_state(NR_DIRTY) + global_page_state(NR_UNSTABLE);
> + if (nr_reclaimable + global_page_state(NR_WRITEBACK) <= dirty_thresh)
> break;
> pages_written += write_chunk - wbc.nr_to_write;
> if (pages_written >= write_chunk)
> @@ -234,7 +212,7 @@ static void balance_dirty_pages(struct a
> blk_congestion_wait(WRITE, HZ/10);
> }
>
> - if (nr_reclaimable + wbs.nr_writeback <= dirty_thresh)
> + if (nr_reclaimable + global_page_state(NR_WRITEBACK) <= dirty_thresh)
> dirty_exceeded = 0;
>
> if (writeback_in_progress(bdi))
> @@ -291,12 +269,11 @@ EXPORT_SYMBOL(balance_dirty_pages_rateli
>
> void throttle_vm_writeout(void)
> {
> - struct writeback_state wbs;
> long background_thresh;
> long dirty_thresh;
>
> for ( ; ; ) {
> - get_dirty_limits(&wbs, &background_thresh, &dirty_thresh, NULL);
> + get_dirty_limits(&background_thresh, &dirty_thresh, NULL);
>
> /*
> * Boost the allowable dirty threshold a bit for page
> @@ -304,7 +281,7 @@ void throttle_vm_writeout(void)
> */
> dirty_thresh += dirty_thresh / 10; /* wheeee... */
>
> - if (wbs.nr_unstable + wbs.nr_writeback <= dirty_thresh)
> + if (global_page_state(NR_UNSTABLE) + global_page_state(NR_WRITEBACK) <= dirty_thresh)
> break;
> blk_congestion_wait(WRITE, HZ/10);
> }
> @@ -327,12 +304,11 @@ static void background_writeout(unsigned
> };
>
> for ( ; ; ) {
> - struct writeback_state wbs;
> long background_thresh;
> long dirty_thresh;
>
> - get_dirty_limits(&wbs, &background_thresh, &dirty_thresh, NULL);
> - if (wbs.nr_dirty + wbs.nr_unstable < background_thresh
> + get_dirty_limits(&background_thresh, &dirty_thresh, NULL);
> + if (global_page_state(NR_DIRTY) + global_page_state(NR_UNSTABLE) < background_thresh
> && min_pages <= 0)
> break;
> wbc.encountered_congestion = 0;
> @@ -356,12 +332,8 @@ static void background_writeout(unsigned
> */
> int wakeup_pdflush(long nr_pages)
> {
> - if (nr_pages == 0) {
> - struct writeback_state wbs;
> -
> - get_writeback_state(&wbs);
> - nr_pages = wbs.nr_dirty + wbs.nr_unstable;
> - }
> + if (nr_pages == 0)
> + nr_pages = global_page_state(NR_DIRTY) + global_page_state(NR_UNSTABLE);
> return pdflush_operation(background_writeout, nr_pages);
> }
>
> @@ -392,7 +364,6 @@ static void wb_kupdate(unsigned long arg
> unsigned long start_jif;
> unsigned long next_jif;
> long nr_to_write;
> - struct writeback_state wbs;
> struct writeback_control wbc = {
> .bdi = NULL,
> .sync_mode = WB_SYNC_NONE,
> @@ -404,11 +375,10 @@ static void wb_kupdate(unsigned long arg
>
> sync_supers();
>
> - get_writeback_state(&wbs);
> oldest_jif = jiffies - (dirty_expire_centisecs * HZ) / 100;
> start_jif = jiffies;
> next_jif = start_jif + (dirty_writeback_centisecs * HZ) / 100;
> - nr_to_write = wbs.nr_dirty + wbs.nr_unstable +
> + nr_to_write = global_page_state(NR_DIRTY) + global_page_state(NR_UNSTABLE) +
> (inodes_stat.nr_inodes - inodes_stat.nr_unused);
> while (nr_to_write > 0) {
> wbc.encountered_congestion = 0;
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2005-12-23 17:27:46

by Christoph Lameter

[permalink] [raw]
Subject: Re: Zoned counters V1 [14/14]: Remove wbs

On Fri, 23 Dec 2005, Coywolf Qi Hunt wrote:

> > static void
> > -get_dirty_limits(struct writeback_state *wbs, long *pbackground, long *pdirty,
> > - struct address_space *mapping)
> > +get_dirty_limits(long *pbackground, long *pdirty, struct address_space *mapping)
>
> Maybe get rid of the odd Hungarian naming too.

s/pbackground/background s/pdirty/dirty ?