2003-07-09 08:32:08

by Nikita Danilov

[permalink] [raw]
Subject: [PATCH] 1/5 VM changes: zone-pressure.patch


Add zone->pressure field. It estimates scanning intensity for this zone and
is calculated as exponentially decaying average of the scanning priority
required to free enough pages in this zone (mm/vmscan.c:zone_adj_pressure()).

zone->pressure is supposed to be used in stead of scanning priority by
vmscan.c. This is used by later patches in a series.



diff -puN include/linux/mmzone.h~zone-pressure include/linux/mmzone.h
--- i386/include/linux/mmzone.h~zone-pressure Wed Jul 9 12:24:50 2003
+++ i386-god/include/linux/mmzone.h Wed Jul 9 12:24:50 2003
@@ -84,11 +84,23 @@ struct zone {
atomic_t refill_counter;
unsigned long nr_active;
unsigned long nr_inactive;
- int all_unreclaimable; /* All pages pinned */
+ int all_unreclaimable:1; /* All pages pinned */
unsigned long pages_scanned; /* since last reclaim */

ZONE_PADDING(_pad2_)

+ /*
+ * measure of scanning intensity for this zone. It is calculated
+ * as exponentially decaying average of the scanning priority
+ * required to free enough pages in this zone
+ * (zone_adj_pressure()).
+ *
+ * 0 --- low pressure
+ *
+ * (DEF_PRIORITY << 10) --- high pressure
+ */
+ int pressure;
+
/*
* free areas of different sizes
*/
diff -puN mm/vmscan.c~zone-pressure mm/vmscan.c
--- i386/mm/vmscan.c~zone-pressure Wed Jul 9 12:24:50 2003
+++ i386-god/mm/vmscan.c Wed Jul 9 12:24:50 2003
@@ -80,6 +80,22 @@ static long total_memory;
#endif

/*
+ * exponentially decaying average
+ */
+static inline int expavg(int avg, int val, int order)
+{
+ return ((val - avg) >> order) + avg;
+}
+
+static void zone_adj_pressure(struct zone * zone, int priority)
+{
+ int pass;
+
+ pass = DEF_PRIORITY - priority;
+ zone->pressure = expavg(zone->pressure, pass << 10, 1);
+}
+
+/*
* The list of shrinker callbacks used by to apply pressure to
* ageable caches.
*/
@@ -794,8 +810,10 @@ shrink_caches(struct zone *classzone, in
ret += shrink_zone(zone, max_scan, gfp_mask,
to_reclaim, &nr_mapped, ps, priority);
*total_scanned += max_scan + nr_mapped;
- if (ret >= nr_pages)
+ if (ret >= nr_pages) {
+ zone_adj_pressure(zone, priority);
break;
+ }
}
return ret;
}
@@ -824,6 +842,7 @@ int try_to_free_pages(struct zone *cz,
int ret = 0;
const int nr_pages = SWAP_CLUSTER_MAX;
int nr_reclaimed = 0;
+ struct zone *zone;
struct reclaim_state *reclaim_state = current->reclaim_state;

inc_page_state(allocstall);
@@ -860,6 +879,8 @@ int try_to_free_pages(struct zone *cz,
}
if ((gfp_mask & __GFP_FS) && !(gfp_mask & __GFP_NORETRY))
out_of_memory();
+ for (zone = cz; zone >= cz->zone_pgdat->node_zones; -- zone)
+ zone_adj_pressure(zone, -1);
out:
return ret;
}
@@ -907,8 +928,10 @@ static int balance_pgdat(pg_data_t *pgda
to_reclaim = min(to_free, SWAP_CLUSTER_MAX*8);
} else { /* Zone balancing */
to_reclaim = zone->pages_high-zone->free_pages;
- if (to_reclaim <= 0)
+ if (to_reclaim <= 0) {
+ zone_adj_pressure(zone, priority);
continue;
+ }
}
all_zones_ok = 0;
max_scan = zone->nr_inactive >> priority;
@@ -932,6 +955,14 @@ static int balance_pgdat(pg_data_t *pgda
break;
blk_congestion_wait(WRITE, HZ/10);
}
+ if (priority < 0) {
+ for (i = 0; i < pgdat->nr_zones; i++) {
+ struct zone *zone = pgdat->node_zones + i;
+
+ if (zone->free_pages < zone->pages_high)
+ zone_adj_pressure(zone, -1);
+ }
+ }
return nr_pages - to_free;
}


_


2003-07-09 09:57:15

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] 1/5 VM changes: zone-pressure.patch

Nikita Danilov <[email protected]> wrote:
>
> Add zone->pressure field. It estimates scanning intensity for this zone and
> is calculated as exponentially decaying average of the scanning priority
> required to free enough pages in this zone (mm/vmscan.c:zone_adj_pressure()).
>
> zone->pressure is supposed to be used in stead of scanning priority by
> vmscan.c. This is used by later patches in a series.
>

OK, fixes a bug.

>
> diff -puN include/linux/mmzone.h~zone-pressure include/linux/mmzone.h
> --- i386/include/linux/mmzone.h~zone-pressure Wed Jul 9 12:24:50 2003
> +++ i386-god/include/linux/mmzone.h Wed Jul 9 12:24:50 2003
> @@ -84,11 +84,23 @@ struct zone {
> atomic_t refill_counter;
> unsigned long nr_active;
> unsigned long nr_inactive;
> - int all_unreclaimable; /* All pages pinned */
> + int all_unreclaimable:1; /* All pages pinned */

I'll revert this change. Once we start adding bitfields in there they all
need common locking and it gets messy. integers are simpler.

2003-07-09 10:11:23

by Nikita Danilov

[permalink] [raw]
Subject: Re: [PATCH] 1/5 VM changes: zone-pressure.patch

Andrew Morton writes:
> Nikita Danilov <[email protected]> wrote:
> >
> > Add zone->pressure field. It estimates scanning intensity for this zone and
> > is calculated as exponentially decaying average of the scanning priority
> > required to free enough pages in this zone (mm/vmscan.c:zone_adj_pressure()).
> >
> > zone->pressure is supposed to be used in stead of scanning priority by
> > vmscan.c. This is used by later patches in a series.
> >
>
> OK, fixes a bug.

What bug?

>
> >
> > diff -puN include/linux/mmzone.h~zone-pressure include/linux/mmzone.h
> > --- i386/include/linux/mmzone.h~zone-pressure Wed Jul 9 12:24:50 2003
> > +++ i386-god/include/linux/mmzone.h Wed Jul 9 12:24:50 2003
> > @@ -84,11 +84,23 @@ struct zone {
> > atomic_t refill_counter;
> > unsigned long nr_active;
> > unsigned long nr_inactive;
> > - int all_unreclaimable; /* All pages pinned */
> > + int all_unreclaimable:1; /* All pages pinned */
>
> I'll revert this change. Once we start adding bitfields in there they all
> need common locking and it gets messy. integers are simpler.

OK.

>

Nikita.

2003-07-09 10:12:57

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] 1/5 VM changes: zone-pressure.patch

Nikita Danilov <[email protected]> wrote:
>
> + if (priority < 0) {
> + for (i = 0; i < pgdat->nr_zones; i++) {
> + struct zone *zone = pgdat->node_zones + i;
> +
> + if (zone->free_pages < zone->pages_high)
> + zone_adj_pressure(zone, -1);
> + }
> + }

What is this bit doing?

2003-07-09 10:17:40

by Nikita Danilov

[permalink] [raw]
Subject: Re: [PATCH] 1/5 VM changes: zone-pressure.patch

Andrew Morton writes:
> Nikita Danilov <[email protected]> wrote:
> >
> > + if (priority < 0) {
> > + for (i = 0; i < pgdat->nr_zones; i++) {
> > + struct zone *zone = pgdat->node_zones + i;
> > +
> > + if (zone->free_pages < zone->pages_high)
> > + zone_adj_pressure(zone, -1);
> > + }
> > + }
>
> What is this bit doing?

kswapd failed to balance some zone after going up to the maximal
priority (0), increase ->pressure on this zone.

In other words: zone->pressure is average of the scanning priority
required to free enough pages in this zone. As no scanning priority was
enough to free pages, zone->pressure should be extra high.

Nikita.

2003-07-09 10:29:22

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] 1/5 VM changes: zone-pressure.patch

Nikita Danilov <[email protected]> wrote:
>
> > OK, fixes a bug.
>
> What bug?

Failing to consider mapped pages on the active list until the scanning
priority gets large.

I ran up your five patches on a 256MB box, running `qsbench -m 350'. It got
all slow then the machine seized up. I'll poke at it some.

2003-07-09 11:04:24

by Nikita Danilov

[permalink] [raw]
Subject: Re: [PATCH] 1/5 VM changes: zone-pressure.patch

Andrew Morton writes:
> Nikita Danilov <[email protected]> wrote:
> >
> > > OK, fixes a bug.
> >
> > What bug?
>
> Failing to consider mapped pages on the active list until the scanning
> priority gets large.
>
> I ran up your five patches on a 256MB box, running `qsbench -m 350'. It got
> all slow then the machine seized up. I'll poke at it some.
>

Thanks for trying them out.

Nikita.

2003-07-10 14:25:46

by Martin Schlemmer

[permalink] [raw]
Subject: Re: [PATCH] 1/5 VM changes: zone-pressure.patch

On Wed, 2003-07-09 at 12:42, Andrew Morton wrote:
> Nikita Danilov <[email protected]> wrote:
> >
> > > OK, fixes a bug.
> >
> > What bug?
>
> Failing to consider mapped pages on the active list until the scanning
> priority gets large.
>
> I ran up your five patches on a 256MB box, running `qsbench -m 350'. It got
> all slow then the machine seized up. I'll poke at it some.
>

P4 2.4HT with 1GB of ram - could be me, but things seem to start a
tad quicker. If I however push it with two 'make -j12' compiles that
normally works fine, it goes for about 10-15mins, and then OOM killer
kills X, evo, galeon, etc.

Must note that for most of the time free memory stays in the region of
450-650mb and then suddenly things goes for the worse. Also, after this
there is only about 80MB used ... If I then start X, etc again, same
thing after 10-15 mins - the box thus never died, just suddenly all free
memory was taken.

If you need any stats, etc, let me know.

BTW, kernel is 2.5.74-bk7.


Regards,

--
Martin Schlemmer


2003-07-17 13:35:53

by Nikita Danilov

[permalink] [raw]
Subject: Re: [PATCH] 1/5 VM changes: zone-pressure.patch

Andrew Morton writes:
> Nikita Danilov <[email protected]> wrote:
> >
> > > OK, fixes a bug.
> >
> > What bug?
>
> Failing to consider mapped pages on the active list until the scanning
> priority gets large.
>
> I ran up your five patches on a 256MB box, running `qsbench -m 350'. It got
> all slow then the machine seized up. I'll poke at it some.
>

My understanding of this is following:

1. kswapd/balance_pgdat runs through all priorities, but fails to rise
zone->free_pages above zone->pages_min. So it exet. Which is strange.

2. nobody is going to wake it up, because all memory allocators are
looping indefinitely in that newly introduced do_retry loop inside
__alloc_pages.

I think either wakeup_kswapd() should be called in __alloc_pages()
before goto rebalance, or kswapd should loop until it restores all zones
to the balance (actually, comment before balance_pgdat() says it does),
or both.

Nikita.