2012-11-25 22:45:26

by Johannes Weiner

[permalink] [raw]
Subject: Re: [PATCH] mm,vmscan: free pages if compaction_suitable tells us to

On Sun, Nov 25, 2012 at 01:29:50PM -0500, Rik van Riel wrote:
> On Sun, 25 Nov 2012 17:57:28 +0100
> Johannes Hirte <[email protected]> wrote:
>
> > With kernel 3.7-rc6 I've still problems with kswapd0 on my laptop
>
> > And this is most of the time. I've only observed this behavior on the
> > laptop. Other systems don't show this.
>
> This suggests it may have something to do with small memory zones,
> where we end up with the "funny" situation that the high watermark
> (+ balance gap) for a particular zone is less than the low watermark
> + 2<<order pages, which is the number of free pages required to keep
> compaction_suitable happy.
>
> Could you try this patch?

It's not quite enough because it's not reaching the conditions you
changed, see analysis in https://lkml.org/lkml/2012/11/20/567

But even fixing it up (by adding the compaction_suitable() test in
this preliminary scan over the zones and setting end_zone accordingly)
is not enough because no actual reclaim happens at priority 12 in a
small zone. So the number of free pages is not actually changing and
the compaction_suitable() checks keep the loop going.

The problem is fairly easy to reproduce, by the way. Just boot with
mem=800M to have a relatively small lowmem reserve in the DMA zone.
Fill it up with page cache, then allocate transparent huge pages.

With your patch and my fix to the preliminary zone loop, there won't
be any hung task warnings anymore because kswapd actually calls
shrink_slab() and there is a rescheduling point in there, but it still
loops forever.

It also seems a bit aggressive to try to balance a small zone like DMA
for a huge page when it's not a GFP_DMA allocation, but none of these
checks actually take the classzone into account. Do we have any
agreement over what this whole thing is supposed to be doing?

diff --git a/mm/vmscan.c b/mm/vmscan.c
index b99ecba..f7e54df 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2412,6 +2412,9 @@ static void age_active_anon(struct zone *zone, struct scan_control *sc)
* would need to be at least 256M for it to be balance a whole node.
* Similarly, on x86-64 the Normal zone would need to be at least 1G
* to balance a node on its own. These seemed like reasonable ratios.
+ *
+ * The kswapd source code is brought to you by Advil?. "For today's
+ * tough pain, one might not be enough."
*/
static bool pgdat_balanced(pg_data_t *pgdat, unsigned long balanced_pages,
int classzone_idx)


2012-11-25 23:32:43

by Rik van Riel

[permalink] [raw]
Subject: Re: [PATCH] mm,vmscan: free pages if compaction_suitable tells us to

On 11/25/2012 05:44 PM, Johannes Weiner wrote:
> On Sun, Nov 25, 2012 at 01:29:50PM -0500, Rik van Riel wrote:
>> On Sun, 25 Nov 2012 17:57:28 +0100
>> Johannes Hirte <[email protected]> wrote:
>>
>>> With kernel 3.7-rc6 I've still problems with kswapd0 on my laptop
>>
>>> And this is most of the time. I've only observed this behavior on the
>>> laptop. Other systems don't show this.
>>
>> This suggests it may have something to do with small memory zones,
>> where we end up with the "funny" situation that the high watermark
>> (+ balance gap) for a particular zone is less than the low watermark
>> + 2<<order pages, which is the number of free pages required to keep
>> compaction_suitable happy.
>>
>> Could you try this patch?
>
> It's not quite enough because it's not reaching the conditions you
> changed, see analysis in https://lkml.org/lkml/2012/11/20/567

You are right, I forgot the preliminary loop in balance_pgdat().

> But even fixing it up (by adding the compaction_suitable() test in
> this preliminary scan over the zones and setting end_zone accordingly)
> is not enough because no actual reclaim happens at priority 12 in a
> small zone. So the number of free pages is not actually changing and
> the compaction_suitable() checks keep the loop going.

Indeed, it is a hairy situation. I tried to come up with a simple
patch, but apparently that is not enough...

> The problem is fairly easy to reproduce, by the way. Just boot with
> mem=800M to have a relatively small lowmem reserve in the DMA zone.
> Fill it up with page cache, then allocate transparent huge pages.
>
> With your patch and my fix to the preliminary zone loop, there won't
> be any hung task warnings anymore because kswapd actually calls
> shrink_slab() and there is a rescheduling point in there, but it still
> loops forever.
>
> It also seems a bit aggressive to try to balance a small zone like DMA
> for a huge page when it's not a GFP_DMA allocation, but none of these
> checks actually take the classzone into account. Do we have any
> agreement over what this whole thing is supposed to be doing?

It is supposed to free memory, in order to:
1) allow allocations to succeed, and
2) balance memory pressure between zones

I think the compaction_suitable check in the final loop
over the zones is backwards.

We need to loop back to the start if compaction_suitable
returns COMPACT_SKIPPED for _every_ zone in the pgdat.

Does that sound reasonable?

I'll whip up a patch.

--
All rights reversed

2012-11-26 00:17:44

by Rik van Riel

[permalink] [raw]
Subject: [PATCH] mm,vmscan: only loop back if compaction would fail in all zones

On Sun, 25 Nov 2012 17:44:33 -0500
Johannes Weiner <[email protected]> wrote:
> On Sun, Nov 25, 2012 at 01:29:50PM -0500, Rik van Riel wrote:

> > Could you try this patch?
>
> It's not quite enough because it's not reaching the conditions you
> changed, see analysis in https://lkml.org/lkml/2012/11/20/567

Johannes,

does the patch below fix your problem?

I suspect it would, because kswapd should only ever run into this
particular problem when we have a tiny memory zone in a pgdat,
and in that case we will also have a larger zone nearby, where
compaction would just succeed.

---8<---

Subject: mm,vmscan: only loop back if compaction would fail in all zones

Kswapd frees memory to satisfy two goals:
1) allow allocations to succeed, and
2) balance memory pressure between zones

Currently, kswapd has an issue where it will loop back to free
more memory if any memory zone in the pgdat has not enough free
memory for compaction. This can lead to unnecessary overhead,
and even infinite loops in kswapd.

It is better to only loop back to free more memory if all of
the zones in the pgdat have insufficient free memory for
compaction. That satisfies both of kswapd's goals with less
overhead.

Signed-off-by: Rik van Riel <[email protected]>
---
mm/vmscan.c | 11 ++++++++---
1 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index b99ecba..f0d111b 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2790,6 +2790,7 @@ static unsigned long balance_pgdat(pg_data_t *pgdat, int order,
*/
if (order) {
int zones_need_compaction = 1;
+ int compaction_needs_memory = 1;

for (i = 0; i <= end_zone; i++) {
struct zone *zone = pgdat->node_zones + i;
@@ -2801,10 +2802,10 @@ static unsigned long balance_pgdat(pg_data_t *pgdat, int order,
sc.priority != DEF_PRIORITY)
continue;

- /* Would compaction fail due to lack of free memory? */
+ /* Is there enough memory for compaction? */
if (COMPACTION_BUILD &&
- compaction_suitable(zone, order) == COMPACT_SKIPPED)
- goto loop_again;
+ compaction_suitable(zone, order) != COMPACT_SKIPPED)
+ compaction_needs_memory = 0;

/* Confirm the zone is balanced for order-0 */
if (!zone_watermark_ok(zone, 0,
@@ -2822,6 +2823,10 @@ static unsigned long balance_pgdat(pg_data_t *pgdat, int order,
zone_clear_flag(zone, ZONE_CONGESTED);
}

+ /* None of the zones had enough free memory for compaction. */
+ if (compaction_needs_memory)
+ goto loop_again;
+
if (zones_need_compaction)
compact_pgdat(pgdat, order);
}

2012-11-26 01:21:59

by Jaegeuk Hanse

[permalink] [raw]
Subject: Re: [PATCH] mm,vmscan: free pages if compaction_suitable tells us to

On 11/26/2012 06:44 AM, Johannes Weiner wrote:
> On Sun, Nov 25, 2012 at 01:29:50PM -0500, Rik van Riel wrote:
>> On Sun, 25 Nov 2012 17:57:28 +0100
>> Johannes Hirte <[email protected]> wrote:
>>
>>> With kernel 3.7-rc6 I've still problems with kswapd0 on my laptop
>>> And this is most of the time. I've only observed this behavior on the
>>> laptop. Other systems don't show this.
>> This suggests it may have something to do with small memory zones,
>> where we end up with the "funny" situation that the high watermark
>> (+ balance gap) for a particular zone is less than the low watermark
>> + 2<<order pages, which is the number of free pages required to keep
>> compaction_suitable happy.
>>
>> Could you try this patch?
> It's not quite enough because it's not reaching the conditions you
> changed, see analysis in https://lkml.org/lkml/2012/11/20/567
>
> But even fixing it up (by adding the compaction_suitable() test in
> this preliminary scan over the zones and setting end_zone accordingly)
> is not enough because no actual reclaim happens at priority 12 in a

The preliminary scan is in the highmem->dma direction, it will miss high
zone which not meet compaction_suitable() test instead of lowest zone.

> small zone. So the number of free pages is not actually changing and
> the compaction_suitable() checks keep the loop going.
>
> The problem is fairly easy to reproduce, by the way. Just boot with
> mem=800M to have a relatively small lowmem reserve in the DMA zone.
> Fill it up with page cache, then allocate transparent huge pages.
>
> With your patch and my fix to the preliminary zone loop, there won't
> be any hung task warnings anymore because kswapd actually calls
> shrink_slab() and there is a rescheduling point in there, but it still
> loops forever.
>
> It also seems a bit aggressive to try to balance a small zone like DMA
> for a huge page when it's not a GFP_DMA allocation, but none of these
> checks actually take the classzone into account. Do we have any
> agreement over what this whole thing is supposed to be doing?
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index b99ecba..f7e54df 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -2412,6 +2412,9 @@ static void age_active_anon(struct zone *zone, struct scan_control *sc)
> * would need to be at least 256M for it to be balance a whole node.
> * Similarly, on x86-64 the Normal zone would need to be at least 1G
> * to balance a node on its own. These seemed like reasonable ratios.
> + *
> + * The kswapd source code is brought to you by Advil?. "For today's
> + * tough pain, one might not be enough."
> */
> static bool pgdat_balanced(pg_data_t *pgdat, unsigned long balanced_pages,
> int classzone_idx)
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to [email protected]. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"[email protected]"> [email protected] </a>

2012-11-26 03:16:11

by Johannes Weiner

[permalink] [raw]
Subject: Re: [PATCH] mm,vmscan: only loop back if compaction would fail in all zones

On Sun, Nov 25, 2012 at 07:16:45PM -0500, Rik van Riel wrote:
> On Sun, 25 Nov 2012 17:44:33 -0500
> Johannes Weiner <[email protected]> wrote:
> > On Sun, Nov 25, 2012 at 01:29:50PM -0500, Rik van Riel wrote:
>
> > > Could you try this patch?
> >
> > It's not quite enough because it's not reaching the conditions you
> > changed, see analysis in https://lkml.org/lkml/2012/11/20/567
>
> Johannes,
>
> does the patch below fix your problem?

I can not reproduce the problem anymore with my smoke test.

> I suspect it would, because kswapd should only ever run into this
> particular problem when we have a tiny memory zone in a pgdat,
> and in that case we will also have a larger zone nearby, where
> compaction would just succeed.

What if there is a higher order GFP_DMA allocation when the other
zones in the system meet the high watermark for this order?

There is something else that worries me: if the preliminary zone scan
finds the high watermark of all zones alright, end_zone is at its
initialization value, 0. The final compaction loop at `if (order)'
goes through all zones up to and including end_zone, which was never
really set to anything meaningful(?) and the only zone considered is
the DMA zone again. Very unlikely, granted, but if you'd ever hit
that race and kswapd gets stuck, this will be fun to debug...

2012-11-26 04:11:34

by Johannes Weiner

[permalink] [raw]
Subject: Re: [PATCH] mm,vmscan: only loop back if compaction would fail in all zones

On Sun, Nov 25, 2012 at 10:15:18PM -0500, Johannes Weiner wrote:
> On Sun, Nov 25, 2012 at 07:16:45PM -0500, Rik van Riel wrote:
> > On Sun, 25 Nov 2012 17:44:33 -0500
> > Johannes Weiner <[email protected]> wrote:
> > > On Sun, Nov 25, 2012 at 01:29:50PM -0500, Rik van Riel wrote:
> >
> > > > Could you try this patch?
> > >
> > > It's not quite enough because it's not reaching the conditions you
> > > changed, see analysis in https://lkml.org/lkml/2012/11/20/567
> >
> > Johannes,
> >
> > does the patch below fix your problem?
>
> I can not reproduce the problem anymore with my smoke test.
>
> > I suspect it would, because kswapd should only ever run into this
> > particular problem when we have a tiny memory zone in a pgdat,
> > and in that case we will also have a larger zone nearby, where
> > compaction would just succeed.
>
> What if there is a higher order GFP_DMA allocation when the other
> zones in the system meet the high watermark for this order?
>
> There is something else that worries me: if the preliminary zone scan
> finds the high watermark of all zones alright, end_zone is at its
> initialization value, 0. The final compaction loop at `if (order)'
> goes through all zones up to and including end_zone, which was never
> really set to anything meaningful(?) and the only zone considered is
> the DMA zone again. Very unlikely, granted, but if you'd ever hit
> that race and kswapd gets stuck, this will be fun to debug...

I actually liked your first idea better: force reclaim until the
compaction watermark is met. The only problem was that still not
every check in there agreed when the zone was considered balanced and
so no actual reclaim happened.

So how about making everybody agree? If the high watermark is met but
not the compaction one, keep doing reclaim AND don't consider the zone
balanced, AND don't make it contribute to balanced_pages etc.? This
makes sure reclaim really does not bail and that the node is never
considered alright when it's actually not according to compaction.
This patch fixes the problem too (at least for the smoke test so far)
and IMO makes the code a bit more understandable.

We may be able to drop some of the relooping conditions. We may also
be able to reduce the pressure from the DMA zone by passing the right
classzone_idx in there. Needs more thought.

---
From: Johannes Weiner <[email protected]>
Subject: [patch] mm: vmscan: fix endless loop in kswapd balancing

Kswapd does not in all places have the same criteria for when it
considers a zone balanced. This leads to zones being not reclaimed
because they are considered just fine and the compaction checks to
loop over the zonelist again because they are considered unbalanced,
causing kswapd to run forever.

Add a function, zone_balanced(), that checks the watermark and if
compaction has enough free memory to do its job. Then use it
uniformly for when kswapd needs to check if a zone is balanced.

Signed-off-by: Johannes Weiner <[email protected]>
---
mm/vmscan.c | 27 ++++++++++++++++++---------
1 file changed, 18 insertions(+), 9 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 48550c6..3b0aef4 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2397,6 +2397,19 @@ static void age_active_anon(struct zone *zone, struct scan_control *sc)
} while (memcg);
}

+static bool zone_balanced(struct zone *zone, int order,
+ unsigned long balance_gap, int classzone_idx)
+{
+ if (!zone_watermark_ok_safe(zone, order, high_wmark_pages(zone) +
+ balance_gap, classzone_idx, 0))
+ return false;
+
+ if (COMPACTION_BUILD && order && !compaction_suitable(zone, order))
+ return false;
+
+ return true;
+}
+
/*
* pgdat_balanced is used when checking if a node is balanced for high-order
* allocations. Only zones that meet watermarks and are in a zone allowed
@@ -2475,8 +2488,7 @@ static bool prepare_kswapd_sleep(pg_data_t *pgdat, int order, long remaining,
continue;
}

- if (!zone_watermark_ok_safe(zone, order, high_wmark_pages(zone),
- i, 0))
+ if (!zone_balanced(zone, order, 0, i))
all_zones_ok = false;
else
balanced += zone->present_pages;
@@ -2585,8 +2597,7 @@ static unsigned long balance_pgdat(pg_data_t *pgdat, int order,
break;
}

- if (!zone_watermark_ok_safe(zone, order,
- high_wmark_pages(zone), 0, 0)) {
+ if (!zone_balanced(zone, order, 0, 0)) {
end_zone = i;
break;
} else {
@@ -2662,9 +2673,8 @@ static unsigned long balance_pgdat(pg_data_t *pgdat, int order,
testorder = 0;

if ((buffer_heads_over_limit && is_highmem_idx(i)) ||
- !zone_watermark_ok_safe(zone, testorder,
- high_wmark_pages(zone) + balance_gap,
- end_zone, 0)) {
+ !zone_balanced(zone, testorder,
+ balance_gap, end_zone)) {
shrink_zone(zone, &sc);

reclaim_state->reclaimed_slab = 0;
@@ -2691,8 +2701,7 @@ static unsigned long balance_pgdat(pg_data_t *pgdat, int order,
continue;
}

- if (!zone_watermark_ok_safe(zone, testorder,
- high_wmark_pages(zone), end_zone, 0)) {
+ if (!zone_balanced(zone, testorder, 0, end_zone)) {
all_zones_ok = 0;
/*
* We are still under min water mark. This
--
1.7.11.7

2012-11-26 11:17:59

by Johannes Hirte

[permalink] [raw]
Subject: Re: [PATCH] mm,vmscan: only loop back if compaction would fail in all zones

Am Sun, 25 Nov 2012 23:10:41 -0500
schrieb Johannes Weiner <[email protected]>:

> On Sun, Nov 25, 2012 at 10:15:18PM -0500, Johannes Weiner wrote:
> > On Sun, Nov 25, 2012 at 07:16:45PM -0500, Rik van Riel wrote:
> > > On Sun, 25 Nov 2012 17:44:33 -0500
> > > Johannes Weiner <[email protected]> wrote:
> > > > On Sun, Nov 25, 2012 at 01:29:50PM -0500, Rik van Riel wrote:
> > >
> > > > > Could you try this patch?
> > > >
> > > > It's not quite enough because it's not reaching the conditions
> > > > you changed, see analysis in
> > > > https://lkml.org/lkml/2012/11/20/567
> > >
> > > Johannes,
> > >
> > > does the patch below fix your problem?
> >
> > I can not reproduce the problem anymore with my smoke test.
> >
> > > I suspect it would, because kswapd should only ever run into this
> > > particular problem when we have a tiny memory zone in a pgdat,
> > > and in that case we will also have a larger zone nearby, where
> > > compaction would just succeed.
> >
> > What if there is a higher order GFP_DMA allocation when the other
> > zones in the system meet the high watermark for this order?
> >
> > There is something else that worries me: if the preliminary zone
> > scan finds the high watermark of all zones alright, end_zone is at
> > its initialization value, 0. The final compaction loop at `if
> > (order)' goes through all zones up to and including end_zone, which
> > was never really set to anything meaningful(?) and the only zone
> > considered is the DMA zone again. Very unlikely, granted, but if
> > you'd ever hit that race and kswapd gets stuck, this will be fun to
> > debug...
>
> I actually liked your first idea better: force reclaim until the
> compaction watermark is met. The only problem was that still not
> every check in there agreed when the zone was considered balanced and
> so no actual reclaim happened.
>
> So how about making everybody agree? If the high watermark is met but
> not the compaction one, keep doing reclaim AND don't consider the zone
> balanced, AND don't make it contribute to balanced_pages etc.? This
> makes sure reclaim really does not bail and that the node is never
> considered alright when it's actually not according to compaction.
> This patch fixes the problem too (at least for the smoke test so far)
> and IMO makes the code a bit more understandable.
>
> We may be able to drop some of the relooping conditions. We may also
> be able to reduce the pressure from the DMA zone by passing the right
> classzone_idx in there. Needs more thought.
>
> ---
> From: Johannes Weiner <[email protected]>
> Subject: [patch] mm: vmscan: fix endless loop in kswapd balancing
>
> Kswapd does not in all places have the same criteria for when it
> considers a zone balanced. This leads to zones being not reclaimed
> because they are considered just fine and the compaction checks to
> loop over the zonelist again because they are considered unbalanced,
> causing kswapd to run forever.
>
> Add a function, zone_balanced(), that checks the watermark and if
> compaction has enough free memory to do its job. Then use it
> uniformly for when kswapd needs to check if a zone is balanced.
>
> Signed-off-by: Johannes Weiner <[email protected]>
> ---
> mm/vmscan.c | 27 ++++++++++++++++++---------
> 1 file changed, 18 insertions(+), 9 deletions(-)
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 48550c6..3b0aef4 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -2397,6 +2397,19 @@ static void age_active_anon(struct zone *zone,
> struct scan_control *sc) } while (memcg);
> }
>
> +static bool zone_balanced(struct zone *zone, int order,
> + unsigned long balance_gap, int
> classzone_idx) +{
> + if (!zone_watermark_ok_safe(zone, order,
> high_wmark_pages(zone) +
> + balance_gap, classzone_idx, 0))
> + return false;
> +
> + if (COMPACTION_BUILD && order && !compaction_suitable(zone,
> order))
> + return false;
> +
> + return true;
> +}
> +
> /*
> * pgdat_balanced is used when checking if a node is balanced for
> high-order
> * allocations. Only zones that meet watermarks and are in a zone
> allowed @@ -2475,8 +2488,7 @@ static bool
> prepare_kswapd_sleep(pg_data_t *pgdat, int order, long remaining,
> continue; }
>
> - if (!zone_watermark_ok_safe(zone, order,
> high_wmark_pages(zone),
> - i, 0))
> + if (!zone_balanced(zone, order, 0, i))
> all_zones_ok = false;
> else
> balanced += zone->present_pages;
> @@ -2585,8 +2597,7 @@ static unsigned long balance_pgdat(pg_data_t
> *pgdat, int order, break;
> }
>
> - if (!zone_watermark_ok_safe(zone, order,
> - high_wmark_pages(zone), 0,
> 0)) {
> + if (!zone_balanced(zone, order, 0, 0)) {
> end_zone = i;
> break;
> } else {
> @@ -2662,9 +2673,8 @@ static unsigned long balance_pgdat(pg_data_t
> *pgdat, int order, testorder = 0;
>
> if ((buffer_heads_over_limit &&
> is_highmem_idx(i)) ||
> - !zone_watermark_ok_safe(zone,
> testorder,
> - high_wmark_pages(zone) +
> balance_gap,
> - end_zone, 0)) {
> + !zone_balanced(zone, testorder,
> + balance_gap, end_zone)) {
> shrink_zone(zone, &sc);
>
> reclaim_state->reclaimed_slab = 0;
> @@ -2691,8 +2701,7 @@ static unsigned long balance_pgdat(pg_data_t
> *pgdat, int order, continue;
> }
>
> - if (!zone_watermark_ok_safe(zone, testorder,
> - high_wmark_pages(zone),
> end_zone, 0)) {
> + if (!zone_balanced(zone, testorder, 0,
> end_zone)) { all_zones_ok = 0;
> /*
> * We are still under min water
> mark. This

I've tested both patches, this one and Riks, and they both seem to fix
the problem. kswapd didn't came up again consuming that much CPU. Feel
free to add my tested-by.

regards,
Johannes

2012-11-26 15:33:11

by Rik van Riel

[permalink] [raw]
Subject: Re: [PATCH] mm,vmscan: only loop back if compaction would fail in all zones

On 11/25/2012 11:10 PM, Johannes Weiner wrote:

> From: Johannes Weiner <[email protected]>
> Subject: [patch] mm: vmscan: fix endless loop in kswapd balancing
>
> Kswapd does not in all places have the same criteria for when it
> considers a zone balanced. This leads to zones being not reclaimed
> because they are considered just fine and the compaction checks to
> loop over the zonelist again because they are considered unbalanced,
> causing kswapd to run forever.
>
> Add a function, zone_balanced(), that checks the watermark and if
> compaction has enough free memory to do its job. Then use it
> uniformly for when kswapd needs to check if a zone is balanced.
>
> Signed-off-by: Johannes Weiner <[email protected]>

Reviewed-by: Rik van Riel <[email protected]>


--
All rights reversed

2012-11-27 22:39:17

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: [PATCH] mm,vmscan: only loop back if compaction would fail in all zones

On Sun, 25 Nov 2012 23:10:41 -0500, Johannes Weiner said:

> From: Johannes Weiner <[email protected]>
> Subject: [patch] mm: vmscan: fix endless loop in kswapd balancing
>
> Kswapd does not in all places have the same criteria for when it
> considers a zone balanced. This leads to zones being not reclaimed
> because they are considered just fine and the compaction checks to
> loop over the zonelist again because they are considered unbalanced,
> causing kswapd to run forever.
>
> Add a function, zone_balanced(), that checks the watermark and if
> compaction has enough free memory to do its job. Then use it
> uniformly for when kswapd needs to check if a zone is balanced.
>
> Signed-off-by: Johannes Weiner <[email protected]>
> ---
> mm/vmscan.c | 27 ++++++++++++++++++---------
> 1 file changed, 18 insertions(+), 9 deletions(-)
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 48550c6..3b0aef4 100644

> + if (COMPACTION_BUILD && order && !compaction_suitable(zone, order))
> + return false;

Applying to next-20121117,I had to hand-patch for this other apkm patch:

./Next/merge.log:Applying: mm: use IS_ENABLED(CONFIG_COMPACTION) instead of COMPACTION_BUILD

Probably won't be till tomorrow before I know if this worked, it seems
to take a while before the kswapd storms start hitting (appears to be
a function of uptime - see almost none for 8-16 hours, after 24-30 hours
I'll be having a spinning kswapd most of the time).


Attachments:
(No filename) (865.00 B)