2020-06-11 12:14:39

by Charan Teja Kalla

[permalink] [raw]
Subject: [PATCH] mm, page_alloc: skip ->watermark_boost for atomic order-0 allocations-fix

When boosting is enabled, it is observed that rate of atomic order-0
allocation failures are high due to the fact that free levels in the
system are checked with ->watermark_boost offset. This is not a problem
for sleepable allocations but for atomic allocations which looks like
regression.

This problem is seen frequently on system setup of Android kernel
running on Snapdragon hardware with 4GB RAM size. When no extfrag event
occurred in the system, ->watermark_boost factor is zero, thus the
watermark configurations in the system are:
_watermark = (
[WMARK_MIN] = 1272, --> ~5MB
[WMARK_LOW] = 9067, --> ~36MB
[WMARK_HIGH] = 9385), --> ~38MB
watermark_boost = 0

After launching some memory hungry applications in Android which can
cause extfrag events in the system to an extent that ->watermark_boost
can be set to max i.e. default boost factor makes it to 150% of high
watermark.
_watermark = (
[WMARK_MIN] = 1272, --> ~5MB
[WMARK_LOW] = 9067, --> ~36MB
[WMARK_HIGH] = 9385), --> ~38MB
watermark_boost = 14077, -->~57MB

With default system configuration, for an atomic order-0 allocation to
succeed, having free memory of ~2MB will suffice. But boosting makes
the min_wmark to ~61MB thus for an atomic order-0 allocation to be
successful system should have minimum of ~23MB of free memory(from
calculations of zone_watermark_ok(), min = 3/4(min/2)). But failures are
observed despite system is having ~20MB of free memory. In the testing,
this is reproducible as early as first 300secs since boot and with
furtherlowram configurations(<2GB) it is observed as early as first
150secs since boot.

These failures can be avoided by excluding the ->watermark_boost in
watermark caluculations for atomic order-0 allocations.

Fix-suggested-by: Mel Gorman <[email protected]>
Signed-off-by: Charan Teja Reddy <[email protected]>
---

Change in linux-next: https://lore.kernel.org/patchwork/patch/1244272/

mm/page_alloc.c | 36 ++++++++++++++++++++----------------
1 file changed, 20 insertions(+), 16 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 0c435b2..18f407e 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3580,7 +3580,7 @@ bool zone_watermark_ok(struct zone *z, unsigned int order, unsigned long mark,

static inline bool zone_watermark_fast(struct zone *z, unsigned int order,
unsigned long mark, int highest_zoneidx,
- unsigned int alloc_flags)
+ unsigned int alloc_flags, gfp_t gfp_mask)
{
long free_pages = zone_page_state(z, NR_FREE_PAGES);
long cma_pages = 0;
@@ -3602,8 +3602,23 @@ static inline bool zone_watermark_fast(struct zone *z, unsigned int order,
mark + z->lowmem_reserve[highest_zoneidx])
return true;

- return __zone_watermark_ok(z, order, mark, highest_zoneidx, alloc_flags,
- free_pages);
+ if (__zone_watermark_ok(z, order, mark, highest_zoneidx, alloc_flags,
+ free_pages))
+ return true;
+ /*
+ * Ignore watermark boosting for GFP_ATOMIC order-0 allocations
+ * when checking the min watermark. The min watermark is the
+ * point where boosting is ignored so that kswapd is woken up
+ * when below the low watermark.
+ */
+ if (unlikely(!order && (gfp_mask & __GFP_ATOMIC) && z->watermark_boost
+ && ((alloc_flags & ALLOC_WMARK_MASK) == WMARK_MIN))) {
+ mark = z->_watermark[WMARK_MIN];
+ return __zone_watermark_ok(z, order, mark, highest_zoneidx,
+ alloc_flags, free_pages);
+ }
+
+ return false;
}

bool zone_watermark_ok_safe(struct zone *z, unsigned int order,
@@ -3746,20 +3761,9 @@ static bool zone_allows_reclaim(struct zone *local_zone, struct zone *zone)
}

mark = wmark_pages(zone, alloc_flags & ALLOC_WMARK_MASK);
- /*
- * Allow GFP_ATOMIC order-0 allocations to exclude the
- * zone->watermark_boost in their watermark calculations.
- * We rely on the ALLOC_ flags set for GFP_ATOMIC requests in
- * gfp_to_alloc_flags() for this. Reason not to use the
- * GFP_ATOMIC directly is that we want to fall back to slow path
- * thus wake up kswapd.
- */
- if (unlikely(!order && !(alloc_flags & ALLOC_WMARK_MASK) &&
- (alloc_flags & (ALLOC_HARDER | ALLOC_HIGH)))) {
- mark = zone->_watermark[WMARK_MIN];
- }
if (!zone_watermark_fast(zone, order, mark,
- ac->highest_zoneidx, alloc_flags)) {
+ ac->highest_zoneidx, alloc_flags,
+ gfp_mask)) {
int ret;

#ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project


2020-06-12 16:17:56

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [PATCH] mm, page_alloc: skip ->watermark_boost for atomic order-0 allocations-fix

On 6/11/20 2:09 PM, Charan Teja Kalla wrote:
> When boosting is enabled, it is observed that rate of atomic order-0
> allocation failures are high due to the fact that free levels in the
> system are checked with ->watermark_boost offset. This is not a problem
> for sleepable allocations but for atomic allocations which looks like
> regression.
>
> This problem is seen frequently on system setup of Android kernel
> running on Snapdragon hardware with 4GB RAM size. When no extfrag event
> occurred in the system, ->watermark_boost factor is zero, thus the
> watermark configurations in the system are:
> _watermark = (
> [WMARK_MIN] = 1272, --> ~5MB
> [WMARK_LOW] = 9067, --> ~36MB
> [WMARK_HIGH] = 9385), --> ~38MB
> watermark_boost = 0
>
> After launching some memory hungry applications in Android which can
> cause extfrag events in the system to an extent that ->watermark_boost
> can be set to max i.e. default boost factor makes it to 150% of high
> watermark.
> _watermark = (
> [WMARK_MIN] = 1272, --> ~5MB
> [WMARK_LOW] = 9067, --> ~36MB
> [WMARK_HIGH] = 9385), --> ~38MB
> watermark_boost = 14077, -->~57MB
>
> With default system configuration, for an atomic order-0 allocation to
> succeed, having free memory of ~2MB will suffice. But boosting makes
> the min_wmark to ~61MB thus for an atomic order-0 allocation to be
> successful system should have minimum of ~23MB of free memory(from
> calculations of zone_watermark_ok(), min = 3/4(min/2)). But failures are
> observed despite system is having ~20MB of free memory. In the testing,
> this is reproducible as early as first 300secs since boot and with
> furtherlowram configurations(<2GB) it is observed as early as first
> 150secs since boot.
>
> These failures can be avoided by excluding the ->watermark_boost in
> watermark caluculations for atomic order-0 allocations.
>
> Fix-suggested-by: Mel Gorman <[email protected]>
> Signed-off-by: Charan Teja Reddy <[email protected]>

For the patch+fix:

Acked-by: Vlastimil Babka <[email protected]>

The boost and highatomic stuff certainly made the whole thing more subtle.

> ---
>
> Change in linux-next: https://lore.kernel.org/patchwork/patch/1244272/
>
> mm/page_alloc.c | 36 ++++++++++++++++++++----------------
> 1 file changed, 20 insertions(+), 16 deletions(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 0c435b2..18f407e 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -3580,7 +3580,7 @@ bool zone_watermark_ok(struct zone *z, unsigned int order, unsigned long mark,
>
> static inline bool zone_watermark_fast(struct zone *z, unsigned int order,
> unsigned long mark, int highest_zoneidx,
> - unsigned int alloc_flags)
> + unsigned int alloc_flags, gfp_t gfp_mask)
> {
> long free_pages = zone_page_state(z, NR_FREE_PAGES);
> long cma_pages = 0;
> @@ -3602,8 +3602,23 @@ static inline bool zone_watermark_fast(struct zone *z, unsigned int order,
> mark + z->lowmem_reserve[highest_zoneidx])
> return true;
>
> - return __zone_watermark_ok(z, order, mark, highest_zoneidx, alloc_flags,
> - free_pages);
> + if (__zone_watermark_ok(z, order, mark, highest_zoneidx, alloc_flags,
> + free_pages))
> + return true;
> + /*
> + * Ignore watermark boosting for GFP_ATOMIC order-0 allocations
> + * when checking the min watermark. The min watermark is the
> + * point where boosting is ignored so that kswapd is woken up
> + * when below the low watermark.
> + */
> + if (unlikely(!order && (gfp_mask & __GFP_ATOMIC) && z->watermark_boost
> + && ((alloc_flags & ALLOC_WMARK_MASK) == WMARK_MIN))) {
> + mark = z->_watermark[WMARK_MIN];
> + return __zone_watermark_ok(z, order, mark, highest_zoneidx,
> + alloc_flags, free_pages);
> + }
> +
> + return false;
> }
>
> bool zone_watermark_ok_safe(struct zone *z, unsigned int order,
> @@ -3746,20 +3761,9 @@ static bool zone_allows_reclaim(struct zone *local_zone, struct zone *zone)
> }
>
> mark = wmark_pages(zone, alloc_flags & ALLOC_WMARK_MASK);
> - /*
> - * Allow GFP_ATOMIC order-0 allocations to exclude the
> - * zone->watermark_boost in their watermark calculations.
> - * We rely on the ALLOC_ flags set for GFP_ATOMIC requests in
> - * gfp_to_alloc_flags() for this. Reason not to use the
> - * GFP_ATOMIC directly is that we want to fall back to slow path
> - * thus wake up kswapd.
> - */
> - if (unlikely(!order && !(alloc_flags & ALLOC_WMARK_MASK) &&
> - (alloc_flags & (ALLOC_HARDER | ALLOC_HIGH)))) {
> - mark = zone->_watermark[WMARK_MIN];
> - }
> if (!zone_watermark_fast(zone, order, mark,
> - ac->highest_zoneidx, alloc_flags)) {
> + ac->highest_zoneidx, alloc_flags,
> + gfp_mask)) {
> int ret;
>
> #ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT
>

2020-06-18 00:21:01

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] mm, page_alloc: skip ->watermark_boost for atomic order-0 allocations-fix

On Thu, 11 Jun 2020 17:39:47 +0530 Charan Teja Kalla <[email protected]> wrote:

> When boosting is enabled, it is observed that rate of atomic order-0
> allocation failures are high due to the fact that free levels in the
> system are checked with ->watermark_boost offset. This is not a problem
> for sleepable allocations but for atomic allocations which looks like
> regression.
>
> This problem is seen frequently on system setup of Android kernel
> running on Snapdragon hardware with 4GB RAM size. When no extfrag event
> occurred in the system, ->watermark_boost factor is zero, thus the
> watermark configurations in the system are:
> _watermark = (
> [WMARK_MIN] = 1272, --> ~5MB
> [WMARK_LOW] = 9067, --> ~36MB
> [WMARK_HIGH] = 9385), --> ~38MB
> watermark_boost = 0
>
> After launching some memory hungry applications in Android which can
> cause extfrag events in the system to an extent that ->watermark_boost
> can be set to max i.e. default boost factor makes it to 150% of high
> watermark.
> _watermark = (
> [WMARK_MIN] = 1272, --> ~5MB
> [WMARK_LOW] = 9067, --> ~36MB
> [WMARK_HIGH] = 9385), --> ~38MB
> watermark_boost = 14077, -->~57MB
>
> With default system configuration, for an atomic order-0 allocation to
> succeed, having free memory of ~2MB will suffice. But boosting makes
> the min_wmark to ~61MB thus for an atomic order-0 allocation to be
> successful system should have minimum of ~23MB of free memory(from
> calculations of zone_watermark_ok(), min = 3/4(min/2)). But failures are
> observed despite system is having ~20MB of free memory. In the testing,
> this is reproducible as early as first 300secs since boot and with
> furtherlowram configurations(<2GB) it is observed as early as first
> 150secs since boot.
>
> These failures can be avoided by excluding the ->watermark_boost in
> watermark caluculations for atomic order-0 allocations.
>

Some description of the changes in this version would help.

Below is the overall patch as it would land in mainline. For
reviewers, please.

From: Charan Teja Reddy <[email protected]>
Subject: mm, page_alloc: skip ->waternark_boost for atomic order-0 allocations

When boosting is enabled, it is observed that rate of atomic order-0
allocation failures are high due to the fact that free levels in the
system are checked with ->watermark_boost offset. This is not a problem
for sleepable allocations but for atomic allocations which looks like
regression.

This problem is seen frequently on system setup of Android kernel running
on Snapdragon hardware with 4GB RAM size. When no extfrag event occurred
in the system, ->watermark_boost factor is zero, thus the watermark
configurations in the system are:

_watermark = (
[WMARK_MIN] = 1272, --> ~5MB
[WMARK_LOW] = 9067, --> ~36MB
[WMARK_HIGH] = 9385), --> ~38MB
watermark_boost = 0

After launching some memory hungry applications in Android which can cause
extfrag events in the system to an extent that ->watermark_boost can be
set to max i.e. default boost factor makes it to 150% of high watermark.

_watermark = (
[WMARK_MIN] = 1272, --> ~5MB
[WMARK_LOW] = 9067, --> ~36MB
[WMARK_HIGH] = 9385), --> ~38MB
watermark_boost = 14077, -->~57MB

With default system configuration, for an atomic order-0 allocation to
succeed, having free memory of ~2MB will suffice. But boosting makes the
min_wmark to ~61MB thus for an atomic order-0 allocation to be successful
system should have minimum of ~23MB of free memory(from calculations of
zone_watermark_ok(), min = 3/4(min/2)). But failures are observed despite
system is having ~20MB of free memory. In the testing, this is
reproducible as early as first 300secs since boot and with furtherlowram
configurations(<2GB) it is observed as early as first 150secs since boot.

These failures can be avoided by excluding the ->watermark_boost in
watermark caluculations for atomic order-0 allocations.

[[email protected]: fix suggested by Mel Gorman]
Link: http://lkml.kernel.org/r/[email protected]
[[email protected]: fix comment grammar, reflow comment]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Charan Teja Reddy <[email protected]>
Cc: Vinayak Menon <[email protected]>
Cc: Mel Gorman <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
---

mm/page_alloc.c | 24 ++++++++++++++++++++----
1 file changed, 20 insertions(+), 4 deletions(-)

--- a/mm/page_alloc.c~mm-page_alloc-skip-waternark_boost-for-atomic-order-0-allocations
+++ a/mm/page_alloc.c
@@ -3580,7 +3580,7 @@ bool zone_watermark_ok(struct zone *z, u

static inline bool zone_watermark_fast(struct zone *z, unsigned int order,
unsigned long mark, int highest_zoneidx,
- unsigned int alloc_flags)
+ unsigned int alloc_flags, gfp_t gfp_mask)
{
long free_pages = zone_page_state(z, NR_FREE_PAGES);
long cma_pages = 0;
@@ -3602,8 +3602,23 @@ static inline bool zone_watermark_fast(s
mark + z->lowmem_reserve[highest_zoneidx])
return true;

- return __zone_watermark_ok(z, order, mark, highest_zoneidx, alloc_flags,
- free_pages);
+ if (__zone_watermark_ok(z, order, mark, highest_zoneidx, alloc_flags,
+ free_pages))
+ return true;
+ /*
+ * Ignore watermark boosting for GFP_ATOMIC order-0 allocations
+ * when checking the min watermark. The min watermark is the
+ * point where boosting is ignored so that kswapd is woken up
+ * when below the low watermark.
+ */
+ if (unlikely(!order && (gfp_mask & __GFP_ATOMIC) && z->watermark_boost
+ && ((alloc_flags & ALLOC_WMARK_MASK) == WMARK_MIN))) {
+ mark = z->_watermark[WMARK_MIN];
+ return __zone_watermark_ok(z, order, mark, highest_zoneidx,
+ alloc_flags, free_pages);
+ }
+
+ return false;
}

bool zone_watermark_ok_safe(struct zone *z, unsigned int order,
@@ -3747,7 +3762,8 @@ retry:

mark = wmark_pages(zone, alloc_flags & ALLOC_WMARK_MASK);
if (!zone_watermark_fast(zone, order, mark,
- ac->highest_zoneidx, alloc_flags)) {
+ ac->highest_zoneidx, alloc_flags,
+ gfp_mask)) {
int ret;

#ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT
_

2020-10-19 18:41:59

by Ralph Siemsen

[permalink] [raw]
Subject: Re: [PATCH] mm, page_alloc: skip ->watermark_boost for atomic order-0 allocations-fix

Hi,

Please consider applying the patch from this thread to 5.8.y:
commit f80b08fc44536a311a9f3182e50f318b79076425

The fix should also go into 5.4.y, however the patch needs some minor
adjustments due to surrounding context differences. Attached below is a
version I have tested against 5.4.71.

This solves a "page allocation failure" error that can be reproduced
both on physical hardware, and also under qemu-system-arm. The test
consists of repeatedly running md5sum on a large file. In my tests the
file contains 1GB of random data, while the system has only 256MB RAM.
No other tasks are running or consuming significant memory.

After some time (between 1 and 200 iterations) the kernel reports a page
allocation failure. Additional failures occur fairly quickly thereafter.
The md5sum is correctly computed in each case. The OOM is not invoked.
The backtrace shows a 0-order GFP_ATOMIC was requested, with quite a
bit of memory available, and yet the allocation fails.

Similar error also occurs when "md5sum" is replaced by "scp" or "nc".
The backtrace again shows a 0-order with GFP_ATOMIC that fails, with
plenty of memory available according to the Mem-Info dump.

The problem does not occur under 4.9.y or 4.19.y. Bisction has found
that the problem started to occur with 688fcbfc06e4 ("mm/vmalloc: modify
struct vmap_area to reduce its size") during the 5.4 dev cycle.

I can provide additional logs and details if interested.

Thanks,
Ralph

Below is the f80b08fc445 commit, tweaked to apply to 5.4.y.

From: Charan Teja Reddy <[email protected]>
Subject: [PATCH] mm, page_alloc: skip ->waternark_boost for atomic order-0
allocations

[upstream commit f80b08fc44536a311a9f3182e50f318b79076425
with context adjusted to match linux-5.4.y]

When boosting is enabled, it is observed that rate of atomic order-0
allocation failures are high due to the fact that free levels in the
system are checked with ->watermark_boost offset. This is not a problem
for sleepable allocations but for atomic allocations which looks like
regression.

This problem is seen frequently on system setup of Android kernel running
on Snapdragon hardware with 4GB RAM size. When no extfrag event occurred
in the system, ->watermark_boost factor is zero, thus the watermark
configurations in the system are:

_watermark = (
[WMARK_MIN] = 1272, --> ~5MB
[WMARK_LOW] = 9067, --> ~36MB
[WMARK_HIGH] = 9385), --> ~38MB
watermark_boost = 0

After launching some memory hungry applications in Android which can cause
extfrag events in the system to an extent that ->watermark_boost can be
set to max i.e. default boost factor makes it to 150% of high watermark.

_watermark = (
[WMARK_MIN] = 1272, --> ~5MB
[WMARK_LOW] = 9067, --> ~36MB
[WMARK_HIGH] = 9385), --> ~38MB
watermark_boost = 14077, -->~57MB

With default system configuration, for an atomic order-0 allocation to
succeed, having free memory of ~2MB will suffice. But boosting makes the
min_wmark to ~61MB thus for an atomic order-0 allocation to be successful
system should have minimum of ~23MB of free memory(from calculations of
zone_watermark_ok(), min = 3/4(min/2)). But failures are observed despite
system is having ~20MB of free memory. In the testing, this is
reproducible as early as first 300secs since boot and with furtherlowram
configurations(<2GB) it is observed as early as first 150secs since boot.

These failures can be avoided by excluding the ->watermark_boost in
watermark caluculations for atomic order-0 allocations.

[[email protected]: fix comment grammar, reflow comment]
[[email protected]: fix suggested by Mel Gorman]
Link: http://lkml.kernel.org/r/[email protected]

Signed-off-by: Charan Teja Reddy <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: Vlastimil Babka <[email protected]>
Cc: Vinayak Menon <[email protected]>
Cc: Mel Gorman <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Ralph Siemsen <[email protected]>
---
mm/page_alloc.c | 25 +++++++++++++++++++++----
1 file changed, 21 insertions(+), 4 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index aff0bb4629bd..b0e9ea4c220e 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3484,7 +3484,8 @@ bool zone_watermark_ok(struct zone *z, unsigned int order, unsigned long mark,
}

static inline bool zone_watermark_fast(struct zone *z, unsigned int order,
- unsigned long mark, int classzone_idx, unsigned int alloc_flags)
+ unsigned long mark, int classzone_idx,
+ unsigned int alloc_flags, gfp_t gfp_mask)
{
long free_pages = zone_page_state(z, NR_FREE_PAGES);
long cma_pages = 0;
@@ -3505,8 +3506,23 @@ static inline bool zone_watermark_fast(struct zone *z, unsigned int order,
if (!order && (free_pages - cma_pages) > mark + z->lowmem_reserve[classzone_idx])
return true;

- return __zone_watermark_ok(z, order, mark, classzone_idx, alloc_flags,
- free_pages);
+ if (__zone_watermark_ok(z, order, mark, classzone_idx, alloc_flags,
+ free_pages))
+ return true;
+ /*
+ * Ignore watermark boosting for GFP_ATOMIC order-0 allocations
+ * when checking the min watermark. The min watermark is the
+ * point where boosting is ignored so that kswapd is woken up
+ * when below the low watermark.
+ */
+ if (unlikely(!order && (gfp_mask & __GFP_ATOMIC) && z->watermark_boost
+ && ((alloc_flags & ALLOC_WMARK_MASK) == WMARK_MIN))) {
+ mark = z->_watermark[WMARK_MIN];
+ return __zone_watermark_ok(z, order, mark, classzone_idx,
+ alloc_flags, free_pages);
+ }
+
+ return false;
}

bool zone_watermark_ok_safe(struct zone *z, unsigned int order,
@@ -3647,7 +3663,8 @@ get_page_from_freelist(gfp_t gfp_mask, unsigned int order, int alloc_flags,

mark = wmark_pages(zone, alloc_flags & ALLOC_WMARK_MASK);
if (!zone_watermark_fast(zone, order, mark,
- ac_classzone_idx(ac), alloc_flags)) {
+ ac_classzone_idx(ac), alloc_flags,
+ gfp_mask)) {
int ret;

#ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT
--
2.17.1

2020-11-23 09:54:26

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH] mm, page_alloc: skip ->watermark_boost for atomic order-0 allocations-fix

On Mon, Oct 19, 2020 at 02:40:17PM -0400, Ralph Siemsen wrote:
> Hi,
>
> Please consider applying the patch from this thread to 5.8.y:
> commit f80b08fc44536a311a9f3182e50f318b79076425

5.8 is end-of-life, sorry.

Now queued up for 5.4.y.

thanks,

greg k-h