2023-01-09 16:10:15

by Mel Gorman

[permalink] [raw]
Subject: [PATCH 0/6 v2] Discard __GFP_ATOMIC

Changelog since v1
o Split one patch (vbabka)
o Improve OOM reserve handling (vbabka)
o Fix __GFP_RECLAIM vs __GFP_DIRECT_RECLAIM (vbabka)

Neil's patch has been residing in mm-unstable as commit 2fafb4fe8f7a
("mm: discard __GFP_ATOMIC") for a long time and recently brought up
again. Most recently, I was worried that __GFP_HIGH allocations could
use high-order atomic reserves which is unintentional but there was no
response so lets revisit -- this series reworks how min reserves are used,
protects highorder reserves and then finishes with Neil's patch with very
minor modifications so it fits on top.

There was a review discussion on renaming __GFP_DIRECT_RECLAIM to
__GFP_ALLOW_BLOCKING but I didn't think it was that big an issue and is
ortogonal to the removal of __GFP_ATOMIC.

There were some concerns about how the gfp flags affect the min reserves
but it never reached a solid conclusion so I made my own attempt.

The series tries to iron out some of the details on how reserves are
used. ALLOC_HIGH becomes ALLOC_MIN_RESERVE and ALLOC_HARDER becomes
ALLOC_NON_BLOCK and documents how the reserves are affected. For example,
ALLOC_NON_BLOCK (no direct reclaim) on its own allows 25% of the min reserve.
ALLOC_MIN_RESERVE (__GFP_HIGH) allows 50% and both combined allows deeper
access again. ALLOC_OOM allows access to 75%.

High-order atomic allocations are explicitly handled with the caveat that
no __GFP_ATOMIC flag means that any high-order allocation that specifies
GFP_HIGH and cannot enter direct reclaim will be treated as if it was
GFP_ATOMIC.

Documentation/mm/balance.rst | 2 +-
drivers/iommu/tegra-smmu.c | 4 +-
include/linux/gfp_types.h | 12 ++--
include/trace/events/mmflags.h | 1 -
lib/test_printf.c | 8 +--
mm/internal.h | 15 ++++-
mm/page_alloc.c | 103 ++++++++++++++++++++-------------
tools/perf/builtin-kmem.c | 1 -
8 files changed, 86 insertions(+), 60 deletions(-)

--
2.35.3


2023-01-09 16:11:30

by Mel Gorman

[permalink] [raw]
Subject: [PATCH 5/7] mm/page_alloc.c: Allow __GFP_NOFAIL requests deeper access to reserves

Currently __GFP_NOFAIL allocations without any other flags can access 25%
of the reserves but these requests imply that the system cannot make forward
progress until the allocation succeeds. Allow __GFP_NOFAIL access to 75%
of the min reserve.

Signed-off-by: Mel Gorman <[email protected]>
---
mm/page_alloc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 6f41b84a97ac..d2df78f5baa2 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5308,7 +5308,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
* could deplete whole memory reserves which would just make
* the situation worse
*/
- page = __alloc_pages_cpuset_fallback(gfp_mask, order, ALLOC_HARDER, ac);
+ page = __alloc_pages_cpuset_fallback(gfp_mask, order, ALLOC_MIN_RESERVE|ALLOC_HARDER, ac);
if (page)
goto got_pg;

--
2.35.3

2023-01-11 14:50:17

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [PATCH 5/7] mm/page_alloc.c: Allow __GFP_NOFAIL requests deeper access to reserves

On 1/9/23 16:16, Mel Gorman wrote:
> Currently __GFP_NOFAIL allocations without any other flags can access 25%
> of the reserves but these requests imply that the system cannot make forward
> progress until the allocation succeeds. Allow __GFP_NOFAIL access to 75%
> of the min reserve.
>
> Signed-off-by: Mel Gorman <[email protected]>

Acked-by: Vlastimil Babka <[email protected]>

> ---
> mm/page_alloc.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 6f41b84a97ac..d2df78f5baa2 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -5308,7 +5308,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
> * could deplete whole memory reserves which would just make
> * the situation worse
> */
> - page = __alloc_pages_cpuset_fallback(gfp_mask, order, ALLOC_HARDER, ac);
> + page = __alloc_pages_cpuset_fallback(gfp_mask, order, ALLOC_MIN_RESERVE|ALLOC_HARDER, ac);
> if (page)
> goto got_pg;
>

2023-01-11 16:36:38

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH 5/7] mm/page_alloc.c: Allow __GFP_NOFAIL requests deeper access to reserves

On Mon 09-01-23 15:16:29, Mel Gorman wrote:
> Currently __GFP_NOFAIL allocations without any other flags can access 25%
> of the reserves but these requests imply that the system cannot make forward
> progress until the allocation succeeds. Allow __GFP_NOFAIL access to 75%
> of the min reserve.

I am not sure this is really needed. IIRC the original motivation for
allowing NOFAIL request to access access to memory reserves was
GFP_NOFS|__GFP_NOFAIL requests which do not invoke the OOM killer.
The amount of memory reserves granted was not really important. The
point was to allow to move forward. Giving more of the reserves is a
double edge sword. It can help in some cases but it can also prevent
other high priority users from fwd progress.

I would much rahter see such a change with an example where it really
made a difference.

> Signed-off-by: Mel Gorman <[email protected]>
> ---
> mm/page_alloc.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 6f41b84a97ac..d2df78f5baa2 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -5308,7 +5308,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
> * could deplete whole memory reserves which would just make
> * the situation worse
> */
> - page = __alloc_pages_cpuset_fallback(gfp_mask, order, ALLOC_HARDER, ac);
> + page = __alloc_pages_cpuset_fallback(gfp_mask, order, ALLOC_MIN_RESERVE|ALLOC_HARDER, ac);
> if (page)
> goto got_pg;
>
> --
> 2.35.3

--
Michal Hocko
SUSE Labs

2023-01-12 09:54:55

by Mel Gorman

[permalink] [raw]
Subject: Re: [PATCH 5/7] mm/page_alloc.c: Allow __GFP_NOFAIL requests deeper access to reserves

On Wed, Jan 11, 2023 at 04:46:13PM +0100, Michal Hocko wrote:
> On Mon 09-01-23 15:16:29, Mel Gorman wrote:
> > Currently __GFP_NOFAIL allocations without any other flags can access 25%
> > of the reserves but these requests imply that the system cannot make forward
> > progress until the allocation succeeds. Allow __GFP_NOFAIL access to 75%
> > of the min reserve.
>
> I am not sure this is really needed. IIRC the original motivation for
> allowing NOFAIL request to access access to memory reserves was
> GFP_NOFS|__GFP_NOFAIL requests which do not invoke the OOM killer.
> The amount of memory reserves granted was not really important. The
> point was to allow to move forward. Giving more of the reserves is a
> double edge sword. It can help in some cases but it can also prevent
> other high priority users from fwd progress.
>
> I would much rahter see such a change with an example where it really
> made a difference.
>

Fair point but based on your review for "mm/page_alloc: Give GFP_ATOMIC
and non-blocking allocations access to reserves" and only allowing
non-blocking allocations to access reserves if __GFP_HIGH is also
specified, this patch becomes a no-op and can be dropped.

If GFP_NOFAIL requests really require deeper access to reserves, it'll
have to be explicitly handled in __zone_watermark_ok and __GFP_NOFAIL
would be added to the ALLOC_RESERVES collection of flags.

--
Mel Gorman
SUSE Labs