2023-10-13 11:31:47

by zhaoyang.huang

[permalink] [raw]
Subject: [RFC PATCH 1/1] mm: only use old generation and stable tier for madv_pageout

From: Zhaoyang Huang <[email protected]>

Dropping pages of young generation or unstable tier via madvise could
make the system experience heavy page thrashing and IO pressure.
Furthermore, it could lead to failure of tier's PID controller which
affect normal reclaiming. I would like suggest skipping this pages in
madv_pageout.

Signed-off-by: Zhaoyang Huang <[email protected]>
---
include/linux/swap.h | 1 +
mm/madvise.c | 12 ++++++++++++
mm/vmscan.c | 3 ++-
3 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index 493487ed7c38..d09c859ccc45 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -496,6 +496,7 @@ extern int init_swap_address_space(unsigned int type, unsigned long nr_pages);
extern void exit_swap_address_space(unsigned int type);
extern struct swap_info_struct *get_swap_device(swp_entry_t entry);
sector_t swap_page_sector(struct page *page);
+extern int get_tier_idx(struct lruvec *lruvec, int type);

static inline void put_swap_device(struct swap_info_struct *si)
{
diff --git a/mm/madvise.c b/mm/madvise.c
index 4dded5d27e7e..324d76096ca5 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -452,6 +452,18 @@ static int madvise_cold_or_pageout_pte_range(pmd_t *pmd,
if (!folio || folio_is_zone_device(folio))
continue;

+ if (lru_gen_enabled() && pageout) {
+ int gen = folio_lru_gen(folio);
+ struct lruvec *lruvec = folio_lruvec(folio);
+ int type = folio_is_file_lru(folio);
+ int refs = folio_lru_refs(folio);
+ int tier = lru_tier_from_refs(refs);
+ int tier_st = get_tier_idx(lruvec, type);
+
+ if (gen > lru_gen_from_seq(lruvec->lrugen.min_seq[type]) + 1
+ || tier > tier_st)
+ continue;
+ }
/*
* Creating a THP page is expensive so split it only if we
* are sure it's worth. Split it if we are only owner.
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 6f13394b112e..16900a8c13e0 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -5072,7 +5072,7 @@ static int scan_folios(struct lruvec *lruvec, struct scan_control *sc,
return isolated || !remaining ? scanned : 0;
}

-static int get_tier_idx(struct lruvec *lruvec, int type)
+int get_tier_idx(struct lruvec *lruvec, int type)
{
int tier;
struct ctrl_pos sp, pv;
@@ -5091,6 +5091,7 @@ static int get_tier_idx(struct lruvec *lruvec, int type)

return tier - 1;
}
+EXPORT_SYMBOL_GPL(get_tier_idx);

static int get_type_to_scan(struct lruvec *lruvec, int swappiness, int *tier_idx)
{
--
2.25.1


2023-10-13 15:39:08

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [RFC PATCH 1/1] mm: only use old generation and stable tier for madv_pageout

On Fri, Oct 13, 2023 at 07:30:28PM +0800, zhaoyang.huang wrote:
> From: Zhaoyang Huang <[email protected]>
>
> Dropping pages of young generation or unstable tier via madvise could
> make the system experience heavy page thrashing and IO pressure.

... then userspace should not do that?

> @@ -5091,6 +5091,7 @@ static int get_tier_idx(struct lruvec *lruvec, int type)
>
> return tier - 1;
> }
> +EXPORT_SYMBOL_GPL(get_tier_idx);

Why would this need to be exported to modules in order to be used by
madvise? Is this patch just a trojan horse so you can use get_tier_idx
in your own module? NAK.