2021-01-22 22:10:37

by Yu Zhao

[permalink] [raw]
Subject: [PATCH v2 00/10] mm: lru related cleanups

The cleanups are intended to reduce the verbosity in lru list
operations and make them less error-prone. A typical example
would be how the patches change __activate_page():

static void __activate_page(struct page *page, struct lruvec *lruvec)
{
if (!PageActive(page) && !PageUnevictable(page)) {
- int lru = page_lru_base_type(page);
int nr_pages = thp_nr_pages(page);

- del_page_from_lru_list(page, lruvec, lru);
+ del_page_from_lru_list(page, lruvec);
SetPageActive(page);
- lru += LRU_ACTIVE;
- add_page_to_lru_list(page, lruvec, lru);
+ add_page_to_lru_list(page, lruvec);
trace_mm_lru_activate(page);

There are a few more places like __activate_page() and they are
unnecessarily repetitive in terms of figuring out which list a page
should be added onto or deleted from. And with the duplicated code
removed, they are easier to read, IMO.

Patch 1 to 5 basically cover the above. Patch 6 and 7 make code more
robust by improving bug reporting. Patch 8, 9 and 10 take care of
some dangling helpers left in header files.

v1 -> v2:
dropped the last patch in this series based on the discussion here:
https://lore.kernel.org/patchwork/patch/1350552/#1550430

Yu Zhao (10):
mm: use add_page_to_lru_list()
mm: shuffle lru list addition and deletion functions
mm: don't pass "enum lru_list" to lru list addition functions
mm: don't pass "enum lru_list" to trace_mm_lru_insertion()
mm: don't pass "enum lru_list" to del_page_from_lru_list()
mm: add __clear_page_lru_flags() to replace page_off_lru()
mm: VM_BUG_ON lru page flags
mm: fold page_lru_base_type() into its sole caller
mm: fold __update_lru_size() into its sole caller
mm: make lruvec_lru_size() static

include/linux/mm_inline.h | 113 ++++++++++++++-------------------
include/linux/mmzone.h | 2 -
include/trace/events/pagemap.h | 11 ++--
mm/compaction.c | 2 +-
mm/mlock.c | 3 +-
mm/swap.c | 50 ++++++---------
mm/vmscan.c | 21 ++----
7 files changed, 77 insertions(+), 125 deletions(-)

--
2.30.0.280.ga3ce27912f-goog


2021-01-22 22:10:38

by Yu Zhao

[permalink] [raw]
Subject: [PATCH v2 04/10] mm: don't pass "enum lru_list" to trace_mm_lru_insertion()

The parameter is redundant in the sense that it can be extracted
from the "struct page" parameter by page_lru() correctly.

Link: https://lore.kernel.org/linux-mm/[email protected]/
Signed-off-by: Yu Zhao <[email protected]>
Reviewed-by: Alex Shi <[email protected]>
---
include/trace/events/pagemap.h | 11 ++++-------
mm/swap.c | 5 +----
2 files changed, 5 insertions(+), 11 deletions(-)

diff --git a/include/trace/events/pagemap.h b/include/trace/events/pagemap.h
index 8fd1babae761..e1735fe7c76a 100644
--- a/include/trace/events/pagemap.h
+++ b/include/trace/events/pagemap.h
@@ -27,24 +27,21 @@

TRACE_EVENT(mm_lru_insertion,

- TP_PROTO(
- struct page *page,
- int lru
- ),
+ TP_PROTO(struct page *page),

- TP_ARGS(page, lru),
+ TP_ARGS(page),

TP_STRUCT__entry(
__field(struct page *, page )
__field(unsigned long, pfn )
- __field(int, lru )
+ __field(enum lru_list, lru )
__field(unsigned long, flags )
),

TP_fast_assign(
__entry->page = page;
__entry->pfn = page_to_pfn(page);
- __entry->lru = lru;
+ __entry->lru = page_lru(page);
__entry->flags = trace_pagemap_flags(page);
),

diff --git a/mm/swap.c b/mm/swap.c
index 4b058ef37add..56682c002db7 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -957,7 +957,6 @@ EXPORT_SYMBOL(__pagevec_release);

static void __pagevec_lru_add_fn(struct page *page, struct lruvec *lruvec)
{
- enum lru_list lru;
int was_unevictable = TestClearPageUnevictable(page);
int nr_pages = thp_nr_pages(page);

@@ -993,11 +992,9 @@ static void __pagevec_lru_add_fn(struct page *page, struct lruvec *lruvec)
smp_mb__after_atomic();

if (page_evictable(page)) {
- lru = page_lru(page);
if (was_unevictable)
__count_vm_events(UNEVICTABLE_PGRESCUED, nr_pages);
} else {
- lru = LRU_UNEVICTABLE;
ClearPageActive(page);
SetPageUnevictable(page);
if (!was_unevictable)
@@ -1005,7 +1002,7 @@ static void __pagevec_lru_add_fn(struct page *page, struct lruvec *lruvec)
}

add_page_to_lru_list(page, lruvec);
- trace_mm_lru_insertion(page, lru);
+ trace_mm_lru_insertion(page);
}

/*
--
2.30.0.280.ga3ce27912f-goog

2021-01-22 22:10:59

by Yu Zhao

[permalink] [raw]
Subject: [PATCH v2 03/10] mm: don't pass "enum lru_list" to lru list addition functions

The "enum lru_list" parameter to add_page_to_lru_list() and
add_page_to_lru_list_tail() is redundant in the sense that it can
be extracted from the "struct page" parameter by page_lru().

A caveat is that we need to make sure PageActive() or
PageUnevictable() is correctly set or cleared before calling
these two functions. And they are indeed.

Link: https://lore.kernel.org/linux-mm/[email protected]/
Signed-off-by: Yu Zhao <[email protected]>
---
include/linux/mm_inline.h | 8 ++++++--
mm/swap.c | 15 +++++++--------
mm/vmscan.c | 6 ++----
3 files changed, 15 insertions(+), 14 deletions(-)

diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h
index 2889741f450a..130ba3201d3f 100644
--- a/include/linux/mm_inline.h
+++ b/include/linux/mm_inline.h
@@ -106,15 +106,19 @@ static __always_inline enum lru_list page_lru(struct page *page)
}

static __always_inline void add_page_to_lru_list(struct page *page,
- struct lruvec *lruvec, enum lru_list lru)
+ struct lruvec *lruvec)
{
+ enum lru_list lru = page_lru(page);
+
update_lru_size(lruvec, lru, page_zonenum(page), thp_nr_pages(page));
list_add(&page->lru, &lruvec->lists[lru]);
}

static __always_inline void add_page_to_lru_list_tail(struct page *page,
- struct lruvec *lruvec, enum lru_list lru)
+ struct lruvec *lruvec)
{
+ enum lru_list lru = page_lru(page);
+
update_lru_size(lruvec, lru, page_zonenum(page), thp_nr_pages(page));
list_add_tail(&page->lru, &lruvec->lists[lru]);
}
diff --git a/mm/swap.c b/mm/swap.c
index 490553f3f9ef..4b058ef37add 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -231,7 +231,7 @@ static void pagevec_move_tail_fn(struct page *page, struct lruvec *lruvec)
if (!PageUnevictable(page)) {
del_page_from_lru_list(page, lruvec, page_lru(page));
ClearPageActive(page);
- add_page_to_lru_list_tail(page, lruvec, page_lru(page));
+ add_page_to_lru_list_tail(page, lruvec);
__count_vm_events(PGROTATED, thp_nr_pages(page));
}
}
@@ -313,8 +313,7 @@ static void __activate_page(struct page *page, struct lruvec *lruvec)

del_page_from_lru_list(page, lruvec, lru);
SetPageActive(page);
- lru += LRU_ACTIVE;
- add_page_to_lru_list(page, lruvec, lru);
+ add_page_to_lru_list(page, lruvec);
trace_mm_lru_activate(page);

__count_vm_events(PGACTIVATE, nr_pages);
@@ -543,14 +542,14 @@ static void lru_deactivate_file_fn(struct page *page, struct lruvec *lruvec)
* It can make readahead confusing. But race window
* is _really_ small and it's non-critical problem.
*/
- add_page_to_lru_list(page, lruvec, lru);
+ add_page_to_lru_list(page, lruvec);
SetPageReclaim(page);
} else {
/*
* The page's writeback ends up during pagevec
* We moves tha page into tail of inactive.
*/
- add_page_to_lru_list_tail(page, lruvec, lru);
+ add_page_to_lru_list_tail(page, lruvec);
__count_vm_events(PGROTATED, nr_pages);
}

@@ -570,7 +569,7 @@ static void lru_deactivate_fn(struct page *page, struct lruvec *lruvec)
del_page_from_lru_list(page, lruvec, lru + LRU_ACTIVE);
ClearPageActive(page);
ClearPageReferenced(page);
- add_page_to_lru_list(page, lruvec, lru);
+ add_page_to_lru_list(page, lruvec);

__count_vm_events(PGDEACTIVATE, nr_pages);
__count_memcg_events(lruvec_memcg(lruvec), PGDEACTIVATE,
@@ -595,7 +594,7 @@ static void lru_lazyfree_fn(struct page *page, struct lruvec *lruvec)
* anonymous pages
*/
ClearPageSwapBacked(page);
- add_page_to_lru_list(page, lruvec, LRU_INACTIVE_FILE);
+ add_page_to_lru_list(page, lruvec);

__count_vm_events(PGLAZYFREE, nr_pages);
__count_memcg_events(lruvec_memcg(lruvec), PGLAZYFREE,
@@ -1005,7 +1004,7 @@ static void __pagevec_lru_add_fn(struct page *page, struct lruvec *lruvec)
__count_vm_events(UNEVICTABLE_PGCULLED, nr_pages);
}

- add_page_to_lru_list(page, lruvec, lru);
+ add_page_to_lru_list(page, lruvec);
trace_mm_lru_insertion(page, lru);
}

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 19875660e8f8..09e4f97488c9 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1867,7 +1867,7 @@ static unsigned noinline_for_stack move_pages_to_lru(struct lruvec *lruvec,
* inhibits memcg migration).
*/
VM_BUG_ON_PAGE(!lruvec_holds_page_lru_lock(page, lruvec), page);
- add_page_to_lru_list(page, lruvec, page_lru(page));
+ add_page_to_lru_list(page, lruvec);
nr_pages = thp_nr_pages(page);
nr_moved += nr_pages;
if (PageActive(page))
@@ -4282,12 +4282,10 @@ void check_move_unevictable_pages(struct pagevec *pvec)

lruvec = relock_page_lruvec_irq(page, lruvec);
if (page_evictable(page) && PageUnevictable(page)) {
- enum lru_list lru = page_lru_base_type(page);
-
VM_BUG_ON_PAGE(PageActive(page), page);
ClearPageUnevictable(page);
del_page_from_lru_list(page, lruvec, LRU_UNEVICTABLE);
- add_page_to_lru_list(page, lruvec, lru);
+ add_page_to_lru_list(page, lruvec);
pgrescued += nr_pages;
}
SetPageLRU(page);
--
2.30.0.280.ga3ce27912f-goog

2021-01-22 22:11:12

by Yu Zhao

[permalink] [raw]
Subject: [PATCH v2 05/10] mm: don't pass "enum lru_list" to del_page_from_lru_list()

The parameter is redundant in the sense that it can be potentially
extracted from the "struct page" parameter by page_lru(). We need to
make sure that existing PageActive() or PageUnevictable() remains
until the function returns. A few places don't conform, and simple
reordering fixes them.

This patch may have left page_off_lru() seemingly odd, and we'll take
care of it in the next patch.

Link: https://lore.kernel.org/linux-mm/[email protected]/
Signed-off-by: Yu Zhao <[email protected]>
---
include/linux/mm_inline.h | 5 +++--
mm/compaction.c | 2 +-
mm/mlock.c | 3 +--
mm/swap.c | 26 ++++++++++----------------
mm/vmscan.c | 4 ++--
5 files changed, 17 insertions(+), 23 deletions(-)

diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h
index 130ba3201d3f..ffacc6273678 100644
--- a/include/linux/mm_inline.h
+++ b/include/linux/mm_inline.h
@@ -124,9 +124,10 @@ static __always_inline void add_page_to_lru_list_tail(struct page *page,
}

static __always_inline void del_page_from_lru_list(struct page *page,
- struct lruvec *lruvec, enum lru_list lru)
+ struct lruvec *lruvec)
{
list_del(&page->lru);
- update_lru_size(lruvec, lru, page_zonenum(page), -thp_nr_pages(page));
+ update_lru_size(lruvec, page_lru(page), page_zonenum(page),
+ -thp_nr_pages(page));
}
#endif
diff --git a/mm/compaction.c b/mm/compaction.c
index 6e21db7f51b3..71fab3f5938c 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -1030,7 +1030,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
low_pfn += compound_nr(page) - 1;

/* Successfully isolated */
- del_page_from_lru_list(page, lruvec, page_lru(page));
+ del_page_from_lru_list(page, lruvec);
mod_node_page_state(page_pgdat(page),
NR_ISOLATED_ANON + page_is_file_lru(page),
thp_nr_pages(page));
diff --git a/mm/mlock.c b/mm/mlock.c
index 55b3b3672977..73960bb3464d 100644
--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -278,8 +278,7 @@ static void __munlock_pagevec(struct pagevec *pvec, struct zone *zone)
*/
if (TestClearPageLRU(page)) {
lruvec = relock_page_lruvec_irq(page, lruvec);
- del_page_from_lru_list(page, lruvec,
- page_lru(page));
+ del_page_from_lru_list(page, lruvec);
continue;
} else
__munlock_isolation_failed(page);
diff --git a/mm/swap.c b/mm/swap.c
index 56682c002db7..94532799ed82 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -85,7 +85,8 @@ static void __page_cache_release(struct page *page)
lruvec = lock_page_lruvec_irqsave(page, &flags);
VM_BUG_ON_PAGE(!PageLRU(page), page);
__ClearPageLRU(page);
- del_page_from_lru_list(page, lruvec, page_off_lru(page));
+ del_page_from_lru_list(page, lruvec);
+ page_off_lru(page);
unlock_page_lruvec_irqrestore(lruvec, flags);
}
__ClearPageWaiters(page);
@@ -229,7 +230,7 @@ static void pagevec_lru_move_fn(struct pagevec *pvec,
static void pagevec_move_tail_fn(struct page *page, struct lruvec *lruvec)
{
if (!PageUnevictable(page)) {
- del_page_from_lru_list(page, lruvec, page_lru(page));
+ del_page_from_lru_list(page, lruvec);
ClearPageActive(page);
add_page_to_lru_list_tail(page, lruvec);
__count_vm_events(PGROTATED, thp_nr_pages(page));
@@ -308,10 +309,9 @@ void lru_note_cost_page(struct page *page)
static void __activate_page(struct page *page, struct lruvec *lruvec)
{
if (!PageActive(page) && !PageUnevictable(page)) {
- int lru = page_lru_base_type(page);
int nr_pages = thp_nr_pages(page);

- del_page_from_lru_list(page, lruvec, lru);
+ del_page_from_lru_list(page, lruvec);
SetPageActive(page);
add_page_to_lru_list(page, lruvec);
trace_mm_lru_activate(page);
@@ -518,8 +518,7 @@ void lru_cache_add_inactive_or_unevictable(struct page *page,
*/
static void lru_deactivate_file_fn(struct page *page, struct lruvec *lruvec)
{
- int lru;
- bool active;
+ bool active = PageActive(page);
int nr_pages = thp_nr_pages(page);

if (PageUnevictable(page))
@@ -529,10 +528,7 @@ static void lru_deactivate_file_fn(struct page *page, struct lruvec *lruvec)
if (page_mapped(page))
return;

- active = PageActive(page);
- lru = page_lru_base_type(page);
-
- del_page_from_lru_list(page, lruvec, lru + active);
+ del_page_from_lru_list(page, lruvec);
ClearPageActive(page);
ClearPageReferenced(page);

@@ -563,10 +559,9 @@ static void lru_deactivate_file_fn(struct page *page, struct lruvec *lruvec)
static void lru_deactivate_fn(struct page *page, struct lruvec *lruvec)
{
if (PageActive(page) && !PageUnevictable(page)) {
- int lru = page_lru_base_type(page);
int nr_pages = thp_nr_pages(page);

- del_page_from_lru_list(page, lruvec, lru + LRU_ACTIVE);
+ del_page_from_lru_list(page, lruvec);
ClearPageActive(page);
ClearPageReferenced(page);
add_page_to_lru_list(page, lruvec);
@@ -581,11 +576,9 @@ static void lru_lazyfree_fn(struct page *page, struct lruvec *lruvec)
{
if (PageAnon(page) && PageSwapBacked(page) &&
!PageSwapCache(page) && !PageUnevictable(page)) {
- bool active = PageActive(page);
int nr_pages = thp_nr_pages(page);

- del_page_from_lru_list(page, lruvec,
- LRU_INACTIVE_ANON + active);
+ del_page_from_lru_list(page, lruvec);
ClearPageActive(page);
ClearPageReferenced(page);
/*
@@ -919,7 +912,8 @@ void release_pages(struct page **pages, int nr)

VM_BUG_ON_PAGE(!PageLRU(page), page);
__ClearPageLRU(page);
- del_page_from_lru_list(page, lruvec, page_off_lru(page));
+ del_page_from_lru_list(page, lruvec);
+ page_off_lru(page);
}

__ClearPageWaiters(page);
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 09e4f97488c9..7c65d47e6612 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1766,7 +1766,7 @@ int isolate_lru_page(struct page *page)

get_page(page);
lruvec = lock_page_lruvec_irq(page);
- del_page_from_lru_list(page, lruvec, page_lru(page));
+ del_page_from_lru_list(page, lruvec);
unlock_page_lruvec_irq(lruvec);
ret = 0;
}
@@ -4283,8 +4283,8 @@ void check_move_unevictable_pages(struct pagevec *pvec)
lruvec = relock_page_lruvec_irq(page, lruvec);
if (page_evictable(page) && PageUnevictable(page)) {
VM_BUG_ON_PAGE(PageActive(page), page);
+ del_page_from_lru_list(page, lruvec);
ClearPageUnevictable(page);
- del_page_from_lru_list(page, lruvec, LRU_UNEVICTABLE);
add_page_to_lru_list(page, lruvec);
pgrescued += nr_pages;
}
--
2.30.0.280.ga3ce27912f-goog

2021-01-22 22:11:24

by Yu Zhao

[permalink] [raw]
Subject: [PATCH v2 08/10] mm: fold page_lru_base_type() into its sole caller

We've removed all other references to this function.

Link: https://lore.kernel.org/linux-mm/[email protected]/
Signed-off-by: Yu Zhao <[email protected]>
Reviewed-by: Alex Shi <[email protected]>
---
include/linux/mm_inline.h | 27 ++++++---------------------
1 file changed, 6 insertions(+), 21 deletions(-)

diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h
index 6d907a4dd6ad..7183c7a03f09 100644
--- a/include/linux/mm_inline.h
+++ b/include/linux/mm_inline.h
@@ -45,21 +45,6 @@ static __always_inline void update_lru_size(struct lruvec *lruvec,
#endif
}

-/**
- * page_lru_base_type - which LRU list type should a page be on?
- * @page: the page to test
- *
- * Used for LRU list index arithmetic.
- *
- * Returns the base LRU type - file or anon - @page should be on.
- */
-static inline enum lru_list page_lru_base_type(struct page *page)
-{
- if (page_is_file_lru(page))
- return LRU_INACTIVE_FILE;
- return LRU_INACTIVE_ANON;
-}
-
/**
* __clear_page_lru_flags - clear page lru flags before releasing a page
* @page: the page that was on lru and now has a zero reference
@@ -92,12 +77,12 @@ static __always_inline enum lru_list page_lru(struct page *page)
VM_BUG_ON_PAGE(PageActive(page) && PageUnevictable(page), page);

if (PageUnevictable(page))
- lru = LRU_UNEVICTABLE;
- else {
- lru = page_lru_base_type(page);
- if (PageActive(page))
- lru += LRU_ACTIVE;
- }
+ return LRU_UNEVICTABLE;
+
+ lru = page_is_file_lru(page) ? LRU_INACTIVE_FILE : LRU_INACTIVE_ANON;
+ if (PageActive(page))
+ lru += LRU_ACTIVE;
+
return lru;
}

--
2.30.0.280.ga3ce27912f-goog

2021-01-22 22:12:23

by Yu Zhao

[permalink] [raw]
Subject: [PATCH v2 01/10] mm: use add_page_to_lru_list()

There is add_page_to_lru_list(), and move_pages_to_lru() should reuse
it, not duplicate it.

Link: https://lore.kernel.org/linux-mm/[email protected]/
Signed-off-by: Yu Zhao <[email protected]>
Reviewed-by: Alex Shi <[email protected]>
---
mm/vmscan.c | 6 +-----
1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 04509994aed4..19875660e8f8 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1823,7 +1823,6 @@ static unsigned noinline_for_stack move_pages_to_lru(struct lruvec *lruvec,
int nr_pages, nr_moved = 0;
LIST_HEAD(pages_to_free);
struct page *page;
- enum lru_list lru;

while (!list_empty(list)) {
page = lru_to_page(list);
@@ -1868,11 +1867,8 @@ static unsigned noinline_for_stack move_pages_to_lru(struct lruvec *lruvec,
* inhibits memcg migration).
*/
VM_BUG_ON_PAGE(!lruvec_holds_page_lru_lock(page, lruvec), page);
- lru = page_lru(page);
+ add_page_to_lru_list(page, lruvec, page_lru(page));
nr_pages = thp_nr_pages(page);
-
- update_lru_size(lruvec, lru, page_zonenum(page), nr_pages);
- list_add(&page->lru, &lruvec->lists[lru]);
nr_moved += nr_pages;
if (PageActive(page))
workingset_age_nonresident(lruvec, nr_pages);
--
2.30.0.280.ga3ce27912f-goog

2021-01-22 22:12:32

by Yu Zhao

[permalink] [raw]
Subject: [PATCH v2 10/10] mm: make lruvec_lru_size() static

All other references to the function were removed after
commit b910718a948a ("mm: vmscan: detect file thrashing at the reclaim
root").

Link: https://lore.kernel.org/linux-mm/[email protected]/
Signed-off-by: Yu Zhao <[email protected]>
Reviewed-by: Alex Shi <[email protected]>
---
include/linux/mmzone.h | 2 --
mm/vmscan.c | 3 ++-
2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index ae588b2f87ef..844bb93d2a1e 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -892,8 +892,6 @@ static inline struct pglist_data *lruvec_pgdat(struct lruvec *lruvec)
#endif
}

-extern unsigned long lruvec_lru_size(struct lruvec *lruvec, enum lru_list lru, int zone_idx);
-
#ifdef CONFIG_HAVE_MEMORYLESS_NODES
int local_memory_node(int node_id);
#else
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 348a90096550..fea6b43bc1f9 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -310,7 +310,8 @@ unsigned long zone_reclaimable_pages(struct zone *zone)
* @lru: lru to use
* @zone_idx: zones to consider (use MAX_NR_ZONES for the whole LRU list)
*/
-unsigned long lruvec_lru_size(struct lruvec *lruvec, enum lru_list lru, int zone_idx)
+static unsigned long lruvec_lru_size(struct lruvec *lruvec, enum lru_list lru,
+ int zone_idx)
{
unsigned long size = 0;
int zid;
--
2.30.0.280.ga3ce27912f-goog

2021-01-22 22:13:02

by Yu Zhao

[permalink] [raw]
Subject: [PATCH v2 09/10] mm: fold __update_lru_size() into its sole caller

All other references to the function were removed after
commit a892cb6b977f ("mm/vmscan.c: use update_lru_size() in
update_lru_sizes()").

Link: https://lore.kernel.org/linux-mm/[email protected]/
Signed-off-by: Yu Zhao <[email protected]>
Reviewed-by: Alex Shi <[email protected]>
---
include/linux/mm_inline.h | 9 +--------
1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h
index 7183c7a03f09..355ea1ee32bd 100644
--- a/include/linux/mm_inline.h
+++ b/include/linux/mm_inline.h
@@ -24,7 +24,7 @@ static inline int page_is_file_lru(struct page *page)
return !PageSwapBacked(page);
}

-static __always_inline void __update_lru_size(struct lruvec *lruvec,
+static __always_inline void update_lru_size(struct lruvec *lruvec,
enum lru_list lru, enum zone_type zid,
int nr_pages)
{
@@ -33,13 +33,6 @@ static __always_inline void __update_lru_size(struct lruvec *lruvec,
__mod_lruvec_state(lruvec, NR_LRU_BASE + lru, nr_pages);
__mod_zone_page_state(&pgdat->node_zones[zid],
NR_ZONE_LRU_BASE + lru, nr_pages);
-}
-
-static __always_inline void update_lru_size(struct lruvec *lruvec,
- enum lru_list lru, enum zone_type zid,
- int nr_pages)
-{
- __update_lru_size(lruvec, lru, zid, nr_pages);
#ifdef CONFIG_MEMCG
mem_cgroup_update_lru_size(lruvec, lru, zid, nr_pages);
#endif
--
2.30.0.280.ga3ce27912f-goog

2021-01-22 22:13:04

by Yu Zhao

[permalink] [raw]
Subject: [PATCH v2 07/10] mm: VM_BUG_ON lru page flags

Move scattered VM_BUG_ONs to two essential places that cover all
lru list additions and deletions.

Link: https://lore.kernel.org/linux-mm/[email protected]/
Signed-off-by: Yu Zhao <[email protected]>
---
include/linux/mm_inline.h | 4 ++++
mm/swap.c | 2 --
mm/vmscan.c | 1 -
3 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h
index ef3fd79222e5..6d907a4dd6ad 100644
--- a/include/linux/mm_inline.h
+++ b/include/linux/mm_inline.h
@@ -66,6 +66,8 @@ static inline enum lru_list page_lru_base_type(struct page *page)
*/
static __always_inline void __clear_page_lru_flags(struct page *page)
{
+ VM_BUG_ON_PAGE(!PageLRU(page), page);
+
__ClearPageLRU(page);

/* this shouldn't happen, so leave the flags to bad_page() */
@@ -87,6 +89,8 @@ static __always_inline enum lru_list page_lru(struct page *page)
{
enum lru_list lru;

+ VM_BUG_ON_PAGE(PageActive(page) && PageUnevictable(page), page);
+
if (PageUnevictable(page))
lru = LRU_UNEVICTABLE;
else {
diff --git a/mm/swap.c b/mm/swap.c
index 38900d672051..31b844d4ed94 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -83,7 +83,6 @@ static void __page_cache_release(struct page *page)
unsigned long flags;

lruvec = lock_page_lruvec_irqsave(page, &flags);
- VM_BUG_ON_PAGE(!PageLRU(page), page);
del_page_from_lru_list(page, lruvec);
__clear_page_lru_flags(page);
unlock_page_lruvec_irqrestore(lruvec, flags);
@@ -909,7 +908,6 @@ void release_pages(struct page **pages, int nr)
if (prev_lruvec != lruvec)
lock_batch = 0;

- VM_BUG_ON_PAGE(!PageLRU(page), page);
del_page_from_lru_list(page, lruvec);
__clear_page_lru_flags(page);
}
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 452dd3818aa3..348a90096550 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -4281,7 +4281,6 @@ void check_move_unevictable_pages(struct pagevec *pvec)

lruvec = relock_page_lruvec_irq(page, lruvec);
if (page_evictable(page) && PageUnevictable(page)) {
- VM_BUG_ON_PAGE(PageActive(page), page);
del_page_from_lru_list(page, lruvec);
ClearPageUnevictable(page);
add_page_to_lru_list(page, lruvec);
--
2.30.0.280.ga3ce27912f-goog

2021-01-22 22:13:17

by Yu Zhao

[permalink] [raw]
Subject: [PATCH v2 06/10] mm: add __clear_page_lru_flags() to replace page_off_lru()

Similar to page_off_lru(), the new function does non-atomic clearing
of PageLRU() in addition to PageActive() and PageUnevictable(), on a
page that has no references left.

If PageActive() and PageUnevictable() are both set, refuse to clear
either and leave them to bad_page(). This is a behavior change that
is meant to help debug.

Link: https://lore.kernel.org/linux-mm/[email protected]/
Signed-off-by: Yu Zhao <[email protected]>
---
include/linux/mm_inline.h | 28 ++++++++++------------------
mm/swap.c | 6 ++----
mm/vmscan.c | 3 +--
3 files changed, 13 insertions(+), 24 deletions(-)

diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h
index ffacc6273678..ef3fd79222e5 100644
--- a/include/linux/mm_inline.h
+++ b/include/linux/mm_inline.h
@@ -61,27 +61,19 @@ static inline enum lru_list page_lru_base_type(struct page *page)
}

/**
- * page_off_lru - which LRU list was page on? clearing its lru flags.
- * @page: the page to test
- *
- * Returns the LRU list a page was on, as an index into the array of LRU
- * lists; and clears its Unevictable or Active flags, ready for freeing.
+ * __clear_page_lru_flags - clear page lru flags before releasing a page
+ * @page: the page that was on lru and now has a zero reference
*/
-static __always_inline enum lru_list page_off_lru(struct page *page)
+static __always_inline void __clear_page_lru_flags(struct page *page)
{
- enum lru_list lru;
+ __ClearPageLRU(page);

- if (PageUnevictable(page)) {
- __ClearPageUnevictable(page);
- lru = LRU_UNEVICTABLE;
- } else {
- lru = page_lru_base_type(page);
- if (PageActive(page)) {
- __ClearPageActive(page);
- lru += LRU_ACTIVE;
- }
- }
- return lru;
+ /* this shouldn't happen, so leave the flags to bad_page() */
+ if (PageActive(page) && PageUnevictable(page))
+ return;
+
+ __ClearPageActive(page);
+ __ClearPageUnevictable(page);
}

/**
diff --git a/mm/swap.c b/mm/swap.c
index 94532799ed82..38900d672051 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -84,9 +84,8 @@ static void __page_cache_release(struct page *page)

lruvec = lock_page_lruvec_irqsave(page, &flags);
VM_BUG_ON_PAGE(!PageLRU(page), page);
- __ClearPageLRU(page);
del_page_from_lru_list(page, lruvec);
- page_off_lru(page);
+ __clear_page_lru_flags(page);
unlock_page_lruvec_irqrestore(lruvec, flags);
}
__ClearPageWaiters(page);
@@ -911,9 +910,8 @@ void release_pages(struct page **pages, int nr)
lock_batch = 0;

VM_BUG_ON_PAGE(!PageLRU(page), page);
- __ClearPageLRU(page);
del_page_from_lru_list(page, lruvec);
- page_off_lru(page);
+ __clear_page_lru_flags(page);
}

__ClearPageWaiters(page);
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 7c65d47e6612..452dd3818aa3 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1849,8 +1849,7 @@ static unsigned noinline_for_stack move_pages_to_lru(struct lruvec *lruvec,
SetPageLRU(page);

if (unlikely(put_page_testzero(page))) {
- __ClearPageLRU(page);
- __ClearPageActive(page);
+ __clear_page_lru_flags(page);

if (unlikely(PageCompound(page))) {
spin_unlock_irq(&lruvec->lru_lock);
--
2.30.0.280.ga3ce27912f-goog

2021-01-22 22:14:33

by Yu Zhao

[permalink] [raw]
Subject: [PATCH v2 02/10] mm: shuffle lru list addition and deletion functions

These functions will call page_lru() in the following patches. Move
them below page_lru() to avoid the forward declaration.

Link: https://lore.kernel.org/linux-mm/[email protected]/
Signed-off-by: Yu Zhao <[email protected]>
---
include/linux/mm_inline.h | 42 +++++++++++++++++++--------------------
1 file changed, 21 insertions(+), 21 deletions(-)

diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h
index 8fc71e9d7bb0..2889741f450a 100644
--- a/include/linux/mm_inline.h
+++ b/include/linux/mm_inline.h
@@ -45,27 +45,6 @@ static __always_inline void update_lru_size(struct lruvec *lruvec,
#endif
}

-static __always_inline void add_page_to_lru_list(struct page *page,
- struct lruvec *lruvec, enum lru_list lru)
-{
- update_lru_size(lruvec, lru, page_zonenum(page), thp_nr_pages(page));
- list_add(&page->lru, &lruvec->lists[lru]);
-}
-
-static __always_inline void add_page_to_lru_list_tail(struct page *page,
- struct lruvec *lruvec, enum lru_list lru)
-{
- update_lru_size(lruvec, lru, page_zonenum(page), thp_nr_pages(page));
- list_add_tail(&page->lru, &lruvec->lists[lru]);
-}
-
-static __always_inline void del_page_from_lru_list(struct page *page,
- struct lruvec *lruvec, enum lru_list lru)
-{
- list_del(&page->lru);
- update_lru_size(lruvec, lru, page_zonenum(page), -thp_nr_pages(page));
-}
-
/**
* page_lru_base_type - which LRU list type should a page be on?
* @page: the page to test
@@ -125,4 +104,25 @@ static __always_inline enum lru_list page_lru(struct page *page)
}
return lru;
}
+
+static __always_inline void add_page_to_lru_list(struct page *page,
+ struct lruvec *lruvec, enum lru_list lru)
+{
+ update_lru_size(lruvec, lru, page_zonenum(page), thp_nr_pages(page));
+ list_add(&page->lru, &lruvec->lists[lru]);
+}
+
+static __always_inline void add_page_to_lru_list_tail(struct page *page,
+ struct lruvec *lruvec, enum lru_list lru)
+{
+ update_lru_size(lruvec, lru, page_zonenum(page), thp_nr_pages(page));
+ list_add_tail(&page->lru, &lruvec->lists[lru]);
+}
+
+static __always_inline void del_page_from_lru_list(struct page *page,
+ struct lruvec *lruvec, enum lru_list lru)
+{
+ list_del(&page->lru);
+ update_lru_size(lruvec, lru, page_zonenum(page), -thp_nr_pages(page));
+}
#endif
--
2.30.0.280.ga3ce27912f-goog

2021-01-27 19:49:56

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [PATCH v2 02/10] mm: shuffle lru list addition and deletion functions

On 1/22/21 11:05 PM, Yu Zhao wrote:
> These functions will call page_lru() in the following patches. Move
> them below page_lru() to avoid the forward declaration.
>
> Link: https://lore.kernel.org/linux-mm/[email protected]/
> Signed-off-by: Yu Zhao <[email protected]>

Acked-by: Vlastimil Babka <[email protected]>

2021-01-27 19:50:10

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [PATCH v2 03/10] mm: don't pass "enum lru_list" to lru list addition functions

On 1/22/21 11:05 PM, Yu Zhao wrote:
> The "enum lru_list" parameter to add_page_to_lru_list() and
> add_page_to_lru_list_tail() is redundant in the sense that it can
> be extracted from the "struct page" parameter by page_lru().

Okay, however, it means repeated extraction of a value that we already knew. The
result of compilation is rather sad. This is bloat-o-meter on mm/built-in.a
(without CONFIG_DEBUG_VM, btw) between patch 2 and 5:

add/remove: 0/0 grow/shrink: 10/5 up/down: 1837/-60 (1777)
Function old new delta
lru_deactivate_file_fn 932 1368 +436
lru_lazyfree_fn.part 629 953 +324
check_move_unevictable_pages 1171 1424 +253
__activate_page.part 735 984 +249
lru_deactivate_fn.part 593 822 +229
perf_trace_mm_lru_insertion 458 560 +102
trace_event_raw_event_mm_lru_insertion 412 500 +88
__page_cache_release 479 558 +79
release_pages 1430 1499 +69
pagevec_move_tail_fn.part 761 769 +8
isolate_lru_page 471 470 -1
__bpf_trace_mm_lru_insertion 7 5 -2
__traceiter_mm_lru_insertion 55 47 -8
isolate_migratepages_block 3200 3185 -15
__pagevec_lru_add_fn 1092 1058 -34


> A caveat is that we need to make sure PageActive() or
> PageUnevictable() is correctly set or cleared before calling
> these two functions. And they are indeed.
>
> Link: https://lore.kernel.org/linux-mm/[email protected]/
> Signed-off-by: Yu Zhao <[email protected]>
> ---
> include/linux/mm_inline.h | 8 ++++++--
> mm/swap.c | 15 +++++++--------
> mm/vmscan.c | 6 ++----
> 3 files changed, 15 insertions(+), 14 deletions(-)
>
> diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h
> index 2889741f450a..130ba3201d3f 100644
> --- a/include/linux/mm_inline.h
> +++ b/include/linux/mm_inline.h
> @@ -106,15 +106,19 @@ static __always_inline enum lru_list page_lru(struct page *page)
> }
>
> static __always_inline void add_page_to_lru_list(struct page *page,
> - struct lruvec *lruvec, enum lru_list lru)
> + struct lruvec *lruvec)
> {
> + enum lru_list lru = page_lru(page);
> +
> update_lru_size(lruvec, lru, page_zonenum(page), thp_nr_pages(page));
> list_add(&page->lru, &lruvec->lists[lru]);
> }
>
> static __always_inline void add_page_to_lru_list_tail(struct page *page,
> - struct lruvec *lruvec, enum lru_list lru)
> + struct lruvec *lruvec)
> {
> + enum lru_list lru = page_lru(page);
> +
> update_lru_size(lruvec, lru, page_zonenum(page), thp_nr_pages(page));
> list_add_tail(&page->lru, &lruvec->lists[lru]);
> }
> diff --git a/mm/swap.c b/mm/swap.c
> index 490553f3f9ef..4b058ef37add 100644
> --- a/mm/swap.c
> +++ b/mm/swap.c
> @@ -231,7 +231,7 @@ static void pagevec_move_tail_fn(struct page *page, struct lruvec *lruvec)
> if (!PageUnevictable(page)) {
> del_page_from_lru_list(page, lruvec, page_lru(page));
> ClearPageActive(page);
> - add_page_to_lru_list_tail(page, lruvec, page_lru(page));
> + add_page_to_lru_list_tail(page, lruvec);
> __count_vm_events(PGROTATED, thp_nr_pages(page));
> }
> }
> @@ -313,8 +313,7 @@ static void __activate_page(struct page *page, struct lruvec *lruvec)
>
> del_page_from_lru_list(page, lruvec, lru);
> SetPageActive(page);
> - lru += LRU_ACTIVE;
> - add_page_to_lru_list(page, lruvec, lru);
> + add_page_to_lru_list(page, lruvec);
> trace_mm_lru_activate(page);
>
> __count_vm_events(PGACTIVATE, nr_pages);
> @@ -543,14 +542,14 @@ static void lru_deactivate_file_fn(struct page *page, struct lruvec *lruvec)
> * It can make readahead confusing. But race window
> * is _really_ small and it's non-critical problem.
> */
> - add_page_to_lru_list(page, lruvec, lru);
> + add_page_to_lru_list(page, lruvec);
> SetPageReclaim(page);
> } else {
> /*
> * The page's writeback ends up during pagevec
> * We moves tha page into tail of inactive.
> */
> - add_page_to_lru_list_tail(page, lruvec, lru);
> + add_page_to_lru_list_tail(page, lruvec);
> __count_vm_events(PGROTATED, nr_pages);
> }
>
> @@ -570,7 +569,7 @@ static void lru_deactivate_fn(struct page *page, struct lruvec *lruvec)
> del_page_from_lru_list(page, lruvec, lru + LRU_ACTIVE);
> ClearPageActive(page);
> ClearPageReferenced(page);
> - add_page_to_lru_list(page, lruvec, lru);
> + add_page_to_lru_list(page, lruvec);
>
> __count_vm_events(PGDEACTIVATE, nr_pages);
> __count_memcg_events(lruvec_memcg(lruvec), PGDEACTIVATE,
> @@ -595,7 +594,7 @@ static void lru_lazyfree_fn(struct page *page, struct lruvec *lruvec)
> * anonymous pages
> */
> ClearPageSwapBacked(page);
> - add_page_to_lru_list(page, lruvec, LRU_INACTIVE_FILE);
> + add_page_to_lru_list(page, lruvec);
>
> __count_vm_events(PGLAZYFREE, nr_pages);
> __count_memcg_events(lruvec_memcg(lruvec), PGLAZYFREE,
> @@ -1005,7 +1004,7 @@ static void __pagevec_lru_add_fn(struct page *page, struct lruvec *lruvec)
> __count_vm_events(UNEVICTABLE_PGCULLED, nr_pages);
> }
>
> - add_page_to_lru_list(page, lruvec, lru);
> + add_page_to_lru_list(page, lruvec);
> trace_mm_lru_insertion(page, lru);
> }
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 19875660e8f8..09e4f97488c9 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1867,7 +1867,7 @@ static unsigned noinline_for_stack move_pages_to_lru(struct lruvec *lruvec,
> * inhibits memcg migration).
> */
> VM_BUG_ON_PAGE(!lruvec_holds_page_lru_lock(page, lruvec), page);
> - add_page_to_lru_list(page, lruvec, page_lru(page));
> + add_page_to_lru_list(page, lruvec);
> nr_pages = thp_nr_pages(page);
> nr_moved += nr_pages;
> if (PageActive(page))
> @@ -4282,12 +4282,10 @@ void check_move_unevictable_pages(struct pagevec *pvec)
>
> lruvec = relock_page_lruvec_irq(page, lruvec);
> if (page_evictable(page) && PageUnevictable(page)) {
> - enum lru_list lru = page_lru_base_type(page);
> -
> VM_BUG_ON_PAGE(PageActive(page), page);
> ClearPageUnevictable(page);
> del_page_from_lru_list(page, lruvec, LRU_UNEVICTABLE);
> - add_page_to_lru_list(page, lruvec, lru);
> + add_page_to_lru_list(page, lruvec);
> pgrescued += nr_pages;
> }
> SetPageLRU(page);
>

2021-01-27 19:50:46

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [PATCH v2 01/10] mm: use add_page_to_lru_list()

On 1/22/21 11:05 PM, Yu Zhao wrote:
> There is add_page_to_lru_list(), and move_pages_to_lru() should reuse
> it, not duplicate it.
>
> Link: https://lore.kernel.org/linux-mm/[email protected]/
> Signed-off-by: Yu Zhao <[email protected]>
> Reviewed-by: Alex Shi <[email protected]>

Acked-by: Vlastimil Babka <[email protected]>

> ---
> mm/vmscan.c | 6 +-----
> 1 file changed, 1 insertion(+), 5 deletions(-)
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 04509994aed4..19875660e8f8 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1823,7 +1823,6 @@ static unsigned noinline_for_stack move_pages_to_lru(struct lruvec *lruvec,
> int nr_pages, nr_moved = 0;
> LIST_HEAD(pages_to_free);
> struct page *page;
> - enum lru_list lru;
>
> while (!list_empty(list)) {
> page = lru_to_page(list);
> @@ -1868,11 +1867,8 @@ static unsigned noinline_for_stack move_pages_to_lru(struct lruvec *lruvec,
> * inhibits memcg migration).
> */
> VM_BUG_ON_PAGE(!lruvec_holds_page_lru_lock(page, lruvec), page);
> - lru = page_lru(page);
> + add_page_to_lru_list(page, lruvec, page_lru(page));
> nr_pages = thp_nr_pages(page);
> -
> - update_lru_size(lruvec, lru, page_zonenum(page), nr_pages);
> - list_add(&page->lru, &lruvec->lists[lru]);
> nr_moved += nr_pages;
> if (PageActive(page))
> workingset_age_nonresident(lruvec, nr_pages);
>

2021-01-27 19:57:20

by Yu Zhao

[permalink] [raw]
Subject: Re: [PATCH v2 03/10] mm: don't pass "enum lru_list" to lru list addition functions

On Tue, Jan 26, 2021 at 08:13:11PM +0100, Vlastimil Babka wrote:
> On 1/22/21 11:05 PM, Yu Zhao wrote:
> > The "enum lru_list" parameter to add_page_to_lru_list() and
> > add_page_to_lru_list_tail() is redundant in the sense that it can
> > be extracted from the "struct page" parameter by page_lru().
>
> Okay, however, it means repeated extraction of a value that we already knew. The
> result of compilation is rather sad. This is bloat-o-meter on mm/built-in.a
> (without CONFIG_DEBUG_VM, btw) between patch 2 and 5:

Thanks for noticing this, Vlastimil. Should I drop the rest of the
series except the first patch?

> add/remove: 0/0 grow/shrink: 10/5 up/down: 1837/-60 (1777)
> Function old new delta
> lru_deactivate_file_fn 932 1368 +436
> lru_lazyfree_fn.part 629 953 +324
> check_move_unevictable_pages 1171 1424 +253
> __activate_page.part 735 984 +249
> lru_deactivate_fn.part 593 822 +229
> perf_trace_mm_lru_insertion 458 560 +102
> trace_event_raw_event_mm_lru_insertion 412 500 +88
> __page_cache_release 479 558 +79
> release_pages 1430 1499 +69
> pagevec_move_tail_fn.part 761 769 +8
> isolate_lru_page 471 470 -1
> __bpf_trace_mm_lru_insertion 7 5 -2
> __traceiter_mm_lru_insertion 55 47 -8
> isolate_migratepages_block 3200 3185 -15
> __pagevec_lru_add_fn 1092 1058 -34
>
>
> > A caveat is that we need to make sure PageActive() or
> > PageUnevictable() is correctly set or cleared before calling
> > these two functions. And they are indeed.
> >
> > Link: https://lore.kernel.org/linux-mm/[email protected]/
> > Signed-off-by: Yu Zhao <[email protected]>
> > ---
> > include/linux/mm_inline.h | 8 ++++++--
> > mm/swap.c | 15 +++++++--------
> > mm/vmscan.c | 6 ++----
> > 3 files changed, 15 insertions(+), 14 deletions(-)
> >
> > diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h
> > index 2889741f450a..130ba3201d3f 100644
> > --- a/include/linux/mm_inline.h
> > +++ b/include/linux/mm_inline.h
> > @@ -106,15 +106,19 @@ static __always_inline enum lru_list page_lru(struct page *page)
> > }
> >
> > static __always_inline void add_page_to_lru_list(struct page *page,
> > - struct lruvec *lruvec, enum lru_list lru)
> > + struct lruvec *lruvec)
> > {
> > + enum lru_list lru = page_lru(page);
> > +
> > update_lru_size(lruvec, lru, page_zonenum(page), thp_nr_pages(page));
> > list_add(&page->lru, &lruvec->lists[lru]);
> > }
> >
> > static __always_inline void add_page_to_lru_list_tail(struct page *page,
> > - struct lruvec *lruvec, enum lru_list lru)
> > + struct lruvec *lruvec)
> > {
> > + enum lru_list lru = page_lru(page);
> > +
> > update_lru_size(lruvec, lru, page_zonenum(page), thp_nr_pages(page));
> > list_add_tail(&page->lru, &lruvec->lists[lru]);
> > }
> > diff --git a/mm/swap.c b/mm/swap.c
> > index 490553f3f9ef..4b058ef37add 100644
> > --- a/mm/swap.c
> > +++ b/mm/swap.c
> > @@ -231,7 +231,7 @@ static void pagevec_move_tail_fn(struct page *page, struct lruvec *lruvec)
> > if (!PageUnevictable(page)) {
> > del_page_from_lru_list(page, lruvec, page_lru(page));
> > ClearPageActive(page);
> > - add_page_to_lru_list_tail(page, lruvec, page_lru(page));
> > + add_page_to_lru_list_tail(page, lruvec);
> > __count_vm_events(PGROTATED, thp_nr_pages(page));
> > }
> > }
> > @@ -313,8 +313,7 @@ static void __activate_page(struct page *page, struct lruvec *lruvec)
> >
> > del_page_from_lru_list(page, lruvec, lru);
> > SetPageActive(page);
> > - lru += LRU_ACTIVE;
> > - add_page_to_lru_list(page, lruvec, lru);
> > + add_page_to_lru_list(page, lruvec);
> > trace_mm_lru_activate(page);
> >
> > __count_vm_events(PGACTIVATE, nr_pages);
> > @@ -543,14 +542,14 @@ static void lru_deactivate_file_fn(struct page *page, struct lruvec *lruvec)
> > * It can make readahead confusing. But race window
> > * is _really_ small and it's non-critical problem.
> > */
> > - add_page_to_lru_list(page, lruvec, lru);
> > + add_page_to_lru_list(page, lruvec);
> > SetPageReclaim(page);
> > } else {
> > /*
> > * The page's writeback ends up during pagevec
> > * We moves tha page into tail of inactive.
> > */
> > - add_page_to_lru_list_tail(page, lruvec, lru);
> > + add_page_to_lru_list_tail(page, lruvec);
> > __count_vm_events(PGROTATED, nr_pages);
> > }
> >
> > @@ -570,7 +569,7 @@ static void lru_deactivate_fn(struct page *page, struct lruvec *lruvec)
> > del_page_from_lru_list(page, lruvec, lru + LRU_ACTIVE);
> > ClearPageActive(page);
> > ClearPageReferenced(page);
> > - add_page_to_lru_list(page, lruvec, lru);
> > + add_page_to_lru_list(page, lruvec);
> >
> > __count_vm_events(PGDEACTIVATE, nr_pages);
> > __count_memcg_events(lruvec_memcg(lruvec), PGDEACTIVATE,
> > @@ -595,7 +594,7 @@ static void lru_lazyfree_fn(struct page *page, struct lruvec *lruvec)
> > * anonymous pages
> > */
> > ClearPageSwapBacked(page);
> > - add_page_to_lru_list(page, lruvec, LRU_INACTIVE_FILE);
> > + add_page_to_lru_list(page, lruvec);
> >
> > __count_vm_events(PGLAZYFREE, nr_pages);
> > __count_memcg_events(lruvec_memcg(lruvec), PGLAZYFREE,
> > @@ -1005,7 +1004,7 @@ static void __pagevec_lru_add_fn(struct page *page, struct lruvec *lruvec)
> > __count_vm_events(UNEVICTABLE_PGCULLED, nr_pages);
> > }
> >
> > - add_page_to_lru_list(page, lruvec, lru);
> > + add_page_to_lru_list(page, lruvec);
> > trace_mm_lru_insertion(page, lru);
> > }
> >
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index 19875660e8f8..09e4f97488c9 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -1867,7 +1867,7 @@ static unsigned noinline_for_stack move_pages_to_lru(struct lruvec *lruvec,
> > * inhibits memcg migration).
> > */
> > VM_BUG_ON_PAGE(!lruvec_holds_page_lru_lock(page, lruvec), page);
> > - add_page_to_lru_list(page, lruvec, page_lru(page));
> > + add_page_to_lru_list(page, lruvec);
> > nr_pages = thp_nr_pages(page);
> > nr_moved += nr_pages;
> > if (PageActive(page))
> > @@ -4282,12 +4282,10 @@ void check_move_unevictable_pages(struct pagevec *pvec)
> >
> > lruvec = relock_page_lruvec_irq(page, lruvec);
> > if (page_evictable(page) && PageUnevictable(page)) {
> > - enum lru_list lru = page_lru_base_type(page);
> > -
> > VM_BUG_ON_PAGE(PageActive(page), page);
> > ClearPageUnevictable(page);
> > del_page_from_lru_list(page, lruvec, LRU_UNEVICTABLE);
> > - add_page_to_lru_list(page, lruvec, lru);
> > + add_page_to_lru_list(page, lruvec);
> > pgrescued += nr_pages;
> > }
> > SetPageLRU(page);
> >
>

2021-01-27 19:57:42

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH v2 03/10] mm: don't pass "enum lru_list" to lru list addition functions

On Fri, Jan 22, 2021 at 03:05:53PM -0700, Yu Zhao wrote:
> +++ b/mm/swap.c
> @@ -231,7 +231,7 @@ static void pagevec_move_tail_fn(struct page *page, struct lruvec *lruvec)
> if (!PageUnevictable(page)) {
> del_page_from_lru_list(page, lruvec, page_lru(page));
> ClearPageActive(page);
> - add_page_to_lru_list_tail(page, lruvec, page_lru(page));
> + add_page_to_lru_list_tail(page, lruvec);
> __count_vm_events(PGROTATED, thp_nr_pages(page));
> }

Is it profitable to do ...

- del_page_from_lru_list(page, lruvec, page_lru(page));
+ enum lru_list lru = page_lru(page);
+ del_page_from_lru_list(page, lruvec, lru);
ClearPageActive(page);
- add_page_to_lru_list_tail(page, lruvec, page_lru(page));
+ lru &= ~LRU_ACTIVE;
+ add_page_to_lru_list_tail(page, lruvec, lru);

2021-01-27 19:58:47

by Yu Zhao

[permalink] [raw]
Subject: Re: [PATCH v2 03/10] mm: don't pass "enum lru_list" to lru list addition functions

On Tue, Jan 26, 2021 at 10:01:11PM +0000, Matthew Wilcox wrote:
> On Fri, Jan 22, 2021 at 03:05:53PM -0700, Yu Zhao wrote:
> > +++ b/mm/swap.c
> > @@ -231,7 +231,7 @@ static void pagevec_move_tail_fn(struct page *page, struct lruvec *lruvec)
> > if (!PageUnevictable(page)) {
> > del_page_from_lru_list(page, lruvec, page_lru(page));
> > ClearPageActive(page);
> > - add_page_to_lru_list_tail(page, lruvec, page_lru(page));
> > + add_page_to_lru_list_tail(page, lruvec);
> > __count_vm_events(PGROTATED, thp_nr_pages(page));
> > }
>
> Is it profitable to do ...
>
> - del_page_from_lru_list(page, lruvec, page_lru(page));
> + enum lru_list lru = page_lru(page);
> + del_page_from_lru_list(page, lruvec, lru);
> ClearPageActive(page);
> - add_page_to_lru_list_tail(page, lruvec, page_lru(page));
> + lru &= ~LRU_ACTIVE;
> + add_page_to_lru_list_tail(page, lruvec, lru);

Ok, now we want to trade readability for size. Sure, I'll see how
much we could squeeze.

2021-01-27 20:59:11

by Miaohe Lin

[permalink] [raw]
Subject: Re: [PATCH v2 02/10] mm: shuffle lru list addition and deletion functions

On 2021/1/23 6:05, Yu Zhao wrote:
> These functions will call page_lru() in the following patches. Move
> them below page_lru() to avoid the forward declaration.
>
> Link: https://lore.kernel.org/linux-mm/[email protected]/
> Signed-off-by: Yu Zhao <[email protected]>

Reviewed-by: Miaohe Lin <[email protected]>

> ---
> include/linux/mm_inline.h | 42 +++++++++++++++++++--------------------
> 1 file changed, 21 insertions(+), 21 deletions(-)
>
> diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h
> index 8fc71e9d7bb0..2889741f450a 100644
> --- a/include/linux/mm_inline.h
> +++ b/include/linux/mm_inline.h
> @@ -45,27 +45,6 @@ static __always_inline void update_lru_size(struct lruvec *lruvec,
> #endif
> }
>
> -static __always_inline void add_page_to_lru_list(struct page *page,
> - struct lruvec *lruvec, enum lru_list lru)
> -{
> - update_lru_size(lruvec, lru, page_zonenum(page), thp_nr_pages(page));
> - list_add(&page->lru, &lruvec->lists[lru]);
> -}
> -
> -static __always_inline void add_page_to_lru_list_tail(struct page *page,
> - struct lruvec *lruvec, enum lru_list lru)
> -{
> - update_lru_size(lruvec, lru, page_zonenum(page), thp_nr_pages(page));
> - list_add_tail(&page->lru, &lruvec->lists[lru]);
> -}
> -
> -static __always_inline void del_page_from_lru_list(struct page *page,
> - struct lruvec *lruvec, enum lru_list lru)
> -{
> - list_del(&page->lru);
> - update_lru_size(lruvec, lru, page_zonenum(page), -thp_nr_pages(page));
> -}
> -
> /**
> * page_lru_base_type - which LRU list type should a page be on?
> * @page: the page to test
> @@ -125,4 +104,25 @@ static __always_inline enum lru_list page_lru(struct page *page)
> }
> return lru;
> }
> +
> +static __always_inline void add_page_to_lru_list(struct page *page,
> + struct lruvec *lruvec, enum lru_list lru)
> +{
> + update_lru_size(lruvec, lru, page_zonenum(page), thp_nr_pages(page));
> + list_add(&page->lru, &lruvec->lists[lru]);
> +}
> +
> +static __always_inline void add_page_to_lru_list_tail(struct page *page,
> + struct lruvec *lruvec, enum lru_list lru)
> +{
> + update_lru_size(lruvec, lru, page_zonenum(page), thp_nr_pages(page));
> + list_add_tail(&page->lru, &lruvec->lists[lru]);
> +}
> +
> +static __always_inline void del_page_from_lru_list(struct page *page,
> + struct lruvec *lruvec, enum lru_list lru)
> +{
> + list_del(&page->lru);
> + update_lru_size(lruvec, lru, page_zonenum(page), -thp_nr_pages(page));
> +}
> #endif
>

2021-01-27 20:59:11

by Miaohe Lin

[permalink] [raw]
Subject: Re: [PATCH v2 01/10] mm: use add_page_to_lru_list()

On 2021/1/23 6:05, Yu Zhao wrote:
> There is add_page_to_lru_list(), and move_pages_to_lru() should reuse
> it, not duplicate it.
>
> Link: https://lore.kernel.org/linux-mm/[email protected]/
> Signed-off-by: Yu Zhao <[email protected]>
> Reviewed-by: Alex Shi <[email protected]>
> ---
> mm/vmscan.c | 6 +-----
> 1 file changed, 1 insertion(+), 5 deletions(-)
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 04509994aed4..19875660e8f8 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1823,7 +1823,6 @@ static unsigned noinline_for_stack move_pages_to_lru(struct lruvec *lruvec,
> int nr_pages, nr_moved = 0;
> LIST_HEAD(pages_to_free);
> struct page *page;
> - enum lru_list lru;
>
> while (!list_empty(list)) {
> page = lru_to_page(list);
> @@ -1868,11 +1867,8 @@ static unsigned noinline_for_stack move_pages_to_lru(struct lruvec *lruvec,
> * inhibits memcg migration).
> */
> VM_BUG_ON_PAGE(!lruvec_holds_page_lru_lock(page, lruvec), page);
> - lru = page_lru(page);
> + add_page_to_lru_list(page, lruvec, page_lru(page));
> nr_pages = thp_nr_pages(page);
> -
> - update_lru_size(lruvec, lru, page_zonenum(page), nr_pages);
> - list_add(&page->lru, &lruvec->lists[lru]);
> nr_moved += nr_pages;
> if (PageActive(page))
> workingset_age_nonresident(lruvec, nr_pages);
>

Reviewed-by: Miaohe Lin <[email protected]>

2021-01-27 22:14:00

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [PATCH v2 03/10] mm: don't pass "enum lru_list" to lru list addition functions

On 1/26/21 10:34 PM, Yu Zhao wrote:
> On Tue, Jan 26, 2021 at 08:13:11PM +0100, Vlastimil Babka wrote:
>> On 1/22/21 11:05 PM, Yu Zhao wrote:
>> > The "enum lru_list" parameter to add_page_to_lru_list() and
>> > add_page_to_lru_list_tail() is redundant in the sense that it can
>> > be extracted from the "struct page" parameter by page_lru().
>>
>> Okay, however, it means repeated extraction of a value that we already knew. The
>> result of compilation is rather sad. This is bloat-o-meter on mm/built-in.a
>> (without CONFIG_DEBUG_VM, btw) between patch 2 and 5:
>
> Thanks for noticing this, Vlastimil. Should I drop the rest of the
> series except the first patch?

I didn't check how 6-10 look (and if they are still applicable without 3-5),
this was just between 2 and 5.

2021-02-24 00:49:15

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH v2 03/10] mm: don't pass "enum lru_list" to lru list addition functions

On Tue, 26 Jan 2021 15:14:38 -0700 Yu Zhao <[email protected]> wrote:

> On Tue, Jan 26, 2021 at 10:01:11PM +0000, Matthew Wilcox wrote:
> > On Fri, Jan 22, 2021 at 03:05:53PM -0700, Yu Zhao wrote:
> > > +++ b/mm/swap.c
> > > @@ -231,7 +231,7 @@ static void pagevec_move_tail_fn(struct page *page, struct lruvec *lruvec)
> > > if (!PageUnevictable(page)) {
> > > del_page_from_lru_list(page, lruvec, page_lru(page));
> > > ClearPageActive(page);
> > > - add_page_to_lru_list_tail(page, lruvec, page_lru(page));
> > > + add_page_to_lru_list_tail(page, lruvec);
> > > __count_vm_events(PGROTATED, thp_nr_pages(page));
> > > }
> >
> > Is it profitable to do ...
> >
> > - del_page_from_lru_list(page, lruvec, page_lru(page));
> > + enum lru_list lru = page_lru(page);
> > + del_page_from_lru_list(page, lruvec, lru);
> > ClearPageActive(page);
> > - add_page_to_lru_list_tail(page, lruvec, page_lru(page));
> > + lru &= ~LRU_ACTIVE;
> > + add_page_to_lru_list_tail(page, lruvec, lru);
>
> Ok, now we want to trade readability for size. Sure, I'll see how
> much we could squeeze.

As nothing has happened here and the code bloat issue remains, I'll
hold this series out of 5.12-rc1.

2021-02-24 07:52:09

by Yu Zhao

[permalink] [raw]
Subject: Re: [PATCH v2 03/10] mm: don't pass "enum lru_list" to lru list addition functions

On Tue, Feb 23, 2021 at 02:50:11PM -0800, Andrew Morton wrote:
> On Tue, 26 Jan 2021 15:14:38 -0700 Yu Zhao <[email protected]> wrote:
>
> > On Tue, Jan 26, 2021 at 10:01:11PM +0000, Matthew Wilcox wrote:
> > > On Fri, Jan 22, 2021 at 03:05:53PM -0700, Yu Zhao wrote:
> > > > +++ b/mm/swap.c
> > > > @@ -231,7 +231,7 @@ static void pagevec_move_tail_fn(struct page *page, struct lruvec *lruvec)
> > > > if (!PageUnevictable(page)) {
> > > > del_page_from_lru_list(page, lruvec, page_lru(page));
> > > > ClearPageActive(page);
> > > > - add_page_to_lru_list_tail(page, lruvec, page_lru(page));
> > > > + add_page_to_lru_list_tail(page, lruvec);
> > > > __count_vm_events(PGROTATED, thp_nr_pages(page));
> > > > }
> > >
> > > Is it profitable to do ...
> > >
> > > - del_page_from_lru_list(page, lruvec, page_lru(page));
> > > + enum lru_list lru = page_lru(page);
> > > + del_page_from_lru_list(page, lruvec, lru);
> > > ClearPageActive(page);
> > > - add_page_to_lru_list_tail(page, lruvec, page_lru(page));
> > > + lru &= ~LRU_ACTIVE;
> > > + add_page_to_lru_list_tail(page, lruvec, lru);
> >
> > Ok, now we want to trade readability for size. Sure, I'll see how
> > much we could squeeze.
>
> As nothing has happened here and the code bloat issue remains, I'll
> hold this series out of 5.12-rc1.

Sorry for the slow response. I was trying to ascertain why
page_lru(), a tiny helper, could bloat vmlinux by O(KB). It turned out
compound_head() included in Page{Active,Unevictable} is a nuisance in
our case. Testing PG_{active,unevictable} against
compound_head(page)->flags is really unnecessary because all lru
operations are eventually done on page->lru not
compound_head(page)->lru. With the following change, which sacrifices
the readability a bit, we gain 998 bytes with Clang but lose 227 bytes
with GCC, which IMO is a win. (We use Clang by default.)


diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h
index 355ea1ee32bd..ec0878a3cdfe 100644
--- a/include/linux/mm_inline.h
+++ b/include/linux/mm_inline.h
@@ -46,14 +46,12 @@ static __always_inline void __clear_page_lru_flags(struct page *page)
{
VM_BUG_ON_PAGE(!PageLRU(page), page);

- __ClearPageLRU(page);
-
/* this shouldn't happen, so leave the flags to bad_page() */
- if (PageActive(page) && PageUnevictable(page))
+ if ((page->flags & (BIT(PG_active) | BIT(PG_unevictable))) ==
+ (BIT(PG_active) | BIT(PG_unevictable)))
return;

- __ClearPageActive(page);
- __ClearPageUnevictable(page);
+ page->flags &= ~(BIT(PG_lru) | BIT(PG_active) | BIT(PG_unevictable));
}

/**
@@ -65,18 +63,12 @@ static __always_inline void __clear_page_lru_flags(struct page *page)
*/
static __always_inline enum lru_list page_lru(struct page *page)
{
- enum lru_list lru;
+ unsigned long flags = READ_ONCE(page->flags);

VM_BUG_ON_PAGE(PageActive(page) && PageUnevictable(page), page);

- if (PageUnevictable(page))
- return LRU_UNEVICTABLE;
-
- lru = page_is_file_lru(page) ? LRU_INACTIVE_FILE : LRU_INACTIVE_ANON;
- if (PageActive(page))
- lru += LRU_ACTIVE;
-
- return lru;
+ return (flags & BIT(PG_unevictable)) ? LRU_UNEVICTABLE :
+ (LRU_FILE * !(flags & BIT(PG_swapbacked)) + !!(flags & BIT(PG_active)));
}

static __always_inline void add_page_to_lru_list(struct page *page,


I'll post this as a separate patch. Below the bloat-o-meter collected
on top of c03c21ba6f4e.

$ ./scripts/bloat-o-meter ../vmlinux.clang.orig ../vmlinux.clang
add/remove: 0/1 grow/shrink: 7/10 up/down: 191/-1189 (-998)
Function old new delta
lru_lazyfree_fn 848 893 +45
lru_deactivate_file_fn 1037 1075 +38
perf_trace_mm_lru_insertion 515 548 +33
check_move_unevictable_pages 983 1006 +23
__activate_page 706 729 +23
trace_event_raw_event_mm_lru_insertion 476 497 +21
lru_deactivate_fn 691 699 +8
__bpf_trace_mm_lru_insertion 13 11 -2
__traceiter_mm_lru_insertion 67 62 -5
move_pages_to_lru 964 881 -83
__pagevec_lru_add_fn 665 581 -84
isolate_lru_page 524 419 -105
__munlock_pagevec 1609 1481 -128
isolate_migratepages_block 3370 3237 -133
__page_cache_release 556 413 -143
lruvec_lru_size 151 - -151
release_pages 1025 866 -159
pagevec_move_tail_fn 805 609 -196
Total: Before=19502982, After=19501984, chg -0.01%

$ ./scripts/bloat-o-meter ../vmlinux.gcc.orig ../vmlinux.gcc
add/remove: 0/1 grow/shrink: 9/9 up/down: 1010/-783 (227)
Function old new delta
shrink_lruvec 1690 1950 +260
lru_deactivate_file_fn 961 1128 +167
isolate_migratepages_block 3286 3427 +141
check_move_unevictable_pages 1042 1170 +128
lru_lazyfree_fn 709 822 +113
lru_deactivate_fn 665 724 +59
__activate_page 703 760 +57
trace_event_raw_event_mm_lru_insertion 432 478 +46
perf_trace_mm_lru_insertion 464 503 +39
__bpf_trace_mm_lru_insertion 13 11 -2
__traceiter_mm_lru_insertion 66 57 -9
isolate_lru_page 472 405 -67
__munlock_pagevec 1282 1212 -70
__pagevec_lru_add 976 893 -83
__page_cache_release 508 418 -90
release_pages 978 887 -91
move_pages_to_lru 954 853 -101
lruvec_lru_size 131 - -131
pagevec_move_tail_fn 770 631 -139
Total: Before=19237248, After=19237475, chg +0.00%

2021-02-24 08:44:15

by Yu Zhao

[permalink] [raw]
Subject: Re: [PATCH v2 03/10] mm: don't pass "enum lru_list" to lru list addition functions

On Wed, Feb 24, 2021 at 04:06:45PM +0800, Alex Shi wrote:
>
>
> 在 2021/2/24 下午1:29, Yu Zhao 写道:
> > On Tue, Feb 23, 2021 at 02:50:11PM -0800, Andrew Morton wrote:
> >> On Tue, 26 Jan 2021 15:14:38 -0700 Yu Zhao <[email protected]> wrote:
> >>
> >>> On Tue, Jan 26, 2021 at 10:01:11PM +0000, Matthew Wilcox wrote:
> >>>> On Fri, Jan 22, 2021 at 03:05:53PM -0700, Yu Zhao wrote:
> >>>>> +++ b/mm/swap.c
> >>>>> @@ -231,7 +231,7 @@ static void pagevec_move_tail_fn(struct page *page, struct lruvec *lruvec)
> >>>>> if (!PageUnevictable(page)) {
> >>>>> del_page_from_lru_list(page, lruvec, page_lru(page));
> >>>>> ClearPageActive(page);
> >>>>> - add_page_to_lru_list_tail(page, lruvec, page_lru(page));
> >>>>> + add_page_to_lru_list_tail(page, lruvec);
> >>>>> __count_vm_events(PGROTATED, thp_nr_pages(page));
> >>>>> }
> >>>>
> >>>> Is it profitable to do ...
> >>>>
> >>>> - del_page_from_lru_list(page, lruvec, page_lru(page));
> >>>> + enum lru_list lru = page_lru(page);
> >>>> + del_page_from_lru_list(page, lruvec, lru);
> >>>> ClearPageActive(page);
> >>>> - add_page_to_lru_list_tail(page, lruvec, page_lru(page));
> >>>> + lru &= ~LRU_ACTIVE;
> >>>> + add_page_to_lru_list_tail(page, lruvec, lru);
> >>>
> >>> Ok, now we want to trade readability for size. Sure, I'll see how
> >>> much we could squeeze.
> >>
> >> As nothing has happened here and the code bloat issue remains, I'll
> >> hold this series out of 5.12-rc1.
> >
> > Sorry for the slow response. I was trying to ascertain why
> > page_lru(), a tiny helper, could bloat vmlinux by O(KB). It turned out
> > compound_head() included in Page{Active,Unevictable} is a nuisance in
> > our case. Testing PG_{active,unevictable} against
> > compound_head(page)->flags is really unnecessary because all lru
> > operations are eventually done on page->lru not
> > compound_head(page)->lru. With the following change, which sacrifices
> > the readability a bit, we gain 998 bytes with Clang but lose 227 bytes
> > with GCC, which IMO is a win. (We use Clang by default.)
> >
> >
> > diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h
> > index 355ea1ee32bd..ec0878a3cdfe 100644
> > --- a/include/linux/mm_inline.h
> > +++ b/include/linux/mm_inline.h
> > @@ -46,14 +46,12 @@ static __always_inline void __clear_page_lru_flags(struct page *page)
> > {
> > VM_BUG_ON_PAGE(!PageLRU(page), page);
> >
> > - __ClearPageLRU(page);
> > -
> > /* this shouldn't happen, so leave the flags to bad_page() */
> > - if (PageActive(page) && PageUnevictable(page))
> > + if ((page->flags & (BIT(PG_active) | BIT(PG_unevictable))) ==
> > + (BIT(PG_active) | BIT(PG_unevictable)))
> > return;
> >
> > - __ClearPageActive(page);
> > - __ClearPageUnevictable(page);
> > + page->flags &= ~(BIT(PG_lru) | BIT(PG_active) | BIT(PG_unevictable));
> > }
> >
> > /**
> > @@ -65,18 +63,12 @@ static __always_inline void __clear_page_lru_flags(struct page *page)
> > */
> > static __always_inline enum lru_list page_lru(struct page *page)
> > {
> > - enum lru_list lru;
> > + unsigned long flags = READ_ONCE(page->flags);
> >
> > VM_BUG_ON_PAGE(PageActive(page) && PageUnevictable(page), page);
> >
> > - if (PageUnevictable(page))
> > - return LRU_UNEVICTABLE;
> > -
> > - lru = page_is_file_lru(page) ? LRU_INACTIVE_FILE : LRU_INACTIVE_ANON;
> > - if (PageActive(page))
> > - lru += LRU_ACTIVE;
> > -
> > - return lru;
> > + return (flags & BIT(PG_unevictable)) ? LRU_UNEVICTABLE :
> > + (LRU_FILE * !(flags & BIT(PG_swapbacked)) + !!(flags & BIT(PG_active)));
>
> Currently each of page flags used different flags policy, does this mean above flags could be
> change to PF_ANY policy?

That's a good question. Semantically, no because
PG_{active,unevictable} only apply to head pages. But practically,
I think the answer is yes, and the only place that needs to
explicitly call compound_head() is gather_stats() in
fs/proc/task_mmu.c, IIRC.

>
> Thanks
> Alex
>
> > }
> >
> > static __always_inline void add_page_to_lru_list(struct page *page,
> >
> >
> > I'll post this as a separate patch. Below the bloat-o-meter collected
> > on top of c03c21ba6f4e.
> >
> > $ ./scripts/bloat-o-meter ../vmlinux.clang.orig ../vmlinux.clang
> > add/remove: 0/1 grow/shrink: 7/10 up/down: 191/-1189 (-998)
> > Function old new delta
> > lru_lazyfree_fn 848 893 +45
> > lru_deactivate_file_fn 1037 1075 +38
> > perf_trace_mm_lru_insertion 515 548 +33
> > check_move_unevictable_pages 983 1006 +23
> > __activate_page 706 729 +23
> > trace_event_raw_event_mm_lru_insertion 476 497 +21
> > lru_deactivate_fn 691 699 +8
> > __bpf_trace_mm_lru_insertion 13 11 -2
> > __traceiter_mm_lru_insertion 67 62 -5
> > move_pages_to_lru 964 881 -83
> > __pagevec_lru_add_fn 665 581 -84
> > isolate_lru_page 524 419 -105
> > __munlock_pagevec 1609 1481 -128
> > isolate_migratepages_block 3370 3237 -133
> > __page_cache_release 556 413 -143
> > lruvec_lru_size 151 - -151
> > release_pages 1025 866 -159
> > pagevec_move_tail_fn 805 609 -196
> > Total: Before=19502982, After=19501984, chg -0.01%
> >
> > $ ./scripts/bloat-o-meter ../vmlinux.gcc.orig ../vmlinux.gcc
> > add/remove: 0/1 grow/shrink: 9/9 up/down: 1010/-783 (227)
> > Function old new delta
> > shrink_lruvec 1690 1950 +260
> > lru_deactivate_file_fn 961 1128 +167
> > isolate_migratepages_block 3286 3427 +141
> > check_move_unevictable_pages 1042 1170 +128
> > lru_lazyfree_fn 709 822 +113
> > lru_deactivate_fn 665 724 +59
> > __activate_page 703 760 +57
> > trace_event_raw_event_mm_lru_insertion 432 478 +46
> > perf_trace_mm_lru_insertion 464 503 +39
> > __bpf_trace_mm_lru_insertion 13 11 -2
> > __traceiter_mm_lru_insertion 66 57 -9
> > isolate_lru_page 472 405 -67
> > __munlock_pagevec 1282 1212 -70
> > __pagevec_lru_add 976 893 -83
> > __page_cache_release 508 418 -90
> > release_pages 978 887 -91
> > move_pages_to_lru 954 853 -101
> > lruvec_lru_size 131 - -131
> > pagevec_move_tail_fn 770 631 -139
> > Total: Before=19237248, After=19237475, chg +0.00%
> >

2021-02-24 08:53:43

by Yu Zhao

[permalink] [raw]
Subject: [PATCH] mm: test page->flags directly in page_lru()

Currently page_lru() uses Page{Active,Unevictable} to determine which
lru list a page belongs to. Page{Active,Unevictable} contain
compound_head() and therefore page_lru() essentially tests
PG_{active,unevictable} against compound_head(page)->flags. Once an
lru list is determined, page->lru, rather than
compound_head(page)->lru, will be added to or deleted from it.

Though not bug, having compound_head() in page_lru() increases the
size of vmlinux by O(KB) because page_lru() is inlined many places.
And removing compound_head() entirely from Page{Active,Unevictable}
may not be the best option (for the moment) either because there
may be other cases that need compound_head(). This patch makes
page_lru() and __clear_page_lru_flags(), which are used immediately
before and after operations on page->lru, test
PG_{active,unevictable} directly against page->flags instead.

scripts/bloat-o-meter results before and after the entire series:
Glang: add/remove: 0/1 grow/shrink: 7/10 up/down: 191/-1189 (-998)
GCC: add/remove: 0/1 grow/shrink: 9/9 up/down: 1010/-783 (227)

Signed-off-by: Yu Zhao <[email protected]>
---
include/linux/mm_inline.h | 21 +++++++--------------
1 file changed, 7 insertions(+), 14 deletions(-)

diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h
index 355ea1ee32bd..1b8df9e6f63f 100644
--- a/include/linux/mm_inline.h
+++ b/include/linux/mm_inline.h
@@ -46,14 +46,12 @@ static __always_inline void __clear_page_lru_flags(struct page *page)
{
VM_BUG_ON_PAGE(!PageLRU(page), page);

- __ClearPageLRU(page);
-
/* this shouldn't happen, so leave the flags to bad_page() */
- if (PageActive(page) && PageUnevictable(page))
+ if ((page->flags & (BIT(PG_active) | BIT(PG_unevictable))) ==
+ (BIT(PG_active) | BIT(PG_unevictable)))
return;

- __ClearPageActive(page);
- __ClearPageUnevictable(page);
+ page->flags &= ~(BIT(PG_lru) | BIT(PG_active) | BIT(PG_unevictable));
}

/**
@@ -65,18 +63,13 @@ static __always_inline void __clear_page_lru_flags(struct page *page)
*/
static __always_inline enum lru_list page_lru(struct page *page)
{
- enum lru_list lru;
+ unsigned long flags = READ_ONCE(page->flags);

VM_BUG_ON_PAGE(PageActive(page) && PageUnevictable(page), page);

- if (PageUnevictable(page))
- return LRU_UNEVICTABLE;
-
- lru = page_is_file_lru(page) ? LRU_INACTIVE_FILE : LRU_INACTIVE_ANON;
- if (PageActive(page))
- lru += LRU_ACTIVE;
-
- return lru;
+ /* test page->flags directly to avoid unnecessary compound_head() */
+ return (flags & BIT(PG_unevictable)) ? LRU_UNEVICTABLE :
+ (LRU_FILE * !(flags & BIT(PG_swapbacked)) + !!(flags & BIT(PG_active)));
}

static __always_inline void add_page_to_lru_list(struct page *page,
--
2.30.0.617.g56c4b15f3c-goog

2021-02-24 13:03:21

by Alex Shi

[permalink] [raw]
Subject: Re: [PATCH v2 03/10] mm: don't pass "enum lru_list" to lru list addition functions



?? 2021/2/24 ????1:29, Yu Zhao д??:
> On Tue, Feb 23, 2021 at 02:50:11PM -0800, Andrew Morton wrote:
>> On Tue, 26 Jan 2021 15:14:38 -0700 Yu Zhao <[email protected]> wrote:
>>
>>> On Tue, Jan 26, 2021 at 10:01:11PM +0000, Matthew Wilcox wrote:
>>>> On Fri, Jan 22, 2021 at 03:05:53PM -0700, Yu Zhao wrote:
>>>>> +++ b/mm/swap.c
>>>>> @@ -231,7 +231,7 @@ static void pagevec_move_tail_fn(struct page *page, struct lruvec *lruvec)
>>>>> if (!PageUnevictable(page)) {
>>>>> del_page_from_lru_list(page, lruvec, page_lru(page));
>>>>> ClearPageActive(page);
>>>>> - add_page_to_lru_list_tail(page, lruvec, page_lru(page));
>>>>> + add_page_to_lru_list_tail(page, lruvec);
>>>>> __count_vm_events(PGROTATED, thp_nr_pages(page));
>>>>> }
>>>>
>>>> Is it profitable to do ...
>>>>
>>>> - del_page_from_lru_list(page, lruvec, page_lru(page));
>>>> + enum lru_list lru = page_lru(page);
>>>> + del_page_from_lru_list(page, lruvec, lru);
>>>> ClearPageActive(page);
>>>> - add_page_to_lru_list_tail(page, lruvec, page_lru(page));
>>>> + lru &= ~LRU_ACTIVE;
>>>> + add_page_to_lru_list_tail(page, lruvec, lru);
>>>
>>> Ok, now we want to trade readability for size. Sure, I'll see how
>>> much we could squeeze.
>>
>> As nothing has happened here and the code bloat issue remains, I'll
>> hold this series out of 5.12-rc1.
>
> Sorry for the slow response. I was trying to ascertain why
> page_lru(), a tiny helper, could bloat vmlinux by O(KB). It turned out
> compound_head() included in Page{Active,Unevictable} is a nuisance in
> our case. Testing PG_{active,unevictable} against
> compound_head(page)->flags is really unnecessary because all lru
> operations are eventually done on page->lru not
> compound_head(page)->lru. With the following change, which sacrifices
> the readability a bit, we gain 998 bytes with Clang but lose 227 bytes
> with GCC, which IMO is a win. (We use Clang by default.)
>
>
> diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h
> index 355ea1ee32bd..ec0878a3cdfe 100644
> --- a/include/linux/mm_inline.h
> +++ b/include/linux/mm_inline.h
> @@ -46,14 +46,12 @@ static __always_inline void __clear_page_lru_flags(struct page *page)
> {
> VM_BUG_ON_PAGE(!PageLRU(page), page);
>
> - __ClearPageLRU(page);
> -
> /* this shouldn't happen, so leave the flags to bad_page() */
> - if (PageActive(page) && PageUnevictable(page))
> + if ((page->flags & (BIT(PG_active) | BIT(PG_unevictable))) ==
> + (BIT(PG_active) | BIT(PG_unevictable)))
> return;
>
> - __ClearPageActive(page);
> - __ClearPageUnevictable(page);
> + page->flags &= ~(BIT(PG_lru) | BIT(PG_active) | BIT(PG_unevictable));
> }
>
> /**
> @@ -65,18 +63,12 @@ static __always_inline void __clear_page_lru_flags(struct page *page)
> */
> static __always_inline enum lru_list page_lru(struct page *page)
> {
> - enum lru_list lru;
> + unsigned long flags = READ_ONCE(page->flags);
>
> VM_BUG_ON_PAGE(PageActive(page) && PageUnevictable(page), page);
>
> - if (PageUnevictable(page))
> - return LRU_UNEVICTABLE;
> -
> - lru = page_is_file_lru(page) ? LRU_INACTIVE_FILE : LRU_INACTIVE_ANON;
> - if (PageActive(page))
> - lru += LRU_ACTIVE;
> -
> - return lru;
> + return (flags & BIT(PG_unevictable)) ? LRU_UNEVICTABLE :
> + (LRU_FILE * !(flags & BIT(PG_swapbacked)) + !!(flags & BIT(PG_active)));

Currently each of page flags used different flags policy, does this mean above flags could be
change to PF_ANY policy?

Thanks
Alex

> }
>
> static __always_inline void add_page_to_lru_list(struct page *page,
>
>
> I'll post this as a separate patch. Below the bloat-o-meter collected
> on top of c03c21ba6f4e.
>
> $ ./scripts/bloat-o-meter ../vmlinux.clang.orig ../vmlinux.clang
> add/remove: 0/1 grow/shrink: 7/10 up/down: 191/-1189 (-998)
> Function old new delta
> lru_lazyfree_fn 848 893 +45
> lru_deactivate_file_fn 1037 1075 +38
> perf_trace_mm_lru_insertion 515 548 +33
> check_move_unevictable_pages 983 1006 +23
> __activate_page 706 729 +23
> trace_event_raw_event_mm_lru_insertion 476 497 +21
> lru_deactivate_fn 691 699 +8
> __bpf_trace_mm_lru_insertion 13 11 -2
> __traceiter_mm_lru_insertion 67 62 -5
> move_pages_to_lru 964 881 -83
> __pagevec_lru_add_fn 665 581 -84
> isolate_lru_page 524 419 -105
> __munlock_pagevec 1609 1481 -128
> isolate_migratepages_block 3370 3237 -133
> __page_cache_release 556 413 -143
> lruvec_lru_size 151 - -151
> release_pages 1025 866 -159
> pagevec_move_tail_fn 805 609 -196
> Total: Before=19502982, After=19501984, chg -0.01%
>
> $ ./scripts/bloat-o-meter ../vmlinux.gcc.orig ../vmlinux.gcc
> add/remove: 0/1 grow/shrink: 9/9 up/down: 1010/-783 (227)
> Function old new delta
> shrink_lruvec 1690 1950 +260
> lru_deactivate_file_fn 961 1128 +167
> isolate_migratepages_block 3286 3427 +141
> check_move_unevictable_pages 1042 1170 +128
> lru_lazyfree_fn 709 822 +113
> lru_deactivate_fn 665 724 +59
> __activate_page 703 760 +57
> trace_event_raw_event_mm_lru_insertion 432 478 +46
> perf_trace_mm_lru_insertion 464 503 +39
> __bpf_trace_mm_lru_insertion 13 11 -2
> __traceiter_mm_lru_insertion 66 57 -9
> isolate_lru_page 472 405 -67
> __munlock_pagevec 1282 1212 -70
> __pagevec_lru_add 976 893 -83
> __page_cache_release 508 418 -90
> release_pages 978 887 -91
> move_pages_to_lru 954 853 -101
> lruvec_lru_size 131 - -131
> pagevec_move_tail_fn 770 631 -139
> Total: Before=19237248, After=19237475, chg +0.00%
>

2021-02-24 13:06:50

by Alex Shi

[permalink] [raw]
Subject: Re: [PATCH v2 03/10] mm: don't pass "enum lru_list" to lru list addition functions



在 2021/2/24 下午4:37, Yu Zhao 写道:
>>> @@ -65,18 +63,12 @@ static __always_inline void __clear_page_lru_flags(struct page *page)
>>> */
>>> static __always_inline enum lru_list page_lru(struct page *page)
>>> {
>>> - enum lru_list lru;
>>> + unsigned long flags = READ_ONCE(page->flags);
>>>
>>> VM_BUG_ON_PAGE(PageActive(page) && PageUnevictable(page), page);
>>>
>>> - if (PageUnevictable(page))
>>> - return LRU_UNEVICTABLE;
>>> -
>>> - lru = page_is_file_lru(page) ? LRU_INACTIVE_FILE : LRU_INACTIVE_ANON;
>>> - if (PageActive(page))
>>> - lru += LRU_ACTIVE;
>>> -
>>> - return lru;
>>> + return (flags & BIT(PG_unevictable)) ? LRU_UNEVICTABLE :
>>> + (LRU_FILE * !(flags & BIT(PG_swapbacked)) + !!(flags & BIT(PG_active)));
>> Currently each of page flags used different flags policy, does this mean above flags could be
>> change to PF_ANY policy?
> That's a good question. Semantically, no because
> PG_{active,unevictable} only apply to head pages. But practically,
> I think the answer is yes, and the only place that needs to
> explicitly call compound_head() is gather_stats() in
> fs/proc/task_mmu.c, IIRC.
>


A quick testing for your testing request:

# ll vmlinux vmlinux.new
-rwxr-xr-x 1 root root 62245304 Feb 24 16:57 vmlinux
-rwxr-xr-x 1 root root 62245280 Feb 24 16:55 vmlinux.new
# gcc --version
gcc (GCC) 8.3.1 20190311 (Red Hat 8.3.1-3)
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

# scripts/bloat-o-meter vmlinux vmlinux.new
add/remove: 0/0 grow/shrink: 1/15 up/down: 1/-2008 (-2007)
Function old new delta
vermagic 37 38 +1
trace_event_raw_event_mm_lru_insertion 471 418 -53
perf_trace_mm_lru_insertion 526 473 -53
__munlock_pagevec 1134 1069 -65
isolate_migratepages_block 2623 2547 -76
isolate_lru_page 384 303 -81
__pagevec_lru_add 753 652 -101
release_pages 780 667 -113
__page_cache_release 429 276 -153
move_pages_to_lru 871 702 -169
lru_lazyfree_fn 712 539 -173
check_move_unevictable_pages 938 763 -175
__activate_page 665 488 -177
lru_deactivate_fn 636 452 -184
pagevec_move_tail_fn 597 411 -186
lru_deactivate_file_fn 1000 751 -249
Total: Before=17029652, After=17027645, chg -0.01%

2021-02-24 20:03:32

by Yu Zhao

[permalink] [raw]
Subject: Re: [PATCH] mm: test page->flags directly in page_lru()

On Wed, Feb 24, 2021 at 05:15:58AM -0800, Andrew Morton wrote:
> On Wed, 24 Feb 2021 01:48:07 -0700 Yu Zhao <[email protected]> wrote:
>
> > Currently page_lru() uses Page{Active,Unevictable} to determine which
> > lru list a page belongs to. Page{Active,Unevictable} contain
> > compound_head() and therefore page_lru() essentially tests
> > PG_{active,unevictable} against compound_head(page)->flags. Once an
> > lru list is determined, page->lru, rather than
> > compound_head(page)->lru, will be added to or deleted from it.
> >
> > Though not bug, having compound_head() in page_lru() increases the
> > size of vmlinux by O(KB) because page_lru() is inlined many places.
> > And removing compound_head() entirely from Page{Active,Unevictable}
> > may not be the best option (for the moment) either because there
> > may be other cases that need compound_head(). This patch makes
> > page_lru() and __clear_page_lru_flags(), which are used immediately
> > before and after operations on page->lru, test
> > PG_{active,unevictable} directly against page->flags instead.
>
> Oh geeze.
>
> > --- a/include/linux/mm_inline.h
> > +++ b/include/linux/mm_inline.h
> > @@ -46,14 +46,12 @@ static __always_inline void __clear_page_lru_flags(struct page *page)
> > {
> > VM_BUG_ON_PAGE(!PageLRU(page), page);
> >
> > - __ClearPageLRU(page);
> > -
> > /* this shouldn't happen, so leave the flags to bad_page() */
> > - if (PageActive(page) && PageUnevictable(page))
> > + if ((page->flags & (BIT(PG_active) | BIT(PG_unevictable))) ==
> > + (BIT(PG_active) | BIT(PG_unevictable)))
> > return;
>
> This isn't very nice. At the very least we should have (documented!)
> helper functions for this:

You are right. Now when I look at this, I s/dislike/hate/ it.

> /* comment goes here */
> static inline bool RawPageActive(struct page *page)
> {
> ...
> }
>
>
>
> However.
>
> Here's what the preprocessor produces for an allmodconfig version of
> PageActive():
>
> static inline __attribute__((__gnu_inline__)) __attribute__((__unused__)) __attribute__((no_instrument_function)) __attribute__((__always_inline__)) int PageActive(struct page *page)
> {
> return test_bit(PG_active, &({ do { if (__builtin_expect(!!(PagePoisoned(compound_head(page))), 0)) { dump_page(compound_head(page), "VM_BUG_ON_PAGE(" "PagePoisoned(compound_head(page))"")"); do { ({ asm volatile("%c0: nop\n\t" ".pushsection .discard.instr_begin\n\t" ".long %c0b - .\n\t" ".popsection\n\t" : : "i" (373)); }); do { asm volatile("1:\t" ".byte 0x0f, 0x0b" "\n" ".pushsection __bug_table,\"aw\"\n" "2:\t" ".long " "1b" " - 2b" "\t# bug_entry::bug_addr\n" "\t" ".long " "%c0" " - 2b" "\t# bug_entry::file\n" "\t.word %c1" "\t# bug_entry::line\n" "\t.word %c2" "\t# bug_entry::flags\n" "\t.org 2b+%c3\n" ".popsection" : : "i" ("./include/linux/page-flags.h"), "i" (338), "i" (0), "i" (sizeof(struct bug_entry))); } while (0); do { ({ asm volatile("%c0:\n\t" ".pushsection .discard.unreachable\n\t" ".long %c0b - .\n\t" ".popsection\n\t" : : "i" (374)); }); asm volatile(""); __builtin_unreachable(); } while (0); } while (0); } } while (0); compound_head(page); })->flags);
>
> }
>
> That's all to test a single bit!

I hear you. Let me spend a couple of days and focus on PG_{lru,active,
unevictable,swapbacked} first. They are mostly used with lru-related
operations and therefore can be switched to a compound_head()-free
policy easily. My estimate is we could save ~8KB by doing so :)

Weaning off compound_head() completely is a larger commitment neither
I or Alex are willing to make at the moment -- I did suggest this to
him last night when I asked him to help test build with GCC, which is
their default compiler (we've switched to Clang).

Another good point he has raised is they did see a slowdown on ARM64
after compound_head() was first introduced. My point is there may be
measurable performance benefit too if we could get rid of those
excessive calls to compound_head(). And I'd be happy to work with
somebody if they are interested in doing this.

Fair?

> Four calls to compound_head().
>
> Compiling this:
>
> int wibble(struct page *page)
> {
> return PageActive(page);
> }
>
>
> to the below assembly output (allmodconfig build) and it appears that
> the compiler did not CSE these calls. Perhaps it would be beneficial
> to give it a bit of help.

Another interesting thing I've noticed is the following change from
patch 10 also makes vmlinux a couple of hundreds bytes larger with
my GCC 4.9.x.

-unsigned long lruvec_lru_size(struct lruvec *lruvec, enum lru_list lru, int zone_idx)
+static unsigned long lruvec_lru_size(struct lruvec *lruvec, enum lru_list lru,
+ int zone_idx)

> This is all nuts. How much of this inlining is really justifiable? Do
> we know we wouldn't get a better kernel if we did
>
> mv mm-inline.h mm-not-inline-any-more.c
>
> ?
>
> Methinks that mm-inline.c needs some serious work...

Agreed. I'll send another series of patches on top of the lru cleanup
series this week.

> .type wibble, @function
> wibble:
> 1: call __fentry__
> .section __mcount_loc, "a",@progbits
> .quad 1b
> .previous
> pushq %r12 #
> pushq %rbp #
> pushq %rbx #
> # mm/swap.c:1156: {
> movq %rdi, %rbx # page, page
> movq %rbx, %rbp # page, _14
> # ./include/linux/page-flags.h:184: unsigned long head = READ_ONCE(page->compound_head);
> call __sanitizer_cov_trace_pc #
> movq 8(%rbx), %r12 # page_2(D)->D.14210.D.14188.compound_head, _8
> # ./include/linux/page-flags.h:186: if (unlikely(head & 1))
> testb $1, %r12b #, _8
> je .L2945 #,
> # ./include/linux/page-flags.h:187: return (struct page *) (head - 1);
> call __sanitizer_cov_trace_pc #
> leaq -1(%r12), %rbp #, _14
> jmp .L2945 #
> .L2945:
> call __sanitizer_cov_trace_pc #
> # ./include/linux/page-flags.h:338: PAGEFLAG(Active, active, PF_HEAD) __CLEARPAGEFLAG(Active, active, PF_HEAD)
> cmpq $-1, 0(%rbp) #, MEM[(long unsigned int *)_15]
> jne .L2946 #,
> # ./include/linux/page-flags.h:184: unsigned long head = READ_ONCE(page->compound_head);
> call __sanitizer_cov_trace_pc #
> movq 8(%rbx), %rbp #, _16
> # ./include/linux/page-flags.h:186: if (unlikely(head & 1))
> testb $1, %bpl #, _16
> je .L2947 #,
> # ./include/linux/page-flags.h:187: return (struct page *) (head - 1);
> leaq -1(%rbp), %rbx #, page
> call __sanitizer_cov_trace_pc #
> .L2947:
> # ./include/linux/page-flags.h:338: PAGEFLAG(Active, active, PF_HEAD) __CLEARPAGEFLAG(Active, active, PF_HEAD)
> call __sanitizer_cov_trace_pc #
> movq $.LC20, %rsi #,
> movq %rbx, %rdi # page,
> call dump_page #
> #APP
> # 338 "./include/linux/page-flags.h" 1
> 373: nop #
> .pushsection .discard.instr_begin
> .long 373b - . #
> .popsection
>
> # 0 "" 2
> # 338 "./include/linux/page-flags.h" 1
> 1: .byte 0x0f, 0x0b
> .pushsection __bug_table,"aw"
> 2: .long 1b - 2b # bug_entry::bug_addr
> .long .LC3 - 2b # bug_entry::file #
> .word 338 # bug_entry::line #
> .word 0 # bug_entry::flags #
> .org 2b+12 #
> .popsection
> # 0 "" 2
> # 338 "./include/linux/page-flags.h" 1
> 374: #
> .pushsection .discard.unreachable
> .long 374b - . #
> .popsection
>
> # 0 "" 2
> #NO_APP
> .L2946:
> # ./include/linux/page-flags.h:184: unsigned long head = READ_ONCE(page->compound_head);
> call __sanitizer_cov_trace_pc #
> movq 8(%rbx), %rbp #, _28
> # ./include/linux/page-flags.h:186: if (unlikely(head & 1))
> testb $1, %bpl #, _28
> je .L2948 #,
> # ./include/linux/page-flags.h:187: return (struct page *) (head - 1);
> leaq -1(%rbp), %rbx #, page
> call __sanitizer_cov_trace_pc #
> .L2948:
> # ./arch/x86/include/asm/bitops.h:207: (addr[nr >> _BITOPS_LONG_SHIFT])) != 0;
> call __sanitizer_cov_trace_pc #
> movq (%rbx), %rax # MEM[(const long unsigned int *)_35], _24
> # mm/swap.c:1158: }
> popq %rbx #
> popq %rbp #
> # ./arch/x86/include/asm/bitops.h:207: (addr[nr >> _BITOPS_LONG_SHIFT])) != 0;
> shrq $5, %rax #, tmp107
> # ./include/linux/page-flags.h:338: PAGEFLAG(Active, active, PF_HEAD) __CLEARPAGEFLAG(Active, active, PF_HEAD)
> andl $1, %eax #, tmp106
> # mm/swap.c:1158: }
> popq %r12 #
> ret
>

2021-02-25 00:52:32

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] mm: test page->flags directly in page_lru()

On Wed, 24 Feb 2021 01:48:07 -0700 Yu Zhao <[email protected]> wrote:

> Currently page_lru() uses Page{Active,Unevictable} to determine which
> lru list a page belongs to. Page{Active,Unevictable} contain
> compound_head() and therefore page_lru() essentially tests
> PG_{active,unevictable} against compound_head(page)->flags. Once an
> lru list is determined, page->lru, rather than
> compound_head(page)->lru, will be added to or deleted from it.
>
> Though not bug, having compound_head() in page_lru() increases the
> size of vmlinux by O(KB) because page_lru() is inlined many places.
> And removing compound_head() entirely from Page{Active,Unevictable}
> may not be the best option (for the moment) either because there
> may be other cases that need compound_head(). This patch makes
> page_lru() and __clear_page_lru_flags(), which are used immediately
> before and after operations on page->lru, test
> PG_{active,unevictable} directly against page->flags instead.

Oh geeze.

> --- a/include/linux/mm_inline.h
> +++ b/include/linux/mm_inline.h
> @@ -46,14 +46,12 @@ static __always_inline void __clear_page_lru_flags(struct page *page)
> {
> VM_BUG_ON_PAGE(!PageLRU(page), page);
>
> - __ClearPageLRU(page);
> -
> /* this shouldn't happen, so leave the flags to bad_page() */
> - if (PageActive(page) && PageUnevictable(page))
> + if ((page->flags & (BIT(PG_active) | BIT(PG_unevictable))) ==
> + (BIT(PG_active) | BIT(PG_unevictable)))
> return;

This isn't very nice. At the very least we should have (documented!)
helper functions for this:

/* comment goes here */
static inline bool RawPageActive(struct page *page)
{
...
}



However.

Here's what the preprocessor produces for an allmodconfig version of
PageActive():

static inline __attribute__((__gnu_inline__)) __attribute__((__unused__)) __attribute__((no_instrument_function)) __attribute__((__always_inline__)) int PageActive(struct page *page)
{
return test_bit(PG_active, &({ do { if (__builtin_expect(!!(PagePoisoned(compound_head(page))), 0)) { dump_page(compound_head(page), "VM_BUG_ON_PAGE(" "PagePoisoned(compound_head(page))"")"); do { ({ asm volatile("%c0: nop\n\t" ".pushsection .discard.instr_begin\n\t" ".long %c0b - .\n\t" ".popsection\n\t" : : "i" (373)); }); do { asm volatile("1:\t" ".byte 0x0f, 0x0b" "\n" ".pushsection __bug_table,\"aw\"\n" "2:\t" ".long " "1b" " - 2b" "\t# bug_entry::bug_addr\n" "\t" ".long " "%c0" " - 2b" "\t# bug_entry::file\n" "\t.word %c1" "\t# bug_entry::line\n" "\t.word %c2" "\t# bug_entry::flags\n" "\t.org 2b+%c3\n" ".popsection" : : "i" ("./include/linux/page-flags.h"), "i" (338), "i" (0), "i" (sizeof(struct bug_entry))); } while (0); do { ({ asm volatile("%c0:\n\t" ".pushsection .discard.unreachable\n\t" ".long %c0b - .\n\t" ".popsection\n\t" : : "i" (374)); }); asm volatile(""); __builtin_unreachable(); } while (0); } while (0); } } while (0); compound_head(page); })->flags);

}

That's all to test a single bit!

Four calls to compound_head().

Compiling this:

int wibble(struct page *page)
{
return PageActive(page);
}


to the below assembly output (allmodconfig build) and it appears that
the compiler did not CSE these calls. Perhaps it would be beneficial
to give it a bit of help.


This is all nuts. How much of this inlining is really justifiable? Do
we know we wouldn't get a better kernel if we did

mv mm-inline.h mm-not-inline-any-more.c

?

Methinks that mm-inline.c needs some serious work...



.type wibble, @function
wibble:
1: call __fentry__
.section __mcount_loc, "a",@progbits
.quad 1b
.previous
pushq %r12 #
pushq %rbp #
pushq %rbx #
# mm/swap.c:1156: {
movq %rdi, %rbx # page, page
movq %rbx, %rbp # page, _14
# ./include/linux/page-flags.h:184: unsigned long head = READ_ONCE(page->compound_head);
call __sanitizer_cov_trace_pc #
movq 8(%rbx), %r12 # page_2(D)->D.14210.D.14188.compound_head, _8
# ./include/linux/page-flags.h:186: if (unlikely(head & 1))
testb $1, %r12b #, _8
je .L2945 #,
# ./include/linux/page-flags.h:187: return (struct page *) (head - 1);
call __sanitizer_cov_trace_pc #
leaq -1(%r12), %rbp #, _14
jmp .L2945 #
.L2945:
call __sanitizer_cov_trace_pc #
# ./include/linux/page-flags.h:338: PAGEFLAG(Active, active, PF_HEAD) __CLEARPAGEFLAG(Active, active, PF_HEAD)
cmpq $-1, 0(%rbp) #, MEM[(long unsigned int *)_15]
jne .L2946 #,
# ./include/linux/page-flags.h:184: unsigned long head = READ_ONCE(page->compound_head);
call __sanitizer_cov_trace_pc #
movq 8(%rbx), %rbp #, _16
# ./include/linux/page-flags.h:186: if (unlikely(head & 1))
testb $1, %bpl #, _16
je .L2947 #,
# ./include/linux/page-flags.h:187: return (struct page *) (head - 1);
leaq -1(%rbp), %rbx #, page
call __sanitizer_cov_trace_pc #
.L2947:
# ./include/linux/page-flags.h:338: PAGEFLAG(Active, active, PF_HEAD) __CLEARPAGEFLAG(Active, active, PF_HEAD)
call __sanitizer_cov_trace_pc #
movq $.LC20, %rsi #,
movq %rbx, %rdi # page,
call dump_page #
#APP
# 338 "./include/linux/page-flags.h" 1
373: nop #
.pushsection .discard.instr_begin
.long 373b - . #
.popsection

# 0 "" 2
# 338 "./include/linux/page-flags.h" 1
1: .byte 0x0f, 0x0b
.pushsection __bug_table,"aw"
2: .long 1b - 2b # bug_entry::bug_addr
.long .LC3 - 2b # bug_entry::file #
.word 338 # bug_entry::line #
.word 0 # bug_entry::flags #
.org 2b+12 #
.popsection
# 0 "" 2
# 338 "./include/linux/page-flags.h" 1
374: #
.pushsection .discard.unreachable
.long 374b - . #
.popsection

# 0 "" 2
#NO_APP
.L2946:
# ./include/linux/page-flags.h:184: unsigned long head = READ_ONCE(page->compound_head);
call __sanitizer_cov_trace_pc #
movq 8(%rbx), %rbp #, _28
# ./include/linux/page-flags.h:186: if (unlikely(head & 1))
testb $1, %bpl #, _28
je .L2948 #,
# ./include/linux/page-flags.h:187: return (struct page *) (head - 1);
leaq -1(%rbp), %rbx #, page
call __sanitizer_cov_trace_pc #
.L2948:
# ./arch/x86/include/asm/bitops.h:207: (addr[nr >> _BITOPS_LONG_SHIFT])) != 0;
call __sanitizer_cov_trace_pc #
movq (%rbx), %rax # MEM[(const long unsigned int *)_35], _24
# mm/swap.c:1158: }
popq %rbx #
popq %rbp #
# ./arch/x86/include/asm/bitops.h:207: (addr[nr >> _BITOPS_LONG_SHIFT])) != 0;
shrq $5, %rax #, tmp107
# ./include/linux/page-flags.h:338: PAGEFLAG(Active, active, PF_HEAD) __CLEARPAGEFLAG(Active, active, PF_HEAD)
andl $1, %eax #, tmp106
# mm/swap.c:1158: }
popq %r12 #
ret

2021-02-25 03:58:18

by Yu Zhao

[permalink] [raw]
Subject: Re: [PATCH] mm: test page->flags directly in page_lru()

On Wed, Feb 24, 2021 at 09:56:39PM +0000, Matthew Wilcox wrote:
> On Wed, Feb 24, 2021 at 05:15:58AM -0800, Andrew Morton wrote:
> > Here's what the preprocessor produces for an allmodconfig version of
> > PageActive():
> >
> > static inline __attribute__((__gnu_inline__)) __attribute__((__unused__)) __attribute__((no_instrument_function)) __attribute__((__always_inline__)) int PageActive(struct page *page)
> > {
> > return test_bit(PG_active, &({ do { if (__builtin_expect(!!(PagePoisoned(compound_head(page))), 0)) { dump_page(compound_head(page), "VM_BUG_ON_PAGE(" "PagePoisoned(compound_head(page))"")"); do { ({ asm volatile("%c0: nop\n\t" ".pushsection .discard.instr_begin\n\t" ".long %c0b - .\n\t" ".popsection\n\t" : : "i" (373)); }); do { asm volatile("1:\t" ".byte 0x0f, 0x0b" "\n" ".pushsection __bug_table,\"aw\"\n" "2:\t" ".long " "1b" " - 2b" "\t# bug_entry::bug_addr\n" "\t" ".long " "%c0" " - 2b" "\t# bug_entry::file\n" "\t.word %c1" "\t# bug_entry::line\n" "\t.word %c2" "\t# bug_entry::flags\n" "\t.org 2b+%c3\n" ".popsection" : : "i" ("./include/linux/page-flags.h"), "i" (338), "i" (0), "i" (sizeof(struct bug_entry))); } while (0); do { ({ asm volatile("%c0:\n\t" ".pushsection .discard.unreachable\n\t" ".long %c0b - .\n\t" ".popsection\n\t" : : "i" (374)); }); asm volatile(""); __builtin_unreachable(); } while (0); } while (0); } } while (0); compound_head(page); })->flags);
> >
> > }
> >
> > That's all to test a single bit!
> >
> > Four calls to compound_head().
>
> If only somebody were working on a patch series to get rid of
> all those calls to compound_head()! Some reviews on
> https://lore.kernel.org/linux-mm/[email protected]/
> would be nice.

I'm on board with the idea and have done some research in this
direction. We've found that the ideal *anon* page size for Chrome OS
is not 4KB or 2MB, but 32KB. I hope we could leverage the folio to
support flexible anon page size to reduce the number of page faults
(vs 4KB) or internal fragmentation (vs 2MB).

That being said, it seems to me this is a long term plan and right
now we need something smaller. So if you don't mind, I'll just go
ahead and remove compound_head() from Page{LRU,Active,Unevictable,
SwapBacked} first?

> So, I haven't done page_lru() yet in my folio tree. What I would do is:
>
> diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h
> index 355ea1ee32bd..3895cfe6502b 100644
> --- a/include/linux/mm_inline.h
> +++ b/include/linux/mm_inline.h
> @@ -63,22 +63,27 @@ static __always_inline void __clear_page_lru_flags(struct page *page)
> * Returns the LRU list a page should be on, as an index
> * into the array of LRU lists.
> */
> -static __always_inline enum lru_list page_lru(struct page *page)
> +static __always_inline enum lru_list folio_lru(struct folio *folio)
> {
> enum lru_list lru;
>
> - VM_BUG_ON_PAGE(PageActive(page) && PageUnevictable(page), page);
> + VM_BUG_ON_PAGE(FolioActive(folio) && FolioUnevictable(folio), folio);
>
> - if (PageUnevictable(page))
> + if (FolioUnevictable(folio))
> return LRU_UNEVICTABLE;
>
> - lru = page_is_file_lru(page) ? LRU_INACTIVE_FILE : LRU_INACTIVE_ANON;
> - if (PageActive(page))
> + lru = page_is_file_lru(&folio->page) ? LRU_INACTIVE_FILE : LRU_INACTIVE_ANON;
> + if (FolioActive(folio))
> lru += LRU_ACTIVE;
>
> return lru;
> }
>
> +static __always_inline enum lru_list page_lru(struct page *page)
> +{
> + return folio_lru(page_folio(page));
> +}
> +
> static __always_inline void add_page_to_lru_list(struct page *page,
> struct lruvec *lruvec)
> {
>
> That would cause compound_head() to be called once instead of four times
> (assuming VM_BUG_ON is enabled). It can be reduced down to zero times
> when the callers are converted from being page-based to being folio-based.
>
> There is a further problem with PageFoo() being a READ_ONCE()
> of page->flags, so the compiler can't CSE it. I have ideas in that
> direction too; essentially ...
>
> unsigned long page_flags = PageFlags(page);
>
> if (PageFlagUnevictable(flags))
> ...
> if (PageFlagsActive(flags))
> ...
>
> and we can generate the PageFlagsFoo macros with the same machinery in
> page-flags.h that generates PageFoo and FolioFoo. This strikes me as
> less critical than the folio work to remove all the unnecessary calls
> to compound_head().
>
> > movq %rbx, %rbp # page, _14
> > # ./include/linux/page-flags.h:184: unsigned long head = READ_ONCE(page->compound_head);
> > call __sanitizer_cov_trace_pc #
>
> It's a bit unfair to complain about code generation with a
> sanitizer-enabled build ...
>

2021-02-25 04:09:50

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH] mm: test page->flags directly in page_lru()

On Wed, Feb 24, 2021 at 03:34:16PM -0700, Yu Zhao wrote:
> > If only somebody were working on a patch series to get rid of
> > all those calls to compound_head()! Some reviews on
> > https://lore.kernel.org/linux-mm/[email protected]/
> > would be nice.
>
> I'm on board with the idea and have done some research in this
> direction. We've found that the ideal *anon* page size for Chrome OS
> is not 4KB or 2MB, but 32KB. I hope we could leverage the folio to
> support flexible anon page size to reduce the number of page faults
> (vs 4KB) or internal fragmentation (vs 2MB).
>
> That being said, it seems to me this is a long term plan and right
> now we need something smaller. So if you don't mind, I'll just go
> ahead and remove compound_head() from Page{LRU,Active,Unevictable,
> SwapBacked} first?

It's really not a big change I'm suggesting here. You need
https://lore.kernel.org/linux-mm/[email protected]/
https://lore.kernel.org/linux-mm/[email protected]/
https://lore.kernel.org/linux-mm/[email protected]/
and then the patch I sent above to create folio_lru().

Then any changes you want to make to use folios more broadly will
incrementally move us towards your goal of 32kB anon pages.

2021-02-25 06:21:39

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH] mm: test page->flags directly in page_lru()

On Wed, Feb 24, 2021 at 04:50:39PM -0700, Yu Zhao wrote:
> On Wed, Feb 24, 2021 at 10:48:46PM +0000, Matthew Wilcox wrote:
> > On Wed, Feb 24, 2021 at 03:34:16PM -0700, Yu Zhao wrote:
> > > > If only somebody were working on a patch series to get rid of
> > > > all those calls to compound_head()! Some reviews on
> > > > https://lore.kernel.org/linux-mm/[email protected]/
> > > > would be nice.
> > >
> > > I'm on board with the idea and have done some research in this
> > > direction. We've found that the ideal *anon* page size for Chrome OS
> > > is not 4KB or 2MB, but 32KB. I hope we could leverage the folio to
> > > support flexible anon page size to reduce the number of page faults
> > > (vs 4KB) or internal fragmentation (vs 2MB).
> > >
> > > That being said, it seems to me this is a long term plan and right
> > > now we need something smaller. So if you don't mind, I'll just go
> > > ahead and remove compound_head() from Page{LRU,Active,Unevictable,
> > > SwapBacked} first?
> >
> > It's really not a big change I'm suggesting here. You need
> > https://lore.kernel.org/linux-mm/[email protected]/
> > https://lore.kernel.org/linux-mm/[email protected]/
> > https://lore.kernel.org/linux-mm/[email protected]/
> > and then the patch I sent above to create folio_lru().
> >
> > Then any changes you want to make to use folios more broadly will
> > incrementally move us towards your goal of 32kB anon pages.
>
> Well, these patches introduce a new concept which I'm on board with.

It's not really a new concept ... it's a new type for an existing concept
(a head page).

> Assume everybody else is too, it still seems to me it's an overkill
> to employee folio to just get rid of unnecessary compound_head()
> in page_lru() -- this is not a criticism but a compliment.

It's not overkill, that really is the point of a folio! If you
think about it, only head pages can be on the LRU list (because the
compound_head is in the union with the lru list_head). So it
always makes sense to talk about folios on the LRU list.

> Let me work out something *conceptually* smaller first, and if you
> think folio is absolutely more suitable even for this specific issue,
> I'll go review and test the four patches you listed. Sounds good?

Umm. It seems to me that no matter what you do, it'll be equivalent to
this, only without the type-safety?

2021-02-25 06:43:03

by Yu Zhao

[permalink] [raw]
Subject: Re: [PATCH] mm: test page->flags directly in page_lru()

On Thu, Feb 25, 2021 at 03:55:53AM +0000, Matthew Wilcox wrote:
> On Wed, Feb 24, 2021 at 04:50:39PM -0700, Yu Zhao wrote:
> > On Wed, Feb 24, 2021 at 10:48:46PM +0000, Matthew Wilcox wrote:
> > > On Wed, Feb 24, 2021 at 03:34:16PM -0700, Yu Zhao wrote:
> > > > > If only somebody were working on a patch series to get rid of
> > > > > all those calls to compound_head()! Some reviews on
> > > > > https://lore.kernel.org/linux-mm/[email protected]/
> > > > > would be nice.
> > > >
> > > > I'm on board with the idea and have done some research in this
> > > > direction. We've found that the ideal *anon* page size for Chrome OS
> > > > is not 4KB or 2MB, but 32KB. I hope we could leverage the folio to
> > > > support flexible anon page size to reduce the number of page faults
> > > > (vs 4KB) or internal fragmentation (vs 2MB).
> > > >
> > > > That being said, it seems to me this is a long term plan and right
> > > > now we need something smaller. So if you don't mind, I'll just go
> > > > ahead and remove compound_head() from Page{LRU,Active,Unevictable,
> > > > SwapBacked} first?
> > >
> > > It's really not a big change I'm suggesting here. You need
> > > https://lore.kernel.org/linux-mm/[email protected]/
> > > https://lore.kernel.org/linux-mm/[email protected]/
> > > https://lore.kernel.org/linux-mm/[email protected]/
> > > and then the patch I sent above to create folio_lru().
> > >
> > > Then any changes you want to make to use folios more broadly will
> > > incrementally move us towards your goal of 32kB anon pages.
> >
> > Well, these patches introduce a new concept which I'm on board with.
>
> It's not really a new concept ... it's a new type for an existing concept
> (a head page).
>
> > Assume everybody else is too, it still seems to me it's an overkill
> > to employee folio to just get rid of unnecessary compound_head()
> > in page_lru() -- this is not a criticism but a compliment.
>
> It's not overkill, that really is the point of a folio! If you
> think about it, only head pages can be on the LRU list (because the
> compound_head is in the union with the lru list_head). So it
> always makes sense to talk about folios on the LRU list.
>
> > Let me work out something *conceptually* smaller first, and if you
> > think folio is absolutely more suitable even for this specific issue,
> > I'll go review and test the four patches you listed. Sounds good?
>
> Umm. It seems to me that no matter what you do, it'll be equivalent to
> this, only without the type-safety?

I'm thinking about something trivial but still very effective. So far
I've only tested it with PG_{active,unevictable}, and I'm already
seeing a 4KB gain less the 2KB loss from page_lru().

I didn't go with this at the beginning because it's also time-
consuming. I need to go over every single use of
PG_{active,unevictable,swapbacked,lru}.

add/remove: 0/0 grow/shrink: 1/37 up/down: 4/-4129 (-4125)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 3cec6fbef725..c866c363bb41 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -1712,6 +1712,7 @@ static void gather_stats(struct page *page, struct numa_maps *md, int pte_dirty,
unsigned long nr_pages)
{
int count = page_mapcount(page);
+ struct page *head = compound_head(page);

md->pages += nr_pages;
if (pte_dirty || PageDirty(page))
@@ -1720,7 +1721,7 @@ static void gather_stats(struct page *page, struct numa_maps *md, int pte_dirty,
if (PageSwapCache(page))
md->swapcache += nr_pages;

- if (PageActive(page) || PageUnevictable(page))
+ if (PageActive(head) || PageUnevictable(head))
md->active += nr_pages;

if (PageWriteback(page))
diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index db914477057b..35b3d272ab4c 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -335,8 +335,8 @@ PAGEFLAG(Dirty, dirty, PF_HEAD) TESTSCFLAG(Dirty, dirty, PF_HEAD)
__CLEARPAGEFLAG(Dirty, dirty, PF_HEAD)
PAGEFLAG(LRU, lru, PF_HEAD) __CLEARPAGEFLAG(LRU, lru, PF_HEAD)
TESTCLEARFLAG(LRU, lru, PF_HEAD)
-PAGEFLAG(Active, active, PF_HEAD) __CLEARPAGEFLAG(Active, active, PF_HEAD)
- TESTCLEARFLAG(Active, active, PF_HEAD)
+PAGEFLAG(Active, active, PF_ONLY_HEAD) __CLEARPAGEFLAG(Active, active, PF_ONLY_HEAD)
+ TESTCLEARFLAG(Active, active, PF_ONLY_HEAD)
PAGEFLAG(Workingset, workingset, PF_HEAD)
TESTCLEARFLAG(Workingset, workingset, PF_HEAD)
__PAGEFLAG(Slab, slab, PF_NO_TAIL)
@@ -407,9 +407,9 @@ CLEARPAGEFLAG(SwapCache, swapcache, PF_NO_TAIL)
PAGEFLAG_FALSE(SwapCache)
#endif

-PAGEFLAG(Unevictable, unevictable, PF_HEAD)
- __CLEARPAGEFLAG(Unevictable, unevictable, PF_HEAD)
- TESTCLEARFLAG(Unevictable, unevictable, PF_HEAD)
+PAGEFLAG(Unevictable, unevictable, PF_ONLY_HEAD)
+ __CLEARPAGEFLAG(Unevictable, unevictable, PF_ONLY_HEAD)
+ TESTCLEARFLAG(Unevictable, unevictable, PF_ONLY_HEAD)

#ifdef CONFIG_MMU
PAGEFLAG(Mlocked, mlocked, PF_NO_TAIL)

2021-02-25 07:58:57

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH] mm: test page->flags directly in page_lru()

On Wed, Feb 24, 2021 at 05:15:58AM -0800, Andrew Morton wrote:
> Here's what the preprocessor produces for an allmodconfig version of
> PageActive():
>
> static inline __attribute__((__gnu_inline__)) __attribute__((__unused__)) __attribute__((no_instrument_function)) __attribute__((__always_inline__)) int PageActive(struct page *page)
> {
> return test_bit(PG_active, &({ do { if (__builtin_expect(!!(PagePoisoned(compound_head(page))), 0)) { dump_page(compound_head(page), "VM_BUG_ON_PAGE(" "PagePoisoned(compound_head(page))"")"); do { ({ asm volatile("%c0: nop\n\t" ".pushsection .discard.instr_begin\n\t" ".long %c0b - .\n\t" ".popsection\n\t" : : "i" (373)); }); do { asm volatile("1:\t" ".byte 0x0f, 0x0b" "\n" ".pushsection __bug_table,\"aw\"\n" "2:\t" ".long " "1b" " - 2b" "\t# bug_entry::bug_addr\n" "\t" ".long " "%c0" " - 2b" "\t# bug_entry::file\n" "\t.word %c1" "\t# bug_entry::line\n" "\t.word %c2" "\t# bug_entry::flags\n" "\t.org 2b+%c3\n" ".popsection" : : "i" ("./include/linux/page-flags.h"), "i" (338), "i" (0), "i" (sizeof(struct bug_entry))); } while (0); do { ({ asm volatile("%c0:\n\t" ".pushsection .discard.unreachable\n\t" ".long %c0b - .\n\t" ".popsection\n\t" : : "i" (374)); }); asm volatile(""); __builtin_unreachable(); } while (0); } while (0); } } while (0); compound_head(page); })->flags);
>
> }
>
> That's all to test a single bit!
>
> Four calls to compound_head().

If only somebody were working on a patch series to get rid of
all those calls to compound_head()! Some reviews on
https://lore.kernel.org/linux-mm/[email protected]/
would be nice.

So, I haven't done page_lru() yet in my folio tree. What I would do is:

diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h
index 355ea1ee32bd..3895cfe6502b 100644
--- a/include/linux/mm_inline.h
+++ b/include/linux/mm_inline.h
@@ -63,22 +63,27 @@ static __always_inline void __clear_page_lru_flags(struct page *page)
* Returns the LRU list a page should be on, as an index
* into the array of LRU lists.
*/
-static __always_inline enum lru_list page_lru(struct page *page)
+static __always_inline enum lru_list folio_lru(struct folio *folio)
{
enum lru_list lru;

- VM_BUG_ON_PAGE(PageActive(page) && PageUnevictable(page), page);
+ VM_BUG_ON_PAGE(FolioActive(folio) && FolioUnevictable(folio), folio);

- if (PageUnevictable(page))
+ if (FolioUnevictable(folio))
return LRU_UNEVICTABLE;

- lru = page_is_file_lru(page) ? LRU_INACTIVE_FILE : LRU_INACTIVE_ANON;
- if (PageActive(page))
+ lru = page_is_file_lru(&folio->page) ? LRU_INACTIVE_FILE : LRU_INACTIVE_ANON;
+ if (FolioActive(folio))
lru += LRU_ACTIVE;

return lru;
}

+static __always_inline enum lru_list page_lru(struct page *page)
+{
+ return folio_lru(page_folio(page));
+}
+
static __always_inline void add_page_to_lru_list(struct page *page,
struct lruvec *lruvec)
{

That would cause compound_head() to be called once instead of four times
(assuming VM_BUG_ON is enabled). It can be reduced down to zero times
when the callers are converted from being page-based to being folio-based.

There is a further problem with PageFoo() being a READ_ONCE()
of page->flags, so the compiler can't CSE it. I have ideas in that
direction too; essentially ...

unsigned long page_flags = PageFlags(page);

if (PageFlagUnevictable(flags))
...
if (PageFlagsActive(flags))
...

and we can generate the PageFlagsFoo macros with the same machinery in
page-flags.h that generates PageFoo and FolioFoo. This strikes me as
less critical than the folio work to remove all the unnecessary calls
to compound_head().

> movq %rbx, %rbp # page, _14
> # ./include/linux/page-flags.h:184: unsigned long head = READ_ONCE(page->compound_head);
> call __sanitizer_cov_trace_pc #

It's a bit unfair to complain about code generation with a
sanitizer-enabled build ...

2021-02-25 08:13:32

by Yu Zhao

[permalink] [raw]
Subject: Re: [PATCH] mm: test page->flags directly in page_lru()

On Wed, Feb 24, 2021 at 10:48:46PM +0000, Matthew Wilcox wrote:
> On Wed, Feb 24, 2021 at 03:34:16PM -0700, Yu Zhao wrote:
> > > If only somebody were working on a patch series to get rid of
> > > all those calls to compound_head()! Some reviews on
> > > https://lore.kernel.org/linux-mm/[email protected]/
> > > would be nice.
> >
> > I'm on board with the idea and have done some research in this
> > direction. We've found that the ideal *anon* page size for Chrome OS
> > is not 4KB or 2MB, but 32KB. I hope we could leverage the folio to
> > support flexible anon page size to reduce the number of page faults
> > (vs 4KB) or internal fragmentation (vs 2MB).
> >
> > That being said, it seems to me this is a long term plan and right
> > now we need something smaller. So if you don't mind, I'll just go
> > ahead and remove compound_head() from Page{LRU,Active,Unevictable,
> > SwapBacked} first?
>
> It's really not a big change I'm suggesting here. You need
> https://lore.kernel.org/linux-mm/[email protected]/
> https://lore.kernel.org/linux-mm/[email protected]/
> https://lore.kernel.org/linux-mm/[email protected]/
> and then the patch I sent above to create folio_lru().
>
> Then any changes you want to make to use folios more broadly will
> incrementally move us towards your goal of 32kB anon pages.

Well, these patches introduce a new concept which I'm on board with.
Assume everybody else is too, it still seems to me it's an overkill
to employee folio to just get rid of unnecessary compound_head()
in page_lru() -- this is not a criticism but a compliment.

Let me work out something *conceptually* smaller first, and if you
think folio is absolutely more suitable even for this specific issue,
I'll go review and test the four patches you listed. Sounds good?

2021-02-25 12:14:43

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH] mm: test page->flags directly in page_lru()

On Wed, Feb 24, 2021 at 10:22:38PM -0700, Yu Zhao wrote:
> On Thu, Feb 25, 2021 at 03:55:53AM +0000, Matthew Wilcox wrote:
> > On Wed, Feb 24, 2021 at 04:50:39PM -0700, Yu Zhao wrote:
> > > Let me work out something *conceptually* smaller first, and if you
> > > think folio is absolutely more suitable even for this specific issue,
> > > I'll go review and test the four patches you listed. Sounds good?
> >
> > Umm. It seems to me that no matter what you do, it'll be equivalent to
> > this, only without the type-safety?
>
> I'm thinking about something trivial but still very effective. So far
> I've only tested it with PG_{active,unevictable}, and I'm already
> seeing a 4KB gain less the 2KB loss from page_lru().
>
> I didn't go with this at the beginning because it's also time-
> consuming. I need to go over every single use of
> PG_{active,unevictable,swapbacked,lru}.

Well, yes. If you went with the folio, it'd also be typesafe.
What you've done here makes it a runtime error, and it's only detected
if you enable CONFIG_DEBUG_VM_PGFLAGS, which people don't do, in general.

> +++ b/fs/proc/task_mmu.c
> @@ -1712,6 +1712,7 @@ static void gather_stats(struct page *page, struct numa_maps *md, int pte_dirty,
> unsigned long nr_pages)
> {
> int count = page_mapcount(page);
> + struct page *head = compound_head(page);
>
> md->pages += nr_pages;
> if (pte_dirty || PageDirty(page))

... if you went full-on folio in this function, you could also make this
FolioDirty, saving another call to compound_head.

> @@ -1720,7 +1721,7 @@ static void gather_stats(struct page *page, struct numa_maps *md, int pte_dirty,
> if (PageSwapCache(page))
... ditto ...
> md->swapcache += nr_pages;
>
> - if (PageActive(page) || PageUnevictable(page))
> + if (PageActive(head) || PageUnevictable(head))
> md->active += nr_pages;
>
> if (PageWriteback(page))
... ditto...

2021-02-26 09:22:40

by Yu Zhao

[permalink] [raw]
Subject: [PATCH v2 0/3] trim the uses of compound_head()

Patch series "mm: lru related cleanups" starting at commit 42895ea73bcd
("mm/vmscan.c: use add_page_to_lru_list()") bloated vmlinux by 1777
bytes, according to:
https://lore.kernel.org/linux-mm/[email protected]/

It turned out many places inline Page{Active,Unevictable} which in
turn include compound_head().

From the v1:
Removing compound_head() entirely from Page{Active,Unevictable} may
not be the best option (for the moment) because there may be other
cases that need compound_head().

In addition to picking a couple pieces of low-hanging fruit, this v2
removes compound_head() completely from Page{Active,Unevictable}.

bloat-o-meter result before and after the series:
add/remove: 0/0 grow/shrink: 6/92 up/down: 697/-7656 (-6959)

Yu Zhao (3):
mm: bypass compound_head() for PF_NO_TAIL when enforce=1
mm: use PF_NO_TAIL for PG_lru
mm: use PF_ONLY_HEAD for PG_active and PG_unevictable

fs/proc/task_mmu.c | 3 ++-
include/linux/page-flags.h | 16 ++++++++--------
2 files changed, 10 insertions(+), 9 deletions(-)

--
2.30.1.766.gb4fecdf3b7-goog

2021-02-26 09:23:32

by Yu Zhao

[permalink] [raw]
Subject: [PATCH v2 1/3] mm: bypass compound_head() for PF_NO_TAIL when enforce=1

When testing page flags with PF_NO_TAIL (enforce=0), tail pages are
legit and they are converted by compound_head(). When modifying page
flags (enforce=1), tail pages are not legit and they either trigger
VM_BUG_ON_PGFLAGS() or are "corrected" by compound_head().

There is no evidence such "correction" is beneficial in terms of
preventing a buggy kernel from crashing. But there is clear evidence
it's detrimental to the size of vmlinux because compound_head() is
redundantly inlined for all modifications of small and head page flags
using PF_NO_TAIL.

This patch makes PF_NO_TAIL skip compound_head() when modifying page
flags. There won't be any behavior change unless a piece of buggy code
tries to modify tail page flags using PF_NO_TAIL. Such code won't be
"corrected", if VM_BUG_ON_PGFLAGS() isn't there to catch it. Again,
there is no evidence such "correction" is beneficial.

bloat-o-meter result:
add/remove: 0/0 grow/shrink: 4/62 up/down: 309/-2851 (-2542)

Signed-off-by: Yu Zhao <[email protected]>
---
include/linux/page-flags.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index db914477057b..1995208a3763 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -248,7 +248,7 @@ static inline void page_init_poison(struct page *page, size_t size)
PF_POISONED_CHECK(page); })
#define PF_NO_TAIL(page, enforce) ({ \
VM_BUG_ON_PGFLAGS(enforce && PageTail(page), page); \
- PF_POISONED_CHECK(compound_head(page)); })
+ PF_POISONED_CHECK((enforce) ? (page) : compound_head(page)); })
#define PF_NO_COMPOUND(page, enforce) ({ \
VM_BUG_ON_PGFLAGS(enforce && PageCompound(page), page); \
PF_POISONED_CHECK(page); })
--
2.30.1.766.gb4fecdf3b7-goog

2021-02-26 09:24:04

by Yu Zhao

[permalink] [raw]
Subject: [PATCH v2 3/3] mm: use PF_ONLY_HEAD for PG_active and PG_unevictable

All places but one test, set or clear PG_active and PG_unevictable on
small or head pages. Use compound_head() explicitly for that singleton
so the rest can rid of redundant compound_head().

bloat-o-meter result:
add/remove: 0/0 grow/shrink: 3/38 up/down: 388/-4270 (-3882)

Signed-off-by: Yu Zhao <[email protected]>
---
fs/proc/task_mmu.c | 3 ++-
include/linux/page-flags.h | 10 +++++-----
2 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 3cec6fbef725..c866c363bb41 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -1712,6 +1712,7 @@ static void gather_stats(struct page *page, struct numa_maps *md, int pte_dirty,
unsigned long nr_pages)
{
int count = page_mapcount(page);
+ struct page *head = compound_head(page);

md->pages += nr_pages;
if (pte_dirty || PageDirty(page))
@@ -1720,7 +1721,7 @@ static void gather_stats(struct page *page, struct numa_maps *md, int pte_dirty,
if (PageSwapCache(page))
md->swapcache += nr_pages;

- if (PageActive(page) || PageUnevictable(page))
+ if (PageActive(head) || PageUnevictable(head))
md->active += nr_pages;

if (PageWriteback(page))
diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index c9626e54df8d..b7fe855a6ee9 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -335,8 +335,8 @@ PAGEFLAG(Dirty, dirty, PF_HEAD) TESTSCFLAG(Dirty, dirty, PF_HEAD)
__CLEARPAGEFLAG(Dirty, dirty, PF_HEAD)
PAGEFLAG(LRU, lru, PF_NO_TAIL) __CLEARPAGEFLAG(LRU, lru, PF_NO_TAIL)
TESTCLEARFLAG(LRU, lru, PF_NO_TAIL)
-PAGEFLAG(Active, active, PF_HEAD) __CLEARPAGEFLAG(Active, active, PF_HEAD)
- TESTCLEARFLAG(Active, active, PF_HEAD)
+PAGEFLAG(Active, active, PF_ONLY_HEAD) __CLEARPAGEFLAG(Active, active, PF_ONLY_HEAD)
+ TESTCLEARFLAG(Active, active, PF_ONLY_HEAD)
PAGEFLAG(Workingset, workingset, PF_HEAD)
TESTCLEARFLAG(Workingset, workingset, PF_HEAD)
__PAGEFLAG(Slab, slab, PF_NO_TAIL)
@@ -407,9 +407,9 @@ CLEARPAGEFLAG(SwapCache, swapcache, PF_NO_TAIL)
PAGEFLAG_FALSE(SwapCache)
#endif

-PAGEFLAG(Unevictable, unevictable, PF_HEAD)
- __CLEARPAGEFLAG(Unevictable, unevictable, PF_HEAD)
- TESTCLEARFLAG(Unevictable, unevictable, PF_HEAD)
+PAGEFLAG(Unevictable, unevictable, PF_ONLY_HEAD)
+ __CLEARPAGEFLAG(Unevictable, unevictable, PF_ONLY_HEAD)
+ TESTCLEARFLAG(Unevictable, unevictable, PF_ONLY_HEAD)

#ifdef CONFIG_MMU
PAGEFLAG(Mlocked, mlocked, PF_NO_TAIL)
--
2.30.1.766.gb4fecdf3b7-goog

2021-02-26 09:24:04

by Yu Zhao

[permalink] [raw]
Subject: [PATCH v2 2/3] mm: use PF_NO_TAIL for PG_lru

Trying to set or clear PG_lru on tail pages has been considered buggy.
Enforce this rule by changing the policy for PG_lru from PF_HEAD to
PF_NO_TAIL. This means setting or clearing PG_lru on tail pages won't
be "corrected" by compound_page(). Such "correction" isn't helpful --
even if a piece of buggy code has gotten away with
compound_page(tail)->flags, it will run into trouble with lru list
addition and deletion because they use tail->lru rather than
compound_page(tail)->lru.

bloat-o-meter result:
add/remove: 0/0 grow/shrink: 0/11 up/down: 0/-535 (-535)

Signed-off-by: Yu Zhao <[email protected]>
---
include/linux/page-flags.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index 1995208a3763..c9626e54df8d 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -333,8 +333,8 @@ PAGEFLAG(Referenced, referenced, PF_HEAD)
__SETPAGEFLAG(Referenced, referenced, PF_HEAD)
PAGEFLAG(Dirty, dirty, PF_HEAD) TESTSCFLAG(Dirty, dirty, PF_HEAD)
__CLEARPAGEFLAG(Dirty, dirty, PF_HEAD)
-PAGEFLAG(LRU, lru, PF_HEAD) __CLEARPAGEFLAG(LRU, lru, PF_HEAD)
- TESTCLEARFLAG(LRU, lru, PF_HEAD)
+PAGEFLAG(LRU, lru, PF_NO_TAIL) __CLEARPAGEFLAG(LRU, lru, PF_NO_TAIL)
+ TESTCLEARFLAG(LRU, lru, PF_NO_TAIL)
PAGEFLAG(Active, active, PF_HEAD) __CLEARPAGEFLAG(Active, active, PF_HEAD)
TESTCLEARFLAG(Active, active, PF_HEAD)
PAGEFLAG(Workingset, workingset, PF_HEAD)
--
2.30.1.766.gb4fecdf3b7-goog

2021-02-26 10:55:49

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [PATCH v2 0/3] trim the uses of compound_head()

On 2/26/21 10:17 AM, Yu Zhao wrote:
> Patch series "mm: lru related cleanups" starting at commit 42895ea73bcd
> ("mm/vmscan.c: use add_page_to_lru_list()") bloated vmlinux by 1777
> bytes, according to:
> https://lore.kernel.org/linux-mm/[email protected]/

Huh, I thought Andrew didn't want to send it for 5.12:
https://lore.kernel.org/linux-mm/[email protected]/

> It turned out many places inline Page{Active,Unevictable} which in
> turn include compound_head().
>
> From the v1:
> Removing compound_head() entirely from Page{Active,Unevictable} may
> not be the best option (for the moment) because there may be other
> cases that need compound_head().
>
> In addition to picking a couple pieces of low-hanging fruit, this v2
> removes compound_head() completely from Page{Active,Unevictable}.
>
> bloat-o-meter result before and after the series:
> add/remove: 0/0 grow/shrink: 6/92 up/down: 697/-7656 (-6959)

Good that you found a way to more than undo the bloat then. But we need to be
careful so bugs are not introduced due to pressure to not have bloated 5.12.

IIRC Kirill introduced these macros so I'm CCing him.

> Yu Zhao (3):
> mm: bypass compound_head() for PF_NO_TAIL when enforce=1
> mm: use PF_NO_TAIL for PG_lru
> mm: use PF_ONLY_HEAD for PG_active and PG_unevictable
>
> fs/proc/task_mmu.c | 3 ++-
> include/linux/page-flags.h | 16 ++++++++--------
> 2 files changed, 10 insertions(+), 9 deletions(-)
>

2021-02-26 12:17:56

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH v2 3/3] mm: use PF_ONLY_HEAD for PG_active and PG_unevictable

On Fri, Feb 26, 2021 at 02:17:18AM -0700, Yu Zhao wrote:
> All places but one test, set or clear PG_active and PG_unevictable on
> small or head pages. Use compound_head() explicitly for that singleton
> so the rest can rid of redundant compound_head().

How do you know it's only one place? I really wish you'd work with me
on folios. They make the compiler prove that it's not a tail page.

2021-02-26 19:07:08

by Yu Zhao

[permalink] [raw]
Subject: Re: [PATCH v2 0/3] trim the uses of compound_head()

On Fri, Feb 26, 2021 at 11:52:03AM +0100, Vlastimil Babka wrote:
> On 2/26/21 10:17 AM, Yu Zhao wrote:
> > Patch series "mm: lru related cleanups" starting at commit 42895ea73bcd
> > ("mm/vmscan.c: use add_page_to_lru_list()") bloated vmlinux by 1777
> > bytes, according to:
> > https://lore.kernel.org/linux-mm/[email protected]/
>
> Huh, I thought Andrew didn't want to send it for 5.12:
> https://lore.kernel.org/linux-mm/[email protected]/
>
> > It turned out many places inline Page{Active,Unevictable} which in
> > turn include compound_head().
> >
> > From the v1:
> > Removing compound_head() entirely from Page{Active,Unevictable} may
> > not be the best option (for the moment) because there may be other
> > cases that need compound_head().
> >
> > In addition to picking a couple pieces of low-hanging fruit, this v2
> > removes compound_head() completely from Page{Active,Unevictable}.
> >
> > bloat-o-meter result before and after the series:
> > add/remove: 0/0 grow/shrink: 6/92 up/down: 697/-7656 (-6959)
>
> Good that you found a way to more than undo the bloat then. But we need to be
> careful so bugs are not introduced due to pressure to not have bloated 5.12.

I was very conservative and only picked a few pieces of low-hanging
fruit. The pressure is good -- if you hadn't noticed it and Andrew
hadn't been emphatic about it, it'd have been left for another time
and to another person, and so on.

2021-02-26 19:55:17

by Yu Zhao

[permalink] [raw]
Subject: Re: [PATCH v2 3/3] mm: use PF_ONLY_HEAD for PG_active and PG_unevictable

On Fri, Feb 26, 2021 at 12:13:14PM +0000, Matthew Wilcox wrote:
> On Fri, Feb 26, 2021 at 02:17:18AM -0700, Yu Zhao wrote:
> > All places but one test, set or clear PG_active and PG_unevictable on
> > small or head pages. Use compound_head() explicitly for that singleton
> > so the rest can rid of redundant compound_head().
>
> How do you know it's only one place? I really wish you'd work with me
> on folios. They make the compiler prove that it's not a tail page.

I hasn't been following the effort closely, so I'm rereading the very
first discussion "Are THPs the right model for the pagecache?" and
then I need to rewatch the recorded Zoom meeting. As I said I'm on
board with the idea, but I can't create a roadmap based on my current
rough understanding, unless you prefer me to just randomly throw some
reviewed-bys at your patches in the next few days. (Our long-term plan
for folios is to support arbitrary anon page sizes because anon memory
is more than 90% of total user memory on Android, Chrome OS and in our
data centers.)

That said, if you have something ready to test, we could do it for you
in our runtime environments immediately.

2021-02-26 20:32:25

by Yu Zhao

[permalink] [raw]
Subject: Re: [PATCH v2 2/3] mm: use PF_NO_TAIL for PG_lru

On Fri, Feb 26, 2021 at 02:17:17AM -0700, Yu Zhao wrote:
> Trying to set or clear PG_lru on tail pages has been considered buggy.
> Enforce this rule by changing the policy for PG_lru from PF_HEAD to
> PF_NO_TAIL. This means setting or clearing PG_lru on tail pages won't
> be "corrected" by compound_page(). Such "correction" isn't helpful --
> even if a piece of buggy code has gotten away with
> compound_page(tail)->flags, it will run into trouble with lru list
> addition and deletion because they use tail->lru rather than
> compound_page(tail)->lru.
>
> bloat-o-meter result:
> add/remove: 0/0 grow/shrink: 0/11 up/down: 0/-535 (-535)
>
> Signed-off-by: Yu Zhao <[email protected]>
> ---
> include/linux/page-flags.h | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
> index 1995208a3763..c9626e54df8d 100644
> --- a/include/linux/page-flags.h
> +++ b/include/linux/page-flags.h
> @@ -333,8 +333,8 @@ PAGEFLAG(Referenced, referenced, PF_HEAD)
> __SETPAGEFLAG(Referenced, referenced, PF_HEAD)
> PAGEFLAG(Dirty, dirty, PF_HEAD) TESTSCFLAG(Dirty, dirty, PF_HEAD)
> __CLEARPAGEFLAG(Dirty, dirty, PF_HEAD)
> -PAGEFLAG(LRU, lru, PF_HEAD) __CLEARPAGEFLAG(LRU, lru, PF_HEAD)
> - TESTCLEARFLAG(LRU, lru, PF_HEAD)

As a side note, IMO, testing PG_lru on compound_head(tail)->flags is
a bug because it defeats the purpose of the following pattern when,
e.g., racing with compound page creations.

This pattern is intended to avoid dirtying struct page cache line when
scanning PFNs speculatively in isolate_migratepages_block() and
page_idle_get_page(). Without compound_head(), it works well when it
misses head pages. But with compound_head(), get_page_unless_zero()
will run unnecessarily on tail pages.

if (!PageLRU(page) || !get_page_unless_zero(page))
continue;

if (!PageLRU(page)) {
put_page(page);
continue;
}

do_something();

2021-02-26 20:33:55

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH v2 3/3] mm: use PF_ONLY_HEAD for PG_active and PG_unevictable

On Fri, Feb 26, 2021 at 12:49:41PM -0700, Yu Zhao wrote:
> On Fri, Feb 26, 2021 at 12:13:14PM +0000, Matthew Wilcox wrote:
> > On Fri, Feb 26, 2021 at 02:17:18AM -0700, Yu Zhao wrote:
> > > All places but one test, set or clear PG_active and PG_unevictable on
> > > small or head pages. Use compound_head() explicitly for that singleton
> > > so the rest can rid of redundant compound_head().
> >
> > How do you know it's only one place? I really wish you'd work with me
> > on folios. They make the compiler prove that it's not a tail page.
>
> I hasn't been following the effort closely, so I'm rereading the very
> first discussion "Are THPs the right model for the pagecache?" and
> then I need to rewatch the recorded Zoom meeting. As I said I'm on
> board with the idea, but I can't create a roadmap based on my current
> rough understanding, unless you prefer me to just randomly throw some
> reviewed-bys at your patches in the next few days. (Our long-term plan
> for folios is to support arbitrary anon page sizes because anon memory
> is more than 90% of total user memory on Android, Chrome OS and in our
> data centers.)
>
> That said, if you have something ready to test, we could do it for you
> in our runtime environments immediately.

I don't have anything ready to test for anonymous memory; indeed I have no
plans to work on anonymous memory myself. My focus is on the page cache.

But, once we get the folio _concept_ into the kernel (ie a struct page
which is definitely not a tail page), it can be used more widely than
the page cache, and independently from anything I'm working on. The
biggest risk is that we end up duplicating work ...

2021-03-01 11:54:21

by Kirill A. Shutemov

[permalink] [raw]
Subject: Re: [PATCH v2 3/3] mm: use PF_ONLY_HEAD for PG_active and PG_unevictable

On Fri, Feb 26, 2021 at 12:13:14PM +0000, Matthew Wilcox wrote:
> On Fri, Feb 26, 2021 at 02:17:18AM -0700, Yu Zhao wrote:
> > All places but one test, set or clear PG_active and PG_unevictable on
> > small or head pages. Use compound_head() explicitly for that singleton
> > so the rest can rid of redundant compound_head().
>
> How do you know it's only one place? I really wish you'd work with me
> on folios. They make the compiler prove that it's not a tail page.

+1 to this.

The problem with compound_head() is systemic and ad-hoc solution to few
page flags will only complicate the picture.

--
Kirill A. Shutemov

2021-03-02 16:22:13

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH v2 3/3] mm: use PF_ONLY_HEAD for PG_active and PG_unevictable

On Mon, Mar 01, 2021 at 12:16:19PM -0800, Hugh Dickins wrote:
> On Mon, 1 Mar 2021, Yu Zhao wrote:
> > On Mon, Mar 01, 2021 at 02:50:07PM +0300, Kirill A. Shutemov wrote:
> > > On Fri, Feb 26, 2021 at 12:13:14PM +0000, Matthew Wilcox wrote:
> > > > On Fri, Feb 26, 2021 at 02:17:18AM -0700, Yu Zhao wrote:
> > > > > All places but one test, set or clear PG_active and PG_unevictable on
> > > > > small or head pages. Use compound_head() explicitly for that singleton
> > > > > so the rest can rid of redundant compound_head().
> > > >
> > > > How do you know it's only one place? I really wish you'd work with me
> > > > on folios. They make the compiler prove that it's not a tail page.
> > >
> > > +1 to this.
> > >
> > > The problem with compound_head() is systemic and ad-hoc solution to few
> > > page flags will only complicate the picture.
> >
> > Well, I call it an incremental improvement, and how exactly does it
> > complicate the picture?
> >
> > I see your point: you prefer a complete replacement. But my point is
> > not about the preference; it's about presenting an option: I'm not
> > saying we have to go with this series; I'm saying if you don't want
> > to wait, here is something quick but not perfect.
>
> +1 to this.

page folios are here and ready to go. I'm doing another pass on them,
quantifying the improvements to text with each patch. So far I'm
at 4357 bytes of text saved, in the first 10 patches (many of which
look as if they're not going to produce any savings).

Yu Zhao's patches seem risky. The only way to know if any places have
been missed is by enabling CONFIG_DEBUG_VM_PGFLAGS, which we do not
recommend for production environments.

2021-03-04 05:06:01

by Yu Zhao

[permalink] [raw]
Subject: Re: [PATCH v2 3/3] mm: use PF_ONLY_HEAD for PG_active and PG_unevictable

On Mon, Mar 01, 2021 at 02:50:07PM +0300, Kirill A. Shutemov wrote:
> On Fri, Feb 26, 2021 at 12:13:14PM +0000, Matthew Wilcox wrote:
> > On Fri, Feb 26, 2021 at 02:17:18AM -0700, Yu Zhao wrote:
> > > All places but one test, set or clear PG_active and PG_unevictable on
> > > small or head pages. Use compound_head() explicitly for that singleton
> > > so the rest can rid of redundant compound_head().
> >
> > How do you know it's only one place? I really wish you'd work with me
> > on folios. They make the compiler prove that it's not a tail page.
>
> +1 to this.
>
> The problem with compound_head() is systemic and ad-hoc solution to few
> page flags will only complicate the picture.

Well, I call it an incremental improvement, and how exactly does it
complicate the picture?

I see your point: you prefer a complete replacement. But my point is
not about the preference; it's about presenting an option: I'm not
saying we have to go with this series; I'm saying if you don't want
to wait, here is something quick but not perfect.

2021-03-04 05:14:27

by Hugh Dickins

[permalink] [raw]
Subject: Re: [PATCH v2 3/3] mm: use PF_ONLY_HEAD for PG_active and PG_unevictable

On Mon, 1 Mar 2021, Yu Zhao wrote:
> On Mon, Mar 01, 2021 at 02:50:07PM +0300, Kirill A. Shutemov wrote:
> > On Fri, Feb 26, 2021 at 12:13:14PM +0000, Matthew Wilcox wrote:
> > > On Fri, Feb 26, 2021 at 02:17:18AM -0700, Yu Zhao wrote:
> > > > All places but one test, set or clear PG_active and PG_unevictable on
> > > > small or head pages. Use compound_head() explicitly for that singleton
> > > > so the rest can rid of redundant compound_head().
> > >
> > > How do you know it's only one place? I really wish you'd work with me
> > > on folios. They make the compiler prove that it's not a tail page.
> >
> > +1 to this.
> >
> > The problem with compound_head() is systemic and ad-hoc solution to few
> > page flags will only complicate the picture.
>
> Well, I call it an incremental improvement, and how exactly does it
> complicate the picture?
>
> I see your point: you prefer a complete replacement. But my point is
> not about the preference; it's about presenting an option: I'm not
> saying we have to go with this series; I'm saying if you don't want
> to wait, here is something quick but not perfect.

+1 to this.

Hugh