2020-12-07 22:14:09

by Yu Zhao

[permalink] [raw]
Subject: [PATCH 00/11] mm: lru related cleanups

The cleanups are intended to reduce the verbosity in lru list
operations and make them less error-prone. A typical example
would be how the patches change __activate_page():

static void __activate_page(struct page *page, struct lruvec *lruvec)
{
if (!PageActive(page) && !PageUnevictable(page)) {
- int lru = page_lru_base_type(page);
int nr_pages = thp_nr_pages(page);

- del_page_from_lru_list(page, lruvec, lru);
+ del_page_from_lru_list(page, lruvec);
SetPageActive(page);
- lru += LRU_ACTIVE;
- add_page_to_lru_list(page, lruvec, lru);
+ add_page_to_lru_list(page, lruvec);
trace_mm_lru_activate(page);

There are a few more places like __activate_page() and they are
unnecessarily repetitive in terms of figuring out which list a page
should be added onto or deleted from. And with the duplicated code
removed, they are easier to read, IMO.

Patch 1 to 5 basically cover the above. Patch 6 and 7 make code more
robust by improving bug reporting. Patch 8, 9 and 10 take care of
some dangling helpers left in header files. Patch 11 isn't strictly a
clean-up patch, but it seems still relevant to include it here.

Yu Zhao (11):
mm: use add_page_to_lru_list()
mm: shuffle lru list addition and deletion functions
mm: don't pass "enum lru_list" to lru list addition functions
mm: don't pass "enum lru_list" to trace_mm_lru_insertion()
mm: don't pass "enum lru_list" to del_page_from_lru_list()
mm: add __clear_page_lru_flags() to replace page_off_lru()
mm: VM_BUG_ON lru page flags
mm: fold page_lru_base_type() into its sole caller
mm: fold __update_lru_size() into its sole caller
mm: make lruvec_lru_size() static
mm: enlarge the "int nr_pages" parameter of update_lru_size()

include/linux/memcontrol.h | 10 +--
include/linux/mm_inline.h | 115 ++++++++++++++-------------------
include/linux/mmzone.h | 2 -
include/linux/vmstat.h | 6 +-
include/trace/events/pagemap.h | 11 ++--
mm/compaction.c | 2 +-
mm/memcontrol.c | 10 +--
mm/mlock.c | 3 +-
mm/swap.c | 50 ++++++--------
mm/vmscan.c | 21 ++----
10 files changed, 91 insertions(+), 139 deletions(-)

--
2.29.2.576.ga3fc446d84-goog


2020-12-07 22:14:10

by Yu Zhao

[permalink] [raw]
Subject: [PATCH 03/11] mm: don't pass "enum lru_list" to lru list addition functions

The "enum lru_list" parameter to add_page_to_lru_list() and
add_page_to_lru_list_tail() is redundant in the sense that it can
be extracted from the "struct page" parameter by page_lru().

A caveat is that we need to make sure PageActive() or
PageUnevictable() is correctly set or cleared before calling
these two functions. And they are indeed.

Signed-off-by: Yu Zhao <[email protected]>
---
include/linux/mm_inline.h | 8 ++++++--
mm/swap.c | 15 +++++++--------
mm/vmscan.c | 6 ++----
3 files changed, 15 insertions(+), 14 deletions(-)

diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h
index 2889741f450a..130ba3201d3f 100644
--- a/include/linux/mm_inline.h
+++ b/include/linux/mm_inline.h
@@ -106,15 +106,19 @@ static __always_inline enum lru_list page_lru(struct page *page)
}

static __always_inline void add_page_to_lru_list(struct page *page,
- struct lruvec *lruvec, enum lru_list lru)
+ struct lruvec *lruvec)
{
+ enum lru_list lru = page_lru(page);
+
update_lru_size(lruvec, lru, page_zonenum(page), thp_nr_pages(page));
list_add(&page->lru, &lruvec->lists[lru]);
}

static __always_inline void add_page_to_lru_list_tail(struct page *page,
- struct lruvec *lruvec, enum lru_list lru)
+ struct lruvec *lruvec)
{
+ enum lru_list lru = page_lru(page);
+
update_lru_size(lruvec, lru, page_zonenum(page), thp_nr_pages(page));
list_add_tail(&page->lru, &lruvec->lists[lru]);
}
diff --git a/mm/swap.c b/mm/swap.c
index 5022dfe388ad..136acabbfab5 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -231,7 +231,7 @@ static void pagevec_move_tail_fn(struct page *page, struct lruvec *lruvec)
if (!PageUnevictable(page)) {
del_page_from_lru_list(page, lruvec, page_lru(page));
ClearPageActive(page);
- add_page_to_lru_list_tail(page, lruvec, page_lru(page));
+ add_page_to_lru_list_tail(page, lruvec);
__count_vm_events(PGROTATED, thp_nr_pages(page));
}
}
@@ -313,8 +313,7 @@ static void __activate_page(struct page *page, struct lruvec *lruvec)

del_page_from_lru_list(page, lruvec, lru);
SetPageActive(page);
- lru += LRU_ACTIVE;
- add_page_to_lru_list(page, lruvec, lru);
+ add_page_to_lru_list(page, lruvec);
trace_mm_lru_activate(page);

__count_vm_events(PGACTIVATE, nr_pages);
@@ -543,14 +542,14 @@ static void lru_deactivate_file_fn(struct page *page, struct lruvec *lruvec)
* It can make readahead confusing. But race window
* is _really_ small and it's non-critical problem.
*/
- add_page_to_lru_list(page, lruvec, lru);
+ add_page_to_lru_list(page, lruvec);
SetPageReclaim(page);
} else {
/*
* The page's writeback ends up during pagevec
* We moves tha page into tail of inactive.
*/
- add_page_to_lru_list_tail(page, lruvec, lru);
+ add_page_to_lru_list_tail(page, lruvec);
__count_vm_events(PGROTATED, nr_pages);
}

@@ -570,7 +569,7 @@ static void lru_deactivate_fn(struct page *page, struct lruvec *lruvec)
del_page_from_lru_list(page, lruvec, lru + LRU_ACTIVE);
ClearPageActive(page);
ClearPageReferenced(page);
- add_page_to_lru_list(page, lruvec, lru);
+ add_page_to_lru_list(page, lruvec);

__count_vm_events(PGDEACTIVATE, nr_pages);
__count_memcg_events(lruvec_memcg(lruvec), PGDEACTIVATE,
@@ -595,7 +594,7 @@ static void lru_lazyfree_fn(struct page *page, struct lruvec *lruvec)
* anonymous pages
*/
ClearPageSwapBacked(page);
- add_page_to_lru_list(page, lruvec, LRU_INACTIVE_FILE);
+ add_page_to_lru_list(page, lruvec);

__count_vm_events(PGLAZYFREE, nr_pages);
__count_memcg_events(lruvec_memcg(lruvec), PGLAZYFREE,
@@ -1005,7 +1004,7 @@ static void __pagevec_lru_add_fn(struct page *page, struct lruvec *lruvec)
__count_vm_events(UNEVICTABLE_PGCULLED, nr_pages);
}

- add_page_to_lru_list(page, lruvec, lru);
+ add_page_to_lru_list(page, lruvec);
trace_mm_lru_insertion(page, lru);
}

diff --git a/mm/vmscan.c b/mm/vmscan.c
index a174594e40f8..8fc8f2c9d7ec 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1865,7 +1865,7 @@ static unsigned noinline_for_stack move_pages_to_lru(struct lruvec *lruvec,
* inhibits memcg migration).
*/
VM_BUG_ON_PAGE(!lruvec_holds_page_lru_lock(page, lruvec), page);
- add_page_to_lru_list(page, lruvec, page_lru(page));
+ add_page_to_lru_list(page, lruvec);
nr_pages = thp_nr_pages(page);
nr_moved += nr_pages;
if (PageActive(page))
@@ -4280,12 +4280,10 @@ void check_move_unevictable_pages(struct pagevec *pvec)

lruvec = relock_page_lruvec_irq(page, lruvec);
if (page_evictable(page) && PageUnevictable(page)) {
- enum lru_list lru = page_lru_base_type(page);
-
VM_BUG_ON_PAGE(PageActive(page), page);
ClearPageUnevictable(page);
del_page_from_lru_list(page, lruvec, LRU_UNEVICTABLE);
- add_page_to_lru_list(page, lruvec, lru);
+ add_page_to_lru_list(page, lruvec);
pgrescued += nr_pages;
}
SetPageLRU(page);
--
2.29.2.576.ga3fc446d84-goog

2020-12-07 22:14:27

by Yu Zhao

[permalink] [raw]
Subject: [PATCH 09/11] mm: fold __update_lru_size() into its sole caller

All other references to the function were removed after
commit a892cb6b977f ("mm/vmscan.c: use update_lru_size() in update_lru_sizes()")

Signed-off-by: Yu Zhao <[email protected]>
---
include/linux/mm_inline.h | 9 +--------
1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h
index 7183c7a03f09..355ea1ee32bd 100644
--- a/include/linux/mm_inline.h
+++ b/include/linux/mm_inline.h
@@ -24,7 +24,7 @@ static inline int page_is_file_lru(struct page *page)
return !PageSwapBacked(page);
}

-static __always_inline void __update_lru_size(struct lruvec *lruvec,
+static __always_inline void update_lru_size(struct lruvec *lruvec,
enum lru_list lru, enum zone_type zid,
int nr_pages)
{
@@ -33,13 +33,6 @@ static __always_inline void __update_lru_size(struct lruvec *lruvec,
__mod_lruvec_state(lruvec, NR_LRU_BASE + lru, nr_pages);
__mod_zone_page_state(&pgdat->node_zones[zid],
NR_ZONE_LRU_BASE + lru, nr_pages);
-}
-
-static __always_inline void update_lru_size(struct lruvec *lruvec,
- enum lru_list lru, enum zone_type zid,
- int nr_pages)
-{
- __update_lru_size(lruvec, lru, zid, nr_pages);
#ifdef CONFIG_MEMCG
mem_cgroup_update_lru_size(lruvec, lru, zid, nr_pages);
#endif
--
2.29.2.576.ga3fc446d84-goog

2020-12-07 22:14:50

by Yu Zhao

[permalink] [raw]
Subject: [PATCH 08/11] mm: fold page_lru_base_type() into its sole caller

We've removed all other references to this function.

Signed-off-by: Yu Zhao <[email protected]>
---
include/linux/mm_inline.h | 27 ++++++---------------------
1 file changed, 6 insertions(+), 21 deletions(-)

diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h
index 6d907a4dd6ad..7183c7a03f09 100644
--- a/include/linux/mm_inline.h
+++ b/include/linux/mm_inline.h
@@ -45,21 +45,6 @@ static __always_inline void update_lru_size(struct lruvec *lruvec,
#endif
}

-/**
- * page_lru_base_type - which LRU list type should a page be on?
- * @page: the page to test
- *
- * Used for LRU list index arithmetic.
- *
- * Returns the base LRU type - file or anon - @page should be on.
- */
-static inline enum lru_list page_lru_base_type(struct page *page)
-{
- if (page_is_file_lru(page))
- return LRU_INACTIVE_FILE;
- return LRU_INACTIVE_ANON;
-}
-
/**
* __clear_page_lru_flags - clear page lru flags before releasing a page
* @page: the page that was on lru and now has a zero reference
@@ -92,12 +77,12 @@ static __always_inline enum lru_list page_lru(struct page *page)
VM_BUG_ON_PAGE(PageActive(page) && PageUnevictable(page), page);

if (PageUnevictable(page))
- lru = LRU_UNEVICTABLE;
- else {
- lru = page_lru_base_type(page);
- if (PageActive(page))
- lru += LRU_ACTIVE;
- }
+ return LRU_UNEVICTABLE;
+
+ lru = page_is_file_lru(page) ? LRU_INACTIVE_FILE : LRU_INACTIVE_ANON;
+ if (PageActive(page))
+ lru += LRU_ACTIVE;
+
return lru;
}

--
2.29.2.576.ga3fc446d84-goog

2020-12-07 22:14:53

by Yu Zhao

[permalink] [raw]
Subject: [PATCH 05/11] mm: don't pass "enum lru_list" to del_page_from_lru_list()

The parameter is redundant in the sense that it can be potentially
extracted from the "struct page" parameter by page_lru(). We need to
make sure that existing PageActive() or PageUnevictable() remains
until the function returns. A few places don't conform, and simple
reordering fixes them.

This patch may have left page_off_lru() seemingly odd, and we'll take
care of it in the next patch.

Signed-off-by: Yu Zhao <[email protected]>
---
include/linux/mm_inline.h | 5 +++--
mm/compaction.c | 2 +-
mm/mlock.c | 3 +--
mm/swap.c | 26 ++++++++++----------------
mm/vmscan.c | 4 ++--
5 files changed, 17 insertions(+), 23 deletions(-)

diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h
index 130ba3201d3f..ffacc6273678 100644
--- a/include/linux/mm_inline.h
+++ b/include/linux/mm_inline.h
@@ -124,9 +124,10 @@ static __always_inline void add_page_to_lru_list_tail(struct page *page,
}

static __always_inline void del_page_from_lru_list(struct page *page,
- struct lruvec *lruvec, enum lru_list lru)
+ struct lruvec *lruvec)
{
list_del(&page->lru);
- update_lru_size(lruvec, lru, page_zonenum(page), -thp_nr_pages(page));
+ update_lru_size(lruvec, page_lru(page), page_zonenum(page),
+ -thp_nr_pages(page));
}
#endif
diff --git a/mm/compaction.c b/mm/compaction.c
index 8049d3530812..fd2058330497 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -1034,7 +1034,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
low_pfn += compound_nr(page) - 1;

/* Successfully isolated */
- del_page_from_lru_list(page, lruvec, page_lru(page));
+ del_page_from_lru_list(page, lruvec);
mod_node_page_state(page_pgdat(page),
NR_ISOLATED_ANON + page_is_file_lru(page),
thp_nr_pages(page));
diff --git a/mm/mlock.c b/mm/mlock.c
index 55b3b3672977..73960bb3464d 100644
--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -278,8 +278,7 @@ static void __munlock_pagevec(struct pagevec *pvec, struct zone *zone)
*/
if (TestClearPageLRU(page)) {
lruvec = relock_page_lruvec_irq(page, lruvec);
- del_page_from_lru_list(page, lruvec,
- page_lru(page));
+ del_page_from_lru_list(page, lruvec);
continue;
} else
__munlock_isolation_failed(page);
diff --git a/mm/swap.c b/mm/swap.c
index e053b4db108a..d55a0c27d804 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -85,7 +85,8 @@ static void __page_cache_release(struct page *page)
lruvec = lock_page_lruvec_irqsave(page, &flags);
VM_BUG_ON_PAGE(!PageLRU(page), page);
__ClearPageLRU(page);
- del_page_from_lru_list(page, lruvec, page_off_lru(page));
+ del_page_from_lru_list(page, lruvec);
+ page_off_lru(page);
unlock_page_lruvec_irqrestore(lruvec, flags);
}
__ClearPageWaiters(page);
@@ -229,7 +230,7 @@ static void pagevec_lru_move_fn(struct pagevec *pvec,
static void pagevec_move_tail_fn(struct page *page, struct lruvec *lruvec)
{
if (!PageUnevictable(page)) {
- del_page_from_lru_list(page, lruvec, page_lru(page));
+ del_page_from_lru_list(page, lruvec);
ClearPageActive(page);
add_page_to_lru_list_tail(page, lruvec);
__count_vm_events(PGROTATED, thp_nr_pages(page));
@@ -308,10 +309,9 @@ void lru_note_cost_page(struct page *page)
static void __activate_page(struct page *page, struct lruvec *lruvec)
{
if (!PageActive(page) && !PageUnevictable(page)) {
- int lru = page_lru_base_type(page);
int nr_pages = thp_nr_pages(page);

- del_page_from_lru_list(page, lruvec, lru);
+ del_page_from_lru_list(page, lruvec);
SetPageActive(page);
add_page_to_lru_list(page, lruvec);
trace_mm_lru_activate(page);
@@ -518,8 +518,7 @@ void lru_cache_add_inactive_or_unevictable(struct page *page,
*/
static void lru_deactivate_file_fn(struct page *page, struct lruvec *lruvec)
{
- int lru;
- bool active;
+ bool active = PageActive(page);
int nr_pages = thp_nr_pages(page);

if (PageUnevictable(page))
@@ -529,10 +528,7 @@ static void lru_deactivate_file_fn(struct page *page, struct lruvec *lruvec)
if (page_mapped(page))
return;

- active = PageActive(page);
- lru = page_lru_base_type(page);
-
- del_page_from_lru_list(page, lruvec, lru + active);
+ del_page_from_lru_list(page, lruvec);
ClearPageActive(page);
ClearPageReferenced(page);

@@ -563,10 +559,9 @@ static void lru_deactivate_file_fn(struct page *page, struct lruvec *lruvec)
static void lru_deactivate_fn(struct page *page, struct lruvec *lruvec)
{
if (PageActive(page) && !PageUnevictable(page)) {
- int lru = page_lru_base_type(page);
int nr_pages = thp_nr_pages(page);

- del_page_from_lru_list(page, lruvec, lru + LRU_ACTIVE);
+ del_page_from_lru_list(page, lruvec);
ClearPageActive(page);
ClearPageReferenced(page);
add_page_to_lru_list(page, lruvec);
@@ -581,11 +576,9 @@ static void lru_lazyfree_fn(struct page *page, struct lruvec *lruvec)
{
if (PageAnon(page) && PageSwapBacked(page) &&
!PageSwapCache(page) && !PageUnevictable(page)) {
- bool active = PageActive(page);
int nr_pages = thp_nr_pages(page);

- del_page_from_lru_list(page, lruvec,
- LRU_INACTIVE_ANON + active);
+ del_page_from_lru_list(page, lruvec);
ClearPageActive(page);
ClearPageReferenced(page);
/*
@@ -919,7 +912,8 @@ void release_pages(struct page **pages, int nr)

VM_BUG_ON_PAGE(!PageLRU(page), page);
__ClearPageLRU(page);
- del_page_from_lru_list(page, lruvec, page_off_lru(page));
+ del_page_from_lru_list(page, lruvec);
+ page_off_lru(page);
}

__ClearPageWaiters(page);
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 8fc8f2c9d7ec..49451899037c 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1764,7 +1764,7 @@ int isolate_lru_page(struct page *page)

get_page(page);
lruvec = lock_page_lruvec_irq(page);
- del_page_from_lru_list(page, lruvec, page_lru(page));
+ del_page_from_lru_list(page, lruvec);
unlock_page_lruvec_irq(lruvec);
ret = 0;
}
@@ -4281,8 +4281,8 @@ void check_move_unevictable_pages(struct pagevec *pvec)
lruvec = relock_page_lruvec_irq(page, lruvec);
if (page_evictable(page) && PageUnevictable(page)) {
VM_BUG_ON_PAGE(PageActive(page), page);
+ del_page_from_lru_list(page, lruvec);
ClearPageUnevictable(page);
- del_page_from_lru_list(page, lruvec, LRU_UNEVICTABLE);
add_page_to_lru_list(page, lruvec);
pgrescued += nr_pages;
}
--
2.29.2.576.ga3fc446d84-goog

2020-12-07 22:15:25

by Yu Zhao

[permalink] [raw]
Subject: [PATCH 06/11] mm: add __clear_page_lru_flags() to replace page_off_lru()

Similar to page_off_lru(), the new function does non-atomic clearing
of PageLRU() in addition to PageActive() and PageUnevictable(), on a
page that has no references left.

If PageActive() and PageUnevictable() are both set, refuse to clear
either and leave them to bad_page(). This is a behavior change that
is meant to help debug.

Signed-off-by: Yu Zhao <[email protected]>
---
include/linux/mm_inline.h | 28 ++++++++++------------------
mm/swap.c | 6 ++----
mm/vmscan.c | 3 +--
3 files changed, 13 insertions(+), 24 deletions(-)

diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h
index ffacc6273678..ef3fd79222e5 100644
--- a/include/linux/mm_inline.h
+++ b/include/linux/mm_inline.h
@@ -61,27 +61,19 @@ static inline enum lru_list page_lru_base_type(struct page *page)
}

/**
- * page_off_lru - which LRU list was page on? clearing its lru flags.
- * @page: the page to test
- *
- * Returns the LRU list a page was on, as an index into the array of LRU
- * lists; and clears its Unevictable or Active flags, ready for freeing.
+ * __clear_page_lru_flags - clear page lru flags before releasing a page
+ * @page: the page that was on lru and now has a zero reference
*/
-static __always_inline enum lru_list page_off_lru(struct page *page)
+static __always_inline void __clear_page_lru_flags(struct page *page)
{
- enum lru_list lru;
+ __ClearPageLRU(page);

- if (PageUnevictable(page)) {
- __ClearPageUnevictable(page);
- lru = LRU_UNEVICTABLE;
- } else {
- lru = page_lru_base_type(page);
- if (PageActive(page)) {
- __ClearPageActive(page);
- lru += LRU_ACTIVE;
- }
- }
- return lru;
+ /* this shouldn't happen, so leave the flags to bad_page() */
+ if (PageActive(page) && PageUnevictable(page))
+ return;
+
+ __ClearPageActive(page);
+ __ClearPageUnevictable(page);
}

/**
diff --git a/mm/swap.c b/mm/swap.c
index d55a0c27d804..a37c896a32b0 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -84,9 +84,8 @@ static void __page_cache_release(struct page *page)

lruvec = lock_page_lruvec_irqsave(page, &flags);
VM_BUG_ON_PAGE(!PageLRU(page), page);
- __ClearPageLRU(page);
del_page_from_lru_list(page, lruvec);
- page_off_lru(page);
+ __clear_page_lru_flags(page);
unlock_page_lruvec_irqrestore(lruvec, flags);
}
__ClearPageWaiters(page);
@@ -911,9 +910,8 @@ void release_pages(struct page **pages, int nr)
lock_batch = 0;

VM_BUG_ON_PAGE(!PageLRU(page), page);
- __ClearPageLRU(page);
del_page_from_lru_list(page, lruvec);
- page_off_lru(page);
+ __clear_page_lru_flags(page);
}

__ClearPageWaiters(page);
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 49451899037c..e6bdfdfa2da1 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1847,8 +1847,7 @@ static unsigned noinline_for_stack move_pages_to_lru(struct lruvec *lruvec,
SetPageLRU(page);

if (unlikely(put_page_testzero(page))) {
- __ClearPageLRU(page);
- __ClearPageActive(page);
+ __clear_page_lru_flags(page);

if (unlikely(PageCompound(page))) {
spin_unlock_irq(&lruvec->lru_lock);
--
2.29.2.576.ga3fc446d84-goog

2020-12-07 22:15:34

by Yu Zhao

[permalink] [raw]
Subject: [PATCH 10/11] mm: make lruvec_lru_size() static

All other references to the function were removed after
commit b910718a948a ("mm: vmscan: detect file thrashing at the reclaim root")

Signed-off-by: Yu Zhao <[email protected]>
---
include/linux/mmzone.h | 2 --
mm/vmscan.c | 3 ++-
2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index b593316bff3d..2fc54e269eaf 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -872,8 +872,6 @@ static inline struct pglist_data *lruvec_pgdat(struct lruvec *lruvec)
#endif
}

-extern unsigned long lruvec_lru_size(struct lruvec *lruvec, enum lru_list lru, int zone_idx);
-
#ifdef CONFIG_HAVE_MEMORYLESS_NODES
int local_memory_node(int node_id);
#else
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 95e581c9d9af..fd0c2313bee4 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -310,7 +310,8 @@ unsigned long zone_reclaimable_pages(struct zone *zone)
* @lru: lru to use
* @zone_idx: zones to consider (use MAX_NR_ZONES for the whole LRU list)
*/
-unsigned long lruvec_lru_size(struct lruvec *lruvec, enum lru_list lru, int zone_idx)
+static unsigned long lruvec_lru_size(struct lruvec *lruvec, enum lru_list lru,
+ int zone_idx)
{
unsigned long size = 0;
int zid;
--
2.29.2.576.ga3fc446d84-goog

2020-12-07 22:15:38

by Yu Zhao

[permalink] [raw]
Subject: [PATCH 07/11] mm: VM_BUG_ON lru page flags

Move scattered VM_BUG_ONs to two essential places that cover all
lru list additions and deletions.

Signed-off-by: Yu Zhao <[email protected]>
---
include/linux/mm_inline.h | 4 ++++
mm/swap.c | 2 --
mm/vmscan.c | 1 -
3 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h
index ef3fd79222e5..6d907a4dd6ad 100644
--- a/include/linux/mm_inline.h
+++ b/include/linux/mm_inline.h
@@ -66,6 +66,8 @@ static inline enum lru_list page_lru_base_type(struct page *page)
*/
static __always_inline void __clear_page_lru_flags(struct page *page)
{
+ VM_BUG_ON_PAGE(!PageLRU(page), page);
+
__ClearPageLRU(page);

/* this shouldn't happen, so leave the flags to bad_page() */
@@ -87,6 +89,8 @@ static __always_inline enum lru_list page_lru(struct page *page)
{
enum lru_list lru;

+ VM_BUG_ON_PAGE(PageActive(page) && PageUnevictable(page), page);
+
if (PageUnevictable(page))
lru = LRU_UNEVICTABLE;
else {
diff --git a/mm/swap.c b/mm/swap.c
index a37c896a32b0..09c4a48e0bcd 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -83,7 +83,6 @@ static void __page_cache_release(struct page *page)
unsigned long flags;

lruvec = lock_page_lruvec_irqsave(page, &flags);
- VM_BUG_ON_PAGE(!PageLRU(page), page);
del_page_from_lru_list(page, lruvec);
__clear_page_lru_flags(page);
unlock_page_lruvec_irqrestore(lruvec, flags);
@@ -909,7 +908,6 @@ void release_pages(struct page **pages, int nr)
if (prev_lruvec != lruvec)
lock_batch = 0;

- VM_BUG_ON_PAGE(!PageLRU(page), page);
del_page_from_lru_list(page, lruvec);
__clear_page_lru_flags(page);
}
diff --git a/mm/vmscan.c b/mm/vmscan.c
index e6bdfdfa2da1..95e581c9d9af 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -4279,7 +4279,6 @@ void check_move_unevictable_pages(struct pagevec *pvec)

lruvec = relock_page_lruvec_irq(page, lruvec);
if (page_evictable(page) && PageUnevictable(page)) {
- VM_BUG_ON_PAGE(PageActive(page), page);
del_page_from_lru_list(page, lruvec);
ClearPageUnevictable(page);
add_page_to_lru_list(page, lruvec);
--
2.29.2.576.ga3fc446d84-goog

2020-12-07 22:16:11

by Yu Zhao

[permalink] [raw]
Subject: [PATCH 11/11] mm: enlarge the "int nr_pages" parameter of update_lru_size()

update_lru_sizes() defines an unsigned long argument and passes it as
nr_pages to update_lru_size(). Though this isn't causing any overflows
I'm aware of, it's a bad idea to go through the demotion given that we
have recently stumbled on a related type promotion problem fixed by
commit 2da9f6305f30 ("mm/vmscan: fix NR_ISOLATED_FILE corruption on 64-bit")

Note that the underlying counters are already in long. This is another
reason we shouldn't have the demotion.

This patch enlarges all relevant parameters on the path to the final
underlying counters:
update_lru_size(int -> long)
if memcg:
__mod_lruvec_state(int -> long)
if smp:
__mod_node_page_state(long)
else:
__mod_node_page_state(int -> long)
__mod_memcg_lruvec_state(int -> long)
__mod_memcg_state(int -> long)
else:
__mod_lruvec_state(int -> long)
if smp:
__mod_node_page_state(long)
else:
__mod_node_page_state(int -> long)

__mod_zone_page_state(long)

if memcg:
mem_cgroup_update_lru_size(int -> long)

Note that __mod_node_page_state() for the smp case and
__mod_zone_page_state() already use long. So this change also fixes
the inconsistency.

Signed-off-by: Yu Zhao <[email protected]>
---
include/linux/memcontrol.h | 10 +++++-----
include/linux/mm_inline.h | 2 +-
include/linux/vmstat.h | 6 +++---
mm/memcontrol.c | 10 +++++-----
4 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 3febf64d1b80..1454201abb8d 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -810,7 +810,7 @@ static inline bool mem_cgroup_online(struct mem_cgroup *memcg)
int mem_cgroup_select_victim_node(struct mem_cgroup *memcg);

void mem_cgroup_update_lru_size(struct lruvec *lruvec, enum lru_list lru,
- int zid, int nr_pages);
+ int zid, long nr_pages);

static inline
unsigned long mem_cgroup_get_zone_lru_size(struct lruvec *lruvec,
@@ -896,7 +896,7 @@ static inline unsigned long memcg_page_state_local(struct mem_cgroup *memcg,
return x;
}

-void __mod_memcg_state(struct mem_cgroup *memcg, int idx, int val);
+void __mod_memcg_state(struct mem_cgroup *memcg, int idx, long val);

/* idx can be of type enum memcg_stat_item or node_stat_item */
static inline void mod_memcg_state(struct mem_cgroup *memcg,
@@ -948,7 +948,7 @@ static inline unsigned long lruvec_page_state_local(struct lruvec *lruvec,
}

void __mod_memcg_lruvec_state(struct lruvec *lruvec, enum node_stat_item idx,
- int val);
+ long val);
void __mod_lruvec_kmem_state(void *p, enum node_stat_item idx, int val);

static inline void mod_lruvec_kmem_state(void *p, enum node_stat_item idx,
@@ -1346,7 +1346,7 @@ static inline unsigned long memcg_page_state_local(struct mem_cgroup *memcg,

static inline void __mod_memcg_state(struct mem_cgroup *memcg,
int idx,
- int nr)
+ long nr)
{
}

@@ -1369,7 +1369,7 @@ static inline unsigned long lruvec_page_state_local(struct lruvec *lruvec,
}

static inline void __mod_memcg_lruvec_state(struct lruvec *lruvec,
- enum node_stat_item idx, int val)
+ enum node_stat_item idx, long val)
{
}

diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h
index 355ea1ee32bd..18e85071b44a 100644
--- a/include/linux/mm_inline.h
+++ b/include/linux/mm_inline.h
@@ -26,7 +26,7 @@ static inline int page_is_file_lru(struct page *page)

static __always_inline void update_lru_size(struct lruvec *lruvec,
enum lru_list lru, enum zone_type zid,
- int nr_pages)
+ long nr_pages)
{
struct pglist_data *pgdat = lruvec_pgdat(lruvec);

diff --git a/include/linux/vmstat.h b/include/linux/vmstat.h
index 773135fc6e19..230922179ba0 100644
--- a/include/linux/vmstat.h
+++ b/include/linux/vmstat.h
@@ -310,7 +310,7 @@ static inline void __mod_zone_page_state(struct zone *zone,
}

static inline void __mod_node_page_state(struct pglist_data *pgdat,
- enum node_stat_item item, int delta)
+ enum node_stat_item item, long delta)
{
if (vmstat_item_in_bytes(item)) {
VM_WARN_ON_ONCE(delta & (PAGE_SIZE - 1));
@@ -453,7 +453,7 @@ static inline const char *vm_event_name(enum vm_event_item item)
#ifdef CONFIG_MEMCG

void __mod_lruvec_state(struct lruvec *lruvec, enum node_stat_item idx,
- int val);
+ long val);

static inline void mod_lruvec_state(struct lruvec *lruvec,
enum node_stat_item idx, int val)
@@ -481,7 +481,7 @@ static inline void mod_lruvec_page_state(struct page *page,
#else

static inline void __mod_lruvec_state(struct lruvec *lruvec,
- enum node_stat_item idx, int val)
+ enum node_stat_item idx, long val)
{
__mod_node_page_state(lruvec_pgdat(lruvec), idx, val);
}
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index de17f02d27ad..c3fe5880c42d 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -758,7 +758,7 @@ mem_cgroup_largest_soft_limit_node(struct mem_cgroup_tree_per_node *mctz)
* @idx: the stat item - can be enum memcg_stat_item or enum node_stat_item
* @val: delta to add to the counter, can be negative
*/
-void __mod_memcg_state(struct mem_cgroup *memcg, int idx, int val)
+void __mod_memcg_state(struct mem_cgroup *memcg, int idx, long val)
{
long x, threshold = MEMCG_CHARGE_BATCH;

@@ -796,7 +796,7 @@ parent_nodeinfo(struct mem_cgroup_per_node *pn, int nid)
}

void __mod_memcg_lruvec_state(struct lruvec *lruvec, enum node_stat_item idx,
- int val)
+ long val)
{
struct mem_cgroup_per_node *pn;
struct mem_cgroup *memcg;
@@ -837,7 +837,7 @@ void __mod_memcg_lruvec_state(struct lruvec *lruvec, enum node_stat_item idx,
* change of state at this level: per-node, per-cgroup, per-lruvec.
*/
void __mod_lruvec_state(struct lruvec *lruvec, enum node_stat_item idx,
- int val)
+ long val)
{
/* Update node */
__mod_node_page_state(lruvec_pgdat(lruvec), idx, val);
@@ -1407,7 +1407,7 @@ struct lruvec *lock_page_lruvec_irqsave(struct page *page, unsigned long *flags)
* so as to allow it to check that lru_size 0 is consistent with list_empty).
*/
void mem_cgroup_update_lru_size(struct lruvec *lruvec, enum lru_list lru,
- int zid, int nr_pages)
+ int zid, long nr_pages)
{
struct mem_cgroup_per_node *mz;
unsigned long *lru_size;
@@ -1424,7 +1424,7 @@ void mem_cgroup_update_lru_size(struct lruvec *lruvec, enum lru_list lru,

size = *lru_size;
if (WARN_ONCE(size < 0,
- "%s(%p, %d, %d): lru_size %ld\n",
+ "%s(%p, %d, %ld): lru_size %ld\n",
__func__, lruvec, lru, nr_pages, size)) {
VM_BUG_ON(1);
*lru_size = 0;
--
2.29.2.576.ga3fc446d84-goog

2020-12-07 22:27:36

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH 07/11] mm: VM_BUG_ON lru page flags

On Mon, Dec 07, 2020 at 03:09:45PM -0700, Yu Zhao wrote:
> Move scattered VM_BUG_ONs to two essential places that cover all
> lru list additions and deletions.

I'd like to see these converted into VM_BUG_ON_PGFLAGS so you have
to take that extra CONFIG step to enable checking them.

2020-12-08 08:18:51

by Alex Shi

[permalink] [raw]
Subject: Re: [PATCH 03/11] mm: don't pass "enum lru_list" to lru list addition functions



在 2020/12/8 上午6:09, Yu Zhao 写道:
>
> __count_vm_events(PGACTIVATE, nr_pages);
> @@ -543,14 +542,14 @@ static void lru_deactivate_file_fn(struct page *page, struct lruvec *lruvec)
> * It can make readahead confusing. But race window
> * is _really_ small and it's non-critical problem.
> */
> - add_page_to_lru_list(page, lruvec, lru);
> + add_page_to_lru_list(page, lruvec);
> SetPageReclaim(page);
> } else {
> /*
> * The page's writeback ends up during pagevec
> * We moves tha page into tail of inactive.
> */
> - add_page_to_lru_list_tail(page, lruvec, lru);
> + add_page_to_lru_list_tail(page, lruvec);
> __count_vm_events(PGROTATED, nr_pages);
> }
>
> @@ -570,7 +569,7 @@ static void lru_deactivate_fn(struct page *page, struct lruvec *lruvec)
> del_page_from_lru_list(page, lruvec, lru + LRU_ACTIVE);
> ClearPageActive(page);
> ClearPageReferenced(page);
> - add_page_to_lru_list(page, lruvec, lru);
> + add_page_to_lru_list(page, lruvec);
>
> __count_vm_events(PGDEACTIVATE, nr_pages);
> __count_memcg_events(lruvec_memcg(lruvec), PGDEACTIVATE,

seems leave the lru = xxx out, could save 2 function calling in lru_deactivate_file_fn(), is this right?

2020-12-08 08:28:40

by Alex Shi

[permalink] [raw]
Subject: Re: [PATCH 03/11] mm: don't pass "enum lru_list" to lru list addition functions



在 2020/12/8 上午6:09, Yu Zhao 写道:
> The "enum lru_list" parameter to add_page_to_lru_list() and
> add_page_to_lru_list_tail() is redundant in the sense that it can
> be extracted from the "struct page" parameter by page_lru().
>
> A caveat is that we need to make sure PageActive() or
> PageUnevictable() is correctly set or cleared before calling
> these two functions. And they are indeed.
>
> Signed-off-by: Yu Zhao <[email protected]>
> ---
> include/linux/mm_inline.h | 8 ++++++--
> mm/swap.c | 15 +++++++--------
> mm/vmscan.c | 6 ++----
> 3 files changed, 15 insertions(+), 14 deletions(-)
>
> diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h
> index 2889741f450a..130ba3201d3f 100644
> --- a/include/linux/mm_inline.h
> +++ b/include/linux/mm_inline.h
> @@ -106,15 +106,19 @@ static __always_inline enum lru_list page_lru(struct page *page)
> }
>
> static __always_inline void add_page_to_lru_list(struct page *page,
> - struct lruvec *lruvec, enum lru_list lru)
> + struct lruvec *lruvec)
> {
> + enum lru_list lru = page_lru(page);
> +
> update_lru_size(lruvec, lru, page_zonenum(page), thp_nr_pages(page));
> list_add(&page->lru, &lruvec->lists[lru]);
> }
>
> static __always_inline void add_page_to_lru_list_tail(struct page *page,
> - struct lruvec *lruvec, enum lru_list lru)
> + struct lruvec *lruvec)
> {
> + enum lru_list lru = page_lru(page);
> +
> update_lru_size(lruvec, lru, page_zonenum(page), thp_nr_pages(page));
> list_add_tail(&page->lru, &lruvec->lists[lru]);
> }
> diff --git a/mm/swap.c b/mm/swap.c
> index 5022dfe388ad..136acabbfab5 100644
> --- a/mm/swap.c
> +++ b/mm/swap.c
> @@ -231,7 +231,7 @@ static void pagevec_move_tail_fn(struct page *page, struct lruvec *lruvec)
> if (!PageUnevictable(page)) {
> del_page_from_lru_list(page, lruvec, page_lru(page));
> ClearPageActive(page);
> - add_page_to_lru_list_tail(page, lruvec, page_lru(page));
> + add_page_to_lru_list_tail(page, lruvec);
> __count_vm_events(PGROTATED, thp_nr_pages(page));
> }
> }
> @@ -313,8 +313,7 @@ static void __activate_page(struct page *page, struct lruvec *lruvec)
>
> del_page_from_lru_list(page, lruvec, lru);
> SetPageActive(page);
> - lru += LRU_ACTIVE;

Uh, actully, page to lru functions like, page_lru(page), always inline, so generally, no instruction
increasing, except few place like here.


> - add_page_to_lru_list(page, lruvec, lru);
> + add_page_to_lru_list(page, lruvec);
> trace_mm_lru_activate(page);
>
> __count_vm_events(PGACTIVATE, nr_pages);
> @@ -543,14 +542,14 @@ static void lru_deactivate_file_fn(struct page *page, struct lruvec *lruvec)
> * It can make readahead confusing. But race window
> * is _really_ small and it's non-critical problem.
> */
> - add_page_to_lru_list(page, lruvec, lru);
> + add_page_to_lru_list(page, lruvec);
> SetPageReclaim(page);
> } else {
> /*
> * The page's writeback ends up during pagevec
> * We moves tha page into tail of inactive.
> */
> - add_page_to_lru_list_tail(page, lruvec, lru);
> + add_page_to_lru_list_tail(page, lruvec);
> __count_vm_events(PGROTATED, nr_pages);
> }
>
> @@ -570,7 +569,7 @@ static void lru_deactivate_fn(struct page *page, struct lruvec *lruvec)
> del_page_from_lru_list(page, lruvec, lru + LRU_ACTIVE);
> ClearPageActive(page);
> ClearPageReferenced(page);
> - add_page_to_lru_list(page, lruvec, lru);
> + add_page_to_lru_list(page, lruvec);
>
> __count_vm_events(PGDEACTIVATE, nr_pages);
> __count_memcg_events(lruvec_memcg(lruvec), PGDEACTIVATE,
> @@ -595,7 +594,7 @@ static void lru_lazyfree_fn(struct page *page, struct lruvec *lruvec)
> * anonymous pages
> */
> ClearPageSwapBacked(page);
> - add_page_to_lru_list(page, lruvec, LRU_INACTIVE_FILE);
> + add_page_to_lru_list(page, lruvec);
>
> __count_vm_events(PGLAZYFREE, nr_pages);
> __count_memcg_events(lruvec_memcg(lruvec), PGLAZYFREE,
> @@ -1005,7 +1004,7 @@ static void __pagevec_lru_add_fn(struct page *page, struct lruvec *lruvec)
> __count_vm_events(UNEVICTABLE_PGCULLED, nr_pages);
> }
>
> - add_page_to_lru_list(page, lruvec, lru);
> + add_page_to_lru_list(page, lruvec);
> trace_mm_lru_insertion(page, lru);
> }
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index a174594e40f8..8fc8f2c9d7ec 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1865,7 +1865,7 @@ static unsigned noinline_for_stack move_pages_to_lru(struct lruvec *lruvec,
> * inhibits memcg migration).
> */
> VM_BUG_ON_PAGE(!lruvec_holds_page_lru_lock(page, lruvec), page);
> - add_page_to_lru_list(page, lruvec, page_lru(page));
> + add_page_to_lru_list(page, lruvec);
> nr_pages = thp_nr_pages(page);
> nr_moved += nr_pages;
> if (PageActive(page))
> @@ -4280,12 +4280,10 @@ void check_move_unevictable_pages(struct pagevec *pvec)
>
> lruvec = relock_page_lruvec_irq(page, lruvec);
> if (page_evictable(page) && PageUnevictable(page)) {
> - enum lru_list lru = page_lru_base_type(page);
> -
> VM_BUG_ON_PAGE(PageActive(page), page);
> ClearPageUnevictable(page);
> del_page_from_lru_list(page, lruvec, LRU_UNEVICTABLE);
> - add_page_to_lru_list(page, lruvec, lru);

And here.


> + add_page_to_lru_list(page, lruvec);
> pgrescued += nr_pages;
> }
> SetPageLRU(page);
>

2020-12-08 09:07:43

by Alex Shi

[permalink] [raw]
Subject: Re: [PATCH 08/11] mm: fold page_lru_base_type() into its sole caller

Reviewed-by: Alex Shi <[email protected]>

在 2020/12/8 上午6:09, Yu Zhao 写道:
> We've removed all other references to this function.
>
> Signed-off-by: Yu Zhao <[email protected]>
> ---
> include/linux/mm_inline.h | 27 ++++++---------------------
> 1 file changed, 6 insertions(+), 21 deletions(-)
>
> diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h
> index 6d907a4dd6ad..7183c7a03f09 100644
> --- a/include/linux/mm_inline.h
> +++ b/include/linux/mm_inline.h
> @@ -45,21 +45,6 @@ static __always_inline void update_lru_size(struct lruvec *lruvec,
> #endif
> }
>
> -/**
> - * page_lru_base_type - which LRU list type should a page be on?
> - * @page: the page to test
> - *
> - * Used for LRU list index arithmetic.
> - *
> - * Returns the base LRU type - file or anon - @page should be on.
> - */
> -static inline enum lru_list page_lru_base_type(struct page *page)
> -{
> - if (page_is_file_lru(page))
> - return LRU_INACTIVE_FILE;
> - return LRU_INACTIVE_ANON;
> -}
> -
> /**
> * __clear_page_lru_flags - clear page lru flags before releasing a page
> * @page: the page that was on lru and now has a zero reference
> @@ -92,12 +77,12 @@ static __always_inline enum lru_list page_lru(struct page *page)
> VM_BUG_ON_PAGE(PageActive(page) && PageUnevictable(page), page);
>
> if (PageUnevictable(page))
> - lru = LRU_UNEVICTABLE;
> - else {
> - lru = page_lru_base_type(page);
> - if (PageActive(page))
> - lru += LRU_ACTIVE;
> - }
> + return LRU_UNEVICTABLE;
> +
> + lru = page_is_file_lru(page) ? LRU_INACTIVE_FILE : LRU_INACTIVE_ANON;
> + if (PageActive(page))
> + lru += LRU_ACTIVE;
> +
> return lru;
> }
>
>

2020-12-08 09:11:39

by Alex Shi

[permalink] [raw]
Subject: Re: [PATCH 10/11] mm: make lruvec_lru_size() static

Reviewed-by: Alex Shi <[email protected]>

在 2020/12/8 上午6:09, Yu Zhao 写道:
> All other references to the function were removed after
> commit b910718a948a ("mm: vmscan: detect file thrashing at the reclaim root")
>
> Signed-off-by: Yu Zhao <[email protected]>
> ---
> include/linux/mmzone.h | 2 --
> mm/vmscan.c | 3 ++-
> 2 files changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index b593316bff3d..2fc54e269eaf 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -872,8 +872,6 @@ static inline struct pglist_data *lruvec_pgdat(struct lruvec *lruvec)
> #endif
> }
>
> -extern unsigned long lruvec_lru_size(struct lruvec *lruvec, enum lru_list lru, int zone_idx);
> -
> #ifdef CONFIG_HAVE_MEMORYLESS_NODES
> int local_memory_node(int node_id);
> #else
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 95e581c9d9af..fd0c2313bee4 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -310,7 +310,8 @@ unsigned long zone_reclaimable_pages(struct zone *zone)
> * @lru: lru to use
> * @zone_idx: zones to consider (use MAX_NR_ZONES for the whole LRU list)
> */
> -unsigned long lruvec_lru_size(struct lruvec *lruvec, enum lru_list lru, int zone_idx)
> +static unsigned long lruvec_lru_size(struct lruvec *lruvec, enum lru_list lru,
> + int zone_idx)
> {
> unsigned long size = 0;
> int zid;
>

2020-12-08 09:12:23

by Alex Shi

[permalink] [raw]
Subject: Re: [PATCH 09/11] mm: fold __update_lru_size() into its sole caller

Reviewed-by: Alex Shi <[email protected]>

在 2020/12/8 上午6:09, Yu Zhao 写道:
> All other references to the function were removed after
> commit a892cb6b977f ("mm/vmscan.c: use update_lru_size() in update_lru_sizes()")
>
> Signed-off-by: Yu Zhao <[email protected]>
> ---
> include/linux/mm_inline.h | 9 +--------
> 1 file changed, 1 insertion(+), 8 deletions(-)
>
> diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h
> index 7183c7a03f09..355ea1ee32bd 100644
> --- a/include/linux/mm_inline.h
> +++ b/include/linux/mm_inline.h
> @@ -24,7 +24,7 @@ static inline int page_is_file_lru(struct page *page)
> return !PageSwapBacked(page);
> }
>
> -static __always_inline void __update_lru_size(struct lruvec *lruvec,
> +static __always_inline void update_lru_size(struct lruvec *lruvec,
> enum lru_list lru, enum zone_type zid,
> int nr_pages)
> {
> @@ -33,13 +33,6 @@ static __always_inline void __update_lru_size(struct lruvec *lruvec,
> __mod_lruvec_state(lruvec, NR_LRU_BASE + lru, nr_pages);
> __mod_zone_page_state(&pgdat->node_zones[zid],
> NR_ZONE_LRU_BASE + lru, nr_pages);
> -}
> -
> -static __always_inline void update_lru_size(struct lruvec *lruvec,
> - enum lru_list lru, enum zone_type zid,
> - int nr_pages)
> -{
> - __update_lru_size(lruvec, lru, zid, nr_pages);
> #ifdef CONFIG_MEMCG
> mem_cgroup_update_lru_size(lruvec, lru, zid, nr_pages);
> #endif
>

2020-12-09 00:56:09

by Alex Shi

[permalink] [raw]
Subject: Re: [PATCH 11/11] mm: enlarge the "int nr_pages" parameter of update_lru_size()

Reviewed-by: Alex Shi <[email protected]>

在 2020/12/8 上午6:09, Yu Zhao 写道:
> update_lru_sizes() defines an unsigned long argument and passes it as
> nr_pages to update_lru_size(). Though this isn't causing any overflows
> I'm aware of, it's a bad idea to go through the demotion given that we
> have recently stumbled on a related type promotion problem fixed by
> commit 2da9f6305f30 ("mm/vmscan: fix NR_ISOLATED_FILE corruption on 64-bit")
>
> Note that the underlying counters are already in long. This is another
> reason we shouldn't have the demotion.
>
> This patch enlarges all relevant parameters on the path to the final
> underlying counters:
> update_lru_size(int -> long)
> if memcg:
> __mod_lruvec_state(int -> long)
> if smp:
> __mod_node_page_state(long)
> else:
> __mod_node_page_state(int -> long)
> __mod_memcg_lruvec_state(int -> long)
> __mod_memcg_state(int -> long)
> else:
> __mod_lruvec_state(int -> long)
> if smp:
> __mod_node_page_state(long)
> else:
> __mod_node_page_state(int -> long)
>
> __mod_zone_page_state(long)
>
> if memcg:
> mem_cgroup_update_lru_size(int -> long)
>
> Note that __mod_node_page_state() for the smp case and
> __mod_zone_page_state() already use long. So this change also fixes
> the inconsistency.
>
> Signed-off-by: Yu Zhao <[email protected]>
> ---
> include/linux/memcontrol.h | 10 +++++-----
> include/linux/mm_inline.h | 2 +-
> include/linux/vmstat.h | 6 +++---
> mm/memcontrol.c | 10 +++++-----
> 4 files changed, 14 insertions(+), 14 deletions(-)
>
> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> index 3febf64d1b80..1454201abb8d 100644
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -810,7 +810,7 @@ static inline bool mem_cgroup_online(struct mem_cgroup *memcg)
> int mem_cgroup_select_victim_node(struct mem_cgroup *memcg);
>
> void mem_cgroup_update_lru_size(struct lruvec *lruvec, enum lru_list lru,
> - int zid, int nr_pages);
> + int zid, long nr_pages);
>
> static inline
> unsigned long mem_cgroup_get_zone_lru_size(struct lruvec *lruvec,
> @@ -896,7 +896,7 @@ static inline unsigned long memcg_page_state_local(struct mem_cgroup *memcg,
> return x;
> }
>
> -void __mod_memcg_state(struct mem_cgroup *memcg, int idx, int val);
> +void __mod_memcg_state(struct mem_cgroup *memcg, int idx, long val);
>
> /* idx can be of type enum memcg_stat_item or node_stat_item */
> static inline void mod_memcg_state(struct mem_cgroup *memcg,
> @@ -948,7 +948,7 @@ static inline unsigned long lruvec_page_state_local(struct lruvec *lruvec,
> }
>
> void __mod_memcg_lruvec_state(struct lruvec *lruvec, enum node_stat_item idx,
> - int val);
> + long val);
> void __mod_lruvec_kmem_state(void *p, enum node_stat_item idx, int val);
>
> static inline void mod_lruvec_kmem_state(void *p, enum node_stat_item idx,
> @@ -1346,7 +1346,7 @@ static inline unsigned long memcg_page_state_local(struct mem_cgroup *memcg,
>
> static inline void __mod_memcg_state(struct mem_cgroup *memcg,
> int idx,
> - int nr)
> + long nr)
> {
> }
>
> @@ -1369,7 +1369,7 @@ static inline unsigned long lruvec_page_state_local(struct lruvec *lruvec,
> }
>
> static inline void __mod_memcg_lruvec_state(struct lruvec *lruvec,
> - enum node_stat_item idx, int val)
> + enum node_stat_item idx, long val)
> {
> }
>
> diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h
> index 355ea1ee32bd..18e85071b44a 100644
> --- a/include/linux/mm_inline.h
> +++ b/include/linux/mm_inline.h
> @@ -26,7 +26,7 @@ static inline int page_is_file_lru(struct page *page)
>
> static __always_inline void update_lru_size(struct lruvec *lruvec,
> enum lru_list lru, enum zone_type zid,
> - int nr_pages)
> + long nr_pages)
> {
> struct pglist_data *pgdat = lruvec_pgdat(lruvec);
>
> diff --git a/include/linux/vmstat.h b/include/linux/vmstat.h
> index 773135fc6e19..230922179ba0 100644
> --- a/include/linux/vmstat.h
> +++ b/include/linux/vmstat.h
> @@ -310,7 +310,7 @@ static inline void __mod_zone_page_state(struct zone *zone,
> }
>
> static inline void __mod_node_page_state(struct pglist_data *pgdat,
> - enum node_stat_item item, int delta)
> + enum node_stat_item item, long delta)
> {
> if (vmstat_item_in_bytes(item)) {
> VM_WARN_ON_ONCE(delta & (PAGE_SIZE - 1));
> @@ -453,7 +453,7 @@ static inline const char *vm_event_name(enum vm_event_item item)
> #ifdef CONFIG_MEMCG
>
> void __mod_lruvec_state(struct lruvec *lruvec, enum node_stat_item idx,
> - int val);
> + long val);
>
> static inline void mod_lruvec_state(struct lruvec *lruvec,
> enum node_stat_item idx, int val)
> @@ -481,7 +481,7 @@ static inline void mod_lruvec_page_state(struct page *page,
> #else
>
> static inline void __mod_lruvec_state(struct lruvec *lruvec,
> - enum node_stat_item idx, int val)
> + enum node_stat_item idx, long val)
> {
> __mod_node_page_state(lruvec_pgdat(lruvec), idx, val);
> }
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index de17f02d27ad..c3fe5880c42d 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -758,7 +758,7 @@ mem_cgroup_largest_soft_limit_node(struct mem_cgroup_tree_per_node *mctz)
> * @idx: the stat item - can be enum memcg_stat_item or enum node_stat_item
> * @val: delta to add to the counter, can be negative
> */
> -void __mod_memcg_state(struct mem_cgroup *memcg, int idx, int val)
> +void __mod_memcg_state(struct mem_cgroup *memcg, int idx, long val)
> {
> long x, threshold = MEMCG_CHARGE_BATCH;
>
> @@ -796,7 +796,7 @@ parent_nodeinfo(struct mem_cgroup_per_node *pn, int nid)
> }
>
> void __mod_memcg_lruvec_state(struct lruvec *lruvec, enum node_stat_item idx,
> - int val)
> + long val)
> {
> struct mem_cgroup_per_node *pn;
> struct mem_cgroup *memcg;
> @@ -837,7 +837,7 @@ void __mod_memcg_lruvec_state(struct lruvec *lruvec, enum node_stat_item idx,
> * change of state at this level: per-node, per-cgroup, per-lruvec.
> */
> void __mod_lruvec_state(struct lruvec *lruvec, enum node_stat_item idx,
> - int val)
> + long val)
> {
> /* Update node */
> __mod_node_page_state(lruvec_pgdat(lruvec), idx, val);
> @@ -1407,7 +1407,7 @@ struct lruvec *lock_page_lruvec_irqsave(struct page *page, unsigned long *flags)
> * so as to allow it to check that lru_size 0 is consistent with list_empty).
> */
> void mem_cgroup_update_lru_size(struct lruvec *lruvec, enum lru_list lru,
> - int zid, int nr_pages)
> + int zid, long nr_pages)
> {
> struct mem_cgroup_per_node *mz;
> unsigned long *lru_size;
> @@ -1424,7 +1424,7 @@ void mem_cgroup_update_lru_size(struct lruvec *lruvec, enum lru_list lru,
>
> size = *lru_size;
> if (WARN_ONCE(size < 0,
> - "%s(%p, %d, %d): lru_size %ld\n",
> + "%s(%p, %d, %ld): lru_size %ld\n",
> __func__, lruvec, lru, nr_pages, size)) {
> VM_BUG_ON(1);
> *lru_size = 0;
>

2020-12-10 09:31:34

by Alex Shi

[permalink] [raw]
Subject: Re: [PATCH 00/11] mm: lru related cleanups

Hi Yu,

btw, after this patchset, to do cacheline alignment on each of lru lists
are possible, so did you try that to see performance changes?

Thanks
Alex

在 2020/12/8 上午6:09, Yu Zhao 写道:
> The cleanups are intended to reduce the verbosity in lru list
> operations and make them less error-prone. A typical example
> would be how the patches change __activate_page():
>
> static void __activate_page(struct page *page, struct lruvec *lruvec)
> {
> if (!PageActive(page) && !PageUnevictable(page)) {
> - int lru = page_lru_base_type(page);
> int nr_pages = thp_nr_pages(page);
>
> - del_page_from_lru_list(page, lruvec, lru);
> + del_page_from_lru_list(page, lruvec);
> SetPageActive(page);
> - lru += LRU_ACTIVE;
> - add_page_to_lru_list(page, lruvec, lru);
> + add_page_to_lru_list(page, lruvec);
> trace_mm_lru_activate(page);
>
> There are a few more places like __activate_page() and they are
> unnecessarily repetitive in terms of figuring out which list a page
> should be added onto or deleted from. And with the duplicated code
> removed, they are easier to read, IMO.
>
> Patch 1 to 5 basically cover the above. Patch 6 and 7 make code more
> robust by improving bug reporting. Patch 8, 9 and 10 take care of
> some dangling helpers left in header files. Patch 11 isn't strictly a
> clean-up patch, but it seems still relevant to include it here.
>
> Yu Zhao (11):
> mm: use add_page_to_lru_list()
> mm: shuffle lru list addition and deletion functions
> mm: don't pass "enum lru_list" to lru list addition functions
> mm: don't pass "enum lru_list" to trace_mm_lru_insertion()
> mm: don't pass "enum lru_list" to del_page_from_lru_list()
> mm: add __clear_page_lru_flags() to replace page_off_lru()
> mm: VM_BUG_ON lru page flags
> mm: fold page_lru_base_type() into its sole caller
> mm: fold __update_lru_size() into its sole caller
> mm: make lruvec_lru_size() static
> mm: enlarge the "int nr_pages" parameter of update_lru_size()
>
> include/linux/memcontrol.h | 10 +--
> include/linux/mm_inline.h | 115 ++++++++++++++-------------------
> include/linux/mmzone.h | 2 -
> include/linux/vmstat.h | 6 +-
> include/trace/events/pagemap.h | 11 ++--
> mm/compaction.c | 2 +-
> mm/memcontrol.c | 10 +--
> mm/mlock.c | 3 +-
> mm/swap.c | 50 ++++++--------
> mm/vmscan.c | 21 ++----
> 10 files changed, 91 insertions(+), 139 deletions(-)
>

2020-12-14 21:55:26

by Hugh Dickins

[permalink] [raw]
Subject: Re: [PATCH 11/11] mm: enlarge the "int nr_pages" parameter of update_lru_size()

On Mon, 7 Dec 2020, Yu Zhao wrote:

> update_lru_sizes() defines an unsigned long argument and passes it as
> nr_pages to update_lru_size(). Though this isn't causing any overflows
> I'm aware of, it's a bad idea to go through the demotion given that we
> have recently stumbled on a related type promotion problem fixed by
> commit 2da9f6305f30 ("mm/vmscan: fix NR_ISOLATED_FILE corruption on 64-bit")
>
> Note that the underlying counters are already in long. This is another
> reason we shouldn't have the demotion.
>
> This patch enlarges all relevant parameters on the path to the final
> underlying counters:
> update_lru_size(int -> long)
> if memcg:
> __mod_lruvec_state(int -> long)
> if smp:
> __mod_node_page_state(long)
> else:
> __mod_node_page_state(int -> long)
> __mod_memcg_lruvec_state(int -> long)
> __mod_memcg_state(int -> long)
> else:
> __mod_lruvec_state(int -> long)
> if smp:
> __mod_node_page_state(long)
> else:
> __mod_node_page_state(int -> long)
>
> __mod_zone_page_state(long)
>
> if memcg:
> mem_cgroup_update_lru_size(int -> long)
>
> Note that __mod_node_page_state() for the smp case and
> __mod_zone_page_state() already use long. So this change also fixes
> the inconsistency.
>
> Signed-off-by: Yu Zhao <[email protected]>

NAK from me to this 11/11: I'm running happily with your 1-10 on top of
mmotm (I'll review them n a few days, but currently more concerned with
Rik's shmem huge gfp_mask), but had to leave this one out.

You think you are future-proofing with this, but it is present-breaking.

It looks plausible (though seems random: why these particular functions
use long but others not? why __mod_memcg_state() long, mod_memcg_state()
int?), and I was fooled; but fortunately was still testing with memcg
moving, for Alex's patchset.

Soon got stuck waiting in balance_dirty_pages(), /proc/vmstat showing
nr_anon_pages 2263142822377729
nr_mapped 125095217474159
nr_file_pages 225421358649526
nr_dirty 8589934592
nr_writeback 1202590842920
nr_shmem 40501541678768
nr_anon_transparent_hugepages 51539607554

That last (anon THPs) nothing to do with this patch, but illustrates
what Muchun is fixing in his 1/7 "mm: memcontrol: fix NR_ANON_THPS
accounting in charge moving".

The rest of them could be fixed by changing mem_cgroup_move_account()'s
"unsigned int nr_pages" to "long nr_pages" in this patch, but I think
it's safer just to drop the patch: the promotion of "unsigned int" to
"long" does not work as you would like it to.

I see that mm/vmscan.c contains several "unsigned int" counts of pages,
everything works fine at present so far as I know, and those appeared
to work even with your patch; but I am not confident in my test coverage,
and not confident in us being able to outlaw unsigned int page counts in
future.

Hugh

2020-12-16 00:21:08

by Yu Zhao

[permalink] [raw]
Subject: Re: [PATCH 11/11] mm: enlarge the "int nr_pages" parameter of update_lru_size()

On Mon, Dec 14, 2020 at 01:50:16PM -0800, Hugh Dickins wrote:
> On Mon, 7 Dec 2020, Yu Zhao wrote:
>
> > update_lru_sizes() defines an unsigned long argument and passes it as
> > nr_pages to update_lru_size(). Though this isn't causing any overflows
> > I'm aware of, it's a bad idea to go through the demotion given that we
> > have recently stumbled on a related type promotion problem fixed by
> > commit 2da9f6305f30 ("mm/vmscan: fix NR_ISOLATED_FILE corruption on 64-bit")
> >
> > Note that the underlying counters are already in long. This is another
> > reason we shouldn't have the demotion.
> >
> > This patch enlarges all relevant parameters on the path to the final
> > underlying counters:
> > update_lru_size(int -> long)
> > if memcg:
> > __mod_lruvec_state(int -> long)
> > if smp:
> > __mod_node_page_state(long)
> > else:
> > __mod_node_page_state(int -> long)
> > __mod_memcg_lruvec_state(int -> long)
> > __mod_memcg_state(int -> long)
> > else:
> > __mod_lruvec_state(int -> long)
> > if smp:
> > __mod_node_page_state(long)
> > else:
> > __mod_node_page_state(int -> long)
> >
> > __mod_zone_page_state(long)
> >
> > if memcg:
> > mem_cgroup_update_lru_size(int -> long)
> >
> > Note that __mod_node_page_state() for the smp case and
> > __mod_zone_page_state() already use long. So this change also fixes
> > the inconsistency.
> >
> > Signed-off-by: Yu Zhao <[email protected]>
>
> NAK from me to this 11/11: I'm running happily with your 1-10 on top of
> mmotm (I'll review them n a few days, but currently more concerned with
> Rik's shmem huge gfp_mask), but had to leave this one out.
>
> You think you are future-proofing with this, but it is present-breaking.
>
> It looks plausible (though seems random: why these particular functions
> use long but others not? why __mod_memcg_state() long, mod_memcg_state()
> int?), and I was fooled; but fortunately was still testing with memcg
> moving, for Alex's patchset.

My apologies. The patch was fully tested on 4.15. Apparently I didn't
pay enough attention to what's changed in mem_cgroup_move_account() nor
had any test coverage on it when rebasing this patch.

> Soon got stuck waiting in balance_dirty_pages(), /proc/vmstat showing
> nr_anon_pages 2263142822377729
> nr_mapped 125095217474159
> nr_file_pages 225421358649526
> nr_dirty 8589934592
> nr_writeback 1202590842920
> nr_shmem 40501541678768
> nr_anon_transparent_hugepages 51539607554
>
> That last (anon THPs) nothing to do with this patch, but illustrates
> what Muchun is fixing in his 1/7 "mm: memcontrol: fix NR_ANON_THPS
> accounting in charge moving".
>
> The rest of them could be fixed by changing mem_cgroup_move_account()'s
> "unsigned int nr_pages" to "long nr_pages" in this patch, but I think
> it's safer just to drop the patch: the promotion of "unsigned int" to
> "long" does not work as you would like it to.
>
> I see that mm/vmscan.c contains several "unsigned int" counts of pages,
> everything works fine at present so far as I know, and those appeared
> to work even with your patch; but I am not confident in my test coverage,
> and not confident in us being able to outlaw unsigned int page counts in
> future.

I'll just drop this one. Thanks.