2022-12-12 21:01:03

by Hugh Dickins

[permalink] [raw]
Subject: Re: [PATCH 5.10 001/106] mm/mlock: remove lru_lock on TestClearPageMlocked

On Mon, 12 Dec 2022, Greg Kroah-Hartman wrote:

> From: Alex Shi <[email protected]>
>
> [ Upstream commit 3db19aa39bac33f2e850fa1ddd67be29b192e51f ]
>
> In the func munlock_vma_page, comments mentained lru_lock needed for
> serialization with split_huge_pages. But the page must be PageLocked as
> well as pages in split_huge_page series funcs. Thus the PageLocked is
> enough to serialize both funcs.
>
> Further more, Hugh Dickins pointed: before splitting in
> split_huge_page_to_list, the page was unmap_page() to remove pmd/ptes
> which protect the page from munlock. Thus, no needs to guard
> __split_huge_page_tail for mlock clean, just keep the lru_lock there for
> isolation purpose.
>
> LKP found a preempt issue on __mod_zone_page_state which need change to
> mod_zone_page_state. Thanks!
>
> Link: https://lkml.kernel.org/r/[email protected]
> Signed-off-by: Alex Shi <[email protected]>
> Acked-by: Hugh Dickins <[email protected]>
> Acked-by: Johannes Weiner <[email protected]>
> Acked-by: Vlastimil Babka <[email protected]>
> Cc: Kirill A. Shutemov <[email protected]>
> Cc: Alexander Duyck <[email protected]>
> Cc: Andrea Arcangeli <[email protected]>
> Cc: Andrey Ryabinin <[email protected]>
> Cc: "Chen, Rong A" <[email protected]>
> Cc: Daniel Jordan <[email protected]>
> Cc: "Huang, Ying" <[email protected]>
> Cc: Jann Horn <[email protected]>
> Cc: Joonsoo Kim <[email protected]>
> Cc: Kirill A. Shutemov <[email protected]>
> Cc: Konstantin Khlebnikov <[email protected]>
> Cc: Matthew Wilcox (Oracle) <[email protected]>
> Cc: Mel Gorman <[email protected]>
> Cc: Michal Hocko <[email protected]>
> Cc: Michal Hocko <[email protected]>
> Cc: Mika Penttilä <[email protected]>
> Cc: Minchan Kim <[email protected]>
> Cc: Shakeel Butt <[email protected]>
> Cc: Tejun Heo <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Cc: Vladimir Davydov <[email protected]>
> Cc: Wei Yang <[email protected]>
> Cc: Yang Shi <[email protected]>
> Signed-off-by: Andrew Morton <[email protected]>
> Signed-off-by: Linus Torvalds <[email protected]>
> Stable-dep-of: 829ae0f81ce0 ("mm: migrate: fix THP's mapcount on isolation")
> Signed-off-by: Sasha Levin <[email protected]>

NAK from me to patches 001 through 007 here: 001 through 006 are a
risky subset of patches and followups to a per-memcg per-node lru_lock
series from Alex Shi, which made subtle changes to locking, memcg
charging, lru management, page migration etc.

The whole series could be backported to 5.10 (I did so myself for
internal usage), but cherry-picking parts of it into 5.10-stable is
misguided and contrary to stable principles.

Maybe there is in fact nothing wrong with the selection made:
but then give linux-mm guys two or three weeks to review and
test and give the thumbs up to that selection.

Much easier, quicker and safer would be to adjust 007 (I presume
the reason behind 001 through 006) to fit the 5.10-stable tree:
I can do that myself if you ask, but not until later this week.

Hugh

> ---
> mm/mlock.c | 26 +++++---------------------
> 1 file changed, 5 insertions(+), 21 deletions(-)
>
> diff --git a/mm/mlock.c b/mm/mlock.c
> index 884b1216da6a..796c726a0407 100644
> --- a/mm/mlock.c
> +++ b/mm/mlock.c
> @@ -187,40 +187,24 @@ static void __munlock_isolation_failed(struct page *page)
> unsigned int munlock_vma_page(struct page *page)
> {
> int nr_pages;
> - pg_data_t *pgdat = page_pgdat(page);
>
> /* For try_to_munlock() and to serialize with page migration */
> BUG_ON(!PageLocked(page));
> -
> VM_BUG_ON_PAGE(PageTail(page), page);
>
> - /*
> - * Serialize with any parallel __split_huge_page_refcount() which
> - * might otherwise copy PageMlocked to part of the tail pages before
> - * we clear it in the head page. It also stabilizes thp_nr_pages().
> - */
> - spin_lock_irq(&pgdat->lru_lock);
> -
> if (!TestClearPageMlocked(page)) {
> /* Potentially, PTE-mapped THP: do not skip the rest PTEs */
> - nr_pages = 1;
> - goto unlock_out;
> + return 0;
> }
>
> nr_pages = thp_nr_pages(page);
> - __mod_zone_page_state(page_zone(page), NR_MLOCK, -nr_pages);
> + mod_zone_page_state(page_zone(page), NR_MLOCK, -nr_pages);
>
> - if (__munlock_isolate_lru_page(page, true)) {
> - spin_unlock_irq(&pgdat->lru_lock);
> + if (!isolate_lru_page(page))
> __munlock_isolated_page(page);
> - goto out;
> - }
> - __munlock_isolation_failed(page);
> -
> -unlock_out:
> - spin_unlock_irq(&pgdat->lru_lock);
> + else
> + __munlock_isolation_failed(page);
>
> -out:
> return nr_pages - 1;
> }
>
> --
> 2.35.1
>
>
>
>


2022-12-13 14:47:13

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 5.10 001/106] mm/mlock: remove lru_lock on TestClearPageMlocked

On Mon, Dec 12, 2022 at 12:35:57PM -0800, Hugh Dickins wrote:
> On Mon, 12 Dec 2022, Greg Kroah-Hartman wrote:
>
> > From: Alex Shi <[email protected]>
> >
> > [ Upstream commit 3db19aa39bac33f2e850fa1ddd67be29b192e51f ]
> >
> > In the func munlock_vma_page, comments mentained lru_lock needed for
> > serialization with split_huge_pages. But the page must be PageLocked as
> > well as pages in split_huge_page series funcs. Thus the PageLocked is
> > enough to serialize both funcs.
> >
> > Further more, Hugh Dickins pointed: before splitting in
> > split_huge_page_to_list, the page was unmap_page() to remove pmd/ptes
> > which protect the page from munlock. Thus, no needs to guard
> > __split_huge_page_tail for mlock clean, just keep the lru_lock there for
> > isolation purpose.
> >
> > LKP found a preempt issue on __mod_zone_page_state which need change to
> > mod_zone_page_state. Thanks!
> >
> > Link: https://lkml.kernel.org/r/[email protected]
> > Signed-off-by: Alex Shi <[email protected]>
> > Acked-by: Hugh Dickins <[email protected]>
> > Acked-by: Johannes Weiner <[email protected]>
> > Acked-by: Vlastimil Babka <[email protected]>
> > Cc: Kirill A. Shutemov <[email protected]>
> > Cc: Alexander Duyck <[email protected]>
> > Cc: Andrea Arcangeli <[email protected]>
> > Cc: Andrey Ryabinin <[email protected]>
> > Cc: "Chen, Rong A" <[email protected]>
> > Cc: Daniel Jordan <[email protected]>
> > Cc: "Huang, Ying" <[email protected]>
> > Cc: Jann Horn <[email protected]>
> > Cc: Joonsoo Kim <[email protected]>
> > Cc: Kirill A. Shutemov <[email protected]>
> > Cc: Konstantin Khlebnikov <[email protected]>
> > Cc: Matthew Wilcox (Oracle) <[email protected]>
> > Cc: Mel Gorman <[email protected]>
> > Cc: Michal Hocko <[email protected]>
> > Cc: Michal Hocko <[email protected]>
> > Cc: Mika Penttil? <[email protected]>
> > Cc: Minchan Kim <[email protected]>
> > Cc: Shakeel Butt <[email protected]>
> > Cc: Tejun Heo <[email protected]>
> > Cc: Thomas Gleixner <[email protected]>
> > Cc: Vladimir Davydov <[email protected]>
> > Cc: Wei Yang <[email protected]>
> > Cc: Yang Shi <[email protected]>
> > Signed-off-by: Andrew Morton <[email protected]>
> > Signed-off-by: Linus Torvalds <[email protected]>
> > Stable-dep-of: 829ae0f81ce0 ("mm: migrate: fix THP's mapcount on isolation")
> > Signed-off-by: Sasha Levin <[email protected]>
>
> NAK from me to patches 001 through 007 here: 001 through 006 are a
> risky subset of patches and followups to a per-memcg per-node lru_lock
> series from Alex Shi, which made subtle changes to locking, memcg
> charging, lru management, page migration etc.
>
> The whole series could be backported to 5.10 (I did so myself for
> internal usage), but cherry-picking parts of it into 5.10-stable is
> misguided and contrary to stable principles.
>
> Maybe there is in fact nothing wrong with the selection made:
> but then give linux-mm guys two or three weeks to review and
> test and give the thumbs up to that selection.
>
> Much easier, quicker and safer would be to adjust 007 (I presume
> the reason behind 001 through 006) to fit the 5.10-stable tree:
> I can do that myself if you ask, but not until later this week.

All now dropped, thanks.

greg k-h