2023-05-29 12:48:40

by mawupeng

[permalink] [raw]
Subject: [RFC PATCH stable 5.10/5.15] mm: Pass head page to clear_page_mlock for page_remove_rmap

From: Ma Wupeng <[email protected]>

Our syzbot report a mlock related problem. During exit_mm, tail page is
passed to clear_page_mlock which final lead to kernel panic.

During unmap_page_range, if compound is false, it means this page is
seen as a small page. This page is passed to isolate_lru_page if this
page is PageMlocked and finally lead to "trying to isolate tail page"
warning.

Here is the simplified calltrace:

unmap_page_range
zap_pte_range
page_remove_rmap(page, false); // compound is false means to handle
to small page not compound page
nr_pages = thp_nr_pages(page);
clear_page_mlock(page) // maybe tail page here
isolate_lru_page
WARN_RATELIMIT(PageTail(page), "trying to isolate tail page");

Since mlock is not supposed to handle tail, we pass head page to
clear_page_mlock() to slove this problem.

This bug can lead to multiple reports. Here ares the simplified reports:

------------[ cut here ]------------
trying to isolate tail page
WARNING: CPU: 1 PID: 24489 at mm/vmscan.c:2031 isolate_lru_page+0x574/0x660

page:fffffc000eb7a300 refcount:512 mapcount:0 mapping:0000000000000000 index:0x2008c pfn:0x3ede8c
head:fffffc000eb78000 order:9 compound_mapcount:0 compound_pincount:0
memcg:ffff0000d24bc000
anon flags: 0x37ffff80009080c(uptodate|dirty|arch_1|head|swapbacked|node=1|zone=2|lastcpupid=0xfffff)
raw: 037ffff800000800 fffffc000eb78001 fffffc000eb7a308 dead000000000400
raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
head: 037ffff80009080c fffffc000eb70008 fffffc000e350708 ffff0003829eb839
head: 0000000000020000 0000000000000000 00000200ffffffff ffff0000d24bc000
page dumped because: VM_WARN_ON_ONCE_PAGE(!memcg && !mem_cgroup_disabled())
------------[ cut here ]------------
WARNING: CPU: 1 PID: 24489 at include/linux/memcontrol.h:767 lock_page_lruvec_irq+0x148/0x190

page:fffffc000eb7a300 refcount:0 mapcount:0 mapping:dead000000000400 index:0x0 pfn:0x3ede8c
failed to read mapping contents, not a valid kernel address?
flags: 0x37ffff800000800(arch_1|node=1|zone=2|lastcpupid=0xfffff)
raw: 037ffff800000800 dead000000000100 dead000000000122 dead000000000400
raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: VM_BUG_ON_PAGE(((unsigned int) page_ref_count(page) + 127u <= 127u))
------------[ cut here ]------------
kernel BUG at include/linux/mm.h:1213!
Call trace:
lru_cache_add+0x2d4/0x2e8
putback_lru_page+0x2c/0x168
clear_page_mlock+0x254/0x318
page_remove_rmap+0x900/0x9c0
unmap_page_range+0xa78/0x16a0
unmap_single_vma+0x114/0x1a0
unmap_vmas+0x100/0x220
exit_mmap+0x120/0x410
mmput+0x174/0x498
exit_mm+0x33c/0x460
do_exit+0x3c0/0x1310
do_group_exit+0x98/0x170
get_signal+0x370/0x13d0
do_notify_resume+0x5a0/0x968
el0_da+0x154/0x188
el0t_64_sync_handler+0x88/0xb8
el0t_64_sync+0x1a0/0x1a4
Code: 912b0021 aa1503e0 910c0021 9401a49c (d4210000)

This bug can be reproduced in both linux-5.10.y & linux-5.15.y and maybe
fixed after commit 889a3747b3b7 ("mm/lru: Add folio LRU functions").
This patch turn page into folio for LRU related operations, all
operations to page is turn to folio which means head page after this
patch.

Fixes: d281ee614518 ("rmap: add argument to charge compound page")
Signed-off-by: Ma Wupeng <[email protected]>
---
mm/rmap.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/rmap.c b/mm/rmap.c
index 330b361a460e..8838f6a9d65d 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1372,7 +1372,7 @@ void page_remove_rmap(struct page *page, bool compound)
__dec_lruvec_page_state(page, NR_ANON_MAPPED);

if (unlikely(PageMlocked(page)))
- clear_page_mlock(page);
+ clear_page_mlock(compound_head(page));

if (PageTransCompound(page))
deferred_split_huge_page(compound_head(page));
--
2.25.1



2023-06-02 02:29:56

by Hugh Dickins

[permalink] [raw]
Subject: Re: [RFC PATCH stable 5.10/5.15] mm: Pass head page to clear_page_mlock for page_remove_rmap

On Mon, 29 May 2023, Wupeng Ma wrote:
> From: Ma Wupeng <[email protected]>
>
> Our syzbot report a mlock related problem. During exit_mm, tail page is
> passed to clear_page_mlock which final lead to kernel panic.
>
> During unmap_page_range, if compound is false, it means this page is
> seen as a small page. This page is passed to isolate_lru_page if this
> page is PageMlocked and finally lead to "trying to isolate tail page"
> warning.
>
> Here is the simplified calltrace:
>
> unmap_page_range
> zap_pte_range
> page_remove_rmap(page, false); // compound is false means to handle
> to small page not compound page
> nr_pages = thp_nr_pages(page);
> clear_page_mlock(page) // maybe tail page here
> isolate_lru_page
> WARN_RATELIMIT(PageTail(page), "trying to isolate tail page");
>
> Since mlock is not supposed to handle tail, we pass head page to
> clear_page_mlock() to slove this problem.

Your patch looks plausible for stable, and might even end up as the best
that can be done; but I think you have not root-caused the problem yet
(and until it's root-caused, there is likely to be other damage).

5.15 and 5.10 were releases with the PageDoubleMap flag, and the intention
then was that a compound page with PageDoubleMap set could not be Mlocked,
and PageMlocked had to be cleared when setting PageDoubleMap.

See, for example, the line in the old mlock_vma_page()
VM_BUG_ON_PAGE(PageCompound(page) && PageDoubleMap(page), page);
before it did the TestSetPageMlocked().

So it should have been impossible to find PageMlocked on a Tail page
(even with PageMlocked redirecting to the head page to look up the flag)
there; so unnecessary for clear_page_mlock() to use compound_head().

Since nobody reported this problem before, my suspicion is that a commit
has been backported to 5.15 and 5.10 stable, which does not belong there.
Or perhaps the stable trees are okay, but your own tree has an unsuitable
backport in it?

>
> This bug can lead to multiple reports. Here ares the simplified reports:
>
> ------------[ cut here ]------------
> trying to isolate tail page
> WARNING: CPU: 1 PID: 24489 at mm/vmscan.c:2031 isolate_lru_page+0x574/0x660
>
> page:fffffc000eb7a300 refcount:512 mapcount:0 mapping:0000000000000000 index:0x2008c pfn:0x3ede8c
> head:fffffc000eb78000 order:9 compound_mapcount:0 compound_pincount:0
> memcg:ffff0000d24bc000
> anon flags: 0x37ffff80009080c(uptodate|dirty|arch_1|head|swapbacked|node=1|zone=2|lastcpupid=0xfffff)
> raw: 037ffff800000800 fffffc000eb78001 fffffc000eb7a308 dead000000000400
> raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
> head: 037ffff80009080c fffffc000eb70008 fffffc000e350708 ffff0003829eb839
> head: 0000000000020000 0000000000000000 00000200ffffffff ffff0000d24bc000
> page dumped because: VM_WARN_ON_ONCE_PAGE(!memcg && !mem_cgroup_disabled())
> ------------[ cut here ]------------
> WARNING: CPU: 1 PID: 24489 at include/linux/memcontrol.h:767 lock_page_lruvec_irq+0x148/0x190
>
> page:fffffc000eb7a300 refcount:0 mapcount:0 mapping:dead000000000400 index:0x0 pfn:0x3ede8c
> failed to read mapping contents, not a valid kernel address?
> flags: 0x37ffff800000800(arch_1|node=1|zone=2|lastcpupid=0xfffff)
> raw: 037ffff800000800 dead000000000100 dead000000000122 dead000000000400
> raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
> page dumped because: VM_BUG_ON_PAGE(((unsigned int) page_ref_count(page) + 127u <= 127u))
> ------------[ cut here ]------------
> kernel BUG at include/linux/mm.h:1213!
> Call trace:
> lru_cache_add+0x2d4/0x2e8
> putback_lru_page+0x2c/0x168
> clear_page_mlock+0x254/0x318
> page_remove_rmap+0x900/0x9c0
> unmap_page_range+0xa78/0x16a0
> unmap_single_vma+0x114/0x1a0
> unmap_vmas+0x100/0x220
> exit_mmap+0x120/0x410
> mmput+0x174/0x498
> exit_mm+0x33c/0x460
> do_exit+0x3c0/0x1310
> do_group_exit+0x98/0x170
> get_signal+0x370/0x13d0
> do_notify_resume+0x5a0/0x968
> el0_da+0x154/0x188
> el0t_64_sync_handler+0x88/0xb8
> el0t_64_sync+0x1a0/0x1a4
> Code: 912b0021 aa1503e0 910c0021 9401a49c (d4210000)
>
> This bug can be reproduced in both linux-5.10.y & linux-5.15.y and maybe
> fixed after commit 889a3747b3b7 ("mm/lru: Add folio LRU functions").
> This patch turn page into folio for LRU related operations, all
> operations to page is turn to folio which means head page after this
> patch.

No, that commit is not likely to have been a fix for this issue.
If there ever was such an issue in the 5.15 and 5.10 trees, it would
more likely have been fixed by the munlock changes in 5.18, or by the
removal of PageDoubleMap in 6.2.

>
> Fixes: d281ee614518 ("rmap: add argument to charge compound page")

Perhaps, but I think an inappropriate backport is more likely.

> Signed-off-by: Ma Wupeng <[email protected]>
> ---
> mm/rmap.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/rmap.c b/mm/rmap.c
> index 330b361a460e..8838f6a9d65d 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -1372,7 +1372,7 @@ void page_remove_rmap(struct page *page, bool compound)
> __dec_lruvec_page_state(page, NR_ANON_MAPPED);
>
> if (unlikely(PageMlocked(page)))
> - clear_page_mlock(page);
> + clear_page_mlock(compound_head(page));
>
> if (PageTransCompound(page))
> deferred_split_huge_page(compound_head(page));

And what about the clear_page_mlock() in page_remove_file_rmap()?

Thanks,
Hugh

2023-06-02 08:15:09

by mawupeng

[permalink] [raw]
Subject: Re: [RFC PATCH stable 5.10/5.15] mm: Pass head page to clear_page_mlock for page_remove_rmap



On 2023/6/2 10:04, Hugh Dickins wrote:
> On Mon, 29 May 2023, Wupeng Ma wrote:
>> From: Ma Wupeng <[email protected]>
>>
>> Our syzbot report a mlock related problem. During exit_mm, tail page is
>> passed to clear_page_mlock which final lead to kernel panic.
>>
>> During unmap_page_range, if compound is false, it means this page is
>> seen as a small page. This page is passed to isolate_lru_page if this
>> page is PageMlocked and finally lead to "trying to isolate tail page"
>> warning.
>>
>> Here is the simplified calltrace:
>>
>> unmap_page_range
>> zap_pte_range
>> page_remove_rmap(page, false); // compound is false means to handle
>> to small page not compound page
>> nr_pages = thp_nr_pages(page);
>> clear_page_mlock(page) // maybe tail page here
>> isolate_lru_page
>> WARN_RATELIMIT(PageTail(page), "trying to isolate tail page");
>>
>> Since mlock is not supposed to handle tail, we pass head page to
>> clear_page_mlock() to slove this problem.
>
> Your patch looks plausible for stable, and might even end up as the best
> that can be done; but I think you have not root-caused the problem yet
> (and until it's root-caused, there is likely to be other damage).

This I do agreed. The root cause of this problem is still unknown.

>
> 5.15 and 5.10 were releases with the PageDoubleMap flag, and the intention
> then was that a compound page with PageDoubleMap set could not be Mlocked,
> and PageMlocked had to be cleared when setting PageDoubleMap.
>
> See, for example, the line in the old mlock_vma_page()
> VM_BUG_ON_PAGE(PageCompound(page) && PageDoubleMap(page), page);
> before it did the TestSetPageMlocked().
>
> So it should have been impossible to find PageMlocked on a Tail page
> (even with PageMlocked redirecting to the head page to look up the flag)
> there; so unnecessary for clear_page_mlock() to use compound_head().
>
> Since nobody reported this problem before, my suspicion is that a commit
> has been backported to 5.15 and 5.10 stable, which does not belong there.
> Or perhaps the stable trees are okay, but your own tree has an unsuitable
> backport in it?

We are using the latest 5.10/5.15 for testing, without any our own patches.
The corresponding reproduction c file is attached as follow.

We are tesing it in ARM64 with the following config enabled:

CONFIG_KASAN=y
CONFIG_DEBUG_VM=y
CONFIG_DEBUG_LIST=y

>
>>
>> This bug can lead to multiple reports. Here ares the simplified reports:
>>
>> ------------[ cut here ]------------
>> trying to isolate tail page
>> WARNING: CPU: 1 PID: 24489 at mm/vmscan.c:2031 isolate_lru_page+0x574/0x660
>>
>> page:fffffc000eb7a300 refcount:512 mapcount:0 mapping:0000000000000000 index:0x2008c pfn:0x3ede8c
>> head:fffffc000eb78000 order:9 compound_mapcount:0 compound_pincount:0
>> memcg:ffff0000d24bc000
>> anon flags: 0x37ffff80009080c(uptodate|dirty|arch_1|head|swapbacked|node=1|zone=2|lastcpupid=0xfffff)
>> raw: 037ffff800000800 fffffc000eb78001 fffffc000eb7a308 dead000000000400
>> raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
>> head: 037ffff80009080c fffffc000eb70008 fffffc000e350708 ffff0003829eb839
>> head: 0000000000020000 0000000000000000 00000200ffffffff ffff0000d24bc000
>> page dumped because: VM_WARN_ON_ONCE_PAGE(!memcg && !mem_cgroup_disabled())
>> ------------[ cut here ]------------
>> WARNING: CPU: 1 PID: 24489 at include/linux/memcontrol.h:767 lock_page_lruvec_irq+0x148/0x190
>>
>> page:fffffc000eb7a300 refcount:0 mapcount:0 mapping:dead000000000400 index:0x0 pfn:0x3ede8c
>> failed to read mapping contents, not a valid kernel address?
>> flags: 0x37ffff800000800(arch_1|node=1|zone=2|lastcpupid=0xfffff)
>> raw: 037ffff800000800 dead000000000100 dead000000000122 dead000000000400
>> raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
>> page dumped because: VM_BUG_ON_PAGE(((unsigned int) page_ref_count(page) + 127u <= 127u))
>> ------------[ cut here ]------------
>> kernel BUG at include/linux/mm.h:1213!
>> Call trace:
>> lru_cache_add+0x2d4/0x2e8
>> putback_lru_page+0x2c/0x168
>> clear_page_mlock+0x254/0x318
>> page_remove_rmap+0x900/0x9c0
>> unmap_page_range+0xa78/0x16a0
>> unmap_single_vma+0x114/0x1a0
>> unmap_vmas+0x100/0x220
>> exit_mmap+0x120/0x410
>> mmput+0x174/0x498
>> exit_mm+0x33c/0x460
>> do_exit+0x3c0/0x1310
>> do_group_exit+0x98/0x170
>> get_signal+0x370/0x13d0
>> do_notify_resume+0x5a0/0x968
>> el0_da+0x154/0x188
>> el0t_64_sync_handler+0x88/0xb8
>> el0t_64_sync+0x1a0/0x1a4
>> Code: 912b0021 aa1503e0 910c0021 9401a49c (d4210000)
>>
>> This bug can be reproduced in both linux-5.10.y & linux-5.15.y and maybe
>> fixed after commit 889a3747b3b7 ("mm/lru: Add folio LRU functions").
>> This patch turn page into folio for LRU related operations, all
>> operations to page is turn to folio which means head page after this
>> patch.
>
> No, that commit is not likely to have been a fix for this issue.
> If there ever was such an issue in the 5.15 and 5.10 trees, it would
> more likely have been fixed by the munlock changes in 5.18, or by the
> removal of PageDoubleMap in 6.2.

Sorry, my bad, commit 889a3747b3b7 ("mm/lru: Add folio LRU functions")
only fix one warning, the real fix is commit b109b87050df, ("mm/munlock:
replace clear_page_mlock() by final clearance").

>
>>
>> Fixes: d281ee614518 ("rmap: add argument to charge compound page")
>
> Perhaps, but I think an inappropriate backport is more likely.
>
>> Signed-off-by: Ma Wupeng <[email protected]>
>> ---
>> mm/rmap.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/mm/rmap.c b/mm/rmap.c
>> index 330b361a460e..8838f6a9d65d 100644
>> --- a/mm/rmap.c
>> +++ b/mm/rmap.c
>> @@ -1372,7 +1372,7 @@ void page_remove_rmap(struct page *page, bool compound)
>> __dec_lruvec_page_state(page, NR_ANON_MAPPED);
>>
>> if (unlikely(PageMlocked(page)))
>> - clear_page_mlock(page);
>> + clear_page_mlock(compound_head(page));
>>
>> if (PageTransCompound(page))
>> deferred_split_huge_page(compound_head(page));
>
> And what about the clear_page_mlock() in page_remove_file_rmap()?

According to the same logic, this should be fixed too.

Thanks for your reply,
Wupeng

>
> Thanks,
> Hugh


Attachments:
isolate_lru_page.c (6.13 kB)