2023-01-14 00:35:23

by Vishal Moola

[permalink] [raw]
Subject: [PATCH 2/2] mm/khugepaged: Convert release_pte_pages() to use folios

Converts release_pte_pages() to use folios instead of pages.

Signed-off-by: Vishal Moola (Oracle) <[email protected]>
---
mm/khugepaged.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 4888e8688401..27d010431ece 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -509,20 +509,20 @@ static void release_pte_page(struct page *page)
static void release_pte_pages(pte_t *pte, pte_t *_pte,
struct list_head *compound_pagelist)
{
- struct page *page, *tmp;
+ struct folio *folio, *tmp;

while (--_pte >= pte) {
pte_t pteval = *_pte;

- page = pte_page(pteval);
+ folio = pfn_folio(pte_pfn(pteval));
if (!pte_none(pteval) && !is_zero_pfn(pte_pfn(pteval)) &&
- !PageCompound(page))
- release_pte_page(page);
+ !folio_test_large(folio))
+ release_pte_folio(folio);
}

- list_for_each_entry_safe(page, tmp, compound_pagelist, lru) {
- list_del(&page->lru);
- release_pte_page(page);
+ list_for_each_entry_safe(folio, tmp, compound_pagelist, lru) {
+ list_del(&folio->lru);
+ release_pte_folio(folio);
}
}

--
2.38.1


2023-02-13 08:53:20

by Marek Szyprowski

[permalink] [raw]
Subject: Re: [PATCH 2/2] mm/khugepaged: Convert release_pte_pages() to use folios

Hi,

On 14.01.2023 01:15, Vishal Moola (Oracle) wrote:
> Converts release_pte_pages() to use folios instead of pages.
>
> Signed-off-by: Vishal Moola (Oracle) <[email protected]>

This patch has been merged some time ago to linux-next as commit
9bdfeea46f49 ("mm/khugepaged: convert release_pte_pages() to use
folios"). It took me a while to bisect this (mainly because I was busy
with other things), but I finally found that this change is responsible
for the following kernel panic:

Unable to handle kernel paging request at virtual address fffffc0000000008
Mem abort info:
  ESR = 0x0000000096000006
  EC = 0x25: DABT (current EL), IL = 32 bits
  SET = 0, FnV = 0
  EA = 0, S1PTW = 0
  FSC = 0x06: level 2 translation fault
Data abort info:
  ISV = 0, ISS = 0x00000006
  CM = 0, WnR = 0
swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000021efa000
[fffffc0000000008] pgd=10000000df05a003, p4d=10000000df05a003,
pud=10000000df059003, pmd=0000000000000000
Internal error: Oops: 0000000096000006 [#1] PREEMPT SMP
Modules linked in: ip_tables x_tables ipv6
CPU: 7 PID: 61 Comm: khugepaged Not tainted 6.2.0-rc4+ #13307
Hardware name: Samsung TM2E board (DT)
pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : hpage_collapse_scan_pmd+0x12ec/0x1a20
lr : hpage_collapse_scan_pmd+0x14b0/0x1a20
sp : ffff80000be13c20
x29: ffff80000be13c20 x28: 0000000000000001 x27: fffffc0000d3f5c0
x26: fffffc0000d3f600 x25: 00000000000001f9 x24: 0000000000000007
x23: ffff0000296f9dd8 x22: ffff800009e5b490 x21: 0000000000000000
x20: 000000000000000f x19: ffff80000a9d0000 x18: ffff80000af52e58
x17: 0000000000000028 x16: 0000000000009249 x15: ffff80000af971f8
x14: 0000000000000000 x13: 00000000000443a0 x12: 0000000000040000
x11: 000000000fffffff x10: ffff000024928880 x9 : ffff80000b5c6e98
x8 : ffff000024928000 x7 : 00000000b35d04b9 x6 : 0000000000000000
x5 : fffffc0000000000 x4 : ffff8000cbf2e000 x3 : 0000000000000000
x2 : 0000000000000000 x1 : 0000000000000000 x0 : fffffc0000000000
Call trace:
 hpage_collapse_scan_pmd+0x12ec/0x1a20
 khugepaged+0x7e0/0x8dc
 kthread+0x118/0x11c
 ret_from_fork+0x10/0x20
Code: d34cbc43 cb813061 d37ae421 8b050020 (f9400404)
---[ end trace 0000000000000000 ]---
Kernel panic - not syncing: Oops: Fatal exception
SMP: stopping secondary CPUs
Kernel Offset: disabled
CPU features: 0x8c000,41c78100,0000421b
Memory Limit: none
---[ end Kernel panic - not syncing: Oops: Fatal exception ]---


Reverting it on top of recent linux-next fixes the issue, so it looks
that some kind of a corner case is missing in this patch. I can
reproduce it usually during the system shutdown, 1 of 20 times on the
average.


> ---
> mm/khugepaged.c | 14 +++++++-------
> 1 file changed, 7 insertions(+), 7 deletions(-)
>
> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> index 4888e8688401..27d010431ece 100644
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -509,20 +509,20 @@ static void release_pte_page(struct page *page)
> static void release_pte_pages(pte_t *pte, pte_t *_pte,
> struct list_head *compound_pagelist)
> {
> - struct page *page, *tmp;
> + struct folio *folio, *tmp;
>
> while (--_pte >= pte) {
> pte_t pteval = *_pte;
>
> - page = pte_page(pteval);
> + folio = pfn_folio(pte_pfn(pteval));
> if (!pte_none(pteval) && !is_zero_pfn(pte_pfn(pteval)) &&
> - !PageCompound(page))
> - release_pte_page(page);
> + !folio_test_large(folio))
> + release_pte_folio(folio);
> }
>
> - list_for_each_entry_safe(page, tmp, compound_pagelist, lru) {
> - list_del(&page->lru);
> - release_pte_page(page);
> + list_for_each_entry_safe(folio, tmp, compound_pagelist, lru) {
> + list_del(&folio->lru);
> + release_pte_folio(folio);
> }
> }
>

Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland


2023-02-13 15:28:27

by Alexandre Ghiti

[permalink] [raw]
Subject: Re: [PATCH 2/2] mm/khugepaged: Convert release_pte_pages() to use folios

Hi,

On 2/13/23 09:53, Marek Szyprowski wrote:
> Hi,
>
> On 14.01.2023 01:15, Vishal Moola (Oracle) wrote:
>> Converts release_pte_pages() to use folios instead of pages.
>>
>> Signed-off-by: Vishal Moola (Oracle) <[email protected]>
> This patch has been merged some time ago to linux-next as commit
> 9bdfeea46f49 ("mm/khugepaged: convert release_pte_pages() to use
> folios"). It took me a while to bisect this (mainly because I was busy
> with other things), but I finally found that this change is responsible
> for the following kernel panic:
>
> Unable to handle kernel paging request at virtual address fffffc0000000008
> Mem abort info:
>   ESR = 0x0000000096000006
>   EC = 0x25: DABT (current EL), IL = 32 bits
>   SET = 0, FnV = 0
>   EA = 0, S1PTW = 0
>   FSC = 0x06: level 2 translation fault
> Data abort info:
>   ISV = 0, ISS = 0x00000006
>   CM = 0, WnR = 0
> swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000021efa000
> [fffffc0000000008] pgd=10000000df05a003, p4d=10000000df05a003,
> pud=10000000df059003, pmd=0000000000000000
> Internal error: Oops: 0000000096000006 [#1] PREEMPT SMP
> Modules linked in: ip_tables x_tables ipv6
> CPU: 7 PID: 61 Comm: khugepaged Not tainted 6.2.0-rc4+ #13307
> Hardware name: Samsung TM2E board (DT)
> pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> pc : hpage_collapse_scan_pmd+0x12ec/0x1a20
> lr : hpage_collapse_scan_pmd+0x14b0/0x1a20
> sp : ffff80000be13c20
> x29: ffff80000be13c20 x28: 0000000000000001 x27: fffffc0000d3f5c0
> x26: fffffc0000d3f600 x25: 00000000000001f9 x24: 0000000000000007
> x23: ffff0000296f9dd8 x22: ffff800009e5b490 x21: 0000000000000000
> x20: 000000000000000f x19: ffff80000a9d0000 x18: ffff80000af52e58
> x17: 0000000000000028 x16: 0000000000009249 x15: ffff80000af971f8
> x14: 0000000000000000 x13: 00000000000443a0 x12: 0000000000040000
> x11: 000000000fffffff x10: ffff000024928880 x9 : ffff80000b5c6e98
> x8 : ffff000024928000 x7 : 00000000b35d04b9 x6 : 0000000000000000
> x5 : fffffc0000000000 x4 : ffff8000cbf2e000 x3 : 0000000000000000
> x2 : 0000000000000000 x1 : 0000000000000000 x0 : fffffc0000000000
> Call trace:
>  hpage_collapse_scan_pmd+0x12ec/0x1a20
>  khugepaged+0x7e0/0x8dc
>  kthread+0x118/0x11c
>  ret_from_fork+0x10/0x20
> Code: d34cbc43 cb813061 d37ae421 8b050020 (f9400404)
> ---[ end trace 0000000000000000 ]---
> Kernel panic - not syncing: Oops: Fatal exception
> SMP: stopping secondary CPUs
> Kernel Offset: disabled
> CPU features: 0x8c000,41c78100,0000421b
> Memory Limit: none
> ---[ end Kernel panic - not syncing: Oops: Fatal exception ]---
>
>
> Reverting it on top of recent linux-next fixes the issue, so it looks
> that some kind of a corner case is missing in this patch. I can
> reproduce it usually during the system shutdown, 1 of 20 times on the
> average.


I have debugging this issue since this morning too!


>
>> ---
>> mm/khugepaged.c | 14 +++++++-------
>> 1 file changed, 7 insertions(+), 7 deletions(-)
>>
>> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
>> index 4888e8688401..27d010431ece 100644
>> --- a/mm/khugepaged.c
>> +++ b/mm/khugepaged.c
>> @@ -509,20 +509,20 @@ static void release_pte_page(struct page *page)
>> static void release_pte_pages(pte_t *pte, pte_t *_pte,
>> struct list_head *compound_pagelist)
>> {
>> - struct page *page, *tmp;
>> + struct folio *folio, *tmp;
>>
>> while (--_pte >= pte) {
>> pte_t pteval = *_pte;
>>
>> - page = pte_page(pteval);
>> + folio = pfn_folio(pte_pfn(pteval));


The issue lies here: before using pteval in pfn_folio(), we should test
it. The following patch fixes the issue for me:

diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index eb38bd1b1b2f..fef3414b481b 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -514,10 +514,12 @@ static void release_pte_pages(pte_t *pte, pte_t *_pte,
        while (--_pte >= pte) {
                pte_t pteval = *_pte;

-               folio = pfn_folio(pte_pfn(pteval));
-               if (!pte_none(pteval) && !is_zero_pfn(pte_pfn(pteval)) &&
-                               !folio_test_large(folio))
-                       release_pte_folio(folio);
+               if (!pte_none(pteval) && !is_zero_pfn(pte_pfn(pteval))) {
+                       folio = pfn_folio(pte_pfn(pteval));
+
+                       if (!folio_test_large(folio))
+                               release_pte_folio(folio);
+               }
        }

        list_for_each_entry_safe(folio, tmp, compound_pagelist, lru) {


@Marek: could you give it a try?

I can send a separate patch if needed, let me know.

Thanks,

Alex


>> if (!pte_none(pteval) && !is_zero_pfn(pte_pfn(pteval)) &&
>> - !PageCompound(page))
>> - release_pte_page(page);
>> + !folio_test_large(folio))
>> + release_pte_folio(folio);
>> }
>>
>> - list_for_each_entry_safe(page, tmp, compound_pagelist, lru) {
>> - list_del(&page->lru);
>> - release_pte_page(page);
>> + list_for_each_entry_safe(folio, tmp, compound_pagelist, lru) {
>> + list_del(&folio->lru);
>> + release_pte_folio(folio);
>> }
>> }
>>
> Best regards

2023-02-13 15:50:17

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH 2/2] mm/khugepaged: Convert release_pte_pages() to use folios

On Mon, Feb 13, 2023 at 04:28:05PM +0100, Alexandre Ghiti wrote:
> The issue lies here: before using pteval in pfn_folio(), we should test it.
> The following patch fixes the issue for me:

Thanks for debugging it. I'd rather see this written as ...

pte_t pteval = *_pte;
+ unsigned long pfn;

+ if (pte_none(pteval))
+ continue;
+ pfn = pte_pfn(pteval);
+ if (is_zero_pfn(pfn))
+ continue;
+ folio = pfn_folio(pfn);
+ if (folio_test_large(folio))
+ continue;
release_pte_folio(folio);

makes sense?

> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> index eb38bd1b1b2f..fef3414b481b 100644
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -514,10 +514,12 @@ static void release_pte_pages(pte_t *pte, pte_t *_pte,
> ??????? while (--_pte >= pte) {
> ??????????????? pte_t pteval = *_pte;
>
> -?????????????? folio = pfn_folio(pte_pfn(pteval));
> -?????????????? if (!pte_none(pteval) && !is_zero_pfn(pte_pfn(pteval)) &&
> -?????????????????????????????? !folio_test_large(folio))
> -?????????????????????? release_pte_folio(folio);
> +?????????????? if (!pte_none(pteval) && !is_zero_pfn(pte_pfn(pteval))) {
> +?????????????????????? folio = pfn_folio(pte_pfn(pteval));
> +
> +?????????????????????? if (!folio_test_large(folio))
> +?????????????????????????????? release_pte_folio(folio);
> +?????????????? }
> ??????? }
>
> ??????? list_for_each_entry_safe(folio, tmp, compound_pagelist, lru) {
>
>
> @Marek: could you give it a try?
>
> I can send a separate patch if needed, let me know.
>
> Thanks,
>
> Alex
>
>
> > > if (!pte_none(pteval) && !is_zero_pfn(pte_pfn(pteval)) &&
> > > - !PageCompound(page))
> > > - release_pte_page(page);
> > > + !folio_test_large(folio))
> > > + release_pte_folio(folio);
> > > }
> > > - list_for_each_entry_safe(page, tmp, compound_pagelist, lru) {
> > > - list_del(&page->lru);
> > > - release_pte_page(page);
> > > + list_for_each_entry_safe(folio, tmp, compound_pagelist, lru) {
> > > + list_del(&folio->lru);
> > > + release_pte_folio(folio);
> > > }
> > > }
> > Best regards
>

2023-02-13 15:55:38

by Alexandre Ghiti

[permalink] [raw]
Subject: Re: [PATCH 2/2] mm/khugepaged: Convert release_pte_pages() to use folios

Hi Matthew,

On 2/13/23 16:50, Matthew Wilcox wrote:
> On Mon, Feb 13, 2023 at 04:28:05PM +0100, Alexandre Ghiti wrote:
>> The issue lies here: before using pteval in pfn_folio(), we should test it.
>> The following patch fixes the issue for me:
> Thanks for debugging it. I'd rather see this written as ...
>
> pte_t pteval = *_pte;
> + unsigned long pfn;
>
> + if (pte_none(pteval))
> + continue;
> + pfn = pte_pfn(pteval);
> + if (is_zero_pfn(pfn))
> + continue;
> + folio = pfn_folio(pfn);
> + if (folio_test_large(folio))
> + continue;
> release_pte_folio(folio);
>
> makes sense?


Sure, that's fine by me, I can send that or I'll add my tested-by on
what you send, whatever suits you.

Alex


>
>> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
>> index eb38bd1b1b2f..fef3414b481b 100644
>> --- a/mm/khugepaged.c
>> +++ b/mm/khugepaged.c
>> @@ -514,10 +514,12 @@ static void release_pte_pages(pte_t *pte, pte_t *_pte,
>>         while (--_pte >= pte) {
>>                 pte_t pteval = *_pte;
>>
>> -               folio = pfn_folio(pte_pfn(pteval));
>> -               if (!pte_none(pteval) && !is_zero_pfn(pte_pfn(pteval)) &&
>> -                               !folio_test_large(folio))
>> -                       release_pte_folio(folio);
>> +               if (!pte_none(pteval) && !is_zero_pfn(pte_pfn(pteval))) {
>> +                       folio = pfn_folio(pte_pfn(pteval));
>> +
>> +                       if (!folio_test_large(folio))
>> +                               release_pte_folio(folio);
>> +               }
>>         }
>>
>>         list_for_each_entry_safe(folio, tmp, compound_pagelist, lru) {
>>
>>
>> @Marek: could you give it a try?
>>
>> I can send a separate patch if needed, let me know.
>>
>> Thanks,
>>
>> Alex
>>
>>
>>>> if (!pte_none(pteval) && !is_zero_pfn(pte_pfn(pteval)) &&
>>>> - !PageCompound(page))
>>>> - release_pte_page(page);
>>>> + !folio_test_large(folio))
>>>> + release_pte_folio(folio);
>>>> }
>>>> - list_for_each_entry_safe(page, tmp, compound_pagelist, lru) {
>>>> - list_del(&page->lru);
>>>> - release_pte_page(page);
>>>> + list_for_each_entry_safe(folio, tmp, compound_pagelist, lru) {
>>>> + list_del(&folio->lru);
>>>> + release_pte_folio(folio);
>>>> }
>>>> }
>>> Best regards

2023-02-13 20:39:12

by Vishal Moola

[permalink] [raw]
Subject: Re: [PATCH 2/2] mm/khugepaged: Convert release_pte_pages() to use folios

On Mon, Feb 13, 2023 at 7:55 AM Alexandre Ghiti <[email protected]> wrote:
>
> Hi Matthew,
>
> On 2/13/23 16:50, Matthew Wilcox wrote:
> > On Mon, Feb 13, 2023 at 04:28:05PM +0100, Alexandre Ghiti wrote:
> >> The issue lies here: before using pteval in pfn_folio(), we should test it.
> >> The following patch fixes the issue for me:
> > Thanks for debugging it. I'd rather see this written as ...
> >
> > pte_t pteval = *_pte;
> > + unsigned long pfn;
> >
> > + if (pte_none(pteval))
> > + continue;
> > + pfn = pte_pfn(pteval);
> > + if (is_zero_pfn(pfn))
> > + continue;
> > + folio = pfn_folio(pfn);
> > + if (folio_test_large(folio))
> > + continue;
> > release_pte_folio(folio);
> >
> > makes sense?
>
>
> Sure, that's fine by me, I can send that or I'll add my tested-by on
> what you send, whatever suits you.

Thanks for debugging this! I'll send a fix patch using Matthew's
approach later today.

> Alex
>
>
> >
> >> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> >> index eb38bd1b1b2f..fef3414b481b 100644
> >> --- a/mm/khugepaged.c
> >> +++ b/mm/khugepaged.c
> >> @@ -514,10 +514,12 @@ static void release_pte_pages(pte_t *pte, pte_t *_pte,
> >> while (--_pte >= pte) {
> >> pte_t pteval = *_pte;
> >>
> >> - folio = pfn_folio(pte_pfn(pteval));
> >> - if (!pte_none(pteval) && !is_zero_pfn(pte_pfn(pteval)) &&
> >> - !folio_test_large(folio))
> >> - release_pte_folio(folio);
> >> + if (!pte_none(pteval) && !is_zero_pfn(pte_pfn(pteval))) {
> >> + folio = pfn_folio(pte_pfn(pteval));
> >> +
> >> + if (!folio_test_large(folio))
> >> + release_pte_folio(folio);
> >> + }
> >> }
> >>
> >> list_for_each_entry_safe(folio, tmp, compound_pagelist, lru) {
> >>
> >>
> >> @Marek: could you give it a try?
> >>
> >> I can send a separate patch if needed, let me know.
> >>
> >> Thanks,
> >>
> >> Alex
> >>
> >>
> >>>> if (!pte_none(pteval) && !is_zero_pfn(pte_pfn(pteval)) &&
> >>>> - !PageCompound(page))
> >>>> - release_pte_page(page);
> >>>> + !folio_test_large(folio))
> >>>> + release_pte_folio(folio);
> >>>> }
> >>>> - list_for_each_entry_safe(page, tmp, compound_pagelist, lru) {
> >>>> - list_del(&page->lru);
> >>>> - release_pte_page(page);
> >>>> + list_for_each_entry_safe(folio, tmp, compound_pagelist, lru) {
> >>>> + list_del(&folio->lru);
> >>>> + release_pte_folio(folio);
> >>>> }
> >>>> }
> >>> Best regards