2022-08-12 08:52:17

by Haiyue Wang

[permalink] [raw]
Subject: [PATCH v1] mm: migration: fix the FOLL_GET failure on following huge page

Not all huge page APIs support FOLL_GET option, so the __NR_move_pages
will fail to get the page node information for huge page.

This is an temporary solution to mitigate the racing fix.

After supporting follow huge page by FOLL_GET is done, this fix can be
reverted safely.

Fixes: 4cd614841c06 ("mm: migration: fix possible do_pages_stat_array racing with memory offline")
Signed-off-by: Haiyue Wang <[email protected]>
---
mm/migrate.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/mm/migrate.c b/mm/migrate.c
index 6a1597c92261..581dfaad9257 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1848,6 +1848,7 @@ static void do_pages_stat_array(struct mm_struct *mm, unsigned long nr_pages,

for (i = 0; i < nr_pages; i++) {
unsigned long addr = (unsigned long)(*pages);
+ unsigned int foll_flags = FOLL_DUMP;
struct vm_area_struct *vma;
struct page *page;
int err = -EFAULT;
@@ -1856,8 +1857,12 @@ static void do_pages_stat_array(struct mm_struct *mm, unsigned long nr_pages,
if (!vma)
goto set_status;

+ /* Not all huge page follow APIs support 'FOLL_GET' */
+ if (!is_vm_hugetlb_page(vma))
+ foll_flags |= FOLL_GET;
+
/* FOLL_DUMP to ignore special (like zero) pages */
- page = follow_page(vma, addr, FOLL_GET | FOLL_DUMP);
+ page = follow_page(vma, addr, foll_flags);

err = PTR_ERR(page);
if (IS_ERR(page))
@@ -1865,7 +1870,8 @@ static void do_pages_stat_array(struct mm_struct *mm, unsigned long nr_pages,

if (page && !is_zone_device_page(page)) {
err = page_to_nid(page);
- put_page(page);
+ if (foll_flags & FOLL_GET)
+ put_page(page);
} else {
err = -ENOENT;
}
--
2.37.1


2022-08-14 00:20:11

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH v1] mm: migration: fix the FOLL_GET failure on following huge page

On Fri, 12 Aug 2022 16:49:21 +0800 Haiyue Wang <[email protected]> wrote:

> Not all huge page APIs support FOLL_GET option, so the __NR_move_pages
> will fail to get the page node information for huge page.

Which ones need fixing?

What are the user-visible runtime effects of this bug?

Is a -stable backport warranted?

> This is an temporary solution to mitigate the racing fix.
>
> After supporting follow huge page by FOLL_GET is done, this fix can be
> reverted safely.
>

2022-08-14 06:41:49

by Haiyue Wang

[permalink] [raw]
Subject: RE: [PATCH v1] mm: migration: fix the FOLL_GET failure on following huge page

> -----Original Message-----
> From: Andrew Morton <[email protected]>
> Sent: Sunday, August 14, 2022 07:29
> To: Wang, Haiyue <[email protected]>
> Cc: [email protected]; [email protected]; [email protected]; [email protected]; Huang,
> Ying <[email protected]>; [email protected]; [email protected]
> Subject: Re: [PATCH v1] mm: migration: fix the FOLL_GET failure on following huge page
>
> On Fri, 12 Aug 2022 16:49:21 +0800 Haiyue Wang <[email protected]> wrote:
>
> > Not all huge page APIs support FOLL_GET option, so the __NR_move_pages
> > will fail to get the page node information for huge page.
>
> Which ones need fixing?

1. 'follow_huge_pud' arch/s390/mm/hugetlbpage.c

2. 'follow_huge_addr' arch/ia64/mm/hugetlbpage.c

3. 'follow_huge_pgd' mm/hugetlb.c

And I found that only 'pud' and 'pmd' need to check 'is_vm_hugetlb_page' like:
pud_huge(*pud) && is_vm_hugetlb_page(vma)
pmd_huge(pmdval) && is_vm_hugetlb_page(vma)

So I'm not sure whether my patch can cover 2 & 3 for other huge page use cases
except by hugetlbfs.

>
> What are the user-visible runtime effects of this bug?
>

In my test, the '__NR_move_pages' system call will return '-2' for 1GB huge page
memory map when dump the page node information. [Test on linux-5.19 stable]

> Is a -stable backport warranted?

Yes.

Since the mainline has introduced the new patch:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3218f8712d6bb

The backported needs to rebase like for 5.19:

- if (page && !is_zone_device_page(page)) {
+ if (page) {

>
> > This is an temporary solution to mitigate the racing fix.
> >
> > After supporting follow huge page by FOLL_GET is done, this fix can be
> > reverted safely.
> >

2022-08-14 07:19:57

by Haiyue Wang

[permalink] [raw]
Subject: RE: [PATCH v1] mm: migration: fix the FOLL_GET failure on following huge page

> -----Original Message-----
> From: Wang, Haiyue
> Sent: Sunday, August 14, 2022 14:20
> To: Andrew Morton <[email protected]>
> Cc: [email protected]; [email protected]; [email protected]; [email protected]; Huang,
> Ying <[email protected]>; [email protected]; [email protected]
> Subject: RE: [PATCH v1] mm: migration: fix the FOLL_GET failure on following huge page
>
> > -----Original Message-----
> > From: Andrew Morton <[email protected]>
> > Sent: Sunday, August 14, 2022 07:29
> > To: Wang, Haiyue <[email protected]>
> > Cc: [email protected]; [email protected]; [email protected]; [email protected]; Huang,
> > Ying <[email protected]>; [email protected]; [email protected]
> > Subject: Re: [PATCH v1] mm: migration: fix the FOLL_GET failure on following huge page
> >
> > On Fri, 12 Aug 2022 16:49:21 +0800 Haiyue Wang <[email protected]> wrote:
> >
> > > Not all huge page APIs support FOLL_GET option, so the __NR_move_pages
> > > will fail to get the page node information for huge page.
> >


> > Is a -stable backport warranted?
>
> Yes.
>
> Since the mainline has introduced the new patch:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3218f8712d6bb
>

Looks like 'is_zone_device_page' will cause 'FOLL_GET' page reference count issue, which should
call 'put_page' before call branch jump.

try_grab_page -->
if (flags & FOLL_GET)
folio_ref_inc(folio);

Or I misunderstood the 'is_zone_device_page' ? ;-)

> The backported needs to rebase like for 5.19:
>
> - if (page && !is_zone_device_page(page)) {
> + if (page) {
>
> >
> > >

2022-08-14 14:06:27

by Haiyue Wang

[permalink] [raw]
Subject: [PATCH v2 0/3] fix follow_page related issues

v2: Add the Non-LRU pages fix with two patches, so that
'mm: migration: fix the FOLL_GET' can be applied directly
on linux-5.19 stable branch.

Haiyue Wang (3):
mm: revert handling Non-LRU pages returned by follow_page
mm: migration: fix the FOLL_GET failure on following huge page
mm: handling Non-LRU pages returned by follow_page

mm/huge_memory.c | 4 ++--
mm/ksm.c | 16 +++++++++++++---
mm/migrate.c | 20 +++++++++++++++-----
3 files changed, 30 insertions(+), 10 deletions(-)

--
2.37.2

2022-08-14 14:06:43

by Haiyue Wang

[permalink] [raw]
Subject: [PATCH v2 1/3] mm: revert handling Non-LRU pages returned by follow_page

The commit
3218f8712d6b ("mm: handling Non-LRU pages returned by vm_normal_pages")
doesn't handle the follow_page with flag FOLL_GET correctly, this will
do get_page on page, it shouldn't just return directly without put_page.

So revert the related fix to prepare for clean patch to handle Non-LRU
pages returned by follow_page.

Signed-off-by: Haiyue Wang <[email protected]>
---
mm/huge_memory.c | 2 +-
mm/ksm.c | 6 +++---
mm/migrate.c | 4 ++--
3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 8a7c1b344abe..2ee6d38a1426 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2963,7 +2963,7 @@ static int split_huge_pages_pid(int pid, unsigned long vaddr_start,
/* FOLL_DUMP to ignore special (like zero) pages */
page = follow_page(vma, addr, FOLL_GET | FOLL_DUMP);

- if (IS_ERR_OR_NULL(page) || is_zone_device_page(page))
+ if (IS_ERR_OR_NULL(page))
continue;

if (!is_transparent_hugepage(page))
diff --git a/mm/ksm.c b/mm/ksm.c
index 42ab153335a2..fe3e0a39f73a 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -475,7 +475,7 @@ static int break_ksm(struct vm_area_struct *vma, unsigned long addr)
cond_resched();
page = follow_page(vma, addr,
FOLL_GET | FOLL_MIGRATION | FOLL_REMOTE);
- if (IS_ERR_OR_NULL(page) || is_zone_device_page(page))
+ if (IS_ERR_OR_NULL(page))
break;
if (PageKsm(page))
ret = handle_mm_fault(vma, addr,
@@ -560,7 +560,7 @@ static struct page *get_mergeable_page(struct rmap_item *rmap_item)
goto out;

page = follow_page(vma, addr, FOLL_GET);
- if (IS_ERR_OR_NULL(page) || is_zone_device_page(page))
+ if (IS_ERR_OR_NULL(page))
goto out;
if (PageAnon(page)) {
flush_anon_page(vma, page, addr);
@@ -2308,7 +2308,7 @@ static struct rmap_item *scan_get_next_rmap_item(struct page **page)
if (ksm_test_exit(mm))
break;
*page = follow_page(vma, ksm_scan.address, FOLL_GET);
- if (IS_ERR_OR_NULL(*page) || is_zone_device_page(*page)) {
+ if (IS_ERR_OR_NULL(*page)) {
ksm_scan.address += PAGE_SIZE;
cond_resched();
continue;
diff --git a/mm/migrate.c b/mm/migrate.c
index 6a1597c92261..3d5f0262ab60 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1672,7 +1672,7 @@ static int add_page_for_migration(struct mm_struct *mm, unsigned long addr,
goto out;

err = -ENOENT;
- if (!page || is_zone_device_page(page))
+ if (!page)
goto out;

err = 0;
@@ -1863,7 +1863,7 @@ static void do_pages_stat_array(struct mm_struct *mm, unsigned long nr_pages,
if (IS_ERR(page))
goto set_status;

- if (page && !is_zone_device_page(page)) {
+ if (page) {
err = page_to_nid(page);
put_page(page);
} else {
--
2.37.2

2022-08-14 14:15:14

by Haiyue Wang

[permalink] [raw]
Subject: [PATCH v2 2/3] mm: migration: fix the FOLL_GET failure on following huge page

Not all huge page APIs support FOLL_GET option, so the __NR_move_pages
will fail to get the page node information for huge page.

This is an temporary solution to mitigate the racing fix.

After supporting follow huge page by FOLL_GET is done, this fix can be
reverted safely.

Fixes: 4cd614841c06 ("mm: migration: fix possible do_pages_stat_array racing with memory offline")
Signed-off-by: Haiyue Wang <[email protected]>
---
mm/migrate.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/mm/migrate.c b/mm/migrate.c
index 3d5f0262ab60..5d304de3950b 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1848,6 +1848,7 @@ static void do_pages_stat_array(struct mm_struct *mm, unsigned long nr_pages,

for (i = 0; i < nr_pages; i++) {
unsigned long addr = (unsigned long)(*pages);
+ unsigned int foll_flags = FOLL_DUMP;
struct vm_area_struct *vma;
struct page *page;
int err = -EFAULT;
@@ -1856,8 +1857,12 @@ static void do_pages_stat_array(struct mm_struct *mm, unsigned long nr_pages,
if (!vma)
goto set_status;

+ /* Not all huge page follow APIs support 'FOLL_GET' */
+ if (!is_vm_hugetlb_page(vma))
+ foll_flags |= FOLL_GET;
+
/* FOLL_DUMP to ignore special (like zero) pages */
- page = follow_page(vma, addr, FOLL_GET | FOLL_DUMP);
+ page = follow_page(vma, addr, foll_flags);

err = PTR_ERR(page);
if (IS_ERR(page))
@@ -1865,7 +1870,8 @@ static void do_pages_stat_array(struct mm_struct *mm, unsigned long nr_pages,

if (page) {
err = page_to_nid(page);
- put_page(page);
+ if (foll_flags & FOLL_GET)
+ put_page(page);
} else {
err = -ENOENT;
}
--
2.37.2

2022-08-14 14:46:25

by Haiyue Wang

[permalink] [raw]
Subject: [PATCH v2 3/3] mm: handling Non-LRU pages returned by follow_page

Add the missed put_page handling for handling Non-LRU pages returned by
follow_page with FOLL_GET flag set.

This is the second patch for fixing the commit
3218f8712d6b ("mm: handling Non-LRU pages returned by vm_normal_pages")

Signed-off-by: Haiyue Wang <[email protected]>
---
mm/huge_memory.c | 2 +-
mm/ksm.c | 10 ++++++++++
mm/migrate.c | 6 +++++-
3 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 2ee6d38a1426..b2ba17c3dcd7 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2966,7 +2966,7 @@ static int split_huge_pages_pid(int pid, unsigned long vaddr_start,
if (IS_ERR_OR_NULL(page))
continue;

- if (!is_transparent_hugepage(page))
+ if (is_zone_device_page(page) || !is_transparent_hugepage(page))
goto next;

total++;
diff --git a/mm/ksm.c b/mm/ksm.c
index fe3e0a39f73a..1360bb52ada6 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -477,6 +477,10 @@ static int break_ksm(struct vm_area_struct *vma, unsigned long addr)
FOLL_GET | FOLL_MIGRATION | FOLL_REMOTE);
if (IS_ERR_OR_NULL(page))
break;
+ if (is_zone_device_page(page)) {
+ put_page(page);
+ break;
+ }
if (PageKsm(page))
ret = handle_mm_fault(vma, addr,
FAULT_FLAG_WRITE | FAULT_FLAG_REMOTE,
@@ -562,10 +566,13 @@ static struct page *get_mergeable_page(struct rmap_item *rmap_item)
page = follow_page(vma, addr, FOLL_GET);
if (IS_ERR_OR_NULL(page))
goto out;
+ if (is_zone_device_page(page))
+ goto out_putpage;
if (PageAnon(page)) {
flush_anon_page(vma, page, addr);
flush_dcache_page(page);
} else {
+out_putpage:
put_page(page);
out:
page = NULL;
@@ -2313,6 +2320,8 @@ static struct rmap_item *scan_get_next_rmap_item(struct page **page)
cond_resched();
continue;
}
+ if (is_zone_device_page(*page))
+ goto next_page;
if (PageAnon(*page)) {
flush_anon_page(vma, *page, ksm_scan.address);
flush_dcache_page(*page);
@@ -2327,6 +2336,7 @@ static struct rmap_item *scan_get_next_rmap_item(struct page **page)
mmap_read_unlock(mm);
return rmap_item;
}
+next_page:
put_page(*page);
ksm_scan.address += PAGE_SIZE;
cond_resched();
diff --git a/mm/migrate.c b/mm/migrate.c
index 5d304de3950b..fee12cd2f294 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1675,6 +1675,9 @@ static int add_page_for_migration(struct mm_struct *mm, unsigned long addr,
if (!page)
goto out;

+ if (is_zone_device_page(page))
+ goto out_putpage;
+
err = 0;
if (page_to_nid(page) == node)
goto out_putpage;
@@ -1869,7 +1872,8 @@ static void do_pages_stat_array(struct mm_struct *mm, unsigned long nr_pages,
goto set_status;

if (page) {
- err = page_to_nid(page);
+ err = !is_zone_device_page(page) ? page_to_nid(page)
+ : -ENOENT;
if (foll_flags & FOLL_GET)
put_page(page);
} else {
--
2.37.2

2022-08-14 17:00:53

by David Hildenbrand

[permalink] [raw]
Subject: Re: [PATCH v2 1/3] mm: revert handling Non-LRU pages returned by follow_page

On 14.08.22 16:05, Haiyue Wang wrote:
> The commit
> 3218f8712d6b ("mm: handling Non-LRU pages returned by vm_normal_pages")
> doesn't handle the follow_page with flag FOLL_GET correctly, this will
> do get_page on page, it shouldn't just return directly without put_page.
>
> So revert the related fix to prepare for clean patch to handle Non-LRU
> pages returned by follow_page.

What? Why?

Just fix it.

--
Thanks,

David / dhildenb

2022-08-14 17:11:49

by David Hildenbrand

[permalink] [raw]
Subject: Re: [PATCH v2 3/3] mm: handling Non-LRU pages returned by follow_page

On 14.08.22 16:05, Haiyue Wang wrote:
> Add the missed put_page handling for handling Non-LRU pages returned by
> follow_page with FOLL_GET flag set.
>
> This is the second patch for fixing the commit
> 3218f8712d6b ("mm: handling Non-LRU pages returned by vm_normal_pages")
>
> Signed-off-by: Haiyue Wang <[email protected]>
> ---
> mm/huge_memory.c | 2 +-
> mm/ksm.c | 10 ++++++++++
> mm/migrate.c | 6 +++++-
> 3 files changed, 16 insertions(+), 2 deletions(-)
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 2ee6d38a1426..b2ba17c3dcd7 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -2966,7 +2966,7 @@ static int split_huge_pages_pid(int pid, unsigned long vaddr_start,
> if (IS_ERR_OR_NULL(page))
> continue;
>
> - if (!is_transparent_hugepage(page))
> + if (is_zone_device_page(page) || !is_transparent_hugepage(page))
> goto next;
>
> total++;
> diff --git a/mm/ksm.c b/mm/ksm.c
> index fe3e0a39f73a..1360bb52ada6 100644
> --- a/mm/ksm.c
> +++ b/mm/ksm.c
> @@ -477,6 +477,10 @@ static int break_ksm(struct vm_area_struct *vma, unsigned long addr)
> FOLL_GET | FOLL_MIGRATION | FOLL_REMOTE);
> if (IS_ERR_OR_NULL(page))
> break;
> + if (is_zone_device_page(page)) {
> + put_page(page);
> + break;
> + }

I think we can drop this check completely. While working on patches that
touch this code I realized that this check is completely useless. device
pages are never PageKsm pages and there is no need to special-case here.

If a zone device page could be PageKsm, then we wouldn't handle it here
correctly and not break ksm.

So just drop it.



--
Thanks,

David / dhildenb

2022-08-15 01:30:14

by Haiyue Wang

[permalink] [raw]
Subject: RE: [PATCH v2 3/3] mm: handling Non-LRU pages returned by follow_page

> -----Original Message-----
> From: David Hildenbrand <[email protected]>
> Sent: Monday, August 15, 2022 00:34
> To: Wang, Haiyue <[email protected]>; [email protected]; [email protected]
> Cc: [email protected]; [email protected]; Huang, Ying <[email protected]>;
> [email protected]; [email protected]; [email protected]
> Subject: Re: [PATCH v2 3/3] mm: handling Non-LRU pages returned by follow_page
>
> On 14.08.22 16:05, Haiyue Wang wrote:
> > Add the missed put_page handling for handling Non-LRU pages returned by
> > follow_page with FOLL_GET flag set.
> >
> > This is the second patch for fixing the commit
> > 3218f8712d6b ("mm: handling Non-LRU pages returned by vm_normal_pages")
> >
> > Signed-off-by: Haiyue Wang <[email protected]>
> > ---
> > mm/huge_memory.c | 2 +-
> > mm/ksm.c | 10 ++++++++++
> > mm/migrate.c | 6 +++++-
> > 3 files changed, 16 insertions(+), 2 deletions(-)
> >
> > diff --git a/mm/ksm.c b/mm/ksm.c
> > index fe3e0a39f73a..1360bb52ada6 100644
> > --- a/mm/ksm.c
> > +++ b/mm/ksm.c
> > @@ -477,6 +477,10 @@ static int break_ksm(struct vm_area_struct *vma, unsigned long addr)
> > FOLL_GET | FOLL_MIGRATION | FOLL_REMOTE);
> > if (IS_ERR_OR_NULL(page))
> > break;
> > + if (is_zone_device_page(page)) {
> > + put_page(page);
> > + break;
> > + }
>
> I think we can drop this check completely. While working on patches that
> touch this code I realized that this check is completely useless. device
> pages are never PageKsm pages and there is no need to special-case here.
>
> If a zone device page could be PageKsm, then we wouldn't handle it here
> correctly and not break ksm.
>
> So just drop it.
>

Fixed in v3.

>
>
> --
> Thanks,
>
> David / dhildenb

2022-08-15 01:30:51

by Haiyue Wang

[permalink] [raw]
Subject: RE: [PATCH v2 1/3] mm: revert handling Non-LRU pages returned by follow_page

> -----Original Message-----
> From: David Hildenbrand <[email protected]>
> Sent: Monday, August 15, 2022 00:31
> To: Wang, Haiyue <[email protected]>; [email protected]; [email protected]
> Cc: [email protected]; [email protected]; Huang, Ying <[email protected]>;
> [email protected]; [email protected]; [email protected]
> Subject: Re: [PATCH v2 1/3] mm: revert handling Non-LRU pages returned by follow_page
>
> On 14.08.22 16:05, Haiyue Wang wrote:
> > The commit
> > 3218f8712d6b ("mm: handling Non-LRU pages returned by vm_normal_pages")
> > doesn't handle the follow_page with flag FOLL_GET correctly, this will
> > do get_page on page, it shouldn't just return directly without put_page.
> >
> > So revert the related fix to prepare for clean patch to handle Non-LRU
> > pages returned by follow_page.
>
> What? Why?
>

Just as the cover letter said, for applying the PATCH 2/3 can be applied on
Linux-5.19 branch directly. I will drop this kind of fix, and fix the issue
directly in v3.

> Just fix it.
>
> --
> Thanks,
>
> David / dhildenb

2022-08-15 01:42:57

by Haiyue Wang

[permalink] [raw]
Subject: [PATCH v3 0/2] fix follow_page related issues

v3: Merge the fix for handling Non-LRU pages into one patch.
Drop the break_ksm zone device page check.

v2: Add the Non-LRU pages fix with two patches, so that
'mm: migration: fix the FOLL_GET' can be applied directly
on linux-5.19 stable branch.

Haiyue Wang (2):
mm: migration: fix the FOLL_GET failure on following huge page
mm: fix the handling Non-LRU pages returned by follow_page

mm/huge_memory.c | 4 ++--
mm/ksm.c | 12 +++++++++---
mm/migrate.c | 20 +++++++++++++++-----
3 files changed, 26 insertions(+), 10 deletions(-)

--
2.37.2

2022-08-15 02:00:59

by Haiyue Wang

[permalink] [raw]
Subject: [PATCH v4 0/2] fix follow_page related issues

v4: add '()' for the function for readability.
add more words about the Non-LRU pages fix in commit message.

v3: Merge the fix for handling Non-LRU pages into one patch.
Drop the break_ksm zone device page check.

v2: Add the Non-LRU pages fix with two patches, so that
'mm: migration: fix the FOLL_GET' can be applied directly
on linux-5.19 stable branch.

Haiyue Wang (2):
mm: migration: fix the FOLL_GET failure on following huge page
mm: fix the handling Non-LRU pages returned by follow_page

mm/huge_memory.c | 4 ++--
mm/ksm.c | 12 +++++++++---
mm/migrate.c | 20 +++++++++++++++-----
3 files changed, 26 insertions(+), 10 deletions(-)

--
2.37.2

2022-08-15 02:32:53

by Haiyue Wang

[permalink] [raw]
Subject: [PATCH v4 1/2] mm: migration: fix the FOLL_GET failure on following huge page

Not all huge page APIs support FOLL_GET option, so the __NR_move_pages
will fail to get the page node information for huge page.

This is an temporary solution to mitigate the racing fix.

After supporting follow huge page by FOLL_GET is done, this fix can be
reverted safely.

Fixes: 4cd614841c06 ("mm: migration: fix possible do_pages_stat_array racing with memory offline")
Signed-off-by: Haiyue Wang <[email protected]>
---
mm/migrate.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/mm/migrate.c b/mm/migrate.c
index 6a1597c92261..581dfaad9257 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1848,6 +1848,7 @@ static void do_pages_stat_array(struct mm_struct *mm, unsigned long nr_pages,

for (i = 0; i < nr_pages; i++) {
unsigned long addr = (unsigned long)(*pages);
+ unsigned int foll_flags = FOLL_DUMP;
struct vm_area_struct *vma;
struct page *page;
int err = -EFAULT;
@@ -1856,8 +1857,12 @@ static void do_pages_stat_array(struct mm_struct *mm, unsigned long nr_pages,
if (!vma)
goto set_status;

+ /* Not all huge page follow APIs support 'FOLL_GET' */
+ if (!is_vm_hugetlb_page(vma))
+ foll_flags |= FOLL_GET;
+
/* FOLL_DUMP to ignore special (like zero) pages */
- page = follow_page(vma, addr, FOLL_GET | FOLL_DUMP);
+ page = follow_page(vma, addr, foll_flags);

err = PTR_ERR(page);
if (IS_ERR(page))
@@ -1865,7 +1870,8 @@ static void do_pages_stat_array(struct mm_struct *mm, unsigned long nr_pages,

if (page && !is_zone_device_page(page)) {
err = page_to_nid(page);
- put_page(page);
+ if (foll_flags & FOLL_GET)
+ put_page(page);
} else {
err = -ENOENT;
}
--
2.37.2

2022-08-15 02:37:53

by Haiyue Wang

[permalink] [raw]
Subject: [PATCH v4 2/2] mm: fix the handling Non-LRU pages returned by follow_page

The handling Non-LRU pages returned by follow_page() jumps directly, it
doesn't call put_page() to handle the reference count, since 'FOLL_GET'
flag for follow_page() has get_page() called. Fix the zone device page
check by handling the page reference count correctly before returning.

And as David reviewed, "device pages are never PageKsm pages". Drop this
zone device page check for break_ksm().

Fixes: 3218f8712d6b ("mm: handling Non-LRU pages returned by vm_normal_pages")
Signed-off-by: Haiyue Wang <[email protected]>
---
mm/huge_memory.c | 4 ++--
mm/ksm.c | 12 +++++++++---
mm/migrate.c | 10 +++++++---
3 files changed, 18 insertions(+), 8 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 8a7c1b344abe..b2ba17c3dcd7 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2963,10 +2963,10 @@ static int split_huge_pages_pid(int pid, unsigned long vaddr_start,
/* FOLL_DUMP to ignore special (like zero) pages */
page = follow_page(vma, addr, FOLL_GET | FOLL_DUMP);

- if (IS_ERR_OR_NULL(page) || is_zone_device_page(page))
+ if (IS_ERR_OR_NULL(page))
continue;

- if (!is_transparent_hugepage(page))
+ if (is_zone_device_page(page) || !is_transparent_hugepage(page))
goto next;

total++;
diff --git a/mm/ksm.c b/mm/ksm.c
index 42ab153335a2..e26f57fc1f0e 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -475,7 +475,7 @@ static int break_ksm(struct vm_area_struct *vma, unsigned long addr)
cond_resched();
page = follow_page(vma, addr,
FOLL_GET | FOLL_MIGRATION | FOLL_REMOTE);
- if (IS_ERR_OR_NULL(page) || is_zone_device_page(page))
+ if (IS_ERR_OR_NULL(page))
break;
if (PageKsm(page))
ret = handle_mm_fault(vma, addr,
@@ -560,12 +560,15 @@ static struct page *get_mergeable_page(struct rmap_item *rmap_item)
goto out;

page = follow_page(vma, addr, FOLL_GET);
- if (IS_ERR_OR_NULL(page) || is_zone_device_page(page))
+ if (IS_ERR_OR_NULL(page))
goto out;
+ if (is_zone_device_page(page))
+ goto out_putpage;
if (PageAnon(page)) {
flush_anon_page(vma, page, addr);
flush_dcache_page(page);
} else {
+out_putpage:
put_page(page);
out:
page = NULL;
@@ -2308,11 +2311,13 @@ static struct rmap_item *scan_get_next_rmap_item(struct page **page)
if (ksm_test_exit(mm))
break;
*page = follow_page(vma, ksm_scan.address, FOLL_GET);
- if (IS_ERR_OR_NULL(*page) || is_zone_device_page(*page)) {
+ if (IS_ERR_OR_NULL(*page)) {
ksm_scan.address += PAGE_SIZE;
cond_resched();
continue;
}
+ if (is_zone_device_page(*page))
+ goto next_page;
if (PageAnon(*page)) {
flush_anon_page(vma, *page, ksm_scan.address);
flush_dcache_page(*page);
@@ -2327,6 +2332,7 @@ static struct rmap_item *scan_get_next_rmap_item(struct page **page)
mmap_read_unlock(mm);
return rmap_item;
}
+next_page:
put_page(*page);
ksm_scan.address += PAGE_SIZE;
cond_resched();
diff --git a/mm/migrate.c b/mm/migrate.c
index 581dfaad9257..fee12cd2f294 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1672,9 +1672,12 @@ static int add_page_for_migration(struct mm_struct *mm, unsigned long addr,
goto out;

err = -ENOENT;
- if (!page || is_zone_device_page(page))
+ if (!page)
goto out;

+ if (is_zone_device_page(page))
+ goto out_putpage;
+
err = 0;
if (page_to_nid(page) == node)
goto out_putpage;
@@ -1868,8 +1871,9 @@ static void do_pages_stat_array(struct mm_struct *mm, unsigned long nr_pages,
if (IS_ERR(page))
goto set_status;

- if (page && !is_zone_device_page(page)) {
- err = page_to_nid(page);
+ if (page) {
+ err = !is_zone_device_page(page) ? page_to_nid(page)
+ : -ENOENT;
if (foll_flags & FOLL_GET)
put_page(page);
} else {
--
2.37.2

2022-08-15 04:38:11

by Alistair Popple

[permalink] [raw]
Subject: Re: [PATCH v4 1/2] mm: migration: fix the FOLL_GET failure on following huge page

On Monday, 15 August 2022 11:59:08 AM AEST Haiyue Wang wrote:
> Not all huge page APIs support FOLL_GET option, so the __NR_move_pages
> will fail to get the page node information for huge page.

I think you should be explicit in the commit message about which functions do
not support FOLL_GET as it's not obvious what support needs to be added before
this fix can be reverted.

Thanks.

- Alistair

> This is an temporary solution to mitigate the racing fix.
>
> After supporting follow huge page by FOLL_GET is done, this fix can be
> reverted safely.
>
> Fixes: 4cd614841c06 ("mm: migration: fix possible do_pages_stat_array racing
with memory offline")
> Signed-off-by: Haiyue Wang <[email protected]>
> ---
> mm/migrate.c | 10 ++++++++--
> 1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 6a1597c92261..581dfaad9257 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1848,6 +1848,7 @@ static void do_pages_stat_array(struct mm_struct *mm,
unsigned long nr_pages,
>
> for (i = 0; i < nr_pages; i++) {
> unsigned long addr = (unsigned long)(*pages);
> + unsigned int foll_flags = FOLL_DUMP;
> struct vm_area_struct *vma;
> struct page *page;
> int err = -EFAULT;
> @@ -1856,8 +1857,12 @@ static void do_pages_stat_array(struct mm_struct *mm,
unsigned long nr_pages,
> if (!vma)
> goto set_status;
>
> + /* Not all huge page follow APIs support 'FOLL_GET' */
> + if (!is_vm_hugetlb_page(vma))
> + foll_flags |= FOLL_GET;
> +
> /* FOLL_DUMP to ignore special (like zero) pages */
> - page = follow_page(vma, addr, FOLL_GET | FOLL_DUMP);
> + page = follow_page(vma, addr, foll_flags);
>
> err = PTR_ERR(page);
> if (IS_ERR(page))
> @@ -1865,7 +1870,8 @@ static void do_pages_stat_array(struct mm_struct *mm,
unsigned long nr_pages,
>
> if (page && !is_zone_device_page(page)) {
> err = page_to_nid(page);
> - put_page(page);
> + if (foll_flags & FOLL_GET)
> + put_page(page);
> } else {
> err = -ENOENT;
> }
>




2022-08-15 05:07:11

by Haiyue Wang

[permalink] [raw]
Subject: RE: [PATCH v4 1/2] mm: migration: fix the FOLL_GET failure on following huge page

> -----Original Message-----
> From: Alistair Popple <[email protected]>
> Sent: Monday, August 15, 2022 12:29
> To: [email protected]; [email protected]; Wang, Haiyue <[email protected]>
> Cc: [email protected]; [email protected]; [email protected]; Huang, Ying
> <[email protected]>; [email protected]; [email protected]; [email protected]; Wang,
> Haiyue <[email protected]>
> Subject: Re: [PATCH v4 1/2] mm: migration: fix the FOLL_GET failure on following huge page
>
> On Monday, 15 August 2022 11:59:08 AM AEST Haiyue Wang wrote:
> > Not all huge page APIs support FOLL_GET option, so the __NR_move_pages
> > will fail to get the page node information for huge page.
>
> I think you should be explicit in the commit message about which functions do
> not support FOLL_GET as it's not obvious what support needs to be added before
> this fix can be reverted.

Yes, make sense, will add them in new patch.

>
> Thanks.
>
> - Alistair
>
> > This is an temporary solution to mitigate the racing fix.
> >
> > After supporting follow huge page by FOLL_GET is done, this fix can be
> > reverted safely.
> >
> > Fixes: 4cd614841c06 ("mm: migration: fix possible do_pages_stat_array racing
> with memory offline")
> > Signed-off-by: Haiyue Wang <[email protected]>
> > ---
> > mm/migrate.c | 10 ++++++++--
> > 1 file changed, 8 insertions(+), 2 deletions(-)
> >
> > diff --git a/mm/migrate.c b/mm/migrate.c
> > index 6a1597c92261..581dfaad9257 100644
> > --- a/mm/migrate.c
> > +++ b/mm/migrate.c
> > @@ -1848,6 +1848,7 @@ static void do_pages_stat_array(struct mm_struct *mm,
> unsigned long nr_pages,
> >
> > for (i = 0; i < nr_pages; i++) {
> > unsigned long addr = (unsigned long)(*pages);
> > + unsigned int foll_flags = FOLL_DUMP;
> > struct vm_area_struct *vma;
> > struct page *page;
> > int err = -EFAULT;
> > @@ -1856,8 +1857,12 @@ static void do_pages_stat_array(struct mm_struct *mm,
> unsigned long nr_pages,
> > if (!vma)
> > goto set_status;
> >
> > + /* Not all huge page follow APIs support 'FOLL_GET' */
> > + if (!is_vm_hugetlb_page(vma))
> > + foll_flags |= FOLL_GET;
> > +
> > /* FOLL_DUMP to ignore special (like zero) pages */
> > - page = follow_page(vma, addr, FOLL_GET | FOLL_DUMP);
> > + page = follow_page(vma, addr, foll_flags);
> >
> > err = PTR_ERR(page);
> > if (IS_ERR(page))
> > @@ -1865,7 +1870,8 @@ static void do_pages_stat_array(struct mm_struct *mm,
> unsigned long nr_pages,
> >
> > if (page && !is_zone_device_page(page)) {
> > err = page_to_nid(page);
> > - put_page(page);
> > + if (foll_flags & FOLL_GET)
> > + put_page(page);
> > } else {
> > err = -ENOENT;
> > }
> >
>
>
>

2022-08-15 05:33:05

by Haiyue Wang

[permalink] [raw]
Subject: RE: [PATCH v4 1/2] mm: migration: fix the FOLL_GET failure on following huge page

> -----Original Message-----
> From: Alistair Popple <[email protected]>
> Sent: Monday, August 15, 2022 13:17
> To: [email protected]; [email protected]; Wang, Haiyue <[email protected]>
> Cc: [email protected]; [email protected]; [email protected]; Huang, Ying
> <[email protected]>; [email protected]; [email protected]; [email protected]
> Subject: Re: [PATCH v4 1/2] mm: migration: fix the FOLL_GET failure on following huge page
>
> On Monday, 15 August 2022 2:40:48 PM AEST Wang, Haiyue wrote:
> > > -----Original Message-----
> > > From: Alistair Popple <[email protected]>
> > > Sent: Monday, August 15, 2022 12:29
> > > To: [email protected]; [email protected]; Wang, Haiyue
> <[email protected]>
> > > Cc: [email protected]; [email protected]; [email protected];
> Huang, Ying
> > > <[email protected]>; [email protected];
> [email protected]; [email protected]; Wang,
> > > Haiyue <[email protected]>
> > > Subject: Re: [PATCH v4 1/2] mm: migration: fix the FOLL_GET failure on
> following huge page
> > >
> > > On Monday, 15 August 2022 11:59:08 AM AEST Haiyue Wang wrote:
> > > > Not all huge page APIs support FOLL_GET option, so the __NR_move_pages
> > > > will fail to get the page node information for huge page.
> > >
> > > I think you should be explicit in the commit message about which functions
> do
> > > not support FOLL_GET as it's not obvious what support needs to be added
> before
> > > this fix can be reverted.
> >
> > Yes, make sense, will add them in new patch.
>
> Actually while you're at it I think it would be good to include a description
> of the impact of this failure in the commit message. Ie. You're answer to:
>
> > What are the user-visible runtime effects of this bug?
>
> As it documents what should be tested if this fix does actually ever get
> reverted.

An short example *.c code to capture the bug in commit message ?

>
> > >
> > > Thanks.
> > >
> > > - Alistair
> > >
> > > > This is an temporary solution to mitigate the racing fix.
> > > >
> > > > After supporting follow huge page by FOLL_GET is done, this fix can be
> > > > reverted safely.
> > > >
> > > > Fixes: 4cd614841c06 ("mm: migration: fix possible do_pages_stat_array
> racing
> > > with memory offline")
> > > > Signed-off-by: Haiyue Wang <[email protected]>
> > > > ---
> > > > mm/migrate.c | 10 ++++++++--
> > > > 1 file changed, 8 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/mm/migrate.c b/mm/migrate.c
> > > > index 6a1597c92261..581dfaad9257 100644
> > > > --- a/mm/migrate.c
> > > > +++ b/mm/migrate.c
> > > > @@ -1848,6 +1848,7 @@ static void do_pages_stat_array(struct mm_struct
> *mm,
> > > unsigned long nr_pages,
> > > >
> > > > for (i = 0; i < nr_pages; i++) {
> > > > unsigned long addr = (unsigned long)(*pages);
> > > > + unsigned int foll_flags = FOLL_DUMP;
> > > > struct vm_area_struct *vma;
> > > > struct page *page;
> > > > int err = -EFAULT;
> > > > @@ -1856,8 +1857,12 @@ static void do_pages_stat_array(struct mm_struct
> *mm,
> > > unsigned long nr_pages,
> > > > if (!vma)
> > > > goto set_status;
> > > >
> > > > + /* Not all huge page follow APIs support 'FOLL_GET' */
> > > > + if (!is_vm_hugetlb_page(vma))
> > > > + foll_flags |= FOLL_GET;
> > > > +
> > > > /* FOLL_DUMP to ignore special (like zero) pages */
> > > > - page = follow_page(vma, addr, FOLL_GET | FOLL_DUMP);
> > > > + page = follow_page(vma, addr, foll_flags);
> > > >
> > > > err = PTR_ERR(page);
> > > > if (IS_ERR(page))
> > > > @@ -1865,7 +1870,8 @@ static void do_pages_stat_array(struct mm_struct
> *mm,
> > > unsigned long nr_pages,
> > > >
> > > > if (page && !is_zone_device_page(page)) {
> > > > err = page_to_nid(page);
> > > > - put_page(page);
> > > > + if (foll_flags & FOLL_GET)
> > > > + put_page(page);
> > > > } else {
> > > > err = -ENOENT;
> > > > }
> > > >
> > >
> > >
> > >
> >
> >
>
>
>

2022-08-15 05:42:43

by Alistair Popple

[permalink] [raw]
Subject: Re: [PATCH v4 1/2] mm: migration: fix the FOLL_GET failure on following huge page

On Monday, 15 August 2022 2:40:48 PM AEST Wang, Haiyue wrote:
> > -----Original Message-----
> > From: Alistair Popple <[email protected]>
> > Sent: Monday, August 15, 2022 12:29
> > To: [email protected]; [email protected]; Wang, Haiyue
<[email protected]>
> > Cc: [email protected]; [email protected]; [email protected];
Huang, Ying
> > <[email protected]>; [email protected];
[email protected]; [email protected]; Wang,
> > Haiyue <[email protected]>
> > Subject: Re: [PATCH v4 1/2] mm: migration: fix the FOLL_GET failure on
following huge page
> >
> > On Monday, 15 August 2022 11:59:08 AM AEST Haiyue Wang wrote:
> > > Not all huge page APIs support FOLL_GET option, so the __NR_move_pages
> > > will fail to get the page node information for huge page.
> >
> > I think you should be explicit in the commit message about which functions
do
> > not support FOLL_GET as it's not obvious what support needs to be added
before
> > this fix can be reverted.
>
> Yes, make sense, will add them in new patch.

Actually while you're at it I think it would be good to include a description
of the impact of this failure in the commit message. Ie. You're answer to:

> What are the user-visible runtime effects of this bug?

As it documents what should be tested if this fix does actually ever get
reverted.

> >
> > Thanks.
> >
> > - Alistair
> >
> > > This is an temporary solution to mitigate the racing fix.
> > >
> > > After supporting follow huge page by FOLL_GET is done, this fix can be
> > > reverted safely.
> > >
> > > Fixes: 4cd614841c06 ("mm: migration: fix possible do_pages_stat_array
racing
> > with memory offline")
> > > Signed-off-by: Haiyue Wang <[email protected]>
> > > ---
> > > mm/migrate.c | 10 ++++++++--
> > > 1 file changed, 8 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/mm/migrate.c b/mm/migrate.c
> > > index 6a1597c92261..581dfaad9257 100644
> > > --- a/mm/migrate.c
> > > +++ b/mm/migrate.c
> > > @@ -1848,6 +1848,7 @@ static void do_pages_stat_array(struct mm_struct
*mm,
> > unsigned long nr_pages,
> > >
> > > for (i = 0; i < nr_pages; i++) {
> > > unsigned long addr = (unsigned long)(*pages);
> > > + unsigned int foll_flags = FOLL_DUMP;
> > > struct vm_area_struct *vma;
> > > struct page *page;
> > > int err = -EFAULT;
> > > @@ -1856,8 +1857,12 @@ static void do_pages_stat_array(struct mm_struct
*mm,
> > unsigned long nr_pages,
> > > if (!vma)
> > > goto set_status;
> > >
> > > + /* Not all huge page follow APIs support 'FOLL_GET' */
> > > + if (!is_vm_hugetlb_page(vma))
> > > + foll_flags |= FOLL_GET;
> > > +
> > > /* FOLL_DUMP to ignore special (like zero) pages */
> > > - page = follow_page(vma, addr, FOLL_GET | FOLL_DUMP);
> > > + page = follow_page(vma, addr, foll_flags);
> > >
> > > err = PTR_ERR(page);
> > > if (IS_ERR(page))
> > > @@ -1865,7 +1870,8 @@ static void do_pages_stat_array(struct mm_struct
*mm,
> > unsigned long nr_pages,
> > >
> > > if (page && !is_zone_device_page(page)) {
> > > err = page_to_nid(page);
> > > - put_page(page);
> > > + if (foll_flags & FOLL_GET)
> > > + put_page(page);
> > > } else {
> > > err = -ENOENT;
> > > }
> > >
> >
> >
> >
>
>




2022-08-15 05:47:58

by Alistair Popple

[permalink] [raw]
Subject: Re: [PATCH v4 1/2] mm: migration: fix the FOLL_GET failure on following huge page

On Monday, 15 August 2022 3:20:28 PM AEST Wang, Haiyue wrote:
> > -----Original Message-----
> > From: Alistair Popple <[email protected]>
> > Sent: Monday, August 15, 2022 13:17
> > To: [email protected]; [email protected]; Wang, Haiyue
<[email protected]>
> > Cc: [email protected]; [email protected]; [email protected];
Huang, Ying
> > <[email protected]>; [email protected];
[email protected]; [email protected]
> > Subject: Re: [PATCH v4 1/2] mm: migration: fix the FOLL_GET failure on
following huge page
> >
> > On Monday, 15 August 2022 2:40:48 PM AEST Wang, Haiyue wrote:
> > > > -----Original Message-----
> > > > From: Alistair Popple <[email protected]>
> > > > Sent: Monday, August 15, 2022 12:29
> > > > To: [email protected]; [email protected]; Wang, Haiyue
> > <[email protected]>
> > > > Cc: [email protected]; [email protected]; [email protected];
> > Huang, Ying
> > > > <[email protected]>; [email protected];
> > [email protected]; [email protected]; Wang,
> > > > Haiyue <[email protected]>
> > > > Subject: Re: [PATCH v4 1/2] mm: migration: fix the FOLL_GET failure on
> > following huge page
> > > >
> > > > On Monday, 15 August 2022 11:59:08 AM AEST Haiyue Wang wrote:
> > > > > Not all huge page APIs support FOLL_GET option, so the
__NR_move_pages
> > > > > will fail to get the page node information for huge page.
> > > >
> > > > I think you should be explicit in the commit message about which
functions
> > do
> > > > not support FOLL_GET as it's not obvious what support needs to be
added
> > before
> > > > this fix can be reverted.
> > >
> > > Yes, make sense, will add them in new patch.
> >
> > Actually while you're at it I think it would be good to include a
description
> > of the impact of this failure in the commit message. Ie. You're answer to:
> >
> > > What are the user-visible runtime effects of this bug?
> >
> > As it documents what should be tested if this fix does actually ever get
> > reverted.
>
> An short example *.c code to capture the bug in commit message ?

That's probably overkill. Just being a bit more explicit about the
circumstances in which sys_move_pages() actually fails would be good. Eg.
something like this:

"Without this sys_move_pages() will return -ENOENT for 1GB huge page
memory map when dumping the page node information for nodes != NULL"

> >
> > > >
> > > > Thanks.
> > > >
> > > > - Alistair
> > > >
> > > > > This is an temporary solution to mitigate the racing fix.
> > > > >
> > > > > After supporting follow huge page by FOLL_GET is done, this fix can
be
> > > > > reverted safely.
> > > > >
> > > > > Fixes: 4cd614841c06 ("mm: migration: fix possible
do_pages_stat_array
> > racing
> > > > with memory offline")
> > > > > Signed-off-by: Haiyue Wang <[email protected]>
> > > > > ---
> > > > > mm/migrate.c | 10 ++++++++--
> > > > > 1 file changed, 8 insertions(+), 2 deletions(-)
> > > > >
> > > > > diff --git a/mm/migrate.c b/mm/migrate.c
> > > > > index 6a1597c92261..581dfaad9257 100644
> > > > > --- a/mm/migrate.c
> > > > > +++ b/mm/migrate.c
> > > > > @@ -1848,6 +1848,7 @@ static void do_pages_stat_array(struct
mm_struct
> > *mm,
> > > > unsigned long nr_pages,
> > > > >
> > > > > for (i = 0; i < nr_pages; i++) {
> > > > > unsigned long addr = (unsigned long)(*pages);
> > > > > + unsigned int foll_flags = FOLL_DUMP;
> > > > > struct vm_area_struct *vma;
> > > > > struct page *page;
> > > > > int err = -EFAULT;
> > > > > @@ -1856,8 +1857,12 @@ static void do_pages_stat_array(struct
mm_struct
> > *mm,
> > > > unsigned long nr_pages,
> > > > > if (!vma)
> > > > > goto set_status;
> > > > >
> > > > > + /* Not all huge page follow APIs support 'FOLL_GET'
*/
> > > > > + if (!is_vm_hugetlb_page(vma))
> > > > > + foll_flags |= FOLL_GET;
> > > > > +
> > > > > /* FOLL_DUMP to ignore special (like zero) pages */
> > > > > - page = follow_page(vma, addr, FOLL_GET | FOLL_DUMP);
> > > > > + page = follow_page(vma, addr, foll_flags);
> > > > >
> > > > > err = PTR_ERR(page);
> > > > > if (IS_ERR(page))
> > > > > @@ -1865,7 +1870,8 @@ static void do_pages_stat_array(struct
mm_struct
> > *mm,
> > > > unsigned long nr_pages,
> > > > >
> > > > > if (page && !is_zone_device_page(page)) {
> > > > > err = page_to_nid(page);
> > > > > - put_page(page);
> > > > > + if (foll_flags & FOLL_GET)
> > > > > + put_page(page);
> > > > > } else {
> > > > > err = -ENOENT;
> > > > > }
> > > > >
> > > >
> > > >
> > > >
> > >
> > >
> >
> >
> >
>
>




2022-08-15 06:00:56

by Haiyue Wang

[permalink] [raw]
Subject: RE: [PATCH v4 1/2] mm: migration: fix the FOLL_GET failure on following huge page

> -----Original Message-----
> From: Alistair Popple <[email protected]>
> Sent: Monday, August 15, 2022 13:35
> To: [email protected]; [email protected]; Wang, Haiyue <[email protected]>
> Cc: [email protected]; [email protected]; [email protected]; Huang, Ying
> <[email protected]>; [email protected]; [email protected]; [email protected]
> Subject: Re: [PATCH v4 1/2] mm: migration: fix the FOLL_GET failure on following huge page
>
> On Monday, 15 August 2022 3:20:28 PM AEST Wang, Haiyue wrote:
> > > -----Original Message-----
> > > From: Alistair Popple <[email protected]>
> > > Sent: Monday, August 15, 2022 13:17
> > > To: [email protected]; [email protected]; Wang, Haiyue
> <[email protected]>
> > > Cc: [email protected]; [email protected]; [email protected];
> Huang, Ying
> > > <[email protected]>; [email protected];
> [email protected]; [email protected]
> > > Subject: Re: [PATCH v4 1/2] mm: migration: fix the FOLL_GET failure on
> following huge page
> > >
> > > On Monday, 15 August 2022 2:40:48 PM AEST Wang, Haiyue wrote:
> > > > > -----Original Message-----
> > > > > From: Alistair Popple <[email protected]>
> > > > > Sent: Monday, August 15, 2022 12:29
> > > > > To: [email protected]; [email protected]; Wang, Haiyue
> > > <[email protected]>
> > > > > Cc: [email protected]; [email protected]; [email protected];
> > > Huang, Ying
> > > > > <[email protected]>; [email protected];
> > > [email protected]; [email protected]; Wang,
> > > > > Haiyue <[email protected]>
> > > > > Subject: Re: [PATCH v4 1/2] mm: migration: fix the FOLL_GET failure on
> > > following huge page
> > > > >
> > > > > On Monday, 15 August 2022 11:59:08 AM AEST Haiyue Wang wrote:
> > > > > > Not all huge page APIs support FOLL_GET option, so the
> __NR_move_pages
> > > > > > will fail to get the page node information for huge page.
> > > > >
> > > > > I think you should be explicit in the commit message about which
> functions
> > > do
> > > > > not support FOLL_GET as it's not obvious what support needs to be
> added
> > > before
> > > > > this fix can be reverted.
> > > >
> > > > Yes, make sense, will add them in new patch.
> > >
> > > Actually while you're at it I think it would be good to include a
> description
> > > of the impact of this failure in the commit message. Ie. You're answer to:
> > >
> > > > What are the user-visible runtime effects of this bug?
> > >
> > > As it documents what should be tested if this fix does actually ever get
> > > reverted.
> >
> > An short example *.c code to capture the bug in commit message ?
>
> That's probably overkill. Just being a bit more explicit about the
> circumstances in which sys_move_pages() actually fails would be good. Eg.
> something like this:
>
> "Without this sys_move_pages() will return -ENOENT for 1GB huge page
> memory map when dumping the page node information for nodes != NULL"

OK, understood, will try to add more information about the error. ;-)

>
> > >
> > > > >
> > > > > Thanks.
> > > > >
> > > > > - Alistair
> > > > >
> > > > > > This is an temporary solution to mitigate the racing fix.
> > > > > >
> > > > > > After supporting follow huge page by FOLL_GET is done, this fix can
> be
> > > > > > reverted safely.
> > > > > >
> > > > > > Fixes: 4cd614841c06 ("mm: migration: fix possible
> do_pages_stat_array
> > > racing
> > > > > with memory offline")
> > > > > > Signed-off-by: Haiyue Wang <[email protected]>
> > > > > > ---
> > > > > > mm/migrate.c | 10 ++++++++--
> > > > > > 1 file changed, 8 insertions(+), 2 deletions(-)
> > > > > >
> > > > > > diff --git a/mm/migrate.c b/mm/migrate.c
> > > > > > index 6a1597c92261..581dfaad9257 100644
> > > > > > --- a/mm/migrate.c
> > > > > > +++ b/mm/migrate.c
> > > > > > @@ -1848,6 +1848,7 @@ static void do_pages_stat_array(struct
> mm_struct
> > > *mm,
> > > > > unsigned long nr_pages,
> > > > > >
> > > > > > for (i = 0; i < nr_pages; i++) {
> > > > > > unsigned long addr = (unsigned long)(*pages);
> > > > > > + unsigned int foll_flags = FOLL_DUMP;
> > > > > > struct vm_area_struct *vma;
> > > > > > struct page *page;
> > > > > > int err = -EFAULT;
> > > > > > @@ -1856,8 +1857,12 @@ static void do_pages_stat_array(struct
> mm_struct
> > > *mm,
> > > > > unsigned long nr_pages,
> > > > > > if (!vma)
> > > > > > goto set_status;
> > > > > >
> > > > > > + /* Not all huge page follow APIs support 'FOLL_GET'
> */
> > > > > > + if (!is_vm_hugetlb_page(vma))
> > > > > > + foll_flags |= FOLL_GET;
> > > > > > +
> > > > > > /* FOLL_DUMP to ignore special (like zero) pages */
> > > > > > - page = follow_page(vma, addr, FOLL_GET | FOLL_DUMP);
> > > > > > + page = follow_page(vma, addr, foll_flags);
> > > > > >
> > > > > > err = PTR_ERR(page);
> > > > > > if (IS_ERR(page))
> > > > > > @@ -1865,7 +1870,8 @@ static void do_pages_stat_array(struct
> mm_struct
> > > *mm,
> > > > > unsigned long nr_pages,
> > > > > >
> > > > > > if (page && !is_zone_device_page(page)) {
> > > > > > err = page_to_nid(page);
> > > > > > - put_page(page);
> > > > > > + if (foll_flags & FOLL_GET)
> > > > > > + put_page(page);
> > > > > > } else {
> > > > > > err = -ENOENT;
> > > > > > }
> > > > > >
> > > > >
> > > > >
> > > > >
> > > >
> > > >
> > >
> > >
> > >
> >
> >
>
>
>

2022-08-15 07:17:16

by Haiyue Wang

[permalink] [raw]
Subject: [PATCH v5 0/2] fix follow_page related issues

v5: reword the commit message for FOLL_GET with more information.

v4: add '()' for the function for readability.
add more words about the Non-LRU pages fix in commit message.

v3: Merge the fix for handling Non-LRU pages into one patch.
Drop the break_ksm zone device page check.

v2: Add the Non-LRU pages fix with two patches, so that
'mm: migration: fix the FOLL_GET' can be applied directly
on linux-5.19 stable branch.

Haiyue Wang (2):
mm: migration: fix the FOLL_GET failure on following huge page
mm: fix the handling Non-LRU pages returned by follow_page

mm/huge_memory.c | 4 ++--
mm/ksm.c | 12 +++++++++---
mm/migrate.c | 20 +++++++++++++++-----
3 files changed, 26 insertions(+), 10 deletions(-)

--
2.37.2

2022-08-15 07:17:46

by Haiyue Wang

[permalink] [raw]
Subject: [PATCH v5 2/2] mm: fix the handling Non-LRU pages returned by follow_page

The handling Non-LRU pages returned by follow_page() jumps directly, it
doesn't call put_page() to handle the reference count, since 'FOLL_GET'
flag for follow_page() has get_page() called. Fix the zone device page
check by handling the page reference count correctly before returning.

And as David reviewed, "device pages are never PageKsm pages". Drop this
zone device page check for break_ksm().

Fixes: 3218f8712d6b ("mm: handling Non-LRU pages returned by vm_normal_pages")
Signed-off-by: Haiyue Wang <[email protected]>
---
mm/huge_memory.c | 4 ++--
mm/ksm.c | 12 +++++++++---
mm/migrate.c | 10 +++++++---
3 files changed, 18 insertions(+), 8 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 8a7c1b344abe..b2ba17c3dcd7 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2963,10 +2963,10 @@ static int split_huge_pages_pid(int pid, unsigned long vaddr_start,
/* FOLL_DUMP to ignore special (like zero) pages */
page = follow_page(vma, addr, FOLL_GET | FOLL_DUMP);

- if (IS_ERR_OR_NULL(page) || is_zone_device_page(page))
+ if (IS_ERR_OR_NULL(page))
continue;

- if (!is_transparent_hugepage(page))
+ if (is_zone_device_page(page) || !is_transparent_hugepage(page))
goto next;

total++;
diff --git a/mm/ksm.c b/mm/ksm.c
index 42ab153335a2..e26f57fc1f0e 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -475,7 +475,7 @@ static int break_ksm(struct vm_area_struct *vma, unsigned long addr)
cond_resched();
page = follow_page(vma, addr,
FOLL_GET | FOLL_MIGRATION | FOLL_REMOTE);
- if (IS_ERR_OR_NULL(page) || is_zone_device_page(page))
+ if (IS_ERR_OR_NULL(page))
break;
if (PageKsm(page))
ret = handle_mm_fault(vma, addr,
@@ -560,12 +560,15 @@ static struct page *get_mergeable_page(struct rmap_item *rmap_item)
goto out;

page = follow_page(vma, addr, FOLL_GET);
- if (IS_ERR_OR_NULL(page) || is_zone_device_page(page))
+ if (IS_ERR_OR_NULL(page))
goto out;
+ if (is_zone_device_page(page))
+ goto out_putpage;
if (PageAnon(page)) {
flush_anon_page(vma, page, addr);
flush_dcache_page(page);
} else {
+out_putpage:
put_page(page);
out:
page = NULL;
@@ -2308,11 +2311,13 @@ static struct rmap_item *scan_get_next_rmap_item(struct page **page)
if (ksm_test_exit(mm))
break;
*page = follow_page(vma, ksm_scan.address, FOLL_GET);
- if (IS_ERR_OR_NULL(*page) || is_zone_device_page(*page)) {
+ if (IS_ERR_OR_NULL(*page)) {
ksm_scan.address += PAGE_SIZE;
cond_resched();
continue;
}
+ if (is_zone_device_page(*page))
+ goto next_page;
if (PageAnon(*page)) {
flush_anon_page(vma, *page, ksm_scan.address);
flush_dcache_page(*page);
@@ -2327,6 +2332,7 @@ static struct rmap_item *scan_get_next_rmap_item(struct page **page)
mmap_read_unlock(mm);
return rmap_item;
}
+next_page:
put_page(*page);
ksm_scan.address += PAGE_SIZE;
cond_resched();
diff --git a/mm/migrate.c b/mm/migrate.c
index 581dfaad9257..fee12cd2f294 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1672,9 +1672,12 @@ static int add_page_for_migration(struct mm_struct *mm, unsigned long addr,
goto out;

err = -ENOENT;
- if (!page || is_zone_device_page(page))
+ if (!page)
goto out;

+ if (is_zone_device_page(page))
+ goto out_putpage;
+
err = 0;
if (page_to_nid(page) == node)
goto out_putpage;
@@ -1868,8 +1871,9 @@ static void do_pages_stat_array(struct mm_struct *mm, unsigned long nr_pages,
if (IS_ERR(page))
goto set_status;

- if (page && !is_zone_device_page(page)) {
- err = page_to_nid(page);
+ if (page) {
+ err = !is_zone_device_page(page) ? page_to_nid(page)
+ : -ENOENT;
if (foll_flags & FOLL_GET)
put_page(page);
} else {
--
2.37.2

2022-08-15 07:25:59

by Haiyue Wang

[permalink] [raw]
Subject: [PATCH v5 1/2] mm: migration: fix the FOLL_GET failure on following huge page

Not all huge page APIs support FOLL_GET option, so move_pages() syscall
will fail to get the page node information for some huge pages.

Like x86 on linux 5.19 with 1GB huge page API follow_huge_pud(), it will
return NULL page for FOLL_GET when calling move_pages() syscall with the
NULL 'nodes' parameter, the 'status' parameter has '-2' error in array.

Note: follow_huge_pud() now supports FOLL_GET in linux 6.0.
Link: https://lore.kernel.org/all/[email protected]

But these huge page APIs don't support FOLL_GET:
1. follow_huge_pud() in arch/s390/mm/hugetlbpage.c
2. follow_huge_addr() in arch/ia64/mm/hugetlbpage.c
It will cause WARN_ON_ONCE for FOLL_GET.
3. follow_huge_pgd() in mm/hugetlb.c

This is an temporary solution to mitigate the side effect of the race
condition fix by calling follow_page() with FOLL_GET set for huge pages.

After supporting follow huge page by FOLL_GET is done, this fix can be
reverted safely.

Fixes: 4cd614841c06 ("mm: migration: fix possible do_pages_stat_array racing with memory offline")
Signed-off-by: Haiyue Wang <[email protected]>
---
mm/migrate.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/mm/migrate.c b/mm/migrate.c
index 6a1597c92261..581dfaad9257 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1848,6 +1848,7 @@ static void do_pages_stat_array(struct mm_struct *mm, unsigned long nr_pages,

for (i = 0; i < nr_pages; i++) {
unsigned long addr = (unsigned long)(*pages);
+ unsigned int foll_flags = FOLL_DUMP;
struct vm_area_struct *vma;
struct page *page;
int err = -EFAULT;
@@ -1856,8 +1857,12 @@ static void do_pages_stat_array(struct mm_struct *mm, unsigned long nr_pages,
if (!vma)
goto set_status;

+ /* Not all huge page follow APIs support 'FOLL_GET' */
+ if (!is_vm_hugetlb_page(vma))
+ foll_flags |= FOLL_GET;
+
/* FOLL_DUMP to ignore special (like zero) pages */
- page = follow_page(vma, addr, FOLL_GET | FOLL_DUMP);
+ page = follow_page(vma, addr, foll_flags);

err = PTR_ERR(page);
if (IS_ERR(page))
@@ -1865,7 +1870,8 @@ static void do_pages_stat_array(struct mm_struct *mm, unsigned long nr_pages,

if (page && !is_zone_device_page(page)) {
err = page_to_nid(page);
- put_page(page);
+ if (foll_flags & FOLL_GET)
+ put_page(page);
} else {
err = -ENOENT;
}
--
2.37.2

2022-08-15 07:57:50

by Huang, Ying

[permalink] [raw]
Subject: Re: [PATCH v5 2/2] mm: fix the handling Non-LRU pages returned by follow_page

Haiyue Wang <[email protected]> writes:

> The handling Non-LRU pages returned by follow_page() jumps directly, it
> doesn't call put_page() to handle the reference count, since 'FOLL_GET'
> flag for follow_page() has get_page() called. Fix the zone device page
> check by handling the page reference count correctly before returning.
>
> And as David reviewed, "device pages are never PageKsm pages". Drop this
> zone device page check for break_ksm().
>
> Fixes: 3218f8712d6b ("mm: handling Non-LRU pages returned by vm_normal_pages")
> Signed-off-by: Haiyue Wang <[email protected]>

LGTM, Thanks!

Reviewed-by: "Huang, Ying" <[email protected]>

> ---
> mm/huge_memory.c | 4 ++--
> mm/ksm.c | 12 +++++++++---
> mm/migrate.c | 10 +++++++---
> 3 files changed, 18 insertions(+), 8 deletions(-)
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 8a7c1b344abe..b2ba17c3dcd7 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -2963,10 +2963,10 @@ static int split_huge_pages_pid(int pid, unsigned long vaddr_start,
> /* FOLL_DUMP to ignore special (like zero) pages */
> page = follow_page(vma, addr, FOLL_GET | FOLL_DUMP);
>
> - if (IS_ERR_OR_NULL(page) || is_zone_device_page(page))
> + if (IS_ERR_OR_NULL(page))
> continue;
>
> - if (!is_transparent_hugepage(page))
> + if (is_zone_device_page(page) || !is_transparent_hugepage(page))
> goto next;
>
> total++;
> diff --git a/mm/ksm.c b/mm/ksm.c
> index 42ab153335a2..e26f57fc1f0e 100644
> --- a/mm/ksm.c
> +++ b/mm/ksm.c
> @@ -475,7 +475,7 @@ static int break_ksm(struct vm_area_struct *vma, unsigned long addr)
> cond_resched();
> page = follow_page(vma, addr,
> FOLL_GET | FOLL_MIGRATION | FOLL_REMOTE);
> - if (IS_ERR_OR_NULL(page) || is_zone_device_page(page))
> + if (IS_ERR_OR_NULL(page))
> break;
> if (PageKsm(page))
> ret = handle_mm_fault(vma, addr,
> @@ -560,12 +560,15 @@ static struct page *get_mergeable_page(struct rmap_item *rmap_item)
> goto out;
>
> page = follow_page(vma, addr, FOLL_GET);
> - if (IS_ERR_OR_NULL(page) || is_zone_device_page(page))
> + if (IS_ERR_OR_NULL(page))
> goto out;
> + if (is_zone_device_page(page))
> + goto out_putpage;
> if (PageAnon(page)) {
> flush_anon_page(vma, page, addr);
> flush_dcache_page(page);
> } else {
> +out_putpage:
> put_page(page);
> out:
> page = NULL;
> @@ -2308,11 +2311,13 @@ static struct rmap_item *scan_get_next_rmap_item(struct page **page)
> if (ksm_test_exit(mm))
> break;
> *page = follow_page(vma, ksm_scan.address, FOLL_GET);
> - if (IS_ERR_OR_NULL(*page) || is_zone_device_page(*page)) {
> + if (IS_ERR_OR_NULL(*page)) {
> ksm_scan.address += PAGE_SIZE;
> cond_resched();
> continue;
> }
> + if (is_zone_device_page(*page))
> + goto next_page;
> if (PageAnon(*page)) {
> flush_anon_page(vma, *page, ksm_scan.address);
> flush_dcache_page(*page);
> @@ -2327,6 +2332,7 @@ static struct rmap_item *scan_get_next_rmap_item(struct page **page)
> mmap_read_unlock(mm);
> return rmap_item;
> }
> +next_page:
> put_page(*page);
> ksm_scan.address += PAGE_SIZE;
> cond_resched();
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 581dfaad9257..fee12cd2f294 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1672,9 +1672,12 @@ static int add_page_for_migration(struct mm_struct *mm, unsigned long addr,
> goto out;
>
> err = -ENOENT;
> - if (!page || is_zone_device_page(page))
> + if (!page)
> goto out;
>
> + if (is_zone_device_page(page))
> + goto out_putpage;
> +
> err = 0;
> if (page_to_nid(page) == node)
> goto out_putpage;
> @@ -1868,8 +1871,9 @@ static void do_pages_stat_array(struct mm_struct *mm, unsigned long nr_pages,
> if (IS_ERR(page))
> goto set_status;
>
> - if (page && !is_zone_device_page(page)) {
> - err = page_to_nid(page);
> + if (page) {
> + err = !is_zone_device_page(page) ? page_to_nid(page)
> + : -ENOENT;
> if (foll_flags & FOLL_GET)
> put_page(page);
> } else {

2022-08-15 08:03:33

by Huang, Ying

[permalink] [raw]
Subject: Re: [PATCH v5 1/2] mm: migration: fix the FOLL_GET failure on following huge page

Haiyue Wang <[email protected]> writes:

> Not all huge page APIs support FOLL_GET option, so move_pages() syscall
> will fail to get the page node information for some huge pages.
>
> Like x86 on linux 5.19 with 1GB huge page API follow_huge_pud(), it will
> return NULL page for FOLL_GET when calling move_pages() syscall with the
> NULL 'nodes' parameter, the 'status' parameter has '-2' error in array.
>
> Note: follow_huge_pud() now supports FOLL_GET in linux 6.0.
> Link: https://lore.kernel.org/all/[email protected]
>
> But these huge page APIs don't support FOLL_GET:
> 1. follow_huge_pud() in arch/s390/mm/hugetlbpage.c
> 2. follow_huge_addr() in arch/ia64/mm/hugetlbpage.c
> It will cause WARN_ON_ONCE for FOLL_GET.
> 3. follow_huge_pgd() in mm/hugetlb.c
>
> This is an temporary solution to mitigate the side effect of the race
> condition fix by calling follow_page() with FOLL_GET set for huge pages.
>
> After supporting follow huge page by FOLL_GET is done, this fix can be
> reverted safely.
>
> Fixes: 4cd614841c06 ("mm: migration: fix possible do_pages_stat_array racing with memory offline")
> Signed-off-by: Haiyue Wang <[email protected]>

LGTM, Thanks!

Reviewed-by: "Huang, Ying" <[email protected]>

> ---
> mm/migrate.c | 10 ++++++++--
> 1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 6a1597c92261..581dfaad9257 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1848,6 +1848,7 @@ static void do_pages_stat_array(struct mm_struct *mm, unsigned long nr_pages,
>
> for (i = 0; i < nr_pages; i++) {
> unsigned long addr = (unsigned long)(*pages);
> + unsigned int foll_flags = FOLL_DUMP;
> struct vm_area_struct *vma;
> struct page *page;
> int err = -EFAULT;
> @@ -1856,8 +1857,12 @@ static void do_pages_stat_array(struct mm_struct *mm, unsigned long nr_pages,
> if (!vma)
> goto set_status;
>
> + /* Not all huge page follow APIs support 'FOLL_GET' */
> + if (!is_vm_hugetlb_page(vma))
> + foll_flags |= FOLL_GET;
> +
> /* FOLL_DUMP to ignore special (like zero) pages */
> - page = follow_page(vma, addr, FOLL_GET | FOLL_DUMP);
> + page = follow_page(vma, addr, foll_flags);
>
> err = PTR_ERR(page);
> if (IS_ERR(page))
> @@ -1865,7 +1870,8 @@ static void do_pages_stat_array(struct mm_struct *mm, unsigned long nr_pages,
>
> if (page && !is_zone_device_page(page)) {
> err = page_to_nid(page);
> - put_page(page);
> + if (foll_flags & FOLL_GET)
> + put_page(page);
> } else {
> err = -ENOENT;
> }

2022-08-15 14:47:17

by Felix Kuehling

[permalink] [raw]
Subject: Re: [PATCH v5 2/2] mm: fix the handling Non-LRU pages returned by follow_page

Am 2022-08-15 um 03:02 schrieb Haiyue Wang:
> The handling Non-LRU pages returned by follow_page() jumps directly, it
> doesn't call put_page() to handle the reference count, since 'FOLL_GET'
> flag for follow_page() has get_page() called. Fix the zone device page
> check by handling the page reference count correctly before returning.
>
> And as David reviewed, "device pages are never PageKsm pages". Drop this
> zone device page check for break_ksm().
>
> Fixes: 3218f8712d6b ("mm: handling Non-LRU pages returned by vm_normal_pages")
> Signed-off-by: Haiyue Wang <[email protected]>

Thank you for catching this. The patch is

Reviewed-by: Felix Kuehling <[email protected]>


> ---
> mm/huge_memory.c | 4 ++--
> mm/ksm.c | 12 +++++++++---
> mm/migrate.c | 10 +++++++---
> 3 files changed, 18 insertions(+), 8 deletions(-)
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 8a7c1b344abe..b2ba17c3dcd7 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -2963,10 +2963,10 @@ static int split_huge_pages_pid(int pid, unsigned long vaddr_start,
> /* FOLL_DUMP to ignore special (like zero) pages */
> page = follow_page(vma, addr, FOLL_GET | FOLL_DUMP);
>
> - if (IS_ERR_OR_NULL(page) || is_zone_device_page(page))
> + if (IS_ERR_OR_NULL(page))
> continue;
>
> - if (!is_transparent_hugepage(page))
> + if (is_zone_device_page(page) || !is_transparent_hugepage(page))
> goto next;
>
> total++;
> diff --git a/mm/ksm.c b/mm/ksm.c
> index 42ab153335a2..e26f57fc1f0e 100644
> --- a/mm/ksm.c
> +++ b/mm/ksm.c
> @@ -475,7 +475,7 @@ static int break_ksm(struct vm_area_struct *vma, unsigned long addr)
> cond_resched();
> page = follow_page(vma, addr,
> FOLL_GET | FOLL_MIGRATION | FOLL_REMOTE);
> - if (IS_ERR_OR_NULL(page) || is_zone_device_page(page))
> + if (IS_ERR_OR_NULL(page))
> break;
> if (PageKsm(page))
> ret = handle_mm_fault(vma, addr,
> @@ -560,12 +560,15 @@ static struct page *get_mergeable_page(struct rmap_item *rmap_item)
> goto out;
>
> page = follow_page(vma, addr, FOLL_GET);
> - if (IS_ERR_OR_NULL(page) || is_zone_device_page(page))
> + if (IS_ERR_OR_NULL(page))
> goto out;
> + if (is_zone_device_page(page))
> + goto out_putpage;
> if (PageAnon(page)) {
> flush_anon_page(vma, page, addr);
> flush_dcache_page(page);
> } else {
> +out_putpage:
> put_page(page);
> out:
> page = NULL;
> @@ -2308,11 +2311,13 @@ static struct rmap_item *scan_get_next_rmap_item(struct page **page)
> if (ksm_test_exit(mm))
> break;
> *page = follow_page(vma, ksm_scan.address, FOLL_GET);
> - if (IS_ERR_OR_NULL(*page) || is_zone_device_page(*page)) {
> + if (IS_ERR_OR_NULL(*page)) {
> ksm_scan.address += PAGE_SIZE;
> cond_resched();
> continue;
> }
> + if (is_zone_device_page(*page))
> + goto next_page;
> if (PageAnon(*page)) {
> flush_anon_page(vma, *page, ksm_scan.address);
> flush_dcache_page(*page);
> @@ -2327,6 +2332,7 @@ static struct rmap_item *scan_get_next_rmap_item(struct page **page)
> mmap_read_unlock(mm);
> return rmap_item;
> }
> +next_page:
> put_page(*page);
> ksm_scan.address += PAGE_SIZE;
> cond_resched();
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 581dfaad9257..fee12cd2f294 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1672,9 +1672,12 @@ static int add_page_for_migration(struct mm_struct *mm, unsigned long addr,
> goto out;
>
> err = -ENOENT;
> - if (!page || is_zone_device_page(page))
> + if (!page)
> goto out;
>
> + if (is_zone_device_page(page))
> + goto out_putpage;
> +
> err = 0;
> if (page_to_nid(page) == node)
> goto out_putpage;
> @@ -1868,8 +1871,9 @@ static void do_pages_stat_array(struct mm_struct *mm, unsigned long nr_pages,
> if (IS_ERR(page))
> goto set_status;
>
> - if (page && !is_zone_device_page(page)) {
> - err = page_to_nid(page);
> + if (page) {
> + err = !is_zone_device_page(page) ? page_to_nid(page)
> + : -ENOENT;
> if (foll_flags & FOLL_GET)
> put_page(page);
> } else {

2022-08-16 04:17:34

by Alistair Popple

[permalink] [raw]
Subject: Re: [PATCH v5 2/2] mm: fix the handling Non-LRU pages returned by follow_page


Haiyue Wang <[email protected]> writes:

> The handling Non-LRU pages returned by follow_page() jumps directly, it
> doesn't call put_page() to handle the reference count, since 'FOLL_GET'
> flag for follow_page() has get_page() called. Fix the zone device page
> check by handling the page reference count correctly before returning.
>
> And as David reviewed, "device pages are never PageKsm pages". Drop this
> zone device page check for break_ksm().
>
> Fixes: 3218f8712d6b ("mm: handling Non-LRU pages returned by vm_normal_pages")
> Signed-off-by: Haiyue Wang <[email protected]>
> ---
> mm/huge_memory.c | 4 ++--
> mm/ksm.c | 12 +++++++++---
> mm/migrate.c | 10 +++++++---
> 3 files changed, 18 insertions(+), 8 deletions(-)
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 8a7c1b344abe..b2ba17c3dcd7 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -2963,10 +2963,10 @@ static int split_huge_pages_pid(int pid, unsigned long vaddr_start,
> /* FOLL_DUMP to ignore special (like zero) pages */
> page = follow_page(vma, addr, FOLL_GET | FOLL_DUMP);
>
> - if (IS_ERR_OR_NULL(page) || is_zone_device_page(page))
> + if (IS_ERR_OR_NULL(page))
> continue;
>
> - if (!is_transparent_hugepage(page))
> + if (is_zone_device_page(page) || !is_transparent_hugepage(page))
> goto next;
>
> total++;
> diff --git a/mm/ksm.c b/mm/ksm.c
> index 42ab153335a2..e26f57fc1f0e 100644
> --- a/mm/ksm.c
> +++ b/mm/ksm.c
> @@ -475,7 +475,7 @@ static int break_ksm(struct vm_area_struct *vma, unsigned long addr)
> cond_resched();
> page = follow_page(vma, addr,
> FOLL_GET | FOLL_MIGRATION | FOLL_REMOTE);
> - if (IS_ERR_OR_NULL(page) || is_zone_device_page(page))
> + if (IS_ERR_OR_NULL(page))
> break;
> if (PageKsm(page))
> ret = handle_mm_fault(vma, addr,
> @@ -560,12 +560,15 @@ static struct page *get_mergeable_page(struct rmap_item *rmap_item)
> goto out;
>
> page = follow_page(vma, addr, FOLL_GET);
> - if (IS_ERR_OR_NULL(page) || is_zone_device_page(page))
> + if (IS_ERR_OR_NULL(page))
> goto out;
> + if (is_zone_device_page(page))

Same as for break_ksm() I think we should be able to drop the
is_zone_device_page() check here because scan_get_next_rmap_item()
already filters out zone device pages.

> + goto out_putpage;
> if (PageAnon(page)) {
> flush_anon_page(vma, page, addr);
> flush_dcache_page(page);
> } else {
> +out_putpage:
> put_page(page);
> out:
> page = NULL;
> @@ -2308,11 +2311,13 @@ static struct rmap_item *scan_get_next_rmap_item(struct page **page)
> if (ksm_test_exit(mm))
> break;
> *page = follow_page(vma, ksm_scan.address, FOLL_GET);
> - if (IS_ERR_OR_NULL(*page) || is_zone_device_page(*page)) {
> + if (IS_ERR_OR_NULL(*page)) {
> ksm_scan.address += PAGE_SIZE;
> cond_resched();
> continue;
> }
> + if (is_zone_device_page(*page))
> + goto next_page;
> if (PageAnon(*page)) {
> flush_anon_page(vma, *page, ksm_scan.address);
> flush_dcache_page(*page);
> @@ -2327,6 +2332,7 @@ static struct rmap_item *scan_get_next_rmap_item(struct page **page)
> mmap_read_unlock(mm);
> return rmap_item;
> }
> +next_page:
> put_page(*page);
> ksm_scan.address += PAGE_SIZE;
> cond_resched();
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 581dfaad9257..fee12cd2f294 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1672,9 +1672,12 @@ static int add_page_for_migration(struct mm_struct *mm, unsigned long addr,
> goto out;
>
> err = -ENOENT;
> - if (!page || is_zone_device_page(page))
> + if (!page)
> goto out;
>
> + if (is_zone_device_page(page))
> + goto out_putpage;
> +
> err = 0;
> if (page_to_nid(page) == node)
> goto out_putpage;
> @@ -1868,8 +1871,9 @@ static void do_pages_stat_array(struct mm_struct *mm, unsigned long nr_pages,
> if (IS_ERR(page))
> goto set_status;
>
> - if (page && !is_zone_device_page(page)) {
> - err = page_to_nid(page);
> + if (page) {
> + err = !is_zone_device_page(page) ? page_to_nid(page)
> + : -ENOENT;

Can we remove the multiple layers of conditionals here? Something like
this is cleaner and easier to understand IMHO:

- if (page && !is_zone_device_page(page)) {
- err = page_to_nid(page);
- if (foll_flags & FOLL_GET)
- put_page(page);
- } else {
+ if (!page) {
err = -ENOENT;
+ goto set_status;
}
+
+ if (is_zone_device_page(page))
+ err = -ENOENT;
+ else
+ err = page_to_nid_page(page);
+
+ if (foll_flags & FOLL_GET)
+ put_page(page);

Thanks.

- Alistair

> if (foll_flags & FOLL_GET)
> put_page(page);
> } else {

2022-08-16 06:40:55

by Haiyue Wang

[permalink] [raw]
Subject: RE: [PATCH v5 2/2] mm: fix the handling Non-LRU pages returned by follow_page

> -----Original Message-----
> From: Alistair Popple <[email protected]>
> Sent: Tuesday, August 16, 2022 08:01
> To: Wang, Haiyue <[email protected]>
> Cc: [email protected]; [email protected]; [email protected]; [email protected];
> [email protected]; Huang, Ying <[email protected]>; [email protected];
> [email protected]; [email protected]; Felix Kuehling <[email protected]>
> Subject: Re: [PATCH v5 2/2] mm: fix the handling Non-LRU pages returned by follow_page
>
>
> Haiyue Wang <[email protected]> writes:
>
> > The handling Non-LRU pages returned by follow_page() jumps directly, it
> > doesn't call put_page() to handle the reference count, since 'FOLL_GET'
> > flag for follow_page() has get_page() called. Fix the zone device page
> > check by handling the page reference count correctly before returning.
> >
> > And as David reviewed, "device pages are never PageKsm pages". Drop this
> > zone device page check for break_ksm().
> >
> > Fixes: 3218f8712d6b ("mm: handling Non-LRU pages returned by vm_normal_pages")
> > Signed-off-by: Haiyue Wang <[email protected]>
> > ---
> > mm/huge_memory.c | 4 ++--
> > mm/ksm.c | 12 +++++++++---
> > mm/migrate.c | 10 +++++++---
> > 3 files changed, 18 insertions(+), 8 deletions(-)
> >
> > diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> > index 8a7c1b344abe..b2ba17c3dcd7 100644
> > --- a/mm/huge_memory.c
> > +++ b/mm/huge_memory.c
> > @@ -2963,10 +2963,10 @@ static int split_huge_pages_pid(int pid, unsigned long vaddr_start,
> > /* FOLL_DUMP to ignore special (like zero) pages */
> > page = follow_page(vma, addr, FOLL_GET | FOLL_DUMP);
> >
> > - if (IS_ERR_OR_NULL(page) || is_zone_device_page(page))
> > + if (IS_ERR_OR_NULL(page))
> > continue;
> >
> > - if (!is_transparent_hugepage(page))
> > + if (is_zone_device_page(page) || !is_transparent_hugepage(page))
> > goto next;
> >
> > total++;
> > diff --git a/mm/ksm.c b/mm/ksm.c
> > index 42ab153335a2..e26f57fc1f0e 100644
> > --- a/mm/ksm.c
> > +++ b/mm/ksm.c
> > @@ -475,7 +475,7 @@ static int break_ksm(struct vm_area_struct *vma, unsigned long addr)
> > cond_resched();
> > page = follow_page(vma, addr,
> > FOLL_GET | FOLL_MIGRATION | FOLL_REMOTE);
> > - if (IS_ERR_OR_NULL(page) || is_zone_device_page(page))
> > + if (IS_ERR_OR_NULL(page))
> > break;
> > if (PageKsm(page))
> > ret = handle_mm_fault(vma, addr,
> > @@ -560,12 +560,15 @@ static struct page *get_mergeable_page(struct rmap_item *rmap_item)
> > goto out;
> >
> > page = follow_page(vma, addr, FOLL_GET);
> > - if (IS_ERR_OR_NULL(page) || is_zone_device_page(page))
> > + if (IS_ERR_OR_NULL(page))
> > goto out;
> > + if (is_zone_device_page(page))
>
> Same as for break_ksm() I think we should be able to drop the
> is_zone_device_page() check here because scan_get_next_rmap_item()
> already filters out zone device pages.
>

The 'page' for scan_get_next_rmap_item() is from 'vma' which is NOT MERGEABLE:
for (; vma; vma = vma->vm_next) {
if (!(vma->vm_flags & VM_MERGEABLE))
continue;

The 'page' for get_mergeable_page() is from 'vma' which is MERGEABLE by 'find_mergeable_vma()'

So they may be different, and the unstable_tree_search_insert() shows the logical:

'page' vs 'tree_page':

tree_page = get_mergeable_page(tree_rmap_item);
if (!tree_page)
return NULL;

/*
* Don't substitute a ksm page for a forked page.
*/
if (page == tree_page) {
put_page(tree_page);
return NULL;
}

ret = memcmp_pages(page, tree_page);


> > + goto out_putpage;
> > if (PageAnon(page)) {
> > flush_anon_page(vma, page, addr);
> > flush_dcache_page(page);
> > } else {
> > +out_putpage:
> > put_page(page);
> > out:
> > page = NULL;
> > @@ -2308,11 +2311,13 @@ static struct rmap_item *scan_get_next_rmap_item(struct page **page)
> > if (ksm_test_exit(mm))
> > break;
> > *page = follow_page(vma, ksm_scan.address, FOLL_GET);
> > - if (IS_ERR_OR_NULL(*page) || is_zone_device_page(*page)) {
> > + if (IS_ERR_OR_NULL(*page)) {
> > ksm_scan.address += PAGE_SIZE;
> > cond_resched();
> > continue;
> > }
> > + if (is_zone_device_page(*page))
> > + goto next_page;
> > if (PageAnon(*page)) {
> > flush_anon_page(vma, *page, ksm_scan.address);
> > flush_dcache_page(*page);
> > @@ -2327,6 +2332,7 @@ static struct rmap_item *scan_get_next_rmap_item(struct page **page)
> > mmap_read_unlock(mm);
> > return rmap_item;
> > }
> > +next_page:
> > put_page(*page);
> > ksm_scan.address += PAGE_SIZE;
> > cond_resched();
> > diff --git a/mm/migrate.c b/mm/migrate.c
> > index 581dfaad9257..fee12cd2f294 100644
> > --- a/mm/migrate.c
> > +++ b/mm/migrate.c
> > @@ -1672,9 +1672,12 @@ static int add_page_for_migration(struct mm_struct *mm, unsigned long addr,
> > goto out;
> >
> > err = -ENOENT;
> > - if (!page || is_zone_device_page(page))
> > + if (!page)
> > goto out;
> >
> > + if (is_zone_device_page(page))
> > + goto out_putpage;
> > +
> > err = 0;
> > if (page_to_nid(page) == node)
> > goto out_putpage;
> > @@ -1868,8 +1871,9 @@ static void do_pages_stat_array(struct mm_struct *mm, unsigned long nr_pages,
> > if (IS_ERR(page))
> > goto set_status;
> >
> > - if (page && !is_zone_device_page(page)) {
> > - err = page_to_nid(page);
> > + if (page) {
> > + err = !is_zone_device_page(page) ? page_to_nid(page)
> > + : -ENOENT;
>
> Can we remove the multiple layers of conditionals here? Something like
> this is cleaner and easier to understand IMHO:

OK, I will try it in new patch.

>
> - if (page && !is_zone_device_page(page)) {
> - err = page_to_nid(page);
> - if (foll_flags & FOLL_GET)
> - put_page(page);
> - } else {
> + if (!page) {
> err = -ENOENT;
> + goto set_status;
> }
> +
> + if (is_zone_device_page(page))
> + err = -ENOENT;
> + else
> + err = page_to_nid_page(page);
> +
> + if (foll_flags & FOLL_GET)
> + put_page(page);
>
> Thanks.
>
> - Alistair
>
> > if (foll_flags & FOLL_GET)
> > put_page(page);
> > } else {

2022-08-16 07:54:28

by Alistair Popple

[permalink] [raw]
Subject: Re: [PATCH v5 2/2] mm: fix the handling Non-LRU pages returned by follow_page


"Wang, Haiyue" <[email protected]> writes:

>> -----Original Message-----
>> From: Alistair Popple <[email protected]>
>> Sent: Tuesday, August 16, 2022 08:01
>> To: Wang, Haiyue <[email protected]>
>> Cc: [email protected]; [email protected]; [email protected]; [email protected];
>> [email protected]; Huang, Ying <[email protected]>; [email protected];
>> [email protected]; [email protected]; Felix Kuehling <[email protected]>
>> Subject: Re: [PATCH v5 2/2] mm: fix the handling Non-LRU pages returned by follow_page
>>
>>
>> Haiyue Wang <[email protected]> writes:
>>
>> > The handling Non-LRU pages returned by follow_page() jumps directly, it
>> > doesn't call put_page() to handle the reference count, since 'FOLL_GET'
>> > flag for follow_page() has get_page() called. Fix the zone device page
>> > check by handling the page reference count correctly before returning.
>> >
>> > And as David reviewed, "device pages are never PageKsm pages". Drop this
>> > zone device page check for break_ksm().
>> >
>> > Fixes: 3218f8712d6b ("mm: handling Non-LRU pages returned by vm_normal_pages")
>> > Signed-off-by: Haiyue Wang <[email protected]>
>> > ---
>> > mm/huge_memory.c | 4 ++--
>> > mm/ksm.c | 12 +++++++++---
>> > mm/migrate.c | 10 +++++++---
>> > 3 files changed, 18 insertions(+), 8 deletions(-)
>> >
>> > diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>> > index 8a7c1b344abe..b2ba17c3dcd7 100644
>> > --- a/mm/huge_memory.c
>> > +++ b/mm/huge_memory.c
>> > @@ -2963,10 +2963,10 @@ static int split_huge_pages_pid(int pid, unsigned long vaddr_start,
>> > /* FOLL_DUMP to ignore special (like zero) pages */
>> > page = follow_page(vma, addr, FOLL_GET | FOLL_DUMP);
>> >
>> > - if (IS_ERR_OR_NULL(page) || is_zone_device_page(page))
>> > + if (IS_ERR_OR_NULL(page))
>> > continue;
>> >
>> > - if (!is_transparent_hugepage(page))
>> > + if (is_zone_device_page(page) || !is_transparent_hugepage(page))
>> > goto next;
>> >
>> > total++;
>> > diff --git a/mm/ksm.c b/mm/ksm.c
>> > index 42ab153335a2..e26f57fc1f0e 100644
>> > --- a/mm/ksm.c
>> > +++ b/mm/ksm.c
>> > @@ -475,7 +475,7 @@ static int break_ksm(struct vm_area_struct *vma, unsigned long addr)
>> > cond_resched();
>> > page = follow_page(vma, addr,
>> > FOLL_GET | FOLL_MIGRATION | FOLL_REMOTE);
>> > - if (IS_ERR_OR_NULL(page) || is_zone_device_page(page))
>> > + if (IS_ERR_OR_NULL(page))
>> > break;
>> > if (PageKsm(page))
>> > ret = handle_mm_fault(vma, addr,
>> > @@ -560,12 +560,15 @@ static struct page *get_mergeable_page(struct rmap_item *rmap_item)
>> > goto out;
>> >
>> > page = follow_page(vma, addr, FOLL_GET);
>> > - if (IS_ERR_OR_NULL(page) || is_zone_device_page(page))
>> > + if (IS_ERR_OR_NULL(page))
>> > goto out;
>> > + if (is_zone_device_page(page))
>>
>> Same as for break_ksm() I think we should be able to drop the
>> is_zone_device_page() check here because scan_get_next_rmap_item()
>> already filters out zone device pages.
>>
>
> The 'page' for scan_get_next_rmap_item() is from 'vma' which is NOT MERGEABLE:
> for (; vma; vma = vma->vm_next) {
> if (!(vma->vm_flags & VM_MERGEABLE))
> continue;
>
> The 'page' for get_mergeable_page() is from 'vma' which is MERGEABLE by 'find_mergeable_vma()'

Oh, ok. I'm actually not too familiar with KSM but I think I follow so
if you think we need to keep the check by all means do so.

> So they may be different, and the unstable_tree_search_insert() shows the logical:
>
> 'page' vs 'tree_page':
>
> tree_page = get_mergeable_page(tree_rmap_item);
> if (!tree_page)
> return NULL;
>
> /*
> * Don't substitute a ksm page for a forked page.
> */
> if (page == tree_page) {
> put_page(tree_page);
> return NULL;
> }
>
> ret = memcmp_pages(page, tree_page);
>
>
>> > + goto out_putpage;
>> > if (PageAnon(page)) {
>> > flush_anon_page(vma, page, addr);
>> > flush_dcache_page(page);
>> > } else {
>> > +out_putpage:
>> > put_page(page);
>> > out:
>> > page = NULL;
>> > @@ -2308,11 +2311,13 @@ static struct rmap_item *scan_get_next_rmap_item(struct page **page)
>> > if (ksm_test_exit(mm))
>> > break;
>> > *page = follow_page(vma, ksm_scan.address, FOLL_GET);
>> > - if (IS_ERR_OR_NULL(*page) || is_zone_device_page(*page)) {
>> > + if (IS_ERR_OR_NULL(*page)) {
>> > ksm_scan.address += PAGE_SIZE;
>> > cond_resched();
>> > continue;
>> > }
>> > + if (is_zone_device_page(*page))
>> > + goto next_page;
>> > if (PageAnon(*page)) {
>> > flush_anon_page(vma, *page, ksm_scan.address);
>> > flush_dcache_page(*page);
>> > @@ -2327,6 +2332,7 @@ static struct rmap_item *scan_get_next_rmap_item(struct page **page)
>> > mmap_read_unlock(mm);
>> > return rmap_item;
>> > }
>> > +next_page:
>> > put_page(*page);
>> > ksm_scan.address += PAGE_SIZE;
>> > cond_resched();
>> > diff --git a/mm/migrate.c b/mm/migrate.c
>> > index 581dfaad9257..fee12cd2f294 100644
>> > --- a/mm/migrate.c
>> > +++ b/mm/migrate.c
>> > @@ -1672,9 +1672,12 @@ static int add_page_for_migration(struct mm_struct *mm, unsigned long addr,
>> > goto out;
>> >
>> > err = -ENOENT;
>> > - if (!page || is_zone_device_page(page))
>> > + if (!page)
>> > goto out;
>> >
>> > + if (is_zone_device_page(page))
>> > + goto out_putpage;
>> > +
>> > err = 0;
>> > if (page_to_nid(page) == node)
>> > goto out_putpage;
>> > @@ -1868,8 +1871,9 @@ static void do_pages_stat_array(struct mm_struct *mm, unsigned long nr_pages,
>> > if (IS_ERR(page))
>> > goto set_status;
>> >
>> > - if (page && !is_zone_device_page(page)) {
>> > - err = page_to_nid(page);
>> > + if (page) {
>> > + err = !is_zone_device_page(page) ? page_to_nid(page)
>> > + : -ENOENT;
>>
>> Can we remove the multiple layers of conditionals here? Something like
>> this is cleaner and easier to understand IMHO:
>
> OK, I will try it in new patch.

Thanks.

>>
>> - if (page && !is_zone_device_page(page)) {
>> - err = page_to_nid(page);
>> - if (foll_flags & FOLL_GET)
>> - put_page(page);
>> - } else {
>> + if (!page) {
>> err = -ENOENT;
>> + goto set_status;
>> }
>> +
>> + if (is_zone_device_page(page))
>> + err = -ENOENT;
>> + else
>> + err = page_to_nid_page(page);
>> +
>> + if (foll_flags & FOLL_GET)
>> + put_page(page);
>>
>> Thanks.
>>
>> - Alistair
>>
>> > if (foll_flags & FOLL_GET)
>> > put_page(page);
>> > } else {

2022-08-16 07:57:59

by Haiyue Wang

[permalink] [raw]
Subject: [PATCH v6 0/2] fix follow_page related issues

v6: Simplify the multiple layers of conditionals for if {}

- if (page) {
- err = !is_zone_device_page(page) ? page_to_nid(page)
- : -ENOENT;
- if (foll_flags & FOLL_GET)
- put_page(page);
- } else {
- err = -ENOENT;
- }
+ err = -ENOENT;
+ if (!page)
+ goto set_status;
+
+ if (!is_zone_device_page(page))
+ err = page_to_nid(page);
+
+ if (foll_flags & FOLL_GET)
+ put_page(page);

v5: reword the commit message for FOLL_GET with more information.

v4: add '()' for the function for readability.
add more words about the Non-LRU pages fix in commit message.

v3: Merge the fix for handling Non-LRU pages into one patch.
Drop the break_ksm zone device page check.

v2: Add the Non-LRU pages fix with two patches, so that
'mm: migration: fix the FOLL_GET' can be applied directly
on linux-5.19 stable branch.

Haiyue Wang (2):
mm: migration: fix the FOLL_GET failure on following huge page
mm: fix the handling Non-LRU pages returned by follow_page

mm/huge_memory.c | 4 ++--
mm/ksm.c | 12 +++++++++---
mm/migrate.c | 23 +++++++++++++++++------
3 files changed, 28 insertions(+), 11 deletions(-)

--
2.37.2