2022-05-30 15:19:25

by Anshuman Khandual

[permalink] [raw]
Subject: [RFC] mm/page_isolation: Fix an infinite loop in isolate_single_pageblock()

HugeTLB allocation (32MB pages on 4K base page) via sysfs on arm64 platform
is getting stuck in isolate_single_pageblock(), because of an infinite loop
Because head_pfn always evaluate the same, so does pfn, and the outer loop
never exits. Dropping the relevant code block, which seems redundant, makes
the problem go away.

Cc: Andrew Morton <[email protected]>
Cc: Zi Yan <[email protected]>
Cc: [email protected]
Cc: [email protected]
Fixes: b2c9e2fbba32 ("mm: make alloc_contig_range work at pageblock granularity")
Signed-off-by: Anshuman Khandual <[email protected]>
---
I am not sure about this fix, and also did not find much time today to
debug any further. There are much code changes around this function in
recent days. This problem is present on latest mainline kernel.

- Anshuman

mm/page_isolation.c | 4 ----
1 file changed, 4 deletions(-)

diff --git a/mm/page_isolation.c b/mm/page_isolation.c
index 6021f8444b5a..b0922fee75c1 100644
--- a/mm/page_isolation.c
+++ b/mm/page_isolation.c
@@ -389,10 +389,6 @@ static int isolate_single_pageblock(unsigned long boundary_pfn, int flags,
struct page *head = compound_head(page);
unsigned long head_pfn = page_to_pfn(head);

- if (head_pfn + nr_pages <= boundary_pfn) {
- pfn = head_pfn + nr_pages;
- continue;
- }
#if defined CONFIG_COMPACTION || defined CONFIG_CMA
/*
* hugetlb, lru compound (THP), and movable compound pages
--
2.20.1



2022-05-30 18:29:34

by Zi Yan

[permalink] [raw]
Subject: Re: [RFC] mm/page_isolation: Fix an infinite loop in isolate_single_pageblock()

On 30 May 2022, at 7:50, Anshuman Khandual wrote:

> HugeTLB allocation (32MB pages on 4K base page) via sysfs on arm64 platform
> is getting stuck in isolate_single_pageblock(), because of an infinite loop
> Because head_pfn always evaluate the same, so does pfn, and the outer loop
> never exits. Dropping the relevant code block, which seems redundant, makes
> the problem go away.

Thanks for the report.

>
> Cc: Andrew Morton <[email protected]>
> Cc: Zi Yan <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Fixes: b2c9e2fbba32 ("mm: make alloc_contig_range work at pageblock granularity")
> Signed-off-by: Anshuman Khandual <[email protected]>
> ---
> I am not sure about this fix, and also did not find much time today to
> debug any further. There are much code changes around this function in
> recent days. This problem is present on latest mainline kernel.
>
> - Anshuman
>
> mm/page_isolation.c | 4 ----
> 1 file changed, 4 deletions(-)
>
> diff --git a/mm/page_isolation.c b/mm/page_isolation.c
> index 6021f8444b5a..b0922fee75c1 100644
> --- a/mm/page_isolation.c
> +++ b/mm/page_isolation.c
> @@ -389,10 +389,6 @@ static int isolate_single_pageblock(unsigned long boundary_pfn, int flags,
> struct page *head = compound_head(page);
> unsigned long head_pfn = page_to_pfn(head);
>
> - if (head_pfn + nr_pages <= boundary_pfn) {
> - pfn = head_pfn + nr_pages;
> - continue;
> - }
> #if defined CONFIG_COMPACTION || defined CONFIG_CMA
> /*
> * hugetlb, lru compound (THP), and movable compound pages
> --
> 2.20.1

Can you try the patch below to see if it fixes the issue? Thanks.

diff --git a/mm/page_isolation.c b/mm/page_isolation.c
index 6021f8444b5a..d200d41ad0d3 100644
--- a/mm/page_isolation.c
+++ b/mm/page_isolation.c
@@ -385,9 +385,9 @@ static int isolate_single_pageblock(unsigned long boundary_pfn, int flags,
* above do the rest. If migration is not possible, just fail.
*/
if (PageCompound(page)) {
- unsigned long nr_pages = compound_nr(page);
struct page *head = compound_head(page);
unsigned long head_pfn = page_to_pfn(head);
+ unsigned long nr_pages = compound_nr(head);

if (head_pfn + nr_pages <= boundary_pfn) {
pfn = head_pfn + nr_pages;


--
Best Regards,
Yan, Zi


Attachments:
signature.asc (871.00 B)
OpenPGP digital signature