2020-10-29 20:09:21

by Zi Yan

[permalink] [raw]
Subject: [PATCH] mm/compaction: count pages and stop correctly during page isolation.

From: Zi Yan <[email protected]>

In isolate_migratepages_block, when cc->alloc_contig is true, we are
able to isolate compound pages, nr_migratepages and nr_isolated did not
count compound pages correctly, causing us to isolate more pages than we
thought. Use thp_nr_pages to count pages. Otherwise, we might be trapped
in too_many_isolated while loop, since the actual isolated pages can go
up to COMPACT_CLUSTER_MAX*512=16384, where COMPACT_CLUSTER_MAX is 32,
since we stop isolation after cc->nr_migratepages reaches to
COMPACT_CLUSTER_MAX.

In addition, after we fix the issue above, cc->nr_migratepages could
never be equal to COMPACT_CLUSTER_MAX if compound pages are isolated,
thus page isolation could not stop as we intended. Change the isolation
stop condition to >=.

Signed-off-by: Zi Yan <[email protected]>
---
mm/compaction.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index ee1f8439369e..0683a4999581 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -1012,8 +1012,8 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,

isolate_success:
list_add(&page->lru, &cc->migratepages);
- cc->nr_migratepages++;
- nr_isolated++;
+ cc->nr_migratepages += thp_nr_pages(page);
+ nr_isolated += thp_nr_pages(page);

/*
* Avoid isolating too much unless this block is being
@@ -1021,7 +1021,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
* or a lock is contended. For contention, isolate quickly to
* potentially remove one source of contention.
*/
- if (cc->nr_migratepages == COMPACT_CLUSTER_MAX &&
+ if (cc->nr_migratepages >= COMPACT_CLUSTER_MAX &&
!cc->rescan && !cc->contended) {
++low_pfn;
break;
@@ -1132,7 +1132,7 @@ isolate_migratepages_range(struct compact_control *cc, unsigned long start_pfn,
if (!pfn)
break;

- if (cc->nr_migratepages == COMPACT_CLUSTER_MAX)
+ if (cc->nr_migratepages >= COMPACT_CLUSTER_MAX)
break;
}

--
2.28.0


2020-10-29 21:18:16

by Yang Shi

[permalink] [raw]
Subject: Re: [PATCH] mm/compaction: count pages and stop correctly during page isolation.

On Thu, Oct 29, 2020 at 1:04 PM Zi Yan <[email protected]> wrote:
>
> From: Zi Yan <[email protected]>
>
> In isolate_migratepages_block, when cc->alloc_contig is true, we are
> able to isolate compound pages, nr_migratepages and nr_isolated did not
> count compound pages correctly, causing us to isolate more pages than we
> thought. Use thp_nr_pages to count pages. Otherwise, we might be trapped
> in too_many_isolated while loop, since the actual isolated pages can go
> up to COMPACT_CLUSTER_MAX*512=16384, where COMPACT_CLUSTER_MAX is 32,

Is it that easy to run into? 16384 doesn't seem like too many pages, just 64MB.

> since we stop isolation after cc->nr_migratepages reaches to
> COMPACT_CLUSTER_MAX.
>
> In addition, after we fix the issue above, cc->nr_migratepages could
> never be equal to COMPACT_CLUSTER_MAX if compound pages are isolated,
> thus page isolation could not stop as we intended. Change the isolation
> stop condition to >=.

The fix looks sane to me. Reviewed-by: Yang Shi <[email protected]>

Shall you add Fixes tag to commit
1da2f328fa643bd72197dfed0c655148af31e4eb? And may cc stable.

>
> Signed-off-by: Zi Yan <[email protected]>
> ---
> mm/compaction.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/mm/compaction.c b/mm/compaction.c
> index ee1f8439369e..0683a4999581 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -1012,8 +1012,8 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
>
> isolate_success:
> list_add(&page->lru, &cc->migratepages);
> - cc->nr_migratepages++;
> - nr_isolated++;
> + cc->nr_migratepages += thp_nr_pages(page);
> + nr_isolated += thp_nr_pages(page);
>
> /*
> * Avoid isolating too much unless this block is being
> @@ -1021,7 +1021,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
> * or a lock is contended. For contention, isolate quickly to
> * potentially remove one source of contention.
> */
> - if (cc->nr_migratepages == COMPACT_CLUSTER_MAX &&
> + if (cc->nr_migratepages >= COMPACT_CLUSTER_MAX &&
> !cc->rescan && !cc->contended) {
> ++low_pfn;
> break;
> @@ -1132,7 +1132,7 @@ isolate_migratepages_range(struct compact_control *cc, unsigned long start_pfn,
> if (!pfn)
> break;
>
> - if (cc->nr_migratepages == COMPACT_CLUSTER_MAX)
> + if (cc->nr_migratepages >= COMPACT_CLUSTER_MAX)
> break;
> }
>
> --
> 2.28.0
>
>

2020-10-29 21:41:50

by Zi Yan

[permalink] [raw]
Subject: Re: [PATCH] mm/compaction: count pages and stop correctly during page isolation.

On 29 Oct 2020, at 17:14, Yang Shi wrote:

> On Thu, Oct 29, 2020 at 1:04 PM Zi Yan <[email protected]> wrote:
>>
>> From: Zi Yan <[email protected]>
>>
>> In isolate_migratepages_block, when cc->alloc_contig is true, we are
>> able to isolate compound pages, nr_migratepages and nr_isolated did not
>> count compound pages correctly, causing us to isolate more pages than we
>> thought. Use thp_nr_pages to count pages. Otherwise, we might be trapped
>> in too_many_isolated while loop, since the actual isolated pages can go
>> up to COMPACT_CLUSTER_MAX*512=16384, where COMPACT_CLUSTER_MAX is 32,
>
> Is it that easy to run into? 16384 doesn't seem like too many pages, just 64MB.

I hit this when I was running oom01 from ltp to test my PUD THP patchset, which
allocates PUD THPs from CMA regions and splits them into PMD THPs due to memory
pressure. I am not sure if it is common that in the upstream kernel PMD THPs will
be allocated in CMA regions due to allocation fallback.

>
>> since we stop isolation after cc->nr_migratepages reaches to
>> COMPACT_CLUSTER_MAX.
>>
>> In addition, after we fix the issue above, cc->nr_migratepages could
>> never be equal to COMPACT_CLUSTER_MAX if compound pages are isolated,
>> thus page isolation could not stop as we intended. Change the isolation
>> stop condition to >=.
>
> The fix looks sane to me. Reviewed-by: Yang Shi <[email protected]>

Thanks.

>
> Shall you add Fixes tag to commit
> 1da2f328fa643bd72197dfed0c655148af31e4eb? And may cc stable.

Sure.

Fixes: 1da2f328fa64 (“mm,thp,compaction,cma: allow THP migration for CMA allocations”)

stable cc’ed.

>
>>
>> Signed-off-by: Zi Yan <[email protected]>
>> ---
>> mm/compaction.c | 8 ++++----
>> 1 file changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/mm/compaction.c b/mm/compaction.c
>> index ee1f8439369e..0683a4999581 100644
>> --- a/mm/compaction.c
>> +++ b/mm/compaction.c
>> @@ -1012,8 +1012,8 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
>>
>> isolate_success:
>> list_add(&page->lru, &cc->migratepages);
>> - cc->nr_migratepages++;
>> - nr_isolated++;
>> + cc->nr_migratepages += thp_nr_pages(page);
>> + nr_isolated += thp_nr_pages(page);
>>
>> /*
>> * Avoid isolating too much unless this block is being
>> @@ -1021,7 +1021,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
>> * or a lock is contended. For contention, isolate quickly to
>> * potentially remove one source of contention.
>> */
>> - if (cc->nr_migratepages == COMPACT_CLUSTER_MAX &&
>> + if (cc->nr_migratepages >= COMPACT_CLUSTER_MAX &&
>> !cc->rescan && !cc->contended) {
>> ++low_pfn;
>> break;
>> @@ -1132,7 +1132,7 @@ isolate_migratepages_range(struct compact_control *cc, unsigned long start_pfn,
>> if (!pfn)
>> break;
>>
>> - if (cc->nr_migratepages == COMPACT_CLUSTER_MAX)
>> + if (cc->nr_migratepages >= COMPACT_CLUSTER_MAX)
>> break;
>> }
>>
>> --
>> 2.28.0
>>
>>



Best Regards,
Yan Zi


Attachments:
signature.asc (871.00 B)
OpenPGP digital signature

2020-10-30 00:30:04

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] mm/compaction: count pages and stop correctly during page isolation.

On Thu, 29 Oct 2020 17:31:28 -0400 Zi Yan <[email protected]> wrote:

> >
> > Shall you add Fixes tag to commit
> > 1da2f328fa643bd72197dfed0c655148af31e4eb? And may cc stable.
>
> Sure.
>
> Fixes: 1da2f328fa64 (“mm,thp,compaction,cma: allow THP migration for CMA allocations”)
>
> stable cc'ed.

A think a cc:stable really requires a description of the end-user
visible effects of the bug. Could you please provide that?

2020-10-30 01:25:20

by Zi Yan

[permalink] [raw]
Subject: Re: [PATCH] mm/compaction: count pages and stop correctly during page isolation.

On 29 Oct 2020, at 20:28, Andrew Morton wrote:

> On Thu, 29 Oct 2020 17:31:28 -0400 Zi Yan <[email protected]> wrote:
>
>>>
>>> Shall you add Fixes tag to commit
>>> 1da2f328fa643bd72197dfed0c655148af31e4eb? And may cc stable.
>>
>> Sure.
>>
>> Fixes: 1da2f328fa64 (“mm,thp,compaction,cma: allow THP migration for CMA allocations”)
>>
>> stable cc'ed.
>
> A think a cc:stable really requires a description of the end-user
> visible effects of the bug. Could you please provide that?

Sure.

For example, in a system with 16GB memory and an 8GB CMA region reserved by hugetlb_cma,
if we first allocate 10GB THPs and mlock them (so some THPs are allocated in the CMA
region and mlocked), reserving 6 1GB hugetlb pages via
/sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages will get stuck (looping
in too_many_isolated function) until we kill either task. With the patch applied,
oom will kill the application with 10GB THPs and let hugetlb page reservation finish.


Best Regards,
Yan Zi


Attachments:
signature.asc (871.00 B)
OpenPGP digital signature

2020-10-30 09:45:35

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH] mm/compaction: count pages and stop correctly during page isolation.

[Cc Vlastimil]

On Thu 29-10-20 16:04:35, Zi Yan wrote:
> From: Zi Yan <[email protected]>
>
> In isolate_migratepages_block, when cc->alloc_contig is true, we are
> able to isolate compound pages, nr_migratepages and nr_isolated did not
> count compound pages correctly, causing us to isolate more pages than we
> thought. Use thp_nr_pages to count pages. Otherwise, we might be trapped
> in too_many_isolated while loop, since the actual isolated pages can go
> up to COMPACT_CLUSTER_MAX*512=16384, where COMPACT_CLUSTER_MAX is 32,
> since we stop isolation after cc->nr_migratepages reaches to
> COMPACT_CLUSTER_MAX.
>
> In addition, after we fix the issue above, cc->nr_migratepages could
> never be equal to COMPACT_CLUSTER_MAX if compound pages are isolated,
> thus page isolation could not stop as we intended. Change the isolation
> stop condition to >=.
>
> Signed-off-by: Zi Yan <[email protected]>
> ---
> mm/compaction.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/mm/compaction.c b/mm/compaction.c
> index ee1f8439369e..0683a4999581 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -1012,8 +1012,8 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
>
> isolate_success:
> list_add(&page->lru, &cc->migratepages);
> - cc->nr_migratepages++;
> - nr_isolated++;
> + cc->nr_migratepages += thp_nr_pages(page);
> + nr_isolated += thp_nr_pages(page);

Does thp_nr_pages work for __PageMovable pages?

--
Michal Hocko
SUSE Labs

2020-10-30 12:23:57

by Zi Yan

[permalink] [raw]
Subject: Re: [PATCH] mm/compaction: count pages and stop correctly during page isolation.

On 30 Oct 2020, at 5:43, Michal Hocko wrote:

> [Cc Vlastimil]
>
> On Thu 29-10-20 16:04:35, Zi Yan wrote:
>> From: Zi Yan <[email protected]>
>>
>> In isolate_migratepages_block, when cc->alloc_contig is true, we are
>> able to isolate compound pages, nr_migratepages and nr_isolated did not
>> count compound pages correctly, causing us to isolate more pages than we
>> thought. Use thp_nr_pages to count pages. Otherwise, we might be trapped
>> in too_many_isolated while loop, since the actual isolated pages can go
>> up to COMPACT_CLUSTER_MAX*512=16384, where COMPACT_CLUSTER_MAX is 32,
>> since we stop isolation after cc->nr_migratepages reaches to
>> COMPACT_CLUSTER_MAX.
>>
>> In addition, after we fix the issue above, cc->nr_migratepages could
>> never be equal to COMPACT_CLUSTER_MAX if compound pages are isolated,
>> thus page isolation could not stop as we intended. Change the isolation
>> stop condition to >=.
>>
>> Signed-off-by: Zi Yan <[email protected]>
>> ---
>> mm/compaction.c | 8 ++++----
>> 1 file changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/mm/compaction.c b/mm/compaction.c
>> index ee1f8439369e..0683a4999581 100644
>> --- a/mm/compaction.c
>> +++ b/mm/compaction.c
>> @@ -1012,8 +1012,8 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
>>
>> isolate_success:
>> list_add(&page->lru, &cc->migratepages);
>> - cc->nr_migratepages++;
>> - nr_isolated++;
>> + cc->nr_migratepages += thp_nr_pages(page);
>> + nr_isolated += thp_nr_pages(page);
>
> Does thp_nr_pages work for __PageMovable pages?

Yes. It is the same as compound_nr() but compiled
to 1 when THP is not enabled.


Best Regards,
Yan Zi


Attachments:
signature.asc (871.00 B)
OpenPGP digital signature

2020-10-30 13:38:38

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH] mm/compaction: count pages and stop correctly during page isolation.

On Fri 30-10-20 08:20:50, Zi Yan wrote:
> On 30 Oct 2020, at 5:43, Michal Hocko wrote:
>
> > [Cc Vlastimil]
> >
> > On Thu 29-10-20 16:04:35, Zi Yan wrote:
> >> From: Zi Yan <[email protected]>
> >>
> >> In isolate_migratepages_block, when cc->alloc_contig is true, we are
> >> able to isolate compound pages, nr_migratepages and nr_isolated did not
> >> count compound pages correctly, causing us to isolate more pages than we
> >> thought. Use thp_nr_pages to count pages. Otherwise, we might be trapped
> >> in too_many_isolated while loop, since the actual isolated pages can go
> >> up to COMPACT_CLUSTER_MAX*512=16384, where COMPACT_CLUSTER_MAX is 32,
> >> since we stop isolation after cc->nr_migratepages reaches to
> >> COMPACT_CLUSTER_MAX.
> >>
> >> In addition, after we fix the issue above, cc->nr_migratepages could
> >> never be equal to COMPACT_CLUSTER_MAX if compound pages are isolated,
> >> thus page isolation could not stop as we intended. Change the isolation
> >> stop condition to >=.
> >>
> >> Signed-off-by: Zi Yan <[email protected]>
> >> ---
> >> mm/compaction.c | 8 ++++----
> >> 1 file changed, 4 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/mm/compaction.c b/mm/compaction.c
> >> index ee1f8439369e..0683a4999581 100644
> >> --- a/mm/compaction.c
> >> +++ b/mm/compaction.c
> >> @@ -1012,8 +1012,8 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
> >>
> >> isolate_success:
> >> list_add(&page->lru, &cc->migratepages);
> >> - cc->nr_migratepages++;
> >> - nr_isolated++;
> >> + cc->nr_migratepages += thp_nr_pages(page);
> >> + nr_isolated += thp_nr_pages(page);
> >
> > Does thp_nr_pages work for __PageMovable pages?
>
> Yes. It is the same as compound_nr() but compiled
> to 1 when THP is not enabled.

I am sorry but I do not follow. First of all the implementation of the
two is different and also I was asking about __PageMovable which should
never be THP IIRC. Can they be compound though?

--
Michal Hocko
SUSE Labs

2020-10-30 14:39:56

by Zi Yan

[permalink] [raw]
Subject: Re: [PATCH] mm/compaction: count pages and stop correctly during page isolation.

On 30 Oct 2020, at 9:36, Michal Hocko wrote:

> On Fri 30-10-20 08:20:50, Zi Yan wrote:
>> On 30 Oct 2020, at 5:43, Michal Hocko wrote:
>>
>>> [Cc Vlastimil]
>>>
>>> On Thu 29-10-20 16:04:35, Zi Yan wrote:
>>>> From: Zi Yan <[email protected]>
>>>>
>>>> In isolate_migratepages_block, when cc->alloc_contig is true, we are
>>>> able to isolate compound pages, nr_migratepages and nr_isolated did not
>>>> count compound pages correctly, causing us to isolate more pages than we
>>>> thought. Use thp_nr_pages to count pages. Otherwise, we might be trapped
>>>> in too_many_isolated while loop, since the actual isolated pages can go
>>>> up to COMPACT_CLUSTER_MAX*512=16384, where COMPACT_CLUSTER_MAX is 32,
>>>> since we stop isolation after cc->nr_migratepages reaches to
>>>> COMPACT_CLUSTER_MAX.
>>>>
>>>> In addition, after we fix the issue above, cc->nr_migratepages could
>>>> never be equal to COMPACT_CLUSTER_MAX if compound pages are isolated,
>>>> thus page isolation could not stop as we intended. Change the isolation
>>>> stop condition to >=.
>>>>
>>>> Signed-off-by: Zi Yan <[email protected]>
>>>> ---
>>>> mm/compaction.c | 8 ++++----
>>>> 1 file changed, 4 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/mm/compaction.c b/mm/compaction.c
>>>> index ee1f8439369e..0683a4999581 100644
>>>> --- a/mm/compaction.c
>>>> +++ b/mm/compaction.c
>>>> @@ -1012,8 +1012,8 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
>>>>
>>>> isolate_success:
>>>> list_add(&page->lru, &cc->migratepages);
>>>> - cc->nr_migratepages++;
>>>> - nr_isolated++;
>>>> + cc->nr_migratepages += thp_nr_pages(page);
>>>> + nr_isolated += thp_nr_pages(page);
>>>
>>> Does thp_nr_pages work for __PageMovable pages?
>>
>> Yes. It is the same as compound_nr() but compiled
>> to 1 when THP is not enabled.
>
> I am sorry but I do not follow. First of all the implementation of the
> two is different and also I was asking about __PageMovable which should
> never be THP IIRC. Can they be compound though?

__PageMovable, non-lru movable pages, can be compound and thp_nr_page cannot
be used for it, since when THP is off, thp_nr_page will return the wrong number.
I got confused by its name, sorry.

But __PageMovable is irrelevant to this patch, since we are using
__isolate_lru_page to isolate pages. non-lru __PageMovable should not appear
after isolate_succes. thp_nr_pages can be used here.


Best Regards,
Yan Zi


Attachments:
signature.asc (871.00 B)
OpenPGP digital signature

2020-10-30 14:53:19

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH] mm/compaction: count pages and stop correctly during page isolation.

On Fri 30-10-20 10:35:43, Zi Yan wrote:
> On 30 Oct 2020, at 9:36, Michal Hocko wrote:
>
> > On Fri 30-10-20 08:20:50, Zi Yan wrote:
> >> On 30 Oct 2020, at 5:43, Michal Hocko wrote:
> >>
> >>> [Cc Vlastimil]
> >>>
> >>> On Thu 29-10-20 16:04:35, Zi Yan wrote:
> >>>> From: Zi Yan <[email protected]>
> >>>>
> >>>> In isolate_migratepages_block, when cc->alloc_contig is true, we are
> >>>> able to isolate compound pages, nr_migratepages and nr_isolated did not
> >>>> count compound pages correctly, causing us to isolate more pages than we
> >>>> thought. Use thp_nr_pages to count pages. Otherwise, we might be trapped
> >>>> in too_many_isolated while loop, since the actual isolated pages can go
> >>>> up to COMPACT_CLUSTER_MAX*512=16384, where COMPACT_CLUSTER_MAX is 32,
> >>>> since we stop isolation after cc->nr_migratepages reaches to
> >>>> COMPACT_CLUSTER_MAX.
> >>>>
> >>>> In addition, after we fix the issue above, cc->nr_migratepages could
> >>>> never be equal to COMPACT_CLUSTER_MAX if compound pages are isolated,
> >>>> thus page isolation could not stop as we intended. Change the isolation
> >>>> stop condition to >=.
> >>>>
> >>>> Signed-off-by: Zi Yan <[email protected]>
> >>>> ---
> >>>> mm/compaction.c | 8 ++++----
> >>>> 1 file changed, 4 insertions(+), 4 deletions(-)
> >>>>
> >>>> diff --git a/mm/compaction.c b/mm/compaction.c
> >>>> index ee1f8439369e..0683a4999581 100644
> >>>> --- a/mm/compaction.c
> >>>> +++ b/mm/compaction.c
> >>>> @@ -1012,8 +1012,8 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
> >>>>
> >>>> isolate_success:
> >>>> list_add(&page->lru, &cc->migratepages);
> >>>> - cc->nr_migratepages++;
> >>>> - nr_isolated++;
> >>>> + cc->nr_migratepages += thp_nr_pages(page);
> >>>> + nr_isolated += thp_nr_pages(page);
> >>>
> >>> Does thp_nr_pages work for __PageMovable pages?
> >>
> >> Yes. It is the same as compound_nr() but compiled
> >> to 1 when THP is not enabled.
> >
> > I am sorry but I do not follow. First of all the implementation of the
> > two is different and also I was asking about __PageMovable which should
> > never be THP IIRC. Can they be compound though?
>
> __PageMovable, non-lru movable pages, can be compound and thp_nr_page cannot
> be used for it, since when THP is off, thp_nr_page will return the wrong number.
> I got confused by its name, sorry.

OK, this matches my understanding. Good we are on the same page.

> But __PageMovable is irrelevant to this patch, since we are using
> __isolate_lru_page to isolate pages. non-lru __PageMovable should not appear
> after isolate_succes. thp_nr_pages can be used here.

But this is still not clear to me. __PageMovable pages are isolated by
isolate_movable_page and then jump to isolate_succes. Does that somehow
changes the nature of the page being compound or tat thp_nr_page would
start working on those pages.
--
Michal Hocko
SUSE Labs

2020-10-30 14:54:00

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [PATCH] mm/compaction: count pages and stop correctly during page isolation.

On 10/29/20 9:04 PM, Zi Yan wrote:
> From: Zi Yan <[email protected]>
>
> In isolate_migratepages_block, when cc->alloc_contig is true, we are
> able to isolate compound pages, nr_migratepages and nr_isolated did not
> count compound pages correctly, causing us to isolate more pages than we
> thought. Use thp_nr_pages to count pages. Otherwise, we might be trapped
> in too_many_isolated while loop, since the actual isolated pages can go
> up to COMPACT_CLUSTER_MAX*512=16384, where COMPACT_CLUSTER_MAX is 32,
> since we stop isolation after cc->nr_migratepages reaches to
> COMPACT_CLUSTER_MAX.

I wonder if a better fix would be to adjust the too_many_isolated() check so
that if we have non-zero cc->nr_migratepages, we bail out from further isolation
and migrate what we have immediately, instead of looping.

Because I can also imagine a hypothetical situation where multiple threads in
parallel cause too_many_isolated() to be true, and will all loop there forever.
The proposed fix should prevent such situation as well, AFAICT.

> In addition, after we fix the issue above, cc->nr_migratepages could
> never be equal to COMPACT_CLUSTER_MAX if compound pages are isolated,
> thus page isolation could not stop as we intended. Change the isolation
> stop condition to >=.
>
> Signed-off-by: Zi Yan <[email protected]>
> ---
> mm/compaction.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/mm/compaction.c b/mm/compaction.c
> index ee1f8439369e..0683a4999581 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -1012,8 +1012,8 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
>
> isolate_success:
> list_add(&page->lru, &cc->migratepages);
> - cc->nr_migratepages++;
> - nr_isolated++;
> + cc->nr_migratepages += thp_nr_pages(page);
> + nr_isolated += thp_nr_pages(page);
>
> /*
> * Avoid isolating too much unless this block is being
> @@ -1021,7 +1021,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
> * or a lock is contended. For contention, isolate quickly to
> * potentially remove one source of contention.
> */
> - if (cc->nr_migratepages == COMPACT_CLUSTER_MAX &&
> + if (cc->nr_migratepages >= COMPACT_CLUSTER_MAX &&
> !cc->rescan && !cc->contended) {
> ++low_pfn;
> break;
> @@ -1132,7 +1132,7 @@ isolate_migratepages_range(struct compact_control *cc, unsigned long start_pfn,
> if (!pfn)
> break;
>
> - if (cc->nr_migratepages == COMPACT_CLUSTER_MAX)
> + if (cc->nr_migratepages >= COMPACT_CLUSTER_MAX)
> break;
> }
>
>

2020-10-30 14:55:47

by Zi Yan

[permalink] [raw]
Subject: Re: [PATCH] mm/compaction: count pages and stop correctly during page isolation.

On 30 Oct 2020, at 10:49, Michal Hocko wrote:

> On Fri 30-10-20 10:35:43, Zi Yan wrote:
>> On 30 Oct 2020, at 9:36, Michal Hocko wrote:
>>
>>> On Fri 30-10-20 08:20:50, Zi Yan wrote:
>>>> On 30 Oct 2020, at 5:43, Michal Hocko wrote:
>>>>
>>>>> [Cc Vlastimil]
>>>>>
>>>>> On Thu 29-10-20 16:04:35, Zi Yan wrote:
>>>>>> From: Zi Yan <[email protected]>
>>>>>>
>>>>>> In isolate_migratepages_block, when cc->alloc_contig is true, we are
>>>>>> able to isolate compound pages, nr_migratepages and nr_isolated did not
>>>>>> count compound pages correctly, causing us to isolate more pages than we
>>>>>> thought. Use thp_nr_pages to count pages. Otherwise, we might be trapped
>>>>>> in too_many_isolated while loop, since the actual isolated pages can go
>>>>>> up to COMPACT_CLUSTER_MAX*512=16384, where COMPACT_CLUSTER_MAX is 32,
>>>>>> since we stop isolation after cc->nr_migratepages reaches to
>>>>>> COMPACT_CLUSTER_MAX.
>>>>>>
>>>>>> In addition, after we fix the issue above, cc->nr_migratepages could
>>>>>> never be equal to COMPACT_CLUSTER_MAX if compound pages are isolated,
>>>>>> thus page isolation could not stop as we intended. Change the isolation
>>>>>> stop condition to >=.
>>>>>>
>>>>>> Signed-off-by: Zi Yan <[email protected]>
>>>>>> ---
>>>>>> mm/compaction.c | 8 ++++----
>>>>>> 1 file changed, 4 insertions(+), 4 deletions(-)
>>>>>>
>>>>>> diff --git a/mm/compaction.c b/mm/compaction.c
>>>>>> index ee1f8439369e..0683a4999581 100644
>>>>>> --- a/mm/compaction.c
>>>>>> +++ b/mm/compaction.c
>>>>>> @@ -1012,8 +1012,8 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
>>>>>>
>>>>>> isolate_success:
>>>>>> list_add(&page->lru, &cc->migratepages);
>>>>>> - cc->nr_migratepages++;
>>>>>> - nr_isolated++;
>>>>>> + cc->nr_migratepages += thp_nr_pages(page);
>>>>>> + nr_isolated += thp_nr_pages(page);
>>>>>
>>>>> Does thp_nr_pages work for __PageMovable pages?
>>>>
>>>> Yes. It is the same as compound_nr() but compiled
>>>> to 1 when THP is not enabled.
>>>
>>> I am sorry but I do not follow. First of all the implementation of the
>>> two is different and also I was asking about __PageMovable which should
>>> never be THP IIRC. Can they be compound though?
>>
>> __PageMovable, non-lru movable pages, can be compound and thp_nr_page cannot
>> be used for it, since when THP is off, thp_nr_page will return the wrong number.
>> I got confused by its name, sorry.
>
> OK, this matches my understanding. Good we are on the same page.
>
>> But __PageMovable is irrelevant to this patch, since we are using
>> __isolate_lru_page to isolate pages. non-lru __PageMovable should not appear
>> after isolate_succes. thp_nr_pages can be used here.
>
> But this is still not clear to me. __PageMovable pages are isolated by
> isolate_movable_page and then jump to isolate_succes. Does that somehow
> changes the nature of the page being compound or tat thp_nr_page would
> start working on those pages.

Ah, I missed that part of the code. If __PageMovable can reach the code, we
should use compound_nr instead. Will send v2 to fix it. Thanks.


Best Regards,
Yan Zi


Attachments:
signature.asc (871.00 B)
OpenPGP digital signature

2020-10-30 14:57:48

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [PATCH] mm/compaction: count pages and stop correctly during page isolation.

On 10/30/20 3:49 PM, Michal Hocko wrote:
> On Fri 30-10-20 10:35:43, Zi Yan wrote:
>> On 30 Oct 2020, at 9:36, Michal Hocko wrote:
>>
>> > On Fri 30-10-20 08:20:50, Zi Yan wrote:
>> >> On 30 Oct 2020, at 5:43, Michal Hocko wrote:
>> >>
>> >>> [Cc Vlastimil]
>> >>>
>> >>> On Thu 29-10-20 16:04:35, Zi Yan wrote:
>> >>>
>> >>> Does thp_nr_pages work for __PageMovable pages?
>> >>
>> >> Yes. It is the same as compound_nr() but compiled
>> >> to 1 when THP is not enabled.
>> >
>> > I am sorry but I do not follow. First of all the implementation of the
>> > two is different and also I was asking about __PageMovable which should
>> > never be THP IIRC. Can they be compound though?
>>
>> __PageMovable, non-lru movable pages, can be compound and thp_nr_page cannot
>> be used for it, since when THP is off, thp_nr_page will return the wrong number.
>> I got confused by its name, sorry.
>
> OK, this matches my understanding. Good we are on the same page.
>
>> But __PageMovable is irrelevant to this patch, since we are using
>> __isolate_lru_page to isolate pages. non-lru __PageMovable should not appear
>> after isolate_succes. thp_nr_pages can be used here.
>
> But this is still not clear to me. __PageMovable pages are isolated by
> isolate_movable_page and then jump to isolate_succes. Does that somehow
> changes the nature of the page being compound or tat thp_nr_page would
> start working on those pages.

Agreed that page movable can appear after isolate_success. compound_nr() should
work for both.
Note that too_many_isolated() doesn't see __PageMovable isolated pages, as they
are not counted as NR_ISOLATED_FILE/NR_ISOLATED_ANON, AFAIK. So in that sense
they are irrelevant to the bug at hand... for now.

2020-10-30 15:20:09

by Zi Yan

[permalink] [raw]
Subject: Re: [PATCH] mm/compaction: count pages and stop correctly during page isolation.

On 30 Oct 2020, at 10:50, Vlastimil Babka wrote:

> On 10/29/20 9:04 PM, Zi Yan wrote:
>> From: Zi Yan <[email protected]>
>>
>> In isolate_migratepages_block, when cc->alloc_contig is true, we are
>> able to isolate compound pages, nr_migratepages and nr_isolated did not
>> count compound pages correctly, causing us to isolate more pages than we
>> thought. Use thp_nr_pages to count pages. Otherwise, we might be trapped
>> in too_many_isolated while loop, since the actual isolated pages can go
>> up to COMPACT_CLUSTER_MAX*512=16384, where COMPACT_CLUSTER_MAX is 32,
>> since we stop isolation after cc->nr_migratepages reaches to
>> COMPACT_CLUSTER_MAX.
>
> I wonder if a better fix would be to adjust the too_many_isolated() check so that if we have non-zero cc->nr_migratepages, we bail out from further isolation and migrate what we have immediately, instead of looping.

I just tested your fix and it works too. The difference is that with
your fix alloc_contig_range will fail quickly without killing the user
application mlocking THPs in the CMA region (for more context, please
see my other email to Andrew explaining how to reproduce in userspace),
whereas my fix will oom the user application and make alloc_contig_range
successful at the end.

Anyway, I will add your fix below and send v2:

diff --git a/mm/compaction.c b/mm/compaction.c
index ee1f8439369e..8fa11637ccfd 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -817,6 +817,9 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
* delay for some time until fewer pages are isolated
*/
while (unlikely(too_many_isolated(pgdat))) {
+ /* stop isolation if there are still pages not migrated */
+ if (cc->nr_migratepages)
+ return 0;
/* async migration should just abort */
if (cc->mode == MIGRATE_ASYNC)
return 0;

>
> Because I can also imagine a hypothetical situation where multiple threads in parallel cause too_many_isolated() to be true, and will all loop there forever. The proposed fix should prevent such situation as well, AFAICT.

Yes. oom01 from ltp tests the multi-threaded situation and my fix works there too.



Best Regards,
Yan Zi


Attachments:
signature.asc (871.00 B)
OpenPGP digital signature

2020-10-30 18:38:35

by Yang Shi

[permalink] [raw]
Subject: Re: [PATCH] mm/compaction: count pages and stop correctly during page isolation.

On Fri, Oct 30, 2020 at 6:36 AM Michal Hocko <[email protected]> wrote:
>
> On Fri 30-10-20 08:20:50, Zi Yan wrote:
> > On 30 Oct 2020, at 5:43, Michal Hocko wrote:
> >
> > > [Cc Vlastimil]
> > >
> > > On Thu 29-10-20 16:04:35, Zi Yan wrote:
> > >> From: Zi Yan <[email protected]>
> > >>
> > >> In isolate_migratepages_block, when cc->alloc_contig is true, we are
> > >> able to isolate compound pages, nr_migratepages and nr_isolated did not
> > >> count compound pages correctly, causing us to isolate more pages than we
> > >> thought. Use thp_nr_pages to count pages. Otherwise, we might be trapped
> > >> in too_many_isolated while loop, since the actual isolated pages can go
> > >> up to COMPACT_CLUSTER_MAX*512=16384, where COMPACT_CLUSTER_MAX is 32,
> > >> since we stop isolation after cc->nr_migratepages reaches to
> > >> COMPACT_CLUSTER_MAX.
> > >>
> > >> In addition, after we fix the issue above, cc->nr_migratepages could
> > >> never be equal to COMPACT_CLUSTER_MAX if compound pages are isolated,
> > >> thus page isolation could not stop as we intended. Change the isolation
> > >> stop condition to >=.
> > >>
> > >> Signed-off-by: Zi Yan <[email protected]>
> > >> ---
> > >> mm/compaction.c | 8 ++++----
> > >> 1 file changed, 4 insertions(+), 4 deletions(-)
> > >>
> > >> diff --git a/mm/compaction.c b/mm/compaction.c
> > >> index ee1f8439369e..0683a4999581 100644
> > >> --- a/mm/compaction.c
> > >> +++ b/mm/compaction.c
> > >> @@ -1012,8 +1012,8 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
> > >>
> > >> isolate_success:
> > >> list_add(&page->lru, &cc->migratepages);
> > >> - cc->nr_migratepages++;
> > >> - nr_isolated++;
> > >> + cc->nr_migratepages += thp_nr_pages(page);
> > >> + nr_isolated += thp_nr_pages(page);
> > >
> > > Does thp_nr_pages work for __PageMovable pages?
> >
> > Yes. It is the same as compound_nr() but compiled
> > to 1 when THP is not enabled.
>
> I am sorry but I do not follow. First of all the implementation of the
> two is different and also I was asking about __PageMovable which should
> never be THP IIRC. Can they be compound though?

I have the same question, can they be compound? If they can be
compound, PageTransHuge() can't tell from THP and compound movable
page, right?

>
> --
> Michal Hocko
> SUSE Labs
>

2020-10-30 18:41:00

by Zi Yan

[permalink] [raw]
Subject: Re: [PATCH] mm/compaction: count pages and stop correctly during page isolation.

On 30 Oct 2020, at 14:33, Yang Shi wrote:

> On Fri, Oct 30, 2020 at 6:36 AM Michal Hocko <[email protected]> wrote:
>>
>> On Fri 30-10-20 08:20:50, Zi Yan wrote:
>>> On 30 Oct 2020, at 5:43, Michal Hocko wrote:
>>>
>>>> [Cc Vlastimil]
>>>>
>>>> On Thu 29-10-20 16:04:35, Zi Yan wrote:
>>>>> From: Zi Yan <[email protected]>
>>>>>
>>>>> In isolate_migratepages_block, when cc->alloc_contig is true, we are
>>>>> able to isolate compound pages, nr_migratepages and nr_isolated did not
>>>>> count compound pages correctly, causing us to isolate more pages than we
>>>>> thought. Use thp_nr_pages to count pages. Otherwise, we might be trapped
>>>>> in too_many_isolated while loop, since the actual isolated pages can go
>>>>> up to COMPACT_CLUSTER_MAX*512=16384, where COMPACT_CLUSTER_MAX is 32,
>>>>> since we stop isolation after cc->nr_migratepages reaches to
>>>>> COMPACT_CLUSTER_MAX.
>>>>>
>>>>> In addition, after we fix the issue above, cc->nr_migratepages could
>>>>> never be equal to COMPACT_CLUSTER_MAX if compound pages are isolated,
>>>>> thus page isolation could not stop as we intended. Change the isolation
>>>>> stop condition to >=.
>>>>>
>>>>> Signed-off-by: Zi Yan <[email protected]>
>>>>> ---
>>>>> mm/compaction.c | 8 ++++----
>>>>> 1 file changed, 4 insertions(+), 4 deletions(-)
>>>>>
>>>>> diff --git a/mm/compaction.c b/mm/compaction.c
>>>>> index ee1f8439369e..0683a4999581 100644
>>>>> --- a/mm/compaction.c
>>>>> +++ b/mm/compaction.c
>>>>> @@ -1012,8 +1012,8 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
>>>>>
>>>>> isolate_success:
>>>>> list_add(&page->lru, &cc->migratepages);
>>>>> - cc->nr_migratepages++;
>>>>> - nr_isolated++;
>>>>> + cc->nr_migratepages += thp_nr_pages(page);
>>>>> + nr_isolated += thp_nr_pages(page);
>>>>
>>>> Does thp_nr_pages work for __PageMovable pages?
>>>
>>> Yes. It is the same as compound_nr() but compiled
>>> to 1 when THP is not enabled.
>>
>> I am sorry but I do not follow. First of all the implementation of the
>> two is different and also I was asking about __PageMovable which should
>> never be THP IIRC. Can they be compound though?
>
> I have the same question, can they be compound? If they can be
> compound, PageTransHuge() can't tell from THP and compound movable
> page, right?

Right. I have updated the patch and use compound_nr instead.


Best Regards,
Yan Zi


Attachments:
signature.asc (871.00 B)
OpenPGP digital signature

2020-10-30 18:59:41

by Yang Shi

[permalink] [raw]
Subject: Re: [PATCH] mm/compaction: count pages and stop correctly during page isolation.

On Fri, Oct 30, 2020 at 11:39 AM Zi Yan <[email protected]> wrote:
>
> On 30 Oct 2020, at 14:33, Yang Shi wrote:
>
> > On Fri, Oct 30, 2020 at 6:36 AM Michal Hocko <[email protected]> wrote:
> >>
> >> On Fri 30-10-20 08:20:50, Zi Yan wrote:
> >>> On 30 Oct 2020, at 5:43, Michal Hocko wrote:
> >>>
> >>>> [Cc Vlastimil]
> >>>>
> >>>> On Thu 29-10-20 16:04:35, Zi Yan wrote:
> >>>>> From: Zi Yan <[email protected]>
> >>>>>
> >>>>> In isolate_migratepages_block, when cc->alloc_contig is true, we are
> >>>>> able to isolate compound pages, nr_migratepages and nr_isolated did not
> >>>>> count compound pages correctly, causing us to isolate more pages than we
> >>>>> thought. Use thp_nr_pages to count pages. Otherwise, we might be trapped
> >>>>> in too_many_isolated while loop, since the actual isolated pages can go
> >>>>> up to COMPACT_CLUSTER_MAX*512=16384, where COMPACT_CLUSTER_MAX is 32,
> >>>>> since we stop isolation after cc->nr_migratepages reaches to
> >>>>> COMPACT_CLUSTER_MAX.
> >>>>>
> >>>>> In addition, after we fix the issue above, cc->nr_migratepages could
> >>>>> never be equal to COMPACT_CLUSTER_MAX if compound pages are isolated,
> >>>>> thus page isolation could not stop as we intended. Change the isolation
> >>>>> stop condition to >=.
> >>>>>
> >>>>> Signed-off-by: Zi Yan <[email protected]>
> >>>>> ---
> >>>>> mm/compaction.c | 8 ++++----
> >>>>> 1 file changed, 4 insertions(+), 4 deletions(-)
> >>>>>
> >>>>> diff --git a/mm/compaction.c b/mm/compaction.c
> >>>>> index ee1f8439369e..0683a4999581 100644
> >>>>> --- a/mm/compaction.c
> >>>>> +++ b/mm/compaction.c
> >>>>> @@ -1012,8 +1012,8 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
> >>>>>
> >>>>> isolate_success:
> >>>>> list_add(&page->lru, &cc->migratepages);
> >>>>> - cc->nr_migratepages++;
> >>>>> - nr_isolated++;
> >>>>> + cc->nr_migratepages += thp_nr_pages(page);
> >>>>> + nr_isolated += thp_nr_pages(page);
> >>>>
> >>>> Does thp_nr_pages work for __PageMovable pages?
> >>>
> >>> Yes. It is the same as compound_nr() but compiled
> >>> to 1 when THP is not enabled.
> >>
> >> I am sorry but I do not follow. First of all the implementation of the
> >> two is different and also I was asking about __PageMovable which should
> >> never be THP IIRC. Can they be compound though?
> >
> > I have the same question, can they be compound? If they can be
> > compound, PageTransHuge() can't tell from THP and compound movable
> > page, right?
>
> Right. I have updated the patch and use compound_nr instead.

Thanks. Actually I'm wondering what kind of movable page could be
compound. Any real examples?

>
> —
> Best Regards,
> Yan Zi

2020-11-02 13:05:45

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [PATCH] mm/compaction: count pages and stop correctly during page isolation.

On 10/30/20 7:55 PM, Yang Shi wrote:
> On Fri, Oct 30, 2020 at 11:39 AM Zi Yan <[email protected]> wrote:
>>
>> On 30 Oct 2020, at 14:33, Yang Shi wrote:
>>
>> > On Fri, Oct 30, 2020 at 6:36 AM Michal Hocko <[email protected]> wrote:
>> >>
>> >> On Fri 30-10-20 08:20:50, Zi Yan wrote:
>> >>> On 30 Oct 2020, at 5:43, Michal Hocko wrote:
>> >>>
>> >>>> [Cc Vlastimil]
>> >>>>
>> >>>> On Thu 29-10-20 16:04:35, Zi Yan wrote:
>> >>>>> From: Zi Yan <[email protected]>
>> >>>>>
>> >>>>> In isolate_migratepages_block, when cc->alloc_contig is true, we are
>> >>>>> able to isolate compound pages, nr_migratepages and nr_isolated did not
>> >>>>> count compound pages correctly, causing us to isolate more pages than we
>> >>>>> thought. Use thp_nr_pages to count pages. Otherwise, we might be trapped
>> >>>>> in too_many_isolated while loop, since the actual isolated pages can go
>> >>>>> up to COMPACT_CLUSTER_MAX*512=16384, where COMPACT_CLUSTER_MAX is 32,
>> >>>>> since we stop isolation after cc->nr_migratepages reaches to
>> >>>>> COMPACT_CLUSTER_MAX.
>> >>>>>
>> >>>>> In addition, after we fix the issue above, cc->nr_migratepages could
>> >>>>> never be equal to COMPACT_CLUSTER_MAX if compound pages are isolated,
>> >>>>> thus page isolation could not stop as we intended. Change the isolation
>> >>>>> stop condition to >=.
>> >>>>>
>> >>>>> Signed-off-by: Zi Yan <[email protected]>
>> >>>>> ---
>> >>>>> mm/compaction.c | 8 ++++----
>> >>>>> 1 file changed, 4 insertions(+), 4 deletions(-)
>> >>>>>
>> >>>>> diff --git a/mm/compaction.c b/mm/compaction.c
>> >>>>> index ee1f8439369e..0683a4999581 100644
>> >>>>> --- a/mm/compaction.c
>> >>>>> +++ b/mm/compaction.c
>> >>>>> @@ -1012,8 +1012,8 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
>> >>>>>
>> >>>>> isolate_success:
>> >>>>> list_add(&page->lru, &cc->migratepages);
>> >>>>> - cc->nr_migratepages++;
>> >>>>> - nr_isolated++;
>> >>>>> + cc->nr_migratepages += thp_nr_pages(page);
>> >>>>> + nr_isolated += thp_nr_pages(page);
>> >>>>
>> >>>> Does thp_nr_pages work for __PageMovable pages?
>> >>>
>> >>> Yes. It is the same as compound_nr() but compiled
>> >>> to 1 when THP is not enabled.
>> >>
>> >> I am sorry but I do not follow. First of all the implementation of the
>> >> two is different and also I was asking about __PageMovable which should
>> >> never be THP IIRC. Can they be compound though?
>> >
>> > I have the same question, can they be compound? If they can be
>> > compound, PageTransHuge() can't tell from THP and compound movable
>> > page, right?
>>
>> Right. I have updated the patch and use compound_nr instead.
>
> Thanks. Actually I'm wondering what kind of movable page could be
> compound. Any real examples?

Looks like there's currently none. Compaction also wouldn't work properly with
movable pages with order>0 as the free page scanner looks for order-0 pages
only. But it won't hurt to use compound_nr() anyway.

>>
>> —
>> Best Regards,
>> Yan Zi
>

2020-11-02 16:42:52

by Yang Shi

[permalink] [raw]
Subject: Re: [PATCH] mm/compaction: count pages and stop correctly during page isolation.

On Mon, Nov 2, 2020 at 5:03 AM Vlastimil Babka <[email protected]> wrote:
>
> On 10/30/20 7:55 PM, Yang Shi wrote:
> > On Fri, Oct 30, 2020 at 11:39 AM Zi Yan <[email protected]> wrote:
> >>
> >> On 30 Oct 2020, at 14:33, Yang Shi wrote:
> >>
> >> > On Fri, Oct 30, 2020 at 6:36 AM Michal Hocko <[email protected]> wrote:
> >> >>
> >> >> On Fri 30-10-20 08:20:50, Zi Yan wrote:
> >> >>> On 30 Oct 2020, at 5:43, Michal Hocko wrote:
> >> >>>
> >> >>>> [Cc Vlastimil]
> >> >>>>
> >> >>>> On Thu 29-10-20 16:04:35, Zi Yan wrote:
> >> >>>>> From: Zi Yan <[email protected]>
> >> >>>>>
> >> >>>>> In isolate_migratepages_block, when cc->alloc_contig is true, we are
> >> >>>>> able to isolate compound pages, nr_migratepages and nr_isolated did not
> >> >>>>> count compound pages correctly, causing us to isolate more pages than we
> >> >>>>> thought. Use thp_nr_pages to count pages. Otherwise, we might be trapped
> >> >>>>> in too_many_isolated while loop, since the actual isolated pages can go
> >> >>>>> up to COMPACT_CLUSTER_MAX*512=16384, where COMPACT_CLUSTER_MAX is 32,
> >> >>>>> since we stop isolation after cc->nr_migratepages reaches to
> >> >>>>> COMPACT_CLUSTER_MAX.
> >> >>>>>
> >> >>>>> In addition, after we fix the issue above, cc->nr_migratepages could
> >> >>>>> never be equal to COMPACT_CLUSTER_MAX if compound pages are isolated,
> >> >>>>> thus page isolation could not stop as we intended. Change the isolation
> >> >>>>> stop condition to >=.
> >> >>>>>
> >> >>>>> Signed-off-by: Zi Yan <[email protected]>
> >> >>>>> ---
> >> >>>>> mm/compaction.c | 8 ++++----
> >> >>>>> 1 file changed, 4 insertions(+), 4 deletions(-)
> >> >>>>>
> >> >>>>> diff --git a/mm/compaction.c b/mm/compaction.c
> >> >>>>> index ee1f8439369e..0683a4999581 100644
> >> >>>>> --- a/mm/compaction.c
> >> >>>>> +++ b/mm/compaction.c
> >> >>>>> @@ -1012,8 +1012,8 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
> >> >>>>>
> >> >>>>> isolate_success:
> >> >>>>> list_add(&page->lru, &cc->migratepages);
> >> >>>>> - cc->nr_migratepages++;
> >> >>>>> - nr_isolated++;
> >> >>>>> + cc->nr_migratepages += thp_nr_pages(page);
> >> >>>>> + nr_isolated += thp_nr_pages(page);
> >> >>>>
> >> >>>> Does thp_nr_pages work for __PageMovable pages?
> >> >>>
> >> >>> Yes. It is the same as compound_nr() but compiled
> >> >>> to 1 when THP is not enabled.
> >> >>
> >> >> I am sorry but I do not follow. First of all the implementation of the
> >> >> two is different and also I was asking about __PageMovable which should
> >> >> never be THP IIRC. Can they be compound though?
> >> >
> >> > I have the same question, can they be compound? If they can be
> >> > compound, PageTransHuge() can't tell from THP and compound movable
> >> > page, right?
> >>
> >> Right. I have updated the patch and use compound_nr instead.
> >
> > Thanks. Actually I'm wondering what kind of movable page could be
> > compound. Any real examples?
>
> Looks like there's currently none. Compaction also wouldn't work properly with
> movable pages with order>0 as the free page scanner looks for order-0 pages
> only. But it won't hurt to use compound_nr() anyway.

Thanks, yes this is what I thought otherwise we have troubles in
migration path too.

>
> >>
> >> —
> >> Best Regards,
> >> Yan Zi
> >
>