2024-06-04 09:27:45

by yangge1116

[permalink] [raw]
Subject: [PATCH] mm/page_alloc: skip THP-sized PCP list when allocating non-CMA THP-sized page

From: yangge <[email protected]>

Since commit 5d0a661d808f ("mm/page_alloc: use only one PCP list for
THP-sized allocations") no longer differentiates the migration type
of pages in THP-sized PCP list, it's possible to get a CMA page from
the list, in some cases, it's not acceptable, for example, allocating
a non-CMA page with PF_MEMALLOC_PIN flag returns a CMA page.

The patch forbids allocating non-CMA THP-sized page from THP-sized
PCP list to avoid the issue above.

Fixes: 5d0a661d808f ("mm/page_alloc: use only one PCP list for THP-sized allocations")
Signed-off-by: yangge <[email protected]>
---
mm/page_alloc.c | 10 ++++++++++
1 file changed, 10 insertions(+)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 2e22ce5..0bdf471 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2987,10 +2987,20 @@ struct page *rmqueue(struct zone *preferred_zone,
WARN_ON_ONCE((gfp_flags & __GFP_NOFAIL) && (order > 1));

if (likely(pcp_allowed_order(order))) {
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+ if (!IS_ENABLED(CONFIG_CMA) || alloc_flags & ALLOC_CMA ||
+ order != HPAGE_PMD_ORDER) {
+ page = rmqueue_pcplist(preferred_zone, zone, order,
+ migratetype, alloc_flags);
+ if (likely(page))
+ goto out;
+ }
+#else
page = rmqueue_pcplist(preferred_zone, zone, order,
migratetype, alloc_flags);
if (likely(page))
goto out;
+#endif
}

page = rmqueue_buddy(preferred_zone, zone, order, alloc_flags,
--
2.7.4



2024-06-04 12:02:08

by Baolin Wang

[permalink] [raw]
Subject: Re: [PATCH] mm/page_alloc: skip THP-sized PCP list when allocating non-CMA THP-sized page

Cc Johannes, Zi and Vlastimil.

On 2024/6/4 17:14, [email protected] wrote:
> From: yangge <[email protected]>
>
> Since commit 5d0a661d808f ("mm/page_alloc: use only one PCP list for
> THP-sized allocations") no longer differentiates the migration type
> of pages in THP-sized PCP list, it's possible to get a CMA page from
> the list, in some cases, it's not acceptable, for example, allocating
> a non-CMA page with PF_MEMALLOC_PIN flag returns a CMA page.
>
> The patch forbids allocating non-CMA THP-sized page from THP-sized
> PCP list to avoid the issue above.
>
> Fixes: 5d0a661d808f ("mm/page_alloc: use only one PCP list for THP-sized allocations")
> Signed-off-by: yangge <[email protected]>
> ---
> mm/page_alloc.c | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 2e22ce5..0bdf471 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -2987,10 +2987,20 @@ struct page *rmqueue(struct zone *preferred_zone,
> WARN_ON_ONCE((gfp_flags & __GFP_NOFAIL) && (order > 1));
>
> if (likely(pcp_allowed_order(order))) {
> +#ifdef CONFIG_TRANSPARENT_HUGEPAGE
> + if (!IS_ENABLED(CONFIG_CMA) || alloc_flags & ALLOC_CMA ||
> + order != HPAGE_PMD_ORDER) {

Seems you will also miss the non-CMA THP from the PCP, so I wonder if we
can add a migratetype comparison in __rmqueue_pcplist(), and if it's not
suitable, then fallback to buddy?

> + page = rmqueue_pcplist(preferred_zone, zone, order,
> + migratetype, alloc_flags);
> + if (likely(page))
> + goto out;
> + }
> +#else
> page = rmqueue_pcplist(preferred_zone, zone, order,
> migratetype, alloc_flags);
> if (likely(page))
> goto out;
> +#endif
> }
>
> page = rmqueue_buddy(preferred_zone, zone, order, alloc_flags,

2024-06-04 12:39:16

by yangge1116

[permalink] [raw]
Subject: Re: [PATCH] mm/page_alloc: skip THP-sized PCP list when allocating non-CMA THP-sized page



在 2024/6/4 下午8:01, Baolin Wang 写道:
> Cc Johannes, Zi and Vlastimil.
>
> On 2024/6/4 17:14, [email protected] wrote:
>> From: yangge <[email protected]>
>>
>> Since commit 5d0a661d808f ("mm/page_alloc: use only one PCP list for
>> THP-sized allocations") no longer differentiates the migration type
>> of pages in THP-sized PCP list, it's possible to get a CMA page from
>> the list, in some cases, it's not acceptable, for example, allocating
>> a non-CMA page with PF_MEMALLOC_PIN flag returns a CMA page.
>>
>> The patch forbids allocating non-CMA THP-sized page from THP-sized
>> PCP list to avoid the issue above.
>>
>> Fixes: 5d0a661d808f ("mm/page_alloc: use only one PCP list for
>> THP-sized allocations")
>> Signed-off-by: yangge <[email protected]>
>> ---
>>   mm/page_alloc.c | 10 ++++++++++
>>   1 file changed, 10 insertions(+)
>>
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index 2e22ce5..0bdf471 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -2987,10 +2987,20 @@ struct page *rmqueue(struct zone *preferred_zone,
>>       WARN_ON_ONCE((gfp_flags & __GFP_NOFAIL) && (order > 1));
>>       if (likely(pcp_allowed_order(order))) {
>> +#ifdef CONFIG_TRANSPARENT_HUGEPAGE
>> +        if (!IS_ENABLED(CONFIG_CMA) || alloc_flags & ALLOC_CMA ||
>> +                        order != HPAGE_PMD_ORDER) {
>
> Seems you will also miss the non-CMA THP from the PCP, so I wonder if we
> can add a migratetype comparison in __rmqueue_pcplist(), and if it's not
> suitable, then fallback to buddy?

Yes, we may miss some non-CMA THPs in the PCP. But, if add a migratetype
comparison in __rmqueue_pcplist(), we may need to compare many times
because of pcp batch.


2024-06-06 03:06:42

by Baolin Wang

[permalink] [raw]
Subject: Re: [PATCH] mm/page_alloc: skip THP-sized PCP list when allocating non-CMA THP-sized page



On 2024/6/4 20:36, yangge1116 wrote:
>
>
> 在 2024/6/4 下午8:01, Baolin Wang 写道:
>> Cc Johannes, Zi and Vlastimil.
>>
>> On 2024/6/4 17:14, [email protected] wrote:
>>> From: yangge <[email protected]>
>>>
>>> Since commit 5d0a661d808f ("mm/page_alloc: use only one PCP list for
>>> THP-sized allocations") no longer differentiates the migration type
>>> of pages in THP-sized PCP list, it's possible to get a CMA page from
>>> the list, in some cases, it's not acceptable, for example, allocating
>>> a non-CMA page with PF_MEMALLOC_PIN flag returns a CMA page.
>>>
>>> The patch forbids allocating non-CMA THP-sized page from THP-sized
>>> PCP list to avoid the issue above.
>>>
>>> Fixes: 5d0a661d808f ("mm/page_alloc: use only one PCP list for
>>> THP-sized allocations")
>>> Signed-off-by: yangge <[email protected]>
>>> ---
>>>   mm/page_alloc.c | 10 ++++++++++
>>>   1 file changed, 10 insertions(+)
>>>
>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>>> index 2e22ce5..0bdf471 100644
>>> --- a/mm/page_alloc.c
>>> +++ b/mm/page_alloc.c
>>> @@ -2987,10 +2987,20 @@ struct page *rmqueue(struct zone
>>> *preferred_zone,
>>>       WARN_ON_ONCE((gfp_flags & __GFP_NOFAIL) && (order > 1));
>>>       if (likely(pcp_allowed_order(order))) {
>>> +#ifdef CONFIG_TRANSPARENT_HUGEPAGE
>>> +        if (!IS_ENABLED(CONFIG_CMA) || alloc_flags & ALLOC_CMA ||
>>> +                        order != HPAGE_PMD_ORDER) {
>>
>> Seems you will also miss the non-CMA THP from the PCP, so I wonder if
>> we can add a migratetype comparison in __rmqueue_pcplist(), and if
>> it's not suitable, then fallback to buddy?
>
> Yes, we may miss some non-CMA THPs in the PCP. But, if add a migratetype
> comparison in __rmqueue_pcplist(), we may need to compare many times
> because of pcp batch.

I mean we can only compare once, focusing on CMA pages.

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 3734fe7e67c0..960a3b5744d8 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2973,6 +2973,11 @@ struct page *__rmqueue_pcplist(struct zone *zone,
unsigned int order,
}

page = list_first_entry(list, struct page, pcp_list);
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+ if (order == HPAGE_PMD_ORDER &&
!is_migrate_movable(migratetype) &&
+ is_migrate_cma(get_pageblock_migratetype(page)))
+ return NULL;
+#endif
list_del(&page->pcp_list);
pcp->count -= 1 << order;
} while (check_new_pages(page, order));

2024-06-06 09:45:12

by yangge1116

[permalink] [raw]
Subject: Re: [PATCH] mm/page_alloc: skip THP-sized PCP list when allocating non-CMA THP-sized page



在 2024/6/6 上午11:06, Baolin Wang 写道:
>
>
> On 2024/6/4 20:36, yangge1116 wrote:
>>
>>
>> 在 2024/6/4 下午8:01, Baolin Wang 写道:
>>> Cc Johannes, Zi and Vlastimil.
>>>
>>> On 2024/6/4 17:14, [email protected] wrote:
>>>> From: yangge <[email protected]>
>>>>
>>>> Since commit 5d0a661d808f ("mm/page_alloc: use only one PCP list for
>>>> THP-sized allocations") no longer differentiates the migration type
>>>> of pages in THP-sized PCP list, it's possible to get a CMA page from
>>>> the list, in some cases, it's not acceptable, for example, allocating
>>>> a non-CMA page with PF_MEMALLOC_PIN flag returns a CMA page.
>>>>
>>>> The patch forbids allocating non-CMA THP-sized page from THP-sized
>>>> PCP list to avoid the issue above.
>>>>
>>>> Fixes: 5d0a661d808f ("mm/page_alloc: use only one PCP list for
>>>> THP-sized allocations")
>>>> Signed-off-by: yangge <[email protected]>
>>>> ---
>>>>   mm/page_alloc.c | 10 ++++++++++
>>>>   1 file changed, 10 insertions(+)
>>>>
>>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>>>> index 2e22ce5..0bdf471 100644
>>>> --- a/mm/page_alloc.c
>>>> +++ b/mm/page_alloc.c
>>>> @@ -2987,10 +2987,20 @@ struct page *rmqueue(struct zone
>>>> *preferred_zone,
>>>>       WARN_ON_ONCE((gfp_flags & __GFP_NOFAIL) && (order > 1));
>>>>       if (likely(pcp_allowed_order(order))) {
>>>> +#ifdef CONFIG_TRANSPARENT_HUGEPAGE
>>>> +        if (!IS_ENABLED(CONFIG_CMA) || alloc_flags & ALLOC_CMA ||
>>>> +                        order != HPAGE_PMD_ORDER) {
>>>
>>> Seems you will also miss the non-CMA THP from the PCP, so I wonder if
>>> we can add a migratetype comparison in __rmqueue_pcplist(), and if
>>> it's not suitable, then fallback to buddy?
>>
>> Yes, we may miss some non-CMA THPs in the PCP. But, if add a
>> migratetype comparison in __rmqueue_pcplist(), we may need to compare
>> many times because of pcp batch.
>
> I mean we can only compare once, focusing on CMA pages.

pcp_list may contains CMA and no-CMA pages, why only compare once, just
increase one chance of using the pcp_list?


>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 3734fe7e67c0..960a3b5744d8 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -2973,6 +2973,11 @@ struct page *__rmqueue_pcplist(struct zone *zone,
> unsigned int order,
>                 }
>
>                 page = list_first_entry(list, struct page, pcp_list);
> +#ifdef CONFIG_TRANSPARENT_HUGEPAGE
> +               if (order == HPAGE_PMD_ORDER &&
> !is_migrate_movable(migratetype) &&
> +                   is_migrate_cma(get_pageblock_migratetype(page)))
> +                       return NULL;
> +#endif
>                 list_del(&page->pcp_list);
>                 pcp->count -= 1 << order;
>         } while (check_new_pages(page, order));