From: yangge <[email protected]>
Since commit 5d0a661d808f ("mm/page_alloc: use only one PCP list for
THP-sized allocations") no longer differentiates the migration type
of pages in THP-sized PCP list, it's possible to get a CMA page from
the list, in some cases, it's not acceptable, for example, allocating
a non-CMA page with PF_MEMALLOC_PIN flag returns a CMA page.
The patch forbids allocating non-CMA THP-sized page from THP-sized
PCP list to avoid the issue above.
Fixes: 5d0a661d808f ("mm/page_alloc: use only one PCP list for THP-sized allocations")
Signed-off-by: yangge <[email protected]>
---
mm/page_alloc.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 2e22ce5..0bdf471 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2987,10 +2987,20 @@ struct page *rmqueue(struct zone *preferred_zone,
WARN_ON_ONCE((gfp_flags & __GFP_NOFAIL) && (order > 1));
if (likely(pcp_allowed_order(order))) {
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+ if (!IS_ENABLED(CONFIG_CMA) || alloc_flags & ALLOC_CMA ||
+ order != HPAGE_PMD_ORDER) {
+ page = rmqueue_pcplist(preferred_zone, zone, order,
+ migratetype, alloc_flags);
+ if (likely(page))
+ goto out;
+ }
+#else
page = rmqueue_pcplist(preferred_zone, zone, order,
migratetype, alloc_flags);
if (likely(page))
goto out;
+#endif
}
page = rmqueue_buddy(preferred_zone, zone, order, alloc_flags,
--
2.7.4
Cc Johannes, Zi and Vlastimil.
On 2024/6/4 17:14, [email protected] wrote:
> From: yangge <[email protected]>
>
> Since commit 5d0a661d808f ("mm/page_alloc: use only one PCP list for
> THP-sized allocations") no longer differentiates the migration type
> of pages in THP-sized PCP list, it's possible to get a CMA page from
> the list, in some cases, it's not acceptable, for example, allocating
> a non-CMA page with PF_MEMALLOC_PIN flag returns a CMA page.
>
> The patch forbids allocating non-CMA THP-sized page from THP-sized
> PCP list to avoid the issue above.
>
> Fixes: 5d0a661d808f ("mm/page_alloc: use only one PCP list for THP-sized allocations")
> Signed-off-by: yangge <[email protected]>
> ---
> mm/page_alloc.c | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 2e22ce5..0bdf471 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -2987,10 +2987,20 @@ struct page *rmqueue(struct zone *preferred_zone,
> WARN_ON_ONCE((gfp_flags & __GFP_NOFAIL) && (order > 1));
>
> if (likely(pcp_allowed_order(order))) {
> +#ifdef CONFIG_TRANSPARENT_HUGEPAGE
> + if (!IS_ENABLED(CONFIG_CMA) || alloc_flags & ALLOC_CMA ||
> + order != HPAGE_PMD_ORDER) {
Seems you will also miss the non-CMA THP from the PCP, so I wonder if we
can add a migratetype comparison in __rmqueue_pcplist(), and if it's not
suitable, then fallback to buddy?
> + page = rmqueue_pcplist(preferred_zone, zone, order,
> + migratetype, alloc_flags);
> + if (likely(page))
> + goto out;
> + }
> +#else
> page = rmqueue_pcplist(preferred_zone, zone, order,
> migratetype, alloc_flags);
> if (likely(page))
> goto out;
> +#endif
> }
>
> page = rmqueue_buddy(preferred_zone, zone, order, alloc_flags,
在 2024/6/4 下午8:01, Baolin Wang 写道:
> Cc Johannes, Zi and Vlastimil.
>
> On 2024/6/4 17:14, [email protected] wrote:
>> From: yangge <[email protected]>
>>
>> Since commit 5d0a661d808f ("mm/page_alloc: use only one PCP list for
>> THP-sized allocations") no longer differentiates the migration type
>> of pages in THP-sized PCP list, it's possible to get a CMA page from
>> the list, in some cases, it's not acceptable, for example, allocating
>> a non-CMA page with PF_MEMALLOC_PIN flag returns a CMA page.
>>
>> The patch forbids allocating non-CMA THP-sized page from THP-sized
>> PCP list to avoid the issue above.
>>
>> Fixes: 5d0a661d808f ("mm/page_alloc: use only one PCP list for
>> THP-sized allocations")
>> Signed-off-by: yangge <[email protected]>
>> ---
>> mm/page_alloc.c | 10 ++++++++++
>> 1 file changed, 10 insertions(+)
>>
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index 2e22ce5..0bdf471 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -2987,10 +2987,20 @@ struct page *rmqueue(struct zone *preferred_zone,
>> WARN_ON_ONCE((gfp_flags & __GFP_NOFAIL) && (order > 1));
>> if (likely(pcp_allowed_order(order))) {
>> +#ifdef CONFIG_TRANSPARENT_HUGEPAGE
>> + if (!IS_ENABLED(CONFIG_CMA) || alloc_flags & ALLOC_CMA ||
>> + order != HPAGE_PMD_ORDER) {
>
> Seems you will also miss the non-CMA THP from the PCP, so I wonder if we
> can add a migratetype comparison in __rmqueue_pcplist(), and if it's not
> suitable, then fallback to buddy?
Yes, we may miss some non-CMA THPs in the PCP. But, if add a migratetype
comparison in __rmqueue_pcplist(), we may need to compare many times
because of pcp batch.
On 2024/6/4 20:36, yangge1116 wrote:
>
>
> 在 2024/6/4 下午8:01, Baolin Wang 写道:
>> Cc Johannes, Zi and Vlastimil.
>>
>> On 2024/6/4 17:14, [email protected] wrote:
>>> From: yangge <[email protected]>
>>>
>>> Since commit 5d0a661d808f ("mm/page_alloc: use only one PCP list for
>>> THP-sized allocations") no longer differentiates the migration type
>>> of pages in THP-sized PCP list, it's possible to get a CMA page from
>>> the list, in some cases, it's not acceptable, for example, allocating
>>> a non-CMA page with PF_MEMALLOC_PIN flag returns a CMA page.
>>>
>>> The patch forbids allocating non-CMA THP-sized page from THP-sized
>>> PCP list to avoid the issue above.
>>>
>>> Fixes: 5d0a661d808f ("mm/page_alloc: use only one PCP list for
>>> THP-sized allocations")
>>> Signed-off-by: yangge <[email protected]>
>>> ---
>>> mm/page_alloc.c | 10 ++++++++++
>>> 1 file changed, 10 insertions(+)
>>>
>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>>> index 2e22ce5..0bdf471 100644
>>> --- a/mm/page_alloc.c
>>> +++ b/mm/page_alloc.c
>>> @@ -2987,10 +2987,20 @@ struct page *rmqueue(struct zone
>>> *preferred_zone,
>>> WARN_ON_ONCE((gfp_flags & __GFP_NOFAIL) && (order > 1));
>>> if (likely(pcp_allowed_order(order))) {
>>> +#ifdef CONFIG_TRANSPARENT_HUGEPAGE
>>> + if (!IS_ENABLED(CONFIG_CMA) || alloc_flags & ALLOC_CMA ||
>>> + order != HPAGE_PMD_ORDER) {
>>
>> Seems you will also miss the non-CMA THP from the PCP, so I wonder if
>> we can add a migratetype comparison in __rmqueue_pcplist(), and if
>> it's not suitable, then fallback to buddy?
>
> Yes, we may miss some non-CMA THPs in the PCP. But, if add a migratetype
> comparison in __rmqueue_pcplist(), we may need to compare many times
> because of pcp batch.
I mean we can only compare once, focusing on CMA pages.
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 3734fe7e67c0..960a3b5744d8 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2973,6 +2973,11 @@ struct page *__rmqueue_pcplist(struct zone *zone,
unsigned int order,
}
page = list_first_entry(list, struct page, pcp_list);
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+ if (order == HPAGE_PMD_ORDER &&
!is_migrate_movable(migratetype) &&
+ is_migrate_cma(get_pageblock_migratetype(page)))
+ return NULL;
+#endif
list_del(&page->pcp_list);
pcp->count -= 1 << order;
} while (check_new_pages(page, order));
在 2024/6/6 上午11:06, Baolin Wang 写道:
>
>
> On 2024/6/4 20:36, yangge1116 wrote:
>>
>>
>> 在 2024/6/4 下午8:01, Baolin Wang 写道:
>>> Cc Johannes, Zi and Vlastimil.
>>>
>>> On 2024/6/4 17:14, [email protected] wrote:
>>>> From: yangge <[email protected]>
>>>>
>>>> Since commit 5d0a661d808f ("mm/page_alloc: use only one PCP list for
>>>> THP-sized allocations") no longer differentiates the migration type
>>>> of pages in THP-sized PCP list, it's possible to get a CMA page from
>>>> the list, in some cases, it's not acceptable, for example, allocating
>>>> a non-CMA page with PF_MEMALLOC_PIN flag returns a CMA page.
>>>>
>>>> The patch forbids allocating non-CMA THP-sized page from THP-sized
>>>> PCP list to avoid the issue above.
>>>>
>>>> Fixes: 5d0a661d808f ("mm/page_alloc: use only one PCP list for
>>>> THP-sized allocations")
>>>> Signed-off-by: yangge <[email protected]>
>>>> ---
>>>> mm/page_alloc.c | 10 ++++++++++
>>>> 1 file changed, 10 insertions(+)
>>>>
>>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>>>> index 2e22ce5..0bdf471 100644
>>>> --- a/mm/page_alloc.c
>>>> +++ b/mm/page_alloc.c
>>>> @@ -2987,10 +2987,20 @@ struct page *rmqueue(struct zone
>>>> *preferred_zone,
>>>> WARN_ON_ONCE((gfp_flags & __GFP_NOFAIL) && (order > 1));
>>>> if (likely(pcp_allowed_order(order))) {
>>>> +#ifdef CONFIG_TRANSPARENT_HUGEPAGE
>>>> + if (!IS_ENABLED(CONFIG_CMA) || alloc_flags & ALLOC_CMA ||
>>>> + order != HPAGE_PMD_ORDER) {
>>>
>>> Seems you will also miss the non-CMA THP from the PCP, so I wonder if
>>> we can add a migratetype comparison in __rmqueue_pcplist(), and if
>>> it's not suitable, then fallback to buddy?
>>
>> Yes, we may miss some non-CMA THPs in the PCP. But, if add a
>> migratetype comparison in __rmqueue_pcplist(), we may need to compare
>> many times because of pcp batch.
>
> I mean we can only compare once, focusing on CMA pages.
pcp_list may contains CMA and no-CMA pages, why only compare once, just
increase one chance of using the pcp_list?
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 3734fe7e67c0..960a3b5744d8 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -2973,6 +2973,11 @@ struct page *__rmqueue_pcplist(struct zone *zone,
> unsigned int order,
> }
>
> page = list_first_entry(list, struct page, pcp_list);
> +#ifdef CONFIG_TRANSPARENT_HUGEPAGE
> + if (order == HPAGE_PMD_ORDER &&
> !is_migrate_movable(migratetype) &&
> + is_migrate_cma(get_pageblock_migratetype(page)))
> + return NULL;
> +#endif
> list_del(&page->pcp_list);
> pcp->count -= 1 << order;
> } while (check_new_pages(page, order));