2024-06-04 05:53:36

by xu.xin16

[permalink] [raw]
Subject: [PATCH linux-next] mm: huge_memory: fix misused mapping_large_folio_support() for anon folios

From: Ran Xiaokai <[email protected]>

When I did a large folios split test, a WARNING
"[ 5059.122759][ T166] Cannot split file folio to non-0 order"
was triggered. But my test cases are only for anonmous folios.
while mapping_large_folio_support() is only reasonable for page
cache folios.

In split_huge_page_to_list_to_order(), the folio passed to
mapping_large_folio_support() maybe anonmous folio. The
folio_test_anon() check is missing. So the split of the anonmous THP
is failed. This is also the same for shmem_mapping(). We'd better add
a check for both. But the shmem_mapping() in __split_huge_page() is
not involved, as for anonmous folios, the end parameter is set to -1, so
(head[i].index >= end) is always false. shmem_mapping() is not called.

Using /sys/kernel/debug/split_huge_pages to verify this, with this
patch, large anon THP is successfully split and the warning is ceased.

Signed-off-by: Ran Xiaokai <[email protected]>
Cc: xu xin <[email protected]>
Cc: Yang Yang <[email protected]>
---
mm/huge_memory.c | 38 ++++++++++++++++++++------------------
1 file changed, 20 insertions(+), 18 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 317de2afd371..4c9c7e5ea20c 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -3009,31 +3009,33 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list,
if (new_order >= folio_order(folio))
return -EINVAL;

- /* Cannot split anonymous THP to order-1 */
- if (new_order == 1 && folio_test_anon(folio)) {
- VM_WARN_ONCE(1, "Cannot split to order-1 folio");
- return -EINVAL;
- }
-
if (new_order) {
/* Only swapping a whole PMD-mapped folio is supported */
if (folio_test_swapcache(folio))
return -EINVAL;
- /* Split shmem folio to non-zero order not supported */
- if (shmem_mapping(folio->mapping)) {
- VM_WARN_ONCE(1,
- "Cannot split shmem folio to non-0 order");
- return -EINVAL;
- }
- /* No split if the file system does not support large folio */
- if (!mapping_large_folio_support(folio->mapping)) {
- VM_WARN_ONCE(1,
- "Cannot split file folio to non-0 order");
- return -EINVAL;
+
+ if (folio_test_anon(folio)) {
+ /* Cannot split anonymous THP to order-1 */
+ if (new_order == 1) {
+ VM_WARN_ONCE(1, "Cannot split to order-1 folio");
+ return -EINVAL;
+ }
+ } else {
+ /* Split shmem folio to non-zero order not supported */
+ if (shmem_mapping(folio->mapping)) {
+ VM_WARN_ONCE(1,
+ "Cannot split shmem folio to non-0 order");
+ return -EINVAL;
+ }
+ /* No split if the file system does not support large folio */
+ if (!mapping_large_folio_support(folio->mapping)) {
+ VM_WARN_ONCE(1,
+ "Cannot split file folio to non-0 order");
+ return -EINVAL;
+ }
}
}

-
is_hzp = is_huge_zero_folio(folio);
if (is_hzp) {
pr_warn_ratelimited("Called split_huge_page for huge zero page\n");
--
2.15.2


2024-06-04 07:59:58

by David Hildenbrand

[permalink] [raw]
Subject: Re: [PATCH linux-next] mm: huge_memory: fix misused mapping_large_folio_support() for anon folios

On 04.06.24 07:47, [email protected] wrote:
> From: Ran Xiaokai <[email protected]>
>
> When I did a large folios split test, a WARNING
> "[ 5059.122759][ T166] Cannot split file folio to non-0 order"
> was triggered. But my test cases are only for anonmous folios.
> while mapping_large_folio_support() is only reasonable for page
> cache folios.

Agreed.

I wonder if mapping_large_folio_support() should either

a) Complain if used for anon folios, so we can detect the wrong use more
easily. (VM_WARN_ON_ONCE())

b) Return "true" for anonymous mappings, although that's more debatable.

>
> In split_huge_page_to_list_to_order(), the folio passed to
> mapping_large_folio_support() maybe anonmous folio. The
> folio_test_anon() check is missing. So the split of the anonmous THP
> is failed. This is also the same for shmem_mapping(). We'd better add
> a check for both. But the shmem_mapping() in __split_huge_page() is
> not involved, as for anonmous folios, the end parameter is set to -1, so
> (head[i].index >= end) is always false. shmem_mapping() is not called.
>
> Using /sys/kernel/debug/split_huge_pages to verify this, with this
> patch, large anon THP is successfully split and the warning is ceased.
>
> Signed-off-by: Ran Xiaokai <[email protected]>
> Cc: xu xin <[email protected]>
> Cc: Yang Yang <[email protected]>
> ---
> mm/huge_memory.c | 38 ++++++++++++++++++++------------------
> 1 file changed, 20 insertions(+), 18 deletions(-)
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 317de2afd371..4c9c7e5ea20c 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -3009,31 +3009,33 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list,
> if (new_order >= folio_order(folio))
> return -EINVAL;
>
> - /* Cannot split anonymous THP to order-1 */
> - if (new_order == 1 && folio_test_anon(folio)) {
> - VM_WARN_ONCE(1, "Cannot split to order-1 folio");
> - return -EINVAL;
> - }
> -
> if (new_order) {
> /* Only swapping a whole PMD-mapped folio is supported */
> if (folio_test_swapcache(folio))
> return -EINVAL;
> - /* Split shmem folio to non-zero order not supported */
> - if (shmem_mapping(folio->mapping)) {
> - VM_WARN_ONCE(1,
> - "Cannot split shmem folio to non-0 order");
> - return -EINVAL;
> - }
> - /* No split if the file system does not support large folio */
> - if (!mapping_large_folio_support(folio->mapping)) {
> - VM_WARN_ONCE(1,
> - "Cannot split file folio to non-0 order");
> - return -EINVAL;
> +
> + if (folio_test_anon(folio)) {
> + /* Cannot split anonymous THP to order-1 */
> + if (new_order == 1) {
> + VM_WARN_ONCE(1, "Cannot split to order-1 folio");
> + return -EINVAL;
> + }
> + } else {
> + /* Split shmem folio to non-zero order not supported */
> + if (shmem_mapping(folio->mapping)) {
> + VM_WARN_ONCE(1,
> + "Cannot split shmem folio to non-0 order");
> + return -EINVAL;
> + }
> + /* No split if the file system does not support large folio */
> + if (!mapping_large_folio_support(folio->mapping)) {
> + VM_WARN_ONCE(1,
> + "Cannot split file folio to non-0 order");
> + return -EINVAL;
> + }
> }
> }

What about the following sequence:

if (folio_test_anon(folio)) {
if (new_order == 1)
...
} else if (new_order) {
if (shmem_mapping(...))
...
...
}

if (folio_test_swapcache(folio) && new_order)
return -EINVAL;

Should result in less churn and reduce indentation level.

--
Cheers,

David / dhildenb


2024-06-04 13:54:48

by Zi Yan

[permalink] [raw]
Subject: Re: [PATCH linux-next] mm: huge_memory: fix misused mapping_large_folio_support() for anon folios

On 4 Jun 2024, at 0:57, David Hildenbrand wrote:

> On 04.06.24 07:47, [email protected] wrote:
>> From: Ran Xiaokai <[email protected]>
>>
>> When I did a large folios split test, a WARNING
>> "[ 5059.122759][ T166] Cannot split file folio to non-0 order"
>> was triggered. But my test cases are only for anonmous folios.
>> while mapping_large_folio_support() is only reasonable for page
>> cache folios.
>
> Agreed.
>
> I wonder if mapping_large_folio_support() should either
>
> a) Complain if used for anon folios, so we can detect the wrong use more easily. (VM_WARN_ON_ONCE())

This is much better.

>
> b) Return "true" for anonymous mappings, although that's more debatable.

This might fix the warning here, but the function might get wrong uses easily.

>
>>
>> In split_huge_page_to_list_to_order(), the folio passed to
>> mapping_large_folio_support() maybe anonmous folio. The
>> folio_test_anon() check is missing. So the split of the anonmous THP
>> is failed. This is also the same for shmem_mapping(). We'd better add
>> a check for both. But the shmem_mapping() in __split_huge_page() is
>> not involved, as for anonmous folios, the end parameter is set to -1, so
>> (head[i].index >= end) is always false. shmem_mapping() is not called.
>>
>> Using /sys/kernel/debug/split_huge_pages to verify this, with this
>> patch, large anon THP is successfully split and the warning is ceased.
>>
>> Signed-off-by: Ran Xiaokai <[email protected]>
>> Cc: xu xin <[email protected]>
>> Cc: Yang Yang <[email protected]>
>> ---
>> mm/huge_memory.c | 38 ++++++++++++++++++++------------------
>> 1 file changed, 20 insertions(+), 18 deletions(-)
>>
>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>> index 317de2afd371..4c9c7e5ea20c 100644
>> --- a/mm/huge_memory.c
>> +++ b/mm/huge_memory.c
>> @@ -3009,31 +3009,33 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list,
>> if (new_order >= folio_order(folio))
>> return -EINVAL;
>>
>> - /* Cannot split anonymous THP to order-1 */
>> - if (new_order == 1 && folio_test_anon(folio)) {
>> - VM_WARN_ONCE(1, "Cannot split to order-1 folio");
>> - return -EINVAL;
>> - }
>> -
>> if (new_order) {
>> /* Only swapping a whole PMD-mapped folio is supported */
>> if (folio_test_swapcache(folio))
>> return -EINVAL;
>> - /* Split shmem folio to non-zero order not supported */
>> - if (shmem_mapping(folio->mapping)) {
>> - VM_WARN_ONCE(1,
>> - "Cannot split shmem folio to non-0 order");
>> - return -EINVAL;
>> - }
>> - /* No split if the file system does not support large folio */
>> - if (!mapping_large_folio_support(folio->mapping)) {
>> - VM_WARN_ONCE(1,
>> - "Cannot split file folio to non-0 order");
>> - return -EINVAL;
>> +
>> + if (folio_test_anon(folio)) {
>> + /* Cannot split anonymous THP to order-1 */
>> + if (new_order == 1) {
>> + VM_WARN_ONCE(1, "Cannot split to order-1 folio");
>> + return -EINVAL;
>> + }
>> + } else {
>> + /* Split shmem folio to non-zero order not supported */
>> + if (shmem_mapping(folio->mapping)) {
>> + VM_WARN_ONCE(1,
>> + "Cannot split shmem folio to non-0 order");
>> + return -EINVAL;
>> + }
>> + /* No split if the file system does not support large folio */
>> + if (!mapping_large_folio_support(folio->mapping)) {
>> + VM_WARN_ONCE(1,
>> + "Cannot split file folio to non-0 order");
>> + return -EINVAL;
>> + }
>> }
>> }
>
> What about the following sequence:
>
> if (folio_test_anon(folio)) {
> if (new_order == 1)
> ...
> } else if (new_order) {
> if (shmem_mapping(...))
> ...
> ...
> }
>
> if (folio_test_swapcache(folio) && new_order)
> return -EINVAL;
>
> Should result in less churn and reduce indentation level.

Yeah, this looks better to me.

Best Regards,
Yan, Zi


Attachments:
signature.asc (871.00 B)
OpenPGP digital signature

2024-06-04 13:58:20

by Zi Yan

[permalink] [raw]
Subject: Re: [PATCH linux-next] mm: huge_memory: fix misused mapping_large_folio_support() for anon folios

+Luis and Pankaj, who are working on enable bs > ps in XFS and touch split_huge_page_to_list_to_order().


On 4 Jun 2024, at 6:52, Zi Yan wrote:

> On 4 Jun 2024, at 0:57, David Hildenbrand wrote:
>
>> On 04.06.24 07:47, [email protected] wrote:
>>> From: Ran Xiaokai <[email protected]>
>>>
>>> When I did a large folios split test, a WARNING
>>> "[ 5059.122759][ T166] Cannot split file folio to non-0 order"
>>> was triggered. But my test cases are only for anonmous folios.
>>> while mapping_large_folio_support() is only reasonable for page
>>> cache folios.
>>
>> Agreed.
>>
>> I wonder if mapping_large_folio_support() should either
>>
>> a) Complain if used for anon folios, so we can detect the wrong use more easily. (VM_WARN_ON_ONCE())
>
> This is much better.
>
>>
>> b) Return "true" for anonymous mappings, although that's more debatable.
>
> This might fix the warning here, but the function might get wrong uses easily.
>
>>
>>>
>>> In split_huge_page_to_list_to_order(), the folio passed to
>>> mapping_large_folio_support() maybe anonmous folio. The
>>> folio_test_anon() check is missing. So the split of the anonmous THP
>>> is failed. This is also the same for shmem_mapping(). We'd better add
>>> a check for both. But the shmem_mapping() in __split_huge_page() is
>>> not involved, as for anonmous folios, the end parameter is set to -1, so
>>> (head[i].index >= end) is always false. shmem_mapping() is not called.
>>>
>>> Using /sys/kernel/debug/split_huge_pages to verify this, with this
>>> patch, large anon THP is successfully split and the warning is ceased.
>>>
>>> Signed-off-by: Ran Xiaokai <[email protected]>
>>> Cc: xu xin <[email protected]>
>>> Cc: Yang Yang <[email protected]>
>>> ---
>>> mm/huge_memory.c | 38 ++++++++++++++++++++------------------
>>> 1 file changed, 20 insertions(+), 18 deletions(-)
>>>
>>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>>> index 317de2afd371..4c9c7e5ea20c 100644
>>> --- a/mm/huge_memory.c
>>> +++ b/mm/huge_memory.c
>>> @@ -3009,31 +3009,33 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list,
>>> if (new_order >= folio_order(folio))
>>> return -EINVAL;
>>>
>>> - /* Cannot split anonymous THP to order-1 */
>>> - if (new_order == 1 && folio_test_anon(folio)) {
>>> - VM_WARN_ONCE(1, "Cannot split to order-1 folio");
>>> - return -EINVAL;
>>> - }
>>> -
>>> if (new_order) {
>>> /* Only swapping a whole PMD-mapped folio is supported */
>>> if (folio_test_swapcache(folio))
>>> return -EINVAL;
>>> - /* Split shmem folio to non-zero order not supported */
>>> - if (shmem_mapping(folio->mapping)) {
>>> - VM_WARN_ONCE(1,
>>> - "Cannot split shmem folio to non-0 order");
>>> - return -EINVAL;
>>> - }
>>> - /* No split if the file system does not support large folio */
>>> - if (!mapping_large_folio_support(folio->mapping)) {
>>> - VM_WARN_ONCE(1,
>>> - "Cannot split file folio to non-0 order");
>>> - return -EINVAL;
>>> +
>>> + if (folio_test_anon(folio)) {
>>> + /* Cannot split anonymous THP to order-1 */
>>> + if (new_order == 1) {
>>> + VM_WARN_ONCE(1, "Cannot split to order-1 folio");
>>> + return -EINVAL;
>>> + }
>>> + } else {
>>> + /* Split shmem folio to non-zero order not supported */
>>> + if (shmem_mapping(folio->mapping)) {
>>> + VM_WARN_ONCE(1,
>>> + "Cannot split shmem folio to non-0 order");
>>> + return -EINVAL;
>>> + }
>>> + /* No split if the file system does not support large folio */
>>> + if (!mapping_large_folio_support(folio->mapping)) {
>>> + VM_WARN_ONCE(1,
>>> + "Cannot split file folio to non-0 order");
>>> + return -EINVAL;
>>> + }
>>> }
>>> }
>>
>> What about the following sequence:
>>
>> if (folio_test_anon(folio)) {
>> if (new_order == 1)
>> ...
>> } else if (new_order) {
>> if (shmem_mapping(...))
>> ...
>> ...
>> }
>>
>> if (folio_test_swapcache(folio) && new_order)
>> return -EINVAL;
>>
>> Should result in less churn and reduce indentation level.
>
> Yeah, this looks better to me.
>
> Best Regards,
> Yan, Zi


Best Regards,
Yan, Zi


Attachments:
signature.asc (871.00 B)
OpenPGP digital signature

2024-06-05 02:22:46

by ran xiaokai

[permalink] [raw]
Subject: Re: [PATCH linux-next] mm: huge_memory: fix misused mapping_large_folio_support() for anon folios

> On 04.06.24 07:47, [email protected] wrote:
> > From: Ran Xiaokai <[email protected]>
> >
> > When I did a large folios split test, a WARNING
> > "[ 5059.122759][ T166] Cannot split file folio to non-0 order"
> > was triggered. But my test cases are only for anonmous folios.
> > while mapping_large_folio_support() is only reasonable for page
> > cache folios.
>
> Agreed.
>
> I wonder if mapping_large_folio_support() should either
>
> a) Complain if used for anon folios, so we can detect the wrong use more
> easily. (VM_WARN_ON_ONCE())

> b) Return "true" for anonymous mappings, although that's more debatable.
>

Hi, David,
Thanks for the review.
I think a) is better.
But we have to add a new parameter "folio" to mapping_large_folio_support(), right ?
something like mapping_large_folio_support(struct address_space *mapping, struct folio *folio) ?
But in the __filemap_get_folio() path,

__filemap_get_folio()
no_page:
....
if (!mapping_large_folio_support(mapping))

the folio is not allocated yet, yes ?
Or do you mean there is some other way to do this ?

> >
> > In split_huge_page_to_list_to_order(), the folio passed to
> > mapping_large_folio_support() maybe anonmous folio. The
> > folio_test_anon() check is missing. So the split of the anonmous THP
> > is failed. This is also the same for shmem_mapping(). We'd better add
> > a check for both. But the shmem_mapping() in __split_huge_page() is
> > not involved, as for anonmous folios, the end parameter is set to -1, so
> > (head[i].index >= end) is always false. shmem_mapping() is not called.
> >
> > Using /sys/kernel/debug/split_huge_pages to verify this, with this
> > patch, large anon THP is successfully split and the warning is ceased.
> >
> > Signed-off-by: Ran Xiaokai <[email protected]>
> > Cc: xu xin <[email protected]>
> > Cc: Yang Yang <[email protected]>
> > ---
> > mm/huge_memory.c | 38 ++++++++++++++++++++------------------
> > 1 file changed, 20 insertions(+), 18 deletions(-)
> >
> > diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> > index 317de2afd371..4c9c7e5ea20c 100644
> > --- a/mm/huge_memory.c
> > +++ b/mm/huge_memory.c
> > @@ -3009,31 +3009,33 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list,
> > if (new_order >= folio_order(folio))
> > return -EINVAL;
> >
> > - /* Cannot split anonymous THP to order-1 */
> > - if (new_order == 1 && folio_test_anon(folio)) {
> > - VM_WARN_ONCE(1, "Cannot split to order-1 folio");
> > - return -EINVAL;
> > - }
> > -
> > if (new_order) {
> > /* Only swapping a whole PMD-mapped folio is supported */
> > if (folio_test_swapcache(folio))
> > return -EINVAL;
> > - /* Split shmem folio to non-zero order not supported */
> > - if (shmem_mapping(folio->mapping)) {
> > - VM_WARN_ONCE(1,
> > - "Cannot split shmem folio to non-0 order");
> > - return -EINVAL;
> > - }
> > - /* No split if the file system does not support large folio */
> > - if (!mapping_large_folio_support(folio->mapping)) {
> > - VM_WARN_ONCE(1,
> > - "Cannot split file folio to non-0 order");
> > - return -EINVAL;
> > +
> > + if (folio_test_anon(folio)) {
> > + /* Cannot split anonymous THP to order-1 */
> > + if (new_order == 1) {
> > + VM_WARN_ONCE(1, "Cannot split to order-1 folio");
> > + return -EINVAL;
> > + }
> > + } else {
> > + /* Split shmem folio to non-zero order not supported */
> > + if (shmem_mapping(folio->mapping)) {
> > + VM_WARN_ONCE(1,
> > + "Cannot split shmem folio to non-0 order");
> > + return -EINVAL;
> > + }
> > + /* No split if the file system does not support large folio */
> > + if (!mapping_large_folio_support(folio->mapping)) {
> > + VM_WARN_ONCE(1,
> > + "Cannot split file folio to non-0 order");
> > + return -EINVAL;
> > + }
> > }
> > }
>
> What about the following sequence:
>
> if (folio_test_anon(folio)) {
> if (new_order == 1)
> ...
> } else if (new_order) {
> if (shmem_mapping(...))
> ...
> ...
> }
>
> if (folio_test_swapcache(folio) && new_order)
> return -EINVAL;
>
> Should result in less churn and reduce indentation level.
>

Thanks.
The code is cleaner in this way.

> --
> Cheers,
>
> David / dhildenb


2024-06-05 02:58:09

by ran xiaokai

[permalink] [raw]
Subject: Re: [PATCH linux-next] mm: huge_memory: fix misused mapping_large_folio_support() for anon folios

> On 4 Jun 2024, at 0:57, David Hildenbrand wrote:
>
> > On 04.06.24 07:47, [email protected] wrote:
> >> From: Ran Xiaokai <[email protected]>
> >>
> >> When I did a large folios split test, a WARNING
> >> "[ 5059.122759][ T166] Cannot split file folio to non-0 order"
> >> was triggered. But my test cases are only for anonmous folios.
> >> while mapping_large_folio_support() is only reasonable for page
> >> cache folios.
> >
> > Agreed.
> >
> > I wonder if mapping_large_folio_support() should either
> >
> > a) Complain if used for anon folios, so we can detect the wrong use more easily. (VM_WARN_ON_ONCE())
>
> This is much better.
>
> >
> > b) Return "true" for anonymous mappings, although that's more debatable.
>
> This might fix the warning here, but the function might get wrong uses easily.

yes, maybe we should rename mapping_large_folio_support() if we choose b).

> >
> >>
> >> In split_huge_page_to_list_to_order(), the folio passed to
> >> mapping_large_folio_support() maybe anonmous folio. The
> >> folio_test_anon() check is missing. So the split of the anonmous THP
> >> is failed. This is also the same for shmem_mapping(). We'd better add
> >> a check for both. But the shmem_mapping() in __split_huge_page() is
> >> not involved, as for anonmous folios, the end parameter is set to -1, so
> >> (head[i].index >= end) is always false. shmem_mapping() is not called.
> >>
> >> Using /sys/kernel/debug/split_huge_pages to verify this, with this
> >> patch, large anon THP is successfully split and the warning is ceased.
> >>
> >> Signed-off-by: Ran Xiaokai <[email protected]>
> >> Cc: xu xin <[email protected]>
> >> Cc: Yang Yang <[email protected]>
> >> ---
> >> mm/huge_memory.c | 38 ++++++++++++++++++++------------------
> >> 1 file changed, 20 insertions(+), 18 deletions(-)
> >>
> >> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> >> index 317de2afd371..4c9c7e5ea20c 100644
> >> --- a/mm/huge_memory.c
> >> +++ b/mm/huge_memory.c
> >> @@ -3009,31 +3009,33 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list,
> >> if (new_order >= folio_order(folio))
> >> return -EINVAL;
> >>
> >> - /* Cannot split anonymous THP to order-1 */
> >> - if (new_order == 1 && folio_test_anon(folio)) {
> >> - VM_WARN_ONCE(1, "Cannot split to order-1 folio");
> >> - return -EINVAL;
> >> - }
> >> -
> >> if (new_order) {
> >> /* Only swapping a whole PMD-mapped folio is supported */
> >> if (folio_test_swapcache(folio))
> >> return -EINVAL;
> >> - /* Split shmem folio to non-zero order not supported */
> >> - if (shmem_mapping(folio->mapping)) {
> >> - VM_WARN_ONCE(1,
> >> - "Cannot split shmem folio to non-0 order");
> >> - return -EINVAL;
> >> - }
> >> - /* No split if the file system does not support large folio */
> >> - if (!mapping_large_folio_support(folio->mapping)) {
> >> - VM_WARN_ONCE(1,
> >> - "Cannot split file folio to non-0 order");
> >> - return -EINVAL;
> >> +
> >> + if (folio_test_anon(folio)) {
> >> + /* Cannot split anonymous THP to order-1 */
> >> + if (new_order == 1) {
> >> + VM_WARN_ONCE(1, "Cannot split to order-1 folio");
> >> + return -EINVAL;
> >> + }
> >> + } else {
> >> + /* Split shmem folio to non-zero order not supported */
> >> + if (shmem_mapping(folio->mapping)) {
> >> + VM_WARN_ONCE(1,
> >> + "Cannot split shmem folio to non-0 order");
> >> + return -EINVAL;
> >> + }
> >> + /* No split if the file system does not support large folio */
> >> + if (!mapping_large_folio_support(folio->mapping)) {
> >> + VM_WARN_ONCE(1,
> >> + "Cannot split file folio to non-0 order");
> >> + return -EINVAL;
> >> + }
> >> }
> >> }
> >
> > What about the following sequence:
> >
> > if (folio_test_anon(folio)) {
> > if (new_order == 1)
> > ...
> > } else if (new_order) {
> > if (shmem_mapping(...))
> > ...
> > ...
> > }
> >
> > if (folio_test_swapcache(folio) && new_order)
> > return -EINVAL;
> >
> > Should result in less churn and reduce indentation level.
>
> Yeah, this looks better to me.
>
> Best Regards,
> Yan, Zi


2024-06-05 07:26:07

by David Hildenbrand

[permalink] [raw]
Subject: Re: [PATCH linux-next] mm: huge_memory: fix misused mapping_large_folio_support() for anon folios

On 05.06.24 04:20, ran xiaokai wrote:
>> On 04.06.24 07:47, [email protected] wrote:
>>> From: Ran Xiaokai <[email protected]>
>>>
>>> When I did a large folios split test, a WARNING
>>> "[ 5059.122759][ T166] Cannot split file folio to non-0 order"
>>> was triggered. But my test cases are only for anonmous folios.
>>> while mapping_large_folio_support() is only reasonable for page
>>> cache folios.
>>
>> Agreed.
>>
>> I wonder if mapping_large_folio_support() should either
>>
>> a) Complain if used for anon folios, so we can detect the wrong use more
>> easily. (VM_WARN_ON_ONCE())
>
>> b) Return "true" for anonymous mappings, although that's more debatable.
>>
>
> Hi, David,
> Thanks for the review.
> I think a) is better.
> But we have to add a new parameter "folio" to mapping_large_folio_support(), right ?
> something like mapping_large_folio_support(struct address_space *mapping, struct folio *folio) ?
> But in the __filemap_get_folio() path,
>
> __filemap_get_folio()
> no_page:
> ....
> if (!mapping_large_folio_support(mapping))
>
> the folio is not allocated yet, yes ?
> Or do you mean there is some other way to do this ?

If we really pass unmodified folio->mapping, you can do what
folio_test_anon() would and make sure PAGE_MAPPING_ANON is not set.

--
Cheers,

David / dhildenb


2024-06-05 08:32:39

by ran xiaokai

[permalink] [raw]
Subject: Re: [PATCH linux-next] mm: huge_memory: fix misused mapping_large_folio_support() for anon folios

> On 05.06.24 04:20, ran xiaokai wrote:
> >> On 04.06.24 07:47, [email protected] wrote:
> >>> From: Ran Xiaokai <[email protected]>
> >>>
> >>> When I did a large folios split test, a WARNING
> >>> "[ 5059.122759][ T166] Cannot split file folio to non-0 order"
> >>> was triggered. But my test cases are only for anonmous folios.
> >>> while mapping_large_folio_support() is only reasonable for page
> >>> cache folios.
> >>
> >> Agreed.
> >>
> >> I wonder if mapping_large_folio_support() should either
> >>
> >> a) Complain if used for anon folios, so we can detect the wrong use more
> >> easily. (VM_WARN_ON_ONCE())
> >
> >> b) Return "true" for anonymous mappings, although that's more debatable.
> >>
> >
> > Hi, David,
> > Thanks for the review.
> > I think a) is better.
> > But we have to add a new parameter "folio" to mapping_large_folio_support(), right ?
> > something like mapping_large_folio_support(struct address_space *mapping, struct folio *folio) ?
> > But in the __filemap_get_folio() path,
> >
> > __filemap_get_folio()
> > no_page:
> > ....
> > if (!mapping_large_folio_support(mapping))
> >
> > the folio is not allocated yet, yes ?
> > Or do you mean there is some other way to do this ?
>
> If we really pass unmodified folio->mapping, you can do what
> folio_test_anon() would and make sure PAGE_MAPPING_ANON is not set.

I think I just misunderstood your suggestion.
How about this ?

static inline bool mapping_large_folio_support(struct address_space *mapping)
{
VM_WARN_ONCE((unsigned long)mapping & PAGE_MAPPING_ANON,
"Anonymous mapping always supports large folio");

return IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) &&
test_bit(AS_LARGE_FOLIO_SUPPORT, &mapping->flags);
}


2024-06-05 08:34:01

by David Hildenbrand

[permalink] [raw]
Subject: Re: [PATCH linux-next] mm: huge_memory: fix misused mapping_large_folio_support() for anon folios

On 05.06.24 10:30, ran xiaokai wrote:
>> On 05.06.24 04:20, ran xiaokai wrote:
>>>> On 04.06.24 07:47, [email protected] wrote:
>>>>> From: Ran Xiaokai <[email protected]>
>>>>>
>>>>> When I did a large folios split test, a WARNING
>>>>> "[ 5059.122759][ T166] Cannot split file folio to non-0 order"
>>>>> was triggered. But my test cases are only for anonmous folios.
>>>>> while mapping_large_folio_support() is only reasonable for page
>>>>> cache folios.
>>>>
>>>> Agreed.
>>>>
>>>> I wonder if mapping_large_folio_support() should either
>>>>
>>>> a) Complain if used for anon folios, so we can detect the wrong use more
>>>> easily. (VM_WARN_ON_ONCE())
>>>
>>>> b) Return "true" for anonymous mappings, although that's more debatable.
>>>>
>>>
>>> Hi, David,
>>> Thanks for the review.
>>> I think a) is better.
>>> But we have to add a new parameter "folio" to mapping_large_folio_support(), right ?
>>> something like mapping_large_folio_support(struct address_space *mapping, struct folio *folio) ?
>>> But in the __filemap_get_folio() path,
>>>
>>> __filemap_get_folio()
>>> no_page:
>>> ....
>>> if (!mapping_large_folio_support(mapping))
>>>
>>> the folio is not allocated yet, yes ?
>>> Or do you mean there is some other way to do this ?
>>
>> If we really pass unmodified folio->mapping, you can do what
>> folio_test_anon() would and make sure PAGE_MAPPING_ANON is not set.
>
> I think I just misunderstood your suggestion.

Likely my use of "folio" was confusing.

> How about this ?
>
> static inline bool mapping_large_folio_support(struct address_space *mapping)
> {
> VM_WARN_ONCE((unsigned long)mapping & PAGE_MAPPING_ANON,
> "Anonymous mapping always supports large folio");
>
> return IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) &&
> test_bit(AS_LARGE_FOLIO_SUPPORT, &mapping->flags);
> }

Yes, and we should likely document that this is not supposed to be used
with mappings from anonymous folios.

--
Cheers,

David / dhildenb


2024-06-05 09:07:19

by Barry Song

[permalink] [raw]
Subject: Re: [PATCH linux-next] mm: huge_memory: fix misused mapping_large_folio_support() for anon folios

On Tue, Jun 4, 2024 at 5:47 PM <[email protected]> wrote:
>
> From: Ran Xiaokai <[email protected]>
>
> When I did a large folios split test, a WARNING
> "[ 5059.122759][ T166] Cannot split file folio to non-0 order"
> was triggered. But my test cases are only for anonmous folios.
> while mapping_large_folio_support() is only reasonable for page
> cache folios.
>
> In split_huge_page_to_list_to_order(), the folio passed to
> mapping_large_folio_support() maybe anonmous folio. The
> folio_test_anon() check is missing. So the split of the anonmous THP
> is failed. This is also the same for shmem_mapping(). We'd better add
> a check for both. But the shmem_mapping() in __split_huge_page() is
> not involved, as for anonmous folios, the end parameter is set to -1, so
> (head[i].index >= end) is always false. shmem_mapping() is not called.
>
> Using /sys/kernel/debug/split_huge_pages to verify this, with this
> patch, large anon THP is successfully split and the warning is ceased.
>
> Signed-off-by: Ran Xiaokai <[email protected]>
> Cc: xu xin <[email protected]>
> Cc: Yang Yang <[email protected]>
> ---
> mm/huge_memory.c | 38 ++++++++++++++++++++------------------
> 1 file changed, 20 insertions(+), 18 deletions(-)
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 317de2afd371..4c9c7e5ea20c 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -3009,31 +3009,33 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list,
> if (new_order >= folio_order(folio))
> return -EINVAL;
>
> - /* Cannot split anonymous THP to order-1 */
> - if (new_order == 1 && folio_test_anon(folio)) {
> - VM_WARN_ONCE(1, "Cannot split to order-1 folio");
> - return -EINVAL;
> - }
> -
> if (new_order) {
> /* Only swapping a whole PMD-mapped folio is supported */
> if (folio_test_swapcache(folio))
> return -EINVAL;
> - /* Split shmem folio to non-zero order not supported */
> - if (shmem_mapping(folio->mapping)) {
> - VM_WARN_ONCE(1,
> - "Cannot split shmem folio to non-0 order");
> - return -EINVAL;
> - }
> - /* No split if the file system does not support large folio */
> - if (!mapping_large_folio_support(folio->mapping)) {
> - VM_WARN_ONCE(1,
> - "Cannot split file folio to non-0 order");
> - return -EINVAL;
> +
> + if (folio_test_anon(folio)) {
> + /* Cannot split anonymous THP to order-1 */
> + if (new_order == 1) {
> + VM_WARN_ONCE(1, "Cannot split to order-1 folio");
> + return -EINVAL;
> + }
> + } else {
> + /* Split shmem folio to non-zero order not supported */
> + if (shmem_mapping(folio->mapping)) {
> + VM_WARN_ONCE(1,
> + "Cannot split shmem folio to non-0 order");
> + return -EINVAL;
> + }
> + /* No split if the file system does not support large folio */
> + if (!mapping_large_folio_support(folio->mapping)) {
> + VM_WARN_ONCE(1,
> + "Cannot split file folio to non-0 order");
> + return -EINVAL;
> + }

Am I missing something? if file system doesn't support large folio,
how could the large folio start to exist from the first place while its
mapping points to a file which doesn't support large folio?

> }
> }
>
> -
> is_hzp = is_huge_zero_folio(folio);
> if (is_hzp) {
> pr_warn_ratelimited("Called split_huge_page for huge zero page\n");
> --
> 2.15.2
>

Thanks
Barry

2024-06-05 09:55:16

by ran xiaokai

[permalink] [raw]
Subject: Re: [PATCH linux-next] mm: huge_memory: fix misused mapping_large_folio_support() for anon folios

> On Tue, Jun 4, 2024 at 5:47?PM <[email protected]> wrote:
> >
> > From: Ran Xiaokai <[email protected]>
> >
> > When I did a large folios split test, a WARNING
> > "[ 5059.122759][ T166] Cannot split file folio to non-0 order"
> > was triggered. But my test cases are only for anonmous folios.
> > while mapping_large_folio_support() is only reasonable for page
> > cache folios.
> >
> > In split_huge_page_to_list_to_order(), the folio passed to
> > mapping_large_folio_support() maybe anonmous folio. The
> > folio_test_anon() check is missing. So the split of the anonmous THP
> > is failed. This is also the same for shmem_mapping(). We'd better add
> > a check for both. But the shmem_mapping() in __split_huge_page() is
> > not involved, as for anonmous folios, the end parameter is set to -1, so
> > (head[i].index >= end) is always false. shmem_mapping() is not called.
> >
> > Using /sys/kernel/debug/split_huge_pages to verify this, with this
> > patch, large anon THP is successfully split and the warning is ceased.
> >
> > Signed-off-by: Ran Xiaokai <[email protected]>
> > Cc: xu xin <[email protected]>
> > Cc: Yang Yang <[email protected]>
> > ---
> > mm/huge_memory.c | 38 ++++++++++++++++++++------------------
> > 1 file changed, 20 insertions(+), 18 deletions(-)
> >
> > diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> > index 317de2afd371..4c9c7e5ea20c 100644
> > --- a/mm/huge_memory.c
> > +++ b/mm/huge_memory.c
> > @@ -3009,31 +3009,33 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list,
> > if (new_order >= folio_order(folio))
> > return -EINVAL;
> >
> > - /* Cannot split anonymous THP to order-1 */
> > - if (new_order == 1 && folio_test_anon(folio)) {
> > - VM_WARN_ONCE(1, "Cannot split to order-1 folio");
> > - return -EINVAL;
> > - }
> > -
> > if (new_order) {
> > /* Only swapping a whole PMD-mapped folio is supported */
> > if (folio_test_swapcache(folio))
> > return -EINVAL;
> > - /* Split shmem folio to non-zero order not supported */
> > - if (shmem_mapping(folio->mapping)) {
> > - VM_WARN_ONCE(1,
> > - "Cannot split shmem folio to non-0 order");
> > - return -EINVAL;
> > - }
> > - /* No split if the file system does not support large folio */
> > - if (!mapping_large_folio_support(folio->mapping)) {
> > - VM_WARN_ONCE(1,
> > - "Cannot split file folio to non-0 order");
> > - return -EINVAL;
> > +
> > + if (folio_test_anon(folio)) {
> > + /* Cannot split anonymous THP to order-1 */
> > + if (new_order == 1) {
> > + VM_WARN_ONCE(1, "Cannot split to order-1 folio");
> > + return -EINVAL;
> > + }
> > + } else {
> > + /* Split shmem folio to non-zero order not supported */
> > + if (shmem_mapping(folio->mapping)) {
> > + VM_WARN_ONCE(1,
> > + "Cannot split shmem folio to non-0 order");
> > + return -EINVAL;
> > + }
> > + /* No split if the file system does not support large folio */
> > + if (!mapping_large_folio_support(folio->mapping)) {
> > + VM_WARN_ONCE(1,
> > + "Cannot split file folio to non-0 order");
> > + return -EINVAL;
> > + }
>
> Am I missing something? if file system doesn't support large folio,
> how could the large folio start to exist from the first place while its
> mapping points to a file which doesn't support large folio?

I think it is the CONFIG_READ_ONLY_THP_FOR_FS case.
khugepaged will try to collapse read-only file-backed pages to 2M THP.


2024-06-05 14:21:15

by Zi Yan

[permalink] [raw]
Subject: Re: [PATCH linux-next] mm: huge_memory: fix misused mapping_large_folio_support() for anon folios

On 5 Jun 2024, at 2:54, ran xiaokai wrote:

>> On Tue, Jun 4, 2024 at 5:47?PM <[email protected]> wrote:
>>>
>>> From: Ran Xiaokai <[email protected]>
>>>
>>> When I did a large folios split test, a WARNING
>>> "[ 5059.122759][ T166] Cannot split file folio to non-0 order"
>>> was triggered. But my test cases are only for anonmous folios.
>>> while mapping_large_folio_support() is only reasonable for page
>>> cache folios.
>>>
>>> In split_huge_page_to_list_to_order(), the folio passed to
>>> mapping_large_folio_support() maybe anonmous folio. The
>>> folio_test_anon() check is missing. So the split of the anonmous THP
>>> is failed. This is also the same for shmem_mapping(). We'd better add
>>> a check for both. But the shmem_mapping() in __split_huge_page() is
>>> not involved, as for anonmous folios, the end parameter is set to -1, so
>>> (head[i].index >= end) is always false. shmem_mapping() is not called.
>>>
>>> Using /sys/kernel/debug/split_huge_pages to verify this, with this
>>> patch, large anon THP is successfully split and the warning is ceased.
>>>
>>> Signed-off-by: Ran Xiaokai <[email protected]>
>>> Cc: xu xin <[email protected]>
>>> Cc: Yang Yang <[email protected]>
>>> ---
>>> mm/huge_memory.c | 38 ++++++++++++++++++++------------------
>>> 1 file changed, 20 insertions(+), 18 deletions(-)
>>>
>>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>>> index 317de2afd371..4c9c7e5ea20c 100644
>>> --- a/mm/huge_memory.c
>>> +++ b/mm/huge_memory.c
>>> @@ -3009,31 +3009,33 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list,
>>> if (new_order >= folio_order(folio))
>>> return -EINVAL;
>>>
>>> - /* Cannot split anonymous THP to order-1 */
>>> - if (new_order == 1 && folio_test_anon(folio)) {
>>> - VM_WARN_ONCE(1, "Cannot split to order-1 folio");
>>> - return -EINVAL;
>>> - }
>>> -
>>> if (new_order) {
>>> /* Only swapping a whole PMD-mapped folio is supported */
>>> if (folio_test_swapcache(folio))
>>> return -EINVAL;
>>> - /* Split shmem folio to non-zero order not supported */
>>> - if (shmem_mapping(folio->mapping)) {
>>> - VM_WARN_ONCE(1,
>>> - "Cannot split shmem folio to non-0 order");
>>> - return -EINVAL;
>>> - }
>>> - /* No split if the file system does not support large folio */
>>> - if (!mapping_large_folio_support(folio->mapping)) {
>>> - VM_WARN_ONCE(1,
>>> - "Cannot split file folio to non-0 order");
>>> - return -EINVAL;
>>> +
>>> + if (folio_test_anon(folio)) {
>>> + /* Cannot split anonymous THP to order-1 */
>>> + if (new_order == 1) {
>>> + VM_WARN_ONCE(1, "Cannot split to order-1 folio");
>>> + return -EINVAL;
>>> + }
>>> + } else {
>>> + /* Split shmem folio to non-zero order not supported */
>>> + if (shmem_mapping(folio->mapping)) {
>>> + VM_WARN_ONCE(1,
>>> + "Cannot split shmem folio to non-0 order");
>>> + return -EINVAL;
>>> + }
>>> + /* No split if the file system does not support large folio */
>>> + if (!mapping_large_folio_support(folio->mapping)) {
>>> + VM_WARN_ONCE(1,
>>> + "Cannot split file folio to non-0 order");
>>> + return -EINVAL;
>>> + }
>>
>> Am I missing something? if file system doesn't support large folio,
>> how could the large folio start to exist from the first place while its
>> mapping points to a file which doesn't support large folio?
>
> I think it is the CONFIG_READ_ONLY_THP_FOR_FS case.
> khugepaged will try to collapse read-only file-backed pages to 2M THP.

Can you add this information to the commit log in your next version?

Best Regards,
Yan, Zi


Attachments:
signature.asc (871.00 B)
OpenPGP digital signature

2024-06-05 14:52:21

by David Hildenbrand

[permalink] [raw]
Subject: Re: [PATCH linux-next] mm: huge_memory: fix misused mapping_large_folio_support() for anon folios

On 05.06.24 16:08, Zi Yan wrote:
> On 5 Jun 2024, at 2:54, ran xiaokai wrote:
>
>>> On Tue, Jun 4, 2024 at 5:47?PM <[email protected]> wrote:
>>>>
>>>> From: Ran Xiaokai <[email protected]>
>>>>
>>>> When I did a large folios split test, a WARNING
>>>> "[ 5059.122759][ T166] Cannot split file folio to non-0 order"
>>>> was triggered. But my test cases are only for anonmous folios.
>>>> while mapping_large_folio_support() is only reasonable for page
>>>> cache folios.
>>>>
>>>> In split_huge_page_to_list_to_order(), the folio passed to
>>>> mapping_large_folio_support() maybe anonmous folio. The
>>>> folio_test_anon() check is missing. So the split of the anonmous THP
>>>> is failed. This is also the same for shmem_mapping(). We'd better add
>>>> a check for both. But the shmem_mapping() in __split_huge_page() is
>>>> not involved, as for anonmous folios, the end parameter is set to -1, so
>>>> (head[i].index >= end) is always false. shmem_mapping() is not called.
>>>>
>>>> Using /sys/kernel/debug/split_huge_pages to verify this, with this
>>>> patch, large anon THP is successfully split and the warning is ceased.
>>>>
>>>> Signed-off-by: Ran Xiaokai <[email protected]>
>>>> Cc: xu xin <[email protected]>
>>>> Cc: Yang Yang <[email protected]>
>>>> ---
>>>> mm/huge_memory.c | 38 ++++++++++++++++++++------------------
>>>> 1 file changed, 20 insertions(+), 18 deletions(-)
>>>>
>>>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>>>> index 317de2afd371..4c9c7e5ea20c 100644
>>>> --- a/mm/huge_memory.c
>>>> +++ b/mm/huge_memory.c
>>>> @@ -3009,31 +3009,33 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list,
>>>> if (new_order >= folio_order(folio))
>>>> return -EINVAL;
>>>>
>>>> - /* Cannot split anonymous THP to order-1 */
>>>> - if (new_order == 1 && folio_test_anon(folio)) {
>>>> - VM_WARN_ONCE(1, "Cannot split to order-1 folio");
>>>> - return -EINVAL;
>>>> - }
>>>> -
>>>> if (new_order) {
>>>> /* Only swapping a whole PMD-mapped folio is supported */
>>>> if (folio_test_swapcache(folio))
>>>> return -EINVAL;
>>>> - /* Split shmem folio to non-zero order not supported */
>>>> - if (shmem_mapping(folio->mapping)) {
>>>> - VM_WARN_ONCE(1,
>>>> - "Cannot split shmem folio to non-0 order");
>>>> - return -EINVAL;
>>>> - }
>>>> - /* No split if the file system does not support large folio */
>>>> - if (!mapping_large_folio_support(folio->mapping)) {
>>>> - VM_WARN_ONCE(1,
>>>> - "Cannot split file folio to non-0 order");
>>>> - return -EINVAL;
>>>> +
>>>> + if (folio_test_anon(folio)) {
>>>> + /* Cannot split anonymous THP to order-1 */
>>>> + if (new_order == 1) {
>>>> + VM_WARN_ONCE(1, "Cannot split to order-1 folio");
>>>> + return -EINVAL;
>>>> + }
>>>> + } else {
>>>> + /* Split shmem folio to non-zero order not supported */
>>>> + if (shmem_mapping(folio->mapping)) {
>>>> + VM_WARN_ONCE(1,
>>>> + "Cannot split shmem folio to non-0 order");
>>>> + return -EINVAL;
>>>> + }
>>>> + /* No split if the file system does not support large folio */
>>>> + if (!mapping_large_folio_support(folio->mapping)) {
>>>> + VM_WARN_ONCE(1,
>>>> + "Cannot split file folio to non-0 order");
>>>> + return -EINVAL;
>>>> + }
>>>
>>> Am I missing something? if file system doesn't support large folio,
>>> how could the large folio start to exist from the first place while its
>>> mapping points to a file which doesn't support large folio?
>>
>> I think it is the CONFIG_READ_ONLY_THP_FOR_FS case.
>> khugepaged will try to collapse read-only file-backed pages to 2M THP.
>
> Can you add this information to the commit log in your next version?

Can we also add that as a comment to that function, like "Note that we
might still
have THPs in such mappings due to CONFIG_READ_ONLY_THP_FOR_FS. But in
that case,
the mapping does not actually support large folios properly.
"Or sth. like that.

--
Cheers,

David / dhildenb


2024-06-06 01:34:39

by Barry Song

[permalink] [raw]
Subject: Re: [PATCH linux-next] mm: huge_memory: fix misused mapping_large_folio_support() for anon folios

On Thu, Jun 6, 2024 at 2:42 AM David Hildenbrand <[email protected]> wrote:
>
> On 05.06.24 16:08, Zi Yan wrote:
> > On 5 Jun 2024, at 2:54, ran xiaokai wrote:
> >
> >>> On Tue, Jun 4, 2024 at 5:47?PM <[email protected]> wrote:
> >>>>
> >>>> From: Ran Xiaokai <[email protected]>
> >>>>
> >>>> When I did a large folios split test, a WARNING
> >>>> "[ 5059.122759][ T166] Cannot split file folio to non-0 order"
> >>>> was triggered. But my test cases are only for anonmous folios.
> >>>> while mapping_large_folio_support() is only reasonable for page
> >>>> cache folios.
> >>>>
> >>>> In split_huge_page_to_list_to_order(), the folio passed to
> >>>> mapping_large_folio_support() maybe anonmous folio. The
> >>>> folio_test_anon() check is missing. So the split of the anonmous THP
> >>>> is failed. This is also the same for shmem_mapping(). We'd better add
> >>>> a check for both. But the shmem_mapping() in __split_huge_page() is
> >>>> not involved, as for anonmous folios, the end parameter is set to -1, so
> >>>> (head[i].index >= end) is always false. shmem_mapping() is not called.
> >>>>
> >>>> Using /sys/kernel/debug/split_huge_pages to verify this, with this
> >>>> patch, large anon THP is successfully split and the warning is ceased.
> >>>>
> >>>> Signed-off-by: Ran Xiaokai <[email protected]>
> >>>> Cc: xu xin <[email protected]>
> >>>> Cc: Yang Yang <[email protected]>
> >>>> ---
> >>>> mm/huge_memory.c | 38 ++++++++++++++++++++------------------
> >>>> 1 file changed, 20 insertions(+), 18 deletions(-)
> >>>>
> >>>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> >>>> index 317de2afd371..4c9c7e5ea20c 100644
> >>>> --- a/mm/huge_memory.c
> >>>> +++ b/mm/huge_memory.c
> >>>> @@ -3009,31 +3009,33 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list,
> >>>> if (new_order >= folio_order(folio))
> >>>> return -EINVAL;
> >>>>
> >>>> - /* Cannot split anonymous THP to order-1 */
> >>>> - if (new_order == 1 && folio_test_anon(folio)) {
> >>>> - VM_WARN_ONCE(1, "Cannot split to order-1 folio");
> >>>> - return -EINVAL;
> >>>> - }
> >>>> -
> >>>> if (new_order) {
> >>>> /* Only swapping a whole PMD-mapped folio is supported */
> >>>> if (folio_test_swapcache(folio))
> >>>> return -EINVAL;
> >>>> - /* Split shmem folio to non-zero order not supported */
> >>>> - if (shmem_mapping(folio->mapping)) {
> >>>> - VM_WARN_ONCE(1,
> >>>> - "Cannot split shmem folio to non-0 order");
> >>>> - return -EINVAL;
> >>>> - }
> >>>> - /* No split if the file system does not support large folio */
> >>>> - if (!mapping_large_folio_support(folio->mapping)) {
> >>>> - VM_WARN_ONCE(1,
> >>>> - "Cannot split file folio to non-0 order");
> >>>> - return -EINVAL;
> >>>> +
> >>>> + if (folio_test_anon(folio)) {
> >>>> + /* Cannot split anonymous THP to order-1 */
> >>>> + if (new_order == 1) {
> >>>> + VM_WARN_ONCE(1, "Cannot split to order-1 folio");
> >>>> + return -EINVAL;
> >>>> + }
> >>>> + } else {
> >>>> + /* Split shmem folio to non-zero order not supported */
> >>>> + if (shmem_mapping(folio->mapping)) {
> >>>> + VM_WARN_ONCE(1,
> >>>> + "Cannot split shmem folio to non-0 order");
> >>>> + return -EINVAL;
> >>>> + }
> >>>> + /* No split if the file system does not support large folio */
> >>>> + if (!mapping_large_folio_support(folio->mapping)) {
> >>>> + VM_WARN_ONCE(1,
> >>>> + "Cannot split file folio to non-0 order");
> >>>> + return -EINVAL;
> >>>> + }
> >>>
> >>> Am I missing something? if file system doesn't support large folio,
> >>> how could the large folio start to exist from the first place while its
> >>> mapping points to a file which doesn't support large folio?
> >>
> >> I think it is the CONFIG_READ_ONLY_THP_FOR_FS case.
> >> khugepaged will try to collapse read-only file-backed pages to 2M THP.
> >
> > Can you add this information to the commit log in your next version?
>
> Can we also add that as a comment to that function, like "Note that we
> might still
> have THPs in such mappings due to CONFIG_READ_ONLY_THP_FOR_FS. But in
> that case,
> the mapping does not actually support large folios properly.
> "Or sth. like that.

+1. Otherwise, the code appears quite confusing.
Would using "#ifdef" help to further clarify it?

#ifdef CONFIG_READ_ONLY_THP_FOR_FS
if (!mapping_large_folio_support(folio->mapping)) {
....
}
#endif

>
> --
> Cheers,
>
> David / dhildenb
>

Thanks
Barry

2024-06-06 07:18:45

by David Hildenbrand

[permalink] [raw]
Subject: Re: [PATCH linux-next] mm: huge_memory: fix misused mapping_large_folio_support() for anon folios

On 06.06.24 03:34, Barry Song wrote:
> On Thu, Jun 6, 2024 at 2:42 AM David Hildenbrand <[email protected]> wrote:
>>
>> On 05.06.24 16:08, Zi Yan wrote:
>>> On 5 Jun 2024, at 2:54, ran xiaokai wrote:
>>>
>>>>> On Tue, Jun 4, 2024 at 5:47?PM <[email protected]> wrote:
>>>>>>
>>>>>> From: Ran Xiaokai <[email protected]>
>>>>>>
>>>>>> When I did a large folios split test, a WARNING
>>>>>> "[ 5059.122759][ T166] Cannot split file folio to non-0 order"
>>>>>> was triggered. But my test cases are only for anonmous folios.
>>>>>> while mapping_large_folio_support() is only reasonable for page
>>>>>> cache folios.
>>>>>>
>>>>>> In split_huge_page_to_list_to_order(), the folio passed to
>>>>>> mapping_large_folio_support() maybe anonmous folio. The
>>>>>> folio_test_anon() check is missing. So the split of the anonmous THP
>>>>>> is failed. This is also the same for shmem_mapping(). We'd better add
>>>>>> a check for both. But the shmem_mapping() in __split_huge_page() is
>>>>>> not involved, as for anonmous folios, the end parameter is set to -1, so
>>>>>> (head[i].index >= end) is always false. shmem_mapping() is not called.
>>>>>>
>>>>>> Using /sys/kernel/debug/split_huge_pages to verify this, with this
>>>>>> patch, large anon THP is successfully split and the warning is ceased.
>>>>>>
>>>>>> Signed-off-by: Ran Xiaokai <[email protected]>
>>>>>> Cc: xu xin <[email protected]>
>>>>>> Cc: Yang Yang <[email protected]>
>>>>>> ---
>>>>>> mm/huge_memory.c | 38 ++++++++++++++++++++------------------
>>>>>> 1 file changed, 20 insertions(+), 18 deletions(-)
>>>>>>
>>>>>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>>>>>> index 317de2afd371..4c9c7e5ea20c 100644
>>>>>> --- a/mm/huge_memory.c
>>>>>> +++ b/mm/huge_memory.c
>>>>>> @@ -3009,31 +3009,33 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list,
>>>>>> if (new_order >= folio_order(folio))
>>>>>> return -EINVAL;
>>>>>>
>>>>>> - /* Cannot split anonymous THP to order-1 */
>>>>>> - if (new_order == 1 && folio_test_anon(folio)) {
>>>>>> - VM_WARN_ONCE(1, "Cannot split to order-1 folio");
>>>>>> - return -EINVAL;
>>>>>> - }
>>>>>> -
>>>>>> if (new_order) {
>>>>>> /* Only swapping a whole PMD-mapped folio is supported */
>>>>>> if (folio_test_swapcache(folio))
>>>>>> return -EINVAL;
>>>>>> - /* Split shmem folio to non-zero order not supported */
>>>>>> - if (shmem_mapping(folio->mapping)) {
>>>>>> - VM_WARN_ONCE(1,
>>>>>> - "Cannot split shmem folio to non-0 order");
>>>>>> - return -EINVAL;
>>>>>> - }
>>>>>> - /* No split if the file system does not support large folio */
>>>>>> - if (!mapping_large_folio_support(folio->mapping)) {
>>>>>> - VM_WARN_ONCE(1,
>>>>>> - "Cannot split file folio to non-0 order");
>>>>>> - return -EINVAL;
>>>>>> +
>>>>>> + if (folio_test_anon(folio)) {
>>>>>> + /* Cannot split anonymous THP to order-1 */
>>>>>> + if (new_order == 1) {
>>>>>> + VM_WARN_ONCE(1, "Cannot split to order-1 folio");
>>>>>> + return -EINVAL;
>>>>>> + }
>>>>>> + } else {
>>>>>> + /* Split shmem folio to non-zero order not supported */
>>>>>> + if (shmem_mapping(folio->mapping)) {
>>>>>> + VM_WARN_ONCE(1,
>>>>>> + "Cannot split shmem folio to non-0 order");
>>>>>> + return -EINVAL;
>>>>>> + }
>>>>>> + /* No split if the file system does not support large folio */
>>>>>> + if (!mapping_large_folio_support(folio->mapping)) {
>>>>>> + VM_WARN_ONCE(1,
>>>>>> + "Cannot split file folio to non-0 order");
>>>>>> + return -EINVAL;
>>>>>> + }
>>>>>
>>>>> Am I missing something? if file system doesn't support large folio,
>>>>> how could the large folio start to exist from the first place while its
>>>>> mapping points to a file which doesn't support large folio?
>>>>
>>>> I think it is the CONFIG_READ_ONLY_THP_FOR_FS case.
>>>> khugepaged will try to collapse read-only file-backed pages to 2M THP.
>>>
>>> Can you add this information to the commit log in your next version?
>>
>> Can we also add that as a comment to that function, like "Note that we
>> might still
>> have THPs in such mappings due to CONFIG_READ_ONLY_THP_FOR_FS. But in
>> that case,
>> the mapping does not actually support large folios properly.
>> "Or sth. like that.
>
> +1. Otherwise, the code appears quite confusing.
> Would using "#ifdef" help to further clarify it?
>
> #ifdef CONFIG_READ_ONLY_THP_FOR_FS
> if (!mapping_large_folio_support(folio->mapping)) {
> ....
> }

if (IS_ENABLED()) ...

Agreed.

--
Cheers,

David / dhildenb