LinuxLists.cc - Re: [RFC PATCH] mm: support large folio numa balancing

2023-11-15 10:46:30

Subject: Re: [RFC PATCH] mm: support large folio numa balancing

On 13.11.23 11:45, Baolin Wang wrote:
> Currently, the file pages already support large folio, and supporting for
> anonymous pages is also under discussion[1]. Moreover, the numa balancing
> code are converted to use a folio by previous thread[2], and the migrate_pages
> function also already supports the large folio migration.
>
> So now I did not see any reason to continue restricting NUMA balancing for
> large folio.
>
> [1] https://lkml.org/lkml/2023/9/29/342
> [2] https://lore.kernel.org/all/[email protected]/T/#md9d10fe34587229a72801f0d731f7457ab3f4a6e
> Signed-off-by: Baolin Wang <[email protected]>
> ---

I'll note that another piece is missing, and I'd be curious how you
tested your patch set or what I am missing. (no anonymous pages?)

change_pte_range() contains:

if (prot_numa) {
...
/* Also skip shared copy-on-write pages */
if (is_cow_mapping(vma->vm_flags) &&
folio_ref_count(folio) != 1)
continue;

So we'll never end up mapping an anon PTE-mapped THP prot-none (well, unless a
single PTE remains) and consequently never trigger NUMA hinting faults.

Now, that change has some history [1], but the original problem has been
sorted out in the meantime. But we should consider Linus' original feedback.

For pte-mapped THP, we might want to do something like the following
(completely untested):

diff --git a/mm/mprotect.c b/mm/mprotect.c
index 81991102f785..c4e6b9032e40 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -129,7 +129,8 @@ static long change_pte_range(struct mmu_gather *tlb,

/* Also skip shared copy-on-write pages */
if (is_cow_mapping(vma->vm_flags) &&
- folio_ref_count(folio) != 1)
+ (folio_maybe_dma_pinned(folio) ||
+ folio_estimated_sharers(folio) != 1))
continue;

Another note about the possible imprecision that might or might not
be tolerable ;)

[1] https://bugzilla.kernel.org/show_bug.cgi?id=215616

--
Cheers,

David / dhildenb

2023-11-15 10:47:34

by David Hildenbrand

[permalink] [raw]

Subject: Re: [RFC PATCH] mm: support large folio numa balancing

On 15.11.23 11:46, David Hildenbrand wrote:
> On 13.11.23 11:45, Baolin Wang wrote:
>> Currently, the file pages already support large folio, and supporting for
>> anonymous pages is also under discussion[1]. Moreover, the numa balancing
>> code are converted to use a folio by previous thread[2], and the migrate_pages
>> function also already supports the large folio migration.
>>
>> So now I did not see any reason to continue restricting NUMA balancing for
>> large folio.
>>
>> [1] https://lkml.org/lkml/2023/9/29/342
>> [2] https://lore.kernel.org/all/[email protected]/T/#md9d10fe34587229a72801f0d731f7457ab3f4a6e
>> Signed-off-by: Baolin Wang <[email protected]>
>> ---
>
> I'll note that another piece is missing, and I'd be curious how you
> tested your patch set or what I am missing. (no anonymous pages?)
>
> change_pte_range() contains:
>
> if (prot_numa) {
> ...
> /* Also skip shared copy-on-write pages */
> if (is_cow_mapping(vma->vm_flags) &&
> folio_ref_count(folio) != 1)
> continue;
>
> So we'll never end up mapping an anon PTE-mapped THP prot-none (well, unless a
> single PTE remains) and consequently never trigger NUMA hinting faults.
>
> Now, that change has some history [1], but the original problem has been
> sorted out in the meantime. But we should consider Linus' original feedback.
>
> For pte-mapped THP, we might want to do something like the following
> (completely untested):
>
> diff --git a/mm/mprotect.c b/mm/mprotect.c
> index 81991102f785..c4e6b9032e40 100644
> --- a/mm/mprotect.c
> +++ b/mm/mprotect.c
> @@ -129,7 +129,8 @@ static long change_pte_range(struct mmu_gather *tlb,
>
> /* Also skip shared copy-on-write pages */
> if (is_cow_mapping(vma->vm_flags) &&
> - folio_ref_count(folio) != 1)
> + (folio_maybe_dma_pinned(folio) ||
> + folio_estimated_sharers(folio) != 1))

Actually, > 1 might be better if the first subpage is not mapped; it's a
mess.

--
Cheers,

David / dhildenb

2023-11-20 03:28:24

by Baolin Wang

[permalink] [raw]

Subject: Re: [RFC PATCH] mm: support large folio numa balancing

On 11/15/2023 6:47 PM, David Hildenbrand wrote:
> On 15.11.23 11:46, David Hildenbrand wrote:
>> On 13.11.23 11:45, Baolin Wang wrote:
>>> Currently, the file pages already support large folio, and supporting
>>> for
>>> anonymous pages is also under discussion[1]. Moreover, the numa
>>> balancing
>>> code are converted to use a folio by previous thread[2], and the
>>> migrate_pages
>>> function also already supports the large folio migration.
>>>
>>> So now I did not see any reason to continue restricting NUMA
>>> balancing for
>>> large folio.
>>>
>>> [1] https://lkml.org/lkml/2023/9/29/342
>>> [2]
>>> https://lore.kernel.org/all/[email protected]/T/#md9d10fe34587229a72801f0d731f7457ab3f4a6e
>>> Signed-off-by: Baolin Wang <[email protected]>
>>> ---
>>
>> I'll note that another piece is missing, and I'd be curious how you
>> tested your patch set or what I am missing. (no anonymous pages?)

I tested it with file large folio (order = 4) created by XFS filesystem.

>> change_pte_range() contains:
>>
>> if (prot_numa) {
>>     ...
>>     /* Also skip shared copy-on-write pages */
>>     if (is_cow_mapping(vma->vm_flags) &&
>>         folio_ref_count(folio) != 1)
>>         continue;
>>
>> So we'll never end up mapping an anon PTE-mapped THP prot-none (well,
>> unless a
>> single PTE remains) and consequently never trigger NUMA hinting faults.
>>
>> Now, that change has some history [1], but the original problem has been
>> sorted out in the meantime. But we should consider Linus' original
>> feedback.
>>
>> For pte-mapped THP, we might want to do something like the following
>> (completely untested):

Thanks for pointing out. I have not tried pte-mapped THP yet, and will
look at it in detail.

>> diff --git a/mm/mprotect.c b/mm/mprotect.c
>> index 81991102f785..c4e6b9032e40 100644
>> --- a/mm/mprotect.c
>> +++ b/mm/mprotect.c
>> @@ -129,7 +129,8 @@ static long change_pte_range(struct mmu_gather *tlb,
>>                                   /* Also skip shared copy-on-write
>> pages */
>>                                   if (is_cow_mapping(vma->vm_flags) &&
>> -                                   folio_ref_count(folio) != 1)
>> +                                   (folio_maybe_dma_pinned(folio) ||
>> +                                    folio_estimated_sharers(folio) !=
>> 1))
>
> Actually, > 1 might be better if the first subpage is not mapped; it's a
> mess.
>