On 13.11.23 11:45, Baolin Wang wrote:
> Currently, the file pages already support large folio, and supporting for
> anonymous pages is also under discussion[1]. Moreover, the numa balancing
> code are converted to use a folio by previous thread[2], and the migrate_pages
> function also already supports the large folio migration.
>
> So now I did not see any reason to continue restricting NUMA balancing for
> large folio.
>
> [1] https://lkml.org/lkml/2023/9/29/342
> [2] https://lore.kernel.org/all/[email protected]/T/#md9d10fe34587229a72801f0d731f7457ab3f4a6e
> Signed-off-by: Baolin Wang <[email protected]>
> ---
I'll note that another piece is missing, and I'd be curious how you
tested your patch set or what I am missing. (no anonymous pages?)
change_pte_range() contains:
if (prot_numa) {
...
/* Also skip shared copy-on-write pages */
if (is_cow_mapping(vma->vm_flags) &&
folio_ref_count(folio) != 1)
continue;
So we'll never end up mapping an anon PTE-mapped THP prot-none (well, unless a
single PTE remains) and consequently never trigger NUMA hinting faults.
Now, that change has some history [1], but the original problem has been
sorted out in the meantime. But we should consider Linus' original feedback.
For pte-mapped THP, we might want to do something like the following
(completely untested):
diff --git a/mm/mprotect.c b/mm/mprotect.c
index 81991102f785..c4e6b9032e40 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -129,7 +129,8 @@ static long change_pte_range(struct mmu_gather *tlb,
/* Also skip shared copy-on-write pages */
if (is_cow_mapping(vma->vm_flags) &&
- folio_ref_count(folio) != 1)
+ (folio_maybe_dma_pinned(folio) ||
+ folio_estimated_sharers(folio) != 1))
continue;
Another note about the possible imprecision that might or might not
be tolerable ;)
[1] https://bugzilla.kernel.org/show_bug.cgi?id=215616
--
Cheers,
David / dhildenb
On 15.11.23 11:46, David Hildenbrand wrote:
> On 13.11.23 11:45, Baolin Wang wrote:
>> Currently, the file pages already support large folio, and supporting for
>> anonymous pages is also under discussion[1]. Moreover, the numa balancing
>> code are converted to use a folio by previous thread[2], and the migrate_pages
>> function also already supports the large folio migration.
>>
>> So now I did not see any reason to continue restricting NUMA balancing for
>> large folio.
>>
>> [1] https://lkml.org/lkml/2023/9/29/342
>> [2] https://lore.kernel.org/all/[email protected]/T/#md9d10fe34587229a72801f0d731f7457ab3f4a6e
>> Signed-off-by: Baolin Wang <[email protected]>
>> ---
>
> I'll note that another piece is missing, and I'd be curious how you
> tested your patch set or what I am missing. (no anonymous pages?)
>
> change_pte_range() contains:
>
> if (prot_numa) {
> ...
> /* Also skip shared copy-on-write pages */
> if (is_cow_mapping(vma->vm_flags) &&
> folio_ref_count(folio) != 1)
> continue;
>
> So we'll never end up mapping an anon PTE-mapped THP prot-none (well, unless a
> single PTE remains) and consequently never trigger NUMA hinting faults.
>
> Now, that change has some history [1], but the original problem has been
> sorted out in the meantime. But we should consider Linus' original feedback.
>
> For pte-mapped THP, we might want to do something like the following
> (completely untested):
>
> diff --git a/mm/mprotect.c b/mm/mprotect.c
> index 81991102f785..c4e6b9032e40 100644
> --- a/mm/mprotect.c
> +++ b/mm/mprotect.c
> @@ -129,7 +129,8 @@ static long change_pte_range(struct mmu_gather *tlb,
>
> /* Also skip shared copy-on-write pages */
> if (is_cow_mapping(vma->vm_flags) &&
> - folio_ref_count(folio) != 1)
> + (folio_maybe_dma_pinned(folio) ||
> + folio_estimated_sharers(folio) != 1))
Actually, > 1 might be better if the first subpage is not mapped; it's a
mess.
--
Cheers,
David / dhildenb
On 11/15/2023 6:47 PM, David Hildenbrand wrote:
> On 15.11.23 11:46, David Hildenbrand wrote:
>> On 13.11.23 11:45, Baolin Wang wrote:
>>> Currently, the file pages already support large folio, and supporting
>>> for
>>> anonymous pages is also under discussion[1]. Moreover, the numa
>>> balancing
>>> code are converted to use a folio by previous thread[2], and the
>>> migrate_pages
>>> function also already supports the large folio migration.
>>>
>>> So now I did not see any reason to continue restricting NUMA
>>> balancing for
>>> large folio.
>>>
>>> [1] https://lkml.org/lkml/2023/9/29/342
>>> [2]
>>> https://lore.kernel.org/all/[email protected]/T/#md9d10fe34587229a72801f0d731f7457ab3f4a6e
>>> Signed-off-by: Baolin Wang <[email protected]>
>>> ---
>>
>> I'll note that another piece is missing, and I'd be curious how you
>> tested your patch set or what I am missing. (no anonymous pages?)
I tested it with file large folio (order = 4) created by XFS filesystem.
>> change_pte_range() contains:
>>
>> if (prot_numa) {
>> ...
>> /* Also skip shared copy-on-write pages */
>> if (is_cow_mapping(vma->vm_flags) &&
>> folio_ref_count(folio) != 1)
>> continue;
>>
>> So we'll never end up mapping an anon PTE-mapped THP prot-none (well,
>> unless a
>> single PTE remains) and consequently never trigger NUMA hinting faults.
>>
>> Now, that change has some history [1], but the original problem has been
>> sorted out in the meantime. But we should consider Linus' original
>> feedback.
>>
>> For pte-mapped THP, we might want to do something like the following
>> (completely untested):
Thanks for pointing out. I have not tried pte-mapped THP yet, and will
look at it in detail.
>> diff --git a/mm/mprotect.c b/mm/mprotect.c
>> index 81991102f785..c4e6b9032e40 100644
>> --- a/mm/mprotect.c
>> +++ b/mm/mprotect.c
>> @@ -129,7 +129,8 @@ static long change_pte_range(struct mmu_gather *tlb,
>> /* Also skip shared copy-on-write
>> pages */
>> if (is_cow_mapping(vma->vm_flags) &&
>> - folio_ref_count(folio) != 1)
>> + (folio_maybe_dma_pinned(folio) ||
>> + folio_estimated_sharers(folio) !=
>> 1))
>
> Actually, > 1 might be better if the first subpage is not mapped; it's a
> mess.
>