In arm64, contiguous flag refers to the same TLB entry that shared by a
contiguous address range. If flush one entry of the address range, it
would cover the whole contiguous address range. Thus there's no need to
flush all contiguous range that CONT_PMD/PTE points to.
Signed-off-by: Kaihao Bai <[email protected]>
---
arch/arm64/mm/hugetlbpage.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c
index 95364e8bdc19..9213072ce9c7 100644
--- a/arch/arm64/mm/hugetlbpage.c
+++ b/arch/arm64/mm/hugetlbpage.c
@@ -213,7 +213,7 @@ static pte_t get_clear_contig_flush(struct mm_struct *mm,
pte_t orig_pte = get_clear_contig(mm, addr, ptep, pgsize, ncontig);
struct vm_area_struct vma = TLB_FLUSH_VMA(mm, 0);
- flush_tlb_range(&vma, addr, addr + (pgsize * ncontig));
+ flush_tlb_page(&vma, addr);
return orig_pte;
}
@@ -238,7 +238,7 @@ static void clear_flush(struct mm_struct *mm,
for (i = 0; i < ncontig; i++, addr += pgsize, ptep++)
pte_clear(mm, addr, ptep);
- flush_tlb_range(&vma, saddr, addr);
+ flush_tlb_page(&vma, saddr);
}
static inline struct folio *hugetlb_swap_entry_to_folio(swp_entry_t entry)
--
2.27.0
On Tue, Feb 07, 2023 at 07:09:41PM +0800, Kaihao Bai wrote:
> In arm64, contiguous flag refers to the same TLB entry that shared by a
> contiguous address range. If flush one entry of the address range, it
> would cover the whole contiguous address range. Thus there's no need to
> flush all contiguous range that CONT_PMD/PTE points to.
This doesn't work. The contiguous bit is a hint, so the CPU may not
coalesce multiple PTEs into a single TLB entry.
--
Catalin
On 2023/2/8 2:21, Catalin Marinas wrote:
> On Tue, Feb 07, 2023 at 07:09:41PM +0800, Kaihao Bai wrote:
>> In arm64, contiguous flag refers to the same TLB entry that shared by a
>> contiguous address range. If flush one entry of the address range, it
>> would cover the whole contiguous address range. Thus there's no need to
>> flush all contiguous range that CONT_PMD/PTE points to.
>
> This doesn't work. The contiguous bit is a hint, so the CPU may not
> coalesce multiple PTEs into a single TLB entry.
>Sorry I misunderstood the underlying approach of contiguous bit. I
re-check and find that "TLB maintenance must be performed based on the
size of the underlying translation table entries, to avoid TLB
coherency issues.". Thanks for your clarification!