by Michael Ellerman

[permalink] [raw]

Subject: Re: [PATCH v2 3/5] mmu_notifiers: Call invalidate_range() when invalidating TLBs

Alistair Popple <[email protected]> writes:
> The invalidate_range() is going to become an architecture specific mmu
> notifier used to keep the TLB of secondary MMUs such as an IOMMU in
> sync with the CPU page tables. Currently it is called from separate
> code paths to the main CPU TLB invalidations. This can lead to a
> secondary TLB not getting invalidated when required and makes it hard
> to reason about when exactly the secondary TLB is invalidated.
>
> To fix this move the notifier call to the architecture specific TLB
> maintenance functions for architectures that have secondary MMUs
> requiring explicit software invalidations.
>
> This fixes a SMMU bug on ARM64. On ARM64 PTE permission upgrades
> require a TLB invalidation. This invalidation is done by the
> architecutre specific ptep_set_access_flags() which calls
^
architecture

> flush_tlb_page() if required. However this doesn't call the notifier
> resulting in infinite faults being generated by devices using the SMMU
> if it has previously cached a read-only PTE in it's TLB.
>
> Moving the invalidations into the TLB invalidation functions ensures
> all invalidations happen at the same time as the CPU invalidation. The
> architecture specific flush_tlb_all() routines do not call the
> notifier as none of the IOMMUs require this.
>
> Signed-off-by: Alistair Popple <[email protected]>
> Suggested-by: Jason Gunthorpe <[email protected]>
>
...

> diff --git a/arch/powerpc/mm/book3s64/radix_tlb.c b/arch/powerpc/mm/book3s64/radix_tlb.c
> index 0bd4866..9724b26 100644
> --- a/arch/powerpc/mm/book3s64/radix_tlb.c
> +++ b/arch/powerpc/mm/book3s64/radix_tlb.c
> @@ -752,6 +752,8 @@ void radix__local_flush_tlb_page(struct vm_area_struct *vma, unsigned long vmadd
> return radix__local_flush_hugetlb_page(vma, vmaddr);
> #endif
> radix__local_flush_tlb_page_psize(vma->vm_mm, vmaddr, mmu_virtual_psize);
> + mmu_notifier_invalidate_range(vma->vm_mm, vmaddr,
> + vmaddr + mmu_virtual_psize);
> }
> EXPORT_SYMBOL(radix__local_flush_tlb_page);

I think we can skip calling the notifier there? It's explicitly a local flush.

cheers

2023-07-25 07:34:15

by Alistair Popple

[permalink] [raw]

Subject: Re: [PATCH v2 3/5] mmu_notifiers: Call invalidate_range() when invalidating TLBs

Michael Ellerman <[email protected]> writes:

> Alistair Popple <[email protected]> writes:
>> The invalidate_range() is going to become an architecture specific mmu
>> notifier used to keep the TLB of secondary MMUs such as an IOMMU in
>> sync with the CPU page tables. Currently it is called from separate
>> code paths to the main CPU TLB invalidations. This can lead to a
>> secondary TLB not getting invalidated when required and makes it hard
>> to reason about when exactly the secondary TLB is invalidated.
>>
>> To fix this move the notifier call to the architecture specific TLB
>> maintenance functions for architectures that have secondary MMUs
>> requiring explicit software invalidations.
>>
>> This fixes a SMMU bug on ARM64. On ARM64 PTE permission upgrades
>> require a TLB invalidation. This invalidation is done by the
>> arahitecutre specific ptep_set_access_flags() which calls
> ^
> architecture

Oh. I'd forgotten to apt install codespell ;-)

>> flush_tlb_page() if required. However this doesn't call the notifier
>> resulting in infinite faults being generated by devices using the SMMU
>> if it has previously cached a read-only PTE in it's TLB.
>>
>> Moving the invalidations into the TLB invalidation functions ensures
>> all invalidations happen at the same time as the CPU invalidation. The
>> architecture specific flush_tlb_all() routines do not call the
>> notifier as none of the IOMMUs require this.
>>
>> Signed-off-by: Alistair Popple <[email protected]>
>> Suggested-by: Jason Gunthorpe <[email protected]>
>>
> ...
>
>> diff --git a/arch/powerpc/mm/book3s64/radix_tlb.c b/arch/powerpc/mm/book3s64/radix_tlb.c
>> index 0bd4866..9724b26 100644
>> --- a/arch/powerpc/mm/book3s64/radix_tlb.c
>> +++ b/arch/powerpc/mm/book3s64/radix_tlb.c
>> @@ -752,6 +752,8 @@ void radix__local_flush_tlb_page(struct vm_area_struct *vma, unsigned long vmadd
>> return radix__local_flush_hugetlb_page(vma, vmaddr);
>> #endif
>> radix__local_flush_tlb_page_psize(vma->vm_mm, vmaddr, mmu_virtual_psize);
>> + mmu_notifier_invalidate_range(vma->vm_mm, vmaddr,
>> + vmaddr + mmu_virtual_psize);
>> }
>> EXPORT_SYMBOL(radix__local_flush_tlb_page);
>
> I think we can skip calling the notifier there? It's explicitly a local flush.

I suspect you're correct. It's been a while since I last worked on PPC
TLB invalidation code though and it's changed a fair bit since then so
was being conservative and appreciate any comments there. Was worried I
may have missed some clever optimisation that detects a local flush is
all that's needed, but I see OCXL calls mm_context_add_copro() though so
that should be ok. Will respin and drop it.

> cheers