2022-12-15 14:46:19

by Vladimir Oltean

[permalink] [raw]
Subject: [PATCH v2 1/2] arm64: dts: ls1028a: declare cache-coherent page table walk feature for IOMMU

The SMMUv2 driver for MMU-500 reads the ARM_SMMU_GR0_ID0 register at
probe time and tries to determine based on the CTTW (Coherent
Translation Table Walk) bit whether this feature is supported.

Unfortunately, it looks like the SMMU integration in the NXP LS1028A has
wrongly tied the cfg_cttw signal to 0, even though the SoC documentation
specifies that "The SMMU supports cache coherency for page table walks
and DVM transactions for page table cache maintenance operations."

Device tree provides the option of overriding the ID register via the
dma-coherent property since commit bae2c2d421cd ("iommu/arm-smmu: Sort
out coherency"), and that's what we do here.

Telling struct io_pgtable_cfg that the SMMU page table walks are
coherent with the CPU caches brings performance benefits, because it
avoids certain operations such as __arm_lpae_sync_pte() for PTE updates.

Link: https://lore.kernel.org/linux-iommu/[email protected]/
Suggested-by: Robin Murphy <[email protected]>
Signed-off-by: Vladimir Oltean <[email protected]>
---
v1->v2: reword commit message, drop Fixes: tag

vfio's problem with arm_smmu_capable(IOMMU_CAP_CACHE_COHERENCY) should
be resolved independently, I'm not claiming that this is the only fix
for that.

v1 at:
https://lore.kernel.org/linux-iommu/[email protected]/

arch/arm64/boot/dts/freescale/fsl-ls1028a.dtsi | 1 +
1 file changed, 1 insertion(+)

diff --git a/arch/arm64/boot/dts/freescale/fsl-ls1028a.dtsi b/arch/arm64/boot/dts/freescale/fsl-ls1028a.dtsi
index ac1c3a7e5f7a..9be0b4b7babf 100644
--- a/arch/arm64/boot/dts/freescale/fsl-ls1028a.dtsi
+++ b/arch/arm64/boot/dts/freescale/fsl-ls1028a.dtsi
@@ -712,6 +712,7 @@ smmu: iommu@5000000 {
reg = <0 0x5000000 0 0x800000>;
#global-interrupts = <8>;
#iommu-cells = <1>;
+ dma-coherent;
stream-match-mask = <0x7c00>;
/* global secure fault */
interrupts = <GIC_SPI 13 IRQ_TYPE_LEVEL_HIGH>,
--
2.34.1


2022-12-31 14:58:45

by Shawn Guo

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] arm64: dts: ls1028a: declare cache-coherent page table walk feature for IOMMU

On Thu, Dec 15, 2022 at 03:56:35PM +0200, Vladimir Oltean wrote:
> The SMMUv2 driver for MMU-500 reads the ARM_SMMU_GR0_ID0 register at
> probe time and tries to determine based on the CTTW (Coherent
> Translation Table Walk) bit whether this feature is supported.
>
> Unfortunately, it looks like the SMMU integration in the NXP LS1028A has
> wrongly tied the cfg_cttw signal to 0, even though the SoC documentation
> specifies that "The SMMU supports cache coherency for page table walks
> and DVM transactions for page table cache maintenance operations."
>
> Device tree provides the option of overriding the ID register via the
> dma-coherent property since commit bae2c2d421cd ("iommu/arm-smmu: Sort
> out coherency"), and that's what we do here.
>
> Telling struct io_pgtable_cfg that the SMMU page table walks are
> coherent with the CPU caches brings performance benefits, because it
> avoids certain operations such as __arm_lpae_sync_pte() for PTE updates.
>
> Link: https://lore.kernel.org/linux-iommu/[email protected]/
> Suggested-by: Robin Murphy <[email protected]>
> Signed-off-by: Vladimir Oltean <[email protected]>

Applied both, thanks!