NVIDIA's Xavier (Tegra194) SOC has three ARM SMMU(MMU-500) instances.
Two of the SMMU instances are used to interleave IOVA accesses across them.
The IOVA accesses from HW devices are interleaved across these two SMMU instances
and they need to be programmed identical.
The existing ARM SMMU driver can't be used in its current form for programming the
two SMMU instances identically. But, most of the code can be shared between ARM SMMU
driver and Tegra194 SMMU driver.
Page fault handling and TLB sync operations need to know about specific instance
of SMMU for correct fault handling and optimal TLB sync wait. Rest of the code doesn't
need to know about number of SMMU instances. Based on this fact, The patch series here
rearranges the arm-smmu.c code to allow sharing most of the ARM SMMU programming/iommu_ops
code between ARM SMMU driver and Tegra194 SMMU driver and transparently
handles programming of two SMMU instances.
The third SMMU instance would use the existing ARM SMMU driver.
Changes in v2:
* Added CONFIG_ARM_SMMU_TEGRA to protect Tegra194 SMMU driver compilation
* Enabled CONFIG_ARM_SMMU_TEGRA in defconfig
* Added SMMU nodes in Tegra194 device tree
Krishna Reddy (5):
iommu/arm-smmu: rearrange arm-smmu.c code
iommu/arm-smmu: Prepare fault, probe, sync functions for sharing code
iommu/tegra194_smmu: Add Tegra194 SMMU driver
arm64: defconfig: Enable ARM_SMMU_TEGRA
arm64: tegra: Add SMMU nodes to Tegra194 device tree
arch/arm64/boot/dts/nvidia/tegra194.dtsi | 148 ++
arch/arm64/configs/defconfig | 1 +
drivers/iommu/Kconfig | 10 +
drivers/iommu/Makefile | 1 +
drivers/iommu/arm-smmu-common.c | 1971 +++++++++++++++++++++++++++
drivers/iommu/arm-smmu-common.h | 256 ++++
drivers/iommu/arm-smmu.c | 2167 +-----------------------------
drivers/iommu/tegra194-smmu.c | 201 +++
8 files changed, 2595 insertions(+), 2160 deletions(-)
create mode 100644 drivers/iommu/arm-smmu-common.c
create mode 100644 drivers/iommu/arm-smmu-common.h
create mode 100644 drivers/iommu/tegra194-smmu.c
--
2.1.4
Hi Krishna,
On Wed, Oct 31, 2018 at 04:43:45PM -0700, Krishna Reddy wrote:
> NVIDIA's Xavier (Tegra194) SOC has three ARM SMMU(MMU-500) instances.
> Two of the SMMU instances are used to interleave IOVA accesses across them.
> The IOVA accesses from HW devices are interleaved across these two SMMU instances
> and they need to be programmed identical.
>
> The existing ARM SMMU driver can't be used in its current form for programming the
> two SMMU instances identically. But, most of the code can be shared between ARM SMMU
> driver and Tegra194 SMMU driver.
> Page fault handling and TLB sync operations need to know about specific instance
> of SMMU for correct fault handling and optimal TLB sync wait. Rest of the code doesn't
> need to know about number of SMMU instances. Based on this fact, The patch series here
> rearranges the arm-smmu.c code to allow sharing most of the ARM SMMU programming/iommu_ops
> code between ARM SMMU driver and Tegra194 SMMU driver and transparently
> handles programming of two SMMU instances.
Based on what I can see, it seems that you're trying to describe two
pieces of hardware with only one device in the DT. That seems like an
odd choice. Also, it seems like all three SMMUs share the same interrupt
line? That's somewhat suboptimal IMHO, but harder to change.
Why not instantiate both of them and create a reference between then
such that the TLB and sync ops are done on both of them in the native
driver? I.e. two arm_smmu structs and a pointer between then (i.e. add a
"next shared smmu" pointer in the struct arm_smmu).
As long as devices only references one of them, locking only that one should
provide suitable protection as well.
Seems like a simpler approach than adding a new layer to the driver, but maybe
I am missing some complexity here?
-Olof
Hi Olof,
Thanks for the comments!
>>Based on what I can see, it seems that you're trying to describe two
pieces of hardware with only one device in the DT. That seems like an
odd choice.
T194 SOC HW is designed to use Two ARM SMMU's together like one SMMU.
The IOVA accesses from the HW controllers clients is interleaved across these Two
SMMU's by Memory controller HW.
Even though, there are two ARM SMMU's, HW design requires SW programming
these two ARM SMMU instances identically.
>>Also, it seems like all three SMMUs share the same interrupt
line? That's somewhat suboptimal IMHO, but harder to change.
This is by HW design. SMMU interrupts are only expected if there is an illegal IOVA
access, which shouldn't happen in practical. The SMMU interrupts are only used
to log the faults and for debug purpose on T194.
>>Why not instantiate both of them and create a reference between then
such that the TLB and sync ops are done on both of them in the native
driver? I.e. two arm_smmu structs and a pointer between then (i.e. add a
"next shared smmu" pointer in the struct arm_smmu).
DMA framework supports only one SMMU driver instance per device.
SMMU driver setup iommu_ops pointer in the device struct during the boot when
the platform device is added to the bus.
There is no support to allow a platform device register with two different SMMU
devices.
Two different SMMU instances is 2x page table memory and 2x map/unmap overhead.
With SMMU driver knowing about two smmu instances, page table memory usage, map/unmap
Overhead would be efficient.
>>As long as devices only references one of them, locking only that one should
provide suitable protection as well.
Earlier proposal on creating one Master SMMU node and slave SMMU nodes within isn't liked.
Will has proposed creating a library and reusing the code. I have made the patches
to share the code.
https://www.spinics.net/lists/linux-tegra/msg35539.html
https://www.spinics.net/lists/linux-tegra/msg35494.html
>>Seems like a simpler approach than adding a new layer to the driver, but maybe
>>I am missing some complexity here?
Programming both SMMU instances identically with existing ARM SMMU driver is not
possible with multiple DT nodes.
In the following patch, by writel* macro override, both ARM SMMU instances are
programmed identically in Tegra194 SMMU driver transparently. Only tlb sync and
interrupt handling need to know about specific instance.
https://lkml.org/lkml/2018/11/1/573
+#define writel_relaxed writel_relaxed_all
+#define writeq_relaxed writeq_relaxed_all
+#define writel writel_all
-KR