2018-10-15 12:09:44

by Hanna Hawa

[permalink] [raw]
Subject: [PATCH 0/4] Add system mmu support for Armada-806

From: Hanna Hawa <[email protected]>

This series add support for IOMMU for AP806, including workaround
for accessing ARM SMMU 64bit registers.
AP-806 can't access SMMU registers with 64bit width, this patches split
the readq/writeq for 32bit access, due to erratanum #582743.

Hanna Hawa (4):
iommu/arm-smmu: introduce wrapper for writeq/readq
iommu/arm-smmu: Workaround for Marvell Armada-AP806 SoC erratum
#582743
dt-bindings: iommu/arm,smmu: add compatible string for Marvell
arm64: dts: marvell: add smmu node for Armada-AP806

Documentation/arm64/silicon-errata.txt | 2 +
.../devicetree/bindings/iommu/arm,smmu.txt | 1 +
arch/arm64/boot/dts/marvell/armada-ap806.dtsi | 18 ++++++
drivers/iommu/arm-smmu.c | 65 ++++++++++++++++++----
4 files changed, 75 insertions(+), 11 deletions(-)

--
1.9.1



2018-10-15 12:07:30

by Hanna Hawa

[permalink] [raw]
Subject: [PATCH 4/4] arm64: dts: marvell: add smmu node for Armada-AP806

From: Hanna Hawa <[email protected]>

Add SMMU node for Marvell Armada-AP806 SOC.

Signed-off-by: Hanna Hawa <[email protected]>
---
arch/arm64/boot/dts/marvell/armada-ap806.dtsi | 17 +++++++++++++++++
1 file changed, 17 insertions(+)

diff --git a/arch/arm64/boot/dts/marvell/armada-ap806.dtsi b/arch/arm64/boot/dts/marvell/armada-ap806.dtsi
index 176e38d..b5758b6 100644
--- a/arch/arm64/boot/dts/marvell/armada-ap806.dtsi
+++ b/arch/arm64/boot/dts/marvell/armada-ap806.dtsi
@@ -97,6 +97,23 @@
interrupts = <17>;
};

+ smmu: iommu@5000000 {
+ compatible = "marvell,mmu-500";
+ reg = <0x100000 0x100000>;
+ dma-coherent;
+ #iommu-cells = <1>;
+ #global-interrupts = <1>;
+ interrupts = <GIC_SPI 6 IRQ_TYPE_LEVEL_HIGH>,
+ <GIC_SPI 6 IRQ_TYPE_LEVEL_HIGH>,
+ <GIC_SPI 6 IRQ_TYPE_LEVEL_HIGH>,
+ <GIC_SPI 6 IRQ_TYPE_LEVEL_HIGH>,
+ <GIC_SPI 6 IRQ_TYPE_LEVEL_HIGH>,
+ <GIC_SPI 6 IRQ_TYPE_LEVEL_HIGH>,
+ <GIC_SPI 6 IRQ_TYPE_LEVEL_HIGH>,
+ <GIC_SPI 6 IRQ_TYPE_LEVEL_HIGH>,
+ <GIC_SPI 6 IRQ_TYPE_LEVEL_HIGH>;
+ };
+
odmi: odmi@300000 {
compatible = "marvell,odmi-controller";
interrupt-controller;
--
1.9.1


2018-10-15 12:09:03

by Hanna Hawa

[permalink] [raw]
Subject: [PATCH 2/4] iommu/arm-smmu: Workaround for Marvell Armada-AP806 SoC erratum #582743

From: Hanna Hawa <[email protected]>

Due to erratum #582743, the Marvell Armada-AP806 can't access 64bit
to ARM SMMUv2 registers.
This patch split the writeq/readq to two accesses of writel/readl.

Note that separate writes/reads to 2 is not problem regards to atomicity,
because the driver use the readq/writeq while initialize the SMMU, report
for SMMU fault, and use spinlock in one case (iova_to_phys).

Signed-off-by: Hanna Hawa <[email protected]>
---
Documentation/arm64/silicon-errata.txt | 2 ++
drivers/iommu/arm-smmu.c | 33 +++++++++++++++++++++++++++++----
2 files changed, 31 insertions(+), 4 deletions(-)

diff --git a/Documentation/arm64/silicon-errata.txt b/Documentation/arm64/silicon-errata.txt
index 3b2f2dd..fc3f2a0 100644
--- a/Documentation/arm64/silicon-errata.txt
+++ b/Documentation/arm64/silicon-errata.txt
@@ -67,6 +67,8 @@ stable kernels.
| Cavium | ThunderX2 SMMUv3| #74 | N/A |
| Cavium | ThunderX2 SMMUv3| #126 | N/A |
| | | | |
+| Marvell | ARM-MMU-500 | #582743 | N/A |
+| | | | |
| Freescale/NXP | LS2080A/LS1043A | A-008585 | FSL_ERRATUM_A008585 |
| | | | |
| Hisilicon | Hip0{5,6,7} | #161010101 | HISILICON_ERRATUM_161010101 |
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index fccb1d4..d64f892 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -119,6 +119,7 @@ enum arm_smmu_arch_version {
enum arm_smmu_implementation {
GENERIC_SMMU,
ARM_MMU500,
+ MRVL_MMU500,
CAVIUM_SMMUV2,
};

@@ -276,13 +277,35 @@ static inline void smmu_writeq_relaxed(struct arm_smmu_device *smmu,
u64 val,
void __iomem *addr)
{
- writeq_relaxed(val, addr);
+ /*
+ * Marvell Armada-AP806 erratum #582743.
+ * Split all the writeq to double writel
+ */
+ if (smmu->model != MRVL_MMU500) {
+ writeq_relaxed(val, addr);
+ return;
+ }
+
+ writel_relaxed(upper_32_bits(val), addr + 4);
+ writel_relaxed(lower_32_bits(val), addr);
}

static inline u64 smmu_readq_relaxed(struct arm_smmu_device *smmu,
void __iomem *addr)
{
- return readq_relaxed(addr);
+ u64 val;
+
+ /*
+ * Marvell Armada-AP806 erratum #582743.
+ * Split all the readq to double readl
+ */
+ if (smmu->model != MRVL_MMU500)
+ return readq_relaxed(addr);
+
+ val = (u64)readl_relaxed(addr + 4) << 32;
+ val |= readl_relaxed(addr);
+
+ return val;
}

static void parse_driver_options(struct arm_smmu_device *smmu)
@@ -1611,7 +1634,7 @@ static void arm_smmu_device_reset(struct arm_smmu_device *smmu)
for (i = 0; i < smmu->num_mapping_groups; ++i)
arm_smmu_write_sme(smmu, i);

- if (smmu->model == ARM_MMU500) {
+ if (smmu->model == ARM_MMU500 || smmu->model == MRVL_MMU500) {
/*
* Before clearing ARM_MMU500_ACTLR_CPRE, need to
* clear CACHE_LOCK bit of ACR first. And, CACHE_LOCK
@@ -1640,7 +1663,7 @@ static void arm_smmu_device_reset(struct arm_smmu_device *smmu)
* Disable MMU-500's not-particularly-beneficial next-page
* prefetcher for the sake of errata #841119 and #826419.
*/
- if (smmu->model == ARM_MMU500) {
+ if (smmu->model == ARM_MMU500 || smmu->model == MRVL_MMU500) {
reg = readl_relaxed(cb_base + ARM_SMMU_CB_ACTLR);
reg &= ~ARM_MMU500_ACTLR_CPRE;
writel_relaxed(reg, cb_base + ARM_SMMU_CB_ACTLR);
@@ -1923,6 +1946,7 @@ struct arm_smmu_match_data {
ARM_SMMU_MATCH_DATA(smmu_generic_v2, ARM_SMMU_V2, GENERIC_SMMU);
ARM_SMMU_MATCH_DATA(arm_mmu401, ARM_SMMU_V1_64K, GENERIC_SMMU);
ARM_SMMU_MATCH_DATA(arm_mmu500, ARM_SMMU_V2, ARM_MMU500);
+ARM_SMMU_MATCH_DATA(mrvl_mmu500, ARM_SMMU_V2, MRVL_MMU500);
ARM_SMMU_MATCH_DATA(cavium_smmuv2, ARM_SMMU_V2, CAVIUM_SMMUV2);

static const struct of_device_id arm_smmu_of_match[] = {
@@ -1931,6 +1955,7 @@ struct arm_smmu_match_data {
{ .compatible = "arm,mmu-400", .data = &smmu_generic_v1 },
{ .compatible = "arm,mmu-401", .data = &arm_mmu401 },
{ .compatible = "arm,mmu-500", .data = &arm_mmu500 },
+ { .compatible = "marvell,mmu-500", .data = &mrvl_mmu500 },
{ .compatible = "cavium,smmu-v2", .data = &cavium_smmuv2 },
{ },
};
--
1.9.1


2018-10-15 12:09:03

by Hanna Hawa

[permalink] [raw]
Subject: [PATCH 1/4] iommu/arm-smmu: introduce wrapper for writeq/readq

From: Hanna Hawa <[email protected]>

This patch introduce the smmu_writeq_relaxed/smmu_readq_relaxed
helpers, as preparation to add specific Marvell work-around for
accessing 64bit width registers of ARM SMMU.

Signed-off-by: Hanna Hawa <[email protected]>
---
drivers/iommu/arm-smmu.c | 36 +++++++++++++++++++++++++++---------
1 file changed, 27 insertions(+), 9 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index fd1b80e..fccb1d4 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -88,9 +88,11 @@
* therefore this actually makes more sense than it might first appear.
*/
#ifdef CONFIG_64BIT
-#define smmu_write_atomic_lq writeq_relaxed
+#define smmu_write_atomic_lq(smmu, val, reg) \
+ smmu_writeq_relaxed(smmu, val, reg)
#else
-#define smmu_write_atomic_lq writel_relaxed
+#define smmu_write_atomic_lq(smmu, val, reg) \
+ writel_relaxed(val, reg)
#endif

/* Translation context bank */
@@ -270,6 +272,19 @@ static struct arm_smmu_domain *to_smmu_domain(struct iommu_domain *dom)
return container_of(dom, struct arm_smmu_domain, domain);
}

+static inline void smmu_writeq_relaxed(struct arm_smmu_device *smmu,
+ u64 val,
+ void __iomem *addr)
+{
+ writeq_relaxed(val, addr);
+}
+
+static inline u64 smmu_readq_relaxed(struct arm_smmu_device *smmu,
+ void __iomem *addr)
+{
+ return readq_relaxed(addr);
+}
+
static void parse_driver_options(struct arm_smmu_device *smmu)
{
int i = 0;
@@ -465,6 +480,7 @@ static void arm_smmu_tlb_inv_range_nosync(unsigned long iova, size_t size,
size_t granule, bool leaf, void *cookie)
{
struct arm_smmu_domain *smmu_domain = cookie;
+ struct arm_smmu_device *smmu = smmu_domain->smmu;
struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
bool stage1 = cfg->cbar != CBAR_TYPE_S2_TRANS;
void __iomem *reg = ARM_SMMU_CB(smmu_domain->smmu, cfg->cbndx);
@@ -483,7 +499,7 @@ static void arm_smmu_tlb_inv_range_nosync(unsigned long iova, size_t size,
iova >>= 12;
iova |= (u64)cfg->asid << 48;
do {
- writeq_relaxed(iova, reg);
+ smmu_writeq_relaxed(smmu, iova, reg);
iova += granule >> 12;
} while (size -= granule);
}
@@ -492,7 +508,7 @@ static void arm_smmu_tlb_inv_range_nosync(unsigned long iova, size_t size,
ARM_SMMU_CB_S2_TLBIIPAS2;
iova >>= 12;
do {
- smmu_write_atomic_lq(iova, reg);
+ smmu_write_atomic_lq(smmu, iova, reg);
iova += granule >> 12;
} while (size -= granule);
}
@@ -548,7 +564,7 @@ static irqreturn_t arm_smmu_context_fault(int irq, void *dev)
return IRQ_NONE;

fsynr = readl_relaxed(cb_base + ARM_SMMU_CB_FSYNR0);
- iova = readq_relaxed(cb_base + ARM_SMMU_CB_FAR);
+ iova = smmu_readq_relaxed(smmu, cb_base + ARM_SMMU_CB_FAR);

dev_err_ratelimited(smmu->dev,
"Unhandled context fault: fsr=0x%x, iova=0x%08lx, fsynr=0x%x, cb=%d\n",
@@ -698,9 +714,11 @@ static void arm_smmu_write_context_bank(struct arm_smmu_device *smmu, int idx)
writel_relaxed(cb->ttbr[0], cb_base + ARM_SMMU_CB_TTBR0);
writel_relaxed(cb->ttbr[1], cb_base + ARM_SMMU_CB_TTBR1);
} else {
- writeq_relaxed(cb->ttbr[0], cb_base + ARM_SMMU_CB_TTBR0);
+ smmu_writeq_relaxed(smmu, cb->ttbr[0],
+ cb_base + ARM_SMMU_CB_TTBR0);
if (stage1)
- writeq_relaxed(cb->ttbr[1], cb_base + ARM_SMMU_CB_TTBR1);
+ smmu_writeq_relaxed(smmu, cb->ttbr[1],
+ cb_base + ARM_SMMU_CB_TTBR1);
}

/* MAIRs (stage-1 only) */
@@ -1279,7 +1297,7 @@ static phys_addr_t arm_smmu_iova_to_phys_hard(struct iommu_domain *domain,
/* ATS1 registers can only be written atomically */
va = iova & ~0xfffUL;
if (smmu->version == ARM_SMMU_V2)
- smmu_write_atomic_lq(va, cb_base + ARM_SMMU_CB_ATS1PR);
+ smmu_write_atomic_lq(smmu, va, cb_base + ARM_SMMU_CB_ATS1PR);
else /* Register is only 32-bit in v1 */
writel_relaxed(va, cb_base + ARM_SMMU_CB_ATS1PR);

@@ -1292,7 +1310,7 @@ static phys_addr_t arm_smmu_iova_to_phys_hard(struct iommu_domain *domain,
return ops->iova_to_phys(ops, iova);
}

- phys = readq_relaxed(cb_base + ARM_SMMU_CB_PAR);
+ phys = smmu_readq_relaxed(smmu, cb_base + ARM_SMMU_CB_PAR);
spin_unlock_irqrestore(&smmu_domain->cb_lock, flags);
if (phys & CB_PAR_F) {
dev_err(dev, "translation fault!\n");
--
1.9.1


2018-10-15 12:09:09

by Hanna Hawa

[permalink] [raw]
Subject: [PATCH 3/4] dt-bindings: iommu/arm,smmu: add compatible string for Marvell

From: Hanna Hawa <[email protected]>

Add specific compatible string for Marvell usage due errata of
accessing 64bit registers of ARM SMMU, in AP806.

AP806 SOC use the generic ARM-MMU500, and there's no specific
implementation of Marvell, this compatible is used for errata only.

Signed-off-by: Hanna Hawa <[email protected]>
---
Documentation/devicetree/bindings/iommu/arm,smmu.txt | 1 +
1 file changed, 1 insertion(+)

diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.txt b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
index 8a6ffce..92d7263 100644
--- a/Documentation/devicetree/bindings/iommu/arm,smmu.txt
+++ b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
@@ -16,6 +16,7 @@ conditions.
"arm,mmu-400"
"arm,mmu-401"
"arm,mmu-500"
+ "marvell,mmu-500"
"cavium,smmu-v2"

depending on the particular implementation and/or the
--
1.9.1


2018-10-15 13:01:06

by Robin Murphy

[permalink] [raw]
Subject: Re: [PATCH 2/4] iommu/arm-smmu: Workaround for Marvell Armada-AP806 SoC erratum #582743

Hi Hanna,

On 15/10/18 13:00, [email protected] wrote:
> From: Hanna Hawa <[email protected]>
>
> Due to erratum #582743, the Marvell Armada-AP806 can't access 64bit
> to ARM SMMUv2 registers.
> This patch split the writeq/readq to two accesses of writel/readl.
>
> Note that separate writes/reads to 2 is not problem regards to atomicity,
> because the driver use the readq/writeq while initialize the SMMU, report
> for SMMU fault, and use spinlock in one case (iova_to_phys).

In general, this doesn't work. Here's what the SMMU spec says about
SMMU_CBn_TLBIVA, but others are similar:

"If SMMU_CBA2Rn.VA64 is one, then AArch64 format is selected. The
programmer should use 64 bit accesses to this register. If 32-bit
accesses are used then writes to the top 32 bits are ignored and writes
to the lower 32 bits are zero extended."

If your interconnect won't let 64-bit transactions through, then you
can't use AArch64 format at stage 1 at all, since there's no way to
invalidate entries with the correct ASID, and you'll have to restrict
stage 2 formats to at most 44-bit IOVAs in order for TLBIIPAS2{L} not to
invalidate the wrong thing.

> Signed-off-by: Hanna Hawa <[email protected]>
> ---
> Documentation/arm64/silicon-errata.txt | 2 ++
> drivers/iommu/arm-smmu.c | 33 +++++++++++++++++++++++++++++----
> 2 files changed, 31 insertions(+), 4 deletions(-)
>
> diff --git a/Documentation/arm64/silicon-errata.txt b/Documentation/arm64/silicon-errata.txt
> index 3b2f2dd..fc3f2a0 100644
> --- a/Documentation/arm64/silicon-errata.txt
> +++ b/Documentation/arm64/silicon-errata.txt
> @@ -67,6 +67,8 @@ stable kernels.
> | Cavium | ThunderX2 SMMUv3| #74 | N/A |
> | Cavium | ThunderX2 SMMUv3| #126 | N/A |
> | | | | |
> +| Marvell | ARM-MMU-500 | #582743 | N/A |
> +| | | | |

Nit: the convention here seems to be at least alphabetically sorted by
Implementer.

> | Freescale/NXP | LS2080A/LS1043A | A-008585 | FSL_ERRATUM_A008585 |
> | | | | |
> | Hisilicon | Hip0{5,6,7} | #161010101 | HISILICON_ERRATUM_161010101 |
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index fccb1d4..d64f892 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -119,6 +119,7 @@ enum arm_smmu_arch_version {
> enum arm_smmu_implementation {
> GENERIC_SMMU,
> ARM_MMU500,
> + MRVL_MMU500,

Is there any actually modification to the MMU-500 RTL itself here, or is
the problem just in the interconnect in front of the SMMU programming
interface? I would normally assume the latter, in which case treating it
as a separate implementation isn't really accurate, and I'd much rather
handle any workaround via smmu->options, just like the secure access
workaround (which is a similar integration issue).

Robin.

> CAVIUM_SMMUV2,
> };
>
> @@ -276,13 +277,35 @@ static inline void smmu_writeq_relaxed(struct arm_smmu_device *smmu,
> u64 val,
> void __iomem *addr)
> {
> - writeq_relaxed(val, addr);
> + /*
> + * Marvell Armada-AP806 erratum #582743.
> + * Split all the writeq to double writel
> + */
> + if (smmu->model != MRVL_MMU500) {
> + writeq_relaxed(val, addr);
> + return;
> + }
> +
> + writel_relaxed(upper_32_bits(val), addr + 4);
> + writel_relaxed(lower_32_bits(val), addr);
> }
>
> static inline u64 smmu_readq_relaxed(struct arm_smmu_device *smmu,
> void __iomem *addr)
> {
> - return readq_relaxed(addr);
> + u64 val;
> +
> + /*
> + * Marvell Armada-AP806 erratum #582743.
> + * Split all the readq to double readl
> + */
> + if (smmu->model != MRVL_MMU500)
> + return readq_relaxed(addr);
> +
> + val = (u64)readl_relaxed(addr + 4) << 32;
> + val |= readl_relaxed(addr);
> +
> + return val;
> }
>
> static void parse_driver_options(struct arm_smmu_device *smmu)
> @@ -1611,7 +1634,7 @@ static void arm_smmu_device_reset(struct arm_smmu_device *smmu)
> for (i = 0; i < smmu->num_mapping_groups; ++i)
> arm_smmu_write_sme(smmu, i);
>
> - if (smmu->model == ARM_MMU500) {
> + if (smmu->model == ARM_MMU500 || smmu->model == MRVL_MMU500) {
> /*
> * Before clearing ARM_MMU500_ACTLR_CPRE, need to
> * clear CACHE_LOCK bit of ACR first. And, CACHE_LOCK
> @@ -1640,7 +1663,7 @@ static void arm_smmu_device_reset(struct arm_smmu_device *smmu)
> * Disable MMU-500's not-particularly-beneficial next-page
> * prefetcher for the sake of errata #841119 and #826419.
> */
> - if (smmu->model == ARM_MMU500) {
> + if (smmu->model == ARM_MMU500 || smmu->model == MRVL_MMU500) {
> reg = readl_relaxed(cb_base + ARM_SMMU_CB_ACTLR);
> reg &= ~ARM_MMU500_ACTLR_CPRE;
> writel_relaxed(reg, cb_base + ARM_SMMU_CB_ACTLR);
> @@ -1923,6 +1946,7 @@ struct arm_smmu_match_data {
> ARM_SMMU_MATCH_DATA(smmu_generic_v2, ARM_SMMU_V2, GENERIC_SMMU);
> ARM_SMMU_MATCH_DATA(arm_mmu401, ARM_SMMU_V1_64K, GENERIC_SMMU);
> ARM_SMMU_MATCH_DATA(arm_mmu500, ARM_SMMU_V2, ARM_MMU500);
> +ARM_SMMU_MATCH_DATA(mrvl_mmu500, ARM_SMMU_V2, MRVL_MMU500);
> ARM_SMMU_MATCH_DATA(cavium_smmuv2, ARM_SMMU_V2, CAVIUM_SMMUV2);
>
> static const struct of_device_id arm_smmu_of_match[] = {
> @@ -1931,6 +1955,7 @@ struct arm_smmu_match_data {
> { .compatible = "arm,mmu-400", .data = &smmu_generic_v1 },
> { .compatible = "arm,mmu-401", .data = &arm_mmu401 },
> { .compatible = "arm,mmu-500", .data = &arm_mmu500 },
> + { .compatible = "marvell,mmu-500", .data = &mrvl_mmu500 },
> { .compatible = "cavium,smmu-v2", .data = &cavium_smmuv2 },
> { },
> };
>

2018-10-15 13:12:38

by Robin Murphy

[permalink] [raw]
Subject: Re: [PATCH 3/4] dt-bindings: iommu/arm, smmu: add compatible string for Marvell

On 15/10/18 13:00, [email protected] wrote:
> From: Hanna Hawa <[email protected]>
>
> Add specific compatible string for Marvell usage due errata of
> accessing 64bit registers of ARM SMMU, in AP806.
>
> AP806 SOC use the generic ARM-MMU500, and there's no specific
> implementation of Marvell, this compatible is used for errata only.

Given that, I think something more specific like:

"marvell,ap806-smmu", "arm,mmu-500";

would be most appropriate. Otherwise, if some future Marvell SoC were to
ever come out with a *different* MMU-500 integration problem, you'd
already have painted yourself into a corner.

Alternatively (or additionally), we could perhaps consider a separate
property like "marvell,32bit-config-access", to mirror the existing
handling of the secure integration bug.

Robin.

> Signed-off-by: Hanna Hawa <[email protected]>
> ---
> Documentation/devicetree/bindings/iommu/arm,smmu.txt | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.txt b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
> index 8a6ffce..92d7263 100644
> --- a/Documentation/devicetree/bindings/iommu/arm,smmu.txt
> +++ b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
> @@ -16,6 +16,7 @@ conditions.
> "arm,mmu-400"
> "arm,mmu-401"
> "arm,mmu-500"
> + "marvell,mmu-500"
> "cavium,smmu-v2"
>
> depending on the particular implementation and/or the
>

2018-10-16 08:37:51

by Hanna Hawa

[permalink] [raw]
Subject: Re: [PATCH 2/4] iommu/arm-smmu: Workaround for Marvell Armada-AP806 SoC erratum #582743

Hi Robin,


On 10/15/2018 04:00 PM, Robin Murphy wrote:
> Hi Hanna,
>
> On 15/10/18 13:00, [email protected] wrote:
>> From: Hanna Hawa <[email protected]>
>>
>> Due to erratum #582743, the Marvell Armada-AP806 can't access 64bit
>> to ARM SMMUv2 registers.
>> This patch split the writeq/readq to two accesses of writel/readl.
>>
>> Note that separate writes/reads to 2 is not problem regards to atomicity,
>> because the driver use the readq/writeq while initialize the SMMU, report
>> for SMMU fault, and use spinlock in one case (iova_to_phys).
>
> In general, this doesn't work. Here's what the SMMU spec says about
> SMMU_CBn_TLBIVA, but others are similar:
>
> "If SMMU_CBA2Rn.VA64 is one, then AArch64 format is selected. The
> programmer should use 64 bit accesses to this register. If 32-bit
> accesses are used then writes to the top 32 bits are ignored and writes
> to the lower 32 bits are zero extended."
>
> If your interconnect won't let 64-bit transactions through, then you
> can't use AArch64 format at stage 1 at all, since there's no way to
> invalidate entries with the correct ASID, and you'll have to restrict
> stage 2 formats to at most 44-bit IOVAs in order for TLBIIPAS2{L} not to
> invalidate the wrong thing.
Thanks for your suggestion.

To restrict the IOVAs i need to add another work-around to the driver to
limit the va_size, is that acceptable?

What the different in the driver between AARCH32_L & AARCH32_S?

>
>> Signed-off-by: Hanna Hawa <[email protected]>
>> ---
>> Documentation/arm64/silicon-errata.txt | 2 ++
>> drivers/iommu/arm-smmu.c | 33
>> +++++++++++++++++++++++++++++----
>> 2 files changed, 31 insertions(+), 4 deletions(-)
>>
>> diff --git a/Documentation/arm64/silicon-errata.txt
>> b/Documentation/arm64/silicon-errata.txt
>> index 3b2f2dd..fc3f2a0 100644
>> --- a/Documentation/arm64/silicon-errata.txt
>> +++ b/Documentation/arm64/silicon-errata.txt
>> @@ -67,6 +67,8 @@ stable kernels.
>> | Cavium | ThunderX2 SMMUv3| #74 |
>> N/A |
>> | Cavium | ThunderX2 SMMUv3| #126 |
>> N/A |
>> | | |
>> | |
>> +| Marvell | ARM-MMU-500 | #582743 |
>> N/A |
>> +| | |
>> | |
>
> Nit: the convention here seems to be at least alphabetically sorted by
> Implementer.
>
>> | Freescale/NXP | LS2080A/LS1043A | A-008585 |
>> FSL_ERRATUM_A008585 |
>> | | |
>> | |
>> | Hisilicon | Hip0{5,6,7} | #161010101 |
>> HISILICON_ERRATUM_161010101 |
>> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
>> index fccb1d4..d64f892 100644
>> --- a/drivers/iommu/arm-smmu.c
>> +++ b/drivers/iommu/arm-smmu.c
>> @@ -119,6 +119,7 @@ enum arm_smmu_arch_version {
>> enum arm_smmu_implementation {
>> GENERIC_SMMU,
>> ARM_MMU500,
>> + MRVL_MMU500,
>
> Is there any actually modification to the MMU-500 RTL itself here, or is
> the problem just in the interconnect in front of the SMMU programming
> interface? I would normally assume the latter, in which case treating it
> as a separate implementation isn't really accurate, and I'd much rather
> handle any workaround via smmu->options, just like the secure access
> workaround (which is a similar integration issue).
No actual modification to the RTL, i'll use the smmu->option
Thanks for your review & suggestions.

Hanna
>
> Robin.
>
>> CAVIUM_SMMUV2,
>> };
>> @@ -276,13 +277,35 @@ static inline void smmu_writeq_relaxed(struct
>> arm_smmu_device *smmu,
>> u64 val,
>> void __iomem *addr)
>> {
>> - writeq_relaxed(val, addr);
>> + /*
>> + * Marvell Armada-AP806 erratum #582743.
>> + * Split all the writeq to double writel
>> + */
>> + if (smmu->model != MRVL_MMU500) {
>> + writeq_relaxed(val, addr);
>> + return;
>> + }
>> +
>> + writel_relaxed(upper_32_bits(val), addr + 4);
>> + writel_relaxed(lower_32_bits(val), addr);
>> }
>> static inline u64 smmu_readq_relaxed(struct arm_smmu_device *smmu,
>> void __iomem *addr)
>> {
>> - return readq_relaxed(addr);
>> + u64 val;
>> +
>> + /*
>> + * Marvell Armada-AP806 erratum #582743.
>> + * Split all the readq to double readl
>> + */
>> + if (smmu->model != MRVL_MMU500)
>> + return readq_relaxed(addr);
>> +
>> + val = (u64)readl_relaxed(addr + 4) << 32;
>> + val |= readl_relaxed(addr);
>> +
>> + return val;
>> }
>> static void parse_driver_options(struct arm_smmu_device *smmu)
>> @@ -1611,7 +1634,7 @@ static void arm_smmu_device_reset(struct
>> arm_smmu_device *smmu)
>> for (i = 0; i < smmu->num_mapping_groups; ++i)
>> arm_smmu_write_sme(smmu, i);
>> - if (smmu->model == ARM_MMU500) {
>> + if (smmu->model == ARM_MMU500 || smmu->model == MRVL_MMU500) {
>> /*
>> * Before clearing ARM_MMU500_ACTLR_CPRE, need to
>> * clear CACHE_LOCK bit of ACR first. And, CACHE_LOCK
>> @@ -1640,7 +1663,7 @@ static void arm_smmu_device_reset(struct
>> arm_smmu_device *smmu)
>> * Disable MMU-500's not-particularly-beneficial next-page
>> * prefetcher for the sake of errata #841119 and #826419.
>> */
>> - if (smmu->model == ARM_MMU500) {
>> + if (smmu->model == ARM_MMU500 || smmu->model == MRVL_MMU500) {
>> reg = readl_relaxed(cb_base + ARM_SMMU_CB_ACTLR);
>> reg &= ~ARM_MMU500_ACTLR_CPRE;
>> writel_relaxed(reg, cb_base + ARM_SMMU_CB_ACTLR);
>> @@ -1923,6 +1946,7 @@ struct arm_smmu_match_data {
>> ARM_SMMU_MATCH_DATA(smmu_generic_v2, ARM_SMMU_V2, GENERIC_SMMU);
>> ARM_SMMU_MATCH_DATA(arm_mmu401, ARM_SMMU_V1_64K, GENERIC_SMMU);
>> ARM_SMMU_MATCH_DATA(arm_mmu500, ARM_SMMU_V2, ARM_MMU500);
>> +ARM_SMMU_MATCH_DATA(mrvl_mmu500, ARM_SMMU_V2, MRVL_MMU500);
>> ARM_SMMU_MATCH_DATA(cavium_smmuv2, ARM_SMMU_V2, CAVIUM_SMMUV2);
>> static const struct of_device_id arm_smmu_of_match[] = {
>> @@ -1931,6 +1955,7 @@ struct arm_smmu_match_data {
>> { .compatible = "arm,mmu-400", .data = &smmu_generic_v1 },
>> { .compatible = "arm,mmu-401", .data = &arm_mmu401 },
>> { .compatible = "arm,mmu-500", .data = &arm_mmu500 },
>> + { .compatible = "marvell,mmu-500", .data = &mrvl_mmu500 },
>> { .compatible = "cavium,smmu-v2", .data = &cavium_smmuv2 },
>> { },
>> };
>>

2018-10-18 16:09:32

by Robin Murphy

[permalink] [raw]
Subject: Re: [PATCH 2/4] iommu/arm-smmu: Workaround for Marvell Armada-AP806 SoC erratum #582743

On 16/10/18 09:25, Hanna Hawa wrote:
> Hi Robin,
>
>
> On 10/15/2018 04:00 PM, Robin Murphy wrote:
>> Hi Hanna,
>>
>> On 15/10/18 13:00, [email protected] wrote:
>>> From: Hanna Hawa <[email protected]>
>>>
>>> Due to erratum #582743, the Marvell Armada-AP806 can't access 64bit
>>> to ARM SMMUv2 registers.
>>> This patch split the writeq/readq to two accesses of writel/readl.
>>>
>>> Note that separate writes/reads to 2 is not problem regards to
>>> atomicity,
>>> because the driver use the readq/writeq while initialize the SMMU,
>>> report
>>> for SMMU fault, and use spinlock in one case (iova_to_phys).
>>
>> In general, this doesn't work. Here's what the SMMU spec says about
>> SMMU_CBn_TLBIVA, but others are similar:
>>
>> "If SMMU_CBA2Rn.VA64 is one, then AArch64 format is selected. The
>> programmer should use 64 bit accesses to this register. If 32-bit
>> accesses are used then writes to the top 32 bits are ignored and writes
>> to the lower 32 bits are zero extended."
>>
>> If your interconnect won't let 64-bit transactions through, then you
>> can't use AArch64 format at stage 1 at all, since there's no way to
>> invalidate entries with the correct ASID, and you'll have to restrict
>> stage 2 formats to at most 44-bit IOVAs in order for TLBIIPAS2{L} not to
>> invalidate the wrong thing.
> Thanks for your suggestion.
>
> To restrict the IOVAs i need to add another work-around to the driver to
> limit the va_size, is that acceptable?

Yeah, constraining AArch64 stage 2 to 44 bits should just be a case of
adjusting smmu->ipa_size at probe time, but you'd still need to add the
writel()-based TLBI path to take advantage of it.

How big is the physical memory map on these SoCs? If everything fits
into 40 bits then I think you could get away with simply hiding the
SMMU_IDR2.PTFSv8 fields to sidestep the AArch64 formats altogether, and
everything else should fall out in the wash. Otherwise, you'll have to
just disable stage 1 support in addition to the stage 2 workaround as
above.

> What the different in the driver between AARCH32_L & AARCH32_S?

AARCH32_L is the 3-level LPAE format, which gives you 32-bit
input/40-bit output at stage 1 and 40-bit input/40-bit output at stage
2. AARCH32_S is the legacy 2-level short-descriptor format which only
supports stage 1 and is limited to 32-bit output addresses - MMU-500
does support it, but you probably want to avoid it if possible ;)

Robin.

2018-10-18 21:11:06

by Rob Herring (Arm)

[permalink] [raw]
Subject: Re: [PATCH 3/4] dt-bindings: iommu/arm, smmu: add compatible string for Marvell

On Mon, Oct 15, 2018 at 02:11:52PM +0100, Robin Murphy wrote:
> On 15/10/18 13:00, [email protected] wrote:
> > From: Hanna Hawa <[email protected]>
> >
> > Add specific compatible string for Marvell usage due errata of
> > accessing 64bit registers of ARM SMMU, in AP806.
> >
> > AP806 SOC use the generic ARM-MMU500, and there's no specific
> > implementation of Marvell, this compatible is used for errata only.
>
> Given that, I think something more specific like:
>
> "marvell,ap806-smmu", "arm,mmu-500";
>
> would be most appropriate. Otherwise, if some future Marvell SoC were to
> ever come out with a *different* MMU-500 integration problem, you'd already
> have painted yourself into a corner.
>
> Alternatively (or additionally), we could perhaps consider a separate
> property like "marvell,32bit-config-access", to mirror the existing handling
> of the secure integration bug.

The former please. We have learned our lesson there (though for some
reason, that was the *only* SMMU problem in Calxeda Midway ;) ).

Rob