2020-11-02 17:16:28

by Jordan Crouse

[permalink] [raw]
Subject: [PATCH v18 0/4] iommu/arm-smmu: Add adreno-smmu implementation and bindings

(resend with expanded CC list so everybody can see all the patches)

This short series adds support for the adreno-smmu implementation of the
arm-smmu driver and the device-tree bindings to turn on the implementation
for the sm845 and sc7180 GPUs. These changes are the last ones needed to enable
per-instance pagetables in the drm/msm driver.

No deltas in this patchset since the last go-around for 5.10 [1].

[1] https://patchwork.freedesktop.org/series/81393/

Jordan Crouse (3):
iommu/arm-smmu-qcom: Add implementation for the adreno GPU SMMU
dt-bindings: arm-smmu: Add compatible string for Adreno GPU SMMU
arm: dts: qcom: sm845: Set the compatible string for the GPU SMMU

Rob Clark (1):
iommu/arm-smmu: Add a way for implementations to influence SCTLR

.../devicetree/bindings/iommu/arm,smmu.yaml | 9 +-
arch/arm64/boot/dts/qcom/sdm845-cheza.dtsi | 9 +
arch/arm64/boot/dts/qcom/sdm845.dtsi | 2 +-
drivers/iommu/arm/arm-smmu/arm-smmu-impl.c | 3 +
drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 157 +++++++++++++++++-
drivers/iommu/arm/arm-smmu/arm-smmu.c | 3 +
drivers/iommu/arm/arm-smmu/arm-smmu.h | 4 +
7 files changed, 182 insertions(+), 5 deletions(-)

--
2.25.1


2020-11-02 17:16:32

by Jordan Crouse

[permalink] [raw]
Subject: [PATCH v18 4/4] arm: dts: qcom: sm845: Set the compatible string for the GPU SMMU

Set the qcom,adreno-smmu compatible string for the GPU SMMU to enable
split pagetables and per-instance pagetables for drm/msm.

Signed-off-by: Jordan Crouse <[email protected]>
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Bjorn Andersson <[email protected]>
---

arch/arm64/boot/dts/qcom/sdm845-cheza.dtsi | 9 +++++++++
arch/arm64/boot/dts/qcom/sdm845.dtsi | 2 +-
2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/boot/dts/qcom/sdm845-cheza.dtsi b/arch/arm64/boot/dts/qcom/sdm845-cheza.dtsi
index 64fc1bfd66fa..39f23cdcbd02 100644
--- a/arch/arm64/boot/dts/qcom/sdm845-cheza.dtsi
+++ b/arch/arm64/boot/dts/qcom/sdm845-cheza.dtsi
@@ -633,6 +633,15 @@ &mdss_mdp {
status = "okay";
};

+/*
+ * Cheza fw does not properly program the GPU aperture to allow the
+ * GPU to update the SMMU pagetables for context switches. Work
+ * around this by dropping the "qcom,adreno-smmu" compat string.
+ */
+&adreno_smmu {
+ compatible = "qcom,sdm845-smmu-v2", "qcom,smmu-v2";
+};
+
&mss_pil {
iommus = <&apps_smmu 0x781 0x0>,
<&apps_smmu 0x724 0x3>;
diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi b/arch/arm64/boot/dts/qcom/sdm845.dtsi
index 40e8c11f23ab..0508e86140bd 100644
--- a/arch/arm64/boot/dts/qcom/sdm845.dtsi
+++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi
@@ -4103,7 +4103,7 @@ opp-257000000 {
};

adreno_smmu: iommu@5040000 {
- compatible = "qcom,sdm845-smmu-v2", "qcom,smmu-v2";
+ compatible = "qcom,sdm845-smmu-v2", "qcom,adreno-smmu", "qcom,smmu-v2";
reg = <0 0x5040000 0 0x10000>;
#iommu-cells = <1>;
#global-interrupts = <2>;
--
2.25.1

2020-11-02 17:17:25

by Jordan Crouse

[permalink] [raw]
Subject: [PATCH v18 2/4] iommu/arm-smmu: Add a way for implementations to influence SCTLR

From: Rob Clark <[email protected]>

For the Adreno GPU's SMMU, we want SCTLR.HUPCF set to ensure that
pending translations are not terminated on iova fault. Otherwise
a terminated CP read could hang the GPU by returning invalid
command-stream data.

Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Bjorn Andersson <[email protected]>
Signed-off-by: Jordan Crouse <[email protected]>
---

drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 6 ++++++
drivers/iommu/arm/arm-smmu/arm-smmu.c | 3 +++
drivers/iommu/arm/arm-smmu/arm-smmu.h | 3 +++
3 files changed, 12 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
index 1e942eed2dfc..0663d7d26908 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
@@ -129,6 +129,12 @@ static int qcom_adreno_smmu_init_context(struct arm_smmu_domain *smmu_domain,
(smmu_domain->cfg.fmt == ARM_SMMU_CTX_FMT_AARCH64))
pgtbl_cfg->quirks |= IO_PGTABLE_QUIRK_ARM_TTBR1;

+ /*
+ * On the GPU device we want to process subsequent transactions after a
+ * fault to keep the GPU from hanging
+ */
+ smmu_domain->cfg.sctlr_set |= ARM_SMMU_SCTLR_HUPCF;
+
/*
* Initialize private interface with GPU:
*/
diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c
index dad7fa86fbd4..1f06ab219819 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
@@ -617,6 +617,9 @@ void arm_smmu_write_context_bank(struct arm_smmu_device *smmu, int idx)
if (IS_ENABLED(CONFIG_CPU_BIG_ENDIAN))
reg |= ARM_SMMU_SCTLR_E;

+ reg |= cfg->sctlr_set;
+ reg &= ~cfg->sctlr_clr;
+
arm_smmu_cb_write(smmu, idx, ARM_SMMU_CB_SCTLR, reg);
}

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.h b/drivers/iommu/arm/arm-smmu/arm-smmu.h
index 6c5ff9999eae..ddf2ca4c923d 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.h
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.h
@@ -144,6 +144,7 @@ enum arm_smmu_cbar_type {
#define ARM_SMMU_CB_SCTLR 0x0
#define ARM_SMMU_SCTLR_S1_ASIDPNE BIT(12)
#define ARM_SMMU_SCTLR_CFCFG BIT(7)
+#define ARM_SMMU_SCTLR_HUPCF BIT(8)
#define ARM_SMMU_SCTLR_CFIE BIT(6)
#define ARM_SMMU_SCTLR_CFRE BIT(5)
#define ARM_SMMU_SCTLR_E BIT(4)
@@ -341,6 +342,8 @@ struct arm_smmu_cfg {
u16 asid;
u16 vmid;
};
+ u32 sctlr_set; /* extra bits to set in SCTLR */
+ u32 sctlr_clr; /* bits to mask in SCTLR */
enum arm_smmu_cbar_type cbar;
enum arm_smmu_context_fmt fmt;
};
--
2.25.1

2020-11-02 17:17:43

by Jordan Crouse

[permalink] [raw]
Subject: [PATCH v18 1/4] iommu/arm-smmu-qcom: Add implementation for the adreno GPU SMMU

Add a special implementation for the SMMU attached to most Adreno GPU
target triggered from the qcom,adreno-smmu compatible string.

The new Adreno SMMU implementation will enable split pagetables
(TTBR1) for the domain attached to the GPU device (SID 0) and
hard code it context bank 0 so the GPU hardware can implement
per-instance pagetables.

Co-developed-by: Rob Clark <[email protected]>
Signed-off-by: Jordan Crouse <[email protected]>
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Bjorn Andersson <[email protected]>
---

drivers/iommu/arm/arm-smmu/arm-smmu-impl.c | 3 +
drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 151 ++++++++++++++++++++-
drivers/iommu/arm/arm-smmu/arm-smmu.h | 1 +
3 files changed, 153 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c b/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
index 88f17cc33023..d199b4bff15d 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
@@ -223,6 +223,9 @@ struct arm_smmu_device *arm_smmu_impl_init(struct arm_smmu_device *smmu)
of_device_is_compatible(np, "qcom,sm8250-smmu-500"))
return qcom_smmu_impl_init(smmu);

+ if (of_device_is_compatible(smmu->dev->of_node, "qcom,adreno-smmu"))
+ return qcom_adreno_smmu_impl_init(smmu);
+
if (of_device_is_compatible(np, "marvell,ap806-smmu-500"))
smmu->impl = &mrvl_mmu500_impl;

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
index be4318044f96..1e942eed2dfc 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
@@ -3,6 +3,7 @@
* Copyright (c) 2019, The Linux Foundation. All rights reserved.
*/

+#include <linux/adreno-smmu-priv.h>
#include <linux/of_device.h>
#include <linux/qcom_scm.h>

@@ -12,6 +13,134 @@ struct qcom_smmu {
struct arm_smmu_device smmu;
};

+#define QCOM_ADRENO_SMMU_GPU_SID 0
+
+static bool qcom_adreno_smmu_is_gpu_device(struct device *dev)
+{
+ struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
+ int i;
+
+ /*
+ * The GPU will always use SID 0 so that is a handy way to uniquely
+ * identify it and configure it for per-instance pagetables
+ */
+ for (i = 0; i < fwspec->num_ids; i++) {
+ u16 sid = FIELD_GET(ARM_SMMU_SMR_ID, fwspec->ids[i]);
+
+ if (sid == QCOM_ADRENO_SMMU_GPU_SID)
+ return true;
+ }
+
+ return false;
+}
+
+static const struct io_pgtable_cfg *qcom_adreno_smmu_get_ttbr1_cfg(
+ const void *cookie)
+{
+ struct arm_smmu_domain *smmu_domain = (void *)cookie;
+ struct io_pgtable *pgtable =
+ io_pgtable_ops_to_pgtable(smmu_domain->pgtbl_ops);
+ return &pgtable->cfg;
+}
+
+/*
+ * Local implementation to configure TTBR0 with the specified pagetable config.
+ * The GPU driver will call this to enable TTBR0 when per-instance pagetables
+ * are active
+ */
+
+static int qcom_adreno_smmu_set_ttbr0_cfg(const void *cookie,
+ const struct io_pgtable_cfg *pgtbl_cfg)
+{
+ struct arm_smmu_domain *smmu_domain = (void *)cookie;
+ struct io_pgtable *pgtable = io_pgtable_ops_to_pgtable(smmu_domain->pgtbl_ops);
+ struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
+ struct arm_smmu_cb *cb = &smmu_domain->smmu->cbs[cfg->cbndx];
+
+ /* The domain must have split pagetables already enabled */
+ if (cb->tcr[0] & ARM_SMMU_TCR_EPD1)
+ return -EINVAL;
+
+ /* If the pagetable config is NULL, disable TTBR0 */
+ if (!pgtbl_cfg) {
+ /* Do nothing if it is already disabled */
+ if ((cb->tcr[0] & ARM_SMMU_TCR_EPD0))
+ return -EINVAL;
+
+ /* Set TCR to the original configuration */
+ cb->tcr[0] = arm_smmu_lpae_tcr(&pgtable->cfg);
+ cb->ttbr[0] = FIELD_PREP(ARM_SMMU_TTBRn_ASID, cb->cfg->asid);
+ } else {
+ u32 tcr = cb->tcr[0];
+
+ /* Don't call this again if TTBR0 is already enabled */
+ if (!(cb->tcr[0] & ARM_SMMU_TCR_EPD0))
+ return -EINVAL;
+
+ tcr |= arm_smmu_lpae_tcr(pgtbl_cfg);
+ tcr &= ~(ARM_SMMU_TCR_EPD0 | ARM_SMMU_TCR_EPD1);
+
+ cb->tcr[0] = tcr;
+ cb->ttbr[0] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr;
+ cb->ttbr[0] |= FIELD_PREP(ARM_SMMU_TTBRn_ASID, cb->cfg->asid);
+ }
+
+ arm_smmu_write_context_bank(smmu_domain->smmu, cb->cfg->cbndx);
+
+ return 0;
+}
+
+static int qcom_adreno_smmu_alloc_context_bank(struct arm_smmu_domain *smmu_domain,
+ struct arm_smmu_device *smmu,
+ struct device *dev, int start)
+{
+ int count;
+
+ /*
+ * Assign context bank 0 to the GPU device so the GPU hardware can
+ * switch pagetables
+ */
+ if (qcom_adreno_smmu_is_gpu_device(dev)) {
+ start = 0;
+ count = 1;
+ } else {
+ start = 1;
+ count = smmu->num_context_banks;
+ }
+
+ return __arm_smmu_alloc_bitmap(smmu->context_map, start, count);
+}
+
+static int qcom_adreno_smmu_init_context(struct arm_smmu_domain *smmu_domain,
+ struct io_pgtable_cfg *pgtbl_cfg, struct device *dev)
+{
+ struct adreno_smmu_priv *priv;
+
+ /* Only enable split pagetables for the GPU device (SID 0) */
+ if (!qcom_adreno_smmu_is_gpu_device(dev))
+ return 0;
+
+ /*
+ * All targets that use the qcom,adreno-smmu compatible string *should*
+ * be AARCH64 stage 1 but double check because the arm-smmu code assumes
+ * that is the case when the TTBR1 quirk is enabled
+ */
+ if ((smmu_domain->stage == ARM_SMMU_DOMAIN_S1) &&
+ (smmu_domain->cfg.fmt == ARM_SMMU_CTX_FMT_AARCH64))
+ pgtbl_cfg->quirks |= IO_PGTABLE_QUIRK_ARM_TTBR1;
+
+ /*
+ * Initialize private interface with GPU:
+ */
+
+ priv = dev_get_drvdata(dev);
+ priv->cookie = smmu_domain;
+ priv->get_ttbr1_cfg = qcom_adreno_smmu_get_ttbr1_cfg;
+ priv->set_ttbr0_cfg = qcom_adreno_smmu_set_ttbr0_cfg;
+
+ return 0;
+}
+
static const struct of_device_id qcom_smmu_client_of_match[] __maybe_unused = {
{ .compatible = "qcom,adreno" },
{ .compatible = "qcom,mdp4" },
@@ -65,7 +194,15 @@ static const struct arm_smmu_impl qcom_smmu_impl = {
.reset = qcom_smmu500_reset,
};

-struct arm_smmu_device *qcom_smmu_impl_init(struct arm_smmu_device *smmu)
+static const struct arm_smmu_impl qcom_adreno_smmu_impl = {
+ .init_context = qcom_adreno_smmu_init_context,
+ .def_domain_type = qcom_smmu_def_domain_type,
+ .reset = qcom_smmu500_reset,
+ .alloc_context_bank = qcom_adreno_smmu_alloc_context_bank,
+};
+
+static struct arm_smmu_device *qcom_smmu_create(struct arm_smmu_device *smmu,
+ const struct arm_smmu_impl *impl)
{
struct qcom_smmu *qsmmu;

@@ -75,8 +212,18 @@ struct arm_smmu_device *qcom_smmu_impl_init(struct arm_smmu_device *smmu)

qsmmu->smmu = *smmu;

- qsmmu->smmu.impl = &qcom_smmu_impl;
+ qsmmu->smmu.impl = impl;
devm_kfree(smmu->dev, smmu);

return &qsmmu->smmu;
}
+
+struct arm_smmu_device *qcom_smmu_impl_init(struct arm_smmu_device *smmu)
+{
+ return qcom_smmu_create(smmu, &qcom_smmu_impl);
+}
+
+struct arm_smmu_device *qcom_adreno_smmu_impl_init(struct arm_smmu_device *smmu)
+{
+ return qcom_smmu_create(smmu, &qcom_adreno_smmu_impl);
+}
diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.h b/drivers/iommu/arm/arm-smmu/arm-smmu.h
index 1a746476927c..6c5ff9999eae 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.h
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.h
@@ -520,6 +520,7 @@ static inline void arm_smmu_writeq(struct arm_smmu_device *smmu, int page,
struct arm_smmu_device *arm_smmu_impl_init(struct arm_smmu_device *smmu);
struct arm_smmu_device *nvidia_smmu_impl_init(struct arm_smmu_device *smmu);
struct arm_smmu_device *qcom_smmu_impl_init(struct arm_smmu_device *smmu);
+struct arm_smmu_device *qcom_adreno_smmu_impl_init(struct arm_smmu_device *smmu);

void arm_smmu_write_context_bank(struct arm_smmu_device *smmu, int idx);
int arm_mmu500_reset(struct arm_smmu_device *smmu);
--
2.25.1

2020-11-02 17:19:10

by Jordan Crouse

[permalink] [raw]
Subject: [PATCH v18 3/4] dt-bindings: arm-smmu: Add compatible string for Adreno GPU SMMU

Every Qcom Adreno GPU has an embedded SMMU for its own use. These
devices depend on unique features such as split pagetables,
different stall/halt requirements and other settings. Identify them
with a compatible string so that they can be identified in the
arm-smmu implementation specific code.

Signed-off-by: Jordan Crouse <[email protected]>
Reviewed-by: Rob Herring <[email protected]>
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Bjorn Andersson <[email protected]>
---

Documentation/devicetree/bindings/iommu/arm,smmu.yaml | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
index 503160a7b9a0..3b63f2ae24db 100644
--- a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
+++ b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
@@ -28,8 +28,6 @@ properties:
- enum:
- qcom,msm8996-smmu-v2
- qcom,msm8998-smmu-v2
- - qcom,sc7180-smmu-v2
- - qcom,sdm845-smmu-v2
- const: qcom,smmu-v2

- description: Qcom SoCs implementing "arm,mmu-500"
@@ -40,6 +38,13 @@ properties:
- qcom,sm8150-smmu-500
- qcom,sm8250-smmu-500
- const: arm,mmu-500
+ - description: Qcom Adreno GPUs implementing "arm,smmu-v2"
+ items:
+ - enum:
+ - qcom,sc7180-smmu-v2
+ - qcom,sdm845-smmu-v2
+ - const: qcom,adreno-smmu
+ - const: qcom,smmu-v2
- description: Marvell SoCs implementing "arm,mmu-500"
items:
- const: marvell,ap806-smmu-500
--
2.25.1

2020-11-02 18:13:21

by Robin Murphy

[permalink] [raw]
Subject: Re: [PATCH v18 1/4] iommu/arm-smmu-qcom: Add implementation for the adreno GPU SMMU

On 2020-11-02 17:14, Jordan Crouse wrote:
> Add a special implementation for the SMMU attached to most Adreno GPU
> target triggered from the qcom,adreno-smmu compatible string.
>
> The new Adreno SMMU implementation will enable split pagetables
> (TTBR1) for the domain attached to the GPU device (SID 0) and
> hard code it context bank 0 so the GPU hardware can implement
> per-instance pagetables.
>
> Co-developed-by: Rob Clark <[email protected]>
> Signed-off-by: Jordan Crouse <[email protected]>
> Signed-off-by: Rob Clark <[email protected]>
> Reviewed-by: Bjorn Andersson <[email protected]>
> ---
>
> drivers/iommu/arm/arm-smmu/arm-smmu-impl.c | 3 +
> drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 151 ++++++++++++++++++++-
> drivers/iommu/arm/arm-smmu/arm-smmu.h | 1 +
> 3 files changed, 153 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c b/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
> index 88f17cc33023..d199b4bff15d 100644
> --- a/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
> @@ -223,6 +223,9 @@ struct arm_smmu_device *arm_smmu_impl_init(struct arm_smmu_device *smmu)
> of_device_is_compatible(np, "qcom,sm8250-smmu-500"))
> return qcom_smmu_impl_init(smmu);
>
> + if (of_device_is_compatible(smmu->dev->of_node, "qcom,adreno-smmu"))
> + return qcom_adreno_smmu_impl_init(smmu);
> +
> if (of_device_is_compatible(np, "marvell,ap806-smmu-500"))
> smmu->impl = &mrvl_mmu500_impl;
>
> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
> index be4318044f96..1e942eed2dfc 100644
> --- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
> @@ -3,6 +3,7 @@
> * Copyright (c) 2019, The Linux Foundation. All rights reserved.
> */
>
> +#include <linux/adreno-smmu-priv.h>
> #include <linux/of_device.h>
> #include <linux/qcom_scm.h>
>
> @@ -12,6 +13,134 @@ struct qcom_smmu {
> struct arm_smmu_device smmu;
> };
>
> +#define QCOM_ADRENO_SMMU_GPU_SID 0
> +
> +static bool qcom_adreno_smmu_is_gpu_device(struct device *dev)
> +{
> + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
> + int i;
> +
> + /*
> + * The GPU will always use SID 0 so that is a handy way to uniquely
> + * identify it and configure it for per-instance pagetables
> + */
> + for (i = 0; i < fwspec->num_ids; i++) {
> + u16 sid = FIELD_GET(ARM_SMMU_SMR_ID, fwspec->ids[i]);
> +
> + if (sid == QCOM_ADRENO_SMMU_GPU_SID)
> + return true;
> + }
> +
> + return false;
> +}
> +
> +static const struct io_pgtable_cfg *qcom_adreno_smmu_get_ttbr1_cfg(
> + const void *cookie)
> +{
> + struct arm_smmu_domain *smmu_domain = (void *)cookie;
> + struct io_pgtable *pgtable =
> + io_pgtable_ops_to_pgtable(smmu_domain->pgtbl_ops);
> + return &pgtable->cfg;
> +}
> +
> +/*
> + * Local implementation to configure TTBR0 with the specified pagetable config.
> + * The GPU driver will call this to enable TTBR0 when per-instance pagetables
> + * are active
> + */
> +
> +static int qcom_adreno_smmu_set_ttbr0_cfg(const void *cookie,
> + const struct io_pgtable_cfg *pgtbl_cfg)
> +{
> + struct arm_smmu_domain *smmu_domain = (void *)cookie;
> + struct io_pgtable *pgtable = io_pgtable_ops_to_pgtable(smmu_domain->pgtbl_ops);
> + struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
> + struct arm_smmu_cb *cb = &smmu_domain->smmu->cbs[cfg->cbndx];
> +
> + /* The domain must have split pagetables already enabled */
> + if (cb->tcr[0] & ARM_SMMU_TCR_EPD1)
> + return -EINVAL;
> +
> + /* If the pagetable config is NULL, disable TTBR0 */
> + if (!pgtbl_cfg) {
> + /* Do nothing if it is already disabled */
> + if ((cb->tcr[0] & ARM_SMMU_TCR_EPD0))
> + return -EINVAL;
> +
> + /* Set TCR to the original configuration */
> + cb->tcr[0] = arm_smmu_lpae_tcr(&pgtable->cfg);
> + cb->ttbr[0] = FIELD_PREP(ARM_SMMU_TTBRn_ASID, cb->cfg->asid);
> + } else {
> + u32 tcr = cb->tcr[0];
> +
> + /* Don't call this again if TTBR0 is already enabled */
> + if (!(cb->tcr[0] & ARM_SMMU_TCR_EPD0))
> + return -EINVAL;
> +
> + tcr |= arm_smmu_lpae_tcr(pgtbl_cfg);
> + tcr &= ~(ARM_SMMU_TCR_EPD0 | ARM_SMMU_TCR_EPD1);
> +
> + cb->tcr[0] = tcr;
> + cb->ttbr[0] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr;
> + cb->ttbr[0] |= FIELD_PREP(ARM_SMMU_TTBRn_ASID, cb->cfg->asid);
> + }
> +
> + arm_smmu_write_context_bank(smmu_domain->smmu, cb->cfg->cbndx);
> +
> + return 0;
> +}
> +
> +static int qcom_adreno_smmu_alloc_context_bank(struct arm_smmu_domain *smmu_domain,
> + struct arm_smmu_device *smmu,
> + struct device *dev, int start)
> +{
> + int count;
> +
> + /*
> + * Assign context bank 0 to the GPU device so the GPU hardware can
> + * switch pagetables
> + */
> + if (qcom_adreno_smmu_is_gpu_device(dev)) {
> + start = 0;
> + count = 1;
> + } else {
> + start = 1;
> + count = smmu->num_context_banks;
> + }
> +
> + return __arm_smmu_alloc_bitmap(smmu->context_map, start, count);
> +}
> +static int qcom_adreno_smmu_init_context(struct arm_smmu_domain *smmu_domain,
> + struct io_pgtable_cfg *pgtbl_cfg, struct device *dev)
> +{
> + struct adreno_smmu_priv *priv;
> +
> + /* Only enable split pagetables for the GPU device (SID 0) */
> + if (!qcom_adreno_smmu_is_gpu_device(dev))
> + return 0;
> +
> + /*
> + * All targets that use the qcom,adreno-smmu compatible string *should*
> + * be AARCH64 stage 1 but double check because the arm-smmu code assumes
> + * that is the case when the TTBR1 quirk is enabled
> + */
> + if ((smmu_domain->stage == ARM_SMMU_DOMAIN_S1) &&
> + (smmu_domain->cfg.fmt == ARM_SMMU_CTX_FMT_AARCH64))
> + pgtbl_cfg->quirks |= IO_PGTABLE_QUIRK_ARM_TTBR1;
> +
> + /*
> + * Initialize private interface with GPU:
> + */
> +
> + priv = dev_get_drvdata(dev);
> + priv->cookie = smmu_domain;
> + priv->get_ttbr1_cfg = qcom_adreno_smmu_get_ttbr1_cfg;
> + priv->set_ttbr0_cfg = qcom_adreno_smmu_set_ttbr0_cfg;


I still think it would have been logical to reserve context bank 0
outright in cfg_probe, then just swizzle cbndx/irptndx at this point
once everything else has proven that this is to be the One Special
Domain. I guess this way at least you don't have to intervene in
domain_free, but by the same token that means you never get to clean up
the dangling pointer in priv->cookie, which is a little bit yuck. Oh well...

Acked-by: Robin Murphy <[email protected]>

Thanks,
Robin.

> +
> + return 0;
> +}
> +
> static const struct of_device_id qcom_smmu_client_of_match[] __maybe_unused = {
> { .compatible = "qcom,adreno" },
> { .compatible = "qcom,mdp4" },
> @@ -65,7 +194,15 @@ static const struct arm_smmu_impl qcom_smmu_impl = {
> .reset = qcom_smmu500_reset,
> };
>
> -struct arm_smmu_device *qcom_smmu_impl_init(struct arm_smmu_device *smmu)
> +static const struct arm_smmu_impl qcom_adreno_smmu_impl = {
> + .init_context = qcom_adreno_smmu_init_context,
> + .def_domain_type = qcom_smmu_def_domain_type,
> + .reset = qcom_smmu500_reset,
> + .alloc_context_bank = qcom_adreno_smmu_alloc_context_bank,
> +};
> +
> +static struct arm_smmu_device *qcom_smmu_create(struct arm_smmu_device *smmu,
> + const struct arm_smmu_impl *impl)
> {
> struct qcom_smmu *qsmmu;
>
> @@ -75,8 +212,18 @@ struct arm_smmu_device *qcom_smmu_impl_init(struct arm_smmu_device *smmu)
>
> qsmmu->smmu = *smmu;
>
> - qsmmu->smmu.impl = &qcom_smmu_impl;
> + qsmmu->smmu.impl = impl;
> devm_kfree(smmu->dev, smmu);
>
> return &qsmmu->smmu;
> }
> +
> +struct arm_smmu_device *qcom_smmu_impl_init(struct arm_smmu_device *smmu)
> +{
> + return qcom_smmu_create(smmu, &qcom_smmu_impl);
> +}
> +
> +struct arm_smmu_device *qcom_adreno_smmu_impl_init(struct arm_smmu_device *smmu)
> +{
> + return qcom_smmu_create(smmu, &qcom_adreno_smmu_impl);
> +}
> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.h b/drivers/iommu/arm/arm-smmu/arm-smmu.h
> index 1a746476927c..6c5ff9999eae 100644
> --- a/drivers/iommu/arm/arm-smmu/arm-smmu.h
> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.h
> @@ -520,6 +520,7 @@ static inline void arm_smmu_writeq(struct arm_smmu_device *smmu, int page,
> struct arm_smmu_device *arm_smmu_impl_init(struct arm_smmu_device *smmu);
> struct arm_smmu_device *nvidia_smmu_impl_init(struct arm_smmu_device *smmu);
> struct arm_smmu_device *qcom_smmu_impl_init(struct arm_smmu_device *smmu);
> +struct arm_smmu_device *qcom_adreno_smmu_impl_init(struct arm_smmu_device *smmu);
>
> void arm_smmu_write_context_bank(struct arm_smmu_device *smmu, int idx);
> int arm_mmu500_reset(struct arm_smmu_device *smmu);
>

2020-11-02 18:21:38

by Robin Murphy

[permalink] [raw]
Subject: Re: [PATCH v18 2/4] iommu/arm-smmu: Add a way for implementations to influence SCTLR

On 2020-11-02 17:14, Jordan Crouse wrote:
> From: Rob Clark <[email protected]>
>
> For the Adreno GPU's SMMU, we want SCTLR.HUPCF set to ensure that
> pending translations are not terminated on iova fault. Otherwise
> a terminated CP read could hang the GPU by returning invalid
> command-stream data.
>
> Signed-off-by: Rob Clark <[email protected]>
> Reviewed-by: Bjorn Andersson <[email protected]>
> Signed-off-by: Jordan Crouse <[email protected]>
> ---
>
> drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 6 ++++++
> drivers/iommu/arm/arm-smmu/arm-smmu.c | 3 +++
> drivers/iommu/arm/arm-smmu/arm-smmu.h | 3 +++
> 3 files changed, 12 insertions(+)
>
> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
> index 1e942eed2dfc..0663d7d26908 100644
> --- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
> @@ -129,6 +129,12 @@ static int qcom_adreno_smmu_init_context(struct arm_smmu_domain *smmu_domain,
> (smmu_domain->cfg.fmt == ARM_SMMU_CTX_FMT_AARCH64))
> pgtbl_cfg->quirks |= IO_PGTABLE_QUIRK_ARM_TTBR1;
>
> + /*
> + * On the GPU device we want to process subsequent transactions after a
> + * fault to keep the GPU from hanging
> + */
> + smmu_domain->cfg.sctlr_set |= ARM_SMMU_SCTLR_HUPCF;
> +
> /*
> * Initialize private interface with GPU:
> */
> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> index dad7fa86fbd4..1f06ab219819 100644
> --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> @@ -617,6 +617,9 @@ void arm_smmu_write_context_bank(struct arm_smmu_device *smmu, int idx)
> if (IS_ENABLED(CONFIG_CPU_BIG_ENDIAN))
> reg |= ARM_SMMU_SCTLR_E;
>
> + reg |= cfg->sctlr_set;
> + reg &= ~cfg->sctlr_clr;

Since we now have a write_s2cr hook, I'm inclined to think that the
consistency of a write_sctlr hook that could similarly apply its own
arbitrary tweaks would make sense for this. Does anyone have any strong
opinions?

Robin.

> +
> arm_smmu_cb_write(smmu, idx, ARM_SMMU_CB_SCTLR, reg);
> }
>
> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.h b/drivers/iommu/arm/arm-smmu/arm-smmu.h
> index 6c5ff9999eae..ddf2ca4c923d 100644
> --- a/drivers/iommu/arm/arm-smmu/arm-smmu.h
> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.h
> @@ -144,6 +144,7 @@ enum arm_smmu_cbar_type {
> #define ARM_SMMU_CB_SCTLR 0x0
> #define ARM_SMMU_SCTLR_S1_ASIDPNE BIT(12)
> #define ARM_SMMU_SCTLR_CFCFG BIT(7)
> +#define ARM_SMMU_SCTLR_HUPCF BIT(8)
> #define ARM_SMMU_SCTLR_CFIE BIT(6)
> #define ARM_SMMU_SCTLR_CFRE BIT(5)
> #define ARM_SMMU_SCTLR_E BIT(4)
> @@ -341,6 +342,8 @@ struct arm_smmu_cfg {
> u16 asid;
> u16 vmid;
> };
> + u32 sctlr_set; /* extra bits to set in SCTLR */
> + u32 sctlr_clr; /* bits to mask in SCTLR */
> enum arm_smmu_cbar_type cbar;
> enum arm_smmu_context_fmt fmt;
> };
>

2020-11-02 18:25:50

by Robin Murphy

[permalink] [raw]
Subject: Re: [PATCH v18 3/4] dt-bindings: arm-smmu: Add compatible string for Adreno GPU SMMU

On 2020-11-02 17:14, Jordan Crouse wrote:
> Every Qcom Adreno GPU has an embedded SMMU for its own use. These
> devices depend on unique features such as split pagetables,
> different stall/halt requirements and other settings. Identify them
> with a compatible string so that they can be identified in the
> arm-smmu implementation specific code.
>
> Signed-off-by: Jordan Crouse <[email protected]>
> Reviewed-by: Rob Herring <[email protected]>
> Signed-off-by: Rob Clark <[email protected]>
> Reviewed-by: Bjorn Andersson <[email protected]>
> ---
>
> Documentation/devicetree/bindings/iommu/arm,smmu.yaml | 9 +++++++--
> 1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
> index 503160a7b9a0..3b63f2ae24db 100644
> --- a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
> +++ b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
> @@ -28,8 +28,6 @@ properties:
> - enum:
> - qcom,msm8996-smmu-v2
> - qcom,msm8998-smmu-v2
> - - qcom,sc7180-smmu-v2
> - - qcom,sdm845-smmu-v2

What about the "Apps SMMU" instances? Those are distinct and don't
have/need the GPU special behaviour, right?

Robin.

> - const: qcom,smmu-v2
>
> - description: Qcom SoCs implementing "arm,mmu-500"
> @@ -40,6 +38,13 @@ properties:
> - qcom,sm8150-smmu-500
> - qcom,sm8250-smmu-500
> - const: arm,mmu-500
> + - description: Qcom Adreno GPUs implementing "arm,smmu-v2"
> + items:
> + - enum:
> + - qcom,sc7180-smmu-v2
> + - qcom,sdm845-smmu-v2
> + - const: qcom,adreno-smmu
> + - const: qcom,smmu-v2
> - description: Marvell SoCs implementing "arm,mmu-500"
> items:
> - const: marvell,ap806-smmu-500
>

2020-11-02 18:34:29

by Robin Murphy

[permalink] [raw]
Subject: Re: [PATCH v18 3/4] dt-bindings: arm-smmu: Add compatible string for Adreno GPU SMMU

On 2020-11-02 18:22, Robin Murphy wrote:
> On 2020-11-02 17:14, Jordan Crouse wrote:
>> Every Qcom Adreno GPU has an embedded SMMU for its own use. These
>> devices depend on unique features such as split pagetables,
>> different stall/halt requirements and other settings. Identify them
>> with a compatible string so that they can be identified in the
>> arm-smmu implementation specific code.
>>
>> Signed-off-by: Jordan Crouse <[email protected]>
>> Reviewed-by: Rob Herring <[email protected]>
>> Signed-off-by: Rob Clark <[email protected]>
>> Reviewed-by: Bjorn Andersson <[email protected]>
>> ---
>>
>>   Documentation/devicetree/bindings/iommu/arm,smmu.yaml | 9 +++++++--
>>   1 file changed, 7 insertions(+), 2 deletions(-)
>>
>> diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
>> b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
>> index 503160a7b9a0..3b63f2ae24db 100644
>> --- a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
>> +++ b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
>> @@ -28,8 +28,6 @@ properties:
>>             - enum:
>>                 - qcom,msm8996-smmu-v2
>>                 - qcom,msm8998-smmu-v2
>> -              - qcom,sc7180-smmu-v2
>> -              - qcom,sdm845-smmu-v2
>
> What about the "Apps SMMU" instances? Those are distinct and don't
> have/need the GPU special behaviour, right?

Oh, having looked at patch #4, which prompted me go and look at the 845
DTSI in context, now I realise the subtlety I overlooked. So I guess it
really was worth resending, ha! Sorry for being thick :)

Reviewed-by: Robin Murphy <[email protected]>

>
> Robin.
>
>>             - const: qcom,smmu-v2
>>         - description: Qcom SoCs implementing "arm,mmu-500"
>> @@ -40,6 +38,13 @@ properties:
>>                 - qcom,sm8150-smmu-500
>>                 - qcom,sm8250-smmu-500
>>             - const: arm,mmu-500
>> +      - description: Qcom Adreno GPUs implementing "arm,smmu-v2"
>> +        items:
>> +          - enum:
>> +              - qcom,sc7180-smmu-v2
>> +              - qcom,sdm845-smmu-v2
>> +          - const: qcom,adreno-smmu
>> +          - const: qcom,smmu-v2
>>         - description: Marvell SoCs implementing "arm,mmu-500"
>>           items:
>>             - const: marvell,ap806-smmu-500
>>

2020-11-03 17:30:17

by Jordan Crouse

[permalink] [raw]
Subject: Re: [PATCH v18 2/4] iommu/arm-smmu: Add a way for implementations to influence SCTLR

On Mon, Nov 02, 2020 at 06:18:45PM +0000, Robin Murphy wrote:
> On 2020-11-02 17:14, Jordan Crouse wrote:
> >From: Rob Clark <[email protected]>
> >
> >For the Adreno GPU's SMMU, we want SCTLR.HUPCF set to ensure that
> >pending translations are not terminated on iova fault. Otherwise
> >a terminated CP read could hang the GPU by returning invalid
> >command-stream data.
> >
> >Signed-off-by: Rob Clark <[email protected]>
> >Reviewed-by: Bjorn Andersson <[email protected]>
> >Signed-off-by: Jordan Crouse <[email protected]>
> >---
> >
> > drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 6 ++++++
> > drivers/iommu/arm/arm-smmu/arm-smmu.c | 3 +++
> > drivers/iommu/arm/arm-smmu/arm-smmu.h | 3 +++
> > 3 files changed, 12 insertions(+)
> >
> >diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
> >index 1e942eed2dfc..0663d7d26908 100644
> >--- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
> >+++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
> >@@ -129,6 +129,12 @@ static int qcom_adreno_smmu_init_context(struct arm_smmu_domain *smmu_domain,
> > (smmu_domain->cfg.fmt == ARM_SMMU_CTX_FMT_AARCH64))
> > pgtbl_cfg->quirks |= IO_PGTABLE_QUIRK_ARM_TTBR1;
> >+ /*
> >+ * On the GPU device we want to process subsequent transactions after a
> >+ * fault to keep the GPU from hanging
> >+ */
> >+ smmu_domain->cfg.sctlr_set |= ARM_SMMU_SCTLR_HUPCF;
> >+
> > /*
> > * Initialize private interface with GPU:
> > */
> >diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> >index dad7fa86fbd4..1f06ab219819 100644
> >--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
> >+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> >@@ -617,6 +617,9 @@ void arm_smmu_write_context_bank(struct arm_smmu_device *smmu, int idx)
> > if (IS_ENABLED(CONFIG_CPU_BIG_ENDIAN))
> > reg |= ARM_SMMU_SCTLR_E;
> >+ reg |= cfg->sctlr_set;
> >+ reg &= ~cfg->sctlr_clr;
>
> Since we now have a write_s2cr hook, I'm inclined to think that the
> consistency of a write_sctlr hook that could similarly apply its own
> arbitrary tweaks would make sense for this. Does anyone have any strong
> opinions?

None from me. That would make an eventual stall-on-fault implementation easier
too.

Jordan

> Robin.
>
> >+
> > arm_smmu_cb_write(smmu, idx, ARM_SMMU_CB_SCTLR, reg);
> > }
> >diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.h b/drivers/iommu/arm/arm-smmu/arm-smmu.h
> >index 6c5ff9999eae..ddf2ca4c923d 100644
> >--- a/drivers/iommu/arm/arm-smmu/arm-smmu.h
> >+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.h
> >@@ -144,6 +144,7 @@ enum arm_smmu_cbar_type {
> > #define ARM_SMMU_CB_SCTLR 0x0
> > #define ARM_SMMU_SCTLR_S1_ASIDPNE BIT(12)
> > #define ARM_SMMU_SCTLR_CFCFG BIT(7)
> >+#define ARM_SMMU_SCTLR_HUPCF BIT(8)
> > #define ARM_SMMU_SCTLR_CFIE BIT(6)
> > #define ARM_SMMU_SCTLR_CFRE BIT(5)
> > #define ARM_SMMU_SCTLR_E BIT(4)
> >@@ -341,6 +342,8 @@ struct arm_smmu_cfg {
> > u16 asid;
> > u16 vmid;
> > };
> >+ u32 sctlr_set; /* extra bits to set in SCTLR */
> >+ u32 sctlr_clr; /* bits to mask in SCTLR */
> > enum arm_smmu_cbar_type cbar;
> > enum arm_smmu_context_fmt fmt;
> > };
> >

--
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

2020-11-03 18:17:34

by Bjorn Andersson

[permalink] [raw]
Subject: Re: [PATCH v18 2/4] iommu/arm-smmu: Add a way for implementations to influence SCTLR

On Mon 02 Nov 12:18 CST 2020, Robin Murphy wrote:

> On 2020-11-02 17:14, Jordan Crouse wrote:
> > From: Rob Clark <[email protected]>
> >
> > For the Adreno GPU's SMMU, we want SCTLR.HUPCF set to ensure that
> > pending translations are not terminated on iova fault. Otherwise
> > a terminated CP read could hang the GPU by returning invalid
> > command-stream data.
> >
> > Signed-off-by: Rob Clark <[email protected]>
> > Reviewed-by: Bjorn Andersson <[email protected]>
> > Signed-off-by: Jordan Crouse <[email protected]>
> > ---
> >
> > drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 6 ++++++
> > drivers/iommu/arm/arm-smmu/arm-smmu.c | 3 +++
> > drivers/iommu/arm/arm-smmu/arm-smmu.h | 3 +++
> > 3 files changed, 12 insertions(+)
> >
> > diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
> > index 1e942eed2dfc..0663d7d26908 100644
> > --- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
> > +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
> > @@ -129,6 +129,12 @@ static int qcom_adreno_smmu_init_context(struct arm_smmu_domain *smmu_domain,
> > (smmu_domain->cfg.fmt == ARM_SMMU_CTX_FMT_AARCH64))
> > pgtbl_cfg->quirks |= IO_PGTABLE_QUIRK_ARM_TTBR1;
> > + /*
> > + * On the GPU device we want to process subsequent transactions after a
> > + * fault to keep the GPU from hanging
> > + */
> > + smmu_domain->cfg.sctlr_set |= ARM_SMMU_SCTLR_HUPCF;
> > +
> > /*
> > * Initialize private interface with GPU:
> > */
> > diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> > index dad7fa86fbd4..1f06ab219819 100644
> > --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
> > +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> > @@ -617,6 +617,9 @@ void arm_smmu_write_context_bank(struct arm_smmu_device *smmu, int idx)
> > if (IS_ENABLED(CONFIG_CPU_BIG_ENDIAN))
> > reg |= ARM_SMMU_SCTLR_E;
> > + reg |= cfg->sctlr_set;
> > + reg &= ~cfg->sctlr_clr;
>
> Since we now have a write_s2cr hook, I'm inclined to think that the
> consistency of a write_sctlr hook that could similarly apply its own
> arbitrary tweaks would make sense for this. Does anyone have any strong
> opinions?
>

I like it.

Regards,
Bjorn

> Robin.
>
> > +
> > arm_smmu_cb_write(smmu, idx, ARM_SMMU_CB_SCTLR, reg);
> > }
> > diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.h b/drivers/iommu/arm/arm-smmu/arm-smmu.h
> > index 6c5ff9999eae..ddf2ca4c923d 100644
> > --- a/drivers/iommu/arm/arm-smmu/arm-smmu.h
> > +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.h
> > @@ -144,6 +144,7 @@ enum arm_smmu_cbar_type {
> > #define ARM_SMMU_CB_SCTLR 0x0
> > #define ARM_SMMU_SCTLR_S1_ASIDPNE BIT(12)
> > #define ARM_SMMU_SCTLR_CFCFG BIT(7)
> > +#define ARM_SMMU_SCTLR_HUPCF BIT(8)
> > #define ARM_SMMU_SCTLR_CFIE BIT(6)
> > #define ARM_SMMU_SCTLR_CFRE BIT(5)
> > #define ARM_SMMU_SCTLR_E BIT(4)
> > @@ -341,6 +342,8 @@ struct arm_smmu_cfg {
> > u16 asid;
> > u16 vmid;
> > };
> > + u32 sctlr_set; /* extra bits to set in SCTLR */
> > + u32 sctlr_clr; /* bits to mask in SCTLR */
> > enum arm_smmu_cbar_type cbar;
> > enum arm_smmu_context_fmt fmt;
> > };
> >

2020-11-06 12:37:06

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH v18 2/4] iommu/arm-smmu: Add a way for implementations to influence SCTLR

On Tue, Nov 03, 2020 at 10:28:13AM -0700, Jordan Crouse wrote:
> On Mon, Nov 02, 2020 at 06:18:45PM +0000, Robin Murphy wrote:
> > On 2020-11-02 17:14, Jordan Crouse wrote:
> > >From: Rob Clark <[email protected]>
> > >
> > >For the Adreno GPU's SMMU, we want SCTLR.HUPCF set to ensure that
> > >pending translations are not terminated on iova fault. Otherwise
> > >a terminated CP read could hang the GPU by returning invalid
> > >command-stream data.
> > >
> > >Signed-off-by: Rob Clark <[email protected]>
> > >Reviewed-by: Bjorn Andersson <[email protected]>
> > >Signed-off-by: Jordan Crouse <[email protected]>
> > >---
> > >
> > > drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 6 ++++++
> > > drivers/iommu/arm/arm-smmu/arm-smmu.c | 3 +++
> > > drivers/iommu/arm/arm-smmu/arm-smmu.h | 3 +++
> > > 3 files changed, 12 insertions(+)
> > >
> > >diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
> > >index 1e942eed2dfc..0663d7d26908 100644
> > >--- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
> > >+++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
> > >@@ -129,6 +129,12 @@ static int qcom_adreno_smmu_init_context(struct arm_smmu_domain *smmu_domain,
> > > (smmu_domain->cfg.fmt == ARM_SMMU_CTX_FMT_AARCH64))
> > > pgtbl_cfg->quirks |= IO_PGTABLE_QUIRK_ARM_TTBR1;
> > >+ /*
> > >+ * On the GPU device we want to process subsequent transactions after a
> > >+ * fault to keep the GPU from hanging
> > >+ */
> > >+ smmu_domain->cfg.sctlr_set |= ARM_SMMU_SCTLR_HUPCF;
> > >+
> > > /*
> > > * Initialize private interface with GPU:
> > > */
> > >diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> > >index dad7fa86fbd4..1f06ab219819 100644
> > >--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
> > >+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> > >@@ -617,6 +617,9 @@ void arm_smmu_write_context_bank(struct arm_smmu_device *smmu, int idx)
> > > if (IS_ENABLED(CONFIG_CPU_BIG_ENDIAN))
> > > reg |= ARM_SMMU_SCTLR_E;
> > >+ reg |= cfg->sctlr_set;
> > >+ reg &= ~cfg->sctlr_clr;
> >
> > Since we now have a write_s2cr hook, I'm inclined to think that the
> > consistency of a write_sctlr hook that could similarly apply its own
> > arbitrary tweaks would make sense for this. Does anyone have any strong
> > opinions?
>
> None from me. That would make an eventual stall-on-fault implementation easier
> too.

Sounds like people like this idea, so please can you spin a new version with
that so that I can queue the first three patches for 5.11?

Cheers,

Will