2023-12-07 21:32:08

by Rob Clark

[permalink] [raw]
Subject: [PATCH] iommu/arm-smmu-qcom: Add missing GMU entry to match table

From: Rob Clark <[email protected]>

We also want the default domain for the GMU to be an identy domain,
so it does not get a context bank assigned. Without this, both
of_dma_configure() and drm/msm's iommu_domain_attach() will trigger
allocating and configuring a context bank. So GMU ends up attached
to both cbndx 1 and cbndx 2. This arrangement seemingly confounds
and surprises the firmware if the GPU later triggers a translation
fault, resulting (on sc8280xp / lenovo x13s, at least) in the SMMU
getting wedged and the GPU stuck without memory access.

Signed-off-by: Rob Clark <[email protected]>
---
drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
index 549ae4dba3a6..d326fa230b96 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
@@ -243,6 +243,7 @@ static int qcom_adreno_smmu_init_context(struct arm_smmu_domain *smmu_domain,

static const struct of_device_id qcom_smmu_client_of_match[] __maybe_unused = {
{ .compatible = "qcom,adreno" },
+ { .compatible = "qcom,adreno-gmu" },
{ .compatible = "qcom,mdp4" },
{ .compatible = "qcom,mdss" },
{ .compatible = "qcom,sc7180-mdss" },
--
2.43.0


2023-12-08 08:18:46

by Johan Hovold

[permalink] [raw]
Subject: Re: [PATCH] iommu/arm-smmu-qcom: Add missing GMU entry to match table

On Thu, Dec 07, 2023 at 01:24:39PM -0800, Rob Clark wrote:
> From: Rob Clark <[email protected]>
>
> We also want the default domain for the GMU to be an identy domain,
> so it does not get a context bank assigned. Without this, both
> of_dma_configure() and drm/msm's iommu_domain_attach() will trigger
> allocating and configuring a context bank. So GMU ends up attached
> to both cbndx 1 and cbndx 2. This arrangement seemingly confounds
> and surprises the firmware if the GPU later triggers a translation
> fault, resulting (on sc8280xp / lenovo x13s, at least) in the SMMU
> getting wedged and the GPU stuck without memory access.

This sounds like something that should be backported. Should you add a
Fixes and CC-stable tag?

> Signed-off-by: Rob Clark <[email protected]>

Johan

2023-12-08 11:49:53

by Robin Murphy

[permalink] [raw]
Subject: Re: [PATCH] iommu/arm-smmu-qcom: Add missing GMU entry to match table

On 07/12/2023 9:24 pm, Rob Clark wrote:
> From: Rob Clark <[email protected]>
>
> We also want the default domain for the GMU to be an identy domain,
> so it does not get a context bank assigned. Without this, both
> of_dma_configure() and drm/msm's iommu_domain_attach() will trigger
> allocating and configuring a context bank. So GMU ends up attached
> to both cbndx 1 and cbndx 2.

I can't help but read this as implying that it gets attached to both *at
the same time*, which would be indicative of a far more serious problem
in the main driver and/or IOMMU core code.

However, from what we discussed on IRC last night, it sounds like the
key point here is more straightforwardly that firmware expects the GMU
to be using context bank 1, in a vaguely similar fashion to how context
bank 0 is special for the GPU. Clarifying that would help explain why
we're just doing this as a trick to influence the allocator (i.e. unlike
some of the other devices in this list we don't actually need the
properties of the identity domain itself).

In future it might be nice to reserve this explicitly on platforms which
need it and extend qcom_adreno_smmu_alloc_context_bank() to handle the
GMU as well, but I don't object to this patch as an immediate quick fix
for now, especially as something nice and easy for stable (I'd agree
with Johan in that regard).

Thanks,
Robin.

> This arrangement seemingly confounds
> and surprises the firmware if the GPU later triggers a translation
> fault, resulting (on sc8280xp / lenovo x13s, at least) in the SMMU
> getting wedged and the GPU stuck without memory access.
>
> Signed-off-by: Rob Clark <[email protected]>
> ---
> drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
> index 549ae4dba3a6..d326fa230b96 100644
> --- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
> @@ -243,6 +243,7 @@ static int qcom_adreno_smmu_init_context(struct arm_smmu_domain *smmu_domain,
>
> static const struct of_device_id qcom_smmu_client_of_match[] __maybe_unused = {
> { .compatible = "qcom,adreno" },
> + { .compatible = "qcom,adreno-gmu" },
> { .compatible = "qcom,mdp4" },
> { .compatible = "qcom,mdss" },
> { .compatible = "qcom,sc7180-mdss" },