2023-03-14 10:59:36

by Manivannan Sadhasivam

[permalink] [raw]
Subject: [PATCH v2] iommu/arm-smmu-qcom: Rework the logic finding the bypass quirk

The logic used to find the quirky firmware that intercepts the writes to
S2CR register to replace bypass type streams with a fault, and ignore the
fault type, is not working with the firmware on newer SoCs like SC8280XP.

The current logic uses the last stream mapping group (num_mapping_groups
- 1) as an index for finding quirky firmware. But on SC8280XP, NUSMRG
reports a value of 162 (possibly emulated by the hypervisor) and logic is
not working for stream mapping groups > 128. (Note that the ARM SMMU
architecture specification defines NUMSMRG in the range of 0-127).

So the current logic that checks the (162-1)th S2CR entry fails to detect
the quirky firmware on these devices and SMMU triggers invalid context
fault for bypass streams.

To fix this issue, rework the logic to find the first non-valid (free)
stream mapping register group (SMR) within 128 groups and use that index
to access S2CR for detecting the bypass quirk. If no free groups are
available, then just skip the quirk detection.

While at it, let's move the quirk detection logic to a separate function
and change the local variable name from last_s2cr to free_s2cr.

Reviewed-by: Bjorn Andersson <[email protected]>
Signed-off-by: Manivannan Sadhasivam <[email protected]>
---

Changes in v2:

* Limited the check to 128 groups as per ARM SMMU spec's NUMSMRG range
* Moved the quirk handling to its own function
* Collected review tag from Bjorn

drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 48 ++++++++++++++++++----
1 file changed, 40 insertions(+), 8 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
index d1b296b95c86..48362d7ef451 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
@@ -266,25 +266,49 @@ static int qcom_smmu_init_context(struct arm_smmu_domain *smmu_domain,
return 0;
}

-static int qcom_smmu_cfg_probe(struct arm_smmu_device *smmu)
+static void qcom_smmu_bypass_quirk(struct arm_smmu_device *smmu)
{
- unsigned int last_s2cr = ARM_SMMU_GR0_S2CR(smmu->num_mapping_groups - 1);
struct qcom_smmu *qsmmu = to_qcom_smmu(smmu);
- u32 reg;
- u32 smr;
+ u32 free_s2cr;
+ u32 reg, smr;
int i;

+ /*
+ * Find the first non-valid (free) stream mapping register group and
+ * use that index to access S2CR for detecting the bypass quirk.
+ *
+ * Note that only the first 128 stream mapping groups are considered for
+ * the check. This is because the ARM SMMU architecture specification
+ * defines NUMSMRG (Number of Stream Mapping Register Groups) in the
+ * range of 0-127, but some Qcom platforms emulate more stream mapping
+ * groups with the help of hypervisor. And those groups don't exhibit
+ * the quirky behavior.
+ */
+ for (i = 0; i < 128; i++) {
+ smr = arm_smmu_gr0_read(smmu, ARM_SMMU_GR0_SMR(i));
+
+ if (!FIELD_GET(ARM_SMMU_SMR_VALID, smr))
+ break;
+ }
+
+ /* If no free stream mapping register group is available, skip the check */
+ if (i == 128)
+ return;
+
+ free_s2cr = ARM_SMMU_GR0_S2CR(i);
+
/*
* With some firmware versions writes to S2CR of type FAULT are
* ignored, and writing BYPASS will end up written as FAULT in the
- * register. Perform a write to S2CR to detect if this is the case and
- * if so reserve a context bank to emulate bypass streams.
+ * register. Perform a write to the first free S2CR to detect if
+ * this is the case and if so reserve a context bank to emulate
+ * bypass streams.
*/
reg = FIELD_PREP(ARM_SMMU_S2CR_TYPE, S2CR_TYPE_BYPASS) |
FIELD_PREP(ARM_SMMU_S2CR_CBNDX, 0xff) |
FIELD_PREP(ARM_SMMU_S2CR_PRIVCFG, S2CR_PRIVCFG_DEFAULT);
- arm_smmu_gr0_write(smmu, last_s2cr, reg);
- reg = arm_smmu_gr0_read(smmu, last_s2cr);
+ arm_smmu_gr0_write(smmu, free_s2cr, reg);
+ reg = arm_smmu_gr0_read(smmu, free_s2cr);
if (FIELD_GET(ARM_SMMU_S2CR_TYPE, reg) != S2CR_TYPE_BYPASS) {
qsmmu->bypass_quirk = true;
qsmmu->bypass_cbndx = smmu->num_context_banks - 1;
@@ -296,6 +320,14 @@ static int qcom_smmu_cfg_probe(struct arm_smmu_device *smmu)
reg = FIELD_PREP(ARM_SMMU_CBAR_TYPE, CBAR_TYPE_S1_TRANS_S2_BYPASS);
arm_smmu_gr1_write(smmu, ARM_SMMU_GR1_CBAR(qsmmu->bypass_cbndx), reg);
}
+}
+
+static int qcom_smmu_cfg_probe(struct arm_smmu_device *smmu)
+{
+ u32 smr;
+ int i;
+
+ qcom_smmu_bypass_quirk(smmu);

for (i = 0; i < smmu->num_mapping_groups; i++) {
smr = arm_smmu_gr0_read(smmu, ARM_SMMU_GR0_SMR(i));
--
2.25.1



2023-03-14 11:17:38

by Johan Hovold

[permalink] [raw]
Subject: Re: [PATCH v2] iommu/arm-smmu-qcom: Rework the logic finding the bypass quirk

On Tue, Mar 14, 2023 at 04:29:05PM +0530, Manivannan Sadhasivam wrote:
> The logic used to find the quirky firmware that intercepts the writes to
> S2CR register to replace bypass type streams with a fault, and ignore the
> fault type, is not working with the firmware on newer SoCs like SC8280XP.
>
> The current logic uses the last stream mapping group (num_mapping_groups
> - 1) as an index for finding quirky firmware. But on SC8280XP, NUSMRG
> reports a value of 162 (possibly emulated by the hypervisor) and logic is
> not working for stream mapping groups > 128. (Note that the ARM SMMU
> architecture specification defines NUMSMRG in the range of 0-127).
>
> So the current logic that checks the (162-1)th S2CR entry fails to detect
> the quirky firmware on these devices and SMMU triggers invalid context
> fault for bypass streams.
>
> To fix this issue, rework the logic to find the first non-valid (free)
> stream mapping register group (SMR) within 128 groups and use that index
> to access S2CR for detecting the bypass quirk. If no free groups are
> available, then just skip the quirk detection.
>
> While at it, let's move the quirk detection logic to a separate function
> and change the local variable name from last_s2cr to free_s2cr.
>
> Reviewed-by: Bjorn Andersson <[email protected]>
> Signed-off-by: Manivannan Sadhasivam <[email protected]>
> ---
>
> Changes in v2:
>
> * Limited the check to 128 groups as per ARM SMMU spec's NUMSMRG range
> * Moved the quirk handling to its own function
> * Collected review tag from Bjorn
>
> drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 48 ++++++++++++++++++----
> 1 file changed, 40 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
> index d1b296b95c86..48362d7ef451 100644
> --- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
> @@ -266,25 +266,49 @@ static int qcom_smmu_init_context(struct arm_smmu_domain *smmu_domain,
> return 0;
> }
>
> -static int qcom_smmu_cfg_probe(struct arm_smmu_device *smmu)
> +static void qcom_smmu_bypass_quirk(struct arm_smmu_device *smmu)
> {
> - unsigned int last_s2cr = ARM_SMMU_GR0_S2CR(smmu->num_mapping_groups - 1);
> struct qcom_smmu *qsmmu = to_qcom_smmu(smmu);
> - u32 reg;
> - u32 smr;
> + u32 free_s2cr;
> + u32 reg, smr;
> int i;
>
> + /*
> + * Find the first non-valid (free) stream mapping register group and
> + * use that index to access S2CR for detecting the bypass quirk.
> + *
> + * Note that only the first 128 stream mapping groups are considered for
> + * the check. This is because the ARM SMMU architecture specification
> + * defines NUMSMRG (Number of Stream Mapping Register Groups) in the
> + * range of 0-127, but some Qcom platforms emulate more stream mapping
> + * groups with the help of hypervisor. And those groups don't exhibit
> + * the quirky behavior.
> + */
> + for (i = 0; i < 128; i++) {

This may now access registers beyond smmu->num_mapping_groups. Should
you not use the minimum of these two values here (and below)?

> + smr = arm_smmu_gr0_read(smmu, ARM_SMMU_GR0_SMR(i));
> +
> + if (!FIELD_GET(ARM_SMMU_SMR_VALID, smr))
> + break;
> + }
> +
> + /* If no free stream mapping register group is available, skip the check */
> + if (i == 128)
> + return;

Johan

2023-03-14 11:26:44

by Manivannan Sadhasivam

[permalink] [raw]
Subject: Re: [PATCH v2] iommu/arm-smmu-qcom: Rework the logic finding the bypass quirk

On Tue, Mar 14, 2023 at 12:17:38PM +0100, Johan Hovold wrote:
> On Tue, Mar 14, 2023 at 04:29:05PM +0530, Manivannan Sadhasivam wrote:
> > The logic used to find the quirky firmware that intercepts the writes to
> > S2CR register to replace bypass type streams with a fault, and ignore the
> > fault type, is not working with the firmware on newer SoCs like SC8280XP.
> >
> > The current logic uses the last stream mapping group (num_mapping_groups
> > - 1) as an index for finding quirky firmware. But on SC8280XP, NUSMRG
> > reports a value of 162 (possibly emulated by the hypervisor) and logic is
> > not working for stream mapping groups > 128. (Note that the ARM SMMU
> > architecture specification defines NUMSMRG in the range of 0-127).
> >
> > So the current logic that checks the (162-1)th S2CR entry fails to detect
> > the quirky firmware on these devices and SMMU triggers invalid context
> > fault for bypass streams.
> >
> > To fix this issue, rework the logic to find the first non-valid (free)
> > stream mapping register group (SMR) within 128 groups and use that index
> > to access S2CR for detecting the bypass quirk. If no free groups are
> > available, then just skip the quirk detection.
> >
> > While at it, let's move the quirk detection logic to a separate function
> > and change the local variable name from last_s2cr to free_s2cr.
> >
> > Reviewed-by: Bjorn Andersson <[email protected]>
> > Signed-off-by: Manivannan Sadhasivam <[email protected]>
> > ---
> >
> > Changes in v2:
> >
> > * Limited the check to 128 groups as per ARM SMMU spec's NUMSMRG range
> > * Moved the quirk handling to its own function
> > * Collected review tag from Bjorn
> >
> > drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 48 ++++++++++++++++++----
> > 1 file changed, 40 insertions(+), 8 deletions(-)
> >
> > diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
> > index d1b296b95c86..48362d7ef451 100644
> > --- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
> > +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
> > @@ -266,25 +266,49 @@ static int qcom_smmu_init_context(struct arm_smmu_domain *smmu_domain,
> > return 0;
> > }
> >
> > -static int qcom_smmu_cfg_probe(struct arm_smmu_device *smmu)
> > +static void qcom_smmu_bypass_quirk(struct arm_smmu_device *smmu)
> > {
> > - unsigned int last_s2cr = ARM_SMMU_GR0_S2CR(smmu->num_mapping_groups - 1);
> > struct qcom_smmu *qsmmu = to_qcom_smmu(smmu);
> > - u32 reg;
> > - u32 smr;
> > + u32 free_s2cr;
> > + u32 reg, smr;
> > int i;
> >
> > + /*
> > + * Find the first non-valid (free) stream mapping register group and
> > + * use that index to access S2CR for detecting the bypass quirk.
> > + *
> > + * Note that only the first 128 stream mapping groups are considered for
> > + * the check. This is because the ARM SMMU architecture specification
> > + * defines NUMSMRG (Number of Stream Mapping Register Groups) in the
> > + * range of 0-127, but some Qcom platforms emulate more stream mapping
> > + * groups with the help of hypervisor. And those groups don't exhibit
> > + * the quirky behavior.
> > + */
> > + for (i = 0; i < 128; i++) {
>
> This may now access registers beyond smmu->num_mapping_groups. Should
> you not use the minimum of these two values here (and below)?
>

Doh! yeah, you're right. Will fix it in v3.

Thanks,
Mani

> > + smr = arm_smmu_gr0_read(smmu, ARM_SMMU_GR0_SMR(i));
> > +
> > + if (!FIELD_GET(ARM_SMMU_SMR_VALID, smr))
> > + break;
> > + }
> > +
> > + /* If no free stream mapping register group is available, skip the check */
> > + if (i == 128)
> > + return;
>
> Johan

--
மணிவண்ணன் சதாசிவம்

2023-03-14 13:20:43

by Robin Murphy

[permalink] [raw]
Subject: Re: [PATCH v2] iommu/arm-smmu-qcom: Rework the logic finding the bypass quirk

On 2023-03-14 11:26, Manivannan Sadhasivam wrote:
> On Tue, Mar 14, 2023 at 12:17:38PM +0100, Johan Hovold wrote:
>> On Tue, Mar 14, 2023 at 04:29:05PM +0530, Manivannan Sadhasivam wrote:
>>> The logic used to find the quirky firmware that intercepts the writes to
>>> S2CR register to replace bypass type streams with a fault, and ignore the
>>> fault type, is not working with the firmware on newer SoCs like SC8280XP.
>>>
>>> The current logic uses the last stream mapping group (num_mapping_groups
>>> - 1) as an index for finding quirky firmware. But on SC8280XP, NUSMRG
>>> reports a value of 162 (possibly emulated by the hypervisor) and logic is
>>> not working for stream mapping groups > 128. (Note that the ARM SMMU
>>> architecture specification defines NUMSMRG in the range of 0-127).
>>>
>>> So the current logic that checks the (162-1)th S2CR entry fails to detect
>>> the quirky firmware on these devices and SMMU triggers invalid context
>>> fault for bypass streams.
>>>
>>> To fix this issue, rework the logic to find the first non-valid (free)
>>> stream mapping register group (SMR) within 128 groups and use that index
>>> to access S2CR for detecting the bypass quirk. If no free groups are
>>> available, then just skip the quirk detection.
>>>
>>> While at it, let's move the quirk detection logic to a separate function
>>> and change the local variable name from last_s2cr to free_s2cr.
>>>
>>> Reviewed-by: Bjorn Andersson <[email protected]>
>>> Signed-off-by: Manivannan Sadhasivam <[email protected]>
>>> ---
>>>
>>> Changes in v2:
>>>
>>> * Limited the check to 128 groups as per ARM SMMU spec's NUMSMRG range
>>> * Moved the quirk handling to its own function
>>> * Collected review tag from Bjorn
>>>
>>> drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 48 ++++++++++++++++++----
>>> 1 file changed, 40 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
>>> index d1b296b95c86..48362d7ef451 100644
>>> --- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
>>> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
>>> @@ -266,25 +266,49 @@ static int qcom_smmu_init_context(struct arm_smmu_domain *smmu_domain,
>>> return 0;
>>> }
>>>
>>> -static int qcom_smmu_cfg_probe(struct arm_smmu_device *smmu)
>>> +static void qcom_smmu_bypass_quirk(struct arm_smmu_device *smmu)
>>> {
>>> - unsigned int last_s2cr = ARM_SMMU_GR0_S2CR(smmu->num_mapping_groups - 1);
>>> struct qcom_smmu *qsmmu = to_qcom_smmu(smmu);
>>> - u32 reg;
>>> - u32 smr;
>>> + u32 free_s2cr;
>>> + u32 reg, smr;
>>> int i;
>>>
>>> + /*
>>> + * Find the first non-valid (free) stream mapping register group and
>>> + * use that index to access S2CR for detecting the bypass quirk.
>>> + *
>>> + * Note that only the first 128 stream mapping groups are considered for
>>> + * the check. This is because the ARM SMMU architecture specification
>>> + * defines NUMSMRG (Number of Stream Mapping Register Groups) in the
>>> + * range of 0-127, but some Qcom platforms emulate more stream mapping
>>> + * groups with the help of hypervisor. And those groups don't exhibit
>>> + * the quirky behavior.
>>> + */
>>> + for (i = 0; i < 128; i++) {
>>
>> This may now access registers beyond smmu->num_mapping_groups. Should
>> you not use the minimum of these two values here (and below)?
>>
>
> Doh! yeah, you're right. Will fix it in v3.

FWIW I'd say it's probably best if the cfg_probe hook clamps
smmu->num_mapping_groups to the architectural maximum straight away, to
also prevent the main driver iterating off into the nonsensical area in
arm_smmu_device_reset() or the SMR allocator itself.

(Note that we don't support the weird EXSMRGS extension that appeared in
a late version of the architecture, but even if we did, that still
reports 128 for IDR0.NUMSMRG, and the extra extended SMRs live somewhere
completely different.)

Thanks,
Robin.

2023-03-14 13:24:16

by Manivannan Sadhasivam

[permalink] [raw]
Subject: Re: [PATCH v2] iommu/arm-smmu-qcom: Rework the logic finding the bypass quirk

On Tue, Mar 14, 2023 at 11:58:24AM +0000, Robin Murphy wrote:
> On 2023-03-14 11:26, Manivannan Sadhasivam wrote:
> > On Tue, Mar 14, 2023 at 12:17:38PM +0100, Johan Hovold wrote:
> > > On Tue, Mar 14, 2023 at 04:29:05PM +0530, Manivannan Sadhasivam wrote:
> > > > The logic used to find the quirky firmware that intercepts the writes to
> > > > S2CR register to replace bypass type streams with a fault, and ignore the
> > > > fault type, is not working with the firmware on newer SoCs like SC8280XP.
> > > >
> > > > The current logic uses the last stream mapping group (num_mapping_groups
> > > > - 1) as an index for finding quirky firmware. But on SC8280XP, NUSMRG
> > > > reports a value of 162 (possibly emulated by the hypervisor) and logic is
> > > > not working for stream mapping groups > 128. (Note that the ARM SMMU
> > > > architecture specification defines NUMSMRG in the range of 0-127).
> > > >
> > > > So the current logic that checks the (162-1)th S2CR entry fails to detect
> > > > the quirky firmware on these devices and SMMU triggers invalid context
> > > > fault for bypass streams.
> > > >
> > > > To fix this issue, rework the logic to find the first non-valid (free)
> > > > stream mapping register group (SMR) within 128 groups and use that index
> > > > to access S2CR for detecting the bypass quirk. If no free groups are
> > > > available, then just skip the quirk detection.
> > > >
> > > > While at it, let's move the quirk detection logic to a separate function
> > > > and change the local variable name from last_s2cr to free_s2cr.
> > > >
> > > > Reviewed-by: Bjorn Andersson <[email protected]>
> > > > Signed-off-by: Manivannan Sadhasivam <[email protected]>
> > > > ---
> > > >
> > > > Changes in v2:
> > > >
> > > > * Limited the check to 128 groups as per ARM SMMU spec's NUMSMRG range
> > > > * Moved the quirk handling to its own function
> > > > * Collected review tag from Bjorn
> > > >
> > > > drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 48 ++++++++++++++++++----
> > > > 1 file changed, 40 insertions(+), 8 deletions(-)
> > > >
> > > > diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
> > > > index d1b296b95c86..48362d7ef451 100644
> > > > --- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
> > > > +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
> > > > @@ -266,25 +266,49 @@ static int qcom_smmu_init_context(struct arm_smmu_domain *smmu_domain,
> > > > return 0;
> > > > }
> > > > -static int qcom_smmu_cfg_probe(struct arm_smmu_device *smmu)
> > > > +static void qcom_smmu_bypass_quirk(struct arm_smmu_device *smmu)
> > > > {
> > > > - unsigned int last_s2cr = ARM_SMMU_GR0_S2CR(smmu->num_mapping_groups - 1);
> > > > struct qcom_smmu *qsmmu = to_qcom_smmu(smmu);
> > > > - u32 reg;
> > > > - u32 smr;
> > > > + u32 free_s2cr;
> > > > + u32 reg, smr;
> > > > int i;
> > > > + /*
> > > > + * Find the first non-valid (free) stream mapping register group and
> > > > + * use that index to access S2CR for detecting the bypass quirk.
> > > > + *
> > > > + * Note that only the first 128 stream mapping groups are considered for
> > > > + * the check. This is because the ARM SMMU architecture specification
> > > > + * defines NUMSMRG (Number of Stream Mapping Register Groups) in the
> > > > + * range of 0-127, but some Qcom platforms emulate more stream mapping
> > > > + * groups with the help of hypervisor. And those groups don't exhibit
> > > > + * the quirky behavior.
> > > > + */
> > > > + for (i = 0; i < 128; i++) {
> > >
> > > This may now access registers beyond smmu->num_mapping_groups. Should
> > > you not use the minimum of these two values here (and below)?
> > >
> >
> > Doh! yeah, you're right. Will fix it in v3.
>
> FWIW I'd say it's probably best if the cfg_probe hook clamps
> smmu->num_mapping_groups to the architectural maximum straight away, to also
> prevent the main driver iterating off into the nonsensical area in
> arm_smmu_device_reset() or the SMR allocator itself.
>

We considered that also but Qcom purposefully extended the NUMSMRG for
virtualization usecase and we do not have a clear picture of it. That's the
reason we settled with capping the value only for the quirk detection.

Thanks,
Mani

> (Note that we don't support the weird EXSMRGS extension that appeared in a
> late version of the architecture, but even if we did, that still reports 128
> for IDR0.NUMSMRG, and the extra extended SMRs live somewhere completely
> different.)
>
> Thanks,
> Robin.

--
மணிவண்ணன் சதாசிவம்

2023-03-14 13:45:48

by Robin Murphy

[permalink] [raw]
Subject: Re: [PATCH v2] iommu/arm-smmu-qcom: Rework the logic finding the bypass quirk

On 2023-03-14 13:20, Manivannan Sadhasivam wrote:
> On Tue, Mar 14, 2023 at 11:58:24AM +0000, Robin Murphy wrote:
>> On 2023-03-14 11:26, Manivannan Sadhasivam wrote:
>>> On Tue, Mar 14, 2023 at 12:17:38PM +0100, Johan Hovold wrote:
>>>> On Tue, Mar 14, 2023 at 04:29:05PM +0530, Manivannan Sadhasivam wrote:
>>>>> The logic used to find the quirky firmware that intercepts the writes to
>>>>> S2CR register to replace bypass type streams with a fault, and ignore the
>>>>> fault type, is not working with the firmware on newer SoCs like SC8280XP.
>>>>>
>>>>> The current logic uses the last stream mapping group (num_mapping_groups
>>>>> - 1) as an index for finding quirky firmware. But on SC8280XP, NUSMRG
>>>>> reports a value of 162 (possibly emulated by the hypervisor) and logic is
>>>>> not working for stream mapping groups > 128. (Note that the ARM SMMU
>>>>> architecture specification defines NUMSMRG in the range of 0-127).
>>>>>
>>>>> So the current logic that checks the (162-1)th S2CR entry fails to detect
>>>>> the quirky firmware on these devices and SMMU triggers invalid context
>>>>> fault for bypass streams.
>>>>>
>>>>> To fix this issue, rework the logic to find the first non-valid (free)
>>>>> stream mapping register group (SMR) within 128 groups and use that index
>>>>> to access S2CR for detecting the bypass quirk. If no free groups are
>>>>> available, then just skip the quirk detection.
>>>>>
>>>>> While at it, let's move the quirk detection logic to a separate function
>>>>> and change the local variable name from last_s2cr to free_s2cr.
>>>>>
>>>>> Reviewed-by: Bjorn Andersson <[email protected]>
>>>>> Signed-off-by: Manivannan Sadhasivam <[email protected]>
>>>>> ---
>>>>>
>>>>> Changes in v2:
>>>>>
>>>>> * Limited the check to 128 groups as per ARM SMMU spec's NUMSMRG range
>>>>> * Moved the quirk handling to its own function
>>>>> * Collected review tag from Bjorn
>>>>>
>>>>> drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 48 ++++++++++++++++++----
>>>>> 1 file changed, 40 insertions(+), 8 deletions(-)
>>>>>
>>>>> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
>>>>> index d1b296b95c86..48362d7ef451 100644
>>>>> --- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
>>>>> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
>>>>> @@ -266,25 +266,49 @@ static int qcom_smmu_init_context(struct arm_smmu_domain *smmu_domain,
>>>>> return 0;
>>>>> }
>>>>> -static int qcom_smmu_cfg_probe(struct arm_smmu_device *smmu)
>>>>> +static void qcom_smmu_bypass_quirk(struct arm_smmu_device *smmu)
>>>>> {
>>>>> - unsigned int last_s2cr = ARM_SMMU_GR0_S2CR(smmu->num_mapping_groups - 1);
>>>>> struct qcom_smmu *qsmmu = to_qcom_smmu(smmu);
>>>>> - u32 reg;
>>>>> - u32 smr;
>>>>> + u32 free_s2cr;
>>>>> + u32 reg, smr;
>>>>> int i;
>>>>> + /*
>>>>> + * Find the first non-valid (free) stream mapping register group and
>>>>> + * use that index to access S2CR for detecting the bypass quirk.
>>>>> + *
>>>>> + * Note that only the first 128 stream mapping groups are considered for
>>>>> + * the check. This is because the ARM SMMU architecture specification
>>>>> + * defines NUMSMRG (Number of Stream Mapping Register Groups) in the
>>>>> + * range of 0-127, but some Qcom platforms emulate more stream mapping
>>>>> + * groups with the help of hypervisor. And those groups don't exhibit
>>>>> + * the quirky behavior.
>>>>> + */
>>>>> + for (i = 0; i < 128; i++) {
>>>>
>>>> This may now access registers beyond smmu->num_mapping_groups. Should
>>>> you not use the minimum of these two values here (and below)?
>>>>
>>>
>>> Doh! yeah, you're right. Will fix it in v3.
>>
>> FWIW I'd say it's probably best if the cfg_probe hook clamps
>> smmu->num_mapping_groups to the architectural maximum straight away, to also
>> prevent the main driver iterating off into the nonsensical area in
>> arm_smmu_device_reset() or the SMR allocator itself.
>>
>
> We considered that also but Qcom purposefully extended the NUMSMRG for
> virtualization usecase and we do not have a clear picture of it.

Whatever that supposed use-case may be, Linux does not support it, and
clearly isn't going to support it any time soon if we don't even know
what it is. Therefore Linux does not need to accommodate this weirdness
for the foreseeable future, beyond simply making sure it doesn't cause
any problems for what Linux *does* support. It's bad enough that the
emulation of "normal" SMRs continues to violate the architecture, but
I'm even more uncomfortable letting the generic architecture driver poke
at completely non-architectural registers which don't even have the same
behaviour as the ones they're supposedly extending.

Thanks,
Robin.

> That's the
> reason we settled with capping the value only for the quirk detection.
>
> Thanks,
> Mani
>
>> (Note that we don't support the weird EXSMRGS extension that appeared in a
>> late version of the architecture, but even if we did, that still reports 128
>> for IDR0.NUMSMRG, and the extra extended SMRs live somewhere completely
>> different.)
>>
>> Thanks,
>> Robin.
>

2023-03-14 14:08:18

by Manivannan Sadhasivam

[permalink] [raw]
Subject: Re: [PATCH v2] iommu/arm-smmu-qcom: Rework the logic finding the bypass quirk

On Tue, Mar 14, 2023 at 01:41:56PM +0000, Robin Murphy wrote:
> On 2023-03-14 13:20, Manivannan Sadhasivam wrote:
> > On Tue, Mar 14, 2023 at 11:58:24AM +0000, Robin Murphy wrote:
> > > On 2023-03-14 11:26, Manivannan Sadhasivam wrote:
> > > > On Tue, Mar 14, 2023 at 12:17:38PM +0100, Johan Hovold wrote:
> > > > > On Tue, Mar 14, 2023 at 04:29:05PM +0530, Manivannan Sadhasivam wrote:
> > > > > > The logic used to find the quirky firmware that intercepts the writes to
> > > > > > S2CR register to replace bypass type streams with a fault, and ignore the
> > > > > > fault type, is not working with the firmware on newer SoCs like SC8280XP.
> > > > > >
> > > > > > The current logic uses the last stream mapping group (num_mapping_groups
> > > > > > - 1) as an index for finding quirky firmware. But on SC8280XP, NUSMRG
> > > > > > reports a value of 162 (possibly emulated by the hypervisor) and logic is
> > > > > > not working for stream mapping groups > 128. (Note that the ARM SMMU
> > > > > > architecture specification defines NUMSMRG in the range of 0-127).
> > > > > >
> > > > > > So the current logic that checks the (162-1)th S2CR entry fails to detect
> > > > > > the quirky firmware on these devices and SMMU triggers invalid context
> > > > > > fault for bypass streams.
> > > > > >
> > > > > > To fix this issue, rework the logic to find the first non-valid (free)
> > > > > > stream mapping register group (SMR) within 128 groups and use that index
> > > > > > to access S2CR for detecting the bypass quirk. If no free groups are
> > > > > > available, then just skip the quirk detection.
> > > > > >
> > > > > > While at it, let's move the quirk detection logic to a separate function
> > > > > > and change the local variable name from last_s2cr to free_s2cr.
> > > > > >
> > > > > > Reviewed-by: Bjorn Andersson <[email protected]>
> > > > > > Signed-off-by: Manivannan Sadhasivam <[email protected]>
> > > > > > ---
> > > > > >
> > > > > > Changes in v2:
> > > > > >
> > > > > > * Limited the check to 128 groups as per ARM SMMU spec's NUMSMRG range
> > > > > > * Moved the quirk handling to its own function
> > > > > > * Collected review tag from Bjorn
> > > > > >
> > > > > > drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 48 ++++++++++++++++++----
> > > > > > 1 file changed, 40 insertions(+), 8 deletions(-)
> > > > > >
> > > > > > diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
> > > > > > index d1b296b95c86..48362d7ef451 100644
> > > > > > --- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
> > > > > > +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
> > > > > > @@ -266,25 +266,49 @@ static int qcom_smmu_init_context(struct arm_smmu_domain *smmu_domain,
> > > > > > return 0;
> > > > > > }
> > > > > > -static int qcom_smmu_cfg_probe(struct arm_smmu_device *smmu)
> > > > > > +static void qcom_smmu_bypass_quirk(struct arm_smmu_device *smmu)
> > > > > > {
> > > > > > - unsigned int last_s2cr = ARM_SMMU_GR0_S2CR(smmu->num_mapping_groups - 1);
> > > > > > struct qcom_smmu *qsmmu = to_qcom_smmu(smmu);
> > > > > > - u32 reg;
> > > > > > - u32 smr;
> > > > > > + u32 free_s2cr;
> > > > > > + u32 reg, smr;
> > > > > > int i;
> > > > > > + /*
> > > > > > + * Find the first non-valid (free) stream mapping register group and
> > > > > > + * use that index to access S2CR for detecting the bypass quirk.
> > > > > > + *
> > > > > > + * Note that only the first 128 stream mapping groups are considered for
> > > > > > + * the check. This is because the ARM SMMU architecture specification
> > > > > > + * defines NUMSMRG (Number of Stream Mapping Register Groups) in the
> > > > > > + * range of 0-127, but some Qcom platforms emulate more stream mapping
> > > > > > + * groups with the help of hypervisor. And those groups don't exhibit
> > > > > > + * the quirky behavior.
> > > > > > + */
> > > > > > + for (i = 0; i < 128; i++) {
> > > > >
> > > > > This may now access registers beyond smmu->num_mapping_groups. Should
> > > > > you not use the minimum of these two values here (and below)?
> > > > >
> > > >
> > > > Doh! yeah, you're right. Will fix it in v3.
> > >
> > > FWIW I'd say it's probably best if the cfg_probe hook clamps
> > > smmu->num_mapping_groups to the architectural maximum straight away, to also
> > > prevent the main driver iterating off into the nonsensical area in
> > > arm_smmu_device_reset() or the SMR allocator itself.
> > >
> >
> > We considered that also but Qcom purposefully extended the NUMSMRG for
> > virtualization usecase and we do not have a clear picture of it.
>
> Whatever that supposed use-case may be, Linux does not support it, and
> clearly isn't going to support it any time soon if we don't even know what
> it is. Therefore Linux does not need to accommodate this weirdness for the
> foreseeable future, beyond simply making sure it doesn't cause any problems
> for what Linux *does* support. It's bad enough that the emulation of
> "normal" SMRs continues to violate the architecture, but I'm even more
> uncomfortable letting the generic architecture driver poke at completely
> non-architectural registers which don't even have the same behaviour as the
> ones they're supposedly extending.
>

Okay then. I'll cap it to 128.

Thanks,
Mani

> Thanks,
> Robin.
>
> > That's the
> > reason we settled with capping the value only for the quirk detection.
> >
> > Thanks,
> > Mani
> >
> > > (Note that we don't support the weird EXSMRGS extension that appeared in a
> > > late version of the architecture, but even if we did, that still reports 128
> > > for IDR0.NUMSMRG, and the extra extended SMRs live somewhere completely
> > > different.)
> > >
> > > Thanks,
> > > Robin.
> >

--
மணிவண்ணன் சதாசிவம்

2023-03-14 14:30:29

by Bjorn Andersson

[permalink] [raw]
Subject: Re: [PATCH v2] iommu/arm-smmu-qcom: Rework the logic finding the bypass quirk

On Tue, Mar 14, 2023 at 01:41:56PM +0000, Robin Murphy wrote:
> On 2023-03-14 13:20, Manivannan Sadhasivam wrote:
> > On Tue, Mar 14, 2023 at 11:58:24AM +0000, Robin Murphy wrote:
> > > On 2023-03-14 11:26, Manivannan Sadhasivam wrote:
> > > > On Tue, Mar 14, 2023 at 12:17:38PM +0100, Johan Hovold wrote:
> > > > > On Tue, Mar 14, 2023 at 04:29:05PM +0530, Manivannan Sadhasivam wrote:
> > > > > > The logic used to find the quirky firmware that intercepts the writes to
> > > > > > S2CR register to replace bypass type streams with a fault, and ignore the
> > > > > > fault type, is not working with the firmware on newer SoCs like SC8280XP.
> > > > > >
> > > > > > The current logic uses the last stream mapping group (num_mapping_groups
> > > > > > - 1) as an index for finding quirky firmware. But on SC8280XP, NUSMRG
> > > > > > reports a value of 162 (possibly emulated by the hypervisor) and logic is
> > > > > > not working for stream mapping groups > 128. (Note that the ARM SMMU
> > > > > > architecture specification defines NUMSMRG in the range of 0-127).
> > > > > >
> > > > > > So the current logic that checks the (162-1)th S2CR entry fails to detect
> > > > > > the quirky firmware on these devices and SMMU triggers invalid context
> > > > > > fault for bypass streams.
> > > > > >
> > > > > > To fix this issue, rework the logic to find the first non-valid (free)
> > > > > > stream mapping register group (SMR) within 128 groups and use that index
> > > > > > to access S2CR for detecting the bypass quirk. If no free groups are
> > > > > > available, then just skip the quirk detection.
> > > > > >
> > > > > > While at it, let's move the quirk detection logic to a separate function
> > > > > > and change the local variable name from last_s2cr to free_s2cr.
> > > > > >
> > > > > > Reviewed-by: Bjorn Andersson <[email protected]>
> > > > > > Signed-off-by: Manivannan Sadhasivam <[email protected]>
> > > > > > ---
> > > > > >
> > > > > > Changes in v2:
> > > > > >
> > > > > > * Limited the check to 128 groups as per ARM SMMU spec's NUMSMRG range
> > > > > > * Moved the quirk handling to its own function
> > > > > > * Collected review tag from Bjorn
> > > > > >
> > > > > > drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 48 ++++++++++++++++++----
> > > > > > 1 file changed, 40 insertions(+), 8 deletions(-)
> > > > > >
> > > > > > diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
> > > > > > index d1b296b95c86..48362d7ef451 100644
> > > > > > --- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
> > > > > > +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
> > > > > > @@ -266,25 +266,49 @@ static int qcom_smmu_init_context(struct arm_smmu_domain *smmu_domain,
> > > > > > return 0;
> > > > > > }
> > > > > > -static int qcom_smmu_cfg_probe(struct arm_smmu_device *smmu)
> > > > > > +static void qcom_smmu_bypass_quirk(struct arm_smmu_device *smmu)
> > > > > > {
> > > > > > - unsigned int last_s2cr = ARM_SMMU_GR0_S2CR(smmu->num_mapping_groups - 1);
> > > > > > struct qcom_smmu *qsmmu = to_qcom_smmu(smmu);
> > > > > > - u32 reg;
> > > > > > - u32 smr;
> > > > > > + u32 free_s2cr;
> > > > > > + u32 reg, smr;
> > > > > > int i;
> > > > > > + /*
> > > > > > + * Find the first non-valid (free) stream mapping register group and
> > > > > > + * use that index to access S2CR for detecting the bypass quirk.
> > > > > > + *
> > > > > > + * Note that only the first 128 stream mapping groups are considered for
> > > > > > + * the check. This is because the ARM SMMU architecture specification
> > > > > > + * defines NUMSMRG (Number of Stream Mapping Register Groups) in the
> > > > > > + * range of 0-127, but some Qcom platforms emulate more stream mapping
> > > > > > + * groups with the help of hypervisor. And those groups don't exhibit
> > > > > > + * the quirky behavior.
> > > > > > + */
> > > > > > + for (i = 0; i < 128; i++) {
> > > > >
> > > > > This may now access registers beyond smmu->num_mapping_groups. Should
> > > > > you not use the minimum of these two values here (and below)?
> > > > >
> > > >
> > > > Doh! yeah, you're right. Will fix it in v3.
> > >
> > > FWIW I'd say it's probably best if the cfg_probe hook clamps
> > > smmu->num_mapping_groups to the architectural maximum straight away, to also
> > > prevent the main driver iterating off into the nonsensical area in
> > > arm_smmu_device_reset() or the SMR allocator itself.
> > >
> >
> > We considered that also but Qcom purposefully extended the NUMSMRG for
> > virtualization usecase and we do not have a clear picture of it.
>
> Whatever that supposed use-case may be, Linux does not support it, and
> clearly isn't going to support it any time soon if we don't even know what

Can you please elaborate on what it is that would prevent Linux to
handle hardware with more than 128 SMRs?

> it is. Therefore Linux does not need to accommodate this weirdness for the
> foreseeable future, beyond simply making sure it doesn't cause any problems
> for what Linux *does* support. It's bad enough that the emulation of
> "normal" SMRs continues to violate the architecture, but I'm even more
> uncomfortable letting the generic architecture driver poke at completely
> non-architectural registers which don't even have the same behaviour as the
> ones they're supposedly extending.

Afaict there's nothing special about the SMRs beyond 128 on this
platform...

Thanks,
Bjorn

>
> Thanks,
> Robin.
>
> > That's the
> > reason we settled with capping the value only for the quirk detection.
> >
> > Thanks,
> > Mani
> >
> > > (Note that we don't support the weird EXSMRGS extension that appeared in a
> > > late version of the architecture, but even if we did, that still reports 128
> > > for IDR0.NUMSMRG, and the extra extended SMRs live somewhere completely
> > > different.)
> > >
> > > Thanks,
> > > Robin.
> >

2023-03-14 15:25:37

by Robin Murphy

[permalink] [raw]
Subject: Re: [PATCH v2] iommu/arm-smmu-qcom: Rework the logic finding the bypass quirk

On 2023-03-14 14:32, Bjorn Andersson wrote:
> On Tue, Mar 14, 2023 at 01:41:56PM +0000, Robin Murphy wrote:
>> On 2023-03-14 13:20, Manivannan Sadhasivam wrote:
>>> On Tue, Mar 14, 2023 at 11:58:24AM +0000, Robin Murphy wrote:
>>>> On 2023-03-14 11:26, Manivannan Sadhasivam wrote:
>>>>> On Tue, Mar 14, 2023 at 12:17:38PM +0100, Johan Hovold wrote:
>>>>>> On Tue, Mar 14, 2023 at 04:29:05PM +0530, Manivannan Sadhasivam wrote:
>>>>>>> The logic used to find the quirky firmware that intercepts the writes to
>>>>>>> S2CR register to replace bypass type streams with a fault, and ignore the
>>>>>>> fault type, is not working with the firmware on newer SoCs like SC8280XP.
>>>>>>>
>>>>>>> The current logic uses the last stream mapping group (num_mapping_groups
>>>>>>> - 1) as an index for finding quirky firmware. But on SC8280XP, NUSMRG
>>>>>>> reports a value of 162 (possibly emulated by the hypervisor) and logic is
>>>>>>> not working for stream mapping groups > 128. (Note that the ARM SMMU
>>>>>>> architecture specification defines NUMSMRG in the range of 0-127).
>>>>>>>
>>>>>>> So the current logic that checks the (162-1)th S2CR entry fails to detect
>>>>>>> the quirky firmware on these devices and SMMU triggers invalid context
>>>>>>> fault for bypass streams.
>>>>>>>
>>>>>>> To fix this issue, rework the logic to find the first non-valid (free)
>>>>>>> stream mapping register group (SMR) within 128 groups and use that index
>>>>>>> to access S2CR for detecting the bypass quirk. If no free groups are
>>>>>>> available, then just skip the quirk detection.
>>>>>>>
>>>>>>> While at it, let's move the quirk detection logic to a separate function
>>>>>>> and change the local variable name from last_s2cr to free_s2cr.
>>>>>>>
>>>>>>> Reviewed-by: Bjorn Andersson <[email protected]>
>>>>>>> Signed-off-by: Manivannan Sadhasivam <[email protected]>
>>>>>>> ---
>>>>>>>
>>>>>>> Changes in v2:
>>>>>>>
>>>>>>> * Limited the check to 128 groups as per ARM SMMU spec's NUMSMRG range
>>>>>>> * Moved the quirk handling to its own function
>>>>>>> * Collected review tag from Bjorn
>>>>>>>
>>>>>>> drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 48 ++++++++++++++++++----
>>>>>>> 1 file changed, 40 insertions(+), 8 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
>>>>>>> index d1b296b95c86..48362d7ef451 100644
>>>>>>> --- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
>>>>>>> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
>>>>>>> @@ -266,25 +266,49 @@ static int qcom_smmu_init_context(struct arm_smmu_domain *smmu_domain,
>>>>>>> return 0;
>>>>>>> }
>>>>>>> -static int qcom_smmu_cfg_probe(struct arm_smmu_device *smmu)
>>>>>>> +static void qcom_smmu_bypass_quirk(struct arm_smmu_device *smmu)
>>>>>>> {
>>>>>>> - unsigned int last_s2cr = ARM_SMMU_GR0_S2CR(smmu->num_mapping_groups - 1);
>>>>>>> struct qcom_smmu *qsmmu = to_qcom_smmu(smmu);
>>>>>>> - u32 reg;
>>>>>>> - u32 smr;
>>>>>>> + u32 free_s2cr;
>>>>>>> + u32 reg, smr;
>>>>>>> int i;
>>>>>>> + /*
>>>>>>> + * Find the first non-valid (free) stream mapping register group and
>>>>>>> + * use that index to access S2CR for detecting the bypass quirk.
>>>>>>> + *
>>>>>>> + * Note that only the first 128 stream mapping groups are considered for
>>>>>>> + * the check. This is because the ARM SMMU architecture specification
>>>>>>> + * defines NUMSMRG (Number of Stream Mapping Register Groups) in the
>>>>>>> + * range of 0-127, but some Qcom platforms emulate more stream mapping
>>>>>>> + * groups with the help of hypervisor. And those groups don't exhibit
>>>>>>> + * the quirky behavior.
>>>>>>> + */
>>>>>>> + for (i = 0; i < 128; i++) {
>>>>>>
>>>>>> This may now access registers beyond smmu->num_mapping_groups. Should
>>>>>> you not use the minimum of these two values here (and below)?
>>>>>>
>>>>>
>>>>> Doh! yeah, you're right. Will fix it in v3.
>>>>
>>>> FWIW I'd say it's probably best if the cfg_probe hook clamps
>>>> smmu->num_mapping_groups to the architectural maximum straight away, to also
>>>> prevent the main driver iterating off into the nonsensical area in
>>>> arm_smmu_device_reset() or the SMR allocator itself.
>>>>
>>>
>>> We considered that also but Qcom purposefully extended the NUMSMRG for
>>> virtualization usecase and we do not have a clear picture of it.
>>
>> Whatever that supposed use-case may be, Linux does not support it, and
>> clearly isn't going to support it any time soon if we don't even know what
>
> Can you please elaborate on what it is that would prevent Linux to
> handle hardware with more than 128 SMRs?

https://developer.arm.com/documentation/ihi0062/latest

I would expect actual hardware to follow the architecture (because it
would need to pass validation suites etc.). The architecture defines an
extension for supporting up to 1024 stream mapping groups, but that
works very differently.

The SC8280XP DT claims this SMMU is compatible with a standard Arm
MMU-500, which definitely does not support more than 128 SMRs, so I have
no idea what the hypervisor might be up to.

>> it is. Therefore Linux does not need to accommodate this weirdness for the
>> foreseeable future, beyond simply making sure it doesn't cause any problems
>> for what Linux *does* support. It's bad enough that the emulation of
>> "normal" SMRs continues to violate the architecture, but I'm even more
>> uncomfortable letting the generic architecture driver poke at completely
>> non-architectural registers which don't even have the same behaviour as the
>> ones they're supposedly extending.
>
> Afaict there's nothing special about the SMRs beyond 128 on this
> platform...

If that were true then why would this patch be a thing at all? :/

Thanks,
Robin.