2024-05-13 23:14:24

by Doug Anderson

[permalink] [raw]
Subject: [PATCH] iommu/arm-smmu: Don't disable next-page prefetcher on devices it works on

On sc7180 trogdor devices we get a scary warning at bootup:
arm-smmu 15000000.iommu:
Failed to disable prefetcher [errata #841119 and #826419], check ACR.CACHE_LOCK

We spent some time trying to figure out how we were going to fix these
errata and whether we needed to do a firmware update. Upon closer
inspection, however, we realized that the errata don't apply to us.
Specifically, the errata document says that for these errata:
* Found in: r0p0
* Fixed in: r2p2

..and trogdor devices appear to be running r2p4. That means that they
are unaffected despite the scary warning.

The issue is that the kernel unconditionally tries to disable the
prefetcher even on unaffected devices and then warns when it's unable
to.

Let's change the kernel to only disable the prefetcher on affected
devices, which will get rid of the scary warning on devices that are
unaffected. As per the comment the prefetcher is
"not-particularly-beneficial" but it shouldn't hurt to leave it on for
devices where it doesn't cause problems.

Fixes: f87f6e5b4539 ("iommu/arm-smmu: Warn once when the perfetcher errata patch fails to apply")
Signed-off-by: Douglas Anderson <[email protected]>
---

drivers/iommu/arm/arm-smmu/arm-smmu-impl.c | 21 +++++++++++++--------
1 file changed, 13 insertions(+), 8 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c b/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
index 9dc772f2cbb2..d9b38b0db0d4 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
@@ -109,7 +109,7 @@ static struct arm_smmu_device *cavium_smmu_impl_init(struct arm_smmu_device *smm

int arm_mmu500_reset(struct arm_smmu_device *smmu)
{
- u32 reg, major;
+ u32 reg, major, minor;
int i;
/*
* On MMU-500 r2p0 onwards we need to clear ACR.CACHE_LOCK before
@@ -118,6 +118,7 @@ int arm_mmu500_reset(struct arm_smmu_device *smmu)
*/
reg = arm_smmu_gr0_read(smmu, ARM_SMMU_GR0_ID7);
major = FIELD_GET(ARM_SMMU_ID7_MAJOR, reg);
+ minor = FIELD_GET(ARM_SMMU_ID7_MINOR, reg);
reg = arm_smmu_gr0_read(smmu, ARM_SMMU_GR0_sACR);
if (major >= 2)
reg &= ~ARM_MMU500_ACR_CACHE_LOCK;
@@ -131,14 +132,18 @@ int arm_mmu500_reset(struct arm_smmu_device *smmu)
/*
* Disable MMU-500's not-particularly-beneficial next-page
* prefetcher for the sake of errata #841119 and #826419.
+ * These errata only affect r0p0 through r2p1 (fixed in r2p2).
*/
- for (i = 0; i < smmu->num_context_banks; ++i) {
- reg = arm_smmu_cb_read(smmu, i, ARM_SMMU_CB_ACTLR);
- reg &= ~ARM_MMU500_ACTLR_CPRE;
- arm_smmu_cb_write(smmu, i, ARM_SMMU_CB_ACTLR, reg);
- reg = arm_smmu_cb_read(smmu, i, ARM_SMMU_CB_ACTLR);
- if (reg & ARM_MMU500_ACTLR_CPRE)
- dev_warn_once(smmu->dev, "Failed to disable prefetcher [errata #841119 and #826419], check ACR.CACHE_LOCK\n");
+ if (major < 2 || (major == 2 && minor < 2)) {
+ for (i = 0; i < smmu->num_context_banks; ++i) {
+ reg = arm_smmu_cb_read(smmu, i, ARM_SMMU_CB_ACTLR);
+ reg &= ~ARM_MMU500_ACTLR_CPRE;
+ arm_smmu_cb_write(smmu, i, ARM_SMMU_CB_ACTLR, reg);
+ reg = arm_smmu_cb_read(smmu, i, ARM_SMMU_CB_ACTLR);
+ if (reg & ARM_MMU500_ACTLR_CPRE)
+ dev_warn_once(smmu->dev,
+ "Failed to disable prefetcher [errata #841119 and #826419], check ACR.CACHE_LOCK\n");
+ }
}

return 0;
--
2.45.0.rc1.225.g2a3ae87e7f-goog



2024-05-14 17:15:57

by Robin Murphy

[permalink] [raw]
Subject: Re: [PATCH] iommu/arm-smmu: Don't disable next-page prefetcher on devices it works on

Hi Doug,

On 2024-05-14 12:13 am, Douglas Anderson wrote:
> On sc7180 trogdor devices we get a scary warning at bootup:
> arm-smmu 15000000.iommu:
> Failed to disable prefetcher [errata #841119 and #826419], check ACR.CACHE_LOCK
>
> We spent some time trying to figure out how we were going to fix these
> errata and whether we needed to do a firmware update. Upon closer
> inspection, however, we realized that the errata don't apply to us.
> Specifically, the errata document says that for these errata:
> * Found in: r0p0
> * Fixed in: r2p2
>
> ...and trogdor devices appear to be running r2p4. That means that they
> are unaffected despite the scary warning.
>
> The issue is that the kernel unconditionally tries to disable the
> prefetcher even on unaffected devices and then warns when it's unable
> to.
>
> Let's change the kernel to only disable the prefetcher on affected
> devices, which will get rid of the scary warning on devices that are
> unaffected. As per the comment the prefetcher is
> "not-particularly-beneficial" but it shouldn't hurt to leave it on for
> devices where it doesn't cause problems.

Unfortunately by now there are also at least #562869 and #1047329, plus
a small possibility of further corners of systemic brokenness in the
prefetcher yet to be discovered (or at least characterised sufficiently
to be reported as an erratum). One could argue that we're not currently
meeting the conditions for #1047329 yet, but with the IOMMUFD APIs
finally falling into place, and potential pKVM use-cases on the horizon
too, there's a distinct chance that someone will be interested in
nesting support for SMMUv2 sooner or later.

Thanks,
Robin.

> Fixes: f87f6e5b4539 ("iommu/arm-smmu: Warn once when the perfetcher errata patch fails to apply")
> Signed-off-by: Douglas Anderson <[email protected]>
> ---
>
> drivers/iommu/arm/arm-smmu/arm-smmu-impl.c | 21 +++++++++++++--------
> 1 file changed, 13 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c b/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
> index 9dc772f2cbb2..d9b38b0db0d4 100644
> --- a/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
> @@ -109,7 +109,7 @@ static struct arm_smmu_device *cavium_smmu_impl_init(struct arm_smmu_device *smm
>
> int arm_mmu500_reset(struct arm_smmu_device *smmu)
> {
> - u32 reg, major;
> + u32 reg, major, minor;
> int i;
> /*
> * On MMU-500 r2p0 onwards we need to clear ACR.CACHE_LOCK before
> @@ -118,6 +118,7 @@ int arm_mmu500_reset(struct arm_smmu_device *smmu)
> */
> reg = arm_smmu_gr0_read(smmu, ARM_SMMU_GR0_ID7);
> major = FIELD_GET(ARM_SMMU_ID7_MAJOR, reg);
> + minor = FIELD_GET(ARM_SMMU_ID7_MINOR, reg);
> reg = arm_smmu_gr0_read(smmu, ARM_SMMU_GR0_sACR);
> if (major >= 2)
> reg &= ~ARM_MMU500_ACR_CACHE_LOCK;
> @@ -131,14 +132,18 @@ int arm_mmu500_reset(struct arm_smmu_device *smmu)
> /*
> * Disable MMU-500's not-particularly-beneficial next-page
> * prefetcher for the sake of errata #841119 and #826419.
> + * These errata only affect r0p0 through r2p1 (fixed in r2p2).
> */
> - for (i = 0; i < smmu->num_context_banks; ++i) {
> - reg = arm_smmu_cb_read(smmu, i, ARM_SMMU_CB_ACTLR);
> - reg &= ~ARM_MMU500_ACTLR_CPRE;
> - arm_smmu_cb_write(smmu, i, ARM_SMMU_CB_ACTLR, reg);
> - reg = arm_smmu_cb_read(smmu, i, ARM_SMMU_CB_ACTLR);
> - if (reg & ARM_MMU500_ACTLR_CPRE)
> - dev_warn_once(smmu->dev, "Failed to disable prefetcher [errata #841119 and #826419], check ACR.CACHE_LOCK\n");
> + if (major < 2 || (major == 2 && minor < 2)) {
> + for (i = 0; i < smmu->num_context_banks; ++i) {
> + reg = arm_smmu_cb_read(smmu, i, ARM_SMMU_CB_ACTLR);
> + reg &= ~ARM_MMU500_ACTLR_CPRE;
> + arm_smmu_cb_write(smmu, i, ARM_SMMU_CB_ACTLR, reg);
> + reg = arm_smmu_cb_read(smmu, i, ARM_SMMU_CB_ACTLR);
> + if (reg & ARM_MMU500_ACTLR_CPRE)
> + dev_warn_once(smmu->dev,
> + "Failed to disable prefetcher [errata #841119 and #826419], check ACR.CACHE_LOCK\n");
> + }
> }
>
> return 0;

2024-05-17 16:38:17

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH] iommu/arm-smmu: Don't disable next-page prefetcher on devices it works on

Hi Doug,

On Mon, May 13, 2024 at 04:13:47PM -0700, Douglas Anderson wrote:
> On sc7180 trogdor devices we get a scary warning at bootup:
> arm-smmu 15000000.iommu:
> Failed to disable prefetcher [errata #841119 and #826419], check ACR.CACHE_LOCK
>
> We spent some time trying to figure out how we were going to fix these
> errata and whether we needed to do a firmware update. Upon closer
> inspection, however, we realized that the errata don't apply to us.
> Specifically, the errata document says that for these errata:
> * Found in: r0p0
> * Fixed in: r2p2
>
> ...and trogdor devices appear to be running r2p4. That means that they
> are unaffected despite the scary warning.
>
> The issue is that the kernel unconditionally tries to disable the
> prefetcher even on unaffected devices and then warns when it's unable
> to.
>
> Let's change the kernel to only disable the prefetcher on affected
> devices, which will get rid of the scary warning on devices that are
> unaffected. As per the comment the prefetcher is
> "not-particularly-beneficial" but it shouldn't hurt to leave it on for
> devices where it doesn't cause problems.
>
> Fixes: f87f6e5b4539 ("iommu/arm-smmu: Warn once when the perfetcher errata patch fails to apply")
> Signed-off-by: Douglas Anderson <[email protected]>
> ---
>
> drivers/iommu/arm/arm-smmu/arm-smmu-impl.c | 21 +++++++++++++--------
> 1 file changed, 13 insertions(+), 8 deletions(-)


Just curious, but did you see any performance impact (good or bad) as a
result of this patch? The next-page prefetcher has always looked a little
naive to me and, with a tendency for tiny TLBs in some implementations,
there's a possibility it could do more harm than good.

Will

2024-05-17 17:20:00

by Doug Anderson

[permalink] [raw]
Subject: Re: [PATCH] iommu/arm-smmu: Don't disable next-page prefetcher on devices it works on

Hi,

On Fri, May 17, 2024 at 9:37 AM Will Deacon <[email protected]> wrote:
>
> Hi Doug,
>
> On Mon, May 13, 2024 at 04:13:47PM -0700, Douglas Anderson wrote:
> > On sc7180 trogdor devices we get a scary warning at bootup:
> > arm-smmu 15000000.iommu:
> > Failed to disable prefetcher [errata #841119 and #826419], check ACR.CACHE_LOCK
> >
> > We spent some time trying to figure out how we were going to fix these
> > errata and whether we needed to do a firmware update. Upon closer
> > inspection, however, we realized that the errata don't apply to us.
> > Specifically, the errata document says that for these errata:
> > * Found in: r0p0
> > * Fixed in: r2p2
> >
> > ...and trogdor devices appear to be running r2p4. That means that they
> > are unaffected despite the scary warning.
> >
> > The issue is that the kernel unconditionally tries to disable the
> > prefetcher even on unaffected devices and then warns when it's unable
> > to.
> >
> > Let's change the kernel to only disable the prefetcher on affected
> > devices, which will get rid of the scary warning on devices that are
> > unaffected. As per the comment the prefetcher is
> > "not-particularly-beneficial" but it shouldn't hurt to leave it on for
> > devices where it doesn't cause problems.
> >
> > Fixes: f87f6e5b4539 ("iommu/arm-smmu: Warn once when the perfetcher errata patch fails to apply")
> > Signed-off-by: Douglas Anderson <[email protected]>
> > ---
> >
> > drivers/iommu/arm/arm-smmu/arm-smmu-impl.c | 21 +++++++++++++--------
> > 1 file changed, 13 insertions(+), 8 deletions(-)
>
>
> Just curious, but did you see any performance impact (good or bad) as a
> result of this patch? The next-page prefetcher has always looked a little
> naive to me and, with a tendency for tiny TLBs in some implementations,
> there's a possibility it could do more harm than good.

This patch actually makes no difference on trogdor today other than
getting rid of the scary warning. Specifically on trogdor the
ACR.CACHE_LOCK bit seems to be set so the kernel is unable to change
the setting anyway and has never been able to. We are working on
figuring out how to fix the firmware and then we have to get a
firmware spin before we can really see any changes. I'll keep an eye
out to see if performance numbers change when the firmware uprevs.

BTW: any idea how big of a deal these errata are? We're _just_
finishing a firmware uprev process and there is always pushback
against kicking off a new one unless the issue is important. Given
that we've been living with this issue since devices shipped I'm going
to assume we don't need to rush a firmware update, but if this is
really scary and needs to be addressed sooner we can figure that out.

-Doug