2017-04-07 12:40:43

by Joerg Roedel

[permalink] [raw]
Subject: [PATCH] PCI: Add ATS-disable quirk for AMD Stoney GPUs

From: Joerg Roedel <[email protected]>

ATS is broken on this hardware and causes IOMMU stalls and
system failure. Disable ATS on these devices to make them
usable again with IOMMU enabled.

Note that the commit in the Fixes-tag is not buggy, it
just uncovers the problem in the hardware by increasing
the ATS-flush rate.

Fixes: b1516a14657a ('iommu/amd: Implement flush queue')
Signed-off-by: Joerg Roedel <[email protected]>
---
drivers/pci/quirks.c | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 6736836..3bc9856 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -4634,3 +4634,22 @@ static void quirk_no_aersid(struct pci_dev *pdev)
DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x2031, quirk_no_aersid);
DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x2032, quirk_no_aersid);
DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x2033, quirk_no_aersid);
+
+/*
+ * Some devices have a broken ATS implementation causing IOMMU stalls.
+ * Don't use ATS for those devices.
+ */
+static void quirk_disable_ats(struct pci_dev *pdev)
+{
+#ifdef CONFIG_PCI_ATS
+ /*
+ * Set pdev->ats_cap = 0 to make pci_enable_ats() bail out
+ * early.
+ */
+ dev_info(&pdev->dev, "QUIRK: Disabling ATS");
+ pdev->ats_cap = 0;
+#endif
+}
+
+/* AMD Stoney platform GPU */
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x98e4, quirk_disable_ats);
--
1.9.1


2017-04-07 12:53:27

by David Woodhouse

[permalink] [raw]
Subject: Re: [PATCH] PCI: Add ATS-disable quirk for AMD Stoney GPUs

On Fri, 2017-04-07 at 14:40 +0200, Joerg Roedel wrote:
>
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index 6736836..3bc9856 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -4634,3 +4634,22 @@ static void quirk_no_aersid(struct pci_dev *pdev)
>  DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x2031, quirk_no_aersid);
>  DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x2032, quirk_no_aersid);
>  DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x2033, quirk_no_aersid);
> +
> +/*
> + * Some devices have a broken ATS implementation causing IOMMU stalls.
> + * Don't use ATS for those devices.
> + */
> +static void quirk_disable_ats(struct pci_dev *pdev)
> +{
> +#ifdef CONFIG_PCI_ATS
> +       /*
> +        * Set pdev->ats_cap = 0 to make pci_enable_ats() bail out
> +        * early.
> +        */
> +       dev_info(&pdev->dev, "QUIRK: Disabling ATS");
> +       pdev->ats_cap = 0;
> +#endif
> +}
> +
> +/* AMD Stoney platform GPU */
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x98e4, quirk_disable_ats);

Why not put the #ifdef around *all* of the above?


Attachments:
smime.p7s (4.82 kB)

2017-04-07 12:59:11

by Jörg Rödel

[permalink] [raw]
Subject: Re: [PATCH] PCI: Add ATS-disable quirk for AMD Stoney GPUs

On Fri, Apr 07, 2017 at 02:53:05PM +0200, David Woodhouse wrote:
> On Fri, 2017-04-07 at 14:40 +0200, Joerg Roedel wrote:
> >
> > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> > index 6736836..3bc9856 100644
> > --- a/drivers/pci/quirks.c
> > +++ b/drivers/pci/quirks.c
> > @@ -4634,3 +4634,22 @@ static void quirk_no_aersid(struct pci_dev *pdev)
> > ?DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x2031, quirk_no_aersid);
> > ?DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x2032, quirk_no_aersid);
> > ?DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x2033, quirk_no_aersid);
> > +
> > +/*
> > + * Some devices have a broken ATS implementation causing IOMMU stalls.
> > + * Don't use ATS for those devices.
> > + */
> > +static void quirk_disable_ats(struct pci_dev *pdev)
> > +{
> > +#ifdef CONFIG_PCI_ATS
> > +???????/*
> > +??????? * Set pdev->ats_cap = 0 to make pci_enable_ats() bail out
> > +??????? * early.
> > +??????? */
> > +???????dev_info(&pdev->dev, "QUIRK: Disabling ATS");
> > +???????pdev->ats_cap = 0;
> > +#endif
> > +}
> > +
> > +/* AMD Stoney platform GPU */
> > +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x98e4, quirk_disable_ats);
>
> Why not put the #ifdef around *all* of the above?

Good point, I'll change that.