Using vfio-pci on a combination of cn8xxx and some PCI devices results in
a kernel panic. This is triggered by issuing a bus or a slot reset
on the PCI device.
The solution is to prevent the reset. I've dropped the vfio patch from the
previous series as vfio-pci already checks in the reset path for
pci_bus_resetable() and pci_slot_resetable().
With this series both checks indicate that the reset is not possible
preventing the kernel panic.
David Daney (2):
PCI: Allow PCI_DEV_FLAGS_NO_BUS_RESET to be used on bus device
PCI: Avoid bus reset for Cavium cn8xxx root ports
Jan Glauber (1):
PCI: Avoid slot reset for Cavium cn8xxx root ports
drivers/pci/pci.c | 4 ++++
drivers/pci/quirks.c | 24 ++++++++++++++++++++++++
2 files changed, 28 insertions(+)
--
2.9.0.rc0.21.g7777322
From: David Daney <[email protected]>
When checking to see if a PCI bus can safely be reset, we check to see
if any of the children have their PCI_DEV_FLAGS_NO_BUS_RESET flag set.
As these devices are known not to behave well after a bus reset.
Some PCIe root port bridges also do not behave well after a bus reset,
sometimes causing the devices behind the bridge to become unusable.
Add a check for the PCI_DEV_FLAGS_NO_BUS_RESET flag being set in the
bridge device to allow these bridges to be flagged, and prevent their
buses from being reset.
A follow on patch will add a quirk for this type of bridge.
Signed-off-by: David Daney <[email protected]>
[[email protected]: fixed typo]
Signed-off-by: Jan Glauber <[email protected]>
---
drivers/pci/pci.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index af0cc34..d9abbc9 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4290,6 +4290,10 @@ static bool pci_bus_resetable(struct pci_bus *bus)
{
struct pci_dev *dev;
+
+ if (bus->self && (bus->self->dev_flags & PCI_DEV_FLAGS_NO_BUS_RESET))
+ return false;
+
list_for_each_entry(dev, &bus->devices, bus_list) {
if (dev->dev_flags & PCI_DEV_FLAGS_NO_BUS_RESET ||
(dev->subordinate && !pci_bus_resetable(dev->subordinate)))
--
2.9.0.rc0.21.g7777322
Root ports of cn8xxx do not function after a slot reset when used with
some e1000e and LSI HBA devices. Add a quirk to prevent slot reset on
these root ports.
Signed-off-by: Jan Glauber <[email protected]>
---
drivers/pci/quirks.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 85191b8..6679971 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -845,6 +845,22 @@ static void quirk_cavium_sriov_rnm_link(struct pci_dev *dev)
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_CAVIUM, 0xa018, quirk_cavium_sriov_rnm_link);
#endif
+/*
+ * Root port on some Cavium CN8xxx chips do not successfully complete
+ * a bus reset when used with certain types of child devices. Config
+ * space access to the child may quit responding. Flag all devices under
+ * the secondary bus as non-resettable.
+ */
+static void quirk_CN8xxx_secondary_bus(struct pci_dev *dev)
+{
+ struct pci_dev *pdev;
+
+ dev_warn(&dev->dev, "Cavium CN8xxx quirk detected; reset for devices on secondary bus disabled\n");
+ list_for_each_entry(pdev, &dev->subordinate->devices, bus_list)
+ pdev->dev_flags |= PCI_DEV_FLAGS_NO_BUS_RESET;
+}
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_CAVIUM, 0xa100, quirk_CN8xxx_secondary_bus);
+
/*
* Some settings of MMRBC can lead to data corruption so block changes.
* See AMD 8131 HyperTransport PCI-X Tunnel Revision Guide
--
2.9.0.rc0.21.g7777322
From: David Daney <[email protected]>
Root ports of cn8xxx do not function after bus reset when used with
some e1000e and LSI HBA devices. Add a quirk to prevent bus reset on
these root ports.
Signed-off-by: David Daney <[email protected]>
[[email protected]: fixed typo and whitespaces]
Signed-off-by: Jan Glauber <[email protected]>
---
drivers/pci/quirks.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 6967c6b..85191b8 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -3364,6 +3364,14 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATHEROS, 0x0032, quirk_no_bus_reset);
DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATHEROS, 0x003c, quirk_no_bus_reset);
DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATHEROS, 0x0033, quirk_no_bus_reset);
+/*
+ * Root port on some Cavium CN8xxx chips do not successfully complete
+ * a bus reset when used with certain types of child devices. Config
+ * space access to the child may quit responding. Flag the root port
+ * as not supporting bus reset.
+ */
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_CAVIUM, 0xa100, quirk_no_bus_reset);
+
static void quirk_no_pm_reset(struct pci_dev *dev)
{
/*
--
2.9.0.rc0.21.g7777322
On Wed, 30 Aug 2017 16:24:54 +0200
Jan Glauber <[email protected]> wrote:
> Root ports of cn8xxx do not function after a slot reset when used with
> some e1000e and LSI HBA devices. Add a quirk to prevent slot reset on
> these root ports.
>
> Signed-off-by: Jan Glauber <[email protected]>
> ---
> drivers/pci/quirks.c | 16 ++++++++++++++++
> 1 file changed, 16 insertions(+)
>
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index 85191b8..6679971 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -845,6 +845,22 @@ static void quirk_cavium_sriov_rnm_link(struct pci_dev *dev)
> DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_CAVIUM, 0xa018, quirk_cavium_sriov_rnm_link);
> #endif
>
> +/*
> + * Root port on some Cavium CN8xxx chips do not successfully complete
> + * a bus reset when used with certain types of child devices. Config
> + * space access to the child may quit responding. Flag all devices under
> + * the secondary bus as non-resettable.
> + */
> +static void quirk_CN8xxx_secondary_bus(struct pci_dev *dev)
> +{
> + struct pci_dev *pdev;
> +
> + dev_warn(&dev->dev, "Cavium CN8xxx quirk detected; reset for devices on secondary bus disabled\n");
> + list_for_each_entry(pdev, &dev->subordinate->devices, bus_list)
> + pdev->dev_flags |= PCI_DEV_FLAGS_NO_BUS_RESET;
> +}
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_CAVIUM, 0xa100, quirk_CN8xxx_secondary_bus);
> +
> /*
> * Some settings of MMRBC can lead to data corruption so block changes.
> * See AMD 8131 HyperTransport PCI-X Tunnel Revision Guide
This doesn't seem reliable, doesn't the user just need to remove and
reprobe the slot and the device would re-appear without this flag set?
Thanks,
Alex
On Wed, Aug 30, 2017 at 08:40:12AM -0600, Alex Williamson wrote:
> On Wed, 30 Aug 2017 16:24:54 +0200
> Jan Glauber <[email protected]> wrote:
>
> > Root ports of cn8xxx do not function after a slot reset when used with
> > some e1000e and LSI HBA devices. Add a quirk to prevent slot reset on
> > these root ports.
> >
> > Signed-off-by: Jan Glauber <[email protected]>
> > ---
> > drivers/pci/quirks.c | 16 ++++++++++++++++
> > 1 file changed, 16 insertions(+)
> >
> > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> > index 85191b8..6679971 100644
> > --- a/drivers/pci/quirks.c
> > +++ b/drivers/pci/quirks.c
> > @@ -845,6 +845,22 @@ static void quirk_cavium_sriov_rnm_link(struct pci_dev *dev)
> > DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_CAVIUM, 0xa018, quirk_cavium_sriov_rnm_link);
> > #endif
> >
> > +/*
> > + * Root port on some Cavium CN8xxx chips do not successfully complete
> > + * a bus reset when used with certain types of child devices. Config
> > + * space access to the child may quit responding. Flag all devices under
> > + * the secondary bus as non-resettable.
> > + */
> > +static void quirk_CN8xxx_secondary_bus(struct pci_dev *dev)
> > +{
> > + struct pci_dev *pdev;
> > +
> > + dev_warn(&dev->dev, "Cavium CN8xxx quirk detected; reset for devices on secondary bus disabled\n");
> > + list_for_each_entry(pdev, &dev->subordinate->devices, bus_list)
> > + pdev->dev_flags |= PCI_DEV_FLAGS_NO_BUS_RESET;
> > +}
> > +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_CAVIUM, 0xa100, quirk_CN8xxx_secondary_bus);
> > +
> > /*
> > * Some settings of MMRBC can lead to data corruption so block changes.
> > * See AMD 8131 HyperTransport PCI-X Tunnel Revision Guide
>
>
> This doesn't seem reliable, doesn't the user just need to remove and
> reprobe the slot and the device would re-appear without this flag set?
No, I tried before to disable the slot with "echo 0 > /sys/bus/pci/slots/3/power"
but that does not work as it is not supported.
I'm not familiar with the quirk types, would another one be better
suited here (even if we don't have the problem you descibed)?
thanks,
Jan
> Thanks,
>
> Alex
On Thu, 31 Aug 2017 11:40:52 +0200
Jan Glauber <[email protected]> wrote:
> On Wed, Aug 30, 2017 at 08:40:12AM -0600, Alex Williamson wrote:
> > On Wed, 30 Aug 2017 16:24:54 +0200
> > Jan Glauber <[email protected]> wrote:
> >
> > > Root ports of cn8xxx do not function after a slot reset when used with
> > > some e1000e and LSI HBA devices. Add a quirk to prevent slot reset on
> > > these root ports.
> > >
> > > Signed-off-by: Jan Glauber <[email protected]>
> > > ---
> > > drivers/pci/quirks.c | 16 ++++++++++++++++
> > > 1 file changed, 16 insertions(+)
> > >
> > > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> > > index 85191b8..6679971 100644
> > > --- a/drivers/pci/quirks.c
> > > +++ b/drivers/pci/quirks.c
> > > @@ -845,6 +845,22 @@ static void quirk_cavium_sriov_rnm_link(struct pci_dev *dev)
> > > DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_CAVIUM, 0xa018, quirk_cavium_sriov_rnm_link);
> > > #endif
> > >
> > > +/*
> > > + * Root port on some Cavium CN8xxx chips do not successfully complete
> > > + * a bus reset when used with certain types of child devices. Config
> > > + * space access to the child may quit responding. Flag all devices under
> > > + * the secondary bus as non-resettable.
> > > + */
> > > +static void quirk_CN8xxx_secondary_bus(struct pci_dev *dev)
> > > +{
> > > + struct pci_dev *pdev;
> > > +
> > > + dev_warn(&dev->dev, "Cavium CN8xxx quirk detected; reset for devices on secondary bus disabled\n");
> > > + list_for_each_entry(pdev, &dev->subordinate->devices, bus_list)
> > > + pdev->dev_flags |= PCI_DEV_FLAGS_NO_BUS_RESET;
> > > +}
> > > +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_CAVIUM, 0xa100, quirk_CN8xxx_secondary_bus);
> > > +
> > > /*
> > > * Some settings of MMRBC can lead to data corruption so block changes.
> > > * See AMD 8131 HyperTransport PCI-X Tunnel Revision Guide
> >
> >
> > This doesn't seem reliable, doesn't the user just need to remove and
> > reprobe the slot and the device would re-appear without this flag set?
>
> No, I tried before to disable the slot with "echo 0 > /sys/bus/pci/slots/3/power"
> but that does not work as it is not supported.
>
> I'm not familiar with the quirk types, would another one be better
> suited here (even if we don't have the problem you descibed)?
The scenario I'm mentioning is to "echo 1 > /sys/bus/pci/devices/<some
device under the slot>/remove", then "echo <that device address> >
/sys/bus/pci/rescan". This would break the ordering implicit in using
a fixup defined for the root port. It seems like it'd make a lot more
sense to add a test on the parent bridge more similar to how the bus
reset works. It's not the subordinate devices imposing the
no-bus-reset flag, it's the bridge device and the objects and code
should support and reflect that. Thanks,
Alex
On Thu, Aug 31, 2017 at 10:01:30AM -0600, Alex Williamson wrote:
> On Thu, 31 Aug 2017 11:40:52 +0200
> Jan Glauber <[email protected]> wrote:
>
> > On Wed, Aug 30, 2017 at 08:40:12AM -0600, Alex Williamson wrote:
> > > On Wed, 30 Aug 2017 16:24:54 +0200
> > > Jan Glauber <[email protected]> wrote:
> > >
> > > > Root ports of cn8xxx do not function after a slot reset when used with
> > > > some e1000e and LSI HBA devices. Add a quirk to prevent slot reset on
> > > > these root ports.
> > > >
> > > > Signed-off-by: Jan Glauber <[email protected]>
> > > > ---
> > > > drivers/pci/quirks.c | 16 ++++++++++++++++
> > > > 1 file changed, 16 insertions(+)
> > > >
> > > > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> > > > index 85191b8..6679971 100644
> > > > --- a/drivers/pci/quirks.c
> > > > +++ b/drivers/pci/quirks.c
> > > > @@ -845,6 +845,22 @@ static void quirk_cavium_sriov_rnm_link(struct pci_dev *dev)
> > > > DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_CAVIUM, 0xa018, quirk_cavium_sriov_rnm_link);
> > > > #endif
> > > >
> > > > +/*
> > > > + * Root port on some Cavium CN8xxx chips do not successfully complete
> > > > + * a bus reset when used with certain types of child devices. Config
> > > > + * space access to the child may quit responding. Flag all devices under
> > > > + * the secondary bus as non-resettable.
> > > > + */
> > > > +static void quirk_CN8xxx_secondary_bus(struct pci_dev *dev)
> > > > +{
> > > > + struct pci_dev *pdev;
> > > > +
> > > > + dev_warn(&dev->dev, "Cavium CN8xxx quirk detected; reset for devices on secondary bus disabled\n");
> > > > + list_for_each_entry(pdev, &dev->subordinate->devices, bus_list)
> > > > + pdev->dev_flags |= PCI_DEV_FLAGS_NO_BUS_RESET;
> > > > +}
> > > > +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_CAVIUM, 0xa100, quirk_CN8xxx_secondary_bus);
> > > > +
> > > > /*
> > > > * Some settings of MMRBC can lead to data corruption so block changes.
> > > > * See AMD 8131 HyperTransport PCI-X Tunnel Revision Guide
> > >
> > >
> > > This doesn't seem reliable, doesn't the user just need to remove and
> > > reprobe the slot and the device would re-appear without this flag set?
> >
> > No, I tried before to disable the slot with "echo 0 > /sys/bus/pci/slots/3/power"
> > but that does not work as it is not supported.
> >
> > I'm not familiar with the quirk types, would another one be better
> > suited here (even if we don't have the problem you descibed)?
>
> The scenario I'm mentioning is to "echo 1 > /sys/bus/pci/devices/<some
> device under the slot>/remove", then "echo <that device address> >
> /sys/bus/pci/rescan". This would break the ordering implicit in using
> a fixup defined for the root port. It seems like it'd make a lot more
> sense to add a test on the parent bridge more similar to how the bus
> reset works. It's not the subordinate devices imposing the
> no-bus-reset flag, it's the bridge device and the objects and code
> should support and reflect that. Thanks,
Doing "echo <that device address> > /sys/bus/pci/rescan" after the
remove did not work for me, but maybe the format of the device address
needs to be different. Anyway, the sequence
echo 1 > /sys/bus/pci/devices/<some device under the slot>/remove
echo 1 > /sys/bus/pci/rescan
still triggers the panic as you mentioned above.
I agree that the subordinate devices are not causing the issue, still
I need to make pci_slot_resetable() return false in our case.
So what if we add an additional check like:
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index fdf65a6..389db4b 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4389,6 +4389,9 @@ static bool pci_slot_resetable(struct pci_slot *slot)
{
struct pci_dev *dev;
+ if (slot->bus->self & PCI_DEV_FLAGS_NO_BUS_RESET)
+ return false;
+
list_for_each_entry(dev, &slot->bus->devices, bus_list) {
if (!dev->slot || dev->slot != slot)
continue;
--Jan
On Thu, Sep 07, 2017 at 09:40:11AM +0200, Jan Glauber wrote:
> So what if we add an additional check like:
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index fdf65a6..389db4b 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -4389,6 +4389,9 @@ static bool pci_slot_resetable(struct pci_slot *slot)
> {
> struct pci_dev *dev;
>
> + if (slot->bus->self & PCI_DEV_FLAGS_NO_BUS_RESET)
> + return false;
> +
> list_for_each_entry(dev, &slot->bus->devices, bus_list) {
> if (!dev->slot || dev->slot != slot)
> continue;
Obviously I meant:
if (slot->bus->self->dev_flags & PCI_DEV_FLAGS_NO_BUS_RESET)
--Jan
On Thu, 7 Sep 2017 09:49:04 +0200
Jan Glauber <[email protected]> wrote:
> On Thu, Sep 07, 2017 at 09:40:11AM +0200, Jan Glauber wrote:
> > So what if we add an additional check like:
> >
> > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > index fdf65a6..389db4b 100644
> > --- a/drivers/pci/pci.c
> > +++ b/drivers/pci/pci.c
> > @@ -4389,6 +4389,9 @@ static bool pci_slot_resetable(struct pci_slot *slot)
> > {
> > struct pci_dev *dev;
> >
> > + if (slot->bus->self & PCI_DEV_FLAGS_NO_BUS_RESET)
> > + return false;
> > +
> > list_for_each_entry(dev, &slot->bus->devices, bus_list) {
> > if (!dev->slot || dev->slot != slot)
> > continue;
>
> Obviously I meant:
> if (slot->bus->self->dev_flags & PCI_DEV_FLAGS_NO_BUS_RESET)
Much better, perhaps even incorporate the bus->self check for good
measure... is it possible to have a slot on a root bus? Taking
different approaches for bus vs slot reset should have been a giant red
flag that something is wrong. Thanks,
Alex