From: Sean V Kelley <[email protected]>
Changes since v7 [1]:
- No functional changes.
- Reword bridge patch.
- Noted testing below for #non-native/no RCEC case
(Jonathan Cameron)
- Separate out pci_walk_bus() into pci_walk_bridge() change.
- Put remaining dev to bridge name changes in the separate patch from v7.
(Bjorn Helgaas)
[1] https://lore.kernel.org/lkml/[email protected]/
Root Complex Event Collectors (RCEC) provide support for terminating error
and PME messages from Root Complex Integrated Endpoints (RCiEPs). An RCEC
resides on a Bus in the Root Complex. Multiple RCECs can in fact reside on
a single bus. An RCEC will explicitly declare supported RCiEPs through the
Root Complex Endpoint Association Extended Capability.
(See PCIe 5.0-1, sections 1.3.2.3 (RCiEP), and 7.9.10 (RCEC Ext. Cap.))
The kernel lacks handling for these RCECs and the error messages received
from their respective associated RCiEPs. More recently, a new CPU
interconnect, Compute eXpress Link (CXL) depends on RCEC capabilities for
purposes of error messaging from CXL 1.1 supported RCiEP devices.
DocLink: https://www.computeexpresslink.org/
This use case is not limited to CXL. Existing hardware today includes
support for RCECs, such as the Denverton microserver product
family. Future hardware will be forthcoming.
(See Intel Document, Order number: 33061-003US)
So services such as AER or PME could be associated with an RCEC driver.
In the case of CXL, if an RCiEP (i.e., CXL 1.1 device) is associated with a
platform's RCEC it shall signal PME and AER error conditions through that
RCEC.
Towards the above use cases, add the missing RCEC class and extend the
PCIe Root Port and service drivers to allow association of RCiEPs to their
respective parent RCEC and facilitate handling of terminating error and PME
messages.
Tested-by: Jonathan Cameron <[email protected]> #non-native/no RCEC
Jonathan Cameron (1):
PCI/AER: Extend AER error handling to RCECs
Qiuxu Zhuo (5):
PCI/RCEC: Add RCEC class code and extended capability
PCI/RCEC: Bind RCEC devices to the Root Port driver
PCI/AER: Apply function level reset to RCiEP on fatal error
PCI/RCEC: Add RCiEP's linked RCEC to AER/ERR
PCI/AER: Add RCEC AER error injection support
Sean V Kelley (8):
PCI/RCEC: Cache RCEC capabilities in pci_init_capabilities()
PCI/ERR: Rename reset_link() to reset_subordinate_device()
PCI/ERR: Use "bridge" for clarity in pcie_do_recovery()
PCI/ERR: Add pci_walk_bridge() to pcie_do_recovery()
PCI/ERR: Limit AER resets in pcie_do_recovery()
PCI/RCEC: Add pcie_link_rcec() to associate RCiEPs
PCI/AER: Add pcie_walk_rcec() to RCEC AER handling
PCI/PME: Add pcie_walk_rcec() to RCEC PME handling
drivers/pci/pci.h | 25 ++++-
drivers/pci/pcie/Makefile | 2 +-
drivers/pci/pcie/aer.c | 36 ++++--
drivers/pci/pcie/aer_inject.c | 5 +-
drivers/pci/pcie/err.c | 109 +++++++++++++++----
drivers/pci/pcie/pme.c | 15 ++-
drivers/pci/pcie/portdrv_core.c | 8 +-
drivers/pci/pcie/portdrv_pci.c | 8 +-
drivers/pci/pcie/rcec.c | 187 ++++++++++++++++++++++++++++++++
drivers/pci/probe.c | 2 +
include/linux/pci.h | 5 +
include/linux/pci_ids.h | 1 +
include/uapi/linux/pci_regs.h | 7 ++
13 files changed, 367 insertions(+), 43 deletions(-)
create mode 100644 drivers/pci/pcie/rcec.c
--
2.28.0
From: Qiuxu Zhuo <[email protected]>
A PCIe Root Complex Event Collector (RCEC) has the base class 0x08,
sub-class 0x07, and programming interface 0x00. Add the class code
0x0807 to identify RCEC devices and add the defines for the RCEC
Endpoint Association Extended Capability.
See PCI Express Base Specification, version 5.0-1, section "1.3.4
Root Complex Event Collector" and section "7.9.10 Root Complex
Event Collector Endpoint Association Extended Capability"
Signed-off-by: Qiuxu Zhuo <[email protected]>
Reviewed-by: Jonathan Cameron <[email protected]>
---
include/linux/pci_ids.h | 1 +
include/uapi/linux/pci_regs.h | 7 +++++++
2 files changed, 8 insertions(+)
diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h
index 1ab1e24bcbce..d8156a5dbee8 100644
--- a/include/linux/pci_ids.h
+++ b/include/linux/pci_ids.h
@@ -81,6 +81,7 @@
#define PCI_CLASS_SYSTEM_RTC 0x0803
#define PCI_CLASS_SYSTEM_PCI_HOTPLUG 0x0804
#define PCI_CLASS_SYSTEM_SDHCI 0x0805
+#define PCI_CLASS_SYSTEM_RCEC 0x0807
#define PCI_CLASS_SYSTEM_OTHER 0x0880
#define PCI_BASE_CLASS_INPUT 0x09
diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
index f9701410d3b5..f335f65f65d6 100644
--- a/include/uapi/linux/pci_regs.h
+++ b/include/uapi/linux/pci_regs.h
@@ -828,6 +828,13 @@
#define PCI_PWR_CAP_BUDGET(x) ((x) & 1) /* Included in system budget */
#define PCI_EXT_CAP_PWR_SIZEOF 16
+/* Root Complex Event Collector Endpoint Association */
+#define PCI_RCEC_RCIEP_BITMAP 4 /* Associated Bitmap for RCiEPs */
+#define PCI_RCEC_BUSN 8 /* RCEC Associated Bus Numbers */
+#define PCI_RCEC_BUSN_REG_VER 0x02 /* Least capability version that BUSN present */
+#define PCI_RCEC_BUSN_NEXT(x) (((x) >> 8) & 0xff)
+#define PCI_RCEC_BUSN_LAST(x) (((x) >> 16) & 0xff)
+
/* Vendor-Specific (VSEC, PCI_EXT_CAP_ID_VNDR) */
#define PCI_VNDR_HEADER 4 /* Vendor-Specific Header */
#define PCI_VNDR_HEADER_ID(x) ((x) & 0xffff)
--
2.28.0
From: Sean V Kelley <[email protected]>
Extend support for Root Complex Event Collectors by decoding and
caching the RCEC Endpoint Association Extended Capabilities when
enumerating. Use that cached information for later error source
reporting. See PCI Express Base Specification, version 5.0-1,
section 7.9.10.
Suggested-by: Bjorn Helgaas <[email protected]>
Co-developed-by: Qiuxu Zhuo <[email protected]>
Signed-off-by: Qiuxu Zhuo <[email protected]>
Signed-off-by: Sean V Kelley <[email protected]>
Reviewed-by: Jonathan Cameron <[email protected]>
---
drivers/pci/pci.h | 17 +++++++++++
drivers/pci/pcie/Makefile | 2 +-
drivers/pci/pcie/rcec.c | 59 +++++++++++++++++++++++++++++++++++++++
drivers/pci/probe.c | 2 ++
include/linux/pci.h | 4 +++
5 files changed, 83 insertions(+), 1 deletion(-)
create mode 100644 drivers/pci/pcie/rcec.c
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index fa12f7cbc1a0..0e332a218d75 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -449,6 +449,15 @@ int aer_get_device_error_info(struct pci_dev *dev, struct aer_err_info *info);
void aer_print_error(struct pci_dev *dev, struct aer_err_info *info);
#endif /* CONFIG_PCIEAER */
+#ifdef CONFIG_PCIEPORTBUS
+/* Cached RCEC Endpoint Association */
+struct rcec_ea {
+ u8 nextbusn;
+ u8 lastbusn;
+ u32 bitmap;
+};
+#endif
+
#ifdef CONFIG_PCIE_DPC
void pci_save_dpc_state(struct pci_dev *dev);
void pci_restore_dpc_state(struct pci_dev *dev);
@@ -461,6 +470,14 @@ static inline void pci_restore_dpc_state(struct pci_dev *dev) {}
static inline void pci_dpc_init(struct pci_dev *pdev) {}
#endif
+#ifdef CONFIG_PCIEPORTBUS
+int pci_rcec_init(struct pci_dev *dev);
+void pci_rcec_exit(struct pci_dev *dev);
+#else
+static inline int pci_rcec_init(struct pci_dev *dev) { return 0; }
+static inline void pci_rcec_exit(struct pci_dev *dev) {}
+#endif
+
#ifdef CONFIG_PCI_ATS
/* Address Translation Service */
void pci_ats_init(struct pci_dev *dev);
diff --git a/drivers/pci/pcie/Makefile b/drivers/pci/pcie/Makefile
index 68da9280ff11..d9697892fa3e 100644
--- a/drivers/pci/pcie/Makefile
+++ b/drivers/pci/pcie/Makefile
@@ -2,7 +2,7 @@
#
# Makefile for PCI Express features and port driver
-pcieportdrv-y := portdrv_core.o portdrv_pci.o err.o
+pcieportdrv-y := portdrv_core.o portdrv_pci.o err.o rcec.o
obj-$(CONFIG_PCIEPORTBUS) += pcieportdrv.o
diff --git a/drivers/pci/pcie/rcec.c b/drivers/pci/pcie/rcec.c
new file mode 100644
index 000000000000..da02b0af442d
--- /dev/null
+++ b/drivers/pci/pcie/rcec.c
@@ -0,0 +1,59 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Root Complex Event Collector Support
+ *
+ * Authors:
+ * Sean V Kelley <[email protected]>
+ * Qiuxu Zhuo <[email protected]>
+ *
+ * Copyright (C) 2020 Intel Corp.
+ */
+
+#include <linux/kernel.h>
+#include <linux/pci.h>
+#include <linux/pci_regs.h>
+
+#include "../pci.h"
+
+int pci_rcec_init(struct pci_dev *dev)
+{
+ struct rcec_ea *rcec_ea;
+ u32 rcec, hdr, busn;
+ u8 ver;
+
+ /* Only for Root Complex Event Collectors */
+ if (pci_pcie_type(dev) != PCI_EXP_TYPE_RC_EC)
+ return 0;
+
+ rcec = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_RCEC);
+ if (!rcec)
+ return 0;
+
+ rcec_ea = kzalloc(sizeof(*rcec_ea), GFP_KERNEL);
+ if (!rcec_ea)
+ return -ENOMEM;
+ dev->rcec_ea = rcec_ea;
+
+ pci_read_config_dword(dev, rcec + PCI_RCEC_RCIEP_BITMAP, &rcec_ea->bitmap);
+
+ /* Check whether RCEC BUSN register is present */
+ pci_read_config_dword(dev, rcec, &hdr);
+ ver = PCI_EXT_CAP_VER(hdr);
+ if (ver < PCI_RCEC_BUSN_REG_VER) {
+ /* Avoid later ver check by setting nextbusn */
+ rcec_ea->nextbusn = 0xff;
+ return 0;
+ }
+
+ pci_read_config_dword(dev, rcec + PCI_RCEC_BUSN, &busn);
+ rcec_ea->nextbusn = PCI_RCEC_BUSN_NEXT(busn);
+ rcec_ea->lastbusn = PCI_RCEC_BUSN_LAST(busn);
+
+ return 0;
+}
+
+void pci_rcec_exit(struct pci_dev *dev)
+{
+ kfree(dev->rcec_ea);
+ dev->rcec_ea = NULL;
+}
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 03d37128a24f..25f01f841f2d 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -2201,6 +2201,7 @@ static void pci_configure_device(struct pci_dev *dev)
static void pci_release_capabilities(struct pci_dev *dev)
{
pci_aer_exit(dev);
+ pci_rcec_exit(dev);
pci_vpd_release(dev);
pci_iov_release(dev);
pci_free_cap_save_buffers(dev);
@@ -2400,6 +2401,7 @@ static void pci_init_capabilities(struct pci_dev *dev)
pci_ptm_init(dev); /* Precision Time Measurement */
pci_aer_init(dev); /* Advanced Error Reporting */
pci_dpc_init(dev); /* Downstream Port Containment */
+ pci_rcec_init(dev); /* Root Complex Event Collector */
pcie_report_downtraining(dev);
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 835530605c0d..2290439e8bc0 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -304,6 +304,7 @@ struct pcie_link_state;
struct pci_vpd;
struct pci_sriov;
struct pci_p2pdma;
+struct rcec_ea;
/* The pci_dev structure describes PCI devices */
struct pci_dev {
@@ -326,6 +327,9 @@ struct pci_dev {
#ifdef CONFIG_PCIEAER
u16 aer_cap; /* AER capability offset */
struct aer_stats *aer_stats; /* AER stats for this device */
+#endif
+#ifdef CONFIG_PCIEPORTBUS
+ struct rcec_ea *rcec_ea; /* RCEC cached endpoint association */
#endif
u8 pcie_cap; /* PCIe capability offset */
u8 msi_cap; /* MSI capability offset */
--
2.28.0
From: Sean V Kelley <[email protected]>
Consolidate subordinate bus checks with pci_walk_bus()
into pci_walk_bridge() for walking below potentially
AER affected bridges.
Suggested-by: Bjorn Helgaas <[email protected]>
Signed-off-by: Sean V Kelley <[email protected]>
---
drivers/pci/pcie/err.c | 29 ++++++++++++++++++++++-------
1 file changed, 22 insertions(+), 7 deletions(-)
diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
index e68ea5243ff2..9b2130725ab6 100644
--- a/drivers/pci/pcie/err.c
+++ b/drivers/pci/pcie/err.c
@@ -146,12 +146,28 @@ static int report_resume(struct pci_dev *dev, void *data)
return 0;
}
+/**
+ * pci_walk_bridge - walk bridges potentially AER affected
+ * @bridge bridge which may be a Port.
+ * @cb callback to be called for each device found
+ * @userdata arbitrary pointer to be passed to callback.
+ *
+ * If the device provided is a bridge, walk the subordinate bus,
+ * including any bridged devices on buses under this bus.
+ * Call the provided callback on each device found.
+ */
+static void pci_walk_bridge(struct pci_dev *bridge, int (*cb)(struct pci_dev *, void *),
+ void *userdata)
+{
+ if (bridge->subordinate)
+ pci_walk_bus(bridge->subordinate, cb, userdata);
+}
+
pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
pci_channel_state_t state,
pci_ers_result_t (*reset_subordinate_devices)(struct pci_dev *pdev))
{
pci_ers_result_t status = PCI_ERS_RESULT_CAN_RECOVER;
- struct pci_bus *bus;
struct pci_dev *bridge;
int type;
@@ -167,23 +183,22 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
else
bridge = pci_upstream_bridge(dev);
- bus = bridge->subordinate;
pci_dbg(dev, "broadcast error_detected message\n");
if (state == pci_channel_io_frozen) {
- pci_walk_bus(bus, report_frozen_detected, &status);
+ pci_walk_bridge(bridge, report_frozen_detected, &status);
status = reset_subordinate_device(bridge);
if (status != PCI_ERS_RESULT_RECOVERED) {
pci_warn(dev, "subordinate device reset failed\n");
goto failed;
}
} else {
- pci_walk_bus(bus, report_normal_detected, &status);
+ pci_walk_bridge(bridge, report_normal_detected, &status);
}
if (status == PCI_ERS_RESULT_CAN_RECOVER) {
status = PCI_ERS_RESULT_RECOVERED;
pci_dbg(dev, "broadcast mmio_enabled message\n");
- pci_walk_bus(bus, report_mmio_enabled, &status);
+ pci_walk_bridge(bridge, report_mmio_enabled, &status);
}
if (status == PCI_ERS_RESULT_NEED_RESET) {
@@ -194,14 +209,14 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
*/
status = PCI_ERS_RESULT_RECOVERED;
pci_dbg(dev, "broadcast slot_reset message\n");
- pci_walk_bus(bus, report_slot_reset, &status);
+ pci_walk_bridge(bridge, report_slot_reset, &status);
}
if (status != PCI_ERS_RESULT_RECOVERED)
goto failed;
pci_dbg(dev, "broadcast resume message\n");
- pci_walk_bus(bus, report_resume, &status);
+ pci_walk_bridge(bridge, report_resume, &status);
if (pcie_aer_is_native(bridge))
pcie_clear_device_status(bridge);
--
2.28.0
From: Sean V Kelley <[email protected]>
reset_link() appears to be misnamed. The point is to really
reset any devices below a given bridge. So rename it to
reset_subordinate_devices() to make it clear that we are
passing a bridge with the intent to reset the devices below it.
Suggested-by: Bjorn Helgaas <[email protected]>
Signed-off-by: Sean V Kelley <[email protected]>
Acked-by: Jonathan Cameron <[email protected]>
---
drivers/pci/pci.h | 2 +-
drivers/pci/pcie/err.c | 6 +++---
2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 0e332a218d75..98ec87ef780d 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -574,7 +574,7 @@ static inline int pci_dev_specific_disable_acs_redir(struct pci_dev *dev)
/* PCI error reporting and recovery */
pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
pci_channel_state_t state,
- pci_ers_result_t (*reset_link)(struct pci_dev *pdev));
+ pci_ers_result_t (*reset_subordinate_devices)(struct pci_dev *pdev));
bool pcie_wait_for_link(struct pci_dev *pdev, bool active);
#ifdef CONFIG_PCIEASPM
diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
index c543f419d8f9..950612342f1c 100644
--- a/drivers/pci/pcie/err.c
+++ b/drivers/pci/pcie/err.c
@@ -148,7 +148,7 @@ static int report_resume(struct pci_dev *dev, void *data)
pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
pci_channel_state_t state,
- pci_ers_result_t (*reset_link)(struct pci_dev *pdev))
+ pci_ers_result_t (*reset_subordinate_devices)(struct pci_dev *pdev))
{
pci_ers_result_t status = PCI_ERS_RESULT_CAN_RECOVER;
struct pci_bus *bus;
@@ -165,9 +165,9 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
pci_dbg(dev, "broadcast error_detected message\n");
if (state == pci_channel_io_frozen) {
pci_walk_bus(bus, report_frozen_detected, &status);
- status = reset_link(dev);
+ status = reset_subordinate_device(dev);
if (status != PCI_ERS_RESULT_RECOVERED) {
- pci_warn(dev, "link reset failed\n");
+ pci_warn(dev, "subordinate device reset failed\n");
goto failed;
}
} else {
--
2.28.0
From: Sean V Kelley <[email protected]>
A Root Complex Event Collector provides support for
terminating error and PME messages from associated RCiEPs.
Make use of the RCEC Endpoint Association Extended Capability
to identify associated RCiEPs. Link the associated RCiEPs as
the RCECs are enumerated.
Co-developed-by: Qiuxu Zhuo <[email protected]>
Signed-off-by: Qiuxu Zhuo <[email protected]>
Signed-off-by: Sean V Kelley <[email protected]>
Reviewed-by: Jonathan Cameron <[email protected]>
---
drivers/pci/pci.h | 2 +
drivers/pci/pcie/portdrv_pci.c | 3 ++
drivers/pci/pcie/rcec.c | 91 ++++++++++++++++++++++++++++++++++
include/linux/pci.h | 1 +
4 files changed, 97 insertions(+)
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 98ec87ef780d..ea5716d48b68 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -473,9 +473,11 @@ static inline void pci_dpc_init(struct pci_dev *pdev) {}
#ifdef CONFIG_PCIEPORTBUS
int pci_rcec_init(struct pci_dev *dev);
void pci_rcec_exit(struct pci_dev *dev);
+void pcie_link_rcec(struct pci_dev *rcec);
#else
static inline int pci_rcec_init(struct pci_dev *dev) { return 0; }
static inline void pci_rcec_exit(struct pci_dev *dev) {}
+static inline void pcie_link_rcec(struct pci_dev *rcec) {}
#endif
#ifdef CONFIG_PCI_ATS
diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
index 4d880679b9b1..dbeb0155c2c3 100644
--- a/drivers/pci/pcie/portdrv_pci.c
+++ b/drivers/pci/pcie/portdrv_pci.c
@@ -110,6 +110,9 @@ static int pcie_portdrv_probe(struct pci_dev *dev,
(pci_pcie_type(dev) != PCI_EXP_TYPE_RC_EC)))
return -ENODEV;
+ if (pci_pcie_type(dev) == PCI_EXP_TYPE_RC_EC)
+ pcie_link_rcec(dev);
+
status = pcie_port_device_register(dev);
if (status)
return status;
diff --git a/drivers/pci/pcie/rcec.c b/drivers/pci/pcie/rcec.c
index da02b0af442d..9ba74d8064e9 100644
--- a/drivers/pci/pcie/rcec.c
+++ b/drivers/pci/pcie/rcec.c
@@ -15,6 +15,97 @@
#include "../pci.h"
+struct walk_rcec_data {
+ struct pci_dev *rcec;
+ int (*user_callback)(struct pci_dev *dev, void *data);
+ void *user_data;
+};
+
+static bool rcec_assoc_rciep(struct pci_dev *rcec, struct pci_dev *rciep)
+{
+ unsigned long bitmap = rcec->rcec_ea->bitmap;
+ unsigned int devn;
+
+ /* An RCiEP found on a different bus in range */
+ if (rcec->bus->number != rciep->bus->number)
+ return true;
+
+ /* Same bus, so check bitmap */
+ for_each_set_bit(devn, &bitmap, 32)
+ if (devn == rciep->devfn)
+ return true;
+
+ return false;
+}
+
+static int link_rcec_helper(struct pci_dev *dev, void *data)
+{
+ struct walk_rcec_data *rcec_data = data;
+ struct pci_dev *rcec = rcec_data->rcec;
+
+ if ((pci_pcie_type(dev) == PCI_EXP_TYPE_RC_END) && rcec_assoc_rciep(rcec, dev)) {
+ dev->rcec = rcec;
+ pci_dbg(dev, "PME & error events reported via %s\n", pci_name(rcec));
+ }
+
+ return 0;
+}
+
+static void walk_rcec(int (*cb)(struct pci_dev *dev, void *data), void *userdata)
+{
+ struct walk_rcec_data *rcec_data = userdata;
+ struct pci_dev *rcec = rcec_data->rcec;
+ u8 nextbusn, lastbusn;
+ struct pci_bus *bus;
+ unsigned int bnr;
+
+ if (!rcec->rcec_ea)
+ return;
+
+ /* Walk own bus for bitmap based association */
+ pci_walk_bus(rcec->bus, cb, rcec_data);
+
+ nextbusn = rcec->rcec_ea->nextbusn;
+ lastbusn = rcec->rcec_ea->lastbusn;
+
+ /* All RCiEP devices are on the same bus as the RCEC */
+ if (nextbusn == 0xff && lastbusn == 0x00)
+ return;
+
+ for (bnr = nextbusn; bnr <= lastbusn; bnr++) {
+ /* No association indicated (PCIe 5.0-1, 7.9.10.3) */
+ if (bnr == rcec->bus->number)
+ continue;
+
+ bus = pci_find_bus(pci_domain_nr(rcec->bus), bnr);
+ if (!bus)
+ continue;
+
+ /* Find RCiEP devices on the given bus ranges */
+ pci_walk_bus(bus, cb, rcec_data);
+ }
+}
+
+/**
+ * pcie_link_rcec - Link RCiEP devices associating with RCEC.
+ * @rcec RCEC whose RCiEP devices should be linked.
+ *
+ * Link the given RCEC to each RCiEP device found.
+ */
+void pcie_link_rcec(struct pci_dev *rcec)
+{
+ struct walk_rcec_data rcec_data;
+
+ if (!rcec->rcec_ea)
+ return;
+
+ rcec_data.rcec = rcec;
+ rcec_data.user_callback = NULL;
+ rcec_data.user_data = NULL;
+
+ walk_rcec(link_rcec_helper, &rcec_data);
+}
+
int pci_rcec_init(struct pci_dev *dev)
{
struct rcec_ea *rcec_ea;
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 2290439e8bc0..e546b16b13c1 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -330,6 +330,7 @@ struct pci_dev {
#endif
#ifdef CONFIG_PCIEPORTBUS
struct rcec_ea *rcec_ea; /* RCEC cached endpoint association */
+ struct pci_dev *rcec; /* Associated RCEC device */
#endif
u8 pcie_cap; /* PCIe capability offset */
u8 msi_cap; /* MSI capability offset */
--
2.28.0
From: Qiuxu Zhuo <[email protected]>
Attempt to do a function level reset for an RCiEP on fatal error.
Signed-off-by: Qiuxu Zhuo <[email protected]>
Reviewed-by: Jonathan Cameron <[email protected]>
---
drivers/pci/pcie/err.c | 31 ++++++++++++++++++++++---------
1 file changed, 22 insertions(+), 9 deletions(-)
diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
index c4ceca42a3bf..38abd7984996 100644
--- a/drivers/pci/pcie/err.c
+++ b/drivers/pci/pcie/err.c
@@ -169,6 +169,17 @@ static void pci_walk_bridge(struct pci_dev *bridge, int (*cb)(struct pci_dev *,
cb(bridge, userdata);
}
+static pci_ers_result_t flr_on_rciep(struct pci_dev *dev)
+{
+ if (!pcie_has_flr(dev))
+ return PCI_ERS_RESULT_DISCONNECT;
+
+ if (pcie_flr(dev))
+ return PCI_ERS_RESULT_DISCONNECT;
+
+ return PCI_ERS_RESULT_RECOVERED;
+}
+
pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
pci_channel_state_t state,
pci_ers_result_t (*reset_subordinate_devices)(struct pci_dev *pdev))
@@ -195,15 +206,17 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
if (state == pci_channel_io_frozen) {
pci_walk_bridge(bridge, report_frozen_detected, &status);
if (type == PCI_EXP_TYPE_RC_END) {
- pci_warn(dev, "subordinate device reset not possible for RCiEP\n");
- status = PCI_ERS_RESULT_NONE;
- goto failed;
- }
-
- status = reset_subordinate_devices(bridge);
- if (status != PCI_ERS_RESULT_RECOVERED) {
- pci_warn(dev, "subordinate device reset failed\n");
- goto failed;
+ status = flr_on_rciep(dev);
+ if (status != PCI_ERS_RESULT_RECOVERED) {
+ pci_warn(dev, "function level reset failed\n");
+ goto failed;
+ }
+ } else {
+ status = reset_subordinate_devices(bridge);
+ if (status != PCI_ERS_RESULT_RECOVERED) {
+ pci_warn(dev, "subordinate device reset failed\n");
+ goto failed;
+ }
}
} else {
pci_walk_bridge(bridge, report_normal_detected, &status);
--
2.28.0
From: Qiuxu Zhuo <[email protected]>
If a Root Complex Integrated Endpoint (RCiEP) is implemented, errors may
optionally be sent to a corresponding Root Complex Event Collector (RCEC).
Each RCiEP must be associated with no more than one RCEC. Interface errors
are reported to the OS by RCECs.
For an RCEC (technically not a Bridge), error messages "received" from
associated RCiEPs must be enabled for "transmission" in order to cause a
System Error via the Root Control register or (when the Advanced Error
Reporting Capability is present) reporting via the Root Error Command
register and logging in the Root Error Status register and Error Source
Identification register.
Given the commonality with Root Ports and the need to also support AER
and PME services for RCECs, extend the Root Port driver to support RCEC
devices through the addition of the RCEC Class ID to the driver
structure.
Co-developed-by: Sean V Kelley <[email protected]>
Signed-off-by: Sean V Kelley <[email protected]>
Signed-off-by: Qiuxu Zhuo <[email protected]>
Reviewed-by: Jonathan Cameron <[email protected]>
---
drivers/pci/pcie/portdrv_pci.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
index 3a3ce40ae1ab..4d880679b9b1 100644
--- a/drivers/pci/pcie/portdrv_pci.c
+++ b/drivers/pci/pcie/portdrv_pci.c
@@ -106,7 +106,8 @@ static int pcie_portdrv_probe(struct pci_dev *dev,
if (!pci_is_pcie(dev) ||
((pci_pcie_type(dev) != PCI_EXP_TYPE_ROOT_PORT) &&
(pci_pcie_type(dev) != PCI_EXP_TYPE_UPSTREAM) &&
- (pci_pcie_type(dev) != PCI_EXP_TYPE_DOWNSTREAM)))
+ (pci_pcie_type(dev) != PCI_EXP_TYPE_DOWNSTREAM) &&
+ (pci_pcie_type(dev) != PCI_EXP_TYPE_RC_EC)))
return -ENODEV;
status = pcie_port_device_register(dev);
@@ -195,6 +196,8 @@ static const struct pci_device_id port_pci_ids[] = {
{ PCI_DEVICE_CLASS(((PCI_CLASS_BRIDGE_PCI << 8) | 0x00), ~0) },
/* subtractive decode PCI-to-PCI bridge, class type is 060401h */
{ PCI_DEVICE_CLASS(((PCI_CLASS_BRIDGE_PCI << 8) | 0x01), ~0) },
+ /* handle any Root Complex Event Collector */
+ { PCI_DEVICE_CLASS(((PCI_CLASS_SYSTEM_RCEC << 8) | 0x00), ~0) },
{ },
};
--
2.28.0
From: Sean V Kelley <[email protected]>
A generic term such as "bridge" may be used for something with
a subordinate bus. The mix of ports would benefit from a use of
the term. Further clarity can be had in pcie_do_recovery()
with use of pci_upstream_bridge() in place of dev->bus->self.
Reverse the pcie_do_recovery() conditional logic and replace
use of "dev" with "bridge".
Suggested-by: Bjorn Helgaas <[email protected]>
Signed-off-by: Sean V Kelley <[email protected]>
Acked-by: Jonathan Cameron <[email protected]>
---
drivers/pci/pcie/err.c | 28 +++++++++++++++++-----------
1 file changed, 17 insertions(+), 11 deletions(-)
diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
index 950612342f1c..e68ea5243ff2 100644
--- a/drivers/pci/pcie/err.c
+++ b/drivers/pci/pcie/err.c
@@ -152,20 +152,26 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
{
pci_ers_result_t status = PCI_ERS_RESULT_CAN_RECOVER;
struct pci_bus *bus;
+ struct pci_dev *bridge;
+ int type;
/*
- * Error recovery runs on all subordinates of the first downstream port.
- * If the downstream port detected the error, it is cleared at the end.
+ * Error recovery runs on all subordinates of the first downstream
+ * bridge. If the downstream bridge detected the error, it is
+ * cleared at the end.
*/
- if (!(pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT ||
- pci_pcie_type(dev) == PCI_EXP_TYPE_DOWNSTREAM))
- dev = dev->bus->self;
- bus = dev->subordinate;
-
+ type = pci_pcie_type(dev);
+ if (type == PCI_EXP_TYPE_ROOT_PORT ||
+ type == PCI_EXP_TYPE_DOWNSTREAM)
+ bridge = dev;
+ else
+ bridge = pci_upstream_bridge(dev);
+
+ bus = bridge->subordinate;
pci_dbg(dev, "broadcast error_detected message\n");
if (state == pci_channel_io_frozen) {
pci_walk_bus(bus, report_frozen_detected, &status);
- status = reset_subordinate_device(dev);
+ status = reset_subordinate_device(bridge);
if (status != PCI_ERS_RESULT_RECOVERED) {
pci_warn(dev, "subordinate device reset failed\n");
goto failed;
@@ -197,9 +203,9 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
pci_dbg(dev, "broadcast resume message\n");
pci_walk_bus(bus, report_resume, &status);
- if (pcie_aer_is_native(dev))
- pcie_clear_device_status(dev);
- pci_aer_clear_nonfatal_status(dev);
+ if (pcie_aer_is_native(bridge))
+ pcie_clear_device_status(bridge);
+ pci_aer_clear_nonfatal_status(bridge);
pci_info(dev, "device recovery successful\n");
return status;
--
2.28.0
From: Sean V Kelley <[email protected]>
Root Complex Event Collectors (RCEC) appear as peers of Root Ports
and also have the PME capability. As with AER, there is a need to be
able to walk the RCiEPs associated with their RCEC for purposes of
acting upon them with callbacks. So add RCEC support through the use
of pcie_walk_rcec() to the current PME service driver and attach the
PME service driver to the RCEC device.
Co-developed-by: Qiuxu Zhuo <[email protected]>
Signed-off-by: Qiuxu Zhuo <[email protected]>
Signed-off-by: Sean V Kelley <[email protected]>
---
drivers/pci/pcie/pme.c | 15 +++++++++++----
drivers/pci/pcie/portdrv_core.c | 8 ++++----
2 files changed, 15 insertions(+), 8 deletions(-)
diff --git a/drivers/pci/pcie/pme.c b/drivers/pci/pcie/pme.c
index 6a32970bb731..87799166c96a 100644
--- a/drivers/pci/pcie/pme.c
+++ b/drivers/pci/pcie/pme.c
@@ -310,7 +310,10 @@ static int pcie_pme_can_wakeup(struct pci_dev *dev, void *ign)
static void pcie_pme_mark_devices(struct pci_dev *port)
{
pcie_pme_can_wakeup(port, NULL);
- if (port->subordinate)
+
+ if (pci_pcie_type(port) == PCI_EXP_TYPE_RC_EC)
+ pcie_walk_rcec(port, pcie_pme_can_wakeup, NULL);
+ else if (port->subordinate)
pci_walk_bus(port->subordinate, pcie_pme_can_wakeup, NULL);
}
@@ -320,10 +323,15 @@ static void pcie_pme_mark_devices(struct pci_dev *port)
*/
static int pcie_pme_probe(struct pcie_device *srv)
{
- struct pci_dev *port;
+ struct pci_dev *port = srv->port;
struct pcie_pme_service_data *data;
int ret;
+ /* Limit to Root Ports or Root Complex Event Collectors */
+ if ((pci_pcie_type(port) != PCI_EXP_TYPE_RC_EC) &&
+ (pci_pcie_type(port) != PCI_EXP_TYPE_ROOT_PORT))
+ return -ENODEV;
+
data = kzalloc(sizeof(*data), GFP_KERNEL);
if (!data)
return -ENOMEM;
@@ -333,7 +341,6 @@ static int pcie_pme_probe(struct pcie_device *srv)
data->srv = srv;
set_service_data(srv, data);
- port = srv->port;
pcie_pme_interrupt_enable(port, false);
pcie_clear_root_pme_status(port);
@@ -445,7 +452,7 @@ static void pcie_pme_remove(struct pcie_device *srv)
static struct pcie_port_service_driver pcie_pme_driver = {
.name = "pcie_pme",
- .port_type = PCI_EXP_TYPE_ROOT_PORT,
+ .port_type = PCIE_ANY_PORT,
.service = PCIE_PORT_SERVICE_PME,
.probe = pcie_pme_probe,
diff --git a/drivers/pci/pcie/portdrv_core.c b/drivers/pci/pcie/portdrv_core.c
index 50a9522ab07d..99769c636775 100644
--- a/drivers/pci/pcie/portdrv_core.c
+++ b/drivers/pci/pcie/portdrv_core.c
@@ -234,11 +234,11 @@ static int get_port_device_capability(struct pci_dev *dev)
#endif
/*
- * Root ports are capable of generating PME too. Root Complex
- * Event Collectors can also generate PMEs, but we don't handle
- * those yet.
+ * Root ports and Root Complex Event Collectors are capable
+ * of generating PME.
*/
- if (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT &&
+ if ((pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT ||
+ pci_pcie_type(dev) == PCI_EXP_TYPE_RC_EC) &&
(pcie_ports_native || host->native_pme)) {
services |= PCIE_PORT_SERVICE_PME;
--
2.28.0
From: Qiuxu Zhuo <[email protected]>
When attempting error recovery for an RCiEP associated with an RCEC device,
there needs to be a way to update the Root Error Status, the Uncorrectable
Error Status and the Uncorrectable Error Severity of the parent RCEC.
In some non-native cases in which there is no OS visible device
associated with the RCiEP, there is nothing to act upon as the firmware
is acting before the OS. So add handling for the linked 'rcec' in AER/ERR
while taking into account non-native cases.
Co-developed-by: Sean V Kelley <[email protected]>
Signed-off-by: Sean V Kelley <[email protected]>
Signed-off-by: Qiuxu Zhuo <[email protected]>
Reviewed-by: Jonathan Cameron <[email protected]>
---
drivers/pci/pcie/aer.c | 9 +++++----
drivers/pci/pcie/err.c | 39 ++++++++++++++++++++++++++++-----------
2 files changed, 33 insertions(+), 15 deletions(-)
diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index 65dff5f3457a..dccdba60b5d9 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -1358,17 +1358,18 @@ static int aer_probe(struct pcie_device *dev)
static pci_ers_result_t aer_root_reset(struct pci_dev *dev)
{
int aer = dev->aer_cap;
+ int rc = 0;
u32 reg32;
- int rc;
-
/* Disable Root's interrupt in response to error messages */
pci_read_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, ®32);
reg32 &= ~ROOT_PORT_INTR_ON_MESG_MASK;
pci_write_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, reg32);
- rc = pci_bus_error_reset(dev);
- pci_info(dev, "Root Port link has been reset\n");
+ if (pci_pcie_type(dev) != PCI_EXP_TYPE_RC_EC) {
+ rc = pci_bus_error_reset(dev);
+ pci_info(dev, "Root Port link has been reset\n");
+ }
/* Clear Root Error Status */
pci_read_config_dword(dev, aer + PCI_ERR_ROOT_STATUS, ®32);
diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
index 38abd7984996..956ad4c86d53 100644
--- a/drivers/pci/pcie/err.c
+++ b/drivers/pci/pcie/err.c
@@ -149,7 +149,8 @@ static int report_resume(struct pci_dev *dev, void *data)
/**
* pci_walk_bridge - walk bridges potentially AER affected
* @bridge bridge which may be an RCEC with associated RCiEPs,
- * an RCiEP associated with an RCEC, or a Port.
+ * or a Port.
+ * @dev an RCiEP lacking an associated RCEC.
* @cb callback to be called for each device found
* @userdata arbitrary pointer to be passed to callback.
*
@@ -160,13 +161,20 @@ static int report_resume(struct pci_dev *dev, void *data)
* If the device provided has no subordinate bus, call the provided
* callback on the device itself.
*/
-static void pci_walk_bridge(struct pci_dev *bridge, int (*cb)(struct pci_dev *, void *),
+static void pci_walk_bridge(struct pci_dev *bridge, struct pci_dev *dev,
+ int (*cb)(struct pci_dev *, void *),
void *userdata)
{
- if (bridge->subordinate)
+ /*
+ * In a non-native case where there is no OS-visible reporting
+ * device the bridge will be NULL, i.e., no RCEC, no PORT.
+ */
+ if (bridge && bridge->subordinate)
pci_walk_bus(bridge->subordinate, cb, userdata);
- else
+ else if (bridge)
cb(bridge, userdata);
+ else
+ cb(dev, userdata);
}
static pci_ers_result_t flr_on_rciep(struct pci_dev *dev)
@@ -196,16 +204,25 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
type = pci_pcie_type(dev);
if (type == PCI_EXP_TYPE_ROOT_PORT ||
type == PCI_EXP_TYPE_DOWNSTREAM ||
- type == PCI_EXP_TYPE_RC_EC ||
- type == PCI_EXP_TYPE_RC_END)
+ type == PCI_EXP_TYPE_RC_EC)
bridge = dev;
+ else if (type == PCI_EXP_TYPE_RC_END)
+ bridge = dev->rcec;
else
bridge = pci_upstream_bridge(dev);
pci_dbg(dev, "broadcast error_detected message\n");
if (state == pci_channel_io_frozen) {
- pci_walk_bridge(bridge, report_frozen_detected, &status);
+ pci_walk_bridge(bridge, dev, report_frozen_detected, &status);
if (type == PCI_EXP_TYPE_RC_END) {
+ /*
+ * The callback only clears the Root Error Status
+ * of the RCEC (see aer.c). Only perform this for the
+ * native case, i.e., an RCEC is present.
+ */
+ if (bridge)
+ reset_subordinate_devices(bridge);
+
status = flr_on_rciep(dev);
if (status != PCI_ERS_RESULT_RECOVERED) {
pci_warn(dev, "function level reset failed\n");
@@ -219,13 +236,13 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
}
}
} else {
- pci_walk_bridge(bridge, report_normal_detected, &status);
+ pci_walk_bridge(bridge, dev, report_normal_detected, &status);
}
if (status == PCI_ERS_RESULT_CAN_RECOVER) {
status = PCI_ERS_RESULT_RECOVERED;
pci_dbg(dev, "broadcast mmio_enabled message\n");
- pci_walk_bridge(bridge, report_mmio_enabled, &status);
+ pci_walk_bridge(bridge, dev, report_mmio_enabled, &status);
}
if (status == PCI_ERS_RESULT_NEED_RESET) {
@@ -236,14 +253,14 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
*/
status = PCI_ERS_RESULT_RECOVERED;
pci_dbg(dev, "broadcast slot_reset message\n");
- pci_walk_bridge(bridge, report_slot_reset, &status);
+ pci_walk_bridge(bridge, dev, report_slot_reset, &status);
}
if (status != PCI_ERS_RESULT_RECOVERED)
goto failed;
pci_dbg(dev, "broadcast resume message\n");
- pci_walk_bridge(bridge, report_resume, &status);
+ pci_walk_bridge(bridge, dev, report_resume, &status);
if (type == PCI_EXP_TYPE_ROOT_PORT ||
type == PCI_EXP_TYPE_DOWNSTREAM ||
--
2.28.0
From: Jonathan Cameron <[email protected]>
Currently the kernel does not handle AER errors for Root Complex
integrated End Points (RCiEPs)[0]. These devices sit on a root bus within
the Root Complex (RC). AER handling is performed by a Root Complex Event
Collector (RCEC) [1] which is a effectively a type of RCiEP on the same
root bus.
For an RCEC (technically not a Bridge), error messages "received" from
associated RCiEPs must be enabled for "transmission" in order to cause a
System Error via the Root Control register or (when the Advanced Error
Reporting Capability is present) reporting via the Root Error Command
register and logging in the Root Error Status register and Error Source
Identification register.
In addition to the defined OS level handling of the reset flow for the
associated RCiEPs of an RCEC, it is possible to also have non-native
handling. In that case there is no need to take any actions on the RCEC
because the firmware is responsible for them. This is true where APEI [2]
is used to report the AER errors via a GHES[v2] HEST entry [3] and
relevant AER CPER record [4] and non-native handling is in use.
We effectively end up with two different types of discovery for
purposes of handling AER errors:
1) Normal bus walk - we pass the downstream port above a bus to which
the device is attached and it walks everything below that point.
2) An RCiEP with no visible association with an RCEC as there is no need
to walk devices. In that case, the flow is to just call the callbacks for
the actual device, which in turn references its associated RCEC.
Modify pci_walk_bridge() to handle devices which lack a subordinate bus.
If the device does not then it will call the function on that device
alone.
[0] ACPI PCI Express Base Specification 5.0-1 1.3.2.3 Root Complex
Integrated Endpoint Rules.
[1] ACPI PCI Express Base Specification 5.0-1 6.2 Error Signalling and
Logging
[2] ACPI Specification 6.3 Chapter 18 ACPI Platform Error Interface (APEI)
[3] ACPI Specification 6.3 18.2.3.7 Generic Hardware Error Source
[4] UEFI Specification 2.8, N.2.7 PCI Express Error Section
Signed-off-by: Jonathan Cameron <[email protected]>
Signed-off-by: Sean V Kelley <[email protected]>
---
drivers/pci/pcie/err.c | 25 ++++++++++++++++++++-----
1 file changed, 20 insertions(+), 5 deletions(-)
diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
index 5ff1afa4763d..c4ceca42a3bf 100644
--- a/drivers/pci/pcie/err.c
+++ b/drivers/pci/pcie/err.c
@@ -148,19 +148,25 @@ static int report_resume(struct pci_dev *dev, void *data)
/**
* pci_walk_bridge - walk bridges potentially AER affected
- * @bridge bridge which may be a Port.
+ * @bridge bridge which may be an RCEC with associated RCiEPs,
+ * an RCiEP associated with an RCEC, or a Port.
* @cb callback to be called for each device found
* @userdata arbitrary pointer to be passed to callback.
*
* If the device provided is a bridge, walk the subordinate bus,
* including any bridged devices on buses under this bus.
* Call the provided callback on each device found.
+ *
+ * If the device provided has no subordinate bus, call the provided
+ * callback on the device itself.
*/
static void pci_walk_bridge(struct pci_dev *bridge, int (*cb)(struct pci_dev *, void *),
void *userdata)
{
if (bridge->subordinate)
pci_walk_bus(bridge->subordinate, cb, userdata);
+ else
+ cb(bridge, userdata);
}
pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
@@ -174,11 +180,13 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
/*
* Error recovery runs on all subordinates of the first downstream
* bridge. If the downstream bridge detected the error, it is
- * cleared at the end.
+ * cleared at the end. For RCiEPs we should reset just the RCiEP itself.
*/
type = pci_pcie_type(dev);
if (type == PCI_EXP_TYPE_ROOT_PORT ||
- type == PCI_EXP_TYPE_DOWNSTREAM)
+ type == PCI_EXP_TYPE_DOWNSTREAM ||
+ type == PCI_EXP_TYPE_RC_EC ||
+ type == PCI_EXP_TYPE_RC_END)
bridge = dev;
else
bridge = pci_upstream_bridge(dev);
@@ -186,7 +194,13 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
pci_dbg(dev, "broadcast error_detected message\n");
if (state == pci_channel_io_frozen) {
pci_walk_bridge(bridge, report_frozen_detected, &status);
- status = reset_subordinate_device(bridge);
+ if (type == PCI_EXP_TYPE_RC_END) {
+ pci_warn(dev, "subordinate device reset not possible for RCiEP\n");
+ status = PCI_ERS_RESULT_NONE;
+ goto failed;
+ }
+
+ status = reset_subordinate_devices(bridge);
if (status != PCI_ERS_RESULT_RECOVERED) {
pci_warn(dev, "subordinate device reset failed\n");
goto failed;
@@ -219,7 +233,8 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
pci_walk_bridge(bridge, report_resume, &status);
if (type == PCI_EXP_TYPE_ROOT_PORT ||
- type == PCI_EXP_TYPE_DOWNSTREAM) {
+ type == PCI_EXP_TYPE_DOWNSTREAM ||
+ type == PCI_EXP_TYPE_RC_EC) {
if (pcie_aer_is_native(bridge))
pcie_clear_device_status(bridge);
pci_aer_clear_nonfatal_status(bridge);
--
2.28.0
From: Sean V Kelley <[email protected]>
In some cases a bridge may not exist as the hardware
controlling may be handled only by firmware and so is
not visible to the OS. This scenario is also possible
in future use cases involving non-native use of RCECs
by firmware. So explicitly apply conditional logic
around these resets by limiting them to root ports and
downstream ports.
Signed-off-by: Sean V Kelley <[email protected]>
Acked-by: Jonathan Cameron <[email protected]>
---
drivers/pci/pcie/err.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
index 9b2130725ab6..5ff1afa4763d 100644
--- a/drivers/pci/pcie/err.c
+++ b/drivers/pci/pcie/err.c
@@ -218,9 +218,12 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
pci_dbg(dev, "broadcast resume message\n");
pci_walk_bridge(bridge, report_resume, &status);
- if (pcie_aer_is_native(bridge))
- pcie_clear_device_status(bridge);
- pci_aer_clear_nonfatal_status(bridge);
+ if (type == PCI_EXP_TYPE_ROOT_PORT ||
+ type == PCI_EXP_TYPE_DOWNSTREAM) {
+ if (pcie_aer_is_native(bridge))
+ pcie_clear_device_status(bridge);
+ pci_aer_clear_nonfatal_status(bridge);
+ }
pci_info(dev, "device recovery successful\n");
return status;
--
2.28.0
From: Sean V Kelley <[email protected]>
Root Complex Event Collectors (RCEC) appear as peers to Root Ports
and also have the AER capability. In addition, actions need to be taken
for associated RCiEPs. In such cases the RCECs will need to be walked in
order to find and act upon their respective RCiEPs. Extend the existing
ability to link the RCECs with a walking function pcie_walk_rcec(). Add
RCEC support to the current AER service driver and attach the AER service
driver to the RCEC device.
Co-developed-by: Qiuxu Zhuo <[email protected]>
Signed-off-by: Qiuxu Zhuo <[email protected]>
Signed-off-by: Sean V Kelley <[email protected]>
Reviewed-by: Jonathan Cameron <[email protected]>
---
drivers/pci/pci.h | 4 ++++
drivers/pci/pcie/aer.c | 27 ++++++++++++++++++++-------
drivers/pci/pcie/rcec.c | 37 +++++++++++++++++++++++++++++++++++++
3 files changed, 61 insertions(+), 7 deletions(-)
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index ea5716d48b68..73fe09355e21 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -474,10 +474,14 @@ static inline void pci_dpc_init(struct pci_dev *pdev) {}
int pci_rcec_init(struct pci_dev *dev);
void pci_rcec_exit(struct pci_dev *dev);
void pcie_link_rcec(struct pci_dev *rcec);
+void pcie_walk_rcec(struct pci_dev *rcec, int (*cb)(struct pci_dev *, void *),
+ void *userdata);
#else
static inline int pci_rcec_init(struct pci_dev *dev) { return 0; }
static inline void pci_rcec_exit(struct pci_dev *dev) {}
static inline void pcie_link_rcec(struct pci_dev *rcec) {}
+static inline void pcie_walk_rcec(struct pci_dev *rcec, int (*cb)(struct pci_dev *, void *),
+ void *userdata) {}
#endif
#ifdef CONFIG_PCI_ATS
diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index dccdba60b5d9..3cde646f71c0 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -300,7 +300,7 @@ int pci_aer_raw_clear_status(struct pci_dev *dev)
return -EIO;
port_type = pci_pcie_type(dev);
- if (port_type == PCI_EXP_TYPE_ROOT_PORT) {
+ if (port_type == PCI_EXP_TYPE_ROOT_PORT || port_type == PCI_EXP_TYPE_RC_EC) {
pci_read_config_dword(dev, aer + PCI_ERR_ROOT_STATUS, &status);
pci_write_config_dword(dev, aer + PCI_ERR_ROOT_STATUS, status);
}
@@ -595,7 +595,8 @@ static umode_t aer_stats_attrs_are_visible(struct kobject *kobj,
if ((a == &dev_attr_aer_rootport_total_err_cor.attr ||
a == &dev_attr_aer_rootport_total_err_fatal.attr ||
a == &dev_attr_aer_rootport_total_err_nonfatal.attr) &&
- pci_pcie_type(pdev) != PCI_EXP_TYPE_ROOT_PORT)
+ ((pci_pcie_type(pdev) != PCI_EXP_TYPE_ROOT_PORT) &&
+ (pci_pcie_type(pdev) != PCI_EXP_TYPE_RC_EC)))
return 0;
return a->mode;
@@ -916,7 +917,10 @@ static bool find_source_device(struct pci_dev *parent,
if (result)
return true;
- pci_walk_bus(parent->subordinate, find_device_iter, e_info);
+ if (pci_pcie_type(parent) == PCI_EXP_TYPE_RC_EC)
+ pcie_walk_rcec(parent, find_device_iter, e_info);
+ else
+ pci_walk_bus(parent->subordinate, find_device_iter, e_info);
if (!e_info->error_dev_num) {
pci_info(parent, "can't find device of ID%04x\n", e_info->id);
@@ -1053,6 +1057,7 @@ int aer_get_device_error_info(struct pci_dev *dev, struct aer_err_info *info)
if (!(info->status & ~info->mask))
return 0;
} else if (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT ||
+ pci_pcie_type(dev) == PCI_EXP_TYPE_RC_EC ||
pci_pcie_type(dev) == PCI_EXP_TYPE_DOWNSTREAM ||
info->severity == AER_NONFATAL) {
@@ -1205,6 +1210,7 @@ static int set_device_error_reporting(struct pci_dev *dev, void *data)
int type = pci_pcie_type(dev);
if ((type == PCI_EXP_TYPE_ROOT_PORT) ||
+ (type == PCI_EXP_TYPE_RC_EC) ||
(type == PCI_EXP_TYPE_UPSTREAM) ||
(type == PCI_EXP_TYPE_DOWNSTREAM)) {
if (enable)
@@ -1229,9 +1235,11 @@ static void set_downstream_devices_error_reporting(struct pci_dev *dev,
{
set_device_error_reporting(dev, &enable);
- if (!dev->subordinate)
- return;
- pci_walk_bus(dev->subordinate, set_device_error_reporting, &enable);
+ if (pci_pcie_type(dev) == PCI_EXP_TYPE_RC_EC)
+ pcie_walk_rcec(dev, set_device_error_reporting, &enable);
+ else if (dev->subordinate)
+ pci_walk_bus(dev->subordinate, set_device_error_reporting, &enable);
+
}
/**
@@ -1329,6 +1337,11 @@ static int aer_probe(struct pcie_device *dev)
struct device *device = &dev->device;
struct pci_dev *port = dev->port;
+ /* Limit to Root Ports or Root Complex Event Collectors */
+ if ((pci_pcie_type(port) != PCI_EXP_TYPE_RC_EC) &&
+ (pci_pcie_type(port) != PCI_EXP_TYPE_ROOT_PORT))
+ return -ENODEV;
+
rpc = devm_kzalloc(device, sizeof(struct aer_rpc), GFP_KERNEL);
if (!rpc)
return -ENOMEM;
@@ -1385,7 +1398,7 @@ static pci_ers_result_t aer_root_reset(struct pci_dev *dev)
static struct pcie_port_service_driver aerdriver = {
.name = "aer",
- .port_type = PCI_EXP_TYPE_ROOT_PORT,
+ .port_type = PCIE_ANY_PORT,
.service = PCIE_PORT_SERVICE_AER,
.probe = aer_probe,
diff --git a/drivers/pci/pcie/rcec.c b/drivers/pci/pcie/rcec.c
index 9ba74d8064e9..a3ca3bbfac93 100644
--- a/drivers/pci/pcie/rcec.c
+++ b/drivers/pci/pcie/rcec.c
@@ -51,6 +51,17 @@ static int link_rcec_helper(struct pci_dev *dev, void *data)
return 0;
}
+static int walk_rcec_helper(struct pci_dev *dev, void *data)
+{
+ struct walk_rcec_data *rcec_data = data;
+ struct pci_dev *rcec = rcec_data->rcec;
+
+ if ((pci_pcie_type(dev) == PCI_EXP_TYPE_RC_END) && rcec_assoc_rciep(rcec, dev))
+ rcec_data->user_callback(dev, rcec_data->user_data);
+
+ return 0;
+}
+
static void walk_rcec(int (*cb)(struct pci_dev *dev, void *data), void *userdata)
{
struct walk_rcec_data *rcec_data = userdata;
@@ -106,6 +117,32 @@ void pcie_link_rcec(struct pci_dev *rcec)
walk_rcec(link_rcec_helper, &rcec_data);
}
+/**
+ * pcie_walk_rcec - Walk RCiEP devices associating with RCEC and call callback.
+ * @rcec RCEC whose RCiEP devices should be walked.
+ * @cb Callback to be called for each RCiEP device found.
+ * @userdata Arbitrary pointer to be passed to callback.
+ *
+ * Walk the given RCEC. Call the provided callback on each RCiEP device found.
+ *
+ * We check the return of @cb each time. If it returns anything
+ * other than 0, we break out.
+ */
+void pcie_walk_rcec(struct pci_dev *rcec, int (*cb)(struct pci_dev *, void *),
+ void *userdata)
+{
+ struct walk_rcec_data rcec_data;
+
+ if (!rcec->rcec_ea)
+ return;
+
+ rcec_data.rcec = rcec;
+ rcec_data.user_callback = cb;
+ rcec_data.user_data = userdata;
+
+ walk_rcec(walk_rcec_helper, &rcec_data);
+}
+
int pci_rcec_init(struct pci_dev *dev)
{
struct rcec_ea *rcec_ea;
--
2.28.0
From: Qiuxu Zhuo <[email protected]>
The Root Complex Event Collectors (RCEC) appear as peers to Root Ports
and also have the AER capability. So add RCEC support to the current AER
error injection driver.
Signed-off-by: Qiuxu Zhuo <[email protected]>
Co-developed-by: Sean V Kelley <[email protected]>
Signed-off-by: Sean V Kelley <[email protected]>
---
drivers/pci/pcie/aer_inject.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/pci/pcie/aer_inject.c b/drivers/pci/pcie/aer_inject.c
index c2cbf425afc5..011a6c54b4e3 100644
--- a/drivers/pci/pcie/aer_inject.c
+++ b/drivers/pci/pcie/aer_inject.c
@@ -333,8 +333,11 @@ static int aer_inject(struct aer_error_inj *einj)
if (!dev)
return -ENODEV;
rpdev = pcie_find_root_port(dev);
+ /* If Root port not found, try to find an RCEC */
+ if (!rpdev)
+ rpdev = dev->rcec;
if (!rpdev) {
- pci_err(dev, "Root port not found\n");
+ pci_err(dev, "Neither root port nor RCEC found\n");
ret = -ENODEV;
goto out_put;
}
--
2.28.0
On Fri, Oct 02, 2020 at 11:47:21AM -0700, Sean V Kelley wrote:
> From: Sean V Kelley <[email protected]>
>
> Changes since v7 [1]:
>
> - No functional changes.
>
> - Reword bridge patch.
> - Noted testing below for #non-native/no RCEC case
> (Jonathan Cameron)
>
> - Separate out pci_walk_bus() into pci_walk_bridge() change.
> - Put remaining dev to bridge name changes in the separate patch from v7.
> (Bjorn Helgaas)
>
> [1] https://lore.kernel.org/lkml/[email protected]/
>
> Root Complex Event Collectors (RCEC) provide support for terminating error
> and PME messages from Root Complex Integrated Endpoints (RCiEPs). An RCEC
> resides on a Bus in the Root Complex. Multiple RCECs can in fact reside on
> a single bus. An RCEC will explicitly declare supported RCiEPs through the
> Root Complex Endpoint Association Extended Capability.
>
> (See PCIe 5.0-1, sections 1.3.2.3 (RCiEP), and 7.9.10 (RCEC Ext. Cap.))
>
> The kernel lacks handling for these RCECs and the error messages received
> from their respective associated RCiEPs. More recently, a new CPU
> interconnect, Compute eXpress Link (CXL) depends on RCEC capabilities for
> purposes of error messaging from CXL 1.1 supported RCiEP devices.
>
> DocLink: https://www.computeexpresslink.org/
>
> This use case is not limited to CXL. Existing hardware today includes
> support for RCECs, such as the Denverton microserver product
> family. Future hardware will be forthcoming.
>
> (See Intel Document, Order number: 33061-003US)
>
> So services such as AER or PME could be associated with an RCEC driver.
> In the case of CXL, if an RCiEP (i.e., CXL 1.1 device) is associated with a
> platform's RCEC it shall signal PME and AER error conditions through that
> RCEC.
>
> Towards the above use cases, add the missing RCEC class and extend the
> PCIe Root Port and service drivers to allow association of RCiEPs to their
> respective parent RCEC and facilitate handling of terminating error and PME
> messages.
>
> Tested-by: Jonathan Cameron <[email protected]> #non-native/no RCEC
>
>
> Jonathan Cameron (1):
> PCI/AER: Extend AER error handling to RCECs
>
> Qiuxu Zhuo (5):
> PCI/RCEC: Add RCEC class code and extended capability
> PCI/RCEC: Bind RCEC devices to the Root Port driver
> PCI/AER: Apply function level reset to RCiEP on fatal error
> PCI/RCEC: Add RCiEP's linked RCEC to AER/ERR
> PCI/AER: Add RCEC AER error injection support
>
> Sean V Kelley (8):
> PCI/RCEC: Cache RCEC capabilities in pci_init_capabilities()
> PCI/ERR: Rename reset_link() to reset_subordinate_device()
> PCI/ERR: Use "bridge" for clarity in pcie_do_recovery()
> PCI/ERR: Add pci_walk_bridge() to pcie_do_recovery()
> PCI/ERR: Limit AER resets in pcie_do_recovery()
> PCI/RCEC: Add pcie_link_rcec() to associate RCiEPs
> PCI/AER: Add pcie_walk_rcec() to RCEC AER handling
> PCI/PME: Add pcie_walk_rcec() to RCEC PME handling
>
> drivers/pci/pci.h | 25 ++++-
> drivers/pci/pcie/Makefile | 2 +-
> drivers/pci/pcie/aer.c | 36 ++++--
> drivers/pci/pcie/aer_inject.c | 5 +-
> drivers/pci/pcie/err.c | 109 +++++++++++++++----
> drivers/pci/pcie/pme.c | 15 ++-
> drivers/pci/pcie/portdrv_core.c | 8 +-
> drivers/pci/pcie/portdrv_pci.c | 8 +-
> drivers/pci/pcie/rcec.c | 187 ++++++++++++++++++++++++++++++++
> drivers/pci/probe.c | 2 +
> include/linux/pci.h | 5 +
> include/linux/pci_ids.h | 1 +
> include/uapi/linux/pci_regs.h | 7 ++
> 13 files changed, 367 insertions(+), 43 deletions(-)
> create mode 100644 drivers/pci/pcie/rcec.c
Thank you very much for your work and patience with this series!
Applied to pci/err for v5.10 with the following changes:
- Make pci_rcec_init() void since return value was unused.
- Reorder pci_rcec_init() so rcec_ea is filled in before publishing
in dev->rcec_ea.
- Split pcie_do_recovery() patches up a little more. My hope was to
make the "Use 'bridge' for clarity" patch more of a pure rename
patch and easier to review. Not sure I accomplished that.
- Log messages and uevents with "bridge", not "dev", in
pcie_do_recovery() to preserve previous behavior.
- Rename reset_subordinate_devices() to reset_subordinates() for
brevity.
- Fix kerneldoc issues (reported with "make W=1").
- Fix whitespace (lines didn't use the full width or > 80 columns,
etc).
Please let me know if I botched anything.
Hi Bjorn,
On 9 Oct 2020, at 10:57, Bjorn Helgaas wrote:
> On Fri, Oct 02, 2020 at 11:47:32AM -0700, Sean V Kelley wrote:
>> From: Qiuxu Zhuo <[email protected]>
>>
>> When attempting error recovery for an RCiEP associated with an RCEC
>> device,
>> there needs to be a way to update the Root Error Status, the
>> Uncorrectable
>> Error Status and the Uncorrectable Error Severity of the parent RCEC.
>> In some non-native cases in which there is no OS visible device
>> associated with the RCiEP, there is nothing to act upon as the
>> firmware
>> is acting before the OS. So add handling for the linked 'rcec' in
>> AER/ERR
>> while taking into account non-native cases.
>>
>> Co-developed-by: Sean V Kelley <[email protected]>
>> Signed-off-by: Sean V Kelley <[email protected]>
>> Signed-off-by: Qiuxu Zhuo <[email protected]>
>> Reviewed-by: Jonathan Cameron <[email protected]>
>> ---
>> drivers/pci/pcie/aer.c | 9 +++++----
>> drivers/pci/pcie/err.c | 39 ++++++++++++++++++++++++++++-----------
>> 2 files changed, 33 insertions(+), 15 deletions(-)
>>
>> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
>> index 65dff5f3457a..dccdba60b5d9 100644
>> --- a/drivers/pci/pcie/aer.c
>> +++ b/drivers/pci/pcie/aer.c
>> @@ -1358,17 +1358,18 @@ static int aer_probe(struct pcie_device *dev)
>> static pci_ers_result_t aer_root_reset(struct pci_dev *dev)
>> {
>> int aer = dev->aer_cap;
>> + int rc = 0;
>> u32 reg32;
>> - int rc;
>> -
>>
>> /* Disable Root's interrupt in response to error messages */
>> pci_read_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, ®32);
>> reg32 &= ~ROOT_PORT_INTR_ON_MESG_MASK;
>> pci_write_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, reg32);
>>
>> - rc = pci_bus_error_reset(dev);
>> - pci_info(dev, "Root Port link has been reset\n");
>> + if (pci_pcie_type(dev) != PCI_EXP_TYPE_RC_EC) {
>> + rc = pci_bus_error_reset(dev);
>> + pci_info(dev, "Root Port link has been reset\n");
>> + }
>>
>> /* Clear Root Error Status */
>> pci_read_config_dword(dev, aer + PCI_ERR_ROOT_STATUS, ®32);
>> diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
>> index 38abd7984996..956ad4c86d53 100644
>> --- a/drivers/pci/pcie/err.c
>> +++ b/drivers/pci/pcie/err.c
>> @@ -149,7 +149,8 @@ static int report_resume(struct pci_dev *dev,
>> void *data)
>> /**
>> * pci_walk_bridge - walk bridges potentially AER affected
>> * @bridge bridge which may be an RCEC with associated RCiEPs,
>> - * an RCiEP associated with an RCEC, or a Port.
>> + * or a Port.
>> + * @dev an RCiEP lacking an associated RCEC.
>> * @cb callback to be called for each device found
>> * @userdata arbitrary pointer to be passed to callback.
>> *
>> @@ -160,13 +161,20 @@ static int report_resume(struct pci_dev *dev,
>> void *data)
>> * If the device provided has no subordinate bus, call the provided
>> * callback on the device itself.
>> */
>> -static void pci_walk_bridge(struct pci_dev *bridge, int (*cb)(struct
>> pci_dev *, void *),
>> +static void pci_walk_bridge(struct pci_dev *bridge, struct pci_dev
>> *dev,
>> + int (*cb)(struct pci_dev *, void *),
>> void *userdata)
>> {
>> - if (bridge->subordinate)
>> + /*
>> + * In a non-native case where there is no OS-visible reporting
>> + * device the bridge will be NULL, i.e., no RCEC, no PORT.
>> + */
>> + if (bridge && bridge->subordinate)
>> pci_walk_bus(bridge->subordinate, cb, userdata);
>> - else
>> + else if (bridge)
>> cb(bridge, userdata);
>> + else
>> + cb(dev, userdata);
>> }
>>
>> static pci_ers_result_t flr_on_rciep(struct pci_dev *dev)
>> @@ -196,16 +204,25 @@ pci_ers_result_t pcie_do_recovery(struct
>> pci_dev *dev,
>> type = pci_pcie_type(dev);
>> if (type == PCI_EXP_TYPE_ROOT_PORT ||
>> type == PCI_EXP_TYPE_DOWNSTREAM ||
>> - type == PCI_EXP_TYPE_RC_EC ||
>> - type == PCI_EXP_TYPE_RC_END)
>> + type == PCI_EXP_TYPE_RC_EC)
>> bridge = dev;
>> + else if (type == PCI_EXP_TYPE_RC_END)
>> + bridge = dev->rcec;
>> else
>> bridge = pci_upstream_bridge(dev);
>>
>> pci_dbg(dev, "broadcast error_detected message\n");
>> if (state == pci_channel_io_frozen) {
>> - pci_walk_bridge(bridge, report_frozen_detected, &status);
>> + pci_walk_bridge(bridge, dev, report_frozen_detected, &status);
>> if (type == PCI_EXP_TYPE_RC_END) {
>> + /*
>> + * The callback only clears the Root Error Status
>> + * of the RCEC (see aer.c). Only perform this for the
>> + * native case, i.e., an RCEC is present.
>> + */
>> + if (bridge)
>> + reset_subordinate_devices(bridge);
>
> Help me understand this. There are lots of callbacks in this picture,
> but I guess this "callback only clears Root Error Status" must refer
> to aer_root_reset(), i.e., the reset_subordinate_devices pointer?
The ‘bridge’ in this case will always be dev->rcec, the event
collector for the associated RC_END. And that’s what’s being cleared
in aer.c via aer_root_reset() as the callback. It’s also being checked
for native/non-native here.
>
> Of course, the caller of pcie_do_recovery() supplied that pointer.
> And we can infer that it must be aer_root_reset(), not
> dpc_reset_link(), because RCECs and RCiEPs are not allowed to
> implement DPC.
Correct.
>
> I wish we didn't have either this assumption about what
> reset_subordinate_devices points to, or the assumption about what
> aer_root_reset() does. They both seem a little bit tenuous.
Agree. It’s the relationship between the RC_END and the RC_EC.
>
> We already made aer_root_reset() smart enough to check for RCECs. Can
> we put the FLR there, too? Then we wouldn't have this weird situation
> where reset_subordinate_devices() does a reset and clears error
> status, EXCEPT for this case where it only clears error status and we
> do the reset here?
We could add the smarts to aer_root_reset() to check for an RC_END in
that callback and perform the clear there on its RC_EC. We just
wouldn’t map ‘bridge’ to dev->rcec for RC_END in
pcie_do_recovery() which would simplify things.
Further, the FLR in the case of flr_on_rciep() below is specific to the
RCiEP itself. So it could be performed either in aer_root_reset() or
remain in the pcie_do_recovery().
That should work.
Sean
>
>> status = flr_on_rciep(dev);
>> if (status != PCI_ERS_RESULT_RECOVERED) {
>> pci_warn(dev, "function level reset failed\n");
>> @@ -219,13 +236,13 @@ pci_ers_result_t pcie_do_recovery(struct
>> pci_dev *dev,
>> }
>> }
>> } else {
>> - pci_walk_bridge(bridge, report_normal_detected, &status);
>> + pci_walk_bridge(bridge, dev, report_normal_detected, &status);
>> }
>>
>> if (status == PCI_ERS_RESULT_CAN_RECOVER) {
>> status = PCI_ERS_RESULT_RECOVERED;
>> pci_dbg(dev, "broadcast mmio_enabled message\n");
>> - pci_walk_bridge(bridge, report_mmio_enabled, &status);
>> + pci_walk_bridge(bridge, dev, report_mmio_enabled, &status);
>> }
>>
>> if (status == PCI_ERS_RESULT_NEED_RESET) {
>> @@ -236,14 +253,14 @@ pci_ers_result_t pcie_do_recovery(struct
>> pci_dev *dev,
>> */
>> status = PCI_ERS_RESULT_RECOVERED;
>> pci_dbg(dev, "broadcast slot_reset message\n");
>> - pci_walk_bridge(bridge, report_slot_reset, &status);
>> + pci_walk_bridge(bridge, dev, report_slot_reset, &status);
>> }
>>
>> if (status != PCI_ERS_RESULT_RECOVERED)
>> goto failed;
>>
>> pci_dbg(dev, "broadcast resume message\n");
>> - pci_walk_bridge(bridge, report_resume, &status);
>> + pci_walk_bridge(bridge, dev, report_resume, &status);
>>
>> if (type == PCI_EXP_TYPE_ROOT_PORT ||
>> type == PCI_EXP_TYPE_DOWNSTREAM ||
>> --
>> 2.28.0
>>
On 9 Oct 2020, at 11:34, Sean V Kelley wrote:
> On 9 Oct 2020, at 11:26, Sean V Kelley wrote:
>
>> Hi Bjorn,
>>
>> On 9 Oct 2020, at 10:57, Bjorn Helgaas wrote:
>>
>>> On Fri, Oct 02, 2020 at 11:47:32AM -0700, Sean V Kelley wrote:
>>>> From: Qiuxu Zhuo <[email protected]>
>>>>
>>>> When attempting error recovery for an RCiEP associated with an RCEC
>>>> device,
>>>> there needs to be a way to update the Root Error Status, the
>>>> Uncorrectable
>>>> Error Status and the Uncorrectable Error Severity of the parent
>>>> RCEC.
>>>> In some non-native cases in which there is no OS visible device
>>>> associated with the RCiEP, there is nothing to act upon as the
>>>> firmware
>>>> is acting before the OS. So add handling for the linked 'rcec' in
>>>> AER/ERR
>>>> while taking into account non-native cases.
>>>>
>>>> Co-developed-by: Sean V Kelley <[email protected]>
>>>> Signed-off-by: Sean V Kelley <[email protected]>
>>>> Signed-off-by: Qiuxu Zhuo <[email protected]>
>>>> Reviewed-by: Jonathan Cameron <[email protected]>
>>>> ---
>>>> drivers/pci/pcie/aer.c | 9 +++++----
>>>> drivers/pci/pcie/err.c | 39
>>>> ++++++++++++++++++++++++++++-----------
>>>> 2 files changed, 33 insertions(+), 15 deletions(-)
>>>>
>>>> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
>>>> index 65dff5f3457a..dccdba60b5d9 100644
>>>> --- a/drivers/pci/pcie/aer.c
>>>> +++ b/drivers/pci/pcie/aer.c
>>>> @@ -1358,17 +1358,18 @@ static int aer_probe(struct pcie_device
>>>> *dev)
>>>> static pci_ers_result_t aer_root_reset(struct pci_dev *dev)
>>>> {
>>>> int aer = dev->aer_cap;
>>>> + int rc = 0;
>>>> u32 reg32;
>>>> - int rc;
>>>> -
>>>>
>>>> /* Disable Root's interrupt in response to error messages */
>>>> pci_read_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, ®32);
>>>> reg32 &= ~ROOT_PORT_INTR_ON_MESG_MASK;
>>>> pci_write_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, reg32);
>>>>
>>>> - rc = pci_bus_error_reset(dev);
>>>> - pci_info(dev, "Root Port link has been reset\n");
>>>> + if (pci_pcie_type(dev) != PCI_EXP_TYPE_RC_EC) {
>>>> + rc = pci_bus_error_reset(dev);
>>>> + pci_info(dev, "Root Port link has been reset\n");
>>>> + }
>>>>
>>>> /* Clear Root Error Status */
>>>> pci_read_config_dword(dev, aer + PCI_ERR_ROOT_STATUS, ®32);
>>>> diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
>>>> index 38abd7984996..956ad4c86d53 100644
>>>> --- a/drivers/pci/pcie/err.c
>>>> +++ b/drivers/pci/pcie/err.c
>>>> @@ -149,7 +149,8 @@ static int report_resume(struct pci_dev *dev,
>>>> void *data)
>>>> /**
>>>> * pci_walk_bridge - walk bridges potentially AER affected
>>>> * @bridge bridge which may be an RCEC with associated RCiEPs,
>>>> - * an RCiEP associated with an RCEC, or a Port.
>>>> + * or a Port.
>>>> + * @dev an RCiEP lacking an associated RCEC.
>>>> * @cb callback to be called for each device found
>>>> * @userdata arbitrary pointer to be passed to callback.
>>>> *
>>>> @@ -160,13 +161,20 @@ static int report_resume(struct pci_dev *dev,
>>>> void *data)
>>>> * If the device provided has no subordinate bus, call the
>>>> provided
>>>> * callback on the device itself.
>>>> */
>>>> -static void pci_walk_bridge(struct pci_dev *bridge, int
>>>> (*cb)(struct pci_dev *, void *),
>>>> +static void pci_walk_bridge(struct pci_dev *bridge, struct pci_dev
>>>> *dev,
>>>> + int (*cb)(struct pci_dev *, void *),
>>>> void *userdata)
>>>> {
>>>> - if (bridge->subordinate)
>>>> + /*
>>>> + * In a non-native case where there is no OS-visible reporting
>>>> + * device the bridge will be NULL, i.e., no RCEC, no PORT.
>>>> + */
>>>> + if (bridge && bridge->subordinate)
>>>> pci_walk_bus(bridge->subordinate, cb, userdata);
>>>> - else
>>>> + else if (bridge)
>>>> cb(bridge, userdata);
>>>> + else
>>>> + cb(dev, userdata);
>>>> }
>>>>
>>>> static pci_ers_result_t flr_on_rciep(struct pci_dev *dev)
>>>> @@ -196,16 +204,25 @@ pci_ers_result_t pcie_do_recovery(struct
>>>> pci_dev *dev,
>>>> type = pci_pcie_type(dev);
>>>> if (type == PCI_EXP_TYPE_ROOT_PORT ||
>>>> type == PCI_EXP_TYPE_DOWNSTREAM ||
>>>> - type == PCI_EXP_TYPE_RC_EC ||
>>>> - type == PCI_EXP_TYPE_RC_END)
>>>> + type == PCI_EXP_TYPE_RC_EC)
>>>> bridge = dev;
>>>> + else if (type == PCI_EXP_TYPE_RC_END)
>>>> + bridge = dev->rcec;
>>>> else
>>>> bridge = pci_upstream_bridge(dev);
>>>>
>>>> pci_dbg(dev, "broadcast error_detected message\n");
>>>> if (state == pci_channel_io_frozen) {
>>>> - pci_walk_bridge(bridge, report_frozen_detected, &status);
>>>> + pci_walk_bridge(bridge, dev, report_frozen_detected, &status);
>>>> if (type == PCI_EXP_TYPE_RC_END) {
>>>> + /*
>>>> + * The callback only clears the Root Error Status
>>>> + * of the RCEC (see aer.c). Only perform this for the
>>>> + * native case, i.e., an RCEC is present.
>>>> + */
>>>> + if (bridge)
>>>> + reset_subordinate_devices(bridge);
>>>
>>> Help me understand this. There are lots of callbacks in this
>>> picture,
>>> but I guess this "callback only clears Root Error Status" must refer
>>> to aer_root_reset(), i.e., the reset_subordinate_devices pointer?
>>
>> The ‘bridge’ in this case will always be dev->rcec, the event
>> collector for the associated RC_END. And that’s what’s being
>> cleared in aer.c via aer_root_reset() as the callback. It’s also
>> being checked for native/non-native here.
>>
>>>
>>> Of course, the caller of pcie_do_recovery() supplied that pointer.
>>> And we can infer that it must be aer_root_reset(), not
>>> dpc_reset_link(), because RCECs and RCiEPs are not allowed to
>>> implement DPC.
>>
>> Correct.
>>
>>>
>>> I wish we didn't have either this assumption about what
>>> reset_subordinate_devices points to, or the assumption about what
>>> aer_root_reset() does. They both seem a little bit tenuous.
>>
>> Agree. It’s the relationship between the RC_END and the RC_EC.
>>
>>>
>>> We already made aer_root_reset() smart enough to check for RCECs.
>>> Can
>>> we put the FLR there, too? Then we wouldn't have this weird
>>> situation
>>> where reset_subordinate_devices() does a reset and clears error
>>> status, EXCEPT for this case where it only clears error status and
>>> we
>>> do the reset here?
>>
>> We could add the smarts to aer_root_reset() to check for an RC_END in
>> that callback and perform the clear there on its RC_EC. We just
>> wouldn’t map ‘bridge’ to dev->rcec for RC_END in
>> pcie_do_recovery() which would simplify things.
>>
>> Further, the FLR in the case of flr_on_rciep() below is specific to
>> the RCiEP itself. So it could be performed either in aer_root_reset()
>> or remain in the pcie_do_recovery().
>>
>> That should work.
>>
>> Sean
>
> Thinking more on this, you could still pass dev->rcec to the callback
> (eventually aer_root_reset()), but you won’t have the ability to
> handle the FLR there without the pointer to the RC_END. That’s why I
> suggested passing the RC_END and checking for its RC_EC in
> aer_root_reset() if you want to also handle FLR there too from below.
Where you get into trouble is that you don’t want to do anything in
the non-native case and shouldn’t be going to the callback. So this
would add complexity to aer_root_reset() for checking on dev->rcec ==
NULL…where before you never would have invoked
reset_subordinate_devices() in the non-native case.
>
> Sean
>
>>
>>
>>>
>>>> status = flr_on_rciep(dev);
>>>> if (status != PCI_ERS_RESULT_RECOVERED) {
>>>> pci_warn(dev, "function level reset failed\n");
>>>> @@ -219,13 +236,13 @@ pci_ers_result_t pcie_do_recovery(struct
>>>> pci_dev *dev,
>>>> }
>>>> }
>>>> } else {
>>>> - pci_walk_bridge(bridge, report_normal_detected, &status);
>>>> + pci_walk_bridge(bridge, dev, report_normal_detected, &status);
>>>> }
>>>>
>>>> if (status == PCI_ERS_RESULT_CAN_RECOVER) {
>>>> status = PCI_ERS_RESULT_RECOVERED;
>>>> pci_dbg(dev, "broadcast mmio_enabled message\n");
>>>> - pci_walk_bridge(bridge, report_mmio_enabled, &status);
>>>> + pci_walk_bridge(bridge, dev, report_mmio_enabled, &status);
>>>> }
>>>>
>>>> if (status == PCI_ERS_RESULT_NEED_RESET) {
>>>> @@ -236,14 +253,14 @@ pci_ers_result_t pcie_do_recovery(struct
>>>> pci_dev *dev,
>>>> */
>>>> status = PCI_ERS_RESULT_RECOVERED;
>>>> pci_dbg(dev, "broadcast slot_reset message\n");
>>>> - pci_walk_bridge(bridge, report_slot_reset, &status);
>>>> + pci_walk_bridge(bridge, dev, report_slot_reset, &status);
>>>> }
>>>>
>>>> if (status != PCI_ERS_RESULT_RECOVERED)
>>>> goto failed;
>>>>
>>>> pci_dbg(dev, "broadcast resume message\n");
>>>> - pci_walk_bridge(bridge, report_resume, &status);
>>>> + pci_walk_bridge(bridge, dev, report_resume, &status);
>>>>
>>>> if (type == PCI_EXP_TYPE_ROOT_PORT ||
>>>> type == PCI_EXP_TYPE_DOWNSTREAM ||
>>>> --
>>>> 2.28.0
>>>>
On 9 Oct 2020, at 11:53, Sean V Kelley wrote:
> On 9 Oct 2020, at 11:34, Sean V Kelley wrote:
>
>> On 9 Oct 2020, at 11:26, Sean V Kelley wrote:
>>
>>> Hi Bjorn,
>>>
>>> On 9 Oct 2020, at 10:57, Bjorn Helgaas wrote:
>>>
>>>> On Fri, Oct 02, 2020 at 11:47:32AM -0700, Sean V Kelley wrote:
>>>>> From: Qiuxu Zhuo <[email protected]>
>>>>>
>>>>> When attempting error recovery for an RCiEP associated with an
>>>>> RCEC device,
>>>>> there needs to be a way to update the Root Error Status, the
>>>>> Uncorrectable
>>>>> Error Status and the Uncorrectable Error Severity of the parent
>>>>> RCEC.
>>>>> In some non-native cases in which there is no OS visible device
>>>>> associated with the RCiEP, there is nothing to act upon as the
>>>>> firmware
>>>>> is acting before the OS. So add handling for the linked 'rcec' in
>>>>> AER/ERR
>>>>> while taking into account non-native cases.
>>>>>
>>>>> Co-developed-by: Sean V Kelley <[email protected]>
>>>>> Signed-off-by: Sean V Kelley <[email protected]>
>>>>> Signed-off-by: Qiuxu Zhuo <[email protected]>
>>>>> Reviewed-by: Jonathan Cameron <[email protected]>
>>>>> ---
>>>>> drivers/pci/pcie/aer.c | 9 +++++----
>>>>> drivers/pci/pcie/err.c | 39
>>>>> ++++++++++++++++++++++++++++-----------
>>>>> 2 files changed, 33 insertions(+), 15 deletions(-)
>>>>>
>>>>> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
>>>>> index 65dff5f3457a..dccdba60b5d9 100644
>>>>> --- a/drivers/pci/pcie/aer.c
>>>>> +++ b/drivers/pci/pcie/aer.c
>>>>> @@ -1358,17 +1358,18 @@ static int aer_probe(struct pcie_device
>>>>> *dev)
>>>>> static pci_ers_result_t aer_root_reset(struct pci_dev *dev)
>>>>> {
>>>>> int aer = dev->aer_cap;
>>>>> + int rc = 0;
>>>>> u32 reg32;
>>>>> - int rc;
>>>>> -
>>>>>
>>>>> /* Disable Root's interrupt in response to error messages */
>>>>> pci_read_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, ®32);
>>>>> reg32 &= ~ROOT_PORT_INTR_ON_MESG_MASK;
>>>>> pci_write_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, reg32);
>>>>>
>>>>> - rc = pci_bus_error_reset(dev);
>>>>> - pci_info(dev, "Root Port link has been reset\n");
>>>>> + if (pci_pcie_type(dev) != PCI_EXP_TYPE_RC_EC) {
>>>>> + rc = pci_bus_error_reset(dev);
>>>>> + pci_info(dev, "Root Port link has been reset\n");
>>>>> + }
>>>>>
>>>>> /* Clear Root Error Status */
>>>>> pci_read_config_dword(dev, aer + PCI_ERR_ROOT_STATUS, ®32);
>>>>> diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
>>>>> index 38abd7984996..956ad4c86d53 100644
>>>>> --- a/drivers/pci/pcie/err.c
>>>>> +++ b/drivers/pci/pcie/err.c
>>>>> @@ -149,7 +149,8 @@ static int report_resume(struct pci_dev *dev,
>>>>> void *data)
>>>>> /**
>>>>> * pci_walk_bridge - walk bridges potentially AER affected
>>>>> * @bridge bridge which may be an RCEC with associated RCiEPs,
>>>>> - * an RCiEP associated with an RCEC, or a Port.
>>>>> + * or a Port.
>>>>> + * @dev an RCiEP lacking an associated RCEC.
>>>>> * @cb callback to be called for each device found
>>>>> * @userdata arbitrary pointer to be passed to callback.
>>>>> *
>>>>> @@ -160,13 +161,20 @@ static int report_resume(struct pci_dev
>>>>> *dev, void *data)
>>>>> * If the device provided has no subordinate bus, call the
>>>>> provided
>>>>> * callback on the device itself.
>>>>> */
>>>>> -static void pci_walk_bridge(struct pci_dev *bridge, int
>>>>> (*cb)(struct pci_dev *, void *),
>>>>> +static void pci_walk_bridge(struct pci_dev *bridge, struct
>>>>> pci_dev *dev,
>>>>> + int (*cb)(struct pci_dev *, void *),
>>>>> void *userdata)
>>>>> {
>>>>> - if (bridge->subordinate)
>>>>> + /*
>>>>> + * In a non-native case where there is no OS-visible reporting
>>>>> + * device the bridge will be NULL, i.e., no RCEC, no PORT.
>>>>> + */
>>>>> + if (bridge && bridge->subordinate)
>>>>> pci_walk_bus(bridge->subordinate, cb, userdata);
>>>>> - else
>>>>> + else if (bridge)
>>>>> cb(bridge, userdata);
>>>>> + else
>>>>> + cb(dev, userdata);
>>>>> }
>>>>>
>>>>> static pci_ers_result_t flr_on_rciep(struct pci_dev *dev)
>>>>> @@ -196,16 +204,25 @@ pci_ers_result_t pcie_do_recovery(struct
>>>>> pci_dev *dev,
>>>>> type = pci_pcie_type(dev);
>>>>> if (type == PCI_EXP_TYPE_ROOT_PORT ||
>>>>> type == PCI_EXP_TYPE_DOWNSTREAM ||
>>>>> - type == PCI_EXP_TYPE_RC_EC ||
>>>>> - type == PCI_EXP_TYPE_RC_END)
>>>>> + type == PCI_EXP_TYPE_RC_EC)
>>>>> bridge = dev;
>>>>> + else if (type == PCI_EXP_TYPE_RC_END)
>>>>> + bridge = dev->rcec;
>>>>> else
>>>>> bridge = pci_upstream_bridge(dev);
>>>>>
>>>>> pci_dbg(dev, "broadcast error_detected message\n");
>>>>> if (state == pci_channel_io_frozen) {
>>>>> - pci_walk_bridge(bridge, report_frozen_detected, &status);
>>>>> + pci_walk_bridge(bridge, dev, report_frozen_detected, &status);
>>>>> if (type == PCI_EXP_TYPE_RC_END) {
>>>>> + /*
>>>>> + * The callback only clears the Root Error Status
>>>>> + * of the RCEC (see aer.c). Only perform this for the
>>>>> + * native case, i.e., an RCEC is present.
>>>>> + */
>>>>> + if (bridge)
>>>>> + reset_subordinate_devices(bridge);
>>>>
>>>> Help me understand this. There are lots of callbacks in this
>>>> picture,
>>>> but I guess this "callback only clears Root Error Status" must
>>>> refer
>>>> to aer_root_reset(), i.e., the reset_subordinate_devices pointer?
>>>
>>> The ‘bridge’ in this case will always be dev->rcec, the event
>>> collector for the associated RC_END. And that’s what’s being
>>> cleared in aer.c via aer_root_reset() as the callback. It’s also
>>> being checked for native/non-native here.
>>>
>>>>
>>>> Of course, the caller of pcie_do_recovery() supplied that pointer.
>>>> And we can infer that it must be aer_root_reset(), not
>>>> dpc_reset_link(), because RCECs and RCiEPs are not allowed to
>>>> implement DPC.
>>>
>>> Correct.
>>>
>>>>
>>>> I wish we didn't have either this assumption about what
>>>> reset_subordinate_devices points to, or the assumption about what
>>>> aer_root_reset() does. They both seem a little bit tenuous.
>>>
>>> Agree. It’s the relationship between the RC_END and the RC_EC.
>>>
>>>>
>>>> We already made aer_root_reset() smart enough to check for RCECs.
>>>> Can
>>>> we put the FLR there, too? Then we wouldn't have this weird
>>>> situation
>>>> where reset_subordinate_devices() does a reset and clears error
>>>> status, EXCEPT for this case where it only clears error status and
>>>> we
>>>> do the reset here?
>>>
>>> We could add the smarts to aer_root_reset() to check for an RC_END
>>> in that callback and perform the clear there on its RC_EC. We just
>>> wouldn’t map ‘bridge’ to dev->rcec for RC_END in
>>> pcie_do_recovery() which would simplify things.
>>>
>>> Further, the FLR in the case of flr_on_rciep() below is specific to
>>> the RCiEP itself. So it could be performed either in
>>> aer_root_reset() or remain in the pcie_do_recovery().
>>>
>>> That should work.
>>>
>>> Sean
>>
>> Thinking more on this, you could still pass dev->rcec to the callback
>> (eventually aer_root_reset()), but you won’t have the ability to
>> handle the FLR there without the pointer to the RC_END. That’s why
>> I suggested passing the RC_END and checking for its RC_EC in
>> aer_root_reset() if you want to also handle FLR there too from below.
>
> Where you get into trouble is that you don’t want to do anything in
> the non-native case and shouldn’t be going to the callback. So this
> would add complexity to aer_root_reset() for checking on dev->rcec ==
> NULL…where before you never would have invoked
> reset_subordinate_devices() in the non-native case.
>
>
Unmapping of bridge to dev->rcec in pcie_do_recovery() also causes
problems for pci_walk_bridge() which pays attention to the non-native
case, even making available the dev for when there is no ‘bridge’
like device.
I think we may be just shifting things elsewhere despite the constraints
remaining the same. So while it should work, I believe that I prefer the
current approach.
Sean
>>
>> Sean
>>
>>>
>>>
>>>>
>>>>> status = flr_on_rciep(dev);
>>>>> if (status != PCI_ERS_RESULT_RECOVERED) {
>>>>> pci_warn(dev, "function level reset failed\n");
>>>>> @@ -219,13 +236,13 @@ pci_ers_result_t pcie_do_recovery(struct
>>>>> pci_dev *dev,
>>>>> }
>>>>> }
>>>>> } else {
>>>>> - pci_walk_bridge(bridge, report_normal_detected, &status);
>>>>> + pci_walk_bridge(bridge, dev, report_normal_detected, &status);
>>>>> }
>>>>>
>>>>> if (status == PCI_ERS_RESULT_CAN_RECOVER) {
>>>>> status = PCI_ERS_RESULT_RECOVERED;
>>>>> pci_dbg(dev, "broadcast mmio_enabled message\n");
>>>>> - pci_walk_bridge(bridge, report_mmio_enabled, &status);
>>>>> + pci_walk_bridge(bridge, dev, report_mmio_enabled, &status);
>>>>> }
>>>>>
>>>>> if (status == PCI_ERS_RESULT_NEED_RESET) {
>>>>> @@ -236,14 +253,14 @@ pci_ers_result_t pcie_do_recovery(struct
>>>>> pci_dev *dev,
>>>>> */
>>>>> status = PCI_ERS_RESULT_RECOVERED;
>>>>> pci_dbg(dev, "broadcast slot_reset message\n");
>>>>> - pci_walk_bridge(bridge, report_slot_reset, &status);
>>>>> + pci_walk_bridge(bridge, dev, report_slot_reset, &status);
>>>>> }
>>>>>
>>>>> if (status != PCI_ERS_RESULT_RECOVERED)
>>>>> goto failed;
>>>>>
>>>>> pci_dbg(dev, "broadcast resume message\n");
>>>>> - pci_walk_bridge(bridge, report_resume, &status);
>>>>> + pci_walk_bridge(bridge, dev, report_resume, &status);
>>>>>
>>>>> if (type == PCI_EXP_TYPE_ROOT_PORT ||
>>>>> type == PCI_EXP_TYPE_DOWNSTREAM ||
>>>>> --
>>>>> 2.28.0
>>>>>
On Fri, Oct 02, 2020 at 11:47:32AM -0700, Sean V Kelley wrote:
> From: Qiuxu Zhuo <[email protected]>
>
> When attempting error recovery for an RCiEP associated with an RCEC device,
> there needs to be a way to update the Root Error Status, the Uncorrectable
> Error Status and the Uncorrectable Error Severity of the parent RCEC.
> In some non-native cases in which there is no OS visible device
> associated with the RCiEP, there is nothing to act upon as the firmware
> is acting before the OS. So add handling for the linked 'rcec' in AER/ERR
> while taking into account non-native cases.
>
> Co-developed-by: Sean V Kelley <[email protected]>
> Signed-off-by: Sean V Kelley <[email protected]>
> Signed-off-by: Qiuxu Zhuo <[email protected]>
> Reviewed-by: Jonathan Cameron <[email protected]>
> ---
> drivers/pci/pcie/aer.c | 9 +++++----
> drivers/pci/pcie/err.c | 39 ++++++++++++++++++++++++++++-----------
> 2 files changed, 33 insertions(+), 15 deletions(-)
>
> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> index 65dff5f3457a..dccdba60b5d9 100644
> --- a/drivers/pci/pcie/aer.c
> +++ b/drivers/pci/pcie/aer.c
> @@ -1358,17 +1358,18 @@ static int aer_probe(struct pcie_device *dev)
> static pci_ers_result_t aer_root_reset(struct pci_dev *dev)
> {
> int aer = dev->aer_cap;
> + int rc = 0;
> u32 reg32;
> - int rc;
> -
>
> /* Disable Root's interrupt in response to error messages */
> pci_read_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, ®32);
> reg32 &= ~ROOT_PORT_INTR_ON_MESG_MASK;
> pci_write_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, reg32);
>
> - rc = pci_bus_error_reset(dev);
> - pci_info(dev, "Root Port link has been reset\n");
> + if (pci_pcie_type(dev) != PCI_EXP_TYPE_RC_EC) {
> + rc = pci_bus_error_reset(dev);
> + pci_info(dev, "Root Port link has been reset\n");
> + }
>
> /* Clear Root Error Status */
> pci_read_config_dword(dev, aer + PCI_ERR_ROOT_STATUS, ®32);
> diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
> index 38abd7984996..956ad4c86d53 100644
> --- a/drivers/pci/pcie/err.c
> +++ b/drivers/pci/pcie/err.c
> @@ -149,7 +149,8 @@ static int report_resume(struct pci_dev *dev, void *data)
> /**
> * pci_walk_bridge - walk bridges potentially AER affected
> * @bridge bridge which may be an RCEC with associated RCiEPs,
> - * an RCiEP associated with an RCEC, or a Port.
> + * or a Port.
> + * @dev an RCiEP lacking an associated RCEC.
> * @cb callback to be called for each device found
> * @userdata arbitrary pointer to be passed to callback.
> *
> @@ -160,13 +161,20 @@ static int report_resume(struct pci_dev *dev, void *data)
> * If the device provided has no subordinate bus, call the provided
> * callback on the device itself.
> */
> -static void pci_walk_bridge(struct pci_dev *bridge, int (*cb)(struct pci_dev *, void *),
> +static void pci_walk_bridge(struct pci_dev *bridge, struct pci_dev *dev,
> + int (*cb)(struct pci_dev *, void *),
> void *userdata)
> {
> - if (bridge->subordinate)
> + /*
> + * In a non-native case where there is no OS-visible reporting
> + * device the bridge will be NULL, i.e., no RCEC, no PORT.
> + */
> + if (bridge && bridge->subordinate)
> pci_walk_bus(bridge->subordinate, cb, userdata);
> - else
> + else if (bridge)
> cb(bridge, userdata);
> + else
> + cb(dev, userdata);
> }
>
> static pci_ers_result_t flr_on_rciep(struct pci_dev *dev)
> @@ -196,16 +204,25 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
> type = pci_pcie_type(dev);
> if (type == PCI_EXP_TYPE_ROOT_PORT ||
> type == PCI_EXP_TYPE_DOWNSTREAM ||
> - type == PCI_EXP_TYPE_RC_EC ||
> - type == PCI_EXP_TYPE_RC_END)
> + type == PCI_EXP_TYPE_RC_EC)
> bridge = dev;
> + else if (type == PCI_EXP_TYPE_RC_END)
> + bridge = dev->rcec;
> else
> bridge = pci_upstream_bridge(dev);
>
> pci_dbg(dev, "broadcast error_detected message\n");
> if (state == pci_channel_io_frozen) {
> - pci_walk_bridge(bridge, report_frozen_detected, &status);
> + pci_walk_bridge(bridge, dev, report_frozen_detected, &status);
> if (type == PCI_EXP_TYPE_RC_END) {
> + /*
> + * The callback only clears the Root Error Status
> + * of the RCEC (see aer.c). Only perform this for the
> + * native case, i.e., an RCEC is present.
> + */
> + if (bridge)
> + reset_subordinate_devices(bridge);
Help me understand this. There are lots of callbacks in this picture,
but I guess this "callback only clears Root Error Status" must refer
to aer_root_reset(), i.e., the reset_subordinate_devices pointer?
Of course, the caller of pcie_do_recovery() supplied that pointer.
And we can infer that it must be aer_root_reset(), not
dpc_reset_link(), because RCECs and RCiEPs are not allowed to
implement DPC.
I wish we didn't have either this assumption about what
reset_subordinate_devices points to, or the assumption about what
aer_root_reset() does. They both seem a little bit tenuous.
We already made aer_root_reset() smart enough to check for RCECs. Can
we put the FLR there, too? Then we wouldn't have this weird situation
where reset_subordinate_devices() does a reset and clears error
status, EXCEPT for this case where it only clears error status and we
do the reset here?
> status = flr_on_rciep(dev);
> if (status != PCI_ERS_RESULT_RECOVERED) {
> pci_warn(dev, "function level reset failed\n");
> @@ -219,13 +236,13 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
> }
> }
> } else {
> - pci_walk_bridge(bridge, report_normal_detected, &status);
> + pci_walk_bridge(bridge, dev, report_normal_detected, &status);
> }
>
> if (status == PCI_ERS_RESULT_CAN_RECOVER) {
> status = PCI_ERS_RESULT_RECOVERED;
> pci_dbg(dev, "broadcast mmio_enabled message\n");
> - pci_walk_bridge(bridge, report_mmio_enabled, &status);
> + pci_walk_bridge(bridge, dev, report_mmio_enabled, &status);
> }
>
> if (status == PCI_ERS_RESULT_NEED_RESET) {
> @@ -236,14 +253,14 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
> */
> status = PCI_ERS_RESULT_RECOVERED;
> pci_dbg(dev, "broadcast slot_reset message\n");
> - pci_walk_bridge(bridge, report_slot_reset, &status);
> + pci_walk_bridge(bridge, dev, report_slot_reset, &status);
> }
>
> if (status != PCI_ERS_RESULT_RECOVERED)
> goto failed;
>
> pci_dbg(dev, "broadcast resume message\n");
> - pci_walk_bridge(bridge, report_resume, &status);
> + pci_walk_bridge(bridge, dev, report_resume, &status);
>
> if (type == PCI_EXP_TYPE_ROOT_PORT ||
> type == PCI_EXP_TYPE_DOWNSTREAM ||
> --
> 2.28.0
>
On 9 Oct 2020, at 11:26, Sean V Kelley wrote:
> Hi Bjorn,
>
> On 9 Oct 2020, at 10:57, Bjorn Helgaas wrote:
>
>> On Fri, Oct 02, 2020 at 11:47:32AM -0700, Sean V Kelley wrote:
>>> From: Qiuxu Zhuo <[email protected]>
>>>
>>> When attempting error recovery for an RCiEP associated with an RCEC
>>> device,
>>> there needs to be a way to update the Root Error Status, the
>>> Uncorrectable
>>> Error Status and the Uncorrectable Error Severity of the parent
>>> RCEC.
>>> In some non-native cases in which there is no OS visible device
>>> associated with the RCiEP, there is nothing to act upon as the
>>> firmware
>>> is acting before the OS. So add handling for the linked 'rcec' in
>>> AER/ERR
>>> while taking into account non-native cases.
>>>
>>> Co-developed-by: Sean V Kelley <[email protected]>
>>> Signed-off-by: Sean V Kelley <[email protected]>
>>> Signed-off-by: Qiuxu Zhuo <[email protected]>
>>> Reviewed-by: Jonathan Cameron <[email protected]>
>>> ---
>>> drivers/pci/pcie/aer.c | 9 +++++----
>>> drivers/pci/pcie/err.c | 39 ++++++++++++++++++++++++++++-----------
>>> 2 files changed, 33 insertions(+), 15 deletions(-)
>>>
>>> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
>>> index 65dff5f3457a..dccdba60b5d9 100644
>>> --- a/drivers/pci/pcie/aer.c
>>> +++ b/drivers/pci/pcie/aer.c
>>> @@ -1358,17 +1358,18 @@ static int aer_probe(struct pcie_device
>>> *dev)
>>> static pci_ers_result_t aer_root_reset(struct pci_dev *dev)
>>> {
>>> int aer = dev->aer_cap;
>>> + int rc = 0;
>>> u32 reg32;
>>> - int rc;
>>> -
>>>
>>> /* Disable Root's interrupt in response to error messages */
>>> pci_read_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, ®32);
>>> reg32 &= ~ROOT_PORT_INTR_ON_MESG_MASK;
>>> pci_write_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, reg32);
>>>
>>> - rc = pci_bus_error_reset(dev);
>>> - pci_info(dev, "Root Port link has been reset\n");
>>> + if (pci_pcie_type(dev) != PCI_EXP_TYPE_RC_EC) {
>>> + rc = pci_bus_error_reset(dev);
>>> + pci_info(dev, "Root Port link has been reset\n");
>>> + }
>>>
>>> /* Clear Root Error Status */
>>> pci_read_config_dword(dev, aer + PCI_ERR_ROOT_STATUS, ®32);
>>> diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
>>> index 38abd7984996..956ad4c86d53 100644
>>> --- a/drivers/pci/pcie/err.c
>>> +++ b/drivers/pci/pcie/err.c
>>> @@ -149,7 +149,8 @@ static int report_resume(struct pci_dev *dev,
>>> void *data)
>>> /**
>>> * pci_walk_bridge - walk bridges potentially AER affected
>>> * @bridge bridge which may be an RCEC with associated RCiEPs,
>>> - * an RCiEP associated with an RCEC, or a Port.
>>> + * or a Port.
>>> + * @dev an RCiEP lacking an associated RCEC.
>>> * @cb callback to be called for each device found
>>> * @userdata arbitrary pointer to be passed to callback.
>>> *
>>> @@ -160,13 +161,20 @@ static int report_resume(struct pci_dev *dev,
>>> void *data)
>>> * If the device provided has no subordinate bus, call the provided
>>> * callback on the device itself.
>>> */
>>> -static void pci_walk_bridge(struct pci_dev *bridge, int
>>> (*cb)(struct pci_dev *, void *),
>>> +static void pci_walk_bridge(struct pci_dev *bridge, struct pci_dev
>>> *dev,
>>> + int (*cb)(struct pci_dev *, void *),
>>> void *userdata)
>>> {
>>> - if (bridge->subordinate)
>>> + /*
>>> + * In a non-native case where there is no OS-visible reporting
>>> + * device the bridge will be NULL, i.e., no RCEC, no PORT.
>>> + */
>>> + if (bridge && bridge->subordinate)
>>> pci_walk_bus(bridge->subordinate, cb, userdata);
>>> - else
>>> + else if (bridge)
>>> cb(bridge, userdata);
>>> + else
>>> + cb(dev, userdata);
>>> }
>>>
>>> static pci_ers_result_t flr_on_rciep(struct pci_dev *dev)
>>> @@ -196,16 +204,25 @@ pci_ers_result_t pcie_do_recovery(struct
>>> pci_dev *dev,
>>> type = pci_pcie_type(dev);
>>> if (type == PCI_EXP_TYPE_ROOT_PORT ||
>>> type == PCI_EXP_TYPE_DOWNSTREAM ||
>>> - type == PCI_EXP_TYPE_RC_EC ||
>>> - type == PCI_EXP_TYPE_RC_END)
>>> + type == PCI_EXP_TYPE_RC_EC)
>>> bridge = dev;
>>> + else if (type == PCI_EXP_TYPE_RC_END)
>>> + bridge = dev->rcec;
>>> else
>>> bridge = pci_upstream_bridge(dev);
>>>
>>> pci_dbg(dev, "broadcast error_detected message\n");
>>> if (state == pci_channel_io_frozen) {
>>> - pci_walk_bridge(bridge, report_frozen_detected, &status);
>>> + pci_walk_bridge(bridge, dev, report_frozen_detected, &status);
>>> if (type == PCI_EXP_TYPE_RC_END) {
>>> + /*
>>> + * The callback only clears the Root Error Status
>>> + * of the RCEC (see aer.c). Only perform this for the
>>> + * native case, i.e., an RCEC is present.
>>> + */
>>> + if (bridge)
>>> + reset_subordinate_devices(bridge);
>>
>> Help me understand this. There are lots of callbacks in this
>> picture,
>> but I guess this "callback only clears Root Error Status" must refer
>> to aer_root_reset(), i.e., the reset_subordinate_devices pointer?
>
> The ‘bridge’ in this case will always be dev->rcec, the event
> collector for the associated RC_END. And that’s what’s being
> cleared in aer.c via aer_root_reset() as the callback. It’s also
> being checked for native/non-native here.
>
>>
>> Of course, the caller of pcie_do_recovery() supplied that pointer.
>> And we can infer that it must be aer_root_reset(), not
>> dpc_reset_link(), because RCECs and RCiEPs are not allowed to
>> implement DPC.
>
> Correct.
>
>>
>> I wish we didn't have either this assumption about what
>> reset_subordinate_devices points to, or the assumption about what
>> aer_root_reset() does. They both seem a little bit tenuous.
>
> Agree. It’s the relationship between the RC_END and the RC_EC.
>
>>
>> We already made aer_root_reset() smart enough to check for RCECs.
>> Can
>> we put the FLR there, too? Then we wouldn't have this weird
>> situation
>> where reset_subordinate_devices() does a reset and clears error
>> status, EXCEPT for this case where it only clears error status and we
>> do the reset here?
>
> We could add the smarts to aer_root_reset() to check for an RC_END in
> that callback and perform the clear there on its RC_EC. We just
> wouldn’t map ‘bridge’ to dev->rcec for RC_END in
> pcie_do_recovery() which would simplify things.
>
> Further, the FLR in the case of flr_on_rciep() below is specific to
> the RCiEP itself. So it could be performed either in aer_root_reset()
> or remain in the pcie_do_recovery().
>
> That should work.
>
> Sean
Thinking more on this, you could still pass dev->rcec to the callback
(eventually aer_root_reset()), but you won’t have the ability to
handle the FLR there without the pointer to the RC_END. That’s why I
suggested passing the RC_END and checking for its RC_EC in
aer_root_reset() if you want to also handle FLR there too from below.
Sean
>
>
>>
>>> status = flr_on_rciep(dev);
>>> if (status != PCI_ERS_RESULT_RECOVERED) {
>>> pci_warn(dev, "function level reset failed\n");
>>> @@ -219,13 +236,13 @@ pci_ers_result_t pcie_do_recovery(struct
>>> pci_dev *dev,
>>> }
>>> }
>>> } else {
>>> - pci_walk_bridge(bridge, report_normal_detected, &status);
>>> + pci_walk_bridge(bridge, dev, report_normal_detected, &status);
>>> }
>>>
>>> if (status == PCI_ERS_RESULT_CAN_RECOVER) {
>>> status = PCI_ERS_RESULT_RECOVERED;
>>> pci_dbg(dev, "broadcast mmio_enabled message\n");
>>> - pci_walk_bridge(bridge, report_mmio_enabled, &status);
>>> + pci_walk_bridge(bridge, dev, report_mmio_enabled, &status);
>>> }
>>>
>>> if (status == PCI_ERS_RESULT_NEED_RESET) {
>>> @@ -236,14 +253,14 @@ pci_ers_result_t pcie_do_recovery(struct
>>> pci_dev *dev,
>>> */
>>> status = PCI_ERS_RESULT_RECOVERED;
>>> pci_dbg(dev, "broadcast slot_reset message\n");
>>> - pci_walk_bridge(bridge, report_slot_reset, &status);
>>> + pci_walk_bridge(bridge, dev, report_slot_reset, &status);
>>> }
>>>
>>> if (status != PCI_ERS_RESULT_RECOVERED)
>>> goto failed;
>>>
>>> pci_dbg(dev, "broadcast resume message\n");
>>> - pci_walk_bridge(bridge, report_resume, &status);
>>> + pci_walk_bridge(bridge, dev, report_resume, &status);
>>>
>>> if (type == PCI_EXP_TYPE_ROOT_PORT ||
>>> type == PCI_EXP_TYPE_DOWNSTREAM ||
>>> --
>>> 2.28.0
>>>
On Fri, Oct 09, 2020 at 12:57:45PM -0500, Bjorn Helgaas wrote:
> On Fri, Oct 02, 2020 at 11:47:32AM -0700, Sean V Kelley wrote:
> > From: Qiuxu Zhuo <[email protected]>
> >
> > When attempting error recovery for an RCiEP associated with an RCEC device,
> > there needs to be a way to update the Root Error Status, the Uncorrectable
> > Error Status and the Uncorrectable Error Severity of the parent RCEC.
> > In some non-native cases in which there is no OS visible device
> > associated with the RCiEP, there is nothing to act upon as the firmware
> > is acting before the OS. So add handling for the linked 'rcec' in AER/ERR
> > while taking into account non-native cases.
> >
> > Co-developed-by: Sean V Kelley <[email protected]>
> > Signed-off-by: Sean V Kelley <[email protected]>
> > Signed-off-by: Qiuxu Zhuo <[email protected]>
> > Reviewed-by: Jonathan Cameron <[email protected]>
> > ---
> > drivers/pci/pcie/aer.c | 9 +++++----
> > drivers/pci/pcie/err.c | 39 ++++++++++++++++++++++++++++-----------
> > 2 files changed, 33 insertions(+), 15 deletions(-)
> >
> > diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> > index 65dff5f3457a..dccdba60b5d9 100644
> > --- a/drivers/pci/pcie/aer.c
> > +++ b/drivers/pci/pcie/aer.c
> > @@ -1358,17 +1358,18 @@ static int aer_probe(struct pcie_device *dev)
> > static pci_ers_result_t aer_root_reset(struct pci_dev *dev)
> > {
> > int aer = dev->aer_cap;
> > + int rc = 0;
> > u32 reg32;
> > - int rc;
> > -
> >
> > /* Disable Root's interrupt in response to error messages */
> > pci_read_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, ®32);
> > reg32 &= ~ROOT_PORT_INTR_ON_MESG_MASK;
> > pci_write_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, reg32);
> >
> > - rc = pci_bus_error_reset(dev);
> > - pci_info(dev, "Root Port link has been reset\n");
> > + if (pci_pcie_type(dev) != PCI_EXP_TYPE_RC_EC) {
> > + rc = pci_bus_error_reset(dev);
> > + pci_info(dev, "Root Port link has been reset\n");
> > + }
> >
> > /* Clear Root Error Status */
> > pci_read_config_dword(dev, aer + PCI_ERR_ROOT_STATUS, ®32);
> > diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
> > index 38abd7984996..956ad4c86d53 100644
> > --- a/drivers/pci/pcie/err.c
> > +++ b/drivers/pci/pcie/err.c
> > @@ -149,7 +149,8 @@ static int report_resume(struct pci_dev *dev, void *data)
> > /**
> > * pci_walk_bridge - walk bridges potentially AER affected
> > * @bridge bridge which may be an RCEC with associated RCiEPs,
> > - * an RCiEP associated with an RCEC, or a Port.
> > + * or a Port.
> > + * @dev an RCiEP lacking an associated RCEC.
> > * @cb callback to be called for each device found
> > * @userdata arbitrary pointer to be passed to callback.
> > *
> > @@ -160,13 +161,20 @@ static int report_resume(struct pci_dev *dev, void *data)
> > * If the device provided has no subordinate bus, call the provided
> > * callback on the device itself.
> > */
> > -static void pci_walk_bridge(struct pci_dev *bridge, int (*cb)(struct pci_dev *, void *),
> > +static void pci_walk_bridge(struct pci_dev *bridge, struct pci_dev *dev,
> > + int (*cb)(struct pci_dev *, void *),
> > void *userdata)
> > {
> > - if (bridge->subordinate)
> > + /*
> > + * In a non-native case where there is no OS-visible reporting
> > + * device the bridge will be NULL, i.e., no RCEC, no PORT.
> > + */
> > + if (bridge && bridge->subordinate)
> > pci_walk_bus(bridge->subordinate, cb, userdata);
> > - else
> > + else if (bridge)
> > cb(bridge, userdata);
> > + else
> > + cb(dev, userdata);
> > }
> >
> > static pci_ers_result_t flr_on_rciep(struct pci_dev *dev)
> > @@ -196,16 +204,25 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
> > type = pci_pcie_type(dev);
> > if (type == PCI_EXP_TYPE_ROOT_PORT ||
> > type == PCI_EXP_TYPE_DOWNSTREAM ||
> > - type == PCI_EXP_TYPE_RC_EC ||
> > - type == PCI_EXP_TYPE_RC_END)
> > + type == PCI_EXP_TYPE_RC_EC)
> > bridge = dev;
> > + else if (type == PCI_EXP_TYPE_RC_END)
> > + bridge = dev->rcec;
> > else
> > bridge = pci_upstream_bridge(dev);
> >
> > pci_dbg(dev, "broadcast error_detected message\n");
> > if (state == pci_channel_io_frozen) {
> > - pci_walk_bridge(bridge, report_frozen_detected, &status);
> > + pci_walk_bridge(bridge, dev, report_frozen_detected, &status);
> > if (type == PCI_EXP_TYPE_RC_END) {
> > + /*
> > + * The callback only clears the Root Error Status
> > + * of the RCEC (see aer.c). Only perform this for the
> > + * native case, i.e., an RCEC is present.
> > + */
> > + if (bridge)
> > + reset_subordinate_devices(bridge);
>
> Help me understand this. There are lots of callbacks in this picture,
> but I guess this "callback only clears Root Error Status" must refer
> to aer_root_reset(), i.e., the reset_subordinate_devices pointer?
>
> Of course, the caller of pcie_do_recovery() supplied that pointer.
> And we can infer that it must be aer_root_reset(), not
> dpc_reset_link(), because RCECs and RCiEPs are not allowed to
> implement DPC.
>
> I wish we didn't have either this assumption about what
> reset_subordinate_devices points to, or the assumption about what
> aer_root_reset() does. They both seem a little bit tenuous.
>
> We already made aer_root_reset() smart enough to check for RCECs. Can
> we put the FLR there, too? Then we wouldn't have this weird situation
> where reset_subordinate_devices() does a reset and clears error
> status, EXCEPT for this case where it only clears error status and we
> do the reset here?
Just as an example to make this concrete. Doesn't even compile.
diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index d6927e6535e5..e389db7cbba6 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -1372,28 +1372,45 @@ static int aer_probe(struct pcie_device *dev)
*/
static pci_ers_result_t aer_root_reset(struct pci_dev *dev)
{
- int aer = dev->aer_cap;
+ int type = pci_pcie_type(dev);
+ struct pci_dev *root;
+ int aer = 0;
int rc = 0;
u32 reg32;
- /* Disable Root's interrupt in response to error messages */
- pci_read_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, ®32);
- reg32 &= ~ROOT_PORT_INTR_ON_MESG_MASK;
- pci_write_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, reg32);
+ if (type == PCI_EXP_TYPE_RC_END)
+ root = dev->rcec;
+ else
+ root = dev;
+
+ if (root)
+ aer = root->aer_cap;
- if (pci_pcie_type(dev) != PCI_EXP_TYPE_RC_EC) {
+ if (aer) {
+ /* Disable Root's interrupt in response to error messages */
+ pci_read_config_dword(root, aer + PCI_ERR_ROOT_COMMAND, ®32);
+ reg32 &= ~ROOT_PORT_INTR_ON_MESG_MASK;
+ pci_write_config_dword(root, aer + PCI_ERR_ROOT_COMMAND, reg32);
+ }
+
+ if (type == PCI_EXP_TYPE_RC_EC || type == PCI_EXP_TYPE_RC_END) {
+ rc = flr_on_rciep(dev);
+ pci_info(dev, "has been reset (%d)\n", rc);
+ } else {
rc = pci_bus_error_reset(dev);
- pci_info(dev, "Root Port link has been reset\n");
+ pci_info(dev, "Root Port link has been reset (%d)\n", rc);
}
- /* Clear Root Error Status */
- pci_read_config_dword(dev, aer + PCI_ERR_ROOT_STATUS, ®32);
- pci_write_config_dword(dev, aer + PCI_ERR_ROOT_STATUS, reg32);
+ if (aer) {
+ /* Clear Root Error Status */
+ pci_read_config_dword(root, aer + PCI_ERR_ROOT_STATUS, ®32);
+ pci_write_config_dword(root, aer + PCI_ERR_ROOT_STATUS, reg32);
- /* Enable Root Port's interrupt in response to error messages */
- pci_read_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, ®32);
- reg32 |= ROOT_PORT_INTR_ON_MESG_MASK;
- pci_write_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, reg32);
+ /* Enable Root Port's interrupt in response to error messages */
+ pci_read_config_dword(root, aer + PCI_ERR_ROOT_COMMAND, ®32);
+ reg32 |= ROOT_PORT_INTR_ON_MESG_MASK;
+ pci_write_config_dword(root, aer + PCI_ERR_ROOT_COMMAND, reg32);
+ }
return rc ? PCI_ERS_RESULT_DISCONNECT : PCI_ERS_RESULT_RECOVERED;
}
diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
index 79ae1356141d..08976034a89c 100644
--- a/drivers/pci/pcie/err.c
+++ b/drivers/pci/pcie/err.c
@@ -203,36 +203,19 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
*/
if (type == PCI_EXP_TYPE_ROOT_PORT ||
type == PCI_EXP_TYPE_DOWNSTREAM ||
- type == PCI_EXP_TYPE_RC_EC)
+ type == PCI_EXP_TYPE_RC_EC ||
+ type == PCI_EXP_TYPE_RC_END)
bridge = dev;
- else if (type == PCI_EXP_TYPE_RC_END)
- bridge = dev->rcec;
else
bridge = pci_upstream_bridge(dev);
pci_dbg(bridge, "broadcast error_detected message\n");
if (state == pci_channel_io_frozen) {
pci_walk_bridge(bridge, dev, report_frozen_detected, &status);
- if (type == PCI_EXP_TYPE_RC_END) {
- /*
- * The callback only clears the Root Error Status
- * of the RCEC (see aer.c). Only perform this for the
- * native case, i.e., an RCEC is present.
- */
- if (bridge)
- reset_subordinates(bridge);
-
- status = flr_on_rciep(dev);
- if (status != PCI_ERS_RESULT_RECOVERED) {
- pci_warn(dev, "Function Level Reset failed\n");
- goto failed;
- }
- } else {
- status = reset_subordinates(bridge);
- if (status != PCI_ERS_RESULT_RECOVERED) {
- pci_warn(dev, "subordinate device reset failed\n");
- goto failed;
- }
+ status = reset_subordinates(bridge);
+ if (status != PCI_ERS_RESULT_RECOVERED) {
+ pci_warn(bridge, "subordinate device reset failed\n");
+ goto failed;
}
} else {
pci_walk_bridge(bridge, dev, report_normal_detected, &status);
On 9 Oct 2020, at 14:27, Bjorn Helgaas wrote:
> On Fri, Oct 02, 2020 at 11:47:29AM -0700, Sean V Kelley wrote:
>> From: Jonathan Cameron <[email protected]>
>>
>> Currently the kernel does not handle AER errors for Root Complex
>> integrated End Points (RCiEPs)[0]. These devices sit on a root bus
>> within
>> the Root Complex (RC). AER handling is performed by a Root Complex
>> Event
>> Collector (RCEC) [1] which is a effectively a type of RCiEP on the
>> same
>> root bus.
>>
>> For an RCEC (technically not a Bridge), error messages "received"
>> from
>> associated RCiEPs must be enabled for "transmission" in order to
>> cause a
>> System Error via the Root Control register or (when the Advanced
>> Error
>> Reporting Capability is present) reporting via the Root Error Command
>> register and logging in the Root Error Status register and Error
>> Source
>> Identification register.
>>
>> In addition to the defined OS level handling of the reset flow for
>> the
>> associated RCiEPs of an RCEC, it is possible to also have non-native
>> handling. In that case there is no need to take any actions on the
>> RCEC
>> because the firmware is responsible for them. This is true where APEI
>> [2]
>> is used to report the AER errors via a GHES[v2] HEST entry [3] and
>> relevant AER CPER record [4] and non-native handling is in use.
>>
>> We effectively end up with two different types of discovery for
>> purposes of handling AER errors:
>>
>> 1) Normal bus walk - we pass the downstream port above a bus to which
>> the device is attached and it walks everything below that point.
>>
>> 2) An RCiEP with no visible association with an RCEC as there is no
>> need
>> to walk devices. In that case, the flow is to just call the callbacks
>> for
>> the actual device, which in turn references its associated RCEC.
>>
>> Modify pci_walk_bridge() to handle devices which lack a subordinate
>> bus.
>> If the device does not then it will call the function on that device
>> alone.
>>
>> [0] ACPI PCI Express Base Specification 5.0-1 1.3.2.3 Root Complex
>> Integrated Endpoint Rules.
>> [1] ACPI PCI Express Base Specification 5.0-1 6.2 Error Signalling
>> and
>> Logging
>> [2] ACPI Specification 6.3 Chapter 18 ACPI Platform Error Interface
>> (APEI)
>> [3] ACPI Specification 6.3 18.2.3.7 Generic Hardware Error Source
>> [4] UEFI Specification 2.8, N.2.7 PCI Express Error Section
>>
>> Signed-off-by: Jonathan Cameron <[email protected]>
>> Signed-off-by: Sean V Kelley <[email protected]>
>> ---
>> drivers/pci/pcie/err.c | 25 ++++++++++++++++++++-----
>> 1 file changed, 20 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
>> index 5ff1afa4763d..c4ceca42a3bf 100644
>> --- a/drivers/pci/pcie/err.c
>> +++ b/drivers/pci/pcie/err.c
>> @@ -148,19 +148,25 @@ static int report_resume(struct pci_dev *dev,
>> void *data)
>>
>> /**
>> * pci_walk_bridge - walk bridges potentially AER affected
>> - * @bridge bridge which may be a Port.
>> + * @bridge bridge which may be an RCEC with associated RCiEPs,
>> + * an RCiEP associated with an RCEC, or a Port.
>> * @cb callback to be called for each device found
>> * @userdata arbitrary pointer to be passed to callback.
>> *
>> * If the device provided is a bridge, walk the subordinate bus,
>> * including any bridged devices on buses under this bus.
>> * Call the provided callback on each device found.
>> + *
>> + * If the device provided has no subordinate bus, call the provided
>> + * callback on the device itself.
>> */
>> static void pci_walk_bridge(struct pci_dev *bridge, int (*cb)(struct
>> pci_dev *, void *),
>> void *userdata)
>> {
>> if (bridge->subordinate)
>> pci_walk_bus(bridge->subordinate, cb, userdata);
>> + else
>> + cb(bridge, userdata);
>> }
>>
>> pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
>> @@ -174,11 +180,13 @@ pci_ers_result_t pcie_do_recovery(struct
>> pci_dev *dev,
>> /*
>> * Error recovery runs on all subordinates of the first downstream
>> * bridge. If the downstream bridge detected the error, it is
>> - * cleared at the end.
>> + * cleared at the end. For RCiEPs we should reset just the RCiEP
>> itself.
>> */
>> type = pci_pcie_type(dev);
>> if (type == PCI_EXP_TYPE_ROOT_PORT ||
>> - type == PCI_EXP_TYPE_DOWNSTREAM)
>> + type == PCI_EXP_TYPE_DOWNSTREAM ||
>> + type == PCI_EXP_TYPE_RC_EC ||
>
> What is the case where an RCEC is passed to pcie_do_recovery()? I
> guess it's the case where an RCEC is reporting an error that it logged
> itself, i.e., no RCiEP is involved at all? In that case I guess we
> should try an FLR on the RCEC and clear its status?
>
> (I don't think the current series attempts the FLR.)
Correct, we can get errors reported by that same RCEC that are not
related to the associated RCiEPs. Further, an RCiEP in reporting an
error will trigger logging to the Root Error Command/Status and Error
Source Identification registers of the associated RCEC. The assumption
in pcie_do_recovery() here is that I can cover both scenarios.
We could attempt an FLR on the RCEC itself as well.
>
>> + type == PCI_EXP_TYPE_RC_END)
>> bridge = dev;
>> else
>> bridge = pci_upstream_bridge(dev);
>> @@ -186,7 +194,13 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev
>> *dev,
>> pci_dbg(dev, "broadcast error_detected message\n");
>> if (state == pci_channel_io_frozen) {
>> pci_walk_bridge(bridge, report_frozen_detected, &status);
>> - status = reset_subordinate_device(bridge);
>> + if (type == PCI_EXP_TYPE_RC_END) {
>> + pci_warn(dev, "subordinate device reset not possible for
>> RCiEP\n");
>> + status = PCI_ERS_RESULT_NONE;
>> + goto failed;
>> + }
>> +
>> + status = reset_subordinate_devices(bridge);
>> if (status != PCI_ERS_RESULT_RECOVERED) {
>> pci_warn(dev, "subordinate device reset failed\n");
>> goto failed;
>> @@ -219,7 +233,8 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev
>> *dev,
>> pci_walk_bridge(bridge, report_resume, &status);
>>
>> if (type == PCI_EXP_TYPE_ROOT_PORT ||
>> - type == PCI_EXP_TYPE_DOWNSTREAM) {
>> + type == PCI_EXP_TYPE_DOWNSTREAM ||
>> + type == PCI_EXP_TYPE_RC_EC) {
>> if (pcie_aer_is_native(bridge))
>> pcie_clear_device_status(bridge);
>> pci_aer_clear_nonfatal_status(bridge);
>> --
>> 2.28.0
>>
On Fri, 2020-10-09 at 15:07 -0700, Sean V Kelley wrote:
> On 9 Oct 2020, at 14:30, Bjorn Helgaas wrote:
>
> > On Fri, Oct 09, 2020 at 12:57:45PM -0500, Bjorn Helgaas wrote:
> > > On Fri, Oct 02, 2020 at 11:47:32AM -0700, Sean V Kelley wrote:
> > > > From: Qiuxu Zhuo <[email protected]>
> > > >
> > > > When attempting error recovery for an RCiEP associated with an
> > > > RCEC
> > > > device,
> > > > there needs to be a way to update the Root Error Status, the
> > > > Uncorrectable
> > > > Error Status and the Uncorrectable Error Severity of the
> > > > parent
> > > > RCEC.
> > > > In some non-native cases in which there is no OS visible device
> > > > associated with the RCiEP, there is nothing to act upon as the
> > > > firmware
> > > > is acting before the OS. So add handling for the linked 'rcec'
> > > > in
> > > > AER/ERR
> > > > while taking into account non-native cases.
> > > >
> > > > Co-developed-by: Sean V Kelley <[email protected]>
> > > > Signed-off-by: Sean V Kelley <[email protected]>
> > > > Signed-off-by: Qiuxu Zhuo <[email protected]>
> > > > Reviewed-by: Jonathan Cameron <[email protected]>
> > > > ---
> > > > drivers/pci/pcie/aer.c | 9 +++++----
> > > > drivers/pci/pcie/err.c | 39 ++++++++++++++++++++++++++++----
> > > > -------
> > > > 2 files changed, 33 insertions(+), 15 deletions(-)
> > > >
> > > > diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> > > > index 65dff5f3457a..dccdba60b5d9 100644
> > > > --- a/drivers/pci/pcie/aer.c
> > > > +++ b/drivers/pci/pcie/aer.c
> > > > @@ -1358,17 +1358,18 @@ static int aer_probe(struct
> > > > pcie_device
> > > > *dev)
> > > > static pci_ers_result_t aer_root_reset(struct pci_dev *dev)
> > > > {
> > > > int aer = dev->aer_cap;
> > > > + int rc = 0;
> > > > u32 reg32;
> > > > - int rc;
> > > > -
> > > >
> > > > /* Disable Root's interrupt in response to error
> > > > messages */
> > > > pci_read_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND,
> > > > ®32);
> > > > reg32 &= ~ROOT_PORT_INTR_ON_MESG_MASK;
> > > > pci_write_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND,
> > > > reg32);
> > > >
> > > > - rc = pci_bus_error_reset(dev);
> > > > - pci_info(dev, "Root Port link has been reset\n");
> > > > + if (pci_pcie_type(dev) != PCI_EXP_TYPE_RC_EC) {
> > > > + rc = pci_bus_error_reset(dev);
> > > > + pci_info(dev, "Root Port link has been
> > > > reset\n");
> > > > + }
> > > >
> > > > /* Clear Root Error Status */
> > > > pci_read_config_dword(dev, aer + PCI_ERR_ROOT_STATUS,
> > > > ®32);
> > > > diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
> > > > index 38abd7984996..956ad4c86d53 100644
> > > > --- a/drivers/pci/pcie/err.c
> > > > +++ b/drivers/pci/pcie/err.c
> > > > @@ -149,7 +149,8 @@ static int report_resume(struct pci_dev
> > > > *dev,
> > > > void *data)
> > > > /**
> > > > * pci_walk_bridge - walk bridges potentially AER affected
> > > > * @bridge bridge which may be an RCEC with associated
> > > > RCiEPs,
> > > > - * an RCiEP associated with an RCEC, or a Port.
> > > > + * or a Port.
> > > > + * @dev an RCiEP lacking an associated RCEC.
> > > > * @cb callback to be called for each device found
> > > > * @userdata arbitrary pointer to be passed to callback.
> > > > *
> > > > @@ -160,13 +161,20 @@ static int report_resume(struct pci_dev
> > > > *dev,
> > > > void *data)
> > > > * If the device provided has no subordinate bus, call the
> > > > provided
> > > > * callback on the device itself.
> > > > */
> > > > -static void pci_walk_bridge(struct pci_dev *bridge, int
> > > > (*cb)(struct pci_dev *, void *),
> > > > +static void pci_walk_bridge(struct pci_dev *bridge, struct
> > > > pci_dev
> > > > *dev,
> > > > + int (*cb)(struct pci_dev *, void
> > > > *),
> > > > void *userdata)
> > > > {
> > > > - if (bridge->subordinate)
> > > > + /*
> > > > + * In a non-native case where there is no OS-visible
> > > > reporting
> > > > + * device the bridge will be NULL, i.e., no RCEC, no
> > > > PORT.
> > > > + */
> > > > + if (bridge && bridge->subordinate)
> > > > pci_walk_bus(bridge->subordinate, cb,
> > > > userdata);
> > > > - else
> > > > + else if (bridge)
> > > > cb(bridge, userdata);
> > > > + else
> > > > + cb(dev, userdata);
> > > > }
> > > >
> > > > static pci_ers_result_t flr_on_rciep(struct pci_dev *dev)
> > > > @@ -196,16 +204,25 @@ pci_ers_result_t pcie_do_recovery(struct
> > > > pci_dev *dev,
> > > > type = pci_pcie_type(dev);
> > > > if (type == PCI_EXP_TYPE_ROOT_PORT ||
> > > > type == PCI_EXP_TYPE_DOWNSTREAM ||
> > > > - type == PCI_EXP_TYPE_RC_EC ||
> > > > - type == PCI_EXP_TYPE_RC_END)
> > > > + type == PCI_EXP_TYPE_RC_EC)
> > > > bridge = dev;
> > > > + else if (type == PCI_EXP_TYPE_RC_END)
> > > > + bridge = dev->rcec;
> > > > else
> > > > bridge = pci_upstream_bridge(dev);
> > > >
> > > > pci_dbg(dev, "broadcast error_detected message\n");
> > > > if (state == pci_channel_io_frozen) {
> > > > - pci_walk_bridge(bridge, report_frozen_detected,
> > > > &status);
> > > > + pci_walk_bridge(bridge, dev,
> > > > report_frozen_detected, &status);
> > > > if (type == PCI_EXP_TYPE_RC_END) {
> > > > + /*
> > > > + * The callback only clears the Root
> > > > Error Status
> > > > + * of the RCEC (see aer.c). Only
> > > > perform this for the
> > > > + * native case, i.e., an RCEC is
> > > > present.
> > > > + */
> > > > + if (bridge)
> > > > + reset_subordinate_devices(bridg
> > > > e);
> > >
> > > Help me understand this. There are lots of callbacks in this
> > > picture,
> > > but I guess this "callback only clears Root Error Status" must
> > > refer
> > > to aer_root_reset(), i.e., the reset_subordinate_devices pointer?
> > >
> > > Of course, the caller of pcie_do_recovery() supplied that
> > > pointer.
> > > And we can infer that it must be aer_root_reset(), not
> > > dpc_reset_link(), because RCECs and RCiEPs are not allowed to
> > > implement DPC.
> > >
> > > I wish we didn't have either this assumption about what
> > > reset_subordinate_devices points to, or the assumption about what
> > > aer_root_reset() does. They both seem a little bit tenuous.
> > >
> > > We already made aer_root_reset() smart enough to check for
> > > RCECs.
> > > Can
> > > we put the FLR there, too? Then we wouldn't have this weird
> > > situation
> > > where reset_subordinate_devices() does a reset and clears error
> > > status, EXCEPT for this case where it only clears error status
> > > and we
> > > do the reset here?
> >
> > Just as an example to make this concrete. Doesn't even compile.
>
> Okay, I tried something similar. Comments below:
>
> >
> > diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> > index d6927e6535e5..e389db7cbba6 100644
> > --- a/drivers/pci/pcie/aer.c
> > +++ b/drivers/pci/pcie/aer.c
> > @@ -1372,28 +1372,45 @@ static int aer_probe(struct pcie_device
> > *dev)
> > */
> > static pci_ers_result_t aer_root_reset(struct pci_dev *dev)
> > {
> > - int aer = dev->aer_cap;
> > + int type = pci_pcie_type(dev);
> > + struct pci_dev *root;
> > + int aer = 0;
> > int rc = 0;
> > u32 reg32;
> >
> > - /* Disable Root's interrupt in response to error messages */
> > - pci_read_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, ®32);
> > - reg32 &= ~ROOT_PORT_INTR_ON_MESG_MASK;
> > - pci_write_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, reg32);
> > + if (type == PCI_EXP_TYPE_RC_END)
> > + root = dev->rcec;
> > + else
> > + root = dev;
> > +
> > + if (root)
> > + aer = root->aer_cap;
>
> Except here dev->rcec may be NULL for the non-native case. So I had
> a
> goto label to bypass what follows. I left the FLR in err.c.
>
>
> > - if (pci_pcie_type(dev) != PCI_EXP_TYPE_RC_EC) {
> > + if (aer) {
> > + /* Disable Root's interrupt in response to error
> > messages */
> > + pci_read_config_dword(root, aer + PCI_ERR_ROOT_COMMAND,
> > ®32);
> > + reg32 &= ~ROOT_PORT_INTR_ON_MESG_MASK;
> > + pci_write_config_dword(root, aer +
> > PCI_ERR_ROOT_COMMAND, reg32);
> > + }
> > +
> > + if (type == PCI_EXP_TYPE_RC_EC || type == PCI_EXP_TYPE_RC_END)
> > {
> > + rc = flr_on_rciep(dev);
> > + pci_info(dev, "has been reset (%d)\n", rc);
> > + } else {
> > rc = pci_bus_error_reset(dev);
> > - pci_info(dev, "Root Port link has been reset\n");
> > + pci_info(dev, "Root Port link has been reset (%d)\n",
> > rc);
> > }
> >
> > - /* Clear Root Error Status */
> > - pci_read_config_dword(dev, aer + PCI_ERR_ROOT_STATUS, ®32);
> > - pci_write_config_dword(dev, aer + PCI_ERR_ROOT_STATUS, reg32);
> > + if (aer) {
> > + /* Clear Root Error Status */
> > + pci_read_config_dword(root, aer + PCI_ERR_ROOT_STATUS,
> > ®32);
> > + pci_write_config_dword(root, aer + PCI_ERR_ROOT_STATUS,
> > reg32);
> >
> > - /* Enable Root Port's interrupt in response to error messages
> > */
> > - pci_read_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, ®32);
> > - reg32 |= ROOT_PORT_INTR_ON_MESG_MASK;
> > - pci_write_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, reg32);
> > + /* Enable Root Port's interrupt in response to error
> > messages */
> > + pci_read_config_dword(root, aer + PCI_ERR_ROOT_COMMAND,
> > ®32);
> > + reg32 |= ROOT_PORT_INTR_ON_MESG_MASK;
> > + pci_write_config_dword(root, aer +
> > PCI_ERR_ROOT_COMMAND, reg32);
> > + }
> >
> > return rc ? PCI_ERS_RESULT_DISCONNECT :
> > PCI_ERS_RESULT_RECOVERED;
> > }
> > diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
> > index 79ae1356141d..08976034a89c 100644
> > --- a/drivers/pci/pcie/err.c
> > +++ b/drivers/pci/pcie/err.c
> > @@ -203,36 +203,19 @@ pci_ers_result_t pcie_do_recovery(struct
> > pci_dev
> > *dev,
> > */
> > if (type == PCI_EXP_TYPE_ROOT_PORT ||
> > type == PCI_EXP_TYPE_DOWNSTREAM ||
> > - type == PCI_EXP_TYPE_RC_EC)
> > + type == PCI_EXP_TYPE_RC_EC ||
> > + type == PCI_EXP_TYPE_RC_END)
> > bridge = dev;
> > - else if (type == PCI_EXP_TYPE_RC_END)
> > - bridge = dev->rcec;
> > else
> > bridge = pci_upstream_bridge(dev);
> >
> > pci_dbg(bridge, "broadcast error_detected message\n");
> > if (state == pci_channel_io_frozen) {
> > pci_walk_bridge(bridge, dev, report_frozen_detected,
> > &status);
>
> We’d need to walk back the dev addition to the pci_walk_bridge() and
> revert to prior to the checks on dev->rcec.
>
> > - if (type == PCI_EXP_TYPE_RC_END) {
> > - /*
> > - * The callback only clears the Root Error
> > Status
> > - * of the RCEC (see aer.c). Only perform this
> > for the
> > - * native case, i.e., an RCEC is present.
> > - */
> > - if (bridge)
> > - reset_subordinates(bridge);
> > -
> > - status = flr_on_rciep(dev);
> > - if (status != PCI_ERS_RESULT_RECOVERED) {
> > - pci_warn(dev, "Function Level Reset
> > failed\n");
> > - goto failed;
> > - }
> > - } else {
> > - status = reset_subordinates(bridge);
> > - if (status != PCI_ERS_RESULT_RECOVERED) {
> > - pci_warn(dev, "subordinate device reset
> > failed\n");
> > - goto failed;
> > - }
> > + status = reset_subordinates(bridge);
> > + if (status != PCI_ERS_RESULT_RECOVERED) {
> > + pci_warn(bridge, "subordinate device reset
> > failed\n");
> > + goto failed;
> > }
> > } else {
> > pci_walk_bridge(bridge, dev, report_normal_detected,
> > &status);
>
> So this is what I experimented with for the most part but with the
> changes that I highlighted - addressing non-native case in the
> callback
> and also in the bridge walk.
>
> I can test it out.
So I tested the following out, including your moving flr to aer.c:
- Renamed flr_on_rciep() to flr_on_rc() for RC devices (RC_END and
RC_EC)
- Moved check on dev->rcec into aer_root_reset() including the FLR.
- Reworked pci_walk_bridge() to drop extra dev argument and check
locally for the bridge->rcec. Maybe should also check on type when
checking on bridge->rcec.
Note I didn't use the check on aer_cap existence because I think you
had added that for simply being able to skip over for the non-native
case and I handle that with the single goto at the beginning which
takes you to the FLR.
So this is rough, compiled, tested with AER injections but that's it...
Please take a look.
Thanks,
Sean
drivers/pci/pcie/aer.c | 57 ++++++++++++++++++++++++++++++--------
drivers/pci/pcie/err.c | 62 ++++++++++++------------------------------
2 files changed, 63 insertions(+), 56 deletions(-)
diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index 3cde646f71c0..35b32309a97b 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -1362,6 +1362,23 @@ static int aer_probe(struct pcie_device *dev)
return 0;
}
+/**
+ * flr_on_rc - function levl reset on root complex device
+ * @dev: pointer to root complex device's pci_dev data structure
+ *
+ * Invoked by aer_root_reset() when performing link reset.
+ */
+static pci_ers_result_t flr_on_rc(struct pci_dev *dev)
+{
+ if (!pcie_has_flr(dev))
+ return PCI_ERS_RESULT_DISCONNECT;
+
+ if (pcie_flr(dev))
+ return PCI_ERS_RESULT_DISCONNECT;
+
+ return PCI_ERS_RESULT_RECOVERED;
+}
+
/**
* aer_root_reset - reset link on Root Port
* @dev: pointer to Root Port's pci_dev data structure
@@ -1370,28 +1387,46 @@ static int aer_probe(struct pcie_device *dev)
*/
static pci_ers_result_t aer_root_reset(struct pci_dev *dev)
{
+ int type = pci_pcie_type(dev);
+ struct pci_dev *root = dev;
int aer = dev->aer_cap;
int rc = 0;
u32 reg32;
+ if (pci_pcie_type(dev) == PCI_EXP_TYPE_RC_END) {
+ /*
+ * The reset should only clear the Root Error Status
+ * of the RCEC. Only perform this for the
+ * native case, i.e., an RCEC is present.
+ */
+ if (!(dev->rcec))
+ goto non_native;
+ else
+ root = dev->rcec;
+ }
+
/* Disable Root's interrupt in response to error messages */
- pci_read_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, ®32);
+ pci_read_config_dword(root, aer + PCI_ERR_ROOT_COMMAND,
®32);
reg32 &= ~ROOT_PORT_INTR_ON_MESG_MASK;
- pci_write_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, reg32);
-
- if (pci_pcie_type(dev) != PCI_EXP_TYPE_RC_EC) {
- rc = pci_bus_error_reset(dev);
- pci_info(dev, "Root Port link has been reset\n");
- }
+ pci_write_config_dword(root, aer + PCI_ERR_ROOT_COMMAND,
reg32);
/* Clear Root Error Status */
- pci_read_config_dword(dev, aer + PCI_ERR_ROOT_STATUS, ®32);
- pci_write_config_dword(dev, aer + PCI_ERR_ROOT_STATUS, reg32);
+ pci_read_config_dword(root, aer + PCI_ERR_ROOT_STATUS, ®32);
+ pci_write_config_dword(root, aer + PCI_ERR_ROOT_STATUS, reg32);
/* Enable Root Port's interrupt in response to error messages
*/
- pci_read_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, ®32);
+ pci_read_config_dword(root, aer + PCI_ERR_ROOT_COMMAND,
®32);
reg32 |= ROOT_PORT_INTR_ON_MESG_MASK;
- pci_write_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, reg32);
+ pci_write_config_dword(root, aer + PCI_ERR_ROOT_COMMAND,
reg32);
+
+non_native:
+ if ((type == PCI_EXP_TYPE_RC_EC) || (type ==
PCI_EXP_TYPE_RC_END)) {
+ rc = flr_on_rc(root);
+ pci_info(dev, "has been reset (%d)\n", rc);
+ } else {
+ rc = pci_bus_error_reset(root);
+ pci_info(dev, "Root Port link has been reset (%d)\n",
rc);
+ }
return rc ? PCI_ERS_RESULT_DISCONNECT :
PCI_ERS_RESULT_RECOVERED;
}
diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
index 956ad4c86d53..81bbf8a4006d 100644
--- a/drivers/pci/pcie/err.c
+++ b/drivers/pci/pcie/err.c
@@ -150,7 +150,6 @@ static int report_resume(struct pci_dev *dev, void
*data)
* pci_walk_bridge - walk bridges potentially AER affected
* @bridge bridge which may be an RCEC with associated RCiEPs,
* or a Port.
- * @dev an RCiEP lacking an associated RCEC.
* @cb callback to be called for each device found
* @userdata arbitrary pointer to be passed to callback.
*
@@ -161,7 +160,7 @@ static int report_resume(struct pci_dev *dev, void
*data)
* If the device provided has no subordinate bus, call the provided
* callback on the device itself.
*/
-static void pci_walk_bridge(struct pci_dev *bridge, struct pci_dev
*dev,
+static void pci_walk_bridge(struct pci_dev *bridge,
int (*cb)(struct pci_dev *, void *),
void *userdata)
{
@@ -169,23 +168,12 @@ static void pci_walk_bridge(struct pci_dev
*bridge, struct pci_dev *dev,
* In a non-native case where there is no OS-visible reporting
* device the bridge will be NULL, i.e., no RCEC, no PORT.
*/
- if (bridge && bridge->subordinate)
+ if (bridge->subordinate)
pci_walk_bus(bridge->subordinate, cb, userdata);
- else if (bridge)
- cb(bridge, userdata);
+ else if (bridge->rcec)
+ cb(bridge->rcec, userdata);
else
- cb(dev, userdata);
-}
-
-static pci_ers_result_t flr_on_rciep(struct pci_dev *dev)
-{
- if (!pcie_has_flr(dev))
- return PCI_ERS_RESULT_DISCONNECT;
-
- if (pcie_flr(dev))
- return PCI_ERS_RESULT_DISCONNECT;
-
- return PCI_ERS_RESULT_RECOVERED;
+ cb(bridge, userdata);
}
pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
@@ -204,45 +192,29 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev
*dev,
type = pci_pcie_type(dev);
if (type == PCI_EXP_TYPE_ROOT_PORT ||
type == PCI_EXP_TYPE_DOWNSTREAM ||
- type == PCI_EXP_TYPE_RC_EC)
+ type == PCI_EXP_TYPE_RC_EC ||
+ type == PCI_EXP_TYPE_RC_END)
bridge = dev;
- else if (type == PCI_EXP_TYPE_RC_END)
- bridge = dev->rcec;
else
bridge = pci_upstream_bridge(dev);
pci_dbg(dev, "broadcast error_detected message\n");
if (state == pci_channel_io_frozen) {
- pci_walk_bridge(bridge, dev, report_frozen_detected,
&status);
- if (type == PCI_EXP_TYPE_RC_END) {
- /*
- * The callback only clears the Root Error
Status
- * of the RCEC (see aer.c). Only perform this
for the
- * native case, i.e., an RCEC is present.
- */
- if (bridge)
- reset_subordinate_devices(bridge);
-
- status = flr_on_rciep(dev);
- if (status != PCI_ERS_RESULT_RECOVERED) {
- pci_warn(dev, "function level reset
failed\n");
- goto failed;
- }
- } else {
- status = reset_subordinate_devices(bridge);
- if (status != PCI_ERS_RESULT_RECOVERED) {
- pci_warn(dev, "subordinate device reset
failed\n");
- goto failed;
- }
+ pci_walk_bridge(bridge, report_frozen_detected,
&status);
+ status = reset_subordinate_devices(bridge);
+ if (status != PCI_ERS_RESULT_RECOVERED) {
+ pci_warn(dev, "subordinate device reset
failed\n");
+ goto failed;
}
+
} else {
- pci_walk_bridge(bridge, dev, report_normal_detected,
&status);
+ pci_walk_bridge(bridge, report_normal_detected,
&status);
}
if (status == PCI_ERS_RESULT_CAN_RECOVER) {
status = PCI_ERS_RESULT_RECOVERED;
pci_dbg(dev, "broadcast mmio_enabled message\n");
- pci_walk_bridge(bridge, dev, report_mmio_enabled,
&status);
+ pci_walk_bridge(bridge, report_mmio_enabled, &status);
}
if (status == PCI_ERS_RESULT_NEED_RESET) {
@@ -253,14 +225,14 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev
*dev,
*/
status = PCI_ERS_RESULT_RECOVERED;
pci_dbg(dev, "broadcast slot_reset message\n");
- pci_walk_bridge(bridge, dev, report_slot_reset,
&status);
+ pci_walk_bridge(bridge, report_slot_reset, &status);
}
if (status != PCI_ERS_RESULT_RECOVERED)
goto failed;
pci_dbg(dev, "broadcast resume message\n");
- pci_walk_bridge(bridge, dev, report_resume, &status);
+ pci_walk_bridge(bridge, report_resume, &status);
if (type == PCI_EXP_TYPE_ROOT_PORT ||
type == PCI_EXP_TYPE_DOWNSTREAM ||
--
2.28.0
>
> Thanks,
>
> Sean
On Fri, Oct 02, 2020 at 11:47:29AM -0700, Sean V Kelley wrote:
> From: Jonathan Cameron <[email protected]>
>
> Currently the kernel does not handle AER errors for Root Complex
> integrated End Points (RCiEPs)[0]. These devices sit on a root bus within
> the Root Complex (RC). AER handling is performed by a Root Complex Event
> Collector (RCEC) [1] which is a effectively a type of RCiEP on the same
> root bus.
>
> For an RCEC (technically not a Bridge), error messages "received" from
> associated RCiEPs must be enabled for "transmission" in order to cause a
> System Error via the Root Control register or (when the Advanced Error
> Reporting Capability is present) reporting via the Root Error Command
> register and logging in the Root Error Status register and Error Source
> Identification register.
>
> In addition to the defined OS level handling of the reset flow for the
> associated RCiEPs of an RCEC, it is possible to also have non-native
> handling. In that case there is no need to take any actions on the RCEC
> because the firmware is responsible for them. This is true where APEI [2]
> is used to report the AER errors via a GHES[v2] HEST entry [3] and
> relevant AER CPER record [4] and non-native handling is in use.
>
> We effectively end up with two different types of discovery for
> purposes of handling AER errors:
>
> 1) Normal bus walk - we pass the downstream port above a bus to which
> the device is attached and it walks everything below that point.
>
> 2) An RCiEP with no visible association with an RCEC as there is no need
> to walk devices. In that case, the flow is to just call the callbacks for
> the actual device, which in turn references its associated RCEC.
>
> Modify pci_walk_bridge() to handle devices which lack a subordinate bus.
> If the device does not then it will call the function on that device
> alone.
>
> [0] ACPI PCI Express Base Specification 5.0-1 1.3.2.3 Root Complex
> Integrated Endpoint Rules.
> [1] ACPI PCI Express Base Specification 5.0-1 6.2 Error Signalling and
> Logging
> [2] ACPI Specification 6.3 Chapter 18 ACPI Platform Error Interface (APEI)
> [3] ACPI Specification 6.3 18.2.3.7 Generic Hardware Error Source
> [4] UEFI Specification 2.8, N.2.7 PCI Express Error Section
>
> Signed-off-by: Jonathan Cameron <[email protected]>
> Signed-off-by: Sean V Kelley <[email protected]>
> ---
> drivers/pci/pcie/err.c | 25 ++++++++++++++++++++-----
> 1 file changed, 20 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
> index 5ff1afa4763d..c4ceca42a3bf 100644
> --- a/drivers/pci/pcie/err.c
> +++ b/drivers/pci/pcie/err.c
> @@ -148,19 +148,25 @@ static int report_resume(struct pci_dev *dev, void *data)
>
> /**
> * pci_walk_bridge - walk bridges potentially AER affected
> - * @bridge bridge which may be a Port.
> + * @bridge bridge which may be an RCEC with associated RCiEPs,
> + * an RCiEP associated with an RCEC, or a Port.
> * @cb callback to be called for each device found
> * @userdata arbitrary pointer to be passed to callback.
> *
> * If the device provided is a bridge, walk the subordinate bus,
> * including any bridged devices on buses under this bus.
> * Call the provided callback on each device found.
> + *
> + * If the device provided has no subordinate bus, call the provided
> + * callback on the device itself.
> */
> static void pci_walk_bridge(struct pci_dev *bridge, int (*cb)(struct pci_dev *, void *),
> void *userdata)
> {
> if (bridge->subordinate)
> pci_walk_bus(bridge->subordinate, cb, userdata);
> + else
> + cb(bridge, userdata);
> }
>
> pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
> @@ -174,11 +180,13 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
> /*
> * Error recovery runs on all subordinates of the first downstream
> * bridge. If the downstream bridge detected the error, it is
> - * cleared at the end.
> + * cleared at the end. For RCiEPs we should reset just the RCiEP itself.
> */
> type = pci_pcie_type(dev);
> if (type == PCI_EXP_TYPE_ROOT_PORT ||
> - type == PCI_EXP_TYPE_DOWNSTREAM)
> + type == PCI_EXP_TYPE_DOWNSTREAM ||
> + type == PCI_EXP_TYPE_RC_EC ||
What is the case where an RCEC is passed to pcie_do_recovery()? I
guess it's the case where an RCEC is reporting an error that it logged
itself, i.e., no RCiEP is involved at all? In that case I guess we
should try an FLR on the RCEC and clear its status?
(I don't think the current series attempts the FLR.)
> + type == PCI_EXP_TYPE_RC_END)
> bridge = dev;
> else
> bridge = pci_upstream_bridge(dev);
> @@ -186,7 +194,13 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
> pci_dbg(dev, "broadcast error_detected message\n");
> if (state == pci_channel_io_frozen) {
> pci_walk_bridge(bridge, report_frozen_detected, &status);
> - status = reset_subordinate_device(bridge);
> + if (type == PCI_EXP_TYPE_RC_END) {
> + pci_warn(dev, "subordinate device reset not possible for RCiEP\n");
> + status = PCI_ERS_RESULT_NONE;
> + goto failed;
> + }
> +
> + status = reset_subordinate_devices(bridge);
> if (status != PCI_ERS_RESULT_RECOVERED) {
> pci_warn(dev, "subordinate device reset failed\n");
> goto failed;
> @@ -219,7 +233,8 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
> pci_walk_bridge(bridge, report_resume, &status);
>
> if (type == PCI_EXP_TYPE_ROOT_PORT ||
> - type == PCI_EXP_TYPE_DOWNSTREAM) {
> + type == PCI_EXP_TYPE_DOWNSTREAM ||
> + type == PCI_EXP_TYPE_RC_EC) {
> if (pcie_aer_is_native(bridge))
> pcie_clear_device_status(bridge);
> pci_aer_clear_nonfatal_status(bridge);
> --
> 2.28.0
>
On 9 Oct 2020, at 14:30, Bjorn Helgaas wrote:
> On Fri, Oct 09, 2020 at 12:57:45PM -0500, Bjorn Helgaas wrote:
>> On Fri, Oct 02, 2020 at 11:47:32AM -0700, Sean V Kelley wrote:
>>> From: Qiuxu Zhuo <[email protected]>
>>>
>>> When attempting error recovery for an RCiEP associated with an RCEC
>>> device,
>>> there needs to be a way to update the Root Error Status, the
>>> Uncorrectable
>>> Error Status and the Uncorrectable Error Severity of the parent
>>> RCEC.
>>> In some non-native cases in which there is no OS visible device
>>> associated with the RCiEP, there is nothing to act upon as the
>>> firmware
>>> is acting before the OS. So add handling for the linked 'rcec' in
>>> AER/ERR
>>> while taking into account non-native cases.
>>>
>>> Co-developed-by: Sean V Kelley <[email protected]>
>>> Signed-off-by: Sean V Kelley <[email protected]>
>>> Signed-off-by: Qiuxu Zhuo <[email protected]>
>>> Reviewed-by: Jonathan Cameron <[email protected]>
>>> ---
>>> drivers/pci/pcie/aer.c | 9 +++++----
>>> drivers/pci/pcie/err.c | 39 ++++++++++++++++++++++++++++-----------
>>> 2 files changed, 33 insertions(+), 15 deletions(-)
>>>
>>> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
>>> index 65dff5f3457a..dccdba60b5d9 100644
>>> --- a/drivers/pci/pcie/aer.c
>>> +++ b/drivers/pci/pcie/aer.c
>>> @@ -1358,17 +1358,18 @@ static int aer_probe(struct pcie_device
>>> *dev)
>>> static pci_ers_result_t aer_root_reset(struct pci_dev *dev)
>>> {
>>> int aer = dev->aer_cap;
>>> + int rc = 0;
>>> u32 reg32;
>>> - int rc;
>>> -
>>>
>>> /* Disable Root's interrupt in response to error messages */
>>> pci_read_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, ®32);
>>> reg32 &= ~ROOT_PORT_INTR_ON_MESG_MASK;
>>> pci_write_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, reg32);
>>>
>>> - rc = pci_bus_error_reset(dev);
>>> - pci_info(dev, "Root Port link has been reset\n");
>>> + if (pci_pcie_type(dev) != PCI_EXP_TYPE_RC_EC) {
>>> + rc = pci_bus_error_reset(dev);
>>> + pci_info(dev, "Root Port link has been reset\n");
>>> + }
>>>
>>> /* Clear Root Error Status */
>>> pci_read_config_dword(dev, aer + PCI_ERR_ROOT_STATUS, ®32);
>>> diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
>>> index 38abd7984996..956ad4c86d53 100644
>>> --- a/drivers/pci/pcie/err.c
>>> +++ b/drivers/pci/pcie/err.c
>>> @@ -149,7 +149,8 @@ static int report_resume(struct pci_dev *dev,
>>> void *data)
>>> /**
>>> * pci_walk_bridge - walk bridges potentially AER affected
>>> * @bridge bridge which may be an RCEC with associated RCiEPs,
>>> - * an RCiEP associated with an RCEC, or a Port.
>>> + * or a Port.
>>> + * @dev an RCiEP lacking an associated RCEC.
>>> * @cb callback to be called for each device found
>>> * @userdata arbitrary pointer to be passed to callback.
>>> *
>>> @@ -160,13 +161,20 @@ static int report_resume(struct pci_dev *dev,
>>> void *data)
>>> * If the device provided has no subordinate bus, call the provided
>>> * callback on the device itself.
>>> */
>>> -static void pci_walk_bridge(struct pci_dev *bridge, int
>>> (*cb)(struct pci_dev *, void *),
>>> +static void pci_walk_bridge(struct pci_dev *bridge, struct pci_dev
>>> *dev,
>>> + int (*cb)(struct pci_dev *, void *),
>>> void *userdata)
>>> {
>>> - if (bridge->subordinate)
>>> + /*
>>> + * In a non-native case where there is no OS-visible reporting
>>> + * device the bridge will be NULL, i.e., no RCEC, no PORT.
>>> + */
>>> + if (bridge && bridge->subordinate)
>>> pci_walk_bus(bridge->subordinate, cb, userdata);
>>> - else
>>> + else if (bridge)
>>> cb(bridge, userdata);
>>> + else
>>> + cb(dev, userdata);
>>> }
>>>
>>> static pci_ers_result_t flr_on_rciep(struct pci_dev *dev)
>>> @@ -196,16 +204,25 @@ pci_ers_result_t pcie_do_recovery(struct
>>> pci_dev *dev,
>>> type = pci_pcie_type(dev);
>>> if (type == PCI_EXP_TYPE_ROOT_PORT ||
>>> type == PCI_EXP_TYPE_DOWNSTREAM ||
>>> - type == PCI_EXP_TYPE_RC_EC ||
>>> - type == PCI_EXP_TYPE_RC_END)
>>> + type == PCI_EXP_TYPE_RC_EC)
>>> bridge = dev;
>>> + else if (type == PCI_EXP_TYPE_RC_END)
>>> + bridge = dev->rcec;
>>> else
>>> bridge = pci_upstream_bridge(dev);
>>>
>>> pci_dbg(dev, "broadcast error_detected message\n");
>>> if (state == pci_channel_io_frozen) {
>>> - pci_walk_bridge(bridge, report_frozen_detected, &status);
>>> + pci_walk_bridge(bridge, dev, report_frozen_detected, &status);
>>> if (type == PCI_EXP_TYPE_RC_END) {
>>> + /*
>>> + * The callback only clears the Root Error Status
>>> + * of the RCEC (see aer.c). Only perform this for the
>>> + * native case, i.e., an RCEC is present.
>>> + */
>>> + if (bridge)
>>> + reset_subordinate_devices(bridge);
>>
>> Help me understand this. There are lots of callbacks in this
>> picture,
>> but I guess this "callback only clears Root Error Status" must refer
>> to aer_root_reset(), i.e., the reset_subordinate_devices pointer?
>>
>> Of course, the caller of pcie_do_recovery() supplied that pointer.
>> And we can infer that it must be aer_root_reset(), not
>> dpc_reset_link(), because RCECs and RCiEPs are not allowed to
>> implement DPC.
>>
>> I wish we didn't have either this assumption about what
>> reset_subordinate_devices points to, or the assumption about what
>> aer_root_reset() does. They both seem a little bit tenuous.
>>
>> We already made aer_root_reset() smart enough to check for RCECs.
>> Can
>> we put the FLR there, too? Then we wouldn't have this weird
>> situation
>> where reset_subordinate_devices() does a reset and clears error
>> status, EXCEPT for this case where it only clears error status and we
>> do the reset here?
>
> Just as an example to make this concrete. Doesn't even compile.
Okay, I tried something similar. Comments below:
>
>
> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> index d6927e6535e5..e389db7cbba6 100644
> --- a/drivers/pci/pcie/aer.c
> +++ b/drivers/pci/pcie/aer.c
> @@ -1372,28 +1372,45 @@ static int aer_probe(struct pcie_device *dev)
> */
> static pci_ers_result_t aer_root_reset(struct pci_dev *dev)
> {
> - int aer = dev->aer_cap;
> + int type = pci_pcie_type(dev);
> + struct pci_dev *root;
> + int aer = 0;
> int rc = 0;
> u32 reg32;
>
> - /* Disable Root's interrupt in response to error messages */
> - pci_read_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, ®32);
> - reg32 &= ~ROOT_PORT_INTR_ON_MESG_MASK;
> - pci_write_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, reg32);
> + if (type == PCI_EXP_TYPE_RC_END)
> + root = dev->rcec;
> + else
> + root = dev;
> +
> + if (root)
> + aer = root->aer_cap;
Except here dev->rcec may be NULL for the non-native case. So I had a
goto label to bypass what follows. I left the FLR in err.c.
>
> - if (pci_pcie_type(dev) != PCI_EXP_TYPE_RC_EC) {
> + if (aer) {
> + /* Disable Root's interrupt in response to error messages */
> + pci_read_config_dword(root, aer + PCI_ERR_ROOT_COMMAND, ®32);
> + reg32 &= ~ROOT_PORT_INTR_ON_MESG_MASK;
> + pci_write_config_dword(root, aer + PCI_ERR_ROOT_COMMAND, reg32);
> + }
> +
> + if (type == PCI_EXP_TYPE_RC_EC || type == PCI_EXP_TYPE_RC_END) {
> + rc = flr_on_rciep(dev);
> + pci_info(dev, "has been reset (%d)\n", rc);
> + } else {
> rc = pci_bus_error_reset(dev);
> - pci_info(dev, "Root Port link has been reset\n");
> + pci_info(dev, "Root Port link has been reset (%d)\n", rc);
> }
>
> - /* Clear Root Error Status */
> - pci_read_config_dword(dev, aer + PCI_ERR_ROOT_STATUS, ®32);
> - pci_write_config_dword(dev, aer + PCI_ERR_ROOT_STATUS, reg32);
> + if (aer) {
> + /* Clear Root Error Status */
> + pci_read_config_dword(root, aer + PCI_ERR_ROOT_STATUS, ®32);
> + pci_write_config_dword(root, aer + PCI_ERR_ROOT_STATUS, reg32);
>
> - /* Enable Root Port's interrupt in response to error messages */
> - pci_read_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, ®32);
> - reg32 |= ROOT_PORT_INTR_ON_MESG_MASK;
> - pci_write_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, reg32);
> + /* Enable Root Port's interrupt in response to error messages */
> + pci_read_config_dword(root, aer + PCI_ERR_ROOT_COMMAND, ®32);
> + reg32 |= ROOT_PORT_INTR_ON_MESG_MASK;
> + pci_write_config_dword(root, aer + PCI_ERR_ROOT_COMMAND, reg32);
> + }
>
> return rc ? PCI_ERS_RESULT_DISCONNECT : PCI_ERS_RESULT_RECOVERED;
> }
> diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
> index 79ae1356141d..08976034a89c 100644
> --- a/drivers/pci/pcie/err.c
> +++ b/drivers/pci/pcie/err.c
> @@ -203,36 +203,19 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev
> *dev,
> */
> if (type == PCI_EXP_TYPE_ROOT_PORT ||
> type == PCI_EXP_TYPE_DOWNSTREAM ||
> - type == PCI_EXP_TYPE_RC_EC)
> + type == PCI_EXP_TYPE_RC_EC ||
> + type == PCI_EXP_TYPE_RC_END)
> bridge = dev;
> - else if (type == PCI_EXP_TYPE_RC_END)
> - bridge = dev->rcec;
> else
> bridge = pci_upstream_bridge(dev);
>
> pci_dbg(bridge, "broadcast error_detected message\n");
> if (state == pci_channel_io_frozen) {
> pci_walk_bridge(bridge, dev, report_frozen_detected, &status);
We’d need to walk back the dev addition to the pci_walk_bridge() and
revert to prior to the checks on dev->rcec.
> - if (type == PCI_EXP_TYPE_RC_END) {
> - /*
> - * The callback only clears the Root Error Status
> - * of the RCEC (see aer.c). Only perform this for the
> - * native case, i.e., an RCEC is present.
> - */
> - if (bridge)
> - reset_subordinates(bridge);
> -
> - status = flr_on_rciep(dev);
> - if (status != PCI_ERS_RESULT_RECOVERED) {
> - pci_warn(dev, "Function Level Reset failed\n");
> - goto failed;
> - }
> - } else {
> - status = reset_subordinates(bridge);
> - if (status != PCI_ERS_RESULT_RECOVERED) {
> - pci_warn(dev, "subordinate device reset failed\n");
> - goto failed;
> - }
> + status = reset_subordinates(bridge);
> + if (status != PCI_ERS_RESULT_RECOVERED) {
> + pci_warn(bridge, "subordinate device reset failed\n");
> + goto failed;
> }
> } else {
> pci_walk_bridge(bridge, dev, report_normal_detected, &status);
So this is what I experimented with for the most part but with the
changes that I highlighted - addressing non-native case in the callback
and also in the bridge walk.
I can test it out.
Thanks,
Sean
On Fri, Oct 09, 2020 at 11:51:39PM +0000, Kelley, Sean V wrote:
> On Fri, 2020-10-09 at 15:07 -0700, Sean V Kelley wrote:
> So I tested the following out, including your moving flr to aer.c:
>
> - Renamed flr_on_rciep() to flr_on_rc() for RC devices (RC_END and
> RC_EC)
>
> - Moved check on dev->rcec into aer_root_reset() including the FLR.
>
> - Reworked pci_walk_bridge() to drop extra dev argument and check
> locally for the bridge->rcec. Maybe should also check on type when
> checking on bridge->rcec.
>
> Note I didn't use the check on aer_cap existence because I think you
> had added that for simply being able to skip over for the non-native
> case and I handle that with the single goto at the beginning which
> takes you to the FLR.
Right. Well, my thinking was that "root" would be a device with the
AER Root Error Command and Root Error Status registers, i.e., a Root
Port or RCEC. IIUC that basically means the APEI case where firmware
gives us an error record.
Isn't the existing v5.9 code buggy in that it unconditionally pokes
these registers? I think the APEI path can end up here, and firmware
probably has not granted us control over AER.
Somewhat related question: I'm a little skeptical about the fact that
aer_root_reset() currently does:
- clear ROOT_PORT_INTR_ON_MESG_MASK
- do reset
- clear PCI_ERR_ROOT_STATUS
- enable ROOT_PORT_INTR_ON_MESG_MASK
In the APEI path all this AER register manipulation must be done by
firmware before passing the error record to the OS. So in the native
case where the OS does own the AER registers, why can't the OS do that
manipulation in the same order, i.e., all before doing the reset?
> So this is rough, compiled, tested with AER injections but that's it...
I couldn't actually apply the patch below because it seems to be
whitespace-damaged, but I think I like it.
- It would be nice to be able to just call pcie_flr() and not have
to add flr_on_rc(). I can't remember why we need the
pcie_has_flr() / pcie_flr() dance. It seems racy and ugly, but I
have a vague recollection that there actually is some reason for
it.
- I would *rather* consolidate the AER register updates and test for
the non-native case once instead of treating it like a completely
separate path with a "goto". But maybe not possible. Not a big
deal either way.
- Getting rid of the extra "dev" argument to pci_walk_bridge() is a
great side-effect. I didn't even notice that.
- If we can simplify that "state == pci_channel_io_frozen" case as
this does, that is a *big* deal because there are other patches
just waiting to touch that reset and it will be much simpler if
there's only one reset_subordinate_devices() call there.
If you do work this up, I'd really appreciate it if you can start with
my pci/err branch so I don't have to re-do all the tweaks I've already
done:
https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git/commit/?h=pci/err
Bjorn
On Mon, 2020-10-12 at 17:58 -0500, Bjorn Helgaas wrote:
> On Fri, Oct 09, 2020 at 11:51:39PM +0000, Kelley, Sean V wrote:
> > On Fri, 2020-10-09 at 15:07 -0700, Sean V Kelley wrote:
> > So I tested the following out, including your moving flr to aer.c:
> >
> > - Renamed flr_on_rciep() to flr_on_rc() for RC devices (RC_END and
> > RC_EC)
> >
> > - Moved check on dev->rcec into aer_root_reset() including the FLR.
> >
> > - Reworked pci_walk_bridge() to drop extra dev argument and check
> > locally for the bridge->rcec. Maybe should also check on type when
> > checking on bridge->rcec.
> >
> > Note I didn't use the check on aer_cap existence because I think
> > you
> > had added that for simply being able to skip over for the non-
> > native
> > case and I handle that with the single goto at the beginning which
> > takes you to the FLR.
>
> Right. Well, my thinking was that "root" would be a device with the
> AER Root Error Command and Root Error Status registers, i.e., a Root
> Port or RCEC. IIUC that basically means the APEI case where firmware
> gives us an error record.
Got it.
>
> Isn't the existing v5.9 code buggy in that it unconditionally pokes
> these registers? I think the APEI path can end up here, and firmware
> probably has not granted us control over AER.
Yes, APEI path can end up here even in the absence of AER control.
>
> Somewhat related question: I'm a little skeptical about the fact that
> aer_root_reset() currently does:
>
> - clear ROOT_PORT_INTR_ON_MESG_MASK
> - do reset
> - clear PCI_ERR_ROOT_STATUS
> - enable ROOT_PORT_INTR_ON_MESG_MASK
It's a bit of a mix and growing with RC_END handling.
>
> In the APEI path all this AER register manipulation must be done by
> firmware before passing the error record to the OS. So in the native
> case where the OS does own the AER registers, why can't the OS do
> that
> manipulation in the same order, i.e., all before doing the reset?
And you're right, the mix here imposes additional complexity for native
versus non-native. If you're not actively engaged with the code, it's
not obvious. So, yes moving it out would make more sense.
>
> > So this is rough, compiled, tested with AER injections but that's
> > it...
>
> I couldn't actually apply the patch below because it seems to be
> whitespace-damaged, but I think I like it.
Yes, it was a quick copy-paste to an existing email. Will work with
your branch.
>
> - It would be nice to be able to just call pcie_flr() and not have
> to add flr_on_rc(). I can't remember why we need the
> pcie_has_flr() / pcie_flr() dance. It seems racy and ugly, but I
> have a vague recollection that there actually is some reason for
> it.
I'll have a look.
>
> - I would *rather* consolidate the AER register updates and test
> for
> the non-native case once instead of treating it like a completely
> separate path with a "goto". But maybe not possible. Not a big
> deal either way.
Following your line of reasoning above, I think we can better
consolidate the AER register updates.
>
> - Getting rid of the extra "dev" argument to pci_walk_bridge() is a
> great side-effect. I didn't even notice that.
>
> - If we can simplify that "state == pci_channel_io_frozen" case as
> this does, that is a *big* deal because there are other patches
> just waiting to touch that reset and it will be much simpler if
> there's only one reset_subordinate_devices() call there.
Agreed.
>
> If you do work this up, I'd really appreciate it if you can start
> with
> my pci/err branch so I don't have to re-do all the tweaks I've
> already
> done:
>
>
> https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git/commit/?h=pci/err
>
Will do.
Thanks,
Sean
> Bjorn