2024-01-25 09:41:08

by Wang, Qingshun

[permalink] [raw]
Subject: [PATCH v2 0/4] PCI/AER: Handle Advisory Non-Fatal properly

According to PCIe Base Specification Revision 6.1, Sections 6.2.3.4
and 6.2.4.3, certain uncorrectable errors will signal ERR_COR instead
of ERR_NONFATAL, logged as Advisory Non-Fatal Error, and set bits in
both Correctable Error Status register and Uncorrectable Error Status
register. Currently, when handling AER events the kernel will only look
at CE status or UE status, but never both. In the Advisory
Non-Fatal Error case, bits set in the UE status register will not be
reported and cleared until the next Fatal/Non-Fatal error arrives.

For instance, previously, when the kernel receives an ANFE with Poisoned
TLP in OS native AER mode, only the status of CE will be reported and
cleared:

AER: Corrected error received: 0000:b7:02.0
PCIe Bus Error: severity=Corrected, type=Transaction Layer, (Receiver ID)
device [8086:0db0] error status/mask=00002000/00000000
[13] NonFatalErr

If the kernel receives a Malformed TLP after that, two UE will be
reported, which is unexpected. The Malformed TLP Header was lost since
the previous ANFE gated the TLP header logs:

PCIe Bus Error: severity=Uncorrected (Fatal), type=Transaction Layer, (Receiver ID)
device [8086:0db0] error status/mask=00041000/00180020
[12] TLP (First)
[18] MalfTLP

To handle this case properly, add additional fields in aer_err_info to
track both CE and UE status/mask, UE severity, and Device Status.
Use this information to determine the status bits that need to be cleared.

Now, in the previous scenario, both CE status and related UE status will
be reported and cleared after ANFE:

AER: Corrected error received: 0000:b7:02.0
PCIe Bus Error: severity=Corrected, type=Transaction Layer, (Receiver ID)
device [8086:0db0] error status/mask=00002000/00000000
[13] NonFatalErr
Uncorrectable errors that may cause Advisory Non-Fatal:
[18] TLP

Additionally, add more data to aer_event tracepoint, which would help
to better understand ANFE and other errors for external observation.

Note:
checkpatch.pl will produce following errors and warnings on PATCH 4:

ERROR: space prohibited after that open parenthesis '('
#103: FILE: include/ras/ras_event.h:319:
+ __field( u16, link_status )

...similar errors omitted...

WARNING: quoted string split across lines
#126: FILE: include/ras/ras_event.h:342:
+ TP_printk("%s PCIe Bus Error: severity=%s, %s, TLP Header=%s, "
+ "Correctable Error Status=0x%08x, "

...similar warnings omitted...

For readability reasons, these errors and warnings are not fixed,
following the code style of existing examples in the kernel source tree.

Change log:
v2:
- Reference to the latest PCIe Specification in both commit messages
and comments, as suggested by Bjorn Helgaas.
- Describe the reason for storing additional information in
aer_err_info in the commit message of PATCH 1, as suggested by Bjorn
Helgaas.
- Add more details of behavior changes in the commit message of PATCH
2, as suggested by Bjorn Helgaas.

v1: https://lore.kernel.org/linux-pci/[email protected]/

Wang, Qingshun (4):
PCI/AER: Store more information in aer_err_info
PCI/AER: Handle Advisory Non-Fatal properly
PCI/AER: Fetch information for FTrace
RAS: Trace more information in aer_event

drivers/acpi/apei/ghes.c | 16 ++-
drivers/cxl/core/pci.c | 15 ++-
drivers/pci/pci.h | 12 ++-
drivers/pci/pcie/aer.c | 191 +++++++++++++++++++++++++++-------
include/linux/aer.h | 6 +-
include/linux/pci.h | 27 +++++
include/ras/ras_event.h | 48 ++++++++-
include/uapi/linux/pci_regs.h | 1 +
8 files changed, 269 insertions(+), 47 deletions(-)


base-commit: 6613476e225e090cc9aad49be7fa504e290dd33d
--
2.42.0



2024-01-25 10:23:39

by Wang, Qingshun

[permalink] [raw]
Subject: [PATCH v2 2/4] PCI/AER: Handle Advisory Non-Fatal properly

When processing an Advisory Non-Fatal error, ideally both correctable
error status and uncorrectable error status should be cleared. However,
there is no way to fully identify the UE associated with ANFE. Even
worse, a Fatal/Non-Fatal error may set the same UE status bit as ANFE.
Assuming an ANFE is FE/NFE is kind of bad, but assuming a FE/NFE is an
ANFE is usually unacceptable. To avoid clearing UEs that are not ANFE by
accident, the most conservative route is taken here: If any of the
Fatal/Non-Fatal Error Detected bits is set in Device Status, do not
touch UE status, they should be cleared later by the UE handler.
Otherwise, a specific set of UEs that may be raised as ANFE according to
the PCIe specification will be cleared if their corresponding severity
is non-fatal. Additionally, log UEs that will be cleared.

For instance, previously when kernel receives an ANFE with Poisoned TLP
in OS native AER mode, only status of CE will be reported and cleared:

AER: Corrected error received: 0000:b7:02.0
PCIe Bus Error: severity=Corrected, type=Transaction Layer, (Receiver ID)
device [8086:0db0] error status/mask=00002000/00000000
[13] NonFatalErr

If the kernel receives a Malformed TLP after that, two UE will be
reported, which is unexpected. Malformed TLP Header was lost since
the previous ANF gated the TLP header logs:

PCIe Bus Error: severity=Uncorrected (Fatal), type=Transaction Layer, (Receiver ID)
device [8086:0db0] error status/mask=00041000/00180020
[12] TLP (First)
[18] MalfTLP

Now, in the same scenario, both CE status and related UE status will be
reported and cleared after ANFE:

AER: Corrected error received: 0000:b7:02.0
PCIe Bus Error: severity=Corrected, type=Transaction Layer, (Receiver ID)
device [8086:0db0] error status/mask=00002000/00000000
[13] NonFatalErr
Uncorrectable errors that may cause Advisory Non-Fatal:
[18] TLP

Signed-off-by: "Wang, Qingshun" <[email protected]>
---
drivers/pci/pcie/aer.c | 61 +++++++++++++++++++++++++++++++++++++++++-
1 file changed, 60 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index 6583dcf50977..713cbf625d3f 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -107,6 +107,12 @@ struct aer_stats {
PCI_ERR_ROOT_MULTI_COR_RCV | \
PCI_ERR_ROOT_MULTI_UNCOR_RCV)

+#define AER_ERR_ANFE_UNC_MASK (PCI_ERR_UNC_POISON_TLP | \
+ PCI_ERR_UNC_COMP_TIME | \
+ PCI_ERR_UNC_COMP_ABORT | \
+ PCI_ERR_UNC_UNX_COMP | \
+ PCI_ERR_UNC_UNSUP)
+
static int pcie_aer_disable;
static pci_ers_result_t aer_root_reset(struct pci_dev *dev);

@@ -612,6 +618,32 @@ const struct attribute_group aer_stats_attr_group = {
.is_visible = aer_stats_attrs_are_visible,
};

+static int anfe_get_related_err(struct aer_err_info *info)
+{
+ /*
+ * Take the most conservative route here. If there are
+ * Non-Fatal/Fatal errors detected, do not assume any
+ * bit in uncor_status is set by ANFE.
+ */
+ if (info->device_status & (PCI_EXP_DEVSTA_NFED | PCI_EXP_DEVSTA_FED))
+ return 0;
+ /*
+ * According to PCIe Base Specification Revision 6.1,
+ * Section 6.2.3.2.4, if an UNCOR error is rasied as
+ * Advisory Non-Fatal error, it will match the following
+ * conditions:
+ * a. The severity of the error is Non-Fatal.
+ * b. The error is one of the following:
+ * 1. Poisoned TLP
+ * 2. Completion Timeout
+ * 3. Completer Abort
+ * 4. Unexpected Completion
+ * 5. Unsupported Request
+ */
+ return info->uncor_status & ~info->uncor_mask
+ & AER_ERR_ANFE_UNC_MASK & ~info->severity;
+}
+
static void pci_dev_aer_stats_incr(struct pci_dev *pdev,
struct aer_err_info *info)
{
@@ -678,6 +710,7 @@ static void __aer_print_error(struct pci_dev *dev,
struct aer_err_info *info)
{
unsigned long status;
+ unsigned long anfe_status;
const char **strings;
const char *level, *errmsg;
int i;
@@ -700,6 +733,21 @@ static void __aer_print_error(struct pci_dev *dev,
pci_printk(level, dev, " [%2d] %-22s%s\n", i, errmsg,
info->first_error == i ? " (First)" : "");
}
+
+ if (info->severity == AER_CORRECTABLE && (status & PCI_ERR_COR_ADV_NFAT)) {
+ anfe_status = anfe_get_related_err(info);
+ if (anfe_status) {
+ pci_printk(level, dev, "Uncorrectable errors that may cause Advisory Non-Fatal:");
+ for_each_set_bit(i, &anfe_status, 32) {
+ errmsg = aer_uncorrectable_error_string[i];
+ if (!errmsg)
+ errmsg = "Unknown Error Bit";
+
+ pci_printk(level, dev, " [%2d] %-22s\n", i, errmsg);
+ }
+ }
+ }
+
pci_dev_aer_stats_incr(dev, info);
}

@@ -1097,6 +1145,14 @@ static inline void cxl_rch_handle_error(struct pci_dev *dev,
struct aer_err_info *info) { }
#endif

+static void handle_advisory_nonfatal(struct pci_dev *dev, struct aer_err_info *info)
+{
+ int aer = dev->aer_cap;
+
+ pci_write_config_dword(dev, aer + PCI_ERR_UNCOR_STATUS,
+ anfe_get_related_err(info));
+}
+
/**
* pci_aer_handle_error - handle logging error into an event log
* @dev: pointer to pci_dev data structure of error source device
@@ -1113,9 +1169,12 @@ static void pci_aer_handle_error(struct pci_dev *dev, struct aer_err_info *info)
* Correctable error does not need software intervention.
* No need to go through error recovery process.
*/
- if (aer)
+ if (aer) {
pci_write_config_dword(dev, aer + PCI_ERR_COR_STATUS,
info->cor_status);
+ if (info->cor_status & PCI_ERR_COR_ADV_NFAT)
+ handle_advisory_nonfatal(dev, info);
+ }
if (pcie_aer_is_native(dev)) {
struct pci_driver *pdrv = dev->driver;

--
2.42.0


2024-01-25 11:21:10

by Wang, Qingshun

[permalink] [raw]
Subject: [PATCH v2 4/4] RAS: Trace more information in aer_event

Add following fields in aer_event to better understand Advisory
Non-Fatal and other errors for external observation:

- cor_status (Correctable Error Status)
- cor_mask (Correctable Error Mask)
- uncor_status (Uncorrectable Error Status)
- uncor_severity (Uncorrectable Error Severity)
- uncor_mask (Uncorrectable Error Mask)
- aer_cap_ctrl (AER Capabilities and Control)
- link_status (Link Status)
- device_status (Device Status)
- device_control_2 (Device Control 2)

In addition to the raw register value, value of following fields are
extracted and logged for better observability:

- "First Error Pointer" and "Completion Timeout Prefix/Header Log
Capable" from "AER Capabilities and Control"
- "Completion Timeout Value" and "Completion Timeout Disable"
from "Device Control 2"

Signed-off-by: "Wang, Qingshun" <[email protected]>
---
drivers/pci/pcie/aer.c | 17 +++++++++++--
include/ras/ras_event.h | 48 ++++++++++++++++++++++++++++++++---
include/uapi/linux/pci_regs.h | 1 +
3 files changed, 60 insertions(+), 6 deletions(-)

diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index eec3406f727a..2f5639f6c40f 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -757,6 +757,7 @@ void aer_print_error(struct pci_dev *dev, struct aer_err_info *info)
int layer, agent;
int id = pci_dev_id(dev);
const char *level;
+ struct aer_capability_regs aer_caps;

if (info->severity == AER_CORRECTABLE) {
status = info->cor_status;
@@ -793,8 +794,18 @@ void aer_print_error(struct pci_dev *dev, struct aer_err_info *info)
if (info->id && info->error_dev_num > 1 && info->id == id)
pci_err(dev, " Error of this Agent is reported first\n");

+ aer_caps = (struct aer_capability_regs) {
+ .cor_status = info->cor_status,
+ .cor_mask = info->cor_mask,
+ .uncor_status = info->uncor_status,
+ .uncor_severity = info->uncor_severity,
+ .uncor_mask = info->uncor_mask,
+ .cap_control = info->aer_cap_ctrl
+ };
trace_aer_event(dev_name(&dev->dev), (status & ~mask),
- info->severity, info->tlp_header_valid, &info->tlp);
+ info->severity, info->tlp_header_valid, &info->tlp,
+ &aer_caps, info->link_status,
+ info->device_status, info->device_control_2);
}

static void aer_print_port_info(struct pci_dev *dev, struct aer_err_info *info)
@@ -870,7 +881,9 @@ void pci_print_aer(struct pci_dev *dev, int aer_severity,
__print_tlp_header(dev, &aer->header_log);

trace_aer_event(dev_name(&dev->dev), (status & ~mask),
- aer_severity, tlp_header_valid, &aer->header_log);
+ aer_severity, tlp_header_valid, &aer->header_log,
+ aer, info.link_status,
+ info.device_status, info.device_control_2);
}
EXPORT_SYMBOL_NS_GPL(pci_print_aer, CXL);

diff --git a/include/ras/ras_event.h b/include/ras/ras_event.h
index cbd3ddd7c33d..a94997073d90 100644
--- a/include/ras/ras_event.h
+++ b/include/ras/ras_event.h
@@ -300,9 +300,14 @@ TRACE_EVENT(aer_event,
const u32 status,
const u8 severity,
const u8 tlp_header_valid,
- struct aer_header_log_regs *tlp),
+ struct aer_header_log_regs *tlp,
+ struct aer_capability_regs *aer_caps,
+ const u16 link_status,
+ const u16 device_status,
+ const u16 device_control_2),

- TP_ARGS(dev_name, status, severity, tlp_header_valid, tlp),
+ TP_ARGS(dev_name, status, severity, tlp_header_valid, tlp,
+ aer_caps, link_status, device_status, device_control_2),

TP_STRUCT__entry(
__string( dev_name, dev_name )
@@ -310,6 +315,10 @@ TRACE_EVENT(aer_event,
__field( u8, severity )
__field( u8, tlp_header_valid)
__array( u32, tlp_header, 4 )
+ __field_struct(struct aer_capability_regs, aer_caps)
+ __field( u16, link_status )
+ __field( u16, device_status )
+ __field( u16, device_control_2)
),

TP_fast_assign(
@@ -317,6 +326,10 @@ TRACE_EVENT(aer_event,
__entry->status = status;
__entry->severity = severity;
__entry->tlp_header_valid = tlp_header_valid;
+ __entry->aer_caps = *aer_caps;
+ __entry->link_status = link_status;
+ __entry->device_status = device_status;
+ __entry->device_control_2 = device_control_2;
if (tlp_header_valid) {
__entry->tlp_header[0] = tlp->dw0;
__entry->tlp_header[1] = tlp->dw1;
@@ -325,7 +338,20 @@ TRACE_EVENT(aer_event,
}
),

- TP_printk("%s PCIe Bus Error: severity=%s, %s, TLP Header=%s\n",
+ TP_printk("%s PCIe Bus Error: severity=%s, %s, TLP Header=%s, "
+ "Correctable Error Status=0x%08x, "
+ "Correctable Error Mask=0x%08x, "
+ "Uncorrectable Error Status=0x%08x, "
+ "Uncorrectable Error Severity=0x%08x, "
+ "Uncorrectable Error Mask=0x%08x, "
+ "AER Capability and Control=0x%08x, "
+ "First Error Pointer=0x%x, "
+ "Completion Timeout Prefix/Header Log Capable=%s, "
+ "Link Status=0x%04x, "
+ "Device Status=0x%04x, "
+ "Device Control 2=0x%04x, "
+ "Completion Timeout Value=0x%x, "
+ "Completion Timeout Disable=%sn",
__get_str(dev_name),
__entry->severity == AER_CORRECTABLE ? "Corrected" :
__entry->severity == AER_FATAL ?
@@ -335,7 +361,21 @@ TRACE_EVENT(aer_event,
__print_flags(__entry->status, "|", aer_uncorrectable_errors),
__entry->tlp_header_valid ?
__print_array(__entry->tlp_header, 4, 4) :
- "Not available")
+ "Not available",
+ __entry->aer_caps.cor_status,
+ __entry->aer_caps.cor_mask,
+ __entry->aer_caps.uncor_status,
+ __entry->aer_caps.uncor_severity,
+ __entry->aer_caps.uncor_mask,
+ __entry->aer_caps.cap_control,
+ PCI_ERR_CAP_FEP(__entry->aer_caps.cap_control),
+ __entry->aer_caps.cap_control & PCI_ERR_CAP_CTO_LOGC ? "True" : "False",
+ __entry->link_status,
+ __entry->device_status,
+ __entry->device_control_2,
+ __entry->device_control_2 & PCI_EXP_DEVCTL2_COMP_TIMEOUT,
+ __entry->device_control_2 & PCI_EXP_DEVCTL2_COMP_TMOUT_DIS ?
+ "True" : "False")
);

/*
diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
index a39193213ff2..54160ed2a8c9 100644
--- a/include/uapi/linux/pci_regs.h
+++ b/include/uapi/linux/pci_regs.h
@@ -787,6 +787,7 @@
#define PCI_ERR_CAP_ECRC_GENE 0x00000040 /* ECRC Generation Enable */
#define PCI_ERR_CAP_ECRC_CHKC 0x00000080 /* ECRC Check Capable */
#define PCI_ERR_CAP_ECRC_CHKE 0x00000100 /* ECRC Check Enable */
+#define PCI_ERR_CAP_CTO_LOGC 0x00001000 /* Completion Timeout Prefix/Header Log Capable */
#define PCI_ERR_HEADER_LOG 0x1c /* Header Log Register (16 bytes) */
#define PCI_ERR_ROOT_COMMAND 0x2c /* Root Error Command */
#define PCI_ERR_ROOT_CMD_COR_EN 0x00000001 /* Correctable Err Reporting Enable */
--
2.42.0


2024-01-25 11:32:59

by Wang, Qingshun

[permalink] [raw]
Subject: [PATCH v2 1/4] PCI/AER: Store more information in aer_err_info

When Advisory Non-Fatal errors are raised, both correctable and
uncorrectable error statuses will be set. The current kernel code cannot
store both statuses at the same time, thus failing to handle ANFE properly.
In addition, to avoid clearing UEs that are not ANFE by accident, UE
severity and Device Status also need to be recorded: any fatal UE cannot
be ANFE, and if Fatal/Non-Fatal Error Detected is set in Device Status, do
not take any assumption and let UE handler to clear UE status.

Store status and mask of both correctable and uncorrectable errors in
aer_err_info. The severity of UEs and the values of the Device Status
register are also recorded, which will be used to determine UEs that should
be handled by the ANFE handler. Refactor the rest of the code to use
cor/uncor_status and cor/uncor_mask fields instead of status and mask
fields.

Signed-off-by: "Wang, Qingshun" <[email protected]>
---
drivers/acpi/apei/ghes.c | 10 ++++-
drivers/cxl/core/pci.c | 6 ++-
drivers/pci/pci.h | 8 +++-
drivers/pci/pcie/aer.c | 93 ++++++++++++++++++++++++++--------------
include/linux/aer.h | 4 +-
include/linux/pci.h | 27 ++++++++++++
6 files changed, 111 insertions(+), 37 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 7b7c605166e0..6034039d5cff 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -593,6 +593,8 @@ static void ghes_handle_aer(struct acpi_hest_generic_data *gdata)

if (pcie_err->validation_bits & CPER_PCIE_VALID_DEVICE_ID &&
pcie_err->validation_bits & CPER_PCIE_VALID_AER_INFO) {
+ struct pcie_capability_regs *pcie_caps;
+ u16 device_status = 0;
unsigned int devfn;
int aer_severity;
u8 *aer_info;
@@ -615,11 +617,17 @@ static void ghes_handle_aer(struct acpi_hest_generic_data *gdata)
return;
memcpy(aer_info, pcie_err->aer_info, sizeof(struct aer_capability_regs));

+ if (pcie_err->validation_bits & CPER_PCIE_VALID_CAPABILITY) {
+ pcie_caps = (struct pcie_capability_regs *)pcie_err->capability;
+ device_status = pcie_caps->device_status;
+ }
+
aer_recover_queue(pcie_err->device_id.segment,
pcie_err->device_id.bus,
devfn, aer_severity,
(struct aer_capability_regs *)
- aer_info);
+ aer_info,
+ device_status);
}
#endif
}
diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
index 6c9c8d92f8f7..9111a4415a63 100644
--- a/drivers/cxl/core/pci.c
+++ b/drivers/cxl/core/pci.c
@@ -903,6 +903,7 @@ static void cxl_handle_rdport_errors(struct cxl_dev_state *cxlds)
struct aer_capability_regs aer_regs;
struct cxl_dport *dport;
struct cxl_port *port;
+ u16 device_status;
int severity;

port = cxl_pci_find_port(pdev, &dport);
@@ -917,7 +918,10 @@ static void cxl_handle_rdport_errors(struct cxl_dev_state *cxlds)
if (!cxl_rch_get_aer_severity(&aer_regs, &severity))
return;

- pci_print_aer(pdev, severity, &aer_regs);
+ if (pcie_capability_read_word(pdev, PCI_EXP_DEVSTA, &device_status))
+ return;
+
+ pci_print_aer(pdev, severity, &aer_regs, device_status);

if (severity == AER_CORRECTABLE)
cxl_handle_rdport_cor_ras(cxlds, dport);
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 2336a8d1edab..f881a1b42f14 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -407,8 +407,12 @@ struct aer_err_info {
unsigned int __pad2:2;
unsigned int tlp_header_valid:1;

- unsigned int status; /* COR/UNCOR Error Status */
- unsigned int mask; /* COR/UNCOR Error Mask */
+ u32 cor_mask; /* COR Error Mask */
+ u32 cor_status; /* COR Error Status */
+ u32 uncor_mask; /* UNCOR Error Mask */
+ u32 uncor_status; /* UNCOR Error Status */
+ u32 uncor_severity; /* UNCOR Error Severity */
+ u16 device_status;
struct aer_header_log_regs tlp; /* TLP Header */
};

diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index 05fc30bb5134..6583dcf50977 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -615,7 +615,7 @@ const struct attribute_group aer_stats_attr_group = {
static void pci_dev_aer_stats_incr(struct pci_dev *pdev,
struct aer_err_info *info)
{
- unsigned long status = info->status & ~info->mask;
+ unsigned long status;
int i, max = -1;
u64 *counter = NULL;
struct aer_stats *aer_stats = pdev->aer_stats;
@@ -625,16 +625,19 @@ static void pci_dev_aer_stats_incr(struct pci_dev *pdev,

switch (info->severity) {
case AER_CORRECTABLE:
+ status = info->cor_status & ~info->cor_mask;
aer_stats->dev_total_cor_errs++;
counter = &aer_stats->dev_cor_errs[0];
max = AER_MAX_TYPEOF_COR_ERRS;
break;
case AER_NONFATAL:
+ status = info->uncor_status & ~info->uncor_mask;
aer_stats->dev_total_nonfatal_errs++;
counter = &aer_stats->dev_nonfatal_errs[0];
max = AER_MAX_TYPEOF_UNCOR_ERRS;
break;
case AER_FATAL:
+ status = info->uncor_status & ~info->uncor_mask;
aer_stats->dev_total_fatal_errs++;
counter = &aer_stats->dev_fatal_errs[0];
max = AER_MAX_TYPEOF_UNCOR_ERRS;
@@ -674,15 +677,17 @@ static void __print_tlp_header(struct pci_dev *dev,
static void __aer_print_error(struct pci_dev *dev,
struct aer_err_info *info)
{
+ unsigned long status;
const char **strings;
- unsigned long status = info->status & ~info->mask;
const char *level, *errmsg;
int i;

if (info->severity == AER_CORRECTABLE) {
+ status = info->cor_status & ~info->cor_mask;
strings = aer_correctable_error_string;
level = KERN_WARNING;
} else {
+ status = info->uncor_status & ~info->uncor_mask;
strings = aer_uncorrectable_error_string;
level = KERN_ERR;
}
@@ -700,18 +705,27 @@ static void __aer_print_error(struct pci_dev *dev,

void aer_print_error(struct pci_dev *dev, struct aer_err_info *info)
{
+ u32 status, mask;
int layer, agent;
int id = pci_dev_id(dev);
const char *level;

- if (!info->status) {
+ if (info->severity == AER_CORRECTABLE) {
+ status = info->cor_status;
+ mask = info->cor_mask;
+ } else {
+ status = info->uncor_status;
+ mask = info->uncor_mask;
+ }
+
+ if (!status) {
pci_err(dev, "PCIe Bus Error: severity=%s, type=Inaccessible, (Unregistered Agent ID)\n",
aer_error_severity_string[info->severity]);
goto out;
}

- layer = AER_GET_LAYER_ERROR(info->severity, info->status);
- agent = AER_GET_AGENT(info->severity, info->status);
+ layer = AER_GET_LAYER_ERROR(info->severity, status);
+ agent = AER_GET_AGENT(info->severity, status);

level = (info->severity == AER_CORRECTABLE) ? KERN_WARNING : KERN_ERR;

@@ -720,7 +734,7 @@ void aer_print_error(struct pci_dev *dev, struct aer_err_info *info)
aer_error_layer[layer], aer_agent_string[agent]);

pci_printk(level, dev, " device [%04x:%04x] error status/mask=%08x/%08x\n",
- dev->vendor, dev->device, info->status, info->mask);
+ dev->vendor, dev->device, status, mask);

__aer_print_error(dev, info);

@@ -731,7 +745,7 @@ void aer_print_error(struct pci_dev *dev, struct aer_err_info *info)
if (info->id && info->error_dev_num > 1 && info->id == id)
pci_err(dev, " Error of this Agent is reported first\n");

- trace_aer_event(dev_name(&dev->dev), (info->status & ~info->mask),
+ trace_aer_event(dev_name(&dev->dev), (status & ~mask),
info->severity, info->tlp_header_valid, &info->tlp);
}

@@ -763,7 +777,7 @@ EXPORT_SYMBOL_GPL(cper_severity_to_aer);
#endif

void pci_print_aer(struct pci_dev *dev, int aer_severity,
- struct aer_capability_regs *aer)
+ struct aer_capability_regs *aer, u16 device_status)
{
int layer, agent, tlp_header_valid = 0;
u32 status, mask;
@@ -783,8 +797,12 @@ void pci_print_aer(struct pci_dev *dev, int aer_severity,

memset(&info, 0, sizeof(info));
info.severity = aer_severity;
- info.status = status;
- info.mask = mask;
+ info.cor_status = aer->cor_status;
+ info.cor_mask = aer->cor_mask;
+ info.uncor_status = aer->uncor_status;
+ info.uncor_severity = aer->uncor_severity;
+ info.uncor_mask = aer->uncor_mask;
+ info.device_status = device_status;
info.first_error = PCI_ERR_CAP_FEP(aer->cap_control);

pci_err(dev, "aer_status: 0x%08x, aer_mask: 0x%08x\n", status, mask);
@@ -996,9 +1014,9 @@ static bool cxl_error_is_native(struct pci_dev *dev)
static bool is_internal_error(struct aer_err_info *info)
{
if (info->severity == AER_CORRECTABLE)
- return info->status & PCI_ERR_COR_INTERNAL;
+ return info->cor_status & PCI_ERR_COR_INTERNAL;

- return info->status & PCI_ERR_UNC_INTN;
+ return info->uncor_status & PCI_ERR_UNC_INTN;
}

static int cxl_rch_handle_error_iter(struct pci_dev *dev, void *data)
@@ -1097,7 +1115,7 @@ static void pci_aer_handle_error(struct pci_dev *dev, struct aer_err_info *info)
*/
if (aer)
pci_write_config_dword(dev, aer + PCI_ERR_COR_STATUS,
- info->status);
+ info->cor_status);
if (pcie_aer_is_native(dev)) {
struct pci_driver *pdrv = dev->driver;

@@ -1128,6 +1146,7 @@ struct aer_recover_entry {
u8 devfn;
u16 domain;
int severity;
+ u16 device_status;
struct aer_capability_regs *regs;
};

@@ -1148,7 +1167,7 @@ static void aer_recover_work_func(struct work_struct *work)
PCI_SLOT(entry.devfn), PCI_FUNC(entry.devfn));
continue;
}
- pci_print_aer(pdev, entry.severity, entry.regs);
+ pci_print_aer(pdev, entry.severity, entry.regs, entry.device_status);
/*
* Memory for aer_capability_regs(entry.regs) is being allocated from the
* ghes_estatus_pool to protect it from overwriting when multiple sections
@@ -1177,7 +1196,7 @@ static DEFINE_SPINLOCK(aer_recover_ring_lock);
static DECLARE_WORK(aer_recover_work, aer_recover_work_func);

void aer_recover_queue(int domain, unsigned int bus, unsigned int devfn,
- int severity, struct aer_capability_regs *aer_regs)
+ int severity, struct aer_capability_regs *aer_regs, u16 device_status)
{
struct aer_recover_entry entry = {
.bus = bus,
@@ -1185,6 +1204,7 @@ void aer_recover_queue(int domain, unsigned int bus, unsigned int devfn,
.domain = domain,
.severity = severity,
.regs = aer_regs,
+ .device_status = device_status,
};

if (kfifo_in_spinlocked(&aer_recover_ring, &entry, 1,
@@ -1213,38 +1233,49 @@ int aer_get_device_error_info(struct pci_dev *dev, struct aer_err_info *info)
int temp;

/* Must reset in this function */
- info->status = 0;
+ info->cor_status = 0;
+ info->uncor_status = 0;
+ info->uncor_severity = 0;
info->tlp_header_valid = 0;

/* The device might not support AER */
if (!aer)
return 0;

- if (info->severity == AER_CORRECTABLE) {
+ if (info->severity == AER_CORRECTABLE ||
+ info->severity == AER_NONFATAL ||
+ type == PCI_EXP_TYPE_ROOT_PORT ||
+ type == PCI_EXP_TYPE_RC_EC ||
+ type == PCI_EXP_TYPE_DOWNSTREAM) {
+ /* Link is healthy for IO reads */
pci_read_config_dword(dev, aer + PCI_ERR_COR_STATUS,
- &info->status);
+ &info->cor_status);
pci_read_config_dword(dev, aer + PCI_ERR_COR_MASK,
- &info->mask);
- if (!(info->status & ~info->mask))
- return 0;
- } else if (type == PCI_EXP_TYPE_ROOT_PORT ||
- type == PCI_EXP_TYPE_RC_EC ||
- type == PCI_EXP_TYPE_DOWNSTREAM ||
- info->severity == AER_NONFATAL) {
-
- /* Link is still healthy for IO reads */
+ &info->cor_mask);
pci_read_config_dword(dev, aer + PCI_ERR_UNCOR_STATUS,
- &info->status);
+ &info->uncor_status);
+ pci_read_config_dword(dev, aer + PCI_ERR_UNCOR_SEVER,
+ &info->uncor_severity);
pci_read_config_dword(dev, aer + PCI_ERR_UNCOR_MASK,
- &info->mask);
- if (!(info->status & ~info->mask))
+ &info->uncor_mask);
+ pcie_capability_read_word(dev, PCI_EXP_DEVSTA,
+ &info->device_status);
+ } else {
+ return 1;
+ }
+
+ if (info->severity == AER_CORRECTABLE) {
+ if (!(info->cor_status & ~info->cor_mask))
+ return 0;
+ } else {
+ if (!(info->uncor_status & ~info->uncor_mask))
return 0;

/* Get First Error Pointer */
pci_read_config_dword(dev, aer + PCI_ERR_CAP, &temp);
info->first_error = PCI_ERR_CAP_FEP(temp);

- if (info->status & AER_LOG_TLP_MASKS) {
+ if (info->uncor_status & AER_LOG_TLP_MASKS) {
info->tlp_header_valid = 1;
pci_read_config_dword(dev,
aer + PCI_ERR_HEADER_LOG, &info->tlp.dw0);
diff --git a/include/linux/aer.h b/include/linux/aer.h
index ae0fae70d4bd..38ac802250ac 100644
--- a/include/linux/aer.h
+++ b/include/linux/aer.h
@@ -52,9 +52,9 @@ static inline int pcie_aer_is_native(struct pci_dev *dev) { return 0; }
#endif

void pci_print_aer(struct pci_dev *dev, int aer_severity,
- struct aer_capability_regs *aer);
+ struct aer_capability_regs *aer, u16 device_status);
int cper_severity_to_aer(int cper_severity);
void aer_recover_queue(int domain, unsigned int bus, unsigned int devfn,
- int severity, struct aer_capability_regs *aer_regs);
+ int severity, struct aer_capability_regs *aer_regs, u16 device_status);
#endif //_AER_H_

diff --git a/include/linux/pci.h b/include/linux/pci.h
index add9368e6314..259812620d4d 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -318,6 +318,33 @@ struct pci_sriov;
struct pci_p2pdma;
struct rcec_ea;

+struct pcie_capability_regs {
+ u8 pcie_cap_id;
+ u8 next_cap_ptr;
+ u16 pcie_caps;
+ u32 device_caps;
+ u16 device_control;
+ u16 device_status;
+ u32 link_caps;
+ u16 link_control;
+ u16 link_status;
+ u32 slot_caps;
+ u16 slot_control;
+ u16 slot_status;
+ u16 root_control;
+ u16 root_caps;
+ u32 root_status;
+ u32 device_caps_2;
+ u16 device_control_2;
+ u16 device_status_2;
+ u32 link_caps_2;
+ u16 link_control_2;
+ u16 link_status_2;
+ u32 slot_caps_2;
+ u16 slot_control_2;
+ u16 slot_status_2;
+};
+
/* The pci_dev structure describes PCI devices */
struct pci_dev {
struct list_head bus_list; /* Node in per-bus list */
--
2.42.0


2024-01-25 12:31:17

by Wang, Qingshun

[permalink] [raw]
Subject: [PATCH v2 3/4] PCI/AER: Fetch information for FTrace

Fetch and store the data of 3 more registers: "Link Status", "Device
Control 2", and "Advanced Error Capabilities and Control". This data is
needed for external observation to better understand ANFE.

Signed-off-by: "Wang, Qingshun" <[email protected]>
---
drivers/acpi/apei/ghes.c | 8 +++++++-
drivers/cxl/core/pci.c | 11 ++++++++++-
drivers/pci/pci.h | 4 ++++
drivers/pci/pcie/aer.c | 26 ++++++++++++++++++++------
include/linux/aer.h | 6 ++++--
5 files changed, 45 insertions(+), 10 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 6034039d5cff..047cc01be68c 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -594,7 +594,9 @@ static void ghes_handle_aer(struct acpi_hest_generic_data *gdata)
if (pcie_err->validation_bits & CPER_PCIE_VALID_DEVICE_ID &&
pcie_err->validation_bits & CPER_PCIE_VALID_AER_INFO) {
struct pcie_capability_regs *pcie_caps;
+ u16 device_control_2 = 0;
u16 device_status = 0;
+ u16 link_status = 0;
unsigned int devfn;
int aer_severity;
u8 *aer_info;
@@ -619,7 +621,9 @@ static void ghes_handle_aer(struct acpi_hest_generic_data *gdata)

if (pcie_err->validation_bits & CPER_PCIE_VALID_CAPABILITY) {
pcie_caps = (struct pcie_capability_regs *)pcie_err->capability;
+ device_control_2 = pcie_caps->device_control_2;
device_status = pcie_caps->device_status;
+ link_status = pcie_caps->link_status;
}

aer_recover_queue(pcie_err->device_id.segment,
@@ -627,7 +631,9 @@ static void ghes_handle_aer(struct acpi_hest_generic_data *gdata)
devfn, aer_severity,
(struct aer_capability_regs *)
aer_info,
- device_status);
+ device_status,
+ link_status,
+ device_control_2);
}
#endif
}
diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
index 9111a4415a63..3aa57fe8db42 100644
--- a/drivers/cxl/core/pci.c
+++ b/drivers/cxl/core/pci.c
@@ -903,7 +903,9 @@ static void cxl_handle_rdport_errors(struct cxl_dev_state *cxlds)
struct aer_capability_regs aer_regs;
struct cxl_dport *dport;
struct cxl_port *port;
+ u16 device_control_2;
u16 device_status;
+ u16 link_status;
int severity;

port = cxl_pci_find_port(pdev, &dport);
@@ -918,10 +920,17 @@ static void cxl_handle_rdport_errors(struct cxl_dev_state *cxlds)
if (!cxl_rch_get_aer_severity(&aer_regs, &severity))
return;

+ if (pcie_capability_read_word(pdev, PCI_EXP_DEVCTL2, &device_control_2))
+ return;
+
if (pcie_capability_read_word(pdev, PCI_EXP_DEVSTA, &device_status))
return;

- pci_print_aer(pdev, severity, &aer_regs, device_status);
+ if (pcie_capability_read_word(pdev, PCI_EXP_LNKSTA, &link_status))
+ return;
+
+ pci_print_aer(pdev, severity, &aer_regs, device_status,
+ link_status, device_control_2);

if (severity == AER_CORRECTABLE)
cxl_handle_rdport_cor_ras(cxlds, dport);
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index f881a1b42f14..5788a94b4e95 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -412,7 +412,11 @@ struct aer_err_info {
u32 uncor_mask; /* UNCOR Error Mask */
u32 uncor_status; /* UNCOR Error Status */
u32 uncor_severity; /* UNCOR Error Severity */
+
+ u16 link_status;
+ u32 aer_cap_ctrl; /* AER Capabilities and Control */
u16 device_status;
+ u16 device_control_2;
struct aer_header_log_regs tlp; /* TLP Header */
};

diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index 713cbf625d3f..eec3406f727a 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -825,7 +825,8 @@ EXPORT_SYMBOL_GPL(cper_severity_to_aer);
#endif

void pci_print_aer(struct pci_dev *dev, int aer_severity,
- struct aer_capability_regs *aer, u16 device_status)
+ struct aer_capability_regs *aer, u16 device_status,
+ u16 link_status, u16 device_control_2)
{
int layer, agent, tlp_header_valid = 0;
u32 status, mask;
@@ -850,7 +851,10 @@ void pci_print_aer(struct pci_dev *dev, int aer_severity,
info.uncor_status = aer->uncor_status;
info.uncor_severity = aer->uncor_severity;
info.uncor_mask = aer->uncor_mask;
+ info.link_status = link_status;
+ info.aer_cap_ctrl = aer->cap_control;
info.device_status = device_status;
+ info.device_control_2 = device_control_2;
info.first_error = PCI_ERR_CAP_FEP(aer->cap_control);

pci_err(dev, "aer_status: 0x%08x, aer_mask: 0x%08x\n", status, mask);
@@ -1205,7 +1209,9 @@ struct aer_recover_entry {
u8 devfn;
u16 domain;
int severity;
+ u16 link_status;
u16 device_status;
+ u16 device_control_2;
struct aer_capability_regs *regs;
};

@@ -1226,7 +1232,8 @@ static void aer_recover_work_func(struct work_struct *work)
PCI_SLOT(entry.devfn), PCI_FUNC(entry.devfn));
continue;
}
- pci_print_aer(pdev, entry.severity, entry.regs, entry.device_status);
+ pci_print_aer(pdev, entry.severity, entry.regs, entry.device_status,
+ entry.link_status, entry.device_control_2);
/*
* Memory for aer_capability_regs(entry.regs) is being allocated from the
* ghes_estatus_pool to protect it from overwriting when multiple sections
@@ -1255,7 +1262,8 @@ static DEFINE_SPINLOCK(aer_recover_ring_lock);
static DECLARE_WORK(aer_recover_work, aer_recover_work_func);

void aer_recover_queue(int domain, unsigned int bus, unsigned int devfn,
- int severity, struct aer_capability_regs *aer_regs, u16 device_status)
+ int severity, struct aer_capability_regs *aer_regs, u16 device_status,
+ u16 link_status, u16 device_control_2)
{
struct aer_recover_entry entry = {
.bus = bus,
@@ -1263,7 +1271,9 @@ void aer_recover_queue(int domain, unsigned int bus, unsigned int devfn,
.domain = domain,
.severity = severity,
.regs = aer_regs,
+ .link_status = link_status,
.device_status = device_status,
+ .device_control_2 = device_control_2,
};

if (kfifo_in_spinlocked(&aer_recover_ring, &entry, 1,
@@ -1289,7 +1299,6 @@ int aer_get_device_error_info(struct pci_dev *dev, struct aer_err_info *info)
{
int type = pci_pcie_type(dev);
int aer = dev->aer_cap;
- int temp;

/* Must reset in this function */
info->cor_status = 0;
@@ -1317,8 +1326,14 @@ int aer_get_device_error_info(struct pci_dev *dev, struct aer_err_info *info)
&info->uncor_severity);
pci_read_config_dword(dev, aer + PCI_ERR_UNCOR_MASK,
&info->uncor_mask);
+ pci_read_config_dword(dev, aer + PCI_ERR_CAP,
+ &info->aer_cap_ctrl);
+ pcie_capability_read_word(dev, PCI_EXP_LNKSTA,
+ &info->link_status);
pcie_capability_read_word(dev, PCI_EXP_DEVSTA,
&info->device_status);
+ pcie_capability_read_word(dev, PCI_EXP_DEVCTL2,
+ &info->device_control_2);
} else {
return 1;
}
@@ -1331,8 +1346,7 @@ int aer_get_device_error_info(struct pci_dev *dev, struct aer_err_info *info)
return 0;

/* Get First Error Pointer */
- pci_read_config_dword(dev, aer + PCI_ERR_CAP, &temp);
- info->first_error = PCI_ERR_CAP_FEP(temp);
+ info->first_error = PCI_ERR_CAP_FEP(info->aer_cap_ctrl);

if (info->uncor_status & AER_LOG_TLP_MASKS) {
info->tlp_header_valid = 1;
diff --git a/include/linux/aer.h b/include/linux/aer.h
index 38ac802250ac..327ebec1e4b3 100644
--- a/include/linux/aer.h
+++ b/include/linux/aer.h
@@ -52,9 +52,11 @@ static inline int pcie_aer_is_native(struct pci_dev *dev) { return 0; }
#endif

void pci_print_aer(struct pci_dev *dev, int aer_severity,
- struct aer_capability_regs *aer, u16 device_status);
+ struct aer_capability_regs *aer, u16 device_status,
+ u16 link_status, u16 device_control_2);
int cper_severity_to_aer(int cper_severity);
void aer_recover_queue(int domain, unsigned int bus, unsigned int devfn,
- int severity, struct aer_capability_regs *aer_regs, u16 device_status);
+ int severity, struct aer_capability_regs *aer_regs, u16 device_status,
+ u16 link_status, u16 device_control_2);
#endif //_AER_H_

--
2.42.0


Subject: Re: [PATCH v2 1/4] PCI/AER: Store more information in aer_err_info


On 1/24/24 10:27 PM, Wang, Qingshun wrote:
> When Advisory Non-Fatal errors are raised, both correctable and

Maybe you can start with same info about what Advisory Non-FataL
errors are and the specification reference. I know that you included
it in cover letter. But it is good to include it in commit log.

> uncorrectable error statuses will be set. The current kernel code cannot
> store both statuses at the same time, thus failing to handle ANFE properly.
> In addition, to avoid clearing UEs that are not ANFE by accident, UE

Please add some details about the impact of not clearing them.
> severity and Device Status also need to be recorded: any fatal UE cannot
> be ANFE, and if Fatal/Non-Fatal Error Detected is set in Device Status, do
> not take any assumption and let UE handler to clear UE status.
>
> Store status and mask of both correctable and uncorrectable errors in
> aer_err_info. The severity of UEs and the values of the Device Status
> register are also recorded, which will be used to determine UEs that should
> be handled by the ANFE handler. Refactor the rest of the code to use
> cor/uncor_status and cor/uncor_mask fields instead of status and mask
> fields.
>
> Signed-off-by: "Wang, Qingshun" <[email protected]>
> ---
> drivers/acpi/apei/ghes.c | 10 ++++-
> drivers/cxl/core/pci.c | 6 ++-
> drivers/pci/pci.h | 8 +++-
> drivers/pci/pcie/aer.c | 93 ++++++++++++++++++++++++++--------------
> include/linux/aer.h | 4 +-
> include/linux/pci.h | 27 ++++++++++++
> 6 files changed, 111 insertions(+), 37 deletions(-)
>
> diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
> index 7b7c605166e0..6034039d5cff 100644
> --- a/drivers/acpi/apei/ghes.c
> +++ b/drivers/acpi/apei/ghes.c
> @@ -593,6 +593,8 @@ static void ghes_handle_aer(struct acpi_hest_generic_data *gdata)
>
> if (pcie_err->validation_bits & CPER_PCIE_VALID_DEVICE_ID &&
> pcie_err->validation_bits & CPER_PCIE_VALID_AER_INFO) {
> + struct pcie_capability_regs *pcie_caps;
> + u16 device_status = 0;
> unsigned int devfn;
> int aer_severity;
> u8 *aer_info;
> @@ -615,11 +617,17 @@ static void ghes_handle_aer(struct acpi_hest_generic_data *gdata)
> return;
> memcpy(aer_info, pcie_err->aer_info, sizeof(struct aer_capability_regs));
>
> + if (pcie_err->validation_bits & CPER_PCIE_VALID_CAPABILITY) {
> + pcie_caps = (struct pcie_capability_regs *)pcie_err->capability;
> + device_status = pcie_caps->device_status;
> + }
> +
> aer_recover_queue(pcie_err->device_id.segment,
> pcie_err->device_id.bus,
> devfn, aer_severity,
> (struct aer_capability_regs *)
> - aer_info);
> + aer_info,
> + device_status);
> }
> #endif
> }
> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> index 6c9c8d92f8f7..9111a4415a63 100644
> --- a/drivers/cxl/core/pci.c
> +++ b/drivers/cxl/core/pci.c
> @@ -903,6 +903,7 @@ static void cxl_handle_rdport_errors(struct cxl_dev_state *cxlds)
> struct aer_capability_regs aer_regs;
> struct cxl_dport *dport;
> struct cxl_port *port;
> + u16 device_status;
> int severity;
>
> port = cxl_pci_find_port(pdev, &dport);
> @@ -917,7 +918,10 @@ static void cxl_handle_rdport_errors(struct cxl_dev_state *cxlds)
> if (!cxl_rch_get_aer_severity(&aer_regs, &severity))
> return;
>
> - pci_print_aer(pdev, severity, &aer_regs);
> + if (pcie_capability_read_word(pdev, PCI_EXP_DEVSTA, &device_status))
> + return;
> +
> + pci_print_aer(pdev, severity, &aer_regs, device_status);
>
> if (severity == AER_CORRECTABLE)
> cxl_handle_rdport_cor_ras(cxlds, dport);
> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> index 2336a8d1edab..f881a1b42f14 100644
> --- a/drivers/pci/pci.h
> +++ b/drivers/pci/pci.h
> @@ -407,8 +407,12 @@ struct aer_err_info {
> unsigned int __pad2:2;
> unsigned int tlp_header_valid:1;
>
> - unsigned int status; /* COR/UNCOR Error Status */
> - unsigned int mask; /* COR/UNCOR Error Mask */
> + u32 cor_mask; /* COR Error Mask */
> + u32 cor_status; /* COR Error Status */
> + u32 uncor_mask; /* UNCOR Error Mask */
> + u32 uncor_status; /* UNCOR Error Status */
> + u32 uncor_severity; /* UNCOR Error Severity */
> + u16 device_status;
> struct aer_header_log_regs tlp; /* TLP Header */
> };
>
> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> index 05fc30bb5134..6583dcf50977 100644
> --- a/drivers/pci/pcie/aer.c
> +++ b/drivers/pci/pcie/aer.c
> @@ -615,7 +615,7 @@ const struct attribute_group aer_stats_attr_group = {
> static void pci_dev_aer_stats_incr(struct pci_dev *pdev,
> struct aer_err_info *info)
> {
> - unsigned long status = info->status & ~info->mask;
> + unsigned long status;
> int i, max = -1;
> u64 *counter = NULL;
> struct aer_stats *aer_stats = pdev->aer_stats;
> @@ -625,16 +625,19 @@ static void pci_dev_aer_stats_incr(struct pci_dev *pdev,
>
> switch (info->severity) {
> case AER_CORRECTABLE:
> + status = info->cor_status & ~info->cor_mask;
> aer_stats->dev_total_cor_errs++;
> counter = &aer_stats->dev_cor_errs[0];
> max = AER_MAX_TYPEOF_COR_ERRS;
> break;
> case AER_NONFATAL:
> + status = info->uncor_status & ~info->uncor_mask;
> aer_stats->dev_total_nonfatal_errs++;
> counter = &aer_stats->dev_nonfatal_errs[0];
> max = AER_MAX_TYPEOF_UNCOR_ERRS;
> break;
> case AER_FATAL:
> + status = info->uncor_status & ~info->uncor_mask;
> aer_stats->dev_total_fatal_errs++;
> counter = &aer_stats->dev_fatal_errs[0];
> max = AER_MAX_TYPEOF_UNCOR_ERRS;
> @@ -674,15 +677,17 @@ static void __print_tlp_header(struct pci_dev *dev,
> static void __aer_print_error(struct pci_dev *dev,
> struct aer_err_info *info)
> {
> + unsigned long status;
> const char **strings;
> - unsigned long status = info->status & ~info->mask;
> const char *level, *errmsg;
> int i;
>
> if (info->severity == AER_CORRECTABLE) {
> + status = info->cor_status & ~info->cor_mask;
> strings = aer_correctable_error_string;
> level = KERN_WARNING;
> } else {
> + status = info->uncor_status & ~info->uncor_mask;
> strings = aer_uncorrectable_error_string;
> level = KERN_ERR;
> }
> @@ -700,18 +705,27 @@ static void __aer_print_error(struct pci_dev *dev,
>
> void aer_print_error(struct pci_dev *dev, struct aer_err_info *info)
> {
> + u32 status, mask;
> int layer, agent;
> int id = pci_dev_id(dev);
> const char *level;
>
> - if (!info->status) {
> + if (info->severity == AER_CORRECTABLE) {
> + status = info->cor_status;
> + mask = info->cor_mask;
> + } else {
> + status = info->uncor_status;
> + mask = info->uncor_mask;
> + }
> +
> + if (!status) {
> pci_err(dev, "PCIe Bus Error: severity=%s, type=Inaccessible, (Unregistered Agent ID)\n",
> aer_error_severity_string[info->severity]);
> goto out;
> }
>
> - layer = AER_GET_LAYER_ERROR(info->severity, info->status);
> - agent = AER_GET_AGENT(info->severity, info->status);
> + layer = AER_GET_LAYER_ERROR(info->severity, status);
> + agent = AER_GET_AGENT(info->severity, status);
>
> level = (info->severity == AER_CORRECTABLE) ? KERN_WARNING : KERN_ERR;
>
> @@ -720,7 +734,7 @@ void aer_print_error(struct pci_dev *dev, struct aer_err_info *info)
> aer_error_layer[layer], aer_agent_string[agent]);
>
> pci_printk(level, dev, " device [%04x:%04x] error status/mask=%08x/%08x\n",
> - dev->vendor, dev->device, info->status, info->mask);
> + dev->vendor, dev->device, status, mask);
>
> __aer_print_error(dev, info);
>
> @@ -731,7 +745,7 @@ void aer_print_error(struct pci_dev *dev, struct aer_err_info *info)
> if (info->id && info->error_dev_num > 1 && info->id == id)
> pci_err(dev, " Error of this Agent is reported first\n");
>
> - trace_aer_event(dev_name(&dev->dev), (info->status & ~info->mask),
> + trace_aer_event(dev_name(&dev->dev), (status & ~mask),
> info->severity, info->tlp_header_valid, &info->tlp);
> }
>
> @@ -763,7 +777,7 @@ EXPORT_SYMBOL_GPL(cper_severity_to_aer);
> #endif
>
> void pci_print_aer(struct pci_dev *dev, int aer_severity,
> - struct aer_capability_regs *aer)
> + struct aer_capability_regs *aer, u16 device_status)
> {
> int layer, agent, tlp_header_valid = 0;
> u32 status, mask;
> @@ -783,8 +797,12 @@ void pci_print_aer(struct pci_dev *dev, int aer_severity,
>
> memset(&info, 0, sizeof(info));
> info.severity = aer_severity;
> - info.status = status;
> - info.mask = mask;
> + info.cor_status = aer->cor_status;
> + info.cor_mask = aer->cor_mask;
> + info.uncor_status = aer->uncor_status;
> + info.uncor_severity = aer->uncor_severity;
> + info.uncor_mask = aer->uncor_mask;
> + info.device_status = device_status;
> info.first_error = PCI_ERR_CAP_FEP(aer->cap_control);
>
> pci_err(dev, "aer_status: 0x%08x, aer_mask: 0x%08x\n", status, mask);
> @@ -996,9 +1014,9 @@ static bool cxl_error_is_native(struct pci_dev *dev)
> static bool is_internal_error(struct aer_err_info *info)
> {
> if (info->severity == AER_CORRECTABLE)
> - return info->status & PCI_ERR_COR_INTERNAL;
> + return info->cor_status & PCI_ERR_COR_INTERNAL;
>
> - return info->status & PCI_ERR_UNC_INTN;
> + return info->uncor_status & PCI_ERR_UNC_INTN;
> }
>
> static int cxl_rch_handle_error_iter(struct pci_dev *dev, void *data)
> @@ -1097,7 +1115,7 @@ static void pci_aer_handle_error(struct pci_dev *dev, struct aer_err_info *info)
> */
> if (aer)
> pci_write_config_dword(dev, aer + PCI_ERR_COR_STATUS,
> - info->status);
> + info->cor_status);
> if (pcie_aer_is_native(dev)) {
> struct pci_driver *pdrv = dev->driver;
>
> @@ -1128,6 +1146,7 @@ struct aer_recover_entry {
> u8 devfn;
> u16 domain;
> int severity;
> + u16 device_status;
> struct aer_capability_regs *regs;
> };
>
> @@ -1148,7 +1167,7 @@ static void aer_recover_work_func(struct work_struct *work)
> PCI_SLOT(entry.devfn), PCI_FUNC(entry.devfn));
> continue;
> }
> - pci_print_aer(pdev, entry.severity, entry.regs);
> + pci_print_aer(pdev, entry.severity, entry.regs, entry.device_status);
> /*
> * Memory for aer_capability_regs(entry.regs) is being allocated from the
> * ghes_estatus_pool to protect it from overwriting when multiple sections
> @@ -1177,7 +1196,7 @@ static DEFINE_SPINLOCK(aer_recover_ring_lock);
> static DECLARE_WORK(aer_recover_work, aer_recover_work_func);
>
> void aer_recover_queue(int domain, unsigned int bus, unsigned int devfn,
> - int severity, struct aer_capability_regs *aer_regs)
> + int severity, struct aer_capability_regs *aer_regs, u16 device_status)
> {
> struct aer_recover_entry entry = {
> .bus = bus,
> @@ -1185,6 +1204,7 @@ void aer_recover_queue(int domain, unsigned int bus, unsigned int devfn,
> .domain = domain,
> .severity = severity,
> .regs = aer_regs,
> + .device_status = device_status,
> };
>
> if (kfifo_in_spinlocked(&aer_recover_ring, &entry, 1,
> @@ -1213,38 +1233,49 @@ int aer_get_device_error_info(struct pci_dev *dev, struct aer_err_info *info)
> int temp;
>
> /* Must reset in this function */
> - info->status = 0;
> + info->cor_status = 0;
> + info->uncor_status = 0;
> + info->uncor_severity = 0;
> info->tlp_header_valid = 0;
>
> /* The device might not support AER */
> if (!aer)
> return 0;
>
> - if (info->severity == AER_CORRECTABLE) {
> + if (info->severity == AER_CORRECTABLE ||
> + info->severity == AER_NONFATAL ||
> + type == PCI_EXP_TYPE_ROOT_PORT ||
> + type == PCI_EXP_TYPE_RC_EC ||
> + type == PCI_EXP_TYPE_DOWNSTREAM) {


It looks like you are reading both uncorrectable and correctable status
by default for both NONFATAL and CORRECTABLE errors. Why not do
it conditionally only for ANFE errors?


> + /* Link is healthy for IO reads */
> pci_read_config_dword(dev, aer + PCI_ERR_COR_STATUS,
> - &info->status);
> + &info->cor_status);
> pci_read_config_dword(dev, aer + PCI_ERR_COR_MASK,
> - &info->mask);
> - if (!(info->status & ~info->mask))
> - return 0;
> - } else if (type == PCI_EXP_TYPE_ROOT_PORT ||
> - type == PCI_EXP_TYPE_RC_EC ||
> - type == PCI_EXP_TYPE_DOWNSTREAM ||
> - info->severity == AER_NONFATAL) {
> -
> - /* Link is still healthy for IO reads */
> + &info->cor_mask);
> pci_read_config_dword(dev, aer + PCI_ERR_UNCOR_STATUS,
> - &info->status);
> + &info->uncor_status);
> + pci_read_config_dword(dev, aer + PCI_ERR_UNCOR_SEVER,
> + &info->uncor_severity);
> pci_read_config_dword(dev, aer + PCI_ERR_UNCOR_MASK,
> - &info->mask);
> - if (!(info->status & ~info->mask))
> + &info->uncor_mask);
> + pcie_capability_read_word(dev, PCI_EXP_DEVSTA,
> + &info->device_status);
> + } else {
> + return 1;
> + }
> +
> + if (info->severity == AER_CORRECTABLE) {
> + if (!(info->cor_status & ~info->cor_mask))
> + return 0;
> + } else {
> + if (!(info->uncor_status & ~info->uncor_mask))
> return 0;
>
> /* Get First Error Pointer */
> pci_read_config_dword(dev, aer + PCI_ERR_CAP, &temp);
> info->first_error = PCI_ERR_CAP_FEP(temp);
>
> - if (info->status & AER_LOG_TLP_MASKS) {
> + if (info->uncor_status & AER_LOG_TLP_MASKS) {
> info->tlp_header_valid = 1;
> pci_read_config_dword(dev,
> aer + PCI_ERR_HEADER_LOG, &info->tlp.dw0);
> diff --git a/include/linux/aer.h b/include/linux/aer.h
> index ae0fae70d4bd..38ac802250ac 100644
> --- a/include/linux/aer.h
> +++ b/include/linux/aer.h
> @@ -52,9 +52,9 @@ static inline int pcie_aer_is_native(struct pci_dev *dev) { return 0; }
> #endif
>
> void pci_print_aer(struct pci_dev *dev, int aer_severity,
> - struct aer_capability_regs *aer);
> + struct aer_capability_regs *aer, u16 device_status);
> int cper_severity_to_aer(int cper_severity);
> void aer_recover_queue(int domain, unsigned int bus, unsigned int devfn,
> - int severity, struct aer_capability_regs *aer_regs);
> + int severity, struct aer_capability_regs *aer_regs, u16 device_status);
> #endif //_AER_H_
>
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index add9368e6314..259812620d4d 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -318,6 +318,33 @@ struct pci_sriov;
> struct pci_p2pdma;
> struct rcec_ea;
>
> +struct pcie_capability_regs {
> + u8 pcie_cap_id;
> + u8 next_cap_ptr;
> + u16 pcie_caps;
> + u32 device_caps;
> + u16 device_control;
> + u16 device_status;
> + u32 link_caps;
> + u16 link_control;
> + u16 link_status;
> + u32 slot_caps;
> + u16 slot_control;
> + u16 slot_status;
> + u16 root_control;
> + u16 root_caps;
> + u32 root_status;
> + u32 device_caps_2;
> + u16 device_control_2;
> + u16 device_status_2;
> + u32 link_caps_2;
> + u16 link_control_2;
> + u16 link_status_2;
> + u32 slot_caps_2;
> + u16 slot_control_2;
> + u16 slot_status_2;
> +};
> +
IIUC, this struct is only used drivers/acpi/apei/ghes.c . Why not define it in that file?
> /* The pci_dev structure describes PCI devices */
> struct pci_dev {
> struct list_head bus_list; /* Node in per-bus list */

--
Sathyanarayanan Kuppuswamy
Linux Kernel Developer


2024-01-31 09:21:15

by Wang, Qingshun

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] PCI/AER: Store more information in aer_err_info

On Tue, Jan 30, 2024 at 06:26:39PM -0800, Kuppuswamy Sathyanarayanan wrote:
>
> On 1/24/24 10:27 PM, Wang, Qingshun wrote:
> > When Advisory Non-Fatal errors are raised, both correctable and
>
> Maybe you can start with same info about what Advisory Non-FataL
> errors are and the specification reference. I know that you included
> it in cover letter. But it is good to include it in commit log.

Good idea, thanks!

>
> > uncorrectable error statuses will be set. The current kernel code cannot
> > store both statuses at the same time, thus failing to handle ANFE properly.
> > In addition, to avoid clearing UEs that are not ANFE by accident, UE
>
> Please add some details about the impact of not clearing them.

Makes sense, will do.

> > severity and Device Status also need to be recorded: any fatal UE cannot
> > be ANFE, and if Fatal/Non-Fatal Error Detected is set in Device Status, do
> > not take any assumption and let UE handler to clear UE status.
> >
> > Store status and mask of both correctable and uncorrectable errors in
> > aer_err_info. The severity of UEs and the values of the Device Status
> > register are also recorded, which will be used to determine UEs that should
> > be handled by the ANFE handler. Refactor the rest of the code to use
> > cor/uncor_status and cor/uncor_mask fields instead of status and mask
> > fields.
> >
> > Signed-off-by: "Wang, Qingshun" <[email protected]>
> > ---
> > drivers/acpi/apei/ghes.c | 10 ++++-
> > drivers/cxl/core/pci.c | 6 ++-
> > drivers/pci/pci.h | 8 +++-
> > drivers/pci/pcie/aer.c | 93 ++++++++++++++++++++++++++--------------
> > include/linux/aer.h | 4 +-
> > include/linux/pci.h | 27 ++++++++++++
> > 6 files changed, 111 insertions(+), 37 deletions(-)
> >
> > ......
> >
> > @@ -1213,38 +1233,49 @@ int aer_get_device_error_info(struct pci_dev *dev, struct aer_err_info *info)
> > int temp;
> >
> > /* Must reset in this function */
> > - info->status = 0;
> > + info->cor_status = 0;
> > + info->uncor_status = 0;
> > + info->uncor_severity = 0;
> > info->tlp_header_valid = 0;
> >
> > /* The device might not support AER */
> > if (!aer)
> > return 0;
> >
> > - if (info->severity == AER_CORRECTABLE) {
> > + if (info->severity == AER_CORRECTABLE ||
> > + info->severity == AER_NONFATAL ||
> > + type == PCI_EXP_TYPE_ROOT_PORT ||
> > + type == PCI_EXP_TYPE_RC_EC ||
> > + type == PCI_EXP_TYPE_DOWNSTREAM) {
>
>
> It looks like you are reading both uncorrectable and correctable status
> by default for both NONFATAL and CORRECTABLE errors. Why not do
> it conditionally only for ANFE errors?
>
>

My initial purpose was the value will be used in aer_event trace in
PATCH 4 under both conditions, but I can also add checks here to reduce
unnecessary IO and remove the checks in PATCH 4.

> > + /* Link is healthy for IO reads */
> > pci_read_config_dword(dev, aer + PCI_ERR_COR_STATUS,
> > - &info->status);
> > + &info->cor_status);
> > pci_read_config_dword(dev, aer + PCI_ERR_COR_MASK,
> > - &info->mask);
> >
> > ......
> >
> > diff --git a/include/linux/pci.h b/include/linux/pci.h
> > index add9368e6314..259812620d4d 100644
> > --- a/include/linux/pci.h
> > +++ b/include/linux/pci.h
> > @@ -318,6 +318,33 @@ struct pci_sriov;
> > struct pci_p2pdma;
> > struct rcec_ea;
> >
> > +struct pcie_capability_regs {
> > + u8 pcie_cap_id;
> > + u8 next_cap_ptr;
> > + u16 pcie_caps;
> > + u32 device_caps;
> > + u16 device_control;
> > + u16 device_status;
> > + u32 link_caps;
> > + u16 link_control;
> > + u16 link_status;
> > + u32 slot_caps;
> > + u16 slot_control;
> > + u16 slot_status;
> > + u16 root_control;
> > + u16 root_caps;
> > + u32 root_status;
> > + u32 device_caps_2;
> > + u16 device_control_2;
> > + u16 device_status_2;
> > + u32 link_caps_2;
> > + u16 link_control_2;
> > + u16 link_status_2;
> > + u32 slot_caps_2;
> > + u16 slot_control_2;
> > + u16 slot_status_2;
> > +};
> > +
> IIUC, this struct is only used drivers/acpi/apei/ghes.c . Why not define it in that file?

You are right. Whenever we need it elsewhere, we can move it to the
header file anyway.

> > /* The pci_dev structure describes PCI devices */
> > struct pci_dev {
> > struct list_head bus_list; /* Node in per-bus list */
>
> --
> Sathyanarayanan Kuppuswamy
> Linux Kernel Developer
>

--
Best regards,
Wang, Qingshun

2024-02-02 18:03:35

by Dan Williams

[permalink] [raw]
Subject: Re: [PATCH v2 3/4] PCI/AER: Fetch information for FTrace

Wang, Qingshun wrote:
> Fetch and store the data of 3 more registers: "Link Status", "Device
> Control 2", and "Advanced Error Capabilities and Control". This data is
> needed for external observation to better understand ANFE.
>
> Signed-off-by: "Wang, Qingshun" <[email protected]>
> ---
> drivers/acpi/apei/ghes.c | 8 +++++++-
> drivers/cxl/core/pci.c | 11 ++++++++++-
> drivers/pci/pci.h | 4 ++++
> drivers/pci/pcie/aer.c | 26 ++++++++++++++++++++------
> include/linux/aer.h | 6 ++++--
> 5 files changed, 45 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
> index 6034039d5cff..047cc01be68c 100644
> --- a/drivers/acpi/apei/ghes.c
> +++ b/drivers/acpi/apei/ghes.c
> @@ -594,7 +594,9 @@ static void ghes_handle_aer(struct acpi_hest_generic_data *gdata)
> if (pcie_err->validation_bits & CPER_PCIE_VALID_DEVICE_ID &&
> pcie_err->validation_bits & CPER_PCIE_VALID_AER_INFO) {
> struct pcie_capability_regs *pcie_caps;
> + u16 device_control_2 = 0;
> u16 device_status = 0;
> + u16 link_status = 0;
> unsigned int devfn;
> int aer_severity;
> u8 *aer_info;
> @@ -619,7 +621,9 @@ static void ghes_handle_aer(struct acpi_hest_generic_data *gdata)
>
> if (pcie_err->validation_bits & CPER_PCIE_VALID_CAPABILITY) {
> pcie_caps = (struct pcie_capability_regs *)pcie_err->capability;
> + device_control_2 = pcie_caps->device_control_2;
> device_status = pcie_caps->device_status;
> + link_status = pcie_caps->link_status;
> }
>
> aer_recover_queue(pcie_err->device_id.segment,
> @@ -627,7 +631,9 @@ static void ghes_handle_aer(struct acpi_hest_generic_data *gdata)
> devfn, aer_severity,
> (struct aer_capability_regs *)
> aer_info,
> - device_status);
> + device_status,
> + link_status,
> + device_control_2);
> }
> #endif
> }
> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> index 9111a4415a63..3aa57fe8db42 100644
> --- a/drivers/cxl/core/pci.c
> +++ b/drivers/cxl/core/pci.c
> @@ -903,7 +903,9 @@ static void cxl_handle_rdport_errors(struct cxl_dev_state *cxlds)
> struct aer_capability_regs aer_regs;
> struct cxl_dport *dport;
> struct cxl_port *port;
> + u16 device_control_2;
> u16 device_status;
> + u16 link_status;
> int severity;
>
> port = cxl_pci_find_port(pdev, &dport);
> @@ -918,10 +920,17 @@ static void cxl_handle_rdport_errors(struct cxl_dev_state *cxlds)
> if (!cxl_rch_get_aer_severity(&aer_regs, &severity))
> return;
>
> + if (pcie_capability_read_word(pdev, PCI_EXP_DEVCTL2, &device_control_2))
> + return;
> +
> if (pcie_capability_read_word(pdev, PCI_EXP_DEVSTA, &device_status))
> return;
>
> - pci_print_aer(pdev, severity, &aer_regs, device_status);
> + if (pcie_capability_read_word(pdev, PCI_EXP_LNKSTA, &link_status))
> + return;
> +
> + pci_print_aer(pdev, severity, &aer_regs, device_status,
> + link_status, device_control_2);

Rather than complicate the calling convention of pci_print_aer(), update
the internals of pci_print_aer() to get these extra registers, or
provide a new wrapper interface that satisfies the dependencies and
switch users over to that. Otherwise multiple touches of the same code
path in one patch set is indicative of the need for a higher level
helper.

2024-02-03 05:00:25

by Wang, Qingshun

[permalink] [raw]
Subject: Re: [PATCH v2 3/4] PCI/AER: Fetch information for FTrace

On Fri, Feb 02, 2024 at 10:01:40AM -0800, Dan Williams wrote:
> Wang, Qingshun wrote:
> > Fetch and store the data of 3 more registers: "Link Status", "Device
> > Control 2", and "Advanced Error Capabilities and Control". This data is
> > needed for external observation to better understand ANFE.
> >
> > Signed-off-by: "Wang, Qingshun" <[email protected]>
> > ---
> > drivers/acpi/apei/ghes.c | 8 +++++++-
> > drivers/cxl/core/pci.c | 11 ++++++++++-
> > drivers/pci/pci.h | 4 ++++
> > drivers/pci/pcie/aer.c | 26 ++++++++++++++++++++------
> > include/linux/aer.h | 6 ++++--
> > 5 files changed, 45 insertions(+), 10 deletions(-)
> >
> > diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
> > index 6034039d5cff..047cc01be68c 100644
> > --- a/drivers/acpi/apei/ghes.c
> > +++ b/drivers/acpi/apei/ghes.c
> > @@ -594,7 +594,9 @@ static void ghes_handle_aer(struct acpi_hest_generic_data *gdata)
> > if (pcie_err->validation_bits & CPER_PCIE_VALID_DEVICE_ID &&
> > pcie_err->validation_bits & CPER_PCIE_VALID_AER_INFO) {
> > struct pcie_capability_regs *pcie_caps;
> > + u16 device_control_2 = 0;
> > u16 device_status = 0;
> > + u16 link_status = 0;
> > unsigned int devfn;
> > int aer_severity;
> > u8 *aer_info;
> > @@ -619,7 +621,9 @@ static void ghes_handle_aer(struct acpi_hest_generic_data *gdata)
> >
> > if (pcie_err->validation_bits & CPER_PCIE_VALID_CAPABILITY) {
> > pcie_caps = (struct pcie_capability_regs *)pcie_err->capability;
> > + device_control_2 = pcie_caps->device_control_2;
> > device_status = pcie_caps->device_status;
> > + link_status = pcie_caps->link_status;
> > }
> >
> > aer_recover_queue(pcie_err->device_id.segment,
> > @@ -627,7 +631,9 @@ static void ghes_handle_aer(struct acpi_hest_generic_data *gdata)
> > devfn, aer_severity,
> > (struct aer_capability_regs *)
> > aer_info,
> > - device_status);
> > + device_status,
> > + link_status,
> > + device_control_2);
> > }
> > #endif
> > }
> > diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> > index 9111a4415a63..3aa57fe8db42 100644
> > --- a/drivers/cxl/core/pci.c
> > +++ b/drivers/cxl/core/pci.c
> > @@ -903,7 +903,9 @@ static void cxl_handle_rdport_errors(struct cxl_dev_state *cxlds)
> > struct aer_capability_regs aer_regs;
> > struct cxl_dport *dport;
> > struct cxl_port *port;
> > + u16 device_control_2;
> > u16 device_status;
> > + u16 link_status;
> > int severity;
> >
> > port = cxl_pci_find_port(pdev, &dport);
> > @@ -918,10 +920,17 @@ static void cxl_handle_rdport_errors(struct cxl_dev_state *cxlds)
> > if (!cxl_rch_get_aer_severity(&aer_regs, &severity))
> > return;
> >
> > + if (pcie_capability_read_word(pdev, PCI_EXP_DEVCTL2, &device_control_2))
> > + return;
> > +
> > if (pcie_capability_read_word(pdev, PCI_EXP_DEVSTA, &device_status))
> > return;
> >
> > - pci_print_aer(pdev, severity, &aer_regs, device_status);
> > + if (pcie_capability_read_word(pdev, PCI_EXP_LNKSTA, &link_status))
> > + return;
> > +
> > + pci_print_aer(pdev, severity, &aer_regs, device_status,
> > + link_status, device_control_2);
>
> Rather than complicate the calling convention of pci_print_aer(), update
> the internals of pci_print_aer() to get these extra registers, or
> provide a new wrapper interface that satisfies the dependencies and
> switch users over to that. Otherwise multiple touches of the same code
> path in one patch set is indicative of the need for a higher level
> helper.

Thanks for the advice, it does make sense. Will reconsider the
implementation.

--
Best regards,
Wang, Qingshun

2024-02-05 23:12:54

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] PCI/AER: Store more information in aer_err_info

On Thu, Jan 25, 2024 at 02:27:59PM +0800, Wang, Qingshun wrote:
> When Advisory Non-Fatal errors are raised, both correctable and
> uncorrectable error statuses will be set. The current kernel code cannot
> store both statuses at the same time, thus failing to handle ANFE properly.
> In addition, to avoid clearing UEs that are not ANFE by accident, UE
> severity and Device Status also need to be recorded: any fatal UE cannot
> be ANFE, and if Fatal/Non-Fatal Error Detected is set in Device Status, do
> not take any assumption and let UE handler to clear UE status.
>
> Store status and mask of both correctable and uncorrectable errors in
> aer_err_info. The severity of UEs and the values of the Device Status
> register are also recorded, which will be used to determine UEs that should
> be handled by the ANFE handler. Refactor the rest of the code to use
> cor/uncor_status and cor/uncor_mask fields instead of status and mask
> fields.

There's a lot going on in this patch. Could it possibly be split up a
bit, e.g., first tease apart aer_err_info.status/.mask into
cor_status/mask and .uncor_status/mask, then add .uncor_severity,
then add the device_status bit separately? If it could be split up, I
think the ANFE case would be easier to see.

Thanks a lot for working on this area!

Bjorn

2024-02-05 23:29:22

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH v2 2/4] PCI/AER: Handle Advisory Non-Fatal properly

In the subject, "properly" really doesn't convey information. I think
this patch does two things:

- Prints error bits that might be ANFE
- Clears UNCOR_STATUS bits that were previously not cleared

Maybe the subject line could say something about those (clearing
UNCOR_STATUS might be more important, or maybe this could even be
split into two patches so we could see both).

On Thu, Jan 25, 2024 at 02:28:00PM +0800, Wang, Qingshun wrote:
> When processing an Advisory Non-Fatal error, ideally both correctable
> error status and uncorrectable error status should be cleared. However,
> there is no way to fully identify the UE associated with ANFE. Even
> worse, a Fatal/Non-Fatal error may set the same UE status bit as ANFE.
> Assuming an ANFE is FE/NFE is kind of bad, but assuming a FE/NFE is an
> ANFE is usually unacceptable. To avoid clearing UEs that are not ANFE by
> accident, the most conservative route is taken here: If any of the
> Fatal/Non-Fatal Error Detected bits is set in Device Status, do not
> touch UE status, they should be cleared later by the UE handler.
> Otherwise, a specific set of UEs that may be raised as ANFE according to
> the PCIe specification will be cleared if their corresponding severity
> is non-fatal. Additionally, log UEs that will be cleared.
>
> For instance, previously when kernel receives an ANFE with Poisoned TLP
> in OS native AER mode, only status of CE will be reported and cleared:
>
> AER: Corrected error received: 0000:b7:02.0
> PCIe Bus Error: severity=Corrected, type=Transaction Layer, (Receiver ID)
> device [8086:0db0] error status/mask=00002000/00000000
> [13] NonFatalErr
>
> If the kernel receives a Malformed TLP after that, two UE will be
> reported, which is unexpected. Malformed TLP Header was lost since
> the previous ANF gated the TLP header logs:
>
> PCIe Bus Error: severity=Uncorrected (Fatal), type=Transaction Layer, (Receiver ID)
> device [8086:0db0] error status/mask=00041000/00180020
> [12] TLP (First)
> [18] MalfTLP
>
> Now, in the same scenario, both CE status and related UE status will be
> reported and cleared after ANFE:
>
> AER: Corrected error received: 0000:b7:02.0
> PCIe Bus Error: severity=Corrected, type=Transaction Layer, (Receiver ID)
> device [8086:0db0] error status/mask=00002000/00000000
> [13] NonFatalErr
> Uncorrectable errors that may cause Advisory Non-Fatal:
> [18] TLP
>
> Signed-off-by: "Wang, Qingshun" <[email protected]>
> ---
> drivers/pci/pcie/aer.c | 61 +++++++++++++++++++++++++++++++++++++++++-
> 1 file changed, 60 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> index 6583dcf50977..713cbf625d3f 100644
> --- a/drivers/pci/pcie/aer.c
> +++ b/drivers/pci/pcie/aer.c
> @@ -107,6 +107,12 @@ struct aer_stats {
> PCI_ERR_ROOT_MULTI_COR_RCV | \
> PCI_ERR_ROOT_MULTI_UNCOR_RCV)
>
> +#define AER_ERR_ANFE_UNC_MASK (PCI_ERR_UNC_POISON_TLP | \
> + PCI_ERR_UNC_COMP_TIME | \
> + PCI_ERR_UNC_COMP_ABORT | \
> + PCI_ERR_UNC_UNX_COMP | \
> + PCI_ERR_UNC_UNSUP)
> +
> static int pcie_aer_disable;
> static pci_ers_result_t aer_root_reset(struct pci_dev *dev);
>
> @@ -612,6 +618,32 @@ const struct attribute_group aer_stats_attr_group = {
> .is_visible = aer_stats_attrs_are_visible,
> };
>
> +static int anfe_get_related_err(struct aer_err_info *info)
> +{
> + /*
> + * Take the most conservative route here. If there are
> + * Non-Fatal/Fatal errors detected, do not assume any
> + * bit in uncor_status is set by ANFE.
> + */
> + if (info->device_status & (PCI_EXP_DEVSTA_NFED | PCI_EXP_DEVSTA_FED))
> + return 0;
> + /*
> + * According to PCIe Base Specification Revision 6.1,
> + * Section 6.2.3.2.4, if an UNCOR error is rasied as
> + * Advisory Non-Fatal error, it will match the following
> + * conditions:
> + * a. The severity of the error is Non-Fatal.
> + * b. The error is one of the following:
> + * 1. Poisoned TLP
> + * 2. Completion Timeout
> + * 3. Completer Abort
> + * 4. Unexpected Completion
> + * 5. Unsupported Request
> + */
> + return info->uncor_status & ~info->uncor_mask
> + & AER_ERR_ANFE_UNC_MASK & ~info->severity;
> +}
> +
> static void pci_dev_aer_stats_incr(struct pci_dev *pdev,
> struct aer_err_info *info)
> {
> @@ -678,6 +710,7 @@ static void __aer_print_error(struct pci_dev *dev,
> struct aer_err_info *info)
> {
> unsigned long status;
> + unsigned long anfe_status;
> const char **strings;
> const char *level, *errmsg;
> int i;
> @@ -700,6 +733,21 @@ static void __aer_print_error(struct pci_dev *dev,
> pci_printk(level, dev, " [%2d] %-22s%s\n", i, errmsg,
> info->first_error == i ? " (First)" : "");
> }
> +
> + if (info->severity == AER_CORRECTABLE && (status & PCI_ERR_COR_ADV_NFAT)) {
> + anfe_status = anfe_get_related_err(info);
> + if (anfe_status) {
> + pci_printk(level, dev, "Uncorrectable errors that may cause Advisory Non-Fatal:");
> + for_each_set_bit(i, &anfe_status, 32) {
> + errmsg = aer_uncorrectable_error_string[i];
> + if (!errmsg)
> + errmsg = "Unknown Error Bit";
> +
> + pci_printk(level, dev, " [%2d] %-22s\n", i, errmsg);
> + }
> + }
> + }
> +
> pci_dev_aer_stats_incr(dev, info);
> }
>
> @@ -1097,6 +1145,14 @@ static inline void cxl_rch_handle_error(struct pci_dev *dev,
> struct aer_err_info *info) { }
> #endif
>
> +static void handle_advisory_nonfatal(struct pci_dev *dev, struct aer_err_info *info)
> +{
> + int aer = dev->aer_cap;
> +
> + pci_write_config_dword(dev, aer + PCI_ERR_UNCOR_STATUS,
> + anfe_get_related_err(info));
> +}
> +
> /**
> * pci_aer_handle_error - handle logging error into an event log
> * @dev: pointer to pci_dev data structure of error source device
> @@ -1113,9 +1169,12 @@ static void pci_aer_handle_error(struct pci_dev *dev, struct aer_err_info *info)
> * Correctable error does not need software intervention.
> * No need to go through error recovery process.
> */
> - if (aer)
> + if (aer) {
> pci_write_config_dword(dev, aer + PCI_ERR_COR_STATUS,
> info->cor_status);
> + if (info->cor_status & PCI_ERR_COR_ADV_NFAT)
> + handle_advisory_nonfatal(dev, info);
> + }
> if (pcie_aer_is_native(dev)) {
> struct pci_driver *pdrv = dev->driver;
>
> --
> 2.42.0
>

2024-02-06 16:47:10

by Wang, Qingshun

[permalink] [raw]
Subject: Re: [PATCH v2 2/4] PCI/AER: Handle Advisory Non-Fatal properly

On Mon, Feb 05, 2024 at 05:26:16PM -0600, Bjorn Helgaas wrote:
> In the subject, "properly" really doesn't convey information. I think
> this patch does two things:
>
> - Prints error bits that might be ANFE
> - Clears UNCOR_STATUS bits that were previously not cleared
>
> Maybe the subject line could say something about those (clearing
> UNCOR_STATUS might be more important, or maybe this could even be
> split into two patches so we could see both).
>

Good idea, thanks. I think splitting it into two patches would be the
better approach.

2024-02-06 17:24:02

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] PCI/AER: Store more information in aer_err_info

On Wed, Feb 07, 2024 at 12:41:41AM +0800, Wang, Qingshun wrote:
> On Mon, Feb 05, 2024 at 05:12:31PM -0600, Bjorn Helgaas wrote:
> > On Thu, Jan 25, 2024 at 02:27:59PM +0800, Wang, Qingshun wrote:
> > > When Advisory Non-Fatal errors are raised, both correctable and
> > > uncorrectable error statuses will be set. The current kernel code cannot
> > > store both statuses at the same time, thus failing to handle ANFE properly.
> > > In addition, to avoid clearing UEs that are not ANFE by accident, UE
> > > severity and Device Status also need to be recorded: any fatal UE cannot
> > > be ANFE, and if Fatal/Non-Fatal Error Detected is set in Device Status, do
> > > not take any assumption and let UE handler to clear UE status.
> > >
> > > Store status and mask of both correctable and uncorrectable errors in
> > > aer_err_info. The severity of UEs and the values of the Device Status
> > > register are also recorded, which will be used to determine UEs that should
> > > be handled by the ANFE handler. Refactor the rest of the code to use
> > > cor/uncor_status and cor/uncor_mask fields instead of status and mask
> > > fields.
> >
> > There's a lot going on in this patch. Could it possibly be split up a
> > bit, e.g., first tease apart aer_err_info.status/.mask into
> > .cor_status/mask and .uncor_status/mask, then add .uncor_severity,
> > then add the device_status bit separately? If it could be split up, I
> > think the ANFE case would be easier to see.
>
> Thanks for the feedback! Will split it up into two pacthes in the next
> version.

Or even three:

1) tease apart aer_err_info.status/.mask into .cor_status/mask and
.uncor_status/mask

2) add .uncor_severity

3) add device_status

Looking at this again, I'm a little confused about 2) and 3). I see
the new read of PCI_ERR_UNCOR_SEVER into .uncor_severity, but there's
no actual *use* of it.

Same for 3), I see the new read of PCI_EXP_DEVSTA, but AFAICS there's
no use of that value.

We should have the addition of these new values in the same patch
that *uses* them.

Bjorn

2024-02-06 22:41:32

by Wang, Qingshun

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] PCI/AER: Store more information in aer_err_info

On Mon, Feb 05, 2024 at 05:12:31PM -0600, Bjorn Helgaas wrote:
> On Thu, Jan 25, 2024 at 02:27:59PM +0800, Wang, Qingshun wrote:
> > When Advisory Non-Fatal errors are raised, both correctable and
> > uncorrectable error statuses will be set. The current kernel code cannot
> > store both statuses at the same time, thus failing to handle ANFE properly.
> > In addition, to avoid clearing UEs that are not ANFE by accident, UE
> > severity and Device Status also need to be recorded: any fatal UE cannot
> > be ANFE, and if Fatal/Non-Fatal Error Detected is set in Device Status, do
> > not take any assumption and let UE handler to clear UE status.
> >
> > Store status and mask of both correctable and uncorrectable errors in
> > aer_err_info. The severity of UEs and the values of the Device Status
> > register are also recorded, which will be used to determine UEs that should
> > be handled by the ANFE handler. Refactor the rest of the code to use
> > cor/uncor_status and cor/uncor_mask fields instead of status and mask
> > fields.
>
> There's a lot going on in this patch. Could it possibly be split up a
> bit, e.g., first tease apart aer_err_info.status/.mask into
> .cor_status/mask and .uncor_status/mask, then add .uncor_severity,
> then add the device_status bit separately? If it could be split up, I
> think the ANFE case would be easier to see.
>
> Thanks a lot for working on this area!
>
> Bjorn

Thanks for the feedback! Will split it up into two pacthes in the next
version.

--
Best regards,
Wang, Qingshun

2024-02-08 16:16:55

by Wang, Qingshun

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] PCI/AER: Store more information in aer_err_info

On Tue, Feb 06, 2024 at 11:23:35AM -0600, Bjorn Helgaas wrote:
> On Wed, Feb 07, 2024 at 12:41:41AM +0800, Wang, Qingshun wrote:
> > On Mon, Feb 05, 2024 at 05:12:31PM -0600, Bjorn Helgaas wrote:
> > > On Thu, Jan 25, 2024 at 02:27:59PM +0800, Wang, Qingshun wrote:
> > > > When Advisory Non-Fatal errors are raised, both correctable and
> > > > uncorrectable error statuses will be set. The current kernel code cannot
> > > > store both statuses at the same time, thus failing to handle ANFE properly.
> > > > In addition, to avoid clearing UEs that are not ANFE by accident, UE
> > > > severity and Device Status also need to be recorded: any fatal UE cannot
> > > > be ANFE, and if Fatal/Non-Fatal Error Detected is set in Device Status, do
> > > > not take any assumption and let UE handler to clear UE status.
> > > >
> > > > Store status and mask of both correctable and uncorrectable errors in
> > > > aer_err_info. The severity of UEs and the values of the Device Status
> > > > register are also recorded, which will be used to determine UEs that should
> > > > be handled by the ANFE handler. Refactor the rest of the code to use
> > > > cor/uncor_status and cor/uncor_mask fields instead of status and mask
> > > > fields.
> > >
> > > There's a lot going on in this patch. Could it possibly be split up a
> > > bit, e.g., first tease apart aer_err_info.status/.mask into
> > > .cor_status/mask and .uncor_status/mask, then add .uncor_severity,
> > > then add the device_status bit separately? If it could be split up, I
> > > think the ANFE case would be easier to see.
> >
> > Thanks for the feedback! Will split it up into two pacthes in the next
> > version.
>
> Or even three:
>
> 1) tease apart aer_err_info.status/.mask into .cor_status/mask and
> .uncor_status/mask
>
> 2) add .uncor_severity
>
> 3) add device_status
>
> Looking at this again, I'm a little confused about 2) and 3). I see
> the new read of PCI_ERR_UNCOR_SEVER into .uncor_severity, but there's
> no actual *use* of it.
>
> Same for 3), I see the new read of PCI_EXP_DEVSTA, but AFAICS there's
> no use of that value.
>

Both 2) and 3) are used in PATCH 2 and traced in PATCH 4. I can separate
the logic for reading these values from PATCH 1 and merge it with PATCH
2.

--
Best regards,
Wang, Qingshun