2024-05-09 08:51:35

by Duan, Zhenzhong

[permalink] [raw]
Subject: [PATCH v4 0/3] PCI/AER: Handle Advisory Non-Fatal error

Hi,

This is a relay work of Qingshun's v2 [1], but changed to focus on ANFE
processing as subject suggests and drops trace-event for now. I think it's
a bit heavy to do extra IOes to get PCIe registers only for trace purpose
and not see it a community request for now.

According to PCIe Base Specification Revision 6.1, Sections 6.2.3.2.4 and
6.2.4.3, certain uncorrectable errors will signal ERR_COR instead of
ERR_NONFATAL, logged as Advisory Non-Fatal Error(ANFE), and set bits in
both Correctable Error(CE) Status register and Uncorrectable Error(UE)
Status register. Currently, when handling AER events the kernel will only
look at CE status or UE status, but never both. In the ANFE case, bits set
in the UE status register will not be reported and cleared until the next
FE/NFE arrives.

For instance, previously, when the kernel receives an ANFE with Poisoned
TLP in OS native AER mode, only the status of CE will be reported and
cleared:

AER: Correctable error message received from 0000:b7:02.0
PCIe Bus Error: severity=Correctable, type=Transaction Layer, (Receiver ID)
device [8086:0db0] error status/mask=00002000/00000000
[13] NonFatalErr

If the kernel receives a Malformed TLP after that, two UEs will be
reported, which is unexpected. The Malformed TLP Header is lost since
the previous ANFE gated the TLP header logs:

PCIe Bus Error: severity="Uncorrectable (Fatal), type=Transaction Layer, (Receiver ID)
device [8086:0db0] error status/mask=00041000/00180020
[12] TLP (First)
[18] MalfTLP

To handle this case properly, calculate potential ANFE related status bits
and save in aer_err_info. Use this information to determine the status bits
that need to be cleared.

Now, for the previous scenario, both CE status and related UE status will
be reported and cleared after ANFE:

AER: Correctable error message received from 0000:b7:02.0
PCIe Bus Error: severity=Correctable, type=Transaction Layer, (Receiver ID)
device [8086:0db0] error status/mask=00002000/00000000
[13] NonFatalErr
Uncorrectable errors that may cause Advisory Non-Fatal:
[18] TLP

Note:
checkpatch.pl will produce following warnings on PATCH2/3:

WARNING: 'UE' may be misspelled - perhaps 'USE'?
#22:
uncorrectable error(UE) status should be cleared. However, there is no

..similar warnings omitted...

This is a false-positive, so not fixed.

WARNING: Prefer a maximum 75 chars per line (possible unwrapped commit description?)
#10:
PCIe Bus Error: severity=Correctable, type=Transaction Layer, (Receiver ID)

..similar warnings omitted...

For readability reasons, these warnings are not fixed.



[1] https://lore.kernel.org/linux-pci/[email protected]

Thanks
Qingshun, Zhenzhong

Changelog:
v4:
- Fix a race in anfe_get_uc_status() (Jonathan)
- Add a comment to explain side effect of processing ANFE as NFE (Jonathan)
- Drop the check for PCI_EXP_DEVSTA_NFED

v3:
- Split ANFE print and processing to two patches (Bjorn)
- Simplify ANFE handling, drop trace event
- Polish comments and patch description
- Add Tested-by

v2:
- Reference to the latest PCIe Specification in both commit messages
and comments, as suggested by Bjorn Helgaas.
- Describe the reason for storing additional information in
aer_err_info in the commit message of PATCH 1, as suggested by Bjorn
Helgaas.
- Add more details of behavior changes in the commit message of PATCH
2, as suggested by Bjorn Helgaas.

v3: https://lore.kernel.org/lkml/[email protected]
v2: https://lore.kernel.org/linux-pci/[email protected]
v1: https://lore.kernel.org/linux-pci/[email protected]

Zhenzhong Duan (3):
PCI/AER: Store UNCOR_STATUS bits that might be ANFE in aer_err_info
PCI/AER: Print UNCOR_STATUS bits that might be ANFE
PCI/AER: Clear UNCOR_STATUS bits that might be ANFE

drivers/pci/pci.h | 1 +
drivers/pci/pcie/aer.c | 75 +++++++++++++++++++++++++++++++++++++++++-
2 files changed, 75 insertions(+), 1 deletion(-)

--
2.34.1



2024-05-09 08:51:50

by Duan, Zhenzhong

[permalink] [raw]
Subject: [PATCH v4 1/3] PCI/AER: Store UNCOR_STATUS bits that might be ANFE in aer_err_info

In some cases the detector of a Non-Fatal Error(NFE) is not the most
appropriate agent to determine the type of the error. For example,
when software performs a configuration read from a non-existent
device or Function, completer will send an ERR_NONFATAL Message.
On some platforms, ERR_NONFATAL results in a System Error, which
breaks normal software probing.

Advisory Non-Fatal Error(ANFE) is a special case that can be used
in above scenario. It is predominantly determined by the role of the
detecting agent (Requester, Completer, or Receiver) and the specific
error. In such cases, an agent with AER signals the NFE (if enabled)
by sending an ERR_COR Message as an advisory to software, instead of
sending ERR_NONFATAL.

When processing an ANFE, ideally both correctable error(CE) status and
uncorrectable error(UE) status should be cleared. However, there is no
way to fully identify the UE associated with ANFE. Even worse, Non-Fatal
Error(NFE) may set the same UE status bit as ANFE. Treating an ANFE as
NFE will reproduce above mentioned issue, i.e., breaking softwore probing;
treating NFE as ANFE will make us ignoring some UEs which need active
recover operation. To avoid clearing UEs that are not ANFE by accident,
the most conservative route is taken here: If any of the NFE Detected
bits is set in Device Status, do not touch UE status, they should be
cleared later by the UE handler. Otherwise, a specific set of UEs that
may be raised as ANFE according to the PCIe specification will be cleared
if their corresponding severity is Non-Fatal.

To achieve above purpose, store UNCOR_STATUS bits that might be ANFE
in aer_err_info.anfe_status. So that those bits could be printed and
processed later.

Tested-by: Yudong Wang <[email protected]>
Co-developed-by: "Wang, Qingshun" <[email protected]>
Signed-off-by: "Wang, Qingshun" <[email protected]>
Signed-off-by: Zhenzhong Duan <[email protected]>
---
drivers/pci/pci.h | 1 +
drivers/pci/pcie/aer.c | 53 ++++++++++++++++++++++++++++++++++++++++++
2 files changed, 54 insertions(+)

diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 17fed1846847..3f9eb807f9fd 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -412,6 +412,7 @@ struct aer_err_info {

unsigned int status; /* COR/UNCOR Error Status */
unsigned int mask; /* COR/UNCOR Error Mask */
+ unsigned int anfe_status; /* UNCOR Error Status for ANFE */
struct pcie_tlp_log tlp; /* TLP Header */
};

diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index ac6293c24976..f2839b51321a 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -107,6 +107,12 @@ struct aer_stats {
PCI_ERR_ROOT_MULTI_COR_RCV | \
PCI_ERR_ROOT_MULTI_UNCOR_RCV)

+#define AER_ERR_ANFE_UNC_MASK (PCI_ERR_UNC_POISON_TLP | \
+ PCI_ERR_UNC_COMP_TIME | \
+ PCI_ERR_UNC_COMP_ABORT | \
+ PCI_ERR_UNC_UNX_COMP | \
+ PCI_ERR_UNC_UNSUP)
+
static int pcie_aer_disable;
static pci_ers_result_t aer_root_reset(struct pci_dev *dev);

@@ -1196,6 +1202,49 @@ void aer_recover_queue(int domain, unsigned int bus, unsigned int devfn,
EXPORT_SYMBOL_GPL(aer_recover_queue);
#endif

+static void anfe_get_uc_status(struct pci_dev *dev, struct aer_err_info *info)
+{
+ u32 uncor_mask, uncor_status, anfe_status;
+ u16 device_status;
+ int aer = dev->aer_cap;
+
+ pci_read_config_dword(dev, aer + PCI_ERR_UNCOR_STATUS, &uncor_status);
+ pci_read_config_dword(dev, aer + PCI_ERR_UNCOR_MASK, &uncor_mask);
+ /*
+ * According to PCIe Base Specification Revision 6.1,
+ * Section 6.2.3.2.4, if an UNCOR error is raised as
+ * Advisory Non-Fatal error, it will match the following
+ * conditions:
+ * a. The severity of the error is Non-Fatal.
+ * b. The error is one of the following:
+ * 1. Poisoned TLP (Section 6.2.3.2.4.3)
+ * 2. Completion Timeout (Section 6.2.3.2.4.4)
+ * 3. Completer Abort (Section 6.2.3.2.4.1)
+ * 4. Unexpected Completion (Section 6.2.3.2.4.5)
+ * 5. Unsupported Request (Section 6.2.3.2.4.1)
+ */
+ anfe_status = uncor_status & ~uncor_mask & ~info->severity &
+ AER_ERR_ANFE_UNC_MASK;
+
+ if (pcie_capability_read_word(dev, PCI_EXP_DEVSTA, &device_status))
+ return;
+ /*
+ * Take the most conservative route here. If there are Non-Fatal errors
+ * detected, do not assume any bit in uncor_status is set by ANFE.
+ */
+ if (device_status & PCI_EXP_DEVSTA_NFED)
+ return;
+
+ /*
+ * If there is another ANFE between reading uncor_status and clearing
+ * PCI_ERR_COR_ADV_NFAT bit in cor_status register, that ANFE isn't
+ * recorded in info->anfe_status. It will be read out as NFE in
+ * following uncor_status register reading and processed by NFE
+ * handler.
+ */
+ info->anfe_status = anfe_status;
+}
+
/**
* aer_get_device_error_info - read error status from dev and store it to info
* @dev: pointer to the device expected to have a error record
@@ -1213,6 +1262,7 @@ int aer_get_device_error_info(struct pci_dev *dev, struct aer_err_info *info)

/* Must reset in this function */
info->status = 0;
+ info->anfe_status = 0;
info->tlp_header_valid = 0;

/* The device might not support AER */
@@ -1226,6 +1276,9 @@ int aer_get_device_error_info(struct pci_dev *dev, struct aer_err_info *info)
&info->mask);
if (!(info->status & ~info->mask))
return 0;
+
+ if (info->status & PCI_ERR_COR_ADV_NFAT)
+ anfe_get_uc_status(dev, info);
} else if (type == PCI_EXP_TYPE_ROOT_PORT ||
type == PCI_EXP_TYPE_RC_EC ||
type == PCI_EXP_TYPE_DOWNSTREAM ||
--
2.34.1


2024-05-09 08:52:21

by Duan, Zhenzhong

[permalink] [raw]
Subject: [PATCH v4 3/3] PCI/AER: Clear UNCOR_STATUS bits that might be ANFE

When processing an ANFE, ideally both correctable error(CE) status and
uncorrectable error(UE) status should be cleared. However, there is no
way to fully identify the UE associated with ANFE. Even worse, Non-Fatal
Error(NFE) may set the same UE status bit as ANFE. Treating an ANFE as
NFE will bring some issues, i.e., breaking softwore probing; treating
NFE as ANFE will make us ignoring some UEs which need active recover
operation. To avoid clearing UEs that are not ANFE by accident, the
most conservative route is taken here: If any of the NFE Detected bits
is set in Device Status, do not touch UE status, they should be cleared
later by the UE handler. Otherwise, a specific set of UEs that may be
raised as ANFE according to the PCIe specification will be cleared if
their corresponding severity is Non-Fatal.

For instance, previously when kernel receives an ANFE with Poisoned TLP
in OS native AER mode, only status of CE will be reported and cleared:

AER: Correctable error message received from 0000:b7:02.0
PCIe Bus Error: severity=Correctable, type=Transaction Layer, (Receiver ID)
device [8086:0db0] error status/mask=00002000/00000000
[13] NonFatalErr

If the kernel receives a Malformed TLP after that, two UEs will be
reported, which is unexpected. Malformed TLP Header is lost since
the previous ANFE gated the TLP header logs:

PCIe Bus Error: severity="Uncorrectable (Fatal), type=Transaction Layer, (Receiver ID)
device [8086:0db0] error status/mask=00041000/00180020
[12] TLP (First)
[18] MalfTLP

Now, for the same scenario, both CE status and related UE status will be
reported and cleared after ANFE:

AER: Correctable error message received from 0000:b7:02.0
PCIe Bus Error: severity=Correctable, type=Transaction Layer, (Receiver ID)
device [8086:0db0] error status/mask=00002000/00000000
[13] NonFatalErr
Uncorrectable errors that may cause Advisory Non-Fatal:
[18] TLP

Tested-by: Yudong Wang <[email protected]>
Co-developed-by: "Wang, Qingshun" <[email protected]>
Signed-off-by: "Wang, Qingshun" <[email protected]>
Signed-off-by: Zhenzhong Duan <[email protected]>
---
drivers/pci/pcie/aer.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index ed435f09ac27..6a6a3a40569a 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -1115,9 +1115,14 @@ static void pci_aer_handle_error(struct pci_dev *dev, struct aer_err_info *info)
* Correctable error does not need software intervention.
* No need to go through error recovery process.
*/
- if (aer)
+ if (aer) {
pci_write_config_dword(dev, aer + PCI_ERR_COR_STATUS,
info->status);
+ if (info->anfe_status)
+ pci_write_config_dword(dev,
+ aer + PCI_ERR_UNCOR_STATUS,
+ info->anfe_status);
+ }
if (pcie_aer_is_native(dev)) {
struct pci_driver *pdrv = dev->driver;

--
2.34.1


2024-05-09 08:52:42

by Duan, Zhenzhong

[permalink] [raw]
Subject: [PATCH v4 2/3] PCI/AER: Print UNCOR_STATUS bits that might be ANFE

When an Advisory Non-Fatal error(ANFE) triggers, both correctable error(CE)
status and ANFE related uncorrectable error(UE) status will be printed:

AER: Correctable error message received from 0000:b7:02.0
PCIe Bus Error: severity=Correctable, type=Transaction Layer, (Receiver ID)
device [8086:0db0] error status/mask=00002000/00000000
[13] NonFatalErr
Uncorrectable errors that may cause Advisory Non-Fatal:
[18] TLP

Tested-by: Yudong Wang <[email protected]>
Co-developed-by: "Wang, Qingshun" <[email protected]>
Signed-off-by: "Wang, Qingshun" <[email protected]>
Signed-off-by: Zhenzhong Duan <[email protected]>
---
drivers/pci/pcie/aer.c | 15 +++++++++++++++
1 file changed, 15 insertions(+)

diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index f2839b51321a..ed435f09ac27 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -681,6 +681,7 @@ static void __aer_print_error(struct pci_dev *dev,
{
const char **strings;
unsigned long status = info->status & ~info->mask;
+ unsigned long anfe_status = info->anfe_status;
const char *level, *errmsg;
int i;

@@ -701,6 +702,20 @@ static void __aer_print_error(struct pci_dev *dev,
info->first_error == i ? " (First)" : "");
}
pci_dev_aer_stats_incr(dev, info);
+
+ if (!anfe_status)
+ return;
+
+ strings = aer_uncorrectable_error_string;
+ pci_printk(level, dev, "Uncorrectable errors that may cause Advisory Non-Fatal:\n");
+
+ for_each_set_bit(i, &anfe_status, 32) {
+ errmsg = strings[i];
+ if (!errmsg)
+ errmsg = "Unknown Error Bit";
+
+ pci_printk(level, dev, " [%2d] %s\n", i, errmsg);
+ }
}

void aer_print_error(struct pci_dev *dev, struct aer_err_info *info)
--
2.34.1


2024-05-29 05:33:11

by Duan, Zhenzhong

[permalink] [raw]
Subject: RE: [PATCH v4 0/3] PCI/AER: Handle Advisory Non-Fatal error

Hi,

Kindly ping.
Appreciate comments and suggestions so I could go ahead.

Thanks
Zhenzhong

>-----Original Message-----
>From: Duan, Zhenzhong <[email protected]>
>Subject: [PATCH v4 0/3] PCI/AER: Handle Advisory Non-Fatal error
>
>Hi,
>
>This is a relay work of Qingshun's v2 [1], but changed to focus on ANFE
>processing as subject suggests and drops trace-event for now. I think it's
>a bit heavy to do extra IOes to get PCIe registers only for trace purpose
>and not see it a community request for now.
>
>According to PCIe Base Specification Revision 6.1, Sections 6.2.3.2.4 and
>6.2.4.3, certain uncorrectable errors will signal ERR_COR instead of
>ERR_NONFATAL, logged as Advisory Non-Fatal Error(ANFE), and set bits in
>both Correctable Error(CE) Status register and Uncorrectable Error(UE)
>Status register. Currently, when handling AER events the kernel will only
>look at CE status or UE status, but never both. In the ANFE case, bits set
>in the UE status register will not be reported and cleared until the next
>FE/NFE arrives.
>
>For instance, previously, when the kernel receives an ANFE with Poisoned
>TLP in OS native AER mode, only the status of CE will be reported and
>cleared:
>
> AER: Correctable error message received from 0000:b7:02.0
> PCIe Bus Error: severity=Correctable, type=Transaction Layer, (Receiver ID)
> device [8086:0db0] error status/mask=00002000/00000000
> [13] NonFatalErr
>
>If the kernel receives a Malformed TLP after that, two UEs will be
>reported, which is unexpected. The Malformed TLP Header is lost since
>the previous ANFE gated the TLP header logs:
>
> PCIe Bus Error: severity="Uncorrectable (Fatal), type=Transaction Layer,
>(Receiver ID)
> device [8086:0db0] error status/mask=00041000/00180020
> [12] TLP (First)
> [18] MalfTLP
>
>To handle this case properly, calculate potential ANFE related status bits
>and save in aer_err_info. Use this information to determine the status bits
>that need to be cleared.
>
>Now, for the previous scenario, both CE status and related UE status will
>be reported and cleared after ANFE:
>
> AER: Correctable error message received from 0000:b7:02.0
> PCIe Bus Error: severity=Correctable, type=Transaction Layer, (Receiver ID)
> device [8086:0db0] error status/mask=00002000/00000000
> [13] NonFatalErr
> Uncorrectable errors that may cause Advisory Non-Fatal:
> [18] TLP
>
>Note:
>checkpatch.pl will produce following warnings on PATCH2/3:
>
>WARNING: 'UE' may be misspelled - perhaps 'USE'?
>#22:
>uncorrectable error(UE) status should be cleared. However, there is no
>
>...similar warnings omitted...
>
>This is a false-positive, so not fixed.
>
>WARNING: Prefer a maximum 75 chars per line (possible unwrapped commit
>description?)
>#10:
> PCIe Bus Error: severity=Correctable, type=Transaction Layer, (Receiver ID)
>
>...similar warnings omitted...
>
>For readability reasons, these warnings are not fixed.
>
>
>
>[1] https://lore.kernel.org/linux-pci/20240125062802.50819-1-
>[email protected]
>
>Thanks
>Qingshun, Zhenzhong
>
>Changelog:
>v4:
> - Fix a race in anfe_get_uc_status() (Jonathan)
> - Add a comment to explain side effect of processing ANFE as NFE (Jonathan)
> - Drop the check for PCI_EXP_DEVSTA_NFED
>
>v3:
> - Split ANFE print and processing to two patches (Bjorn)
> - Simplify ANFE handling, drop trace event
> - Polish comments and patch description
> - Add Tested-by
>
>v2:
> - Reference to the latest PCIe Specification in both commit messages
> and comments, as suggested by Bjorn Helgaas.
> - Describe the reason for storing additional information in
> aer_err_info in the commit message of PATCH 1, as suggested by Bjorn
> Helgaas.
> - Add more details of behavior changes in the commit message of PATCH
> 2, as suggested by Bjorn Helgaas.
>
>v3: https://lore.kernel.org/lkml/20240417061407.1491361-1-
>[email protected]
>v2: https://lore.kernel.org/linux-pci/20240125062802.50819-1-
>[email protected]
>v1: https://lore.kernel.org/linux-pci/20240111073227.31488-1-
>[email protected]
>
>Zhenzhong Duan (3):
> PCI/AER: Store UNCOR_STATUS bits that might be ANFE in aer_err_info
> PCI/AER: Print UNCOR_STATUS bits that might be ANFE
> PCI/AER: Clear UNCOR_STATUS bits that might be ANFE
>
> drivers/pci/pci.h | 1 +
> drivers/pci/pcie/aer.c | 75
>+++++++++++++++++++++++++++++++++++++++++-
> 2 files changed, 75 insertions(+), 1 deletion(-)
>
>--
>2.34.1


Subject: Re: [PATCH v4 2/3] PCI/AER: Print UNCOR_STATUS bits that might be ANFE


On 5/9/24 1:48 AM, Zhenzhong Duan wrote:
> When an Advisory Non-Fatal error(ANFE) triggers, both correctable error(CE)
> status and ANFE related uncorrectable error(UE) status will be printed:
>
> AER: Correctable error message received from 0000:b7:02.0
> PCIe Bus Error: severity=Correctable, type=Transaction Layer, (Receiver ID)
> device [8086:0db0] error status/mask=00002000/00000000
> [13] NonFatalErr
> Uncorrectable errors that may cause Advisory Non-Fatal:
> [18] TLP
>
> Tested-by: Yudong Wang <[email protected]>
> Co-developed-by: "Wang, Qingshun" <[email protected]>
> Signed-off-by: "Wang, Qingshun" <[email protected]>
> Signed-off-by: Zhenzhong Duan <[email protected]>
> ---

LGTM

Reviewed-by: Kuppuswamy Sathyanarayanan <[email protected]>

> drivers/pci/pcie/aer.c | 15 +++++++++++++++
> 1 file changed, 15 insertions(+)
>
> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> index f2839b51321a..ed435f09ac27 100644
> --- a/drivers/pci/pcie/aer.c
> +++ b/drivers/pci/pcie/aer.c
> @@ -681,6 +681,7 @@ static void __aer_print_error(struct pci_dev *dev,
> {
> const char **strings;
> unsigned long status = info->status & ~info->mask;
> + unsigned long anfe_status = info->anfe_status;
> const char *level, *errmsg;
> int i;
>
> @@ -701,6 +702,20 @@ static void __aer_print_error(struct pci_dev *dev,
> info->first_error == i ? " (First)" : "");
> }
> pci_dev_aer_stats_incr(dev, info);
> +
> + if (!anfe_status)
> + return;
> +
> + strings = aer_uncorrectable_error_string;
> + pci_printk(level, dev, "Uncorrectable errors that may cause Advisory Non-Fatal:\n");
> +
> + for_each_set_bit(i, &anfe_status, 32) {
> + errmsg = strings[i];
> + if (!errmsg)
> + errmsg = "Unknown Error Bit";
> +
> + pci_printk(level, dev, " [%2d] %s\n", i, errmsg);
> + }
> }
>
> void aer_print_error(struct pci_dev *dev, struct aer_err_info *info)

--
Sathyanarayanan Kuppuswamy
Linux Kernel Developer


2024-06-14 02:40:04

by Duan, Zhenzhong

[permalink] [raw]
Subject: RE: [PATCH v4 1/3] PCI/AER: Store UNCOR_STATUS bits that might be ANFE in aer_err_info

Hi

>-----Original Message-----
>From: Kuppuswamy Sathyanarayanan
><[email protected]>
>Subject: Re: [PATCH v4 1/3] PCI/AER: Store UNCOR_STATUS bits that might
>be ANFE in aer_err_info
>
>Hi,
>
>On 5/9/24 1:48 AM, Zhenzhong Duan wrote:
>> In some cases the detector of a Non-Fatal Error(NFE) is not the most
>> appropriate agent to determine the type of the error. For example,
>> when software performs a configuration read from a non-existent
>> device or Function, completer will send an ERR_NONFATAL Message.
>> On some platforms, ERR_NONFATAL results in a System Error, which
>> breaks normal software probing.
>>
>> Advisory Non-Fatal Error(ANFE) is a special case that can be used
>> in above scenario. It is predominantly determined by the role of the
>> detecting agent (Requester, Completer, or Receiver) and the specific
>> error. In such cases, an agent with AER signals the NFE (if enabled)
>> by sending an ERR_COR Message as an advisory to software, instead of
>> sending ERR_NONFATAL.
>>
>> When processing an ANFE, ideally both correctable error(CE) status and
>> uncorrectable error(UE) status should be cleared. However, there is no
>> way to fully identify the UE associated with ANFE. Even worse, Non-Fatal
>> Error(NFE) may set the same UE status bit as ANFE. Treating an ANFE as
>> NFE will reproduce above mentioned issue, i.e., breaking softwore probing;
>> treating NFE as ANFE will make us ignoring some UEs which need active
>> recover operation. To avoid clearing UEs that are not ANFE by accident,
>> the most conservative route is taken here: If any of the NFE Detected
>> bits is set in Device Status, do not touch UE status, they should be
>> cleared later by the UE handler. Otherwise, a specific set of UEs that
>> may be raised as ANFE according to the PCIe specification will be cleared
>> if their corresponding severity is Non-Fatal.
>>
>> To achieve above purpose, store UNCOR_STATUS bits that might be ANFE
>> in aer_err_info.anfe_status. So that those bits could be printed and
>> processed later.
>>
>> Tested-by: Yudong Wang <[email protected]>
>> Co-developed-by: "Wang, Qingshun" <[email protected]>
>> Signed-off-by: "Wang, Qingshun" <[email protected]>
>> Signed-off-by: Zhenzhong Duan <[email protected]>
>> ---
>> drivers/pci/pci.h | 1 +
>> drivers/pci/pcie/aer.c | 53
>++++++++++++++++++++++++++++++++++++++++++
>> 2 files changed, 54 insertions(+)
>>
>> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
>> index 17fed1846847..3f9eb807f9fd 100644
>> --- a/drivers/pci/pci.h
>> +++ b/drivers/pci/pci.h
>> @@ -412,6 +412,7 @@ struct aer_err_info {
>>
>> unsigned int status; /* COR/UNCOR Error Status */
>> unsigned int mask; /* COR/UNCOR Error Mask */
>> + unsigned int anfe_status; /* UNCOR Error Status for ANFE */
>> struct pcie_tlp_log tlp; /* TLP Header */
>> };
>>
>> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
>> index ac6293c24976..f2839b51321a 100644
>> --- a/drivers/pci/pcie/aer.c
>> +++ b/drivers/pci/pcie/aer.c
>> @@ -107,6 +107,12 @@ struct aer_stats {
>> PCI_ERR_ROOT_MULTI_COR_RCV |
> \
>> PCI_ERR_ROOT_MULTI_UNCOR_RCV)
>>
>> +#define AER_ERR_ANFE_UNC_MASK
> (PCI_ERR_UNC_POISON_TLP | \
>> + PCI_ERR_UNC_COMP_TIME |
> \
>> + PCI_ERR_UNC_COMP_ABORT |
> \
>> + PCI_ERR_UNC_UNX_COMP |
> \
>> + PCI_ERR_UNC_UNSUP)
>> +
>> static int pcie_aer_disable;
>> static pci_ers_result_t aer_root_reset(struct pci_dev *dev);
>>
>> @@ -1196,6 +1202,49 @@ void aer_recover_queue(int domain, unsigned
>int bus, unsigned int devfn,
>> EXPORT_SYMBOL_GPL(aer_recover_queue);
>> #endif
>>
>> +static void anfe_get_uc_status(struct pci_dev *dev, struct aer_err_info
>*info)
>> +{
>> + u32 uncor_mask, uncor_status, anfe_status;
>> + u16 device_status;
>> + int aer = dev->aer_cap;
>> +
>> + pci_read_config_dword(dev, aer + PCI_ERR_UNCOR_STATUS,
>&uncor_status);
>> + pci_read_config_dword(dev, aer + PCI_ERR_UNCOR_MASK,
>&uncor_mask);
>> + /*
>> + * According to PCIe Base Specification Revision 6.1,
>> + * Section 6.2.3.2.4, if an UNCOR error is raised as
>> + * Advisory Non-Fatal error, it will match the following
>> + * conditions:
>> + * a. The severity of the error is Non-Fatal.
>> + * b. The error is one of the following:
>> + * 1. Poisoned TLP (Section 6.2.3.2.4.3)
>> + * 2. Completion Timeout (Section 6.2.3.2.4.4)
>> + * 3. Completer Abort (Section 6.2.3.2.4.1)
>> + * 4. Unexpected Completion (Section 6.2.3.2.4.5)
>> + * 5. Unsupported Request (Section 6.2.3.2.4.1)
>> + */
>> + anfe_status = uncor_status & ~uncor_mask & ~info->severity &
>> + AER_ERR_ANFE_UNC_MASK;
>> +
>> + if (pcie_capability_read_word(dev, PCI_EXP_DEVSTA,
>&device_status))
>> + return;
>> + /*
>> + * Take the most conservative route here. If there are Non-Fatal
>errors
>> + * detected, do not assume any bit in uncor_status is set by ANFE.
>> + */
>> + if (device_status & PCI_EXP_DEVSTA_NFED)
>> + return;
>
>You can move this check to the top of the function. You don't need to check
>the rest if NFE error is detected in device status.

The v3 just worked that way. Jonathan pointed a race that NFE triggered after
the check will be treated as ANFE and cleared. Check it after reading UNCOR_STATUS
can avoid the race.

See https://lkml.org/lkml/2024/4/22/1011 for discussion details.

Thanks
Zhenzhong

>
>> +
>> + /*
>> + * If there is another ANFE between reading uncor_status and
>clearing
>> + * PCI_ERR_COR_ADV_NFAT bit in cor_status register, that ANFE
>isn't
>> + * recorded in info->anfe_status. It will be read out as NFE in
>> + * following uncor_status register reading and processed by NFE
>> + * handler.
>> + */
>> + info->anfe_status = anfe_status;
>> +}
>> +
>> /**
>> * aer_get_device_error_info - read error status from dev and store it to
>info
>> * @dev: pointer to the device expected to have a error record
>> @@ -1213,6 +1262,7 @@ int aer_get_device_error_info(struct pci_dev
>*dev, struct aer_err_info *info)
>>
>> /* Must reset in this function */
>> info->status = 0;
>> + info->anfe_status = 0;
>> info->tlp_header_valid = 0;
>>
>> /* The device might not support AER */
>> @@ -1226,6 +1276,9 @@ int aer_get_device_error_info(struct pci_dev
>*dev, struct aer_err_info *info)
>> &info->mask);
>> if (!(info->status & ~info->mask))
>> return 0;
>> +
>> + if (info->status & PCI_ERR_COR_ADV_NFAT)
>> + anfe_get_uc_status(dev, info);
>> } else if (type == PCI_EXP_TYPE_ROOT_PORT ||
>> type == PCI_EXP_TYPE_RC_EC ||
>> type == PCI_EXP_TYPE_DOWNSTREAM ||
>
>--
>Sathyanarayanan Kuppuswamy
>Linux Kernel Developer

2024-06-14 02:40:58

by Duan, Zhenzhong

[permalink] [raw]
Subject: RE: [PATCH v4 3/3] PCI/AER: Clear UNCOR_STATUS bits that might be ANFE



>-----Original Message-----
>From: Kuppuswamy, Sathyanarayanan
>Subject: Re: [PATCH v4 3/3] PCI/AER: Clear UNCOR_STATUS bits that might
>be ANFE
>
>
>On 5/9/24 1:48 AM, Zhenzhong Duan wrote:
>> When processing an ANFE, ideally both correctable error(CE) status and
>> uncorrectable error(UE) status should be cleared. However, there is no
>> way to fully identify the UE associated with ANFE. Even worse, Non-Fatal
>> Error(NFE) may set the same UE status bit as ANFE. Treating an ANFE as
>> NFE will bring some issues, i.e., breaking softwore probing; treating
>/s/softwore/software

Good catch, will fix. It's strange 'checkpatch --codespell' doesn't catch this.

>
>May be this is already discussed. But can you explain why treating
>AFNE as non-fatal error will bring probing issues?

Copied below from spec 6.1, 6.2.3.2.4, says it can results in a System Error.

In some cases the detector of a non-fatal error is not the most appropriate agent to determine whether the error is
recoverable or not, or if it even needs any recovery action at all. For example, if software attempts to perform a
configuration read from a non-existent device or Function, the resulting UR Status in the Completion will signal the error
to software, and software does not need for the Completer in addition to signal the error by sending an ERR_NONFATAL
Message. In fact, on some platforms, signaling the error with ERR_NONFATAL results in a System Error, which breaks
normal software probing.

>> NFE as ANFE will make us ignoring some UEs which need active recover
>/s/ignoring/ignore

Will fix.

>> operation. To avoid clearing UEs that are not ANFE by accident, the
>> most conservative route is taken here: If any of the NFE Detected bits
>> is set in Device Status, do not touch UE status, they should be cleared
>> later by the UE handler. Otherwise, a specific set of UEs that may be
>> raised as ANFE according to the PCIe specification will be cleared if
>> their corresponding severity is Non-Fatal.
>>
>> For instance, previously when kernel receives an ANFE with Poisoned TLP
>> in OS native AER mode, only status of CE will be reported and cleared:
>>
>> AER: Correctable error message received from 0000:b7:02.0
>> PCIe Bus Error: severity=Correctable, type=Transaction Layer, (Receiver ID)
>> device [8086:0db0] error status/mask=00002000/00000000
>> [13] NonFatalErr
>>
>> If the kernel receives a Malformed TLP after that, two UEs will be
>> reported, which is unexpected. Malformed TLP Header is lost since
>> the previous ANFE gated the TLP header logs:
>>
>> PCIe Bus Error: severity="Uncorrectable (Fatal), type=Transaction Layer,
>(Receiver ID)
>> device [8086:0db0] error status/mask=00041000/00180020
>> [12] TLP (First)
>> [18] MalfTLP
>>
>> Now, for the same scenario, both CE status and related UE status will be
>> reported and cleared after ANFE:
>>
>> AER: Correctable error message received from 0000:b7:02.0
>> PCIe Bus Error: severity=Correctable, type=Transaction Layer, (Receiver ID)
>> device [8086:0db0] error status/mask=00002000/00000000
>> [13] NonFatalErr
>> Uncorrectable errors that may cause Advisory Non-Fatal:
>> [18] TLP
>>
>> Tested-by: Yudong Wang <[email protected]>
>> Co-developed-by: "Wang, Qingshun" <[email protected]>
>> Signed-off-by: "Wang, Qingshun" <[email protected]>
>> Signed-off-by: Zhenzhong Duan <[email protected]>
>> ---
>> drivers/pci/pcie/aer.c | 7 ++++++-
>> 1 file changed, 6 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
>> index ed435f09ac27..6a6a3a40569a 100644
>> --- a/drivers/pci/pcie/aer.c
>> +++ b/drivers/pci/pcie/aer.c
>> @@ -1115,9 +1115,14 @@ static void pci_aer_handle_error(struct
>pci_dev *dev, struct aer_err_info *info)
>> * Correctable error does not need software intervention.
>> * No need to go through error recovery process.
>> */
>> - if (aer)
>> + if (aer) {
>> pci_write_config_dword(dev, aer +
>PCI_ERR_COR_STATUS,
>> info->status);
>> + if (info->anfe_status)
>> + pci_write_config_dword(dev,
>> + aer +
>PCI_ERR_UNCOR_STATUS,
>> + info->anfe_status);
>> + }
>
>Why split the handling part and storing part into two patches? Why not
>merge
>this part of patch 1/3.

This is based on Bjorn's suggestion at https://www.spinics.net/lists/linux-pci/msg149012.html,
clearing UNCOR_STATUS might be more important, deserve to raise out.

Thanks
Zhenzhong

Subject: Re: [PATCH v4 1/3] PCI/AER: Store UNCOR_STATUS bits that might be ANFE in aer_err_info


On 6/13/24 7:39 PM, Duan, Zhenzhong wrote:
> Hi
>
>> -----Original Message-----
>> From: Kuppuswamy Sathyanarayanan
>> <[email protected]>
>> Subject: Re: [PATCH v4 1/3] PCI/AER: Store UNCOR_STATUS bits that might
>> be ANFE in aer_err_info
>>
>> Hi,
>>
>> On 5/9/24 1:48 AM, Zhenzhong Duan wrote:
>>> In some cases the detector of a Non-Fatal Error(NFE) is not the most
>>> appropriate agent to determine the type of the error. For example,
>>> when software performs a configuration read from a non-existent
>>> device or Function, completer will send an ERR_NONFATAL Message.
>>> On some platforms, ERR_NONFATAL results in a System Error, which
>>> breaks normal software probing.
>>>
>>> Advisory Non-Fatal Error(ANFE) is a special case that can be used
>>> in above scenario. It is predominantly determined by the role of the
>>> detecting agent (Requester, Completer, or Receiver) and the specific
>>> error. In such cases, an agent with AER signals the NFE (if enabled)
>>> by sending an ERR_COR Message as an advisory to software, instead of
>>> sending ERR_NONFATAL.
>>>
>>> When processing an ANFE, ideally both correctable error(CE) status and
>>> uncorrectable error(UE) status should be cleared. However, there is no
>>> way to fully identify the UE associated with ANFE. Even worse, Non-Fatal
>>> Error(NFE) may set the same UE status bit as ANFE. Treating an ANFE as
>>> NFE will reproduce above mentioned issue, i.e., breaking softwore probing;
>>> treating NFE as ANFE will make us ignoring some UEs which need active
>>> recover operation. To avoid clearing UEs that are not ANFE by accident,
>>> the most conservative route is taken here: If any of the NFE Detected
>>> bits is set in Device Status, do not touch UE status, they should be
>>> cleared later by the UE handler. Otherwise, a specific set of UEs that
>>> may be raised as ANFE according to the PCIe specification will be cleared
>>> if their corresponding severity is Non-Fatal.
>>>
>>> To achieve above purpose, store UNCOR_STATUS bits that might be ANFE
>>> in aer_err_info.anfe_status. So that those bits could be printed and
>>> processed later.
>>>
>>> Tested-by: Yudong Wang <[email protected]>
>>> Co-developed-by: "Wang, Qingshun" <[email protected]>
>>> Signed-off-by: "Wang, Qingshun" <[email protected]>
>>> Signed-off-by: Zhenzhong Duan <[email protected]>
>>> ---
>>> drivers/pci/pci.h | 1 +
>>> drivers/pci/pcie/aer.c | 53
>> ++++++++++++++++++++++++++++++++++++++++++
>>> 2 files changed, 54 insertions(+)
>>>
>>> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
>>> index 17fed1846847..3f9eb807f9fd 100644
>>> --- a/drivers/pci/pci.h
>>> +++ b/drivers/pci/pci.h
>>> @@ -412,6 +412,7 @@ struct aer_err_info {
>>>
>>> unsigned int status; /* COR/UNCOR Error Status */
>>> unsigned int mask; /* COR/UNCOR Error Mask */
>>> + unsigned int anfe_status; /* UNCOR Error Status for ANFE */
>>> struct pcie_tlp_log tlp; /* TLP Header */
>>> };
>>>
>>> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
>>> index ac6293c24976..f2839b51321a 100644
>>> --- a/drivers/pci/pcie/aer.c
>>> +++ b/drivers/pci/pcie/aer.c
>>> @@ -107,6 +107,12 @@ struct aer_stats {
>>> PCI_ERR_ROOT_MULTI_COR_RCV |
>> \
>>> PCI_ERR_ROOT_MULTI_UNCOR_RCV)
>>>
>>> +#define AER_ERR_ANFE_UNC_MASK
>> (PCI_ERR_UNC_POISON_TLP | \
>>> + PCI_ERR_UNC_COMP_TIME |
>> \
>>> + PCI_ERR_UNC_COMP_ABORT |
>> \
>>> + PCI_ERR_UNC_UNX_COMP |
>> \
>>> + PCI_ERR_UNC_UNSUP)
>>> +
>>> static int pcie_aer_disable;
>>> static pci_ers_result_t aer_root_reset(struct pci_dev *dev);
>>>
>>> @@ -1196,6 +1202,49 @@ void aer_recover_queue(int domain, unsigned
>> int bus, unsigned int devfn,
>>> EXPORT_SYMBOL_GPL(aer_recover_queue);
>>> #endif
>>>
>>> +static void anfe_get_uc_status(struct pci_dev *dev, struct aer_err_info
>> *info)
>>> +{
>>> + u32 uncor_mask, uncor_status, anfe_status;
>>> + u16 device_status;
>>> + int aer = dev->aer_cap;
>>> +
>>> + pci_read_config_dword(dev, aer + PCI_ERR_UNCOR_STATUS,
>> &uncor_status);
>>> + pci_read_config_dword(dev, aer + PCI_ERR_UNCOR_MASK,
>> &uncor_mask);
>>> + /*
>>> + * According to PCIe Base Specification Revision 6.1,
>>> + * Section 6.2.3.2.4, if an UNCOR error is raised as
>>> + * Advisory Non-Fatal error, it will match the following
>>> + * conditions:
>>> + * a. The severity of the error is Non-Fatal.
>>> + * b. The error is one of the following:
>>> + * 1. Poisoned TLP (Section 6.2.3.2.4.3)
>>> + * 2. Completion Timeout (Section 6.2.3.2.4.4)
>>> + * 3. Completer Abort (Section 6.2.3.2.4.1)
>>> + * 4. Unexpected Completion (Section 6.2.3.2.4.5)
>>> + * 5. Unsupported Request (Section 6.2.3.2.4.1)
>>> + */
>>> + anfe_status = uncor_status & ~uncor_mask & ~info->severity &
>>> + AER_ERR_ANFE_UNC_MASK;
>>> +
>>> + if (pcie_capability_read_word(dev, PCI_EXP_DEVSTA,
>> &device_status))
>>> + return;
>>> + /*
>>> + * Take the most conservative route here. If there are Non-Fatal
>> errors
>>> + * detected, do not assume any bit in uncor_status is set by ANFE.
>>> + */
>>> + if (device_status & PCI_EXP_DEVSTA_NFED)
>>> + return;
>> You can move this check to the top of the function. You don't need to check
>> the rest if NFE error is detected in device status.
> The v3 just worked that way. Jonathan pointed a race that NFE triggered after
> the check will be treated as ANFE and cleared. Check it after reading UNCOR_STATUS
> can avoid the race.
>
> See https://lkml.org/lkml/2024/4/22/1011 for discussion details.

Got it. I would recommend adding a comment about it in handler. May be
some thing like,

/*
 * To avoid race between device status read and error status register read, cache
 * uncorrectable error status before checking for NFE in device status * register. */
>
> Thanks
> Zhenzhong
>
>>> +
>>> + /*
>>> + * If there is another ANFE between reading uncor_status and
>> clearing
>>> + * PCI_ERR_COR_ADV_NFAT bit in cor_status register, that ANFE
>> isn't
>>> + * recorded in info->anfe_status. It will be read out as NFE in
>>> + * following uncor_status register reading and processed by NFE
>>> + * handler.
>>> + */
>>> + info->anfe_status = anfe_status;
>>> +}
>>> +
>>> /**
>>> * aer_get_device_error_info - read error status from dev and store it to
>> info
>>> * @dev: pointer to the device expected to have a error record
>>> @@ -1213,6 +1262,7 @@ int aer_get_device_error_info(struct pci_dev
>> *dev, struct aer_err_info *info)
>>> /* Must reset in this function */
>>> info->status = 0;
>>> + info->anfe_status = 0;
>>> info->tlp_header_valid = 0;
>>>
>>> /* The device might not support AER */
>>> @@ -1226,6 +1276,9 @@ int aer_get_device_error_info(struct pci_dev
>> *dev, struct aer_err_info *info)
>>> &info->mask);
>>> if (!(info->status & ~info->mask))
>>> return 0;
>>> +
>>> + if (info->status & PCI_ERR_COR_ADV_NFAT)
>>> + anfe_get_uc_status(dev, info);
>>> } else if (type == PCI_EXP_TYPE_ROOT_PORT ||
>>> type == PCI_EXP_TYPE_RC_EC ||
>>> type == PCI_EXP_TYPE_DOWNSTREAM ||
>> --
>> Sathyanarayanan Kuppuswamy
>> Linux Kernel Developer

--
Sathyanarayanan Kuppuswamy
Linux Kernel Developer


2024-06-14 03:14:19

by Duan, Zhenzhong

[permalink] [raw]
Subject: RE: [PATCH v4 1/3] PCI/AER: Store UNCOR_STATUS bits that might be ANFE in aer_err_info



>-----Original Message-----
>From: Kuppuswamy Sathyanarayanan
><[email protected]>
>Subject: Re: [PATCH v4 1/3] PCI/AER: Store UNCOR_STATUS bits that might
>be ANFE in aer_err_info
>
>
>On 6/13/24 7:39 PM, Duan, Zhenzhong wrote:
>> Hi
>>
>>> -----Original Message-----
>>> From: Kuppuswamy Sathyanarayanan
>>> <[email protected]>
>>> Subject: Re: [PATCH v4 1/3] PCI/AER: Store UNCOR_STATUS bits that
>might
>>> be ANFE in aer_err_info
>>>
>>> Hi,
>>>
>>> On 5/9/24 1:48 AM, Zhenzhong Duan wrote:
>>>> In some cases the detector of a Non-Fatal Error(NFE) is not the most
>>>> appropriate agent to determine the type of the error. For example,
>>>> when software performs a configuration read from a non-existent
>>>> device or Function, completer will send an ERR_NONFATAL Message.
>>>> On some platforms, ERR_NONFATAL results in a System Error, which
>>>> breaks normal software probing.
>>>>
>>>> Advisory Non-Fatal Error(ANFE) is a special case that can be used
>>>> in above scenario. It is predominantly determined by the role of the
>>>> detecting agent (Requester, Completer, or Receiver) and the specific
>>>> error. In such cases, an agent with AER signals the NFE (if enabled)
>>>> by sending an ERR_COR Message as an advisory to software, instead of
>>>> sending ERR_NONFATAL.
>>>>
>>>> When processing an ANFE, ideally both correctable error(CE) status and
>>>> uncorrectable error(UE) status should be cleared. However, there is no
>>>> way to fully identify the UE associated with ANFE. Even worse, Non-Fatal
>>>> Error(NFE) may set the same UE status bit as ANFE. Treating an ANFE as
>>>> NFE will reproduce above mentioned issue, i.e., breaking softwore
>probing;
>>>> treating NFE as ANFE will make us ignoring some UEs which need active
>>>> recover operation. To avoid clearing UEs that are not ANFE by accident,
>>>> the most conservative route is taken here: If any of the NFE Detected
>>>> bits is set in Device Status, do not touch UE status, they should be
>>>> cleared later by the UE handler. Otherwise, a specific set of UEs that
>>>> may be raised as ANFE according to the PCIe specification will be cleared
>>>> if their corresponding severity is Non-Fatal.
>>>>
>>>> To achieve above purpose, store UNCOR_STATUS bits that might be
>ANFE
>>>> in aer_err_info.anfe_status. So that those bits could be printed and
>>>> processed later.
>>>>
>>>> Tested-by: Yudong Wang <[email protected]>
>>>> Co-developed-by: "Wang, Qingshun" <[email protected]>
>>>> Signed-off-by: "Wang, Qingshun" <[email protected]>
>>>> Signed-off-by: Zhenzhong Duan <[email protected]>
>>>> ---
>>>> drivers/pci/pci.h | 1 +
>>>> drivers/pci/pcie/aer.c | 53
>>> ++++++++++++++++++++++++++++++++++++++++++
>>>> 2 files changed, 54 insertions(+)
>>>>
>>>> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
>>>> index 17fed1846847..3f9eb807f9fd 100644
>>>> --- a/drivers/pci/pci.h
>>>> +++ b/drivers/pci/pci.h
>>>> @@ -412,6 +412,7 @@ struct aer_err_info {
>>>>
>>>> unsigned int status; /* COR/UNCOR Error Status */
>>>> unsigned int mask; /* COR/UNCOR Error Mask */
>>>> + unsigned int anfe_status; /* UNCOR Error Status for ANFE */
>>>> struct pcie_tlp_log tlp; /* TLP Header */
>>>> };
>>>>
>>>> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
>>>> index ac6293c24976..f2839b51321a 100644
>>>> --- a/drivers/pci/pcie/aer.c
>>>> +++ b/drivers/pci/pcie/aer.c
>>>> @@ -107,6 +107,12 @@ struct aer_stats {
>>>> PCI_ERR_ROOT_MULTI_COR_RCV |
>>> \
>>>> PCI_ERR_ROOT_MULTI_UNCOR_RCV)
>>>>
>>>> +#define AER_ERR_ANFE_UNC_MASK
>>> (PCI_ERR_UNC_POISON_TLP | \
>>>> + PCI_ERR_UNC_COMP_TIME |
>>> \
>>>> + PCI_ERR_UNC_COMP_ABORT |
>>> \
>>>> + PCI_ERR_UNC_UNX_COMP |
>>> \
>>>> + PCI_ERR_UNC_UNSUP)
>>>> +
>>>> static int pcie_aer_disable;
>>>> static pci_ers_result_t aer_root_reset(struct pci_dev *dev);
>>>>
>>>> @@ -1196,6 +1202,49 @@ void aer_recover_queue(int domain,
>unsigned
>>> int bus, unsigned int devfn,
>>>> EXPORT_SYMBOL_GPL(aer_recover_queue);
>>>> #endif
>>>>
>>>> +static void anfe_get_uc_status(struct pci_dev *dev, struct aer_err_info
>>> *info)
>>>> +{
>>>> + u32 uncor_mask, uncor_status, anfe_status;
>>>> + u16 device_status;
>>>> + int aer = dev->aer_cap;
>>>> +
>>>> + pci_read_config_dword(dev, aer + PCI_ERR_UNCOR_STATUS,
>>> &uncor_status);
>>>> + pci_read_config_dword(dev, aer + PCI_ERR_UNCOR_MASK,
>>> &uncor_mask);
>>>> + /*
>>>> + * According to PCIe Base Specification Revision 6.1,
>>>> + * Section 6.2.3.2.4, if an UNCOR error is raised as
>>>> + * Advisory Non-Fatal error, it will match the following
>>>> + * conditions:
>>>> + * a. The severity of the error is Non-Fatal.
>>>> + * b. The error is one of the following:
>>>> + * 1. Poisoned TLP (Section 6.2.3.2.4.3)
>>>> + * 2. Completion Timeout (Section 6.2.3.2.4.4)
>>>> + * 3. Completer Abort (Section 6.2.3.2.4.1)
>>>> + * 4. Unexpected Completion (Section 6.2.3.2.4.5)
>>>> + * 5. Unsupported Request (Section 6.2.3.2.4.1)
>>>> + */
>>>> + anfe_status = uncor_status & ~uncor_mask & ~info->severity &
>>>> + AER_ERR_ANFE_UNC_MASK;
>>>> +
>>>> + if (pcie_capability_read_word(dev, PCI_EXP_DEVSTA,
>>> &device_status))
>>>> + return;
>>>> + /*
>>>> + * Take the most conservative route here. If there are Non-Fatal
>>> errors
>>>> + * detected, do not assume any bit in uncor_status is set by ANFE.
>>>> + */
>>>> + if (device_status & PCI_EXP_DEVSTA_NFED)
>>>> + return;
>>> You can move this check to the top of the function. You don't need to
>check
>>> the rest if NFE error is detected in device status.
>> The v3 just worked that way. Jonathan pointed a race that NFE triggered
>after
>> the check will be treated as ANFE and cleared. Check it after reading
>UNCOR_STATUS
>> can avoid the race.
>>
>> See https://lkml.org/lkml/2024/4/22/1011 for discussion details.
>
>Got it. I would recommend adding a comment about it in handler. May be
>some thing like,
>
>/*
> * To avoid race between device status read and error status register read,
>cache
> * uncorrectable error status before checking for NFE in device status *
>register. */

Good suggestion, will add.

Thanks
Zhenzhong

Subject: Re: [PATCH v4 3/3] PCI/AER: Clear UNCOR_STATUS bits that might be ANFE


On 6/13/24 7:40 PM, Duan, Zhenzhong wrote:
>
>> -----Original Message-----
>> From: Kuppuswamy, Sathyanarayanan
>> Subject: Re: [PATCH v4 3/3] PCI/AER: Clear UNCOR_STATUS bits that might
>> be ANFE
>>
>>
>> On 5/9/24 1:48 AM, Zhenzhong Duan wrote:
>>> When processing an ANFE, ideally both correctable error(CE) status and
>>> uncorrectable error(UE) status should be cleared. However, there is no
>>> way to fully identify the UE associated with ANFE. Even worse, Non-Fatal
>>> Error(NFE) may set the same UE status bit as ANFE. Treating an ANFE as
>>> NFE will bring some issues, i.e., breaking softwore probing; treating
>> /s/softwore/software
> Good catch, will fix. It's strange 'checkpatch --codespell' doesn't catch this.
>
>> May be this is already discussed. But can you explain why treating
>> AFNE as non-fatal error will bring probing issues?
> Copied below from spec 6.1, 6.2.3.2.4, says it can results in a System Error.
>
> In some cases the detector of a non-fatal error is not the most appropriate agent to determine whether the error is
> recoverable or not, or if it even needs any recovery action at all. For example, if software attempts to perform a
> configuration read from a non-existent device or Function, the resulting UR Status in the Completion will signal the error
> to software, and software does not need for the Completer in addition to signal the error by sending an ERR_NONFATAL
> Message. In fact, on some platforms, signaling the error with ERR_NONFATAL results in a System Error, which breaks
> normal software probing.
>
>>> NFE as ANFE will make us ignoring some UEs which need active recover
>> /s/ignoring/ignore
> Will fix.
>
>>> operation. To avoid clearing UEs that are not ANFE by accident, the
>>> most conservative route is taken here: If any of the NFE Detected bits
>>> is set in Device Status, do not touch UE status, they should be cleared
>>> later by the UE handler. Otherwise, a specific set of UEs that may be
>>> raised as ANFE according to the PCIe specification will be cleared if
>>> their corresponding severity is Non-Fatal.
>>>
>>> For instance, previously when kernel receives an ANFE with Poisoned TLP
>>> in OS native AER mode, only status of CE will be reported and cleared:
>>>
>>> AER: Correctable error message received from 0000:b7:02.0
>>> PCIe Bus Error: severity=Correctable, type=Transaction Layer, (Receiver ID)
>>> device [8086:0db0] error status/mask=00002000/00000000
>>> [13] NonFatalErr
>>>
>>> If the kernel receives a Malformed TLP after that, two UEs will be
>>> reported, which is unexpected. Malformed TLP Header is lost since
>>> the previous ANFE gated the TLP header logs:
>>>
>>> PCIe Bus Error: severity="Uncorrectable (Fatal), type=Transaction Layer,
>> (Receiver ID)
>>> device [8086:0db0] error status/mask=00041000/00180020
>>> [12] TLP (First)
>>> [18] MalfTLP
>>>
>>> Now, for the same scenario, both CE status and related UE status will be
>>> reported and cleared after ANFE:
>>>
>>> AER: Correctable error message received from 0000:b7:02.0
>>> PCIe Bus Error: severity=Correctable, type=Transaction Layer, (Receiver ID)
>>> device [8086:0db0] error status/mask=00002000/00000000
>>> [13] NonFatalErr
>>> Uncorrectable errors that may cause Advisory Non-Fatal:
>>> [18] TLP
>>>
>>> Tested-by: Yudong Wang <[email protected]>
>>> Co-developed-by: "Wang, Qingshun" <[email protected]>
>>> Signed-off-by: "Wang, Qingshun" <[email protected]>
>>> Signed-off-by: Zhenzhong Duan <[email protected]>
>>> ---
>>> drivers/pci/pcie/aer.c | 7 ++++++-
>>> 1 file changed, 6 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
>>> index ed435f09ac27..6a6a3a40569a 100644
>>> --- a/drivers/pci/pcie/aer.c
>>> +++ b/drivers/pci/pcie/aer.c
>>> @@ -1115,9 +1115,14 @@ static void pci_aer_handle_error(struct
>> pci_dev *dev, struct aer_err_info *info)
>>> * Correctable error does not need software intervention.
>>> * No need to go through error recovery process.
>>> */
>>> - if (aer)
>>> + if (aer) {
>>> pci_write_config_dword(dev, aer +
>> PCI_ERR_COR_STATUS,
>>> info->status);
>>> + if (info->anfe_status)
>>> + pci_write_config_dword(dev,
>>> + aer +
>> PCI_ERR_UNCOR_STATUS,
>>> + info->anfe_status);
>>> + }
>> Why split the handling part and storing part into two patches? Why not
>> merge
>> this part of patch 1/3.
> This is based on Bjorn's suggestion at https://www.spinics.net/lists/linux-pci/msg149012.html,
> clearing UNCOR_STATUS might be more important, deserve to raise out.

I think Bjorn's suggestion is to divide it into two logical patches.
One for printing the error and another to clear the UNCOR_STATUS
properly. But currently you have split the UNCOR_STATUS status caching and
clearing process into two patches. IMO, your first patch can store ANFE
status and clear it. You can add print support in the second patch.


Code wise it looks fine to me. You can add my Reviewed-by after fixing
the typos

Reviewed-by: Kuppuswamy Sathyanarayanan <[email protected]>

>
> Thanks
> Zhenzhong

--
Sathyanarayanan Kuppuswamy
Linux Kernel Developer


Subject: Re: [PATCH v4 1/3] PCI/AER: Store UNCOR_STATUS bits that might be ANFE in aer_err_info

Hi,

On 5/9/24 1:48 AM, Zhenzhong Duan wrote:
> In some cases the detector of a Non-Fatal Error(NFE) is not the most
> appropriate agent to determine the type of the error. For example,
> when software performs a configuration read from a non-existent
> device or Function, completer will send an ERR_NONFATAL Message.
> On some platforms, ERR_NONFATAL results in a System Error, which
> breaks normal software probing.
>
> Advisory Non-Fatal Error(ANFE) is a special case that can be used
> in above scenario. It is predominantly determined by the role of the
> detecting agent (Requester, Completer, or Receiver) and the specific
> error. In such cases, an agent with AER signals the NFE (if enabled)
> by sending an ERR_COR Message as an advisory to software, instead of
> sending ERR_NONFATAL.
>
> When processing an ANFE, ideally both correctable error(CE) status and
> uncorrectable error(UE) status should be cleared. However, there is no
> way to fully identify the UE associated with ANFE. Even worse, Non-Fatal
> Error(NFE) may set the same UE status bit as ANFE. Treating an ANFE as
> NFE will reproduce above mentioned issue, i.e., breaking softwore probing;
> treating NFE as ANFE will make us ignoring some UEs which need active
> recover operation. To avoid clearing UEs that are not ANFE by accident,
> the most conservative route is taken here: If any of the NFE Detected
> bits is set in Device Status, do not touch UE status, they should be
> cleared later by the UE handler. Otherwise, a specific set of UEs that
> may be raised as ANFE according to the PCIe specification will be cleared
> if their corresponding severity is Non-Fatal.
>
> To achieve above purpose, store UNCOR_STATUS bits that might be ANFE
> in aer_err_info.anfe_status. So that those bits could be printed and
> processed later.
>
> Tested-by: Yudong Wang <[email protected]>
> Co-developed-by: "Wang, Qingshun" <[email protected]>
> Signed-off-by: "Wang, Qingshun" <[email protected]>
> Signed-off-by: Zhenzhong Duan <[email protected]>
> ---
> drivers/pci/pci.h | 1 +
> drivers/pci/pcie/aer.c | 53 ++++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 54 insertions(+)
>
> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> index 17fed1846847..3f9eb807f9fd 100644
> --- a/drivers/pci/pci.h
> +++ b/drivers/pci/pci.h
> @@ -412,6 +412,7 @@ struct aer_err_info {
>
> unsigned int status; /* COR/UNCOR Error Status */
> unsigned int mask; /* COR/UNCOR Error Mask */
> + unsigned int anfe_status; /* UNCOR Error Status for ANFE */
> struct pcie_tlp_log tlp; /* TLP Header */
> };
>
> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> index ac6293c24976..f2839b51321a 100644
> --- a/drivers/pci/pcie/aer.c
> +++ b/drivers/pci/pcie/aer.c
> @@ -107,6 +107,12 @@ struct aer_stats {
> PCI_ERR_ROOT_MULTI_COR_RCV | \
> PCI_ERR_ROOT_MULTI_UNCOR_RCV)
>
> +#define AER_ERR_ANFE_UNC_MASK (PCI_ERR_UNC_POISON_TLP | \
> + PCI_ERR_UNC_COMP_TIME | \
> + PCI_ERR_UNC_COMP_ABORT | \
> + PCI_ERR_UNC_UNX_COMP | \
> + PCI_ERR_UNC_UNSUP)
> +
> static int pcie_aer_disable;
> static pci_ers_result_t aer_root_reset(struct pci_dev *dev);
>
> @@ -1196,6 +1202,49 @@ void aer_recover_queue(int domain, unsigned int bus, unsigned int devfn,
> EXPORT_SYMBOL_GPL(aer_recover_queue);
> #endif
>
> +static void anfe_get_uc_status(struct pci_dev *dev, struct aer_err_info *info)
> +{
> + u32 uncor_mask, uncor_status, anfe_status;
> + u16 device_status;
> + int aer = dev->aer_cap;
> +
> + pci_read_config_dword(dev, aer + PCI_ERR_UNCOR_STATUS, &uncor_status);
> + pci_read_config_dword(dev, aer + PCI_ERR_UNCOR_MASK, &uncor_mask);
> + /*
> + * According to PCIe Base Specification Revision 6.1,
> + * Section 6.2.3.2.4, if an UNCOR error is raised as
> + * Advisory Non-Fatal error, it will match the following
> + * conditions:
> + * a. The severity of the error is Non-Fatal.
> + * b. The error is one of the following:
> + * 1. Poisoned TLP (Section 6.2.3.2.4.3)
> + * 2. Completion Timeout (Section 6.2.3.2.4.4)
> + * 3. Completer Abort (Section 6.2.3.2.4.1)
> + * 4. Unexpected Completion (Section 6.2.3.2.4.5)
> + * 5. Unsupported Request (Section 6.2.3.2.4.1)
> + */
> + anfe_status = uncor_status & ~uncor_mask & ~info->severity &
> + AER_ERR_ANFE_UNC_MASK;
> +
> + if (pcie_capability_read_word(dev, PCI_EXP_DEVSTA, &device_status))
> + return;
> + /*
> + * Take the most conservative route here. If there are Non-Fatal errors
> + * detected, do not assume any bit in uncor_status is set by ANFE.
> + */
> + if (device_status & PCI_EXP_DEVSTA_NFED)
> + return;

You can move this check to the top of the function. You don't need to check
the rest if NFE error is detected in device status.

> +
> + /*
> + * If there is another ANFE between reading uncor_status and clearing
> + * PCI_ERR_COR_ADV_NFAT bit in cor_status register, that ANFE isn't
> + * recorded in info->anfe_status. It will be read out as NFE in
> + * following uncor_status register reading and processed by NFE
> + * handler.
> + */
> + info->anfe_status = anfe_status;
> +}
> +
> /**
> * aer_get_device_error_info - read error status from dev and store it to info
> * @dev: pointer to the device expected to have a error record
> @@ -1213,6 +1262,7 @@ int aer_get_device_error_info(struct pci_dev *dev, struct aer_err_info *info)
>
> /* Must reset in this function */
> info->status = 0;
> + info->anfe_status = 0;
> info->tlp_header_valid = 0;
>
> /* The device might not support AER */
> @@ -1226,6 +1276,9 @@ int aer_get_device_error_info(struct pci_dev *dev, struct aer_err_info *info)
> &info->mask);
> if (!(info->status & ~info->mask))
> return 0;
> +
> + if (info->status & PCI_ERR_COR_ADV_NFAT)
> + anfe_get_uc_status(dev, info);
> } else if (type == PCI_EXP_TYPE_ROOT_PORT ||
> type == PCI_EXP_TYPE_RC_EC ||
> type == PCI_EXP_TYPE_DOWNSTREAM ||

--
Sathyanarayanan Kuppuswamy
Linux Kernel Developer


2024-06-14 03:32:28

by Duan, Zhenzhong

[permalink] [raw]
Subject: RE: [PATCH v4 3/3] PCI/AER: Clear UNCOR_STATUS bits that might be ANFE



>-----Original Message-----
>From: Kuppuswamy Sathyanarayanan
><[email protected]>
>Subject: Re: [PATCH v4 3/3] PCI/AER: Clear UNCOR_STATUS bits that might
>be ANFE
>
>
>On 6/13/24 7:40 PM, Duan, Zhenzhong wrote:
>>
>>> -----Original Message-----
>>> From: Kuppuswamy, Sathyanarayanan
>>> Subject: Re: [PATCH v4 3/3] PCI/AER: Clear UNCOR_STATUS bits that
>might
>>> be ANFE
>>>
>>>
>>> On 5/9/24 1:48 AM, Zhenzhong Duan wrote:
>>>> When processing an ANFE, ideally both correctable error(CE) status and
>>>> uncorrectable error(UE) status should be cleared. However, there is no
>>>> way to fully identify the UE associated with ANFE. Even worse, Non-Fatal
>>>> Error(NFE) may set the same UE status bit as ANFE. Treating an ANFE as
>>>> NFE will bring some issues, i.e., breaking softwore probing; treating
>>> /s/softwore/software
>> Good catch, will fix. It's strange 'checkpatch --codespell' doesn't catch this.
>>
>>> May be this is already discussed. But can you explain why treating
>>> AFNE as non-fatal error will bring probing issues?
>> Copied below from spec 6.1, 6.2.3.2.4, says it can results in a System Error.
>>
>> In some cases the detector of a non-fatal error is not the most appropriate
>agent to determine whether the error is
>> recoverable or not, or if it even needs any recovery action at all. For
>example, if software attempts to perform a
>> configuration read from a non-existent device or Function, the resulting UR
>Status in the Completion will signal the error
>> to software, and software does not need for the Completer in addition to
>signal the error by sending an ERR_NONFATAL
>> Message. In fact, on some platforms, signaling the error with
>ERR_NONFATAL results in a System Error, which breaks
>> normal software probing.
>>
>>>> NFE as ANFE will make us ignoring some UEs which need active recover
>>> /s/ignoring/ignore
>> Will fix.
>>
>>>> operation. To avoid clearing UEs that are not ANFE by accident, the
>>>> most conservative route is taken here: If any of the NFE Detected bits
>>>> is set in Device Status, do not touch UE status, they should be cleared
>>>> later by the UE handler. Otherwise, a specific set of UEs that may be
>>>> raised as ANFE according to the PCIe specification will be cleared if
>>>> their corresponding severity is Non-Fatal.
>>>>
>>>> For instance, previously when kernel receives an ANFE with Poisoned
>TLP
>>>> in OS native AER mode, only status of CE will be reported and cleared:
>>>>
>>>> AER: Correctable error message received from 0000:b7:02.0
>>>> PCIe Bus Error: severity=Correctable, type=Transaction Layer, (Receiver
>ID)
>>>> device [8086:0db0] error status/mask=00002000/00000000
>>>> [13] NonFatalErr
>>>>
>>>> If the kernel receives a Malformed TLP after that, two UEs will be
>>>> reported, which is unexpected. Malformed TLP Header is lost since
>>>> the previous ANFE gated the TLP header logs:
>>>>
>>>> PCIe Bus Error: severity="Uncorrectable (Fatal), type=Transaction Layer,
>>> (Receiver ID)
>>>> device [8086:0db0] error status/mask=00041000/00180020
>>>> [12] TLP (First)
>>>> [18] MalfTLP
>>>>
>>>> Now, for the same scenario, both CE status and related UE status will be
>>>> reported and cleared after ANFE:
>>>>
>>>> AER: Correctable error message received from 0000:b7:02.0
>>>> PCIe Bus Error: severity=Correctable, type=Transaction Layer, (Receiver
>ID)
>>>> device [8086:0db0] error status/mask=00002000/00000000
>>>> [13] NonFatalErr
>>>> Uncorrectable errors that may cause Advisory Non-Fatal:
>>>> [18] TLP
>>>>
>>>> Tested-by: Yudong Wang <[email protected]>
>>>> Co-developed-by: "Wang, Qingshun" <[email protected]>
>>>> Signed-off-by: "Wang, Qingshun" <[email protected]>
>>>> Signed-off-by: Zhenzhong Duan <[email protected]>
>>>> ---
>>>> drivers/pci/pcie/aer.c | 7 ++++++-
>>>> 1 file changed, 6 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
>>>> index ed435f09ac27..6a6a3a40569a 100644
>>>> --- a/drivers/pci/pcie/aer.c
>>>> +++ b/drivers/pci/pcie/aer.c
>>>> @@ -1115,9 +1115,14 @@ static void pci_aer_handle_error(struct
>>> pci_dev *dev, struct aer_err_info *info)
>>>> * Correctable error does not need software intervention.
>>>> * No need to go through error recovery process.
>>>> */
>>>> - if (aer)
>>>> + if (aer) {
>>>> pci_write_config_dword(dev, aer +
>>> PCI_ERR_COR_STATUS,
>>>> info->status);
>>>> + if (info->anfe_status)
>>>> + pci_write_config_dword(dev,
>>>> + aer +
>>> PCI_ERR_UNCOR_STATUS,
>>>> + info->anfe_status);
>>>> + }
>>> Why split the handling part and storing part into two patches? Why not
>>> merge
>>> this part of patch 1/3.
>> This is based on Bjorn's suggestion at https://www.spinics.net/lists/linux-
>pci/msg149012.html,
>> clearing UNCOR_STATUS might be more important, deserve to raise out.
>
>I think Bjorn's suggestion is to divide it into two logical patches.
>One for printing the error and another to clear the UNCOR_STATUS
>properly. But currently you have split the UNCOR_STATUS status caching and
>clearing process into two patches. IMO, your first patch can store ANFE
>status and clear it. You can add print support in the second patch.

OK, I'll merge patch1 with patch3 in next version.

>
>
>Code wise it looks fine to me. You can add my Reviewed-by after fixing
>the typos
>
>Reviewed-by: Kuppuswamy Sathyanarayanan
><[email protected]>

Thanks
zhenzhong

>
>>
>> Thanks
>> Zhenzhong
>
>--
>Sathyanarayanan Kuppuswamy
>Linux Kernel Developer