2023-06-21 19:14:41

by Smita Koralahalli

[permalink] [raw]
Subject: [PATCH v3 0/2] PCI: pciehp: Add support for native AER and DPC handling on async remove

This series of patches adds native support to handle AER and DPC events
occurred as a side-effect on an async remove.

Link to v2:
https://lore.kernel.org/all/[email protected]/

Smita Koralahalli (2):
PCI: pciehp: Add support for async hotplug with native AER and DPC/EDR
PCI: pciehp: Clear the optional capabilities in DEVCTL2 on a hot-plug

drivers/pci/hotplug/pciehp_pci.c | 4 +++
drivers/pci/pcie/dpc.c | 58 ++++++++++++++++++++++++++++++++
include/uapi/linux/pci_regs.h | 1 +
3 files changed, 63 insertions(+)

--
2.17.1



2023-06-21 19:17:29

by Smita Koralahalli

[permalink] [raw]
Subject: [PATCH v3 1/2] PCI: pciehp: Add support for async hotplug with native AER and DPC/EDR

According to Section 6.7.6 of PCIe Base Specification [1], async removal
with DPC may result in surprise down error. This error is expected and
is just a side-effect of async remove.

Add support to handle the surprise down error generated as a side-effect
of async remove. Typically, this error is benign as the pciehp handler
invoked by PDC or/and DLLSC alongside DPC, de-enumerates and brings down
the device appropriately. But the error messages might confuse users. Get
rid of these irritating log messages with a 1s delay while pciehp waits
for dpc recovery.

The implementation is as follows: On an async remove a DPC is triggered
along with a Presence Detect State change and/or DLL State Change.
Determine it's an async remove by checking for DPC Trigger Status in DPC
Status Register and Surprise Down Error Status in AER Uncorrected Error
Status to be non-zero. If true, treat the DPC event as a side-effect of
async remove, clear the error status registers and continue with hot-plug
tear down routines. If not, follow the existing routine to handle AER and
DPC errors.

Please note that, masking Surprise Down Errors was explored as an
alternative approach, but left due to the odd behavior that masking only
avoids the interrupt, but still records an error per PCIe r6.0.1 Section
6.2.3.2.2. That stale error is going to be reported the next time some
error other than Surprise Down is handled.

Dmesg before:

pcieport 0000:00:01.4: DPC: containment event, status:0x1f01 source:0x0000
pcieport 0000:00:01.4: DPC: unmasked uncorrectable error detected
pcieport 0000:00:01.4: PCIe Bus Error: severity=Uncorrected (Fatal), type=Transaction Layer, (Receiver ID)
pcieport 0000:00:01.4: device [1022:14ab] error status/mask=00000020/04004000
pcieport 0000:00:01.4: [ 5] SDES (First)
nvme nvme2: frozen state error detected, reset controller
pcieport 0000:00:01.4: DPC: Data Link Layer Link Active not set in 1000 msec
pcieport 0000:00:01.4: AER: subordinate device reset failed
pcieport 0000:00:01.4: AER: device recovery failed
pcieport 0000:00:01.4: pciehp: Slot(16): Link Down
nvme2n1: detected capacity change from 1953525168 to 0
pci 0000:04:00.0: Removing from iommu group 49

Dmesg after:

pcieport 0000:00:01.4: pciehp: Slot(16): Link Down
nvme1n1: detected capacity change from 1953525168 to 0
pci 0000:04:00.0: Removing from iommu group 37

[1] PCI Express Base Specification Revision 6.0, Dec 16 2021.
https://members.pcisig.com/wg/PCI-SIG/document/16609

Signed-off-by: Smita Koralahalli <[email protected]>
---
v2:
Indentation is taken care. (Bjorn)
Unrelevant dmesg logs are removed. (Bjorn)
Rephrased commit message, to be clear on native vs FW-First
handling. (Bjorn and Sathyanarayanan)
Prefix changed from pciehp_ to dpc_. (Lukas)
Clearing ARI and AtomicOp Requester are performed as a part of
(de-)enumeration in pciehp_unconfigure_device(). (Lukas)
Changed to clearing all optional capabilities in DEVCTL2.
OS-First -> native. (Sathyanarayanan)

v3:
Added error message when root port become inactive.
Modified commit description to add more details.
Rearranged code comments and function calls with no functional
change.
Additional check for is_hotplug_bridge.
dpc_completed_waitqueue to wakeup pciehp handler.
Cleared only Fatal error detected in DEVSTA.
---
drivers/pci/pcie/dpc.c | 58 ++++++++++++++++++++++++++++++++++++++++++
1 file changed, 58 insertions(+)

diff --git a/drivers/pci/pcie/dpc.c b/drivers/pci/pcie/dpc.c
index 3ceed8e3de41..5153ac8ea91c 100644
--- a/drivers/pci/pcie/dpc.c
+++ b/drivers/pci/pcie/dpc.c
@@ -292,10 +292,68 @@ void dpc_process_error(struct pci_dev *pdev)
}
}

+static void pci_clear_surpdn_errors(struct pci_dev *pdev)
+{
+ u16 reg16;
+ u32 reg32;
+
+ pci_read_config_dword(pdev, pdev->dpc_cap + PCI_EXP_DPC_RP_PIO_STATUS, &reg32);
+ pci_write_config_dword(pdev, pdev->dpc_cap + PCI_EXP_DPC_RP_PIO_STATUS, reg32);
+
+ pci_read_config_word(pdev, PCI_STATUS, &reg16);
+ pci_write_config_word(pdev, PCI_STATUS, reg16);
+
+ pcie_capability_write_word(pdev, PCI_EXP_DEVSTA, PCI_EXP_DEVSTA_FED);
+}
+
+static void dpc_handle_surprise_removal(struct pci_dev *pdev)
+{
+ if (pdev->dpc_rp_extensions && dpc_wait_rp_inactive(pdev)) {
+ pci_err(pdev, "failed to retrieve DPC root port on async remove\n");
+ goto out;
+ }
+
+ pci_aer_raw_clear_status(pdev);
+ pci_clear_surpdn_errors(pdev);
+
+ pci_write_config_word(pdev, pdev->dpc_cap + PCI_EXP_DPC_STATUS,
+ PCI_EXP_DPC_STATUS_TRIGGER);
+
+out:
+ clear_bit(PCI_DPC_RECOVERED, &pdev->priv_flags);
+ wake_up_all(&dpc_completed_waitqueue);
+}
+
+static bool dpc_is_surprise_removal(struct pci_dev *pdev)
+{
+ u16 status;
+
+ pci_read_config_word(pdev, pdev->aer_cap + PCI_ERR_UNCOR_STATUS, &status);
+
+ if (!pdev->is_hotplug_bridge)
+ return false;
+
+ if (!(status & PCI_ERR_UNC_SURPDN))
+ return false;
+
+ return true;
+}
+
static irqreturn_t dpc_handler(int irq, void *context)
{
struct pci_dev *pdev = context;

+ /*
+ * According to Section 6.7.6 of the PCIe Base Spec 6.0, since async
+ * removal might be unexpected, errors might be reported as a side
+ * effect of the event and software should handle them as an expected
+ * part of this event.
+ */
+ if (dpc_is_surprise_removal(pdev)) {
+ dpc_handle_surprise_removal(pdev);
+ return IRQ_HANDLED;
+ }
+
dpc_process_error(pdev);

/* We configure DPC so it only triggers on ERR_FATAL */
--
2.17.1


2023-06-22 09:31:22

by Lukas Wunner

[permalink] [raw]
Subject: Re: [PATCH v3 1/2] PCI: pciehp: Add support for async hotplug with native AER and DPC/EDR

On Wed, Jun 21, 2023 at 06:51:51PM +0000, Smita Koralahalli wrote:
> --- a/drivers/pci/pcie/dpc.c
> +++ b/drivers/pci/pcie/dpc.c
> @@ -292,10 +292,68 @@ void dpc_process_error(struct pci_dev *pdev)
> }
> }
>
> +static void pci_clear_surpdn_errors(struct pci_dev *pdev)
> +{
> + u16 reg16;
> + u32 reg32;
> +
> + pci_read_config_dword(pdev, pdev->dpc_cap + PCI_EXP_DPC_RP_PIO_STATUS, &reg32);
> + pci_write_config_dword(pdev, pdev->dpc_cap + PCI_EXP_DPC_RP_PIO_STATUS, reg32);

Make this read+write conditional on "if (pdev->dpc_rp_extensions)"
as the register otherwise doesn't exist.

Wrap to 80 chars per line.


> + pci_read_config_word(pdev, PCI_STATUS, &reg16);
> + pci_write_config_word(pdev, PCI_STATUS, reg16);
> +
> + pcie_capability_write_word(pdev, PCI_EXP_DEVSTA, PCI_EXP_DEVSTA_FED);
> +}

A code comment might be useful here saying that in practice,
Surprise Down errors have been observed to also set error bits
in the Status Register as well as the Fatal Error Detected bit
in the Device Status Register.


> +static void dpc_handle_surprise_removal(struct pci_dev *pdev)
> +{

I'm wondering if we also need

if (!pcie_wait_for_link(pdev, false)) {
pci_info(pdev, "Data Link Layer Link Active not cleared in 1000 msec\n");
goto out;
}

here, similar to dpc_reset_link() and in accordance with PCIe r6.1
sec 6.2.11:

"To ensure that the LTSSM has time to reach the Disabled state
or at least to bring the Link down under a variety of error
conditions, software must leave the Downstream Port in DPC
until the Data Link Layer Link Active bit in the Link Status
Register reads 0b; otherwise, the result is undefined."


> + if (pdev->dpc_rp_extensions && dpc_wait_rp_inactive(pdev)) {
> + pci_err(pdev, "failed to retrieve DPC root port on async remove\n");
> + goto out;
> + }

I don't think pci_err() is needed here as dpc_wait_rp_inactive()
already emits a message. (I think I mistakenly gave the advice
to emit an error here in an earlier review comment -- sorry!)


> +
> + pci_aer_raw_clear_status(pdev);
> + pci_clear_surpdn_errors(pdev);
> +
> + pci_write_config_word(pdev, pdev->dpc_cap + PCI_EXP_DPC_STATUS,
> + PCI_EXP_DPC_STATUS_TRIGGER);
> +
> +out:
> + clear_bit(PCI_DPC_RECOVERED, &pdev->priv_flags);
> + wake_up_all(&dpc_completed_waitqueue);
> +}
> +
> +static bool dpc_is_surprise_removal(struct pci_dev *pdev)
> +{
> + u16 status;
> +
> + pci_read_config_word(pdev, pdev->aer_cap + PCI_ERR_UNCOR_STATUS, &status);

Wrap to 80 chars.


> +
> + if (!pdev->is_hotplug_bridge)
> + return false;

Move this if-clause to the beginning if the function so that
you omit the unnecessary register read if it's not a hotplug
bridge.


> +
> + if (!(status & PCI_ERR_UNC_SURPDN))
> + return false;
> +
> + return true;
> +}
> +
> static irqreturn_t dpc_handler(int irq, void *context)
> {
> struct pci_dev *pdev = context;
>
> + /*
> + * According to Section 6.7.6 of the PCIe Base Spec 6.0, since async
> + * removal might be unexpected, errors might be reported as a side
> + * effect of the event and software should handle them as an expected
> + * part of this event.
> + */

I think the usual way to reference the spec is "PCIe r6.0 sec 6.7.6".

Maybe that's just me but I find the code comment a little difficult
to parse. Maybe something like the following?

/*
* According to PCIe r6.0 sec 6.7.6, errors are an expected side effect
* of async removal and should be ignored by software.
*/

Thanks,

Lukas

> + if (dpc_is_surprise_removal(pdev)) {
> + dpc_handle_surprise_removal(pdev);
> + return IRQ_HANDLED;
> + }
> +
> dpc_process_error(pdev);
>
> /* We configure DPC so it only triggers on ERR_FATAL */
> --
> 2.17.1
>

2023-06-22 21:53:28

by Lukas Wunner

[permalink] [raw]
Subject: Re: [PATCH v3 1/2] PCI: pciehp: Add support for async hotplug with native AER and DPC/EDR

On Thu, Jun 22, 2023 at 02:02:03PM -0700, Smita Koralahalli wrote:
> On 6/22/2023 2:04 AM, Lukas Wunner wrote:
> > On Wed, Jun 21, 2023 at 06:51:51PM +0000, Smita Koralahalli wrote:
> > > --- a/drivers/pci/pcie/dpc.c
> > > +++ b/drivers/pci/pcie/dpc.c
> > > @@ -292,10 +292,68 @@ void dpc_process_error(struct pci_dev *pdev)
> > > }
> > > }
> > > +static void pci_clear_surpdn_errors(struct pci_dev *pdev)
> > > +{
> > > + u16 reg16;
> > > + u32 reg32;
> > > +
> > > + pci_read_config_dword(pdev, pdev->dpc_cap + PCI_EXP_DPC_RP_PIO_STATUS, &reg32);
> > > + pci_write_config_dword(pdev, pdev->dpc_cap + PCI_EXP_DPC_RP_PIO_STATUS, reg32);
> >
> > Make this read+write conditional on "if (pdev->dpc_rp_extensions)"
> > as the register otherwise doesn't exist.
>
> I'm checking for pdev->dpc_rpc_extensions inside
> dpc_handle_surprise_removal() before calling pci_clear_surpdn_errors().
> Should I recheck it once again here?

Yes.


> > > + pci_read_config_word(pdev, PCI_STATUS, &reg16);
> > > + pci_write_config_word(pdev, PCI_STATUS, reg16);
> > > +
> > > + pcie_capability_write_word(pdev, PCI_EXP_DEVSTA, PCI_EXP_DEVSTA_FED);
> > > +}
> >
> > A code comment might be useful here saying that in practice,
> > Surprise Down errors have been observed to also set error bits
> > in the Status Register as well as the Fatal Error Detected bit
> > in the Device Status Register.
>
> And probably move this code comment below to where this function is called
> inside dpc_handle_surprise_removal()..?

No right here would be good because that's the piece of code to which
the code comment would pertain.


> > if (!pcie_wait_for_link(pdev, false)) {
> > pci_info(pdev, "Data Link Layer Link Active not cleared in 1000 msec\n");
> > goto out;
> > }
> >
> > here, similar to dpc_reset_link() and in accordance with PCIe r6.1
> > sec 6.2.11:
> >
> > "To ensure that the LTSSM has time to reach the Disabled state
> > or at least to bring the Link down under a variety of error
> > conditions, software must leave the Downstream Port in DPC
> > until the Data Link Layer Link Active bit in the Link Status
> > Register reads 0b; otherwise, the result is undefined."
>
> And include the above comment in code..

I'd say that's optional. dpc_reset_link() doesn't have a code comment
for that either, so...

Thanks,

Lukas

2023-06-22 21:54:45

by Smita Koralahalli

[permalink] [raw]
Subject: Re: [PATCH v3 1/2] PCI: pciehp: Add support for async hotplug with native AER and DPC/EDR

On 6/22/2023 2:04 AM, Lukas Wunner wrote:
> On Wed, Jun 21, 2023 at 06:51:51PM +0000, Smita Koralahalli wrote:
>> --- a/drivers/pci/pcie/dpc.c
>> +++ b/drivers/pci/pcie/dpc.c
>> @@ -292,10 +292,68 @@ void dpc_process_error(struct pci_dev *pdev)
>> }
>> }
>>
>> +static void pci_clear_surpdn_errors(struct pci_dev *pdev)
>> +{
>> + u16 reg16;
>> + u32 reg32;
>> +
>> + pci_read_config_dword(pdev, pdev->dpc_cap + PCI_EXP_DPC_RP_PIO_STATUS, &reg32);
>> + pci_write_config_dword(pdev, pdev->dpc_cap + PCI_EXP_DPC_RP_PIO_STATUS, reg32);
>
> Make this read+write conditional on "if (pdev->dpc_rp_extensions)"
> as the register otherwise doesn't exist.

I'm checking for pdev->dpc_rpc_extensions inside
dpc_handle_surprise_removal() before calling pci_clear_surpdn_errors().
Should I recheck it once again here?

>
> Wrap to 80 chars per line.

Okay.

>
>
>> + pci_read_config_word(pdev, PCI_STATUS, &reg16);
>> + pci_write_config_word(pdev, PCI_STATUS, reg16);
>> +
>> + pcie_capability_write_word(pdev, PCI_EXP_DEVSTA, PCI_EXP_DEVSTA_FED);
>> +}
>
> A code comment might be useful here saying that in practice,
> Surprise Down errors have been observed to also set error bits
> in the Status Register as well as the Fatal Error Detected bit
> in the Device Status Register.

And probably move this code comment below to where this function is
called inside dpc_handle_surprise_removal()..?

>
>
>> +static void dpc_handle_surprise_removal(struct pci_dev *pdev)
>> +{
>
> I'm wondering if we also need
>
> if (!pcie_wait_for_link(pdev, false)) {
> pci_info(pdev, "Data Link Layer Link Active not cleared in 1000 msec\n");
> goto out;
> }
>
> here, similar to dpc_reset_link() and in accordance with PCIe r6.1
> sec 6.2.11:
>
> "To ensure that the LTSSM has time to reach the Disabled state
> or at least to bring the Link down under a variety of error
> conditions, software must leave the Downstream Port in DPC
> until the Data Link Layer Link Active bit in the Link Status
> Register reads 0b; otherwise, the result is undefined."

And include the above comment in code..
>
>
>> + if (pdev->dpc_rp_extensions && dpc_wait_rp_inactive(pdev)) {
>> + pci_err(pdev, "failed to retrieve DPC root port on async remove\n");
>> + goto out;
>> + }
>
> I don't think pci_err() is needed here as dpc_wait_rp_inactive()
> already emits a message. (I think I mistakenly gave the advice
> to emit an error here in an earlier review comment -- sorry!)

:)

Will take care of other comments below as well.

Thanks,
Smita

>>
>> /* We configure DPC so it only triggers on ERR_FATAL */
>> --
>> 2.17.1
>>


Subject: Re: [PATCH v3 1/2] PCI: pciehp: Add support for async hotplug with native AER and DPC/EDR



On 6/21/23 11:51 AM, Smita Koralahalli wrote:
> According to Section 6.7.6 of PCIe Base Specification [1], async removal
> with DPC may result in surprise down error. This error is expected and
> is just a side-effect of async remove.
>
> Add support to handle the surprise down error generated as a side-effect
> of async remove. Typically, this error is benign as the pciehp handler
> invoked by PDC or/and DLLSC alongside DPC, de-enumerates and brings down
> the device appropriately. But the error messages might confuse users. Get
> rid of these irritating log messages with a 1s delay while pciehp waits
> for dpc recovery.
>
> The implementation is as follows: On an async remove a DPC is triggered
> along with a Presence Detect State change and/or DLL State Change.
> Determine it's an async remove by checking for DPC Trigger Status in DPC
> Status Register and Surprise Down Error Status in AER Uncorrected Error
> Status to be non-zero. If true, treat the DPC event as a side-effect of
> async remove, clear the error status registers and continue with hot-plug
> tear down routines. If not, follow the existing routine to handle AER and
> DPC errors.
>
> Please note that, masking Surprise Down Errors was explored as an
> alternative approach, but left due to the odd behavior that masking only
> avoids the interrupt, but still records an error per PCIe r6.0.1 Section
> 6.2.3.2.2. That stale error is going to be reported the next time some
> error other than Surprise Down is handled.

I think this fix is applicable to the EDR code path as well.

>
> Dmesg before:
>
> pcieport 0000:00:01.4: DPC: containment event, status:0x1f01 source:0x0000
> pcieport 0000:00:01.4: DPC: unmasked uncorrectable error detected
> pcieport 0000:00:01.4: PCIe Bus Error: severity=Uncorrected (Fatal), type=Transaction Layer, (Receiver ID)
> pcieport 0000:00:01.4: device [1022:14ab] error status/mask=00000020/04004000
> pcieport 0000:00:01.4: [ 5] SDES (First)
> nvme nvme2: frozen state error detected, reset controller
> pcieport 0000:00:01.4: DPC: Data Link Layer Link Active not set in 1000 msec
> pcieport 0000:00:01.4: AER: subordinate device reset failed
> pcieport 0000:00:01.4: AER: device recovery failed
> pcieport 0000:00:01.4: pciehp: Slot(16): Link Down
> nvme2n1: detected capacity change from 1953525168 to 0
> pci 0000:04:00.0: Removing from iommu group 49
>
> Dmesg after:
>
> pcieport 0000:00:01.4: pciehp: Slot(16): Link Down
> nvme1n1: detected capacity change from 1953525168 to 0
> pci 0000:04:00.0: Removing from iommu group 37
>
> [1] PCI Express Base Specification Revision 6.0, Dec 16 2021.
> https://members.pcisig.com/wg/PCI-SIG/document/16609
>
> Signed-off-by: Smita Koralahalli <[email protected]>
> ---
> v2:
> Indentation is taken care. (Bjorn)
> Unrelevant dmesg logs are removed. (Bjorn)
> Rephrased commit message, to be clear on native vs FW-First
> handling. (Bjorn and Sathyanarayanan)
> Prefix changed from pciehp_ to dpc_. (Lukas)
> Clearing ARI and AtomicOp Requester are performed as a part of
> (de-)enumeration in pciehp_unconfigure_device(). (Lukas)
> Changed to clearing all optional capabilities in DEVCTL2.
> OS-First -> native. (Sathyanarayanan)
>
> v3:
> Added error message when root port become inactive.
> Modified commit description to add more details.
> Rearranged code comments and function calls with no functional
> change.
> Additional check for is_hotplug_bridge.
> dpc_completed_waitqueue to wakeup pciehp handler.
> Cleared only Fatal error detected in DEVSTA.
> ---
> drivers/pci/pcie/dpc.c | 58 ++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 58 insertions(+)
>
> diff --git a/drivers/pci/pcie/dpc.c b/drivers/pci/pcie/dpc.c
> index 3ceed8e3de41..5153ac8ea91c 100644
> --- a/drivers/pci/pcie/dpc.c
> +++ b/drivers/pci/pcie/dpc.c
> @@ -292,10 +292,68 @@ void dpc_process_error(struct pci_dev *pdev)
> }
> }
>
> +static void pci_clear_surpdn_errors(struct pci_dev *pdev)
> +{
> + u16 reg16;
> + u32 reg32;
> +
> + pci_read_config_dword(pdev, pdev->dpc_cap + PCI_EXP_DPC_RP_PIO_STATUS, &reg32);
> + pci_write_config_dword(pdev, pdev->dpc_cap + PCI_EXP_DPC_RP_PIO_STATUS, reg32);

It is not clear why you want to clear it.

> +
> + pci_read_config_word(pdev, PCI_STATUS, &reg16);
> + pci_write_config_word(pdev, PCI_STATUS, reg16);

Same as above. Can you add some comment about why you are clearing it?

> +
> + pcie_capability_write_word(pdev, PCI_EXP_DEVSTA, PCI_EXP_DEVSTA_FED);
> +}
> +
> +static void dpc_handle_surprise_removal(struct pci_dev *pdev)
> +{
> + if (pdev->dpc_rp_extensions && dpc_wait_rp_inactive(pdev)) {
> + pci_err(pdev, "failed to retrieve DPC root port on async remove\n");
> + goto out;
> + }
> +
> + pci_aer_raw_clear_status(pdev);
> + pci_clear_surpdn_errors(pdev);
> +
> + pci_write_config_word(pdev, pdev->dpc_cap + PCI_EXP_DPC_STATUS,
> + PCI_EXP_DPC_STATUS_TRIGGER);

Don't you need to wait for the link to go down?

> +
> +out:
> + clear_bit(PCI_DPC_RECOVERED, &pdev->priv_flags);
> + wake_up_all(&dpc_completed_waitqueue);
> +}
> +
> +static bool dpc_is_surprise_removal(struct pci_dev *pdev)
> +{
> + u16 status;
> +
> + pci_read_config_word(pdev, pdev->aer_cap + PCI_ERR_UNCOR_STATUS, &status);
> +
> + if (!pdev->is_hotplug_bridge)
> + return false;
> +
> + if (!(status & PCI_ERR_UNC_SURPDN))
> + return false;
> +
> + return true;
> +}
> +
> static irqreturn_t dpc_handler(int irq, void *context)
> {
> struct pci_dev *pdev = context;
>
> + /*
> + * According to Section 6.7.6 of the PCIe Base Spec 6.0, since async
> + * removal might be unexpected, errors might be reported as a side
> + * effect of the event and software should handle them as an expected
> + * part of this event.
> + */
> + if (dpc_is_surprise_removal(pdev)) {
> + dpc_handle_surprise_removal(pdev);
> + return IRQ_HANDLED;
> + }
> +
> dpc_process_error(pdev);
>
> /* We configure DPC so it only triggers on ERR_FATAL */

--
Sathyanarayanan Kuppuswamy
Linux Kernel Developer

2023-06-27 18:03:14

by Smita Koralahalli

[permalink] [raw]
Subject: Re: [PATCH v3 1/2] PCI: pciehp: Add support for async hotplug with native AER and DPC/EDR

On 6/22/2023 4:22 PM, Sathyanarayanan Kuppuswamy wrote:
>
>
> On 6/21/23 11:51 AM, Smita Koralahalli wrote:
>> According to Section 6.7.6 of PCIe Base Specification [1], async removal
>> with DPC may result in surprise down error. This error is expected and
>> is just a side-effect of async remove.
>>
>> Add support to handle the surprise down error generated as a side-effect
>> of async remove. Typically, this error is benign as the pciehp handler
>> invoked by PDC or/and DLLSC alongside DPC, de-enumerates and brings down
>> the device appropriately. But the error messages might confuse users. Get
>> rid of these irritating log messages with a 1s delay while pciehp waits
>> for dpc recovery.
>>
>> The implementation is as follows: On an async remove a DPC is triggered
>> along with a Presence Detect State change and/or DLL State Change.
>> Determine it's an async remove by checking for DPC Trigger Status in DPC
>> Status Register and Surprise Down Error Status in AER Uncorrected Error
>> Status to be non-zero. If true, treat the DPC event as a side-effect of
>> async remove, clear the error status registers and continue with hot-plug
>> tear down routines. If not, follow the existing routine to handle AER and
>> DPC errors.
>>
>> Please note that, masking Surprise Down Errors was explored as an
>> alternative approach, but left due to the odd behavior that masking only
>> avoids the interrupt, but still records an error per PCIe r6.0.1 Section
>> 6.2.3.2.2. That stale error is going to be reported the next time some
>> error other than Surprise Down is handled.
>
> I think this fix is applicable to the EDR code path as well.
>
>>
>> Dmesg before:
>>
>> pcieport 0000:00:01.4: DPC: containment event, status:0x1f01 source:0x0000
>> pcieport 0000:00:01.4: DPC: unmasked uncorrectable error detected
>> pcieport 0000:00:01.4: PCIe Bus Error: severity=Uncorrected (Fatal), type=Transaction Layer, (Receiver ID)
>> pcieport 0000:00:01.4: device [1022:14ab] error status/mask=00000020/04004000
>> pcieport 0000:00:01.4: [ 5] SDES (First)
>> nvme nvme2: frozen state error detected, reset controller
>> pcieport 0000:00:01.4: DPC: Data Link Layer Link Active not set in 1000 msec
>> pcieport 0000:00:01.4: AER: subordinate device reset failed
>> pcieport 0000:00:01.4: AER: device recovery failed
>> pcieport 0000:00:01.4: pciehp: Slot(16): Link Down
>> nvme2n1: detected capacity change from 1953525168 to 0
>> pci 0000:04:00.0: Removing from iommu group 49
>>
>> Dmesg after:
>>
>> pcieport 0000:00:01.4: pciehp: Slot(16): Link Down
>> nvme1n1: detected capacity change from 1953525168 to 0
>> pci 0000:04:00.0: Removing from iommu group 37
>>
>> [1] PCI Express Base Specification Revision 6.0, Dec 16 2021.
>> https://members.pcisig.com/wg/PCI-SIG/document/16609
>>
>> Signed-off-by: Smita Koralahalli <[email protected]>
>> ---
>> v2:
>> Indentation is taken care. (Bjorn)
>> Unrelevant dmesg logs are removed. (Bjorn)
>> Rephrased commit message, to be clear on native vs FW-First
>> handling. (Bjorn and Sathyanarayanan)
>> Prefix changed from pciehp_ to dpc_. (Lukas)
>> Clearing ARI and AtomicOp Requester are performed as a part of
>> (de-)enumeration in pciehp_unconfigure_device(). (Lukas)
>> Changed to clearing all optional capabilities in DEVCTL2.
>> OS-First -> native. (Sathyanarayanan)
>>
>> v3:
>> Added error message when root port become inactive.
>> Modified commit description to add more details.
>> Rearranged code comments and function calls with no functional
>> change.
>> Additional check for is_hotplug_bridge.
>> dpc_completed_waitqueue to wakeup pciehp handler.
>> Cleared only Fatal error detected in DEVSTA.
>> ---
>> drivers/pci/pcie/dpc.c | 58 ++++++++++++++++++++++++++++++++++++++++++
>> 1 file changed, 58 insertions(+)
>>
>> diff --git a/drivers/pci/pcie/dpc.c b/drivers/pci/pcie/dpc.c
>> index 3ceed8e3de41..5153ac8ea91c 100644
>> --- a/drivers/pci/pcie/dpc.c
>> +++ b/drivers/pci/pcie/dpc.c
>> @@ -292,10 +292,68 @@ void dpc_process_error(struct pci_dev *pdev)
>> }
>> }
>>
>> +static void pci_clear_surpdn_errors(struct pci_dev *pdev)
>> +{
>> + u16 reg16;
>> + u32 reg32;
>> +
>> + pci_read_config_dword(pdev, pdev->dpc_cap + PCI_EXP_DPC_RP_PIO_STATUS, &reg32);
>> + pci_write_config_dword(pdev, pdev->dpc_cap + PCI_EXP_DPC_RP_PIO_STATUS, reg32);
>
> It is not clear why you want to clear it.

We are observing, Surprise Down Errors setting error bits in these
status registers and also Fatal Error Detected bit in DEVSTA. Hence, we
are clearing them to mask any kind of appearance that there was an
error. Will add appropriate code comments in v4..
>
>> +
>> + pci_read_config_word(pdev, PCI_STATUS, &reg16);
>> + pci_write_config_word(pdev, PCI_STATUS, reg16);
>
> Same as above. Can you add some comment about why you are clearing it?

Will add.
>
>> +
>> + pcie_capability_write_word(pdev, PCI_EXP_DEVSTA, PCI_EXP_DEVSTA_FED);
>> +}
>> +
>> +static void dpc_handle_surprise_removal(struct pci_dev *pdev)
>> +{
>> + if (pdev->dpc_rp_extensions && dpc_wait_rp_inactive(pdev)) {
>> + pci_err(pdev, "failed to retrieve DPC root port on async remove\n");
>> + goto out;
>> + }
>> +
>> + pci_aer_raw_clear_status(pdev);
>> + pci_clear_surpdn_errors(pdev);
>> +
>> + pci_write_config_word(pdev, pdev->dpc_cap + PCI_EXP_DPC_STATUS,
>> + PCI_EXP_DPC_STATUS_TRIGGER);
>
> Don't you need to wait for the link to go down?

Yes will include, pcie_wait_for_link()..

Should this check be here or at the beginning of the function before we
check pdev->dpc_rp_extensions?

Thanks,
Smita

>
>> +
>> +out:
>> + clear_bit(PCI_DPC_RECOVERED, &pdev->priv_flags);
>> + wake_up_all(&dpc_completed_waitqueue);
>> +}
>> +
>> +static bool dpc_is_surprise_removal(struct pci_dev *pdev)
>> +{
>> + u16 status;
>> +
>> + pci_read_config_word(pdev, pdev->aer_cap + PCI_ERR_UNCOR_STATUS, &status);
>> +
>> + if (!pdev->is_hotplug_bridge)
>> + return false;
>> +
>> + if (!(status & PCI_ERR_UNC_SURPDN))
>> + return false;
>> +
>> + return true;
>> +}
>> +
>> static irqreturn_t dpc_handler(int irq, void *context)
>> {
>> struct pci_dev *pdev = context;
>>
>> + /*
>> + * According to Section 6.7.6 of the PCIe Base Spec 6.0, since async
>> + * removal might be unexpected, errors might be reported as a side
>> + * effect of the event and software should handle them as an expected
>> + * part of this event.
>> + */
>> + if (dpc_is_surprise_removal(pdev)) {
>> + dpc_handle_surprise_removal(pdev);
>> + return IRQ_HANDLED;
>> + }
>> +
>> dpc_process_error(pdev);
>>
>> /* We configure DPC so it only triggers on ERR_FATAL */
>


2023-06-28 13:41:01

by Lukas Wunner

[permalink] [raw]
Subject: Re: [PATCH v3 1/2] PCI: pciehp: Add support for async hotplug with native AER and DPC/EDR

On Tue, Jun 27, 2023 at 10:48:37AM -0700, Smita Koralahalli wrote:
> On 6/22/2023 4:22 PM, Sathyanarayanan Kuppuswamy wrote:
> > On 6/21/23 11:51 AM, Smita Koralahalli wrote:
> > > +static void dpc_handle_surprise_removal(struct pci_dev *pdev)
> > > +{
> > > + if (pdev->dpc_rp_extensions && dpc_wait_rp_inactive(pdev)) {
> > > + pci_err(pdev, "failed to retrieve DPC root port on async remove\n");
> > > + goto out;
> > > + }
> > > +
> > > + pci_aer_raw_clear_status(pdev);
> > > + pci_clear_surpdn_errors(pdev);
> > > +
> > > + pci_write_config_word(pdev, pdev->dpc_cap + PCI_EXP_DPC_STATUS,
> > > + PCI_EXP_DPC_STATUS_TRIGGER);
> >
> > Don't you need to wait for the link to go down?
>
> Yes will include, pcie_wait_for_link()..
>
> Should this check be here or at the beginning of the function before we
> check pdev->dpc_rp_extensions?

I'd just use the same order as dpc_reset_link(), i.e. checking for
!pcie_wait_for_link() happens before the check for pdev->dpc_rp_extensions.

Thanks,

Lukas