2024-02-22 22:15:42

by Bjorn Helgaas

[permalink] [raw]
Subject: [PATCH v2 0/3] PCI/DPC: Clean up DPC vs AER/EDR ownership and Kconfig

From: Bjorn Helgaas <[email protected]>

Previously we could request control of DPC without AER, which is illegal
per spec. Also, we could enable CONFIG_PCIE_DPC without CONFIG_PCIE_EDR,
which is also illegal. This series addresses both.

Bjorn Helgaas (3):
PCI/DPC: Request DPC only if also requesting AER
PCI/DPC: Remove CONFIG_PCIE_EDR
PCI/DPC: Encapsulate pci_acpi_add_edr_notifier()

drivers/acpi/pci_root.c | 22 ++++++++++++----------
drivers/pci/pci.h | 4 ++++
drivers/pci/pcie/Kconfig | 14 ++++----------
drivers/pci/pcie/Makefile | 5 ++++-
drivers/pci/pcie/dpc.c | 10 ----------
include/linux/pci-acpi.h | 8 --------
6 files changed, 24 insertions(+), 39 deletions(-)

--
2.34.1



2024-02-22 22:15:58

by Bjorn Helgaas

[permalink] [raw]
Subject: [PATCH v2 1/3] PCI/DPC: Request DPC only if also requesting AER

From: Bjorn Helgaas <[email protected]>

When booting with "pci=noaer", we don't request control of AER, but we
previously *did* request control of DPC, as in the dmesg log attached at
the bugzilla below:

Command line: ... pci=noaer
acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI EDR HPX-Type3]
acpi PNP0A08:00: _OSC: OS now controls [PCIeHotplug SHPCHotplug PME PCIeCapability LTR DPC]

That's illegal per PCI Firmware Spec, r3.3, sec 4.5.1, table 4-5, which
says:

If the operating system sets this bit [OSC_PCI_EXPRESS_DPC_CONTROL], it
must also set bit 7 of the Support field (indicating support for Error
Disconnect Recover notifications) and bits 3 and 4 of the Control field
(requesting control of PCI Express Advanced Error Reporting and the PCI
Express Capability Structure).

Request DPC control only if we have also requested AER control.

Fixes: ac1c8e35a326 ("PCI/DPC: Add Error Disconnect Recover (EDR) support")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=218491#c12
Signed-off-by: Bjorn Helgaas <[email protected]>
Cc: <[email protected]> # v5.7+
Cc: Kuppuswamy Sathyanarayanan <[email protected]>
Cc: Matthew W Carlis <[email protected]>
Cc: Keith Busch <[email protected]>
Cc: Lukas Wunner <[email protected]>
Cc: Mika Westerberg <[email protected]>
Cc: Jesse Brandeburg <[email protected]>
---
drivers/acpi/pci_root.c | 20 +++++++++++---------
1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c
index 58b89b8d950e..efc292b6214e 100644
--- a/drivers/acpi/pci_root.c
+++ b/drivers/acpi/pci_root.c
@@ -518,17 +518,19 @@ static u32 calculate_control(void)
if (IS_ENABLED(CONFIG_HOTPLUG_PCI_SHPC))
control |= OSC_PCI_SHPC_NATIVE_HP_CONTROL;

- if (pci_aer_available())
+ if (pci_aer_available()) {
control |= OSC_PCI_EXPRESS_AER_CONTROL;

- /*
- * Per the Downstream Port Containment Related Enhancements ECN to
- * the PCI Firmware Spec, r3.2, sec 4.5.1, table 4-5,
- * OSC_PCI_EXPRESS_DPC_CONTROL indicates the OS supports both DPC
- * and EDR.
- */
- if (IS_ENABLED(CONFIG_PCIE_DPC) && IS_ENABLED(CONFIG_PCIE_EDR))
- control |= OSC_PCI_EXPRESS_DPC_CONTROL;
+ /*
+ * Per PCI Firmware Spec, r3.3, sec 4.5.1, table 4-5, the
+ * OS can request DPC control only if it has advertised
+ * OSC_PCI_EDR_SUPPORT and requested both
+ * OSC_PCI_EXPRESS_CAPABILITY_CONTROL and
+ * OSC_PCI_EXPRESS_AER_CONTROL.
+ */
+ if (IS_ENABLED(CONFIG_PCIE_DPC))
+ control |= OSC_PCI_EXPRESS_DPC_CONTROL;
+ }

return control;
}
--
2.34.1


2024-02-22 22:16:10

by Bjorn Helgaas

[permalink] [raw]
Subject: [PATCH v2 2/3] PCI/DPC: Remove CONFIG_PCIE_EDR

From: Bjorn Helgaas <[email protected]>

Previous Kconfig allowed the possibility of enabling CONFIG_PCIE_DPC
without CONFIG_PCIE_EDR. The PCI Firmware Spec, r3.3, sec 4.5.1,
table 4-5, says an ACPI OS that requests control of DPC must also
advertise support for EDR.

Remove CONFIG_PCIE_EDR and enable the EDR code with CONFIG_PCIE_DPC so that
enabling DPC also enables EDR for ACPI systems. Since EDR is an ACPI
feature, build edr.o only when CONFIG_ACPI is enabled. Stubs cover the
case when DPC is enabled without ACPI.

Signed-off-by: Bjorn Helgaas <[email protected]>
---
drivers/acpi/pci_root.c | 2 +-
drivers/pci/pcie/Kconfig | 14 ++++----------
drivers/pci/pcie/Makefile | 5 ++++-
drivers/pci/pcie/dpc.c | 10 ----------
include/linux/pci-acpi.h | 4 ++--
5 files changed, 11 insertions(+), 24 deletions(-)

diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c
index efc292b6214e..bcaf3d3a5e05 100644
--- a/drivers/acpi/pci_root.c
+++ b/drivers/acpi/pci_root.c
@@ -448,7 +448,7 @@ static u32 calculate_support(void)
support |= OSC_PCI_ASPM_SUPPORT | OSC_PCI_CLOCK_PM_SUPPORT;
if (pci_msi_enabled())
support |= OSC_PCI_MSI_SUPPORT;
- if (IS_ENABLED(CONFIG_PCIE_EDR))
+ if (IS_ENABLED(CONFIG_PCIE_DPC)) /* DPC => EDR support */
support |= OSC_PCI_EDR_SUPPORT;

return support;
diff --git a/drivers/pci/pcie/Kconfig b/drivers/pci/pcie/Kconfig
index 8999fcebde6a..21e98289fbe9 100644
--- a/drivers/pci/pcie/Kconfig
+++ b/drivers/pci/pcie/Kconfig
@@ -137,6 +137,10 @@ config PCIE_DPC
have this capability or you do not want to use this feature,
it is safe to answer N.

+ On ACPI systems, this includes Error Disconnect Recover support,
+ the hybrid model that uses both firmware and OS to implement DPC,
+ as specified in the PCI Firmware Specification r3.3.
+
config PCIE_PTM
bool "PCI Express Precision Time Measurement support"
help
@@ -145,13 +149,3 @@ config PCIE_PTM

This is only useful if you have devices that support PTM, but it
is safe to enable even if you don't.
-
-config PCIE_EDR
- bool "PCI Express Error Disconnect Recover support"
- depends on PCIE_DPC && ACPI
- help
- This option adds Error Disconnect Recover support as specified
- in the Downstream Port Containment Related Enhancements ECN to
- the PCI Firmware Specification r3.2. Enable this if you want to
- support hybrid DPC model which uses both firmware and OS to
- implement DPC.
diff --git a/drivers/pci/pcie/Makefile b/drivers/pci/pcie/Makefile
index 8de4ed5f98f1..72657f780c33 100644
--- a/drivers/pci/pcie/Makefile
+++ b/drivers/pci/pcie/Makefile
@@ -12,4 +12,7 @@ obj-$(CONFIG_PCIEAER_INJECT) += aer_inject.o
obj-$(CONFIG_PCIE_PME) += pme.o
obj-$(CONFIG_PCIE_DPC) += dpc.o
obj-$(CONFIG_PCIE_PTM) += ptm.o
-obj-$(CONFIG_PCIE_EDR) += edr.o
+
+ifdef CONFIG_ACPI
+obj-$(CONFIG_PCIE_DPC) += edr.o
+endif
diff --git a/drivers/pci/pcie/dpc.c b/drivers/pci/pcie/dpc.c
index 94111e438241..0aa79581250b 100644
--- a/drivers/pci/pcie/dpc.c
+++ b/drivers/pci/pcie/dpc.c
@@ -102,19 +102,9 @@ static bool dpc_completed(struct pci_dev *pdev)
*/
bool pci_dpc_recovered(struct pci_dev *pdev)
{
- struct pci_host_bridge *host;
-
if (!pdev->dpc_cap)
return false;

- /*
- * Synchronization between hotplug and DPC is not supported
- * if DPC is owned by firmware and EDR is not enabled.
- */
- host = pci_find_host_bridge(pdev->bus);
- if (!host->native_dpc && !IS_ENABLED(CONFIG_PCIE_EDR))
- return false;
-
/*
* Need a timeout in case DPC never completes due to failure of
* dpc_wait_rp_inactive(). The spec doesn't mandate a time limit,
diff --git a/include/linux/pci-acpi.h b/include/linux/pci-acpi.h
index 078225b514d4..92e196ba0249 100644
--- a/include/linux/pci-acpi.h
+++ b/include/linux/pci-acpi.h
@@ -122,13 +122,13 @@ extern const guid_t pci_acpi_dsm_guid;
#define DSM_PCI_POWER_ON_RESET_DELAY 0x08
#define DSM_PCI_DEVICE_READINESS_DURATIONS 0x09

-#ifdef CONFIG_PCIE_EDR
+#ifdef CONFIG_PCIE_DPC
void pci_acpi_add_edr_notifier(struct pci_dev *pdev);
void pci_acpi_remove_edr_notifier(struct pci_dev *pdev);
#else
static inline void pci_acpi_add_edr_notifier(struct pci_dev *pdev) { }
static inline void pci_acpi_remove_edr_notifier(struct pci_dev *pdev) { }
-#endif /* CONFIG_PCIE_EDR */
+#endif /* CONFIG_PCIE_DPC */

int pci_acpi_set_companion_lookup_hook(struct acpi_device *(*func)(struct pci_dev *));
void pci_acpi_clear_companion_lookup_hook(void);
--
2.34.1


2024-02-22 22:17:45

by Bjorn Helgaas

[permalink] [raw]
Subject: [PATCH v2 3/3] PCI/DPC: Encapsulate pci_acpi_add_edr_notifier()

From: Bjorn Helgaas <[email protected]>

pci_acpi_add_edr_notifier() and pci_acpi_remove_edr_notifier() are only
referenced inside drivers/pci/. Move their declarations from
include/linux/pci-acpi.h to drivers/pci/pci.h so they're not visible
outside drivers/pci/.

Signed-off-by: Bjorn Helgaas <[email protected]>
---
drivers/pci/pci.h | 4 ++++
include/linux/pci-acpi.h | 8 --------
2 files changed, 4 insertions(+), 8 deletions(-)

diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 2336a8d1edab..03bf2776d73b 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -432,11 +432,15 @@ void pci_dpc_init(struct pci_dev *pdev);
void dpc_process_error(struct pci_dev *pdev);
pci_ers_result_t dpc_reset_link(struct pci_dev *pdev);
bool pci_dpc_recovered(struct pci_dev *pdev);
+void pci_acpi_add_edr_notifier(struct pci_dev *pdev);
+void pci_acpi_remove_edr_notifier(struct pci_dev *pdev);
#else
static inline void pci_save_dpc_state(struct pci_dev *dev) { }
static inline void pci_restore_dpc_state(struct pci_dev *dev) { }
static inline void pci_dpc_init(struct pci_dev *pdev) { }
static inline bool pci_dpc_recovered(struct pci_dev *pdev) { return false; }
+static inline void pci_acpi_add_edr_notifier(struct pci_dev *pdev) { }
+static inline void pci_acpi_remove_edr_notifier(struct pci_dev *pdev) { }
#endif

#ifdef CONFIG_PCIEPORTBUS
diff --git a/include/linux/pci-acpi.h b/include/linux/pci-acpi.h
index 92e196ba0249..f447ce215adf 100644
--- a/include/linux/pci-acpi.h
+++ b/include/linux/pci-acpi.h
@@ -122,14 +122,6 @@ extern const guid_t pci_acpi_dsm_guid;
#define DSM_PCI_POWER_ON_RESET_DELAY 0x08
#define DSM_PCI_DEVICE_READINESS_DURATIONS 0x09

-#ifdef CONFIG_PCIE_DPC
-void pci_acpi_add_edr_notifier(struct pci_dev *pdev);
-void pci_acpi_remove_edr_notifier(struct pci_dev *pdev);
-#else
-static inline void pci_acpi_add_edr_notifier(struct pci_dev *pdev) { }
-static inline void pci_acpi_remove_edr_notifier(struct pci_dev *pdev) { }
-#endif /* CONFIG_PCIE_DPC */
-
int pci_acpi_set_companion_lookup_hook(struct acpi_device *(*func)(struct pci_dev *));
void pci_acpi_clear_companion_lookup_hook(void);

--
2.34.1


Subject: Re: [PATCH v2 2/3] PCI/DPC: Remove CONFIG_PCIE_EDR

Hi Bjorn,

On 2/22/24 2:15 PM, Bjorn Helgaas wrote:
> From: Bjorn Helgaas <[email protected]>
>
> Previous Kconfig allowed the possibility of enabling CONFIG_PCIE_DPC
> without CONFIG_PCIE_EDR. The PCI Firmware Spec, r3.3, sec 4.5.1,
> table 4-5, says an ACPI OS that requests control of DPC must also
> advertise support for EDR.
>
> Remove CONFIG_PCIE_EDR and enable the EDR code with CONFIG_PCIE_DPC so that
> enabling DPC also enables EDR for ACPI systems. Since EDR is an ACPI
> feature, build edr.o only when CONFIG_ACPI is enabled. Stubs cover the
> case when DPC is enabled without ACPI.
>
> Signed-off-by: Bjorn Helgaas <[email protected]>
> ---
> drivers/acpi/pci_root.c | 2 +-
> drivers/pci/pcie/Kconfig | 14 ++++----------
> drivers/pci/pcie/Makefile | 5 ++++-
> drivers/pci/pcie/dpc.c | 10 ----------
> include/linux/pci-acpi.h | 4 ++--
> 5 files changed, 11 insertions(+), 24 deletions(-)
>
> diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c
> index efc292b6214e..bcaf3d3a5e05 100644
> --- a/drivers/acpi/pci_root.c
> +++ b/drivers/acpi/pci_root.c
> @@ -448,7 +448,7 @@ static u32 calculate_support(void)
> support |= OSC_PCI_ASPM_SUPPORT | OSC_PCI_CLOCK_PM_SUPPORT;
> if (pci_msi_enabled())
> support |= OSC_PCI_MSI_SUPPORT;
> - if (IS_ENABLED(CONFIG_PCIE_EDR))
> + if (IS_ENABLED(CONFIG_PCIE_DPC)) /* DPC => EDR support */
> support |= OSC_PCI_EDR_SUPPORT;

Since EDR internally touches AER registers, I still think we should make sure
OS enables AER support before advertising the EDR support.

>
> return support;
> diff --git a/drivers/pci/pcie/Kconfig b/drivers/pci/pcie/Kconfig
> index 8999fcebde6a..21e98289fbe9 100644
> --- a/drivers/pci/pcie/Kconfig
> +++ b/drivers/pci/pcie/Kconfig
> @@ -137,6 +137,10 @@ config PCIE_DPC
> have this capability or you do not want to use this feature,
> it is safe to answer N.
>
> + On ACPI systems, this includes Error Disconnect Recover support,
> + the hybrid model that uses both firmware and OS to implement DPC,
> + as specified in the PCI Firmware Specification r3.3.

Nit: Include some section reference?

> +
> config PCIE_PTM
> bool "PCI Express Precision Time Measurement support"
> help
> @@ -145,13 +149,3 @@ config PCIE_PTM
>
> This is only useful if you have devices that support PTM, but it
> is safe to enable even if you don't.
> -
> -config PCIE_EDR
> - bool "PCI Express Error Disconnect Recover support"
> - depends on PCIE_DPC && ACPI
> - help
> - This option adds Error Disconnect Recover support as specified
> - in the Downstream Port Containment Related Enhancements ECN to
> - the PCI Firmware Specification r3.2. Enable this if you want to
> - support hybrid DPC model which uses both firmware and OS to
> - implement DPC.
> diff --git a/drivers/pci/pcie/Makefile b/drivers/pci/pcie/Makefile
> index 8de4ed5f98f1..72657f780c33 100644
> --- a/drivers/pci/pcie/Makefile
> +++ b/drivers/pci/pcie/Makefile
> @@ -12,4 +12,7 @@ obj-$(CONFIG_PCIEAER_INJECT) += aer_inject.o
> obj-$(CONFIG_PCIE_PME) += pme.o
> obj-$(CONFIG_PCIE_DPC) += dpc.o
> obj-$(CONFIG_PCIE_PTM) += ptm.o
> -obj-$(CONFIG_PCIE_EDR) += edr.o
> +
> +ifdef CONFIG_ACPI
> +obj-$(CONFIG_PCIE_DPC) += edr.o
> +endif
> diff --git a/drivers/pci/pcie/dpc.c b/drivers/pci/pcie/dpc.c
> index 94111e438241..0aa79581250b 100644
> --- a/drivers/pci/pcie/dpc.c
> +++ b/drivers/pci/pcie/dpc.c
> @@ -102,19 +102,9 @@ static bool dpc_completed(struct pci_dev *pdev)
> */
> bool pci_dpc_recovered(struct pci_dev *pdev)
> {
> - struct pci_host_bridge *host;
> -
> if (!pdev->dpc_cap)
> return false;
>
> - /*
> - * Synchronization between hotplug and DPC is not supported
> - * if DPC is owned by firmware and EDR is not enabled.
> - */
> - host = pci_find_host_bridge(pdev->bus);
> - if (!host->native_dpc && !IS_ENABLED(CONFIG_PCIE_EDR))
> - return false;
> -
> /*
> * Need a timeout in case DPC never completes due to failure of
> * dpc_wait_rp_inactive(). The spec doesn't mandate a time limit,
> diff --git a/include/linux/pci-acpi.h b/include/linux/pci-acpi.h
> index 078225b514d4..92e196ba0249 100644
> --- a/include/linux/pci-acpi.h
> +++ b/include/linux/pci-acpi.h
> @@ -122,13 +122,13 @@ extern const guid_t pci_acpi_dsm_guid;
> #define DSM_PCI_POWER_ON_RESET_DELAY 0x08
> #define DSM_PCI_DEVICE_READINESS_DURATIONS 0x09
>
> -#ifdef CONFIG_PCIE_EDR
> +#ifdef CONFIG_PCIE_DPC
> void pci_acpi_add_edr_notifier(struct pci_dev *pdev);
> void pci_acpi_remove_edr_notifier(struct pci_dev *pdev);
> #else
> static inline void pci_acpi_add_edr_notifier(struct pci_dev *pdev) { }
> static inline void pci_acpi_remove_edr_notifier(struct pci_dev *pdev) { }
> -#endif /* CONFIG_PCIE_EDR */
> +#endif /* CONFIG_PCIE_DPC */
>
> int pci_acpi_set_companion_lookup_hook(struct acpi_device *(*func)(struct pci_dev *));
> void pci_acpi_clear_companion_lookup_hook(void);

--
Sathyanarayanan Kuppuswamy
Linux Kernel Developer


Subject: Re: [PATCH v2 3/3] PCI/DPC: Encapsulate pci_acpi_add_edr_notifier()


On 2/22/24 2:15 PM, Bjorn Helgaas wrote:
> From: Bjorn Helgaas <[email protected]>
>
> pci_acpi_add_edr_notifier() and pci_acpi_remove_edr_notifier() are only
> referenced inside drivers/pci/. Move their declarations from
> include/linux/pci-acpi.h to drivers/pci/pci.h so they're not visible
> outside drivers/pci/.
>
> Signed-off-by: Bjorn Helgaas <[email protected]>
> ---
> drivers/pci/pci.h | 4 ++++
> include/linux/pci-acpi.h | 8 --------
> 2 files changed, 4 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> index 2336a8d1edab..03bf2776d73b 100644
> --- a/drivers/pci/pci.h
> +++ b/drivers/pci/pci.h
> @@ -432,11 +432,15 @@ void pci_dpc_init(struct pci_dev *pdev);
> void dpc_process_error(struct pci_dev *pdev);
> pci_ers_result_t dpc_reset_link(struct pci_dev *pdev);
> bool pci_dpc_recovered(struct pci_dev *pdev);
> +void pci_acpi_add_edr_notifier(struct pci_dev *pdev);
> +void pci_acpi_remove_edr_notifier(struct pci_dev *pdev);

Protect them with CONFIG_ACPI?

> #else
> static inline void pci_save_dpc_state(struct pci_dev *dev) { }
> static inline void pci_restore_dpc_state(struct pci_dev *dev) { }
> static inline void pci_dpc_init(struct pci_dev *pdev) { }
> static inline bool pci_dpc_recovered(struct pci_dev *pdev) { return false; }
> +static inline void pci_acpi_add_edr_notifier(struct pci_dev *pdev) { }
> +static inline void pci_acpi_remove_edr_notifier(struct pci_dev *pdev) { }
> #endif
>
> #ifdef CONFIG_PCIEPORTBUS
> diff --git a/include/linux/pci-acpi.h b/include/linux/pci-acpi.h
> index 92e196ba0249..f447ce215adf 100644
> --- a/include/linux/pci-acpi.h
> +++ b/include/linux/pci-acpi.h
> @@ -122,14 +122,6 @@ extern const guid_t pci_acpi_dsm_guid;
> #define DSM_PCI_POWER_ON_RESET_DELAY 0x08
> #define DSM_PCI_DEVICE_READINESS_DURATIONS 0x09
>
> -#ifdef CONFIG_PCIE_DPC
> -void pci_acpi_add_edr_notifier(struct pci_dev *pdev);
> -void pci_acpi_remove_edr_notifier(struct pci_dev *pdev);
> -#else
> -static inline void pci_acpi_add_edr_notifier(struct pci_dev *pdev) { }
> -static inline void pci_acpi_remove_edr_notifier(struct pci_dev *pdev) { }
> -#endif /* CONFIG_PCIE_DPC */
> -
> int pci_acpi_set_companion_lookup_hook(struct acpi_device *(*func)(struct pci_dev *));
> void pci_acpi_clear_companion_lookup_hook(void);
>

--
Sathyanarayanan Kuppuswamy
Linux Kernel Developer


Subject: Re: [PATCH v2 1/3] PCI/DPC: Request DPC only if also requesting AER


On 2/22/24 2:15 PM, Bjorn Helgaas wrote:
> From: Bjorn Helgaas <[email protected]>
>
> When booting with "pci=noaer", we don't request control of AER, but we
> previously *did* request control of DPC, as in the dmesg log attached at
> the bugzilla below:
>
> Command line: ... pci=noaer
> acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI EDR HPX-Type3]
> acpi PNP0A08:00: _OSC: OS now controls [PCIeHotplug SHPCHotplug PME PCIeCapability LTR DPC]
>
> That's illegal per PCI Firmware Spec, r3.3, sec 4.5.1, table 4-5, which
> says:
>
> If the operating system sets this bit [OSC_PCI_EXPRESS_DPC_CONTROL], it
> must also set bit 7 of the Support field (indicating support for Error
> Disconnect Recover notifications) and bits 3 and 4 of the Control field
> (requesting control of PCI Express Advanced Error Reporting and the PCI
> Express Capability Structure).

IIUC, this dependency is discussed in sec 4.5.2.4. "Dependencies Between
_OSC Control Bits".

Because handling of Downstream Port Containment has a dependency on Advanced Error
Reporting, the operating system is required to request control over Advanced Error Reporting (bit 3
of the Control field) while requesting control over Downstream Port Containment Configuration
(bit 7 of the Control field). If the operating system attempts to claim control of Downstream Port
Containment Configuration without also claiming control over Advanced Error Reporting, firmware
is required to refuse control of the feature being illegally claimed and mask the corresponding bit.
Firmware is required to maintain ownership of Advanced Error Reporting if it retains ownership of
Downstream Port Containment Configuration.
If the operating system sets bit 7 of the Control field, it must set bit 7 of the Support field, indicating
support for the Error Disconnect Recover event.

>
> Request DPC control only if we have also requested AER control.
>
> Fixes: ac1c8e35a326 ("PCI/DPC: Add Error Disconnect Recover (EDR) support")
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=218491#c12
> Signed-off-by: Bjorn Helgaas <[email protected]>
> Cc: <[email protected]> # v5.7+
> Cc: Kuppuswamy Sathyanarayanan <[email protected]>
> Cc: Matthew W Carlis <[email protected]>
> Cc: Keith Busch <[email protected]>
> Cc: Lukas Wunner <[email protected]>
> Cc: Mika Westerberg <[email protected]>
> Cc: Jesse Brandeburg <[email protected]>
> ---
Code wise it looks fine to me.

Reviewed-by: Kuppuswamy Sathyanarayanan <[email protected]>
> drivers/acpi/pci_root.c | 20 +++++++++++---------
> 1 file changed, 11 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c
> index 58b89b8d950e..efc292b6214e 100644
> --- a/drivers/acpi/pci_root.c
> +++ b/drivers/acpi/pci_root.c
> @@ -518,17 +518,19 @@ static u32 calculate_control(void)
> if (IS_ENABLED(CONFIG_HOTPLUG_PCI_SHPC))
> control |= OSC_PCI_SHPC_NATIVE_HP_CONTROL;
>
> - if (pci_aer_available())
> + if (pci_aer_available()) {
> control |= OSC_PCI_EXPRESS_AER_CONTROL;
>
> - /*
> - * Per the Downstream Port Containment Related Enhancements ECN to
> - * the PCI Firmware Spec, r3.2, sec 4.5.1, table 4-5,
> - * OSC_PCI_EXPRESS_DPC_CONTROL indicates the OS supports both DPC
> - * and EDR.
> - */
> - if (IS_ENABLED(CONFIG_PCIE_DPC) && IS_ENABLED(CONFIG_PCIE_EDR))
> - control |= OSC_PCI_EXPRESS_DPC_CONTROL;
> + /*
> + * Per PCI Firmware Spec, r3.3, sec 4.5.1, table 4-5, the
> + * OS can request DPC control only if it has advertised
> + * OSC_PCI_EDR_SUPPORT and requested both
> + * OSC_PCI_EXPRESS_CAPABILITY_CONTROL and
> + * OSC_PCI_EXPRESS_AER_CONTROL.
> + */
> + if (IS_ENABLED(CONFIG_PCIE_DPC))
> + control |= OSC_PCI_EXPRESS_DPC_CONTROL;
> + }
>
> return control;
> }

--
Sathyanarayanan Kuppuswamy
Linux Kernel Developer


2024-02-26 15:26:21

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH v2 3/3] PCI/DPC: Encapsulate pci_acpi_add_edr_notifier()

On Sun, Feb 25, 2024 at 12:06:52PM -0800, Kuppuswamy Sathyanarayanan wrote:
>
> On 2/22/24 2:15 PM, Bjorn Helgaas wrote:
> > From: Bjorn Helgaas <[email protected]>
> >
> > pci_acpi_add_edr_notifier() and pci_acpi_remove_edr_notifier() are only
> > referenced inside drivers/pci/. Move their declarations from
> > include/linux/pci-acpi.h to drivers/pci/pci.h so they're not visible
> > outside drivers/pci/.
> >
> > Signed-off-by: Bjorn Helgaas <[email protected]>
> > ---
> > drivers/pci/pci.h | 4 ++++
> > include/linux/pci-acpi.h | 8 --------
> > 2 files changed, 4 insertions(+), 8 deletions(-)
> >
> > diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> > index 2336a8d1edab..03bf2776d73b 100644
> > --- a/drivers/pci/pci.h
> > +++ b/drivers/pci/pci.h
> > @@ -432,11 +432,15 @@ void pci_dpc_init(struct pci_dev *pdev);
> > void dpc_process_error(struct pci_dev *pdev);
> > pci_ers_result_t dpc_reset_link(struct pci_dev *pdev);
> > bool pci_dpc_recovered(struct pci_dev *pdev);
> > +void pci_acpi_add_edr_notifier(struct pci_dev *pdev);
> > +void pci_acpi_remove_edr_notifier(struct pci_dev *pdev);
>
> Protect them with CONFIG_ACPI?

Good idea, thanks! They're called only from pci-acpi.c, so I moved
them inside the #ifdef CONFIG_ACPI above:

#ifdef CONFIG_ACPI
int pci_acpi_program_hp_params(struct pci_dev *dev);
...
#ifdef CONFIG_PCIE_DPC
void pci_acpi_add_edr_notifier(struct pci_dev *pdev);
void pci_acpi_remove_edr_notifier(struct pci_dev *pdev);
#endif
#else
static inline int pci_acpi_program_hp_params(struct pci_dev *dev)
{
return -ENODEV;
}
#ifdef CONFIG_PCIE_DPC
static inline void pci_acpi_add_edr_notifier(struct pci_dev *pdev) { }
static inline void pci_acpi_remove_edr_notifier(struct pci_dev *pdev) { }
#endif
#endif

Bjorn

2024-02-26 15:36:14

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH v2 1/3] PCI/DPC: Request DPC only if also requesting AER

On Sun, Feb 25, 2024 at 11:46:07AM -0800, Kuppuswamy Sathyanarayanan wrote:
>
> On 2/22/24 2:15 PM, Bjorn Helgaas wrote:
> > From: Bjorn Helgaas <[email protected]>
> >
> > When booting with "pci=noaer", we don't request control of AER, but we
> > previously *did* request control of DPC, as in the dmesg log attached at
> > the bugzilla below:
> >
> > Command line: ... pci=noaer
> > acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI EDR HPX-Type3]
> > acpi PNP0A08:00: _OSC: OS now controls [PCIeHotplug SHPCHotplug PME PCIeCapability LTR DPC]
> >
> > That's illegal per PCI Firmware Spec, r3.3, sec 4.5.1, table 4-5, which
> > says:
> >
> > If the operating system sets this bit [OSC_PCI_EXPRESS_DPC_CONTROL], it
> > must also set bit 7 of the Support field (indicating support for Error
> > Disconnect Recover notifications) and bits 3 and 4 of the Control field
> > (requesting control of PCI Express Advanced Error Reporting and the PCI
> > Express Capability Structure).
>
> IIUC, this dependency is discussed in sec 4.5.2.4. "Dependencies
> Between _OSC Control Bits".
>
> Because handling of Downstream Port Containment has a dependency on
> Advanced Error Reporting, the operating system is required to
> request control over Advanced Error Reporting (bit 3 of the Control
> field) while requesting control over Downstream Port Containment
> Configuration (bit 7 of the Control field). If the operating system
> attempts to claim control of Downstream Port Containment
> Configuration without also claiming control over Advanced Error
> Reporting, firmware is required to refuse control of the feature
> being illegally claimed and mask the corresponding bit. Firmware is
> required to maintain ownership of Advanced Error Reporting if it
> retains ownership of Downstream Port Containment Configuration. If
> the operating system sets bit 7 of the Control field, it must set
> bit 7 of the Support field, indicating support for the Error
> Disconnect Recover event.

So I guess you're suggesting that there are two defects here?

1) Linux requested DPC control without requesting AER control.

2) Platform granted DPC control when it shouldn't have.

I do agree with that, but obviously we can only fix 1) in Linux.

> > Request DPC control only if we have also requested AER control.
> >
> > Fixes: ac1c8e35a326 ("PCI/DPC: Add Error Disconnect Recover (EDR) support")
> > Link: https://bugzilla.kernel.org/show_bug.cgi?id=218491#c12
> > Signed-off-by: Bjorn Helgaas <[email protected]>
> > Cc: <[email protected]> # v5.7+
> > Cc: Kuppuswamy Sathyanarayanan <[email protected]>
> > Cc: Matthew W Carlis <[email protected]>
> > Cc: Keith Busch <[email protected]>
> > Cc: Lukas Wunner <[email protected]>
> > Cc: Mika Westerberg <[email protected]>
> > Cc: Jesse Brandeburg <[email protected]>
> > ---
> Code wise it looks fine to me.
>
> Reviewed-by: Kuppuswamy Sathyanarayanan <[email protected]>
> > drivers/acpi/pci_root.c | 20 +++++++++++---------
> > 1 file changed, 11 insertions(+), 9 deletions(-)
> >
> > diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c
> > index 58b89b8d950e..efc292b6214e 100644
> > --- a/drivers/acpi/pci_root.c
> > +++ b/drivers/acpi/pci_root.c
> > @@ -518,17 +518,19 @@ static u32 calculate_control(void)
> > if (IS_ENABLED(CONFIG_HOTPLUG_PCI_SHPC))
> > control |= OSC_PCI_SHPC_NATIVE_HP_CONTROL;
> >
> > - if (pci_aer_available())
> > + if (pci_aer_available()) {
> > control |= OSC_PCI_EXPRESS_AER_CONTROL;
> >
> > - /*
> > - * Per the Downstream Port Containment Related Enhancements ECN to
> > - * the PCI Firmware Spec, r3.2, sec 4.5.1, table 4-5,
> > - * OSC_PCI_EXPRESS_DPC_CONTROL indicates the OS supports both DPC
> > - * and EDR.
> > - */
> > - if (IS_ENABLED(CONFIG_PCIE_DPC) && IS_ENABLED(CONFIG_PCIE_EDR))
> > - control |= OSC_PCI_EXPRESS_DPC_CONTROL;
> > + /*
> > + * Per PCI Firmware Spec, r3.3, sec 4.5.1, table 4-5, the
> > + * OS can request DPC control only if it has advertised
> > + * OSC_PCI_EDR_SUPPORT and requested both
> > + * OSC_PCI_EXPRESS_CAPABILITY_CONTROL and
> > + * OSC_PCI_EXPRESS_AER_CONTROL.
> > + */
> > + if (IS_ENABLED(CONFIG_PCIE_DPC))
> > + control |= OSC_PCI_EXPRESS_DPC_CONTROL;
> > + }
> >
> > return control;
> > }
>
> --
> Sathyanarayanan Kuppuswamy
> Linux Kernel Developer
>

Subject: Re: [PATCH v2 1/3] PCI/DPC: Request DPC only if also requesting AER


On 2/26/24 7:18 AM, Bjorn Helgaas wrote:
> On Sun, Feb 25, 2024 at 11:46:07AM -0800, Kuppuswamy Sathyanarayanan wrote:
>> On 2/22/24 2:15 PM, Bjorn Helgaas wrote:
>>> From: Bjorn Helgaas <[email protected]>
>>>
>>> When booting with "pci=noaer", we don't request control of AER, but we
>>> previously *did* request control of DPC, as in the dmesg log attached at
>>> the bugzilla below:
>>>
>>> Command line: ... pci=noaer
>>> acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI EDR HPX-Type3]
>>> acpi PNP0A08:00: _OSC: OS now controls [PCIeHotplug SHPCHotplug PME PCIeCapability LTR DPC]
>>>
>>> That's illegal per PCI Firmware Spec, r3.3, sec 4.5.1, table 4-5, which
>>> says:
>>>
>>> If the operating system sets this bit [OSC_PCI_EXPRESS_DPC_CONTROL], it
>>> must also set bit 7 of the Support field (indicating support for Error
>>> Disconnect Recover notifications) and bits 3 and 4 of the Control field
>>> (requesting control of PCI Express Advanced Error Reporting and the PCI
>>> Express Capability Structure).
>> IIUC, this dependency is discussed in sec 4.5.2.4. "Dependencies
>> Between _OSC Control Bits".
>>
>> Because handling of Downstream Port Containment has a dependency on
>> Advanced Error Reporting, the operating system is required to
>> request control over Advanced Error Reporting (bit 3 of the Control
>> field) while requesting control over Downstream Port Containment
>> Configuration (bit 7 of the Control field). If the operating system
>> attempts to claim control of Downstream Port Containment
>> Configuration without also claiming control over Advanced Error
>> Reporting, firmware is required to refuse control of the feature
>> being illegally claimed and mask the corresponding bit. Firmware is
>> required to maintain ownership of Advanced Error Reporting if it
>> retains ownership of Downstream Port Containment Configuration. If
>> the operating system sets bit 7 of the Control field, it must set
>> bit 7 of the Support field, indicating support for the Error
>> Disconnect Recover event.
> So I guess you're suggesting that there are two defects here?
>
> 1) Linux requested DPC control without requesting AER control.
>
> 2) Platform granted DPC control when it shouldn't have.
>
> I do agree with that, but obviously we can only fix 1) in Linux.

Sorry, maybe my comment was not clear. I was just suggesting
to change the  spec reference from r3.3, sec 4.5.1, table 4-5
to  r3.3, sec 4.5.2.4 "Dependencies Between _OSC Control Bits". I agree that we cannot do much if firmware misbehaves here.

>>> Request DPC control only if we have also requested AER control.
>>>
>>> Fixes: ac1c8e35a326 ("PCI/DPC: Add Error Disconnect Recover (EDR) support")
>>> Link: https://bugzilla.kernel.org/show_bug.cgi?id=218491#c12
>>> Signed-off-by: Bjorn Helgaas <[email protected]>
>>> Cc: <[email protected]> # v5.7+
>>> Cc: Kuppuswamy Sathyanarayanan <[email protected]>
>>> Cc: Matthew W Carlis <[email protected]>
>>> Cc: Keith Busch <[email protected]>
>>> Cc: Lukas Wunner <[email protected]>
>>> Cc: Mika Westerberg <[email protected]>
>>> Cc: Jesse Brandeburg <[email protected]>
>>> ---
>> Code wise it looks fine to me.
>>
>> Reviewed-by: Kuppuswamy Sathyanarayanan <[email protected]>
>>> drivers/acpi/pci_root.c | 20 +++++++++++---------
>>> 1 file changed, 11 insertions(+), 9 deletions(-)
>>>
>>> diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c
>>> index 58b89b8d950e..efc292b6214e 100644
>>> --- a/drivers/acpi/pci_root.c
>>> +++ b/drivers/acpi/pci_root.c
>>> @@ -518,17 +518,19 @@ static u32 calculate_control(void)
>>> if (IS_ENABLED(CONFIG_HOTPLUG_PCI_SHPC))
>>> control |= OSC_PCI_SHPC_NATIVE_HP_CONTROL;
>>>
>>> - if (pci_aer_available())
>>> + if (pci_aer_available()) {
>>> control |= OSC_PCI_EXPRESS_AER_CONTROL;
>>>
>>> - /*
>>> - * Per the Downstream Port Containment Related Enhancements ECN to
>>> - * the PCI Firmware Spec, r3.2, sec 4.5.1, table 4-5,
>>> - * OSC_PCI_EXPRESS_DPC_CONTROL indicates the OS supports both DPC
>>> - * and EDR.
>>> - */
>>> - if (IS_ENABLED(CONFIG_PCIE_DPC) && IS_ENABLED(CONFIG_PCIE_EDR))
>>> - control |= OSC_PCI_EXPRESS_DPC_CONTROL;
>>> + /*
>>> + * Per PCI Firmware Spec, r3.3, sec 4.5.1, table 4-5, the
>>> + * OS can request DPC control only if it has advertised
>>> + * OSC_PCI_EDR_SUPPORT and requested both
>>> + * OSC_PCI_EXPRESS_CAPABILITY_CONTROL and
>>> + * OSC_PCI_EXPRESS_AER_CONTROL.
>>> + */
>>> + if (IS_ENABLED(CONFIG_PCIE_DPC))
>>> + control |= OSC_PCI_EXPRESS_DPC_CONTROL;
>>> + }
>>>
>>> return control;
>>> }
>> --
>> Sathyanarayanan Kuppuswamy
>> Linux Kernel Developer
>>
--
Sathyanarayanan Kuppuswamy
Linux Kernel Developer


2024-02-26 16:33:31

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH v2 1/3] PCI/DPC: Request DPC only if also requesting AER

On Mon, Feb 26, 2024 at 07:46:05AM -0800, Kuppuswamy Sathyanarayanan wrote:
>
> On 2/26/24 7:18 AM, Bjorn Helgaas wrote:
> > On Sun, Feb 25, 2024 at 11:46:07AM -0800, Kuppuswamy Sathyanarayanan wrote:
> >> On 2/22/24 2:15 PM, Bjorn Helgaas wrote:
> >>> From: Bjorn Helgaas <[email protected]>
> >>>
> >>> When booting with "pci=noaer", we don't request control of AER, but we
> >>> previously *did* request control of DPC, as in the dmesg log attached at
> >>> the bugzilla below:
> >>>
> >>> Command line: ... pci=noaer
> >>> acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI EDR HPX-Type3]
> >>> acpi PNP0A08:00: _OSC: OS now controls [PCIeHotplug SHPCHotplug PME PCIeCapability LTR DPC]
> >>>
> >>> That's illegal per PCI Firmware Spec, r3.3, sec 4.5.1, table 4-5, which
> >>> says:
> >>>
> >>> If the operating system sets this bit [OSC_PCI_EXPRESS_DPC_CONTROL], it
> >>> must also set bit 7 of the Support field (indicating support for Error
> >>> Disconnect Recover notifications) and bits 3 and 4 of the Control field
> >>> (requesting control of PCI Express Advanced Error Reporting and the PCI
> >>> Express Capability Structure).
> >>
> >> IIUC, this dependency is discussed in sec 4.5.2.4. "Dependencies
> >> Between _OSC Control Bits".
> >>
> >> Because handling of Downstream Port Containment has a dependency on
> >> Advanced Error Reporting, the operating system is required to
> >> request control over Advanced Error Reporting (bit 3 of the Control
> >> field) while requesting control over Downstream Port Containment
> >> Configuration (bit 7 of the Control field). If the operating system
> >> attempts to claim control of Downstream Port Containment
> >> Configuration without also claiming control over Advanced Error
> >> Reporting, firmware is required to refuse control of the feature
> >> being illegally claimed and mask the corresponding bit. Firmware is
> >> required to maintain ownership of Advanced Error Reporting if it
> >> retains ownership of Downstream Port Containment Configuration. If
> >> the operating system sets bit 7 of the Control field, it must set
> >> bit 7 of the Support field, indicating support for the Error
> >> Disconnect Recover event.
> >
> > So I guess you're suggesting that there are two defects here?
> >
> > 1) Linux requested DPC control without requesting AER control.
> >
> > 2) Platform granted DPC control when it shouldn't have.
> >
> > I do agree with that, but obviously we can only fix 1) in Linux.
>
> Sorry, maybe my comment was not clear. I was just suggesting to
> change the spec reference from r3.3, sec 4.5.1, table 4-5 to r3.3,
> sec 4.5.2.4 "Dependencies Between _OSC Control Bits".

The requirement that the OS request AER control whenever it requests
DPC control is mentioned in both sec 4.5.1 and sec 4.5.2.4. IMO sec
4.5.2.4 should not exist because the per-bit table in sec 4.5.1 is a
better place for implementation guidance. 4.5.2.4 is easy to miss,
mostly redundant, and hard to integrate with the 4.5.1 table.

What advantage do you see for citing 4.5.2.4 instead of 4.5.1? The
only real difference I see is that it also points out a firmware
problem. I don't think the extra text is worth it since it doesn't
motivate the Linux change.

Bjorn

Subject: Re: [PATCH v2 1/3] PCI/DPC: Request DPC only if also requesting AER


On 2/26/24 8:33 AM, Bjorn Helgaas wrote:
> On Mon, Feb 26, 2024 at 07:46:05AM -0800, Kuppuswamy Sathyanarayanan wrote:
>> On 2/26/24 7:18 AM, Bjorn Helgaas wrote:
>>> On Sun, Feb 25, 2024 at 11:46:07AM -0800, Kuppuswamy Sathyanarayanan wrote:
>>>> On 2/22/24 2:15 PM, Bjorn Helgaas wrote:
>>>>> From: Bjorn Helgaas <[email protected]>
>>>>>
>>>>> When booting with "pci=noaer", we don't request control of AER, but we
>>>>> previously *did* request control of DPC, as in the dmesg log attached at
>>>>> the bugzilla below:
>>>>>
>>>>> Command line: ... pci=noaer
>>>>> acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI EDR HPX-Type3]
>>>>> acpi PNP0A08:00: _OSC: OS now controls [PCIeHotplug SHPCHotplug PME PCIeCapability LTR DPC]
>>>>>
>>>>> That's illegal per PCI Firmware Spec, r3.3, sec 4.5.1, table 4-5, which
>>>>> says:
>>>>>
>>>>> If the operating system sets this bit [OSC_PCI_EXPRESS_DPC_CONTROL], it
>>>>> must also set bit 7 of the Support field (indicating support for Error
>>>>> Disconnect Recover notifications) and bits 3 and 4 of the Control field
>>>>> (requesting control of PCI Express Advanced Error Reporting and the PCI
>>>>> Express Capability Structure).
>>>> IIUC, this dependency is discussed in sec 4.5.2.4. "Dependencies
>>>> Between _OSC Control Bits".
>>>>
>>>> Because handling of Downstream Port Containment has a dependency on
>>>> Advanced Error Reporting, the operating system is required to
>>>> request control over Advanced Error Reporting (bit 3 of the Control
>>>> field) while requesting control over Downstream Port Containment
>>>> Configuration (bit 7 of the Control field). If the operating system
>>>> attempts to claim control of Downstream Port Containment
>>>> Configuration without also claiming control over Advanced Error
>>>> Reporting, firmware is required to refuse control of the feature
>>>> being illegally claimed and mask the corresponding bit. Firmware is
>>>> required to maintain ownership of Advanced Error Reporting if it
>>>> retains ownership of Downstream Port Containment Configuration. If
>>>> the operating system sets bit 7 of the Control field, it must set
>>>> bit 7 of the Support field, indicating support for the Error
>>>> Disconnect Recover event.
>>> So I guess you're suggesting that there are two defects here?
>>>
>>> 1) Linux requested DPC control without requesting AER control.
>>>
>>> 2) Platform granted DPC control when it shouldn't have.
>>>
>>> I do agree with that, but obviously we can only fix 1) in Linux.
>> Sorry, maybe my comment was not clear. I was just suggesting to
>> change the spec reference from r3.3, sec 4.5.1, table 4-5 to r3.3,
>> sec 4.5.2.4 "Dependencies Between _OSC Control Bits".
> The requirement that the OS request AER control whenever it requests
> DPC control is mentioned in both sec 4.5.1 and sec 4.5.2.4. IMO sec
> 4.5.2.4 should not exist because the per-bit table in sec 4.5.1 is a
> better place for implementation guidance. 4.5.2.4 is easy to miss,
> mostly redundant, and hard to integrate with the 4.5.1 table.
>
> What advantage do you see for citing 4.5.2.4 instead of 4.5.1? The
> only real difference I see is that it also points out a firmware
> problem. I don't think the extra text is worth it since it doesn't
> motivate the Linux change.


My thinking is, since the fix is related to the dependency between
_OSC control bits (AER & DPC), and there is a special section in the
spec which discuss it, I thought it is better to  quote it.

But I get your point. I think either if fine.

>
> Bjorn

--
Sathyanarayanan Kuppuswamy
Linux Kernel Developer


Subject: Re: [PATCH v2 0/3] PCI/DPC: Clean up DPC vs AER/EDR ownership and Kconfig

Hi,

On 2/26/24 10:18 PM, Ethan Zhao wrote:
> On 2/23/2024 6:15 AM, Bjorn Helgaas wrote:
>> From: Bjorn Helgaas <[email protected]>
>>
>> Previously we could request control of DPC without AER, which is illegal
>> per spec.  Also, we could enable CONFIG_PCIE_DPC without CONFIG_PCIE_EDR,
>> which is also illegal.  This series addresses both.
>
> I have a question here, how to understand the relationship EDR & AER ?
> somewhere EDR touches AER status without checking _OSC granted bits,
> such as
>    pci_aer_raw_clear_status(edev);


Which_OSC bits?

EDR code will only get triggered if OS advertises the EDR support (which
also means OS supports AER and DPC), and both AER and DPC is owned by
the firmware. During the EDR notification, the OS is allowed to touch AER
and DPC registers. So there is no problem with EDR code using AER routines.


>
> sometimes EDR calling AER with host->native_aer checked, like
>
> pcie_do_recovery()
> {
>  ...
>  if (host->native_aer || pcie_ports_native) {
>         pcie_clear_device_status(dev);
>         pci_aer_clear_nonfatal_status(dev);
>     }
>  ...
> }
>
> That is really confusing. could we do some cleanup to eliminate it ?
> such as seperate AER code into common code and runtime part.
>
>
> Thanks,
> Ethan
>  
>
>>
>> Bjorn Helgaas (3):
>>    PCI/DPC: Request DPC only if also requesting AER
>>    PCI/DPC: Remove CONFIG_PCIE_EDR
>>    PCI/DPC: Encapsulate pci_acpi_add_edr_notifier()
>>
>>   drivers/acpi/pci_root.c   | 22 ++++++++++++----------
>>   drivers/pci/pci.h         |  4 ++++
>>   drivers/pci/pcie/Kconfig  | 14 ++++----------
>>   drivers/pci/pcie/Makefile |  5 ++++-
>>   drivers/pci/pcie/dpc.c    | 10 ----------
>>   include/linux/pci-acpi.h  |  8 --------
>>   6 files changed, 24 insertions(+), 39 deletions(-)
>>
--
Sathyanarayanan Kuppuswamy
Linux Kernel Developer


2024-02-27 07:13:01

by Ethan Zhao

[permalink] [raw]
Subject: Re: [PATCH v2 0/3] PCI/DPC: Clean up DPC vs AER/EDR ownership and Kconfig

On 2/27/2024 2:35 PM, Kuppuswamy Sathyanarayanan wrote:
> Hi,
>
> On 2/26/24 10:18 PM, Ethan Zhao wrote:
>> On 2/23/2024 6:15 AM, Bjorn Helgaas wrote:
>>> From: Bjorn Helgaas <[email protected]>
>>>
>>> Previously we could request control of DPC without AER, which is illegal
>>> per spec.  Also, we could enable CONFIG_PCIE_DPC without CONFIG_PCIE_EDR,
>>> which is also illegal.  This series addresses both.
>> I have a question here, how to understand the relationship EDR & AER ?
>> somewhere EDR touches AER status without checking _OSC granted bits,
>> such as
>>    pci_aer_raw_clear_status(edev);
>
> Which_OSC bits?
>
> EDR code will only get triggered if OS advertises the EDR support (which
> also means OS supports AER and DPC), and both AER and DPC is owned by
> the firmware. During the EDR notification, the OS is allowed to touch AER

Means no need to check if host->native_aer ? why checked in
pcie_do_recovery() ?

Thanks,
Ethan

> and DPC registers. So there is no problem with EDR code using AER routines.
>
>
>> sometimes EDR calling AER with host->native_aer checked, like
>>
>> pcie_do_recovery()
>> {
>>  ...
>>  if (host->native_aer || pcie_ports_native) {
>>         pcie_clear_device_status(dev);
>>         pci_aer_clear_nonfatal_status(dev);
>>     }
>>  ...
>> }
>>
>> That is really confusing. could we do some cleanup to eliminate it ?
>> such as seperate AER code into common code and runtime part.
>>
>>
>> Thanks,
>> Ethan
>>
>>
>>> Bjorn Helgaas (3):
>>>    PCI/DPC: Request DPC only if also requesting AER
>>>    PCI/DPC: Remove CONFIG_PCIE_EDR
>>>    PCI/DPC: Encapsulate pci_acpi_add_edr_notifier()
>>>
>>>   drivers/acpi/pci_root.c   | 22 ++++++++++++----------
>>>   drivers/pci/pci.h         |  4 ++++
>>>   drivers/pci/pcie/Kconfig  | 14 ++++----------
>>>   drivers/pci/pcie/Makefile |  5 ++++-
>>>   drivers/pci/pcie/dpc.c    | 10 ----------
>>>   include/linux/pci-acpi.h  |  8 --------
>>>   6 files changed, 24 insertions(+), 39 deletions(-)
>>>

2024-02-27 09:56:39

by Ethan Zhao

[permalink] [raw]
Subject: Re: [PATCH v2 0/3] PCI/DPC: Clean up DPC vs AER/EDR ownership and Kconfig

On 2/23/2024 6:15 AM, Bjorn Helgaas wrote:
> From: Bjorn Helgaas <[email protected]>
>
> Previously we could request control of DPC without AER, which is illegal
> per spec. Also, we could enable CONFIG_PCIE_DPC without CONFIG_PCIE_EDR,
> which is also illegal. This series addresses both.

I have a question here, how to understand the relationship EDR & AER ?
somewhere EDR touches AER status without checking _OSC granted bits,
such as
pci_aer_raw_clear_status(edev);

sometimes EDR calling AER with host->native_aer checked, like

pcie_do_recovery()
{
...
if (host->native_aer || pcie_ports_native) {
pcie_clear_device_status(dev);
pci_aer_clear_nonfatal_status(dev);
}
...
}

That is really confusing. could we do some cleanup to eliminate it ?
such as seperate AER code into common code and runtime part.


Thanks,
Ethan


>
> Bjorn Helgaas (3):
> PCI/DPC: Request DPC only if also requesting AER
> PCI/DPC: Remove CONFIG_PCIE_EDR
> PCI/DPC: Encapsulate pci_acpi_add_edr_notifier()
>
> drivers/acpi/pci_root.c | 22 ++++++++++++----------
> drivers/pci/pci.h | 4 ++++
> drivers/pci/pcie/Kconfig | 14 ++++----------
> drivers/pci/pcie/Makefile | 5 ++++-
> drivers/pci/pcie/dpc.c | 10 ----------
> include/linux/pci-acpi.h | 8 --------
> 6 files changed, 24 insertions(+), 39 deletions(-)
>

Subject: Re: [PATCH v2 0/3] PCI/DPC: Clean up DPC vs AER/EDR ownership and Kconfig


On 2/26/24 11:12 PM, Ethan Zhao wrote:
> On 2/27/2024 2:35 PM, Kuppuswamy Sathyanarayanan wrote:
>> Hi,
>>
>> On 2/26/24 10:18 PM, Ethan Zhao wrote:
>>> On 2/23/2024 6:15 AM, Bjorn Helgaas wrote:
>>>> From: Bjorn Helgaas <[email protected]>
>>>>
>>>> Previously we could request control of DPC without AER, which is illegal
>>>> per spec.  Also, we could enable CONFIG_PCIE_DPC without CONFIG_PCIE_EDR,
>>>> which is also illegal.  This series addresses both.
>>> I have a question here, how to understand the relationship EDR & AER ?
>>> somewhere EDR touches AER status without checking _OSC granted bits,
>>> such as
>>>     pci_aer_raw_clear_status(edev);
>>
>> Which_OSC bits?
>>
>> EDR code will only get triggered if OS advertises the EDR support (which
>> also means OS supports AER and DPC), and both AER and DPC is owned by
>> the firmware. During the EDR notification, the OS is allowed to touch AER
>
> Means no need to check if host->native_aer ? why checked in
> pcie_do_recovery() ?

pcie_do_recovery() can be called form EDR (FF) or native path. That check is used
to skip device status clearing in FF mode. You can find details about it in

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/pci/pcie/err.c?id=068c29a248b6ddbfdf7bb150b547569759620d36



>
> Thanks,
> Ethan
>
>> and DPC registers. So there is no problem with EDR code using AER routines.
>>
>>
>>> sometimes EDR calling AER with host->native_aer checked, like
>>>
>>> pcie_do_recovery()
>>> {
>>>   ...
>>>   if (host->native_aer || pcie_ports_native) {
>>>          pcie_clear_device_status(dev);
>>>          pci_aer_clear_nonfatal_status(dev);
>>>      }
>>>   ...
>>> }
>>>
>>> That is really confusing. could we do some cleanup to eliminate it ?
>>> such as seperate AER code into common code and runtime part.
>>>
>>>
>>> Thanks,
>>> Ethan
>>>  
>>>> Bjorn Helgaas (3):
>>>>     PCI/DPC: Request DPC only if also requesting AER
>>>>     PCI/DPC: Remove CONFIG_PCIE_EDR
>>>>     PCI/DPC: Encapsulate pci_acpi_add_edr_notifier()
>>>>
>>>>    drivers/acpi/pci_root.c   | 22 ++++++++++++----------
>>>>    drivers/pci/pci.h         |  4 ++++
>>>>    drivers/pci/pcie/Kconfig  | 14 ++++----------
>>>>    drivers/pci/pcie/Makefile |  5 ++++-
>>>>    drivers/pci/pcie/dpc.c    | 10 ----------
>>>>    include/linux/pci-acpi.h  |  8 --------
>>>>    6 files changed, 24 insertions(+), 39 deletions(-)
>>>>
>
--
Sathyanarayanan Kuppuswamy
Linux Kernel Developer


2024-03-01 23:06:18

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH v2 2/3] PCI/DPC: Remove CONFIG_PCIE_EDR

On Sun, Feb 25, 2024 at 12:05:12PM -0800, Kuppuswamy Sathyanarayanan wrote:
> On 2/22/24 2:15 PM, Bjorn Helgaas wrote:
> > From: Bjorn Helgaas <[email protected]>
> >
> > Previous Kconfig allowed the possibility of enabling CONFIG_PCIE_DPC
> > without CONFIG_PCIE_EDR. The PCI Firmware Spec, r3.3, sec 4.5.1,
> > table 4-5, says an ACPI OS that requests control of DPC must also
> > advertise support for EDR.
> >
> > Remove CONFIG_PCIE_EDR and enable the EDR code with CONFIG_PCIE_DPC so that
> > enabling DPC also enables EDR for ACPI systems. Since EDR is an ACPI
> > feature, build edr.o only when CONFIG_ACPI is enabled. Stubs cover the
> > case when DPC is enabled without ACPI.
> >
> > Signed-off-by: Bjorn Helgaas <[email protected]>
> > ---
> > drivers/acpi/pci_root.c | 2 +-
> > drivers/pci/pcie/Kconfig | 14 ++++----------
> > drivers/pci/pcie/Makefile | 5 ++++-
> > drivers/pci/pcie/dpc.c | 10 ----------
> > include/linux/pci-acpi.h | 4 ++--
> > 5 files changed, 11 insertions(+), 24 deletions(-)
> >
> > diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c
> > index efc292b6214e..bcaf3d3a5e05 100644
> > --- a/drivers/acpi/pci_root.c
> > +++ b/drivers/acpi/pci_root.c
> > @@ -448,7 +448,7 @@ static u32 calculate_support(void)
> > support |= OSC_PCI_ASPM_SUPPORT | OSC_PCI_CLOCK_PM_SUPPORT;
> > if (pci_msi_enabled())
> > support |= OSC_PCI_MSI_SUPPORT;
> > - if (IS_ENABLED(CONFIG_PCIE_EDR))
> > + if (IS_ENABLED(CONFIG_PCIE_DPC)) /* DPC => EDR support */
> > support |= OSC_PCI_EDR_SUPPORT;
>
> Since EDR internally touches AER registers, I still think we should
> make sure OS enables AER support before advertising the EDR support.

I guess you're suggesting that we should make it look like this?

if (host_bridge->native_aer && IS_ENABLED(CONFIG_PCIE_DPC))

That doesn't seem right to me because the implementation note in PCI
Firmware r3.3, sec 4.6.12, shows the EDR flow when firmware maintains
control of AER and DPC, i.e., "host_bridge->native_aer" would be
false.

> > return support;
> > diff --git a/drivers/pci/pcie/Kconfig b/drivers/pci/pcie/Kconfig
> > index 8999fcebde6a..21e98289fbe9 100644
> > --- a/drivers/pci/pcie/Kconfig
> > +++ b/drivers/pci/pcie/Kconfig
> > @@ -137,6 +137,10 @@ config PCIE_DPC
> > have this capability or you do not want to use this feature,
> > it is safe to answer N.
> >
> > + On ACPI systems, this includes Error Disconnect Recover support,
> > + the hybrid model that uses both firmware and OS to implement DPC,
> > + as specified in the PCI Firmware Specification r3.3.
>
> Nit: Include some section reference?

I basically copied this from the PCIE_EDR help and updated the
revision number. But I don't think the firmware spec is a very good
reference because EDR is defined by ACPI. There's very little text in
the ACPI spec about EDR, but the firmware spec does assume you know
what *is* there. And the ACPI spec is available to anybody, unlike
the PCI firmware spec.

+ On ACPI systems, this includes Error Disconnect Recover support,
+ the hybrid model that uses both firmware and OS to implement DPC,
+ as specified in ACPI r6.5, sec 5.6.6.

> > config PCIE_PTM
> > bool "PCI Express Precision Time Measurement support"
> > help
> > @@ -145,13 +149,3 @@ config PCIE_PTM
> >
> > This is only useful if you have devices that support PTM, but it
> > is safe to enable even if you don't.
> > -
> > -config PCIE_EDR
> > - bool "PCI Express Error Disconnect Recover support"
> > - depends on PCIE_DPC && ACPI
> > - help
> > - This option adds Error Disconnect Recover support as specified
> > - in the Downstream Port Containment Related Enhancements ECN to
> > - the PCI Firmware Specification r3.2. Enable this if you want to
> > - support hybrid DPC model which uses both firmware and OS to
> > - implement DPC.

Subject: Re: [PATCH v2 2/3] PCI/DPC: Remove CONFIG_PCIE_EDR


On 3/1/24 3:06 PM, Bjorn Helgaas wrote:
> On Sun, Feb 25, 2024 at 12:05:12PM -0800, Kuppuswamy Sathyanarayanan wrote:
>> On 2/22/24 2:15 PM, Bjorn Helgaas wrote:
>>> From: Bjorn Helgaas <[email protected]>
>>>
>>> Previous Kconfig allowed the possibility of enabling CONFIG_PCIE_DPC
>>> without CONFIG_PCIE_EDR. The PCI Firmware Spec, r3.3, sec 4.5.1,
>>> table 4-5, says an ACPI OS that requests control of DPC must also
>>> advertise support for EDR.
>>>
>>> Remove CONFIG_PCIE_EDR and enable the EDR code with CONFIG_PCIE_DPC so that
>>> enabling DPC also enables EDR for ACPI systems. Since EDR is an ACPI
>>> feature, build edr.o only when CONFIG_ACPI is enabled. Stubs cover the
>>> case when DPC is enabled without ACPI.
>>>
>>> Signed-off-by: Bjorn Helgaas <[email protected]>
>>> ---
>>> drivers/acpi/pci_root.c | 2 +-
>>> drivers/pci/pcie/Kconfig | 14 ++++----------
>>> drivers/pci/pcie/Makefile | 5 ++++-
>>> drivers/pci/pcie/dpc.c | 10 ----------
>>> include/linux/pci-acpi.h | 4 ++--
>>> 5 files changed, 11 insertions(+), 24 deletions(-)
>>>
>>> diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c
>>> index efc292b6214e..bcaf3d3a5e05 100644
>>> --- a/drivers/acpi/pci_root.c
>>> +++ b/drivers/acpi/pci_root.c
>>> @@ -448,7 +448,7 @@ static u32 calculate_support(void)
>>> support |= OSC_PCI_ASPM_SUPPORT | OSC_PCI_CLOCK_PM_SUPPORT;
>>> if (pci_msi_enabled())
>>> support |= OSC_PCI_MSI_SUPPORT;
>>> - if (IS_ENABLED(CONFIG_PCIE_EDR))
>>> + if (IS_ENABLED(CONFIG_PCIE_DPC)) /* DPC => EDR support */
>>> support |= OSC_PCI_EDR_SUPPORT;
>> Since EDR internally touches AER registers, I still think we should
>> make sure OS enables AER support before advertising the EDR support.
> I guess you're suggesting that we should make it look like this?
>
> if (host_bridge->native_aer && IS_ENABLED(CONFIG_PCIE_DPC))
>
> That doesn't seem right to me because the implementation note in PCI
> Firmware r3.3, sec 4.6.12, shows the EDR flow when firmware maintains
> control of AER and DPC, i.e., "host_bridge->native_aer" would be
> false.

No, my idea is to check for something like below:

if (IS_ENABLED(CONFIG_PCIEAER) && IS_ENABLED(CONFIG_PCIE_DPC)) or if (pci_aer_available() && IS_ENABLED(CONFIG_PCIE_DPC) to ensure AER is not disabled by noaer command line option. Since EDR handler depends on AER routines (like pci_aer_raw_clear_status()) to |clear AER registers, we need to ensure AER is enabled in kernel before advertising suppor for EDR.


>
>>> return support;
>>> diff --git a/drivers/pci/pcie/Kconfig b/drivers/pci/pcie/Kconfig
>>> index 8999fcebde6a..21e98289fbe9 100644
>>> --- a/drivers/pci/pcie/Kconfig
>>> +++ b/drivers/pci/pcie/Kconfig
>>> @@ -137,6 +137,10 @@ config PCIE_DPC
>>> have this capability or you do not want to use this feature,
>>> it is safe to answer N.
>>>
>>> + On ACPI systems, this includes Error Disconnect Recover support,
>>> + the hybrid model that uses both firmware and OS to implement DPC,
>>> + as specified in the PCI Firmware Specification r3.3.
>> Nit: Include some section reference?
> I basically copied this from the PCIE_EDR help and updated the
> revision number. But I don't think the firmware spec is a very good
> reference because EDR is defined by ACPI. There's very little text in
> the ACPI spec about EDR, but the firmware spec does assume you know
> what *is* there. And the ACPI spec is available to anybody, unlike
> the PCI firmware spec.
>
> + On ACPI systems, this includes Error Disconnect Recover support,
> + the hybrid model that uses both firmware and OS to implement DPC,
> + as specified in ACPI r6.5, sec 5.6.6.
>
>>> config PCIE_PTM
>>> bool "PCI Express Precision Time Measurement support"
>>> help
>>> @@ -145,13 +149,3 @@ config PCIE_PTM
>>>
>>> This is only useful if you have devices that support PTM, but it
>>> is safe to enable even if you don't.
>>> -
>>> -config PCIE_EDR
>>> - bool "PCI Express Error Disconnect Recover support"
>>> - depends on PCIE_DPC && ACPI
>>> - help
>>> - This option adds Error Disconnect Recover support as specified
>>> - in the Downstream Port Containment Related Enhancements ECN to
>>> - the PCI Firmware Specification r3.2. Enable this if you want to
>>> - support hybrid DPC model which uses both firmware and OS to
>>> - implement DPC.

--
Sathyanarayanan Kuppuswamy
Linux Kernel Developer