2021-04-27 02:29:52

by Shanker Donthineni

[permalink] [raw]
Subject: [PATCH v2 1/2] PCI: Add support for a functional level reset based on _RST method

The _RST is a standard method specified in the ACPI specification. It
provides a function level reset when it is described in the acpi_device
context associated with PCI-device.

Implement a new reset function pci_dev_acpi_reset() for probing RST
method and execute if it is defined in the firmware.

The ACPI based reset is called after the device-specific reset and
before standard PCI hardware resets.

Signed-off-by: Shanker Donthineni <[email protected]>
---
drivers/pci/pci.c | 35 +++++++++++++++++++++++++++++++++++
1 file changed, 35 insertions(+)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 16a17215f633..6dadb19848c2 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -5054,6 +5054,35 @@ static void pci_dev_restore(struct pci_dev *dev)
err_handler->reset_done(dev);
}

+/**
+ * pci_dev_acpi_reset - do a function level reset using _RST method
+ * @dev: device to reset
+ * @probe: check if _RST method is included in the acpi_device context.
+ */
+static int pci_dev_acpi_reset(struct pci_dev *dev, int probe)
+{
+#ifdef CONFIG_ACPI
+ acpi_handle handle = ACPI_HANDLE(&dev->dev);
+
+ /* Return -ENOTTY if _RST method is not included in the dev context */
+ if (!handle || !acpi_has_method(handle, "_RST"))
+ return -ENOTTY;
+
+ /* Return 0 for probe phase indicating that we can reset this device */
+ if (probe)
+ return 0;
+
+ /* Invoke _RST() method to perform a function level reset */
+ if (ACPI_FAILURE(acpi_evaluate_object(handle, "_RST", NULL, NULL))) {
+ pci_warn(dev, "Failed to reset the device\n");
+ return -EINVAL;
+ }
+ return 0;
+#else
+ return -ENOTTY;
+#endif
+}
+
/**
* __pci_reset_function_locked - reset a PCI device function while holding
* the @dev mutex lock.
@@ -5089,6 +5118,9 @@ int __pci_reset_function_locked(struct pci_dev *dev)
* reset mechanisms might be broken on the device.
*/
rc = pci_dev_specific_reset(dev, 0);
+ if (rc != -ENOTTY)
+ return rc;
+ rc = pci_dev_acpi_reset(dev, 0);
if (rc != -ENOTTY)
return rc;
if (pcie_has_flr(dev)) {
@@ -5127,6 +5159,9 @@ int pci_probe_reset_function(struct pci_dev *dev)
might_sleep();

rc = pci_dev_specific_reset(dev, 1);
+ if (rc != -ENOTTY)
+ return rc;
+ rc = pci_dev_acpi_reset(dev, 1);
if (rc != -ENOTTY)
return rc;
if (pcie_has_flr(dev))
--
2.17.1


2021-04-27 02:30:42

by Shanker Donthineni

[permalink] [raw]
Subject: [PATCH v2 2/2] PCI: Enable NO_BUS_RESET quirk for Nvidia GPUs

On select platforms, some Nvidia GPU devices do not work with SBR.
Triggering SBR would leave the device inoperable for the current
system boot. It requires a system hard-reboot to get the GPU device
back to normal operating condition post-SBR. For the affected
devices, enable NO_BUS_RESET quirk to fix the issue.

This issue will be fixed in the next generation of hardware.

Signed-off-by: Shanker Donthineni <[email protected]>
---
Changes since v1:
- Split patch into 2, code for handling _RST and SBR specific quirk
- The RST based reset is called as a first-class mechanism in the reset code path

drivers/pci/quirks.c | 12 ++++++++++++
1 file changed, 12 insertions(+)

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 653660e3ba9e..1da80e772ee1 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -3913,6 +3913,18 @@ static int delay_250ms_after_flr(struct pci_dev *dev, int probe)
return 0;
}

+/*
+ * Some Nvidia GPU devices do not work with bus reset, SBR needs to be
+ * prevented for those affected devices.
+ */
+static void quirk_nvidia_no_bus_reset(struct pci_dev *dev)
+{
+ if ((dev->device & 0xffc0) == 0x2340)
+ dev->dev_flags |= PCI_DEV_FLAGS_NO_BUS_RESET;
+}
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_NVIDIA, PCI_ANY_ID,
+ quirk_nvidia_no_bus_reset);
+
static const struct pci_dev_reset_methods pci_dev_reset_methods[] = {
{ PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82599_SFP_VF,
reset_intel_82599_sfp_virtfn },
--
2.17.1

2021-04-27 14:58:55

by Shanker Donthineni

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] PCI: Add support for a functional level reset based on _RST method

Typo in the commit text  will post v3.

On 4/26/21 9:28 PM, Shanker Donthineni wrote:
> The _RST is a standard method specified in the ACPI specification. It
> provides a function level reset when it is described in the acpi_device
> context associated with PCI-device.
>
> Implement a new reset function pci_dev_acpi_reset() for probing RST
> method and execute if it is defined in the firmware.
>
> The ACPI based reset is called after the device-specific reset and
> before standard PCI hardware resets.
>
> Signed-off-by: Shanker Donthineni <[email protected]>
> ---
> drivers/pci/pci.c | 35 +++++++++++++++++++++++++++++++++++
> 1 file changed, 35 insertions(+)
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 16a17215f633..6dadb19848c2 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -5054,6 +5054,35 @@ static void pci_dev_restore(struct pci_dev *dev)
> err_handler->reset_done(dev);
> }
>
> +/**
> + * pci_dev_acpi_reset - do a function level reset using _RST method
> + * @dev: device to reset
> + * @probe: check if _RST method is included in the acpi_device context.
> + */
> +static int pci_dev_acpi_reset(struct pci_dev *dev, int probe)
> +{
> +#ifdef CONFIG_ACPI
> + acpi_handle handle = ACPI_HANDLE(&dev->dev);
> +
> + /* Return -ENOTTY if _RST method is not included in the dev context */
> + if (!handle || !acpi_has_method(handle, "_RST"))
> + return -ENOTTY;
> +
> + /* Return 0 for probe phase indicating that we can reset this device */
> + if (probe)
> + return 0;
> +
> + /* Invoke _RST() method to perform a function level reset */
> + if (ACPI_FAILURE(acpi_evaluate_object(handle, "_RST", NULL, NULL))) {
> + pci_warn(dev, "Failed to reset the device\n");
> + return -EINVAL;
> + }
> + return 0;
> +#else
> + return -ENOTTY;
> +#endif
> +}
> +
> /**
> * __pci_reset_function_locked - reset a PCI device function while holding
> * the @dev mutex lock.
> @@ -5089,6 +5118,9 @@ int __pci_reset_function_locked(struct pci_dev *dev)
> * reset mechanisms might be broken on the device.
> */
> rc = pci_dev_specific_reset(dev, 0);
> + if (rc != -ENOTTY)
> + return rc;
> + rc = pci_dev_acpi_reset(dev, 0);
> if (rc != -ENOTTY)
> return rc;
> if (pcie_has_flr(dev)) {
> @@ -5127,6 +5159,9 @@ int pci_probe_reset_function(struct pci_dev *dev)
> might_sleep();
>
> rc = pci_dev_specific_reset(dev, 1);
> + if (rc != -ENOTTY)
> + return rc;
> + rc = pci_dev_acpi_reset(dev, 1);
> if (rc != -ENOTTY)
> return rc;
> if (pcie_has_flr(dev))