2021-04-29 00:51:19

by Shanker Donthineni

[permalink] [raw]
Subject: [PATCH v4 2/2] PCI: Enable NO_BUS_RESET quirk for Nvidia GPUs

On select platforms, some Nvidia GPU devices do not work with SBR.
Triggering SBR would leave the device inoperable for the current
system boot. It requires a system hard-reboot to get the GPU device
back to normal operating condition post-SBR. For the affected
devices, enable NO_BUS_RESET quirk to fix the issue.

This issue will be fixed in the next generation of hardware.

Signed-off-by: Shanker Donthineni <[email protected]>
---
Changes since v1:
- Split patch into 2, code for handling _RST and SBR specific quirk
- The RST based reset is called as a first-class mechanism in the reset code path

drivers/pci/quirks.c | 12 ++++++++++++
1 file changed, 12 insertions(+)

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 8f47d139c381..e1216a8165df 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -3910,6 +3910,18 @@ static int delay_250ms_after_flr(struct pci_dev *dev, int probe)
return 0;
}

+/*
+ * Some Nvidia GPU devices do not work with bus reset, SBR needs to be
+ * prevented for those affected devices.
+ */
+static void quirk_nvidia_no_bus_reset(struct pci_dev *dev)
+{
+ if ((dev->device & 0xffc0) == 0x2340)
+ dev->dev_flags |= PCI_DEV_FLAGS_NO_BUS_RESET;
+}
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_NVIDIA, PCI_ANY_ID,
+ quirk_nvidia_no_bus_reset);
+
static const struct pci_dev_reset_methods pci_dev_reset_methods[] = {
{ PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82599_SFP_VF,
reset_intel_82599_sfp_virtfn },
--
2.17.1


2021-04-30 17:04:07

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH v4 2/2] PCI: Enable NO_BUS_RESET quirk for Nvidia GPUs

On Wed, Apr 28, 2021 at 07:49:07PM -0500, Shanker Donthineni wrote:
> On select platforms, some Nvidia GPU devices do not work with SBR.
> Triggering SBR would leave the device inoperable for the current
> system boot. It requires a system hard-reboot to get the GPU device
> back to normal operating condition post-SBR. For the affected
> devices, enable NO_BUS_RESET quirk to fix the issue.

Since 1/2 adds _RST support, should I infer that _RST works on these
Nvidia GPUs even though SBR does not? If so, how does _RST do the
reset?

Do you have a root cause for why SBR doesn't work? I'm not super
confident that we perform resets correctly in general, and if the
problem is an issue in Linux, it'd be nice to fix that.

> This issue will be fixed in the next generation of hardware.
>
> Signed-off-by: Shanker Donthineni <[email protected]>
> ---
> Changes since v1:
> - Split patch into 2, code for handling _RST and SBR specific quirk
> - The RST based reset is called as a first-class mechanism in the reset code path
>
> drivers/pci/quirks.c | 12 ++++++++++++
> 1 file changed, 12 insertions(+)
>
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index 8f47d139c381..e1216a8165df 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -3910,6 +3910,18 @@ static int delay_250ms_after_flr(struct pci_dev *dev, int probe)
> return 0;
> }
>
> +/*
> + * Some Nvidia GPU devices do not work with bus reset, SBR needs to be
> + * prevented for those affected devices.
> + */
> +static void quirk_nvidia_no_bus_reset(struct pci_dev *dev)
> +{
> + if ((dev->device & 0xffc0) == 0x2340)
> + dev->dev_flags |= PCI_DEV_FLAGS_NO_BUS_RESET;
> +}
> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_NVIDIA, PCI_ANY_ID,
> + quirk_nvidia_no_bus_reset);

Can you move this next to the existing quirk_no_bus_reset(), and maybe
even just call quirk_no_bus_reset(), e.g.,

if ((dev->device & 0xffc0) == 0x2340)
quirk_no_bus_reset(dev);

It doesn't look connected to this spot.

> static const struct pci_dev_reset_methods pci_dev_reset_methods[] = {
> { PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82599_SFP_VF,
> reset_intel_82599_sfp_virtfn },
> --
> 2.17.1
>

2021-04-30 22:12:52

by Shanker Donthineni

[permalink] [raw]
Subject: Re: [PATCH v4 2/2] PCI: Enable NO_BUS_RESET quirk for Nvidia GPUs

Thanks Bjorn for reviewing patch.

On 4/30/21 12:01 PM, Bjorn Helgaas wrote:
> External email: Use caution opening links or attachments
>
>
> On Wed, Apr 28, 2021 at 07:49:07PM -0500, Shanker Donthineni wrote:
>> On select platforms, some Nvidia GPU devices do not work with SBR.
>> Triggering SBR would leave the device inoperable for the current
>> system boot. It requires a system hard-reboot to get the GPU device
>> back to normal operating condition post-SBR. For the affected
>> devices, enable NO_BUS_RESET quirk to fix the issue.
> Since 1/2 adds _RST support, should I infer that _RST works on these
> Nvidia GPUs even though SBR does not? If so, how does _RST do the
> reset?
Yes, _RST method works but not SBR. The _RST method in DSDT-AML uses
platform-specific initialization steps outside of the GPU BARs for resetting
the GPU device.
> Do you have a root cause for why SBR doesn't work?
It is a hardware implementation specific issue. GPU end-point device
is inoperative after receiving SBR from the RP/SwitchPort. This quirk is
to prevent SBR.

> I'm not super
> confident that we perform resets correctly in general, and if the
> problem is an issue in Linux, it'd be nice to fix that.
We have not seen any issue with Linux SBR implementation.
>
>> This issue will be fixed in the next generation of hardware.
>>
>> Signed-off-by: Shanker Donthineni <[email protected]>
>> ---
>> Changes since v1:
>> - Split patch into 2, code for handling _RST and SBR specific quirk
>> - The RST based reset is called as a first-class mechanism in the reset code path
>>
>> drivers/pci/quirks.c | 12 ++++++++++++
>> 1 file changed, 12 insertions(+)
>>
>> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
>> index 8f47d139c381..e1216a8165df 100644
>> --- a/drivers/pci/quirks.c
>> +++ b/drivers/pci/quirks.c
>> @@ -3910,6 +3910,18 @@ static int delay_250ms_after_flr(struct pci_dev *dev, int probe)
>> return 0;
>> }
>>
>> +/*
>> + * Some Nvidia GPU devices do not work with bus reset, SBR needs to be
>> + * prevented for those affected devices.
>> + */
>> +static void quirk_nvidia_no_bus_reset(struct pci_dev *dev)
>> +{
>> + if ((dev->device & 0xffc0) == 0x2340)
>> + dev->dev_flags |= PCI_DEV_FLAGS_NO_BUS_RESET;
>> +}
>> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_NVIDIA, PCI_ANY_ID,
>> + quirk_nvidia_no_bus_reset);
> Can you move this next to the existing quirk_no_bus_reset(), and maybe
> even just call quirk_no_bus_reset(), e.g.,
>
> if ((dev->device & 0xffc0) == 0x2340)
> quirk_no_bus_reset(dev);
>
> It doesn't look connected to this spot.
>
Thanks, I will move next to the existing NO_BUS_RESET quirks.
>> static const struct pci_dev_reset_methods pci_dev_reset_methods[] = {
>> { PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82599_SFP_VF,
>> reset_intel_82599_sfp_virtfn },
>> --
>> 2.17.1
>>

2021-05-03 22:44:13

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH v4 2/2] PCI: Enable NO_BUS_RESET quirk for Nvidia GPUs

On Fri, Apr 30, 2021 at 05:11:23PM -0500, Shanker R Donthineni wrote:
> Thanks Bjorn for reviewing patch.
>
> On 4/30/21 12:01 PM, Bjorn Helgaas wrote:
> > External email: Use caution opening links or attachments
> >
> >
> > On Wed, Apr 28, 2021 at 07:49:07PM -0500, Shanker Donthineni wrote:
> >> On select platforms, some Nvidia GPU devices do not work with SBR.
> >> Triggering SBR would leave the device inoperable for the current
> >> system boot. It requires a system hard-reboot to get the GPU device
> >> back to normal operating condition post-SBR. For the affected
> >> devices, enable NO_BUS_RESET quirk to fix the issue.
> > Since 1/2 adds _RST support, should I infer that _RST works on these
> > Nvidia GPUs even though SBR does not? If so, how does _RST do the
> > reset?
> Yes, _RST method works but not SBR. The _RST method in DSDT-AML uses
> platform-specific initialization steps outside of the GPU BARs for resetting
> the GPU device.

Obviously _RST only works for built-in devices, since there's no AML
for plug-in devices, right? So if there's a plug-in card with this
GPU, neither SBR nor _RST will work?

> > Do you have a root cause for why SBR doesn't work?
> It is a hardware implementation specific issue. GPU end-point device
> is inoperative after receiving SBR from the RP/SwitchPort. This quirk is
> to prevent SBR.

That's not a root cause; it's basically still "it doesn't work." But
OK.

I'm wondering if we should log something to dmesg in
quirk_no_bus_reset(), quirk_no_pm_reset(), quirk_no_flr(), etc., just
so we have a hint about the fact that resets won't work quite as
expected on these devices.

Bjorn

2021-05-04 02:08:05

by Shanker Donthineni

[permalink] [raw]
Subject: Re: [PATCH v4 2/2] PCI: Enable NO_BUS_RESET quirk for Nvidia GPUs



On 5/3/21 5:42 PM, Bjorn Helgaas wrote:
> Obviously _RST only works for built-in devices, since there's no AML
> for plug-in devices, right? So if there's a plug-in card with this
> GPU, neither SBR nor _RST will work?
These are not plug-in PCIe GPU cards, will exist on upcoming server
baseboards. ACPI-reset should wok for plug-in devices as well as long
as firmware has _RST method defined in ACPI-device associated with
the PCIe hot-plug slot.

I've verified PCIe plug-in feature using SYSFS interface.

1) Remove device using sysfs interface
  root@test:/sys/bus/pci# echo 1 > devices/0005:01:00.0/remove
  root@test:/sys/bus/pci# lspci -s 0005:01:00.0
 
2) Rescan PCI bus using sysfs interface
  root@test:/sys/bus/pci# echo 1 > devices/0005:00:00.0/rescan
  root@test:/sys/bus/pci# lspci -s 0005:01:00.0
  0005:01:00.0 3D controller: NVIDIA Corporation Device 2341 (rev a1)

3) List current reset methods
  root@jetson:/sys/bus/pci# cat devices/0005:01:00.0/reset_method
  acpi,flr

Example AML code:
 // Device definition for slot/devfn
  Device(GPU0) {
     Name(_ADR,0x00000000)
     Method (_RST, 0)
     {
        printf("Entering ACPI _RST method")
        // RESET code
        printf("Exiting ACPI _RST method")
     }
  }

4) Issue device reset from the userspace
 root@test:/sys/bus/pci# echo 1 > devices/0005:01:00.0/reset

dmesg:
 [ 6156.426303] ACPI Debug:  "Entering PCI9 _RST method"
 [ 6156.427007] ACPI Debug:  "Exiting PCI9 _RST method"

> I'm wondering if we should log something to dmesg in
> quirk_no_bus_reset(), quirk_no_pm_reset(), quirk_no_flr(), etc., just
> so we have a hint about the fact that resets won't work quite as
> expected on these devices.
Yes, it would be very useful to know what PCI quirks were applied during boot.
Should I create a separate patch for adding pci_info() or include as part of this
patch?
 
 --- a/drivers/pci/quirks.c
 +++ b/drivers/pci/quirks.c
 @@ -3556,6 +3556,7 @@ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_MELLANOX, PCI_ANY_ID,
  static void quirk_no_bus_reset(struct pci_dev *dev)
  {
         dev->dev_flags |= PCI_DEV_FLAGS_NO_BUS_RESET;
       +pci_info(dev, "Applied NO_BUS_RESET quirk\n");
  }

  /*
 @@ -3598,6 +3599,7 @@ static void quirk_no_pm_reset(struct pci_dev *dev)
          */
         if (!pci_is_root_bus(dev->bus))
                 dev->dev_flags |= PCI_DEV_FLAGS_NO_PM_RESET;
        +pci_info(dev, "Applied NO_PM_RESET quirk\n");
  }

  /*
 @@ -5138,6 +5140,7 @@ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x443, quirk_intel_qat_vf_cap);
  static void quirk_no_flr(struct pci_dev *dev)
  {
         dev->dev_flags |= PCI_DEV_FLAGS_NO_FLR_RESET;
        +pci_info(dev, "Applied NO_FLR_RESET quirk\n");
  }


2021-05-05 02:54:22

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH v4 2/2] PCI: Enable NO_BUS_RESET quirk for Nvidia GPUs

On Mon, May 03, 2021 at 09:07:11PM -0500, Shanker R Donthineni wrote:
> On 5/3/21 5:42 PM, Bjorn Helgaas wrote:
> > Obviously _RST only works for built-in devices, since there's no AML
> > for plug-in devices, right? So if there's a plug-in card with this
> > GPU, neither SBR nor _RST will work?
> These are not plug-in PCIe GPU cards, will exist on upcoming server
> baseboards. ACPI-reset should wok for plug-in devices as well as long
> as firmware has _RST method defined in ACPI-device associated with
> the PCIe hot-plug slot.

Maybe I'm missing something, but I don't see how _RST can work for
plug-in devices. _RST is part of the system firmware, and that
firmware knows nothing about what will be plugged into the slot. So
if system firmware supplies _RST that knows how to reset the Nvidia
GPU, it's not going to do the right thing if you plug in an NVMe
device instead.

Can you elaborate on how _RST would work for plug-in devices? My only
point here is that IF this GPU is ever on a plug-in card, neither _RST
nor SBR would work, so we'd have to use whatever other reset methods
*do* work (I guess only FLR?)

> I've verified PCIe plug-in feature using SYSFS interface.
>
> 1) Remove device using sysfs interface
> ? root@test:/sys/bus/pci# echo 1 > devices/0005:01:00.0/remove
> ? root@test:/sys/bus/pci# lspci -s 0005:01:00.0
> ?
> 2) Rescan PCI bus using sysfs interface
> ? root@test:/sys/bus/pci# echo 1 > devices/0005:00:00.0/rescan
> ? root@test:/sys/bus/pci# lspci -s 0005:01:00.0
> ? 0005:01:00.0 3D controller: NVIDIA Corporation Device 2341 (rev a1)
>
> 3) List current reset methods
> ? root@jetson:/sys/bus/pci# cat devices/0005:01:00.0/reset_method
> ? acpi,flr
>
> Example AML code:
> ?// Device definition for slot/devfn
> ? Device(GPU0) {
> ???? Name(_ADR,0x00000000)
> ???? Method (_RST, 0)
> ???? {
> ??????? printf("Entering ACPI _RST method")
> ??????? // RESET code
> ??????? printf("Exiting ACPI _RST method")
> ???? }
> ? }
>
> 4) Issue device reset from the userspace
> ?root@test:/sys/bus/pci# echo 1 > devices/0005:01:00.0/reset
>
> dmesg:
> ?[ 6156.426303] ACPI Debug:? "Entering PCI9 _RST method"
> ?[ 6156.427007] ACPI Debug:? "Exiting PCI9 _RST method"
>
> > I'm wondering if we should log something to dmesg in
> > quirk_no_bus_reset(), quirk_no_pm_reset(), quirk_no_flr(), etc., just
> > so we have a hint about the fact that resets won't work quite as
> > expected on these devices.
> Yes, it would be very useful to know what PCI quirks were applied
> during boot. Should I create a separate patch for adding pci_info()
> or include as part of this patch?

Don't include it as part of this patch. It's a separate logical
change so should be a separate patch. We can worry about that later.

> ?--- a/drivers/pci/quirks.c
> ?+++ b/drivers/pci/quirks.c
> ?@@ -3556,6 +3556,7 @@ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_MELLANOX, PCI_ANY_ID,
> ? static void quirk_no_bus_reset(struct pci_dev *dev)
> ? {
> ? ? ???? dev->dev_flags |= PCI_DEV_FLAGS_NO_BUS_RESET;
> ?????? +pci_info(dev, "Applied NO_BUS_RESET quirk\n");
> ? }
>
> ? /*
> ?@@ -3598,6 +3599,7 @@ static void quirk_no_pm_reset(struct pci_dev *dev)
> ?? ?????? */
> ? ? ???? if (!pci_is_root_bus(dev->bus))
> ?? ? ??????????? dev->dev_flags |= PCI_DEV_FLAGS_NO_PM_RESET;
> ??????? +pci_info(dev, "Applied NO_PM_RESET quirk\n");
> ? }
>
> ? /*
> ?@@ -5138,6 +5140,7 @@ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x443, quirk_intel_qat_vf_cap);
> ? static void quirk_no_flr(struct pci_dev *dev)
> ? {
> ? ? ???? dev->dev_flags |= PCI_DEV_FLAGS_NO_FLR_RESET;
> ??????? +pci_info(dev, "Applied NO_FLR_RESET quirk\n");
> ? }
>
>

2021-05-05 03:57:55

by Oliver O'Halloran

[permalink] [raw]
Subject: Re: [PATCH v4 2/2] PCI: Enable NO_BUS_RESET quirk for Nvidia GPUs

On Wed, May 5, 2021 at 12:50 PM Bjorn Helgaas <[email protected]> wrote:
>
> On Mon, May 03, 2021 at 09:07:11PM -0500, Shanker R Donthineni wrote:
> > On 5/3/21 5:42 PM, Bjorn Helgaas wrote:
> > > Obviously _RST only works for built-in devices, since there's no AML
> > > for plug-in devices, right? So if there's a plug-in card with this
> > > GPU, neither SBR nor _RST will work?
> > These are not plug-in PCIe GPU cards, will exist on upcoming server
> > baseboards. ACPI-reset should wok for plug-in devices as well as long
> > as firmware has _RST method defined in ACPI-device associated with
> > the PCIe hot-plug slot.
>
> Maybe I'm missing something, but I don't see how _RST can work for
> plug-in devices. _RST is part of the system firmware, and that
> firmware knows nothing about what will be plugged into the slot. So
> if system firmware supplies _RST that knows how to reset the Nvidia
> GPU, it's not going to do the right thing if you plug in an NVMe
> device instead.
>
> Can you elaborate on how _RST would work for plug-in devices?

Power cycling the slot or just re-asserting #PERST probably. IBM has
been doing that on Power boxes since forever and it mostly works.
Mostly.

2021-05-05 04:47:04

by Shanker Donthineni

[permalink] [raw]
Subject: Re: [PATCH v4 2/2] PCI: Enable NO_BUS_RESET quirk for Nvidia GPUs


Hi Bjorn,

On 5/4/21 9:12 PM, Bjorn Helgaas wrote:
> Maybe I'm missing something, but I don't see how _RST can work for
> plug-in devices. _RST is part of the system firmware, and that
> firmware knows nothing about what will be plugged into the slot. So
> if system firmware supplies _RST that knows how to reset the Nvidia
> GPU, it's not going to do the right thing if you plug in an NVMe
> device instead.
>
> Can you elaborate on how _RST would work for plug-in devices? My only
> point here is that IF this GPU is ever on a plug-in card, neither _RST
> nor SBR would work, so we'd have to use whatever other reset methods
> *do* work (I guess only FLR?)
Sorry, if my explanation was not clear earlier.

These NVIDIA GPUs which need SBR quirk are not hot-plug-able devices, they
will exist only on server baseboards and directly attached to RP/SwitchPorts.
In this case ACPI based reset will be applied to GPUs always.

Agree the RST method is a platform specific firmware implementation and
can't be used for other devices without probing the device PCI config space.
It would be possible to enhance RST implementation, probe config space
using ECAM, check vendorID/deviceID, and do FLR if the non-GPU device is
plugged-in.



2021-05-05 12:17:54

by Pali Rohár

[permalink] [raw]
Subject: Re: [PATCH v4 2/2] PCI: Enable NO_BUS_RESET quirk for Nvidia GPUs

On Friday 30 April 2021 17:11:23 Shanker R Donthineni wrote:
> Thanks Bjorn for reviewing patch.
>
> On 4/30/21 12:01 PM, Bjorn Helgaas wrote:
> > External email: Use caution opening links or attachments
> >
> >
> > On Wed, Apr 28, 2021 at 07:49:07PM -0500, Shanker Donthineni wrote:
> >> On select platforms, some Nvidia GPU devices do not work with SBR.
> >> Triggering SBR would leave the device inoperable for the current
> >> system boot. It requires a system hard-reboot to get the GPU device
> >> back to normal operating condition post-SBR. For the affected
> >> devices, enable NO_BUS_RESET quirk to fix the issue.
> > Since 1/2 adds _RST support, should I infer that _RST works on these
> > Nvidia GPUs even though SBR does not? If so, how does _RST do the
> > reset?
> Yes, _RST method works but not SBR. The _RST method in DSDT-AML uses
> platform-specific initialization steps outside of the GPU BARs for resetting
> the GPU device.

Hello! If I understood this "reset" issue correctly, it means that
affected PCIe GPU device cannot be reset via PCI Secondary Bus Reset
(PCIe Warm Reset) and some special, platform specific reset type needs
to be issued.

And code for this platform specific reset is included in ACPI DSDT
table.

But because ACPI DSDT table is part of BIOS/firmware and not part of the
PCIe GPU device itself, it means that this kind of reset is available to
linux kernel only in the case when vendor of motherboard (or who burn
BIOS/firmware into motherboard EEPROM) includes this specific code into
HW. Am I Right?

So if this PCIe GPU device is connected to other motherboard or other
system then this special platform reset in ACPI DSDT is not available.

What is doing default APCI _RST() method on motherboards without this
special platform reset hook? It probably would not be able to reset
these PCIe GPU devices if standard SBR cannot reset them.

Would not be better to include for these PCIe devices "native" linux
code for resetting them?

Please correct me if I'm wrong in my assumption or if I understood this
issue incorrectly.

> > Do you have a root cause for why SBR doesn't work?
> It is a hardware implementation specific issue. GPU end-point device
> is inoperative after receiving SBR from the RP/SwitchPort. This quirk is
> to prevent SBR.
>
> > I'm not super
> > confident that we perform resets correctly in general, and if the
> > problem is an issue in Linux, it'd be nice to fix that.
> We have not seen any issue with Linux SBR implementation.
> >
> >> This issue will be fixed in the next generation of hardware.

2021-05-05 15:40:30

by Shanker Donthineni

[permalink] [raw]
Subject: Re: [PATCH v4 2/2] PCI: Enable NO_BUS_RESET quirk for Nvidia GPUs

Hi Pali,

On 5/5/21 7:15 AM, Pali Rohár wrote:
> Hello! If I understood this "reset" issue correctly, it means that
> affected PCIe GPU device cannot be reset via PCI Secondary Bus Reset
> (PCIe Warm Reset) and some special, platform specific reset type needs
> to be issued.
>
> And code for this platform specific reset is included in ACPI DSDT
> table.
Yes, correct.
> But because ACPI DSDT table is part of BIOS/firmware and not part of the
> PCIe GPU device itself, it means that this kind of reset is available to
> linux kernel only in the case when vendor of motherboard (or who burn
> BIOS/firmware into motherboard EEPROM) includes this specific code into
> HW. Am I Right?
ACPI specification provides a standard mechanism for a function level reset
using _RST method and should work for any OSPM not just Linux.

https://docs.microsoft.com/en-us/windows-hardware/drivers/kernel/resetting-and-recovering-a-device
ACPI firmware: Function-level reset
To support function-level device reset, there must be an _RST method defined inside the Device scope. If present, this method will override the bus driver's implementation of function-level device reset (if present) for that device. When executed, the _RST method must reset only that device, and must not affect other devices. In addition, the device must stay connected on the bus.
> So if this PCIe GPU device is connected to other motherboard or other
> system then this special platform reset in ACPI DSDT is not available.
PCI hw resets won't work. only way to reset the device using platform specific code.
> What is doing default APCI _RST() method on motherboards without this
> special platform reset hook? It probably would not be able to reset
> these PCIe GPU devices if standard SBR cannot reset them.
Yes, BIOS/firmware has to support where these affected  GPU devices are attached.
These GPU devices are not plug-in PCIe cards, only exist on server baseboards and
directly attached to PCIe fabric. 
> Would not be better to include for these PCIe devices "native" linux
> code for resetting them?
It requires complicated code sequence and has to access many platform specific
registers. We're taking advantage of OS independent standard ACPI-RST reset
mechanism for resting the GPU device.
> Please correct me if I'm wrong in my assumption or if I understood this
> issue incorrectly.
The GPU has side effects after triggering the SBR, it requires the system reboot to
bring the device back to the operating state, This workaround is to prevent SBR.

2021-05-05 19:15:20

by Alex Williamson

[permalink] [raw]
Subject: Re: [PATCH v4 2/2] PCI: Enable NO_BUS_RESET quirk for Nvidia GPUs

On Wed, 5 May 2021 23:10:32 +0530
Amey Narkhede <[email protected]> wrote:

> On 21/05/05 01:56PM, Oliver O'Halloran wrote:
> > On Wed, May 5, 2021 at 12:50 PM Bjorn Helgaas <[email protected]> wrote:
> > >
> > > On Mon, May 03, 2021 at 09:07:11PM -0500, Shanker R Donthineni wrote:
> > > > On 5/3/21 5:42 PM, Bjorn Helgaas wrote:
> > > > > Obviously _RST only works for built-in devices, since there's no AML
> > > > > for plug-in devices, right? So if there's a plug-in card with this
> > > > > GPU, neither SBR nor _RST will work?
> > > > These are not plug-in PCIe GPU cards, will exist on upcoming server
> > > > baseboards. ACPI-reset should wok for plug-in devices as well as long
> > > > as firmware has _RST method defined in ACPI-device associated with
> > > > the PCIe hot-plug slot.
> > >
> > > Maybe I'm missing something, but I don't see how _RST can work for
> > > plug-in devices. _RST is part of the system firmware, and that
> > > firmware knows nothing about what will be plugged into the slot. So
> > > if system firmware supplies _RST that knows how to reset the Nvidia
> > > GPU, it's not going to do the right thing if you plug in an NVMe
> > > device instead.
> > >
> > > Can you elaborate on how _RST would work for plug-in devices?

I'm not sure I really understand these concerns about plug-in devices.
In this case I believe we're dealing with an embedded GPU, there is no
case where one of these GPUs would be a discrete device on a plug-in
card. I'm also assuming all SoCs integrating this GPU will provide a
_RST method, but we're also disabling SBR in this series to avoid the
only other generic reset option we'd have for this device.

In the more general case, I'd expect that system firmware isn't going
to implement an _RST method for a pluggable slot, so we'll lookup the
ACPI handle, fail to find a _RST method and drop to the next option.
For a PCI/e slot, at best the _RST method might be included in the _PRR
scope rather than the device scope to indicate it affects the entire
slot. That could be something like the #PERST below or a warm reset. I
don't think we're enabling that here, are we?

Otherwise system firmware would need to dynamically provide a _RST
method if it recognized and had support for the plugin card.

> > Power cycling the slot or just re-asserting #PERST probably. IBM has
> > been doing that on Power boxes since forever and it mostly works.
> > Mostly.
> According to ACPI spec v6.3 section 7.3.25, _RST just performs normal
> FLR in most cases but if the device supports _PRR(Power Resource for Reset)
> then reset operation causes the device to be reported as missing from the bus
> that indicates that it affects all the devices on the bus.

We're only looking for _RST on the device handle, so I think we're
limited to the device context limitations. Per the referenced section:

7.3.25 _RST (Device Reset)

This object executes a reset on the associated device or devices. If
^^
included in a device context, the reset must not affect any other
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ACPI-described devices; if included in a power resource for reset
^^^^^^^^^^^^^^^^^^^^^^
(_PRR, Section 7.3.26) the reset must affect all ACPI-described
devices that reference it.

When this object is described in a device context, it executes a
function level reset that only affects the device it is associated
with; neither parent nor children should be affected by the execution
of this reset. Executing this must only result in this device
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
resetting without the device appearing as if it has been removed from
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
the bus altogether, to prevent OSPM re-enumeration of devices on
^^^^^^^^^^^^^^^^^^
hot-pluggable buses (e.g. USB).

If a device reset is supported by the platform, but cannot meet the
function level and bus requirement, the device should instead
implement a _PRR (Section 7.3.26).

Devices can define both a _RST and a _PRR if supported by the
hardware.

Arguments: Non

Return Value: None


It's a bit unfortunate that they use the phrase "function level reset",
but since this method is not specific to a PCI device, I think this
could just as easily be replaced with "individual device scope reset".
The implementation of that could be an PCI FLR, or any number of device
or platform specific operations. To me this reads like a system
firmware provided, device specific reset. Thanks,

Alex

2021-05-05 21:02:10

by Amey Narkhede

[permalink] [raw]
Subject: Re: [PATCH v4 2/2] PCI: Enable NO_BUS_RESET quirk for Nvidia GPUs

On 21/05/05 01:56PM, Oliver O'Halloran wrote:
> On Wed, May 5, 2021 at 12:50 PM Bjorn Helgaas <[email protected]> wrote:
> >
> > On Mon, May 03, 2021 at 09:07:11PM -0500, Shanker R Donthineni wrote:
> > > On 5/3/21 5:42 PM, Bjorn Helgaas wrote:
> > > > Obviously _RST only works for built-in devices, since there's no AML
> > > > for plug-in devices, right? So if there's a plug-in card with this
> > > > GPU, neither SBR nor _RST will work?
> > > These are not plug-in PCIe GPU cards, will exist on upcoming server
> > > baseboards. ACPI-reset should wok for plug-in devices as well as long
> > > as firmware has _RST method defined in ACPI-device associated with
> > > the PCIe hot-plug slot.
> >
> > Maybe I'm missing something, but I don't see how _RST can work for
> > plug-in devices. _RST is part of the system firmware, and that
> > firmware knows nothing about what will be plugged into the slot. So
> > if system firmware supplies _RST that knows how to reset the Nvidia
> > GPU, it's not going to do the right thing if you plug in an NVMe
> > device instead.
> >
> > Can you elaborate on how _RST would work for plug-in devices?
>
> Power cycling the slot or just re-asserting #PERST probably. IBM has
> been doing that on Power boxes since forever and it mostly works.
> Mostly.
According to ACPI spec v6.3 section 7.3.25, _RST just performs normal
FLR in most cases but if the device supports _PRR(Power Resource for Reset)
then reset operation causes the device to be reported as missing from the bus
that indicates that it affects all the devices on the bus.

Thanks,
Amey

2021-05-05 22:55:13

by Shanker Donthineni

[permalink] [raw]
Subject: Re: [PATCH v4 2/2] PCI: Enable NO_BUS_RESET quirk for Nvidia GPUs

Thanks Alex for the detailed explanation.

On 5/5/21 2:13 PM, Alex Williamson wrote:
> I'm also assuming all SoCs integrating this GPU will provide a
> _RST method, but we're also disabling SBR in this series to avoid the
> only other generic reset option we'd have for this device.
All the platforms/SoCs which contain these GPUs will provide ACPI/firmware with
_RST method.
 
> In the more general case, I'd expect that system firmware isn't going
> to implement an _RST method for a pluggable slot, so we'll lookup the
> ACPI handle, fail to find a _RST method and drop to the next option.
> For a PCI/e slot, at best the _RST method might be included in the _PRR
> scope rather than the device scope to indicate it affects the entire
> slot. That could be something like the #PERST below or a warm reset. I
> don't think we're enabling that here, are we?
No, our_RST method will be included only in a device context (not _PRP).

2021-05-05 22:59:42

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH v4 2/2] PCI: Enable NO_BUS_RESET quirk for Nvidia GPUs

On Wed, May 05, 2021 at 01:13:57PM -0600, Alex Williamson wrote:
> On Wed, 5 May 2021 23:10:32 +0530
> Amey Narkhede <[email protected]> wrote:
> > On 21/05/05 01:56PM, Oliver O'Halloran wrote:
> > > On Wed, May 5, 2021 at 12:50 PM Bjorn Helgaas <[email protected]> wrote:
> > > > On Mon, May 03, 2021 at 09:07:11PM -0500, Shanker R Donthineni wrote:
> > > > > On 5/3/21 5:42 PM, Bjorn Helgaas wrote:
> > > > > > Obviously _RST only works for built-in devices, since there's no AML
> > > > > > for plug-in devices, right? So if there's a plug-in card with this
> > > > > > GPU, neither SBR nor _RST will work?
> > > > > These are not plug-in PCIe GPU cards, will exist on upcoming server
> > > > > baseboards. ACPI-reset should wok for plug-in devices as well as long
> > > > > as firmware has _RST method defined in ACPI-device associated with
> > > > > the PCIe hot-plug slot.
> > > >
> > > > Maybe I'm missing something, but I don't see how _RST can work for
> > > > plug-in devices. _RST is part of the system firmware, and that
> > > > firmware knows nothing about what will be plugged into the slot. So
> > > > if system firmware supplies _RST that knows how to reset the Nvidia
> > > > GPU, it's not going to do the right thing if you plug in an NVMe
> > > > device instead.
> > > >
> > > > Can you elaborate on how _RST would work for plug-in devices?
>
> I'm not sure I really understand these concerns about plug-in devices.

I'm not really concerned about plug-in devices. Shanker said above
that _RST would work for them:

ACPI-reset should work for plug-in devices as well as long as
firmware has _RST method defined in ACPI-device associated with the
PCIe hot-plug slot.

and I disagreed. _RST *cannot* work for plug-in devices because
firmware doesn't know what device will be plugged in and therefore
cannot provide the required device-specific _RST.

That's all I wanted to clarify.

Bjorn