2023-04-24 06:05:13

by Kai-Heng Feng

[permalink] [raw]
Subject: [PATCH v4 2/3] PCI/AER: Disable AER interrupt on suspend

PCIe service that shares IRQ with PME may cause spurious wakeup on
system suspend.

PCIe Base Spec 5.0, section 5.2 "Link State Power Management" states
that TLP and DLLP transmission is disabled for a Link in L2/L3 Ready
(D3hot), L2 (D3cold with aux power) and L3 (D3cold), so we don't lose
much here to disable AER during system suspend.

This is very similar to previous attempts to suspend AER and DPC [1],
but with a different reason.

[1] https://lore.kernel.org/linux-pci/[email protected]/
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216295

Reviewed-by: Mika Westerberg <[email protected]>
Signed-off-by: Kai-Heng Feng <[email protected]>
---
drivers/pci/pcie/aer.c | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)

diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index 1420e1f27105..9c07fdbeb52d 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -1356,6 +1356,26 @@ static int aer_probe(struct pcie_device *dev)
return 0;
}

+static int aer_suspend(struct pcie_device *dev)
+{
+ struct aer_rpc *rpc = get_service_data(dev);
+ struct pci_dev *pdev = rpc->rpd;
+
+ aer_disable_irq(pdev);
+
+ return 0;
+}
+
+static int aer_resume(struct pcie_device *dev)
+{
+ struct aer_rpc *rpc = get_service_data(dev);
+ struct pci_dev *pdev = rpc->rpd;
+
+ aer_enable_irq(pdev);
+
+ return 0;
+}
+
/**
* aer_root_reset - reset Root Port hierarchy, RCEC, or RCiEP
* @dev: pointer to Root Port, RCEC, or RCiEP
@@ -1420,6 +1440,8 @@ static struct pcie_port_service_driver aerdriver = {
.service = PCIE_PORT_SERVICE_AER,

.probe = aer_probe,
+ .suspend = aer_suspend,
+ .resume = aer_resume,
.remove = aer_remove,
};

--
2.34.1


Subject: Re: [PATCH v4 2/3] PCI/AER: Disable AER interrupt on suspend



On 4/23/23 10:52 PM, Kai-Heng Feng wrote:
> PCIe service that shares IRQ with PME may cause spurious wakeup on
> system suspend.
>
> PCIe Base Spec 5.0, section 5.2 "Link State Power Management" states
> that TLP and DLLP transmission is disabled for a Link in L2/L3 Ready
> (D3hot), L2 (D3cold with aux power) and L3 (D3cold), so we don't lose
> much here to disable AER during system suspend.
>
> This is very similar to previous attempts to suspend AER and DPC [1],
> but with a different reason.
>
> [1] https://lore.kernel.org/linux-pci/[email protected]/
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=216295
>
> Reviewed-by: Mika Westerberg <[email protected]>
> Signed-off-by: Kai-Heng Feng <[email protected]>
> ---

IIUC, you encounter AER errors during the suspend/resume process, which
results in AER IRQ. Because AER and PME share an IRQ, it is regarded as a
spurious wake-up IRQ. So to fix it, you want to disable AER reporting,
right?

It looks like it is harmless to disable the AER during the suspend/resume
path. But, I am wondering why we get these errors? Did you check what errors
you get during the suspend/resume path? Are these errors valid?


> drivers/pci/pcie/aer.c | 22 ++++++++++++++++++++++
> 1 file changed, 22 insertions(+)
>
> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> index 1420e1f27105..9c07fdbeb52d 100644
> --- a/drivers/pci/pcie/aer.c
> +++ b/drivers/pci/pcie/aer.c
> @@ -1356,6 +1356,26 @@ static int aer_probe(struct pcie_device *dev)
> return 0;
> }
>
> +static int aer_suspend(struct pcie_device *dev)
> +{
> + struct aer_rpc *rpc = get_service_data(dev);
> + struct pci_dev *pdev = rpc->rpd;
> +
> + aer_disable_irq(pdev);
> +
> + return 0;
> +}
> +
> +static int aer_resume(struct pcie_device *dev)
> +{
> + struct aer_rpc *rpc = get_service_data(dev);
> + struct pci_dev *pdev = rpc->rpd;
> +
> + aer_enable_irq(pdev);
> +
> + return 0;
> +}
> +
> /**
> * aer_root_reset - reset Root Port hierarchy, RCEC, or RCiEP
> * @dev: pointer to Root Port, RCEC, or RCiEP
> @@ -1420,6 +1440,8 @@ static struct pcie_port_service_driver aerdriver = {
> .service = PCIE_PORT_SERVICE_AER,
>
> .probe = aer_probe,
> + .suspend = aer_suspend,
> + .resume = aer_resume,
> .remove = aer_remove,
> };
>

--
Sathyanarayanan Kuppuswamy
Linux Kernel Developer

Subject: Re: [PATCH v4 2/3] PCI/AER: Disable AER interrupt on suspend

Hi,

On 4/24/23 10:55 PM, Kai-Heng Feng wrote:
> On Tue, Apr 25, 2023 at 7:47 AM Sathyanarayanan Kuppuswamy
> <[email protected]> wrote:
>>
>>
>>
>> On 4/23/23 10:52 PM, Kai-Heng Feng wrote:
>>> PCIe service that shares IRQ with PME may cause spurious wakeup on
>>> system suspend.
>>>
>>> PCIe Base Spec 5.0, section 5.2 "Link State Power Management" states
>>> that TLP and DLLP transmission is disabled for a Link in L2/L3 Ready
>>> (D3hot), L2 (D3cold with aux power) and L3 (D3cold), so we don't lose
>>> much here to disable AER during system suspend.
>>>
>>> This is very similar to previous attempts to suspend AER and DPC [1],
>>> but with a different reason.
>>>
>>> [1] https://lore.kernel.org/linux-pci/[email protected]/
>>> Link: https://bugzilla.kernel.org/show_bug.cgi?id=216295
>>>
>>> Reviewed-by: Mika Westerberg <[email protected]>
>>> Signed-off-by: Kai-Heng Feng <[email protected]>
>>> ---
>>
>> IIUC, you encounter AER errors during the suspend/resume process, which
>> results in AER IRQ. Because AER and PME share an IRQ, it is regarded as a
>> spurious wake-up IRQ. So to fix it, you want to disable AER reporting,
>> right?
>
> Yes. That's exactly what happened.
>
>>
>> It looks like it is harmless to disable the AER during the suspend/resume
>> path. But, I am wondering why we get these errors? Did you check what errors
>> you get during the suspend/resume path? Are these errors valid?
>
> I really don't know. I think it's similar to the reasoning in commit
> b07461a8e45b ("PCI/AER: Clear error status registers during
> enumeration and restore"): "AER errors might be recorded when
> powering-on devices. These errors can be ignored, ...".
> For this case, it happens when powering-off the device (D3cold) via
> turning off power resources.

Got it.

Reviewed-by: Kuppuswamy Sathyanarayanan <[email protected]>

>
> Kai-Heng
>
>>
>>
>>> drivers/pci/pcie/aer.c | 22 ++++++++++++++++++++++
>>> 1 file changed, 22 insertions(+)
>>>
>>> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
>>> index 1420e1f27105..9c07fdbeb52d 100644
>>> --- a/drivers/pci/pcie/aer.c
>>> +++ b/drivers/pci/pcie/aer.c
>>> @@ -1356,6 +1356,26 @@ static int aer_probe(struct pcie_device *dev)
>>> return 0;
>>> }
>>>
>>> +static int aer_suspend(struct pcie_device *dev)
>>> +{
>>> + struct aer_rpc *rpc = get_service_data(dev);
>>> + struct pci_dev *pdev = rpc->rpd;
>>> +
>>> + aer_disable_irq(pdev);
>>> +
>>> + return 0;
>>> +}
>>> +
>>> +static int aer_resume(struct pcie_device *dev)
>>> +{
>>> + struct aer_rpc *rpc = get_service_data(dev);
>>> + struct pci_dev *pdev = rpc->rpd;
>>> +
>>> + aer_enable_irq(pdev);
>>> +
>>> + return 0;
>>> +}
>>> +
>>> /**
>>> * aer_root_reset - reset Root Port hierarchy, RCEC, or RCiEP
>>> * @dev: pointer to Root Port, RCEC, or RCiEP
>>> @@ -1420,6 +1440,8 @@ static struct pcie_port_service_driver aerdriver = {
>>> .service = PCIE_PORT_SERVICE_AER,
>>>
>>> .probe = aer_probe,
>>> + .suspend = aer_suspend,
>>> + .resume = aer_resume,
>>> .remove = aer_remove,
>>> };
>>>
>>
>> --
>> Sathyanarayanan Kuppuswamy
>> Linux Kernel Developer

--
Sathyanarayanan Kuppuswamy
Linux Kernel Developer

2023-04-25 06:04:52

by Kai-Heng Feng

[permalink] [raw]
Subject: Re: [PATCH v4 2/3] PCI/AER: Disable AER interrupt on suspend

On Tue, Apr 25, 2023 at 7:47 AM Sathyanarayanan Kuppuswamy
<[email protected]> wrote:
>
>
>
> On 4/23/23 10:52 PM, Kai-Heng Feng wrote:
> > PCIe service that shares IRQ with PME may cause spurious wakeup on
> > system suspend.
> >
> > PCIe Base Spec 5.0, section 5.2 "Link State Power Management" states
> > that TLP and DLLP transmission is disabled for a Link in L2/L3 Ready
> > (D3hot), L2 (D3cold with aux power) and L3 (D3cold), so we don't lose
> > much here to disable AER during system suspend.
> >
> > This is very similar to previous attempts to suspend AER and DPC [1],
> > but with a different reason.
> >
> > [1] https://lore.kernel.org/linux-pci/[email protected]/
> > Link: https://bugzilla.kernel.org/show_bug.cgi?id=216295
> >
> > Reviewed-by: Mika Westerberg <[email protected]>
> > Signed-off-by: Kai-Heng Feng <[email protected]>
> > ---
>
> IIUC, you encounter AER errors during the suspend/resume process, which
> results in AER IRQ. Because AER and PME share an IRQ, it is regarded as a
> spurious wake-up IRQ. So to fix it, you want to disable AER reporting,
> right?

Yes. That's exactly what happened.

>
> It looks like it is harmless to disable the AER during the suspend/resume
> path. But, I am wondering why we get these errors? Did you check what errors
> you get during the suspend/resume path? Are these errors valid?

I really don't know. I think it's similar to the reasoning in commit
b07461a8e45b ("PCI/AER: Clear error status registers during
enumeration and restore"): "AER errors might be recorded when
powering-on devices. These errors can be ignored, ...".
For this case, it happens when powering-off the device (D3cold) via
turning off power resources.

Kai-Heng

>
>
> > drivers/pci/pcie/aer.c | 22 ++++++++++++++++++++++
> > 1 file changed, 22 insertions(+)
> >
> > diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> > index 1420e1f27105..9c07fdbeb52d 100644
> > --- a/drivers/pci/pcie/aer.c
> > +++ b/drivers/pci/pcie/aer.c
> > @@ -1356,6 +1356,26 @@ static int aer_probe(struct pcie_device *dev)
> > return 0;
> > }
> >
> > +static int aer_suspend(struct pcie_device *dev)
> > +{
> > + struct aer_rpc *rpc = get_service_data(dev);
> > + struct pci_dev *pdev = rpc->rpd;
> > +
> > + aer_disable_irq(pdev);
> > +
> > + return 0;
> > +}
> > +
> > +static int aer_resume(struct pcie_device *dev)
> > +{
> > + struct aer_rpc *rpc = get_service_data(dev);
> > + struct pci_dev *pdev = rpc->rpd;
> > +
> > + aer_enable_irq(pdev);
> > +
> > + return 0;
> > +}
> > +
> > /**
> > * aer_root_reset - reset Root Port hierarchy, RCEC, or RCiEP
> > * @dev: pointer to Root Port, RCEC, or RCiEP
> > @@ -1420,6 +1440,8 @@ static struct pcie_port_service_driver aerdriver = {
> > .service = PCIE_PORT_SERVICE_AER,
> >
> > .probe = aer_probe,
> > + .suspend = aer_suspend,
> > + .resume = aer_resume,
> > .remove = aer_remove,
> > };
> >
>
> --
> Sathyanarayanan Kuppuswamy
> Linux Kernel Developer

2023-05-05 19:31:32

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH v4 2/3] PCI/AER: Disable AER interrupt on suspend

On Mon, Apr 24, 2023 at 01:52:48PM +0800, Kai-Heng Feng wrote:
> PCIe service that shares IRQ with PME may cause spurious wakeup on
> system suspend.
>
> PCIe Base Spec 5.0, section 5.2 "Link State Power Management" states
> that TLP and DLLP transmission is disabled for a Link in L2/L3 Ready
> (D3hot), L2 (D3cold with aux power) and L3 (D3cold), so we don't lose
> much here to disable AER during system suspend.
>
> This is very similar to previous attempts to suspend AER and DPC [1],
> but with a different reason.

What is the reason? I assume it's something to do with the bugzilla
below, but the commit log should outline the user-visible problem this
fixes. The commit log basically makes the case for "why should we
merge this patch."

I assume it's along the lines of "I tried to suspend this system, but
it immediately woke up again because of an AER interrupt, and
disabling AER during suspend avoids this problem. And disabling
the AER interrupt is not a problem because X"

> [1] https://lore.kernel.org/linux-pci/[email protected]/
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=216295
>
> Reviewed-by: Mika Westerberg <[email protected]>
> Signed-off-by: Kai-Heng Feng <[email protected]>
> ---
> drivers/pci/pcie/aer.c | 22 ++++++++++++++++++++++
> 1 file changed, 22 insertions(+)
>
> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> index 1420e1f27105..9c07fdbeb52d 100644
> --- a/drivers/pci/pcie/aer.c
> +++ b/drivers/pci/pcie/aer.c
> @@ -1356,6 +1356,26 @@ static int aer_probe(struct pcie_device *dev)
> return 0;
> }
>
> +static int aer_suspend(struct pcie_device *dev)
> +{
> + struct aer_rpc *rpc = get_service_data(dev);
> + struct pci_dev *pdev = rpc->rpd;
> +
> + aer_disable_irq(pdev);
> +
> + return 0;
> +}
> +
> +static int aer_resume(struct pcie_device *dev)
> +{
> + struct aer_rpc *rpc = get_service_data(dev);
> + struct pci_dev *pdev = rpc->rpd;
> +
> + aer_enable_irq(pdev);
> +
> + return 0;
> +}
> +
> /**
> * aer_root_reset - reset Root Port hierarchy, RCEC, or RCiEP
> * @dev: pointer to Root Port, RCEC, or RCiEP
> @@ -1420,6 +1440,8 @@ static struct pcie_port_service_driver aerdriver = {
> .service = PCIE_PORT_SERVICE_AER,
>
> .probe = aer_probe,
> + .suspend = aer_suspend,
> + .resume = aer_resume,
> .remove = aer_remove,
> };
>
> --
> 2.34.1
>

2023-05-11 13:00:33

by Kai-Heng Feng

[permalink] [raw]
Subject: Re: [PATCH v4 2/3] PCI/AER: Disable AER interrupt on suspend

On Sat, May 6, 2023 at 3:22 AM Bjorn Helgaas <[email protected]> wrote:
>
> On Mon, Apr 24, 2023 at 01:52:48PM +0800, Kai-Heng Feng wrote:
> > PCIe service that shares IRQ with PME may cause spurious wakeup on
> > system suspend.
> >
> > PCIe Base Spec 5.0, section 5.2 "Link State Power Management" states
> > that TLP and DLLP transmission is disabled for a Link in L2/L3 Ready
> > (D3hot), L2 (D3cold with aux power) and L3 (D3cold), so we don't lose
> > much here to disable AER during system suspend.
> >
> > This is very similar to previous attempts to suspend AER and DPC [1],
> > but with a different reason.
>
> What is the reason? I assume it's something to do with the bugzilla
> below, but the commit log should outline the user-visible problem this
> fixes. The commit log basically makes the case for "why should we
> merge this patch."
>
> I assume it's along the lines of "I tried to suspend this system, but
> it immediately woke up again because of an AER interrupt, and
> disabling AER during suspend avoids this problem. And disabling
> the AER interrupt is not a problem because X"

Yes that's the reason :)
Will update the message to better reflect what's going on.

Kai-Heng

>
> > [1] https://lore.kernel.org/linux-pci/[email protected]/
> > Link: https://bugzilla.kernel.org/show_bug.cgi?id=216295
> >
> > Reviewed-by: Mika Westerberg <[email protected]>
> > Signed-off-by: Kai-Heng Feng <[email protected]>
> > ---
> > drivers/pci/pcie/aer.c | 22 ++++++++++++++++++++++
> > 1 file changed, 22 insertions(+)
> >
> > diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> > index 1420e1f27105..9c07fdbeb52d 100644
> > --- a/drivers/pci/pcie/aer.c
> > +++ b/drivers/pci/pcie/aer.c
> > @@ -1356,6 +1356,26 @@ static int aer_probe(struct pcie_device *dev)
> > return 0;
> > }
> >
> > +static int aer_suspend(struct pcie_device *dev)
> > +{
> > + struct aer_rpc *rpc = get_service_data(dev);
> > + struct pci_dev *pdev = rpc->rpd;
> > +
> > + aer_disable_irq(pdev);
> > +
> > + return 0;
> > +}
> > +
> > +static int aer_resume(struct pcie_device *dev)
> > +{
> > + struct aer_rpc *rpc = get_service_data(dev);
> > + struct pci_dev *pdev = rpc->rpd;
> > +
> > + aer_enable_irq(pdev);
> > +
> > + return 0;
> > +}
> > +
> > /**
> > * aer_root_reset - reset Root Port hierarchy, RCEC, or RCiEP
> > * @dev: pointer to Root Port, RCEC, or RCiEP
> > @@ -1420,6 +1440,8 @@ static struct pcie_port_service_driver aerdriver = {
> > .service = PCIE_PORT_SERVICE_AER,
> >
> > .probe = aer_probe,
> > + .suspend = aer_suspend,
> > + .resume = aer_resume,
> > .remove = aer_remove,
> > };
> >
> > --
> > 2.34.1
> >

2023-05-11 13:42:10

by Kai-Heng Feng

[permalink] [raw]
Subject: Re: [PATCH v4 2/3] PCI/AER: Disable AER interrupt on suspend

On Tue, Apr 25, 2023 at 7:47 AM Sathyanarayanan Kuppuswamy
<[email protected]> wrote:
>
>
>
> On 4/23/23 10:52 PM, Kai-Heng Feng wrote:
> > PCIe service that shares IRQ with PME may cause spurious wakeup on
> > system suspend.
> >
> > PCIe Base Spec 5.0, section 5.2 "Link State Power Management" states
> > that TLP and DLLP transmission is disabled for a Link in L2/L3 Ready
> > (D3hot), L2 (D3cold with aux power) and L3 (D3cold), so we don't lose
> > much here to disable AER during system suspend.
> >
> > This is very similar to previous attempts to suspend AER and DPC [1],
> > but with a different reason.
> >
> > [1] https://lore.kernel.org/linux-pci/[email protected]/
> > Link: https://bugzilla.kernel.org/show_bug.cgi?id=216295
> >
> > Reviewed-by: Mika Westerberg <[email protected]>
> > Signed-off-by: Kai-Heng Feng <[email protected]>
> > ---
>
> IIUC, you encounter AER errors during the suspend/resume process, which
> results in AER IRQ. Because AER and PME share an IRQ, it is regarded as a
> spurious wake-up IRQ. So to fix it, you want to disable AER reporting,
> right?
>
> It looks like it is harmless to disable the AER during the suspend/resume
> path. But, I am wondering why we get these errors? Did you check what errors
> you get during the suspend/resume path? Are these errors valid?

AFAIK those errors comes from firmware/hardware side, especially when
the device gets put to D3hot/D3cold.

Kai-Heng

>
>
> > drivers/pci/pcie/aer.c | 22 ++++++++++++++++++++++
> > 1 file changed, 22 insertions(+)
> >
> > diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> > index 1420e1f27105..9c07fdbeb52d 100644
> > --- a/drivers/pci/pcie/aer.c
> > +++ b/drivers/pci/pcie/aer.c
> > @@ -1356,6 +1356,26 @@ static int aer_probe(struct pcie_device *dev)
> > return 0;
> > }
> >
> > +static int aer_suspend(struct pcie_device *dev)
> > +{
> > + struct aer_rpc *rpc = get_service_data(dev);
> > + struct pci_dev *pdev = rpc->rpd;
> > +
> > + aer_disable_irq(pdev);
> > +
> > + return 0;
> > +}
> > +
> > +static int aer_resume(struct pcie_device *dev)
> > +{
> > + struct aer_rpc *rpc = get_service_data(dev);
> > + struct pci_dev *pdev = rpc->rpd;
> > +
> > + aer_enable_irq(pdev);
> > +
> > + return 0;
> > +}
> > +
> > /**
> > * aer_root_reset - reset Root Port hierarchy, RCEC, or RCiEP
> > * @dev: pointer to Root Port, RCEC, or RCiEP
> > @@ -1420,6 +1440,8 @@ static struct pcie_port_service_driver aerdriver = {
> > .service = PCIE_PORT_SERVICE_AER,
> >
> > .probe = aer_probe,
> > + .suspend = aer_suspend,
> > + .resume = aer_resume,
> > .remove = aer_remove,
> > };
> >
>
> --
> Sathyanarayanan Kuppuswamy
> Linux Kernel Developer