2022-05-14 03:02:53

by Wysocki, Rafael J

[permalink] [raw]
Subject: Re: [PATCH v5 2/2] PCI/PM: Fix pci_pm_suspend_noirq() to disable PTM

On 4/29/2022 11:05 PM, Rajvi Jingar wrote:
> For the PCIe devices (like nvme) that do not go into D3 state still need to
> disable PTM to allow the port to enter a lower-power PM state and the SoC
> to reach a lower-power idle state as a whole. Move the pci_disable_ptm()
> out of pci_prepare_to_sleep() as this code path is not followed for devices
> that do not go into D3. This fixes the issue seen on Dell XPS 9300 with
> Ice Lake CPU and Dell Precision 5530 with Coffee Lake CPU platforms to get
> improved residency in low power idle states.
>
> Also, on receiving a PTM Request from a downstream device, if PTM is
> disabled on the root port, as per PCIe r6.0, sec 6.21.3, such a request
> would cause an Unsupported Request error. So it must first disable PTM in
> any downstream devices.
>
> Fixes: a697f072f5da ("PCI: Disable PTM during suspend to save power")
> Signed-off-by: Rajvi Jingar <[email protected]>
> Suggested-by: David E. Box <[email protected]>
> ---
> v1 -> v2: add Fixes tag in commit message
> v2 -> v3: move changelog after "---" marker
> v3 -> v4: add "---" marker after changelog
> v4 -> v5: move pci_disable_ptm() out of the pci_dev->state_saved check.
> disable PTM for all devices, not just root ports.
> ---
> drivers/pci/pci-driver.c | 28 +++++++++++++++++++---------
> drivers/pci/pci.c | 10 ----------
> 2 files changed, 19 insertions(+), 19 deletions(-)
>
> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> index 8b55a90126a2..400dd18a9cf5 100644
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -813,6 +813,7 @@ static int pci_pm_suspend_late(struct device *dev)
>
> static int pci_pm_suspend_noirq(struct device *dev)
> {
> + unsigned int dev_state_saved;
> struct pci_dev *pci_dev = to_pci_dev(dev);
> const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
>
> @@ -845,16 +846,25 @@ static int pci_pm_suspend_noirq(struct device *dev)
> }
> }
>
> - if (!pci_dev->state_saved) {
> + dev_state_saved = pci_dev->state_saved;

If pci_dev->state_saved is set here, the device may be in D3cold already
and disabling PTM for it will not work.  Of course, it is not necessary
to disable PTM for it then, but this case need to be taken care of.

> + if (!dev_state_saved)
> pci_save_state(pci_dev);
> - /*
> - * If the device is a bridge with a child in D0 below it, it needs to
> - * stay in D0, so check skip_bus_pm to avoid putting it into a
> - * low-power state in that case.
> - */
> - if (!pci_dev->skip_bus_pm && pci_power_manageable(pci_dev))
> - pci_prepare_to_sleep(pci_dev);
> - }
> +
> + /*
> + * There are systems (for example, Intel mobile chips since Coffee
> + * Lake) where the power drawn while suspended can be significantly
> + * reduced by disabling PTM as this allows the SoC to reach a
> + * lower-power idle state as a whole.
> + */

Something like this should suffice IMV:

if (!dev_state_saved || pci_dev->current_state != PCI_D3cold)

        pci_disable_ptm(pci_dev);


> + pci_disable_ptm(pci_dev);
> +
> + /* If the device is a bridge with a child in D0 below it, it needs to
> + * stay in D0, so check skip_bus_pm to avoid putting it into a
> + * low-power state in that case.
> + */
> + if (!dev_state_saved && !pci_dev->skip_bus_pm &&
> + pci_power_manageable(pci_dev))
> + pci_prepare_to_sleep(pci_dev);
>
> pci_dbg(pci_dev, "PCI PM: Suspend power state: %s\n",
> pci_power_name(pci_dev->current_state));
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 9ecce435fb3f..f8768672c064 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -2660,16 +2660,6 @@ int pci_prepare_to_sleep(struct pci_dev *dev)
> if (target_state == PCI_POWER_ERROR)
> return -EIO;
>
> - /*
> - * There are systems (for example, Intel mobile chips since Coffee
> - * Lake) where the power drawn while suspended can be significantly
> - * reduced by disabling PTM on PCIe root ports as this allows the
> - * port to enter a lower-power PM state and the SoC to reach a
> - * lower-power idle state as a whole.
> - */
> - if (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT)
> - pci_disable_ptm(dev);
> -
> pci_enable_wake(dev, target_state, wakeup);
>
> error = pci_set_power_state(dev, target_state);

And the restoration of the PTM state on errors in pci_prepare_to_sleep()
is not needed any more.




2022-05-14 03:22:09

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH v5 2/2] PCI/PM: Fix pci_pm_suspend_noirq() to disable PTM

Hi Rajvi,

I received your v1, v2, v3, v4, v5 postings because they were sent
directly to [email protected], but for some reason vger doesn't like
them so they don't show up on the mailing list:

https://lore.kernel.org/all/?q=a%3Arajvi.jingar

I looked at the ones I received directly and don't see an obvious
problem. Maybe there's a hint here?

http://vger.kernel.org/majordomo-info.html

All patches should appear on the linux-pci mailing list before
applying them, so we need to figure this out somehow. In fact, I read
and review patches from linux-pci, so I often don't even see things
that are just sent directly to [email protected].

On Thu, May 12, 2022 at 03:49:18PM +0200, Rafael J. Wysocki wrote:
> On 4/29/2022 11:05 PM, Rajvi Jingar wrote:
> > For the PCIe devices (like nvme) that do not go into D3 state still need to
> > disable PTM to allow the port to enter a lower-power PM state and the SoC
> > to reach a lower-power idle state as a whole. Move the pci_disable_ptm()
> > out of pci_prepare_to_sleep() as this code path is not followed for devices
> > that do not go into D3. This fixes the issue seen on Dell XPS 9300 with
> > Ice Lake CPU and Dell Precision 5530 with Coffee Lake CPU platforms to get
> > improved residency in low power idle states.

I think the paragraph above is a distraction, and the real reason is
the paragraph below.

> > Also, on receiving a PTM Request from a downstream device, if PTM is
> > disabled on the root port, as per PCIe r6.0, sec 6.21.3, such a request
> > would cause an Unsupported Request error. So it must first disable PTM in
> > any downstream devices.
> >
> > Fixes: a697f072f5da ("PCI: Disable PTM during suspend to save power")
> > Signed-off-by: Rajvi Jingar <[email protected]>
> > Suggested-by: David E. Box <[email protected]>
> > ---
> > v1 -> v2: add Fixes tag in commit message
> > v2 -> v3: move changelog after "---" marker
> > v3 -> v4: add "---" marker after changelog
> > v4 -> v5: move pci_disable_ptm() out of the pci_dev->state_saved check.
> > disable PTM for all devices, not just root ports.
> > ---
> > drivers/pci/pci-driver.c | 28 +++++++++++++++++++---------
> > drivers/pci/pci.c | 10 ----------
> > 2 files changed, 19 insertions(+), 19 deletions(-)
> >
> > diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> > index 8b55a90126a2..400dd18a9cf5 100644
> > --- a/drivers/pci/pci-driver.c
> > +++ b/drivers/pci/pci-driver.c
> > @@ -813,6 +813,7 @@ static int pci_pm_suspend_late(struct device *dev)
> > static int pci_pm_suspend_noirq(struct device *dev)
> > {
> > + unsigned int dev_state_saved;
> > struct pci_dev *pci_dev = to_pci_dev(dev);
> > const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
> > @@ -845,16 +846,25 @@ static int pci_pm_suspend_noirq(struct device *dev)
> > }
> > }
> > - if (!pci_dev->state_saved) {
> > + dev_state_saved = pci_dev->state_saved;
>
> If pci_dev->state_saved is set here, the device may be in D3cold already and
> disabling PTM for it will not work.? Of course, it is not necessary to
> disable PTM for it then, but this case need to be taken care of.
>
> > + if (!dev_state_saved)
> > pci_save_state(pci_dev);
> > - /*
> > - * If the device is a bridge with a child in D0 below it, it needs to
> > - * stay in D0, so check skip_bus_pm to avoid putting it into a
> > - * low-power state in that case.
> > - */
> > - if (!pci_dev->skip_bus_pm && pci_power_manageable(pci_dev))
> > - pci_prepare_to_sleep(pci_dev);
> > - }
> > +
> > + /*
> > + * There are systems (for example, Intel mobile chips since Coffee
> > + * Lake) where the power drawn while suspended can be significantly
> > + * reduced by disabling PTM as this allows the SoC to reach a
> > + * lower-power idle state as a whole.

I think the argument for disabling PTM is that:

- If a PTM Requester is put in a low-power state, a PTM Responder
upstream from it may also be put in a low-power state.

- Putting a Port in D1, D2, or D3hot does not prohibit it from
sending or responding to PTM Requests (I'd be glad to be corrected
about this).

- We want to disable PTM on Responders when they are in a low-power
state.

- Per 6.21.3, a PTM Requester must not be enabled when the upstream
PTM Responder is disabled.

- Therefore, we must disable all PTM on all downstream PTM
Requesters before disabling it on the PTM Responder, e.g., a Root
Port.

This has nothing specifically to do with Coffee Lake or other Intel
chips, so I think the comment should be merely something to the
effect that "disabling PTM reduces power consumption."

> Something like this should suffice IMV:
>
> if (!dev_state_saved || pci_dev->current_state != PCI_D3cold)
>
> ??????? pci_disable_ptm(pci_dev);

It makes sense to me that we needn't disable PTM if the device is in
D3cold. But the "!dev_state_saved" condition depends on what the
driver did. Why is that important? Why should we not do the
following?

if (pci_dev->current_state != PCI_D3cold)
pci_disable_ptm(pci_dev);

Bjorn