2022-05-14 00:18:13

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH v5 2/2] PCI/PM: Fix pci_pm_suspend_noirq() to disable PTM

On Thu, May 12, 2022 at 07:52:36PM +0200, Rafael J. Wysocki wrote:
> On Thu, May 12, 2022 at 7:42 PM Bjorn Helgaas <[email protected]> wrote:
> > On Thu, May 12, 2022 at 03:49:18PM +0200, Rafael J. Wysocki wrote:

> > > Something like this should suffice IMV:
> > >
> > > if (!dev_state_saved || pci_dev->current_state != PCI_D3cold)
> > >
> > > pci_disable_ptm(pci_dev);
> >
> > It makes sense to me that we needn't disable PTM if the device is in
> > D3cold. But the "!dev_state_saved" condition depends on what the
> > driver did. Why is that important? Why should we not do the
> > following?
> >
> > if (pci_dev->current_state != PCI_D3cold)
> > pci_disable_ptm(pci_dev);
>
> We can do this too. I thought we could skip the power state check if
> dev_state_saved was unset, because then we would know that the power
> state was not D3cold. It probably isn't worth the hassle though.

Ah, thanks. IMHO it's easier to analyze for correctness if we only
check the power state.

Bjorn


2022-05-14 00:49:57

by Jingar, Rajvi

[permalink] [raw]
Subject: RE: [PATCH v5 2/2] PCI/PM: Fix pci_pm_suspend_noirq() to disable PTM


> -----Original Message-----
> From: Bjorn Helgaas <[email protected]>
> Sent: Thursday, May 12, 2022 11:36 AM
> To: Rafael J. Wysocki <[email protected]>
> Cc: Jingar, Rajvi <[email protected]>; Wysocki, Rafael J
> <[email protected]>; Bjorn Helgaas <[email protected]>; David Box
> <[email protected]>; Linux PCI <[email protected]>; Linux
> Kernel Mailing List <[email protected]>; Linux PM <linux-
> [email protected]>
> Subject: Re: [PATCH v5 2/2] PCI/PM: Fix pci_pm_suspend_noirq() to disable PTM
>
> On Thu, May 12, 2022 at 07:52:36PM +0200, Rafael J. Wysocki wrote:
> > On Thu, May 12, 2022 at 7:42 PM Bjorn Helgaas <[email protected]> wrote:
> > > On Thu, May 12, 2022 at 03:49:18PM +0200, Rafael J. Wysocki wrote:
>
> > > > Something like this should suffice IMV:
> > > >
> > > > if (!dev_state_saved || pci_dev->current_state != PCI_D3cold)
> > > >
> > > > pci_disable_ptm(pci_dev);
> > >
> > > It makes sense to me that we needn't disable PTM if the device is in
> > > D3cold. But the "!dev_state_saved" condition depends on what the
> > > driver did. Why is that important? Why should we not do the
> > > following?
> > >
> > > if (pci_dev->current_state != PCI_D3cold)
> > > pci_disable_ptm(pci_dev);
> >
> > We can do this too. I thought we could skip the power state check if
> > dev_state_saved was unset, because then we would know that the power
> > state was not D3cold. It probably isn't worth the hassle though.
>

We see issue with certain platforms where only checking if device power
state in D3Cold is not enough and the !dev_state_saved check is needed
when disabling PTM. Device like nvme is relying on ASPM, it stays in D0 but
state is saved. Touching the config space wakes up the device which
prevents the system from entering into low power state.

Following would fix the issue:

if (!pci_dev->state_save) {
pci_save_state(pci_dev);

pci_disable_ptm(pci_dev);

if (!pci_dev->skip_bus_pm && pci_power_manageable(pci_dev))
pci_prepare_to_sleep(pci_dev);
}

> Ah, thanks. IMHO it's easier to analyze for correctness if we only
> check the power state.
>
> Bjorn

2022-05-15 10:33:20

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH v5 2/2] PCI/PM: Fix pci_pm_suspend_noirq() to disable PTM

On Sat, May 14, 2022 at 12:01 AM Jingar, Rajvi <[email protected]> wrote:
>
>
> > -----Original Message-----
> > From: Bjorn Helgaas <[email protected]>
> > Sent: Thursday, May 12, 2022 11:36 AM
> > To: Rafael J. Wysocki <[email protected]>
> > Cc: Jingar, Rajvi <[email protected]>; Wysocki, Rafael J
> > <[email protected]>; Bjorn Helgaas <[email protected]>; David Box
> > <[email protected]>; Linux PCI <[email protected]>; Linux
> > Kernel Mailing List <[email protected]>; Linux PM <linux-
> > [email protected]>
> > Subject: Re: [PATCH v5 2/2] PCI/PM: Fix pci_pm_suspend_noirq() to disable PTM
> >
> > On Thu, May 12, 2022 at 07:52:36PM +0200, Rafael J. Wysocki wrote:
> > > On Thu, May 12, 2022 at 7:42 PM Bjorn Helgaas <[email protected]> wrote:
> > > > On Thu, May 12, 2022 at 03:49:18PM +0200, Rafael J. Wysocki wrote:
> >
> > > > > Something like this should suffice IMV:
> > > > >
> > > > > if (!dev_state_saved || pci_dev->current_state != PCI_D3cold)
> > > > >
> > > > > pci_disable_ptm(pci_dev);
> > > >
> > > > It makes sense to me that we needn't disable PTM if the device is in
> > > > D3cold. But the "!dev_state_saved" condition depends on what the
> > > > driver did. Why is that important? Why should we not do the
> > > > following?
> > > >
> > > > if (pci_dev->current_state != PCI_D3cold)
> > > > pci_disable_ptm(pci_dev);
> > >
> > > We can do this too. I thought we could skip the power state check if
> > > dev_state_saved was unset, because then we would know that the power
> > > state was not D3cold. It probably isn't worth the hassle though.
> >
>
> We see issue with certain platforms where only checking if device power
> state in D3Cold is not enough and the !dev_state_saved check is needed
> when disabling PTM. Device like nvme is relying on ASPM, it stays in D0 but
> state is saved. Touching the config space wakes up the device which
> prevents the system from entering into low power state.
>
> Following would fix the issue:
>
> if (!pci_dev->state_save) {
> pci_save_state(pci_dev);
>
> pci_disable_ptm(pci_dev);
>
> if (!pci_dev->skip_bus_pm && pci_power_manageable(pci_dev))
> pci_prepare_to_sleep(pci_dev);
> }

Well, the point is to also disable PTM for devices that were put into
D3 by their drivers.

In addition to D3cold, the check could cover D0 too, that is

if (pci_dev->current_state > D0 && pci_dev->current_state < PCI_D3cold)
pci_disable_ptm(pci_dev);

> > Ah, thanks. IMHO it's easier to analyze for correctness if we only
> > check the power state.
> >
> > Bjorn

2022-05-17 00:09:47

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH v5 2/2] PCI/PM: Fix pci_pm_suspend_noirq() to disable PTM

On Fri, May 13, 2022 at 10:00:48PM +0000, Jingar, Rajvi wrote:
>
> > -----Original Message-----
> > From: Bjorn Helgaas <[email protected]>
> > Sent: Thursday, May 12, 2022 11:36 AM
> > To: Rafael J. Wysocki <[email protected]>
> > Cc: Jingar, Rajvi <[email protected]>; Wysocki, Rafael J
> > <[email protected]>; Bjorn Helgaas <[email protected]>; David Box
> > <[email protected]>; Linux PCI <[email protected]>; Linux
> > Kernel Mailing List <[email protected]>; Linux PM <linux-
> > [email protected]>
> > Subject: Re: [PATCH v5 2/2] PCI/PM: Fix pci_pm_suspend_noirq() to disable PTM
> >
> > On Thu, May 12, 2022 at 07:52:36PM +0200, Rafael J. Wysocki wrote:
> > > On Thu, May 12, 2022 at 7:42 PM Bjorn Helgaas <[email protected]> wrote:
> > > > On Thu, May 12, 2022 at 03:49:18PM +0200, Rafael J. Wysocki wrote:
> >
> > > > > Something like this should suffice IMV:
> > > > >
> > > > > if (!dev_state_saved || pci_dev->current_state != PCI_D3cold)
> > > > >
> > > > > pci_disable_ptm(pci_dev);
> > > >
> > > > It makes sense to me that we needn't disable PTM if the device is in
> > > > D3cold. But the "!dev_state_saved" condition depends on what the
> > > > driver did. Why is that important? Why should we not do the
> > > > following?
> > > >
> > > > if (pci_dev->current_state != PCI_D3cold)
> > > > pci_disable_ptm(pci_dev);
> > >
> > > We can do this too. I thought we could skip the power state
> > > check if dev_state_saved was unset, because then we would know
> > > that the power state was not D3cold. It probably isn't worth
> > > the hassle though.
>
> We see issue with certain platforms where only checking if device
> power state in D3Cold is not enough and the !dev_state_saved check
> is needed when disabling PTM. Device like nvme is relying on ASPM,
> it stays in D0 but state is saved. Touching the config space wakes
> up the device which prevents the system from entering into low power
> state.

Correct me if I'm wrong: for NVMe devices, nvme_suspend() has already
saved state and put the device in some low-power state. Disabling PTM
here is functionally OK but prevents a system low power state, so you
want to leave PTM enabled.

But I must be missing something because pci_prepare_to_sleep()
currently disables PTM for Root Ports. If we leave PTM enabled on
NVMe but disable it on the Root Port above it, any PTM Request from
NVMe will cause an Unsupported Request error.

Disabling PTM must be coordinated across PTM Requesters and PTM
Responders. That means the decision to disable cannot depend on
driver-specific things like whether the driver has saved state.

> Following would fix the issue:
>
> if (!pci_dev->state_save) {
> pci_save_state(pci_dev);
>
> pci_disable_ptm(pci_dev);
>
> if (!pci_dev->skip_bus_pm && pci_power_manageable(pci_dev))
> pci_prepare_to_sleep(pci_dev);
> }
>
> > Ah, thanks. IMHO it's easier to analyze for correctness if we only
> > check the power state.
> >
> > Bjorn

2022-05-17 02:55:45

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH v5 2/2] PCI/PM: Fix pci_pm_suspend_noirq() to disable PTM

On Mon, May 16, 2022 at 10:09 PM Bjorn Helgaas <[email protected]> wrote:
>
> On Fri, May 13, 2022 at 10:00:48PM +0000, Jingar, Rajvi wrote:
> >
> > > -----Original Message-----
> > > From: Bjorn Helgaas <[email protected]>
> > > Sent: Thursday, May 12, 2022 11:36 AM
> > > To: Rafael J. Wysocki <[email protected]>
> > > Cc: Jingar, Rajvi <[email protected]>; Wysocki, Rafael J
> > > <[email protected]>; Bjorn Helgaas <[email protected]>; David Box
> > > <[email protected]>; Linux PCI <[email protected]>; Linux
> > > Kernel Mailing List <[email protected]>; Linux PM <linux-
> > > [email protected]>
> > > Subject: Re: [PATCH v5 2/2] PCI/PM: Fix pci_pm_suspend_noirq() to disable PTM
> > >
> > > On Thu, May 12, 2022 at 07:52:36PM +0200, Rafael J. Wysocki wrote:
> > > > On Thu, May 12, 2022 at 7:42 PM Bjorn Helgaas <[email protected]> wrote:
> > > > > On Thu, May 12, 2022 at 03:49:18PM +0200, Rafael J. Wysocki wrote:
> > >
> > > > > > Something like this should suffice IMV:
> > > > > >
> > > > > > if (!dev_state_saved || pci_dev->current_state != PCI_D3cold)
> > > > > >
> > > > > > pci_disable_ptm(pci_dev);
> > > > >
> > > > > It makes sense to me that we needn't disable PTM if the device is in
> > > > > D3cold. But the "!dev_state_saved" condition depends on what the
> > > > > driver did. Why is that important? Why should we not do the
> > > > > following?
> > > > >
> > > > > if (pci_dev->current_state != PCI_D3cold)
> > > > > pci_disable_ptm(pci_dev);
> > > >
> > > > We can do this too. I thought we could skip the power state
> > > > check if dev_state_saved was unset, because then we would know
> > > > that the power state was not D3cold. It probably isn't worth
> > > > the hassle though.
> >
> > We see issue with certain platforms where only checking if device
> > power state in D3Cold is not enough and the !dev_state_saved check
> > is needed when disabling PTM. Device like nvme is relying on ASPM,
> > it stays in D0 but state is saved. Touching the config space wakes
> > up the device which prevents the system from entering into low power
> > state.
>
> Correct me if I'm wrong: for NVMe devices, nvme_suspend() has already
> saved state and put the device in some low-power state. Disabling PTM
> here is functionally OK but prevents a system low power state, so you
> want to leave PTM enabled.
>
> But I must be missing something because pci_prepare_to_sleep()
> currently disables PTM for Root Ports. If we leave PTM enabled on
> NVMe but disable it on the Root Port above it, any PTM Request from
> NVMe will cause an Unsupported Request error.
>
> Disabling PTM must be coordinated across PTM Requesters and PTM
> Responders. That means the decision to disable cannot depend on
> driver-specific things like whether the driver has saved state.

Setting state_saved generally informs pci_pm_suspend_noirq() that the
device has already been handled and it doesn't need to do anything to
it.

But you are right that PTM should be disabled on downstream devices as
well as on the ports that those devices are connected to and it can be
done even if the given device has already been handled, so the
state_saved value is technically irrelevant.

That's why I suggested to check if the power state is between D0 and
D3cold (exclusive) and only disable PTM if that is the case. It is
pointless to disable PTM for devices in D3cold and it may be harmful
for devices that are left in D0.

2022-05-18 03:40:04

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH v5 2/2] PCI/PM: Fix pci_pm_suspend_noirq() to disable PTM

On Mon, May 16, 2022 at 10:59:32PM +0200, Rafael J. Wysocki wrote:
> On Mon, May 16, 2022 at 10:09 PM Bjorn Helgaas <[email protected]> wrote:
> > On Fri, May 13, 2022 at 10:00:48PM +0000, Jingar, Rajvi wrote:
> > > > -----Original Message-----
> > > > From: Bjorn Helgaas <[email protected]>
> > > > Sent: Thursday, May 12, 2022 11:36 AM
> > > > To: Rafael J. Wysocki <[email protected]>
> > > > Cc: Jingar, Rajvi <[email protected]>; Wysocki, Rafael J
> > > > <[email protected]>; Bjorn Helgaas <[email protected]>; David Box
> > > > <[email protected]>; Linux PCI <[email protected]>; Linux
> > > > Kernel Mailing List <[email protected]>; Linux PM <linux-
> > > > [email protected]>
> > > > Subject: Re: [PATCH v5 2/2] PCI/PM: Fix pci_pm_suspend_noirq() to disable PTM
> > > >
> > > > On Thu, May 12, 2022 at 07:52:36PM +0200, Rafael J. Wysocki wrote:
> > > > > On Thu, May 12, 2022 at 7:42 PM Bjorn Helgaas <[email protected]> wrote:
> > > > > > On Thu, May 12, 2022 at 03:49:18PM +0200, Rafael J. Wysocki wrote:
> > > >
> > > > > > > Something like this should suffice IMV:
> > > > > > >
> > > > > > > if (!dev_state_saved || pci_dev->current_state != PCI_D3cold)
> > > > > > >
> > > > > > > pci_disable_ptm(pci_dev);
> > > > > >
> > > > > > It makes sense to me that we needn't disable PTM if the device is in
> > > > > > D3cold. But the "!dev_state_saved" condition depends on what the
> > > > > > driver did. Why is that important? Why should we not do the
> > > > > > following?
> > > > > >
> > > > > > if (pci_dev->current_state != PCI_D3cold)
> > > > > > pci_disable_ptm(pci_dev);
> > > > >
> > > > > We can do this too. I thought we could skip the power state
> > > > > check if dev_state_saved was unset, because then we would know
> > > > > that the power state was not D3cold. It probably isn't worth
> > > > > the hassle though.
> > >
> > > We see issue with certain platforms where only checking if device
> > > power state in D3Cold is not enough and the !dev_state_saved check
> > > is needed when disabling PTM. Device like nvme is relying on ASPM,
> > > it stays in D0 but state is saved. Touching the config space wakes
> > > up the device which prevents the system from entering into low power
> > > state.
> >
> > Correct me if I'm wrong: for NVMe devices, nvme_suspend() has already
> > saved state and put the device in some low-power state. Disabling PTM
> > here is functionally OK but prevents a system low power state, so you
> > want to leave PTM enabled.
> >
> > But I must be missing something because pci_prepare_to_sleep()
> > currently disables PTM for Root Ports. If we leave PTM enabled on
> > NVMe but disable it on the Root Port above it, any PTM Request from
> > NVMe will cause an Unsupported Request error.
> >
> > Disabling PTM must be coordinated across PTM Requesters and PTM
> > Responders. That means the decision to disable cannot depend on
> > driver-specific things like whether the driver has saved state.
>
> Setting state_saved generally informs pci_pm_suspend_noirq() that the
> device has already been handled and it doesn't need to do anything to
> it.
>
> But you are right that PTM should be disabled on downstream devices as
> well as on the ports that those devices are connected to and it can be
> done even if the given device has already been handled, so the
> state_saved value is technically irrelevant.
>
> That's why I suggested to check if the power state is between D0 and
> D3cold (exclusive) and only disable PTM if that is the case. It is
> pointless to disable PTM for devices in D3cold and it may be harmful
> for devices that are left in D0.

"... it may be harmful for devices that are left in D0" -- I want to
understand this better. It sounds like nvme_suspend() leaves the
device in some device-specific low-power flavor of D0, and subsequent
config accesses take it out of that low-power situation?

If that's the case, it sounds a little brittle. I don't think it's
obvious that "pci_dev->state_saved was set by the driver" means "no
config accesses allowed in pci_pm_suspend_noirq()." And
pci_pm_suspend_noirq() calls quirks via pci_fixup_device(), which are
very likely to do config accesses.

Maybe PTM needs to be disabled earlier, e.g., in pci_pm_suspend()? I
don't think PTM uses any interrupts, so there's probably no reason
interrupts need to be disabled before disabling PTM.