[BUGFIX 1/4] PCI/PM: enable D3/D3cold by default for most devices
[BUGFIX 2/4] PCI/PM: Keep parent bridge active when probing device
[BUGFIX 3/4] PCI/PM: Fix config reg access for D3cold and bridge suspending
[PATCH 4/4] PCI/PM: Add ABI document for sysfs file d3cold_allowed
Best Regards,
Huang Ying
This patch fixes the following bug:
http://marc.info/?l=linux-pci&m=134338059022620&w=2
Where lspci does not work properly if a device and the corresponding
parent bridge (such as PCIe port) is suspended. This is because the
device configuration space registers will be not accessible if the
corresponding parent bridge is suspended or the device is put into
D3cold state.
To solve the issue, the bridge/PCIe port connected to the device is
put into active state before read/write configuration space registers.
If the device is in D3cold state, it will be put into active state
too.
To avoid resume/suspend PCIe port for each configuration register
read/write, a small delay is added before the PCIe port to go
suspended.
Reported-by: Bjorn Mork <[email protected]>
Signed-off-by: Huang Ying <[email protected]>
---
drivers/pci/pci-sysfs.c | 37 +++++++++++++++++++++++++++++++++++++
drivers/pci/pcie/portdrv_pci.c | 9 +++++++++
2 files changed, 46 insertions(+)
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -458,6 +458,35 @@ boot_vga_show(struct device *dev, struct
}
struct device_attribute vga_attr = __ATTR_RO(boot_vga);
+static void
+pci_config_pm_runtime_get(struct pci_dev *pdev)
+{
+ struct device *dev = &pdev->dev;
+ struct device *parent = dev->parent;
+
+ if (parent)
+ pm_runtime_get_sync(parent);
+ pm_runtime_get_noresume(dev);
+ /*
+ * pdev->current_state is set to PCI_D3cold during suspending,
+ * so wait until suspending completes
+ */
+ pm_runtime_barrier(dev);
+ if (pdev->current_state == PCI_D3cold)
+ pm_runtime_resume(dev);
+}
+
+static void
+pci_config_pm_runtime_put(struct pci_dev *pdev)
+{
+ struct device *dev = &pdev->dev;
+ struct device *parent = dev->parent;
+
+ pm_runtime_put(dev);
+ if (parent)
+ pm_runtime_put(parent);
+}
+
static ssize_t
pci_read_config(struct file *filp, struct kobject *kobj,
struct bin_attribute *bin_attr,
@@ -484,6 +513,8 @@ pci_read_config(struct file *filp, struc
size = count;
}
+ pci_config_pm_runtime_get(dev);
+
if ((off & 1) && size) {
u8 val;
pci_user_read_config_byte(dev, off, &val);
@@ -529,6 +560,8 @@ pci_read_config(struct file *filp, struc
--size;
}
+ pci_config_pm_runtime_put(dev);
+
return count;
}
@@ -549,6 +582,8 @@ pci_write_config(struct file* filp, stru
count = size;
}
+ pci_config_pm_runtime_get(dev);
+
if ((off & 1) && size) {
pci_user_write_config_byte(dev, off, data[off - init_off]);
off++;
@@ -587,6 +622,8 @@ pci_write_config(struct file* filp, stru
--size;
}
+ pci_config_pm_runtime_put(dev);
+
return count;
}
--- a/drivers/pci/pcie/portdrv_pci.c
+++ b/drivers/pci/pcie/portdrv_pci.c
@@ -140,9 +140,17 @@ static int pcie_port_runtime_resume(stru
{
return 0;
}
+
+static int pcie_port_runtime_idle(struct device *dev)
+{
+ /* Delay for a short while to prevent too frequent suspend/resume */
+ pm_schedule_suspend(dev, 10);
+ return -EBUSY;
+}
#else
#define pcie_port_runtime_suspend NULL
#define pcie_port_runtime_resume NULL
+#define pcie_port_runtime_idle NULL
#endif
static const struct dev_pm_ops pcie_portdrv_pm_ops = {
@@ -155,6 +163,7 @@ static const struct dev_pm_ops pcie_port
.resume_noirq = pcie_port_resume_noirq,
.runtime_suspend = pcie_port_runtime_suspend,
.runtime_resume = pcie_port_runtime_resume,
+ .runtime_idle = pcie_port_runtime_idle,
};
#define PCIE_PORTDRV_PM_OPS (&pcie_portdrv_pm_ops)
This patch adds ABI document for the following sysfs file:
/sys/bus/pci/devices/.../d3cold_allowed
Signed-off-by: Huang Ying <[email protected]>
Acked-by: Rafael J. Wysocki <[email protected]>
---
Documentation/ABI/testing/sysfs-bus-pci | 12 ++++++++++++
1 file changed, 12 insertions(+)
--- a/Documentation/ABI/testing/sysfs-bus-pci
+++ b/Documentation/ABI/testing/sysfs-bus-pci
@@ -210,3 +210,15 @@ Users:
firmware assigned instance number of the PCI
device that can help in understanding the firmware
intended order of the PCI device.
+
+What: /sys/bus/pci/devices/.../d3cold_allowed
+Date: July 2012
+Contact: Huang Ying <[email protected]>
+Description:
+ d3cold_allowed is bit to control whether the corresponding PCI
+ device can be put into D3Cold state. If it is cleared, the
+ device will never be put into D3Cold state. If it is set, the
+ device may be put into D3Cold state if other requirements are
+ satisfied too. Reading this attribute will show the current
+ value of d3cold_allowed bit. Writing this attribute will set
+ the value of d3cold_allowed bit.
This patch fixes the following bug:
http://marc.info/?l=linux-usb&m=134318961120825&w=2
Originally, device lower power states include D1, D2, D3. After that,
D3 is further divided into D3hot and D3cold. To support both scenario
safely, original D3 is mapped to D3cold.
When adding D3cold support, because worry about some device may have
broken D3cold support, D3cold is disabled by default. This disable D3
on original platform too. But some original platform may only have
working D3, but no working D1, D2. The root cause of the above bug is
it too.
To deal with this, this patch enables D3/D3cold by default for most
devices. This restores the original behavior. For some devices that
suspected to have broken D3cold support, such as PCIe port, D3cold is
disabled by default.
Reported-by: Bjorn Mork <[email protected]>
Signed-off-by: Huang Ying <[email protected]>
---
drivers/pci/pci.c | 1 +
drivers/pci/pcie/portdrv_pci.c | 5 +++++
2 files changed, 6 insertions(+)
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1941,6 +1941,7 @@ void pci_pm_init(struct pci_dev *dev)
dev->pm_cap = pm;
dev->d3_delay = PCI_PM_D3_WAIT;
dev->d3cold_delay = PCI_PM_D3COLD_WAIT;
+ dev->d3cold_allowed = true;
dev->d1_support = false;
dev->d2_support = false;
--- a/drivers/pci/pcie/portdrv_pci.c
+++ b/drivers/pci/pcie/portdrv_pci.c
@@ -200,6 +200,11 @@ static int __devinit pcie_portdrv_probe(
return status;
pci_save_state(dev);
+ /*
+ * D3cold may not work properly on some PCIe port, so disable
+ * it by default.
+ */
+ dev->d3cold_allowed = false;
if (!pci_match_id(port_runtime_pm_black_list, dev))
pm_runtime_put_noidle(&dev->dev);
This patch fixes the following bug:
http://marc.info/?l=linux-pci&m=134329923124234&w=2
The root cause of the bug is as follow.
If a device is not bound with the corresponding driver, the device
runtime PM will be disabled and the device will be put into suspended
state. So that, the bridge/PCIe port connected to it may be put into
suspended and low power state. When do probing for the device later,
because the bridge/PCIe port connected to it is in low power state,
the IO access to device may fail.
To solve the issue, the bridge/PCIe port connected to the device is
put into active state before probing.
Reported-by: Bjorn Mork <[email protected]>
Signed-off-by: Huang Ying <[email protected]>
---
drivers/pci/pci-driver.c | 6 ++++++
1 file changed, 6 insertions(+)
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -280,8 +280,12 @@ static long local_pci_probe(void *_ddi)
{
struct drv_dev_and_id *ddi = _ddi;
struct device *dev = &ddi->dev->dev;
+ struct device *parent = dev->parent;
int rc;
+ /* The parent bridge must be in active state when probing */
+ if (parent)
+ pm_runtime_get_sync(parent);
/* Unbound PCI devices are always set to disabled and suspended.
* During probe, the device is set to enabled and active and the
* usage count is incremented. If the driver supports runtime PM,
@@ -298,6 +302,8 @@ static long local_pci_probe(void *_ddi)
pm_runtime_set_suspended(dev);
pm_runtime_put_noidle(dev);
}
+ if (parent)
+ pm_runtime_put(parent);
return rc;
}
On Fri, 3 Aug 2012, Huang Ying wrote:
> This patch fixes the following bug:
>
> http://marc.info/?l=linux-pci&m=134338059022620&w=2
>
> Where lspci does not work properly if a device and the corresponding
> parent bridge (such as PCIe port) is suspended. This is because the
> device configuration space registers will be not accessible if the
> corresponding parent bridge is suspended or the device is put into
> D3cold state.
>
> To solve the issue, the bridge/PCIe port connected to the device is
> put into active state before read/write configuration space registers.
> If the device is in D3cold state, it will be put into active state
> too.
>
> To avoid resume/suspend PCIe port for each configuration register
> read/write, a small delay is added before the PCIe port to go
> suspended.
> +static void
> +pci_config_pm_runtime_put(struct pci_dev *pdev)
> +{
> + struct device *dev = &pdev->dev;
> + struct device *parent = dev->parent;
> +
> + pm_runtime_put(dev);
> + if (parent)
> + pm_runtime_put(parent);
> +}
This is just the sort of thing Rafael and I have been talking about.
Why do an asynchronous put, going to all the trouble of using the
workqueue, if the idle routine is just going to call
pm_schedule_suspend()?
Why not call pm_runtime_put_sync() instead?
Alan Stern
On Friday, August 03, 2012, Alan Stern wrote:
> On Fri, 3 Aug 2012, Huang Ying wrote:
>
> > This patch fixes the following bug:
> >
> > http://marc.info/?l=linux-pci&m=134338059022620&w=2
> >
> > Where lspci does not work properly if a device and the corresponding
> > parent bridge (such as PCIe port) is suspended. This is because the
> > device configuration space registers will be not accessible if the
> > corresponding parent bridge is suspended or the device is put into
> > D3cold state.
> >
> > To solve the issue, the bridge/PCIe port connected to the device is
> > put into active state before read/write configuration space registers.
> > If the device is in D3cold state, it will be put into active state
> > too.
> >
> > To avoid resume/suspend PCIe port for each configuration register
> > read/write, a small delay is added before the PCIe port to go
> > suspended.
>
>
> > +static void
> > +pci_config_pm_runtime_put(struct pci_dev *pdev)
> > +{
> > + struct device *dev = &pdev->dev;
> > + struct device *parent = dev->parent;
> > +
> > + pm_runtime_put(dev);
> > + if (parent)
> > + pm_runtime_put(parent);
> > +}
>
> This is just the sort of thing Rafael and I have been talking about.
> Why do an asynchronous put, going to all the trouble of using the
> workqueue, if the idle routine is just going to call
> pm_schedule_suspend()?
If that's PCI, it will call pm_runtime_suspend(). That probably _should_ be
pm_schedule_suspend(), but it isn't at the moment.
> Why not call pm_runtime_put_sync() instead?
I guess because the caller doesn't care whether or not the devices will be
suspended immediately and we seem to have agreed already that the added
workqueue overhead is minimal.
If the _idle() routine were to call pm_schedule_suspend(), though, I'd
agree that the overhead would be absolutely unnecessary.
Thanks,
Rafael
On Saturday, August 04, 2012, Rafael J. Wysocki wrote:
> On Friday, August 03, 2012, Alan Stern wrote:
> > On Fri, 3 Aug 2012, Huang Ying wrote:
> >
> > > This patch fixes the following bug:
> > >
> > > http://marc.info/?l=linux-pci&m=134338059022620&w=2
> > >
> > > Where lspci does not work properly if a device and the corresponding
> > > parent bridge (such as PCIe port) is suspended. This is because the
> > > device configuration space registers will be not accessible if the
> > > corresponding parent bridge is suspended or the device is put into
> > > D3cold state.
> > >
> > > To solve the issue, the bridge/PCIe port connected to the device is
> > > put into active state before read/write configuration space registers.
> > > If the device is in D3cold state, it will be put into active state
> > > too.
> > >
> > > To avoid resume/suspend PCIe port for each configuration register
> > > read/write, a small delay is added before the PCIe port to go
> > > suspended.
> >
> >
> > > +static void
> > > +pci_config_pm_runtime_put(struct pci_dev *pdev)
> > > +{
> > > + struct device *dev = &pdev->dev;
> > > + struct device *parent = dev->parent;
> > > +
> > > + pm_runtime_put(dev);
> > > + if (parent)
> > > + pm_runtime_put(parent);
> > > +}
> >
> > This is just the sort of thing Rafael and I have been talking about.
> > Why do an asynchronous put, going to all the trouble of using the
> > workqueue, if the idle routine is just going to call
> > pm_schedule_suspend()?
>
> If that's PCI, it will call pm_runtime_suspend(). That probably _should_ be
> pm_schedule_suspend(), but it isn't at the moment.
>
> > Why not call pm_runtime_put_sync() instead?
>
> I guess because the caller doesn't care whether or not the devices will be
> suspended immediately and we seem to have agreed already that the added
> workqueue overhead is minimal.
>
> If the _idle() routine were to call pm_schedule_suspend(), though, I'd
> agree that the overhead would be absolutely unnecessary.
Sorry, I should have had a closer look at pcie_port_runtime_idle() before
replying.
You're right, pm_runtime_put_sync() should be used for the parent.
Thanks,
Rafael
On Friday, August 03, 2012, Huang Ying wrote:
> This patch fixes the following bug:
>
> http://marc.info/?l=linux-usb&m=134318961120825&w=2
>
> Originally, device lower power states include D1, D2, D3. After that,
> D3 is further divided into D3hot and D3cold. To support both scenario
> safely, original D3 is mapped to D3cold.
>
> When adding D3cold support, because worry about some device may have
> broken D3cold support, D3cold is disabled by default. This disable D3
> on original platform too. But some original platform may only have
> working D3, but no working D1, D2. The root cause of the above bug is
> it too.
>
> To deal with this, this patch enables D3/D3cold by default for most
> devices. This restores the original behavior. For some devices that
> suspected to have broken D3cold support, such as PCIe port, D3cold is
> disabled by default.
>
> Reported-by: Bjorn Mork <[email protected]>
> Signed-off-by: Huang Ying <[email protected]>
Reviewed-by: Rafael J. Wysocki <[email protected]>
> ---
> drivers/pci/pci.c | 1 +
> drivers/pci/pcie/portdrv_pci.c | 5 +++++
> 2 files changed, 6 insertions(+)
>
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -1941,6 +1941,7 @@ void pci_pm_init(struct pci_dev *dev)
> dev->pm_cap = pm;
> dev->d3_delay = PCI_PM_D3_WAIT;
> dev->d3cold_delay = PCI_PM_D3COLD_WAIT;
> + dev->d3cold_allowed = true;
>
> dev->d1_support = false;
> dev->d2_support = false;
> --- a/drivers/pci/pcie/portdrv_pci.c
> +++ b/drivers/pci/pcie/portdrv_pci.c
> @@ -200,6 +200,11 @@ static int __devinit pcie_portdrv_probe(
> return status;
>
> pci_save_state(dev);
> + /*
> + * D3cold may not work properly on some PCIe port, so disable
> + * it by default.
> + */
> + dev->d3cold_allowed = false;
> if (!pci_match_id(port_runtime_pm_black_list, dev))
> pm_runtime_put_noidle(&dev->dev);
>
>
>
On Friday, August 03, 2012, Huang Ying wrote:
> This patch fixes the following bug:
>
> http://marc.info/?l=linux-pci&m=134329923124234&w=2
>
> The root cause of the bug is as follow.
>
> If a device is not bound with the corresponding driver, the device
> runtime PM will be disabled and the device will be put into suspended
> state. So that, the bridge/PCIe port connected to it may be put into
> suspended and low power state. When do probing for the device later,
> because the bridge/PCIe port connected to it is in low power state,
> the IO access to device may fail.
>
> To solve the issue, the bridge/PCIe port connected to the device is
> put into active state before probing.
>
> Reported-by: Bjorn Mork <[email protected]>
> Signed-off-by: Huang Ying <[email protected]>
Reviewed-by: Rafael J. Wysocki <[email protected]>
> ---
> drivers/pci/pci-driver.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -280,8 +280,12 @@ static long local_pci_probe(void *_ddi)
> {
> struct drv_dev_and_id *ddi = _ddi;
> struct device *dev = &ddi->dev->dev;
> + struct device *parent = dev->parent;
> int rc;
>
> + /* The parent bridge must be in active state when probing */
> + if (parent)
> + pm_runtime_get_sync(parent);
> /* Unbound PCI devices are always set to disabled and suspended.
> * During probe, the device is set to enabled and active and the
> * usage count is incremented. If the driver supports runtime PM,
> @@ -298,6 +302,8 @@ static long local_pci_probe(void *_ddi)
> pm_runtime_set_suspended(dev);
> pm_runtime_put_noidle(dev);
> }
> + if (parent)
> + pm_runtime_put(parent);
> return rc;
> }
>
>
>
Huang Ying <[email protected]> writes:
> [BUGFIX 1/4] PCI/PM: enable D3/D3cold by default for most devices
> [BUGFIX 2/4] PCI/PM: Keep parent bridge active when probing device
> [BUGFIX 3/4] PCI/PM: Fix config reg access for D3cold and bridge suspending
> [PATCH 4/4] PCI/PM: Add ABI document for sysfs file d3cold_allowed
Hello,
I am hoping these patches will appear in 3.6? They fix real problems in
3.6-rc1 for me. If it helps in any way, feel free to add
Tested-by: Bjørn Mork <[email protected]>
to the 3 bugfix patches, including version 2 of patch #3.
Bjørn
Hi, Bjorn,
Could you please merge this patchset? They fix real bugs.
Best Regards,
Huang Ying
On Sun, Aug 19, 2012 at 6:35 PM, Bjørn Mork <[email protected]> wrote:
> Huang Ying <[email protected]> writes:
>
>> [BUGFIX 1/4] PCI/PM: enable D3/D3cold by default for most devices
>> [BUGFIX 2/4] PCI/PM: Keep parent bridge active when probing device
>> [BUGFIX 3/4] PCI/PM: Fix config reg access for D3cold and bridge suspending
>> [PATCH 4/4] PCI/PM: Add ABI document for sysfs file d3cold_allowed
>
> Hello,
>
> I am hoping these patches will appear in 3.6? They fix real problems in
> 3.6-rc1 for me. If it helps in any way, feel free to add
>
> Tested-by: Bjørn Mork <[email protected]>
>
> to the 3 bugfix patches, including version 2 of patch #3.
>
>
> Bjørn
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
On Sun, Aug 19, 2012 at 6:09 PM, huang ying
<[email protected]> wrote:
> Hi, Bjorn,
>
> Could you please merge this patchset? They fix real bugs.
I assume you wanted the updated "[PATCH 3/4] PCI/PM: Fix config reg
access ..." patch posted Aug 15.
I merged these (with the updated 3/4 patch) to my "for-linus" branch.
After it's in linux-next for a couple days, I'll ask Linus to pull it.
> On Sun, Aug 19, 2012 at 6:35 PM, Bj?rn Mork <[email protected]> wrote:
>> Huang Ying <[email protected]> writes:
>>
>>> [BUGFIX 1/4] PCI/PM: enable D3/D3cold by default for most devices
>>> [BUGFIX 2/4] PCI/PM: Keep parent bridge active when probing device
>>> [BUGFIX 3/4] PCI/PM: Fix config reg access for D3cold and bridge suspending
>>> [PATCH 4/4] PCI/PM: Add ABI document for sysfs file d3cold_allowed
>>
>> Hello,
>>
>> I am hoping these patches will appear in 3.6? They fix real problems in
>> 3.6-rc1 for me. If it helps in any way, feel free to add
>>
>> Tested-by: Bj?rn Mork <[email protected]>
>>
>> to the 3 bugfix patches, including version 2 of patch #3.
>>
>>
>> Bj?rn
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at http://www.tux.org/lkml/