2004-10-26 05:05:26

by Shaohua Li

[permalink] [raw]
Subject: [Proposal]Another way to save/restore PCI config space for suspend/resume

Hi,
We suffer from PCI config space issue for a long time, which causes many
system can't correctly resume. Current Linux mechanism isn't sufficient.
Here is a another idea:
Record all PCI writes in Linux kernel, and redo all the write after
resume in order. The idea assumes Firmware will restore all PCI config
space to the boot time state, which is true at least for IA32.

Reason:
1. Current PCI save/restore routines only cover first 64 bytes
2. No PCI bridge driver currently.
3. Some special devices can't or are difficult to save/restore config
space with current model. Such as PCI link device, it's a sysdev, but
its resume code can't be invoked with irq disabled.
4. ACPI possibly changes special devices' config space, such as host
bridge or LPC bridge. The special devices generally are vender specific,
and possibly will not have a driver forever.

Possibly we must consider other factors:
1.tracking writes alone will not be enough. Some PCI devices may have
restrictions such as something has to be written after it is read and
the like. Still we should be able to do this if we can trace all pci
reads and writes and repeat it at restore.
2. For support hotplug, add a callback for hotplug PCI remove. When a
device is removed, all records about it are removed.
What's your opinions?

Thanks,
Shaohua


2004-10-26 05:23:04

by Andi Kleen

[permalink] [raw]
Subject: Re: [ACPI] [Proposal]Another way to save/restore PCI config space for suspend/resume

On Tue, Oct 26, 2004 at 12:50:57PM +0800, Li Shaohua wrote:
> Hi,
> We suffer from PCI config space issue for a long time, which causes many
> system can't correctly resume. Current Linux mechanism isn't sufficient.
> Here is a another idea:
> Record all PCI writes in Linux kernel, and redo all the write after
> resume in order. The idea assumes Firmware will restore all PCI config

This won't work very well for some cases. e.g. on AMD x86-64 the
IOMMU is flushed by setting/clearing a bit in PCI config space.
AGP implementations work similar. You really don't want to track
all these flushes, it would be far too costly.

> space to the boot time state, which is true at least for IA32.
>
> Reason:
> 1. Current PCI save/restore routines only cover first 64 bytes

The driver could set a flag if it wants more.

> 2. No PCI bridge driver currently.

That could be fixed I guess?

> 3. Some special devices can't or are difficult to save/restore config
> space with current model. Such as PCI link device, it's a sysdev, but
> its resume code can't be invoked with irq disabled.

In this case it would be IMHO better to have specialized suspend/resume
functions in the drivers for these oddball devices.

Most likely they will require some special handling anyways
(like special delays etc.) that can't be done by the generic code

> 4. ACPI possibly changes special devices' config space, such as host
> bridge or LPC bridge. The special devices generally are vender specific,
> and possibly will not have a driver forever.

I didn't get that one.

-Andi

2004-10-26 06:12:05

by Brown, Len

[permalink] [raw]
Subject: Re: [ACPI] [Proposal]Another way to save/restore PCI config space for suspend/resume

What this comes down to is that extended config space is device-specific.
Generic solutions will fail. Only device drivers will work.

If there are no drivers for PCI bridges to properly save/restore
their config space, then should create them, even if this is all the
drivers do.

-Len


2004-10-26 08:42:48

by Arjan van de Ven

[permalink] [raw]
Subject: Re: [ACPI] [Proposal]Another way to save/restore PCI config space for suspend/resume

On Tue, 2004-10-26 at 02:11 -0400, Len Brown wrote:
> What this comes down to is that extended config space is device-specific.
> Generic solutions will fail. Only device drivers will work.
>
> If there are no drivers for PCI bridges to properly save/restore
> their config space, then should create them, even if this is all the
> drivers do.

note that by default, if there is no driver, the first 64 bytes of
config space are saved/restored.
--

2004-10-26 09:06:33

by Karol Kozimor

[permalink] [raw]
Subject: Re: [ACPI] [Proposal]Another way to save/restore PCI config space for suspend/resume

Thus wrote Arjan van de Ven:
> > What this comes down to is that extended config space is device-specific.
> > Generic solutions will fail. Only device drivers will work.
> >
> > If there are no drivers for PCI bridges to properly save/restore
> > their config space, then should create them, even if this is all the
> > drivers do.
> note that by default, if there is no driver, the first 64 bytes of
> config space are saved/restored.

That's not enough -- some devices with no drivers (think LPC bridges) might
need more (see http://bugme.osdl.org/show_bug.cgi?id=3609).
Best regards,

--
Karol 'sziwan' Kozimor
[email protected]

2004-10-26 09:17:54

by Shaohua Li

[permalink] [raw]
Subject: Re: [ACPI] [Proposal]Another way to save/restore PCI config space for suspend/resume

On Tue, 2004-10-26 at 13:11, Andi Kleen wrote:
> On Tue, Oct 26, 2004 at 12:50:57PM +0800, Li Shaohua wrote:
> > Hi,
> > We suffer from PCI config space issue for a long time, which causes many
> > system can't correctly resume. Current Linux mechanism isn't sufficient.
> > Here is a another idea:
> > Record all PCI writes in Linux kernel, and redo all the write after
> > resume in order. The idea assumes Firmware will restore all PCI config
>
> This won't work very well for some cases. e.g. on AMD x86-64 the
> IOMMU is flushed by setting/clearing a bit in PCI config space.
> AGP implementations work similar. You really don't want to track
> all these flushes, it would be far too costly.
Possibly we can consider some optimizations, such as a driver can
explicitly disable the 'pci record'. The main problem we encountered is
the correctness of suspend/resume.
>
> > space to the boot time state, which is true at least for IA32.
> >
> > Reason:
> > 1. Current PCI save/restore routines only cover first 64 bytes
>
> The driver could set a flag if it wants more.
Extend PCI config space is device specific, general code can't do it.

> > 2. No PCI bridge driver currently.
>
> That could be fixed I guess?
if all PCI devices and bridges have drivers, this could be fixed. But I
think it's far away. Another issue here is the hierarchy of devices. A
device in the below of device tree doesn't means it must be resumed
later. Such as a PCI IRQ router, it must resume before all PCI devices.

>
> > 3. Some special devices can't or are difficult to save/restore config
> > space with current model. Such as PCI link device, it's a sysdev, but
> > its resume code can't be invoked with irq disabled.
>
> In this case it would be IMHO better to have specialized suspend/resume
> functions in the drivers for these oddball devices.
>
> Most likely they will require some special handling anyways
> (like special delays etc.) that can't be done by the generic code
>
> > 4. ACPI possibly changes special devices' config space, such as host
> > bridge or LPC bridge. The special devices generally are vender specific,
> > and possibly will not have a driver forever.
>
> I didn't get that one.
One case here is the ASL code will disable/enable EHCI per current OS
(such as disable EHCI if OS isn't win) in some systems. The
disable/enable bit is in ICH.

Thanks,
Shaohua

2004-10-26 09:21:26

by Pavel Machek

[permalink] [raw]
Subject: Re: [Proposal]Another way to save/restore PCI config space for suspend/resume

Hi!

> We suffer from PCI config space issue for a long time, which causes many
> system can't correctly resume. Current Linux mechanism isn't sufficient.
> Here is a another idea:
> Record all PCI writes in Linux kernel, and redo all the write after
> resume in order. The idea assumes Firmware will restore all PCI config
> space to the boot time state, which is true at least for IA32.

That looks extremely ugly to me. If you want to do something special
in resume function, just do it there. It will probably share a lot of
code with your init function, anyway.
Pavel
--
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

2004-10-26 09:58:31

by Matthew Garrett

[permalink] [raw]
Subject: Re: [ACPI] [Proposal]Another way to save/restore PCI config space for suspend/resume

On Tue, 2004-10-26 at 10:42 +0200, Arjan van de Ven wrote:
> On Tue, 2004-10-26 at 02:11 -0400, Len Brown wrote:
> > What this comes down to is that extended config space is device-specific.
> > Generic solutions will fail. Only device drivers will work.
> >
> > If there are no drivers for PCI bridges to properly save/restore
> > their config space, then should create them, even if this is all the
> > drivers do.
>
> note that by default, if there is no driver, the first 64 bytes of
> config space are saved/restored.

On one of my machines, doing this causes the cardbus bridge to explode
on resume (every other character of printks suddenly starts getting left
out, and then the machine hangs). This happens even if I've never loaded
the yenta driver. The naive approach certainly doesn't seem to be safe
on all hardware.

--
Matthew Garrett | [email protected]

2004-10-27 00:58:16

by Shaohua Li

[permalink] [raw]
Subject: Re: [ACPI] Re: [Proposal]Another way to save/restore PCI config space for suspend/resume

On Tue, 2004-10-26 at 17:21, Pavel Machek wrote:
> Hi!
>
> > We suffer from PCI config space issue for a long time, which causes many
> > system can't correctly resume. Current Linux mechanism isn't sufficient.
> > Here is a another idea:
> > Record all PCI writes in Linux kernel, and redo all the write after
> > resume in order. The idea assumes Firmware will restore all PCI config
> > space to the boot time state, which is true at least for IA32.
>
> That looks extremely ugly to me. If you want to do something special
> in resume function, just do it there. It will probably share a lot of
> code with your init function, anyway.
How can you handle devices without driver? And how to save/restore
config space for special devices, such as LPC bridge and host bridge?

-Shaohua

2004-10-27 01:33:08

by Hiroshi 2 Itoh

[permalink] [raw]
Subject: Re: [ACPI] [Proposal]Another way to save/restore PCI config space for suspend/resume





Hi,

[email protected] wrote on 2004/10/26 13:50:57:

> Hi,
> We suffer from PCI config space issue for a long time, which causes many
> system can't correctly resume. Current Linux mechanism isn't sufficient.
> Here is a another idea:
> Record all PCI writes in Linux kernel, and redo all the write after
> resume in order. The idea assumes Firmware will restore all PCI config
> space to the boot time state, which is true at least for IA32.
>

I think a basic problem of current Linux device model is that there is no
effective message path from sibling devices to their root device.
Although the message direction from a root device to sibling devices is
natural from the viewpoint of device enumeration, the direction from
sibling devices to a root device is required for effective arbitration for
device configuration and power management.

The Windows driver model uses the direction from sibling drivers to a root
bus driver mainly, i.e. sibling drivers are layered on a root bus driver.
While we need a kind of callback mechanism from PCI (sibling) devices to
PCI bus (root) device instead because their normal call interface is from a
root device to sibling devices.

- Hiro.

2004-10-27 02:02:43

by Shaohua Li

[permalink] [raw]
Subject: Re: [ACPI] [Proposal]Another way to save/restore PCI config space for suspend/resume

On Wed, 2004-10-27 at 09:32, Hiroshi 2 Itoh wrote:
>
>
> Hi,
>
> [email protected] wrote on 2004/10/26 13:50:57:
>
> > Hi,
> > We suffer from PCI config space issue for a long time, which causes many
> > system can't correctly resume. Current Linux mechanism isn't sufficient.
> > Here is a another idea:
> > Record all PCI writes in Linux kernel, and redo all the write after
> > resume in order. The idea assumes Firmware will restore all PCI config
> > space to the boot time state, which is true at least for IA32.
> >
>
> I think a basic problem of current Linux device model is that there is no
> effective message path from sibling devices to their root device.
> Although the message direction from a root device to sibling devices is
> natural from the viewpoint of device enumeration, the direction from
> sibling devices to a root device is required for effective arbitration for
> device configuration and power management.
>
> The Windows driver model uses the direction from sibling drivers to a root
> bus driver mainly, i.e. sibling drivers are layered on a root bus driver.
> While we need a kind of callback mechanism from PCI (sibling) devices to
> PCI bus (root) device instead because their normal call interface is from a
> root device to sibling devices.
Hiro-san,
I don't really understand why this is related with suspend/resume. Could
you please explain it more clearly?

Thanks,
Shaohua


2004-10-27 02:26:55

by Hiroshi 2 Itoh

[permalink] [raw]
Subject: Re: [ACPI] [Proposal]Another way to save/restore PCI config space for suspend/resume





> >
> > I think a basic problem of current Linux device model is that there is
no
> > effective message path from sibling devices to their root device.
> > Although the message direction from a root device to sibling devices is
> > natural from the viewpoint of device enumeration, the direction from
> > sibling devices to a root device is required for effective arbitration
for
> > device configuration and power management.
> >
> > The Windows driver model uses the direction from sibling drivers to a
root
> > bus driver mainly, i.e. sibling drivers are layered on a root bus
driver.
> > While we need a kind of callback mechanism from PCI (sibling) devices
to
> > PCI bus (root) device instead because their normal call interface is
from a
> > root device to sibling devices.
> Hiro-san,
> I don't really understand why this is related with suspend/resume. Could
> you please explain it more clearly?
>
Hi,

What I mean is that:

There are some bridge devices to be supported by a driver for various
device types and vendors. In the long run PCI drivers will have power
dependency one another in the long run. It is natual that PCI bridge driver
reports its child devices and their dependency to PCI core because PCI core
driver needs to know power up/down sequence at suspend/resume time. So I
think callbacks from bridge-to-core is useful. Especially it is more useful
for the core to get exact timing to power up the next driver if some
devices have certain latency to power up.

- Hiro.

2004-10-27 20:24:31

by Rajesh Shah

[permalink] [raw]
Subject: Re: [Proposal]Another way to save/restore PCI config space for suspend/resume

On Tue, Oct 26, 2004 at 12:50:57PM +0800, Li Shaohua wrote:
> Hi,
> We suffer from PCI config space issue for a long time, which causes many
> system can't correctly resume. Current Linux mechanism isn't sufficient.
> Here is a another idea:
> Record all PCI writes in Linux kernel, and redo all the write after
> resume in order. The idea assumes Firmware will restore all PCI config
> space to the boot time state, which is true at least for IA32.
>
A large percentage of them may be irrelevant with respect to
suspend/resume (e.g. pci device tree and resource scan...). Apart
from the performance problems, generic code doing device specific
config accesses may have correctness problems. For example, you
will not be able to capture/replay config cycles or other device
specific initialization (e.g. using memory mapped IO) that BIOS may
have done before boot. This may be required for correct access to
device specific areas. The same thing is true about device specific
config accesses that may have been done by the corresponding
driver after boot. Without device specific knowledge, we may see
unpredictable behavior and non-repeatable and hard to debug problems.

I don't see how generic code can do the right thing for device
specific accesses. Devices like p2p bridges that have standard
definitions can be handled separately.

Rajesh