2005-02-28 23:17:35

by Pavel Machek

[permalink] [raw]
Subject: 2.6.11-rc4-mm1: something is wrong with swsusp powerdown

Hi!

In `subj` kernel, machine no longer powers down at the end of
swsusp. 2.6.11-rc5-pavel works ok, as does 2.6.11-bk.

Pavel
--
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!


2005-03-01 06:39:12

by Laurent Riffard

[permalink] [raw]
Subject: Re: 2.6.11-rc4-mm1: something is wrong with swsusp powerdown

Le 01.03.2005 00:17, Pavel Machek a ?crit :
> Hi!
>
> In `subj` kernel, machine no longer powers down at the end of
> swsusp. 2.6.11-rc5-pavel works ok, as does 2.6.11-bk.
>
> Pavel

Hello,

I noticed this behaviour, too. Can't remember if it came with
2.6.11-rc3-mm2 or with 2.6.11-rc4-mm1. Didn't try another kernel.

I was able to workaround this problem by doing
"echo platform > /sys/power/disk"
before
"echo disk > /sys/power/state"

The box is a desktop with an asus A7V133 mb (VIA 82Cxxx chipset), Athlon
XP 1600+ CPU and NVidia Geforce2 MX400 graphics.

~~
laurent


Attachments:
signature.asc (252.00 B)
OpenPGP digital signature

2005-03-01 09:53:29

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.11-rc4-mm1: something is wrong with swsusp powerdown

Pavel Machek <[email protected]> wrote:
>
> In `subj` kernel, machine no longer powers down at the end of
> swsusp. 2.6.11-rc5-pavel works ok, as does 2.6.11-bk.

Binary searching indicates that this is due to
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.11-rc5/2.6.11-rc5-mm1/broken-out/acpi_power_off-bug-fix.patch.

I'll drop it. That patch is pretty ugly-looking anyway (ACPI code in
drivers/base/power/?).

Perhaps someone who is hitting the problem which that patch addresses could
raise a bugzilla entry.

Oh. It has one. http://bugme.osdl.org/show_bug.cgi?id=4041

Anyway. It needs more work.

2005-03-01 10:07:53

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.11-rc4-mm1: something is wrong with swsusp powerdown


btw, suspend is a bit messy. The disk spins down. Then up. Then down
again. And:



Stopping tasks: ==========================================|
Freeing memory... done (7069 pages freed)
swsusp: Need to copy 7847 pages
swsusp: critical section/: done (7879 pages copied)
swsusp: Restoring Highmem
Debug: sleeping function called from invalid context at mm/slab.c:2082
in_atomic():0, irqs_disabled():1
[<c010318d>] dump_stack+0x19/0x20
[<c0111731>] __might_sleep+0x91/0x9c
[<c01365df>] kmem_cache_alloc+0x23/0x84
[<c0232d50>] acpi_evaluate_integer+0x3c/0xac
[<c024b3d9>] acpi_bus_get_status+0x39/0x94
[<c024ca99>] acpi_pci_link_set+0x16d/0x1e8
[<c024ce65>] acpi_pci_link_resume+0x1d/0x28
[<c024ce8a>] irqrouter_resume+0x1a/0x38
[<c0281e3c>] sysdev_resume+0x2c/0xae
[<c0285ea8>] device_power_up+0x8/0x11
[<c012a873>] swsusp_suspend+0x4b/0x58
[<c012ac35>] pm_suspend_disk+0x35/0x74
[<c01292ea>] enter_state+0x2e/0x70
[<c0129336>] software_suspend+0xa/0x10
[<c024a8a7>] acpi_system_write_sleep+0x73/0x98
[<c0149f1b>] vfs_write+0xaf/0x118
[<c014a028>] sys_write+0x3c/0x68
[<c0102c05>] sysenter_past_esp+0x52/0x75
PCI: Setting latency timer of device 0000:00:1f.2 to 64
ACPI: PCI interrupt 0000:00:1f.5[B] -> GSI 9 (level, low) -> IRQ 9
PCI: Setting latency timer of device 0000:00:1f.5 to 64
ACPI: PCI interrupt 0000:01:00.0[A] -> GSI 11 (level, low) -> IRQ 11
ehci_hcd 0000:02:01.2: USB 2.0 restarted, EHCI 0.95, driver 10 Dec 2004
ACPI: PCI interrupt 0000:02:0c.0[A] -> GSI 9 (level, low) -> IRQ 9
e100: eth0: e100_watchdog: link up, 100Mbps, full-duplex
Writing data to swap (7879 pages)... done
Writing pagedir (31 pages)
S|
Powering off system
Debug: sleeping function called from invalid context at include/linux/rwsem.h:66
in_atomic():0, irqs_disabled():1
[<c010318d>] dump_stack+0x19/0x20
[<c0111731>] __might_sleep+0x91/0x9c
[<c0285872>] device_shutdown+0x16/0x82
[<c012aa97>] power_down+0x47/0x74
[<c012ac5a>] pm_suspend_disk+0x5a/0x74
[<c01292ea>] enter_state+0x2e/0x70
[<c0129336>] software_suspend+0xa/0x10
[<c024a8a7>] acpi_system_write_sleep+0x73/0x98
[<c0149f1b>] vfs_write+0xaf/0x118
[<c014a028>] sys_write+0x3c/0x68
[<c0102c05>] sysenter_past_esp+0x52/0x75
Synchronizing SCSI cache for disk sda:
Shutdown: hda
acpi_power_off called

2005-03-01 10:21:42

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.11-rc4-mm1: something is wrong with swsusp powerdown


Resume on SMP locks up.


Relocating pagedir |
Reading image data (8157 pages): 100% 8157 done.
Stopping tasks: ====|
Freeing memory... done (0 pages freed)
Freezing CPUs (at 1)...Sleeping in:
[<c0103c1d>] dump_stack+0x19/0x20
[<c0133c7f>] smp_pause+0x1f/0x54
[<c010ee27>] smp_call_function_interrupt+0x3b/0x60
[<c01037d4>] call_function_interrupt+0x1c/0x24
[<c0101111>] cpu_idle+0x55/0x64
[<c05929ed>] start_secondary+0x71/0x78
[<00000000>] 0x0
[<cffa5fbc>] 0xcffa5fbc
ok
double fault, gdt at c1203260 [255 bytes]
NMI Watchdog detected LOCKUP on CPU1, eip c0133c96, registers:
Modules linked in: video thermal processor pcc_acpi fan button battery ac
CPU: 1
EIP: 0060:[<c0133c96>] Not tainted VLI
EFLAGS: 00000002 (2.6.11-rc5)
EIP is at smp_pause+0x36/0x54
eax: 00000001 ebx: cffa5f20 ecx: fffbe4e6 edx: cffa5f20
esi: cffa4000 edi: 00000080 ebp: cffa5f58 esp: cffa5f1c
ds: 007b es: 007b ss: 0068
Process swapper (pid: 0, threadinfo=cffa4000 task=c18ac540)
Stack: 00000000 0000007b 00680000 80050033 00000000 005d3000 000006f0 00ff0001
c120b260 07ff5f4c c0577000 00000088 cffa0080 c011eed4 cffa5f68 cffa5f68
c010ee27 00000000 00000001 cffa5fa4 c01037d4 00000001 c120b260 fffbe4e5
Call Trace:
[<c0103bf7>] show_stack+0x7b/0x88
[<c0103d36>] show_registers+0x112/0x188
[<c01046f1>] die_nmi+0x41/0x74
[<c010fcb4>] nmi_watchdog_tick+0x54/0xcc
[<c0104797>] default_do_nmi+0x73/0xfc
[<c0104865>] do_nmi+0x39/0x4c
[<c010395c>] nmi_stack_correct+0x1d/0x2a
[<c010ee27>] smp_call_function_interrupt+0x3b/0x60
[<c01037d4>] call_function_interrupt+0x1c/0x24
[<c0101111>] cpu_idle+0x55/0x64
[<c05929ed>] start_secondary+0x71/0x78
[<00000000>] 0x0
[<cffa5fbc>] 0xcffa5fbc
Code: e8 60 e0 24 00 68 0c 7a 40 c0 e8 c2 68 fe ff e8 85 ff fc ff 83 c4 08 f0 ff 05 4c 20 5e c0 a1 50 20 5e c0 89 da 85 c0 74 0b f3
console shuts up ...

2005-03-01 10:48:27

by Pavel Machek

[permalink] [raw]
Subject: Re: 2.6.11-rc4-mm1: something is wrong with swsusp powerdown

Hi!

> btw, suspend is a bit messy. The disk spins down. Then up. Then down
> again. And:

Yes, that's known, pm_message_t needs to become struct to solve disk
pingpong properly.

> Debug: sleeping function called from invalid context at mm/slab.c:2082
> in_atomic():0, irqs_disabled():1
> [<c010318d>] dump_stack+0x19/0x20
> [<c0111731>] __might_sleep+0x91/0x9c
> [<c01365df>] kmem_cache_alloc+0x23/0x84
> [<c0232d50>] acpi_evaluate_integer+0x3c/0xac
> [<c024b3d9>] acpi_bus_get_status+0x39/0x94
> [<c024ca99>] acpi_pci_link_set+0x16d/0x1e8
> [<c024ce65>] acpi_pci_link_resume+0x1d/0x28
> [<c024ce8a>] irqrouter_resume+0x1a/0x38
> [<c0281e3c>] sysdev_resume+0x2c/0xae
> [<c0285ea8>] device_power_up+0x8/0x11
> [<c012a873>] swsusp_suspend+0x4b/0x58
> [<c012ac35>] pm_suspend_disk+0x35/0x74
> [<c01292ea>] enter_state+0x2e/0x70
> [<c0129336>] software_suspend+0xa/0x10
> [<c024a8a7>] acpi_system_write_sleep+0x73/0x98
> [<c0149f1b>] vfs_write+0xaf/0x118
> [<c014a028>] sys_write+0x3c/0x68
> [<c0102c05>] sysenter_past_esp+0x52/0x75

ACPI problem, patches are available (s/GFP_KERNEL/GFP_ATOMIC), but Len
claims better solution is ready... OTOH he claims that for half a year
already so we may push him a bit (added to cc).

> Powering off system
> Debug: sleeping function called from invalid context at include/linux/rwsem.h:66
> in_atomic():0, irqs_disabled():1
> [<c010318d>] dump_stack+0x19/0x20
> [<c0111731>] __might_sleep+0x91/0x9c
> [<c0285872>] device_shutdown+0x16/0x82
> [<c012aa97>] power_down+0x47/0x74
> [<c012ac5a>] pm_suspend_disk+0x5a/0x74
> [<c01292ea>] enter_state+0x2e/0x70
> [<c0129336>] software_suspend+0xa/0x10
> [<c024a8a7>] acpi_system_write_sleep+0x73/0x98
> [<c0149f1b>] vfs_write+0xaf/0x118
> [<c014a028>] sys_write+0x3c/0x68
> [<c0102c05>] sysenter_past_esp+0x52/0x75

I'll look at this one.
Pavel
--
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

2005-03-01 10:55:08

by Pavel Machek

[permalink] [raw]
Subject: Re: 2.6.11-rc4-mm1: something is wrong with swsusp powerdown

Hi!

> > In `subj` kernel, machine no longer powers down at the end of
> > swsusp. 2.6.11-rc5-pavel works ok, as does 2.6.11-bk.
>
> Binary searching indicates that this is due to
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.11-rc5/2.6.11-rc5-mm1/broken-out/acpi_power_off-bug-fix.patch.
>
> I'll drop it. That patch is pretty ugly-looking anyway (ACPI code in
> drivers/base/power/?).
>
> Perhaps someone who is hitting the problem which that patch addresses could
> raise a bugzilla entry.
>
> Oh. It has one. http://bugme.osdl.org/show_bug.cgi?id=4041
>
> Anyway. It needs more work.

Yes, the patch is very ugly. If something like this needs to be done,
then perhaps acpi should properly register into driver model and do
the work there. This will also mean code will be called consistently.

Pavel
--
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

2005-03-01 10:57:01

by Pavel Machek

[permalink] [raw]
Subject: Re: 2.6.11-rc4-mm1: something is wrong with swsusp powerdown

Hi!

> Resume on SMP locks up.

Does it work on UP kernel on same hardware? NMI watchdog is problem
for suspend, it takes long to do various phases. Can you disable it
for testing?
Pavel

> Relocating pagedir |
> Reading image data (8157 pages): 100% 8157 done.
> Stopping tasks: ====|
> Freeing memory... done (0 pages freed)
> Freezing CPUs (at 1)...Sleeping in:
> [<c0103c1d>] dump_stack+0x19/0x20
> [<c0133c7f>] smp_pause+0x1f/0x54
> [<c010ee27>] smp_call_function_interrupt+0x3b/0x60
> [<c01037d4>] call_function_interrupt+0x1c/0x24
> [<c0101111>] cpu_idle+0x55/0x64
> [<c05929ed>] start_secondary+0x71/0x78
> [<00000000>] 0x0
> [<cffa5fbc>] 0xcffa5fbc
> ok
> double fault, gdt at c1203260 [255 bytes]
> NMI Watchdog detected LOCKUP on CPU1, eip c0133c96, registers:
> Modules linked in: video thermal processor pcc_acpi fan button battery ac
> CPU: 1
> EIP: 0060:[<c0133c96>] Not tainted VLI
> EFLAGS: 00000002 (2.6.11-rc5)
> EIP is at smp_pause+0x36/0x54
> eax: 00000001 ebx: cffa5f20 ecx: fffbe4e6 edx: cffa5f20
> esi: cffa4000 edi: 00000080 ebp: cffa5f58 esp: cffa5f1c
> ds: 007b es: 007b ss: 0068
> Process swapper (pid: 0, threadinfo=cffa4000 task=c18ac540)
> Stack: 00000000 0000007b 00680000 80050033 00000000 005d3000 000006f0 00ff0001
> c120b260 07ff5f4c c0577000 00000088 cffa0080 c011eed4 cffa5f68 cffa5f68
> c010ee27 00000000 00000001 cffa5fa4 c01037d4 00000001 c120b260 fffbe4e5
> Call Trace:
> [<c0103bf7>] show_stack+0x7b/0x88
> [<c0103d36>] show_registers+0x112/0x188
> [<c01046f1>] die_nmi+0x41/0x74
> [<c010fcb4>] nmi_watchdog_tick+0x54/0xcc
> [<c0104797>] default_do_nmi+0x73/0xfc
> [<c0104865>] do_nmi+0x39/0x4c
> [<c010395c>] nmi_stack_correct+0x1d/0x2a
> [<c010ee27>] smp_call_function_interrupt+0x3b/0x60
> [<c01037d4>] call_function_interrupt+0x1c/0x24
> [<c0101111>] cpu_idle+0x55/0x64
> [<c05929ed>] start_secondary+0x71/0x78
> [<00000000>] 0x0
> [<cffa5fbc>] 0xcffa5fbc
> Code: e8 60 e0 24 00 68 0c 7a 40 c0 e8 c2 68 fe ff e8 85 ff fc ff 83 c4 08 f0 ff 05 4c 20 5e c0 a1 50 20 5e c0 89 da 85 c0 74 0b f3
> console shuts up ...
>

--
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

2005-03-01 11:11:55

by Eric W. Biederman

[permalink] [raw]
Subject: Re: 2.6.11-rc4-mm1: something is wrong with swsusp powerdown

Andrew Morton <[email protected]> writes:

> Pavel Machek <[email protected]> wrote:
> >
> > In `subj` kernel, machine no longer powers down at the end of
> > swsusp. 2.6.11-rc5-pavel works ok, as does 2.6.11-bk.
>
> Binary searching indicates that this is due to
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.11-rc5/2.6.11-rc5-mm1/broken-out/acpi_power_off-bug-fix.patch.
>
>
> I'll drop it. That patch is pretty ugly-looking anyway (ACPI code in
> drivers/base/power/?).
>
> Perhaps someone who is hitting the problem which that patch addresses could
> raise a bugzilla entry.
>
> Oh. It has one. http://bugme.osdl.org/show_bug.cgi?id=4041
>
> Anyway. It needs more work.

Agreed.

I threw it together to test a specific code path, and the fact it
fails in software suspend is actually almost confirmation that I am on
the right track. This actually fixed the case I was testing.

In this case the failure is simply because system_state is
not set to SYSTEM_POWER_OFF before
kernel/power/disk.c:power_down() calls device_shutdown().
The appropriate reboot notifier is also not called..

So to fix this properly all of the places
that call machine_power_off now need to call a wrapper
that does all of the appropriate things and then calls
machine_power_off.

Likewise with the other reboot functions.

In addition a clean way to get device_shutdown() to
call acpi_power_off_prepare() at roughly the location
I have it hard coded.

The fundamental issue this patch was starting to address
before I ran out of steam, is that acpi_power_off_prepare()
must be called with interrupts enabled and after we have shut down
the system devices (i.e. the interrupt controllers) we can't
guarantee interrupts, are working.

I'm don't know how much earlier it is safe to
acpi_power_off_prepare(). But mostly I think we need to
throw in a fake device to attach acpi_power_off_prepare to.

Eric

2005-03-01 11:16:12

by Eric W. Biederman

[permalink] [raw]
Subject: Re: 2.6.11-rc4-mm1: something is wrong with swsusp powerdown

Pavel Machek <[email protected]> writes:

> Yes, the patch is very ugly. If something like this needs to be done,
> then perhaps acpi should properly register into driver model and do
> the work there. This will also mean code will be called consistently.

I totally agree. Do you have an example of how a non-device
can do this?

In particular something that gets as close to shutting down
the system devices as possible. But gets called before that.

Or perhaps acpi should simply be setup to be the first system device?

Eric

2005-03-01 12:03:13

by Pavel Machek

[permalink] [raw]
Subject: Re: 2.6.11-rc4-mm1: something is wrong with swsusp powerdown

Hi!

> > Yes, the patch is very ugly. If something like this needs to be done,
> > then perhaps acpi should properly register into driver model and do
> > the work there. This will also mean code will be called consistently.
>
> I totally agree. Do you have an example of how a non-device
> can do this?
>
> In particular something that gets as close to shutting down
> the system devices as possible. But gets called before that.
>
> Or perhaps acpi should simply be setup to be the first system device?

I believe that's the prefered solution.
Pavel
--
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

2005-03-01 12:09:20

by Pavel Machek

[permalink] [raw]
Subject: Re: 2.6.11-rc4-mm1: something is wrong with swsusp powerdown

Hi!

> > > In `subj` kernel, machine no longer powers down at the end of
> > > swsusp. 2.6.11-rc5-pavel works ok, as does 2.6.11-bk.
> >
> > Binary searching indicates that this is due to
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.11-rc5/2.6.11-rc5-mm1/broken-out/acpi_power_off-bug-fix.patch.
> >
> >
> > I'll drop it. That patch is pretty ugly-looking anyway (ACPI code in
> > drivers/base/power/?).
> >
> > Perhaps someone who is hitting the problem which that patch addresses could
> > raise a bugzilla entry.
> >
> > Oh. It has one. http://bugme.osdl.org/show_bug.cgi?id=4041
> >
> > Anyway. It needs more work.
>
> Agreed.
>
> I threw it together to test a specific code path, and the fact it
> fails in software suspend is actually almost confirmation that I am on
> the right track. This actually fixed the case I was testing.
>
> In this case the failure is simply because system_state is
> not set to SYSTEM_POWER_OFF before
> kernel/power/disk.c:power_down() calls device_shutdown().
> The appropriate reboot notifier is also not called..

Can you suggest patch to do it right? Or perhaps there should be
just_plain_power_machine_down() that does all neccessary
trickery?
Pavel
--
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

2005-03-01 13:11:49

by Pavel Machek

[permalink] [raw]
Subject: Re: 2.6.11-rc4-mm1: something is wrong with swsusp powerdown

Hi!

> btw, suspend is a bit messy. The disk spins down. Then up. Then down
> again. And:

Yes, this is going to be properly solved by switching pm_message_t to
struct (preview patch attached, EVENT will become .event, this is just
for me). I could do some hack to make disk not go up-down-up (and will
need to do it for suse9.3, anyway), but I do not think that would
belong to mainline.

> Powering off system
> Debug: sleeping function called from invalid context at include/linux/rwsem.h:66
> in_atomic():0, irqs_disabled():1
> [<c010318d>] dump_stack+0x19/0x20
> [<c0111731>] __might_sleep+0x91/0x9c
> [<c0285872>] device_shutdown+0x16/0x82
> [<c012aa97>] power_down+0x47/0x74
> [<c012ac5a>] pm_suspend_disk+0x5a/0x74
> [<c01292ea>] enter_state+0x2e/0x70
> [<c0129336>] software_suspend+0xa/0x10
> [<c024a8a7>] acpi_system_write_sleep+0x73/0x98
> [<c0149f1b>] vfs_write+0xaf/0x118
> [<c014a028>] sys_write+0x3c/0x68
> [<c0102c05>] sysenter_past_esp+0x52/0x75
> Synchronizing SCSI cache for disk sda:
> Shutdown: hda
> acpi_power_off called

Hmm, device_shutdown is confused. Should it be called with interrupts
enabled or disabled? It uses rwsem, that suggests interrupts enabled,
but I do not think sysdev_shutdown with enabled interrupts is good
idea (and comment suggests it should be called with interrupts disabled).

Pavel

/**
* We handle system devices differently - we suspend and shut them
* down last and resume them first. That way, we don't do anything
stupid like
* shutting down the interrupt controller before any devices..
*
* Note that there are not different stages for power management calls
-
* they only get one called once when interrupts are disabled.
*/

extern int sysdev_shutdown(void);

/**
* device_shutdown - call ->shutdown() on each device to shutdown.
*/
void device_shutdown(void)
{
struct device * dev;

down_write(&devices_subsys.rwsem);
list_for_each_entry_reverse(dev, &devices_subsys.kset.list, kobj.entry) {
pr_debug("shutting down %s: ", dev->bus_id);
if (dev->driver && dev->driver->shutdown) {
pr_debug("Ok\n");
dev->driver->shutdown(dev);
} else
pr_debug("Ignored.\n");
}
up_write(&devices_subsys.rwsem);

sysdev_shutdown();
}



--
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

2005-03-01 13:16:39

by Pavel Machek

[permalink] [raw]
Subject: Re: 2.6.11-rc4-mm1: something is wrong with swsusp powerdown

Hi!

> btw, suspend is a bit messy. The disk spins down. Then up. Then down
> again. And:

Here's preview patch to make disk not do stupid yo-yo. Please do not
apply (it will probably not apply cleanly anyway).

I can fix disk going yo-yo without switching pm_message_t to struct,
but will have to back parts of that later. Do you want patch?
Pavel

--- clean/drivers/base/power/resume.c 2004-12-25 13:34:59.000000000 +0100
+++ linux/drivers/base/power/resume.c 2005-02-28 15:38:51.000000000 +0100
@@ -41,7 +41,7 @@
list_add_tail(entry, &dpm_active);

up(&dpm_list_sem);
- if (!dev->power.prev_state)
+ if (!dev->power.prev_state EVENT)
resume_device(dev);
down(&dpm_list_sem);
put_device(dev);
--- clean/drivers/base/power/runtime.c 2005-01-12 11:07:39.000000000 +0100
+++ linux/drivers/base/power/runtime.c 2005-02-28 15:42:10.000000000 +0100
@@ -13,10 +13,10 @@
static void runtime_resume(struct device * dev)
{
dev_dbg(dev, "resuming\n");
- if (!dev->power.power_state)
+ if (!dev->power.power_state EVENT)
return;
if (!resume_device(dev))
- dev->power.power_state = 0;
+ dev->power.power_state = PMSG_ON;
}


@@ -49,10 +49,10 @@
int error = 0;

down(&dpm_sem);
- if (dev->power.power_state == state)
+ if (dev->power.power_state EVENT == state EVENT)
goto Done;

- if (dev->power.power_state)
+ if (dev->power.power_state EVENT)
runtime_resume(dev);

if (!(error = suspend_device(dev, state)))
--- clean/drivers/base/power/shutdown.c 2004-08-15 19:14:55.000000000 +0200
+++ linux/drivers/base/power/shutdown.c 2005-01-12 10:57:23.000000000 +0100
@@ -29,7 +29,8 @@
dev->driver->shutdown(dev);
return 0;
}
- return dpm_runtime_suspend(dev, dev->detach_state);
+ /* FIXME */
+ return dpm_runtime_suspend(dev, PMSG_FREEZE);
}


--- clean/drivers/base/power/suspend.c 2005-01-12 11:07:39.000000000 +0100
+++ linux/drivers/base/power/suspend.c 2005-02-28 21:30:13.000000000 +0100
@@ -43,7 +43,7 @@

dev->power.prev_state = dev->power.power_state;

- if (dev->bus && dev->bus->suspend && !dev->power.power_state)
+ if (dev->bus && dev->bus->suspend && (!dev->power.power_state EVENT))
error = dev->bus->suspend(dev, state);

return error;
@@ -134,6 +134,8 @@
Done:
return error;
Error:
+ printk(KERN_ERR "Could not power down device %s: "
+ "error %d\n", kobject_name(&dev->kobj), error);
dpm_power_up();
goto Done;
}
--- clean/drivers/base/power/sysfs.c 2004-08-15 19:14:55.000000000 +0200
+++ linux/drivers/base/power/sysfs.c 2005-02-28 15:43:57.000000000 +0100
@@ -26,19 +26,20 @@

static ssize_t state_show(struct device * dev, char * buf)
{
- return sprintf(buf, "%u\n", dev->power.power_state);
+ return sprintf(buf, "%u\n", dev->power.power_state EVENT);
}

static ssize_t state_store(struct device * dev, const char * buf, size_t n)
{
- u32 state;
+ pm_message_t state;
char * rest;
int error = 0;

- state = simple_strtoul(buf, &rest, 10);
+ state EVENT = simple_strtoul(buf, &rest, 10);
+// state.flags = PFL_RUNTIME;
if (*rest)
return -EINVAL;
- if (state)
+ if (state EVENT)
error = dpm_runtime_suspend(dev, state);
else
dpm_runtime_resume(dev);
--- clean/drivers/ide/ide-disk.c 2005-02-14 14:12:21.000000000 +0100
+++ linux/drivers/ide/ide-disk.c 2005-02-14 22:34:43.000000000 +0100
@@ -872,7 +872,7 @@
{
switch (rq->pm->pm_step) {
case idedisk_pm_flush_cache: /* Suspend step 1 (flush cache) complete */
- if (rq->pm->pm_state == 4)
+ if (rq->pm->pm_state == EVENT_FREEZE)
rq->pm->pm_step = ide_pm_state_completed;
else
rq->pm->pm_step = idedisk_pm_standby;
@@ -1155,8 +1155,7 @@
return;
}

- printk("Shutdown: %s\n", drive->name);
- dev->bus->suspend(dev, PM_SUSPEND_STANDBY);
+ dev->bus->suspend(dev, PMSG_SUSPEND);
}

/*
--- clean/drivers/ide/ide.c 2005-02-28 00:50:42.000000000 +0100
+++ linux/drivers/ide/ide.c 2005-02-28 15:48:21.000000000 +0100
@@ -1398,7 +1398,7 @@
rq.special = &args;
rq.pm = &rqpm;
rqpm.pm_step = ide_pm_state_start_suspend;
- rqpm.pm_state = state;
+ rqpm.pm_state = state EVENT;

return ide_do_drive_cmd(drive, &rq, ide_wait);
}
@@ -1417,7 +1417,7 @@
rq.special = &args;
rq.pm = &rqpm;
rqpm.pm_step = ide_pm_state_start_resume;
- rqpm.pm_state = 0;
+ rqpm.pm_state = EVENT_ON;

return ide_do_drive_cmd(drive, &rq, ide_head_wait);
}
--- clean/drivers/pci/pci.c 2005-02-28 00:50:43.000000000 +0100
+++ linux/drivers/pci/pci.c 2005-02-28 15:54:24.000000000 +0100
@@ -312,22 +312,27 @@
/**
* pci_choose_state - Choose the power state of a PCI device
* @dev: PCI device to be suspended
- * @state: target sleep state for the whole system
+ * @state: target sleep state for the whole system. This is the value
+ * that is passed to suspend() function.
*
* Returns PCI power state suitable for given device and given system
* message.
*/

-pci_power_t pci_choose_state(struct pci_dev *dev, u32 state)
+pci_power_t pci_choose_state(struct pci_dev *dev, pm_message_t state)
{
if (!pci_find_capability(dev, PCI_CAP_ID_PM))
return PCI_D0;

- switch (state) {
- case 0: return PCI_D0;
- case 2: return PCI_D2;
- case 3: return PCI_D3hot;
- default: BUG();
+ switch (state EVENT) {
+ case EVENT_ON:
+ case EVENT_FREEZE:
+ return PCI_D0;
+ case EVENT_SUSPEND:
+ return PCI_D3hot;
+ default:
+ printk("They asked me for state %d\n", state EVENT);
+ BUG();
}
return PCI_D0;
}
--- clean/drivers/usb/core/usb.c 2005-01-22 21:24:52.000000000 +0100
+++ linux/drivers/usb/core/usb.c 2005-02-28 16:01:01.000000000 +0100
@@ -1364,7 +1364,7 @@
driver = to_usb_driver(dev->driver);

/* there's only one USB suspend state */
- if (intf->dev.power.power_state)
+ if (intf->dev.power.power_state EVENT)
return 0;

if (driver->suspend)
--- clean/drivers/usb/host/ehci-dbg.c 2005-01-12 11:07:40.000000000 +0100
+++ linux/drivers/usb/host/ehci-dbg.c 2005-02-14 22:35:42.000000000 +0100
@@ -641,7 +641,7 @@

spin_lock_irqsave (&ehci->lock, flags);

- if (bus->controller->power.power_state) {
+ if (bus->controller->power.power_state.event) {
size = scnprintf (next, size,
"bus %s, device %s (driver " DRIVER_VERSION ")\n"
"SUSPENDED (no register access)\n",
--- clean/drivers/usb/host/ohci-dbg.c 2005-01-12 11:07:40.000000000 +0100
+++ linux/drivers/usb/host/ohci-dbg.c 2005-02-14 22:35:42.000000000 +0100
@@ -625,7 +625,7 @@
hcd->self.controller->bus_id,
hcd_name);

- if (bus->controller->power.power_state) {
+ if (bus->controller->power.power_state.event) {
size -= scnprintf (next, size,
"SUSPENDED (no register access)\n");
goto done;
--- clean/drivers/video/aty/atyfb_base.c 2005-02-28 00:50:43.000000000 +0100
+++ linux/drivers/video/aty/atyfb_base.c 2005-02-28 00:50:54.000000000 +0100
@@ -2070,12 +2070,12 @@
struct fb_info *info = pci_get_drvdata(pdev);
struct atyfb_par *par = (struct atyfb_par *) info->par;

- if (pdev->dev.power.power_state == 0)
+ if (pdev->dev.power.power_state.event == EVENT_ON)
return 0;

acquire_console_sem();

- if (pdev->dev.power.power_state == 2)
+ if (pdev->dev.power.power_state.event == 2)
aty_power_mgmt(0, par);
par->asleep = 0;

@@ -2091,7 +2091,7 @@

release_console_sem();

- pdev->dev.power.power_state = 0;
+ pdev->dev.power.power_state = PMSG_ON;

return 0;
}
--- clean/drivers/video/aty/radeon_pm.c 2005-02-28 00:50:43.000000000 +0100
+++ linux/drivers/video/aty/radeon_pm.c 2005-02-28 16:06:12.000000000 +0100
@@ -2501,31 +2501,25 @@
}


-static/*extern*/ int susdisking = 0;
-
-int radeonfb_pci_suspend(struct pci_dev *pdev, u32 state)
+int radeonfb_pci_suspend(struct pci_dev *pdev, pm_message_t state)
{
struct fb_info *info = pci_get_drvdata(pdev);
struct radeonfb_info *rinfo = info->par;
int i;

- if (state == pdev->dev.power.power_state)
+ if (state EVENT == pdev->dev.power.power_state EVENT)
return 0;

printk(KERN_DEBUG "radeonfb (%s): suspending to state: %d...\n",
- pci_name(pdev), state);
+ pci_name(pdev), state EVENT);

/* For suspend-to-disk, we cheat here. We don't suspend anything and
* let fbcon continue drawing until we are all set. That shouldn't
* really cause any problem at this point, provided that the wakeup
* code knows that any state in memory may not match the HW
*/
- if (state != PM_SUSPEND_MEM)
- goto done;
- if (susdisking) {
- printk("suspending to disk but state = %d\n", state);
+ if (state EVENT == EVENT_FREEZE)
goto done;
- }

acquire_console_sem();

@@ -2596,7 +2590,7 @@
struct radeonfb_info *rinfo = info->par;
int rc = 0;

- if (pdev->dev.power.power_state == 0)
+ if (pdev->dev.power.power_state EVENT == EVENT_ON)
return 0;

if (rinfo->no_schedule) {
@@ -2606,7 +2600,7 @@
acquire_console_sem();

printk(KERN_DEBUG "radeonfb (%s): resuming from state: %d...\n",
- pci_name(pdev), pdev->dev.power.power_state);
+ pci_name(pdev), pdev->dev.power.power_state EVENT);


if (pci_enable_device(pdev)) {
@@ -2617,7 +2611,7 @@
}
pci_set_master(pdev);

- if (pdev->dev.power.power_state == PM_SUSPEND_MEM) {
+ if (pdev->dev.power.power_state EVENT == EVENT_SUSPEND) {
/* Wakeup chip. Check from config space if we were powered off
* (todo: additionally, check CLK_PIN_CNTL too)
*/
@@ -2663,7 +2657,7 @@
else if (rinfo->dynclk == 0)
radeon_pm_disable_dynamic_mode(rinfo);

- pdev->dev.power.power_state = 0;
+ pdev->dev.power.power_state = PMSG_ON;

bail:
release_console_sem();
--- clean/drivers/video/i810/i810_main.c 2005-01-22 21:24:52.000000000 +0100
+++ linux/drivers/video/i810/i810_main.c 2005-01-30 23:53:29.000000000 +0100
@@ -1492,18 +1492,18 @@
/***********************************************************************
* Power Management *
***********************************************************************/
-static int i810fb_suspend(struct pci_dev *dev, u32 state)
+static int i810fb_suspend(struct pci_dev *dev, pm_message_t state)
{
struct fb_info *info = pci_get_drvdata(dev);
struct i810fb_par *par = (struct i810fb_par *) info->par;
int blank = 0, prev_state = par->cur_state;

- if (state == prev_state)
+ if (state.event == prev_state)
return 0;

- par->cur_state = state;
+ par->cur_state = state.event;

- switch (state) {
+ switch (state.event) {
case 1:
blank = VESA_VSYNC_SUSPEND;
break;
--- clean/include/linux/pm.h 2005-01-12 11:07:40.000000000 +0100
+++ linux/include/linux/pm.h 2005-02-28 18:08:20.000000000 +0100
@@ -195,7 +195,11 @@

struct device;

-typedef u32 __bitwise pm_message_t;
+#if 1
+typedef struct pm_message {
+ int event;
+ int flags;
+} pm_message_t;

/*
* There are 4 important states driver can be in:
@@ -215,9 +219,32 @@
* or something similar soon.
*/

-#define PMSG_FREEZE ((__force pm_message_t) 3)
-#define PMSG_SUSPEND ((__force pm_message_t) 3)
-#define PMSG_ON ((__force pm_message_t) 0)
+#define EVENT_ON 0
+#define EVENT_FREEZE 1
+#define EVENT_SUSPEND 2
+
+#define PFL_RUNTIME 1
+
+#define PMSG_FREEZE ({struct pm_message m; m.event = EVENT_FREEZE; m.flags = 0; m; })
+#define PMSG_SUSPEND ({struct pm_message m; m.event = EVENT_SUSPEND; m.flags = 0; m; })
+#define PMSG_ON ({struct pm_message m; m.event = EVENT_ON; m.flags = 0; m; })
+#define EVENT .event
+#else
+
+typedef u32 pm_message_t;
+
+#define EVENT_ON 0
+#define EVENT_FREEZE 2
+#define EVENT_SUSPEND 3
+
+#define PFL_RUNTIME 1
+
+#define PMSG_FREEZE EVENT_FREEZE
+#define PMSG_SUSPEND EVENT_SUSPEND
+#define PMSG_ON EVENT_ON
+#define EVENT
+#endif
+

struct dev_pm_info {
pm_message_t power_state;
--- clean/kernel/power/main.c 2005-02-03 22:27:26.000000000 +0100
+++ linux/kernel/power/main.c 2005-02-28 01:16:02.000000000 +0100
@@ -65,8 +65,10 @@
goto Thaw;
}

- if ((error = device_suspend(PMSG_SUSPEND)))
+ if ((error = device_suspend(PMSG_SUSPEND))) {
+ printk(KERN_ERR "Some devices failed to suspend\n");
goto Finish;
+ }
return 0;
Finish:
if (pm_ops->finish)
@@ -85,8 +87,10 @@

local_irq_save(flags);

- if ((error = device_power_down(PMSG_SUSPEND)))
+ if ((error = device_power_down(PMSG_SUSPEND))) {
+ printk(KERN_ERR "Some devices failed to power down\n");
goto Done;
+ }
error = pm_ops->enter(state);
device_power_up();
Done:
--- clean/kernel/sys.c 2005-01-12 11:07:40.000000000 +0100
+++ linux/kernel/sys.c 2005-01-12 11:12:10.000000000 +0100
@@ -402,6 +402,7 @@
case LINUX_REBOOT_CMD_HALT:
notifier_call_chain(&reboot_notifier_list, SYS_HALT, NULL);
system_state = SYSTEM_HALT;
+ device_suspend(PMSG_SUSPEND);
device_shutdown();
printk(KERN_EMERG "System halted.\n");
machine_halt();
@@ -412,6 +413,7 @@
case LINUX_REBOOT_CMD_POWER_OFF:
notifier_call_chain(&reboot_notifier_list, SYS_POWER_OFF, NULL);
system_state = SYSTEM_POWER_OFF;
+ device_suspend(PMSG_SUSPEND);
device_shutdown();
printk(KERN_EMERG "Power down.\n");
machine_power_off();
@@ -428,6 +430,7 @@

notifier_call_chain(&reboot_notifier_list, SYS_RESTART, buffer);
system_state = SYSTEM_RESTART;
+ device_suspend(PMSG_FREEZE);
device_shutdown();
printk(KERN_EMERG "Restarting system with command '%s'.\n", buffer);
machine_restart(buffer);


--
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

2005-03-01 17:38:08

by Eric W. Biederman

[permalink] [raw]
Subject: Re: 2.6.11-rc4-mm1: something is wrong with swsusp powerdown

Pavel Machek <[email protected]> writes:

> > I threw it together to test a specific code path, and the fact it
> > fails in software suspend is actually almost confirmation that I am on
> > the right track. This actually fixed the case I was testing.
> >
> > In this case the failure is simply because system_state is
> > not set to SYSTEM_POWER_OFF before
> > kernel/power/disk.c:power_down() calls device_shutdown().
> > The appropriate reboot notifier is also not called..
>
> Can you suggest patch to do it right? Or perhaps there should be
> just_plain_power_machine_down() that does all neccessary
> trickery?

I would call it kernel_power_down() and that
is what I am suggesting is the right fix.

We have it open coded in kernel/sys.c:sys_reboot()
in the switch case for: LINUX_REBOOT_CMD_POWER_OFF

So after the code gets factored out from there all
of the cases that call machine_power_off() and pm_power_off()
directly need to be updated.

There are similar cases for machine_restart() and machine_halt().
But the power off case seems to be the most acute.

My biggest problem with this is I get into the recursive code
cleanup problem. Where I fix one piece and a bug is exposed somewhere
else. And that then requires investigation and fixing.

Fixing the callers of machine_power_off() is about the fifth bug
fix down the chain triggered by disabling UP interrupts in
device_shutdown(), SMP interrupts have always been disabled. With the
first bug fix was to create system devices in the device tree..

I haven't a clue where fixing this one will lead. Recursive
code fixes are a hard thing to schedule :(

Eric

2005-03-01 20:39:07

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.11-rc4-mm1: something is wrong with swsusp powerdown

Pavel Machek <[email protected]> wrote:
>
> Hi!
>
> > Resume on SMP locks up.
>
> Does it work on UP kernel on same hardware?

yup.

> NMI watchdog is problem
> for suspend, it takes long to do various phases. Can you disable it
> for testing?

Will try to remember to do that.

> > Relocating pagedir |
> > Reading image data (8157 pages): 100% 8157 done.
> > Stopping tasks: ====|
> > Freeing memory... done (0 pages freed)
> > Freezing CPUs (at 1)...Sleeping in:
> > [<c0103c1d>] dump_stack+0x19/0x20
> > [<c0133c7f>] smp_pause+0x1f/0x54
> > [<c010ee27>] smp_call_function_interrupt+0x3b/0x60
> > [<c01037d4>] call_function_interrupt+0x1c/0x24
> > [<c0101111>] cpu_idle+0x55/0x64
> > [<c05929ed>] start_secondary+0x71/0x78
> > [<00000000>] 0x0
> > [<cffa5fbc>] 0xcffa5fbc
> > ok
> > double fault, gdt at c1203260 [255 bytes]
> > NMI Watchdog detected LOCKUP on CPU1, eip c0133c96, registers:

Note the double fault.

2005-03-01 20:42:05

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.11-rc4-mm1: something is wrong with swsusp powerdown

Pavel Machek <[email protected]> wrote:
>
> I can fix disk going yo-yo without switching pm_message_t to struct,
> but will have to back parts of that later. Do you want patch?

No thanks, I was just pointing it out. It sounds like you have it under
control.

2005-03-01 23:40:59

by Pavel Machek

[permalink] [raw]
Subject: Re: 2.6.11-rc4-mm1: something is wrong with swsusp powerdown

Hi!

> > > Relocating pagedir |
> > > Reading image data (8157 pages): 100% 8157 done.
> > > Stopping tasks: ====|
> > > Freeing memory... done (0 pages freed)
> > > Freezing CPUs (at 1)...Sleeping in:
> > > [<c0103c1d>] dump_stack+0x19/0x20
> > > [<c0133c7f>] smp_pause+0x1f/0x54
> > > [<c010ee27>] smp_call_function_interrupt+0x3b/0x60
> > > [<c01037d4>] call_function_interrupt+0x1c/0x24
> > > [<c0101111>] cpu_idle+0x55/0x64
> > > [<c05929ed>] start_secondary+0x71/0x78
> > > [<00000000>] 0x0
> > > [<cffa5fbc>] 0xcffa5fbc
> > > ok
> > > double fault, gdt at c1203260 [255 bytes]
> > > NMI Watchdog detected LOCKUP on CPU1, eip c0133c96, registers:
>
> Note the double fault.

Yes, I can see it, it scares me. SMP swsusp is not in good state
because I do not have easy access to SMP or HT hardware. I guess I'll
just have to get into suse at the night and steal some P4 ;-).

Pavel
--
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

2005-03-02 23:03:52

by Jindrich Makovicka

[permalink] [raw]
Subject: Re: 2.6.11-rc4-mm1: something is wrong with swsusp powerdown

Pavel Machek wrote:
> Hi!
>
> In `subj` kernel, machine no longer powers down at the end of
> swsusp. 2.6.11-rc5-pavel works ok, as does 2.6.11-bk.

For me, power down stopped working since the introduction of softlockup
detection. After disabling CONFIG_DETECT_SOFTLOCKUP, powerdown works fine.

--
Jindrich Makovicka

2005-03-03 00:35:55

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.11-rc4-mm1: something is wrong with swsusp powerdown


(Please do reply-to-all)

Jindrich Makovicka <[email protected]> wrote:
>
> Pavel Machek wrote:
> > Hi!
> >
> > In `subj` kernel, machine no longer powers down at the end of
> > swsusp. 2.6.11-rc5-pavel works ok, as does 2.6.11-bk.
>
> For me, power down stopped working since the introduction of softlockup
> detection. After disabling CONFIG_DETECT_SOFTLOCKUP, powerdown works fine.

Could you send the output which CONFIG_DETECT_SOFTLOCKUP generates?

I had one CONFIG_DETECT_SOFTLOCKUP failure with suspend, on SMP. The
machine was stuck somewhere under mce_work_fn(). Perhaps in the
smp_call_function(). It only happened the once.

2005-03-03 12:46:47

by Jindrich Makovicka

[permalink] [raw]
Subject: Re: 2.6.11-rc4-mm1: something is wrong with swsusp powerdown

Andrew Morton wrote:
> (Please do reply-to-all)
>
> Jindrich Makovicka <[email protected]> wrote:
>
>>Pavel Machek wrote:
>>
>>>Hi!
>>>
>>>In `subj` kernel, machine no longer powers down at the end of
>>>swsusp. 2.6.11-rc5-pavel works ok, as does 2.6.11-bk.
>>
>>For me, power down stopped working since the introduction of softlockup
>>detection. After disabling CONFIG_DETECT_SOFTLOCKUP, powerdown works fine.
>
>
> Could you send the output which CONFIG_DETECT_SOFTLOCKUP generates?
>
> I had one CONFIG_DETECT_SOFTLOCKUP failure with suspend, on SMP. The
> machine was stuck somewhere under mce_work_fn(). Perhaps in the
> smp_call_function(). It only happened the once.

Strange enough, softlockup produces no additional output. Kernel just
prints "acpi_power_off called" and freezes. Without softlockup detection
compiled in it turns off normally.

First I was under impression that this is caused by
acpi_power_off-bug-fix.patch mentioned above, but unfortunately removing
it didn't actually solve the problem. Later I found I missed that
softlockup detection sneaked in turned on by default, and disabling it
made power off work again.

Power down via APM produced some softlockup output, but I am not sure if
APM actually worked on my machine before - I just tried APM if it works
when ACPI doesn't, and didn't bother taking a snapshot. I can recompile
an APM kernel with softlockup enabled and disabled and test it, if it
could help.

--
Jindrich Makovicka