2010-02-03 09:25:32

by Corey Wright

[permalink] [raw]
Subject: regression in 2.6.27.45 with usb and suspend-to-disk

2.6.27.45 will only suspend-to-disk once without rebooting. every attempt
to suspend-to-disk after the first attempt results in "pm_op():
usb_dev_freeze +0x0/0xa returns -16" and "PM: Device usb1 failed to freeze:
error -16". reverting commit 4fbebe26d7a5eed6c1900bf2142b68af7df38462
(http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.27.y.git;a=commit;h=4fbebe26d7a5eed6c1900bf2142b68af7df38462)
fixes the problem.

this problem exhibits itself on both a x86-64 desktop and i386 netbook
(though the reverted commit was only tested on the x86-64 desktop). the
affected computers are different architectures, different distributions
(debian lenny vs ubuntu hardy), and different kernel configs (but unchanged
since 2.6.27.23), so i am fairly certain this is a regression in 2.6.27.45
and not the result of an environment change.

i normally use the pm-hibernate command to suspend to disk, but to isolate
the problem i used the kernel interfaces directly (as demonstrated in the
transcript below).

i can provide any other information necessary (eg .config, dmesg, lsusb)
and i am willing to test patches.

transcript of problem:

root@computer:~# uname -a
Linux computer 2.6.27.45+1-k8 #1 SMP Wed Feb 3 07:06:38 UTC 2010 x86_64
GNU/Linux
root@computer:~# cat /sys/power/disk
[platform] test testproc shutdown reboot
root@computer:~# cat /sys/power/state
standby mem disk
root@computer:~# echo disk >/sys/power/state
<suspends to disk, powers off, and is manually powered back on>
root@computer:~# echo disk >/sys/power/state
bash: echo: write error: Device or resource busy
root@computer:~# tail /var/log/kern.log | grep -i pm
Feb 3 01:45:25 computer kernel: [ 263.666343] PM: Syncing filesystems ...
done.
Feb 3 01:45:25 computer kernel: [ 263.887504] PM: Shrinking memory...
done (4347 pages freed)
Feb 3 01:45:25 computer kernel: [ 268.023515] PM: Freed 17388 kbytes in
4.11 seconds (4.23 MB/s)
Feb 3 01:45:25 computer kernel: [ 268.061753] pm_op(): usb_dev_freeze
+0x0/0xa returns -16
Feb 3 01:45:25 computer kernel: [ 268.061769] PM: Device usb1 failed to
freeze: error -16
<revert commit 4fbebe26d7a5eed6c1900bf2142b68af7df38462, rebuild kernel,
install, and reboot>
root@computer:~# uname -a
Linux computer 2.6.27.45+2-k8-revert-as1321 #1 SMP Wed Feb 3 07:56:46 UTC
2010 x86_64 GNU/Linux
root@computer:~# cat /sys/power/disk
[platform] test testproc shutdown reboot
root@computer:~# cat /sys/power/state
standby mem disk
root@computer:~# echo disk >/sys/power/state
<suspends to disk, powers off, and is manually powered back on>
root@computer:~# echo disk >/sys/power/state
<suspends to disk, powers off, and is manually powered back on>

thanks for maintaining and contributing to the 2.6.27 stable kernel series!

corey
--
[email protected]


2010-02-03 15:05:29

by Alan Stern

[permalink] [raw]
Subject: Re: regression in 2.6.27.45 with usb and suspend-to-disk

On Wed, 3 Feb 2010, Corey Wright wrote:

> 2.6.27.45 will only suspend-to-disk once without rebooting. every attempt
> to suspend-to-disk after the first attempt results in "pm_op():
> usb_dev_freeze +0x0/0xa returns -16" and "PM: Device usb1 failed to freeze:
> error -16". reverting commit 4fbebe26d7a5eed6c1900bf2142b68af7df38462
> (http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.27.y.git;a=commit;h=4fbebe26d7a5eed6c1900bf2142b68af7df38462)
> fixes the problem.
>
> this problem exhibits itself on both a x86-64 desktop and i386 netbook
> (though the reverted commit was only tested on the x86-64 desktop). the
> affected computers are different architectures, different distributions
> (debian lenny vs ubuntu hardy), and different kernel configs (but unchanged
> since 2.6.27.23), so i am fairly certain this is a regression in 2.6.27.45
> and not the result of an environment change.
>
> i normally use the pm-hibernate command to suspend to disk, but to isolate
> the problem i used the kernel interfaces directly (as demonstrated in the
> transcript below).
>
> i can provide any other information necessary (eg .config, dmesg, lsusb)
> and i am willing to test patches.

Please build a kernel with CONFIG_USB_DEBUG enabled and post a dmesg
log showing the problem during the second hibernation attempt.

Alan Stern

2010-02-03 18:00:15

by Corey Wright

[permalink] [raw]
Subject: Re: regression in 2.6.27.45 with usb and suspend-to-disk

On Wed, 3 Feb 2010 10:05:27 -0500 (EST)
Alan Stern <[email protected]> wrote:

> On Wed, 3 Feb 2010, Corey Wright wrote:
>
> > 2.6.27.45 will only suspend-to-disk once without rebooting. every
> > attempt to suspend-to-disk after the first attempt results in "pm_op():
> > usb_dev_freeze +0x0/0xa returns -16" and "PM: Device usb1 failed to
> > freeze: error -16". reverting commit
> > 4fbebe26d7a5eed6c1900bf2142b68af7df38462
> > (http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.27.y.git;a=commit;h=4fbebe26d7a5eed6c1900bf2142b68af7df38462)
> > fixes the problem.
> >
> > this problem exhibits itself on both a x86-64 desktop and i386 netbook
> > (though the reverted commit was only tested on the x86-64 desktop). the
> > affected computers are different architectures, different distributions
> > (debian lenny vs ubuntu hardy), and different kernel configs (but
> > unchanged since 2.6.27.23), so i am fairly certain this is a regression
> > in 2.6.27.45 and not the result of an environment change.
> >
> > i normally use the pm-hibernate command to suspend to disk, but to
> > isolate the problem i used the kernel interfaces directly (as
> > demonstrated in the transcript below).
> >
> > i can provide any other information necessary (eg .config, dmesg, lsusb)
> > and i am willing to test patches.
>
> Please build a kernel with CONFIG_USB_DEBUG enabled and post a dmesg
> log showing the problem during the second hibernation attempt.

[ 675.443474] PM: Syncing filesystems ... done.
[ 675.576353] Freezing user space processes ... (elapsed 0.00 seconds)
done.
[ 675.606968] Freezing remaining freezable tasks ... (elapsed 0.00
seconds) done.
[ 675.629249] PM: Shrinking memory... -\|done (3129 pages freed)
[ 679.547522] PM: Freed 12516 kbytes in 3.90 seconds (3.20 MB/s)
[ 679.565065] Suspending console(s) (use no_console_suspend to debug)
[ 679.585425] uhci_hcd 0000:00:10.0: release dev 2 ep81-INT, period 8,
phase 4, 118 us
[ 679.585453] hub 2-0:1.0: hub_suspend
[ 679.585460] usb usb2: bus suspend
[ 679.585463] usb usb2: suspend_rh
[ 679.585534] hub 1-0:1.0: hub_suspend
[ 679.585538] usb usb1: bus suspend
[ 679.585542] ehci_hcd 0000:00:10.4: suspend root hub
[ 679.585547] ehci_hcd 0000:00:10.4: suspend failed because port 8 is
resuming
[ 679.585551] usb usb1: bus suspend fail, err -16
[ 679.585554] hub 1-0:1.0: hub_resume
[ 679.585585] hub 1-0:1.0: port 7: status 0503 change 0000
[ 679.585590] hub 1-0:1.0: port 8: status 0503 change 0000
[ 679.585595] pm_op(): usb_dev_freeze+0x0/0xa returns -16
[ 679.585607] PM: Device usb1 failed to freeze: error -16
[ 679.585696] usb usb2: usb resume
[ 679.585699] usb usb2: wakeup_rh
[ 679.624016] hub 2-0:1.0: hub_resume
[ 679.624026] hub 2-0:1.0: port 1: status 0303 change 0000
[ 679.624051] uhci_hcd 0000:00:10.0: reserve dev 2 ep81-INT, period 8,
phase 4, 118 us
[ 679.722862] Restarting tasks ... <7>hub 1-0:1.0: state 7 ports 8 chg
0000 evt 0000
[ 679.747221] hub 2-0:1.0: state 7 ports 2 chg 0000 evt 0000
[ 679.747251] done.

corey
--
[email protected]

2010-02-03 18:09:48

by Alan Stern

[permalink] [raw]
Subject: Re: regression in 2.6.27.45 with usb and suspend-to-disk

On Wed, 3 Feb 2010, Corey Wright wrote:

> > Please build a kernel with CONFIG_USB_DEBUG enabled and post a dmesg
> > log showing the problem during the second hibernation attempt.
>
> [ 675.443474] PM: Syncing filesystems ... done.
> [ 675.576353] Freezing user space processes ... (elapsed 0.00 seconds)
> done.
> [ 675.606968] Freezing remaining freezable tasks ... (elapsed 0.00
> seconds) done.
> [ 675.629249] PM: Shrinking memory... -\|done (3129 pages freed)
> [ 679.547522] PM: Freed 12516 kbytes in 3.90 seconds (3.20 MB/s)
> [ 679.565065] Suspending console(s) (use no_console_suspend to debug)
> [ 679.585425] uhci_hcd 0000:00:10.0: release dev 2 ep81-INT, period 8,
> phase 4, 118 us
> [ 679.585453] hub 2-0:1.0: hub_suspend
> [ 679.585460] usb usb2: bus suspend
> [ 679.585463] usb usb2: suspend_rh
> [ 679.585534] hub 1-0:1.0: hub_suspend
> [ 679.585538] usb usb1: bus suspend
> [ 679.585542] ehci_hcd 0000:00:10.4: suspend root hub
> [ 679.585547] ehci_hcd 0000:00:10.4: suspend failed because port 8 is
> resuming

There you have it. What is the USB device attached to port 8 on bus 1
(i.e., /sys/bus/usb/devices/1-8)?

And what happens if you do:

echo disabled >/sys/bus/usb/devices/1-8/power/wakeup

before trying to hibernate?

Alan Stern

2010-02-03 19:48:16

by Corey Wright

[permalink] [raw]
Subject: Re: regression in 2.6.27.45 with usb and suspend-to-disk

On Wed, 3 Feb 2010 13:09:45 -0500 (EST)
Alan Stern <[email protected]> wrote:

> On Wed, 3 Feb 2010, Corey Wright wrote:
>
> > > Please build a kernel with CONFIG_USB_DEBUG enabled and post a dmesg
> > > log showing the problem during the second hibernation attempt.
> >
> > [ 675.443474] PM: Syncing filesystems ... done.
> > [ 675.576353] Freezing user space processes ... (elapsed 0.00 seconds)
> > done.
> > [ 675.606968] Freezing remaining freezable tasks ... (elapsed 0.00
> > seconds) done.
> > [ 675.629249] PM: Shrinking memory... -\|done (3129 pages freed)
> > [ 679.547522] PM: Freed 12516 kbytes in 3.90 seconds (3.20 MB/s)
> > [ 679.565065] Suspending console(s) (use no_console_suspend to debug)
> > [ 679.585425] uhci_hcd 0000:00:10.0: release dev 2 ep81-INT, period 8,
> > phase 4, 118 us
> > [ 679.585453] hub 2-0:1.0: hub_suspend
> > [ 679.585460] usb usb2: bus suspend
> > [ 679.585463] usb usb2: suspend_rh
> > [ 679.585534] hub 1-0:1.0: hub_suspend
> > [ 679.585538] usb usb1: bus suspend
> > [ 679.585542] ehci_hcd 0000:00:10.4: suspend root hub
> > [ 679.585547] ehci_hcd 0000:00:10.4: suspend failed because port 8 is
> > resuming
>
> There you have it. What is the USB device attached to port 8 on bus 1
> (i.e., /sys/bus/usb/devices/1-8)?

the device is a usb SD reader w/ SD card in it.

# cat /sys/bus/usb/devices/1-8/product
USB 2.0 SD MMC READER

i removed the media reader after the failed suspend (it is only used at
boot-up to hold the LUKS key material to decrypt the filesystem) but the
message is the same:

[ 4002.585329] ehci_hcd 0000:00:10.4: suspend root hub
[ 4002.585334] ehci_hcd 0000:00:10.4: suspend failed because port 8 is
resuming
[ 4002.585338] usb usb1: bus suspend fail, err -16

and /sys/bus/usb/devices/1-8 no longer exists (little
alone /sys/bus/usb/devices/1-8/power/wakeup):

# ls -1 /sys/bus/usb/devices/
1-0:1.0
1-7
1-7:1.0
2-0:1.0
2-1
2-1:1.0
3-0:1.0
4-0:1.0
5-0:1.0
usb1
usb2
usb3
usb4
usb5

> And what happens if you do:
>
> echo disabled >/sys/bus/usb/devices/1-8/power/wakeup
>
> before trying to hibernate?

it's the same immediately after a reboot (and before suspending), a
successful suspend, or a failed suspend:

# echo disabled >/sys/bus/usb/devices/1-8/power/wakeup
bash: echo: write error: Invalid argument

and as previously said, that sysfs entry does not exist if the media reader
is removed, though the error appears to continue to refer to that
port/device.

corey
--
[email protected]

2010-02-03 20:31:55

by Alan Stern

[permalink] [raw]
Subject: Re: regression in 2.6.27.45 with usb and suspend-to-disk

On Wed, 3 Feb 2010, Corey Wright wrote:

> the device is a usb SD reader w/ SD card in it.
>
> # cat /sys/bus/usb/devices/1-8/product
> USB 2.0 SD MMC READER
>
> i removed the media reader after the failed suspend (it is only used at
> boot-up to hold the LUKS key material to decrypt the filesystem) but the
> message is the same:
>
> [ 4002.585329] ehci_hcd 0000:00:10.4: suspend root hub
> [ 4002.585334] ehci_hcd 0000:00:10.4: suspend failed because port 8 is
> resuming
> [ 4002.585338] usb usb1: bus suspend fail, err -16
>
> and /sys/bus/usb/devices/1-8 no longer exists (little
> alone /sys/bus/usb/devices/1-8/power/wakeup):
>
> # ls -1 /sys/bus/usb/devices/
> 1-0:1.0
> 1-7
> 1-7:1.0
> 2-0:1.0
> 2-1
> 2-1:1.0
> 3-0:1.0
> 4-0:1.0
> 5-0:1.0
> usb1
> usb2
> usb3
> usb4
> usb5
>
> > And what happens if you do:
> >
> > echo disabled >/sys/bus/usb/devices/1-8/power/wakeup
> >
> > before trying to hibernate?
>
> it's the same immediately after a reboot (and before suspending), a
> successful suspend, or a failed suspend:
>
> # echo disabled >/sys/bus/usb/devices/1-8/power/wakeup
> bash: echo: write error: Invalid argument
>
> and as previously said, that sysfs entry does not exist if the media reader
> is removed, though the error appears to continue to refer to that
> port/device.

Okay, thanks for testing. That narrows it down, and I believe the
patch below will fix the bug. Let me know how it works.

Alan Stern


Index: 2.6.27.45/drivers/usb/host/ehci-hub.c
===================================================================
--- 2.6.27.45.orig/drivers/usb/host/ehci-hub.c
+++ 2.6.27.45/drivers/usb/host/ehci-hub.c
@@ -255,7 +255,6 @@ static int ehci_bus_resume (struct usb_h
temp &= ~(PORT_RWC_BITS | PORT_WAKE_BITS);
if (test_bit(i, &ehci->bus_suspended) &&
(temp & PORT_SUSPEND)) {
- ehci->reset_done [i] = jiffies + msecs_to_jiffies (20);
temp |= PORT_RESUME;
}
ehci_writel(ehci, temp, &ehci->regs->port_status [i]);

2010-02-04 07:47:12

by Corey Wright

[permalink] [raw]
Subject: Re: regression in 2.6.27.45 with usb and suspend-to-disk

On Wed, 3 Feb 2010 15:31:49 -0500 (EST)
Alan Stern <[email protected]> wrote:

> On Wed, 3 Feb 2010, Corey Wright wrote:
>
> > the device is a usb SD reader w/ SD card in it.
> >
> > # cat /sys/bus/usb/devices/1-8/product
> > USB 2.0 SD MMC READER
> >
> > i removed the media reader after the failed suspend (it is only used at
> > boot-up to hold the LUKS key material to decrypt the filesystem) but the
> > message is the same:
> >
> > [ 4002.585329] ehci_hcd 0000:00:10.4: suspend root hub
> > [ 4002.585334] ehci_hcd 0000:00:10.4: suspend failed because port 8 is
> > resuming
> > [ 4002.585338] usb usb1: bus suspend fail, err -16
> >
> > and /sys/bus/usb/devices/1-8 no longer exists (little
> > alone /sys/bus/usb/devices/1-8/power/wakeup):
> >
> > # ls -1 /sys/bus/usb/devices/
> > 1-0:1.0
> > 1-7
> > 1-7:1.0
> > 2-0:1.0
> > 2-1
> > 2-1:1.0
> > 3-0:1.0
> > 4-0:1.0
> > 5-0:1.0
> > usb1
> > usb2
> > usb3
> > usb4
> > usb5
> >
> > > And what happens if you do:
> > >
> > > echo disabled >/sys/bus/usb/devices/1-8/power/wakeup
> > >
> > > before trying to hibernate?
> >
> > it's the same immediately after a reboot (and before suspending), a
> > successful suspend, or a failed suspend:
> >
> > # echo disabled >/sys/bus/usb/devices/1-8/power/wakeup
> > bash: echo: write error: Invalid argument
> >
> > and as previously said, that sysfs entry does not exist if the media
> > reader is removed, though the error appears to continue to refer to that
> > port/device.
>
> Okay, thanks for testing. That narrows it down, and I believe the
> patch below will fix the bug. Let me know how it works.

after applying the patch the system suspended successfully three times in a
row (before i got tired of watching it resume, which takes minutes due to
bios hardware initialization).

i looked in dmesg after a suspend and all the usb debug messages looked
innocuous.

the patch appears to have fixed the problem. i need to rebuild the
netbook's kernel with the new patch and see if that fixes it, too. if it
doesn't fix the netboot suspending, then i'll follow-up.

thanks for your time debugging the problem and for the patch!

corey
--
[email protected]

2010-02-04 19:15:13

by Alan Stern

[permalink] [raw]
Subject: Re: regression in 2.6.27.45 with usb and suspend-to-disk

On Thu, 4 Feb 2010, Corey Wright wrote:

> after applying the patch the system suspended successfully three times in a
> row (before i got tired of watching it resume, which takes minutes due to
> bios hardware initialization).
>
> i looked in dmesg after a suspend and all the usb debug messages looked
> innocuous.
>
> the patch appears to have fixed the problem. i need to rebuild the
> netbook's kernel with the new patch and see if that fixes it, too. if it
> doesn't fix the netboot suspending, then i'll follow-up.
>
> thanks for your time debugging the problem and for the patch!

Below is a slightly improved version of the patch. It's what I intend
to submit for 2.6.27.stable. I'd appreciate your testing it to make
sure that it behaves properly on both machines.

Be sure to set CONFIG_USB_DEBUG in the netbook's kernel too. If the
reason for its failure turns out to be different, we'll need the extra
information.

Alan Stern


-----------------------------------------------------------------------

This patch fixes a bug caused by backporting commit
cec3a53c7fe794237b582e8e77fc0e48465e65ee (USB: EHCI & UHCI: fix race
between root-hub suspend and port resume) to 2.6.27.stable without
also backporting commit eafe5b99f2135488b21cf17a262c54997c44f784 (USB:
EHCI: fix remote-wakeup support for ARC/TDI core). This extracts the
necessary changes from the earlier patch and backports them.

The symptom of the bug is that the system will fail to suspend more
than once. The problem is caused by setting ehci->reset_done[i] but
never clearing it. When ehci_bus_suspend() sees a nonzero value
there, it assumes this means the port is in the middle of resuming so
it aborts the bus suspend.

Signed-off-by: Alan Stern <[email protected]>

---

Index: 2.6.27.45/drivers/usb/host/ehci-hub.c
===================================================================
--- 2.6.27.45.orig/drivers/usb/host/ehci-hub.c
+++ 2.6.27.45/drivers/usb/host/ehci-hub.c
@@ -254,10 +254,8 @@ static int ehci_bus_resume (struct usb_h
temp = ehci_readl(ehci, &ehci->regs->port_status [i]);
temp &= ~(PORT_RWC_BITS | PORT_WAKE_BITS);
if (test_bit(i, &ehci->bus_suspended) &&
- (temp & PORT_SUSPEND)) {
- ehci->reset_done [i] = jiffies + msecs_to_jiffies (20);
+ (temp & PORT_SUSPEND))
temp |= PORT_RESUME;
- }
ehci_writel(ehci, temp, &ehci->regs->port_status [i]);
}
i = HCS_N_PORTS (ehci->hcs_params);
@@ -752,6 +750,9 @@ static int ehci_hub_control (
ehci_readl(ehci, status_reg));
}

+ if (!(temp & (PORT_RESUME|PORT_RESET)))
+ ehci->reset_done[wIndex] = 0;
+
/* transfer dedicated ports to the companion hc */
if ((temp & PORT_CONNECT) &&
test_bit(wIndex, &ehci->companion_ports)) {

2010-02-05 02:52:40

by Corey Wright

[permalink] [raw]
Subject: Re: regression in 2.6.27.45 with usb and suspend-to-disk

On Thu, 4 Feb 2010 14:15:09 -0500 (EST)
Alan Stern <[email protected]> wrote:

> Below is a slightly improved version of the patch. It's what I intend
> to submit for 2.6.27.stable. I'd appreciate your testing it to make
> sure that it behaves properly on both machines.

rebuilt 2.6.27.45 with only your previously posted patch applied on both
the x86-64 desktop and i386 netbook and both computers suspended to disk
successfully three times in a row (when i stopped and declared success).

> Be sure to set CONFIG_USB_DEBUG in the netbook's kernel too. If the
> reason for its failure turns out to be different, we'll need the extra
> information.

i set CONFIG_USB_DEBUG but thankfully didn't need it.

again, thanks for your work!

corey
--
[email protected]