Hi!
I noticed empty suspend stopped working around 2.6.23-rc4, and it is
still present in 2.6.23-rc6. To reproduce
swapoff -a
echo disk > /sys/power/state
echo disk > /sys/power/state
Unsetting
CONFIG_IEEE1394=y
solves the problem.
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
Ben Collins wrote:
> On Tue, 2007-09-11 at 21:00 +0200, Pavel Machek wrote:
>> I noticed empty suspend stopped working around 2.6.23-rc4, and it is
>> still present in 2.6.23-rc6. To reproduce
>>
>> swapoff -a
>> echo disk > /sys/power/state
>> echo disk > /sys/power/state
>>
>> Unsetting
>>
>> CONFIG_IEEE1394=y
>>
>> solves the problem.
>
> Since this is easy to reproduce, can you bisect it down?
I will check in a minute what I pushed out after 2.6.23-rc1, may save
Pavel some time.
--
Stefan Richter
-=====-=-=== =--= -=-==
http://arcgraph.de/sr/
On Tue, 2007-09-11 at 21:00 +0200, Pavel Machek wrote:
> Hi!
>
> I noticed empty suspend stopped working around 2.6.23-rc4, and it is
> still present in 2.6.23-rc6. To reproduce
>
> swapoff -a
> echo disk > /sys/power/state
> echo disk > /sys/power/state
>
> Unsetting
>
> CONFIG_IEEE1394=y
>
> solves the problem.
Since this is easy to reproduce, can you bisect it down?
--
Ubuntu : http://www.ubuntu.com/
Linux1394: http://wiki.linux1394.org/
SwissDisk: http://www.swissdisk.com/
>> On Tue, 2007-09-11 at 21:00 +0200, Pavel Machek wrote:
>>> I noticed empty suspend stopped working around 2.6.23-rc4, and it is
>>> still present in 2.6.23-rc6.
...
>>> Unsetting
>>>
>>> CONFIG_IEEE1394=y
>>>
>>> solves the problem.
...
Between -rc3 and -rc4:
ieee1394: sbp2: fix sbp2_remove_device for error cases
a2ee3f9bbb0ce57102dad8928d54f59acdc4b8f7
should not occur in suspend path
Between -rc1 and -rc2:
ieee1394: sbp2: more correct Kconfig dependencies
e4f8cac5e07528f7e0bc21d3682c16c9de993ecb
unrelated
ieee1394: revert "sbp2: enforce 32bit DMA mapping"
a9c2f18800753c82c45fc13b27bdc148849bdbb2
unrelated
(not via linux1394-2.6.git)
raw1394 __user annotation
5b26e64ea39e45802c5736c8261bf8a8704d212f
unrelated
So it must be something older which was somehow uncovered.
Dmesg with the oops or/and bisection would be good.
--
Stefan Richter
-=====-=-=== =--= -=-==
http://arcgraph.de/sr/
Hi!
> >> On Tue, 2007-09-11 at 21:00 +0200, Pavel Machek wrote:
> >>> I noticed empty suspend stopped working around 2.6.23-rc4, and it is
> >>> still present in 2.6.23-rc6.
> ...
> >>> Unsetting
> >>>
> >>> CONFIG_IEEE1394=y
> >>>
> >>> solves the problem.
> ...
>
> Between -rc3 and -rc4:
>
> ieee1394: sbp2: fix sbp2_remove_device for error cases
> a2ee3f9bbb0ce57102dad8928d54f59acdc4b8f7
> should not occur in suspend path
>
> Between -rc1 and -rc2:
>
> ieee1394: sbp2: more correct Kconfig dependencies
> e4f8cac5e07528f7e0bc21d3682c16c9de993ecb
> unrelated
>
> ieee1394: revert "sbp2: enforce 32bit DMA mapping"
> a9c2f18800753c82c45fc13b27bdc148849bdbb2
> unrelated
>
> (not via linux1394-2.6.git)
> raw1394 __user annotation
> 5b26e64ea39e45802c5736c8261bf8a8704d212f
> unrelated
>
> So it must be something older which was somehow uncovered.
> Dmesg with the oops or/and bisection would be good.
Sorry, had to hand-copy. It is oops at virtual adddress 6b6b6b7b --
looks like slab poison to me?
EIP is in task_rq_lock, backtrace is
try_to_wake_up
highlevel_host_reset
ohci_irq_handler
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
On Tue 2007-09-11 21:45:58, Stefan Richter wrote:
> >> On Tue, 2007-09-11 at 21:00 +0200, Pavel Machek wrote:
> >>> I noticed empty suspend stopped working around 2.6.23-rc4, and it is
> >>> still present in 2.6.23-rc6.
> ...
> >>> Unsetting
> >>>
> >>> CONFIG_IEEE1394=y
> >>>
> >>> solves the problem.
> ...
>
> Between -rc3 and -rc4:
>
> ieee1394: sbp2: fix sbp2_remove_device for error cases
> a2ee3f9bbb0ce57102dad8928d54f59acdc4b8f7
> should not occur in suspend path
Plus I do not have firewire attached disk here, so sbp2 should not be
used, right?
> Between -rc1 and -rc2:
>
> ieee1394: sbp2: more correct Kconfig dependencies
> e4f8cac5e07528f7e0bc21d3682c16c9de993ecb
> unrelated
>
> ieee1394: revert "sbp2: enforce 32bit DMA mapping"
> a9c2f18800753c82c45fc13b27bdc148849bdbb2
> unrelated
>
> (not via linux1394-2.6.git)
> raw1394 __user annotation
> 5b26e64ea39e45802c5736c8261bf8a8704d212f
> unrelated
>
> So it must be something older which was somehow uncovered.
> Dmesg with the oops or/and bisection would be good.
The others look even more innocent. I wonder if this may have been
responsible?
commit 831441862956fffa17b9801db37e6ea1650b0f69
tree b0334921341f8f1734bdd3243de76d676329d21c
parent 787d2214c19bcc9b6ac48af0ce098277a801eded
author Rafael J. Wysocki <[email protected]> Tue, 17 Jul 2007 04:03:35 -0700
committer Linus Torvalds <[email protected]> Tue, 17
Jul 2007 10:23:02 -0700
Freezer: make kernel threads nonfreezable by default
Or this one?
commit 20c2df83d25c6a95affe6157a4c9cac4cf5ffaac
tree 415c4453d2b17a50abe7a3e515177e1fa337bd67
parent 64fb98fc40738ae1a98bcea9ca3145b89fb71524
author Paul Mundt <[email protected]> Fri, 20 Jul 2007 10:11:58
+0900
committer Paul Mundt <[email protected]> Fri, 20 Jul 2007 10:11:58
+0900
mm: Remove slab destructors from kmem_cache_create().
Slab destructors were no longer supported after Christoph's
c59def9f222d44bb7e2f0a559f2906191a0862d7 change. They've been
BUGs for both slab and slub, and slob never supported them
either.
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
(Adding Cc: Rafael, Ingo)
Pavel Machek wrote:
>> Dmesg with the oops or/and bisection would be good.
>
> Sorry, had to hand-copy. It is oops at virtual adddress 6b6b6b7b --
> looks like slab poison to me?
>
> EIP is in task_rq_lock, backtrace is
> try_to_wake_up
> highlevel_host_reset
> ohci_irq_handler
In reverse order, this trace is most certainly
drivers/ieee1394/ohci1394.c::ohci_irq_handler
drivers/ieee1394/highlevel.c::highlevel_host_reset
drivers/ieee1394/highlevel.c::nodemgr_highlevel.host_reset
== nodemgr_host_reset
kernel/sched.c::wake_up_process(hi->thread);
with hi->thread being the kthread which executes
drivers/ieee1394/nodemgr.c::nodemgr_host_thread of the ieee1394 core driver.
The ieee1394 core has two (or more) threads: One nodemgr_host_thread
alias [knodemgrd_*] for each card, and one hpsbpkt_thread alias
[khpsbpkt] for all cards. The knodemgrd should be frozen during suspend
or hibernate, while khpsbpkt should not be frozen in order to let
transactions to go on in the case of saving the hibernation image to a
FireWire disk.
Quoting your other post:
> On Tue 2007-09-11 21:45:58, Stefan Richter wrote:
>> Between -rc3 and -rc4:
>>
>> ieee1394: sbp2: fix sbp2_remove_device for error cases
>> a2ee3f9bbb0ce57102dad8928d54f59acdc4b8f7
>> should not occur in suspend path
>
> Plus I do not have firewire attached disk here, so sbp2 should not be
> used, right?
Sbp2's host_reset handler would do something if it was loaded even
without any SBP-2 devices attached,but it wouldn't under any
circumstances call try_to_wake_up. Actually with no devices attached it
would simply "iterate" over an empty list.
>> Between -rc1 and -rc2:
>>
>> ieee1394: sbp2: more correct Kconfig dependencies
>> e4f8cac5e07528f7e0bc21d3682c16c9de993ecb
>> unrelated
>>
>> ieee1394: revert "sbp2: enforce 32bit DMA mapping"
>> a9c2f18800753c82c45fc13b27bdc148849bdbb2
>> unrelated
>>
>> (not via linux1394-2.6.git)
>> raw1394 __user annotation
>> 5b26e64ea39e45802c5736c8261bf8a8704d212f
>> unrelated
>>
>> So it must be something older which was somehow uncovered.
>> Dmesg with the oops or/and bisection would be good.
>
> The others look even more innocent. I wonder if this may have been
> responsible?
>
> commit 831441862956fffa17b9801db37e6ea1650b0f69
> tree b0334921341f8f1734bdd3243de76d676329d21c
> parent 787d2214c19bcc9b6ac48af0ce098277a801eded
> author Rafael J. Wysocki <[email protected]> Tue, 17 Jul 2007 04:03:35 -0700
> committer Linus Torvalds <[email protected]> Tue, 17
> Jul 2007 10:23:02 -0700
>
> Freezer: make kernel threads nonfreezable by default
>
> Or this one?
>
> commit 20c2df83d25c6a95affe6157a4c9cac4cf5ffaac
> tree 415c4453d2b17a50abe7a3e515177e1fa337bd67
> parent 64fb98fc40738ae1a98bcea9ca3145b89fb71524
> author Paul Mundt <[email protected]> Fri, 20 Jul 2007 10:11:58
> +0900
> committer Paul Mundt <[email protected]> Fri, 20 Jul 2007 10:11:58
> +0900
>
> mm: Remove slab destructors from kmem_cache_create().
>
> Slab destructors were no longer supported after Christoph's
> c59def9f222d44bb7e2f0a559f2906191a0862d7 change. They've been
> BUGs for both slab and slub, and slob never supported them
> either.
Both of these went in during the merge window between 2.6.22 and 2.6.23-rc1.
The bug could have been introduced by a change outside of the ieee1394
subsystem, like freezer or scheduler. The try_to_wake_up in your
backtrace would be directed towards a thread which should be frozen at
some point.
--
Stefan Richter
-=====-=-=== =--= =----
http://arcgraph.de/sr/
On Sunday, 16 September 2007 20:41, Stefan Richter wrote:
> (Adding Cc: Rafael, Ingo)
>
> Pavel Machek wrote:
> >> Dmesg with the oops or/and bisection would be good.
> >
> > Sorry, had to hand-copy. It is oops at virtual adddress 6b6b6b7b --
> > looks like slab poison to me?
> >
> > EIP is in task_rq_lock, backtrace is
> > try_to_wake_up
> > highlevel_host_reset
> > ohci_irq_handler
>
> In reverse order, this trace is most certainly
>
> drivers/ieee1394/ohci1394.c::ohci_irq_handler
> drivers/ieee1394/highlevel.c::highlevel_host_reset
> drivers/ieee1394/highlevel.c::nodemgr_highlevel.host_reset
> == nodemgr_host_reset
> kernel/sched.c::wake_up_process(hi->thread);
>
> with hi->thread being the kthread which executes
> drivers/ieee1394/nodemgr.c::nodemgr_host_thread of the ieee1394 core driver.
>
> The ieee1394 core has two (or more) threads: One nodemgr_host_thread
> alias [knodemgrd_*] for each card, and one hpsbpkt_thread alias
> [khpsbpkt] for all cards. The knodemgrd should be frozen during suspend
> or hibernate, while khpsbpkt should not be frozen in order to let
> transactions to go on in the case of saving the hibernation image to a
> FireWire disk.
>
>
> Quoting your other post:
> > On Tue 2007-09-11 21:45:58, Stefan Richter wrote:
> >> Between -rc3 and -rc4:
> >>
> >> ieee1394: sbp2: fix sbp2_remove_device for error cases
> >> a2ee3f9bbb0ce57102dad8928d54f59acdc4b8f7
> >> should not occur in suspend path
> >
> > Plus I do not have firewire attached disk here, so sbp2 should not be
> > used, right?
>
> Sbp2's host_reset handler would do something if it was loaded even
> without any SBP-2 devices attached,but it wouldn't under any
> circumstances call try_to_wake_up. Actually with no devices attached it
> would simply "iterate" over an empty list.
>
> >> Between -rc1 and -rc2:
> >>
> >> ieee1394: sbp2: more correct Kconfig dependencies
> >> e4f8cac5e07528f7e0bc21d3682c16c9de993ecb
> >> unrelated
> >>
> >> ieee1394: revert "sbp2: enforce 32bit DMA mapping"
> >> a9c2f18800753c82c45fc13b27bdc148849bdbb2
> >> unrelated
> >>
> >> (not via linux1394-2.6.git)
> >> raw1394 __user annotation
> >> 5b26e64ea39e45802c5736c8261bf8a8704d212f
> >> unrelated
> >>
> >> So it must be something older which was somehow uncovered.
> >> Dmesg with the oops or/and bisection would be good.
> >
> > The others look even more innocent. I wonder if this may have been
> > responsible?
> >
> > commit 831441862956fffa17b9801db37e6ea1650b0f69
> > tree b0334921341f8f1734bdd3243de76d676329d21c
> > parent 787d2214c19bcc9b6ac48af0ce098277a801eded
> > author Rafael J. Wysocki <[email protected]> Tue, 17 Jul 2007 04:03:35 -0700
> > committer Linus Torvalds <[email protected]> Tue, 17
> > Jul 2007 10:23:02 -0700
> >
> > Freezer: make kernel threads nonfreezable by default
Well, I don't think so.
nodemgr_host_thread() calls set_freezable() as it should.
Greetings,
Rafael
Rafael J. Wysocki wrote:
>> Pavel Machek wrote:
>>> commit 831441862956fffa17b9801db37e6ea1650b0f69
>>> tree b0334921341f8f1734bdd3243de76d676329d21c
>>> parent 787d2214c19bcc9b6ac48af0ce098277a801eded
>>> author Rafael J. Wysocki <[email protected]> Tue, 17 Jul 2007 04:03:35 -0700
>>> committer Linus Torvalds <[email protected]> Tue, 17
>>> Jul 2007 10:23:02 -0700
>>>
>>> Freezer: make kernel threads nonfreezable by default
>
> Well, I don't think so.
>
> nodemgr_host_thread() calls set_freezable() as it should.
This commit is certainly OK, as it should merely preserve status quo.
Also note that Pavel wrote in his initial post that the problem became
apparent way after -rc1. Full quote:
| I noticed empty suspend stopped working around 2.6.23-rc4, and it is
| still present in 2.6.23-rc6. To reproduce
|
| swapoff -a
| echo disk > /sys/power/state
| echo disk > /sys/power/state
|
| Unsetting
|
| CONFIG_IEEE1394=y
|
| solves the problem.
--
Stefan Richter
-=====-=-=== =--= =----
http://arcgraph.de/sr/
On Sunday, 16 September 2007 21:58, Stefan Richter wrote:
> Rafael J. Wysocki wrote:
> >> Pavel Machek wrote:
> >>> commit 831441862956fffa17b9801db37e6ea1650b0f69
> >>> tree b0334921341f8f1734bdd3243de76d676329d21c
> >>> parent 787d2214c19bcc9b6ac48af0ce098277a801eded
> >>> author Rafael J. Wysocki <[email protected]> Tue, 17 Jul 2007 04:03:35 -0700
> >>> committer Linus Torvalds <[email protected]> Tue, 17
> >>> Jul 2007 10:23:02 -0700
> >>>
> >>> Freezer: make kernel threads nonfreezable by default
> >
> > Well, I don't think so.
> >
> > nodemgr_host_thread() calls set_freezable() as it should.
>
> This commit is certainly OK, as it should merely preserve status quo.
> Also note that Pavel wrote in his initial post that the problem became
> apparent way after -rc1. Full quote:
>
> | I noticed empty suspend stopped working around 2.6.23-rc4, and it is
> | still present in 2.6.23-rc6. To reproduce
> |
> | swapoff -a
> | echo disk > /sys/power/state
> | echo disk > /sys/power/state
> |
> | Unsetting
> |
> | CONFIG_IEEE1394=y
> |
> | solves the problem.
Yes.
I thought that there might be a later change that exposed a bug in it.
Hm. Pavel, can you do
# echo test > /sys/power/disk
# echo disk > /sys/power/state
and see what happens?
Greetings,
Rafael
Hi!
> > This commit is certainly OK, as it should merely preserve status quo.
> > Also note that Pavel wrote in his initial post that the problem became
> > apparent way after -rc1. Full quote:
> >
> > | I noticed empty suspend stopped working around 2.6.23-rc4, and it is
> > | still present in 2.6.23-rc6. To reproduce
> > |
> > | swapoff -a
> > | echo disk > /sys/power/state
> > | echo disk > /sys/power/state
> > |
> > | Unsetting
> > |
> > | CONFIG_IEEE1394=y
> > |
> > | solves the problem.
>
> Yes.
>
> I thought that there might be a later change that exposed a bug in it.
>
> Hm. Pavel, can you do
>
> # echo test > /sys/power/disk
> # echo disk > /sys/power/state
>
> and see what happens?
root@amd:~# echo test > /sys/power/disk
root@amd:~# echo disk > /sys/power/state
root@amd:~# echo disk > /sys/power/state
Produces nothing interesting... I also did few hibernation/resume
cycles, and everything seems to work ok. But when I do swapoff then
try to hibernate, it fails on second try. Weird.
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
On Friday, 5 October 2007 09:08, Pavel Machek wrote:
> Hi!
>
> > > This commit is certainly OK, as it should merely preserve status quo.
> > > Also note that Pavel wrote in his initial post that the problem became
> > > apparent way after -rc1. Full quote:
> > >
> > > | I noticed empty suspend stopped working around 2.6.23-rc4, and it is
> > > | still present in 2.6.23-rc6. To reproduce
> > > |
> > > | swapoff -a
> > > | echo disk > /sys/power/state
> > > | echo disk > /sys/power/state
> > > |
> > > | Unsetting
> > > |
> > > | CONFIG_IEEE1394=y
> > > |
> > > | solves the problem.
> >
> > Yes.
> >
> > I thought that there might be a later change that exposed a bug in it.
> >
> > Hm. Pavel, can you do
> >
> > # echo test > /sys/power/disk
> > # echo disk > /sys/power/state
> >
> > and see what happens?
>
> root@amd:~# echo test > /sys/power/disk
> root@amd:~# echo disk > /sys/power/state
> root@amd:~# echo disk > /sys/power/state
>
> Produces nothing interesting... I also did few hibernation/resume
> cycles, and everything seems to work ok. But when I do swapoff then
> try to hibernate, it fails on second try. Weird.
Weird indeed, but it means that there's some history that causes things to
break on the second attempt.
To summarize, after running swapoff the suspending of devices during the second
attempt to hibernate fails (100% of the time) unless CONFIG_IEEE1394 is unset?
What happens for CONFIG_IEEE1394=m?
Greetings,
Rafael