2005-02-08 20:40:22

by martin f krafft

[permalink] [raw]
Subject: swsusp logic error?

I am trying to get swsusp working on a 2.6.10 Debian kernel
(2.6.10-1-686, custom compile, enabling only CONFIG_SOFTWARE_SUSPEND
and leaving CONFIG_PM_STD_PARTITION empty) on this Sony Vaio Z1RSP
Centrino 1.7 Pentium M laptop... without much success. Whenever
I enter swsusp mode, the kernel reports that it cannot find the swap
space and aborts.

I checked the code and found the following problem:

swsusp_swap_check() calls is_resume_device(..), which compares the
device specified in CONFIG_PM_STD_PARTITION and overridden by the
'resume' kernel boot parameter with the list of available swap
partitions.

IMHO, the problem is not with the swap partitions, but rather with
the handling of the resume_file variable. A dev_t is just an
integer, and to compare the devices, is_resume_device(..) converts
the device node of each swap file to a dev_t, using the MKDEV(..)
macro. For me, the swap partition is hda2, and MKDEV correctly
returns the dev_t for 3:2.

However, in is_resume_device, the resume_device variable is 0, which
translates to the 0:0 device. On inspection, this is no surprise:

resume_device is a static in swsusp.c. However, it is only ever
written once: in swsusp_read(), which is called to restore a memory
image from swap. That image can never be created because
is_resume_device(..) will always fail due to the comparison against
the (uninitialised) static resume_device.

I tried to rectify the situation by duplicating the line

resume_device = name_to_dev_t(resume_file);

to the beginning of the swsusp_swap_check() function, so that it
gets set to the dev_t corresponding to the device identified in
resume_file before is_resume_device(..) is called.

However, name_to_dev_t(..) does more than converting a name to the
dev_t structure... in particular, it crashes the kernel when called
from swsusp_swap_check(). If I execute

echo platform >| disk; echo disk >| state

from the shell (zsh), then the kernel will report a crash in the zsh
process, the top of the trace is

[<c0134780>] swsusp_swap_check+0x30/0x100

and the corresponding disassembly is available from

http://rafb.net/paste/results/HV8eCI97.txt

The Code at the bottom of the crash dump is 2.5 lines of 'cc cc
...', and I am being told that

<6>zsh[6632] exited with preempt_count 1.

The machine is then pretty much dead. The network interface reports
too much work at the interrupt, and I can still switch virtual
consoles, but I cannot type, and sysrq does not work.

Anyway, I have no more time to work on this, unfortunately.
Hopefully my analysis helps to solve that problem.

--
martin; (greetings from the heart of the sun.)
\____ echo mailto: !#^."<*>"|tr "<*> mailto:" net@madduck

invalid/expired pgp subkeys? use subkeys.pgp.net as keyserver!
spamtraps: [email protected]

"there are two major products that come out of berkeley: lsd and unix.
we don't believe this to be a coincidence."
-- jeremy s. anderson


Attachments:
(No filename) (2.94 kB)
signature.asc (189.00 B)
Digital signature
Download all attachments

2005-02-09 22:14:07

by Pavel Machek

[permalink] [raw]
Subject: Re: swsusp logic error?

Hi!

> I am trying to get swsusp working on a 2.6.10 Debian kernel
> (2.6.10-1-686, custom compile, enabling only CONFIG_SOFTWARE_SUSPEND
> and leaving CONFIG_PM_STD_PARTITION empty) on this Sony Vaio Z1RSP
> Centrino 1.7 Pentium M laptop... without much success. Whenever
> I enter swsusp mode, the kernel reports that it cannot find the swap
> space and aborts.

Try doing it on vanilla, just one swapfile, and pass
resume=/dev/your_swapdevice.

Oh, and cc me next time if you want faster reply...
Pavel
--
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

2005-02-27 10:50:42

by martin f krafft

[permalink] [raw]
Subject: Re: swsusp logic error?

Sorry for the late reply, I've been strung up with work. I tried
your suggestion on another machine, with a vanilla 2.6.10 kernel and
a single swap device, twice the size of the physical RAM; I get
exactly the same result. The swap device cannot be found.

What to try next?

--
martin; (greetings from the heart of the sun.)
\____ echo mailto: !#^."<*>"|tr "<*> mailto:" net@madduck

invalid/expired pgp subkeys? use subkeys.pgp.net as keyserver!
spamtraps: [email protected]

to err is human - to moo, bovine


Attachments:
(No filename) (537.00 B)
signature.asc (189.00 B)
Digital signature
Download all attachments

2005-02-27 17:04:51

by Pavel Machek

[permalink] [raw]
Subject: Re: swsusp logic error?

Hi!

> Sorry for the late reply, I've been strung up with work. I tried
> your suggestion on another machine, with a vanilla 2.6.10 kernel and
> a single swap device, twice the size of the physical RAM; I get
> exactly the same result. The swap device cannot be found.
>
> What to try next?

Ugh, too late, I already forgot what went wrong for you. Anyway try
reading Documentation/power/swsusp.txt and/or going to 2.6.11-rc4. If
that does not help, debug with printk :-).
Pavel
--
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

2005-02-27 17:54:15

by Pavel Machek

[permalink] [raw]
Subject: Re: swsusp logic error?

Hi!

> > Ugh, too late, I already forgot what went wrong for you. Anyway
> > try reading Documentation/power/swsusp.txt and/or going to
> > 2.6.11-rc4. If that does not help, debug with printk :-).
>
> I already did the first two. I will try 2.6.11-rc4 now.
>
> Please check my first post, if you have the time:
>
> http://marc.theaimsgroup.com/?l=linux-kernel&m=110789536921510&w=2

Ok, this one.

I do not know what is going wrong. swsusp seems to work for
people... or at least it works for me. Here's my .config, perhaps you
have something unusual?

I do have CONFIG_PM_STD_PARTITION="/dev/hda1", perhaps that's
neccessary?
Pavel
--
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

2005-02-27 17:54:16

by martin f krafft

[permalink] [raw]
Subject: Re: swsusp logic error?

also sprach Pavel Machek <[email protected]> [2005.02.27.1804 +0100]:
> Ugh, too late, I already forgot what went wrong for you. Anyway
> try reading Documentation/power/swsusp.txt and/or going to
> 2.6.11-rc4. If that does not help, debug with printk :-).

I already did the first two. I will try 2.6.11-rc4 now.

Please check my first post, if you have the time:

http://marc.theaimsgroup.com/?l=linux-kernel&m=110789536921510&w=2

--
martin; (greetings from the heart of the sun.)
\____ echo mailto: !#^."<*>"|tr "<*> mailto:" net@madduck

invalid/expired pgp subkeys? use subkeys.pgp.net as keyserver!
spamtraps: [email protected]

"it always takes longer than you expect, even when
you take into account hofstadter's law."
-- douglas hofstadter


Attachments:
(No filename) (824.00 B)
signature.asc (189.00 B)
Digital signature
Download all attachments

2005-02-27 18:27:57

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: swsusp logic error?

On Sunday, 27 of February 2005 18:50, Pavel Machek wrote:
> Hi!
>
> > > Ugh, too late, I already forgot what went wrong for you. Anyway
> > > try reading Documentation/power/swsusp.txt and/or going to
> > > 2.6.11-rc4. If that does not help, debug with printk :-).
> >
> > I already did the first two. I will try 2.6.11-rc4 now.
> >
> > Please check my first post, if you have the time:
> >
> > http://marc.theaimsgroup.com/?l=linux-kernel&m=110789536921510&w=2
>
> Ok, this one.
>
> I do not know what is going wrong. swsusp seems to work for
> people... or at least it works for me. Here's my .config, perhaps you
> have something unusual?
>
> I do have CONFIG_PM_STD_PARTITION="/dev/hda1", perhaps that's
> neccessary?

I don't set CONFIG_PM_STD_PARTITION, but I pass the "resume" parameter
to the kernel and it works (no fuss, on x86-64 and i386).

Greets,
Rafael


--
- Would you tell me, please, which way I ought to go from here?
- That depends a good deal on where you want to get to.
-- Lewis Carroll "Alice's Adventures in Wonderland"

2005-02-28 01:58:34

by Barry K. Nathan

[permalink] [raw]
Subject: Re: swsusp logic error?

On Sun, Feb 27, 2005 at 07:27:39PM +0100, Rafael J. Wysocki wrote:
> On Sunday, 27 of February 2005 18:50, Pavel Machek wrote:
[snip]
> > Ok, this one.
> >
> > I do not know what is going wrong. swsusp seems to work for
> > people... or at least it works for me. Here's my .config, perhaps you
> > have something unusual?
> >
> > I do have CONFIG_PM_STD_PARTITION="/dev/hda1", perhaps that's
> > neccessary?
>
> I don't set CONFIG_PM_STD_PARTITION, but I pass the "resume" parameter
> to the kernel and it works (no fuss, on x86-64 and i386).

I have the same setup as Rafael, on i386 boxes. swsusp was very
messed-up for me in earlier 2.6.11-rc, but with -rc4 (or maybe it's one
of the -bk snapshots between -rc4 and -rc5) it works for me again.
Specifically, in the failing releases, swsusp would never succeed in
suspending the machine.

Since the problem is gone now, I think I have better uses for my time
than figuring out when the problem started and when it was fixed, but I
just wanted to mention that in fact there are problems in earlier
2.6.11-rc releases that seem to be fixed later on.

-Barry K. Nathan <[email protected]>

2005-02-28 14:00:24

by martin f krafft

[permalink] [raw]
Subject: Re: swsusp logic error?

also sprach martin f krafft <[email protected]> [2005.02.27.1843 +0100]:
> Please check my first post, if you have the time:
>
> http://marc.theaimsgroup.com/?l=linux-kernel&m=110789536921510&w=2

There is also

http://thread.gmane.org/gmane.linux.acpi.devel/12540

with the same conclusion.

Maybe 2.6.11-rcX fixes this.

--
martin; (greetings from the heart of the sun.)
\____ echo mailto: !#^."<*>"|tr "<*> mailto:" net@madduck

invalid/expired pgp subkeys? use subkeys.pgp.net as keyserver!
spamtraps: [email protected]

"security here. yes, ma'am. yes. groucho glasses. yes, we're on it.
c'mon, guys. somebody gave an aardvark a nose-cut: somebody who
can't deal with deconstructionist humor. code blue."
-- http://azure.humbug.org.au/~aj/armadillos.txt


Attachments:
(No filename) (816.00 B)
signature.asc (189.00 B)
Digital signature
Download all attachments

2005-02-28 14:34:27

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: swsusp logic error?

On Monday, 28 of February 2005 14:56, martin f krafft wrote:
> also sprach martin f krafft <[email protected]> [2005.02.27.1843 +0100]:
> > Please check my first post, if you have the time:
> >
> > http://marc.theaimsgroup.com/?l=linux-kernel&m=110789536921510&w=2
>
> There is also
>
> http://thread.gmane.org/gmane.linux.acpi.devel/12540
>
> with the same conclusion.
>
> Maybe 2.6.11-rcX fixes this.

Could you, please, verify that you don't need to load any modules
from initrd for your swap partition to work? It won't work if you do.

Greets,
Rafael


--
- Would you tell me, please, which way I ought to go from here?
- That depends a good deal on where you want to get to.
-- Lewis Carroll "Alice's Adventures in Wonderland"

2005-02-28 14:25:08

by Pavel Machek

[permalink] [raw]
Subject: Re: swsusp logic error?

Hi!

On Po 28-02-05 14:56:04, martin f krafft wrote:
> also sprach martin f krafft <[email protected]> [2005.02.27.1843 +0100]:
> > Please check my first post, if you have the time:
> >
> > http://marc.theaimsgroup.com/?l=linux-kernel&m=110789536921510&w=2
>
> There is also
>
> http://thread.gmane.org/gmane.linux.acpi.devel/12540
>
> with the same conclusion.
>
> Maybe 2.6.11-rcX fixes this.

It was resolved -- modular IDE was the problem. Indeed see the thread above.



--
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

2005-02-28 14:50:03

by Pavel Machek

[permalink] [raw]
Subject: Re: swsusp logic error?

Hi!

> also sprach Rafael J. Wysocki <[email protected]> [2005.02.28.1533 +0100]:
> > Could you, please, verify that you don't need to load any modules
> > from initrd for your swap partition to work? It won't work if you do.
>
> this makes perfect sense to me when you talk about resuming. does it
> also apply to suspending?

As kernel is the same for suspend and resume... Yes, it seems it makes
sense.

Of course, it was a mistake, not design, but failed suspend is better
than failed resume.
Pavel
--
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

2005-02-28 14:49:09

by martin f krafft

[permalink] [raw]
Subject: Re: swsusp logic error?

also sprach Rafael J. Wysocki <[email protected]> [2005.02.28.1533 +0100]:
> Could you, please, verify that you don't need to load any modules
> from initrd for your swap partition to work? It won't work if you do.

this makes perfect sense to me when you talk about resuming. does it
also apply to suspending?

--
martin; (greetings from the heart of the sun.)
\____ echo mailto: !#^."<*>"|tr "<*> mailto:" net@madduck

invalid/expired pgp subkeys? use subkeys.pgp.net as keyserver!
spamtraps: [email protected]

"i never travel without my diary. one should always have something
sensational to read on the train."
-- oscar wilde


Attachments:
(No filename) (709.00 B)
signature.asc (189.00 B)
Digital signature
Download all attachments

2005-03-01 08:24:48

by martin f krafft

[permalink] [raw]
Subject: Re: swsusp logic error?

also sprach Pavel Machek <pavel () ucw ! cz>
> > > Could you, please, verify that you don't need to load any
> > > modules from initrd for your swap partition to work? It won't
> > > work if you do.
> >
> > this makes perfect sense to me when you talk about resuming.
> > does it also apply to suspending?
>
> As kernel is the same for suspend and resume... Yes, it seems it
> makes sense.

But before the suspend, the IDE modules are loaded, so the swap
drive is accessible, no? Or are IDE modules (yes, they are modules
here) unloaded just before writing to swap?

--
martin; (greetings from the heart of the sun.)
\____ echo mailto: !#^."<*>"|tr "<*> mailto:" net@madduck

invalid/expired pgp subkeys? use subkeys.pgp.net as keyserver!
spamtraps: [email protected]

man muss noch chaos in sich haben
um einen tanzenden stern zu geb?hren.
-- friedrich nietzsche


Attachments:
(No filename) (941.00 B)
signature.asc (189.00 B)
Digital signature
Download all attachments

2005-03-01 09:05:34

by Nigel Cunningham

[permalink] [raw]
Subject: Re: swsusp logic error?

Hi Martin.

On Tue, 2005-03-01 at 19:22, martin f krafft wrote:
> also sprach Pavel Machek <pavel () ucw ! cz>
> > > > Could you, please, verify that you don't need to load any
> > > > modules from initrd for your swap partition to work? It won't
> > > > work if you do.
> > >
> > > this makes perfect sense to me when you talk about resuming.
> > > does it also apply to suspending?
> >
> > As kernel is the same for suspend and resume... Yes, it seems it
> > makes sense.
>
> But before the suspend, the IDE modules are loaded, so the swap
> drive is accessible, no? Or are IDE modules (yes, they are modules
> here) unloaded just before writing to swap?

I think Pavel got a bit confused somewhere here! The IDE modules will
always be loaded when you're doing the suspend, right to the very end.
At resume time, they need to be loaded the swsusp attempts to parse the
resume= parameter, so that it can actually succeed in doing that.
Suspend2 works with IDE made as modules because it allows you to delay
that parsing until after the modules are loaded (you put echo >
/proc/software_suspend/do_resume in your initrd after modules are loaded
but before you mount filesystems). Last time I looked, swsusp didn't
have that capability and thus required IDE to be built in. Pavel, has
that changed?

Regards,

Nigel
--
Nigel Cunningham
Software Engineer, Canberra, Australia
http://www.cyclades.com
Bus: +61 (2) 6291 9554; Hme: +61 (2) 6292 8028; Mob: +61 (417) 100 574

Maintainer of Suspend2 Kernel Patches http://softwaresuspend.berlios.de


2005-03-01 10:31:05

by Pavel Machek

[permalink] [raw]
Subject: Re: swsusp logic error?

Hi!

> > > > > Could you, please, verify that you don't need to load any
> > > > > modules from initrd for your swap partition to work? It won't
> > > > > work if you do.
> > > >
> > > > this makes perfect sense to me when you talk about resuming.
> > > > does it also apply to suspending?
> > >
> > > As kernel is the same for suspend and resume... Yes, it seems it
> > > makes sense.
> >
> > But before the suspend, the IDE modules are loaded, so the swap
> > drive is accessible, no? Or are IDE modules (yes, they are modules
> > here) unloaded just before writing to swap?
>
> I think Pavel got a bit confused somewhere here! The IDE modules will
> always be loaded when you're doing the suspend, right to the very end.
> At resume time, they need to be loaded the swsusp attempts to parse the
> resume= parameter, so that it can actually succeed in doing that.
> Suspend2 works with IDE made as modules because it allows you to delay
> that parsing until after the modules are loaded (you put echo >
> /proc/software_suspend/do_resume in your initrd after modules are loaded
> but before you mount filesystems). Last time I looked, swsusp didn't
> have that capability and thus required IDE to be built in. Pavel, has
> that changed?

No, it has not changed for mainline.

SuSE has patch to allow resume from modular IDE.
Pavel
--
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

2005-03-01 10:30:32

by Pavel Machek

[permalink] [raw]
Subject: Re: swsusp logic error?

Hi!

> > > > Could you, please, verify that you don't need to load any
> > > > modules from initrd for your swap partition to work? It won't
> > > > work if you do.
> > >
> > > this makes perfect sense to me when you talk about resuming.
> > > does it also apply to suspending?
> >
> > As kernel is the same for suspend and resume... Yes, it seems it
> > makes sense.
>
> But before the suspend, the IDE modules are loaded, so the swap
> drive is accessible, no? Or are IDE modules (yes, they are modules
> here) unloaded just before writing to swap?

Yes, IDE modules are loaded and swap drive is accessible during
suspend. But you want to resume some time later, and you want to
resume with same kernel, right?
Pavel
--
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!