Distro: SuSE Linux 9.2
Kernel: 2.6.8 (kernel-default-2.6.8-24.11), also 2.6.11.5
Hardware: Dell Inspiron 6000d, Intel Pentium-M, 915PM chipset,
disc is Fujitsu MHT2040AH, SATA via ata_piix driver
Kernel cmdline: root=/dev/sda3 vga=0x317 selinux=0 resume=/dev/sda5 \
desktop elevator=as showopts
I have the same symptoms as seen in numerous complaints on the web: I do
"echo disk > /sys/power/state" or run /sbin/swsusp or powersave -U. The
kernel suspends all the way, then immediately wakes up, having
accomplished nothing. On 2.6.11.5 I can read an error message: "swsusp:
FATAL: cannot find swap device, try swapon -a!" Yes, the swap device is
recognized in /proc/swaps.
I put some printk's into 2.6.11.5 and found out the reason for this
behavior: in kernel/power/swsusp.c, static resume_device == 0. The
reason it's 0 is that swsusp_read uses name_to_dev_t to interpret
resume=/dev/sda5, a bogus block device name. The reason it's bogus is
that its driver is modular and will be loaded in the future from the
initrd. Thus if the image were written there, it could not be read
by swsusp_read, and swsusp_swap_check correctly aborts.
Formerly I had a laptop on which software suspend worked. It
had an IDE disc, and in this distro the ide modules are hardwired in the
kernel; hence /dev/hda2 was not bogus and could be resumed from and
(therefore) suspended to.
Obvious workaround: recompile the kernel myself and hardwire ata_piix
and its scsi friends. I'll bet we actually see SuSE do this in version
9.3, to shut up the expensive customer support calls from frustrated
users with new laptops (all the new Dell laptops and desktops use SATA
discs). But this is not a general solution, since the next
technological advance (infiniband?) will require yet another driver, and
you're now defeating the whole purpose of modules. Also, if I go to my
own kernel I cut myself off from my distro's security patches.
So I'm hoping someone has an idea how to make software_resume happen
_after_ the initrd has been run and its modules are in place, which
might make it into whatever kernel is being used in SuSE 9.3.
James F. Carter Voice 310 825 2897 FAX 310 206 6673
UCLA-Mathnet; 6115 MSA; 405 Hilgard Ave.; Los Angeles, CA, USA 90095-1555
Email: [email protected] http://www.math.ucla.edu/~jimc (q.v. for PGP key)
Hi!
> Distro: SuSE Linux 9.2
> Kernel: 2.6.8 (kernel-default-2.6.8-24.11), also 2.6.11.5
> Hardware: Dell Inspiron 6000d, Intel Pentium-M, 915PM chipset,
> disc is Fujitsu MHT2040AH, SATA via ata_piix driver
> Kernel cmdline: root=/dev/sda3 vga=0x317 selinux=0 resume=/dev/sda5 \
> desktop elevator=as showopts
>
> I have the same symptoms as seen in numerous complaints on the web: I do
> "echo disk > /sys/power/state" or run /sbin/swsusp or powersave -U. The
> kernel suspends all the way, then immediately wakes up, having
> accomplished nothing. On 2.6.11.5 I can read an error message: "swsusp:
> FATAL: cannot find swap device, try swapon -a!" Yes, the swap device is
> recognized in /proc/swaps.
>
> I put some printk's into 2.6.11.5 and found out the reason for this
> behavior: in kernel/power/swsusp.c, static resume_device == 0. The
> reason it's 0 is that swsusp_read uses name_to_dev_t to interpret
> resume=/dev/sda5, a bogus block device name. The reason it's bogus
> is
...
> So I'm hoping someone has an idea how to make software_resume happen
> _after_ the initrd has been run and its modules are in place, which
> might make it into whatever kernel is being used in SuSE 9.3.
This is WONTFIX for 2.6.11, but you can be pretty sure it is going to
be fixed for SuSE 9.3, and patch is already in 2.6.12-rc1. Feel free
to betatest SuSE 9.3 ;-).
Pavel
--
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!
On Wed, 23 Mar 2005, Pavel Machek wrote:
> > I put some printk's into 2.6.11.5 and found out the reason for this
> > behavior: in kernel/power/swsusp.c, static resume_device == 0. The
> > reason it's 0 is that swsusp_read uses name_to_dev_t to interpret
> > resume=/dev/sda5, a bogus block device name. The reason it's bogus
> > is
> ...
> This is WONTFIX for 2.6.11, but you can be pretty sure it is going to
> be fixed for SuSE 9.3, and patch is already in 2.6.12-rc1. Feel free
> to betatest SuSE 9.3 ;-).
Many thanks! Definitely I'll beta-test SuSE 9.3, and download 2.6.12-rc1
for a self-compiled kernel. SuSE rocks; open source rocks!
James F. Carter Voice 310 825 2897 FAX 310 206 6673
UCLA-Mathnet; 6115 MSA; 405 Hilgard Ave.; Los Angeles, CA, USA 90095-1555
Email: [email protected] http://www.math.ucla.edu/~jimc (q.v. for PGP key)
On Wed, 23 Mar 2005, Pavel Machek wrote:
> This is WONTFIX for 2.6.11, but you can be pretty sure it is going to
> be fixed for SuSE 9.3, and patch is already in 2.6.12-rc1. Feel free
> to betatest SuSE 9.3 ;-).
Unfortunately the celebration was premature. I compiled 2.6.12-rc1,
noticing the new feature that you can see or alter the swap device
number in /sys/power/resume. So I'm able to suspend... but not to
resume, since the driver still isn't loaded at the time of resuming.
I tried some cowboy programming like this: in kernel/power/disk.c I
changed software_resume to be not static (i.e. extern) and not a
late_initcall. In init/main.c, in init(), just after do_basic_setup(),
I inserted a call to software_resume(). This did not even cause a
kernel panic as I had expected; there was no sign on the console, in
/var/log/boot.msg or anywhere else that software_resume had ever been
called, even with a suspended image in the swap partition.
It was worth a try, but not much more, since I'm sure there are
contingencies that I'm not taking into account. For example, the real
root filesystem is mounted (readonly), and if that makes a problem
for resuming, how can we squeeze software_resume after the initrd and
before mounting the root disc?
James F. Carter Voice 310 825 2897 FAX 310 206 6673
UCLA-Mathnet; 6115 MSA; 405 Hilgard Ave.; Los Angeles, CA, USA 90095-1555
Email: [email protected] http://www.math.ucla.edu/~jimc (q.v. for PGP key)
Hi!
> On Wed, 23 Mar 2005, Pavel Machek wrote:
> > This is WONTFIX for 2.6.11, but you can be pretty sure it is going to
> > be fixed for SuSE 9.3, and patch is already in 2.6.12-rc1. Feel free
> > to betatest SuSE 9.3 ;-).
>
> Unfortunately the celebration was premature. I compiled 2.6.12-rc1,
> noticing the new feature that you can see or alter the swap device
> number in /sys/power/resume. So I'm able to suspend... but not to
> resume, since the driver still isn't loaded at the time of resuming.
There's another feature that enables you to start resume manually with
some echo to /sys... Perhaps it needs to be documented better, I'm
looking for a patch ;-).
Pavel
--
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!
Pavel Machek wrote:
> Hi!
> There's another feature that enables you to start resume manually with
> some echo to /sys... Perhaps it needs to be documented better, I'm
> looking for a patch ;-).
HANNES, where are you?
;-)
Stefan
On Fri, 25 Mar 2005, Pavel Machek wrote:
> There's another feature that enables you to start resume manually with
> some echo to /sys... Perhaps it needs to be documented better, I'm
> looking for a patch ;-).
But how can it resume from a swap device for which it has no driver?
Even if you copied the needed module(s) onto the swap device, the kernel
needs the modules to be loaded before it can read anything. The driver
would be there if resuming happened after the initrd loaded it. But
I wasn't able to make that actually work.
James F. Carter Voice 310 825 2897 FAX 310 206 6673
UCLA-Mathnet; 6115 MSA; 405 Hilgard Ave.; Los Angeles, CA, USA 90095-1555
Email: [email protected] http://www.math.ucla.edu/~jimc (q.v. for PGP key)
Hi!
> > There's another feature that enables you to start resume manually with
> > some echo to /sys... Perhaps it needs to be documented better, I'm
> > looking for a patch ;-).
>
> But how can it resume from a swap device for which it has no driver?
You insmod driver for your swap device, then you echo device numbers
to /sys... then initiate resume.
> Even if you copied the needed module(s) onto the swap device, the kernel
> needs the modules to be loaded before it can read anything. The driver
> would be there if resuming happened after the initrd loaded it. But
> I wasn't able to make that actually work.
It should be possible, suse 9.3 does that...
Pavel
--
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!
On Tue, 29 Mar 2005, Pavel Machek wrote:
> You insmod driver for your swap device, then you echo device numbers
> to /sys... then initiate resume.
So you're saying, let the machine come all the way up, log in as root,
"echo 8:5 > /sys/power/resume" (I think that was the name), then "echo
resume > /sys/power/state"? Hmm, you would have to bypass "swapon -a",
e.g. boot with the -b kernel parameter.
Or I'll bet one could do something equivalent in the initrd -- much more
user friendly. But the friendliest of all would be if the swsusp resume
call were not a late_initcall but rather were called just before the root
was mounted, after the initrd (if any) had loaded whatever modules. I
think you're confirming that that approach would not blow up the kernel --
if it will work with the root mounted and user space in full roar (well,
skimpy roar with the -b switch), then it's got to be OK at the earlier
time.
I'll see what I can do.
James F. Carter Voice 310 825 2897 FAX 310 206 6673
UCLA-Mathnet; 6115 MSA; 405 Hilgard Ave.; Los Angeles, CA, USA 90095-1555
Email: [email protected] http://www.math.ucla.edu/~jimc (q.v. for PGP key)
Hi!
> > You insmod driver for your swap device, then you echo device numbers
> > to /sys... then initiate resume.
>
> So you're saying, let the machine come all the way up, log in as root,
> "echo 8:5 > /sys/power/resume" (I think that was the name), then "echo
> resume > /sys/power/state"? Hmm, you would have to bypass "swapon -a",
> e.g. boot with the -b kernel parameter.
Well, basically yes, but do that without any writing to filesystem, or
it is "bye bye data".
> Or I'll bet one could do something equivalent in the initrd -- much more
> user friendly. But the friendliest of all would be if the swsusp resume
> call were not a late_initcall but rather were called just before the root
> was mounted, after the initrd (if any) had loaded whatever modules. I
> think you're confirming that that approach would not blow up the kernel --
> if it will work with the root mounted and user space in full roar (well,
> skimpy roar with the -b switch), then it's got to be OK at the earlier
> time.
You do not want to mount journaling filesystems; they tend to write to
disks even during read-only mounts... But doing it from initrd should
be okay. ext2 and init=/bin/bash should do the trick, too.
Pavel
--
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!
On Wed, 30 Mar 2005, Pavel Machek wrote:
> You do not want to mount journaling filesystems; they tend to write to
> disks even during read-only mounts... But doing it from initrd should
> be okay. ext2 and init=/bin/bash should do the trick, too.
I did give it a try -- successfully.
For reference I recite the original issue: the driver for my primary
disc is in the initrd, not hardwired. (It's ata_piix and friends, but
the same issue happens if you boot from RAID or other weird devices. As
modern systems tend to have a SATA disc, more and more people are
complaining on the web that software suspend has stopped working after
they upgraded their machines.) software_suspend would suspend all the
way, then immediately wake up having accomplished nothing (but broken
nothing either). In kernel 2.6.12-rc1 but not 2.6.8 it complains "can't
find swap device". If this safety check is unwisely overriden so a
suspend image is written, and you then resume (providing the device by
number), it fails to read the image using the driver which it hasn't
loaded yet.
This patch makes software_resume not a late_initcall but rather an
external subroutine similar to software_suspend, and calls it at the
beginning of mount_root (in init/do_mounts.c), just _after_ the initrd
(if any) and its driver have been seen. This buried placement is needed
because there are several flow paths that call mount_root, and otherwise
each path would need to be monkeyed with.
The initrd contents at the time of resuming are lost, but the initrd
contents at initial boot, if mounted at that time on /initrd, are still
there.
I have been running with this patch for over a week, with several
suspends per day (and much more than the usual number of reboots, due to
driver debugging). I have had only two system crashes in that time. In
one, I was trying code in ata_piix connected with ATAPI DMA that was
wrong for my kernel version, and it hung in driver initialization before
software_resume was even called. In the other, I was trying the CVS
version of X-Windows, specifically DRI. The rough edges showed clearly,
and I suspect it stored corrupt data somewhere. After reverting to the
production X-Windows I suspended, and I got a null pointer dereference
upon resuming. I suspect (but can't prove) that module reinit would
have failed exactly the same way with the original or patched calls to
software_resume. In other words, I think the patched version is doing
its part of the job perfectly.
So, what do you think? Can we bring the benefit of software suspend to
systems with SATA or RAID boot discs?
James F. Carter Voice 310 825 2897 FAX 310 206 6673
UCLA-Mathnet; 6115 MSA; 405 Hilgard Ave.; Los Angeles, CA, USA 90095-1555
Email: [email protected] http://www.math.ucla.edu/~jimc (q.v. for PGP key)
Patch relative to 2.6.12-rc1
--- init/do_mounts.c.orig 2005-03-17 17:34:09.000000000 -0800
+++ init/do_mounts.c 2005-04-01 19:29:23.000000000 -0800
@@ -362,6 +362,16 @@
void __init mount_root(void)
{
+#ifdef CONFIG_SOFTWARE_SUSPEND
+ /*
+ * Must resume after initrd has loaded the device for the root filesys,
+ * presumed same as the one with the swap partition with the resume
+ * image, but before mounting anything, which resuming would smear.
+ */
+ software_resume();
+#endif
#ifdef CONFIG_ROOT_NFS
if (MAJOR(ROOT_DEV) == UNNAMED_MAJOR) {
if (mount_nfs_root())
--- include/linux/suspend.h.orig 2005-03-17 17:34:07.000000000 -0800
+++ include/linux/suspend.h 2005-04-01 19:39:35.000000000 -0800
@@ -48,6 +48,7 @@
#ifdef CONFIG_PM
/* kernel/power/swsusp.c */
extern int software_suspend(void);
+extern int software_resume(void);
extern int pm_prepare_console(void);
extern void pm_restore_console(void);
@@ -58,6 +59,10 @@
printk("Warning: fake suspend called\n");
return -EPERM;
}
+static inline int software_resume(void)
+{
+ return 0;
+}
#endif
#ifdef CONFIG_SMP
--- kernel/power/disk.c.orig 2005-03-26 14:16:25.000000000 -0800
+++ kernel/power/disk.c 2005-04-01 21:14:01.029535791 -0800
@@ -229,7 +229,7 @@
*
*/
-static int software_resume(void)
+int software_resume(void)
{
int error;
@@ -243,12 +243,15 @@
pr_debug("PM: Checking swsusp image.\n");
- if ((error = swsusp_check()))
+ if ((error = swsusp_check())) {
+ pr_debug("PM: No swsusp image, skipping.\n");
goto Done;
+ }
pr_debug("PM: Preparing processes for restore.\n");
if ((error = prepare_processes())) {
+ pr_debug("PM: Problem preparing processes, not restoring.\n");
swsusp_close();
goto Cleanup;
}
@@ -278,8 +281,6 @@
return 0;
}
-late_initcall(software_resume);
-
static char * pm_disk_modes[] = {
[PM_DISK_FIRMWARE] = "firmware",
Hi!
On Ne 10-04-05 16:14:52, Jim Carter wrote:
> On Wed, 30 Mar 2005, Pavel Machek wrote:
> > You do not want to mount journaling filesystems; they tend to write to
> > disks even during read-only mounts... But doing it from initrd should
> > be okay. ext2 and init=/bin/bash should do the trick, too.
>
> I did give it a try -- successfully.
>
> For reference I recite the original issue: the driver for my primary
> disc is in the initrd, not hardwired. (It's ata_piix and friends, but
> the same issue happens if you boot from RAID or other weird devices. As
> modern systems tend to have a SATA disc, more and more people are
> complaining on the web that software suspend has stopped working after
> they upgraded their machines.) software_suspend would suspend all the
> way, then immediately wake up having accomplished nothing (but broken
> nothing either). In kernel 2.6.12-rc1 but not 2.6.8 it complains "can't
> find swap device". If this safety check is unwisely overriden so a
> suspend image is written, and you then resume (providing the device by
> number), it fails to read the image using the driver which it hasn't
> loaded yet.
Yep.
> This patch makes software_resume not a late_initcall but rather an
> external subroutine similar to software_suspend, and calls it at the
> beginning of mount_root (in init/do_mounts.c), just _after_ the initrd
> (if any) and its driver have been seen. This buried placement is needed
> because there are several flow paths that call mount_root, and otherwise
> each path would need to be monkeyed with.
But the patch is very dangerous. Unsuspecting users will see their
systems resumed after unsafe initrd is ran. It is okay for you,
through..
What you want to do si to audit your initrd, then add echo to
/sys/power/resume at the end...
> So, what do you think? Can we bring the benefit of software suspend to
> systems with SATA or RAID boot discs?
Yes, but not this way. -rc2 already contains
/sys/power/resume... (Better documentation would be needed, through).
Pavel
--
Boycott Kodak -- for their patent abuse against Java.
On Wed, 13 Apr 2005, Pavel Machek wrote:
> > This patch makes software_resume not a late_initcall but rather an
> > external subroutine similar to software_suspend, and calls it at the
> > beginning of mount_root (in init/do_mounts.c), just _after_ the initrd
> > (if any) and its driver have been seen....
>
> But the patch is very dangerous. Unsuspecting users will see their
> systems resumed after unsafe initrd is ran. It is okay for you,
> through..
>
> What you want to do si to audit your initrd, then add echo to
> /sys/power/resume at the end...
I think you expressed similar reservations earlier but I'm not sure if I
understand what your issue is. Are you saying (please let me know which if
any of these threats are the ones that concern you):
1. If the initrd mounted any filesystem read-write, or any journalled
filesystem, and left it mounted, that would be bad.
2. If the initrd started an ordinary process (or a kernel thread?) and
left it running, that would be bad. The ata_piix driver really does
create a kernel thread, though I believe it exists only during actual data
transfer.
3. The initrd is copied into a ramdisc which is then mounted. It's still
mounted when software_resume is called (as the patch is arranged presently,
or if the initrd does "echo resume > /sys/power/state"), and jimc isn't too
sure where the ramdisc's memory goes at that point.
I was hoping to have a single point in the boot-up code where
software_resume happened, rather than multiple places or, heaven forbid,
calling it repeatedly at various steps in the boot process. I think we can
justify some effort to avoid the situation where software_resume is called
before initrd loading, and it sometimes refuses to load the image, and then
is called again by the initrd.
Suppose software_resume (after the initrd) does an audit, and refuses to
load the image if there's a problem? Or even better, if all it needs is to
unmount filesystems, it should just do that and load the image. Is this
the right way to proceed?
James F. Carter Voice 310 825 2897 FAX 310 206 6673
UCLA-Mathnet; 6115 MSA; 405 Hilgard Ave.; Los Angeles, CA, USA 90095-1555
Email: [email protected] http://www.math.ucla.edu/~jimc (q.v. for PGP key)
Hi!
> > > This patch makes software_resume not a late_initcall but rather an
> > > external subroutine similar to software_suspend, and calls it at the
> > > beginning of mount_root (in init/do_mounts.c), just _after_ the initrd
> > > (if any) and its driver have been seen....
> >
> > But the patch is very dangerous. Unsuspecting users will see their
> > systems resumed after unsafe initrd is ran. It is okay for you,
> > through..
> >
> > What you want to do si to audit your initrd, then add echo to
> > /sys/power/resume at the end...
>
> I think you expressed similar reservations earlier but I'm not sure if I
> understand what your issue is. Are you saying (please let me know which if
> any of these threats are the ones that concern you):
>
> 1. If the initrd mounted any filesystem read-write, or any journalled
> filesystem, and left it mounted, that would be bad.
Yes. (Note that mounting in the first place is the problem. Even if
you umount it, you already did some changes on disk, BAD).
> 2. If the initrd started an ordinary process (or a kernel thread?) and
> left it running, that would be bad. The ata_piix driver really does
> create a kernel thread, though I believe it exists only during actual data
> transfer.
Should not be a problem with new refrigerator setup.
> 3. The initrd is copied into a ramdisc which is then mounted. It's still
> mounted when software_resume is called (as the patch is arranged presently,
> or if the initrd does "echo resume > /sys/power/state"), and jimc isn't too
> sure where the ramdisc's memory goes at that point.
Should not me a problem.
> I was hoping to have a single point in the boot-up code where
> software_resume happened, rather than multiple places or, heaven forbid,
> calling it repeatedly at various steps in the boot process. I think we can
> justify some effort to avoid the situation where software_resume is called
> before initrd loading, and it sometimes refuses to load the image, and then
> is called again by the initrd.
I don't see why calling it repeatedly would be that bad. In fact, it
is what we are doing just now. You can try to resume as many times as
you wish (by echo resume...), but obviously at most one of those will
succeed.
Pavel
--
Boycott Kodak -- for their patent abuse against Java.
On Thu, 14 Apr 2005, Pavel Machek wrote:
> > 1. If the initrd mounted any filesystem read-write, or any journalled
> > filesystem, and left it mounted, that would be bad.
>
> Yes. (Note that mounting in the first place is the problem. Even if
> you umount it, you already did some changes on disk, BAD).
(additional points snipped)
I've been quiet for the last few weeks because I've been installing and
working on SuSE 9.3. Kernel 2.6.11.4 has a fatal problem blamed on the
SATA driver (see below), but with 2.6.12-rc3, software suspend has been
reliable for me, with the initrd ending in "echo resume >
/sys/power/state".
I've come to agree with your position, that if the initrd runs then there
must be a positive action (like echo resume...) in the initrd. The kernel
should not try to clean up and resume by itself, because the initrd could
have done anything. Think of the SuSE rescue initrd.
There are only two rough edges: /sys/power/disk seems to change randomly to
"reboot", whereupon the BIOS reboots from the original boot disc, not going
through Grub or whatever is in the MBR. If you echo shutdown >
/sys/power/disk just before every suspend, it's a lot more reliable. The
other item involves the SATA driver and I'll copy you on that message.
Thank you very much for working on software suspend. I use the feature a
lot on my laptop and it greatly adds to the convenience of the O.S.
James F. Carter Voice 310 825 2897 FAX 310 206 6673
UCLA-Mathnet; 6115 MSA; 405 Hilgard Ave.; Los Angeles, CA, USA 90095-1555
Email: [email protected] http://www.math.ucla.edu/~jimc (q.v. for PGP key)
Hi!
> There are only two rough edges: /sys/power/disk seems to change randomly to
> "reboot", whereupon the BIOS reboots from the original boot disc, not going
> through Grub or whatever is in the MBR. If you echo shutdown >
> /sys/power/disk just before every suspend, it's a lot more reliable. The
> other item involves the SATA driver and I'll copy you on that
> message.
I'm not sure who changes /sys/power/disk from under you... I do not
think kernel does it. SATA is being worked on. Actually Jens has a
solution by now.
Pavel
--
Boycott Kodak -- for their patent abuse against Java.