2005-02-17 04:46:38

by Dmitry Torokhov

[permalink] [raw]
Subject: Swsusp, resume and kernel versions

Pavel,

First of all I must say that swsusp has progressed alot and now works
very reliably, at least for my configuration, and I use it a lot. Great
job!

But I think there is one pretty severe issue present - even if swsusp
is not enabled kernel should check if there is an image in swap and
erase it. Today I has somewhat unpleasant experience - after suspending
I accidentially loaded a vendor kernel. I was in hurry and decided that
resume just failed for some reason so I did couple of things and left
the box running. In the evening I realized that I am running vendor kernel
and decided to reboot into my devel. version. What I did not expect is for
the kernel to find a valid suspend image and restore it. As you might
imagine messed up my disk somewhat.

Any chance this can be done?

--
Dmitry


2005-02-17 05:13:42

by Nigel Cunningham

[permalink] [raw]
Subject: Re: Swsusp, resume and kernel versions

Hi Dmitry.

On Thu, 2005-02-17 at 15:46, Dmitry Torokhov wrote:
> Pavel,
>
> First of all I must say that swsusp has progressed alot and now works
> very reliably, at least for my configuration, and I use it a lot. Great
> job!
>
> But I think there is one pretty severe issue present - even if swsusp
> is not enabled kernel should check if there is an image in swap and
> erase it. Today I has somewhat unpleasant experience - after suspending
> I accidentially loaded a vendor kernel. I was in hurry and decided that
> resume just failed for some reason so I did couple of things and left
> the box running. In the evening I realized that I am running vendor kernel
> and decided to reboot into my devel. version. What I did not expect is for
> the kernel to find a valid suspend image and restore it. As you might
> imagine messed up my disk somewhat.
>
> Any chance this can be done?

One of my suspend2 users had the same thing yesterday. Unfortunately
there's no easy way for us to detect that another kernel has been
booted. The simplest solution we've found is to add commands in your
init scripts to mkswap the place where your image is stored when
rebooting. This will stop both of the implementations seeing the image
and resuming at a later time.

Regards,

Nigel
--
Nigel Cunningham
Software Engineer, Canberra, Australia
http://www.cyclades.com

Ph: +61 (2) 6292 8028 Mob: +61 (417) 100 574

2005-02-17 05:38:34

by Dmitry Torokhov

[permalink] [raw]
Subject: Re: Swsusp, resume and kernel versions

Hi Nigel,

On Thursday 17 February 2005 00:15, Nigel Cunningham wrote:
> Hi Dmitry.
>
> On Thu, 2005-02-17 at 15:46, Dmitry Torokhov wrote:
> > Pavel,
> >
> > First of all I must say that swsusp has progressed alot and now works
> > very reliably, at least for my configuration, and I use it a lot. Great
> > job!
> >
> > But I think there is one pretty severe issue present - even if swsusp
> > is not enabled kernel should check if there is an image in swap and
> > erase it. Today I has somewhat unpleasant experience - after suspending
> > I accidentially loaded a vendor kernel. I was in hurry and decided that
> > resume just failed for some reason so I did couple of things and left
> > the box running. In the evening I realized that I am running vendor kernel
> > and decided to reboot into my devel. version. What I did not expect is for
> > the kernel to find a valid suspend image and restore it. As you might
> > imagine messed up my disk somewhat.
> >
> > Any chance this can be done?
>
> One of my suspend2 users had the same thing yesterday. Unfortunately
> there's no easy way for us to detect that another kernel has been
> booted.

What do you mean? I thought it already compares signatures of the booting
kernel and suspend image. Just wipe it out if it does not match, or, even
better, just stop if signature does not match unless one boots with
"nosuspend". This way even if I start booting wrong image I have a chance
to select right one and avoid fsck.

--
Dmitry

2005-02-17 08:07:51

by Nigel Cunningham

[permalink] [raw]
Subject: Re: Swsusp, resume and kernel versions

Hi.

On Thu, 2005-02-17 at 16:38, Dmitry Torokhov wrote:
> On Thursday 17 February 2005 00:15, Nigel Cunningham wrote:
> > On Thu, 2005-02-17 at 15:46, Dmitry Torokhov wrote:
> > > But I think there is one pretty severe issue present - even if swsusp
> > > is not enabled kernel should check if there is an image in swap and
> > > erase it. Today I has somewhat unpleasant experience - after suspending
> > > I accidentially loaded a vendor kernel. I was in hurry and decided that
> > > resume just failed for some reason so I did couple of things and left
> > > the box running. In the evening I realized that I am running vendor kernel
> > > and decided to reboot into my devel. version. What I did not expect is for
> > > the kernel to find a valid suspend image and restore it. As you might
> > > imagine messed up my disk somewhat.
> > >
> > > Any chance this can be done?
> >
> > One of my suspend2 users had the same thing yesterday. Unfortunately
> > there's no easy way for us to detect that another kernel has been
> > booted.
>
> What do you mean? I thought it already compares signatures of the booting
> kernel and suspend image. Just wipe it out if it does not match, or, even
> better, just stop if signature does not match unless one boots with
> "nosuspend". This way even if I start booting wrong image I have a chance
> to select right one and avoid fsck.

That would work if the alternate kernel is suspend-enabled. Suspend2
handles that case nicely and I'm sure swsusp will handle it as well
(although exactly what it does, I'm not sure. It used to panic IIRC).

If the mistakenly booted kernel isn't suspend enabled, however, you need
a more generic method of removing the image, such as mkswapping the
storage device. This is what I was speaking of.

Regards,

Nigel
--
Nigel Cunningham
Software Engineer, Canberra, Australia
http://www.cyclades.com

Ph: +61 (2) 6292 8028 Mob: +61 (417) 100 574

2005-02-17 11:09:13

by Pavel Machek

[permalink] [raw]
Subject: Re: Swsusp, resume and kernel versions

Hi!

> First of all I must say that swsusp has progressed alot and now works
> very reliably, at least for my configuration, and I use it a lot. Great
> job!
>
> But I think there is one pretty severe issue present - even if swsusp
> is not enabled kernel should check if there is an image in swap and
> erase it. Today I has somewhat unpleasant experience - after suspending
> I accidentially loaded a vendor kernel. I was in hurry and decided that
> resume just failed for some reason so I did couple of things and left
> the box running. In the evening I realized that I am running vendor kernel
> and decided to reboot into my devel. version. What I did not expect is for
> the kernel to find a valid suspend image and restore it. As you might
> imagine messed up my disk somewhat.

When all the vendor's kernels have swsusp, it will magically kill the
signature. Or stick mkswap /dev/XXX in your init scripts.
Pavel
--
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

2005-02-17 16:29:19

by John M Flinchbaugh

[permalink] [raw]
Subject: Re: Swsusp, resume and kernel versions

On Thu, Feb 17, 2005 at 12:07:31PM +0100, Pavel Machek wrote:
> When all the vendor's kernels have swsusp, it will magically kill the
> signature. Or stick mkswap /dev/XXX in your init scripts.

This is what I've done in some instances. There should be no harm in
sticking that mkswap into your init scripts right before the swapon -a,
and then you have a nice userspace solution.

It's safe to reinitialize swap on any clean boot. A resume will not get
into the init scripts.

Just remember you're doing the mkswap if you decide to rearrange your
partitions at all, or code a script smart enough to grep your swap
partitions out of your fstab.

--
John M Flinchbaugh
[email protected]


Attachments:
(No filename) (687.00 B)
signature.asc (189.00 B)
Digital signature
Download all attachments

2005-02-17 17:30:58

by Dmitry Torokhov

[permalink] [raw]
Subject: Re: Swsusp, resume and kernel versions

On Thu, 17 Feb 2005 11:28:47 -0500, John M Flinchbaugh <[email protected]> wrote:
> On Thu, Feb 17, 2005 at 12:07:31PM +0100, Pavel Machek wrote:
> > When all the vendor's kernels have swsusp, it will magically kill the
> > signature. Or stick mkswap /dev/XXX in your init scripts.
>
> This is what I've done in some instances. There should be no harm in
> sticking that mkswap into your init scripts right before the swapon -a,
> and then you have a nice userspace solution.
>
> It's safe to reinitialize swap on any clean boot. A resume will not get
> into the init scripts.
>
> Just remember you're doing the mkswap if you decide to rearrange your
> partitions at all, or code a script smart enough to grep your swap
> partitions out of your fstab.

It could be a workaround. Still it will cause loss of unsaved work if
I happen to load wrong kernel. Given that the code checking for swsusp
image can be marked __init I don't understand the reasons gainst doing
it.

--
Dmitry

2005-02-17 20:00:57

by Pavel Machek

[permalink] [raw]
Subject: Re: Swsusp, resume and kernel versions

Hi!

> > Just remember you're doing the mkswap if you decide to rearrange your
> > partitions at all, or code a script smart enough to grep your swap
> > partitions out of your fstab.
>
> It could be a workaround. Still it will cause loss of unsaved work if
> I happen to load wrong kernel. Given that the code checking for swsusp
> image can be marked __init I don't understand the reasons gainst doing
> it.

How do you know which partitions to check? swsusp gets it from resume= parameter,
but if you do not have it compiled, you probably have wrong cmdline, too.

--
64 bytes from 195.113.31.123: icmp_seq=28 ttl=51 time=448769.1 ms

2005-02-17 20:10:16

by Dmitry Torokhov

[permalink] [raw]
Subject: Re: Swsusp, resume and kernel versions

On Thu, 17 Feb 2005 20:56:52 +0100, Pavel Machek <[email protected]> wrote:
> Hi!
>
> > > Just remember you're doing the mkswap if you decide to rearrange your
> > > partitions at all, or code a script smart enough to grep your swap
> > > partitions out of your fstab.
> >
> > It could be a workaround. Still it will cause loss of unsaved work if
> > I happen to load wrong kernel. Given that the code checking for swsusp
> > image can be marked __init I don't understand the reasons gainst doing
> > it.
>
> How do you know which partitions to check? swsusp gets it from resume= parameter,
> but if you do not have it compiled, you probably have wrong cmdline, too.
>

Ok, that makes sense. I guess I should just stop pulling vendor
kernels with the rest of updates since I am not using them anyway.

Sorry for the noise.

--
Dmitry

2005-02-18 02:02:30

by Bernard Blackham

[permalink] [raw]
Subject: Re: Swsusp, resume and kernel versions

On Thu, Feb 17, 2005 at 08:56:52PM +0100, Pavel Machek wrote:
> > > Just remember you're doing the mkswap if you decide to rearrange your
> > > partitions at all, or code a script smart enough to grep your swap
> > > partitions out of your fstab.
> >
> > It could be a workaround. Still it will cause loss of unsaved work if
> > I happen to load wrong kernel. Given that the code checking for swsusp
> > image can be marked __init I don't understand the reasons gainst doing
> > it.
>
> How do you know which partitions to check? swsusp gets it from resume= parameter,
> but if you do not have it compiled, you probably have wrong cmdline, too.

In many cases, you might have added the resume= line to every kernel
that's booted (eg, LILO's global append= parameter, or Debian GRUB's
magic kopts gear). Alternately (or additionally), you could examine
the signature when sys_swapon is called on a swap partition (though
the code couldn't be __init then).

These together I want to claim would catch many of these cases, and
any effort to avoid severe filesystem corruption is a good thing.

Bernard.

--
Bernard Blackham <bernard at blackham dot com dot au>

2005-02-18 09:06:52

by Stefan Seyfried

[permalink] [raw]
Subject: Re: Swsusp, resume and kernel versions

Nigel Cunningham wrote:

> If the mistakenly booted kernel isn't suspend enabled, however, you need
> a more generic method of removing the image, such as mkswapping the
> storage device. This is what I was speaking of.

The following code is used in the SUSE bootscripts to do exactly this:

----------------------------------------------------
get_swap_id() {
local line;
fdisk -l | while read line; do
case "$line" in
/*Linux\ [sS]wap*) echo "${line%% *}"
esac
done
}

check_swap_sig () {
local part="$(get_swap_id)"
local where what type rest p c
while read where what type rest ; do
test "$type" = "swap" || continue
c=continue
for p in $part ; do
test "$p" = "$where" && c=true
done
$c
case "$(dd if=$where bs=1 count=6 skip=4086 2>/dev/null)" in
S1SUSP|S2SUSP) mkswap $where
esac
done < /etc/fstab
}
---------------------------------------------------------------------

This invalidates the suspend signature if the kernel has not already
done it. It probably does not cover the softwaresuspend2 signature but
that should be trivial to add.

Regards,

Stefan

2005-02-18 10:33:33

by Nigel Cunningham

[permalink] [raw]
Subject: Re: Swsusp, resume and kernel versions

Hi Stefan.

For Suspend2, we also put a device id in the space, so there's only room
for one character, which is a lower or upper case Z. (We also validate
the device ID, so a random Z won't cause an oops).

Thanks for the code. With your/Suse's permission, I'll ask Bernard
(cc'd) to include the script in the docs somewhere with the appropriate
credit.

Thanks and regards,

Nigel

On Fri, 2005-02-18 at 08:05, Stefan Seyfried wrote:
> Nigel Cunningham wrote:
>
> > If the mistakenly booted kernel isn't suspend enabled, however, you need
> > a more generic method of removing the image, such as mkswapping the
> > storage device. This is what I was speaking of.
>
> The following code is used in the SUSE bootscripts to do exactly this:
>
> ----------------------------------------------------
> get_swap_id() {
> local line;
> fdisk -l | while read line; do
> case "$line" in
> /*Linux\ [sS]wap*) echo "${line%% *}"
> esac
> done
> }
>
> check_swap_sig () {
> local part="$(get_swap_id)"
> local where what type rest p c
> while read where what type rest ; do
> test "$type" = "swap" || continue
> c=continue
> for p in $part ; do
> test "$p" = "$where" && c=true
> done
> $c
> case "$(dd if=$where bs=1 count=6 skip=4086 2>/dev/null)" in
> S1SUSP|S2SUSP) mkswap $where
> esac
> done < /etc/fstab
> }
> ---------------------------------------------------------------------
>
> This invalidates the suspend signature if the kernel has not already
> done it. It probably does not cover the softwaresuspend2 signature but
> that should be trivial to add.
>
> Regards,
>
> Stefan
--
Nigel Cunningham
Software Engineer, Canberra, Australia
http://www.cyclades.com

Ph: +61 (2) 6292 8028 Mob: +61 (417) 100 574

2005-02-18 11:24:28

by Pavel Machek

[permalink] [raw]
Subject: Re: Swsusp, resume and kernel versions

Hi!

> > > > Just remember you're doing the mkswap if you decide to rearrange your
> > > > partitions at all, or code a script smart enough to grep your swap
> > > > partitions out of your fstab.
> > >
> > > It could be a workaround. Still it will cause loss of unsaved work if
> > > I happen to load wrong kernel. Given that the code checking for swsusp
> > > image can be marked __init I don't understand the reasons gainst doing
> > > it.
> >
> > How do you know which partitions to check? swsusp gets it from resume= parameter,
> > but if you do not have it compiled, you probably have wrong cmdline, too.
>
> In many cases, you might have added the resume= line to every kernel
> that's booted (eg, LILO's global append= parameter, or Debian GRUB's
> magic kopts gear). Alternately (or additionally), you could examine
> the signature when sys_swapon is called on a swap partition (though
> the code couldn't be __init then).
>
> These together I want to claim would catch many of these cases, and
> any effort to avoid severe filesystem corruption is a good thing.

Messing up the kernel to avoid fs corruption in some cases is bad
idea.

If you want to be 100% safe, add support to LILO/GRUB: just do not
allow selecting wrong kernel if last action was suspend. Bootloader
knows, it seen the command lines.

Or simply use Stefan's userland code.
Pavel
--
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

2005-02-18 11:25:08

by Stefan Seyfried

[permalink] [raw]
Subject: Re: Swsusp, resume and kernel versions

Nigel Cunningham wrote:
> Hi Stefan.
>
> For Suspend2, we also put a device id in the space, so there's only room
> for one character, which is a lower or upper case Z. (We also validate
> the device ID, so a random Z won't cause an oops).
>
> Thanks for the code. With your/Suse's permission, I'll ask Bernard
> (cc'd) to include the script in the docs somewhere with the appropriate
> credit.

i have consulted our license guy and the original author of
/etc/init.d/boot.swap and am glad to say that it is GPL'd ;-)

Have fun,

Stefan

--
Stefan Seyfried, QA / R&D Team Mobile Devices, SUSE LINUX, N?rnberg.

"Any ideas, John?"
"Well, surrounding them's out."

2005-02-18 12:26:36

by Bernard Blackham

[permalink] [raw]
Subject: Re: Swsusp, resume and kernel versions

On Fri, Feb 18, 2005 at 12:24:09PM +0100, Pavel Machek wrote:
> If you want to be 100% safe, add support to LILO/GRUB: just do not
> allow selecting wrong kernel if last action was suspend. Bootloader
> knows, it seen the command lines.

That's a very good point/solution indeed. The hibernate script
available from the Software Suspend 2 homepage already has options
to reconfigure LILO/GRUB upon suspending. I'd forgotten about them!

Bernard.

--
Bernard Blackham <bernard at blackham dot com dot au>