Hi Pavel,
hi Rafael,
after a quick search i couldn't find anything dealing with the topic in the
subject line so here we go:
One sometimes can mix up (and by one i mean me) the
kernel images one boots after having suspended the machine previously. There can
be at least two reasons for that:
1. too many kernels in grub and having forgotten with which i suspended.
2. compile and install a new kernel and forget about it, suspend in the evening
and then boot with the new kernel;
in both cases you end up staring at fsck since they filesystems haven't been unmounted,
of course. Or at least see the warning message of some journal recovery whisk away.
In order to alleviate that, one could probably go, imho, and write in the swsusp_header
the kernel version which suspended the machine (UTS_RELEASE) alongside
SWSUSP_SIG and check that against the kernel version of the image just booting.
If they match then all is well, if not, one could
a) issue a BIG FAT WARNING and reboot telling the user to select the proper
image
b) ask the user what to do:
- proceed as if "noresume" has been entered on the kernel command line
- reboot after issuing the kernel version which suspended the machine
-
c)...
In case you guys think something like that might be of use i can come up with a
patch in the next coupla days...
--
Regards/Gru?,
Boris.
On Thursday, 6 of December 2007, Borislav Petkov wrote:
> Hi Pavel,
> hi Rafael,
>
> after a quick search i couldn't find anything dealing with the topic in the
> subject line so here we go:
>
> One sometimes can mix up (and by one i mean me) the
> kernel images one boots after having suspended the machine previously. There can
> be at least two reasons for that:
>
> 1. too many kernels in grub and having forgotten with which i suspended.
> 2. compile and install a new kernel and forget about it, suspend in the evening
> and then boot with the new kernel;
>
> in both cases you end up staring at fsck since they filesystems haven't been unmounted,
> of course. Or at least see the warning message of some journal recovery whisk away.
>
> In order to alleviate that, one could probably go, imho, and write in the swsusp_header
> the kernel version which suspended the machine (UTS_RELEASE) alongside
> SWSUSP_SIG and check that against the kernel version of the image just booting.
> If they match then all is well, if not, one could
>
> a) issue a BIG FAT WARNING and reboot telling the user to select the proper
> image
> b) ask the user what to do:
> - proceed as if "noresume" has been entered on the kernel command line
> - reboot after issuing the kernel version which suspended the machine
> -
> c)...
>
> In case you guys think something like that might be of use i can come up with a
> patch in the next coupla days...
Well, there's a patchset in the current mainline that allows you to use
arbitrary (sufficiently new) kernel to load the image and then restore the
image kernel. So, you can hibernate 2.6.24-rc3 and use 2.6.24-rc2 to restore
it, for example.
I'm going to do that for i386 too.
Greetings,
Rafael
On Thu, Dec 06, 2007 at 10:46:48PM +0100, Rafael J. Wysocki wrote:
> On Thursday, 6 of December 2007, Borislav Petkov wrote:
> > Hi Pavel,
> > hi Rafael,
> >
> > after a quick search i couldn't find anything dealing with the topic in the
> > subject line so here we go:
> >
> > One sometimes can mix up (and by one i mean me) the
> > kernel images one boots after having suspended the machine previously. There can
> > be at least two reasons for that:
> >
> > 1. too many kernels in grub and having forgotten with which i suspended.
> > 2. compile and install a new kernel and forget about it, suspend in the evening
> > and then boot with the new kernel;
> >
> > in both cases you end up staring at fsck since they filesystems haven't been unmounted,
> > of course. Or at least see the warning message of some journal recovery whisk away.
> >
> > In order to alleviate that, one could probably go, imho, and write in the swsusp_header
> > the kernel version which suspended the machine (UTS_RELEASE) alongside
> > SWSUSP_SIG and check that against the kernel version of the image just booting.
> > If they match then all is well, if not, one could
> >
> > a) issue a BIG FAT WARNING and reboot telling the user to select the proper
> > image
> > b) ask the user what to do:
> > - proceed as if "noresume" has been entered on the kernel command line
> > - reboot after issuing the kernel version which suspended the machine
> > -
> > c)...
> >
> > In case you guys think something like that might be of use i can come up with a
> > patch in the next coupla days...
>
> Well, there's a patchset in the current mainline that allows you to use
> arbitrary (sufficiently new) kernel to load the image and then restore the
> image kernel. So, you can hibernate 2.6.24-rc3 and use 2.6.24-rc2 to restore
> it, for example.
>
> I'm going to do that for i386 too.
right, this is d307c4a8e826c44f9633bd3f7e60d0491e7d885a (Hibernation: Arbitrary
boot kernel support - generic code), i should've seen that. What's the status of
those bits, from a quick scan it seems they need some rewiring (Kconfig, e.g.
CONFIG_ARCH_HIBERNATION_HEADER etc..) and arch-specific save and restore
functions?
--
Regards/Gru?,
Boris.
On Friday, 7 of December 2007, Borislav Petkov wrote:
> On Thu, Dec 06, 2007 at 10:46:48PM +0100, Rafael J. Wysocki wrote:
> > On Thursday, 6 of December 2007, Borislav Petkov wrote:
> > > Hi Pavel,
> > > hi Rafael,
> > >
> > > after a quick search i couldn't find anything dealing with the topic in the
> > > subject line so here we go:
> > >
> > > One sometimes can mix up (and by one i mean me) the
> > > kernel images one boots after having suspended the machine previously. There can
> > > be at least two reasons for that:
> > >
> > > 1. too many kernels in grub and having forgotten with which i suspended.
> > > 2. compile and install a new kernel and forget about it, suspend in the evening
> > > and then boot with the new kernel;
> > >
> > > in both cases you end up staring at fsck since they filesystems haven't been unmounted,
> > > of course. Or at least see the warning message of some journal recovery whisk away.
> > >
> > > In order to alleviate that, one could probably go, imho, and write in the swsusp_header
> > > the kernel version which suspended the machine (UTS_RELEASE) alongside
> > > SWSUSP_SIG and check that against the kernel version of the image just booting.
> > > If they match then all is well, if not, one could
> > >
> > > a) issue a BIG FAT WARNING and reboot telling the user to select the proper
> > > image
> > > b) ask the user what to do:
> > > - proceed as if "noresume" has been entered on the kernel command line
> > > - reboot after issuing the kernel version which suspended the machine
> > > -
> > > c)...
> > >
> > > In case you guys think something like that might be of use i can come up with a
> > > patch in the next coupla days...
> >
> > Well, there's a patchset in the current mainline that allows you to use
> > arbitrary (sufficiently new) kernel to load the image and then restore the
> > image kernel. So, you can hibernate 2.6.24-rc3 and use 2.6.24-rc2 to restore
> > it, for example.
> >
> > I'm going to do that for i386 too.
> right, this is d307c4a8e826c44f9633bd3f7e60d0491e7d885a (Hibernation: Arbitrary
> boot kernel support - generic code), i should've seen that. What's the status of
> those bits, from a quick scan it seems they need some rewiring (Kconfig, e.g.
> CONFIG_ARCH_HIBERNATION_HEADER etc..) and arch-specific save and restore
> functions?
No, this code is fully functional. :-)
The arch save and restore functions are in arch/x86/kernel/suspend_64.c .
As I said, i386 is not yet supported.
Greetings,
Rafael
On Fri, Dec 07, 2007 at 09:19:09PM +0100, Rafael J. Wysocki wrote:
...
> > > Well, there's a patchset in the current mainline that allows you to use
> > > arbitrary (sufficiently new) kernel to load the image and then restore the
> > > image kernel. So, you can hibernate 2.6.24-rc3 and use 2.6.24-rc2 to restore
> > > it, for example.
> > >
> > > I'm going to do that for i386 too.
> > right, this is d307c4a8e826c44f9633bd3f7e60d0491e7d885a (Hibernation: Arbitrary
> > boot kernel support - generic code), i should've seen that. What's the status of
> > those bits, from a quick scan it seems they need some rewiring (Kconfig, e.g.
> > CONFIG_ARCH_HIBERNATION_HEADER etc..) and arch-specific save and restore
> > functions?
>
> No, this code is fully functional. :-)
>
> The arch save and restore functions are in arch/x86/kernel/suspend_64.c .
>
> As I said, i386 is not yet supported.
nice, holler if you need a tester when you have some prototypes ready. By the way,
what do you do when the suspend image header mismatches and it is unsafe to continue booting?
Also, there's a freakishly long comment in suspend_64.c, might wanna shorten it:
Signed-off-by: Borislav Petkov <[email protected]>
diff --git a/arch/x86/kernel/suspend_64.c b/arch/x86/kernel/suspend_64.c
index db284ef..0a23e5f 100644
--- a/arch/x86/kernel/suspend_64.c
+++ b/arch/x86/kernel/suspend_64.c
@@ -118,7 +118,12 @@ void fix_processor_context(void)
int cpu = smp_processor_id();
struct tss_struct *t = &per_cpu(init_tss, cpu);
- set_tss_desc(cpu,t); /* This just modifies memory; should not be necessary. But... This is necessary, because 386 hardware has concept of busy TSS or some similar stupidity. */
+ /*
+ * This just modifies memory; should not be necessary. But... This
+ * is necessary, because 386 hardware has concept of busy TSS or some
+ * similar stupidity.
+ */
+ set_tss_desc(cpu,t);
cpu_gdt(cpu)[GDT_ENTRY_TSS].type = 9;
@@ -138,7 +143,6 @@ void fix_processor_context(void)
loaddebug(¤t->thread, 6);
loaddebug(¤t->thread, 7);
}
-
}
#ifdef CONFIG_HIBERNATION
--
Regards/Gru?,
Boris.
On Saturday, 8 of December 2007, Borislav Petkov wrote:
> On Fri, Dec 07, 2007 at 09:19:09PM +0100, Rafael J. Wysocki wrote:
>
> ...
>
> > > > Well, there's a patchset in the current mainline that allows you to use
> > > > arbitrary (sufficiently new) kernel to load the image and then restore the
> > > > image kernel. So, you can hibernate 2.6.24-rc3 and use 2.6.24-rc2 to restore
> > > > it, for example.
> > > >
> > > > I'm going to do that for i386 too.
> > > right, this is d307c4a8e826c44f9633bd3f7e60d0491e7d885a (Hibernation: Arbitrary
> > > boot kernel support - generic code), i should've seen that. What's the status of
> > > those bits, from a quick scan it seems they need some rewiring (Kconfig, e.g.
> > > CONFIG_ARCH_HIBERNATION_HEADER etc..) and arch-specific save and restore
> > > functions?
> >
> > No, this code is fully functional. :-)
> >
> > The arch save and restore functions are in arch/x86/kernel/suspend_64.c .
> >
> > As I said, i386 is not yet supported.
>
> nice, holler if you need a tester when you have some prototypes ready. By the way,
> what do you do when the suspend image header mismatches and it is unsafe to continue booting?
If the image header doesn't match, we don't load it and return an error code,
which usually results in the boot kernel continuing to boot.
> Also, there's a freakishly long comment in suspend_64.c, might wanna shorten it:
Ah, OK.
I'll take your patch for 2.6.25, thanks.
> Signed-off-by: Borislav Petkov <[email protected]>
>
> diff --git a/arch/x86/kernel/suspend_64.c b/arch/x86/kernel/suspend_64.c
> index db284ef..0a23e5f 100644
> --- a/arch/x86/kernel/suspend_64.c
> +++ b/arch/x86/kernel/suspend_64.c
> @@ -118,7 +118,12 @@ void fix_processor_context(void)
> int cpu = smp_processor_id();
> struct tss_struct *t = &per_cpu(init_tss, cpu);
>
> - set_tss_desc(cpu,t); /* This just modifies memory; should not be necessary. But... This is necessary, because 386 hardware has concept of busy TSS or some similar stupidity. */
> + /*
> + * This just modifies memory; should not be necessary. But... This
> + * is necessary, because 386 hardware has concept of busy TSS or some
> + * similar stupidity.
> + */
> + set_tss_desc(cpu,t);
>
> cpu_gdt(cpu)[GDT_ENTRY_TSS].type = 9;
>
> @@ -138,7 +143,6 @@ void fix_processor_context(void)
> loaddebug(¤t->thread, 6);
> loaddebug(¤t->thread, 7);
> }
> -
> }
>
> #ifdef CONFIG_HIBERNATION
>
On Sat, Dec 08, 2007 at 11:50:33PM +0100, Rafael J. Wysocki wrote:
> On Saturday, 8 of December 2007, Borislav Petkov wrote:
> > On Fri, Dec 07, 2007 at 09:19:09PM +0100, Rafael J. Wysocki wrote:
> >
> > ...
> >
> > > > > Well, there's a patchset in the current mainline that allows you to use
> > > > > arbitrary (sufficiently new) kernel to load the image and then restore the
> > > > > image kernel. So, you can hibernate 2.6.24-rc3 and use 2.6.24-rc2 to restore
> > > > > it, for example.
> > > > >
> > > > > I'm going to do that for i386 too.
> > > > right, this is d307c4a8e826c44f9633bd3f7e60d0491e7d885a (Hibernation: Arbitrary
> > > > boot kernel support - generic code), i should've seen that. What's the status of
> > > > those bits, from a quick scan it seems they need some rewiring (Kconfig, e.g.
> > > > CONFIG_ARCH_HIBERNATION_HEADER etc..) and arch-specific save and restore
> > > > functions?
> > >
> > > No, this code is fully functional. :-)
> > >
> > > The arch save and restore functions are in arch/x86/kernel/suspend_64.c .
> > >
> > > As I said, i386 is not yet supported.
> >
> > nice, holler if you need a tester when you have some prototypes ready. By the way,
> > what do you do when the suspend image header mismatches and it is unsafe to continue booting?
>
> If the image header doesn't match, we don't load it and return an error code,
> which usually results in the boot kernel continuing to boot.
But if you continue to boot the filesystems were still mounted and fsck has to
go over them and check for errors. In the case of ext2 this takes relatively
long depending on the size of the partition. However, this is only the
smaller problem, the problem of data loss is what worries me.
Instead, I'd rather issue a warning that the swsusp header mismatches, say with
which kernel the machine got suspended with and then start the countdown for reboot.
Thoughts?
--
Regards/Gru?,
Boris.
On Sunday, 9 of December 2007, Borislav Petkov wrote:
> On Sat, Dec 08, 2007 at 11:50:33PM +0100, Rafael J. Wysocki wrote:
> > On Saturday, 8 of December 2007, Borislav Petkov wrote:
> > > On Fri, Dec 07, 2007 at 09:19:09PM +0100, Rafael J. Wysocki wrote:
> > >
> > > ...
> > >
> > > > > > Well, there's a patchset in the current mainline that allows you to use
> > > > > > arbitrary (sufficiently new) kernel to load the image and then restore the
> > > > > > image kernel. So, you can hibernate 2.6.24-rc3 and use 2.6.24-rc2 to restore
> > > > > > it, for example.
> > > > > >
> > > > > > I'm going to do that for i386 too.
> > > > > right, this is d307c4a8e826c44f9633bd3f7e60d0491e7d885a (Hibernation: Arbitrary
> > > > > boot kernel support - generic code), i should've seen that. What's the status of
> > > > > those bits, from a quick scan it seems they need some rewiring (Kconfig, e.g.
> > > > > CONFIG_ARCH_HIBERNATION_HEADER etc..) and arch-specific save and restore
> > > > > functions?
> > > >
> > > > No, this code is fully functional. :-)
> > > >
> > > > The arch save and restore functions are in arch/x86/kernel/suspend_64.c .
> > > >
> > > > As I said, i386 is not yet supported.
> > >
> > > nice, holler if you need a tester when you have some prototypes ready. By the way,
> > > what do you do when the suspend image header mismatches and it is unsafe to continue booting?
> >
> > If the image header doesn't match, we don't load it and return an error code,
> > which usually results in the boot kernel continuing to boot.
>
> But if you continue to boot the filesystems were still mounted and fsck has to
> go over them and check for errors. In the case of ext2 this takes relatively
> long depending on the size of the partition. However, this is only the
> smaller problem, the problem of data loss is what worries me.
The filesystems are synced before the hibernation, so there shouldn't be data
any loss.
> Instead, I'd rather issue a warning that the swsusp header mismatches, say with
> which kernel the machine got suspended with and then start the countdown for reboot.
What exactly would that change? You need to reboot anyway and fsck will run on
the filesystems regardless of which kernel you boot with.
Rafael
On Sunday, 9 of December 2007, Rafael J. Wysocki wrote:
> On Sunday, 9 of December 2007, Borislav Petkov wrote:
> > On Sat, Dec 08, 2007 at 11:50:33PM +0100, Rafael J. Wysocki wrote:
> > > On Saturday, 8 of December 2007, Borislav Petkov wrote:
> > > > On Fri, Dec 07, 2007 at 09:19:09PM +0100, Rafael J. Wysocki wrote:
> > > >
> > > > ...
> > > >
> > > > > > > Well, there's a patchset in the current mainline that allows you to use
> > > > > > > arbitrary (sufficiently new) kernel to load the image and then restore the
> > > > > > > image kernel. So, you can hibernate 2.6.24-rc3 and use 2.6.24-rc2 to restore
> > > > > > > it, for example.
> > > > > > >
> > > > > > > I'm going to do that for i386 too.
> > > > > > right, this is d307c4a8e826c44f9633bd3f7e60d0491e7d885a (Hibernation: Arbitrary
> > > > > > boot kernel support - generic code), i should've seen that. What's the status of
> > > > > > those bits, from a quick scan it seems they need some rewiring (Kconfig, e.g.
> > > > > > CONFIG_ARCH_HIBERNATION_HEADER etc..) and arch-specific save and restore
> > > > > > functions?
> > > > >
> > > > > No, this code is fully functional. :-)
> > > > >
> > > > > The arch save and restore functions are in arch/x86/kernel/suspend_64.c .
> > > > >
> > > > > As I said, i386 is not yet supported.
> > > >
> > > > nice, holler if you need a tester when you have some prototypes ready. By the way,
> > > > what do you do when the suspend image header mismatches and it is unsafe to continue booting?
> > >
> > > If the image header doesn't match, we don't load it and return an error code,
> > > which usually results in the boot kernel continuing to boot.
> >
> > But if you continue to boot the filesystems were still mounted and fsck has to
> > go over them and check for errors. In the case of ext2 this takes relatively
> > long depending on the size of the partition. However, this is only the
> > smaller problem, the problem of data loss is what worries me.
>
> The filesystems are synced before the hibernation, so there shouldn't be data
> any loss.
s/data any loss/any data loss/ (sorry).
On Sun, Dec 09, 2007 at 03:27:57PM +0100, Rafael J. Wysocki wrote:
...
> > Instead, I'd rather issue a warning that the swsusp header mismatches, say with
> > which kernel the machine got suspended with and then start the countdown for reboot.
>
> What exactly would that change? You need to reboot anyway and fsck will run on
> the filesystems regardless of which kernel you boot with.
well, you'll have the chance to reboot with the kernel the machine got suspended
with and then the swsusp image header _will_ match so no fsck-ing. or am i
missing something...
--
Regards/Gru?,
Boris.
On Sunday, 9 of December 2007, Borislav Petkov wrote:
> On Sun, Dec 09, 2007 at 03:27:57PM +0100, Rafael J. Wysocki wrote:
> ...
>
> > > Instead, I'd rather issue a warning that the swsusp header mismatches, say with
> > > which kernel the machine got suspended with and then start the countdown for reboot.
> >
> > What exactly would that change? You need to reboot anyway and fsck will run on
> > the filesystems regardless of which kernel you boot with.
>
> well, you'll have the chance to reboot with the kernel the machine got suspended
> with and then the swsusp image header _will_ match so no fsck-ing. or am i
> missing something...
Yes, you are. :-)
With the new code (which BTW I'm assuming we are talking about) the images are
not matched against the kernel they were created by, but against a hard-coded
magic number (defined in suspend_64.c) playing the role of the "header protocol
version" and against some system parameters, like the amount of RAM etc.
Since all kernels containing the new code use the same magic number, all of
them will match or none of them will match.
Of course, we may want to change the magic at some point and in that case
there you will have some non-matching kernels. If you're worried about that,
use the userland hibernation tools from suspend.sf.net (they handle such
situations quite well).
Greetings,
Rafael
On Sun, Dec 09, 2007 at 10:46:35PM +0100, Rafael J. Wysocki wrote:
> On Sunday, 9 of December 2007, Borislav Petkov wrote:
> > On Sun, Dec 09, 2007 at 03:27:57PM +0100, Rafael J. Wysocki wrote:
> > ...
> >
> > > > Instead, I'd rather issue a warning that the swsusp header mismatches, say with
> > > > which kernel the machine got suspended with and then start the countdown for reboot.
> > >
> > > What exactly would that change? You need to reboot anyway and fsck will run on
> > > the filesystems regardless of which kernel you boot with.
> >
> > well, you'll have the chance to reboot with the kernel the machine got suspended
> > with and then the swsusp image header _will_ match so no fsck-ing. or am i
> > missing something...
>
> Yes, you are. :-)
>
> With the new code (which BTW I'm assuming we are talking about) the images are
> not matched against the kernel they were created by, but against a hard-coded
> magic number (defined in suspend_64.c) playing the role of the "header protocol
> version" and against some system parameters, like the amount of RAM etc.
> Since all kernels containing the new code use the same magic number, all of
> them will match or none of them will match.
right, i was kinda wondering when actually a swsusp image won't match after looking
at check_image_kernel() but missed that arch-specific RESTORE_MAGIC bit.
Thanks for clearing that up.
--
Regards/Gru?,
Boris.