Kexec base hibernation has some potential advantages over uswsusp and
TuxOnIce (suspend2). Some most obvious advantages are:
1. The hibernation image size can exceed half of memory size easily.
2. The hibernation image can be written to and read from almost
anywhere, such as USB disk, NFS.
3. It is possible to eliminate freezer from kexec based hibernation
implementation.
4. Based on kexec/kdump implementation, the kernel code needed is
less.
The hibernation procedure with the patch set is as follow:
1. Boot a kernel A
2. Work under kernel A
3. Kexec another kernel B (crash dump enabled) in kernel A.
4. Save the memory image of kernel A through crash dump (such as "cp
/proc/vmcore ~"). Save the "jump back entry".
5. Shutdown or reboot
The restore process with the patch set is as follow:
1. Boot a kernel C (crash dump enabled), the memory area used by
kernel C must be a subset of memory area used by kernel B.
2. Restore the memory image of kernel A through /dev/oldmem. Restore
the "jump back entry".
3. Jump from kernel C back to kernel A
4. Continue work under kernel A
The following user-space tools are needed to implement hibernation and
restore.
1. kexec-tools needs to be patched to support kexec jump. The patches
and the precompiled kexec can be download from the following URL:
source: http://khibernation.sourceforge.net/download/release_v5/kexec-tools/kexec-tools-src_v5.tar.gz
patches: http://khibernation.sourceforge.net/download/release_v5/kexec-tools/kexec-tools-patches_v5.tar.gz
binary: http://khibernation.sourceforge.net/download/release_v5/kexec-tools/kexec_v5.tar.gz
2. Memory image saving tool. Currently, the memory image saving is
done through: "cp /proc/vmcore <image file>". This will save all
memory pages of original kernel including the free pages. Maybe the
crash dump tool "makedumpfile" can be used for this, but it has not
been tested.
3. Memory image restore tool. A simplest memory image restoring tool
named "krestore" is implemented. It can be downloaded from the
following URL:
source: http://khibernation.sourceforge.net/download/release_v5/krestore/krestore-src_v5.tar.gz
binary: http://khibernation.sourceforge.net/download/release_v5/krestore/krestore_v5.tar.gz
Usage:
1. Compile kernel with following options selected:
CONFIG_X86_32=y
CONFIG_RELOCATABLE=y # not needed strictly, but it is more convenient with it
CONFIG_KEXEC=y
CONFIG_CRASH_DUMP=y # only needed by kexeced kernel to save/restore memory image
CONFIG_PM=y
2. Download the kexec-tools-testing git tree, apply the kexec-tools
kjump patches (or download the source tar ball directly) and
compile.
3. Download and compile the krestore tool.
4. Prepare 2 root partition used by kernel A and kernel B/C, referred
as /dev/hda, /dev/hdb in following text. This is not strictly
necessary, I use this scheme for testing during development.
5. Boot kernel compiled for normal usage (kernel A).
6. Load kernel compiled for hibernating/restore usage (kernel B) with
kexec, the same kernel as that of 5 can be used if
CONFIG_RELOCATABLE=y and CONFIG_CRASH_DUMP=y are selected.
The --elf64-core-headers should be specified in command line of
kexec, because only the 64bit ELF is supported by krestore tool.
For example, the shell command line can be as follow:
kexec --load-jump-back /boot/bzImage --mem-min=0x100000 --mem-max=0xffffff
--elf64-core-headers --append="root=/dev/hdb single"
7. Jump to the hibernating kernel (kernel B) with following shell
command line:
kexec -e
8. In the hibernating kernel (kernel B), the memory image of
hibernated kernel (kernel A) can be saved as follow:
cp /proc/vmcore .
cat /proc/cmdline | tr ' ' '\n' | grep kexec_jump_back_entry | cut -d '=' -f 2 > kexec_jump_back_entry
9. Shutdown or reboot in hibernating kernel (kernel B).
10. Boot kernel (kernel C) compiled for hibernating/restore usage on
the root file system /dev/hdb in memory range of kernel B.
For example, the following kernel command line parameters can be
used:
root=/dev/hdb single memmap=exactmap memmap=640K@0K memmap=15M@1M
11. In restore kernel (kernel C), the memory image of kernel A can be
restored as follow:
krestore vmcore
12. Jump back to hibernated kernel (kernel A)
kexec -b --jump-back-entry=`cat kexec_jump_back_entry`
Known issues:
- To comply with the ACPI specification, some ACPI methods must be
invoked before and after hibernation, and machine should be put in
S4 state instead of shutdown.
- The setup of hibernation/restore is fairly complex. I will continue
working on simplifying.
- Memory pages including free pages of kernel A are saved. I think the
"makedumpfile" tool can be used to exclude "free pages", but I have
not tested it.
Now, only the i386 architecture is supported. The patch set is based
on Linux kernel 2.6.23-rc8-mm2, and has been tested on IBM T42 with
ACPI on and off.
ChangeLog:
v5:
- A flag (KEXEC_JUMP_BACK) is added to indicate the loaded kernel
image is used for jumping back. The reboot command for jumping back
is removed. This interface is more stable (proposed by Eric
Biederman).
- NX bit handling support for kexec is added.
- Merge machine_kexec and machine_kexec_jump, remove NO_RET attribute
from machine_kexec.
- Passing jump back entry to kexeced kernel via kernel command line
(parsed by user space tool via /proc/cmdline instead of
kernel). Original corresponding boot parameter and sysfs code is
removed.
v4:
- Two reboot command are merged back to one because the underlying
implementation is same.
- Jumping without reserving memory is implemented. As a side effect,
two direction jumping is implemented.
- A jump back protocol is defined and documented. The original kernel
and kexeced kernel are more independent from each other.
- The CPU state save/restore code are merged into relocate_kernel.S.
v3:
- The reboot command LINUX_REBOOT_CMD_KJUMP is split into to two
reboot command to reflect the different function.
- Document is added for added kernel parameters.
- /sys/kernel/kexec_jump_buf_pfn is made writable, it is used for
memory image restoring.
- Console restoring after jumping back is implemented.
- Writing support is added for /dev/oldmem, to restore memory contents
of hibernated system.
v2:
- The kexec jump implementation is put into the kexec/kdump framework
instead of software suspend framework. The device and CPU state
save/restore code of software suspend is called when needed.
- The same code path is used for both kexec a new kernel and jump back
to original kernel.
On Thursday, 11 October 2007 04:13, Huang, Ying wrote:
> Kexec base hibernation has some potential advantages over uswsusp and
> TuxOnIce (suspend2). Some most obvious advantages are:
Well, I have some doubts as far as the obviousness is concerned.
> 1. The hibernation image size can exceed half of memory size easily.
This is also possible with TuxOnIce.
> 2. The hibernation image can be written to and read from almost
> anywhere, such as USB disk, NFS.
This is possible with uswsusp, at least in theory, probably with TuxOnIce too.
> 3. It is possible to eliminate freezer from kexec based hibernation
> implementation.
This isn't true as long as we have not changed the handling of devices
(which is in the works, but will take time).
> 4. Based on kexec/kdump implementation, the kernel code needed is
> less.
Well, maybe.
Greetings,
Rafael
On Thu, 2007-10-11 at 12:17 +0200, Rafael J. Wysocki wrote:
> On Thursday, 11 October 2007 04:13, Huang, Ying wrote:
> > Kexec base hibernation has some potential advantages over uswsusp and
> > TuxOnIce (suspend2). Some most obvious advantages are:
>
> Well, I have some doubts as far as the obviousness is concerned.
OK, I will remove the "obvious".
> > 1. The hibernation image size can exceed half of memory size easily.
>
> This is also possible with TuxOnIce.
I will add detail description about this. It is possible with TuxOnIce,
and hard with u/swsusp.
>
> > 2. The hibernation image can be written to and read from almost
> > anywhere, such as USB disk, NFS.
>
> This is possible with uswsusp, at least in theory, probably with TuxOnIce too.
I will remove this.
> > 3. It is possible to eliminate freezer from kexec based hibernation
> > implementation.
>
> This isn't true as long as we have not changed the handling of devices
> (which is in the works, but will take time).
I know it has not been implemented yet. I just say that it is possible
for khibernation and it is almost impossible for u/swsusp and TuxOnIce.
> > 4. Based on kexec/kdump implementation, the kernel code needed is
> > less.
>
> Well, maybe.
>
Best Regards,
Huang Ying
On Friday, 12 October 2007 05:19, Huang, Ying wrote:
> On Thu, 2007-10-11 at 12:17 +0200, Rafael J. Wysocki wrote:
> > On Thursday, 11 October 2007 04:13, Huang, Ying wrote:
> > > Kexec base hibernation has some potential advantages over uswsusp and
> > > TuxOnIce (suspend2). Some most obvious advantages are:
> >
> > Well, I have some doubts as far as the obviousness is concerned.
>
> OK, I will remove the "obvious".
>
> > > 1. The hibernation image size can exceed half of memory size easily.
> >
> > This is also possible with TuxOnIce.
>
> I will add detail description about this. It is possible with TuxOnIce,
> and hard with u/swsusp.
Actually, possible with TuxOnIce, impossible with u/swsusp, at present.
> > > 2. The hibernation image can be written to and read from almost
> > > anywhere, such as USB disk, NFS.
> >
> > This is possible with uswsusp, at least in theory, probably with TuxOnIce too.
>
> I will remove this.
OK, thanks.
> > > 3. It is possible to eliminate freezer from kexec based hibernation
> > > implementation.
> >
> > This isn't true as long as we have not changed the handling of devices
> > (which is in the works, but will take time).
>
> I know it has not been implemented yet. I just say that it is possible
> for khibernation and it is almost impossible for u/swsusp and TuxOnIce.
When that's implemented, it may be possible to avoid using the freezer for
u/swsusp and TuxOnIce either. Time will tell.
Greetings,
Rafael
Huang, Ying wrote:
> The hibernation procedure with the patch set is as follow:
>
> 1. Boot a kernel A
>
> 2. Work under kernel A
>
> 3. Kexec another kernel B (crash dump enabled) in kernel A.
>
> 4. Save the memory image of kernel A through crash dump (such as "cp
> /proc/vmcore ~"). Save the "jump back entry".
Doesn't this also save the memory of kernel B?
> 5. Shutdown or reboot
>
>
> The restore process with the patch set is as follow:
>
> 1. Boot a kernel C (crash dump enabled), the memory area used by
> kernel C must be a subset of memory area used by kernel B.
Why is a third kernel needed? Why can't kernel B be used for this as
well? In fact, if kernel A has been compiled to be relocatable and
crash dump enabled, why wouldn't it suffice for all 3 instances?
On Fri, 19 Oct 2007, Phillip Susi wrote:
> Huang, Ying wrote:
>>
>>
>> The restore process with the patch set is as follow:
>>
>> 1. Boot a kernel C (crash dump enabled), the memory area used by
>> kernel C must be a subset of memory area used by kernel B.
>
> Why is a third kernel needed? Why can't kernel B be used for this as well?
> In fact, if kernel A has been compiled to be relocatable and crash dump
> enabled, why wouldn't it suffice for all 3 instances?
you could use one kernel for all three, or you could use three different
kernels, and three different sets of userspace if it's appropriate.
David Lang
On 10/20/07, Phillip Susi <[email protected]> wrote:
> Huang, Ying wrote:
> > The hibernation procedure with the patch set is as follow:
> >
> > 1. Boot a kernel A
> >
> > 2. Work under kernel A
> >
> > 3. Kexec another kernel B (crash dump enabled) in kernel A.
> >
> > 4. Save the memory image of kernel A through crash dump (such as "cp
> > /proc/vmcore ~"). Save the "jump back entry".
>
> Doesn't this also save the memory of kernel B?
The memory area of kernel B is excluded from the elf header of
/proc/vmcore. This is done in kexec-tools (/sbin/kexec) patches.
> > 5. Shutdown or reboot
> >
> >
> > The restore process with the patch set is as follow:
> >
> > 1. Boot a kernel C (crash dump enabled), the memory area used by
> > kernel C must be a subset of memory area used by kernel B.
>
> Why is a third kernel needed? Why can't kernel B be used for this as
> well? In fact, if kernel A has been compiled to be relocatable and
> crash dump enabled, why wouldn't it suffice for all 3 instances?
One kernel can be used for three situation. In fact, I use just one
kernel for testing.
Best Regards,
Huang Ying