2002-04-07 23:37:44

by Pavel Machek

[permalink] [raw]
Subject: Make swsusp actually work

Hi!

There were two bugs, and linux/mm.h one took me *very* long to
find... Well, those bits used for zone should have been marked. Plus I
hack ide_..._suspend code not to panic, and it now seems to
work. [Sorry, 2pm, have to get some sleep.]

Please apply,
Pavel

--- linux-ac.clean/drivers/ide/ide-disk.c Sun Apr 7 10:55:09 2002
+++ linux-swsusp.24/drivers/ide/ide-disk.c Mon Apr 8 01:22:06 2002
@@ -1567,7 +1567,7 @@
struct hwgroup_s *hwgroup = ide_hwifs[i].hwgroup;

if (!hwgroup) continue;
- hwgroup->handler = hwgroup->handler_save;
+ hwgroup->handler = NULL; /* hwgroup->handler_save; */
hwgroup->handler_save = NULL;
}
driver_blocked = 0;
@@ -1584,6 +1584,7 @@
if (hwgroup->handler != panic_box)
panic("Handler was not set to panic?");
hwgroup->handler_save = NULL;
+ hwgroup->handler = NULL;
}
driver_blocked = 0;
}
--- linux-ac.clean/include/linux/mm.h Sun Apr 7 10:55:12 2002
+++ linux-swsusp.24/include/linux/mm.h Mon Apr 8 01:04:06 2002
@@ -303,7 +303,9 @@
#define PG_arch_1 13
#define PG_reserved 14
#define PG_launder 15 /* written out by VM pressure.. */
-#define PG_nosave 29
+#define PG_nosave 16
+/* Don't you dare to use high bits, they seem to be used for something else! */
+

/* Make it prettier to test the above... */
#define UnlockPage(page) unlock_page(page)


--
(about SSSCA) "I don't say this lightly. However, I really think that the U.S.
no longer is classifiable as a democracy, but rather as a plutocracy." --hpa


2002-04-08 07:52:04

by Brian Litzinger

[permalink] [raw]
Subject: Re: Make swsusp actually work better

On Mon, Apr 08, 2002 at 01:37:26AM +0200, Pavel Machek wrote:
> Hi!
>
> There were two bugs, and linux/mm.h one took me *very* long to
> find... Well, those bits used for zone should have been marked. Plus I
> hack ide_..._suspend code not to panic, and it now seems to
> work. [Sorry, 2pm, have to get some sleep.]

I can suspend without oopses. Yeh!

However, during the boot '2419p5a3 resume=/dev/hda6' it oopses right
after saying a couple of things about not being able to determine
blocksize. I'll photograph the repeatable oops and get it to you
when I have access to my camera again. Probably in the next
24 hours.

Do I need APM compiled into the kernel?


> (about SSSCA) "I don't say this lightly. However, I really think that
> the U.S. no longer is classifiable as a democracy, but rather as a
> plutocracy." --hpa

The US was never a democracy. It was a constitutional republic.
Be that as it may, Thomas Hobbes wrote that there are only three
kinds of governments: Monarchies, Democracies, and Aristocracies.
All the rest of kinds and types are just versions of those already
mentioned. The US is not a Democracy by any definition
so clearly I cannot choose the glass in front of me...

While the country may aspire to be a constitutional republic,
it seems to operate by the golden rule mixed with a bit of
thuggery, so clearly I cannot choose the glass in front of you...
Are there countries that operate differently?

--
Brian Litzinger <[email protected]>

Copyright (c) 2002 By Brian Litzinger, All Rights Reserved

2002-04-08 08:25:58

by Ed Sweetman

[permalink] [raw]
Subject: Re: Make swsusp actually work

It tries killing init on resume for me and panics. the actual suspend
worked though (apparently). I attached the ksymoops output to the
panic.

On Sun, 2002-04-07 at 19:37, Pavel Machek wrote:
> Hi!
>
> There were two bugs, and linux/mm.h one took me *very* long to
> find... Well, those bits used for zone should have been marked. Plus I
> hack ide_..._suspend code not to panic, and it now seems to
> work. [Sorry, 2pm, have to get some sleep.]
>


Attachments:
ksymoops.out (2.39 kB)

2002-04-08 10:12:57

by Pavel Machek

[permalink] [raw]
Subject: Re: Make swsusp actually work better

Hi!

> > There were two bugs, and linux/mm.h one took me *very* long to
> > find... Well, those bits used for zone should have been marked. Plus I
> > hack ide_..._suspend code not to panic, and it now seems to
> > work. [Sorry, 2pm, have to get some sleep.]
>
> I can suspend without oopses. Yeh!
>
> However, during the boot '2419p5a3 resume=/dev/hda6' it oopses right
> after saying a couple of things about not being able to determine
> blocksize. I'll photograph the repeatable oops and get it to you
> when I have access to my camera again. Probably in the next
> 24 hours.

I mailed two patches to the list in last two days. The first one
should fix this.

> > (about SSSCA) "I don't say this lightly. However, I really think that
> > the U.S. no longer is classifiable as a democracy, but rather as a
> > plutocracy." --hpa
>
> The US was never a democracy. It was a constitutional republic.

I think you can have democracy and constitutional republic at same
time, no?
Pavel
--
Casualities in World Trade Center: ~3k dead inside the building,
cryptography in U.S.A. and free speech in Czech Republic.

2002-04-08 10:25:23

by Pavel Machek

[permalink] [raw]
Subject: Re: Make swsusp actually work

Hi!

> It tries killing init on resume for me and panics. the actual suspend
> worked though (apparently). I attached the ksymoops output to the
> panic.

You need my previous patch too. Here are both, combined.

Pavel

--- linux-ac.clean/Documentation/swsusp.txt Tue Jan 29 12:16:13 2002
+++ linux-swsusp.24/Documentation/swsusp.txt Sun Apr 7 11:42:03 2002
@@ -0,0 +1,168 @@
+From kernel/suspend.c:
+
+ * BIG FAT WARNING *********************************************************
+ *
+ * If you have unsupported (*) devices using DMA...
+ * ...say goodbye to your data.
+ *
+ * If you touch anything on disk between suspend and resume...
+ * ...kiss your data goodbye.
+ *
+ * If your disk driver does not support suspend... (IDE does)
+ * ...you'd better find out how to get along
+ * without your data.
+ *
+ * (*) pm interface support is needed to make it safe.
+
+You need to append resume=/dev/your_swap_partition to kernel command
+line. Then you suspend by echo 4 > /proc/acpi/sleep.
+
+[Notice. Rest docs is pretty outdated (see date!) It should be safe to
+use swsusp on ext3/reiserfs these days.]
+
+
+Article about goals and implementation of Software Suspend for Linux
+Author: G?bor Kuti
+Last revised: Jan 23rd, 2000
+
+Idea and goals to achieve
+
+Nowadays it is common in several laptops that they have a suspend button. It
+saves the state of the machine to a filesystem or to a partition and switches
+to standby mode. Later resuming the machine the saved state is loaded back to
+ram and the machine can continue its work. It has two real benefits. First we
+save ourselves the time machine goes down and later boots up, energy costs
+real high when running from batteries. The other gain is that we don't have to
+interrupt our programs so processes that are calculating something for a long
+time shouldn't need to be written interruptible.
+
+On desk machines the power saving function isn't as important as it is in
+laptops but we really may benefit from the second one. Nowadays the number of
+desk machines supporting suspend function in their APM is going up but there
+are (and there will still be for a long time) machines that don't even support
+APM of any kind. On the other hand it is reported that using APM's suspend
+some irqs (e.g. ATA disk irq) is lost and it is annoying for the user until
+the Linux kernel resets the device.
+
+So I started thinking about implementing Software Suspend which doesn't need
+any APM support and - since it uses pretty near only high-level routines - is
+supposed to be architecture independent code.
+
+Using the code
+
+The code is experimental right now - testers, extra eyes are welcome. To
+compile this support into the kernel, you need CONFIG_EXPERIMENTAL,
+and then CONFIG_SOFTWARE_SUSPEND in menu General Setup to be enabled. It
+cannot be used as a module and I don't think it will ever be needed.
+
+You have two ways to use this code. The first one is if you've compiled in
+sysrq support then you may press Sysrq-D to request suspend. The other way
+is with a patched SysVinit (my patch is against 2.76 and available at my
+home page). You might call 'swsusp' or 'shutdown -z <time>'.
+
+Either way it saves the state of the machine into active swaps and then
+reboots. By the next booting the kernel's resuming function is either triggered
+by swapon -a (which is ought to be in the very early stage of booting) or you
+may explicitly specify the swap partition/file to resume from with ``resume=''
+kernel option. If signature is found it loads and restores saved state. If the
+option ``noresume'' is specified as a boot parameter, it skips the resuming.
+Warning! Look at section ``Things to implement'' to see what isn't yet
+implemented. Also I strongly suggest you to list all active swaps in
+/etc/fstab. Firstly because you don't have to specify anything to resume and
+secondly if you have more than one swap area you can't decide which one has the
+'root' signature. Also, it doesn't yet support ``embedded'' swapfiles, e.g. you
+have a dos partition mounted on /dos and also have a /dos/swapfile named
+swapfile.. I plan to add support to mount all necessary filesystems readonly.
+
+In the meantime while the system is suspended you should not touch any of the
+hardware!
+
+About the code
+Goals reached
+
+The code can be downloaded from
+http://falcon.sch.bme.hu/~seasons/linux/. It mainly works but there are still
+some of XXXs, TODOs, FIXMEs in the code which seem not to be too important. It
+should work all right except for the problems listed in ``Things to
+implement''. Notes about the code are really welcome.
+
+How the code works
+
+When suspending is triggered it immediately wakes up process bdflush. Bdflush
+checks whether we have anything in our run queue tq_bdflush. Since we queued up
+function do_software_suspend, it is called. Here we shrink everything including
+dcache, inodes, buffers and memory (here mainly processes are swapped out). We
+count how many pages we need to duplicate (we have to be atomical!) then we
+create an appropiate sized page directory. It will point to the original and
+the new (copied) address of the page. We get the free pages by
+__get_free_pages() but since it changes state we have to be able to track it
+later so it also flips in a bit in page's flags (a new Nosave flag). We
+duplicate pages and then mark them as used (so atomicity is ensured). After
+this we write out the image to swaps, do another sync and the machine may
+reboot. We also save registers to stack.
+
+By resuming an ``inverse'' method is executed. The image if exists is loaded,
+loadling is either triggered by ``swapon -a'' or ``resume='' kernel option. We
+change our task to bdflush (it is needed because if we don't do this init does
+an oops when it is waken up later) and then pages are copied back to their
+original location. We restore registers, free previously allocated memory,
+activate memory context and task information. Here we should restore hardware
+state but even without this the machine is restored and processes are continued
+to work. I think hardware state should be restored by some list (using
+notify_chain) and probably by some userland program (run-parts?) for users'
+pleasure. Check out my patch at the same location for the sysvinit patch.
+
+WARNINGS!
+- Using swsusp on logging fs (e.g.reiserfs) could lead to pretty bad things:
+ mounting of reiserfs (even readonly!) means log is replayed, which
+ means writes to filesystem. That could lead to pretty bad corruption.
+- It does not like pcmcia cards. And this is logical: pcmcia cards need cardmgr to be
+ initialized. they are not initialized during singleuser boot, but "resumed" kernel does
+ expect them to be initialized. That leads to armagedon. You should eject any pcmcia cards
+ before suspending.
+
+Things to implement
+- SMP support. I've done an SMP support but since I don't have access to a kind
+ of this one I cannot test it. Please SMP people test it. .. Tested it,
+ doesn't work. Had no time to figure out why. There is some mess with
+ interrupts AFAIK..
+- We should only make a copy of data related to kernel segment, since any
+ process data won't be changed.
+- Should \#define ADDRESS go into linux/mm.h?
+- By copying pages back to their original position, copy_page caused General
+ Protection Fault. Why?
+- Hardware state restoring. Now there's support for notifying via the notify
+ chain, event handlers are welcome. Some devices may have microcodes loaded
+ into them. We should have event handlers for them aswell.
+- We should support other architectures (There are really only some arch
+ related functions..)
+- We should also restore original state of swaps if the ``noresume'' kernel
+ option is specified.. Or do we need such a feature to save state for some
+ other time? Do we need some kind of ``several saved states''? (Linux-HA
+ people?). There's been some discussion about checkpointing on linux-future.
+- Should make more sanity checks. Or are these enough?
+
+Not so important ideas for implementing
+
+- Should SIGSTOP_SUSPEND called SIGFREEZE? Or is it some kind of stealing
+ for Solaris?
+- If a real time process is running then don't suspend the machine.
+- Is there any sense in compressing the outwritten pages?
+- Support for power.conf file as in Solaris, autoshutdown, special
+ devicetypes support, maybe in sysctl.
+- Introduce timeout for SMP locking. But first locking ought to work :O
+- Pre-detect if we don't have enough swap space or free it instead of
+ calling panic.
+- Support for adding/removing hardware while suspended?
+- We should not free pages at the beginning so aggressively, most of them
+ go there anyway..
+- If X is active while suspending then by resuming calling svgatextmode
+ corrupts the virtual console of X.. (Maybe this has been fixed AFAIK).
+
+Any other idea you might have tell me!
+
+Contacting the author
+If you have any question or any patch that solves the above or detected
+problems please contact me at [email protected]. I might delay
+answering, sorry about that.
+
--- linux-ac.clean/drivers/ide/ide-disk.c Sun Apr 7 10:55:09 2002
+++ linux-swsusp.24/drivers/ide/ide-disk.c Mon Apr 8 01:22:06 2002
@@ -1567,7 +1567,7 @@
struct hwgroup_s *hwgroup = ide_hwifs[i].hwgroup;

if (!hwgroup) continue;
- hwgroup->handler = hwgroup->handler_save;
+ hwgroup->handler = NULL; /* hwgroup->handler_save; */
hwgroup->handler_save = NULL;
}
driver_blocked = 0;
@@ -1584,6 +1584,7 @@
if (hwgroup->handler != panic_box)
panic("Handler was not set to panic?");
hwgroup->handler_save = NULL;
+ hwgroup->handler = NULL;
}
driver_blocked = 0;
}
--- linux-ac.clean/include/linux/mm.h Sun Apr 7 10:55:12 2002
+++ linux-swsusp.24/include/linux/mm.h Mon Apr 8 01:04:06 2002
@@ -303,7 +303,9 @@
#define PG_arch_1 13
#define PG_reserved 14
#define PG_launder 15 /* written out by VM pressure.. */
-#define PG_nosave 29
+#define PG_nosave 16
+/* Don't you dare to use high bits, they seem to be used for something else! */
+

/* Make it prettier to test the above... */
#define UnlockPage(page) unlock_page(page)
--- linux-ac.clean/init/main.c Sun Apr 7 10:55:13 2002
+++ linux-swsusp.24/init/main.c Sun Apr 7 11:54:39 2002
@@ -522,15 +522,15 @@
tc_init();
#endif

- /* This has to be before mounting root, because even readonly mount of reiserfs would replay
- log corrupting stuff */
- software_resume();
-
/* Networking initialization needs a process context */
sock_init();

start_context_thread();
do_initcalls();
+
+ /* This has to be before mounting root, because even readonly mount of reiserfs would replay
+ log corrupting stuff */
+ software_resume();

#ifdef CONFIG_IRDA
irda_proto_init();
--- linux-ac.clean/kernel/suspend.c Sun Apr 7 10:55:13 2002
+++ linux-swsusp.24/kernel/suspend.c Mon Apr 8 00:51:02 2002
@@ -32,20 +32,6 @@
* More state savers are welcome. Especially for the scsi layer...
*
* For TODOs,FIXMEs also look in Documentation/swsusp.txt
- *
- * BIG FAT WARNING *********************************************************
- *
- * If you have unsupported (*) devices using DMA...
- * ...say goodbye to your data.
- *
- * If you touch anything on disk between suspend and resume...
- * ...kiss your data goodbye.
- *
- * If your disk driver does not support suspend... (IDE does)
- * ...you'd better find out how to get along
- * without your data.
- *
- * (*) pm interface support is needed to make it safe.
*/

/*



--
(about SSSCA) "I don't say this lightly. However, I really think that the U.S.
no longer is classifiable as a democracy, but rather as a plutocracy." --hpa

2002-04-08 14:21:28

by Ed Sweetman

[permalink] [raw]
Subject: Re: Make swsusp actually work

the documentation suggests that you do not need to specify resume= . Is
this only true if you have the sysvinit patch in use? Is swapon -a
simply a matter of making it the first init script executed by init?

2002-04-08 15:00:06

by Pavel Machek

[permalink] [raw]
Subject: Re: Make swsusp actually work

Hi!

> the documentation suggests that you do not need to specify resume= . Is
> this only true if you have the sysvinit patch in use? Is swapon -a

Then docs is wrong.
Pavel
--
Casualities in World Trade Center: ~3k dead inside the building,
cryptography in U.S.A. and free speech in Czech Republic.

2002-04-08 15:18:18

by Ed Sweetman

[permalink] [raw]
Subject: Re: Make swsusp actually work

On Mon, 2002-04-08 at 11:00, Pavel Machek wrote:
> Hi!
>
> > the documentation suggests that you do not need to specify resume= . Is
> > this only true if you have the sysvinit patch in use? Is swapon -a
>
> Then docs is wrong.
> Pavel
Then you need to change this from your previous "all in one" patch.

from the swsusp.txt in "Using the code" 3rd paragraph
<
Either way it saves the state of the machine into active swaps and then
reboots. By the next booting the kernel's resuming function is either
triggered by swapon -a (which is ought to be in the very early stage of
booting) or you may explicitly specify the swap partition/file to resume
from with ``resume='' kernel option.
>
This seems to suggest that you have a choice. Which last time i checked,
you dont.


from the swsusp.txt in "How the code works" 2nd paragraph
Same thing as before basically. swapon -a does not trigger a resume

under warnings!
Ext3 fs seems to show no more of a risk than a non-journalling fs.
perhaps that problem is reiserfs only?


Also, mention of the swap files being described in fstab is made in
"Using the code" but no mention is made to how they must be loaded and
must be actual raw partitions. Files of course would not work as viable
swaps for resume because the fs would have to be mounted to load them.

and one more thing. What happens when you have multiple swap files all
of equal priority (normal swap conditions have a striping effect (like
raid)) Will swsusp choose one ? How do we know which one it chose? Is
it just the first one in /proc/swaps all the time? That kind of
behavior would be nice to document in swsusp.txt.

Thanks for the patches. it seems to work on my non X box (p4) just
fine. I'll have to risk disaster and try it out on a dri X session soon
since that's where the convenience would come into play.

2002-04-08 17:49:55

by Eric W. Biederman

[permalink] [raw]
Subject: Re: Make swsusp actually work better

Pavel Machek <[email protected]> writes:

> Hi!
>
> > > There were two bugs, and linux/mm.h one took me *very* long to
> > > find... Well, those bits used for zone should have been marked. Plus I
> > > hack ide_..._suspend code not to panic, and it now seems to
> > > work. [Sorry, 2pm, have to get some sleep.]
> >
> > I can suspend without oopses. Yeh!
> >
> > However, during the boot '2419p5a3 resume=/dev/hda6' it oopses right
> > after saying a couple of things about not being able to determine
> > blocksize. I'll photograph the repeatable oops and get it to you
> > when I have access to my camera again. Probably in the next
> > 24 hours.
>
> I mailed two patches to the list in last two days. The first one
> should fix this.
>
> > > (about SSSCA) "I don't say this lightly. However, I really think that
> > > the U.S. no longer is classifiable as a democracy, but rather as a
> > > plutocracy." --hpa
> >
> > The US was never a democracy. It was a constitutional republic.
>
> I think you can have democracy and constitutional republic at same
> time, no?

In a technical sense the difference is when a vote is taken to
pass/not pass a law. In a republic your representative votes for
you. In a democracy every citizen in the whole nation votes.

Eric

2002-04-08 20:51:10

by Pavel Machek

[permalink] [raw]
Subject: Re: Make swsusp actually work

Hi!

> > > the documentation suggests that you do not need to specify resume= . Is
> > > this only true if you have the sysvinit patch in use? Is swapon -a
> >
> > Then docs is wrong.
> > Pavel
> Then you need to change this from your previous "all in one" patch.
>
> from the swsusp.txt in "Using the code" 3rd paragraph
> <
> Either way it saves the state of the machine into active swaps and then
> reboots. By the next booting the kernel's resuming function is either
> triggered by swapon -a (which is ought to be in the very early stage of
> booting) or you may explicitly specify the swap partition/file to resume
> from with ``resume='' kernel option.
> >
> This seems to suggest that you have a choice. Which last time i checked,
> you dont.

Fixed, thanx.

> from the swsusp.txt in "How the code works" 2nd paragraph
> Same thing as before basically. swapon -a does not trigger a resume
>
> under warnings!
> Ext3 fs seems to show no more of a risk than a non-journalling fs.
> perhaps that problem is reiserfs only?

Warning about journalling filesystem was obsoleted. In now should work
okay with journalled filesystems.

> Also, mention of the swap files being described in fstab is made in
> "Using the code" but no mention is made to how they must be loaded and
> must be actual raw partitions. Files of course would not work as viable
> swaps for resume because the fs would have to be mounted to load them.

Killed relevant part.


> and one more thing. What happens when you have multiple swap files all
> of equal priority (normal swap conditions have a striping effect (like
> raid)) Will swsusp choose one ? How do we know which one it chose? Is
> it just the first one in /proc/swaps all the time? That kind of
> behavior would be nice to document in swsusp.txt.

Just don't use multiple swap partitions for now.

> Thanks for the patches. it seems to work on my non X box (p4) just
> fine. I'll have to risk disaster and try it out on a dri X session soon
> since that's where the convenience would come into play.

Switch to text console and do suspend there. That should work.

Pavel

--
Casualities in World Trade Center: ~3k dead inside the building,
cryptography in U.S.A. and free speech in Czech Republic.

2002-04-08 21:05:14

by Pavel Machek

[permalink] [raw]
Subject: Re: Make swsusp actually work

Hi!

> > > the documentation suggests that you do not need to specify resume= . Is
> > > this only true if you have the sysvinit patch in use? Is swapon -a
> >
> > Then docs is wrong.

> Then you need to change this from your previous "all in one" patch.

Okay, here's updated docs. Any more comments?

[Alan, either copy this over your Documentation/swsusp.txt, or just
ignore the issue and I'll fixed in -ac5.]
Pavel

>From kernel/suspend.c:

* BIG FAT WARNING *********************************************************
*
* If you have unsupported (*) devices using DMA...
* ...say goodbye to your data.
*
* If you touch anything on disk between suspend and resume...
* ...kiss your data goodbye.
*
* If your disk driver does not support suspend... (IDE does)
* ...you'd better find out how to get along
* without your data.
*
* (*) pm interface support is needed to make it safe.

You need to append resume=/dev/your_swap_partition to kernel command
line. Then you suspend by echo 4 > /proc/acpi/sleep.

[Notice. Rest docs is pretty outdated (see date!) It should be safe to
use swsusp on ext3/reiserfs these days.]


Article about goals and implementation of Software Suspend for Linux
Author: G?bor Kuti
Last revised: 2002-04-08

Idea and goals to achieve

Nowadays it is common in several laptops that they have a suspend button. It
saves the state of the machine to a filesystem or to a partition and switches
to standby mode. Later resuming the machine the saved state is loaded back to
ram and the machine can continue its work. It has two real benefits. First we
save ourselves the time machine goes down and later boots up, energy costs
real high when running from batteries. The other gain is that we don't have to
interrupt our programs so processes that are calculating something for a long
time shouldn't need to be written interruptible.

On desk machines the power saving function isn't as important as it is in
laptops but we really may benefit from the second one. Nowadays the number of
desk machines supporting suspend function in their APM is going up but there
are (and there will still be for a long time) machines that don't even support
APM of any kind. On the other hand it is reported that using APM's suspend
some irqs (e.g. ATA disk irq) is lost and it is annoying for the user until
the Linux kernel resets the device.

So I started thinking about implementing Software Suspend which doesn't need
any APM support and - since it uses pretty near only high-level routines - is
supposed to be architecture independent code.

Using the code

The code is experimental right now - testers, extra eyes are welcome. To
compile this support into the kernel, you need CONFIG_EXPERIMENTAL,
and then CONFIG_SOFTWARE_SUSPEND in menu General Setup to be enabled. It
cannot be used as a module and I don't think it will ever be needed.

You have two ways to use this code. The first one is if you've compiled in
sysrq support then you may press Sysrq-D to request suspend. The other way
is with a patched SysVinit (my patch is against 2.76 and available at my
home page). You might call 'swsusp' or 'shutdown -z <time>'. Next way is to
echo 4 > /proc/acpi/sleep.

Either way it saves the state of the machine into active swaps and then
reboots. You must explicitly specify the swap partition to resume from with ``resume=''
kernel option. If signature is found it loads and restores saved state. If the
option ``noresume'' is specified as a boot parameter, it skips the resuming.
Warning! Look at section ``Things to implement'' to see what isn't yet
implemented. Also I strongly suggest you to list all active swaps in
/etc/fstab. Firstly because you don't have to specify anything to resume and
secondly if you have more than one swap area you can't decide which one has the
'root' signature.

In the meantime while the system is suspended you should not touch any of the
hardware!

About the code
Goals reached

The code can be downloaded from
http://falcon.sch.bme.hu/~seasons/linux/. It mainly works but there are still
some of XXXs, TODOs, FIXMEs in the code which seem not to be too important. It
should work all right except for the problems listed in ``Things to
implement''. Notes about the code are really welcome.

How the code works

When suspending is triggered it immediately wakes up process bdflush. Bdflush
checks whether we have anything in our run queue tq_bdflush. Since we queued up
function do_software_suspend, it is called. Here we shrink everything including
dcache, inodes, buffers and memory (here mainly processes are swapped out). We
count how many pages we need to duplicate (we have to be atomical!) then we
create an appropiate sized page directory. It will point to the original and
the new (copied) address of the page. We get the free pages by
__get_free_pages() but since it changes state we have to be able to track it
later so it also flips in a bit in page's flags (a new Nosave flag). We
duplicate pages and then mark them as used (so atomicity is ensured). After
this we write out the image to swaps, do another sync and the machine may
reboot. We also save registers to stack.

By resuming an ``inverse'' method is executed. The image if exists is loaded,
loadling is either triggered by ``resume='' kernel option. We
change our task to bdflush (it is needed because if we don't do this init does
an oops when it is waken up later) and then pages are copied back to their
original location. We restore registers, free previously allocated memory,
activate memory context and task information. Here we should restore hardware
state but even without this the machine is restored and processes are continued
to work. I think hardware state should be restored by some list (using
notify_chain) and probably by some userland program (run-parts?) for users'
pleasure. Check out my patch at the same location for the sysvinit patch.

WARNINGS!
- It does not like pcmcia cards. And this is logical: pcmcia cards need cardmgr to be
initialized. they are not initialized during singleuser boot, but "resumed" kernel does
expect them to be initialized. That leads to armagedon. You should eject any pcmcia cards
before suspending.

Things to implement
- SMP support. I've done an SMP support but since I don't have access to a kind
of this one I cannot test it. Please SMP people test it. .. Tested it,
doesn't work. Had no time to figure out why. There is some mess with
interrupts AFAIK..
- We should only make a copy of data related to kernel segment, since any
process data won't be changed.
- By copying pages back to their original position, copy_page caused General
Protection Fault. Why?
- Hardware state restoring. Now there's support for notifying via the notify
chain, event handlers are welcome. Some devices may have microcodes loaded
into them. We should have event handlers for them aswell.
- We should support other architectures (There are really only some arch
related functions..)
- We should also restore original state of swaps if the ``noresume'' kernel
option is specified.. Or do we need such a feature to save state for some
other time? Do we need some kind of ``several saved states''? (Linux-HA
people?). There's been some discussion about checkpointing on linux-future.
- Should make more sanity checks. Or are these enough?

Not so important ideas for implementing

- If a real time process is running then don't suspend the machine.
- Is there any sense in compressing the outwritten pages?
- Support for power.conf file as in Solaris, autoshutdown, special
devicetypes support, maybe in sysctl.
- Introduce timeout for SMP locking. But first locking ought to work :O
- Pre-detect if we don't have enough swap space or free it instead of
calling panic.
- Support for adding/removing hardware while suspended?
- We should not free pages at the beginning so aggressively, most of them
go there anyway..
- If X is active while suspending then by resuming calling svgatextmode
corrupts the virtual console of X.. (Maybe this has been fixed AFAIK).

Any other idea you might have tell me!

Contacting the author
If you have any question or any patch that solves the above or detected
problems please contact me at [email protected]. I might delay
answering, sorry about that.


--
(about SSSCA) "I don't say this lightly. However, I really think that the U.S.
no longer is classifiable as a democracy, but rather as a plutocracy." --hpa

2002-04-08 21:20:22

by Brian Litzinger

[permalink] [raw]
Subject: Re: Make swsusp actually work

On Mon, Apr 08, 2002 at 01:37:26AM +0200, Pavel Machek wrote:
> Hi!
>
> There were two bugs, and linux/mm.h one took me *very* long to
> find... Well, those bits used for zone should have been marked. Plus I
> hack ide_..._suspend code not to panic, and it now seems to
> work. [Sorry, 2pm, have to get some sleep.]

I've applied both this patch and the earlier one, and now my
2.4.19-pre5-ac3 system can suspend and it can resume. However,
when it resumed, I was stuck in the kernel SysRq function.

Couldn't get out of it.

And nothing seemed to work, other than it kept displaying the
help each time I touched a key.

On the other hand, the swsusp in 2.4.18-WOLK3.3 works correctly.

--
Brian Litzinger <[email protected]>

Copyright (c) 2002 By Brian Litzinger, All Rights Reserved

2002-04-08 21:27:37

by Ed Sweetman

[permalink] [raw]
Subject: Re: Make swsusp actually work

On Mon, 2002-04-08 at 17:15, [email protected] wrote:
> On Mon, Apr 08, 2002 at 01:37:26AM +0200, Pavel Machek wrote:
> > Hi!
> >
> > There were two bugs, and linux/mm.h one took me *very* long to
> > find... Well, those bits used for zone should have been marked. Plus I
> > hack ide_..._suspend code not to panic, and it now seems to
> > work. [Sorry, 2pm, have to get some sleep.]
>
> I've applied both this patch and the earlier one, and now my
> 2.4.19-pre5-ac3 system can suspend and it can resume. However,
> when it resumed, I was stuck in the kernel SysRq function.
>
> Couldn't get out of it.

press alt Sysrq again and then enter

> And nothing seemed to work, other than it kept displaying the
> help each time I touched a key.
>
> On the other hand, the swsusp in 2.4.18-WOLK3.3 works correctly.
>


2002-04-08 21:32:10

by Pavel Machek

[permalink] [raw]
Subject: Re: Make swsusp actually work

Hi!
> >
> > There were two bugs, and linux/mm.h one took me *very* long to
> > find... Well, those bits used for zone should have been marked. Plus I
> > hack ide_..._suspend code not to panic, and it now seems to
> > work. [Sorry, 2pm, have to get some sleep.]
>
> I've applied both this patch and the earlier one, and now my
> 2.4.19-pre5-ac3 system can suspend and it can resume. However,
> when it resumed, I was stuck in the kernel SysRq function.
>
> Couldn't get out of it.

You should be able to simply press sysrq again.

> And nothing seemed to work, other than it kept displaying the
> help each time I touched a key.

> On the other hand, the swsusp in 2.4.18-WOLK3.3 works correctly.

There should be nothing changed in that area. Perhaps its just timing?

Pavel
--
Casualities in World Trade Center: ~3k dead inside the building,
cryptography in U.S.A. and free speech in Czech Republic.