2001-03-19 19:48:03

by Otto Wyss

[permalink] [raw]
Subject: Linux should better cope with power failure

Lately I had an USB failure, leaving me without any access to my system
since I only use an USB-keyboard/-mouse. All I could do in that
situation was switching power off and on after a few minutes of
inactivity. From the impression I got during the following startup, I
assume Linux (2.4.2, EXT2-filesystem) is not very suited to any power
failiure or manually switching it off. Not even if there wasn't any
activity going on.

Shouldn't a good system allways try to be on the save side? Shouldn't
Linux try to be more fail save? There is currently much work done in
getting high performance during high activity but it seems there is no
work done at all in getting a save system during low/no activity. I
think this is a major drawback and should be addressed as fast as
possible. Bringing a system to save state should allway have a high priority.

How could this be accomplished:
1. Flush any dirty cache pages as soon as possible. There may not be any
dirty cache after a certain amount of idle time.
2. Keep open files in a state where it doesn't matter if they where
improperly closed (if possible).
3. Swap may not contain anything which can't be discarded. Otherwise
swap has to be treated as ordinary disk space.

These actions are not filesystem dependant. It might be that certain
filesystem cope better with power failiure than others but still it's
much better not to have errors instead to fix them.

Don't we tell children never go close to any abyss or doesn't have
alpinist a saying "never go to the limits"? So why is this simple rule
always broken with computers?

O. Wyss


2001-03-19 23:11:29

by Ben Ford

[permalink] [raw]
Subject: Re: Linux should better cope with power failure

> Actually, I think /etc/mtab is not needed at all. Originally, UNIX
> used to put as much onto the disk (and not in "core") as possible.
> so much state information related only to one boot-cycle was
> taken out of kernel and stored on disk. /var/run/utmp, /etc/mtab,
> , rmtab, and many others. all are invalidated by a reboot, and are yet
> stored
> in non-volatile storage. kernel memory is not swappable, so they manually
> separated out the minimum needed in core.
>
> Linux currently has a lot of this info in core, and maintains the disk files
> for backwards compatibility. in the case of /etc/mtab, I believe
> /proc/mounts
> has the same info. It appears to be in the same format as /etc/mtab,
> so most of the groundwork has already been done.
> i've considered trying just changing /etc/mtab to /proc/mounts in some
> utilities, to remove the need for read-write root. This (and other cases)
> would guarantee consistency (look at /etc/mtab after restart in single
> user more - ugh)

It has been suggested to ln -sf /proc/mounts /etc/mtab. Linus has said this, I
believe.

-b


2001-03-19 20:00:23

by Charles Cazabon

[permalink] [raw]
Subject: Re: Linux should better cope with power failure

Otto Wyss <[email protected]> wrote:
> Lately I had an USB failure, leaving me without any access to my system
> since I only use an USB-keyboard/-mouse. All I could do in that
> situation was switching power off and on after a few minutes of
> inactivity. From the impression I got during the following startup, I
> assume Linux (2.4.2, EXT2-filesystem) is not very suited to any power
> failiure or manually switching it off. Not even if there wasn't any
> activity going on.

You're not using the filesystem the way you should, if you expect to be
able to kill the power and not lose data.

> How could this be accomplished:
> 1. Flush any dirty cache pages as soon as possible. There may not be any
> dirty cache after a certain amount of idle time.

Mount the filesystem sychronously if you want this.

> 2. Keep open files in a state where it doesn't matter if they where
> improperly closed (if possible).

Mount the filesystem read-only if you want this.

> 3. Swap may not contain anything which can't be discarded. Otherwise
> swap has to be treated as ordinary disk space.

The kernel doesn't care about what's in swap. Fix your applications if they
do.

Charles
--
-----------------------------------------------------------------------
Charles Cazabon <[email protected]>
GPL'ed software available at: http://www.qcc.sk.ca/~charlesc/software/
Any opinions expressed are just that -- my opinions.
-----------------------------------------------------------------------

2001-03-19 20:17:03

by Richard B. Johnson

[permalink] [raw]
Subject: Re: Linux should better cope with power failure

On Mon, 19 Mar 2001, Otto Wyss wrote:

> Lately I had an USB failure, leaving me without any access to my system
> since I only use an USB-keyboard/-mouse. All I could do in that
> situation was switching power off and on after a few minutes of
> inactivity. From the impression I got during the following startup, I
> assume Linux (2.4.2, EXT2-filesystem) is not very suited to any power
> failiure or manually switching it off. Not even if there wasn't any
> activity going on.
>
> Shouldn't a good system allways try to be on the save side? Shouldn't
> Linux try to be more fail save? There is currently much work done in
> getting high performance during high activity but it seems there is no
> work done at all in getting a save system during low/no activity. I
> think this is a major drawback and should be addressed as fast as
> possible. Bringing a system to save state should allway have a high priority.
>
> How could this be accomplished:
> 1. Flush any dirty cache pages as soon as possible. There may not be any
> dirty cache after a certain amount of idle time.
> 2. Keep open files in a state where it doesn't matter if they where
> improperly closed (if possible).
> 3. Swap may not contain anything which can't be discarded. Otherwise
> swap has to be treated as ordinary disk space.
>
> These actions are not filesystem dependant. It might be that certain
> filesystem cope better with power failiure than others but still it's
> much better not to have errors instead to fix them.
>
> Don't we tell children never go close to any abyss or doesn't have
> alpinist a saying "never go to the limits"? So why is this simple rule
> always broken with computers?
>

Unix and other such variants have what's called a Virtual File System
(VFS). The idea behind this is to keep as much recently-used file stuff
in memory so that the system can be as fast as if you used a RAM disk
instead of real physical (slow) hard disks. If you can't cope with this,
use DOS. Even Windows tries to emulate Unix as far as VFS is concerned.
However Windows never reports any errors. By design, Windows keeps
trashing along until you must reinstall it because there is nothing left
of the file-system.

If you want file-system trashing errors hidden, use Windows. Unix and
its variants provide enough information in their file-systems to recover
the file-system, although not necessaily a particular file, upon startup,
if you have just switched the system off. It uses `fsck` for this.

If as you say; "bringing the system to a save state should always
have a high priority...", then mount the disks `sync`.


Cheers,
Dick Johnson

Penguin : Linux version 2.4.1 on an i686 machine (799.53 BogoMips).

"Memory is like gasoline. You use it up when you are running. Of
course you get it all back when you reboot..."; Actual explanation
obtained from the Micro$oft help desk.


2001-03-19 20:21:53

by William T Wilson

[permalink] [raw]
Subject: Re: Linux should better cope with power failure

On Mon, 19 Mar 2001, Otto Wyss wrote:

> inactivity. From the impression I got during the following startup, I
> assume Linux (2.4.2, EXT2-filesystem) is not very suited to any power
> failiure or manually switching it off. Not even if there wasn't any
> activity going on.

What data, if any, did you lose?

While fsck complains loudly when the system comes back up, 9 times in 10
no data is actually lost during a power loss. e2fsck is really good at
recovering damaged filesystems.

> How could this be accomplished:
> 1. Flush any dirty cache pages as soon as possible. There may not be any
> dirty cache after a certain amount of idle time.

Mount the filesystem synchronously to accomplish this. It will prevent
the kernel from using a write cache basically. It will ensure that if a
write operation completes, then the data will be physically on the disk
afterward.

> 2. Keep open files in a state where it doesn't matter if they where
> improperly closed (if possible).

The way to do this is to use a highly reliable filesystem, such as ext3fs,
Tux or ReiserFS. These filesystems guarantee that metadata is consistent
at all times.

> 3. Swap may not contain anything which can't be discarded. Otherwise
> swap has to be treated as ordinary disk space.

I can't think of a case where the contents of swap matter in any way for
recovering from a power failure.

2001-03-19 20:53:24

by Brian Gerst

[permalink] [raw]
Subject: Re: Linux should better cope with power failure

"Richard B. Johnson" wrote:
>
> On Mon, 19 Mar 2001, Otto Wyss wrote:
>
> > Lately I had an USB failure, leaving me without any access to my system
> > since I only use an USB-keyboard/-mouse. All I could do in that
> > situation was switching power off and on after a few minutes of
> > inactivity. From the impression I got during the following startup, I
> > assume Linux (2.4.2, EXT2-filesystem) is not very suited to any power
> > failiure or manually switching it off. Not even if there wasn't any
> > activity going on.
> >
> > Shouldn't a good system allways try to be on the save side? Shouldn't
> > Linux try to be more fail save? There is currently much work done in
> > getting high performance during high activity but it seems there is no
> > work done at all in getting a save system during low/no activity. I
> > think this is a major drawback and should be addressed as fast as
> > possible. Bringing a system to save state should allway have a high priority.
> >
> > How could this be accomplished:
> > 1. Flush any dirty cache pages as soon as possible. There may not be any
> > dirty cache after a certain amount of idle time.
> > 2. Keep open files in a state where it doesn't matter if they where
> > improperly closed (if possible).
> > 3. Swap may not contain anything which can't be discarded. Otherwise
> > swap has to be treated as ordinary disk space.
> >
> > These actions are not filesystem dependant. It might be that certain
> > filesystem cope better with power failiure than others but still it's
> > much better not to have errors instead to fix them.
> >
> > Don't we tell children never go close to any abyss or doesn't have
> > alpinist a saying "never go to the limits"? So why is this simple rule
> > always broken with computers?
> >
>
> Unix and other such variants have what's called a Virtual File System
> (VFS). The idea behind this is to keep as much recently-used file stuff
> in memory so that the system can be as fast as if you used a RAM disk
> instead of real physical (slow) hard disks. If you can't cope with this,
> use DOS.

At the very least the disk should be consistent with memory. If the
dirty pages aren't written back to the disk (but not necessarily removed
from memory) after a reasonable idle period, then there is room for
improvement.

--

Brian Gerst

2001-03-19 21:16:25

by Jeremy Jackson

[permalink] [raw]
Subject: Re: Linux should better cope with power failure

Brian Gerst wrote:

> "Richard B. Johnson" wrote:
> >
> > On Mon, 19 Mar 2001, Otto Wyss wrote:
> >
> > > Lately I had an USB failure, leaving me without any access to my system
> > > since I only use an USB-keyboard/-mouse. All I could do in that
> > > situation was switching power off and on after a few minutes of
> > > inactivity. From the impression I got during the following startup, I
> > > assume Linux (2.4.2, EXT2-filesystem) is not very suited to any power
> > > failiure or manually switching it off. Not even if there wasn't any
> > > activity going on.
> > >
> > > Shouldn't a good system allways try to be on the save side? Shouldn't
> > > Linux try to be more fail save? There is currently much work done in
> > > getting high performance during high activity but it seems there is no
> > > work done at all in getting a save system during low/no activity. I
> > > think this is a major drawback and should be addressed as fast as
> > > possible. Bringing a system to save state should allway have a high priority.
> > >
> > > How could this be accomplished:
> > > 1. Flush any dirty cache pages as soon as possible. There may not be any
> > > dirty cache after a certain amount of idle time.
> > > 2. Keep open files in a state where it doesn't matter if they where
> > > improperly closed (if possible).
> > > 3. Swap may not contain anything which can't be discarded. Otherwise
> > > swap has to be treated as ordinary disk space.
> > >
> > > These actions are not filesystem dependant. It might be that certain
> > > filesystem cope better with power failiure than others but still it's
> > > much better not to have errors instead to fix them.
> > >
> > > Don't we tell children never go close to any abyss or doesn't have
> > > alpinist a saying "never go to the limits"? So why is this simple rule
> > > always broken with computers?
> > >
> >
> > Unix and other such variants have what's called a Virtual File System
> > (VFS). The idea behind this is to keep as much recently-used file stuff
> > in memory so that the system can be as fast as if you used a RAM disk
> > instead of real physical (slow) hard disks. If you can't cope with this,
> > use DOS.
>
> At the very least the disk should be consistent with memory. If the
> dirty pages aren't written back to the disk (but not necessarily removed
> from memory) after a reasonable idle period, then there is room for
> improvement.

They are. If you leave your machine one for a minute or so (probably less is ok,
but I don't know) the kernel will flush dirty buffers... fsck will complain, but the
file's
*data* blocks will be on the disk. There are way more reasons that this is a silly
and annoying thread. You should read more about things like
asynchronous/synchronous filesystems,
lazy-write cacheing, etc, etc,. If you know how to write software and/or configure
your system,
you can avoid all of these problems. Or use a journaling filesystem ext3/xfs, etc.
But I tire of this...

2001-03-19 21:18:35

by Torrey Hoffman

[permalink] [raw]
Subject: RE: Linux should better cope with power failure


Otto Wyss wrote:
> situation was switching power off and on after a few minutes of
> inactivity. From the impression I got during the following startup, I

You aren't giving a lot of detail here. I assume your startup scripts run
fsck, and you saw a lot of errors. Were any of them uncorrectable?

> assume Linux (2.4.2, EXT2-filesystem) is not very suited to any power
> failiure or manually switching it off. Not even if there wasn't any
> activity going on.

That is correct. Pulling the plug on non-journaled filesystems is a
bad idea.

> Shouldn't a good system allways try to be on the save side?

Yes. Some of this is your responsibility. You have several options:
1. Get a UPS. That would not have helped your particular problem,
but it's a good idea if you care about data integrity.
2. Use a journaling file system. These are much more tolerant of
abuse. Reiserfs seems to work for me on embedded systems I am
building where the user can (and does) remove the power any time.
3. Use RAID. Hard drives are very cheap and software raid is very
easy to set up.

> There is currently much work done in
> getting high performance during high activity but it seems there is no
> work done at all in getting a save system during low/no activity.

Actually, a lot of work _is_ being done on journaling file systems
which help solve this problem. Current journaling file systems are
metadata only, but Tux2 (if I understand it) will journal everything.

> How could this be accomplished:
> 1. Flush any dirty cache pages as soon as possible. There may
> not be any
> dirty cache after a certain amount of idle time.

This can be done from user space. The simple approach would be to set up a
cron job to sync and flush buffers every "n" seconds. A smarter approach
would examine the load average, and not sync if the load was high. This
does not need to be in the kernel.

> 2. Keep open files in a state where it doesn't matter if they where
> improperly closed (if possible).

This is mostly a user space problem as well. It has been solved for
editors which automatically save files every "n" minutes. I don't know
if it can be solved from kernel space - if applications leave files in
an inconsistent state, how can the kernel possibly do anything about it?

> 3. Swap may not contain anything which can't be discarded. Otherwise
> swap has to be treated as ordinary disk space.

I'm not an expert, but I don't think this is relevant?

> Don't we tell children never go close to any abyss or doesn't have
> alpinist a saying "never go to the limits"? So why is this simple rule
> always broken with computers?

So were you breaking this rule? Were you using a journaling file system,
or RAID, or a UPS?

Torrey Hoffman

2001-03-19 21:37:48

by Richard B. Johnson

[permalink] [raw]
Subject: Re: Linux should better cope with power failure

On Mon, 19 Mar 2001, Brian Gerst wrote:
[SNIPPED...]

>
> At the very least the disk should be consistent with memory. If the
> dirty pages aren't written back to the disk (but not necessarily removed
> from memory) after a reasonable idle period, then there is room for
> improvement.
>

Hmmm. Now think about it a minute. You have a database operation
with a few hundred files open, most of which will be deleted after
a sort/merge completes. At the same time, you've got a few thousand
directories with their ATIME being updated. Also, you have thousands
of temporary files being created in /tmp during a compile that didn't
use "-pipe".

If you periodically write everything to disk, you don't have many
CPU cycles available to do anything useful.

It is up to the application to decide if anything is 'precious'.
If you've got some database running, it's got to be checkpointed
with some "commitable" file-system so it can be interrupted at any time.

If you make your file-systems up of "slices", you can mount the
executable stuff read/only. Currently, only /var and /tmp need to
be writable for normal use, plus any user "slices", of course.
-- Yes I know you need to modify /etc/stuff occasionally (startup
and shutdown, plus an occasional password change). I proposed
a long time ago that /etc/mtab get moved to /var.

Cheers,
Dick Johnson

Penguin : Linux version 2.4.1 on an i686 machine (799.53 BogoMips).

"Memory is like gasoline. You use it up when you are running. Of
course you get it all back when you reboot..."; Actual explanation
obtained from the Micro$oft help desk.


2001-03-19 22:01:38

by Brian Gerst

[permalink] [raw]
Subject: Re: Linux should better cope with power failure

"Richard B. Johnson" wrote:
>
> On Mon, 19 Mar 2001, Brian Gerst wrote:
> [SNIPPED...]
>
> >
> > At the very least the disk should be consistent with memory. If the
> > dirty pages aren't written back to the disk (but not necessarily removed
> > from memory) after a reasonable idle period, then there is room for
> > improvement.
> >
>
> Hmmm. Now think about it a minute. You have a database operation
> with a few hundred files open, most of which will be deleted after
> a sort/merge completes. At the same time, you've got a few thousand
> directories with their ATIME being updated. Also, you have thousands
> of temporary files being created in /tmp during a compile that didn't
> use "-pipe".
>
> If you periodically write everything to disk, you don't have many
> CPU cycles available to do anything useful.

Note the key words "reasonable idle period". It was stated elsewhere
that this is the case already so it is a moot point.

--

Brian Gerst

Subject: RE: Linux should better cope with power failure

Otto,

If you are doing development work (or playing with new kernels) and things
like USB failures lock you from keyboard and mouse...

Have you considered telnet into your box from a second machine? Even a 486
system would do this fine... network cards are cheap. You could try to
recover the system or at least do a shutdown.

Maybe there are reason you have ruled this out; just making sure you haven't
overlooked a possible prevention solution.

Stephen Gutknecht
Renton, Washington
http://www.RoundSparrow.com/


-----Original Message-----
From: Otto Wyss [mailto:[email protected]]
Sent: Monday, March 19, 2001 11:47 AM
To: [email protected]
Subject: Linux should better cope with power failure


Lately I had an USB failure, leaving me without any access to my system
since I only use an USB-keyboard/-mouse. All I could do in that
situation was switching power off and on after a few minutes of
inactivity. From the impression I got during the following startup, I
assume Linux (2.4.2, EXT2-filesystem) is not very suited to any power
failiure or manually switching it off. Not even if there wasn't any
activity going on.
[snip]

2001-03-19 22:22:50

by Jeremy Jackson

[permalink] [raw]
Subject: Re: Linux should better cope with power failure

"Richard B. Johnson" wrote:

> On Mon, 19 Mar 2001, Brian Gerst wrote:
> [SNIPPED...]
>
> >
> > At the very least the disk should be consistent with memory. If the
> > dirty pages aren't written back to the disk (but not necessarily removed
> > from memory) after a reasonable idle period, then there is room for
> > improvement.
> >
>
> Hmmm. Now think about it a minute. You have a database operation
> with a few hundred files open, most of which will be deleted after
> a sort/merge completes. At the same time, you've got a few thousand
> directories with their ATIME being updated. Also, you have thousands
> of temporary files being created in /tmp during a compile that didn't
> use "-pipe".
>
> If you periodically write everything to disk, you don't have many
> CPU cycles available to do anything useful.
>
> It is up to the application to decide if anything is 'precious'.
> If you've got some database running, it's got to be checkpointed
> with some "commitable" file-system so it can be interrupted at any time.
>
> If you make your file-systems up of "slices", you can mount the
> executable stuff read/only. Currently, only /var and /tmp need to
> be writable for normal use, plus any user "slices", of course.
> -- Yes I know you need to modify /etc/stuff occasionally (startup
> and shutdown, plus an occasional password change). I proposed
> a long time ago that /etc/mtab get moved to /var.

so how does mount update /var/mtab when mounting var? he he.

Actually, I think /etc/mtab is not needed at all. Originally, UNIX
used to put as much onto the disk (and not in "core") as possible.
so much state information related only to one boot-cycle was
taken out of kernel and stored on disk. /var/run/utmp, /etc/mtab,
, rmtab, and many others. all are invalidated by a reboot, and are yet
stored
in non-volatile storage. kernel memory is not swappable, so they manually
separated out the minimum needed in core.

Linux currently has a lot of this info in core, and maintains the disk files
for backwards compatibility. in the case of /etc/mtab, I believe
/proc/mounts
has the same info. It appears to be in the same format as /etc/mtab,
so most of the groundwork has already been done.
i've considered trying just changing /etc/mtab to /proc/mounts in some
utilities, to remove the need for read-write root. This (and other cases)
would guarantee consistency (look at /etc/mtab after restart in single
user more - ugh)

I wonder if embedded folks would like to at least keep the old behaviour
as a compile-time option; they're in almost the same boat as early (64k
core memory) unix folks.

My favorite compromise between journaling and performance is the
compaq smart array controllers, with a battery-backed sram
write cache; they can do (fast)lazy writes and still be perfectly reliable.
plus they keep *everything* reliable, not just metadata.

I find this a fascinating topic... the ultimate would be to use the nvram
(it's mostly empty if using LinuxBIOS) to store a clean shutdown flag,
and/or a system heartbeat timestamp (like syslogd's)... only this timestamp
would let the hdd spin down (not hit every 20 minutes or so with a timestamp
log entry) and not wear out a flash disk based system.

Regards,

Jeremy

2001-03-19 22:29:38

by Stephen Satchell

[permalink] [raw]
Subject: RE: Linux should better cope with power failure

At 01:16 PM 3/19/01 -0800, Torrey Hoffman wrote:
>Yes. Some of this is your responsibility. You have several options:
>1. Get a UPS. That would not have helped your particular problem,
> but it's a good idea if you care about data integrity.
>2. Use a journaling file system. These are much more tolerant of
> abuse. Reiserfs seems to work for me on embedded systems I am
> building where the user can (and does) remove the power any time.
>3. Use RAID. Hard drives are very cheap and software raid is very
> easy to set up.

Sorry, but you really should have read the ENTIRE thread before
commenting. This guy's original complaint was that his USB keyboard locks
up, and the only way to get it back is to do a very rude restart. In
combatting this problem, the guy was observing the "shortcomings" of the
file system.

To be more to the point of the guy's problem, he should consider using
software specifically intended for UPS hardware to notify a system when the
power is going to go, and wire up an appropriate switch to signal his
system to enter shutdown when his keyboard goes south. By forcing an
orderly shutdown, he doesn't see the fsck-ing messages, he gets his USB
keyboard back, and all is well with the world.

Of course, the other option is to use a regular keyboard instead of a USB
keyboard, but why point out the really easy solution? "Hey Doc, it hurts
when I do this." "Then don't do it."

Satch

2001-03-19 22:40:18

by Otto Wyss

[permalink] [raw]
Subject: Re: Linux should better cope with power failure

"Stephen Gutknecht (linux-kernel)" wrote:
>
> Otto,
>
[...]
> Have you considered telnet into your box from a second machine? Even a 486
> system would do this fine... network cards are cheap. You could try to
> recover the system or at least do a shutdown.
>
It was just a simple test machine where it didn't matter what was lost.
Still that doesn't justify this behaviour.

O. Wyss

2001-03-19 22:37:08

by Otto Wyss

[permalink] [raw]
Subject: Re: Linux should better cope with power failure

Jeremy Jackson wrote:
>
> Brian Gerst wrote:
>
> > "Richard B. Johnson" wrote:
> > >
> > > On Mon, 19 Mar 2001, Otto Wyss wrote:
> > >
> > > > Lately I had an USB failure, leaving me without any access to my system
[..]
> > > Unix and other such variants have what's called a Virtual File System
> > > (VFS). The idea behind this is to keep as much recently-used file stuff
> > > in memory so that the system can be as fast as if you used a RAM disk
> > > instead of real physical (slow) hard disks. If you can't cope with this,
> > > use DOS.
> >
> > At the very least the disk should be consistent with memory. If the
> > dirty pages aren't written back to the disk (but not necessarily removed
> > from memory) after a reasonable idle period, then there is room for
> > improvement.
>
> They are. If you leave your machine one for a minute or so (probably less is ok,
> but I don't know) the kernel will flush dirty buffers... fsck will complain, but the
> file's

There was at least 15min I waited without doing anything (how could I
without any imput device). I was editing a file with vim and the usual
bunch of demons where running mostly doing nothing. I don't know if all
the complains from fsck where due to open files or dirty cache pages. I
wouldn't complain if there was any heavy activity but there was allmost none.

> *data* blocks will be on the disk. There are way more reasons that this is a silly
> and annoying thread. You should read more about things like
> asynchronous/synchronous filesystems,
> lazy-write cacheing, etc, etc,. If you know how to write software and/or configure
> your system,
> you can avoid all of these problems. Or use a journaling filesystem ext3/xfs, etc.
> But I tire of this...

So in real live you would propose to put fences and nets everywhere to
prevent children from possibly falling in abyses?

O. Wyss

2001-03-19 23:07:09

by Andre Hedrick

[permalink] [raw]
Subject: RE: Linux should better cope with power failure


Guy,

I wrote APCUPSD beginning back in 95/96 for this reason.
American Power Conversion is now friendly to Linux.

http://www.linux-ide.org/apcupsd.html

Cheers,

On Mon, 19 Mar 2001, Stephen Satchell wrote:

> At 01:16 PM 3/19/01 -0800, Torrey Hoffman wrote:
> >Yes. Some of this is your responsibility. You have several options:
> >1. Get a UPS. That would not have helped your particular problem,
> > but it's a good idea if you care about data integrity.
> >2. Use a journaling file system. These are much more tolerant of
> > abuse. Reiserfs seems to work for me on embedded systems I am
> > building where the user can (and does) remove the power any time.
> >3. Use RAID. Hard drives are very cheap and software raid is very
> > easy to set up.
>
> Sorry, but you really should have read the ENTIRE thread before
> commenting. This guy's original complaint was that his USB keyboard locks
> up, and the only way to get it back is to do a very rude restart. In
> combatting this problem, the guy was observing the "shortcomings" of the
> file system.
>
> To be more to the point of the guy's problem, he should consider using
> software specifically intended for UPS hardware to notify a system when the
> power is going to go, and wire up an appropriate switch to signal his
> system to enter shutdown when his keyboard goes south. By forcing an
> orderly shutdown, he doesn't see the fsck-ing messages, he gets his USB
> keyboard back, and all is well with the world.
>
> Of course, the other option is to use a regular keyboard instead of a USB
> keyboard, but why point out the really easy solution? "Hey Doc, it hurts
> when I do this." "Then don't do it."
>
> Satch
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

Andre Hedrick
Linux ATA Development
ASL Kernel Development
-----------------------------------------------------------------------------
ASL, Inc. Toll free: 1-877-ASL-3535
1757 Houret Court Fax: 1-408-941-2071
Milpitas, CA 95035 Web: http://www.aslab.com

2001-03-19 23:09:19

by Werner Almesberger

[permalink] [raw]
Subject: Re: Linux should better cope with power failure

Richard B. Johnson wrote:
> Unix and other such variants have what's called a Virtual File System
> (VFS).

Correct, but hardly relevant here, except possibly that this enables you
to use a different, perhaps more resilient file system.

> The idea behind this is to keep as much recently-used file stuff
> in memory so that the system can be as fast as if you used a RAM disk
> instead of real physical (slow) hard disks.

Correct, but does not require VFS.

Nice try, though.

- Werner

--
_________________________________________________________________________
/ Werner Almesberger, ICA, EPFL, CH [email protected] /
/_IN_N_032__Tel_+41_21_693_6621__Fax_+41_21_693_6610_____________________/

2001-03-19 23:12:29

by John R Lenton

[permalink] [raw]
Subject: Re: Linux should better cope with power failure

On Mon, Mar 19, 2001 at 11:35:55PM +0100, Otto Wyss wrote:
> > you can avoid all of these problems. Or use a journaling filesystem ext3/xfs, etc.
>
> So in real live you would propose to put fences and nets everywhere to
> prevent children from possibly falling in abyses?

I think you've got it backwards: from my point of view, _you_ are
proposing the nets, _he_ is proposing the knowledgable and
trustworthy parent looking after the children.

--
John Lenton ([email protected]) -- Random fortune:
Si le dan dos ?rdenes contradictorias, obedezca las dos.
-- Segunda Ley de Brintnall.

2001-03-20 21:39:25

by H. Peter Anvin

[permalink] [raw]
Subject: Re: Linux should better cope with power failure

Followup to: <[email protected]>
By author: Otto Wyss <[email protected]>
In newsgroup: linux.dev.kernel
>
> It was just a simple test machine where it didn't matter what was lost.
> Still that doesn't justify this behaviour.
>

Then use a journalling filesystem. If not, give it a few minutes of
idle time; fsck will complain on boot but it should be able to clean
up the mess.

-hpa
--
<[email protected]> at work, <[email protected]> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt

2001-03-23 15:29:10

by David Balazic

[permalink] [raw]
Subject: Re: Linux should better cope with power failure

I had a similar experience:
X crashed , hosing the console , so I could not initiate
a proper shutdown.

Here I must note that the response you got on linux-kernel is
shameful.

What I did was to write a kernel/apmd patch , that performed a
proper shutdown when I press the power button ( which luckily
works as long as the kernel works ).

Ask me for details, if interested.
The patch was for 2.2.x IIRC, so I would have to rewrite it almost
from scratch.


Otto Wyss ([email protected]) wrote :

> Lately I had an USB failure, leaving me without any access to my system
> since I only use an USB-keyboard/-mouse. All I could do in that
> situation was switching power off and on after a few minutes of
> inactivity. From the impression I got during the following startup, I
> assume Linux (2.4.2, EXT2-filesystem) is not very suited to any power
> failiure or manually switching it off. Not even if there wasn't any
> activity going on.
>
> Shouldn't a good system allways try to be on the save side? Shouldn't
> Linux try to be more fail save? There is currently much work done in
> getting high performance during high activity but it seems there is no
> work done at all in getting a save system during low/no activity. I
> think this is a major drawback and should be addressed as fast as
> possible. Bringing a system to save state should allway have a high priority.
>
> How could this be accomplished:
> 1. Flush any dirty cache pages as soon as possible. There may not be any
> dirty cache after a certain amount of idle time.
> 2. Keep open files in a state where it doesn't matter if they where
> improperly closed (if possible).
> 3. Swap may not contain anything which can't be discarded. Otherwise
> swap has to be treated as ordinary disk space.
>
> These actions are not filesystem dependant. It might be that certain
> filesystem cope better with power failiure than others but still it's
> much better not to have errors instead to fix them.
>
> Don't we tell children never go close to any abyss or doesn't have
> alpinist a saying "never go to the limits"? So why is this simple rule
> always broken with computers?
>
> O. Wyss

--
David Balazic
--------------
"Be excellent to each other." - Bill & Ted
- - - - - - - - - - - - - - - - - - - - - -

2001-03-23 18:22:49

by Gerhard Mack

[permalink] [raw]
Subject: Re: Linux should better cope with power failure

This sounds very nice.. can such a thing be done with the reset switch as
well?

Gerhard


On Fri, 23 Mar 2001, David Balazic wrote:

> I had a similar experience:
> X crashed , hosing the console , so I could not initiate
> a proper shutdown.
>
> Here I must note that the response you got on linux-kernel is
> shameful.
>
> What I did was to write a kernel/apmd patch , that performed a
> proper shutdown when I press the power button ( which luckily
> works as long as the kernel works ).
>
> Ask me for details, if interested.
> The patch was for 2.2.x IIRC, so I would have to rewrite it almost
> from scratch.
>
>
> Otto Wyss ([email protected]) wrote :
>
> > Lately I had an USB failure, leaving me without any access to my system
> > since I only use an USB-keyboard/-mouse. All I could do in that
> > situation was switching power off and on after a few minutes of
> > inactivity. From the impression I got during the following startup, I
> > assume Linux (2.4.2, EXT2-filesystem) is not very suited to any power
> > failiure or manually switching it off. Not even if there wasn't any
> > activity going on.
> >
> > Shouldn't a good system allways try to be on the save side? Shouldn't
> > Linux try to be more fail save? There is currently much work done in
> > getting high performance during high activity but it seems there is no
> > work done at all in getting a save system during low/no activity. I
> > think this is a major drawback and should be addressed as fast as
> > possible. Bringing a system to save state should allway have a high priority.
> >
> > How could this be accomplished:
> > 1. Flush any dirty cache pages as soon as possible. There may not be any
> > dirty cache after a certain amount of idle time.
> > 2. Keep open files in a state where it doesn't matter if they where
> > improperly closed (if possible).
> > 3. Swap may not contain anything which can't be discarded. Otherwise
> > swap has to be treated as ordinary disk space.
> >
> > These actions are not filesystem dependant. It might be that certain
> > filesystem cope better with power failiure than others but still it's
> > much better not to have errors instead to fix them.
> >
> > Don't we tell children never go close to any abyss or doesn't have
> > alpinist a saying "never go to the limits"? So why is this simple rule
> > always broken with computers?
> >
> > O. Wyss
>
> --
> David Balazic
> --------------
> "Be excellent to each other." - Bill & Ted
> - - - - - - - - - - - - - - - - - - - - - -
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

--
Gerhard Mack

[email protected]

<>< As a computer I find your faith in technology amusing.

2001-03-23 19:30:11

by Otto Wyss

[permalink] [raw]
Subject: Re: Linux should better cope with power failure

> I had a similar experience:
> X crashed , hosing the console , so I could not initiate
> a proper shutdown.
>
> Here I must note that the response you got on linux-kernel is
> shameful.
>
Thanks, but I expected it a little bit. All around Linux is centered
around getting the highest performance out of it and very low (to low
IMHO) is done to have a save system. The attitude "It doesn't matter
making mistakes, they get fix anyhow" annoys me most, especially if it
were easy to prevent them.

> What I did was to write a kernel/apmd patch , that performed a
> proper shutdown when I press the power button ( which luckily
> works as long as the kernel works ).
>
Not with a AT power supply but certainly nice to have. See that it gets
included into the kernel. I didn't lost anything important since it was
just a testing machine. I was just shocked what fsck complained on a
machine which hadn't done almost anything at all. If I'd run into this
on a productive system I'd get immediately a serial keyboard or have at
least a usable network connection. Besides USB-only is not ready yet.

> > Don't we tell children never go close to any abyss or doesn't have
> > alpinist a saying "never go to the limits"? So why is this simple rule
> > always broken with computers?
> >
Is there a similar expression which could be hammered into any
developers mind, i.e. "Don't make errors, others already do them for you".

O. Wyss

2001-03-23 22:45:37

by David Ford

[permalink] [raw]
Subject: Re: Linux should better cope with power failure

Otto Wyss wrote:

> > I had a similar experience:
> > X crashed , hosing the console , so I could not initiate
> > a proper shutdown.
> >
> > Here I must note that the response you got on linux-kernel is
> > shameful.
> >
> Thanks, but I expected it a little bit. All around Linux is centered
> around getting the highest performance out of it and very low (to low
> IMHO) is done to have a save system. The attitude "It doesn't matter
> making mistakes, they get fix anyhow" annoys me most, especially if it
> were easy to prevent them.

No, the correct answer is if you want a reliable recovery then run your disks
in non write buffered mode. I.e. turn on sync in fstab.

It's all about RTFM and knowing the difference between buffered actions and
nonbuffered.

Everything you need to have a safely clean and proper crash recovery system
already is within your power, you just need to read the man pages and fix
your fstab instead of blaming linux-kernel for bad attitudes.

Yes, it's very easy to prevent e2fsck runs. Run synchronous or journaled
file systems.

> > > Don't we tell children never go close to any abyss or doesn't have
> > > alpinist a saying "never go to the limits"? So why is this simple rule
> > > always broken with computers?
> > >
> Is there a similar expression which could be hammered into any
> developers mind, i.e. "Don't make errors, others already do them for you".

There is also a very common expression...RTFM.

Please understand what you are doing before you do it, particularly before
you bad mouth others for having a bad attitude. Don't blame race car makers
for destructive engine failure when you expect it to act like a family car.

-d


--
There is a natural aristocracy among men. The grounds of this are virtue and talents. Thomas Jefferson
The good thing about standards is that there are so many to choose from. Andrew S. Tanenbaum



2001-03-24 08:46:06

by Otto Wyss

[permalink] [raw]
Subject: Re: Linux should better cope with power failure

> No, the correct answer is if you want a reliable recovery then run your disks
> in non write buffered mode. I.e. turn on sync in fstab.
>
You probably haven't tried to use sync or you would have noticed the
performace penalty. I think nobody really considers sync an alternative.

O. Wyss

2001-03-24 09:48:22

by David Ford

[permalink] [raw]
Subject: Re: Linux should better cope with power failure

Otto Wyss wrote:

> > No, the correct answer is if you want a reliable recovery then run your disks
> > in non write buffered mode. I.e. turn on sync in fstab.
> >
> You probably haven't tried to use sync or you would have noticed the
> performace penalty. I think nobody really considers sync an alternative.
>
> O. Wyss

You can't have the best of everything. There are tradeoffs. A viable option is a
journaled filesystem. Linux boasts a few, two of which are at your fingertips by
way of config options. Read up on JFS or ReiserFS.

-d

--
There is a natural aristocracy among men. The grounds of this are virtue and talents. Thomas Jefferson
The good thing about standards is that there are so many to choose from. Andrew S. Tanenbaum



2001-03-24 10:29:44

by Otto Wyss

[permalink] [raw]
Subject: Re: Linux should better cope with power failure

> > You probably haven't tried to use sync or you would have noticed the
> > performace penalty. I think nobody really considers sync an alternative.
> >
> > O. Wyss
>
> You can't have the best of everything. There are tradeoffs. A viable option is > a journaled filesystem. Linux boasts a few, two of which are at your fingertips > by way of config options. Read up on JFS or ReiserFS.
>
How about the following solution: During high activity _any_ FS is
treated as if it were mounted asynch, during low/no activity it's
treaded as synch. This simple solution certainly will be acceptable for anyone.

O. Wyss

2001-03-26 09:35:52

by David Balazic

[permalink] [raw]
Subject: Re: Linux should better cope with power failure

Gerhard Mack wrote:
>
> This sounds very nice.. can such a thing be done with the reset switch as
> well?

Don't think so.
I'm not sure , but I think that the reset button is directly connected
to the reset pin of most chips and can not be overrided.
Off course this is the first candidate for a "reboot properly" button,
but there is no hardware support. That is why I used the power button,
which is ( more or less ) under software control.


> Gerhard
>
> On Fri, 23 Mar 2001, David Balazic wrote:
>
> > I had a similar experience:
> > X crashed , hosing the console , so I could not initiate
> > a proper shutdown.
> >
> > Here I must note that the response you got on linux-kernel is
> > shameful.
> >
> > What I did was to write a kernel/apmd patch , that performed a
> > proper shutdown when I press the power button ( which luckily
> > works as long as the kernel works ).
> >
> > Ask me for details, if interested.
> > The patch was for 2.2.x IIRC, so I would have to rewrite it almost
> > from scratch.
> >
> >
> > Otto Wyss ([email protected]) wrote :
> >
> > > Lately I had an USB failure, leaving me without any access to my system
> > > since I only use an USB-keyboard/-mouse. All I could do in that
> > > situation was switching power off and on after a few minutes of
> > > inactivity. From the impression I got during the following startup, I
> > > assume Linux (2.4.2, EXT2-filesystem) is not very suited to any power
> > > failiure or manually switching it off. Not even if there wasn't any
> > > activity going on.
> > >
> > > Shouldn't a good system allways try to be on the save side? Shouldn't
> > > Linux try to be more fail save? There is currently much work done in
> > > getting high performance during high activity but it seems there is no
> > > work done at all in getting a save system during low/no activity. I
> > > think this is a major drawback and should be addressed as fast as
> > > possible. Bringing a system to save state should allway have a high priority.
> > >
> > > How could this be accomplished:
> > > 1. Flush any dirty cache pages as soon as possible. There may not be any
> > > dirty cache after a certain amount of idle time.
> > > 2. Keep open files in a state where it doesn't matter if they where
> > > improperly closed (if possible).
> > > 3. Swap may not contain anything which can't be discarded. Otherwise
> > > swap has to be treated as ordinary disk space.
> > >
> > > These actions are not filesystem dependant. It might be that certain
> > > filesystem cope better with power failiure than others but still it's
> > > much better not to have errors instead to fix them.
> > >
> > > Don't we tell children never go close to any abyss or doesn't have
> > > alpinist a saying "never go to the limits"? So why is this simple rule
> > > always broken with computers?
> > >
> > > O. Wyss
> >
> > --
> > David Balazic
> > --------------
> > "Be excellent to each other." - Bill & Ted
> > - - - - - - - - - - - - - - - - - - - - - -
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to [email protected]
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at http://www.tux.org/lkml/
> >
>
> --
> Gerhard Mack
>
> [email protected]
>
> <>< As a computer I find your faith in technology amusing.


--
David Balazic
--------------
"Be excellent to each other." - Bill & Ted
- - - - - - - - - - - - - - - - - - - - - -

2001-03-26 10:20:02

by David Balazic

[permalink] [raw]
Subject: Re: Linux should better cope with power failure

Otto Wyss wrote:
>
> > I had a similar experience:
> > X crashed , hosing the console , so I could not initiate
> > a proper shutdown.
> >
> > Here I must note that the response you got on linux-kernel is
> > shameful.
> >
> Thanks, but I expected it a little bit. All around Linux is centered
> around getting the highest performance out of it and very low (to low
> IMHO) is done to have a save system. The attitude "It doesn't matter
> making mistakes, they get fix anyhow" annoys me most, especially if it
> were easy to prevent them.
>
> > What I did was to write a kernel/apmd patch , that performed a
> > proper shutdown when I press the power button ( which luckily
> > works as long as the kernel works ).
> >
> Not with a AT power supply but certainly nice to have. See that it gets
> included into the kernel.

It was just a line or two bugfix, not a real patch. I will dig it up
and send a patch for 2.4
When I have the time :-)


--
David Balazic
--------------
"Be excellent to each other." - Bill & Ted
- - - - - - - - - - - - - - - - - - - - - -

2001-03-26 10:23:22

by David Balazic

[permalink] [raw]
Subject: Re: Linux should better cope with power failure

David ( Ford ) , I think you are misunderstanding a bit here.
The problem here is not that a fsck is needed after an unclean umount,
but that users are forced to corrupt ( by unclean umount due to reset or
poweroff ) their perfectly good file system on a "perfectly" working
system, when their keyboard goes wacko ( happens more often than you might
think, just remember those "log in over net and run 'shutdown -r'" advice's )

David Ford wrote:
>
> Otto Wyss wrote:
>
> > > I had a similar experience:
> > > X crashed , hosing the console , so I could not initiate
> > > a proper shutdown.
> > >
> > > Here I must note that the response you got on linux-kernel is
> > > shameful.
> > >
> > Thanks, but I expected it a little bit. All around Linux is centered
> > around getting the highest performance out of it and very low (to low
> > IMHO) is done to have a save system. The attitude "It doesn't matter
> > making mistakes, they get fix anyhow" annoys me most, especially if it
> > were easy to prevent them.
>
> No, the correct answer is if you want a reliable recovery then run your disks
> in non write buffered mode. I.e. turn on sync in fstab.
>
> It's all about RTFM and knowing the difference between buffered actions and
> nonbuffered.
>
> Everything you need to have a safely clean and proper crash recovery system
> already is within your power, you just need to read the man pages and fix
> your fstab instead of blaming linux-kernel for bad attitudes.
>
> Yes, it's very easy to prevent e2fsck runs. Run synchronous or journaled
> file systems.
>
> > > > Don't we tell children never go close to any abyss or doesn't have
> > > > alpinist a saying "never go to the limits"? So why is this simple rule
> > > > always broken with computers?
> > > >
> > Is there a similar expression which could be hammered into any
> > developers mind, i.e. "Don't make errors, others already do them for you".
>
> There is also a very common expression...RTFM.
>
> Please understand what you are doing before you do it, particularly before
> you bad mouth others for having a bad attitude. Don't blame race car makers
> for destructive engine failure when you expect it to act like a family car.
>
> -d
>
> --
> There is a natural aristocracy among men. The grounds of this are virtue and talents. Thomas Jefferson
> The good thing about standards is that there are so many to choose from. Andrew S. Tanenbaum


--
David Balazic
--------------
"Be excellent to each other." - Bill & Ted
- - - - - - - - - - - - - - - - - - - - - -