LinuxLists.cc - ext3 corruption

2006-08-08 23:47:47

by Molle Bestefich

[permalink] [raw]

Subject: ext3 corruption

I have a ~1TB filesystem that fails to mount, the message is:

EXT3-fs error (device loop0): ext3_check_descriptors: Block bitmap for
group 2338 not in group (block 1607003381)!
EXT3-fs: group descriptors corrupted !

A day before, it worked flawlessly.

What could have happened, and what's the best course of action?

2006-08-09 01:33:35

by Sergio Monteiro Basto

[permalink] [raw]

Subject: Re: ext3 corruption

man -k 2fs

man e2fsck

umount filesystem (don't forget it)
and e2fsck /dev/h (filesystem)

On Wed, 2006-08-09 at 01:47 +0200, Molle Bestefich wrote:
> I have a ~1TB filesystem that fails to mount, the message is:
>
> EXT3-fs error (device loop0): ext3_check_descriptors: Block bitmap for
> group 2338 not in group (block 1607003381)!
> EXT3-fs: group descriptors corrupted !
>
> A day before, it worked flawlessly.
>
> What could have happened, and what's the best course of action?
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2006-08-09 10:36:15

by Molle Bestefich

[permalink] [raw]

Subject: Re: ext3 corruption

Molle Bestefich wrote:
> I have a ~1TB filesystem that fails to mount, the message is:
>
> EXT3-fs error (device loop0): ext3_check_descriptors: Block bitmap for
> group 2338 not in group (block 1607003381)!
> EXT3-fs: group descriptors corrupted !
>
> A day before, it worked flawlessly.
>
> What could have happened, and what's the best course of action?

I should probably mention that I've been bitten by e2fsck before.
I had a filesystem with minor damage, but after running e2fsck it was
completely nuked.
Nothing was recoverable.

So before anyone suggest running e2fsck, I'd really like someone
knowledgeable to tell me what e2fsck is going to do about "group
descriptors corrupted" *BEFORE* I go ahead and blindly run it.

2006-08-09 11:34:01

by linux-os (Dick Johnson)

[permalink] [raw]

Subject: Re: ext3 corruption

On Tue, 8 Aug 2006, Molle Bestefich wrote:

> I have a ~1TB filesystem that fails to mount, the message is:
>
> EXT3-fs error (device loop0): ext3_check_descriptors: Block bitmap for
^^^^^^^^^^^_________

It seems as though you have a LOT of RAM if you can make a 1TB
filesystem on the loopback device!

Seriously, what are you doing, attempting to mount a big file-system
through the loop-back device or is this a copied-down message message
you got during boot when initrd tried to mount a RAM disk?

> group 2338 not in group (block 1607003381)!
> EXT3-fs: group descriptors corrupted !
>

Ordinary disk repair involves running fsck on an UNMOUNTED file-system.

> A day before, it worked flawlessly.
>
> What could have happened, and what's the best course of action?

Any bad RAM, any shutdown without a proper unmount, any device hardware
error like DMA not completing properly, can cause file-system corruption.
That's why there are tools to fix it.

Cheers,
Dick Johnson
Penguin : Linux version 2.6.16.24 on an i686 machine (5592.62 BogoMips).
New book: http://www.AbominableFirebug.com/
_

****************************************************************
The information transmitted in this message is confidential and may be privileged. Any review, retransmission, dissemination, or other use of this information by persons or entities other than the intended recipient is prohibited. If you are not the intended recipient, please notify Analogic Corporation immediately - by replying to this message or by sending an email to [email protected] - and destroy all copies of this information, including any attachments, without reading or disclosing them.

Thank you.

2006-08-09 15:22:32

by Molle Bestefich

[permalink] [raw]

Subject: Re: ext3 corruption

linux-os wrote:
> Molle Bestefich wrote:
> > I have a ~1TB filesystem that fails to mount, the message is:
> >
> > EXT3-fs error (device loop0): ext3_check_descriptors: Block bitmap for
> ^^^^^^^^^^^_________
>
> It seems as though you have a LOT of RAM if you can make a 1TB
> filesystem on the loopback device!

Why is that?
loop0 is backed by a MD device.

> Seriously, what are you doing, attempting to mount a big file-system
> through the loop-back device

Yes, and it has worked for... well... many years now.

> or is this a copied-down message message
> you got during boot when initrd tried to mount a RAM disk?

No.

> > group 2338 not in group (block 1607003381)!
> > EXT3-fs: group descriptors corrupted !
>
> Ordinary disk repair involves running fsck on an UNMOUNTED file-system.

It _is_ unmounted.

(I've learned that lesson years ago. Probably after seeing fsck
complaining loudly when I tried to run it on a mounted filesystem, if
I had to guess ;-).)

> > A day before, it worked flawlessly.
> >
> > What could have happened, and what's the best course of action?
>
> Any bad RAM, any shutdown without a proper unmount, any device hardware
> error like DMA not completing properly, can cause file-system corruption.
> That's why there are tools to fix it.

The hardware works flawlessly.
The shutdown was a regular shutdown -h.

Messages on the console indicated that Linux actually tried to
shutdown the filesystem before shutting down Samba, which is just
plain Real-F......-Stupid. Is there no intelligent ordering of
shutdown events in Linux at all?

Samba was serving files to remote computers and had no desire to let
go of the filesystem while still running. After 5 seconds or so,
Linux just shutdown the MD device with the filesystem still mounted.

That's what happened on a user-visible level, but what could have
happened internally in the filesystem?

2006-08-09 15:38:28

by Michael Loftis

[permalink] [raw]

Subject: Re: ext3 corruption

--On August 9, 2006 5:22:28 PM +0200 Molle Bestefich
<[email protected]> wrote:

> Messages on the console indicated that Linux actually tried to
> shutdown the filesystem before shutting down Samba, which is just
> plain Real-F......-Stupid. Is there no intelligent ordering of
> shutdown events in Linux at all?

The kernel doesn't perform those, your distro's init scripts do that. And
various distros have various success at doing the right thing. I've had
the best luck with Debian and Ubuntu doing this in the right order. RH
seems to insist on turning off the network then network services such as
sshd.

> Samba was serving files to remote computers and had no desire to let
> go of the filesystem while still running. After 5 seconds or so,
> Linux just shutdown the MD device with the filesystem still mounted.

The kernel probably didn't do this, usually by the time the kernel gets to
this point init has already sent kills to everything. If it hasn't it
points to problems with your init scripts, not the kernel.

>
> That's what happened on a user-visible level, but what could have
> happened internally in the filesystem?

2006-08-09 18:28:46

by Molle Bestefich

[permalink] [raw]

Subject: Re: ext3 corruption

Michael Loftis wrote:
> > Is there no intelligent ordering of
> > shutdown events in Linux at all?
>
> The kernel doesn't perform those, your distro's init scripts do that.

Right. It's all just "Linux" to me ;-).

(Maybe the kernel SHOULD coordinate it somehow,
seems like some of the distros are doing a pretty bad job as is.)

> And various distros have various success at doing the right thing. I've had
> the best luck with Debian and Ubuntu doing this in the right order. RH
> seems to insist on turning off the network then network services such as
> sshd.

Seems things are worse than that. Seems like it actually kills the
block device before it has successfully (or forcefully) unmounted the
filesystems. Thus the killing must also be before stopping Samba,
since that's what was (always is) holding the filesystem.

It's indeed a redhat, though - Red Hat Linux release 9 (Shrike).

> > Samba was serving files to remote computers and had no desire to let
> > go of the filesystem while still running. After 5 seconds or so,
> > Linux just shutdown the MD device with the filesystem still mounted.
>
> The kernel probably didn't do this, usually by the time the kernel gets to
> this point init has already sent kills to everything. If it hasn't it
> points to problems with your init scripts, not the kernel.

Ok, so LKML is not appropriate for the init script issue.
Never mind that, I'll just try another distro when time comes.

I'd really like to know what the "Block bitmap for group not in group"
message means (block bitmap is pretty self explanatory, but what's a
group?).

And what will e2fsck do to my dear filesystem if I let it have a go at it?

2006-08-09 18:41:37

by Mws

[permalink] [raw]

Subject: Re: ext3 corruption

On Wednesday 09 August 2006 20:28, Molle Bestefich wrote:
> Michael Loftis wrote:
> > > Is there no intelligent ordering of
> > > shutdown events in Linux at all?
> >
> > The kernel doesn't perform those, your distro's init scripts do that.
>
> Right. It's all just "Linux" to me ;-).
>
> (Maybe the kernel SHOULD coordinate it somehow,
> seems like some of the distros are doing a pretty bad job as is.)
>
> > And various distros have various success at doing the right thing. I've had
> > the best luck with Debian and Ubuntu doing this in the right order. RH
> > seems to insist on turning off the network then network services such as
> > sshd.
>
> Seems things are worse than that. Seems like it actually kills the
> block device before it has successfully (or forcefully) unmounted the
> filesystems. Thus the killing must also be before stopping Samba,
> since that's what was (always is) holding the filesystem.
>
> It's indeed a redhat, though - Red Hat Linux release 9 (Shrike).
>
> > > Samba was serving files to remote computers and had no desire to let
> > > go of the filesystem while still running. After 5 seconds or so,
> > > Linux just shutdown the MD device with the filesystem still mounted.
> >
> > The kernel probably didn't do this, usually by the time the kernel gets to
> > this point init has already sent kills to everything. If it hasn't it
> > points to problems with your init scripts, not the kernel.
>
> Ok, so LKML is not appropriate for the init script issue.
> Never mind that, I'll just try another distro when time comes.
>
> I'd really like to know what the "Block bitmap for group not in group"
> message means (block bitmap is pretty self explanatory, but what's a
> group?).
>
> And what will e2fsck do to my dear filesystem if I let it have a go at it?
> -
hi,
what i am missing is a kind of information, what type of pc you own/use.

i personally builded a new one the last few days and also encountered
problems with ext3.

i do own a amd64 x2 5000+ with asus m2n32 ws pro motherboard.

i yesterday changed my root partition from ext3 to xfs and my problems
went away. so imho there might be some issues in having 64 bit systems,
dual processor and ext3 in combination.

kernel is 2.6.17

behaviour was like:
filesystem became corrupted due to uncommitted transactions, resulting
in manually "fsck" checking the partition, loads of errors i did correct, but
a lot of files got corrupted. i didn't check if the sata attached drives would also
fail on ext3 cause i had them already prepared for xfs.

regards
marcel

Attachments:

(No filename) (2.52 kB)
(No filename) (189.00 B)
Download all attachments

2006-08-09 20:17:07

by Duane Griffin

[permalink] [raw]

Subject: Re: ext3 corruption

On 09/08/06, Molle Bestefich <[email protected]> wrote:
[snip]
> And what will e2fsck do to my dear filesystem if I let it have a go at it?

To be safe, run it on an image of your filesystem first. You can use
the dd command to take the image, then run e2fsck on it. Afterwards
mount it and make sure everything looks kosher. That is assuming you
have enough spare space, of course. If not then you should at least
run e2fsck with -n first to find out what it wants to do. Personally,
my risk tolerance would be closely correlated with the quality of my
backups.

Cheers,
Duane.

--
"I never could learn to drink that blood and call it wine" - Bob Dylan

2006-08-09 20:47:27

by Molle Bestefich

[permalink] [raw]

Subject: Re: ext3 corruption

Duane Griffin wrote:
> > And what will e2fsck do to my dear filesystem if I let it have a go at it?
>
> To be safe, run it on an image of your filesystem first.

Yes, hmm, I don't have another terabyte handy, unfortunately.

> That is assuming you have enough spare space, of course.
> If not then you should at least run e2fsck with -n first to find out
> what it wants to do.

How close to 1-1 does "-n" relate to non-"-n" ?

For example, does e2fsck take into consideration the changes it would
have done itself in regular mode when it proceeds to the next problem
and/or phase of a -n operation?

If it doesn't, then that command is, well, totally useless.

So :-). Does it take that into consideration?

> Personally, my risk tolerance would be closely correlated with
> the quality of my backups.

I hear you loud and clear...
Sigh ;-).

> "I never could learn to drink that blood and call it wine" - Bob Dylan

Hmm. I like it.

2006-08-09 23:09:25

by Molle Bestefich

[permalink] [raw]

Subject: Re: ext3 corruption

Duane Griffin wrote:
> > How close to 1-1 does "-n" relate to non-"-n" ?
> >
> > For example, does e2fsck take into consideration the changes it would
> > have done itself in regular mode when it proceeds to the next problem
> > and/or phase of a -n operation?
>
> It corresponds perfectly to you answering "no" to all questions :)
> Sorry, I don't have a much better answer than that.

A good answer, even if it's one that can be found in the manual :-).

> > If it doesn't, then that command is, well, totally useless.
>
> That is too strong.

I don't think so.

If it doesn't take into account own changes, then the -n command is
unable to produce even a slightly accurate resemblence of what would
happen if I did a real run.

And that's about the only use case I can come up with for -n...

> You should be able to get an idea how severe the damage
> is, at least.

If it's complete inaccurate, I can't trust the result, so that doesn't
help me much, if any.

> From a quick read of the code it looks like your problem
> is related to dodgy data in the superblock, and e2fsck will attempt to
> recover & continue by reading the backup superblock.

Thanks a lot for checking !

I wonder then, will it write back this alternate superblock?

Is there anything I can do to control the process, like:
Do a test mount with one of the alternate superblocks?
Tell fsck to test a specific superblock; afterwards tell fsck to use a
specific superblock?

That would be useful.

> It does that regardless of whether you use -n,
> so in that respect at least it will operate in the
> same way as "normal" operation.

Ok, that's very good to know, thanks a lot.

2006-08-10 00:08:14

by Duane Griffin

[permalink] [raw]

Subject: Re: ext3 corruption

On 10/08/06, Molle Bestefich <[email protected]> wrote:
> If it doesn't take into account own changes, then the -n command is
> unable to produce even a slightly accurate resemblence of what would
> happen if I did a real run.

It takes into account some of them (such as reading data from the
backup superblock if it detects corruption). Others will be irrelevent
for further operations. Many reports will be accurate, especially
fatal ones. I consider that useful, YMMV.

Cheers,
Duane.

--
"I never could learn to drink that blood and call it wine" - Bob Dylan

2006-08-10 03:06:39

[permalink] [raw]

Subject: Re: ext3 corruption

On 08/09/06 08:28:34PM +0200, Molle Bestefich wrote:
> Michael Loftis wrote:
> >> Is there no intelligent ordering of
> >> shutdown events in Linux at all?
> >
> >The kernel doesn't perform those, your distro's init scripts do that.
>
> Right. It's all just "Linux" to me ;-).
>

Then I guess it's time to break out the learning cap and figure out what's
what. =)

> (Maybe the kernel SHOULD coordinate it somehow,
> seems like some of the distros are doing a pretty bad job as is.)
>

That's pretty much impossible, the best the kernel can do is send signals
to all of the running processes. If anything requires anything more
complicated (and many do) then even worse things will happen.

> >And various distros have various success at doing the right thing. I've
> >had
> >the best luck with Debian and Ubuntu doing this in the right order. RH
> >seems to insist on turning off the network then network services such as
> >sshd.
>
> Seems things are worse than that. Seems like it actually kills the
> block device before it has successfully (or forcefully) unmounted the
> filesystems. Thus the killing must also be before stopping Samba,
> since that's what was (always is) holding the filesystem.
>
> It's indeed a redhat, though - Red Hat Linux release 9 (Shrike).
>

Why are you using such an old distribution? I know it's only been 3 years,
but a lot has changed and I don't think anyone supports RH9 or earlier
anymore.

> >> Samba was serving files to remote computers and had no desire to let
> >> go of the filesystem while still running. After 5 seconds or so,
> >> Linux just shutdown the MD device with the filesystem still mounted.
> >
> >The kernel probably didn't do this, usually by the time the kernel gets to
> >this point init has already sent kills to everything. If it hasn't it
> >points to problems with your init scripts, not the kernel.
>
> Ok, so LKML is not appropriate for the init script issue.
> Never mind that, I'll just try another distro when time comes.
>
> I'd really like to know what the "Block bitmap for group not in group"
> message means (block bitmap is pretty self explanatory, but what's a
> group?).
>

ext2 breaks the filesystem up into block groups, a while guess about the
error message would be that it couldn't find the block bitmap for a certain
group or the bitmap that it did find wasn't in the correct group.

Jim.

2006-08-10 07:44:58

by Denys Vlasenko

[permalink] [raw]

Subject: Re: ext3 corruption

On Wednesday 09 August 2006 17:22, Molle Bestefich wrote:
> The hardware works flawlessly.
> The shutdown was a regular shutdown -h.
>
> Messages on the console indicated that Linux actually tried to
> shutdown the filesystem before shutting down Samba, which is just
> plain Real-F......-Stupid. Is there no intelligent ordering of
> shutdown events in Linux at all?

There is no shutdown ordering in the Linux *kernel*, it is
the responsibility of the userspace to arrange for that.
IOW: the distribution packagers should do it,
or you, if you maintain your custom-configured system.

> Samba was serving files to remote computers and had no desire to let
> go of the filesystem while still running. After 5 seconds or so,

Somebody forgot to add a kill -9 to the shutdown scripts.

> Linux just shutdown the MD device with the filesystem still mounted.
>
> That's what happened on a user-visible level, but what could have
> happened internally in the filesystem?
--
vda

2006-08-10 08:33:22

by Bernd Petrovitsch

[permalink] [raw]

Subject: Re: ext3 corruption

On Wed, 2006-08-09 at 20:28 +0200, Molle Bestefich wrote:
> Michael Loftis wrote:
> > > Is there no intelligent ordering of
> > > shutdown events in Linux at all?
> >
> > The kernel doesn't perform those, your distro's init scripts do that.
>
> Right. It's all just "Linux" to me ;-).

Then you are very probably questioning at the wrong place.

> (Maybe the kernel SHOULD coordinate it somehow,
> seems like some of the distros are doing a pretty bad job as is.)

Patch your "Linux" to dump the output of "strace" of the init scripts
(it should be enough to improve the correct line in /etc/inittab) into a
log file and have fun considering the heuristics to be used in the
kernel to detect the dependencies.

AFAIK typical init scripts, the expression "extremely hard" is the
understatement of the year for this task.

Bernd
--
Firmix Software GmbH http://www.firmix.at/
mobil: +43 664 4416156 fax: +43 1 7890849-55
Embedded Linux Development and Services

2006-08-10 09:48:11

by Molle Bestefich

[permalink] [raw]

Subject: Re: ext3 corruption

Duane Griffin wrote:
> It takes into account some of them (such as reading data from the
> backup superblock if it detects corruption). Others will be
> irrelevent for further operations.

Ok, maybe it is accurate?

> Many reports will be accurate

Ok, perhaps not then :-).

I'm still confused as to the performance of "-n".
It would be _very_ good to fix this deficiency in the man page of e2fsck.

Thanks Duane, you've been most helpful.

Jim Crilly wrote:
> > Right. It's all just "Linux" to me ;-).
>
> Then I guess it's time to break out the learning cap and figure
> out what's what. =)

;-).

You can start by phoning Red Hat. They call their entire product
"Red Hat Linux", so that pretty much means that "Linux" basically
covers everything, not just the kernel.

> > It's indeed a redhat, though - Red Hat Linux release 9 (Shrike).
>
> Why are you using such an old distribution? I know it's only been 3
> years, but a lot has changed and I don't think anyone supports RH9
> or earlier anymore.

As far as I remember, I configured it to automatically update everything.
Apparently that function just broke itself very early on :-).

I guess the problem is that I don't know a single Linux packaging system
that actually works well enough to keep a system up to date at all times,
and I don't have any free time to spend on reinstalling systems all the
time.

I think most of the package managers break because their dependency system
sucks. Some of them doesn't suck, but they break because there's no
integrity checks, and package maintainers can dump any kind of bizarre
corrupt dependencies they like into them. That's how Gentoo works, for
example. Others have even more bizarre ways of breaking, again Gentoo as
an example requires the user to run a "switch to newer GCC" command from
time to time, otherwise random packages just start breaking.

AFAIK, every single Linux package manager on the planet is half-ass, broken
like above or in some other way. If you know of one that's actually well
thought through on all planes and well implemented and thus works good enough
to keep a system up to date for three years in a row without human
intervention....
Please speak up!!!

> > (Maybe the kernel SHOULD coordinate it somehow,
> > seems like some of the distros are doing a pretty bad job as is.)
>
> That's pretty much impossible, the best the kernel can do is send
> signals to all of the running processes.

Impossible? Few things in the software world are impossible.

Surely it's possible to create a kernel interface where processes
can tell the kernel about which other processes they'd like to
outlive and which ones they'd like to get killed before.

The kernel could then coordinate the killing of processes in a
"shutdown" function, which the various distro's 'reboot' and
'shutdown' scripts could call.

And voila, that difficult task of assessing in which order to do
things is out of the hands of distros like Red Hat, and into the
hands of those people who actually make the binaries.

Which is probably a good thing, because

a) Red Hat's init scripts probably fails for me because there's
something in my setup that Red Hat didn't expect. A greatly
simplified system as outlined above would help to fix things
like this.

b) Less duplicated effort in the form of init script coding for
the distro maintainers.

I realize that details totally absent in the above, but at least
it doesn't look to me like it's impossible at all.

> ext2 breaks the filesystem up into block groups,

Thanks for the info!

> a wild guess about the error message would be that it couldn't
> find the block bitmap for a certain group

Hmm, I would have expected it to find something completely
corrupt somewhere instead of finding nothing at all.

> or the bitmap that it did find wasn't in the correct group.

Implying that they're linked both ways?
That would probably be a very good thing wrt. recoverability.
Interesting thought!

2006-08-10 11:41:26

by linux-os (Dick Johnson)

[permalink] [raw]

Subject: Re: ext3 corruption

On Thu, 10 Aug 2006, Molle Bestefich wrote:

> Duane Griffin wrote:
>> It takes into account some of them (such as reading data from the
>> backup superblock if it detects corruption). Others will be
>> irrelevent for further operations.
>
> Ok, maybe it is accurate?
>
>> Many reports will be accurate
>
> Ok, perhaps not then :-).
>
> I'm still confused as to the performance of "-n".
> It would be _very_ good to fix this deficiency in the man page of e2fsck.
>
>
> Thanks Duane, you've been most helpful.
>
>
> Jim Crilly wrote:
>>> Right. It's all just "Linux" to me ;-).
>>
>> Then I guess it's time to break out the learning cap and figure
>> out what's what. =)
>
> ;-).
>
> You can start by phoning Red Hat. They call their entire product
> "Red Hat Linux", so that pretty much means that "Linux" basically
> covers everything, not just the kernel.
>
>
>>> It's indeed a redhat, though - Red Hat Linux release 9 (Shrike).
>>
>> Why are you using such an old distribution? I know it's only been 3
>> years, but a lot has changed and I don't think anyone supports RH9
>> or earlier anymore.
>
> As far as I remember, I configured it to automatically update everything.
> Apparently that function just broke itself very early on :-).
>
> I guess the problem is that I don't know a single Linux packaging system
> that actually works well enough to keep a system up to date at all times,
> and I don't have any free time to spend on reinstalling systems all the
> time.
>
> I think most of the package managers break because their dependency system
> sucks. Some of them doesn't suck, but they break because there's no
> integrity checks, and package maintainers can dump any kind of bizarre
> corrupt dependencies they like into them. That's how Gentoo works, for
> example. Others have even more bizarre ways of breaking, again Gentoo as
> an example requires the user to run a "switch to newer GCC" command from
> time to time, otherwise random packages just start breaking.
>
> AFAIK, every single Linux package manager on the planet is half-ass, broken
> like above or in some other way. If you know of one that's actually well
> thought through on all planes and well implemented and thus works good enough
> to keep a system up to date for three years in a row without human
> intervention....
> Please speak up!!!
>
>
>>> (Maybe the kernel SHOULD coordinate it somehow,
>>> seems like some of the distros are doing a pretty bad job as is.)
>>
>> That's pretty much impossible, the best the kernel can do is send
>> signals to all of the running processes.
>
> Impossible? Few things in the software world are impossible.
>
> Surely it's possible to create a kernel interface where processes
> can tell the kernel about which other processes they'd like to
> outlive and which ones they'd like to get killed before.
>
> The kernel could then coordinate the killing of processes in a
> "shutdown" function, which the various distro's 'reboot' and
> 'shutdown' scripts could call.
>
> And voila, that difficult task of assessing in which order to do
> things is out of the hands of distros like Red Hat, and into the
> hands of those people who actually make the binaries.
>
> Which is probably a good thing, because
>
> a) Red Hat's init scripts probably fails for me because there's
> something in my setup that Red Hat didn't expect. A greatly
> simplified system as outlined above would help to fix things
> like this.
>
> b) Less duplicated effort in the form of init script coding for
> the distro maintainers.
>
> I realize that details totally absent in the above, but at least
> it doesn't look to me like it's impossible at all.
>
>
>> ext2 breaks the filesystem up into block groups,
>
> Thanks for the info!
>
>> a wild guess about the error message would be that it couldn't
>> find the block bitmap for a certain group
>
> Hmm, I would have expected it to find something completely
> corrupt somewhere instead of finding nothing at all.
>
>> or the bitmap that it did find wasn't in the correct group.
>
> Implying that they're linked both ways?
> That would probably be a very good thing wrt. recoverability.
> Interesting thought!

What is it that you are attempting to do? First you show us some
text obtained while attempting to run fsck on the loop device,
claiming that this was obtained from a 1TB file-system that
was destroyed by Linux. Then you spend several days telling us
that linux is no good. Enough is enough.

If you had a 1TB file-system and you knew anything about Unix or
Linux, it would have been fixed by now -- and BTW, samba can't
destroy a file-system, no matter how many files were open.
The worse possible situation is that files, open for write, may
not be completely written and this only for files that were
being created or extended. You still have the original file-data
and all the rest of the files on your file system.

Another point... ext3 is a journaled file-system. Even when
forced off by hitting the reset switch, ext3 will quietly
announce "recovering from journal" and mount just fine.

Cheers,
Dick Johnson
Penguin : Linux version 2.6.16.24 on an i686 machine (5592.62 BogoMips).
New book: http://www.AbominableFirebug.com/
_

****************************************************************
The information transmitted in this message is confidential and may be privileged. Any review, retransmission, dissemination, or other use of this information by persons or entities other than the intended recipient is prohibited. If you are not the intended recipient, please notify Analogic Corporation immediately - by replying to this message or by sending an email to [email protected] - and destroy all copies of this information, including any attachments, without reading or disclosing them.

Thank you.

2006-08-10 12:21:11

by Molle Bestefich

[permalink] [raw]

Subject: Re: ext3 corruption

linux-os (Dick Johnson) wrote:
> What is it that you are attempting to do?

Fix my filesystem.
Prevent this situation from happening for others, if at all possible.

> First you show us some text obtained while attempting
> to run fsck on the loop device,
> claiming that this was obtained from a 1TB file-system that
> was destroyed by Linux. Then you spend several days telling us
> that linux is no good. Enough is enough.

That was never really my point, apologies if it came across that way.

> If you had a 1TB file-system and you knew anything about Unix or
> Linux, it would have been fixed by now

What? Uh.
Well, whatever.

> -- and BTW, samba can't
> destroy a file-system, no matter how many files were open.

Never claimed that it did.

> The worse possible situation is that files, open for write, may
> not be completely written and this only for files that were
> being created or extended. You still have the original file-data
> and all the rest of the files on your file system.

Well, it doesn't mount, so they're kind of irretrievable right now.

> Another point... ext3 is a journaled file-system. Even when
> forced off by hitting the reset switch, ext3 will quietly
> announce "recovering from journal" and mount just fine.

Obviously that's not true.

2006-08-10 12:22:17

by Helge Hafting

[permalink] [raw]

Subject: Re: ext3 corruption

Molle Bestefich wrote:
[...]
>
> As far as I remember, I configured it to automatically update everything.
> Apparently that function just broke itself very early on :-).
>
> I guess the problem is that I don't know a single Linux packaging system
> that actually works well enough to keep a system up to date at all times,
> and I don't have any free time to spend on reinstalling systems all the
> time.
Have you considered debian then? That package system certainly
have been able to keep many a system running for years and years.
You never reinstall, just upgrade.
>
> I think most of the package managers break because their dependency
> system
> sucks. Some of them doesn't suck, but they break because there's no
> integrity checks, and package maintainers can dump any kind of bizarre
> corrupt dependencies they like into them. That's how Gentoo works, for
Sure, you have to trust someone. If you can't trust any distro and its
maintainers, your only choice is to roll your own.
> example. Others have even more bizarre ways of breaking, again Gentoo as
> an example requires the user to run a "switch to newer GCC" command from
> time to time, otherwise random packages just start breaking.
>
> AFAIK, every single Linux package manager on the planet is half-ass,
> broken
> like above or in some other way. If you know of one that's actually well
> thought through on all planes and well implemented and thus works good
> enough
> to keep a system up to date for three years in a row without human
> intervention....
> Please speak up!!!
Well, if you expect to keep a computer running for three years
without human intervention - good luck to you! Not only will there be
new vulnerabilities and attacks via the internet in that time, there is also
a substantial risk of hardware breakdown. Disks in particular seems not
to live much more than three years, and if you have several, the chance
is bigger that one goes. Raid protects the data, but surely you will
intervene
to replace the failed drive? Or do you install six hot spares so the
system really can keep running alone?

It is certainly possible to run debian and spend 5min per week
on running "apt-get dist-upgrade" which installs anything that
was upgraded since the last time. And if you have many identical
servers, then you know that when one upgrade went without
problems the others will too.
>> > (Maybe the kernel SHOULD coordinate it somehow,
>> > seems like some of the distros are doing a pretty bad job as is.)
>>
>> That's pretty much impossible, the best the kernel can do is send
>> signals to all of the running processes.
>
> Impossible? Few things in the software world are impossible.
>
> Surely it's possible to create a kernel interface where processes
> can tell the kernel about which other processes they'd like to
> outlive and which ones they'd like to get killed before.
>
> The kernel could then coordinate the killing of processes in a
> "shutdown" function, which the various distro's 'reboot' and
> 'shutdown' scripts could call.
Such coordination can certainly be done - there is just no
reason at all to do it in the _kernel_. This is the job of the
program called "init". (The kernel is a very important piece of
a linux system, but don't make the mistake of believing
it therefore should be the "general manager" for all things less
important.)

Init is the first program that runs,
it is started up by the kernel. Init will then start all other
software you need, such as samba, any other server software,
and of course the login services. When you shut the pc down,
init is the program responsible for stopping everything in
a sane order.

Init is customizable, by editing and/or renaming the so-called
initscripts. That way, you can alter the order of startup and
shutdown of software, if your distribution didn't get this right.
This isn't all that hard, and a linux system administrator is
supposed to be able to make simple adjustments in this area.
Many linux/unix books documents how initscripts work, and there
is usually plenty of online documentation as well.

> And voila, that difficult task of assessing in which order to do
> things is out of the hands of distros like Red Hat, and into the
> hands of those people who actually make the binaries.
Not so easy. You do not want to shut down md devices because
samba is using them. Someone else may run samba on a single
harddisk and also have some md-devices that they take down
and bring up a lot. So having samba generally depend on md doesn't
work. Your setup need it, others may have different needs.
That's why the initscripts are _scripts_, simple textfiles that
administrators can manipulate without having to know C programming.
>
> Which is probably a good thing, because
>
> a) Red Hat's init scripts probably fails for me because there's
> something in my setup that Red Hat didn't expect. A greatly
> simplified system as outlined above would help to fix things
> like this.
Learn to manipulate the initscripts then. Changing the
shutdown order really is as simple as renaming samba's
script file so it occur earlier in the shutdown order than
the script responsible for taking down the md devices.
You don't even need to understand shellscript programming to
re-order stuff.
>
> b) Less duplicated effort in the form of init script coding for
> the distro maintainers.
It is open source, they can copy each others's scripts to save
effort. They often do - and sometimes the initscript for a
particular piece of sw is written by the maker of that sw too.

Helge Hafting

2006-08-10 13:00:07

by Molle Bestefich

[permalink] [raw]

Subject: Re: ext3 corruption

Helge Hafting wrote:
> Have you considered debian then? That package system certainly
> have been able to keep many a system running for years and years.
> You never reinstall, just upgrade.

Sounds cool, I'll try that.
Thanks.

> Well, if you expect to keep a computer running for three years
> without human intervention - good luck to you! Not only will there be
> new vulnerabilities and attacks via the internet in that time,

It's NATted, so it's basically unreachable unless you know how to get to it.

> there is also a substantial risk of hardware breakdown.

Hasn't happened yet.

> Disks in particular seems not to live much more than three years,
> and if you have several, the chance is bigger that one goes.

Fair enough, but I've got mdmonitor to send me an email when that happens.

> It is certainly possible to run debian and spend 5min per week
> on running "apt-get dist-upgrade" which installs anything that
> was upgraded since the last time.

Cool, a cron job should take care of that.

> And if you have many identical servers, then you know that when
> one upgrade went without problems the others will too.

Oh.... Sounds a bit like it's not really as good as you once said it was.

> > And voila, that difficult task of assessing in which order to do
> > things is out of the hands of distros like Red Hat, and into the
> > hands of those people who actually make the binaries.
>
> Not so easy. You do not want to shut down md devices because
> samba is using them.

Definitely not, I want to shut down Samba because it's using a
filesystem, which is using a MD device.

> Someone else may run samba on a single harddisk and also have
> some md-devices that they take down and bring up a lot.

Fair enough.
I guess a dependency system would have to be more complicated, then,
taking into account which particular resources a process depends on,
not just which subsystem.

> So having samba generally depend on md doesn't work.

Right.

> Your setup need it, others may have different needs.
> That's why the initscripts are _scripts_, simple textfiles that
> administrators can manipulate without having to know C programming.

I'm not sure I buy your argumentation. But I do acknowledge that
being able to modify init scripts, without having C knowledge is
definitely a plus, seeing as they're obviously often far from perfect.

> Learn to manipulate the initscripts then.

Yeah, I chould sit down and look at how Red Hat did their whole init thing.

OTOH, I don't really have time to dissect 'em and find what problem
they have. I think I'll just switch distro and see if I have better
luck with something else.

> Changing the shutdown order really is as
> simple as renaming samba's script file

I have a feeling that this is not the problem.

After all, this should be something that every distro has gotten
right, it's not really an issue you have to think long about.

I think the problem occurs in the "sending processes the TERM/KILL
signal" phase.
Perhaps because one phase is initiated too early, before various
services such as Samba has shut down.

Anyway, let's all forget about the init scripts forthwith, they're not
really relevant for LKML I think.

Concentrate on the ext3 issue :-).

2006-08-10 14:40:13

[permalink] [raw]

Subject: Re: ext3 corruption

On 8/10/06, Molle Bestefich <[email protected]> wrote:
> Helge Hafting wrote:
> > Have you considered debian then? That package system certainly
> > have been able to keep many a system running for years and years.
> > You never reinstall, just upgrade.
>
> Sounds cool, I'll try that.
> Thanks.
>
> > Well, if you expect to keep a computer running for three years
> > without human intervention - good luck to you! Not only will there be
> > new vulnerabilities and attacks via the internet in that time,
>
> It's NATted, so it's basically unreachable unless you know how to get to it.

and the villain you have to fear will know much better than anybody
else how to reach the box, that is for sure.

[snip]

> Fair enough, but I've got mdmonitor to send me an email when that happens.
>
> > It is certainly possible to run debian and spend 5min per week
> > on running "apt-get dist-upgrade" which installs anything that
> > was upgraded since the last time.
>
> Cool, a cron job should take care of that.

http://packages.debian.org/testing/admin/cron-apt

[snip]

2006-08-10 16:10:36

by John Stoffel

[permalink] [raw]

Subject: Re: ext3 corruption

>>>>> "Molle" == Molle Bestefich <[email protected]> writes:

Molle> I guess the problem is that I don't know a single Linux
Molle> packaging system that actually works well enough to keep a
Molle> system up to date at all times, and I don't have any free time
Molle> to spend on reinstalling systems all the time.

Debian works the best in my experience. Yum sucks rocks, it's a pain
to configured and when it does pull in stuff, it pulls in *everything*
that you don't want.

Not that apt-get is perfect, but it works well. My main machine at
home is an old Debian Stable install upgraded pretty much daily.
Rarely do things break. And rarely do you want packages updated
without you thinking about it.

If you need to maintain a bunch of systems, all at the same rev, then
you will obviously need a master system to test on and from which to
push updates to the clients. In this case, you'd setup your own
private repository which the clients would pull from.

Molle> I think most of the package managers break because their
Molle> dependency system sucks. Some of them doesn't suck, but they
Molle> break because there's no integrity checks, and package
Molle> maintainers can dump any kind of bizarre corrupt dependencies
Molle> they like into them. That's how Gentoo works, for example.
Molle> Others have even more bizarre ways of breaking, again Gentoo as
Molle> an example requires the user to run a "switch to newer GCC"
Molle> command from time to time, otherwise random packages just start
Molle> breaking.

I tried gentoo a bunch of years ago and didn't like it, and it
certainly didn't give me the speedup it claimed it would have. I've
been happy with Debian. Thinking about Ubuntu more...

Molle> Surely it's possible to create a kernel interface where
Molle> processes can tell the kernel about which other processes
Molle> they'd like to outlive and which ones they'd like to get killed
Molle> before.

This has nothing to do with the kernel, it's all a userspace issue.

Molle> The kernel could then coordinate the killing of processes in a
Molle> "shutdown" function, which the various distro's 'reboot' and
Molle> 'shutdown' scripts could call.

Again, userspace completely. You're asking for policy in the kernel,
where it doesn't belong.

Molle> And voila, that difficult task of assessing in which order to
Molle> do things is out of the hands of distros like Red Hat, and into
Molle> the hands of those people who actually make the binaries.

*bwah hah hah!* And you think they'll get it right? So what happens
when two packages, call them A and B, have a circular dependency on
each other? Who wins then?

It's not as simple an issue as you think.

John

2006-08-10 19:10:32

by Molle Bestefich

[permalink] [raw]

Subject: Re: ext3 corruption

John Stoffel wrote:
> Molle> And voila, that difficult task of assessing in which order to
> Molle> do things is out of the hands of distros like Red Hat, and into
> Molle> the hands of those people who actually make the binaries.
>
> *bwah hah hah!*

No need to ridicule :-).
After all, I'm just saying that there's got to be a simpler, stabler
and more transparent way than to have all this logic sit in shell
scripts.

> So what happens when two packages, call them A and B,
> have a circular dependency on each other? Who wins then?

They both get terminated at *exactly* the same time :-)... Nah, just kidding.

In that case, I imagine either
a) the system will log errors to syslog and pick a random order, or
b) the system will refuse to shutdown, politely returning back a
message to the user space tool that asked for the shutdown, saying
"there's an inconsistency in the ordering rules, please fix that
first". They guy who tapped in "shutdown" would have to kill one of
the processes manually. (And probably also upgrade the affected
software, or file a bug report, or whatever.)

I Googled for a similar software construct, and came upon the SCM in Windows.
Seems you can make Windows drivers and system services depend on each other.
In the case where there exists a circular dependency, the SCM refuses
to even start the affected services.

So there's a third possibility.

2006-08-10 21:00:18

by Molle Bestefich

[permalink] [raw]

Subject: Re: ext3 corruption

Duane Griffin wrote:
> > If it doesn't take into account own changes, then the -n command is
> > unable to produce even a slightly accurate resemblence of what would
> > happen if I did a real run.
>
> It takes into account some of them (such as reading data from the
> backup superblock if it detects corruption). Others will be irrelevent
> for further operations. Many reports will be accurate, especially
> fatal ones. I consider that useful, YMMV.

I've attached the output of a -n run, let's get some facts on the table.

I would be very happy if someone knowledgeable would tell me something
useful about it.

I'm especially worried about the "70058 files, 39754 blocks used (0%)"
message at the end of the e2fsck run.

Attachments:

(No filename) (726.00 B)
fs_check_out.bz2 (38.12 kB)
Download all attachments

2006-08-11 08:09:36

by Helge Hafting

[permalink] [raw]

Subject: Re: ext3 corruption

Molle Bestefich wrote:
> John Stoffel wrote:
>> Molle> And voila, that difficult task of assessing in which order to
>> Molle> do things is out of the hands of distros like Red Hat, and into
>> Molle> the hands of those people who actually make the binaries.
>>
>> *bwah hah hah!*
>
> No need to ridicule :-).
> After all, I'm just saying that there's got to be a simpler, stabler
> and more transparent way than to have all this logic sit in shell
> scripts.
Shellscripts are both simple and stable. Still, you are not the only
one dissatisfied with the sysv init program.

Check out initng:
http://initng.thinktux.net/index.php/Main_Page
It uses config files instead of scripts. Well, the config files
may contain scripts but they don't have to. Dependencies
are described in a simple way in these files. Another advantage,
services that don't depend on each other start/stop in parallel,
cutting boot time to 1/3 or so. (30s from powerup to X display is easy.)

I use it on one machine. It is kind of "unfinished" in that it
don't have config files for every service out there, but it works
and you can make your own files if your service isn't supported yet.

When I looked for a init replacement, I found several other
alternatives too. All trying different approaches, often trying to save
time by starting up things in parallel, but with very different
approaches to dependencies. Not all were good. Some made the mistake
of having the dependant software having to start up all the sw
it depends on. Consider the maintenance nightmare adding and
removing packages from such a system...

Helge Hafting

2006-08-11 13:26:25

by Horst H. von Brand

[permalink] [raw]

Subject: Re: ext3 corruption

John Stoffel <[email protected]> wrote:
> >>>>> "Molle" == Molle Bestefich <[email protected]> writes:
>
> Molle> I guess the problem is that I don't know a single Linux
> Molle> packaging system that actually works well enough to keep a
> Molle> system up to date at all times, and I don't have any free time
> Molle> to spend on reinstalling systems all the time.
>
> Debian works the best in my experience. Yum sucks rocks, it's a pain
> to configured and when it does pull in stuff, it pulls in *everything*
> that you don't want.

Right. Its called "dependencies", and if you don't get those, chances are
it won't work. No Debian magic around that, I'm afraid. Sure, if all that
ever changes is minor tweaks...

> Not that apt-get is perfect, but it works well. My main machine at
> home is an old Debian Stable install upgraded pretty much daily.
> Rarely do things break. And rarely do you want packages updated
> without you thinking about it.

My machine here runs Fedora rawhide (can't get more bleeding edge). Rarely
things break due to updating. The machines in the Lab here are Fedora,
almost never anything breaks. Servers are CentOS, I can't remember anything
ever breaking due to updates.

[...]

> I tried gentoo a bunch of years ago and didn't like it, and it
> certainly didn't give me the speedup it claimed it would have.

Gentoo folks deceive themselves into "at least twice as fast because I
compiled it myself"...

> I've
> been happy with Debian. Thinking about Ubuntu more...

[...]

> Molle> And voila, that difficult task of assessing in which order to
> Molle> do things is out of the hands of distros like Red Hat, and into
> Molle> the hands of those people who actually make the binaries.
>
> *bwah hah hah!* And you think they'll get it right? So what happens
> when two packages, call them A and B, have a circular dependency on
> each other? Who wins then?

The kernel people are certainly not infallible either. And there are cases
where the right order is A B C, and others in which it is C B A, and still
others where it doesn't matter. No way to get it right always.

> It's not as simple an issue as you think.

Shoving it into the kernel certainly won't simplify it, quite the contrary
by making it less amenable to hand-tweaking.
--
Dr. Horst H. von Brand User #22616 counter.li.org
Departamento de Informatica Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria +56 32 654239
Casilla 110-V, Valparaiso, Chile Fax: +56 32 797513

2006-08-12 08:54:52

by Molle Bestefich

[permalink] [raw]

Subject: Re: ext3 corruption

Horst H. von Brand wrote:
> The kernel people are certainly not infallible either. And there are cases
> where the right order is A B C, and others in which it is C B A, and still
> others where it doesn't matter.

In the quite unlikely situation where that happens, you've obviously
got a piece of software which is broken dependency-wise. Many of the
current schemes will fail to accommodate that too.

For example, no amount of moving the /etc/rc.d/rc6.d/K35smb script
around will fix that situation on Red Hat.

A solution to your example is to fix two of the three broken pieces of
software by splitting B into B1 and B2, and either A or C into their
components likewise:

A1 --> B1 --> C --> B2 --> A2

-or-

C1 --> B1 --> A --> B2 --> C2

> No way to get it right always.

Your example did in no way prove that, so thus far that statement is not true.

> In any case, this is wildly off-topic for a list on /kernel/ development.
> Better locate a Linux User Group near you, look for mailing lists on running
> Linux, trawl Usenet for a group with acceptable signal/noise ratio.

I did mention that:

> > Anyway, let's all forget about the init scripts forthwith, they're
> > not really relevant for LKML I think.

And:

> > Concentrate on the ext3 issue :-).

And my next posting was about ext3 again.

2006-08-12 10:31:35

by Molle Bestefich

[permalink] [raw]

Subject: Re: ext3 corruption

Molle Bestefich wrote:
> A solution to your example is to fix two of the three broken pieces of
> software by splitting B into B1 and B2, and either A or C into their
> components likewise:
>
> A1 --> B1 --> C --> B2 --> A2
>
> -or-
>
> C1 --> B1 --> A --> B2 --> C2

To clarify, by the above I didn't necessarily mean "split one process
into two processes". A kernel API where you can wait for a named
object or a subsystem to complete it's startup / shutdown would be
just as well. Or simply waiting on (dis-)appearance of named files in
a dedicated directory named "boot_sequence" in sysfs, would be another
equally fine way to accomplish above scheme.

2006-08-12 16:39:00

by Theodore Ts'o

[permalink] [raw]

Subject: Re: ext3 corruption

On Thu, Aug 10, 2006 at 11:00:07PM +0200, Molle Bestefich wrote:
> Duane Griffin wrote:
> >> If it doesn't take into account own changes, then the -n command is
> >> unable to produce even a slightly accurate resemblence of what would
> >> happen if I did a real run.
> >
> >It takes into account some of them (such as reading data from the
> >backup superblock if it detects corruption). Others will be irrelevent
> >for further operations. Many reports will be accurate, especially
> >fatal ones. I consider that useful, YMMV.
>
> I've attached the output of a -n run, let's get some facts on the table.
>
> I would be very happy if someone knowledgeable would tell me something
> useful about it.
>
> I'm especially worried about the "70058 files, 39754 blocks used (0%)"
> message at the end of the e2fsck run.

OK, so it looks like the primary block group descriptors got trashed,
and so e2fsck had to fall back to using the backup block group
descriptors. The summary information in the backup block group
descriptors is not backed up, for speed/performance reasons. This is
not a problem, since that information can always be regenerated
trivially from the pass 5 information. That's what all of "free
inodes/blocks/directories count wrong" messages in your log were all
about.

The 39754 block used (0%) is just because you were using -n and the
summary information is calculated from the filesystem summary data,
not from the pass5 count information (which was thrown away since you
were running -n and thus declined to fix the results).

I can imagine accepting a patch which sets a flag if any discrepancies
found in pass 5 are not fixed, and then if the summary information is
requested, to print a warning message indicating that the summary
information may not be correct. But no, it's not worth it to take
into account changes that -n might make if the user had said "yes".
The complexities that would entail would be huge, and in fact as it is
e2fsck -n does give a fairly accurate report of what what is wrong
with the filesystem. Is it 100% accurate? No, but that was never the
goal of e2fsck -n. If you want that, then use a dm-snapshot, and run
e2fsck on the snapshot....

- Ted

2006-08-12 17:24:09

by Molle Bestefich

[permalink] [raw]

Subject: Re: ext3 corruption

Theodore Tso wrote:
> > I'm especially worried about the "70058 files, 39754 blocks used (0%)"
> > message at the end of the e2fsck run.
>
> OK, so it looks like the primary block group descriptors got trashed,
> and so e2fsck had to fall back to using the backup block group
> descriptors.

Good to have backups. It would be very useful to know whether e2fsck
contemplates writing those back as primary BGDs when it's done, but I
couldn't find that in the documentation. Will it?

(Would be good to have the above information in the docs. Perhaps in
a "what does this message mean?" section.)

(Such a section would also help a lot when confronted with the first
message: "Entry blah is a link to directory bluh. Clear? y/n".
Obviously I don't want to "clear" my data. But why is e2fsck
confronting me with that question? Is something wrong with it that it
should be cleared?)

> The summary information in the backup block group
> descriptors is not backed up, for speed/performance reasons. This is
> not a problem, since that information can always be regenerated
> trivially from the pass 5 information.

Thanks for the information!
(Would be very helpful to have a copy/paste of the above in the docs too...)

> That's what all of "free inodes/blocks/directories count wrong"
> messages in your log were all about.

Ah, I'm relieved. The sheer number of messages was an indicator to me
that e2fsck is doing something gruesomely wrong.

> The 39754 block used (0%) is just because you were using -n and the
> summary information is calculated from the filesystem summary data,
> not from the pass5 count information (which was thrown away since you
> were running -n and thus declined to fix the results).

Much relieved again, thanks.

I'm wondering why it even tries to use the corrupt information, instead of just:
* reconstructing it from scratch
* not asking the user?

That leaves me a little less relieved once again ;-).

> I can imagine accepting a patch which sets a flag if any discrepancies
> found in pass 5 are not fixed, and then if the summary information is
> requested,

Huh? The user didn't request anything, it always prints.

> to print a warning message indicating that the summary
> information may not be correct.

Even not printing anything would probably be better than knowingly
printing wrong information...

> But no, it's not worth it to take
> into account changes that -n might make if the user had said "yes".
> The complexities that would entail would be huge, and in fact as it is
> e2fsck -n does give a fairly accurate report of what what is wrong
> with the filesystem. Is it 100% accurate? No, but that was never the
> goal of e2fsck -n. If you want that, then use a dm-snapshot, and run
> e2fsck on the snapshot....

Agreed. Running a r/w e2fsck on some kind of overlay would be the way
to implement a more useful (for me anyway) version of -n.

But I think dm-snapshot is useless in this case because:
* It must be configured before MD is configured and the filesystem is
created, which I haven't done on this box.

And generally because:
* It's rather good at corrupting the filesystems you store on top of
it *itself*...
* Either you have to create snapshots on devices just as big as the
ones being snapshotted, or you'll have to live with the snapshot
failing any time because it's full. There's no good management
framework to help you manage the full/failing situations either.

Thanks a lot for the information!

I take it that it's safe to run e2fsck on the filesystem, then...

2006-08-12 21:47:32

by Theodore Ts'o

[permalink] [raw]

Subject: Re: ext3 corruption

On Sat, Aug 12, 2006 at 07:24:06PM +0200, Molle Bestefich wrote:
>
> Good to have backups. It would be very useful to know whether e2fsck
> contemplates writing those back as primary BGDs when it's done, but I
> couldn't find that in the documentation. Will it?

Yes, it will.

> (Would be good to have the above information in the docs. Perhaps in
> a "what does this message mean?" section.)

Well, if someone would like to volunteer to be a technical writer,
that would be great......

> (Such a section would also help a lot when confronted with the first
> message: "Entry blah is a link to directory bluh. Clear? y/n".
> Obviously I don't want to "clear" my data. But why is e2fsck
> confronting me with that question? Is something wrong with it that it
> should be cleared?)

Basically, there are two modes that e2fsck can run in. What the boot
scripts use is called "preen" mode, which will automatically fix
"safe" things, and stop if there are anything where the user
administrator might need to need to exercise discretion, or where the
system administrator should know that there might be something that he
or she needs to clean up (like orphaned inodes getting linked into the
lost+found directory, for example).

In the normal mode, e2fsck asks permission before it does anything.
In general, the default answer is "safe", but there are times when a=
filesystem expert can do better by declining to fix a problem and then
using debugfs afterwards to try to recover data before running e2fsck
a second time to completely clear out all of the problems.

If you don't like that, you can always run with e2fsck -y, which wil
clause e2fsck to never ask permission before going ahead and trying
its best to fix the problem.

> >The summary information in the backup block group
> >descriptors is not backed up, for speed/performance reasons. This is
> >not a problem, since that information can always be regenerated
> >trivially from the pass 5 information.
>
> Thanks for the information!
> (Would be very helpful to have a copy/paste of the above in the docs too...)

Well, the e2fsck man page isn't intended to be a tutorial. If someone
wants to volunteer to write an extended introduction to how e2fsck
works and what all of the messages mean, I'd certainly be willing to
work with that person... So if you're willing to volunteer or willing
to chip in to pay for a technical writer, let me know....

> I'm wondering why it even tries to use the corrupt information, instead of
> just:
> * reconstructing it from scratch
> * not asking the user?

It did reconstruct it from scratch; that's what pass 5 is all about.
It just didn't store it in the block group descriptors, because of the
-n option.

> >I can imagine accepting a patch which sets a flag if any discrepancies
> >found in pass 5 are not fixed, and then if the summary information is
> >requested,
>
> Huh? The user didn't request anything, it always prints.

The summary information is only printed when the -v option is given,
and that's about all the -v option does. The summary information is
not the primary raison d'etre for e2fsck, so I'm not going to waste a
lot of time trying to keep two copies of the information so that the
information can be correct in the -nv case. That's just soooooo
unimportant, and most users don't use the -v option anyway.

> >with the filesystem. Is it 100% accurate? No, but that was never the
> >goal of e2fsck -n. If you want that, then use a dm-snapshot, and run
> >e2fsck on the snapshot....
>
> Agreed. Running a r/w e2fsck on some kind of overlay would be the way
> to implement a more useful (for me anyway) version of -n.
>
> But I think dm-snapshot is useless in this case because....

Well, I have the following project listed in the TODO file for
e2fsprogs:

4) Create a new I/O manager (i.e., test_io.c, unix_io.c, et.al.) which
layers on top of an existing I/O manager which provides copy-on-write
functionality. This COW I/O manager takes will take two open I/O
managers, call them "base" and "changed". The "base" I/O manager is
opened read/only, so any changes are written instead to the "changed"
I/O manager, in a compact, non-sparse format containing the intended
modification to the "base" filesystem.

This will allow resize2fs to figure out what changes need to made to
extend a filesystem, or expand the size of inodes in the inode table,
and the changes can be pushed the filesystem in one fell swoop. (If
the system crashes; the program which runs the "changed" file can be
re-run, much like a journal replay. My assumption is that the COW
file will contain the filesystem UUID in a the COW superblock, and the
COW file will be stored in some place such as /var/state/e2fsprogs,
with an init.d file to automate the replay so we can recover cleanly
from a crash during the resize2fs process.)

Difficulty: Medium Priority: Medium

Patches to implement this would be gratefully accepted....

(This is open source, which means if people who have the bad manners
to kvetch that volunteers have done all of this free work for them
haven't done $FOO will be gently reminded that patches to implement
$FOO are always welcome. :-)

- Ted

2006-08-13 19:21:26

by Molle Bestefich

[permalink] [raw]

Subject: Re: ext3 corruption

Theodore Tso wrote:
> Well, the e2fsck man page isn't intended to be a tutorial. If someone
> wants to volunteer to write an extended introduction to how e2fsck
> works and what all of the messages mean, I'd certainly be willing to
> work with that person... So if you're willing to volunteer

I'd love to help others with the same problem that I have. I know
basically nothing of e2fsck though, and I don't have time to research
and write a whole tutorial. Maybe there's a wiki somewhere where I
can start something out with a structure and some information
regarding the stuff I've seen?

> or willing to chip in to pay for a technical writer, let me know....

What kind of economic scale where you thinking about?

> (This is open source, which means if people who have the bad manners
> to kvetch that volunteers have done all of this free work for them
> haven't done $FOO will be gently reminded that patches to implement
> $FOO are always welcome. :-)

OTOH, the open source community rigorously PR Linux as an alternative
to Windows.

While the above attitude is fine by me, you're going to have to expect
to see some sad faces from Windows users when they create a filesystem
on a loop device and don't realize that the loop driver destroys
journaling expectancies and results in all their photos and home
videos going down the drain, all because nobody implemented a simple
"warning!" message in the software.

(Or whatever. Lots of similar examples exist to show you that the "no
warranty: you use our software, you learn to hack it to do what you
want yourself or it's your own fault" argument is fallacious.)

2006-08-14 03:24:04

by Kyle Moffett

[permalink] [raw]

Subject: Re: ext3 corruption

On Aug 13, 2006, at 15:21:24, Molle Bestefich wrote:
> Theodore Tso wrote:
>> (This is open source, which means if people who have the bad
>> manners to kvetch that volunteers have done all of this free work
>> for them haven't done $FOO will be gently reminded that patches to
>> implement $FOO are always welcome. :-)
>
> OTOH, the open source community rigorously PR Linux as an
> alternative to Windows.

Some people do; some people believe it's still not ready (for the
desktop environment where Windows currently has majority
marketshare). I run a fileserver for my parents and wouldn't use
anything other than Linux/OpenLDAP/Samba/device-mapper/mdadm on fully
open-spec hardware, but I wouldn't expect them to do anything other
than call me when it breaks and maybe follow a few specific
instructions for getting it network-accessible again via server-
management chip. This is all really easy for _me_ to manage with
Linux on good server hardware, but that's not something I'd think a
non-admin could handle on their own. And for 3D graphics, GUI
programs, etc, IMHO Linux is still miles from being where it needs to
be to really compete.

> While the above attitude is fine by me, you're going to have to
> expect to see some sad faces from Windows users when they create a
> filesystem on a loop device and don't realize that the loop driver
> destroys journaling expectancies and results in all their photos
> and home videos going down the drain, all because nobody
> implemented a simple "warning!" message in the software.

This is really what distros are expected to do (at least in the
current environment). The major development groups don't have the
financial and legal backing to be able to certify reliability and
support for *any* user, let alone your average Joe User who's used to
Windows and *clicky*-*clicky*-ing his way around the UI. Eventually
there will be enough vendors selling Linux-based systems that the UI-
polish patches will be developed as rapidly as the fundamental
underlying infrastructure, but we're not there yet. Ubuntu and such
are paving the way for future even-better-than-mac vertical UI
integration but we have a lot of UI infrastructure (especially 3d
support in X) that needs fixing first. IMHO Linux is still very much
for hobbyists, server administrators, and other people who have at
least a modicum of computer problem-solving skills.

> (Or whatever. Lots of similar examples exist to show you that the
> "no warranty: you use our software, you learn to hack it to do what
> you want yourself or it's your own fault" argument is fallacious.)

That kind of warranty is a hobbyist-type warranty. Some companies
invest money to build upon that and provide server-admin-type or end-
user-type warranties, but such support costs money and time which
most upstream developers don't have.

Cheers,
Kyle Moffett

2006-08-14 15:35:18

by Theodore Ts'o

[permalink] [raw]

Subject: Re: ext3 corruption

On Sun, Aug 13, 2006 at 09:21:24PM +0200, Molle Bestefich wrote:
> I'd love to help others with the same problem that I have. I know
> basically nothing of e2fsck though, and I don't have time to research
> and write a whole tutorial. Maybe there's a wiki somewhere where I
> can start something out with a structure and some information
> regarding the stuff I've seen?

There isn't yet, but kernel.org is supposed to be setting a wiki soon.
So hopefully we can get a wiki started there.

> >or willing to chip in to pay for a technical writer, let me know....
>
> What kind of economic scale where you thinking about?

Well, the Technical Advisory Board of the OSDL (James Bottomly is the
chair, other folks include Greg K-H, Randy Dunlap, and others
including myself) is trying to fund a technical writer, mostly for
kernel documentation, as well as other kernel projects. OSDL is
mainly set up to solicit monies from companies, but it might be
possible to get something setup so we can accept donations from
individuals.

> >(This is open source, which means if people who have the bad manners
> >to kvetch that volunteers have done all of this free work for them
> >haven't done $FOO will be gently reminded that patches to implement
> >$FOO are always welcome. :-)
>
> OTOH, the open source community rigorously PR Linux as an alternative
> to Windows.

Some people do, but not all. It also depends on what usage model you
are looking at. If it's kiosks or fixed-function Windows facilities
(i.e., used by travel agengts, receptionists, cash registers), then
Linux would certainly be ready now, and it's probably easier to use
Linux than Windows for those scenarios. But for the "knowledge
worker" who is a power Windows user who regular exchanges Microsoft
Office files with others and who needs 100% Office compatibility? Not
hardly!

> While the above attitude is fine by me, you're going to have to expect
> to see some sad faces from Windows users when they create a filesystem
> on a loop device and don't realize that the loop driver destroys
> journaling expectancies and results in all their photos and home
> videos going down the drain, all because nobody implemented a simple
> "warning!" message in the software.

To be fair, there are plenty of other dangerous things that you can do
with Windows that don't have warning messages pop-up. And using the
loop driver is of a complexity which is higher than what you would
expect of a typical Windows user. You might as well complain that
Linux doesn't give a warning message when you run some command like
"rm -rf /", or "dd if=/dev/null of=/dev/hda". I'm sure there are
similar commands (probably involving regedit :-) that are just as
dangerous from the Windows cmd.exe window.....

- Ted

2006-08-14 17:21:20

by Molle Bestefich

[permalink] [raw]

Subject: Re: ext3 corruption

Theodore Tso wrote:
> To be fair, there are plenty of other dangerous things that you can do
> with Windows that don't have warning messages pop-up. And using the
> loop driver is of a complexity which is higher than what you would
> expect of a typical Windows user. You might as well complain that
> Linux doesn't give a warning message when you run some command like
> "rm -rf /", or "dd if=/dev/null of=/dev/hda". I'm sure there are
> similar commands (probably involving regedit :-) that are just as
> dangerous from the Windows cmd.exe window.....

Hardly comparable..

"rm", "dd if=/dev/null", "format c:" is meant to nuke your harddrive.
The loop driver just does it as a nasty side effect of a stinky
implementation.

2006-08-17 01:27:49

by Horst H. von Brand

[permalink] [raw]

Subject: Re: ext3 corruption

Molle Bestefich <[email protected]> wrote:
> Horst H. von Brand wrote:

[...]

> > The kernel people are certainly not infallible either. And there are cases
> > where the right order is A B C, and others in which it is C B A, and still
> > others where it doesn't matter.

> In the quite unlikely situation where that happens, you've obviously
> got a piece of software which is broken dependency-wise. Many of the
> current schemes will fail to accommodate that too.

It isn't broken /software/, it is /different setups/.

> For example, no amount of moving the /etc/rc.d/rc6.d/K35smb script
> around will fix that situation on Red Hat.

What situation?
--
Dr. Horst H. von Brand User #22616 counter.li.org
Departamento de Informatica Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria +56 32 654239
Casilla 110-V, Valparaiso, Chile Fax: +56 32 797513

2006-08-17 13:46:58

by Molle Bestefich

[permalink] [raw]

Subject: Re: ext3 corruption

Horst H. von Brand wrote:
> > > The kernel people are certainly not infallible either. And there are cases
> > > where the right order is A B C, and others in which it is C B A, and still
> > > others where it doesn't matter.
>
> > In the quite unlikely situation where that happens, you've obviously
> > got a piece of software which is broken dependency-wise. Many of the
> > current schemes will fail to accommodate that too.
>
> It isn't broken /software/, it is /different setups/.

It's broken software.

> > For example, no amount of moving the /etc/rc.d/rc6.d/K35smb script
> > around will fix that situation on Red Hat.
>
> What situation?

The situation you outlined, where A can depend on B, which can depend
on C, but in another usage scenario C can depend on B which can depend
on A.

2006-09-24 08:56:55

by Molle Bestefich

[permalink] [raw]

Subject: Re: ext3 corruption

I wrote:
> I have a ~1TB filesystem that failed to mount today, the message is:
>
> EXT3-fs error (device loop0): ext3_check_descriptors: Block bitmap for
> group 2338 not in group (block 1607003381)!
> EXT3-fs: group descriptors corrupted !
>
> Yesterday it worked flawlessly.

Helge Hafting wrote:
> > And voila, that difficult task of assessing in which order to do
> > things is out of the hands of distros like Red Hat, and into the
> > hands of those people who actually make the binaries.
>
> Not so easy. You do not want to shut down md devices because
> samba is using them. Someone else may run samba on a single
> harddisk and also have some md-devices that they take down
> and bring up a lot. So having samba generally depend on md doesn't
> work. Your setup need it, others may have different needs.

I've looked hard at things and just found that maybe it's not the init
order that's to blame..

It seems that unmounting the filesystem fails with a "device busy" error.
I'm not sure why there's still open files on the device, but perhaps a
remote user is copying a file or some such (likely).

Anyway, the system is shutting down, so it should just forcefully
unmount the device, but it doesn't.
The halt script tries "umount" three times, which all fail with:
"device is busy".
It then actually tries "umount -f" three times, which all fail with
"Device or resource busy"
At which point the halt script turns off the machine and the
filesystem is ruined.

How to fix forceful unmount so it works?

2006-09-25 12:30:51

by Helge Hafting

[permalink] [raw]

Subject: Re: ext3 corruption

Molle Bestefich wrote:
> I wrote:
>> I have a ~1TB filesystem that failed to mount today, the message is:
>>
>> EXT3-fs error (device loop0): ext3_check_descriptors: Block bitmap for
>> group 2338 not in group (block 1607003381)!
>> EXT3-fs: group descriptors corrupted !
>>
>> Yesterday it worked flawlessly.
>
> Helge Hafting wrote:
>> > And voila, that difficult task of assessing in which order to do
>> > things is out of the hands of distros like Red Hat, and into the
>> > hands of those people who actually make the binaries.
>>
>> Not so easy. You do not want to shut down md devices because
>> samba is using them. Someone else may run samba on a single
>> harddisk and also have some md-devices that they take down
>> and bring up a lot. So having samba generally depend on md doesn't
>> work. Your setup need it, others may have different needs.
>
> I've looked hard at things and just found that maybe it's not the init
> order that's to blame..
>
> It seems that unmounting the filesystem fails with a "device busy" error.
> I'm not sure why there's still open files on the device, but perhaps a
> remote user is copying a file or some such (likely).
That is solvable by shutting down remote operations first.
So stop samba (or nfs or whatever) before attempting to umount.
> Anyway, the system is shutting down, so it should just forcefully
> unmount the device, but it doesn't.
> The halt script tries "umount" three times, which all fail with:
> "device is busy".
> It then actually tries "umount -f" three times, which all fail with
> "Device or resource busy"
> At which point the halt script turns off the machine and the
> filesystem is ruined.
>
> How to fix forceful unmount so it works?
I don't know, other than researching what filesystems support
forced umount and use one of those. Complain to the vendor or maintainer
of your particular filesystem.

However, you can usually find out why some file is open. Try
umount yourself, when it doesn't work, use "lsof" to see
what file is open. Then figure out who or what is keeping it open.
To debug a shutdown problem, consider putting "lsof >> logfile"
in your shutdown script.

Not a solution but a workaround: Run "sync" before shutdown.
(Stick it in some script.)
Now, all data in filesystem caches will be written to disk before power
is lost.
This isn't perfect, but filesystem damage is greatly minimized and
often avoided completely. Useful while waiting for a better solution.

The real solution is to set things up so unforced umount works.
This is normally possible to do.

Helge Hafting

2006-10-02 02:40:59

by Molle Bestefich

[permalink] [raw]

Subject: Re: ext3 corruption

Helge Hafting wrote:
> [snip]

Well, that was unproductive :-).

If anyone knows how to make forced unmounting work, hints would be
greatly appreciated.

To reiterate:
The distro halt script tries "umount -f" three times, which all fail with
"Device or resource busy".

2006-10-02 03:25:19

by Gene Heskett

[permalink] [raw]

Subject: Re: ext3 corruption

On Sunday 01 October 2006 22:40, Molle Bestefich wrote:
>Helge Hafting wrote:
>> [snip]
>
>Well, that was unproductive :-).
>
>If anyone knows how to make forced unmounting work, hints would be
>greatly appreciated.
>
>To reiterate:
>The distro halt script tries "umount -f" three times, which all fail with
>"Device or resource busy".

Me too.
I'm getting those messages from the NFS stuff at shutdown time, with NO NFS
shares active. I have had them for years. But the reboot goes on
eventually, and apparently without harm.

--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Yahoo.com and AOL/TW attorneys please note, additions to the above
message by Gene Heskett are:
Copyright 2006 by Maurice Eugene Heskett, all rights reserved.

2006-10-02 06:50:15

by Kyle Moffett

[permalink] [raw]

Subject: Re: ext3 corruption

On Oct 01, 2006, at 23:24:48, Gene Heskett wrote:
> On Sunday 01 October 2006 22:40, Molle Bestefich wrote:
>> To reiterate:
>> The distro halt script tries "umount -f" three times, which all
>> fail with
>> "Device or resource busy".
>
> Me too.
> I'm getting those messages from the NFS stuff at shutdown time,
> with NO NFS
> shares active. I have had them for years. But the reboot goes on
> eventually, and apparently without harm.

What causes problems on _all_ of my softraid boxes is that without a
whole bunch of pivot_root magic in the shutdown code to switch to a
tmpfs and unmount my lvm-on-md-on-sata stuff, it's impossible to get
the kernel to stop devices cleanly. I get all sorts of messages from
the kernel about trying to stop MD devices and not being able to
_after_ reboot is called, even though at that point it should just
forcibly kill all userspace, unmount all filesystems, and deconstruct
the MD/DM device tree. I see no reason why a successful shutdown or
reboot call should _ever_ leave the disks in an inconsistent state.

Cheers,
Kyle Moffett