2001-03-06 05:23:12

by Frédéric L. W. Meunier

[permalink] [raw]
Subject: 2.4.2 ext2 filesystem corruption ? (was 2.4.2: What happened ? (No such file or directory))

Hi. After a reboot I had to manually run fsck (sulogin from
sysinit script) since there were failures.

In my second (and problematic) boot with 2.4.2 I used the
option mount --bind in my sysinit script to mount the old /dev
in /dev-old before devfs was mounted, so I could get rid of all
entries that were still there (I removed most before building a
Kernel with devfs support).

For some reason I couldn't remove /dev-old/hdd2. It reported
can't state file. Note that I never used /dev/hdd*, since I
only use hda and hdc, but am sure it was OK with 2.4.0 (mc
reported an error when I accessed /dev-old, what never happened
before), the last time I used a Kernel without devfs support.

If you read my old thread, you should notice various
applications couldn't access (or rename ?) files. It happened
after ~8h of idle time. It was OK at 5:58, when I last ran cvs
and killed pppd, but failed at ~14:30, when multilog (from
daemontools) had to do something to a full dnscache log file (I
was online).

I'm not sure 2.4.2 is the culprit. I just hope it's the last
time. There were no errors when I first booted with this Kernel
(I was using 2.4.1), and my first uptime was ~6 days (~23 with
2.4.1). Also there were no errors when I booted 2.4.2 for the
second time.

BTW, /lost+found contains hdd2:

brw-r----- 1 root disk 22, 66 May 8 1995 #518878

The other partitions (/home/ftp/pub and /usr/local/src) have no
problems.

--
0@pervalidus.{net, {dyndns.}org} Tel: 55-21-717-2399 (Niter?i-RJ BR)


2001-03-06 06:21:52

by Ben Greear

[permalink] [raw]
Subject: Re: 2.4.2 ext2 filesystem corruption ? (was 2.4.2: What happened ? (No such file or directory))

For what it's worth, I was able to completely screw up my root FS
using redhat's Fisher beta kernel (2.2.18 + stuff). I did this by
running a bad hdparm command while running a full GNOME desktop:
(This was not a good idea...and I know, and knew that...but....)

hdparm -X34 -d1 -u1 /dev/hda
(As found here: http://www.oreillynet.com/pub/a/linux/2000/06/29/hdparm.html?page=2


HD is a 40GB 7200 RPM Western Digital drive. (ATA-100 I believe)
that I just got from Fry's a few days ago...

fdisk was sort of able to recover most of the file system by
booting off of the CD in rescue mode and running fsck on /dev/hda, but
many files were not what they said they were, ie /sbin/ifup was
some other binary... Some files turned into directories it
seems....

Sorry for the lame bug report, but I'm scared to try it again, and
I didn't realize the complexity of the problem when I simply powered
down my machine with the HD light on solid...

Thanks,
Ben

--
Ben Greear ([email protected]) http://www.candelatech.com
Author of ScryMUD: scry.wanfear.com 4444 (Released under GPL)
http://scry.wanfear.com http://scry.wanfear.com/~greear

2001-03-06 06:39:15

by Frédéric L. W. Meunier

[permalink] [raw]
Subject: Re: 2.4.2 ext2 filesystem corruption ? (was 2.4.2: What happened ? (No such file or directory))

Maybe I should give details about my hardware. The system was
installed 5 months ago, and this is the first problem.

I used 2.2.16 stock Kernel from Slackware 7.1
2.2.17
2.2.18
2.4.0
2.4.1

And the only problem was with 2.4.2.

FYI, I'm not using hdparm or changing the BIOS to use UDMA 66.
It'd fail with 2.4.1 and 2.4.2 (CRC errors), so the setting
is AUTO and it's using UDMA 33.

And please note that this machine is fine, but I actually only
open 2 consoles and run GNU screen. No XFree86, and only a few
applications running.

If needed, my /var/log/dmesg is at
http://members.nbci.com/pervalidus/dmesg-2.4.2.txt

--
0@pervalidus.{net, {dyndns.}org} Tel: 55-21-717-2399 (Niter?i-RJ BR)

2001-03-06 10:21:31

by SteveC

[permalink] [raw]
Subject: Re: 2.4.2 ext2 filesystem corruption ? (was 2.4.2: What happened ? (No such file or directory))

On Mon, 5 Mar 2001, [iso-8859-1] Fr?d?ric L. W. Meunier wrote:
> Hi. After a reboot I had to manually run fsck (sulogin from
> sysinit script) since there were failures.

's what I had, also after something like 8 hours idle.

lost+found looks a bit bigger with 43 files... no problems just using
2.2.18.

have fun,

pub 1024D/A9D75E73 2000-05-30 Stephen Coast (SteveC) <[email protected]>
[expires:2001-05-30] http://www.fractalus.com/steve/ <[email protected]>

2001-03-06 12:05:21

by Alan

[permalink] [raw]
Subject: Re: 2.4.2 ext2 filesystem corruption ? (was 2.4.2: What happened ? (No

> running a bad hdparm command while running a full GNOME desktop:
> (This was not a good idea...and I know, and knew that...but....)
>
> hdparm -X34 -d1 -u1 /dev/hda
> (As found here: http://www.oreillynet.com/pub/a/linux/2000/06/29/hdparm.html?page=2
>
> Sorry for the lame bug report, but I'm scared to try it again, and
> I didn't realize the complexity of the problem when I simply powered
> down my machine with the HD light on solid...

Its not a bug. As the system administrator you reconfigured a hard disk on
the fly and shit happened. The hdparm man page warnings do exist for a reason.

2001-03-07 03:26:32

by Ben Greear

[permalink] [raw]
Subject: Re: 2.4.2 ext2 filesystem corruption ? (was 2.4.2: What happened ? (No

Alan Cox wrote:
>
> > running a bad hdparm command while running a full GNOME desktop:
> > (This was not a good idea...and I know, and knew that...but....)
> >
> > hdparm -X34 -d1 -u1 /dev/hda
> > (As found here: http://www.oreillynet.com/pub/a/linux/2000/06/29/hdparm.html?page=2
> >
> > Sorry for the lame bug report, but I'm scared to try it again, and
> > I didn't realize the complexity of the problem when I simply powered
> > down my machine with the HD light on solid...
>
> Its not a bug. As the system administrator you reconfigured a hard disk on
> the fly and shit happened. The hdparm man page warnings do exist for a reason.

I'm not arguing it was a smart thing to do, but I would think that the
fs/kernel/driver writers could keep really nasty and un-expected things
from happenning. For instance, the driver could dis-allow any new (non-hdparm)
writes while hdparm is doing it's test. Or maybe the driver could realize it
was being told to do something that would break and just not do it?

Considering this disk is my root disk, is there *any* safe way to test
out hdparm on this disk?

Enjoy,
Ben
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

--
Ben Greear ([email protected]) http://www.candelatech.com
Author of ScryMUD: scry.wanfear.com 4444 (Released under GPL)
http://scry.wanfear.com http://scry.wanfear.com/~greear

2001-03-07 12:55:22

by Alan

[permalink] [raw]
Subject: Re: 2.4.2 ext2 filesystem corruption ? (was 2.4.2: What happened ? (No

> I'm not arguing it was a smart thing to do, but I would think that the
> fs/kernel/driver writers could keep really nasty and un-expected things
> from happenning. For instance, the driver could dis-allow any new (non-hdparm)

Like stopping root from using rm -r ? Where is the line drawn

2001-03-07 13:50:33

by Anton Altaparmakov

[permalink] [raw]
Subject: Re: 2.4.2 ext2 filesystem corruption ? (was 2.4.2: What happened ? (No

At 03:54 07/03/01, Ben Greear wrote:
>Alan Cox wrote:
> > Its not a bug. As the system administrator you reconfigured a hard disk on
> > the fly and shit happened. The hdparm man page warnings do exist for a
> reason.
>
>I'm not arguing it was a smart thing to do, but I would think that the
>fs/kernel/driver writers could keep really nasty and un-expected things
>from happenning. For instance, the driver could dis-allow any new
>(non-hdparm) writes while hdparm is doing it's test. Or maybe the driver
>could realize it was being told to do something that would break and just
>not do it?

No. This would be against Linux/Unix philosphy of giving you enough rope.
Maybe I want to break my hd? You never know. Or maybe the same commands
work perfectly well on a different hd/controller? In general, if you don't
understand the consequences of something you want to do, then *don't* do
it! Or at least have a backup handy and don't complain afterwards...

>Considering this disk is my root disk, is there *any* safe way to test
>out hdparm on this disk?

Of course. Boot/change into single user mode, sync, and remount any
readwrite mounted fs readonly. Then it should be safe to check things out
with hdparm, at least I have done it this way for ages and never run into a
problem even though in my early stage of hdparm experimentation I would
cause kernel crashes more often then not... Chances are that if readonly
works fine, so will write, so once I find the fastest settings that still
give 100% reliability on reads I switch back to normal network multi user
mode and try read-write. Never failed me so far but YMMV, so keep a backup...

Best regards,

Anton


--
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Linux NTFS Maintainer / WWW: http://sourceforge.net/projects/linux-ntfs/
ICQ: 8561279 / WWW: http://www-stu.christs.cam.ac.uk/~aia21/

2001-03-08 05:12:55

by Ben Greear

[permalink] [raw]
Subject: Re: 2.4.2 ext2 filesystem corruption ? (was 2.4.2: What happened ? (No

Alan Cox wrote:
>
> > I'm not arguing it was a smart thing to do, but I would think that the
> > fs/kernel/driver writers could keep really nasty and un-expected things
> > from happenning. For instance, the driver could dis-allow any new (non-hdparm)
>
> Like stopping root from using rm -r ? Where is the line drawn

rm -r does not do un-expected things, and it does not corrupt your file
system, it merely removes it. That is the only thing it does, and it
does it every time.

However, messing with the hdparms options can do random things, at
least from my perspective as a user: It may bring exciting new performance
to your system, and it may subtly, or not so, corrupt your file system.

If the drivers can detect what type of HD/chipset we are using, surely
it can know not to allow the user to do stupid things that are out of
spec w/regards to the hardware?

For the power/insane user, there could be a --really-do-stupid-thing-i-told-you-to
option, and it should be that hard to type!!

Ben

--
Ben Greear ([email protected]) http://www.candelatech.com
Author of ScryMUD: scry.wanfear.com 4444 (Released under GPL)
http://scry.wanfear.com http://scry.wanfear.com/~greear

2001-03-08 05:35:26

by Alexander Viro

[permalink] [raw]
Subject: Re: 2.4.2 ext2 filesystem corruption ? (was 2.4.2: What happened ? (No



On Wed, 7 Mar 2001, Ben Greear wrote:

> However, messing with the hdparms options can do random things, at
> least from my perspective as a user: It may bring exciting new performance
> to your system, and it may subtly, or not so, corrupt your file system.

It's root-only. If you run unfamiliar stuff as root without thorough
RTFM or choose to ignore "use with extreme caution" contained in the
manpage - hdparm is the least of your problems. Think of it as evolution
in action...
Cheers,
Al

2001-03-08 06:04:21

by Ben Greear

[permalink] [raw]
Subject: Re: 2.4.2 ext2 filesystem corruption ? (was 2.4.2: What happened ?(No

Alexander Viro wrote:
>
> On Wed, 7 Mar 2001, Ben Greear wrote:
>
> > However, messing with the hdparms options can do random things, at
> > least from my perspective as a user: It may bring exciting new performance
> > to your system, and it may subtly, or not so, corrupt your file system.
>
> It's root-only. If you run unfamiliar stuff as root without thorough
> RTFM or choose to ignore "use with extreme caution" contained in the
> manpage - hdparm is the least of your problems. Think of it as evolution
> in action...
> Cheers,
> Al

I see it differently: If it's possible for the driver to protect the
user, and it does not, then it strikes me as irresponsible programming. If
there is a reason other than 'only elite users are cool enough to tune
their system, and they never make mistakes', then that's ok, but I have
not heard that argument yet.

Of course, I'd love it if the HD driver automatically brought it over
4MBps (it's 7200 RPM, for goodness sake!!). (It sounds like, from
reading the hdparm man page, that my HD should do at least 20MBps..)

Either way, I've said my piece, and will go back to wrestling with
why my network/overall performance is sucking so badly all of a sudden...

Enjoy,
Ben

--
Ben Greear ([email protected]) http://www.candelatech.com
Author of ScryMUD: scry.wanfear.com 4444 (Released under GPL)
http://scry.wanfear.com http://scry.wanfear.com/~greear

2001-03-08 06:22:02

by Alexander Viro

[permalink] [raw]
Subject: Re: 2.4.2 ext2 filesystem corruption ? (was 2.4.2: What happened ?(No



On Wed, 7 Mar 2001, Ben Greear wrote:

> I see it differently: If it's possible for the driver to protect the
> user, and it does not, then it strikes me as irresponsible programming. If
> there is a reason other than 'only elite users are cool enough to tune
> their system, and they never make mistakes', then that's ok, but I have
> not heard that argument yet.

*users* have no business changing the system configuration. End of story.
Again, if somebody doesn't read manpages before doing stuff under root -
no point trying to protect him. He will find a way to fsck up, no matter
how many "safety" checks you put in. BTW, that's the first time I've seen
"elite" used as a term for "able to understand the meaning of words 'use
with extreme caution'". Oh, well...
Cheers,
Al

2001-03-08 20:12:08

by God

[permalink] [raw]
Subject: Re: 2.4.2 ext2 filesystem corruption ? (was 2.4.2: What happened ?(No

On Wed, 7 Mar 2001, Ben Greear wrote:

> Date: Wed, 07 Mar 2001 23:32:11 -0700
> From: Ben Greear <[email protected]>
> To: Alexander Viro <[email protected]>
> Cc: Alan Cox <[email protected]>,
> Linux Kernel <[email protected]>
> Subject: Re: 2.4.2 ext2 filesystem corruption ? (was 2.4.2: What happened
> ?(No
>
> Alexander Viro wrote:
> >
> > On Wed, 7 Mar 2001, Ben Greear wrote:
> >
> > > However, messing with the hdparms options can do random things, at
> > > least from my perspective as a user: It may bring exciting new performance
> > > to your system, and it may subtly, or not so, corrupt your file system.
> >
> > It's root-only. If you run unfamiliar stuff as root without thorough
> > RTFM or choose to ignore "use with extreme caution" contained in the
> > manpage - hdparm is the least of your problems. Think of it as evolution
> > in action...
> > Cheers,
> > Al
>
> I see it differently: If it's possible for the driver to protect the
> user, and it does not,

Agreed.

> then it strikes me as irresponsible programming.

Also agreed.

> If there is a reason other than 'only elite users are cool enough to tune
> their system, and they never make mistakes',

Agreed

> then that's ok,

NOT Agreed.

> but I have not heard that argument yet.
>

What must be understood by the linux community is that if it continues to
target the user base of other Desktop OS's, (ok the only other one... we
all know which), Then it MUST be userfriendly.

How friendly? Think about the AOL and newuser jokes we have all heard at
one point or another. The truth is, _assuming_ the user will know, or
know better, is the WRONG way to go.

Look at some of the confirmation requests in windows, some ask you twice
if you whish to perform an action. Even Red Hat (that I know of, others
may as well), has an alias for "rm" that by
default turns on confirmation. Why? Because not ALL users will know
better. Sure there are warnings that you can put in a man page somewhere,
but the truth is few users are actually going to READ the page. Is it
there fault? Yes. But should it be so easy to lose their data over
it rather then writting code to detect if said feature will work or
not? ...

If the majority of people on this list think YES, then Linux
truely has a long way to go ......

>
> Either way, I've said my piece, and will go back to wrestling with
> why my network/overall performance is sucking so badly all of a sudden...
>
> Enjoy,
> Ben
>
>



2001-03-08 20:31:28

by God

[permalink] [raw]
Subject: Re: 2.4.2 ext2 filesystem corruption ? (was 2.4.2: What happened ?(No

On Thu, 8 Mar 2001, Alexander Viro wrote:

> Date: Thu, 8 Mar 2001 01:21:31 -0500 (EST)
> From: Alexander Viro <[email protected]>
> To: Ben Greear <[email protected]>
> Cc: Alan Cox <[email protected]>,
> Linux Kernel <[email protected]>
> Subject: Re: 2.4.2 ext2 filesystem corruption ? (was 2.4.2: What happened
> ?(No
>
>
>
> On Wed, 7 Mar 2001, Ben Greear wrote:
>
> > I see it differently: If it's possible for the driver to protect the
> > user, and it does not, then it strikes me as irresponsible programming. If
> > there is a reason other than 'only elite users are cool enough to tune
> > their system, and they never make mistakes', then that's ok, but I have
> > not heard that argument yet.
>
> *users* have no business changing the system configuration. End of story.
> Again, if somebody doesn't read manpages before doing stuff under root -
> no point trying to protect him. He will find a way to fsck up, no matter
> how many "safety" checks you put in.

Just curious, but do you administer any kind of network with users? Are
they all perfect? Never changing a setting? Never screwing anything
up? ... If so , then it must get boring sitting in your office all day.

According to you, I, nor any of the other millions of computer users/game
players out there, should ever do anything more then install a game and
run it. Oh wait .. ya know what? .. that involves changing system
settings too ..... darn .. ya know .. I guess I just shouldn't use a
computer at all.... -end of story

> BTW, that's the first time I've seen
> "elite" used as a term for "able to understand the meaning of words 'use
> with extreme caution'". Oh, well...


What? .... that is very, VERY, low and stupid.



2001-03-08 20:38:18

by Andre Hedrick

[permalink] [raw]
Subject: Re: 2.4.2 ext2 filesystem corruption ? (was 2.4.2: What happened ?(No



[22 new messages! Most recent from God]

You know how much this bothers me to turn around and see these in my
mailbox? I am not ready to answer for all of the things
past/present/future, so please change your name because you are not "god"!

Andre Hedrick
Linux ATA Development

2001-03-08 20:35:48

by Oliver Xymoron

[permalink] [raw]
Subject: Re: 2.4.2 ext2 filesystem corruption ? (was 2.4.2: What happened ? (No

On Wed, 7 Mar 2001, Ben Greear wrote:

> For the power/insane user, there could be a --really-do-stupid-thing-i-told-you-to
> option, and it should be that hard to type!!

There is, though historically it's undocumented. It's called "root
password".

Pause. Reflect.

--
"Love the dolphins," she advised him. "Write by W.A.S.T.E.."

2001-03-08 21:09:18

by Roman Zippel

[permalink] [raw]
Subject: Re: 2.4.2 ext2 filesystem corruption ? (was 2.4.2: What happened ?(No

Hi,

On Thu, 8 Mar 2001, God wrote:

> Look at some of the confirmation requests in windows, some ask you twice
> if you whish to perform an action. Even Red Hat (that I know of, others
> may as well), has an alias for "rm" that by
> default turns on confirmation. Why? Because not ALL users will know
> better. Sure there are warnings that you can put in a man page somewhere,
> but the truth is few users are actually going to READ the page. Is it
> there fault? Yes. But should it be so easy to lose their data over
> it rather then writting code to detect if said feature will work or
> not? ...

This is getting off topic, this has nothing to do with the kernel. You are
free to do whatever you want in userspace, if you have the right
capabilities. You're also free to write your own userspace tools, which
protects the user from any danger, but it belongs in userspace not in
the kernel. So please go the KDE/Gnome/... guys and whine there.

bye, Roman


2001-03-08 21:33:18

by Alexander Viro

[permalink] [raw]
Subject: Re: 2.4.2 ext2 filesystem corruption ? (was 2.4.2: What happened ?(No



On Thu, 8 Mar 2001, God wrote:

> > *users* have no business changing the system configuration. End of story.
> > Again, if somebody doesn't read manpages before doing stuff under root -
> > no point trying to protect him. He will find a way to fsck up, no matter
> > how many "safety" checks you put in.
>
> Just curious, but do you administer any kind of network with users? Are

Fortunately, not anymore.

> they all perfect? Never changing a setting? Never screwing anything

Thanks $DEITY, never really had to deal with DOS/MacOS/Windows. So that
was not that much of a problem system-wide. Power-lusers screwing their
.profile, .forward, yodda, yodda? You bet.

> up? ... If so , then it must get boring sitting in your office all day.

Not really ;-/

> According to you, I, nor any of the other millions of computer users/game
> players out there, should ever do anything more then install a game and
> run it. Oh wait .. ya know what? .. that involves changing system
> settings too ..... darn .. ya know .. I guess I just shouldn't use a
> computer at all.... -end of story

Not exactly. It's not a matter of should or shouldn't. It's much simpler -
if you do something you'd better get some idea of potential scale of
screwups you can cause doing that. Anyone who uses sharp tools and doesn't
watch his steps is going to get hurt, be it buzz-saw or root. When you
are logged in as root you can cause _any_ damage to system. If you do
that - blame yourself. It sucks to spend a weekend with backups because
of a typo, but if you screw yourself because you didn't RTFM... <shrug>
You don't take random pills without checking the side effects, do you?
Same principle...

> > BTW, that's the first time I've seen
> > "elite" used as a term for "able to understand the meaning of words 'use
> > with extreme caution'". Oh, well...
>
> What? .... that is very, VERY, low and stupid.

Would you mind rereading the posting I replied to? Previous poster apparently
implied that ability to read the manpage is a sign of being a member
of some elite. No arguments - it's _very_ stupid. Your point being?