LinuxLists.cc - non-critical ext3-fs errors?

2002-07-23 20:25:01

Subject: non-critical ext3-fs errors?

I notice these errors every now and then on my 100GB WD drive
(partitioned into bits) and particularly on this partition that I
compile everything on.

EXT3-fs error (device ide0(3,7)): ext3_readdir: bad entry in directory
#17821: directory entry across blocks - offset=24, inode=17822,
rec_len=4076, name_len=3

Now these errors dont cause the system to panic like other ext3 errors
do, so i'm wondering what the significance of these errors are. I'm
running 2.4.19-rc3 and the drive is on VIA vt82c686b (rev 40) IDE
UDMA100 controller with dma enabled.

2002-07-23 20:47:43

by Andreas Dilger

[permalink] [raw]

Subject: Re: non-critical ext3-fs errors?

On Jul 23, 2002 16:28 -0400, Ed Sweetman wrote:
> I notice these errors every now and then on my 100GB WD drive
> (partitioned into bits) and particularly on this partition that I
> compile everything on.
>
> EXT3-fs error (device ide0(3,7)): ext3_readdir: bad entry in directory
> #17821: directory entry across blocks - offset=24, inode=17822,
> rec_len=4076, name_len=3

This is definitely a bad entry - offset + rec_len (4100) != blocksize (4096).
Strange. You need to run e2fsck -f on this partition. It sounds like
you are getting some corruption or bit flips happening.

> Now these errors dont cause the system to panic like other ext3 errors
> do, so i'm wondering what the significance of these errors are.

Well, this is an error, and the next time the computer is restarted it
should force a full fsck on the partition. The other "panic" (oops)
situations are when things are totally shot and it is better to stop
doing anything than try and continue. In this case, it is possible to
continue, but that doesn't mean things are OK.

Cheers, Andreas
--
Andreas Dilger
http://www-mddsp.enel.ucalgary.ca/People/adilger/
http://sourceforge.net/projects/ext2resize/

2002-07-24 00:16:46

by Ed Sweetman

[permalink] [raw]

Subject: Re: non-critical ext3-fs errors and IDE issues with 2.4.19-rc3

On Tue, 2002-07-23 at 16:48, Andreas Dilger wrote:
> On Jul 23, 2002 16:28 -0400, Ed Sweetman wrote:
> > I notice these errors every now and then on my 100GB WD drive
> > (partitioned into bits) and particularly on this partition that I
> > compile everything on.
> >
> > EXT3-fs error (device ide0(3,7)): ext3_readdir: bad entry in directory
> > #17821: directory entry across blocks - offset=24, inode=17822,
> > rec_len=4076, name_len=3
>
> This is definitely a bad entry - offset + rec_len (4100) != blocksize (4096).
> Strange. You need to run e2fsck -f on this partition. It sounds like
> you are getting some corruption or bit flips happening.
>
> > Now these errors dont cause the system to panic like other ext3 errors
> > do, so i'm wondering what the significance of these errors are.
>
> Well, this is an error, and the next time the computer is restarted it
> should force a full fsck on the partition. The other "panic" (oops)
> situations are when things are totally shot and it is better to stop
> doing anything than try and continue. In this case, it is possible to
> continue, but that doesn't mean things are OK.
>
> Cheers, Andreas

I rebooted after setting max mount count to 1 so it would force and
fsck, I got no error messages. So, I dropped to shell in single mode
and attempted to unmount /usr (the partition with the issues) but umount
insisted /usr was busy. I looked at ps, nothing executed from usr. I
looked at my shell, nothing executed on /usr. I looked at lsof, nothing
showing /usr files being accessed. It didn't make any sense at all but
then there are other surprises with 2.4.19-rc3 that come later. I fsck
-f'd the partition again with dma disabled and still, no errors
reported. Either it fixed them behind the scenes or didn't detect
them. I then went and looked at my dmesg to notice this interesting
message

PDC20262: chipset revision 1
ide: Skipping Promise RAID controller.

Now this is funny. I dont remember telling the kernel to skip my other
ide controller. I double checked the config, nope, no ignoring or
skipping of the promise ide controller. This did not occur in previous
kernels, it seems to be brand new in rc3. I confess to having used jam
kernels with rc2 and rc1 though which had other ide patches in it. So
what gives with this ? Why is rc3 skipping the promise ide
controller? And why is fsck not reporting any fixes to errors that
dmesg shows is being detected during use?

2002-07-24 22:59:58

by Andreas Dilger

[permalink] [raw]

Subject: Re: non-critical ext3-fs errors and IDE issues with 2.4.19-rc3

On Jul 23, 2002 20:19 -0400, Ed Sweetman wrote:
> On Tue, 2002-07-23 at 16:48, Andreas Dilger wrote:
> > Well, this is an error, and the next time the computer is restarted it
> > should force a full fsck on the partition. The other "panic" (oops)
> > situations are when things are totally shot and it is better to stop
> > doing anything than try and continue. In this case, it is possible to
> > continue, but that doesn't mean things are OK.
>
> I rebooted after setting max mount count to 1 so it would force an fsck.

That is not the right thing to do, FYI. That will mean that each time
you boot (even after clean shutdown) you will get an fsck. Instead,
you should change the mount count (-C), or the time it was last checked
(-T on more recent tune2fs) to force it to fsck. That will make it a
one-time event.

Yes, you could also touch /forcefsck or whatever, but that would force
it for all filesystems, again probably not what you want.

Cheers, Andreas
--
Andreas Dilger
http://www-mddsp.enel.ucalgary.ca/People/adilger/
http://sourceforge.net/projects/ext2resize/