2001-02-22 07:02:31

by Mohammad A. Haque

[permalink] [raw]
Subject: EXT2-fs error

I got the following after compiling/rebooting into 2.4.2 and forcing a
fsck.

EXT2-fs error (device ide0(3,3)): ext2_readdir: bad entry in directory
#508411: rec_len is smaller than minimal - offset=0, inode=0, rec_len=0,
name_len=0
EXT2-fs error (device ide0(3,3)): ext2_readdir: bad entry in directory
#508411: rec_len is smaller than minimal - offset=0, inode=0, rec_len=0,
name_len=0

Possibly the result of the 'silent' bug in 2.4.1?

--

=====================================================================
Mohammad A. Haque http://www.haque.net/
[email protected]

"Alcohol and calculus don't mix. Project Lead
Don't drink and derive." --Unknown http://wm.themes.org/
[email protected]
=====================================================================



2001-02-22 08:11:36

by Andreas Dilger

[permalink] [raw]
Subject: Re: EXT2-fs error

Mohammad A. Haque writes:
> I got the following after compiling/rebooting into 2.4.2 and forcing a
> fsck.

Did fsck complain? If not, then it is a 2.4.2 kernel/driver bug, possibly
not reading any data from disk (the below errors are generated from a zero
filled directory block).

> EXT2-fs error (device ide0(3,3)): ext2_readdir: bad entry in directory
> #508411: rec_len is smaller than minimal - offset=0, inode=0, rec_len=0,
> name_len=0
> EXT2-fs error (device ide0(3,3)): ext2_readdir: bad entry in directory
> #508411: rec_len is smaller than minimal - offset=0, inode=0, rec_len=0,
> name_len=0

Cheers, Andreas
--
Andreas Dilger \ "If a man ate a pound of pasta and a pound of antipasto,
\ would they cancel out, leaving him still hungry?"
http://www-mddsp.enel.ucalgary.ca/People/adilger/ -- Dogbert

2001-02-22 14:58:58

by Mohammad A. Haque

[permalink] [raw]
Subject: Re: EXT2-fs error

Well, here's the whole situtation...

Compiled 2.4.2 and reboot forcing fsck. No errors.

Tried mounting a cd image via loopback to see if loopback was working
again. Mount hangs. Reboot command stalled waiting for filesystems to
unmount. Force with alt-sysreq-<s,u,b>. Booted w/o any errors. Restart
forcing fsck and this time fsck needs to clear some inodes.

Once I am rebooted again, I went to reinstall some rpms for files I saw
fsck complain about and I get these errors.


On Thu, 22 Feb 2001, Andreas Dilger wrote:

> Did fsck complain? If not, then it is a 2.4.2 kernel/driver bug, possibly
> not reading any data from disk (the below errors are generated from a zero
> filled directory block).
>
> > EXT2-fs error (device ide0(3,3)): ext2_readdir: bad entry in directory
> > #508411: rec_len is smaller than minimal - offset=0, inode=0, rec_len=0,
> > name_len=0
> > EXT2-fs error (device ide0(3,3)): ext2_readdir: bad entry in directory
> > #508411: rec_len is smaller than minimal - offset=0, inode=0, rec_len=0,
> > name_len=0

--

=====================================================================
Mohammad A. Haque http://www.haque.net/
[email protected]

"Alcohol and calculus don't mix. Project Lead
Don't drink and derive." --Unknown http://wm.themes.org/
[email protected]
=====================================================================

2001-02-23 19:12:55

by Ian Wehrman

[permalink] [raw]
Subject: Re: EXT2-fs error

Mohammad A. Haque <[email protected]> wrote:
> I got the following after compiling/rebooting into 2.4.2 and forcing a
> fsck.
>
> EXT2-fs error (device ide0(3,3)): ext2_readdir: bad entry in directory
> #508411: rec_len is smaller than minimal - offset=0, inode=0, rec_len=0,
> name_len=0
> EXT2-fs error (device ide0(3,3)): ext2_readdir: bad entry in directory
> #508411: rec_len is smaller than minimal - offset=0, inode=0, rec_len=0,
> name_len=0
>
> Possibly the result of the 'silent' bug in 2.4.1?

you are not the only one who found this bug. immediately after booting 2.4.2 i
received dozens of these errors, resulting in _major_ filesystem corruption.
after a half hour of fsck'ing i managed to bring the machine back into a usable
state, but there are still many files and directories around the fs that have
the wrong uid/gid associated with them, as well as incorrect file type,
permissions, etc. i'm not using any unusual hardware, and haven't had any
other recent issues like this. let me know if i can provide further information,
or test patches.

thanks,
ian wehrman

2001-02-23 19:24:17

by Alan

[permalink] [raw]
Subject: Re: EXT2-fs error

> > Possibly the result of the 'silent' bug in 2.4.1?
>
> you are not the only one who found this bug. immediately after booting 2.4.2 i
> received dozens of these errors, resulting in _major_ filesystem corruption.
> after a half hour of fsck'ing i managed to bring the machine back into a usable

Had you been running 2.4.1 before that ?

2001-02-23 19:48:21

by John Heil

[permalink] [raw]
Subject: Re: EXT2-fs error

On Fri, 23 Feb 2001, Ian Wehrman wrote:

> Date: Fri, 23 Feb 2001 13:12:05 -0600
> From: Ian Wehrman <[email protected]>
> To: [email protected], [email protected], [email protected]
> Subject: Re: EXT2-fs error
>
> Mohammad A. Haque <[email protected]> wrote:
> > I got the following after compiling/rebooting into 2.4.2 and forcing a
> > fsck.
> >
> > EXT2-fs error (device ide0(3,3)): ext2_readdir: bad entry in directory
> > #508411: rec_len is smaller than minimal - offset=0, inode=0, rec_len=0,
> > name_len=0
> > EXT2-fs error (device ide0(3,3)): ext2_readdir: bad entry in directory
> > #508411: rec_len is smaller than minimal - offset=0, inode=0, rec_len=0,
> > name_len=0
> >
> > Possibly the result of the 'silent' bug in 2.4.1?
>
> you are not the only one who found this bug. immediately after booting 2.4.2 i
> received dozens of these errors, resulting in _major_ filesystem corruption.

In contrast perhaps, 2.4.1-ac19 seems clean so far in this regard after
having some different but equivalently bad corruption in 2.4.1-final.


> after a half hour of fsck'ing i managed to bring the machine back into a usable
> state, but there are still many files and directories around the fs that have
> the wrong uid/gid associated with them, as well as incorrect file type,
> permissions, etc. i'm not using any unusual hardware, and haven't had any
> other recent issues like this. let me know if i can provide further information,
> or test patches.
>
> thanks,
> ian wehrman
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

-----------------------------------------------------------------
John Heil
South Coast Software
Custom systems software for UNIX and IBM MVS mainframes
1-714-774-6952
[email protected]
[email protected]
http://www.sc-software.com
-----------------------------------------------------------------

2001-02-23 19:55:12

by John Heil

[permalink] [raw]
Subject: Re: EXT2-fs error

On Fri, 23 Feb 2001, Alan Cox wrote:

> Date: Fri, 23 Feb 2001 19:26:17 +0000 (GMT)
> From: Alan Cox <[email protected]>
> To: [email protected]
> Cc: [email protected], [email protected], [email protected]
> Subject: Re: EXT2-fs error
>
> > > Possibly the result of the 'silent' bug in 2.4.1?
> >
> > you are not the only one who found this bug. immediately after booting 2.4.2 i
> > received dozens of these errors, resulting in _major_ filesystem corruption.
> > after a half hour of fsck'ing i managed to bring the machine back into a usable
>
> Had you been running 2.4.1 before that ?

I had, and received a bit different corruption involving inode and block
bitmap errors rather than directories... ac19 seems so far to have solved
it.


> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

-----------------------------------------------------------------
John Heil
South Coast Software
Custom systems software for UNIX and IBM MVS mainframes
1-714-774-6952
[email protected]
[email protected]
http://www.sc-software.com
-----------------------------------------------------------------

2001-02-24 11:01:39

by A E Lawrence

[permalink] [raw]
Subject: Re: EXT2-fs error

Alan Cox wrote:
>
> > > Possibly the result of the 'silent' bug in 2.4.1?
> >
> > you are not the only one who found this bug. immediately after booting 2.4.2 i
> > received dozens of these errors, resulting in _major_ filesystem corruption.
> > after a half hour of fsck'ing i managed to bring the machine back into a usable
>
> Had you been running 2.4.1 before that ?

I have seen similar problems on stock 2.4.2 a machine which has not run
2.4.1.

ael
--
Dr A E Lawrence

2001-02-24 16:58:29

by Alan

[permalink] [raw]
Subject: Re: EXT2-fs error

> I have seen similar problems on stock 2.4.2 a machine which has not run
> 2.4.1.

What disk controllers ? We really need that sort of info in order to see the
pattern in the odd reports of corruption we get

2001-02-24 22:30:30

by A E Lawrence

[permalink] [raw]
Subject: Re: EXT2-fs error

Alan Cox wrote:
>
> > I have seen similar problems on stock 2.4.2 a machine which has not run
> > 2.4.1.
>
> What disk controllers ? We really need that sort of info in order to see the
> pattern in the odd reports of corruption we get

Sorry:-

00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev
10) (prog-if 8a [Master SecP PriP])
Flags: bus master, medium devsel, latency 32
I/O ports at d000
Capabilities: [c0] Power Management version 2

With dma enabled. If this is a known problem on this chipset, that may
be the explanation.

ael
--
A E Lawrence

2001-02-24 23:08:06

by A E Lawrence

[permalink] [raw]
Subject: Re: EXT2-fs error

A E Lawrence wrote:
>
> Alan Cox wrote:
> >
> > > I have seen similar problems on stock 2.4.2 a machine which has not run
> > > 2.4.1.
> >
> > What disk controllers ? We really need that sort of info in order to see the
> > pattern in the odd reports of corruption we get

Problems have just started to show up under 2.2.18, so it is likely that
the hardware has become flakey. Bit of a coincidence, unless it is a
side effect of upgrading one of the packages for 2.4.2 :-( or a damaged
library.

So you had better discount this report. Apologies.

Adrian
--
A E Lawrence

2001-02-25 18:01:31

by Mike Brown

[permalink] [raw]
Subject: Re: EXT2-fs error

On Sat, 24 Feb 2001, Alan Cox wrote:

>> I have seen similar problems on stock 2.4.2 a machine which has not run
>> 2.4.1.
>
>What disk controllers ? We really need that sort of info in order to see the
>pattern in the odd reports of corruption we get
>
>-
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to [email protected]
>More majordomo info at http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at http://www.tux.org/lkml/
>

I also just ran into the ext2 bug:

Feb 25 06:27:05 morpheus kernel: EXT2-fs error (device ide0(3,1)):
ext2_readdir:
directory #881700 contains a hole at offset 0
Feb 25 06:27:05 morpheus kernel: Remounting filesystem read-only

I had been running 2.4.1 before upgrading to 2.4.2. My machine had been
up with 2.4.1 for 13 days, then i added a system fan, then it was up for
another 8 days after that with no problems.

Up with 2.4.2 for about a day and I ran into the above ext2 error. I also
have the VIA Apollo chipset:

00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 10) (prog-if 8a [Master SecP PriP])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 32 set
Region 4: I/O ports at d000 [size=16]
Capabilities: [c0] Power Management version 2
Flags: PMEClk- AuxPwr- DSI- D1- D2- PME-
Status: D0 PME-Enable- DSel=0 DScale=0 PME-

Also, during boot phase I had been manually running:

/sbin/hdparm -d1 /dev/hda

To turn on DMA on my primary disk since the 2.2.x series didn't turn this
on by default. /dev/hda is a Western Digital:

Feb 25 12:35:37 morpheus kernel: hda: WDC WD273BA, ATA DISK drive
Feb 25 12:35:37 morpheus kernel: ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
Feb 25 12:35:37 morpheus kernel: hda: 53464320 sectors (27374 MB) w/1961KiB Cache, CHS=3328/255/63


Let me know if there is any other information I can provide. I also have
a notebook which as been running 2.4.2 without trouble for a few days.
This notebook has an Intel IDE controller. Seems to me like this is a VIA
Apollo bug...... Thanks.


-Michael F. Brown, UMass Lowell Computer Science

phone: (978) 934-5354
email: [email protected]

int *x; while (1) { x = (int *) malloc (sizeof (int)); }

2001-02-25 20:52:06

by A E Lawrence

[permalink] [raw]
Subject: Re: EXT2-fs error

A E Lawrence wrote:
>
> A E Lawrence wrote:
> >
> > Alan Cox wrote:
> > >
> > > > I have seen similar problems on stock 2.4.2 a machine which has not run
> > > > 2.4.1.
> > >
> > > What disk controllers ? We really need that sort of info in order to see the
> > > pattern in the odd reports of corruption we get
>
> Problems have just started to show up under 2.2.18, so it is likely that
> the hardware has become flakey. Bit of a coincidence, unless it is a
> side effect of upgrading one of the packages for 2.4.2 :-( or a damaged
> library.
>
> So you had better discount this report. Apologies.

Now investigated: the hardware has not changed. Rather the corruption
under 2.2.18 only happens when hdparm -d1 is executed. I guess that is
well reported, but I had forgotten if I ever knew :-(

In contrast 2.4.2 corruptions happen whether dma is explicitly turned on
by hdparm or not.

[IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 10)]

ael
--
A E Lawrence

2001-02-25 21:57:12

by John Heil

[permalink] [raw]
Subject: Re: EXT2-fs error

On Sun, 25 Feb 2001, A E Lawrence wrote:
> A E Lawrence wrote:
> >
> > A E Lawrence wrote:
> > >
> > > Alan Cox wrote:
> > > >
> > > > > I have seen similar problems on stock 2.4.2 a machine which has not run
> > > > > 2.4.1.
> > > >
> > > > What disk controllers ? We really need that sort of info in order to see the
> > > > pattern in the odd reports of corruption we get
> >
> > Problems have just started to show up under 2.2.18, so it is likely that
> > the hardware has become flakey. Bit of a coincidence, unless it is a
> > side effect of upgrading one of the packages for 2.4.2 :-( or a damaged
> > library.
> >
> > So you had better discount this report. Apologies.
>
> Now investigated: the hardware has not changed. Rather the corruption
> under 2.2.18 only happens when hdparm -d1 is executed. I guess that is
> well reported, but I had forgotten if I ever knew :-(
>
> In contrast 2.4.2 corruptions happen whether dma is explicitly turned on
> by hdparm or not.
>
> [IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 10)]
>

The corruption I got seemed to be non-chipset related.

The 2.4.1 corruption I experienced spanned 2.2.14 (RH 6.2)...

The 2.4.1 IDE support was in an ALI 1535 south bridge in a Crusoe based
embedded system...

00:0f.0 IDE interface: Acer Laboratories Inc. [ALi] M5229 IDE (rev c3)
(prog-if b4)
Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B-
Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR-
Latency: 64 (500ns min, 1000ns max)
Interrupt: pin A routed to IRQ 0
Region 0: I/O ports at 01f0 [size=16]
Region 1: I/O ports at 03f4
Region 2: I/O ports at 0170 [size=8]
Region 3: I/O ports at 0374
Region 4: I/O ports at 1400 [size=16]
Capabilities: [60] Power Management version 2
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA
PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-

Disk was...

/dev/hdb:

Model=HITACHI_DK23AA-60, FwRev=00XEA0A0, SerialNo=F77166
Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs }
RawCHS=12416/15/63, TrkSize=36477, SectSize=579, ECCbytes=4
BuffType=3(DualPortCache), BuffSize=512kB, MaxMultSect=16, MultSect=off
DblWordIO=no, OldPIO=2, DMA=yes, OldDMA=2
CurCHS=12416/15/63, CurSects=142606515, LBA=yes, LBAsects=11733120
tDMA={min:120,rec:120}, DMA modes: sword0 sword1 sword2 mword0 mword1
mword2
IORDY=yes, tPIO={min:400,w/IORDY:120}, PIO modes: mode3 mode4
UDMA modes: mode0 mode1 *mode2 mode3 mode4


When the above system couldn't adequately repair itself,
I fsck'd it on the following....

2.2.14-5.0 RH 6.2 desktop PII w PIIX4 IDE...

00:07.1 IDE interface: Intel Corporation 82371AB PIIX4 IDE (rev 01)
(prog-if 80 [Master])
Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B-
Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR-
Latency: 64 set
Region 4: I/O ports at 1460

Initially, the 2.2.14 RH 6.2 had e2fsprogs 1.18, but I ugraded it to
e2fsck 1.19, thinking that might be the problem but it did _not_ help.

The 2 systems, both with e2fsck 1.19, would do fsck repairs
after the other had touched it. It seemed impossible to keep it
error free.

This cross-system disk transfering was error free in 2.4.0-test10-pre3.

2.4.1-final was the first post 2.4.0-final kernel after t10p3, that
saw any significant activity.

I've since upgraded the RH to 2.2.18 and the embedded to -ac19.

So far, no more problems but I still haven't used the ac19's disk
on 2.2.18 very much yet but it will have to happen soon...
I am a bit gun-shy of it still.

I didn't/don't have much time to spend on this and I'm hoping ac19
is bug free is this regard... but I'm interested in how this
turns out.

My guess was that I may have hit the bug Russel King fixed for his ARM
system. IRC, that fix hit the -ac tree from Linus's 2.4.2-pre2.


-----------------------------------------------------------------
John Heil
South Coast Software
Custom systems software for UNIX and IBM MVS mainframes
1-714-774-6952
[email protected]
http://www.sc-software.com
-----------------------------------------------------------------