LinuxLists.cc - strange ext3 corruption problem on 2.6.x

2004-03-13 00:47:22

Subject: strange ext3 corruption problem on 2.6.x

I use lvm-over-raid5 and get these messages once a day (requiring a reboot
afterwards):

EXT3-fs error (device dm-0): ext3_readdir: bad entry in directory #4804801: directory entry across blocks - offset=0, inode=0, rec_len=50000,
name_len=152
Aborting journal on device dm-0.
add_dirent_to_buf: aborting transaction: Journal has aborted in __ext3_journal_get_write_access<2>EXT3-fs error (device dm-0) in add_dirent_to
_buf: Journal has aborted
EXT3-fs error (device dm-0) in ext3_writeback_writepage: IO failure
EXT3-fs error (device dm-0) in ext3_writeback_writepage: IO failure
ext3_abort called.
EXT3-fs abort (device dm-0): ext3_journal_start: Detected aborted journal
Remounting filesystem read-only
EXT3-fs error (device dm-0) in start_transaction: Journal has aborted
EXT3-fs error (device dm-0) in ext3_delete_inode: Journal has aborted
EXT3-fs error (device dm-0) in ext3_create: Journal has aborted
EXT3-fs error (device dm-0): ext3_readdir: bad entry in directory #4804801: directory entry across blocks - offset=0, inode=0, rec_len=50000,
name_len=152
EXT3-fs error (device dm-0): ext3_readdir: bad entry in directory #4804801: directory entry across blocks - offset=0, inode=0, rec_len=50000,
name_len=152
EXT3-fs error (device dm-0): ext3_readdir: bad entry in directory #4804801: directory entry across blocks - offset=0, inode=0, rec_len=50000,
name_len=152

e2fsck after rebooting shows no errors and nothing to fix, and in fact, in
this very incident my home directory was missing, after rebooting it was
there again, so so far this doesn't look like on-disk data corruption.

About my configuration:

5 IDE disks were combined into one raid5, with lvm on top. Theer are
two lvs on the raid, one formatted with ext3 and one with reiserfs. the
array was not degraded and not rebuilding. Data throughput under 2.6 is
much lower than under 2.4, though (and 2.6 takes enourmous amounts of cpu
for reading from the raid5 array), but this issue is probably a seperate
problem.

Both partitions currently undergo heavy filesystem activity, mainly
untar'ing big tars with lots of medium-sized files (e.g. 10gb of jpeg
files, or cvs directories).

Reiserfs so far never gave a problem, neither did ext3 filesystems on
normal harddisk partitions (although the latter ones were never under
write stress like the partitions on the lv partitions).

There are no other kernel messages between mounting the volume and the
problem.

I can use this machine for many hours under no stress without any
problems.

I had these problems on 2.6.3 and 2.6.4, other 2.6. kernels have not been
tested.

Using 2.4 on the same machine (lvm1) doesn't show any problems (the
machine is a dual P-III 1ghz).

Summary: the ext3 partition regularly gives me these problems (about once
per day), while reiserfs on the same device does not. Neither of them
make problems under 2.4.

Hope that helps,

--
-----==- |
----==-- _ |
---==---(_)__ __ ____ __ Marc Lehmann +--
--==---/ / _ \/ // /\ \/ / [email protected] |e|
-=====/_/_//_/\_,_/ /_/\_\ XX11-RIPE --+
The choice of a GNU generation |
|

2004-03-13 02:34:21

by Andrew Morton

[permalink] [raw]

Subject: Re: strange ext3 corruption problem on 2.6.x

Marc Lehmann <[email protected]> wrote:
>
> I use lvm-over-raid5 and get these messages once a day (requiring a reboot
> afterwards):
>
> EXT3-fs error (device dm-0): ext3_readdir: bad entry in directory #4804801: directory entry across blocks - offset=0, inode=0, rec_len=50000,
> name_len=152
> Aborting journal on device dm-0.

(and fsck comes up clean)

There have been earlier reports of this. Too many for it to be some random
glitch. We've had similar reports in 2.4, usually with raid5.

I'm fairly confident in ext3 - it's hard to think of an ext3-level bug
which wouldn't have 10x as many reports from non-md users. But perhaps
some timing unique to the MD layer is triggering some ext3 bug.

Joe, Neil: have you spotted reports like this? Any suggestions as to how
to track it down a bit?

2004-03-13 02:41:01

by Marc Singer

[permalink] [raw]

Subject: Re: strange ext3 corruption problem on 2.6.x

On Fri, Mar 12, 2004 at 06:34:23PM -0800, Andrew Morton wrote:
> Marc Lehmann <[email protected]> wrote:
> >
> > I use lvm-over-raid5 and get these messages once a day (requiring a reboot
> > afterwards):
> >
> > EXT3-fs error (device dm-0): ext3_readdir: bad entry in directory #4804801: directory entry across blocks - offset=0, inode=0, rec_len=50000,
> > name_len=152
> > Aborting journal on device dm-0.
>
> (and fsck comes up clean)
>
> There have been earlier reports of this. Too many for it to be some random
> glitch. We've had similar reports in 2.4, usually with raid5.
>
> I'm fairly confident in ext3 - it's hard to think of an ext3-level bug
> which wouldn't have 10x as many reports from non-md users. But perhaps
> some timing unique to the MD layer is triggering some ext3 bug.
>
> Joe, Neil: have you spotted reports like this? Any suggestions as to how
> to track it down a bit?

I, too, have been experiencing this with ext3 on top of lvm on top of
raid5. I also have a dual-proc machine.

It seems to be some sort of race condition because it is triggered by
multiple disk-io intensive processes using the same volume. Many
mornings, when I first login to this machine which runs all of the
time, I find that one or more of the volumes is mounted read-only.
Sometimes e2fsck shows errors and sometimes it doesn't.

2004-03-15 03:41:45

by Marc Lehmann

[permalink] [raw]

Subject: Re: strange ext3 corruption problem on 2.6.x

On Mon, Mar 15, 2004 at 08:59:29AM +1030, [email protected] wrote:
> 'r/o' by the RAID layer, presumably unbeknownst to VFS; are you
> *sure* that your array is still up and 'good' when you get this
> message?

As I said, there are no other messages, so if there is a problem (cabling,
disk-i/o etc.), then the kernel doesn't know it either (usually the kernel
it quite loud in this condition).

The array also comes up clean and synced. And the reiserfs partition on
the same lv doesn't have any problems (wether this means that reiserfs
doesn't suffer from this bug or wether reiserfs is just unable to detect
it is, of course, a different question).

--
-----==- |
----==-- _ |
---==---(_)__ __ ____ __ Marc Lehmann +--
--==---/ / _ \/ // /\ \/ / [email protected] |e|
-=====/_/_//_/\_,_/ /_/\_\ XX11-RIPE --+
The choice of a GNU generation |
|

2004-03-15 23:27:48

by Thorild Selen

[permalink] [raw]

Subject: Re: strange ext3 corruption problem on 2.6.x

Marc Lehmann <[email protected]> writes:

> On Mon, Mar 15, 2004 at 08:59:29AM +1030, [email protected] wrote:
>> 'r/o' by the RAID layer, presumably unbeknownst to VFS; are you
>> *sure* that your array is still up and 'good' when you get this
>> message?
>
> As I said, there are no other messages, so if there is a problem (cabling,
> disk-i/o etc.), then the kernel doesn't know it either (usually the kernel
> it quite loud in this condition).

I was able to repeat this (although with a somewhat different error
message) on a Xeon machine (HT, so "almost" SMP) running 2.6.3 (with
some IPv6 and NFS related patches, most likely nothing affecting
LVM/md/ext3). It took a few hours running bonnie++ on an ext3 fs on
LVM atop raid5 (four Hitachi SATA disks on a Promise SATA150 TX4
controller) until the machine got problems.

The kernel log says:

Mar 15 06:34:40 Psilocybe kernel: EXT3-fs error (device dm-2):
ext3_readdir: bad entry in directory #11: rec_len %% 4 != 0 - off
set=0, inode=1061109567, rec_len=16191, name_len=63
Mar 15 06:34:40 Psilocybe kernel: Aborting journal on device dm-2.
Mar 15 06:34:40 Psilocybe kernel: EXT3-fs error (device dm-2) in
ext3_ordered_writepage: IO failure
Mar 15 06:34:40 Psilocybe kernel: EXT3-fs error (device dm-2) in
ext3_ordered_writepage: IO failure
Mar 15 06:34:41 Psilocybe kernel: ext3_abort called.
Mar 15 06:34:41 Psilocybe kernel: EXT3-fs abort (device dm-2):
ext3_journal_start: Detected aborted journal
Mar 15 06:34:41 Psilocybe kernel: Remounting filesystem read-only
Mar 15 06:34:42 Psilocybe kernel: EXT3-fs error (device dm-2) in
start_transaction: Journal has aborted

And the last words from bonnie++ were:

Writing intelligently...Can't write block.
Bonnie: drastic I/O error (write(2)): No such file or directory

Then bonnie exited. It seems like something unrelated to this fs was
in an inconsistent state at this stage, as the machine crashed some
hours later.

The last syslog entry before the crash was at 11:05:01, then the
machine crashed quietly. The console was blank; the machine still
responded to pings, but appeared otherwise dead. The arrays were not
clean and were reconstructed at boot, also arrays that were not
involved in running the benchmark.

I can try to repeat the experiment with another fs if that is desired,
but people seem to agree already that the problem is in ext3. Any
suggestions on how to continue?

Thorild Sel?n
Datorf?reningen Update / Update Computer Club, Uppsala, SE

2004-03-23 07:33:59

by John Pearson

[permalink] [raw]

Subject: Re: strange ext3 corruption problem on 2.6.x

OK,

I've seen this one now, too; here's my datapoint:

First, under vanilla 2.6.3:

EXT3-fs error (device dm-0): ext3_readdir: bad entry in directory
#917711: rec_len % 4 != 0 - offset=0, inode=1182746341, rec_len=16861,
name_len=185
Aborting journal on device dm-0.
ext3_abort called.
EXT3-fs abort (device dm-0): ext3_journal_start: Detected aborted journal
Remounting filesystem read-only

Then, under 2.6.4+skas3:

EXT3-fs error (device dm-3): ext3_readdir: bad entry in directory
#510327: directory entry across blocks - offset=0, inode=0,
rec_len=5044, name_len=113
Aborting journal on device dm-3.
ext3_abort called.
EXT3-fs abort (device dm-3): ext3_journal_start: Detected aborted journal
Remounting filesystem read-only

I'm running ext3 over raid5; In both cases, fsck spotted the aborted
journal and checked the FS, which came up clean.

No other issues in many days of uptime, including kernel compiles, etc.,
so I'm reasonably confident of the RAM and hardware generally.

I wouldn't describe either volume as seeing heavy use - there's rarely
more than one reader, and almost never more than one writer.

dm-3 has had no writes since last boot (it serves images to diskless
clients, including NFS roots mounted ro); dm-0 had seen a few writes
(it's a read-mostly FTP server containing mirrors of debian-security and
a few other things, synced about once a month).

'directory #510327' on dm-3 is a manpage directory, which shows a size
of 20480 and contains 751 files; 'directory #917711' on dm-0 has a size
of 8192 and contains 101 files.

The box is a UMP Athlon XP with 512Mb DDR RAM on a VIA VT8237-based
board, using on-board IDE + a Promise 20268 controller (but as the RAID
layer works, I doubt it's the hardware).

2004-03-23 07:54:14

by Andrew Kirch

[permalink] [raw]

Subject: Re: strange ext3 corruption problem on 2.6.x

On Tue, 23 Mar 2004 18:03:48 +1030
John Pearson <[email protected]> wrote:

> OK,
>
> I've seen this one now, too; here's my datapoint:
>
> First, under vanilla 2.6.3:
>
> EXT3-fs error (device dm-0): ext3_readdir: bad entry in directory
> #917711: rec_len % 4 != 0 - offset=0, inode=1182746341, rec_len=16861,
> #
> name_len=185
> Aborting journal on device dm-0.
> ext3_abort called.
> EXT3-fs abort (device dm-0): ext3_journal_start: Detected aborted
> journal Remounting filesystem read-only
>
>
>
> Then, under 2.6.4+skas3:
>
>
> EXT3-fs error (device dm-3): ext3_readdir: bad entry in directory
> #510327: directory entry across blocks - offset=0, inode=0,
> rec_len=5044, name_len=113
> Aborting journal on device dm-3.
> ext3_abort called.
> EXT3-fs abort (device dm-3): ext3_journal_start: Detected aborted
> journal Remounting filesystem read-only
>
>
>
> I'm running ext3 over raid5; In both cases, fsck spotted the aborted
> journal and checked the FS, which came up clean.
>
> No other issues in many days of uptime, including kernel compiles,
> etc., so I'm reasonably confident of the RAM and hardware generally.
>
> I wouldn't describe either volume as seeing heavy use - there's rarely
>
> more than one reader, and almost never more than one writer.
>
> dm-3 has had no writes since last boot (it serves images to diskless
> clients, including NFS roots mounted ro); dm-0 had seen a few writes
> (it's a read-mostly FTP server containing mirrors of debian-security
> and a few other things, synced about once a month).
>
> 'directory #510327' on dm-3 is a manpage directory, which shows a size
>
> of 20480 and contains 751 files; 'directory #917711' on dm-0 has a
> size of 8192 and contains 101 files.
>
> The box is a UMP Athlon XP with 512Mb DDR RAM on a VIA VT8237-based
> board, using on-board IDE + a Promise 20268 controller (but as the
> RAID layer works, I doubt it's the hardware).

I had a situation similar to this in 2.6.3, while the machine was under
load, the entire filesystem was trashed, lots of lost inodes and the
journal was irrecoverable. I'm glad your luck was better than mine.

--
Andrew D Kirch | Abusive Hosts Blocking List | http://www.ahbl.org
Security Admin | Summit Open Source Development Group | http://www.sosdg.org
Key At http://www.2mbit.com/~trelane/trelane.key
Key fingerprint = B4C2 8083 648B 37A2 4CCE 61D3 16D6 995D 026F 20CF

Attachments:

(No filename) (2.37 kB)
(No filename) (189.00 B)
Download all attachments

2004-03-25 14:24:59

by Pavel Machek

[permalink] [raw]

Subject: Re: strange ext3 corruption problem on 2.6.x

Hi!

> > 'r/o' by the RAID layer, presumably unbeknownst to VFS; are you
> > *sure* that your array is still up and 'good' when you get this
> > message?
>
> As I said, there are no other messages, so if there is a problem (cabling,
> disk-i/o etc.), then the kernel doesn't know it either (usually the kernel
> it quite loud in this condition).

Hmm, is there way to force raid5 to check parity?
Mostly "degraded" mode with all disks online. Could be
usefull for cabling problems and debugging raid...
--
64 bytes from 195.113.31.123: icmp_seq=28 ttl=51 time=448769.1 ms