2002-11-06 10:11:42

by Stephen C. Tweedie

[permalink] [raw]
Subject: Re: [Ext2-devel] Re: 2.5.46 ext3 errors

Hi,

On Wed, Nov 06, 2002 at 01:43:45AM -0800, Christopher Li wrote:
> Can you put the e2image of that device to some URL I can
> download?

It's unlikely to be useful. A journal abort will cause existing
transactions to be suspended midstream, so any errors afterwards may
be due to updates which were in progress at the time and which didn't
complete. And since a fsck has been done, we've lost those errors
anyway.

Is the problem reproducible? The basic

> EXT3-fs error (device ide1(22,1)): ext3_new_inode: Free inodes count
> corrupted in group 688 Aborting journal on device ide1(22,1).

error is just ext3's normal reaction to a fatal error detected in the
filesystem, so that in itself isn't a worry. The cause of the problem
it spotted is the worry; is this reproducible?

Cheers,
Stephen


2002-11-06 12:58:53

by Jens Axboe

[permalink] [raw]
Subject: Re: [Ext2-devel] Re: 2.5.46 ext3 errors

On Wed, Nov 06 2002, Stephen C. Tweedie wrote:
> Hi,
>
> On Wed, Nov 06, 2002 at 01:43:45AM -0800, Christopher Li wrote:
> > Can you put the e2image of that device to some URL I can
> > download?
>
> It's unlikely to be useful. A journal abort will cause existing
> transactions to be suspended midstream, so any errors afterwards may
> be due to updates which were in progress at the time and which didn't
> complete. And since a fsck has been done, we've lost those errors
> anyway.

It's a 151gb partition anyways, so not very easy to give access to. And
as Stephen mentions, it has been file system checked and is clean now.

> Is the problem reproducible? The basic
>
> > EXT3-fs error (device ide1(22,1)): ext3_new_inode: Free inodes count
> > corrupted in group 688 Aborting journal on device ide1(22,1).
>
> error is just ext3's normal reaction to a fatal error detected in the
> filesystem, so that in itself isn't a worry. The cause of the problem
> it spotted is the worry; is this reproducible?

I can try. The kernel run had my rbtree deadline patches, however
they've been well tested and are likely not the cause of the problem. It
cannot be 100% ruled out though, I'm testing for this very thing right
now. I will let you know what happens.

--
Jens Axboe

2002-11-06 16:06:19

by Ewan Mac Mahon

[permalink] [raw]
Subject: Re: [Ext2-devel] Re: 2.5.46 ext3 errors

On Wed, Nov 06, 2002 at 02:05:21PM +0100, Jens Axboe wrote:
> On Wed, Nov 06 2002, Stephen C. Tweedie wrote:
> >
> > error is just ext3's normal reaction to a fatal error detected in the
> > filesystem, so that in itself isn't a worry. The cause of the problem
> > it spotted is the worry; is this reproducible?
>
> I can try. The kernel run had my rbtree deadline patches, however
> they've been well tested and are likely not the cause of the problem. It
> cannot be 100% ruled out though, I'm testing for this very thing right
> now. I will let you know what happens.

I think I can rule that out, I've got much the same[1] from a vanilla
2.5.46, and the filesystem's recent history has been plain 2.5.XXs as
well.

Ewan


EXT3-fs error (device ide0(3,5)): ext3_new_inode: Free inodes count
corrupted in group 18
Aborting journal on device ide0(3,5).
ext3_abort called.
EXT3-fs abort (device ide0(3,5)): ext3_journal_start: Detected aborted
journal
Remounting filesystem read-only
EXT3-fs error (device ide0(3,5)) in start_transaction: Journal has aborted
EXT3-fs error (device ide0(3,5)) in ext3_new_inode: error 28
EXT3-fs error (device ide0(3,5)) in ext3_create: IO failure
EXT3-fs error (device ide0(3,5)) in start_transaction: Journal has aborted
EXT3-fs error (device ide0(3,5)) in start_transaction: Journal has aborted

etc. etc.

2002-11-06 17:02:12

by Jens Axboe

[permalink] [raw]
Subject: Re: [Ext2-devel] Re: 2.5.46 ext3 errors

On Wed, Nov 06 2002, Ewan Mac Mahon wrote:
> On Wed, Nov 06, 2002 at 02:05:21PM +0100, Jens Axboe wrote:
> > On Wed, Nov 06 2002, Stephen C. Tweedie wrote:
> > >
> > > error is just ext3's normal reaction to a fatal error detected in the
> > > filesystem, so that in itself isn't a worry. The cause of the problem
> > > it spotted is the worry; is this reproducible?
> >
> > I can try. The kernel run had my rbtree deadline patches, however
> > they've been well tested and are likely not the cause of the problem. It
> > cannot be 100% ruled out though, I'm testing for this very thing right
> > now. I will let you know what happens.
>
> I think I can rule that out, I've got much the same[1] from a vanilla
> 2.5.46, and the filesystem's recent history has been plain 2.5.XXs as
> well.

Interesting, so it smells like a generic problem. I cannot reproduce it
on my test box (been running kernel compiles and dbenches all afternoon)
with the same kernel. I've got 2.5.46-BK on the desktop again now, we'll
see what happens....

For the record, test box is P3-800MHz SMP, 512MiB RAM. Desktop is a
MP1800+ SMP, 1GiB of RAM.

--
Jens Axboe