2005-02-27 07:07:30

by Jean-Marc Valin

[permalink] [raw]
Subject: ext3 bug

Hi,

Looks like I ran into an ext3 bug (or at least the log says so). I got a
bunch of messages like:
ext3_free_blocks_sb: aborting transaction: Journal has aborted in
__ext3_journal_get_undo_access<2>EXT3-fs error (device sda2) in
ext3_free_blocks_sb: Journal has aborted
EXT3-fs error (device sda2): ext3_free_blocks: Freeing blocks in system
zones -Block = 228, count = 1

It happened while I was doing an "rm -rf" on a directory. The "rm" gave
a segfault and now I can't unmount the filesystem: unmount says "device
is busy", even though lsof reports nothing. The filesystem is on a USB
hard disk. The actual dump is in attachment. I'm running Debian unstable
with a custom 2.6.10 kernel on a 1.6 GHz Pentium-M.

Jean-Marc

--
Jean-Marc Valin <[email protected]>
Universit? de Sherbrooke


Attachments:
ext3_bug (5.29 kB)

2005-02-27 19:06:34

by Parag Warudkar

[permalink] [raw]
Subject: Re: ext3 bug

On Sunday 27 February 2005 02:04 am, Jean-Marc Valin wrote:
> Hi,
>
> Looks like I ran into an ext3 bug (or at least the log says so). I got a
> bunch of messages like:
> ext3_free_blocks_sb: aborting transaction: Journal has aborted in
> __ext3_journal_get_undo_access<2>EXT3-fs error (device sda2) in
> ext3_free_blocks_sb: Journal has aborted
> EXT3-fs error (device sda2): ext3_free_blocks: Freeing blocks in system
> zones -Block = 228, count = 1
>
> It happened while I was doing an "rm -rf" on a directory. The "rm" gave
> a segfault and now I can't unmount the filesystem: unmount says "device
> is busy", even though lsof reports nothing. The filesystem is on a USB
> hard disk. The actual dump is in attachment. I'm running Debian unstable
> with a custom 2.6.10 kernel on a 1.6 GHz Pentium-M.
>
> Jean-Marc

Please try stock kernel. 2.6.11-rc3 onwards should be fine. - I saw a similar
problem while running 2.6.10 kernel from Fedora Core 3. It doesn't happen
with stock kernels.

Parag

2005-02-27 19:28:07

by Dave Jones

[permalink] [raw]
Subject: Re: ext3 bug

On Sun, Feb 27, 2005 at 02:06:30PM -0500, Parag Warudkar wrote:
> On Sunday 27 February 2005 02:04 am, Jean-Marc Valin wrote:
> > Hi,
> >
> > Looks like I ran into an ext3 bug (or at least the log says so). I got a
> > bunch of messages like:
> > ext3_free_blocks_sb: aborting transaction: Journal has aborted in
> > __ext3_journal_get_undo_access<2>EXT3-fs error (device sda2) in
> > ext3_free_blocks_sb: Journal has aborted
> > EXT3-fs error (device sda2): ext3_free_blocks: Freeing blocks in system
> > zones -Block = 228, count = 1
> >
> > It happened while I was doing an "rm -rf" on a directory. The "rm" gave
> > a segfault and now I can't unmount the filesystem: unmount says "device
> > is busy", even though lsof reports nothing. The filesystem is on a USB
> > hard disk. The actual dump is in attachment. I'm running Debian unstable
> > with a custom 2.6.10 kernel on a 1.6 GHz Pentium-M.
> >
> > Jean-Marc
>
> Please try stock kernel. 2.6.11-rc3 onwards should be fine. - I saw a similar
> problem while running 2.6.10 kernel from Fedora Core 3. It doesn't happen
> with stock kernels.

Which is very odd considering the only ext3 patches in the Fedora
kernel are in 2.6.11rc.

Dave

2005-02-27 19:40:44

by Parag Warudkar

[permalink] [raw]
Subject: Re: ext3 bug

On Sunday 27 February 2005 02:27 pm, Dave Jones wrote:
> Which is very odd considering the only ext3 patches in the Fedora
> kernel are in 2.6.11rc.
This seems to be more of an USB-Storage issue than ext3.

2005-02-27 23:03:28

by Jean-Marc Valin

[permalink] [raw]
Subject: Re: ext3 bug

> Please try stock kernel. 2.6.11-rc3 onwards should be fine. - I saw a similar
> problem while running 2.6.10 kernel from Fedora Core 3. It doesn't happen
> with stock kernels.

I did use a stock 2.6.10 kernel (I said custom in the sense that it
wasn't a Debian kernel). After a reboot, I was able to run fsck on the
disk (many, many errors) and it went fine after.

Jean-Marc

--
Jean-Marc Valin <[email protected]>
Universit? de Sherbrooke

2005-02-28 01:10:52

by Parag Warudkar

[permalink] [raw]
Subject: Re: ext3 bug

On Sunday 27 February 2005 05:58 pm, Jean-Marc Valin wrote:
> I did use a stock 2.6.10 kernel (I said custom in the sense that it
> wasn't a Debian kernel). After a reboot, I was able to run fsck on the
> disk (many, many errors) and it went fine after.

Hmm.. So that error is not FC3 specific, it is present in stock 2.6.10 as
well. Also - This is on a USB disk, right? If so, the error may re-surface.
Try upgrading to latest kernel if possible.

Parag

2005-02-28 02:07:56

by Jean-Marc Valin

[permalink] [raw]
Subject: Re: ext3 bug

> Hmm.. So that error is not FC3 specific, it is present in stock 2.6.10 as
> well. Also - This is on a USB disk, right? If so, the error may re-surface.
> Try upgrading to latest kernel if possible.

It's a USB disk (3.5" IDE + IDE to USB). What has been changed in
2.6.11-rcX?

Jean-Marc

--
Jean-Marc Valin <[email protected]>
Universit? de Sherbrooke

2005-02-28 02:24:46

by Parag Warudkar

[permalink] [raw]
Subject: Re: ext3 bug

On Sunday 27 February 2005 09:04 pm, Jean-Marc Valin wrote:
> What has been changed in
> 2.6.11-rcX?

Don't know exactly what changed but when I faced this issue Greg KH suggested
to reproduce on 2.6.11-rc3 and it didn't happen there.

(I am assuming that the issue you are seeing is the same as what I saw since
the error message looks similar and even I was using a USB disk when the
error happened.)

Parag

2005-02-28 15:27:20

by jmerkey

[permalink] [raw]
Subject: Re: ext3 bug


I see this problem infrequently on systems that have low memory
conditions and
with heavy swapping. I have not seen it on 2.6.9 but I have seen it
on 2.6.10.

Jeff

Jean-Marc Valin wrote:

>>Please try stock kernel. 2.6.11-rc3 onwards should be fine. - I saw a similar
>>problem while running 2.6.10 kernel from Fedora Core 3. It doesn't happen
>>with stock kernels.
>>
>>
>
>I did use a stock 2.6.10 kernel (I said custom in the sense that it
>wasn't a Debian kernel). After a reboot, I was able to run fsck on the
>disk (many, many errors) and it went fine after.
>
> Jean-Marc
>
>
>

2005-02-28 21:54:11

by Jean-Marc Valin

[permalink] [raw]
Subject: Re: ext3 bug

Le lundi 28 f?vrier 2005 ? 08:31 -0700, jmerkey a ?crit :
> I see this problem infrequently on systems that have low memory
> conditions and
> with heavy swapping. I have not seen it on 2.6.9 but I have seen it
> on 2.6.10.

My machine has 1 GB RAM and I wasn't using much of it at that time (2GB
free on the swap), so I doubt that's the problem in my case.

Jean-Marc

--
Jean-Marc Valin <[email protected]>
Universit? de Sherbrooke

2005-03-01 04:00:24

by jmerkey

[permalink] [raw]
Subject: Re: ext3 bug

Jean-Marc Valin wrote:

>Le lundi 28 f?vrier 2005 ? 08:31 -0700, jmerkey a ?crit :
>
>
>>I see this problem infrequently on systems that have low memory
>>conditions and
>>with heavy swapping. I have not seen it on 2.6.9 but I have seen it
>>on 2.6.10.
>>
>>
>
>My machine has 1 GB RAM and I wasn't using much of it at that time (2GB
>free on the swap), so I doubt that's the problem in my case.
>
> Jean-Marc
>
>
>
Running the ext2 recover program seems to trigger some good bugs in
2.6.10 with ext3 -- try it. I was doing this
to test some disk tools and I managed to cause these errors with forcing
ext2 recovery from an ext3 fs (which is
probably something to be expected. The recover tools need to get
syncrhonized -- have not tried with
mc yet.) Doesn't happen every time though.

Jeff


2005-03-01 04:02:21

by jmerkey

[permalink] [raw]
Subject: Re: ext3 bug

jmerkey wrote:

> Jean-Marc Valin wrote:
>
>> Le lundi 28 f?vrier 2005 ? 08:31 -0700, jmerkey a ?crit :
>>
>>
>>> I see this problem infrequently on systems that have low memory
>>> conditions and
>>> with heavy swapping. I have not seen it on 2.6.9 but I have seen
>>> it on 2.6.10.
>>
>>
>> My machine has 1 GB RAM and I wasn't using much of it at that time (2GB
>> free on the swap), so I doubt that's the problem in my case.
>>
>> Jean-Marc
>>
>>
>>
> Running the ext2 recover program seems to trigger some good bugs in
> 2.6.10 with ext3 -- try it. I was doing this
> to test some disk tools and I managed to cause these errors with
> forcing ext2 recovery from an ext3 fs (which is
> probably something to be expected. The recover tools need to get
> syncrhonized -- have not tried with
> mc yet.) Doesn't happen every time though.
>
> Jeff
>
>
>

lde also causes some problems as well with ext3. Just caused one on
2.6.10. stale or poisoned
cache blocks perhaps?

Jeff

2005-03-01 05:25:38

by Jeffrey Hundstad

[permalink] [raw]
Subject: Re: ext3 bug

linux-2.6.10 has some bio problems that are fixed in the current
linux-2.6.11 release candidates. The bio problems wreaked havoc with
XFS and there were people reporting EXT3 problems as well with this
bug. I'd recommend trying the latest release candidate and see if your
problem vanishes.

--
jeffrey hundstad


jmerkey wrote:

> jmerkey wrote:
>
>> Jean-Marc Valin wrote:
>>
>>> Le lundi 28 f?vrier 2005 ? 08:31 -0700, jmerkey a ?crit :
>>>
>>>
>>>> I see this problem infrequently on systems that have low memory
>>>> conditions and
>>>> with heavy swapping. I have not seen it on 2.6.9 but I have seen
>>>> it on 2.6.10.
>>>
>>>
>>>
>>> My machine has 1 GB RAM and I wasn't using much of it at that time (2GB
>>> free on the swap), so I doubt that's the problem in my case.
>>>
>>> Jean-Marc
>>>
>>>
>>>
>> Running the ext2 recover program seems to trigger some good bugs in
>> 2.6.10 with ext3 -- try it. I was doing this
>> to test some disk tools and I managed to cause these errors with
>> forcing ext2 recovery from an ext3 fs (which is
>> probably something to be expected. The recover tools need to get
>> syncrhonized -- have not tried with
>> mc yet.) Doesn't happen every time though.
>>
>> Jeff
>>
>>
>>
>
> lde also causes some problems as well with ext3. Just caused one on
> 2.6.10. stale or poisoned
> cache blocks perhaps?
>
> Jeff
> -
> To unsubscribe from this list: send the line "unsubscribe
> linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/


2005-03-01 06:34:41

by jmerkey

[permalink] [raw]
Subject: Re: ext3 bug

Jeffrey E. Hundstad wrote:

> linux-2.6.10 has some bio problems that are fixed in the current
> linux-2.6.11 release candidates. The bio problems wreaked havoc with
> XFS and there were people reporting EXT3 problems as well with this
> bug. I'd recommend trying the latest release candidate and see if
> your problem vanishes.
>
OK

Jeff