2002-03-01 01:22:27

by Wayne Whitney

[permalink] [raw]
Subject: [OOPS 2.5.5-dj2] ext3 BUG in do_get_write_access()

Hello,

I managed to generate the oops below on 2.5.5-dj2 by doing the following:
cp -ax / /mnt &
<some delay, don't know if it matters>
tune2fs -L root /dev/hdc5

where
/dev/hda7 on / type ext3 (rw,noatime,nodiratime)
/dev/hdc5 on /mnt type ext3 (rw,noatime,nodiratime)

tune2fs -L should be safe on a mounted filesystem, non?

A couple comments on the oops report: I didn't run it through ksymoops,
as I am confident that klogd had the correct System.map file. Also, while
the oops says "Tainted: P", that is solely due to "modprobe: Warning:
loading /lib/modules/2.5.5-dj2/kernel/net/ipv4/netfilter/ipchains.o will
taint the kernel: non-GPL license - BSD without advertisement clause".

I'd be happy to provide further information if needed, for example the
ksymoops output if for some reason the klogd symbol translation is
inadequate.

Cheers,
Wayne


Assertion failure in do_get_write_access() at transaction.c:586: "jh->b_next_transaction == ((void *)0)"
kernel BUG at transaction.c:586!
invalid operand: 0000
CPU: 0
EIP: 0010:[do_get_write_access+1626/1648] Tainted: P
EIP: 0010:[<c017128a>] Tainted: P
EFLAGS: 00010282
eax: 0000006c ebx: d8678cc0 ecx: ddfaf900 edx: df08ff7c
esi: d8678cc0 edi: df968cc0 ebp: df900bd0 esp: ddfe9e5c
ds: 0018 es: 0018 ss: 0018
Process cp (pid: 13610, threadinfo=ddfe8000 task=ddfaf900)
Stack: c0246dc0 c024b5ee c024b55c 0000024a c0248f60 df968d34 00000000 00000000
00000000 df6fa740 df968d34 df968cc0 defda3c0 df900bd0 c01712eb defda3c0
df900bd0 00000000 defda3c0 00000000 00004b80 d89205c0 c0166376 defda3c0
Call Trace: [journal_get_write_access+75/128] [ext3_new_inode+1078/2496] [start_this_handle+126/336] [__jbd_kmalloc+35/112] [ext3_mkdir+245/1072]
Call Trace: [<c01712eb>] [<c0166376>] [<c017051e>] [<c0177893>] [<c016beb5>]
[ext3_lookup+185/304] [vfs_mkdir+120/192] [sys_mkdir+218/256] [syscall_call+7/11]
[<c016b329>] [<c014aa18>] [<c014ab3a>] [<c0108eff>]

Code: 0f 0b 4a 02 5c b5 24 c0 e9 e3 fe ff ff 8b 5d 00 e9 dd f9 ff



2002-03-01 19:00:13

by Andreas Dilger

[permalink] [raw]
Subject: Re: [OOPS 2.5.5-dj2] ext3 BUG in do_get_write_access()

On Feb 28, 2002 17:19 -0800, Wayne Whitney wrote:
> I managed to generate the oops below on 2.5.5-dj2 by doing the following:
> cp -ax / /mnt &
> <some delay, don't know if it matters>
> tune2fs -L root /dev/hdc5
>
> tune2fs -L should be safe on a mounted filesystem, non?

Maybe. It used to be that the superblock was not journaled, but I think
that recently changed. This means that if tune2fs modified the superblock
and then a transaction also tried to modify it, the superblock would be
dirty and not part of a transaction, so an assertion would trigger.

It may be that we need to add journaled ioctls to ext3 to change the data
fields in the superblock, or find some other way to do this safely (maybe
Jeff Garzik's ext3meta filesystem to just read/write the "label" file?).

Cheers, Andreas
--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/

2002-03-01 19:42:53

by Stephen C. Tweedie

[permalink] [raw]
Subject: Re: [OOPS 2.5.5-dj2] ext3 BUG in do_get_write_access()

Hi,

On Thu, Feb 28, 2002 at 05:19:44PM -0800, Wayne Whitney wrote:

> I managed to generate the oops below on 2.5.5-dj2 by doing the following:
> cp -ax / /mnt &
> <some delay, don't know if it matters>
> tune2fs -L root /dev/hdc5

> tune2fs -L should be safe on a mounted filesystem, non?

Hmmm.

There's a fundamental problem here. Journaling filesystems expect to
be in control over when data gets written to the disk. tune2fs is
writing to the superblock in the buffer cache directly, and ext3 is
really, really paranoid about finding unexpected dirty buffers since
they usually imply that we have just violated the filesystem's write
ordering expectations.

Clearly in this case something legal has just happened, but it still
means that the superblock can get flushed to disk before the
filesystem expects it, and this can result in an inconsistency on
disk after recovery if we crash just after that flush.

So saying "this is legal" means data corruption, and protecting the fs
from such interference (eg. by moving the on-disk superblock
representation into the page cache) will prevent tune2fs from working
at all: the updated fields will just be overwritten by the
filesystem's copy.


In this particular case, I think I'll just have to relax the assertion
and cause it to printk instead of BUG()ing, because I don't want to
lose the protection of this test entirely.

I'd really like to be able to detect such direct buffered-io
"interference" from user-space, though, so that I could preserve the
BUG() in cases where ext3 is getting this wrong internally. I'll look
at that --- I may be able to achieve it through ext3's existing
metadata flags.

Cheers,
Stephen

2002-03-01 20:16:48

by Chris Mason

[permalink] [raw]
Subject: Re: [OOPS 2.5.5-dj2] ext3 BUG in do_get_write_access()



On Friday, March 01, 2002 07:41:55 PM +0000 "Stephen C. Tweedie" <[email protected]> wrote:

> In this particular case, I think I'll just have to relax the assertion
> and cause it to printk instead of BUG()ing, because I don't want to
> lose the protection of this test entirely.
>
> I'd really like to be able to detect such direct buffered-io
> "interference" from user-space, though, so that I could preserve the
> BUG() in cases where ext3 is getting this wrong internally. I'll look
> at that --- I may be able to achieve it through ext3's existing
> metadata flags.

Do I misunderstand the assertion? It seems to be saying:

'this buffer has been written out of order. If we were to crash
now, it will result in FS corruption'.
BUG()

If so, a printk alone might be better, since it would give the FS
the chance to put the correct data there anyway.

-chris