2008-11-05 17:34:40

by bugme-daemon

[permalink] [raw]
Subject: [Bug 11960] New: Oops in ext4_mb_poll_new_transaction

http://bugzilla.kernel.org/show_bug.cgi?id=11960

Summary: Oops in ext4_mb_poll_new_transaction
Product: File System
Version: 2.5
KernelVersion: 2.6.27
Platform: All
OS/Version: Linux
Tree: Mainline
Status: NEW
Severity: normal
Priority: P1
Component: ext4
AssignedTo: [email protected]
ReportedBy: [email protected]


Distribution: Debian Etch
Hardware Environment: Supermicro server, quad core intel xeon cpu, 4 gigs of
ram, 8 gigs of swap, two 3ware 9690SA raid cards+BBU.
Software Environment: Debian etch 64-bit os with these drivers/firmware:

3ware 9000 Storage Controller device driver for Linux v2.26.02.011.
3w-9xxx: scsi1: Firmware FH9X 4.06.00.004, BIOS BE9X 4.05.00.015, Ports: 128.

Kernel checked out from ext4 git repository -stable branch.

Problem Description: Kernel produced oops trying to mount an ext4 filesystem
after a hard reset was performed on the machine. fsck.ext4 repaired the
filesystem and it then mounted cleanly.

Steps to reproduce: Have not tried to reproduce. System was under I/O load from
multiple rsync processes reading from the network at around a total of
25-50mbps when it was hard reset. System rebooted but did not mount the
filesystem, instead producing the oops listed.

EXT4-fs: barriers enabled
kjournald2 starting. Commit interval 5 seconds
EXT4 FS on sdb1, internal journal on sdb1:8
ext4_orphan_cleanup: deleting unreferenced inode 394924522
BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
IP: [<ffffffff802f91c8>] ext4_mb_poll_new_transaction+0x6e/0xe3
PGD 1291db067 PUD 129fce067 PMD 0
Oops: 0002 [1] SMP
CPU 5
Pid: 4884, comm: mount Not tainted 2.6.27 #1
RIP: 0010:[<ffffffff802f91c8>] [<ffffffff802f91c8>]
ext4_mb_poll_new_transaction+0x6e/0xe3
RSP: 0018:ffff880128a6d888 EFLAGS: 00010207
RAX: ffff8801288ee1e0 RBX: ffff8801288ec000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8801288ee1d0
RBP: ffff8801229193d8 R08: 0000000000000001 R09: ffff880128a6d9a0
R10: 000000005e2829ed R11: 0000000000000002 R12: ffff8801291b2000
R13: 0000000000000001 R14: 0000000000000001 R15: 0000000000000001
FS: 00007f8f321546d0(0000) GS:ffff88012fb080c0(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000008 CR3: 000000012915c000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process mount (pid: 4884, threadinfo ffff880128a6c000, task ffff88012ec4d410)
Stack: 000000005e2829ed ffff8801291b2000 ffff8801229107e0 ffffffff802fa3ba
ffff88012e60fce0 ffff880128a6d9a0 000000012e615800 ffff8801229107e0
ffff8801229193d8 ffffffff802792ab 0000000000000050 0000000000000292
Call Trace:
[<ffffffff802fa3ba>] ? ext4_mb_free_blocks+0x46/0x5b1
[<ffffffff802792ab>] ? cache_alloc_refill+0xeb/0x1e6
[<ffffffff803073bc>] ? insert_revoke_hash+0x89/0xad
[<ffffffff802ded31>] ? ext4_free_blocks+0x71/0xc5
[<ffffffff802f3c6e>] ? ext4_ext_truncate+0x3fd/0x860
[<ffffffff80303f9a>] ? do_get_write_access+0x38a/0x3d0
[<ffffffff802e4a81>] ? ext4_truncate+0x67/0x4e2
[<ffffffff80304cda>] ? jbd2_journal_dirty_metadata+0xcc/0xe3
[<ffffffff802f4e9a>] ? __ext4_journal_dirty_metadata+0x1e/0x46
[<ffffffff802e228a>] ? ext4_mark_iloc_dirty+0x45e/0x4e3
[<ffffffff802e291f>] ? ext4_mark_inode_dirty+0x159/0x16c
[<ffffffff802e6d8c>] ? ext4_delete_inode+0x103/0x1c2
[<ffffffff802e6c89>] ? ext4_delete_inode+0x0/0x1c2
[<ffffffff8028fa9a>] ? generic_delete_inode+0xb0/0x124
[<ffffffff802eee27>] ? ext4_fill_super+0x1850/0x1b53
[<ffffffff8027ee98>] ? set_bdev_super+0x0/0xf
[<ffffffff802ed5d7>] ? ext4_fill_super+0x0/0x1b53
[<ffffffff8027feff>] ? get_sb_bdev+0xf8/0x145
[<ffffffff8027f92a>] ? vfs_kern_mount+0x93/0x11b
[<ffffffff8027fa05>] ? do_kern_mount+0x43/0xdc
[<ffffffff80294293>] ? do_new_mount+0x5b/0x94
[<ffffffff80294489>] ? do_mount+0x1bd/0x1ea
[<ffffffff80291f82>] ? copy_mount_options+0xcc/0x12b
[<ffffffff80294540>] ? sys_mount+0x8a/0xda
[<ffffffff8020bdcb>] ? system_call_fastpath+0x16/0x1b


Code: 8b b3 d0 21 00 00 48 8d bb d0 21 00 00 48 39 fe 74 2f 48 8b 57 08 48 8b
8b e0 21 00 00 48 8d 83 e0 21 048 89 0a 48 89 51 08 48 89 bb d0 21 00 00 48 89
7f
RIP [<ffffffff802f91c8>] ext4_mb_poll_new_transaction+0x6e/0xe3
RSP <ffff880128a6d888>
CR2: 0000000000000008
---[ end trace 72945378fb356467 ]---


--
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.


2008-11-05 20:56:26

by bugme-daemon

[permalink] [raw]
Subject: [Bug 11960] Oops in ext4_mb_poll_new_transaction

http://bugzilla.kernel.org/show_bug.cgi?id=11960





------- Comment #1 from [email protected] 2008-11-05 12:55 -------
Note, -stable is a stale branch pointer. It reflects commits that Linus has
pulled into mainline, so there's nothing _wrong_ with it, but it accidentally
got published. You probably want either ext4-stable (which is the latest
patches that have been accepted into mainline against the stable 2.6.27 kernel)
or for-stable, which is a set of patches we're going to be sending to the
2.6.27.x kernel when I have a chance.

It's almost certain that the bug won't show up in the ext4-stable branch, since
in the latest mainline kernel we've dropped ext4_mb_poll_new_transaction and
replaced it with something else that is far clearly. However, the code is
still in the for-stable and 2.6.27.x branches, though. So if there is a bug in
2.6.27, we do want to track it down and fix it.

Hmm... at a guess, looking at the symptoms, I suspect it happens when there are
so many inodes on the orphaned inode list that it requries more than one
transaction to clear all of the inoes on the orphaned inode list. How big is
the journal on your filesystem? What does "dumpe2fs -h /dev/sdb1 | grep
Journal" report?


--
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

2008-11-05 22:24:08

by bugme-daemon

[permalink] [raw]
Subject: [Bug 11960] Oops in ext4_mb_poll_new_transaction

http://bugzilla.kernel.org/show_bug.cgi?id=11960





------- Comment #2 from [email protected] 2008-11-05 14:24 -------
backup:~# dumpe2fs -h /dev/sdb1 | grep Journal
dumpe2fs 1.41.3 (12-Oct-2008)
Journal inode: 8
Journal backup: inode blocks
Journal size: 128M

I'll work on building a new kernel with the actual stable stuff soon. Hopefully
we won't see it there!


--
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

2008-11-06 00:12:57

by bugme-daemon

[permalink] [raw]
Subject: [Bug 11960] Oops in ext4_mb_poll_new_transaction

http://bugzilla.kernel.org/show_bug.cgi?id=11960





------- Comment #3 from [email protected] 2008-11-05 16:12 -------
OK, if you had a 128megs journal, it must have been a corrupted orphan list
and/or journal that caused the crash. That's consistent with the I'd really
like to be able to create a reproduction case for this, since otherwise we
won't know if the problem has been fixed in the newer mainline kernel.


--
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

2009-01-18 01:47:48

by bugme-daemon

[permalink] [raw]
Subject: [Bug 11960] Oops in ext4_mb_poll_new_transaction

http://bugzilla.kernel.org/show_bug.cgi?id=11960





------- Comment #4 from [email protected] 2009-01-17 17:47 -------
Any updates on this bug? If not, given that the function in question is no
longer in the ext4 codebase, I plan to close this bug. Thanks!!


--
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

2009-05-19 18:38:18

by bugzilla-daemon

[permalink] [raw]
Subject: [Bug 11960] Oops in ext4_mb_poll_new_transaction

http://bugzilla.kernel.org/show_bug.cgi?id=11960


Theodore Tso <[email protected]> changed:

What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
CC| |[email protected]
Resolution| |INSUFFICIENT_DATA
Regression|--- |No




--- Comment #5 from Theodore Tso <[email protected]> 2009-05-19 18:38:19 ---
There haven't been any updates since November 2008, and the function in
question no longer is in the ext4 code base. Please file a new bug if you are
still seeing problems. Thanks!!

--
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.