2008-01-31 14:01:40

by Eric Sesterhenn

[permalink] [raw]
Subject: BUG_ON at mballoc.c:3752

hi,

while running a modified version of fsfuzzer i triggered the BUG() in
ext4_mb_release_inode_pa(). Sadly I am not able to reproduce this using
the generated image, but running the fuzzer will usually trigger this in
less than 40 attempts. Increasing the JBD2 Debug level didnt give more
information. The kernel is current git with
ext4-fix-null-pointer-deref-in-journal_wait_on_commit_record.patch
applied.

Greetings, Eric

[ 1570.971980] EXT4-fs error (device loop0) in ext4_reserve_inode_write: Journal has aborted
[ 1570.972077] pa c6512330: logic 16, phys. 2337, len 16
[ 1570.972103] free 2, pa_free 1
[ 1570.972191] ------------[ cut here ]------------
[ 1570.972217] kernel BUG at fs/ext4/mballoc.c:3752!
[ 1570.972241] invalid opcode: 0000 [#1] PREEMPT DEBUG_PAGEALLOC
[ 1570.972386] Modules linked in:
[ 1570.972425]
[ 1570.972509] Pid: 6629, comm: fstest Not tainted (2.6.24-05749-g8af03e7-dirty #19)
[ 1570.972534] EIP: 0060:[<c02266b9>] EFLAGS: 00010202 CPU: 0
[ 1570.972570] EIP is at ext4_mb_release_inode_pa+0x169/0x1a0
[ 1570.972595] EAX: 00000001 EBX: 00000930 ECX: 00000001 EDX: 00000001
[ 1570.972678] ESI: 00000930 EDI: c6512330 EBP: cb638b28 ESP: cb638a84
[ 1570.972703] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
[ 1570.972728] Process fstest (pid: 6629, ti=cb638000 task=cb698000 task.ti=cb638000)
[ 1570.972751] Stack: c07c21bb 00000002 00000001 00000921 00000010 cbff34e0 cb638b54 00000002
[ 1570.972899] cb5fa430 c64824a0 cb5fb920 00000022 cbff34e0 cb638ad4 00000246 00000400
[ 1570.972899] cbfa2000 cb5fb920 cbff34e0 00000000 0000092e 00000000 00000002 cbfa2000
[ 1570.972899] Call Trace:
[ 1570.972899] [<c020e3a4>] ? read_block_bitmap+0x54/0x120
[ 1570.972899] [<c022bd24>] ? ext4_mb_discard_inode_preallocations+0x124/0x300
[ 1570.972899] [<c022bda5>] ? ext4_mb_discard_inode_preallocations+0x1a5/0x300
[ 1570.972899] [<c0223577>] ? ext4_ext_get_blocks+0x3a7/0x4b0
[ 1570.972899] [<c0213928>] ? ext4_get_blocks_wrap+0xe8/0x130
[ 1570.972899] [<c0213bce>] ? ext4_get_block+0x7e/0xf0
[ 1570.972899] [<c019d7ba>] ? __block_prepare_write+0x17a/0x3a0
[ 1570.972899] [<c019da68>] ? block_write_begin+0x48/0xe0
[ 1570.972899] [<c0213b50>] ? ext4_get_block+0x0/0xf0
[ 1570.972899] [<c0215127>] ? ext4_write_begin+0xb7/0x190
[ 1570.972899] [<c0213b50>] ? ext4_get_block+0x0/0xf0
[ 1570.972899] [<c01557e9>] ? generic_perform_write+0xa9/0x190
[ 1570.972899] [<c01575bd>] ? generic_file_buffered_write+0x6d/0x130
[ 1570.972899] [<c01578c1>] ? __generic_file_aio_write_nolock+0x241/0x550
[ 1570.972899] [<c0144b44>] ? trace_hardirqs_on+0xc4/0x150
[ 1570.972899] [<c0157c2c>] ? generic_file_aio_write+0x5c/0xd0
[ 1570.972899] [<c015ab0d>] ? free_one_page+0x1ed/0x220
[ 1570.972899] [<c0210410>] ? ext4_file_write+0x50/0x160
[ 1570.972899] [<c017b11d>] ? do_sync_write+0xcd/0x110
[ 1570.972899] [<c01096f9>] ? native_sched_clock+0x69/0xc0
[ 1570.972899] [<c01373c0>] ? autoremove_wake_function+0x0/0x50
[ 1570.972899] [<c01070e5>] ? do_softirq+0x55/0xd0
[ 1570.972899] [<c01050d3>] ? restore_nocheck+0x12/0x15
[ 1570.972899] [<c0144b44>] ? trace_hardirqs_on+0xc4/0x150
[ 1570.972899] [<c017b959>] ? vfs_write+0x99/0x130
[ 1570.972899] [<c017b050>] ? do_sync_write+0x0/0x110
[ 1570.972899] [<c017c048>] ? sys_pwrite64+0x68/0x70
[ 1570.972899] [<c0104fea>] ? sysenter_past_esp+0x5f/0xa5
[ 1570.972899] =======================
[ 1570.972899] Code: ff 0f b7 47 4e 89 44 24 08 8b 85 78 ff ff ff c7 04 24 bb 21 7c c0 89 44 24 04 e8 c3 e0 ef ff 0f b7 47 4e 39 85 78 ff ff ff 74 07 <0f> 0b eb fe 8d 76 00 8b 85 78 ff ff ff 8b 95 7c ff ff ff 01 82
[ 1570.972899] EIP: [<c02266b9>] ext4_mb_release_inode_pa+0x169/0x1a0 SS:ESP 0068:cb638a84
[ 1570.972942] ---[ end trace 51819e80cd9431da ]---
[ 1570.972969] note: fstest[6629] exited with preempt_count 1
[ 1570.973013] BUG: sleeping function called from invalid context at kernel/rwsem.c:21
[ 1570.973039] in_atomic():1, irqs_disabled():0
[ 1570.973077] INFO: lockdep is turned off.
[ 1570.973104] Pid: 6629, comm: fstest Tainted: G D 2.6.24-05749-g8af03e7-dirty #19
[ 1570.973159] [<c011e1a6>] __might_sleep+0xc6/0xf0
[ 1570.973224] [<c06b2c99>] down_read+0x19/0x80
[ 1570.973295] [<c013a7fd>] ? hrtimer_try_to_cancel+0x3d/0x80
[ 1570.973396] [<c0125a27>] exit_mm+0x27/0xd0
[ 1570.973467] [<c01272f3>] do_exit+0x133/0x2e0
[ 1570.973529] [<c010611c>] die+0x13c/0x140
[ 1570.973590] [<c0135197>] ? search_exception_tables+0x27/0x30
[ 1570.973993] [<c01061b1>] do_trap+0x91/0xc0
[ 1570.974054] [<c0106440>] ? do_invalid_op+0x0/0xa0
[ 1570.974133] [<c01064c9>] do_invalid_op+0x89/0xa0
[ 1570.974195] [<c02266b9>] ? ext4_mb_release_inode_pa+0x169/0x1a0
[ 1570.974290] [<c012007b>] ? account_system_time+0x9b/0xd0
[ 1570.974370] [<c01245de>] ? vprintk+0x1ce/0x360
[ 1570.974465] [<c06b4492>] error_code+0x6a/0x70
[ 1570.974531] [<c012007b>] ? account_system_time+0x9b/0xd0
[ 1570.974626] [<c02266b9>] ? ext4_mb_release_inode_pa+0x169/0x1a0
[ 1570.974707] [<c020e3a4>] ? read_block_bitmap+0x54/0x120
[ 1570.974803] [<c022bd24>] ? ext4_mb_discard_inode_preallocations+0x124/0x300
[ 1570.975151] [<c022bda5>] ext4_mb_discard_inode_preallocations+0x1a5/0x300
[ 1570.975221] [<c0223577>] ext4_ext_get_blocks+0x3a7/0x4b0
[ 1570.975289] [<c0213928>] ext4_get_blocks_wrap+0xe8/0x130
[ 1570.975352] [<c0213bce>] ext4_get_block+0x7e/0xf0
[ 1570.975413] [<c019d7ba>] __block_prepare_write+0x17a/0x3a0
[ 1570.975479] [<c019da68>] block_write_begin+0x48/0xe0
[ 1570.975541] [<c0213b50>] ? ext4_get_block+0x0/0xf0
[ 1570.975635] [<c0215127>] ext4_write_begin+0xb7/0x190
[ 1570.975697] [<c0213b50>] ? ext4_get_block+0x0/0xf0
[ 1570.975776] [<c01557e9>] generic_perform_write+0xa9/0x190
[ 1570.975838] [<c01575bd>] generic_file_buffered_write+0x6d/0x130
[ 1570.975944] [<c01578c1>] __generic_file_aio_write_nolock+0x241/0x550
[ 1570.976007] [<c0144b44>] ? trace_hardirqs_on+0xc4/0x150
[ 1570.976105] [<c0157c2c>] generic_file_aio_write+0x5c/0xd0
[ 1570.976166] [<c015ab0d>] ? free_one_page+0x1ed/0x220
[ 1570.976247] [<c0210410>] ext4_file_write+0x50/0x160
[ 1570.976308] [<c017b11d>] do_sync_write+0xcd/0x110
[ 1570.976372] [<c01096f9>] ? native_sched_clock+0x69/0xc0
[ 1570.976470] [<c01373c0>] ? autoremove_wake_function+0x0/0x50
[ 1570.976567] [<c01070e5>] ? do_softirq+0x55/0xd0
[ 1570.976662] [<c01050d3>] ? restore_nocheck+0x12/0x15
[ 1570.976741] [<c0144b44>] ? trace_hardirqs_on+0xc4/0x150
[ 1570.976837] [<c017b959>] vfs_write+0x99/0x130
[ 1570.976933] [<c017b050>] ? do_sync_write+0x0/0x110
[ 1570.977014] [<c017c048>] sys_pwrite64+0x68/0x70
[ 1570.977075] [<c0104fea>] sysenter_past_esp+0x5f/0xa5
[ 1570.977137] =======================
[ 1570.977400] BUG: scheduling while atomic: fstest/6629/0x00000002
[ 1570.977426] INFO: lockdep is turned off.
[ 1570.977468] Pid: 6629, comm: fstest Tainted: G D 2.6.24-05749-g8af03e7-dirty #19
[ 1570.977494] [<c0120125>] __schedule_bug+0x75/0x80
[ 1570.977557] [<c06b1db5>] schedule+0x2d5/0x3a0
[ 1570.977621] [<c06b3325>] rwsem_down_failed_common+0x75/0x170
[ 1570.977684] [<c0142fde>] ? __lock_contended+0xbe/0x110
[ 1570.977780] [<c06b343d>] rwsem_down_write_failed+0x1d/0x30
[ 1570.977842] [<c06b34ca>] call_rwsem_down_write_failed+0x6/0x8
[ 1570.977933] [<c06b2d81>] ? down_write+0x81/0x90
[ 1570.978013] [<c021058c>] ? ext4_release_file+0x6c/0x80
[ 1570.978108] [<c021058c>] ext4_release_file+0x6c/0x80
[ 1570.978168] [<c017c484>] __fput+0xb4/0x1b0
[ 1570.978231] [<c017c788>] fput+0x18/0x20
[ 1570.978291] [<c0179827>] filp_close+0x47/0x70
[ 1570.978352] [<c0125dd0>] close_files+0x70/0x80
[ 1570.978415] [<c0125e0a>] put_files_struct+0x2a/0x60
[ 1570.978477] [<c0125e80>] __exit_files+0x40/0x50
[ 1570.978538] [<c0127301>] do_exit+0x141/0x2e0
[ 1570.978600] [<c010611c>] die+0x13c/0x140
[ 1570.978661] [<c0135197>] ? search_exception_tables+0x27/0x30
[ 1570.978741] [<c01061b1>] do_trap+0x91/0xc0
[ 1570.978803] [<c0106440>] ? do_invalid_op+0x0/0xa0
[ 1570.979039] [<c01064c9>] do_invalid_op+0x89/0xa0
[ 1570.979104] [<c02266b9>] ? ext4_mb_release_inode_pa+0x169/0x1a0
[ 1570.979185] [<c012007b>] ? account_system_time+0x9b/0xd0
[ 1570.979280] [<c01245de>] ? vprintk+0x1ce/0x360
[ 1570.979360] [<c06b4492>] error_code+0x6a/0x70
[ 1570.979423] [<c012007b>] ? account_system_time+0x9b/0xd0
[ 1570.979519] [<c02266b9>] ? ext4_mb_release_inode_pa+0x169/0x1a0
[ 1570.979600] [<c020e3a4>] ? read_block_bitmap+0x54/0x120
[ 1570.979700] [<c022bd24>] ? ext4_mb_discard_inode_preallocations+0x124/0x300
[ 1570.979797] [<c022bda5>] ext4_mb_discard_inode_preallocations+0x1a5/0x300
[ 1570.979861] [<c0223577>] ext4_ext_get_blocks+0x3a7/0x4b0
[ 1570.979964] [<c0213928>] ext4_get_blocks_wrap+0xe8/0x130
[ 1570.980026] [<c0213bce>] ext4_get_block+0x7e/0xf0
[ 1570.980087] [<c019d7ba>] __block_prepare_write+0x17a/0x3a0
[ 1570.980153] [<c019da68>] block_write_begin+0x48/0xe0
[ 1570.980215] [<c0213b50>] ? ext4_get_block+0x0/0xf0
[ 1570.980309] [<c0215127>] ext4_write_begin+0xb7/0x190
[ 1570.980371] [<c0213b50>] ? ext4_get_block+0x0/0xf0
[ 1570.980450] [<c01557e9>] generic_perform_write+0xa9/0x190
[ 1570.980513] [<c01575bd>] generic_file_buffered_write+0x6d/0x130
[ 1570.980576] [<c01578c1>] __generic_file_aio_write_nolock+0x241/0x550
[ 1570.980639] [<c0144b44>] ? trace_hardirqs_on+0xc4/0x150
[ 1570.980735] [<c0157c2c>] generic_file_aio_write+0x5c/0xd0
[ 1570.980796] [<c015ab0d>] ? free_one_page+0x1ed/0x220
[ 1570.980932] [<c0210410>] ext4_file_write+0x50/0x160
[ 1570.980995] [<c017b11d>] do_sync_write+0xcd/0x110
[ 1570.981058] [<c01096f9>] ? native_sched_clock+0x69/0xc0
[ 1570.981142] [<c01373c0>] ? autoremove_wake_function+0x0/0x50
[ 1570.981238] [<c01070e5>] ? do_softirq+0x55/0xd0
[ 1570.981317] [<c01050d3>] ? restore_nocheck+0x12/0x15
[ 1570.981410] [<c0144b44>] ? trace_hardirqs_on+0xc4/0x150
[ 1570.981491] [<c017b959>] vfs_write+0x99/0x130
[ 1570.981552] [<c017b050>] ? do_sync_write+0x0/0x110
[ 1570.981646] [<c017c048>] sys_pwrite64+0x68/0x70
[ 1570.981708] [<c0104fea>] sysenter_past_esp+0x5f/0xa5
[ 1570.981769] =======================


2008-01-31 15:42:09

by Eric Sesterhenn

[permalink] [raw]
Subject: Re: BUG_ON at mballoc.c:3752

* Eric Sesterhenn ([email protected]) wrote:
> hi,
>
> while running a modified version of fsfuzzer i triggered the BUG() in
> ext4_mb_release_inode_pa(). Sadly I am not able to reproduce this using
> the generated image, but running the fuzzer will usually trigger this in
> less than 40 attempts. Increasing the JBD2 Debug level didnt give more
> information. The kernel is current git with
> ext4-fix-null-pointer-deref-in-journal_wait_on_commit_record.patch
> applied.

I am now able to reproduce this using this image:
http://www.cccmz.de/~snakebyte/ext4.24.img.bz2

the following commands will trigger the oops for me

mount cfs/ext4.24.img /media/test -t ext4dev -o extents -o loop
mkdir /media/test/stress
chown snakebyte:snakebyte /media/test/stress && sudo -u snakebyte fstest -n 10 -l 10 -f 5 -s 40000 -p /media/test/stress/


The && between the chown and fstest seems necessary

Greetings, Eric

2008-02-01 19:34:28

by Mingming Cao

[permalink] [raw]
Subject: Re: BUG_ON at mballoc.c:3752

On Thu, 2008-01-31 at 15:01 +0100, Eric Sesterhenn wrote:
> hi,
>
> while running a modified version of fsfuzzer i triggered the BUG() in
> ext4_mb_release_inode_pa(). Sadly I am not able to reproduce this using
> the generated image, but running the fuzzer will usually trigger this in
> less than 40 attempts. Increasing the JBD2 Debug level didnt give more
> information. The kernel is current git with
> ext4-fix-null-pointer-deref-in-journal_wait_on_commit_record.patch
> applied.
>
> Greetings, Eric
>
Thanks for reporting this.

> [ 1570.971980] EXT4-fs error (device loop0) in ext4_reserve_inode_write: Journal has aborted

Is there any more info about why jbd has aborted?

> [ 1570.972077] pa c6512330: logic 16, phys. 2337, len 16
> [ 1570.972103] free 2, pa_free 1

looks like free!=pa_free.

Aneesh, could you take a look? Thanks!

Mingming
> [ 1570.972191] ------------[ cut here ]------------
> [ 1570.972217] kernel BUG at fs/ext4/mballoc.c:3752!
> [ 1570.972241] invalid opcode: 0000 [#1] PREEMPT DEBUG_PAGEALLOC
> [ 1570.972386] Modules linked in:
> [ 1570.972425]
> [ 1570.972509] Pid: 6629, comm: fstest Not tainted (2.6.24-05749-g8af03e7-dirty #19)
> [ 1570.972534] EIP: 0060:[<c02266b9>] EFLAGS: 00010202 CPU: 0
> [ 1570.972570] EIP is at ext4_mb_release_inode_pa+0x169/0x1a0
> [ 1570.972595] EAX: 00000001 EBX: 00000930 ECX: 00000001 EDX: 00000001
> [ 1570.972678] ESI: 00000930 EDI: c6512330 EBP: cb638b28 ESP: cb638a84
> [ 1570.972703] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
> [ 1570.972728] Process fstest (pid: 6629, ti=cb638000 task=cb698000 task.ti=cb638000)
> [ 1570.972751] Stack: c07c21bb 00000002 00000001 00000921 00000010 cbff34e0 cb638b54 00000002
> [ 1570.972899] cb5fa430 c64824a0 cb5fb920 00000022 cbff34e0 cb638ad4 00000246 00000400
> [ 1570.972899] cbfa2000 cb5fb920 cbff34e0 00000000 0000092e 00000000 00000002 cbfa2000
> [ 1570.972899] Call Trace:
> [ 1570.972899] [<c020e3a4>] ? read_block_bitmap+0x54/0x120
> [ 1570.972899] [<c022bd24>] ? ext4_mb_discard_inode_preallocations+0x124/0x300
> [ 1570.972899] [<c022bda5>] ? ext4_mb_discard_inode_preallocations+0x1a5/0x300
> [ 1570.972899] [<c0223577>] ? ext4_ext_get_blocks+0x3a7/0x4b0
> [ 1570.972899] [<c0213928>] ? ext4_get_blocks_wrap+0xe8/0x130
> [ 1570.972899] [<c0213bce>] ? ext4_get_block+0x7e/0xf0
> [ 1570.972899] [<c019d7ba>] ? __block_prepare_write+0x17a/0x3a0
> [ 1570.972899] [<c019da68>] ? block_write_begin+0x48/0xe0
> [ 1570.972899] [<c0213b50>] ? ext4_get_block+0x0/0xf0
> [ 1570.972899] [<c0215127>] ? ext4_write_begin+0xb7/0x190
> [ 1570.972899] [<c0213b50>] ? ext4_get_block+0x0/0xf0
> [ 1570.972899] [<c01557e9>] ? generic_perform_write+0xa9/0x190
> [ 1570.972899] [<c01575bd>] ? generic_file_buffered_write+0x6d/0x130
> [ 1570.972899] [<c01578c1>] ? __generic_file_aio_write_nolock+0x241/0x550
> [ 1570.972899] [<c0144b44>] ? trace_hardirqs_on+0xc4/0x150
> [ 1570.972899] [<c0157c2c>] ? generic_file_aio_write+0x5c/0xd0
> [ 1570.972899] [<c015ab0d>] ? free_one_page+0x1ed/0x220
> [ 1570.972899] [<c0210410>] ? ext4_file_write+0x50/0x160
> [ 1570.972899] [<c017b11d>] ? do_sync_write+0xcd/0x110
> [ 1570.972899] [<c01096f9>] ? native_sched_clock+0x69/0xc0
> [ 1570.972899] [<c01373c0>] ? autoremove_wake_function+0x0/0x50
> [ 1570.972899] [<c01070e5>] ? do_softirq+0x55/0xd0
> [ 1570.972899] [<c01050d3>] ? restore_nocheck+0x12/0x15
> [ 1570.972899] [<c0144b44>] ? trace_hardirqs_on+0xc4/0x150
> [ 1570.972899] [<c017b959>] ? vfs_write+0x99/0x130
> [ 1570.972899] [<c017b050>] ? do_sync_write+0x0/0x110
> [ 1570.972899] [<c017c048>] ? sys_pwrite64+0x68/0x70
> [ 1570.972899] [<c0104fea>] ? sysenter_past_esp+0x5f/0xa5
> [ 1570.972899] =======================
> [ 1570.972899] Code: ff 0f b7 47 4e 89 44 24 08 8b 85 78 ff ff ff c7 04 24 bb 21 7c c0 89 44 24 04 e8 c3 e0 ef ff 0f b7 47 4e 39 85 78 ff ff ff 74 07 <0f> 0b eb fe 8d 76 00 8b 85 78 ff ff ff 8b 95 7c ff ff ff 01 82
> [ 1570.972899] EIP: [<c02266b9>] ext4_mb_release_inode_pa+0x169/0x1a0 SS:ESP 0068:cb638a84
> [ 1570.972942] ---[ end trace 51819e80cd9431da ]---
> [ 1570.972969] note: fstest[6629] exited with preempt_count 1
> [ 1570.973013] BUG: sleeping function called from invalid context at kernel/rwsem.c:21
> [ 1570.973039] in_atomic():1, irqs_disabled():0
> [ 1570.973077] INFO: lockdep is turned off.
> [ 1570.973104] Pid: 6629, comm: fstest Tainted: G D 2.6.24-05749-g8af03e7-dirty #19
> [ 1570.973159] [<c011e1a6>] __might_sleep+0xc6/0xf0
> [ 1570.973224] [<c06b2c99>] down_read+0x19/0x80
> [ 1570.973295] [<c013a7fd>] ? hrtimer_try_to_cancel+0x3d/0x80
> [ 1570.973396] [<c0125a27>] exit_mm+0x27/0xd0
> [ 1570.973467] [<c01272f3>] do_exit+0x133/0x2e0
> [ 1570.973529] [<c010611c>] die+0x13c/0x140
> [ 1570.973590] [<c0135197>] ? search_exception_tables+0x27/0x30
> [ 1570.973993] [<c01061b1>] do_trap+0x91/0xc0
> [ 1570.974054] [<c0106440>] ? do_invalid_op+0x0/0xa0
> [ 1570.974133] [<c01064c9>] do_invalid_op+0x89/0xa0
> [ 1570.974195] [<c02266b9>] ? ext4_mb_release_inode_pa+0x169/0x1a0
> [ 1570.974290] [<c012007b>] ? account_system_time+0x9b/0xd0
> [ 1570.974370] [<c01245de>] ? vprintk+0x1ce/0x360
> [ 1570.974465] [<c06b4492>] error_code+0x6a/0x70
> [ 1570.974531] [<c012007b>] ? account_system_time+0x9b/0xd0
> [ 1570.974626] [<c02266b9>] ? ext4_mb_release_inode_pa+0x169/0x1a0
> [ 1570.974707] [<c020e3a4>] ? read_block_bitmap+0x54/0x120
> [ 1570.974803] [<c022bd24>] ? ext4_mb_discard_inode_preallocations+0x124/0x300
> [ 1570.975151] [<c022bda5>] ext4_mb_discard_inode_preallocations+0x1a5/0x300
> [ 1570.975221] [<c0223577>] ext4_ext_get_blocks+0x3a7/0x4b0
> [ 1570.975289] [<c0213928>] ext4_get_blocks_wrap+0xe8/0x130
> [ 1570.975352] [<c0213bce>] ext4_get_block+0x7e/0xf0
> [ 1570.975413] [<c019d7ba>] __block_prepare_write+0x17a/0x3a0
> [ 1570.975479] [<c019da68>] block_write_begin+0x48/0xe0
> [ 1570.975541] [<c0213b50>] ? ext4_get_block+0x0/0xf0
> [ 1570.975635] [<c0215127>] ext4_write_begin+0xb7/0x190
> [ 1570.975697] [<c0213b50>] ? ext4_get_block+0x0/0xf0
> [ 1570.975776] [<c01557e9>] generic_perform_write+0xa9/0x190
> [ 1570.975838] [<c01575bd>] generic_file_buffered_write+0x6d/0x130
> [ 1570.975944] [<c01578c1>] __generic_file_aio_write_nolock+0x241/0x550
> [ 1570.976007] [<c0144b44>] ? trace_hardirqs_on+0xc4/0x150
> [ 1570.976105] [<c0157c2c>] generic_file_aio_write+0x5c/0xd0
> [ 1570.976166] [<c015ab0d>] ? free_one_page+0x1ed/0x220
> [ 1570.976247] [<c0210410>] ext4_file_write+0x50/0x160
> [ 1570.976308] [<c017b11d>] do_sync_write+0xcd/0x110
> [ 1570.976372] [<c01096f9>] ? native_sched_clock+0x69/0xc0
> [ 1570.976470] [<c01373c0>] ? autoremove_wake_function+0x0/0x50
> [ 1570.976567] [<c01070e5>] ? do_softirq+0x55/0xd0
> [ 1570.976662] [<c01050d3>] ? restore_nocheck+0x12/0x15
> [ 1570.976741] [<c0144b44>] ? trace_hardirqs_on+0xc4/0x150
> [ 1570.976837] [<c017b959>] vfs_write+0x99/0x130
> [ 1570.976933] [<c017b050>] ? do_sync_write+0x0/0x110
> [ 1570.977014] [<c017c048>] sys_pwrite64+0x68/0x70
> [ 1570.977075] [<c0104fea>] sysenter_past_esp+0x5f/0xa5
> [ 1570.977137] =======================
> [ 1570.977400] BUG: scheduling while atomic: fstest/6629/0x00000002
> [ 1570.977426] INFO: lockdep is turned off.
> [ 1570.977468] Pid: 6629, comm: fstest Tainted: G D 2.6.24-05749-g8af03e7-dirty #19
> [ 1570.977494] [<c0120125>] __schedule_bug+0x75/0x80
> [ 1570.977557] [<c06b1db5>] schedule+0x2d5/0x3a0
> [ 1570.977621] [<c06b3325>] rwsem_down_failed_common+0x75/0x170
> [ 1570.977684] [<c0142fde>] ? __lock_contended+0xbe/0x110
> [ 1570.977780] [<c06b343d>] rwsem_down_write_failed+0x1d/0x30
> [ 1570.977842] [<c06b34ca>] call_rwsem_down_write_failed+0x6/0x8
> [ 1570.977933] [<c06b2d81>] ? down_write+0x81/0x90
> [ 1570.978013] [<c021058c>] ? ext4_release_file+0x6c/0x80
> [ 1570.978108] [<c021058c>] ext4_release_file+0x6c/0x80
> [ 1570.978168] [<c017c484>] __fput+0xb4/0x1b0
> [ 1570.978231] [<c017c788>] fput+0x18/0x20
> [ 1570.978291] [<c0179827>] filp_close+0x47/0x70
> [ 1570.978352] [<c0125dd0>] close_files+0x70/0x80
> [ 1570.978415] [<c0125e0a>] put_files_struct+0x2a/0x60
> [ 1570.978477] [<c0125e80>] __exit_files+0x40/0x50
> [ 1570.978538] [<c0127301>] do_exit+0x141/0x2e0
> [ 1570.978600] [<c010611c>] die+0x13c/0x140
> [ 1570.978661] [<c0135197>] ? search_exception_tables+0x27/0x30
> [ 1570.978741] [<c01061b1>] do_trap+0x91/0xc0
> [ 1570.978803] [<c0106440>] ? do_invalid_op+0x0/0xa0
> [ 1570.979039] [<c01064c9>] do_invalid_op+0x89/0xa0
> [ 1570.979104] [<c02266b9>] ? ext4_mb_release_inode_pa+0x169/0x1a0
> [ 1570.979185] [<c012007b>] ? account_system_time+0x9b/0xd0
> [ 1570.979280] [<c01245de>] ? vprintk+0x1ce/0x360
> [ 1570.979360] [<c06b4492>] error_code+0x6a/0x70
> [ 1570.979423] [<c012007b>] ? account_system_time+0x9b/0xd0
> [ 1570.979519] [<c02266b9>] ? ext4_mb_release_inode_pa+0x169/0x1a0
> [ 1570.979600] [<c020e3a4>] ? read_block_bitmap+0x54/0x120
> [ 1570.979700] [<c022bd24>] ? ext4_mb_discard_inode_preallocations+0x124/0x300
> [ 1570.979797] [<c022bda5>] ext4_mb_discard_inode_preallocations+0x1a5/0x300
> [ 1570.979861] [<c0223577>] ext4_ext_get_blocks+0x3a7/0x4b0
> [ 1570.979964] [<c0213928>] ext4_get_blocks_wrap+0xe8/0x130
> [ 1570.980026] [<c0213bce>] ext4_get_block+0x7e/0xf0
> [ 1570.980087] [<c019d7ba>] __block_prepare_write+0x17a/0x3a0
> [ 1570.980153] [<c019da68>] block_write_begin+0x48/0xe0
> [ 1570.980215] [<c0213b50>] ? ext4_get_block+0x0/0xf0
> [ 1570.980309] [<c0215127>] ext4_write_begin+0xb7/0x190
> [ 1570.980371] [<c0213b50>] ? ext4_get_block+0x0/0xf0
> [ 1570.980450] [<c01557e9>] generic_perform_write+0xa9/0x190
> [ 1570.980513] [<c01575bd>] generic_file_buffered_write+0x6d/0x130
> [ 1570.980576] [<c01578c1>] __generic_file_aio_write_nolock+0x241/0x550
> [ 1570.980639] [<c0144b44>] ? trace_hardirqs_on+0xc4/0x150
> [ 1570.980735] [<c0157c2c>] generic_file_aio_write+0x5c/0xd0
> [ 1570.980796] [<c015ab0d>] ? free_one_page+0x1ed/0x220
> [ 1570.980932] [<c0210410>] ext4_file_write+0x50/0x160
> [ 1570.980995] [<c017b11d>] do_sync_write+0xcd/0x110
> [ 1570.981058] [<c01096f9>] ? native_sched_clock+0x69/0xc0
> [ 1570.981142] [<c01373c0>] ? autoremove_wake_function+0x0/0x50
> [ 1570.981238] [<c01070e5>] ? do_softirq+0x55/0xd0
> [ 1570.981317] [<c01050d3>] ? restore_nocheck+0x12/0x15
> [ 1570.981410] [<c0144b44>] ? trace_hardirqs_on+0xc4/0x150
> [ 1570.981491] [<c017b959>] vfs_write+0x99/0x130
> [ 1570.981552] [<c017b050>] ? do_sync_write+0x0/0x110
> [ 1570.981646] [<c017c048>] sys_pwrite64+0x68/0x70
> [ 1570.981708] [<c0104fea>] sysenter_past_esp+0x5f/0xa5
> [ 1570.981769] =======================
> -
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2008-02-04 06:01:01

by Aneesh Kumar K.V

[permalink] [raw]
Subject: Re: BUG_ON at mballoc.c:3752

On Thu, Jan 31, 2008 at 04:42:07PM +0100, Eric Sesterhenn wrote:
> * Eric Sesterhenn ([email protected]) wrote:
> > hi,
> >
> > while running a modified version of fsfuzzer i triggered the BUG() in
> > ext4_mb_release_inode_pa(). Sadly I am not able to reproduce this using
> > the generated image, but running the fuzzer will usually trigger this in
> > less than 40 attempts. Increasing the JBD2 Debug level didnt give more
> > information. The kernel is current git with
> > ext4-fix-null-pointer-deref-in-journal_wait_on_commit_record.patch
> > applied.
>
> I am now able to reproduce this using this image:
> http://www.cccmz.de/~snakebyte/ext4.24.img.bz2
>
> the following commands will trigger the oops for me
>
> mount cfs/ext4.24.img /media/test -t ext4dev -o extents -o loop
> mkdir /media/test/stress
> chown snakebyte:snakebyte /media/test/stress && sudo -u snakebyte fstest -n 10 -l 10 -f 5 -s 40000 -p /media/test/stress/
>

The file system is corrupted. The BUG_ON indicate that the free spcae
marked in the prealloc space and found by looking at the bitmap are not
same. Do you have a set of steps that i can follow to reproduce this ?
on a clean file system ? Where do i find the fsfuzzer that you are using ?


[email protected]:/tmp$/home/opensource/patches/e2fsprogs-1.40.4.cfs1/e2fsck/e2fsck.static -f ./ext4.24.img
e2fsck 1.40.4.cfs1 (31-Dec-2007)
Superblock has an invalid ext3 journal (inode 8).
Clear<y>? yes

*** ext3 journal has been deleted - filesystem is now ext2 only ***

Pass 1: Checking inodes, blocks, and sizes
Inode 7 has illegal block(s). Clear<y>? yes

Illegal block #552 (9568256) in inode 7. CLEARED.
Illegal block #647 (4063232) in inode 7. CLEARED.
Illegal block #659 (12517376) in inode 7. CLEARED.
Illegal block #766 (2600468480) in inode 7. CLEARED.
Illegal block #944 (51200) in inode 7. CLEARED.
Illegal block #1135 (2583691264) in inode 7. CLEARED.
Illegal block #1214 (15925248) in inode 7. CLEARED.
Illegal block #1345 (771751936) in inode 7. CLEARED.
Illegal block #1384 (10092544) in inode 7. CLEARED.
Illegal block #1416 (14811136) in inode 7. CLEARED.
Illegal block #1470 (10420224) in inode 7. CLEARED.
Illegal block #1709 (10158080) in inode 7. CLEARED.
Too many illegal blocks in inode 7.
Clear inode<y>? yes

Journal inode is not in use, but contains data. Clear<y>? yes

Deleted inode 160 has zero dtime. Fix<y>? yes

Deleted inode 257 has zero dtime. Fix<y>? yes

Deleted inode 638 has zero dtime. Fix<y>? yes

Deleted inode 1407 has zero dtime. Fix<y>? yes

Deleted inode 1650 has zero dtime. Fix<y>? yes

Deleted inode 1656 has zero dtime. Fix<y>? yes

Deleted inode 1677 has zero dtime. Fix<y>? yes

Deleted inode 1720 has zero dtime. Fix<y>? yes

Inode 11 has illegal block(s). Clear<y>? yes

Illegal block #249 (15990784) in inode 11. CLEARED.
Inode 11, i_size is 12288, should be 33792. Fix<y>? yes

Inode 11, i_blocks is 24, should be 28. Fix<y>? yes

Recreate journal to make the filesystem ext3 again?
Fix<y>? yes

-aneesh

2008-02-04 08:29:39

by Eric Sesterhenn

[permalink] [raw]
Subject: Re: BUG_ON at mballoc.c:3752

* Aneesh Kumar K.V ([email protected]) wrote:
> On Thu, Jan 31, 2008 at 04:42:07PM +0100, Eric Sesterhenn wrote:
> > * Eric Sesterhenn ([email protected]) wrote:
> > > hi,
> > >
> > > while running a modified version of fsfuzzer i triggered the BUG() in
> > > ext4_mb_release_inode_pa(). Sadly I am not able to reproduce this using
> > > the generated image, but running the fuzzer will usually trigger this in
> > > less than 40 attempts. Increasing the JBD2 Debug level didnt give more
> > > information. The kernel is current git with
> > > ext4-fix-null-pointer-deref-in-journal_wait_on_commit_record.patch
> > > applied.
> >
> > I am now able to reproduce this using this image:
> > http://www.cccmz.de/~snakebyte/ext4.24.img.bz2
> >
> > the following commands will trigger the oops for me
> >
> > mount cfs/ext4.24.img /media/test -t ext4dev -o extents -o loop
> > mkdir /media/test/stress
> > chown snakebyte:snakebyte /media/test/stress && sudo -u snakebyte fstest -n 10 -l 10 -f 5 -s 40000 -p /media/test/stress/
> >
>
> The file system is corrupted. The BUG_ON indicate that the free spcae
> marked in the prealloc space and found by looking at the bitmap are not
> same. Do you have a set of steps that i can follow to reproduce this ?

just compile the mangle.c and run the modified fuzzer for a while (link
below) or use the steps above

> on a clean file system ?

had no luck on a clean fs with this

> Where do i find the fsfuzzer that you are using ?

http://www.cccmz.de/~snakebyte/fsfuzzer-0.6-lmh-eric.tar.bz2
http://www.cccmz.de/~snakebyte/fsfuzz.diff

needs user/group nobody:nobody and stuff like fstest, fsx, iozone,
fsstress

The changes i made are basically changing the fuzzing ratio, adding udf,
hfsplus and ext4, reducing the number of runs to 100 (not unlimited),
saving a backup of the image before mounting and running the tests, and
mounting ext3 with -o debug

Greetings, Eric

2008-02-06 21:59:50

by Dave Kleikamp

[permalink] [raw]
Subject: Re: BUG_ON at mballoc.c:3752

On Mon, 2008-02-04 at 11:30 +0530, Aneesh Kumar K.V wrote:
> On Thu, Jan 31, 2008 at 04:42:07PM +0100, Eric Sesterhenn wrote:
> > * Eric Sesterhenn ([email protected]) wrote:
> > > hi,
> > >
> > > while running a modified version of fsfuzzer i triggered the BUG() in
> > > ext4_mb_release_inode_pa(). Sadly I am not able to reproduce this using
> > > the generated image, but running the fuzzer will usually trigger this in
> > > less than 40 attempts. Increasing the JBD2 Debug level didnt give more
> > > information. The kernel is current git with
> > > ext4-fix-null-pointer-deref-in-journal_wait_on_commit_record.patch
> > > applied.
> >
> > I am now able to reproduce this using this image:
> > http://www.cccmz.de/~snakebyte/ext4.24.img.bz2
> >
> > the following commands will trigger the oops for me
> >
> > mount cfs/ext4.24.img /media/test -t ext4dev -o extents -o loop
> > mkdir /media/test/stress
> > chown snakebyte:snakebyte /media/test/stress && sudo -u snakebyte fstest -n 10 -l 10 -f 5 -s 40000 -p /media/test/stress/
> >
>
> The file system is corrupted.

That's what fsfuzzer does.

> The BUG_ON indicate that the free spcae
> marked in the prealloc space and found by looking at the bitmap are not
> same.

File systems should not call BUG() due to a corrupt file system.
Instead the code should fail the operation, possibly marking the file
system read-only (or panicking) depending on the errors= mount option.

--
David Kleikamp
IBM Linux Technology Center

2008-02-07 12:55:56

by Aneesh Kumar K.V

[permalink] [raw]
Subject: Re: BUG_ON at mballoc.c:3752

On Wed, Feb 06, 2008 at 03:59:48PM -0600, Dave Kleikamp wrote:
>
> File systems should not call BUG() due to a corrupt file system.
> Instead the code should fail the operation, possibly marking the file
> system read-only (or panicking) depending on the errors= mount option.
>

Eric Sandeen explained me the same on IRC. I was busy with the migrate
locking bug. That's why i didn't update here. Today i tried to reproduce
the problem using the image provided. But in my case it is not hitting
the BUG_ON (mostly due to single cpu). I did look at the code and am not
still not clear how we can hit that BUG_ON. prealloc free space pa_free is
generated out of bitmap. So only if something corrupted bitmap after we
initialized prealloc space we will hit this case. In mballoc we error out
if the block allocated or fall in system zone. One thing i noticed is,
the journal is corrupt. So the only possibility that i have is journal write
resulted in bitmap corruption.

I also looked at the mballoc to make sure we don't panic in case of a
corrupt bitmap. Below is the patch that i have now. This one is yet to
go through the ABAT test but it would be nice to see whether the below
change cause any other issues.

Eric ,
can you run the test with below patch and see if this makes any
difference ?. I know we are not fixing any bugs in the below patch.


ext4: Don't panic in case of corrupt bitmap

From: Aneesh Kumar K.V <[email protected]>

Multiblock allocator was calling BUG_ON in many case if the free and used
blocks count obtained looking at the bitmap is different from what
the allocator internally accounted for. Use ext4_error in such case
and don't panic the system.

Signed-off-by: Aneesh Kumar K.V <[email protected]>
---

fs/ext4/mballoc.c | 35 +++++++++++++++++++++--------------
1 files changed, 21 insertions(+), 14 deletions(-)


diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index 06d1f52..656729b 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -680,7 +680,6 @@ static void *mb_find_buddy(struct ext4_buddy *e4b, int order, int *max)
{
char *bb;

- /* FIXME!! is this needed */
BUG_ON(EXT4_MB_BITMAP(e4b) == EXT4_MB_BUDDY(e4b));
BUG_ON(max == NULL);

@@ -964,7 +963,7 @@ static void ext4_mb_generate_buddy(struct super_block *sb,
grp->bb_fragments = fragments;

if (free != grp->bb_free) {
- printk(KERN_DEBUG
+ ext4_error(sb, __FUNCTION__,
"EXT4-fs: group %lu: %u blocks in bitmap, %u in gd\n",
group, free, grp->bb_free);
grp->bb_free = free;
@@ -1821,13 +1820,24 @@ static void ext4_mb_complex_scan_group(struct ext4_allocation_context *ac,
i = ext4_find_next_zero_bit(bitmap,
EXT4_BLOCKS_PER_GROUP(sb), i);
if (i >= EXT4_BLOCKS_PER_GROUP(sb)) {
- BUG_ON(free != 0);
+ /*
+ * IF we corrupt the bitmap we won't find any
+ * free blocks even though group info says we
+ * we have free blocks
+ */
+ ext4_error(sb, __FUNCTION__, "%d free blocks as per "
+ "group info. But bitmap says 0\n",
+ free);
break;
}

mb_find_extent(e4b, 0, i, ac->ac_g_ex.fe_len, &ex);
BUG_ON(ex.fe_len <= 0);
- BUG_ON(free < ex.fe_len);
+ if (free < ex.fe_len) {
+ ext4_error(sb, __FUNCTION__, "%d free blocks as per "
+ "group info. But got %d blocks\n",
+ free, ex.fe_len);
+ }

ext4_mb_measure_extent(ac, &ex, e4b);

@@ -3354,13 +3364,10 @@ static void ext4_mb_use_group_pa(struct ext4_allocation_context *ac,
ac->ac_pa = pa;

/* we don't correct pa_pstart or pa_plen here to avoid
- * possible race when tte group is being loaded concurrently
+ * possible race when the group is being loaded concurrently
* instead we correct pa later, after blocks are marked
- * in on-disk bitmap -- see ext4_mb_release_context() */
- /*
- * FIXME!! but the other CPUs can look at this particular
- * pa and think that it have enought free blocks if we
- * don't update pa_free here right ?
+ * in on-disk bitmap -- see ext4_mb_release_context()
+ * Other CPUs are prevented from allocating from this pa by lg_mutex
*/
mb_debug("use %u/%u from group pa %p\n", pa->pa_lstart-len, len, pa);
}
@@ -3743,13 +3750,13 @@ static int ext4_mb_release_inode_pa(struct ext4_buddy *e4b,
bit = next + 1;
}
if (free != pa->pa_free) {
- printk(KERN_ERR "pa %p: logic %lu, phys. %lu, len %lu\n",
+ printk(KERN_CRIT "pa %p: logic %lu, phys. %lu, len %lu\n",
pa, (unsigned long) pa->pa_lstart,
(unsigned long) pa->pa_pstart,
(unsigned long) pa->pa_len);
- printk(KERN_ERR "free %u, pa_free %u\n", free, pa->pa_free);
+ ext4_error(sb, __FUNCTION__, "free %u, pa_free %u\n",
+ free, pa->pa_free);
}
- BUG_ON(free != pa->pa_free);
atomic_add(free, &sbi->s_mb_discarded);

return err;
@@ -4405,7 +4412,7 @@ void ext4_mb_free_blocks(handle_t *handle, struct inode *inode,
unsigned long block, unsigned long count,
int metadata, unsigned long *freed)
{
- struct buffer_head *bitmap_bh = 0;
+ struct buffer_head *bitmap_bh = NULL;
struct super_block *sb = inode->i_sb;
struct ext4_allocation_context ac;
struct ext4_group_desc *gdp;

2008-02-08 01:30:49

by Mingming Cao

[permalink] [raw]
Subject: Re: BUG_ON at mballoc.c:3752

On Thu, 2008-02-07 at 18:25 +0530, Aneesh Kumar K.V wrote:

> ext4: Don't panic in case of corrupt bitmap
>
> From: Aneesh Kumar K.V <[email protected]>
>
> Multiblock allocator was calling BUG_ON in many case if the free and used
> blocks count obtained looking at the bitmap is different from what
> the allocator internally accounted for. Use ext4_error in such case
> and don't panic the system.
>

There seems a lot of BUG_ON() and BUG() in mballoc code, other than this
case. Should it always panic the whole system in those cases? Perhaps
replacing with ext4_error() or some cases just WARN_ON is enough.

Mingming
> Signed-off-by: Aneesh Kumar K.V <[email protected]>
> ---
>
> fs/ext4/mballoc.c | 35 +++++++++++++++++++++--------------
> 1 files changed, 21 insertions(+), 14 deletions(-)
>
>
> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> index 06d1f52..656729b 100644
> --- a/fs/ext4/mballoc.c
> +++ b/fs/ext4/mballoc.c
> @@ -680,7 +680,6 @@ static void *mb_find_buddy(struct ext4_buddy *e4b, int order, int *max)
> {
> char *bb;
>
> - /* FIXME!! is this needed */
> BUG_ON(EXT4_MB_BITMAP(e4b) == EXT4_MB_BUDDY(e4b));
> BUG_ON(max == NULL);
>



> @@ -964,7 +963,7 @@ static void ext4_mb_generate_buddy(struct super_block *sb,
> grp->bb_fragments = fragments;
>
> if (free != grp->bb_free) {
> - printk(KERN_DEBUG
> + ext4_error(sb, __FUNCTION__,
> "EXT4-fs: group %lu: %u blocks in bitmap, %u in gd\n",
> group, free, grp->bb_free);
> grp->bb_free = free;
> @@ -1821,13 +1820,24 @@ static void ext4_mb_complex_scan_group(struct ext4_allocation_context *ac,
> i = ext4_find_next_zero_bit(bitmap,
> EXT4_BLOCKS_PER_GROUP(sb), i);
> if (i >= EXT4_BLOCKS_PER_GROUP(sb)) {
> - BUG_ON(free != 0);
> + /*
> + * IF we corrupt the bitmap we won't find any
> + * free blocks even though group info says we
> + * we have free blocks
> + */
> + ext4_error(sb, __FUNCTION__, "%d free blocks as per "
> + "group info. But bitmap says 0\n",
> + free);
> break;
> }
>
> mb_find_extent(e4b, 0, i, ac->ac_g_ex.fe_len, &ex);
> BUG_ON(ex.fe_len <= 0);
> - BUG_ON(free < ex.fe_len);
> + if (free < ex.fe_len) {
> + ext4_error(sb, __FUNCTION__, "%d free blocks as per "
> + "group info. But got %d blocks\n",
> + free, ex.fe_len);
> + }
>
> ext4_mb_measure_extent(ac, &ex, e4b);
>
> @@ -3354,13 +3364,10 @@ static void ext4_mb_use_group_pa(struct ext4_allocation_context *ac,
> ac->ac_pa = pa;
>
> /* we don't correct pa_pstart or pa_plen here to avoid
> - * possible race when tte group is being loaded concurrently
> + * possible race when the group is being loaded concurrently
> * instead we correct pa later, after blocks are marked
> - * in on-disk bitmap -- see ext4_mb_release_context() */
> - /*
> - * FIXME!! but the other CPUs can look at this particular
> - * pa and think that it have enought free blocks if we
> - * don't update pa_free here right ?
> + * in on-disk bitmap -- see ext4_mb_release_context()
> + * Other CPUs are prevented from allocating from this pa by lg_mutex
> */
> mb_debug("use %u/%u from group pa %p\n", pa->pa_lstart-len, len, pa);
> }
> @@ -3743,13 +3750,13 @@ static int ext4_mb_release_inode_pa(struct ext4_buddy *e4b,
> bit = next + 1;
> }
> if (free != pa->pa_free) {
> - printk(KERN_ERR "pa %p: logic %lu, phys. %lu, len %lu\n",
> + printk(KERN_CRIT "pa %p: logic %lu, phys. %lu, len %lu\n",
> pa, (unsigned long) pa->pa_lstart,
> (unsigned long) pa->pa_pstart,
> (unsigned long) pa->pa_len);
> - printk(KERN_ERR "free %u, pa_free %u\n", free, pa->pa_free);
> + ext4_error(sb, __FUNCTION__, "free %u, pa_free %u\n",
> + free, pa->pa_free);
> }
> - BUG_ON(free != pa->pa_free);
> atomic_add(free, &sbi->s_mb_discarded);
>
> return err;
> @@ -4405,7 +4412,7 @@ void ext4_mb_free_blocks(handle_t *handle, struct inode *inode,
> unsigned long block, unsigned long count,
> int metadata, unsigned long *freed)
> {
> - struct buffer_head *bitmap_bh = 0;
> + struct buffer_head *bitmap_bh = NULL;
> struct super_block *sb = inode->i_sb;
> struct ext4_allocation_context ac;
> struct ext4_group_desc *gdp;
> -
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2008-02-08 03:14:14

by Aneesh Kumar K.V

[permalink] [raw]
Subject: Re: BUG_ON at mballoc.c:3752

On Thu, Feb 07, 2008 at 05:30:48PM -0800, Mingming Cao wrote:
> On Thu, 2008-02-07 at 18:25 +0530, Aneesh Kumar K.V wrote:
>
> > ext4: Don't panic in case of corrupt bitmap
> >
> > From: Aneesh Kumar K.V <[email protected]>
> >
> > Multiblock allocator was calling BUG_ON in many case if the free and used
> > blocks count obtained looking at the bitmap is different from what
> > the allocator internally accounted for. Use ext4_error in such case
> > and don't panic the system.
> >
>
> There seems a lot of BUG_ON() and BUG() in mballoc code, other than this
> case. Should it always panic the whole system in those cases? Perhaps
> replacing with ext4_error() or some cases just WARN_ON is enough.
>

I had looked at the BUG_ON in mballoc code and found them very useful
while stabilizing the mballoc code. It helped to catch wrong usage of
functions. Most of the BUG_ON are there to make sure we call the API
with the lock held or the API should not return value greater than 'x'
Should not call the function with a particular argument as NULL ...etc
kind of thing. So i would suggest to keep them as such.


-aneesh

2008-02-08 03:30:58

by Eric Sandeen

[permalink] [raw]
Subject: Re: BUG_ON at mballoc.c:3752

Aneesh Kumar K.V wrote:
> On Thu, Feb 07, 2008 at 05:30:48PM -0800, Mingming Cao wrote:
>> On Thu, 2008-02-07 at 18:25 +0530, Aneesh Kumar K.V wrote:
>>
>>> ext4: Don't panic in case of corrupt bitmap
>>>
>>> From: Aneesh Kumar K.V <[email protected]>
>>>
>>> Multiblock allocator was calling BUG_ON in many case if the free and used
>>> blocks count obtained looking at the bitmap is different from what
>>> the allocator internally accounted for. Use ext4_error in such case
>>> and don't panic the system.
>>>
>> There seems a lot of BUG_ON() and BUG() in mballoc code, other than this
>> case. Should it always panic the whole system in those cases? Perhaps
>> replacing with ext4_error() or some cases just WARN_ON is enough.
>>
>
> I had looked at the BUG_ON in mballoc code and found them very useful
> while stabilizing the mballoc code. It helped to catch wrong usage of
> functions. Most of the BUG_ON are there to make sure we call the API
> with the lock held or the API should not return value greater than 'x'
> Should not call the function with a particular argument as NULL ...etc
> kind of thing. So i would suggest to keep them as such.

Yep - as long as the BUG_ONs are for programming errors, and not things
like memory allocation failures or corrupted disks, I think it's ok.
Not that I've audited them all. :)

I asked about this early on, and got that answer... I'm reasonably
satisfied with it.

-Eric

2008-02-08 13:47:22

by Eric Sesterhenn

[permalink] [raw]
Subject: Re: BUG_ON at mballoc.c:3752

* Aneesh Kumar K.V ([email protected]) wrote:
> On Wed, Feb 06, 2008 at 03:59:48PM -0600, Dave Kleikamp wrote:
> >
> > File systems should not call BUG() due to a corrupt file system.
> > Instead the code should fail the operation, possibly marking the file
> > system read-only (or panicking) depending on the errors= mount option.
> >
>
> Eric Sandeen explained me the same on IRC. I was busy with the migrate
> locking bug. That's why i didn't update here. Today i tried to reproduce
> the problem using the image provided. But in my case it is not hitting
> the BUG_ON (mostly due to single cpu). I did look at the code and am not
> still not clear how we can hit that BUG_ON. prealloc free space pa_free is
> generated out of bitmap. So only if something corrupted bitmap after we
> initialized prealloc space we will hit this case. In mballoc we error out
> if the block allocated or fall in system zone. One thing i noticed is,
> the journal is corrupt. So the only possibility that i have is journal write
> resulted in bitmap corruption.
>
> I also looked at the mballoc to make sure we don't panic in case of a
> corrupt bitmap. Below is the patch that i have now. This one is yet to
> go through the ABAT test but it would be nice to see whether the below
> change cause any other issues.
>
> Eric ,
> can you run the test with below patch and see if this makes any
> difference ?. I know we are not fixing any bugs in the below patch.

hi,

so far i am not able to reproduce this on 2.6.24-08039-g488b5ec neither
with the
ext4-fix-null-pointer-deref-in-journal_wait_on_commit_record.patch nor
without it. I will try 2.6.24-05749-g8af03e7 with the patch and your
change later today.

Greetings, Eric

2008-02-08 15:07:06

by Eric Sesterhenn

[permalink] [raw]
Subject: Re: BUG_ON at mballoc.c:3752

> Eric ,
> can you run the test with below patch and see if this makes any
> difference ?. I know we are not fixing any bugs in the below patch.

ok, i checked out the old version again and applied both patches,
the BUG is gone (no surprise)
In the case where it would have hit the BUG i now get the following message:

[ 740.400288] Aborting journal on device loop0.
[ 740.405032] ext4_abort called.
[ 740.405097] EXT4-fs error (device loop0): ext4_journal_start_sb: Detected aborted journal
[ 740.405178] Remounting filesystem read-only
[ 740.410974] EXT4-fs error (device loop0) in ext4_ordered_write_end: IO failure
[ 740.414300] EXT4-fs error (device loop0) in ext4_reserve_inode_write: Journal has aborted
[ 740.414422] pa cba56990: logic 16, phys. 1953, len 16
[ 740.414447] EXT4-fs error (device loop0): ext4_mb_release_inode_pa: free 4, pa_free 3
[ 740.414548] EXT4-fs error (device loop0) in ext4_mb_free_blocks: Journal has aborted


Greetings, Eric

2008-02-08 15:28:17

by Eric Sandeen

[permalink] [raw]
Subject: Re: BUG_ON at mballoc.c:3752

Eric Sesterhenn wrote:
>> Eric ,
>> can you run the test with below patch and see if this makes any
>> difference ?. I know we are not fixing any bugs in the below patch.
>
> ok, i checked out the old version again and applied both patches,
> the BUG is gone (no surprise)
> In the case where it would have hit the BUG i now get the following message:
>
> [ 740.400288] Aborting journal on device loop0.

No message before that about *why* it aborted?

> [ 740.405032] ext4_abort called.
> [ 740.405097] EXT4-fs error (device loop0): ext4_journal_start_sb: Detected aborted journal
> [ 740.405178] Remounting filesystem read-only
> [ 740.410974] EXT4-fs error (device loop0) in ext4_ordered_write_end: IO failure
> [ 740.414300] EXT4-fs error (device loop0) in ext4_reserve_inode_write: Journal has aborted
> [ 740.414422] pa cba56990: logic 16, phys. 1953, len 16
> [ 740.414447] EXT4-fs error (device loop0): ext4_mb_release_inode_pa: free 4, pa_free 3
> [ 740.414548] EXT4-fs error (device loop0) in ext4_mb_free_blocks: Journal has aborted
>
>
> Greetings, Eric

2008-02-08 15:31:38

by Eric Sesterhenn

[permalink] [raw]
Subject: Re: BUG_ON at mballoc.c:3752

* Eric Sandeen ([email protected]) wrote:
> Eric Sesterhenn wrote:
> >> Eric ,
> >> can you run the test with below patch and see if this makes any
> >> difference ?. I know we are not fixing any bugs in the below patch.
> >
> > ok, i checked out the old version again and applied both patches,
> > the BUG is gone (no surprise)
> > In the case where it would have hit the BUG i now get the following message:
> >
> > [ 740.400288] Aborting journal on device loop0.
>
> No message before that about *why* it aborted?

assumed the stuff was from the previous runs


[ 113.318700] Aborting journal on device loop0.
[ 113.324237] ext4_abort called.
[ 113.324312] EXT4-fs error (device loop0): ext4_journal_start_sb:
Detected aborted journal
[ 113.324394] Remounting filesystem read-only
[ 113.330324] EXT4-fs error (device loop0) in ext4_dirty_inode: IO
failure
[ 113.330438] EXT4-fs error (device loop0) in ext4_ordered_write_end:
IO failure
[ 114.884485] EXT4-fs error (device loop0): ext4_mb_generate_buddy:
EXT4-fs: group 0: 6590 blocks in bitmap, 6600 in gd
[ 116.193424] Aborting journal on device loop0.
[ 116.196910] ext4_abort called.
[ 116.197064] EXT4-fs error (device loop0): ext4_journal_start_sb:
Detected aborted journal
[ 116.197147] Remounting filesystem read-only
[ 118.031066] EXT4-fs error (device loop0): ext4_mb_generate_buddy:
EXT4-fs: group 0: 6590 blocks in bitmap, 6600 in gd
[ 119.692812] Buffer I/O error on device loop0, logical block 20729
[ 121.968397] Buffer I/O error on device loop0, logical block 20729
[ 123.124070] Aborting journal on device loop0.
[ 123.125550] ext4_abort called.
[ 123.125654] EXT4-fs error (device loop0): ext4_journal_start_sb:
Detected aborted journal
[ 123.125736] Remounting filesystem read-only
[ 123.145792] pa cba56770: logic 32, phys. 3649, len 32
[ 123.145861] EXT4-fs error (device loop0): ext4_mb_release_inode_pa:
free 30, pa_free 29
[ 123.145943] EXT4-fs error (device loop0) in ext4_mb_free_blocks:
Journal has aborted
[ 123.153174] EXT4-fs error (device loop0) in ext4_dirty_inode: IO
failure
[ 123.153287] EXT4-fs error (device loop0) in ext4_reserve_inode_write:
Journal has aborted
[ 123.153363] pa cba56198: logic 16, phys. 2113, len 16
[ 123.153387] EXT4-fs error (device loop0): ext4_mb_release_inode_pa:
free 15, pa_free 14
[ 123.153488] EXT4-fs error (device loop0) in ext4_mb_free_blocks:
Journal has aborted
[ 125.193083] EXT4-fs error (device loop0): ext4_mb_generate_buddy:
EXT4-fs: group 0: 6590 blocks in bitmap, 6600 in gd
[ 127.683441] Aborting journal on device loop0.
[ 127.685090] EXT4-fs error (device loop0) in ext4_reserve_inode_write:
Journal has aborted
[ 127.687548] pa cba56330: logic 16, phys. 1969, len 16
[ 127.687585] EXT4-fs error (device loop0): ext4_mb_release_inode_pa:
free 16, pa_free 15
[ 127.687687] BUG: sleeping function called from invalid context at
include/linux/buffer_head.h:325
[ 127.688494] BUG: scheduling while atomic: fstest/4447/0x00000002
[ 127.712616] EXT4-fs error (device loop0) in ext4_reserve_inode_write:
Journal has aborted
[ 127.713055] BUG: scheduling while atomic: fstest/4447/0x00000002
[ 127.715557] EXT4-fs error (device loop0) in ext4_orphan_add: Journal
has aborted
[ 127.735782] EXT4-fs error (device loop0) in ext4_reserve_inode_write:
Journal has aborted
[ 127.735883] BUG: scheduling while atomic: fstest/4447/0x00000002
[ 127.736851] EXT4-fs error (device loop0) in ext4_setattr: Journal has
aborted
[ 127.758924] ext4_abort called.
[ 127.759044] EXT4-fs error (device loop0): ext4_journal_start_sb:
Detected aborted journal
[ 127.759125] Remounting filesystem read-only
[ 127.762504] EXT4-fs error (device loop0) in ext4_mb_free_blocks:
Journal has aborted
[ 130.003586] EXT4-fs error (device loop0): ext4_mb_generate_buddy:
EXT4-fs: group 0: 6590 blocks in bitmap, 6600 in gd
[ 133.104600] Aborting journal on device loop0.
[ 133.109459] ext4_abort called.
[ 133.109528] EXT4-fs error (device loop0): ext4_journal_start_sb:
Detected aborted journal
[ 133.109683] Remounting filesystem read-only
[ 133.128537] EXT4-fs error (device loop0) in ext4_dirty_inode: IO
failure
[ 133.128766] EXT4-fs error (device loop0) in ext4_reserve_inode_write:
Journal has aborted
[ 133.128803] EXT4-fs error (device loop0) in ext4_mb_free_blocks:
Journal has aborted
[ 135.037160] EXT4-fs error (device loop0): ext4_mb_generate_buddy:
EXT4-fs: group 0: 6590 blocks in bitmap, 6600 in gd
[ 135.995986] Aborting journal on device loop0.
[ 135.999977] ext4_abort called.
[ 136.000049] EXT4-fs error (device loop0): ext4_journal_start_sb:
Detected aborted journal
[ 136.000129] Remounting filesystem read-only
[ 136.005258] pa cba56440: logic 32, phys. 2081, len 32
[ 136.005312] EXT4-fs error (device loop0): ext4_mb_release_inode_pa:
free 30, pa_free 29
[ 136.005478] EXT4-fs error (device loop0) in ext4_mb_free_blocks:
Journal has aborted
[ 136.011164] EXT4-fs error (device loop0) in ext4_reserve_inode_write:
Journal has aborted
[ 136.011280] pa cba56088: logic 16, phys. 1857, len 16
[ 136.011304] EXT4-fs error (device loop0): ext4_mb_release_inode_pa:
free 7, pa_free 6
[ 136.011403] EXT4-fs error (device loop0) in ext4_mb_free_blocks:
Journal has aborted
[ 137.489878] EXT4-fs error (device loop0): ext4_mb_generate_buddy:
EXT4-fs: group 0: 6590 blocks in bitmap, 6600 in gd
[ 139.140550] Aborting journal on device loop0.
[ 139.144889] ext4_abort called.
[ 139.145015] EXT4-fs error (device loop0): ext4_journal_start_sb:
Detected aborted journal
[ 139.145098] Remounting filesystem read-only
[ 141.115285] EXT4-fs error (device loop0): ext4_mb_generate_buddy:
EXT4-fs: group 0: 6590 blocks in bitmap, 6600 in gd
[ 142.790793] Buffer I/O error on device loop0, logical block 13239314
[ 673.819560] EXT4-fs error (device loop0): ext4_mb_generate_buddy:
EXT4-fs: group 0: 6590 blocks in bitmap, 6600 in gd
[ 675.794735] Buffer I/O error on device loop0, logical block
4060087480
[ 676.048909] Aborting journal on device loop0.
[ 676.050604] ext4_abort called.
[ 676.050670] EXT4-fs error (device loop0): ext4_journal_start_sb:
Detected aborted journal
[ 676.050752] Remounting filesystem read-only
[ 676.065725] EXT4-fs error (device loop0) in ext4_ordered_write_end:
IO failure
[ 697.154481] EXT4-fs error (device loop0): ext4_mb_generate_buddy:
EXT4-fs: group 0: 6590 blocks in bitmap, 6600 in gd
[ 701.149256] Aborting journal on device loop0.
[ 701.150812] ext4_abort called.
[ 701.150856] EXT4-fs error (device loop0): ext4_journal_start_sb:
Detected aborted journal
[ 701.151010] Remounting filesystem read-only
[ 701.171010] EXT4-fs error (device loop0) in ext4_ordered_write_end:
IO failure
[ 728.261874] EXT4-fs error (device loop0): ext4_mb_generate_buddy:
EXT4-fs: group 0: 6590 blocks in bitmap, 6600 in gd
[ 729.913716] Aborting journal on device loop0.
[ 729.917148] ext4_abort called.
[ 729.917216] EXT4-fs error (device loop0): ext4_journal_start_sb:
Detected aborted journal
[ 729.917298] Remounting filesystem read-only
[ 729.933777] EXT4-fs error (device loop0) in ext4_reserve_inode_write:
Journal has aborted
[ 729.933858] EXT4-fs error (device loop0) in ext4_mb_free_blocks:
Journal has aborted
[ 729.941235] EXT4-fs error (device loop0) in ext4_mb_free_blocks:
Journal has aborted
[ 729.941314] EXT4-fs error (device loop0) in ext4_reserve_inode_write:
Journal has aborted
[ 729.941343] EXT4-fs error (device loop0) in ext4_reserve_inode_write:
Journal has aborted
[ 729.941398] EXT4-fs error (device loop0) in ext4_ext_remove_space:
Journal has aborted
[ 729.941431] EXT4-fs error (device loop0) in ext4_reserve_inode_write:
Journal has aborted
[ 729.941481] EXT4-fs error (device loop0) in ext4_orphan_del: Journal
has aborted
[ 735.424574] EXT4-fs error (device loop0): ext4_mb_generate_buddy:
EXT4-fs: group 0: 6590 blocks in bitmap, 6600 in gd
[ 740.400288] Aborting journal on device loop0.
[ 740.405032] ext4_abort called.
[ 740.405097] EXT4-fs error (device loop0): ext4_journal_start_sb:
Detected aborted journal
[ 740.405178] Remounting filesystem read-only
[ 740.410974] EXT4-fs error (device loop0) in ext4_ordered_write_end:
IO failure
[ 740.414300] EXT4-fs error (device loop0) in ext4_reserve_inode_write:
Journal has aborted
[ 740.414422] pa cba56990: logic 16, phys. 1953, len 16
[ 740.414447] EXT4-fs error (device loop0): ext4_mb_release_inode_pa:
free 4, pa_free 3
[ 740.414548] EXT4-fs error (device loop0) in ext4_mb_free_blocks:
Journal has aborted


Greetings, Eric

2008-02-08 15:37:06

by Eric Sandeen

[permalink] [raw]
Subject: Re: BUG_ON at mballoc.c:3752

Eric Sesterhenn wrote:
> * Eric Sandeen ([email protected]) wrote:
>> Eric Sesterhenn wrote:
>>>> Eric ,
>>>> can you run the test with below patch and see if this makes any
>>>> difference ?. I know we are not fixing any bugs in the below patch.
>>> ok, i checked out the old version again and applied both patches,
>>> the BUG is gone (no surprise)
>>> In the case where it would have hit the BUG i now get the following message:
>>>
>>> [ 740.400288] Aborting journal on device loop0.
>> No message before that about *why* it aborted?
>
> assumed the stuff was from the previous runs
>

> [ 735.424574] EXT4-fs error (device loop0): ext4_mb_generate_buddy:
> EXT4-fs: group 0: 6590 blocks in bitmap, 6600 in gd
> [ 740.400288] Aborting journal on device loop0.

There it is, thanks. Just making sure :)

-Eric