Christoph Hellwig, on 07/29/2010 12:31 PM wrote:
> My reading of the ext3/jbd code we explicitly wait on I/O completion
> of dependent writes, and only require those to actually be stable
> by issueing a flush. If that wasn't the case the default ext3
> barriers off behaviour would not only be dangerous on devices with
> volatile write caches, but also on devices that do not have them,
> which in addition to the reading of the code is not what we've seen
> in actual power fail testing, where ext3 does well as long as there
> is no volatile write cache.
Basically, it is so, but, unfortunately, not absolutely. I've just tried 2 tests on ext4 with iSCSI:
# uname -a
Linux ini 2.6.32-22-386 #36-Ubuntu SMP Fri Jun 4 00:27:09 UTC 2010 i686 GNU/Linux
# e2fsck -f -y /dev/sdb
e2fsck 1.41.11 (14-Mar-2010)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
/dev/sdb: 49/640000 files (0.0% non-contiguous), 56496/1280000 blocks
root@ini:~# mount -t ext4 -o barrier=1 /dev/sdb /mnt
root@ini:~# cd /mnt/dbench-mod/
root@ini:/mnt/dbench-mod# ./dbench 50
50 clients started
...
<-- Pull cable
<-- After sometime a lot of warnings like:
(22002) open CLIENTS/CLIENT44/~DMTMP/COREL/CDRBARS.CFG failed for handle 4235 (Read-only file system)
(22004) open CLIENTS/CLIENT44/~DMTMP/COREL/ARTISTIC.ACL failed for handle 4236 (Read-only file system)
(22010) open CLIENTS/CLIENT44/~DMTMP/COREL/@@@CDRW.TMP failed for handle 4237 (Read-only file system)
(22011) nb_close: handle 4237 was not open
(22014) unlink CLIENTS/CLIENT44/~DMTMP/COREL/@@@CDRW.TMP failed (Read-only file system)
(22018) open CLIENTS/CLIENT44/~DMTMP/COREL/CORELDRW.CDT failed for handle 4238 (Read-only file system)
(22021) nb_close: handle 4218 was not open
(22032) open CLIENTS/CLIENT44/~DMTMP/COREL/GRAPHIC1.CDR failed for handle 4239 (Read-only file system)
(22050) open CLIENTS/CLIENT44/~DMTMP/COREL/@@@CDRW.TMP failed for handle 4240 (Read-only file system)
(22051) nb_close: handle 4240 was not open
(22054) unlink CLIENTS/CLIENT44/~DMTMP/COREL/@@@CDRW.TMP failed (Read-only file system)
(22057) nb_close: handle 4228 was not open
(22061) nb_close: handle 4182 was not open
(22065) nb_close: handle 4234 was not open
(22078) open CLIENTS/CLIENT44/~DMTMP/COREL/GRAPH1.CDR failed for handle 4242 (Read-only file system)^C^C^C^C^C^C
root@ini:/mnt/dbench-mod# ^C
root@ini:/mnt/dbench-mod# ^C
root@ini:~# umount /mnt
Segmentation fault
Kernel log:
Jul 29 19:55:35 ini kernel: [ 3044.722313] c2c28e40: 00023740 00023741 00023742 00023743 @7..A7..B7..C7..
Jul 29 19:55:35 ini kernel: [ 3044.722320] c2c28e50: 00023744 00023745 00023746 00023747 D7..E7..F7..G7..
Jul 29 19:55:35 ini kernel: [ 3044.722327] c2c28e60: 00023748 00023749 0002374a 0002374b H7..I7..J7..K7..
Jul 29 19:55:35 ini kernel: [ 3044.722334] c2c28e70: 0002372c 00000000 00000000 00000000 ,7..............
Jul 29 19:55:35 ini kernel: [ 3044.722341] c2c28e80: 00000000 00000000 00000000 00000002 ................
Jul 29 19:55:35 ini kernel: [ 3044.722346] c2c28e90: 00000000 00000000 00000000 00000000 ................
Jul 29 19:55:35 ini kernel: [ 3044.722354] c2c28ea0: c2c28ea0 c2c28ea0 c307f138 c307f138 ........8...8...
Jul 29 19:55:35 ini kernel: [ 3044.722360] c2c28eb0: 0003f800 00000000 00000000 00000000 ................
Jul 29 19:55:35 ini kernel: [ 3044.722366] c2c28ec0: c2c28ec0 c2c28ec0 00000000 00000000 ................
Jul 29 19:55:35 ini kernel: [ 3044.722373] c2c28ed0: 00100100 00200200 c2c28ed8 c2c28ed8 ...... .........
Jul 29 19:55:35 ini kernel: [ 3044.722379] c2c28ee0: c2c28ee0 c2c28ee0 0000800b 00000000 ................
Jul 29 19:55:35 ini kernel: [ 3044.722384] c2c28ef0: 00000001 00000000 00000000 00000000 ................
Jul 29 19:55:35 ini kernel: [ 3044.722391] c2c28f00: 00000001 00000000 0003f800 00000000 ................
Jul 29 19:55:35 ini kernel: [ 3044.722398] c2c28f10: 00000002 4c51a3cc 00000000 4c51a3cc ......QL......QL
Jul 29 19:55:35 ini kernel: [ 3044.722404] c2c28f20: 00000000 4c51a3cc 00000000 00000208 ......QL........
Jul 29 19:55:35 ini kernel: [ 3044.722410] c2c28f30: 00000000 0000000c 81800000 00000101 ................
Jul 29 19:55:35 ini kernel: [ 3044.722416] c2c28f40: 00000001 00000000 c2c28f48 c2c28f48 ........H...H...
Jul 29 19:55:35 ini kernel: [ 3044.722422] c2c28f50: 00000000 00000000 00000000 c2c28f5c ............\...
Jul 29 19:55:35 ini kernel: [ 3044.722428] c2c28f60: c2c28f5c c0593440 c05933c0 ca228a00 \[email protected]...".
Jul 29 19:55:35 ini kernel: [ 3044.722434] c2c28f70: 00000000 c2c28f78 c2c28ec8 00000000 ....x...........
Jul 29 19:55:35 ini kernel: [ 3044.722440] c2c28f80: 00000020 00000000 00000505 00000000 ...............
Jul 29 19:55:35 ini kernel: [ 3044.722446] c2c28f90: 00000000 00010001 c2c28f98 c2c28f98 ................
Jul 29 19:55:35 ini kernel: [ 3044.722451] c2c28fa0: 00000000 00000000 00000000 00000000 ................
Jul 29 19:55:35 ini kernel: [ 3044.722457] c2c28fb0: c0593680 000200da cdcc104c 00000202 .6Y.....L.......
Jul 29 19:55:35 ini kernel: [ 3044.722463] c2c28fc0: c2c28fc0 c2c28fc0 00000000 00000000 ................
Jul 29 19:55:35 ini kernel: [ 3044.722469] c2c28fd0: 00000000 c2c28fd4 c2c28fd4 00000000 ................
Jul 29 19:55:35 ini kernel: [ 3044.722475] c2c28fe0: 0623225b 00000000 00000000 c2c28fec ["#.............
Jul 29 19:55:35 ini kernel: [ 3044.722481] c2c28ff0: c2c28fec 00000001 00000000 c2c28ffc ................
Jul 29 19:55:35 ini kernel: [ 3044.722487] c2c29000: c2c28ffc 00000000 00000040 00000000 ........@.......
Jul 29 19:55:35 ini kernel: [ 3044.722493] c2c29010: 00000000 00000000 00000000 ffffffff ................
Jul 29 19:55:35 ini kernel: [ 3044.722499] c2c29020: ffffffff 00000000 00000000 00000000 ................
Jul 29 19:55:35 ini kernel: [ 3044.722505] c2c29030: c2c29030 c2c29030 c2c28ec8 00000000 0...0...........
Jul 29 19:55:35 ini kernel: [ 3044.722510] c2c29040: 00000000 00000000 00000000 00000000 ................
Jul 29 19:55:35 ini kernel: [ 3044.722516] c2c29050: 00000000 4c51a3d8 00000000 c2c2905c ......QL....\...
Jul 29 19:55:35 ini kernel: [ 3044.722522] c2c29060: c2c2905c 00000101 ffffffff 00000000 \...............
Jul 29 19:55:35 ini kernel: [ 3044.722528] c2c29070: 00000000 00000000 00000000 00000101 ................
Jul 29 19:55:35 ini kernel: [ 3044.722534] c2c29080: 00000000 00000000 c2c29088 c2c29088 ................
Jul 29 19:55:35 ini kernel: [ 3044.722540] c2c29090: 00000000 00005be2 00005be2 .....[...[..
Jul 29 19:55:35 ini kernel: [ 3044.722546] Pid: 1299, comm: umount Not tainted 2.6.32-22-386 #36-Ubuntu
Jul 29 19:55:35 ini kernel: [ 3044.722550] Call Trace:
Jul 29 19:55:35 ini kernel: [ 3044.722567] [<c0291731>] ext4_destroy_inode+0x91/0xa0
Jul 29 19:55:35 ini kernel: [ 3044.722577] [<c020ecb4>] destroy_inode+0x24/0x40
Jul 29 19:55:35 ini kernel: [ 3044.722583] [<c020f11e>] dispose_list+0x8e/0x100
Jul 29 19:55:35 ini kernel: [ 3044.722588] [<c020f534>] invalidate_inodes+0xf4/0x120
Jul 29 19:55:35 ini kernel: [ 3044.722598] [<c023b310>] ? vfs_quota_off+0x0/0x20
Jul 29 19:55:35 ini kernel: [ 3044.722606] [<c01fc602>] generic_shutdown_super+0x42/0xe0
Jul 29 19:55:35 ini kernel: [ 3044.722612] [<c01fc6ca>] kill_block_super+0x2a/0x50
Jul 29 19:55:35 ini kernel: [ 3044.722618] [<c01fd4e4>] deactivate_super+0x64/0x90
Jul 29 19:55:35 ini kernel: [ 3044.722625] [<c021282f>] mntput_no_expire+0x8f/0xe0
Jul 29 19:55:35 ini kernel: [ 3044.722631] [<c0212e47>] sys_umount+0x47/0xa0
Jul 29 19:55:35 ini kernel: [ 3044.722636] [<c0212ebe>] sys_oldumount+0x1e/0x20
Jul 29 19:55:35 ini kernel: [ 3044.722643] [<c01033ec>] syscall_call+0x7/0xb
Jul 29 19:55:35 ini kernel: [ 3044.731043] sd 6:0:0:0: [sdb] Unhandled error code
Jul 29 19:55:35 ini kernel: [ 3044.731049] sd 6:0:0:0: [sdb] Result: hostbyte=DID_TRANSPORT_FAILFAST driverbyte=DRIVER_OK
Jul 29 19:55:35 ini kernel: [ 3044.731056] sd 6:0:0:0: [sdb] CDB: Write(10): 2a 00 00 00 00 00 00 00 01 00
Jul 29 19:55:35 ini kernel: [ 3044.743469] __ratelimit: 37 callbacks suppressed
Jul 29 19:55:35 ini kernel: [ 3044.755695] lost page write due to I/O error on sdb
Jul 29 19:55:36 ini kernel: [ 3044.823044] Modules linked in: crc32c ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi w83627hf hwmon_vid fbcon tileblit font bitblit softcursor ppdev adm1021 i2c_i801 vga16fb vgastate e7xxx_edac psmouse serio_raw parport_pc shpchp edac_core lp parport qla2xxx ohci1394 scsi_transport_fc r8169 sata_via ieee1394 mii scsi_tgt e1000 floppy
Jul 29 19:55:36 ini kernel: [ 3044.823044]
Jul 29 19:55:36 ini kernel: [ 3044.823044] Pid: 1299, comm: umount Not tainted (2.6.32-22-386 #36-Ubuntu) X5DPA
Jul 29 19:55:36 ini kernel: [ 3044.823044] EIP: 0060:[<c0293c2a>] EFLAGS: 00010206 CPU: 0
Jul 29 19:55:36 ini kernel: [ 3044.823044] EIP is at ext4_put_super+0x2ea/0x350
Jul 29 19:55:36 ini kernel: [ 3044.823044] EAX: c2c28ea8 EBX: c307f000 ECX: ffffff52 EDX: c307f138
Jul 29 19:55:36 ini kernel: [ 3044.823044] ESI: ca228a00 EDI: c307f0fc EBP: cec6ff30 ESP: cec6fefc
Jul 29 19:55:36 ini kernel: [ 3044.823044] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
Jul 29 19:55:36 ini kernel: [ 3044.823044] c06bb054 ca228b64 0000800b c2c28ec8 00008180 00000001 00000000 c307f138
Jul 29 19:55:36 ini kernel: [ 3044.823044] <0> c307f138 c307f138 ca228a00 c0593c80 c023b310 cec6ff48 c01fc60d ca228ac0
Jul 29 19:55:36 ini kernel: [ 3044.823044] <0> cec6ff44 cf328400 00000003 cec6ff58 c01fc6ca ca228a00 c0759d80 cec6ff6c
Jul 29 19:55:36 ini kernel: [ 3044.823044] [<c023b310>] ? vfs_quota_off+0x0/0x20
Jul 29 19:55:36 ini kernel: [ 3044.823044] [<c01fc60d>] ? generic_shutdown_super+0x4d/0xe0
Jul 29 19:55:36 ini kernel: [ 3044.823044] [<c01fc6ca>] ? kill_block_super+0x2a/0x50
Jul 29 19:55:36 ini kernel: [ 3044.823044] [<c01fd4e4>] ? deactivate_super+0x64/0x90
Jul 29 19:55:36 ini kernel: [ 3044.823044] [<c021282f>] ? mntput_no_expire+0x8f/0xe0
Jul 29 19:55:36 ini kernel: [ 3044.823044] [<c0212e47>] ? sys_umount+0x47/0xa0
Jul 29 19:55:36 ini kernel: [ 3044.823044] [<c0212ebe>] ? sys_oldumount+0x1e/0x20
Jul 29 19:55:36 ini kernel: [ 3044.823044] [<c01033ec>] ? syscall_call+0x7/0xb
Jul 29 19:55:36 ini kernel: [ 3045.299442] ---[ end trace 426db011a0289db3 ]---
Jul 29 19:55:36 ini kernel: [ 3045.310429] ------------[ cut here ]------------
Jul 29 19:55:36 ini kernel: [ 3045.321086] WARNING: at /build/buildd/linux-2.6.32/kernel/exit.c:895 do_exit+0x2f9/0x300()
Jul 29 19:55:36 ini kernel: [ 3045.342153] Hardware name: X5DPA
Jul 29 19:55:36 ini kernel: [ 3045.352697] Modules linked in: crc32c ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi w83627hf hwmon_vid fbcon tileblit font bitblit softcursor ppdev adm1021 i2c_i801 vga16fb vgastate e7xxx_edac psmouse serio_raw parport_pc shpchp edac_core lp parport qla2xxx ohci1394 scsi_transport_fc r8169 sata_via ieee1394 mii scsi_tgt e1000 floppy
Jul 29 19:55:36 ini kernel: [ 3045.422317] Pid: 1299, comm: umount Tainted: G D 2.6.32-22-386 #36-Ubuntu
Jul 29 19:55:36 ini kernel: [ 3045.444158] Call Trace:
Jul 29 19:55:36 ini kernel: [ 3045.454755] [<c01487a2>] warn_slowpath_common+0x72/0xa0
Jul 29 19:55:36 ini kernel: [ 3045.465152] [<c014ca49>] ? do_exit+0x2f9/0x300
Jul 29 19:55:36 ini kernel: [ 3045.475281] [<c014ca49>] ? do_exit+0x2f9/0x300
Jul 29 19:55:36 ini kernel: [ 3045.485296] [<c01487ea>] warn_slowpath_null+0x1a/0x20
Jul 29 19:55:36 ini kernel: [ 3045.495432] [<c014ca49>] do_exit+0x2f9/0x300
Jul 29 19:55:36 ini kernel: [ 3045.505640] [<c014856f>] ? print_oops_end_marker+0x2f/0x40
Jul 29 19:55:36 ini kernel: [ 3045.516012] [<c0579fc5>] oops_end+0x95/0xd0
Jul 29 19:55:36 ini kernel: [ 3045.526394] [<c01068a4>] die+0x54/0x80
Jul 29 19:55:36 ini kernel: [ 3045.536808] [<c0579716>] do_trap+0x96/0xc0
Jul 29 19:55:36 ini kernel: [ 3045.547268] [<c0104980>] ? do_invalid_op+0x0/0xa0
Jul 29 19:55:36 ini kernel: [ 3045.557756] [<c0104a0b>] do_invalid_op+0x8b/0xa0
Jul 29 19:55:36 ini kernel: [ 3045.568296] [<c0293c2a>] ? ext4_put_super+0x2ea/0x350
Jul 29 19:55:36 ini kernel: [ 3045.578561] [<c0149291>] ? vprintk+0x191/0x3f0
Jul 29 19:55:36 ini kernel: [ 3045.588708] [<c0579493>] error_code+0x73/0x80
Jul 29 19:55:36 ini kernel: [ 3045.598076] [<c0293c2a>] ? ext4_put_super+0x2ea/0x350
Jul 29 19:55:36 ini kernel: [ 3045.607381] [<c023b310>] ? vfs_quota_off+0x0/0x20
Jul 29 19:55:36 ini kernel: [ 3045.616499] [<c01fc60d>] generic_shutdown_super+0x4d/0xe0
Jul 29 19:55:36 ini kernel: [ 3045.625688] [<c01fc6ca>] kill_block_super+0x2a/0x50
Jul 29 19:55:36 ini kernel: [ 3045.634777] [<c01fd4e4>] deactivate_super+0x64/0x90
Jul 29 19:55:36 ini kernel: [ 3045.643744] [<c021282f>] mntput_no_expire+0x8f/0xe0
Jul 29 19:55:36 ini kernel: [ 3045.652782] [<c0212e47>] sys_umount+0x47/0xa0
Jul 29 19:55:36 ini kernel: [ 3045.661514] [<c0212ebe>] sys_oldumount+0x1e/0x20
Jul 29 19:55:36 ini kernel: [ 3045.670139] [<c01033ec>] syscall_call+0x7/0xb
Jul 29 19:55:36 ini kernel: [ 3045.678566] ---[ end trace 426db011a0289db4 ]---
Another test. Everything is as before, only I did not pull the cable, but deleted the corresponding LUN on the target, so all the command starting from this moment failed. Then on umount system rebooted. Kernel log:
Jul 29 20:20:42 ini kernel: [ 1320.251393] umount D 00478e55 0 1234 924 0x00000000
Jul 29 20:20:42 ini kernel: [ 1320.251403] ce579e10 00000086 00000082 00478e55 00000000 c082b330 cef574dc c082b330
Jul 29 20:20:42 ini kernel: [ 1320.251415] c082b330 c082b330 cef574dc 0355459b 0000010d c082b330 c082b330 ce652000
Jul 29 20:20:42 ini kernel: [ 1320.251425] 0000010d cef57230 c1407330 cef57230 ce579e5c ce579e20 c057745a ce579e54
Jul 29 20:20:42 ini kernel: [ 1320.251436] Call Trace:
Jul 29 20:20:42 ini kernel: [ 1320.251452] [<c057745a>] io_schedule+0x3a/0x60
Jul 29 20:20:42 ini kernel: [ 1320.251463] [<c01bd95d>] sync_page+0x3d/0x50
Jul 29 20:20:42 ini kernel: [ 1320.251470] [<c0577aa7>] __wait_on_bit_lock+0x47/0x90
Jul 29 20:20:42 ini kernel: [ 1320.251476] [<c01bd920>] ? sync_page+0x0/0x50
Jul 29 20:20:42 ini kernel: [ 1320.251483] [<c01bd8ee>] __lock_page+0x7e/0x90
Jul 29 20:20:42 ini kernel: [ 1320.251491] [<c01624d0>] ? wake_bit_function+0x0/0x50
Jul 29 20:20:42 ini kernel: [ 1320.251499] [<c01c7219>] truncate_inode_pages_range+0x2a9/0x2c0
Jul 29 20:20:42 ini kernel: [ 1320.251510] [<c02916c6>] ? ext4_destroy_inode+0x26/0xa0
Jul 29 20:20:42 ini kernel: [ 1320.251517] [<c01c724f>] truncate_inode_pages+0x1f/0x30
Jul 29 20:20:42 ini kernel: [ 1320.251523] [<c020f15c>] dispose_list+0xcc/0x100
Jul 29 20:20:42 ini kernel: [ 1320.251529] [<c020f534>] invalidate_inodes+0xf4/0x120
Jul 29 20:20:42 ini kernel: [ 1320.251538] [<c023b310>] ? vfs_quota_off+0x0/0x20
Jul 29 20:20:42 ini kernel: [ 1320.251546] [<c01fc602>] generic_shutdown_super+0x42/0xe0
Jul 29 20:20:42 ini kernel: [ 1320.251553] [<c01fc6ca>] kill_block_super+0x2a/0x50
Jul 29 20:20:42 ini kernel: [ 1320.251559] [<c01fd4e4>] deactivate_super+0x64/0x90
Jul 29 20:20:42 ini kernel: [ 1320.251566] [<c021282f>] mntput_no_expire+0x8f/0xe0
Jul 29 20:20:42 ini kernel: [ 1320.251573] [<c0212e47>] sys_umount+0x47/0xa0
Jul 29 20:20:42 ini kernel: [ 1320.251579] [<c0212ebe>] sys_oldumount+0x1e/0x20
Jul 29 20:20:42 ini kernel: [ 1320.251586] [<c01033ec>] syscall_call+0x7/0xb
Jul 29 20:22:42 ini kernel: [ 1440.285910] umount D 00478e55 0 1234 924 0x00000004
Jul 29 20:22:42 ini kernel: [ 1440.285919] ce579e10 00000086 00000082 00478e55 00000000 c082b330 cef574dc c082b330
Jul 29 20:22:42 ini kernel: [ 1440.285931] c082b330 c082b330 cef574dc 0355459b 0000010d c082b330 c082b330 ce652000
Jul 29 20:22:42 ini kernel: [ 1440.285942] 0000010d cef57230 c1407330 cef57230 ce579e5c ce579e20 c057745a ce579e54
Jul 29 20:22:42 ini kernel: [ 1440.285953] Call Trace:
Jul 29 20:22:42 ini kernel: [ 1440.285969] [<c057745a>] io_schedule+0x3a/0x60
Jul 29 20:22:42 ini kernel: [ 1440.285980] [<c01bd95d>] sync_page+0x3d/0x50
Jul 29 20:22:42 ini kernel: [ 1440.285987] [<c0577aa7>] __wait_on_bit_lock+0x47/0x90
Jul 29 20:22:42 ini kernel: [ 1440.285994] [<c01bd920>] ? sync_page+0x0/0x50
Jul 29 20:22:42 ini kernel: [ 1440.286001] [<c01bd8ee>] __lock_page+0x7e/0x90
Jul 29 20:22:42 ini kernel: [ 1440.286010] [<c01624d0>] ? wake_bit_function+0x0/0x50
Jul 29 20:22:42 ini kernel: [ 1440.286018] [<c01c7219>] truncate_inode_pages_range+0x2a9/0x2c0
Jul 29 20:22:42 ini kernel: [ 1440.286028] [<c02916c6>] ? ext4_destroy_inode+0x26/0xa0
Jul 29 20:22:42 ini kernel: [ 1440.286035] [<c01c724f>] truncate_inode_pages+0x1f/0x30
Jul 29 20:22:42 ini kernel: [ 1440.286041] [<c020f15c>] dispose_list+0xcc/0x100
Jul 29 20:22:42 ini kernel: [ 1440.286047] [<c020f534>] invalidate_inodes+0xf4/0x120
Jul 29 20:22:42 ini kernel: [ 1440.286056] [<c023b310>] ? vfs_quota_off+0x0/0x20
Jul 29 20:22:42 ini kernel: [ 1440.286064] [<c01fc602>] generic_shutdown_super+0x42/0xe0
Jul 29 20:22:42 ini kernel: [ 1440.286071] [<c01fc6ca>] kill_block_super+0x2a/0x50
Jul 29 20:22:42 ini kernel: [ 1440.286077] [<c01fd4e4>] deactivate_super+0x64/0x90
Jul 29 20:22:42 ini kernel: [ 1440.286084] [<c021282f>] mntput_no_expire+0x8f/0xe0
Jul 29 20:22:42 ini kernel: [ 1440.286091] [<c0212e47>] sys_umount+0x47/0xa0
Jul 29 20:22:42 ini kernel: [ 1440.286097] [<c0212ebe>] sys_oldumount+0x1e/0x20
Jul 29 20:22:42 ini kernel: [ 1440.286104] [<c01033ec>] syscall_call+0x7/0xb
Jul 29 20:24:42 ini kernel: [ 1560.321709] umount D 00478e55 0 1234 924 0x00000004
Jul 29 20:24:42 ini kernel: [ 1560.321718] ce579e10 00000086 00000082 00478e55 00000000 c082b330 cef574dc c082b330
Jul 29 20:24:42 ini kernel: [ 1560.321730] c082b330 c082b330 cef574dc 0355459b 0000010d c082b330 c082b330 ce652000
Jul 29 20:24:42 ini kernel: [ 1560.321741] 0000010d cef57230 c1407330 cef57230 ce579e5c ce579e20 c057745a ce579e54
Jul 29 20:24:42 ini kernel: [ 1560.321751] Call Trace:
Jul 29 20:24:42 ini kernel: [ 1560.321767] [<c057745a>] io_schedule+0x3a/0x60
Jul 29 20:24:42 ini kernel: [ 1560.321777] [<c01bd95d>] sync_page+0x3d/0x50
Jul 29 20:24:42 ini kernel: [ 1560.321784] [<c0577aa7>] __wait_on_bit_lock+0x47/0x90
Jul 29 20:24:42 ini kernel: [ 1560.321791] [<c01bd920>] ? sync_page+0x0/0x50
Jul 29 20:24:42 ini kernel: [ 1560.321797] [<c01bd8ee>] __lock_page+0x7e/0x90
Jul 29 20:24:42 ini kernel: [ 1560.321805] [<c01624d0>] ? wake_bit_function+0x0/0x50
Jul 29 20:24:42 ini kernel: [ 1560.321814] [<c01c7219>] truncate_inode_pages_range+0x2a9/0x2c0
Jul 29 20:24:42 ini kernel: [ 1560.321824] [<c02916c6>] ? ext4_destroy_inode+0x26/0xa0
Jul 29 20:24:42 ini kernel: [ 1560.321831] [<c01c724f>] truncate_inode_pages+0x1f/0x30
Jul 29 20:24:42 ini kernel: [ 1560.321837] [<c020f15c>] dispose_list+0xcc/0x100
Jul 29 20:24:42 ini kernel: [ 1560.321845] [<c020f534>] invalidate_inodes+0xf4/0x120
Jul 29 20:24:42 ini kernel: [ 1560.321855] [<c023b310>] ? vfs_quota_off+0x0/0x20
Jul 29 20:24:42 ini kernel: [ 1560.321864] [<c01fc602>] generic_shutdown_super+0x42/0xe0
Jul 29 20:24:42 ini kernel: [ 1560.321870] [<c01fc6ca>] kill_block_super+0x2a/0x50
Jul 29 20:24:42 ini kernel: [ 1560.321877] [<c01fd4e4>] deactivate_super+0x64/0x90
Jul 29 20:24:42 ini kernel: [ 1560.321885] [<c021282f>] mntput_no_expire+0x8f/0xe0
Jul 29 20:24:42 ini kernel: [ 1560.321892] [<c0212e47>] sys_umount+0x47/0xa0
Jul 29 20:24:42 ini kernel: [ 1560.321898] [<c0212ebe>] sys_oldumount+0x1e/0x20
Jul 29 20:24:42 ini kernel: [ 1560.321905] [<c01033ec>] syscall_call+0x7/0xb
Jul 29 20:24:42 ini kernel: [ 1560.358795] sync D 0004beb0 0 1265 1255 0x00000004
Jul 29 20:24:42 ini kernel: [ 1560.358803] cea6ff2c 00000086 00000001 0004beb0 00000000 c082b330 cde3db7c c082b330
Jul 29 20:24:42 ini kernel: [ 1560.358815] c082b330 c082b330 cde3db7c 15c42f3f 00000140 c082b330 c082b330 ce653200
Jul 29 20:24:42 ini kernel: [ 1560.358826] 00000140 cde3d8d0 cea6ff60 cde3d8d0 cefef23c cea6ff58 c0578de5 fffeffff
Jul 29 20:24:42 ini kernel: [ 1560.358837] Call Trace:
Jul 29 20:24:42 ini kernel: [ 1560.358845] [<c0578de5>] rwsem_down_failed_common+0x75/0x1a0
Jul 29 20:24:42 ini kernel: [ 1560.358852] [<c0578f5d>] rwsem_down_read_failed+0x1d/0x30
Jul 29 20:24:42 ini kernel: [ 1560.358858] [<c0578fb7>] call_rwsem_down_read_failed+0x7/0x10
Jul 29 20:24:42 ini kernel: [ 1560.358863] [<c057850c>] ? down_read+0x1c/0x20
Jul 29 20:24:42 ini kernel: [ 1560.358870] [<c021cb6d>] sync_filesystems+0xbd/0x110
Jul 29 20:24:42 ini kernel: [ 1560.358876] [<c021cc16>] sys_sync+0x16/0x40
Jul 29 20:24:42 ini kernel: [ 1560.358881] [<c01033ec>] syscall_call+0x7/0xb
Jul 29 20:26:42 ini kernel: [ 1680.392190] umount D 00478e55 0 1234 924 0x00000004
Jul 29 20:26:42 ini kernel: [ 1680.392200] ce579e10 00000086 00000082 00478e55 00000000 c082b330 cef574dc c082b330
Jul 29 20:26:42 ini kernel: [ 1680.392212] c082b330 c082b330 cef574dc 0355459b 0000010d c082b330 c082b330 ce652000
Jul 29 20:26:42 ini kernel: [ 1680.392223] 0000010d cef57230 c1407330 cef57230 ce579e5c ce579e20 c057745a ce579e54
Jul 29 20:26:42 ini kernel: [ 1680.392233] Call Trace:
Jul 29 20:26:42 ini kernel: [ 1680.392250] [<c057745a>] io_schedule+0x3a/0x60
Jul 29 20:26:42 ini kernel: [ 1680.392260] [<c01bd95d>] sync_page+0x3d/0x50
Jul 29 20:26:42 ini kernel: [ 1680.392267] [<c0577aa7>] __wait_on_bit_lock+0x47/0x90
Jul 29 20:26:42 ini kernel: [ 1680.392274] [<c01bd920>] ? sync_page+0x0/0x50
Jul 29 20:26:42 ini kernel: [ 1680.392280] [<c01bd8ee>] __lock_page+0x7e/0x90
Jul 29 20:26:42 ini kernel: [ 1680.392289] [<c01624d0>] ? wake_bit_function+0x0/0x50
Jul 29 20:26:42 ini kernel: [ 1680.392298] [<c01c7219>] truncate_inode_pages_range+0x2a9/0x2c0
Jul 29 20:26:42 ini kernel: [ 1680.392308] [<c02916c6>] ? ext4_destroy_inode+0x26/0xa0
Jul 29 20:26:42 ini kernel: [ 1680.392314] [<c01c724f>] truncate_inode_pages+0x1f/0x30
Jul 29 20:26:42 ini kernel: [ 1680.392321] [<c020f15c>] dispose_list+0xcc/0x100
Jul 29 20:26:42 ini kernel: [ 1680.392327] [<c020f534>] invalidate_inodes+0xf4/0x120
Jul 29 20:26:42 ini kernel: [ 1680.392336] [<c023b310>] ? vfs_quota_off+0x0/0x20
Jul 29 20:26:42 ini kernel: [ 1680.392344] [<c01fc602>] generic_shutdown_super+0x42/0xe0
Jul 29 20:26:42 ini kernel: [ 1680.392351] [<c01fc6ca>] kill_block_super+0x2a/0x50
Jul 29 20:26:42 ini kernel: [ 1680.392357] [<c01fd4e4>] deactivate_super+0x64/0x90
Jul 29 20:26:42 ini kernel: [ 1680.392364] [<c021282f>] mntput_no_expire+0x8f/0xe0
Jul 29 20:26:42 ini kernel: [ 1680.392371] [<c0212e47>] sys_umount+0x47/0xa0
Jul 29 20:26:42 ini kernel: [ 1680.392378] [<c0212ebe>] sys_oldumount+0x1e/0x20
Jul 29 20:26:42 ini kernel: [ 1680.392384] [<c01033ec>] syscall_call+0x7/0xb
Jul 29 20:26:42 ini kernel: [ 1680.427874] sync D 0004beb0 0 1265 1255 0x00000004
Jul 29 20:26:42 ini kernel: [ 1680.427883] cea6ff2c 00000086 00000001 0004beb0 00000000 c082b330 cde3db7c c082b330
Jul 29 20:26:42 ini kernel: [ 1680.427894] c082b330 c082b330 cde3db7c 15c42f3f 00000140 c082b330 c082b330 ce653200
Jul 29 20:26:42 ini kernel: [ 1680.427904] 00000140 cde3d8d0 cea6ff60 cde3d8d0 cefef23c cea6ff58 c0578de5 fffeffff
Jul 29 20:26:42 ini kernel: [ 1680.427915] Call Trace:
Jul 29 20:26:42 ini kernel: [ 1680.427922] [<c0578de5>] rwsem_down_failed_common+0x75/0x1a0
Jul 29 20:26:42 ini kernel: [ 1680.427929] [<c0578f5d>] rwsem_down_read_failed+0x1d/0x30
Jul 29 20:26:42 ini kernel: [ 1680.427935] [<c0578fb7>] call_rwsem_down_read_failed+0x7/0x10
Jul 29 20:26:42 ini kernel: [ 1680.427940] [<c057850c>] ? down_read+0x1c/0x20
Jul 29 20:26:42 ini kernel: [ 1680.427947] [<c021cb6d>] sync_filesystems+0xbd/0x110
Jul 29 20:26:42 ini kernel: [ 1680.427953] [<c021cc16>] sys_sync+0x16/0x40
Jul 29 20:26:42 ini kernel: [ 1680.427958] [<c01033ec>] syscall_call+0x7/0xb
Jul 29 20:28:42 ini kernel: [ 1800.458856] umount D 00478e55 0 1234 924 0x00000004
Jul 29 20:28:42 ini kernel: [ 1800.458866] ce579e10 00000086 00000082 00478e55 00000000 c082b330 cef574dc c082b330
Jul 29 20:28:42 ini kernel: [ 1800.458877] c082b330 c082b330 cef574dc 0355459b 0000010d c082b330 c082b330 ce652000
Jul 29 20:28:42 ini kernel: [ 1800.458888] 0000010d cef57230 c1407330 cef57230 ce579e5c ce579e20 c057745a ce579e54
Jul 29 20:28:42 ini kernel: [ 1800.458899] Call Trace:
Jul 29 20:28:42 ini kernel: [ 1800.458915] [<c057745a>] io_schedule+0x3a/0x60
Jul 29 20:28:42 ini kernel: [ 1800.458925] [<c01bd95d>] sync_page+0x3d/0x50
Jul 29 20:28:42 ini kernel: [ 1800.458932] [<c0577aa7>] __wait_on_bit_lock+0x47/0x90
Jul 29 20:28:42 ini kernel: [ 1800.458938] [<c01bd920>] ? sync_page+0x0/0x50
Jul 29 20:28:42 ini kernel: [ 1800.458945] [<c01bd8ee>] __lock_page+0x7e/0x90
Jul 29 20:28:42 ini kernel: [ 1800.458953] [<c01624d0>] ? wake_bit_function+0x0/0x50
Jul 29 20:28:42 ini kernel: [ 1800.458961] [<c01c7219>] truncate_inode_pages_range+0x2a9/0x2c0
Jul 29 20:28:42 ini kernel: [ 1800.458971] [<c02916c6>] ? ext4_destroy_inode+0x26/0xa0
Jul 29 20:28:42 ini kernel: [ 1800.458978] [<c01c724f>] truncate_inode_pages+0x1f/0x30
Jul 29 20:28:42 ini kernel: [ 1800.458984] [<c020f15c>] dispose_list+0xcc/0x100
Jul 29 20:28:42 ini kernel: [ 1800.458991] [<c020f534>] invalidate_inodes+0xf4/0x120
Jul 29 20:28:42 ini kernel: [ 1800.458999] [<c023b310>] ? vfs_quota_off+0x0/0x20
Jul 29 20:28:42 ini kernel: [ 1800.459007] [<c01fc602>] generic_shutdown_super+0x42/0xe0
Jul 29 20:28:42 ini kernel: [ 1800.459013] [<c01fc6ca>] kill_block_super+0x2a/0x50
Jul 29 20:28:42 ini kernel: [ 1800.459020] [<c01fd4e4>] deactivate_super+0x64/0x90
Jul 29 20:28:42 ini kernel: [ 1800.459027] [<c021282f>] mntput_no_expire+0x8f/0xe0
Jul 29 20:28:42 ini kernel: [ 1800.459033] [<c0212e47>] sys_umount+0x47/0xa0
Jul 29 20:28:42 ini kernel: [ 1800.459039] [<c0212ebe>] sys_oldumount+0x1e/0x20
Jul 29 20:28:42 ini kernel: [ 1800.459046] [<c01033ec>] syscall_call+0x7/0xb
Jul 29 20:28:42 ini kernel: [ 1800.493768] sync D 0004beb0 0 1265 1255 0x00000004
Jul 29 20:28:42 ini kernel: [ 1800.493777] cea6ff2c 00000086 00000001 0004beb0 00000000 c082b330 cde3db7c c082b330
Jul 29 20:28:42 ini kernel: [ 1800.493788] c082b330 c082b330 cde3db7c 15c42f3f 00000140 c082b330 c082b330 ce653200
Jul 29 20:28:42 ini kernel: [ 1800.493798] 00000140 cde3d8d0 cea6ff60 cde3d8d0 cefef23c cea6ff58 c0578de5 fffeffff
Jul 29 20:28:42 ini kernel: [ 1800.493809] Call Trace:
Jul 29 20:28:42 ini kernel: [ 1800.493816] [<c0578de5>] rwsem_down_failed_common+0x75/0x1a0
Jul 29 20:28:42 ini kernel: [ 1800.493823] [<c0578f5d>] rwsem_down_read_failed+0x1d/0x30
Jul 29 20:28:42 ini kernel: [ 1800.493828] [<c0578fb7>] call_rwsem_down_read_failed+0x7/0x10
Jul 29 20:28:42 ini kernel: [ 1800.493834] [<c057850c>] ? down_read+0x1c/0x20
Jul 29 20:28:42 ini kernel: [ 1800.493841] [<c021cb6d>] sync_filesystems+0xbd/0x110
Jul 29 20:28:42 ini kernel: [ 1800.493847] [<c021cc16>] sys_sync+0x16/0x40
Jul 29 20:28:42 ini kernel: [ 1800.493853] [<c01033ec>] syscall_call+0x7/0xb
Jul 29 20:30:42 ini kernel: [ 1920.526729] umount D 00478e55 0 1234 924 0x00000004
Jul 29 20:30:42 ini kernel: [ 1920.526739] ce579e10 00000086 00000082 00478e55 00000000 c082b330 cef574dc c082b330
Jul 29 20:30:42 ini kernel: [ 1920.526750] c082b330 c082b330 cef574dc 0355459b 0000010d c082b330 c082b330 ce652000
Jul 29 20:30:42 ini kernel: [ 1920.526761] 0000010d cef57230 c1407330 cef57230 ce579e5c ce579e20 c057745a ce579e54
Jul 29 20:30:42 ini kernel: [ 1920.526772] Call Trace:
Jul 29 20:30:42 ini kernel: [ 1920.526788] [<c057745a>] io_schedule+0x3a/0x60
Jul 29 20:30:42 ini kernel: [ 1920.526798] [<c01bd95d>] sync_page+0x3d/0x50
Jul 29 20:30:42 ini kernel: [ 1920.526805] [<c0577aa7>] __wait_on_bit_lock+0x47/0x90
Jul 29 20:30:42 ini kernel: [ 1920.526813] [<c01bd920>] ? sync_page+0x0/0x50
Jul 29 20:30:42 ini kernel: [ 1920.526819] [<c01bd8ee>] __lock_page+0x7e/0x90
Jul 29 20:30:42 ini kernel: [ 1920.526827] [<c01624d0>] ? wake_bit_function+0x0/0x50
Jul 29 20:30:42 ini kernel: [ 1920.526836] [<c01c7219>] truncate_inode_pages_range+0x2a9/0x2c0
Jul 29 20:30:42 ini kernel: [ 1920.526845] [<c02916c6>] ? ext4_destroy_inode+0x26/0xa0
Jul 29 20:30:42 ini kernel: [ 1920.526853] [<c01c724f>] truncate_inode_pages+0x1f/0x30
Jul 29 20:30:42 ini kernel: [ 1920.526859] [<c020f15c>] dispose_list+0xcc/0x100
Jul 29 20:30:42 ini kernel: [ 1920.526866] [<c020f534>] invalidate_inodes+0xf4/0x120
Jul 29 20:30:42 ini kernel: [ 1920.526874] [<c023b310>] ? vfs_quota_off+0x0/0x20
Jul 29 20:30:42 ini kernel: [ 1920.526882] [<c01fc602>] generic_shutdown_super+0x42/0xe0
Jul 29 20:30:42 ini kernel: [ 1920.526889] [<c01fc6ca>] kill_block_super+0x2a/0x50
Jul 29 20:30:42 ini kernel: [ 1920.526895] [<c01fd4e4>] deactivate_super+0x64/0x90
Jul 29 20:30:42 ini kernel: [ 1920.526902] [<c021282f>] mntput_no_expire+0x8f/0xe0
Jul 29 20:30:42 ini kernel: [ 1920.526908] [<c0212e47>] sys_umount+0x47/0xa0
Jul 29 20:30:42 ini kernel: [ 1920.526915] [<c0212ebe>] sys_oldumount+0x1e/0x20
Jul 29 20:30:42 ini kernel: [ 1920.526922] [<c01033ec>] syscall_call+0x7/0xb
Jul 29 20:30:42 ini kernel: [ 1920.563739] sync D 0004beb0 0 1265 1255 0x00000004
Jul 29 20:30:42 ini kernel: [ 1920.563747] cea6ff2c 00000086 00000001 0004beb0 00000000 c082b330 cde3db7c c082b330
Jul 29 20:30:42 ini kernel: [ 1920.563758] c082b330 c082b330 cde3db7c 15c42f3f 00000140 c082b330 c082b330 ce653200
Jul 29 20:30:42 ini kernel: [ 1920.563768] 00000140 cde3d8d0 cea6ff60 cde3d8d0 cefef23c cea6ff58 c0578de5 fffeffff
Jul 29 20:30:42 ini kernel: [ 1920.563779] Call Trace:
Jul 29 20:30:42 ini kernel: [ 1920.563787] [<c0578de5>] rwsem_down_failed_common+0x75/0x1a0
Jul 29 20:30:42 ini kernel: [ 1920.563793] [<c0578f5d>] rwsem_down_read_failed+0x1d/0x30
Jul 29 20:30:42 ini kernel: [ 1920.563799] [<c0578fb7>] call_rwsem_down_read_failed+0x7/0x10
Jul 29 20:30:42 ini kernel: [ 1920.563804] [<c057850c>] ? down_read+0x1c/0x20
Jul 29 20:30:42 ini kernel: [ 1920.563812] [<c021cb6d>] sync_filesystems+0xbd/0x110
Jul 29 20:30:42 ini kernel: [ 1920.563817] [<c021cc16>] sys_sync+0x16/0x40
Jul 29 20:30:42 ini kernel: [ 1920.563823] [<c01033ec>] syscall_call+0x7/0xb
Although in both cases the FS remained consistent:
root@ini:~# mount -t ext4 /dev/sdb /mnt
root@ini:~# umount /mnt
root@ini:~# e2fsck -f -y /dev/sdb
e2fsck 1.41.11 (14-Mar-2010)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
/dev/sdb: ***** FILE SYSTEM WAS MODIFIED *****
/dev/sdb: 4194/640000 files (74.2% non-contiguous), 334774/1280000 blocks
You can find full kernel logs starting from iSCSI load in the attachments.
I already reported such issues some time ago, but my reports were not too much welcomed, so I gave up. Anyway, anybody can easily do my tests at any time. They don't need any special hardware, just 2 Linux boxes: one for iSCSI target and one for iSCSI initiator (the test box itself). But they are generic for other transports as well. You can see there's nothing iSCSI specific in the traces.
Vlad
On Thu, Jul 29, 2010 at 05:00:10PM +0400, Vladislav Bolkhovitin wrote:
> You can find full kernel logs starting from iSCSI load in the attachments.
>
> I already reported such issues some time ago, but my reports were not too much welcomed, so I gave up. Anyway, anybody can easily do my tests at any time. They don't need any special hardware, just 2 Linux boxes: one for iSCSI target and one for iSCSI initiator (the test box itself). But they are generic for other transports as well. You can see there's nothing iSCSI specific in the traces.
I was only talking about ext3.
Christoph Hellwig, on 07/29/2010 05:08 PM wrote:
> On Thu, Jul 29, 2010 at 05:00:10PM +0400, Vladislav Bolkhovitin wrote:
>> You can find full kernel logs starting from iSCSI load in the attachments.
>>
>> I already reported such issues some time ago, but my reports were not too much welcomed, so I gave up. Anyway, anybody can easily do my tests at any time. They don't need any special hardware, just 2 Linux boxes: one for iSCSI target and one for iSCSI initiator (the test box itself). But they are generic for other transports as well. You can see there's nothing iSCSI specific in the traces.
>
> I was only talking about ext3.
Yes, now ext3 is a lot more reliable. The only how I was able to confuse it was:
...
(2197) nb_write: handle 4272 was not open size=65475 ofs=0
(2199) nb_write: handle 4272 was not open size=65475 ofs=65534
(2201) nb_write: handle 4272 was not open size=65475 ofs=131068
(2203) nb_write: handle 4272 was not open size=65475 ofs=196602
(2205) nb_write: handle 4272 was not open size=65475 ofs=262136^C
^C
root@ini:/mnt/dbench-mod# ^C
root@ini:/mnt/dbench-mod# ^C
root@ini:/mnt/dbench-mod# cd
root@ini:~# umount /mnt
<- recover device
root@ini:~# mount -t ext3 -o barrier=1 /dev/sdb /mnt
mount: wrong fs type, bad option, bad superblock on /dev/sdb,
missing codepage or helper program, or other error
In some cases useful info is found in syslog - try
dmesg | tail or so
Kernel log: "Jul 29 22:05:32 ini kernel: [ 2905.423092] JBD: recovery failed"
root@ini:~# mount -t ext3 -o barrier=1 /dev/sdb /mnt
root@ini:~#
Kernel log:
Jul 29 22:05:54 ini kernel: [ 2927.832893] kjournald starting. Commit interval 5 seconds
Jul 29 22:05:54 ini kernel: [ 2927.833430] EXT3 FS on sdb, internal journal
Jul 29 22:05:54 ini kernel: [ 2927.833499] EXT3-fs: sdb: 1 orphan inode deleted
Jul 29 22:05:54 ini kernel: [ 2927.833503] EXT3-fs: recovery complete.
Jul 29 22:05:54 ini kernel: [ 2927.838122] EXT3-fs: mounted filesystem with ordered data mode.
But it still remained consistent:
root@ini:~# umount /mnt
root@ini:~# e2fsck -f -y /dev/sdb
e2fsck 1.41.11 (14-Mar-2010)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
/dev/sdb: 3504/320000 files (21.1% non-contiguous), 307034/1280000 blocks
Good progress since my original reports for kernels around 2.6.27!
Vlad
On Thu 29-07-10 17:00:10, Vladislav Bolkhovitin wrote:
> Christoph Hellwig, on 07/29/2010 12:31 PM wrote:
> > My reading of the ext3/jbd code we explicitly wait on I/O completion
> > of dependent writes, and only require those to actually be stable
> > by issueing a flush. If that wasn't the case the default ext3
> > barriers off behaviour would not only be dangerous on devices with
> > volatile write caches, but also on devices that do not have them,
> > which in addition to the reading of the code is not what we've seen
> > in actual power fail testing, where ext3 does well as long as there
> > is no volatile write cache.
>
> Basically, it is so, but, unfortunately, not absolutely. I've just tried
> 2 tests on ext4 with iSCSI:
>
> # uname -a
> Linux ini 2.6.32-22-386 #36-Ubuntu SMP Fri Jun 4 00:27:09 UTC 2010 i686 GNU/Linux
>
> # e2fsck -f -y /dev/sdb
> e2fsck 1.41.11 (14-Mar-2010)
> Pass 1: Checking inodes, blocks, and sizes
> Pass 2: Checking directory structure
> Pass 3: Checking directory connectivity
> Pass 4: Checking reference counts
> Pass 5: Checking group summary information
> /dev/sdb: 49/640000 files (0.0% non-contiguous), 56496/1280000 blocks
> root@ini:~# mount -t ext4 -o barrier=1 /dev/sdb /mnt
> root@ini:~# cd /mnt/dbench-mod/
> root@ini:/mnt/dbench-mod# ./dbench 50
> 50 clients started
> ...
> <-- Pull cable
> <-- After sometime a lot of warnings like:
> (22002) open CLIENTS/CLIENT44/~DMTMP/COREL/CDRBARS.CFG failed for handle 4235 (Read-only file system)
> (22004) open CLIENTS/CLIENT44/~DMTMP/COREL/ARTISTIC.ACL failed for handle 4236 (Read-only file system)
...
These are OK. You pulled a cable and now you start getting EIO from the
kernel.
> root@ini:/mnt/dbench-mod# ^C
> root@ini:/mnt/dbench-mod# ^C
> root@ini:~# umount /mnt
> Segmentation fault
This isn't OK of course ;)
> Kernel log:
>
> Jul 29 19:55:35 ini kernel: [ 3044.722313] c2c28e40: 00023740 00023741 00023742 00023743 @7..A7..B7..C7..
> Jul 29 19:55:35 ini kernel: [ 3044.722320] c2c28e50: 00023744 00023745 00023746 00023747 D7..E7..F7..G7..
> Jul 29 19:55:35 ini kernel: [ 3044.722327] c2c28e60: 00023748 00023749 0002374a 0002374b H7..I7..J7..K7..
> Jul 29 19:55:35 ini kernel: [ 3044.722334] c2c28e70: 0002372c 00000000 00000000 00000000 ,7..............
> Jul 29 19:55:35 ini kernel: [ 3044.722341] c2c28e80: 00000000 00000000 00000000 00000002 ................
...
Sadly these messages above seem to have overwritten beginning of the
message below. Hmm, but maybe it's just a warning message about inode still
being on orphan list because the next oops still shows untainted kernel.
> Jul 29 19:55:35 ini kernel: [ 3044.722546] Pid: 1299, comm: umount Not tainted 2.6.32-22-386 #36-Ubuntu
> Jul 29 19:55:35 ini kernel: [ 3044.722550] Call Trace:
> Jul 29 19:55:35 ini kernel: [ 3044.722567] [<c0291731>] ext4_destroy_inode+0x91/0xa0
> Jul 29 19:55:35 ini kernel: [ 3044.722577] [<c020ecb4>] destroy_inode+0x24/0x40
> Jul 29 19:55:35 ini kernel: [ 3044.722583] [<c020f11e>] dispose_list+0x8e/0x100
> Jul 29 19:55:35 ini kernel: [ 3044.722588] [<c020f534>] invalidate_inodes+0xf4/0x120
> Jul 29 19:55:35 ini kernel: [ 3044.722598] [<c023b310>] ? vfs_quota_off+0x0/0x20
> Jul 29 19:55:35 ini kernel: [ 3044.722606] [<c01fc602>] generic_shutdown_super+0x42/0xe0
> Jul 29 19:55:35 ini kernel: [ 3044.722612] [<c01fc6ca>] kill_block_super+0x2a/0x50
> Jul 29 19:55:35 ini kernel: [ 3044.722618] [<c01fd4e4>] deactivate_super+0x64/0x90
> Jul 29 19:55:35 ini kernel: [ 3044.722625] [<c021282f>] mntput_no_expire+0x8f/0xe0
> Jul 29 19:55:35 ini kernel: [ 3044.722631] [<c0212e47>] sys_umount+0x47/0xa0
> Jul 29 19:55:35 ini kernel: [ 3044.722636] [<c0212ebe>] sys_oldumount+0x1e/0x20
> Jul 29 19:55:35 ini kernel: [ 3044.722643] [<c01033ec>] syscall_call+0x7/0xb
> Jul 29 19:55:35 ini kernel: [ 3044.731043] sd 6:0:0:0: [sdb] Unhandled error code
> Jul 29 19:55:35 ini kernel: [ 3044.731049] sd 6:0:0:0: [sdb] Result: hostbyte=DID_TRANSPORT_FAILFAST driverbyte=DRIVER_OK
> Jul 29 19:55:35 ini kernel: [ 3044.731056] sd 6:0:0:0: [sdb] CDB: Write(10): 2a 00 00 00 00 00 00 00 01 00
> Jul 29 19:55:35 ini kernel: [ 3044.743469] __ratelimit: 37 callbacks suppressed
> Jul 29 19:55:35 ini kernel: [ 3044.755695] lost page write due to I/O error on sdb
> Jul 29 19:55:36 ini kernel: [ 3044.823044] Modules linked in: crc32c ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi w83627hf hwmon_vid fbcon tileblit font bitblit softcursor ppdev adm1021 i2c_i801 vga16fb vgastate e7xxx_edac psmouse serio_raw parport_pc shpchp edac_core lp parport qla2xxx ohci1394 scsi_transport_fc r8169 sata_via ieee1394 mii scsi_tgt e1000 floppy
So here probably starts the real oops. But sadly we are missing the
beginning as well. Can you send me disassembly of your ext4_put_super?
> Jul 29 19:55:36 ini kernel: [ 3044.823044]
> Jul 29 19:55:36 ini kernel: [ 3044.823044] Pid: 1299, comm: umount Not tainted (2.6.32-22-386 #36-Ubuntu) X5DPA
> Jul 29 19:55:36 ini kernel: [ 3044.823044] EIP: 0060:[<c0293c2a>] EFLAGS: 00010206 CPU: 0
> Jul 29 19:55:36 ini kernel: [ 3044.823044] EIP is at ext4_put_super+0x2ea/0x350
> Jul 29 19:55:36 ini kernel: [ 3044.823044] EAX: c2c28ea8 EBX: c307f000 ECX: ffffff52 EDX: c307f138
> Jul 29 19:55:36 ini kernel: [ 3044.823044] ESI: ca228a00 EDI: c307f0fc EBP: cec6ff30 ESP: cec6fefc
> Jul 29 19:55:36 ini kernel: [ 3044.823044] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
> Jul 29 19:55:36 ini kernel: [ 3044.823044] c06bb054 ca228b64 0000800b c2c28ec8 00008180 00000001 00000000 c307f138
> Jul 29 19:55:36 ini kernel: [ 3044.823044] <0> c307f138 c307f138 ca228a00 c0593c80 c023b310 cec6ff48 c01fc60d ca228ac0
> Jul 29 19:55:36 ini kernel: [ 3044.823044] <0> cec6ff44 cf328400 00000003 cec6ff58 c01fc6ca ca228a00 c0759d80 cec6ff6c
> Jul 29 19:55:36 ini kernel: [ 3044.823044] [<c023b310>] ? vfs_quota_off+0x0/0x20
> Jul 29 19:55:36 ini kernel: [ 3044.823044] [<c01fc60d>] ? generic_shutdown_super+0x4d/0xe0
> Jul 29 19:55:36 ini kernel: [ 3044.823044] [<c01fc6ca>] ? kill_block_super+0x2a/0x50
> Jul 29 19:55:36 ini kernel: [ 3044.823044] [<c01fd4e4>] ? deactivate_super+0x64/0x90
> Jul 29 19:55:36 ini kernel: [ 3044.823044] [<c021282f>] ? mntput_no_expire+0x8f/0xe0
> Jul 29 19:55:36 ini kernel: [ 3044.823044] [<c0212e47>] ? sys_umount+0x47/0xa0
> Jul 29 19:55:36 ini kernel: [ 3044.823044] [<c0212ebe>] ? sys_oldumount+0x1e/0x20
> Jul 29 19:55:36 ini kernel: [ 3044.823044] [<c01033ec>] ? syscall_call+0x7/0xb
> Jul 29 19:55:36 ini kernel: [ 3045.299442] ---[ end trace 426db011a0289db3 ]---
...
> Another test. Everything is as before, only I did not pull the cable, but
> deleted the corresponding LUN on the target, so all the command starting
> from this moment failed. Then on umount system rebooted. Kernel log:
Nasty. But the log actually contains only traces of processes in D state
(generally waiting for a page to be unlocked). Do you have any sort of
watchdog which might have rebooted the machine?
> Jul 29 20:20:42 ini kernel: [ 1320.251393] umount D 00478e55 0 1234 924 0x00000000
> Jul 29 20:20:42 ini kernel: [ 1320.251403] ce579e10 00000086 00000082 00478e55 00000000 c082b330 cef574dc c082b330
> Jul 29 20:20:42 ini kernel: [ 1320.251415] c082b330 c082b330 cef574dc 0355459b 0000010d c082b330 c082b330 ce652000
> Jul 29 20:20:42 ini kernel: [ 1320.251425] 0000010d cef57230 c1407330 cef57230 ce579e5c ce579e20 c057745a ce579e54
> Jul 29 20:20:42 ini kernel: [ 1320.251436] Call Trace:
> Jul 29 20:20:42 ini kernel: [ 1320.251452] [<c057745a>] io_schedule+0x3a/0x60
> Jul 29 20:20:42 ini kernel: [ 1320.251463] [<c01bd95d>] sync_page+0x3d/0x50
> Jul 29 20:20:42 ini kernel: [ 1320.251470] [<c0577aa7>] __wait_on_bit_lock+0x47/0x90
> Jul 29 20:20:42 ini kernel: [ 1320.251476] [<c01bd920>] ? sync_page+0x0/0x50
> Jul 29 20:20:42 ini kernel: [ 1320.251483] [<c01bd8ee>] __lock_page+0x7e/0x90
> Jul 29 20:20:42 ini kernel: [ 1320.251491] [<c01624d0>] ? wake_bit_function+0x0/0x50
> Jul 29 20:20:42 ini kernel: [ 1320.251499] [<c01c7219>] truncate_inode_pages_range+0x2a9/0x2c0
> Jul 29 20:20:42 ini kernel: [ 1320.251510] [<c02916c6>] ? ext4_destroy_inode+0x26/0xa0
> Jul 29 20:20:42 ini kernel: [ 1320.251517] [<c01c724f>] truncate_inode_pages+0x1f/0x30
> Jul 29 20:20:42 ini kernel: [ 1320.251523] [<c020f15c>] dispose_list+0xcc/0x100
> Jul 29 20:20:42 ini kernel: [ 1320.251529] [<c020f534>] invalidate_inodes+0xf4/0x120
> Jul 29 20:20:42 ini kernel: [ 1320.251538] [<c023b310>] ? vfs_quota_off+0x0/0x20
> Jul 29 20:20:42 ini kernel: [ 1320.251546] [<c01fc602>] generic_shutdown_super+0x42/0xe0
> Jul 29 20:20:42 ini kernel: [ 1320.251553] [<c01fc6ca>] kill_block_super+0x2a/0x50
> Jul 29 20:20:42 ini kernel: [ 1320.251559] [<c01fd4e4>] deactivate_super+0x64/0x90
> Jul 29 20:20:42 ini kernel: [ 1320.251566] [<c021282f>] mntput_no_expire+0x8f/0xe0
> Jul 29 20:20:42 ini kernel: [ 1320.251573] [<c0212e47>] sys_umount+0x47/0xa0
> Jul 29 20:20:42 ini kernel: [ 1320.251579] [<c0212ebe>] sys_oldumount+0x1e/0x20
> Jul 29 20:20:42 ini kernel: [ 1320.251586] [<c01033ec>] syscall_call+0x7/0xb
> Jul 29 20:22:42 ini kernel: [ 1440.285910] umount D 00478e55 0 1234 924 0x00000004
> Jul 29 20:22:42 ini kernel: [ 1440.285919] ce579e10 00000086 00000082 00478e55 00000000 c082b330 cef574dc c082b330
> Jul 29 20:22:42 ini kernel: [ 1440.285931] c082b330 c082b330 cef574dc 0355459b 0000010d c082b330 c082b330 ce652000
> Jul 29 20:22:42 ini kernel: [ 1440.285942] 0000010d cef57230 c1407330 cef57230 ce579e5c ce579e20 c057745a ce579e54
> Jul 29 20:22:42 ini kernel: [ 1440.285953] Call Trace:
> Jul 29 20:22:42 ini kernel: [ 1440.285969] [<c057745a>] io_schedule+0x3a/0x60
> Jul 29 20:22:42 ini kernel: [ 1440.285980] [<c01bd95d>] sync_page+0x3d/0x50
> Jul 29 20:22:42 ini kernel: [ 1440.285987] [<c0577aa7>] __wait_on_bit_lock+0x47/0x90
> Jul 29 20:22:42 ini kernel: [ 1440.285994] [<c01bd920>] ? sync_page+0x0/0x50
> Jul 29 20:22:42 ini kernel: [ 1440.286001] [<c01bd8ee>] __lock_page+0x7e/0x90
> Jul 29 20:22:42 ini kernel: [ 1440.286010] [<c01624d0>] ? wake_bit_function+0x0/0x50
> Jul 29 20:22:42 ini kernel: [ 1440.286018] [<c01c7219>] truncate_inode_pages_range+0x2a9/0x2c0
> Jul 29 20:22:42 ini kernel: [ 1440.286028] [<c02916c6>] ? ext4_destroy_inode+0x26/0xa0
> Jul 29 20:22:42 ini kernel: [ 1440.286035] [<c01c724f>] truncate_inode_pages+0x1f/0x30
> Jul 29 20:22:42 ini kernel: [ 1440.286041] [<c020f15c>] dispose_list+0xcc/0x100
> Jul 29 20:22:42 ini kernel: [ 1440.286047] [<c020f534>] invalidate_inodes+0xf4/0x120
> Jul 29 20:22:42 ini kernel: [ 1440.286056] [<c023b310>] ? vfs_quota_off+0x0/0x20
> Jul 29 20:22:42 ini kernel: [ 1440.286064] [<c01fc602>] generic_shutdown_super+0x42/0xe0
> Jul 29 20:22:42 ini kernel: [ 1440.286071] [<c01fc6ca>] kill_block_super+0x2a/0x50
> Jul 29 20:22:42 ini kernel: [ 1440.286077] [<c01fd4e4>] deactivate_super+0x64/0x90
> Jul 29 20:22:42 ini kernel: [ 1440.286084] [<c021282f>] mntput_no_expire+0x8f/0xe0
> Jul 29 20:22:42 ini kernel: [ 1440.286091] [<c0212e47>] sys_umount+0x47/0xa0
> Jul 29 20:22:42 ini kernel: [ 1440.286097] [<c0212ebe>] sys_oldumount+0x1e/0x20
> Jul 29 20:22:42 ini kernel: [ 1440.286104] [<c01033ec>] syscall_call+0x7/0xb
> Jul 29 20:24:42 ini kernel: [ 1560.321709] umount D 00478e55 0 1234 924 0x00000004
> Jul 29 20:24:42 ini kernel: [ 1560.321718] ce579e10 00000086 00000082 00478e55 00000000 c082b330 cef574dc c082b330
> Jul 29 20:24:42 ini kernel: [ 1560.321730] c082b330 c082b330 cef574dc 0355459b 0000010d c082b330 c082b330 ce652000
> Jul 29 20:24:42 ini kernel: [ 1560.321741] 0000010d cef57230 c1407330 cef57230 ce579e5c ce579e20 c057745a ce579e54
> Jul 29 20:24:42 ini kernel: [ 1560.321751] Call Trace:
> Jul 29 20:24:42 ini kernel: [ 1560.321767] [<c057745a>] io_schedule+0x3a/0x60
> Jul 29 20:24:42 ini kernel: [ 1560.321777] [<c01bd95d>] sync_page+0x3d/0x50
> Jul 29 20:24:42 ini kernel: [ 1560.321784] [<c0577aa7>] __wait_on_bit_lock+0x47/0x90
> Jul 29 20:24:42 ini kernel: [ 1560.321791] [<c01bd920>] ? sync_page+0x0/0x50
> Jul 29 20:24:42 ini kernel: [ 1560.321797] [<c01bd8ee>] __lock_page+0x7e/0x90
> Jul 29 20:24:42 ini kernel: [ 1560.321805] [<c01624d0>] ? wake_bit_function+0x0/0x50
> Jul 29 20:24:42 ini kernel: [ 1560.321814] [<c01c7219>] truncate_inode_pages_range+0x2a9/0x2c0
> Jul 29 20:24:42 ini kernel: [ 1560.321824] [<c02916c6>] ? ext4_destroy_inode+0x26/0xa0
> Jul 29 20:24:42 ini kernel: [ 1560.321831] [<c01c724f>] truncate_inode_pages+0x1f/0x30
> Jul 29 20:24:42 ini kernel: [ 1560.321837] [<c020f15c>] dispose_list+0xcc/0x100
> Jul 29 20:24:42 ini kernel: [ 1560.321845] [<c020f534>] invalidate_inodes+0xf4/0x120
> Jul 29 20:24:42 ini kernel: [ 1560.321855] [<c023b310>] ? vfs_quota_off+0x0/0x20
> Jul 29 20:24:42 ini kernel: [ 1560.321864] [<c01fc602>] generic_shutdown_super+0x42/0xe0
> Jul 29 20:24:42 ini kernel: [ 1560.321870] [<c01fc6ca>] kill_block_super+0x2a/0x50
> Jul 29 20:24:42 ini kernel: [ 1560.321877] [<c01fd4e4>] deactivate_super+0x64/0x90
> Jul 29 20:24:42 ini kernel: [ 1560.321885] [<c021282f>] mntput_no_expire+0x8f/0xe0
> Jul 29 20:24:42 ini kernel: [ 1560.321892] [<c0212e47>] sys_umount+0x47/0xa0
> Jul 29 20:24:42 ini kernel: [ 1560.321898] [<c0212ebe>] sys_oldumount+0x1e/0x20
> Jul 29 20:24:42 ini kernel: [ 1560.321905] [<c01033ec>] syscall_call+0x7/0xb
> Jul 29 20:24:42 ini kernel: [ 1560.358795] sync D 0004beb0 0 1265 1255 0x00000004
> Jul 29 20:24:42 ini kernel: [ 1560.358803] cea6ff2c 00000086 00000001 0004beb0 00000000 c082b330 cde3db7c c082b330
> Jul 29 20:24:42 ini kernel: [ 1560.358815] c082b330 c082b330 cde3db7c 15c42f3f 00000140 c082b330 c082b330 ce653200
> Jul 29 20:24:42 ini kernel: [ 1560.358826] 00000140 cde3d8d0 cea6ff60 cde3d8d0 cefef23c cea6ff58 c0578de5 fffeffff
> Jul 29 20:24:42 ini kernel: [ 1560.358837] Call Trace:
> Jul 29 20:24:42 ini kernel: [ 1560.358845] [<c0578de5>] rwsem_down_failed_common+0x75/0x1a0
> Jul 29 20:24:42 ini kernel: [ 1560.358852] [<c0578f5d>] rwsem_down_read_failed+0x1d/0x30
> Jul 29 20:24:42 ini kernel: [ 1560.358858] [<c0578fb7>] call_rwsem_down_read_failed+0x7/0x10
> Jul 29 20:24:42 ini kernel: [ 1560.358863] [<c057850c>] ? down_read+0x1c/0x20
> Jul 29 20:24:42 ini kernel: [ 1560.358870] [<c021cb6d>] sync_filesystems+0xbd/0x110
> Jul 29 20:24:42 ini kernel: [ 1560.358876] [<c021cc16>] sys_sync+0x16/0x40
> Jul 29 20:24:42 ini kernel: [ 1560.358881] [<c01033ec>] syscall_call+0x7/0xb
> Jul 29 20:26:42 ini kernel: [ 1680.392190] umount D 00478e55 0 1234 924 0x00000004
> Jul 29 20:26:42 ini kernel: [ 1680.392200] ce579e10 00000086 00000082 00478e55 00000000 c082b330 cef574dc c082b330
> Jul 29 20:26:42 ini kernel: [ 1680.392212] c082b330 c082b330 cef574dc 0355459b 0000010d c082b330 c082b330 ce652000
> Jul 29 20:26:42 ini kernel: [ 1680.392223] 0000010d cef57230 c1407330 cef57230 ce579e5c ce579e20 c057745a ce579e54
> Jul 29 20:26:42 ini kernel: [ 1680.392233] Call Trace:
> Jul 29 20:26:42 ini kernel: [ 1680.392250] [<c057745a>] io_schedule+0x3a/0x60
> Jul 29 20:26:42 ini kernel: [ 1680.392260] [<c01bd95d>] sync_page+0x3d/0x50
> Jul 29 20:26:42 ini kernel: [ 1680.392267] [<c0577aa7>] __wait_on_bit_lock+0x47/0x90
> Jul 29 20:26:42 ini kernel: [ 1680.392274] [<c01bd920>] ? sync_page+0x0/0x50
> Jul 29 20:26:42 ini kernel: [ 1680.392280] [<c01bd8ee>] __lock_page+0x7e/0x90
> Jul 29 20:26:42 ini kernel: [ 1680.392289] [<c01624d0>] ? wake_bit_function+0x0/0x50
> Jul 29 20:26:42 ini kernel: [ 1680.392298] [<c01c7219>] truncate_inode_pages_range+0x2a9/0x2c0
> Jul 29 20:26:42 ini kernel: [ 1680.392308] [<c02916c6>] ? ext4_destroy_inode+0x26/0xa0
> Jul 29 20:26:42 ini kernel: [ 1680.392314] [<c01c724f>] truncate_inode_pages+0x1f/0x30
> Jul 29 20:26:42 ini kernel: [ 1680.392321] [<c020f15c>] dispose_list+0xcc/0x100
> Jul 29 20:26:42 ini kernel: [ 1680.392327] [<c020f534>] invalidate_inodes+0xf4/0x120
> Jul 29 20:26:42 ini kernel: [ 1680.392336] [<c023b310>] ? vfs_quota_off+0x0/0x20
> Jul 29 20:26:42 ini kernel: [ 1680.392344] [<c01fc602>] generic_shutdown_super+0x42/0xe0
> Jul 29 20:26:42 ini kernel: [ 1680.392351] [<c01fc6ca>] kill_block_super+0x2a/0x50
> Jul 29 20:26:42 ini kernel: [ 1680.392357] [<c01fd4e4>] deactivate_super+0x64/0x90
> Jul 29 20:26:42 ini kernel: [ 1680.392364] [<c021282f>] mntput_no_expire+0x8f/0xe0
> Jul 29 20:26:42 ini kernel: [ 1680.392371] [<c0212e47>] sys_umount+0x47/0xa0
> Jul 29 20:26:42 ini kernel: [ 1680.392378] [<c0212ebe>] sys_oldumount+0x1e/0x20
> Jul 29 20:26:42 ini kernel: [ 1680.392384] [<c01033ec>] syscall_call+0x7/0xb
> Jul 29 20:26:42 ini kernel: [ 1680.427874] sync D 0004beb0 0 1265 1255 0x00000004
> Jul 29 20:26:42 ini kernel: [ 1680.427883] cea6ff2c 00000086 00000001 0004beb0 00000000 c082b330 cde3db7c c082b330
> Jul 29 20:26:42 ini kernel: [ 1680.427894] c082b330 c082b330 cde3db7c 15c42f3f 00000140 c082b330 c082b330 ce653200
> Jul 29 20:26:42 ini kernel: [ 1680.427904] 00000140 cde3d8d0 cea6ff60 cde3d8d0 cefef23c cea6ff58 c0578de5 fffeffff
> Jul 29 20:26:42 ini kernel: [ 1680.427915] Call Trace:
> Jul 29 20:26:42 ini kernel: [ 1680.427922] [<c0578de5>] rwsem_down_failed_common+0x75/0x1a0
> Jul 29 20:26:42 ini kernel: [ 1680.427929] [<c0578f5d>] rwsem_down_read_failed+0x1d/0x30
> Jul 29 20:26:42 ini kernel: [ 1680.427935] [<c0578fb7>] call_rwsem_down_read_failed+0x7/0x10
> Jul 29 20:26:42 ini kernel: [ 1680.427940] [<c057850c>] ? down_read+0x1c/0x20
> Jul 29 20:26:42 ini kernel: [ 1680.427947] [<c021cb6d>] sync_filesystems+0xbd/0x110
> Jul 29 20:26:42 ini kernel: [ 1680.427953] [<c021cc16>] sys_sync+0x16/0x40
> Jul 29 20:26:42 ini kernel: [ 1680.427958] [<c01033ec>] syscall_call+0x7/0xb
> Jul 29 20:28:42 ini kernel: [ 1800.458856] umount D 00478e55 0 1234 924 0x00000004
> Jul 29 20:28:42 ini kernel: [ 1800.458866] ce579e10 00000086 00000082 00478e55 00000000 c082b330 cef574dc c082b330
> Jul 29 20:28:42 ini kernel: [ 1800.458877] c082b330 c082b330 cef574dc 0355459b 0000010d c082b330 c082b330 ce652000
> Jul 29 20:28:42 ini kernel: [ 1800.458888] 0000010d cef57230 c1407330 cef57230 ce579e5c ce579e20 c057745a ce579e54
> Jul 29 20:28:42 ini kernel: [ 1800.458899] Call Trace:
> Jul 29 20:28:42 ini kernel: [ 1800.458915] [<c057745a>] io_schedule+0x3a/0x60
> Jul 29 20:28:42 ini kernel: [ 1800.458925] [<c01bd95d>] sync_page+0x3d/0x50
> Jul 29 20:28:42 ini kernel: [ 1800.458932] [<c0577aa7>] __wait_on_bit_lock+0x47/0x90
> Jul 29 20:28:42 ini kernel: [ 1800.458938] [<c01bd920>] ? sync_page+0x0/0x50
> Jul 29 20:28:42 ini kernel: [ 1800.458945] [<c01bd8ee>] __lock_page+0x7e/0x90
> Jul 29 20:28:42 ini kernel: [ 1800.458953] [<c01624d0>] ? wake_bit_function+0x0/0x50
> Jul 29 20:28:42 ini kernel: [ 1800.458961] [<c01c7219>] truncate_inode_pages_range+0x2a9/0x2c0
> Jul 29 20:28:42 ini kernel: [ 1800.458971] [<c02916c6>] ? ext4_destroy_inode+0x26/0xa0
> Jul 29 20:28:42 ini kernel: [ 1800.458978] [<c01c724f>] truncate_inode_pages+0x1f/0x30
> Jul 29 20:28:42 ini kernel: [ 1800.458984] [<c020f15c>] dispose_list+0xcc/0x100
> Jul 29 20:28:42 ini kernel: [ 1800.458991] [<c020f534>] invalidate_inodes+0xf4/0x120
> Jul 29 20:28:42 ini kernel: [ 1800.458999] [<c023b310>] ? vfs_quota_off+0x0/0x20
> Jul 29 20:28:42 ini kernel: [ 1800.459007] [<c01fc602>] generic_shutdown_super+0x42/0xe0
> Jul 29 20:28:42 ini kernel: [ 1800.459013] [<c01fc6ca>] kill_block_super+0x2a/0x50
> Jul 29 20:28:42 ini kernel: [ 1800.459020] [<c01fd4e4>] deactivate_super+0x64/0x90
> Jul 29 20:28:42 ini kernel: [ 1800.459027] [<c021282f>] mntput_no_expire+0x8f/0xe0
> Jul 29 20:28:42 ini kernel: [ 1800.459033] [<c0212e47>] sys_umount+0x47/0xa0
> Jul 29 20:28:42 ini kernel: [ 1800.459039] [<c0212ebe>] sys_oldumount+0x1e/0x20
> Jul 29 20:28:42 ini kernel: [ 1800.459046] [<c01033ec>] syscall_call+0x7/0xb
> Jul 29 20:28:42 ini kernel: [ 1800.493768] sync D 0004beb0 0 1265 1255 0x00000004
> Jul 29 20:28:42 ini kernel: [ 1800.493777] cea6ff2c 00000086 00000001 0004beb0 00000000 c082b330 cde3db7c c082b330
> Jul 29 20:28:42 ini kernel: [ 1800.493788] c082b330 c082b330 cde3db7c 15c42f3f 00000140 c082b330 c082b330 ce653200
> Jul 29 20:28:42 ini kernel: [ 1800.493798] 00000140 cde3d8d0 cea6ff60 cde3d8d0 cefef23c cea6ff58 c0578de5 fffeffff
> Jul 29 20:28:42 ini kernel: [ 1800.493809] Call Trace:
> Jul 29 20:28:42 ini kernel: [ 1800.493816] [<c0578de5>] rwsem_down_failed_common+0x75/0x1a0
> Jul 29 20:28:42 ini kernel: [ 1800.493823] [<c0578f5d>] rwsem_down_read_failed+0x1d/0x30
> Jul 29 20:28:42 ini kernel: [ 1800.493828] [<c0578fb7>] call_rwsem_down_read_failed+0x7/0x10
> Jul 29 20:28:42 ini kernel: [ 1800.493834] [<c057850c>] ? down_read+0x1c/0x20
> Jul 29 20:28:42 ini kernel: [ 1800.493841] [<c021cb6d>] sync_filesystems+0xbd/0x110
> Jul 29 20:28:42 ini kernel: [ 1800.493847] [<c021cc16>] sys_sync+0x16/0x40
> Jul 29 20:28:42 ini kernel: [ 1800.493853] [<c01033ec>] syscall_call+0x7/0xb
> Jul 29 20:30:42 ini kernel: [ 1920.526729] umount D 00478e55 0 1234 924 0x00000004
> Jul 29 20:30:42 ini kernel: [ 1920.526739] ce579e10 00000086 00000082 00478e55 00000000 c082b330 cef574dc c082b330
> Jul 29 20:30:42 ini kernel: [ 1920.526750] c082b330 c082b330 cef574dc 0355459b 0000010d c082b330 c082b330 ce652000
> Jul 29 20:30:42 ini kernel: [ 1920.526761] 0000010d cef57230 c1407330 cef57230 ce579e5c ce579e20 c057745a ce579e54
> Jul 29 20:30:42 ini kernel: [ 1920.526772] Call Trace:
> Jul 29 20:30:42 ini kernel: [ 1920.526788] [<c057745a>] io_schedule+0x3a/0x60
> Jul 29 20:30:42 ini kernel: [ 1920.526798] [<c01bd95d>] sync_page+0x3d/0x50
> Jul 29 20:30:42 ini kernel: [ 1920.526805] [<c0577aa7>] __wait_on_bit_lock+0x47/0x90
> Jul 29 20:30:42 ini kernel: [ 1920.526813] [<c01bd920>] ? sync_page+0x0/0x50
> Jul 29 20:30:42 ini kernel: [ 1920.526819] [<c01bd8ee>] __lock_page+0x7e/0x90
> Jul 29 20:30:42 ini kernel: [ 1920.526827] [<c01624d0>] ? wake_bit_function+0x0/0x50
> Jul 29 20:30:42 ini kernel: [ 1920.526836] [<c01c7219>] truncate_inode_pages_range+0x2a9/0x2c0
> Jul 29 20:30:42 ini kernel: [ 1920.526845] [<c02916c6>] ? ext4_destroy_inode+0x26/0xa0
> Jul 29 20:30:42 ini kernel: [ 1920.526853] [<c01c724f>] truncate_inode_pages+0x1f/0x30
> Jul 29 20:30:42 ini kernel: [ 1920.526859] [<c020f15c>] dispose_list+0xcc/0x100
> Jul 29 20:30:42 ini kernel: [ 1920.526866] [<c020f534>] invalidate_inodes+0xf4/0x120
> Jul 29 20:30:42 ini kernel: [ 1920.526874] [<c023b310>] ? vfs_quota_off+0x0/0x20
> Jul 29 20:30:42 ini kernel: [ 1920.526882] [<c01fc602>] generic_shutdown_super+0x42/0xe0
> Jul 29 20:30:42 ini kernel: [ 1920.526889] [<c01fc6ca>] kill_block_super+0x2a/0x50
> Jul 29 20:30:42 ini kernel: [ 1920.526895] [<c01fd4e4>] deactivate_super+0x64/0x90
> Jul 29 20:30:42 ini kernel: [ 1920.526902] [<c021282f>] mntput_no_expire+0x8f/0xe0
> Jul 29 20:30:42 ini kernel: [ 1920.526908] [<c0212e47>] sys_umount+0x47/0xa0
> Jul 29 20:30:42 ini kernel: [ 1920.526915] [<c0212ebe>] sys_oldumount+0x1e/0x20
> Jul 29 20:30:42 ini kernel: [ 1920.526922] [<c01033ec>] syscall_call+0x7/0xb
> Jul 29 20:30:42 ini kernel: [ 1920.563739] sync D 0004beb0 0 1265 1255 0x00000004
> Jul 29 20:30:42 ini kernel: [ 1920.563747] cea6ff2c 00000086 00000001 0004beb0 00000000 c082b330 cde3db7c c082b330
> Jul 29 20:30:42 ini kernel: [ 1920.563758] c082b330 c082b330 cde3db7c 15c42f3f 00000140 c082b330 c082b330 ce653200
> Jul 29 20:30:42 ini kernel: [ 1920.563768] 00000140 cde3d8d0 cea6ff60 cde3d8d0 cefef23c cea6ff58 c0578de5 fffeffff
> Jul 29 20:30:42 ini kernel: [ 1920.563779] Call Trace:
> Jul 29 20:30:42 ini kernel: [ 1920.563787] [<c0578de5>] rwsem_down_failed_common+0x75/0x1a0
> Jul 29 20:30:42 ini kernel: [ 1920.563793] [<c0578f5d>] rwsem_down_read_failed+0x1d/0x30
> Jul 29 20:30:42 ini kernel: [ 1920.563799] [<c0578fb7>] call_rwsem_down_read_failed+0x7/0x10
> Jul 29 20:30:42 ini kernel: [ 1920.563804] [<c057850c>] ? down_read+0x1c/0x20
> Jul 29 20:30:42 ini kernel: [ 1920.563812] [<c021cb6d>] sync_filesystems+0xbd/0x110
> Jul 29 20:30:42 ini kernel: [ 1920.563817] [<c021cc16>] sys_sync+0x16/0x40
> Jul 29 20:30:42 ini kernel: [ 1920.563823] [<c01033ec>] syscall_call+0x7/0xb
>
> Although in both cases the FS remained consistent:
Yes, at least something positive in the end ;).
> root@ini:~# mount -t ext4 /dev/sdb /mnt
> root@ini:~# umount /mnt
> root@ini:~# e2fsck -f -y /dev/sdb
> e2fsck 1.41.11 (14-Mar-2010)
> Pass 1: Checking inodes, blocks, and sizes
> Pass 2: Checking directory structure
> Pass 3: Checking directory connectivity
> Pass 4: Checking reference counts
> Pass 5: Checking group summary information
>
> /dev/sdb: ***** FILE SYSTEM WAS MODIFIED *****
> /dev/sdb: 4194/640000 files (74.2% non-contiguous), 334774/1280000 blocks
>
> You can find full kernel logs starting from iSCSI load in the attachments.
>
> I already reported such issues some time ago, but my reports were not too much welcomed, so I gave up. Anyway, anybody can easily do my tests at any time. They don't need any special hardware, just 2 Linux boxes: one for iSCSI target and one for iSCSI initiator (the test box itself). But they are generic for other transports as well. You can see there's nothing iSCSI specific in the traces.
Thanks for running the test.
Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR
On Thu 29-07-10 18:12:29, Vladislav Bolkhovitin wrote:
>
> Christoph Hellwig, on 07/29/2010 05:08 PM wrote:
> > On Thu, Jul 29, 2010 at 05:00:10PM +0400, Vladislav Bolkhovitin wrote:
> >> You can find full kernel logs starting from iSCSI load in the attachments.
> >>
> >> I already reported such issues some time ago, but my reports were not too much welcomed, so I gave up. Anyway, anybody can easily do my tests at any time. They don't need any special hardware, just 2 Linux boxes: one for iSCSI target and one for iSCSI initiator (the test box itself). But they are generic for other transports as well. You can see there's nothing iSCSI specific in the traces.
> >
> > I was only talking about ext3.
>
> Yes, now ext3 is a lot more reliable. The only how I was able to confuse it was:
>
> ...
> (2197) nb_write: handle 4272 was not open size=65475 ofs=0
> (2199) nb_write: handle 4272 was not open size=65475 ofs=65534
> (2201) nb_write: handle 4272 was not open size=65475 ofs=131068
> (2203) nb_write: handle 4272 was not open size=65475 ofs=196602
> (2205) nb_write: handle 4272 was not open size=65475 ofs=262136^C
> ^C
> root@ini:/mnt/dbench-mod# ^C
> root@ini:/mnt/dbench-mod# ^C
> root@ini:/mnt/dbench-mod# cd
> root@ini:~# umount /mnt
>
> <- recover device
>
> root@ini:~# mount -t ext3 -o barrier=1 /dev/sdb /mnt
> mount: wrong fs type, bad option, bad superblock on /dev/sdb,
> missing codepage or helper program, or other error
> In some cases useful info is found in syslog - try
> dmesg | tail or so
>
> Kernel log: "Jul 29 22:05:32 ini kernel: [ 2905.423092] JBD: recovery failed"
Hmm, this is strange. Are there more messages around this one?
Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR
Jan Kara, on 07/29/2010 06:26 PM wrote:
>> root@ini:/mnt/dbench-mod# ^C
>> root@ini:/mnt/dbench-mod# ^C
>> root@ini:~# umount /mnt
>> Segmentation fault
> This isn't OK of course ;)
>
>> Kernel log:
>>
>> Jul 29 19:55:35 ini kernel: [ 3044.722313] c2c28e40: 00023740 00023741 00023742 00023743 @7..A7..B7..C7..
>> Jul 29 19:55:35 ini kernel: [ 3044.722320] c2c28e50: 00023744 00023745 00023746 00023747 D7..E7..F7..G7..
>> Jul 29 19:55:35 ini kernel: [ 3044.722327] c2c28e60: 00023748 00023749 0002374a 0002374b H7..I7..J7..K7..
>> Jul 29 19:55:35 ini kernel: [ 3044.722334] c2c28e70: 0002372c 00000000 00000000 00000000 ,7..............
>> Jul 29 19:55:35 ini kernel: [ 3044.722341] c2c28e80: 00000000 00000000 00000000 00000002 ................
> ...
> Sadly these messages above seem to have overwritten beginning of the
> message below. Hmm, but maybe it's just a warning message about inode still
> being on orphan list because the next oops still shows untainted kernel.
You can find previous messages in the attachments I attached to the report. They are big (500K and 1M), so I compressed and attached them.
>> Jul 29 19:55:35 ini kernel: [ 3044.722546] Pid: 1299, comm: umount Not tainted 2.6.32-22-386 #36-Ubuntu
>> Jul 29 19:55:35 ini kernel: [ 3044.722550] Call Trace:
>> Jul 29 19:55:35 ini kernel: [ 3044.722567] [<c0291731>] ext4_destroy_inode+0x91/0xa0
>> Jul 29 19:55:35 ini kernel: [ 3044.722577] [<c020ecb4>] destroy_inode+0x24/0x40
>> Jul 29 19:55:35 ini kernel: [ 3044.722583] [<c020f11e>] dispose_list+0x8e/0x100
>> Jul 29 19:55:35 ini kernel: [ 3044.722588] [<c020f534>] invalidate_inodes+0xf4/0x120
>> Jul 29 19:55:35 ini kernel: [ 3044.722598] [<c023b310>] ? vfs_quota_off+0x0/0x20
>> Jul 29 19:55:35 ini kernel: [ 3044.722606] [<c01fc602>] generic_shutdown_super+0x42/0xe0
>> Jul 29 19:55:35 ini kernel: [ 3044.722612] [<c01fc6ca>] kill_block_super+0x2a/0x50
>> Jul 29 19:55:35 ini kernel: [ 3044.722618] [<c01fd4e4>] deactivate_super+0x64/0x90
>> Jul 29 19:55:35 ini kernel: [ 3044.722625] [<c021282f>] mntput_no_expire+0x8f/0xe0
>> Jul 29 19:55:35 ini kernel: [ 3044.722631] [<c0212e47>] sys_umount+0x47/0xa0
>> Jul 29 19:55:35 ini kernel: [ 3044.722636] [<c0212ebe>] sys_oldumount+0x1e/0x20
>> Jul 29 19:55:35 ini kernel: [ 3044.722643] [<c01033ec>] syscall_call+0x7/0xb
>> Jul 29 19:55:35 ini kernel: [ 3044.731043] sd 6:0:0:0: [sdb] Unhandled error code
>> Jul 29 19:55:35 ini kernel: [ 3044.731049] sd 6:0:0:0: [sdb] Result: hostbyte=DID_TRANSPORT_FAILFAST driverbyte=DRIVER_OK
>> Jul 29 19:55:35 ini kernel: [ 3044.731056] sd 6:0:0:0: [sdb] CDB: Write(10): 2a 00 00 00 00 00 00 00 01 00
>> Jul 29 19:55:35 ini kernel: [ 3044.743469] __ratelimit: 37 callbacks suppressed
>> Jul 29 19:55:35 ini kernel: [ 3044.755695] lost page write due to I/O error on sdb
>> Jul 29 19:55:36 ini kernel: [ 3044.823044] Modules linked in: crc32c ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi w83627hf hwmon_vid fbcon tileblit font bitblit softcursor ppdev adm1021 i2c_i801 vga16fb vgastate e7xxx_edac psmouse serio_raw parport_pc shpchp edac_core lp parport qla2xxx ohci1394 scsi_transport_fc r8169 sata_via ieee1394 mii scsi_tgt e1000 floppy
> So here probably starts the real oops.
It isn't yet an oops, it's dump_stack() from ext4_destroy_inode() together with hex dump:
static void ext4_destroy_inode(struct inode *inode)
{
if (!list_empty(&(EXT4_I(inode)->i_orphan))) {
ext4_msg(inode->i_sb, KERN_ERR,
"Inode %lu (%p): orphan list check failed!",
inode->i_ino, EXT4_I(inode));
print_hex_dump(KERN_INFO, "", DUMP_PREFIX_ADDRESS, 16, 4,
EXT4_I(inode), sizeof(struct ext4_inode_info),
true);
dump_stack();
}
kmem_cache_free(ext4_inode_cachep, EXT4_I(inode));
}
> But sadly we are missing the
> beginning as well.
It was also in the attached file.
> Can you send me disassembly of your ext4_put_super?
In System.map-2.6.32-22-386:
c0293940 t ext4_put_super
c0293c90 t ext4_quota_write
$ objdump -d --start-address=0xc0293940 vmlinux >ext4_put_super
^C
$ cat ext4_put_super
vmlinux: file format elf32-i386
Disassembly of section .text:
c0293940 <.text+0x193940>:
c0293940: 55 push %ebp
c0293941: 89 e5 mov %esp,%ebp
c0293943: 57 push %edi
c0293944: 56 push %esi
c0293945: 53 push %ebx
c0293946: 83 ec 28 sub $0x28,%esp
c0293949: e8 02 07 e7 ff call 0xc0104050
c029394e: 8b 98 84 01 00 00 mov 0x184(%eax),%ebx
c0293954: 89 c6 mov %eax,%esi
c0293956: 8b 83 2c 02 00 00 mov 0x22c(%ebx),%eax
c029395c: 8b 7b 38 mov 0x38(%ebx),%edi
c029395f: e8 ac b5 ec ff call 0xc015ef10
c0293964: 8b 83 2c 02 00 00 mov 0x22c(%ebx),%eax
c029396a: e8 41 af ec ff call 0xc015e8b0
c029396f: 89 f0 mov %esi,%eax
c0293971: e8 5a 86 f6 ff call 0xc01fbfd0
c0293976: e8 05 5a 2e 00 call 0xc0579380
c029397b: 80 7e 11 00 cmpb $0x0,0x11(%esi)
c029397f: 0f 85 0b 02 00 00 jne 0xc0293b90
c0293985: 8b 83 34 01 00 00 mov 0x134(%ebx),%eax
c029398b: 85 c0 test %eax,%eax
c029398d: 74 17 je 0xc02939a6
c029398f: e8 ec 64 02 00 call 0xc02b9e80
c0293994: c7 83 34 01 00 00 00 movl $0x0,0x134(%ebx)
c029399b: 00 00 00
c029399e: 85 c0 test %eax,%eax
c02939a0: 0f 88 ff 01 00 00 js 0xc0293ba5
c02939a6: 89 f0 mov %esi,%eax
c02939a8: e8 c3 22 01 00 call 0xc02a5c70
c02939ad: 89 f0 mov %esi,%eax
c02939af: e8 0c dc 00 00 call 0xc02a15c0
c02939b4: 89 f0 mov %esi,%eax
c02939b6: e8 c5 51 00 00 call 0xc0298b80
c02939bb: 89 f0 mov %esi,%eax
c02939bd: e8 de 45 01 00 call 0xc02a7fa0
c02939c2: f6 46 30 01 testb $0x1,0x30(%esi)
c02939c6: 0f 84 9c 01 00 00 je 0xc0293b68
c02939cc: 8b 93 f8 00 00 00 mov 0xf8(%ebx),%edx
c02939d2: 85 d2 test %edx,%edx
c02939d4: 74 11 je 0xc02939e7
c02939d6: 8b 15 c8 8a 8a c0 mov 0xc08a8ac8,%edx
c02939dc: 8d 86 64 01 00 00 lea 0x164(%esi),%eax
c02939e2: e8 59 fd fa ff call 0xc0243740
c02939e7: 8d bb fc 00 00 00 lea 0xfc(%ebx),%edi
c02939ed: 89 f8 mov %edi,%eax
c02939ef: e8 7c 5f 0a 00 call 0xc0339970
c02939f4: 8b 43 14 mov 0x14(%ebx),%eax
c02939f7: 85 c0 test %eax,%eax
c02939f9: 0f 84 c3 01 00 00 je 0xc0293bc2
c02939ff: 31 d2 xor %edx,%edx
c0293a01: 8b 4b 3c mov 0x3c(%ebx),%ecx
c0293a04: 31 c0 xor %eax,%eax
c0293a06: 89 75 f0 mov %esi,-0x10(%ebp)
c0293a09: 89 de mov %ebx,%esi
c0293a0b: 89 d3 mov %edx,%ebx
c0293a0d: 8d 76 00 lea 0x0(%esi),%esi
c0293a10: 8b 04 81 mov (%ecx,%eax,4),%eax
c0293a13: 85 c0 test %eax,%eax
c0293a15: 74 08 je 0xc0293a1f
c0293a17: e8 54 ab f8 ff call 0xc021e570
c0293a1c: 8b 4e 3c mov 0x3c(%esi),%ecx
c0293a1f: 83 c3 01 add $0x1,%ebx
c0293a22: 39 5e 14 cmp %ebx,0x14(%esi)
c0293a25: 89 d8 mov %ebx,%eax
c0293a27: 77 e7 ja 0xc0293a10
c0293a29: 89 f3 mov %esi,%ebx
c0293a2b: 8b 75 f0 mov -0x10(%ebp),%esi
c0293a2e: 89 c8 mov %ecx,%eax
c0293a30: e8 fb bb f5 ff call 0xc01ef630
c0293a35: 8b 15 2c 53 8a c0 mov 0xc08a532c,%edx
c0293a3b: 8b 83 28 02 00 00 mov 0x228(%ebx),%eax
c0293a41: 81 c2 00 00 80 00 add $0x800000,%edx
c0293a47: 39 d0 cmp %edx,%eax
c0293a49: 72 20 jb 0xc0293a6b
c0293a4b: 8b 15 c0 17 75 c0 mov 0xc07517c0,%edx
c0293a51: 81 ea 00 20 60 00 sub $0x602000,%edx
c0293a57: 81 e2 00 00 c0 ff and $0xffc00000,%edx
c0293a5d: 81 ea 00 20 00 00 sub $0x2000,%edx
c0293a63: 39 d0 cmp %edx,%eax
c0293a65: 0f 82 ed 00 00 00 jb 0xc0293b58
c0293a6b: e8 c0 bb f5 ff call 0xc01ef630
c0293a70: 8d 83 94 00 00 00 lea 0x94(%ebx),%eax
c0293a76: e8 c5 42 0b 00 call 0xc0347d40
c0293a7b: 8d 83 ac 00 00 00 lea 0xac(%ebx),%eax
c0293a81: e8 ba 42 0b 00 call 0xc0347d40
c0293a86: 8d 83 c4 00 00 00 lea 0xc4(%ebx),%eax
c0293a8c: e8 af 42 0b 00 call 0xc0347d40
c0293a91: 8d 83 dc 00 00 00 lea 0xdc(%ebx),%eax
c0293a97: e8 a4 42 0b 00 call 0xc0347d40
c0293a9c: 8b 43 34 mov 0x34(%ebx),%eax
c0293a9f: 85 c0 test %eax,%eax
c0293aa1: 74 05 je 0xc0293aa8
c0293aa3: e8 c8 aa f8 ff call 0xc021e570
c0293aa8: 8b 83 78 01 00 00 mov 0x178(%ebx),%eax
c0293aae: e8 7d bb f5 ff call 0xc01ef630
c0293ab3: 8b 83 7c 01 00 00 mov 0x17c(%ebx),%eax
c0293ab9: e8 72 bb f5 ff call 0xc01ef630
c0293abe: 8d 93 38 01 00 00 lea 0x138(%ebx),%edx
c0293ac4: 3b 93 38 01 00 00 cmp 0x138(%ebx),%edx
c0293aca: 0f 85 fa 00 00 00 jne 0xc0293bca
c0293ad0: 8b 86 94 00 00 00 mov 0x94(%esi),%eax
c0293ad6: e8 65 b4 f8 ff call 0xc021ef40
c0293adb: 8b 83 74 01 00 00 mov 0x174(%ebx),%eax
c0293ae1: 85 c0 test %eax,%eax
c0293ae3: 74 31 je 0xc0293b16
c0293ae5: 3b 86 94 00 00 00 cmp 0x94(%esi),%eax
c0293aeb: 74 29 je 0xc0293b16
c0293aed: e8 4e 0f f9 ff call 0xc0224a40
c0293af2: 8b 83 74 01 00 00 mov 0x174(%ebx),%eax
c0293af8: e8 43 b4 f8 ff call 0xc021ef40
c0293afd: 8b 83 74 01 00 00 mov 0x174(%ebx),%eax
c0293b03: 85 c0 test %eax,%eax
c0293b05: 74 0f je 0xc0293b16
c0293b07: e8 64 d7 ff ff call 0xc0291270
c0293b0c: c7 83 74 01 00 00 00 movl $0x0,0x174(%ebx)
c0293b13: 00 00 00
c0293b16: c7 86 84 01 00 00 00 movl $0x0,0x184(%esi)
c0293b1d: 00 00 00
c0293b20: e8 2b 58 2e 00 call 0xc0579350
c0293b25: 89 f0 mov %esi,%eax
c0293b27: e8 c4 84 f6 ff call 0xc01fbff0
c0293b2c: 89 f8 mov %edi,%eax
c0293b2e: e8 7d 5d 0a 00 call 0xc03398b0
c0293b33: 8d 83 20 01 00 00 lea 0x120(%ebx),%eax
c0293b39: e8 d2 3b 2e 00 call 0xc0577710
c0293b3e: 8b 83 f4 00 00 00 mov 0xf4(%ebx),%eax
c0293b44: e8 e7 ba f5 ff call 0xc01ef630
c0293b49: 89 d8 mov %ebx,%eax
c0293b4b: e8 e0 ba f5 ff call 0xc01ef630
c0293b50: 83 c4 28 add $0x28,%esp
c0293b53: 5b pop %ebx
c0293b54: 5e pop %esi
c0293b55: 5f pop %edi
c0293b56: 5d pop %ebp
c0293b57: c3 ret
c0293b58: e8 e3 db f4 ff call 0xc01e1740
c0293b5d: 8d 76 00 lea 0x0(%esi),%esi
c0293b60: e9 0b ff ff ff jmp 0xc0293a70
c0293b65: 8d 76 00 lea 0x0(%esi),%esi
c0293b68: 8b 86 84 01 00 00 mov 0x184(%esi),%eax
c0293b6e: ba 01 00 00 00 mov $0x1,%edx
c0293b73: 8b 40 38 mov 0x38(%eax),%eax
c0293b76: 83 60 60 fb andl $0xfffffffb,0x60(%eax)
c0293b7a: 0f b7 43 58 movzwl 0x58(%ebx),%eax
c0293b7e: 66 89 47 3a mov %ax,0x3a(%edi)
c0293b82: 89 f0 mov %esi,%eax
c0293b84: e8 67 e5 ff ff call 0xc02920f0
c0293b89: e9 3e fe ff ff jmp 0xc02939cc
c0293b8e: 66 90 xchg %ax,%ax
c0293b90: ba 01 00 00 00 mov $0x1,%edx
c0293b95: 89 f0 mov %esi,%eax
c0293b97: e8 54 e5 ff ff call 0xc02920f0
c0293b9c: 8d 74 26 00 lea 0x0(%esi,%eiz,1),%esi
c0293ba0: e9 e0 fd ff ff jmp 0xc0293985
c0293ba5: 89 34 24 mov %esi,(%esp)
c0293ba8: c7 44 24 08 bf 9c 6c movl $0xc06c9cbf,0x8(%esp)
c0293baf: c0
c0293bb0: c7 44 24 04 64 3e 59 movl $0xc0593e64,0x4(%esp)
c0293bb7: c0
c0293bb8: e8 d3 f1 ff ff call 0xc0292d90
c0293bbd: e9 e4 fd ff ff jmp 0xc02939a6
c0293bc2: 8b 4b 3c mov 0x3c(%ebx),%ecx
c0293bc5: e9 64 fe ff ff jmp 0xc0293a2e
c0293bca: 8b 43 38 mov 0x38(%ebx),%eax
c0293bcd: 8b 80 e8 00 00 00 mov 0xe8(%eax),%eax
c0293bd3: 89 55 e8 mov %edx,-0x18(%ebp)
c0293bd6: c7 44 24 08 be a9 6c movl $0xc06ca9be,0x8(%esp)
c0293bdd: c0
c0293bde: c7 44 24 04 b9 0c 6a movl $0xc06a0cb9,0x4(%esp)
c0293be5: c0
c0293be6: 89 44 24 0c mov %eax,0xc(%esp)
c0293bea: 89 34 24 mov %esi,(%esp)
c0293bed: e8 be d8 ff ff call 0xc02914b0
c0293bf2: c7 04 24 f6 9c 6c c0 movl $0xc06c9cf6,(%esp)
c0293bf9: e8 4d 2e 2e 00 call 0xc0576a4b
c0293bfe: 8b 83 38 01 00 00 mov 0x138(%ebx),%eax
c0293c04: 8b 55 e8 mov -0x18(%ebp),%edx
c0293c07: 89 45 f0 mov %eax,-0x10(%ebp)
c0293c0a: 89 55 ec mov %edx,-0x14(%ebp)
c0293c0d: 8b 55 f0 mov -0x10(%ebp),%edx
c0293c10: 8b 02 mov (%edx),%eax
c0293c12: 8d 74 26 00 lea 0x0(%esi,%eiz,1),%esi
c0293c16: 39 55 ec cmp %edx,-0x14(%ebp)
c0293c19: 75 13 jne 0xc0293c2e
c0293c1b: 8b 55 ec mov -0x14(%ebp),%edx
c0293c1e: 3b 93 38 01 00 00 cmp 0x138(%ebx),%edx
c0293c24: 0f 84 a6 fe ff ff je 0xc0293ad0
c0293c2a: 0f 0b ud2a
c0293c2c: eb fe jmp 0xc0293c2c
c0293c2e: 8b 55 f0 mov -0x10(%ebp),%edx
c0293c31: 8b 45 f0 mov -0x10(%ebp),%eax
c0293c34: 83 c2 20 add $0x20,%edx
c0293c37: 8b 4a c0 mov -0x40(%edx),%ecx
c0293c3a: 83 e8 68 sub $0x68,%eax
c0293c3d: 89 4c 24 18 mov %ecx,0x18(%esp)
c0293c41: 8b 88 b0 00 00 00 mov 0xb0(%eax),%ecx
c0293c47: 89 4c 24 14 mov %ecx,0x14(%esp)
c0293c4b: 0f b7 88 fa 00 00 00 movzwl 0xfa(%eax),%ecx
c0293c52: 89 54 24 0c mov %edx,0xc(%esp)
c0293c56: 89 4c 24 10 mov %ecx,0x10(%esp)
c0293c5a: 8b 90 a8 00 00 00 mov 0xa8(%eax),%edx
c0293c60: 89 54 24 08 mov %edx,0x8(%esp)
c0293c64: 8b 80 2c 01 00 00 mov 0x12c(%eax),%eax
c0293c6a: c7 04 24 54 b0 6b c0 movl $0xc06bb054,(%esp)
c0293c71: 05 64 01 00 00 add $0x164,%eax
c0293c76: 89 44 24 04 mov %eax,0x4(%esp)
c0293c7a: e8 cc 2d 2e 00 call 0xc0576a4b
c0293c7f: 8b 55 f0 mov -0x10(%ebp),%edx
c0293c82: 8b 12 mov (%edx),%edx
c0293c84: 89 55 f0 mov %edx,-0x10(%ebp)
c0293c87: eb 84 jmp 0xc0293c0d
c0293c89: 8d b4 26 00 00 00 00 lea 0x0(%esi,%eiz,1),%esi
The rest snipped.
>> Jul 29 19:55:36 ini kernel: [ 3044.823044]
>> Jul 29 19:55:36 ini kernel: [ 3044.823044] Pid: 1299, comm: umount Not tainted (2.6.32-22-386 #36-Ubuntu) X5DPA
>> Jul 29 19:55:36 ini kernel: [ 3044.823044] EIP: 0060:[<c0293c2a>] EFLAGS: 00010206 CPU: 0
>> Jul 29 19:55:36 ini kernel: [ 3044.823044] EIP is at ext4_put_super+0x2ea/0x350
>> Jul 29 19:55:36 ini kernel: [ 3044.823044] EAX: c2c28ea8 EBX: c307f000 ECX: ffffff52 EDX: c307f138
>> Jul 29 19:55:36 ini kernel: [ 3044.823044] ESI: ca228a00 EDI: c307f0fc EBP: cec6ff30 ESP: cec6fefc
>> Jul 29 19:55:36 ini kernel: [ 3044.823044] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
>> Jul 29 19:55:36 ini kernel: [ 3044.823044] c06bb054 ca228b64 0000800b c2c28ec8 00008180 00000001 00000000 c307f138
>> Jul 29 19:55:36 ini kernel: [ 3044.823044]<0> c307f138 c307f138 ca228a00 c0593c80 c023b310 cec6ff48 c01fc60d ca228ac0
>> Jul 29 19:55:36 ini kernel: [ 3044.823044]<0> cec6ff44 cf328400 00000003 cec6ff58 c01fc6ca ca228a00 c0759d80 cec6ff6c
>> Jul 29 19:55:36 ini kernel: [ 3044.823044] [<c023b310>] ? vfs_quota_off+0x0/0x20
>> Jul 29 19:55:36 ini kernel: [ 3044.823044] [<c01fc60d>] ? generic_shutdown_super+0x4d/0xe0
>> Jul 29 19:55:36 ini kernel: [ 3044.823044] [<c01fc6ca>] ? kill_block_super+0x2a/0x50
>> Jul 29 19:55:36 ini kernel: [ 3044.823044] [<c01fd4e4>] ? deactivate_super+0x64/0x90
>> Jul 29 19:55:36 ini kernel: [ 3044.823044] [<c021282f>] ? mntput_no_expire+0x8f/0xe0
>> Jul 29 19:55:36 ini kernel: [ 3044.823044] [<c0212e47>] ? sys_umount+0x47/0xa0
>> Jul 29 19:55:36 ini kernel: [ 3044.823044] [<c0212ebe>] ? sys_oldumount+0x1e/0x20
>> Jul 29 19:55:36 ini kernel: [ 3044.823044] [<c01033ec>] ? syscall_call+0x7/0xb
>> Jul 29 19:55:36 ini kernel: [ 3045.299442] ---[ end trace 426db011a0289db3 ]---
> ...
>> Another test. Everything is as before, only I did not pull the cable, but
>> deleted the corresponding LUN on the target, so all the command starting
>> from this moment failed. Then on umount system rebooted. Kernel log:
>
> Nasty. But the log actually contains only traces of processes in D state
> (generally waiting for a page to be unlocked). Do you have any sort of
> watchdog which might have rebooted the machine?
I didn't configured it ;). This is unmodified Ubuntu server 10.04, only with non-PAE kernel.
The reboot wasn't immediate. I even tried to check something in another ssh.
Again, you can find more logs attached to the original message.
> Thanks for running the test.
Thanks for looking at the results!
Vlad
Jan Kara, on 07/29/2010 06:34 PM wrote:
> On Thu 29-07-10 18:12:29, Vladislav Bolkhovitin wrote:
>>
>> Christoph Hellwig, on 07/29/2010 05:08 PM wrote:
>>> On Thu, Jul 29, 2010 at 05:00:10PM +0400, Vladislav Bolkhovitin wrote:
>>>> You can find full kernel logs starting from iSCSI load in the attachments.
>>>>
>>>> I already reported such issues some time ago, but my reports were not too much welcomed, so I gave up. Anyway, anybody can easily do my tests at any time. They don't need any special hardware, just 2 Linux boxes: one for iSCSI target and one for iSCSI initiator (the test box itself). But they are generic for other transports as well. You can see there's nothing iSCSI specific in the traces.
>>>
>>> I was only talking about ext3.
>>
>> Yes, now ext3 is a lot more reliable. The only how I was able to confuse it was:
>>
>> ...
>> (2197) nb_write: handle 4272 was not open size=65475 ofs=0
>> (2199) nb_write: handle 4272 was not open size=65475 ofs=65534
>> (2201) nb_write: handle 4272 was not open size=65475 ofs=131068
>> (2203) nb_write: handle 4272 was not open size=65475 ofs=196602
>> (2205) nb_write: handle 4272 was not open size=65475 ofs=262136^C
>> ^C
>> root@ini:/mnt/dbench-mod# ^C
>> root@ini:/mnt/dbench-mod# ^C
>> root@ini:/mnt/dbench-mod# cd
>> root@ini:~# umount /mnt
>>
>> <- recover device
>>
>> root@ini:~# mount -t ext3 -o barrier=1 /dev/sdb /mnt
>> mount: wrong fs type, bad option, bad superblock on /dev/sdb,
>> missing codepage or helper program, or other error
>> In some cases useful info is found in syslog - try
>> dmesg | tail or so
>>
>> Kernel log: "Jul 29 22:05:32 ini kernel: [ 2905.423092] JBD: recovery failed"
> Hmm, this is strange. Are there more messages around this one?
Rather none:
Jul 29 22:02:05 ini kernel: [ 2698.488446] sd 7:0:0:0: [sdb] CDB: Read(10): 28 00 00 07 88 69 00 00 01 00
Jul 29 22:02:05 ini kernel: [ 2698.505470] sd 7:0:0:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Jul 29 22:02:05 ini kernel: [ 2698.505480] sd 7:0:0:0: [sdb] Sense Key : Illegal Request [current]
Jul 29 22:02:05 ini kernel: [ 2698.505488] sd 7:0:0:0: [sdb] Add. Sense: Logical unit not supported
Jul 29 22:02:05 ini kernel: [ 2698.505497] sd 7:0:0:0: [sdb] CDB: Read(10): 28 00 00 07 88 69 00 00 01 00
Jul 29 22:02:05 ini kernel: [ 2698.555147] sd 7:0:0:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Jul 29 22:02:05 ini kernel: [ 2698.555157] sd 7:0:0:0: [sdb] Sense Key : Illegal Request [current]
Jul 29 22:02:05 ini kernel: [ 2698.555165] sd 7:0:0:0: [sdb] Add. Sense: Logical unit not supported
Jul 29 22:02:05 ini kernel: [ 2698.555175] sd 7:0:0:0: [sdb] CDB: Read(10): 28 00 00 07 88 69 00 00 01 00
Jul 29 22:02:05 ini kernel: [ 2698.582241] sd 7:0:0:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Jul 29 22:02:05 ini kernel: [ 2698.582251] sd 7:0:0:0: [sdb] Sense Key : Illegal Request [current]
Jul 29 22:02:05 ini kernel: [ 2698.582259] sd 7:0:0:0: [sdb] Add. Sense: Logical unit not supported
Jul 29 22:02:05 ini kernel: [ 2698.582268] sd 7:0:0:0: [sdb] CDB: Read(10): 28 00 00 07 88 69 00 00 01 00
Jul 29 22:02:05 ini kernel: [ 2698.614789] sd 7:0:0:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Jul 29 22:02:05 ini kernel: [ 2698.614799] sd 7:0:0:0: [sdb] Sense Key : Illegal Request [current]
Jul 29 22:02:05 ini kernel: [ 2698.614807] sd 7:0:0:0: [sdb] Add. Sense: Logical unit not supported
Jul 29 22:02:05 ini kernel: [ 2698.614817] sd 7:0:0:0: [sdb] CDB: Read(10): 28 00 00 07 88 69 00 00 01 00
Jul 29 22:02:45 ini kernel: [ 2738.474386] __journal_remove_journal_head: freeing b_committed_data
Jul 29 22:02:45 ini kernel: [ 2738.474529] __journal_remove_journal_head: freeing b_committed_data
Jul 29 22:02:45 ini kernel: [ 2738.474536] __journal_remove_journal_head: freeing b_committed_data
Jul 29 22:02:45 ini kernel: [ 2738.474570] __journal_remove_journal_head: freeing b_committed_data
Jul 29 22:02:45 ini kernel: [ 2738.474583] __journal_remove_journal_head: freeing b_committed_data
Jul 29 22:02:45 ini kernel: [ 2738.474603] __journal_remove_journal_head: freeing b_committed_data
Jul 29 22:02:45 ini kernel: [ 2738.474615] __journal_remove_journal_head: freeing b_committed_data
Jul 29 22:02:45 ini kernel: [ 2738.474621] __journal_remove_journal_head: freeing b_committed_data
Jul 29 22:02:45 ini kernel: [ 2738.474633] __journal_remove_journal_head: freeing b_committed_data
Jul 29 22:02:45 ini kernel: [ 2738.474659] __journal_remove_journal_head: freeing b_committed_data
Jul 29 22:05:32 ini kernel: [ 2905.423092] JBD: recovery failed
Jul 29 22:05:54 ini kernel: [ 2927.832893] kjournald starting. Commit interval 5 seconds
Jul 29 22:05:54 ini kernel: [ 2927.833430] EXT3 FS on sdb, internal journal
Jul 29 22:05:54 ini kernel: [ 2927.833499] EXT3-fs: sdb: 1 orphan inode deleted
Jul 29 22:05:54 ini kernel: [ 2927.833503] EXT3-fs: recovery complete.
Jul 29 22:05:54 ini kernel: [ 2927.838122] EXT3-fs: mounted filesystem with ordered data mode.
Jan Kara, on 07/29/2010 06:34 PM wrote:
> On Thu 29-07-10 18:12:29, Vladislav Bolkhovitin wrote:
>>
>> Christoph Hellwig, on 07/29/2010 05:08 PM wrote:
>>> On Thu, Jul 29, 2010 at 05:00:10PM +0400, Vladislav Bolkhovitin wrote:
>>>> You can find full kernel logs starting from iSCSI load in the attachments.
>>>>
>>>> I already reported such issues some time ago, but my reports were not too much welcomed, so I gave up. Anyway, anybody can easily do my tests at any time. They don't need any special hardware, just 2 Linux boxes: one for iSCSI target and one for iSCSI initiator (the test box itself). But they are generic for other transports as well. You can see there's nothing iSCSI specific in the traces.
>>>
>>> I was only talking about ext3.
>>
>> Yes, now ext3 is a lot more reliable. The only how I was able to confuse it was:
>>
>> ...
>> (2197) nb_write: handle 4272 was not open size=65475 ofs=0
>> (2199) nb_write: handle 4272 was not open size=65475 ofs=65534
>> (2201) nb_write: handle 4272 was not open size=65475 ofs=131068
>> (2203) nb_write: handle 4272 was not open size=65475 ofs=196602
>> (2205) nb_write: handle 4272 was not open size=65475 ofs=262136^C
>> ^C
>> root@ini:/mnt/dbench-mod# ^C
>> root@ini:/mnt/dbench-mod# ^C
>> root@ini:/mnt/dbench-mod# cd
>> root@ini:~# umount /mnt
>>
>> <- recover device
>>
>> root@ini:~# mount -t ext3 -o barrier=1 /dev/sdb /mnt
>> mount: wrong fs type, bad option, bad superblock on /dev/sdb,
>> missing codepage or helper program, or other error
>> In some cases useful info is found in syslog - try
>> dmesg | tail or so
>>
>> Kernel log: "Jul 29 22:05:32 ini kernel: [ 2905.423092] JBD: recovery failed"
> Hmm, this is strange. Are there more messages around this one?
I'd encourage you to reproduce similar setup and perform various failure
injection testings. I promise you, you'll find a lot of strange and
interesting ;). Software devices give unique opportunities for that.
Vlad
On Thu, Jul 29, 2010 at 05:00:10PM +0400, Vladislav Bolkhovitin wrote:
> Christoph Hellwig, on 07/29/2010 12:31 PM wrote:
> > My reading of the ext3/jbd code we explicitly wait on I/O completion
> > of dependent writes, and only require those to actually be stable
> > by issueing a flush. If that wasn't the case the default ext3
> > barriers off behaviour would not only be dangerous on devices with
> > volatile write caches, but also on devices that do not have them,
> > which in addition to the reading of the code is not what we've seen
> > in actual power fail testing, where ext3 does well as long as there
> > is no volatile write cache.
>
> Basically, it is so, but, unfortunately, not absolutely. I've just tried 2 tests on ext4 with iSCSI:
Well, this thread was talking about something else (which is how
various file systems handle barriers), and not bugs about what happen
when a disk disappears from a system due to attachment failure --- but
that's fine, we can deal with that here.
> Segmentation fault
OK, I've looked at your kernel messages, and it looks like the problem
comes from this:
/* Debugging code just in case the in-memory inode orphan list
* isn't empty. The on-disk one can be non-empty if we've
* detected an error and taken the fs readonly, but the
* in-memory list had better be clean by this point. */
if (!list_empty(&sbi->s_orphan))
dump_orphan_list(sb, sbi);
J_ASSERT(list_empty(&sbi->s_orphan)); <====
This is a "should never happen situation", and we crash so we can
figure out how we got there. For production kernels, arguably it
would probably be better to print a message and a WARN_ON(1), and then
not force a crash from a BUG_ON (which is what J_ASSERT is defined to
use).
Looking at your messages and the ext4_delete_inode() warning, I think
I know what caused it. Can you try this patch (attached below) and
see if it fixes things for you?
> I already reported such issues some time ago, but my reports were
> not too much welcomed, so I gave up. Anyway, anybody can easily do
> my tests at any time.
My apologies. I've gone through the linux-ext4 mailing list logs, and
I can't find any mention of this problem from any username @vlnb.net.
I'm not sure where you reported it, and I'm sorry we dropped your bug
report. All I can say is that we do the best that we can, and our
team is relatively small and short-handed.
- Ted
>From a190d0386e601d58db6d2a6cbf00dc1c17d02136 Mon Sep 17 00:00:00 2001
From: Theodore Ts'o <[email protected]>
Date: Thu, 29 Jul 2010 14:54:48 -0400
Subject: [PATCH] patch explicitly-drop-inode-from-orphan-list-on-ext4_delete_inode-failure
---
fs/ext4/inode.c | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index a52d5af..533b607 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -221,6 +221,7 @@ void ext4_delete_inode(struct inode *inode)
"couldn't extend journal (err %d)", err);
stop_handle:
ext4_journal_stop(handle);
+ ext4_orphan_del(NULL, inode);
goto no_delete;
}
}
--
1.7.0.4