From: "Darrick J. Wong" Subject: Re: 4.5-rc1 panic when running generic/300 with dioread_nolock Date: Thu, 11 Feb 2016 09:16:42 -0800 Message-ID: <20160211171642.GC6338@birch.djwong.org> References: <20160211002344.GA15231@localhost.localdomain> <20160211092533.GA21760@quack.suse.cz> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Eric Whitney , linux-ext4@vger.kernel.org, tytso@mit.edu To: Jan Kara Return-path: Received: from userp1040.oracle.com ([156.151.31.81]:43868 "EHLO userp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751633AbcBKRQv (ORCPT ); Thu, 11 Feb 2016 12:16:51 -0500 Content-Disposition: inline In-Reply-To: <20160211092533.GA21760@quack.suse.cz> Sender: linux-ext4-owner@vger.kernel.org List-ID: On Thu, Feb 11, 2016 at 10:25:33AM +0100, Jan Kara wrote: > On Wed 10-02-16 19:23:44, Eric Whitney wrote: > > I've been seeing a panic and other signs of locking trouble when running > > xfstest generic/300 with xfstests-bld's x86-64 test appliance in the > > dioread_nolock test configuration. This began with the 4.5-rc1 kernel and has > > continued through 4.5-rc3. It happens in more than 50% of my test runs and > > is accompanied by an inconsistent lock state warning. It does not occur in > > any other xfstest-bld test configuration. > > > > The panic bisects to "ext4: get rid of EXT4_GET_BLOCKS_NO_LOCK flag" > > (2dcba4781f). Reverting this commit (requires minor merging) in a 4.5-rc1 > > kernel eliminates the panic. > > > > Typical dmesg output follows. > > Uh, it seems like we fail to set buffer_defer_completion() when creating > unwritten extent for direct IO so ext4_end_io_dio() gets called directly > from IRQ instead of from the workqueue. Not sure why that happens because > _ext4_get_block() seems to set buffer_defer_completion whenever we return But... set_buffer_defer_completion() only happens if there's an ioend attached to the inode (inode.c around line 727): if (io_end && io_end->flag & EXT4_IO_END_UNWRITTEN) set_buffer_defer_completion(bh); AFAIK, there isn't be an ioend unless this is an AIO DIO, because the only place we call ext4_inode_aio_set() with a non-NULL ioend is around line 3285 in inode.c, which only executes if (!is_sync_kiocb()), which is only true in the AIO case. So since this isn't an AIO DIO to an unwritten extent, we don't mark the bh for deferred completion, so the dio completion can happen in irq context. I guess we could make the ioend happen for any unwritten dio... ...or revert 2dcba4781f. (No idea if this is what's /supposed/ to happen for ext4...) --D > unwritten extent. I don't have time to debug this right now so it may be > best to try bisecting the breakage if you have time to do that... > > Honza > > > > > Thanks, > > Eric > > > > generic/300 7s ... [16:06:59][ 3.320917] run fstests generic/300 at 2016-02-05 16:06:59 > > [ 4.078053] > > [ 4.078330] ================================= > > [ 4.079036] [ INFO: inconsistent lock state ] > > [ 4.079407] 4.4.0-rc4+ #7 Not tainted > > [ 4.079724] --------------------------------- > > [ 4.080007] inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-R} usage. > > [ 4.080007] fio/904 [HC1[1]:SC0[0]:HE0:SE1] takes: > > [ 4.080007] (&journal->j_state_lock){+?++..}, at: [] start_this_handle+0xae/0x450 > > [ 4.080007] {HARDIRQ-ON-W} state was registered at: > > [ 4.080007] [] __lock_acquire+0x896/0x1fb0 > > [ 4.080007] [] lock_acquire+0xb6/0x130 > > [ 4.080007] [] _raw_write_lock+0x31/0x40 > > [ 4.080007] [] ext4_init_journal_params+0x4d/0xc0 > > [ 4.080007] [] ext4_fill_super+0x236e/0x31c0 > > [ 4.080007] [] mount_bdev+0x185/0x1b0 > > [ 4.080007] [] ext4_mount+0x15/0x20 > > [ 4.080007] [] mount_fs+0x38/0x170 > > [ 4.080007] [] vfs_kern_mount+0x6b/0x160 > > [ 4.080007] [] do_mount+0x218/0xe70 > > [ 4.080007] [] SyS_mount+0x8b/0xd0 > > [ 4.080007] [] mount_block_root+0xf6/0x284 > > [ 4.080007] [] mount_root+0x67/0x6a > > [ 4.080007] [] prepare_namespace+0x167/0x19f > > [ 4.080007] [] kernel_init_freeable+0x268/0x278 > > [ 4.080007] [] kernel_init+0xe/0xe0 > > [ 4.080007] [] ret_from_fork+0x3f/0x70 > > [ 4.080007] irq event stamp: 38034 > > [ 4.080007] hardirqs last enabled at (38033): [] context_tracking_exit.part.6+0x37/0x60 > > [ 4.080007] hardirqs last disabled at (38034): [] common_interrupt+0x8c/0x91 > > [ 4.080007] softirqs last enabled at (36084): [] __do_softirq+0x2a9/0x340 > > [ 4.080007] softirqs last disabled at (36063): [] irq_exit+0xbe/0xc0 > > [ 4.080007] > > [ 4.080007] other info that might help us debug this: > > [ 4.080007] Possible unsafe locking scenario: > > [ 4.080007] > > [ 4.080007] CPU0 > > [ 4.080007] ---- > > [ 4.080007] lock(&journal->j_state_lock); > > [ 4.080007] > > [ 4.080007] lock(&journal->j_state_lock); > > [ 4.080007] > > [ 4.080007] *** DEADLOCK *** > > [ 4.080007] > > [ 4.080007] 1 lock held by fio/904: > > [ 4.080007] #0: (&(&vblk->vqs[i].lock)->rlock){-.-...}, at: [] virtblk_done+0x43/0xe0 > > [ 4.080007] > > [ 4.080007] stack backtrace: > > [ 4.080007] CPU: 0 PID: 904 Comm: fio Not tainted 4.4.0-rc4+ #7 > > [ 4.080007] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014 > > [ 4.080007] ffffffff826152e0 ffff88007fc039b0 ffffffff8138fb82 ffff88007a18c880 > > [ 4.080007] ffff88007fc03a00 ffffffff811aed8c 0000000000000000 0000000000000000 > > [ 4.080007] 0000000000000001 0000000000000001 ffff88007a18d0d8 ffff88007a18c880 > > [ 4.080007] Call Trace: > > [ 4.080007] [] dump_stack+0x44/0x62 > > [ 4.080007] [] print_usage_bug+0x1f3/0x208 > > [ 4.080007] [] mark_lock+0x649/0x6c0 > > [ 4.080007] [] ? print_shortest_lock_dependencies+0x1d0/0x1d0 > > [ 4.080007] [] __lock_acquire+0xa1e/0x1fb0 > > [ 4.080007] [] ? cpuacct_charge+0x5/0x190 > > [ 4.080007] [] ? __lock_is_held+0x4d/0x70 > > [ 4.080007] [] ? __lock_is_held+0x4d/0x70 > > [ 4.080007] [] lock_acquire+0xb6/0x130 > > [ 4.080007] [] ? start_this_handle+0xae/0x450 > > [ 4.080007] [] _raw_read_lock+0x34/0x50 > > [ 4.080007] [] ? start_this_handle+0xae/0x450 > > [ 4.080007] [] start_this_handle+0xae/0x450 > > [ 4.080007] [] ? new_handle+0x20/0x60 > > [ 4.080007] [] jbd2__journal_start+0xbf/0x190 > > [ 4.080007] [] ? ext4_convert_unwritten_extents+0xb1/0x210 > > [ 4.080007] [] __ext4_journal_start_sb+0x71/0xd0 > > [ 4.080007] [] ext4_convert_unwritten_extents+0xb1/0x210 > > [ 4.080007] [] ext4_put_io_end+0x4e/0x70 > > [ 4.080007] [] ext4_end_io_dio+0x2a/0x30 > > [ 4.080007] [] dio_complete+0xcb/0x1b0 > > [ 4.080007] [] dio_bio_end_aio+0x7f/0x130 > > [ 4.080007] [] bio_endio+0x3f/0x60 > > [ 4.080007] [] blk_update_request+0x97/0x320 > > [ 4.080007] [] blk_mq_end_request+0x1a/0x70 > > [ 4.080007] [] virtblk_request_done+0x3f/0x70 > > [ 4.080007] [] __blk_mq_complete_request+0x7b/0xf0 > > [ 4.080007] [] blk_mq_complete_request+0x1c/0x20 > > [ 4.080007] [] virtblk_done+0x64/0xe0 > > [ 4.080007] [] vring_interrupt+0x31/0x50 > > [ 4.080007] [] handle_irq_event_percpu+0x81/0x1b0 > > [ 4.080007] [] handle_irq_event+0x39/0x60 > > [ 4.080007] [] handle_edge_irq+0x6f/0x150 > > [ 4.080007] [] handle_irq+0x1d/0x30 > > [ 4.080007] [] do_IRQ+0x61/0x120 > > [ 4.080007] [] common_interrupt+0x91/0x91 > > [ 4.080007] [] ? context_tracking_exit.part.6+0x3c/0x60 > > [ 4.080007] [] context_tracking_exit+0x1d/0x20 > > [ 4.080007] [] enter_from_user_mode+0x1f/0x50 > > [ 4.080007] [] syscall_trace_enter_phase1+0xc6/0x130 > > [ 4.080007] [] ? trace_hardirqs_on_thunk+0x17/0x19 > > [ 4.080007] [] tracesys+0xd/0x44 > > [ 4.080007] ------------[ cut here ]------------ > > [ 4.080007] kernel BUG at fs/buffer.c:1281! > > [ 4.080007] invalid opcode: 0000 [#1] SMP > > [ 4.080007] Modules linked in: > > [ 4.080007] CPU: 0 PID: 904 Comm: fio Not tainted 4.4.0-rc4+ #7 > > [ 4.080007] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014 > > [ 4.080007] task: ffff88007a18c880 ti: ffff880079144000 task.ti: ffff880079144000 > > [ 4.080007] RIP: 0010:[] [] __find_get_block+0x121/0x130 > > [ 4.080007] RSP: 0018:ffff88007fc03a78 EFLAGS: 00010046 > > [ 4.080007] RAX: 0000000000000086 RBX: 0000000000018980 RCX: 0000000002400048 > > [ 4.080007] RDX: 0000000000001000 RSI: 000000000000dde0 RDI: ffff88007c06ca80 > > [ 4.080007] RBP: ffff88007fc03aa8 R08: 0000000000000001 R09: 0000000000000000 > > [ 4.080007] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88007c06ca80 > > [ 4.080007] R13: 000000000000dde0 R14: 0000000000001000 R15: 0000000002400048 > > [ 4.080007] FS: 00007fcfc0cd4740(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000 > > [ 4.080007] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > [ 4.080007] CR2: 000000000258c018 CR3: 000000007a031000 CR4: 00000000000006f0 > > [ 4.080007] Stack: > > [ 4.080007] ffffffff81253223 0000000000018980 ffff880079c86af0 000000000000dde0 > > [ 4.080007] ffff88007a088900 0000000000000001 ffff88007fc03b00 ffffffff812d68d6 > > [ 4.080007] 0000000002408040 000003867fc03b00 ffffffff81841db0 ffffffff00000000 > > [ 4.080007] Call Trace: > > [ 4.080007] > > [ 4.080007] [] ? __getblk_gfp+0x23/0x60 > > [ 4.080007] [] __read_extent_tree_block+0x46/0x200 > > [ 4.080007] [] ext4_find_extent+0x139/0x320 > > [ 4.080007] [] ext4_ext_map_blocks+0x87/0xdd0 > > [ 4.080007] [] ? ext4_map_blocks+0xea/0x4b0 > > [ 4.080007] [] ext4_map_blocks+0x10e/0x4b0 > > [ 4.080007] [] ? ext4_convert_unwritten_extents+0xb1/0x210 > > [ 4.080007] [] ext4_convert_unwritten_extents+0xd4/0x210 > > [ 4.080007] [] ext4_put_io_end+0x4e/0x70 > > [ 4.080007] [] ext4_end_io_dio+0x2a/0x30 > > [ 4.080007] [] dio_complete+0xcb/0x1b0 > > [ 4.080007] [] dio_bio_end_aio+0x7f/0x130 > > [ 4.080007] [] bio_endio+0x3f/0x60 > > [ 4.080007] [] blk_update_request+0x97/0x320 > > [ 4.080007] [] blk_mq_end_request+0x1a/0x70 > > [ 4.080007] [] virtblk_request_done+0x3f/0x70 > > [ 4.080007] [] __blk_mq_complete_request+0x7b/0xf0 > > [ 4.080007] [] blk_mq_complete_request+0x1c/0x20 > > [ 4.080007] [] virtblk_done+0x64/0xe0 > > [ 4.080007] [] vring_interrupt+0x31/0x50 > > [ 4.080007] [] handle_irq_event_percpu+0x81/0x1b0 > > [ 4.080007] [] handle_irq_event+0x39/0x60 > > [ 4.080007] [] handle_edge_irq+0x6f/0x150 > > [ 4.080007] [] handle_irq+0x1d/0x30 > > [ 4.080007] [] do_IRQ+0x61/0x120 > > [ 4.080007] [] common_interrupt+0x91/0x91 > > [ 4.080007] > > [ 4.080007] [] ? context_tracking_exit.part.6+0x3c/0x60 > > [ 4.080007] [] context_tracking_exit+0x1d/0x20 > > [ 4.080007] [] enter_from_user_mode+0x1f/0x50 > > [ 4.080007] [] syscall_trace_enter_phase1+0xc6/0x130 > > [ 4.080007] [] ? trace_hardirqs_on_thunk+0x17/0x19 > > [ 4.080007] [] tracesys+0xd/0x44 > > [ 4.080007] Code: de ff d0 49 8b 04 24 48 85 c0 75 e9 eb b3 4c 89 e6 4c 89 ef e8 41 ed ff ff 48 85 c0 48 89 c3 74 b5 48 89 c7 e8 e1 fc ff ff eb ab <0f> 0b 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 > > [ 4.080007] RIP [] __find_get_block+0x121/0x130 > > [ 4.080007] RSP > > [ 4.080007] ---[ end trace 9bbf559562132712 ]--- > > [ 4.080007] Kernel panic - not syncing: Fatal exception in interrupt > > [ 4.080007] Kernel Offset: disabled > > [ 4.080007] ---[ end Kernel panic - not syncing: Fatal exception in interrupt > > > > > > > > > > > -- > Jan Kara > SUSE Labs, CR > -- > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html