Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755496AbcJZSmS (ORCPT ); Wed, 26 Oct 2016 14:42:18 -0400 Received: from arcturus.aphlor.org ([188.246.204.175]:39340 "EHLO arcturus.aphlor.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751334AbcJZSmL (ORCPT ); Wed, 26 Oct 2016 14:42:11 -0400 Date: Wed, 26 Oct 2016 14:42:01 -0400 From: Dave Jones To: Linus Torvalds Cc: Chris Mason , Andy Lutomirski , Andy Lutomirski , Jens Axboe , Al Viro , Josef Bacik , David Sterba , linux-btrfs , Linux Kernel , Dave Chinner Subject: Re: bio linked list corruption. Message-ID: <20161026184201.6ofblkd3j5uxystq@codemonkey.org.uk> Mail-Followup-To: Dave Jones , Linus Torvalds , Chris Mason , Andy Lutomirski , Andy Lutomirski , Jens Axboe , Al Viro , Josef Bacik , David Sterba , linux-btrfs , Linux Kernel , Dave Chinner References: <20161021200245.kahjzgqzdfyoe3uz@codemonkey.org.uk> <20161022152033.gkmm3l75kqjzsije@codemonkey.org.uk> <20161024044051.onmh4h6sc2bjxzzc@codemonkey.org.uk> <77d9983d-a00a-1dc1-a9a1-631de1d0c146@fb.com> <20161026002752.qvrm6yxqb54fiqnd@codemonkey.org.uk> <20161026163018.wx57yy554576s6e2@codemonkey.org.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: NeoMutt/20161014 (1.7.1) X-Spam-Flag: skipped (authorised relay user) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 22249 Lines: 536 On Wed, Oct 26, 2016 at 09:48:39AM -0700, Linus Torvalds wrote: > I know you already had this in some email, but I lost it. I think you > narrowed it down to a specific set of system calls that seems to > trigger this best. fallocate and xattrs or something? So I was about to give that a shot again. That this has been running doing for 24hrs was bugging me. I ctrl-c'd the current run, and trinity just sat there, because all of its child processes are stuck in D state. The stacks show nearly all of them are stuck in sync_inodes_sb iotop & vmstat shows there is _zero_ io actually happening. So it's spent most the night doing next to nothing useful afaict. Chris ? Here's the /proc/pid/stack's of those children.. [] sync_inodes_sb+0xc6/0x300 [] sync_filesystem+0x57/0xa0 [] SyS_syncfs+0x3c/0x70 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] sync_inodes_sb+0xc6/0x300 [] sync_inodes_one_sb+0x10/0x20 [] iterate_supers+0xaf/0x100 [] sys_sync+0x30/0x90 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] sync_inodes_sb+0xc6/0x300 [] sync_inodes_one_sb+0x10/0x20 [] iterate_supers+0xaf/0x100 [] sys_sync+0x30/0x90 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] sync_inodes_sb+0xc6/0x300 [] sync_inodes_one_sb+0x10/0x20 [] iterate_supers+0xaf/0x100 [] sys_sync+0x30/0x90 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] btrfs_wait_ordered_roots+0x3f/0x200 [btrfs] [] btrfs_sync_fs+0x31/0xc0 [btrfs] [] sync_filesystem+0x6e/0xa0 [] SyS_syncfs+0x3c/0x70 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] call_rwsem_down_write_failed+0x17/0x30 [] utimes_common+0xd4/0x190 [] do_utimes+0x107/0x120 [] SyS_futimesat+0xa1/0xd0 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] sync_inodes_sb+0xc6/0x300 [] sync_inodes_one_sb+0x10/0x20 [] iterate_supers+0xaf/0x100 [] sys_sync+0x30/0x90 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] sync_inodes_sb+0xc6/0x300 [] sync_filesystem+0x57/0xa0 [] SyS_syncfs+0x3c/0x70 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] sync_inodes_sb+0xc6/0x300 [] sync_filesystem+0x57/0xa0 [] SyS_syncfs+0x3c/0x70 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] sync_inodes_sb+0xc6/0x300 [] sync_filesystem+0x57/0xa0 [] SyS_syncfs+0x3c/0x70 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] sync_inodes_sb+0xc6/0x300 [] sync_inodes_one_sb+0x10/0x20 [] iterate_supers+0xaf/0x100 [] sys_sync+0x30/0x90 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] sync_inodes_sb+0xc6/0x300 [] sync_filesystem+0x57/0xa0 [] SyS_syncfs+0x3c/0x70 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] sync_inodes_sb+0xc6/0x300 [] sync_filesystem+0x57/0xa0 [] SyS_syncfs+0x3c/0x70 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] sync_inodes_sb+0xc6/0x300 [] sync_inodes_one_sb+0x10/0x20 [] iterate_supers+0xaf/0x100 [] sys_sync+0x30/0x90 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] sync_inodes_sb+0xc6/0x300 [] sync_filesystem+0x57/0xa0 [] SyS_syncfs+0x3c/0x70 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] sync_inodes_sb+0xc6/0x300 [] sync_inodes_one_sb+0x10/0x20 [] iterate_supers+0xaf/0x100 [] sys_sync+0x30/0x90 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] sync_inodes_sb+0xc6/0x300 [] sync_filesystem+0x57/0xa0 [] SyS_syncfs+0x3c/0x70 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] sync_inodes_sb+0xc6/0x300 [] sync_filesystem+0x57/0xa0 [] SyS_syncfs+0x3c/0x70 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] sync_inodes_sb+0xc6/0x300 [] sync_filesystem+0x57/0xa0 [] SyS_syncfs+0x3c/0x70 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] sync_inodes_sb+0xc6/0x300 [] sync_filesystem+0x57/0xa0 [] SyS_syncfs+0x3c/0x70 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] sync_inodes_sb+0xc6/0x300 [] sync_inodes_one_sb+0x10/0x20 [] iterate_supers+0xaf/0x100 [] sys_sync+0x30/0x90 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] call_rwsem_down_write_failed+0x17/0x30 [] btrfs_fallocate+0xb2/0xfd0 [btrfs] [] vfs_fallocate+0x13e/0x220 [] SyS_fallocate+0x43/0x80 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] sync_inodes_sb+0xc6/0x300 [] sync_inodes_one_sb+0x10/0x20 [] iterate_supers+0xaf/0x100 [] sys_sync+0x30/0x90 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] btrfs_wait_ordered_roots+0x3f/0x200 [btrfs] [] btrfs_sync_fs+0x31/0xc0 [btrfs] [] sync_fs_one_sb+0x1b/0x20 [] iterate_supers+0xaf/0x100 [] sys_sync+0x50/0x90 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] sync_inodes_sb+0xc6/0x300 [] sync_inodes_one_sb+0x10/0x20 [] iterate_supers+0xaf/0x100 [] sys_sync+0x30/0x90 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] sync_inodes_sb+0xc6/0x300 [] sync_inodes_one_sb+0x10/0x20 [] iterate_supers+0xaf/0x100 [] sys_sync+0x30/0x90 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] sync_inodes_sb+0xc6/0x300 [] sync_filesystem+0x57/0xa0 [] SyS_syncfs+0x3c/0x70 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] sync_inodes_sb+0xc6/0x300 [] sync_inodes_one_sb+0x10/0x20 [] iterate_supers+0xaf/0x100 [] sys_sync+0x30/0x90 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] call_rwsem_down_write_failed+0x17/0x30 [] btrfs_file_llseek+0x34/0x290 [btrfs] [] SyS_lseek+0x85/0xa0 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] sync_inodes_sb+0xc6/0x300 [] sync_inodes_one_sb+0x10/0x20 [] iterate_supers+0xaf/0x100 [] sys_sync+0x30/0x90 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] sync_inodes_sb+0xc6/0x300 [] sync_filesystem+0x57/0xa0 [] SyS_syncfs+0x3c/0x70 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] call_rwsem_down_write_failed+0x17/0x30 [] btrfs_sync_file+0x7e/0x360 [btrfs] [] vfs_fsync_range+0x46/0xa0 [] btrfs_file_write_iter+0x3f0/0x550 [btrfs] [] do_iter_readv_writev+0xa8/0x100 [] do_readv_writev+0x172/0x210 [] vfs_writev+0x3a/0x50 [] do_pwritev+0xb0/0xd0 [] SyS_pwritev2+0x12/0x20 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] __fdget_pos+0x44/0x50 [] SyS_getdents+0x6c/0x110 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] call_rwsem_down_write_failed+0x17/0x30 [] do_truncate+0x4a/0x90 [] do_sys_ftruncate.constprop.19+0x10c/0x170 [] SyS_ftruncate+0x9/0x10 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] sync_inodes_sb+0xc6/0x300 [] sync_inodes_one_sb+0x10/0x20 [] iterate_supers+0xaf/0x100 [] sys_sync+0x30/0x90 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] sync_inodes_sb+0xc6/0x300 [] sync_inodes_one_sb+0x10/0x20 [] iterate_supers+0xaf/0x100 [] sys_sync+0x30/0x90 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] btrfs_wait_ordered_extents+0x1e3/0x2e0 [btrfs] [] btrfs_wait_ordered_roots+0x137/0x200 [btrfs] [] btrfs_sync_fs+0x31/0xc0 [btrfs] [] sync_fs_one_sb+0x1b/0x20 [] iterate_supers+0xaf/0x100 [] sys_sync+0x50/0x90 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] sync_inodes_sb+0xc6/0x300 [] sync_filesystem+0x57/0xa0 [] SyS_syncfs+0x3c/0x70 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] wait_on_page_bit+0xaf/0xc0 [] __filemap_fdatawait_range+0x151/0x170 [] filemap_fdatawait_keep_errors+0x1c/0x20 [] sync_inodes_sb+0x273/0x300 [] sync_filesystem+0x57/0xa0 [] SyS_syncfs+0x3c/0x70 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] sync_inodes_sb+0xc6/0x300 [] sync_filesystem+0x57/0xa0 [] SyS_syncfs+0x3c/0x70 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] call_rwsem_down_write_failed+0x17/0x30 [] btrfs_file_write_iter+0x60/0x550 [btrfs] [] do_iter_readv_writev+0xa8/0x100 [] do_readv_writev+0x172/0x210 [] vfs_writev+0x3a/0x50 [] do_pwritev+0xb0/0xd0 [] SyS_pwritev+0xc/0x10 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] __fdget_pos+0x44/0x50 [] SyS_lseek+0x18/0xa0 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] sync_inodes_sb+0xc6/0x300 [] sync_filesystem+0x57/0xa0 [] SyS_syncfs+0x3c/0x70 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] sync_inodes_sb+0xc6/0x300 [] sync_filesystem+0x57/0xa0 [] SyS_syncfs+0x3c/0x70 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] sync_inodes_sb+0xc6/0x300 [] sync_filesystem+0x57/0xa0 [] SyS_syncfs+0x3c/0x70 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] sync_inodes_sb+0xc6/0x300 [] sync_filesystem+0x57/0xa0 [] SyS_syncfs+0x3c/0x70 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] sync_inodes_sb+0xc6/0x300 [] sync_filesystem+0x57/0xa0 [] SyS_syncfs+0x3c/0x70 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] sync_inodes_sb+0xc6/0x300 [] sync_filesystem+0x57/0xa0 [] SyS_syncfs+0x3c/0x70 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] sync_inodes_sb+0xc6/0x300 [] sync_inodes_one_sb+0x10/0x20 [] iterate_supers+0xaf/0x100 [] sys_sync+0x30/0x90 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] sync_inodes_sb+0xc6/0x300 [] sync_inodes_one_sb+0x10/0x20 [] iterate_supers+0xaf/0x100 [] sys_sync+0x30/0x90 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] sync_inodes_sb+0xc6/0x300 [] sync_filesystem+0x57/0xa0 [] SyS_syncfs+0x3c/0x70 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] call_rwsem_down_write_failed+0x17/0x30 [] btrfs_file_write_iter+0x60/0x550 [btrfs] [] __vfs_write+0xc4/0x120 [] vfs_write+0xb3/0x1a0 [] SyS_pwrite64+0x74/0x90 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] sync_inodes_sb+0xc6/0x300 [] sync_filesystem+0x57/0xa0 [] SyS_syncfs+0x3c/0x70 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] sync_inodes_sb+0xc6/0x300 [] sync_filesystem+0x57/0xa0 [] SyS_syncfs+0x3c/0x70 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] sync_inodes_sb+0xc6/0x300 [] sync_inodes_one_sb+0x10/0x20 [] iterate_supers+0xaf/0x100 [] sys_sync+0x30/0x90 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] sync_inodes_sb+0xc6/0x300 [] sync_filesystem+0x57/0xa0 [] SyS_syncfs+0x3c/0x70 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] sync_inodes_sb+0xc6/0x300 [] sync_inodes_one_sb+0x10/0x20 [] iterate_supers+0xaf/0x100 [] sys_sync+0x30/0x90 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] sync_inodes_sb+0xc6/0x300 [] sync_inodes_one_sb+0x10/0x20 [] iterate_supers+0xaf/0x100 [] sys_sync+0x30/0x90 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] sync_inodes_sb+0xc6/0x300 [] sync_inodes_one_sb+0x10/0x20 [] iterate_supers+0xaf/0x100 [] sys_sync+0x30/0x90 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] sync_inodes_sb+0xc6/0x300 [] sync_filesystem+0x57/0xa0 [] SyS_syncfs+0x3c/0x70 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] sync_inodes_sb+0xc6/0x300 [] sync_inodes_one_sb+0x10/0x20 [] iterate_supers+0xaf/0x100 [] sys_sync+0x30/0x90 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] btrfs_start_ordered_extent+0x5b/0xb0 [btrfs] [] lock_and_cleanup_extent_if_need+0x22d/0x290 [btrfs] [] __btrfs_buffered_write+0x1b8/0x6e0 [btrfs] [] btrfs_file_write_iter+0x170/0x550 [btrfs] [] do_iter_readv_writev+0xa8/0x100 [] do_readv_writev+0x172/0x210 [] vfs_writev+0x3a/0x50 [] do_pwritev+0xb0/0xd0 [] SyS_pwritev+0xc/0x10 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] call_rwsem_down_write_failed+0x17/0x30 [] btrfs_file_write_iter+0x60/0x550 [btrfs] [] __vfs_write+0xc4/0x120 [] __kernel_write+0x4d/0xf0 [] write_pipe_buf+0x6d/0x80 [] __splice_from_pipe+0x12f/0x1b0 [] splice_from_pipe+0x4c/0x70 [] default_file_splice_write+0x14/0x20 [] direct_splice_actor+0x31/0x40 [] splice_direct_to_actor+0xcc/0x1e0 [] do_splice_direct+0x90/0xb0 [] do_sendfile+0x1b0/0x390 [] SyS_sendfile64+0x5f/0xd0 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] sync_inodes_sb+0xc6/0x300 [] sync_inodes_one_sb+0x10/0x20 [] iterate_supers+0xaf/0x100 [] sys_sync+0x30/0x90 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff [] do_sigtimedwait+0x16e/0x260 [] SYSC_rt_sigtimedwait+0xa4/0x100 [] SyS_rt_sigtimedwait+0x9/0x10 [] do_syscall_64+0x5c/0x170 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff