Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754943AbcJPAmz (ORCPT ); Sat, 15 Oct 2016 20:42:55 -0400 Received: from arcturus.aphlor.org ([188.246.204.175]:57356 "EHLO arcturus.aphlor.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751694AbcJPAmu (ORCPT ); Sat, 15 Oct 2016 20:42:50 -0400 Date: Sat, 15 Oct 2016 20:42:40 -0400 From: Dave Jones To: Chris Mason Cc: Al Viro , Josef Bacik , David Sterba , linux-btrfs@vger.kernel.org, Linux Kernel Subject: Re: btrfs bio linked list corruption. Message-ID: <20161016004240.hpstul32lb2f3u4g@codemonkey.org.uk> Mail-Followup-To: Dave Jones , Chris Mason , Al Viro , Josef Bacik , David Sterba , linux-btrfs@vger.kernel.org, Linux Kernel References: <20161011144507.okg6baqvodn2m2lh@codemonkey.org.uk> <20161012134717.n74tww5eywc7dqp7@codemonkey.org.uk> <20161012144012.7vvfehceoykswmun@codemonkey.org.uk> <20161013181622.qpi5puv6ivxvslnf@codemonkey.org.uk> <7b476728-75f8-e3d3-1261-b9b0d598ed10@fb.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <7b476728-75f8-e3d3-1261-b9b0d598ed10@fb.com> User-Agent: NeoMutt/20161014 (1.7.1) X-Spam-Flag: skipped (authorised relay user) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3102 Lines: 79 On Thu, Oct 13, 2016 at 05:18:46PM -0400, Chris Mason wrote: > > > > .. and of course the first thing that happens is a completely different > > > > btrfs trace.. > > > > > > > > > > > > WARNING: CPU: 1 PID: 21706 at fs/btrfs/transaction.c:489 start_transaction+0x40a/0x440 [btrfs] > > > > CPU: 1 PID: 21706 Comm: trinity-c16 Not tainted 4.8.0-think+ #14 > > > > ffffc900019076a8 ffffffffb731ff3c 0000000000000000 0000000000000000 > > > > ffffc900019076e8 ffffffffb707a6c1 000001e9f5806ce0 ffff8804f74c4d98 > > > > 0000000000000801 ffff880501cfa2a8 000000000000008a 000000000000008a > > > > > > This isn't even IO. Uuughhhh. We're going to need a fast enough test > > > that we can bisect. > > > > Progress... > > I've found that this combination of syscalls.. > > > > ./trinity -C64 -q -l off -a64 --enable-fds=testfile -c fsync -c fsetxattr -c lremovexattr -c pwritev2 > > > > hits one of these two bugs in a few minutes runtime. > > > > Just the xattr syscalls + fsync isn't enough, neither is just pwrite + fsync. > > Mix them together though, and something goes awry. > > > Hasn't triggered here yet. I'll leave it running though. The hits keep coming.. BUG: Bad page state in process kworker/u8:12 pfn:4988fa page:ffffea0012623e80 count:0 mapcount:0 mapping:ffff8804450456e0 index:0x9 flags: 0x400000000000000c(referenced|uptodate) page dumped because: non-NULL mapping CPU: 2 PID: 1388 Comm: kworker/u8:12 Not tainted 4.8.0-think+ #18 Workqueue: writeback wb_workfn (flush-btrfs-1) ffffc90000aef7e8 ffffffff81320e7c ffffea0012623e80 ffffffff819fe6ec ffffc90000aef810 ffffffff81159b3f 0000000000000000 ffffea0012623e80 400000000000000c ffffc90000aef820 ffffffff81159bfa ffffc90000aef868 Call Trace: [] dump_stack+0x4f/0x73 [] bad_page+0xbf/0x120 [] free_pages_check_bad+0x5a/0x70 [] free_hot_cold_page+0x20b/0x270 [] free_hot_cold_page_list+0x2b/0x50 [] release_pages+0x2d2/0x380 [] __pagevec_release+0x22/0x30 [] extent_write_cache_pages.isra.48.constprop.63+0x350/0x430 [btrfs] [] ? debug_smp_processor_id+0x17/0x20 [] ? get_lock_stats+0x19/0x50 [] extent_writepages+0x58/0x80 [btrfs] [] ? btrfs_releasepage+0x40/0x40 [btrfs] [] btrfs_writepages+0x23/0x30 [btrfs] [] do_writepages+0x1c/0x30 [] __writeback_single_inode+0x33/0x180 [] writeback_sb_inodes+0x2cb/0x5d0 [] __writeback_inodes_wb+0x8d/0xc0 [] wb_writeback+0x203/0x210 [] wb_workfn+0xe7/0x2a0 [] ? __lock_acquire.isra.32+0x1cf/0x8c0 [] process_one_work+0x1da/0x4b0 [] ? process_one_work+0x17a/0x4b0 [] worker_thread+0x49/0x490 [] ? process_one_work+0x4b0/0x4b0 [] ? process_one_work+0x4b0/0x4b0