Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754427AbcDASTD (ORCPT ); Fri, 1 Apr 2016 14:19:03 -0400 Received: from arcturus.aphlor.org ([188.246.204.175]:60602 "EHLO arcturus.aphlor.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751022AbcDASTA (ORCPT ); Fri, 1 Apr 2016 14:19:00 -0400 Date: Fri, 1 Apr 2016 14:18:54 -0400 From: Dave Jones To: Linux Kernel , Chris Mason , Josef Bacik , David Sterba , linux-btrfs@vger.kernel.org Subject: Re: btrfs_destroy_inode WARN_ON. Message-ID: <20160401181854.GA32269@codemonkey.org.uk> Mail-Followup-To: Dave Jones , Linux Kernel , Chris Mason , Josef Bacik , David Sterba , linux-btrfs@vger.kernel.org References: <20160324225411.GA1612@codemonkey.org.uk> <20160328011400.GA19000@codemonkey.org.uk> <20160401181227.GA31426@codemonkey.org.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160401181227.GA31426@codemonkey.org.uk> User-Agent: Mutt/1.5.24 (2015-08-30) X-Spam-Score: -2.9 (--) X-Spam-Report: Spam detection software, running on the system "arcturus.aphlor.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: On Fri, Apr 01, 2016 at 02:12:27PM -0400, Dave Jones wrote: > BUG: workqueue lockup - pool cpus=1 node=0 flags=0x0 nice=0 stuck for 30s! > Showing busy workqueues and worker pools: > workqueue events: flags=0x0 > pwq 6: cpus=3 node=0 flags=0x0 nice=0 active=1/256 > pending: vmstat_shepherd > pwq 4: cpus=2 node=0 flags=0x0 nice=0 active=1/256 > pending: check_corruption > pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=3/256 > pending: usb_serial_port_work, lru_add_drain_per_cpu BAR(17230), e1000_watchdog_task > workqueue events_power_efficient: flags=0x82 > pwq 8: cpus=0-3 flags=0x4 nice=0 active=3/256 > pending: fb_flashcursor, neigh_periodic_work, neigh_periodic_work > workqueue events_freezable_power_: flags=0x86 > pwq 8: cpus=0-3 flags=0x4 nice=0 active=1/256 > pending: disk_events_workfn > workqueue netns: flags=0x6000a > pwq 8: cpus=0-3 flags=0x4 nice=0 active=1/1 > in-flight: 10038:cleanup_net > workqueue writeback: flags=0x4e > pwq 8: cpus=0-3 flags=0x4 nice=0 active=2/256 > pending: wb_workfn, wb_workfn > workqueue kblockd: flags=0x18 > pwq 3: cpus=1 node=0 flags=0x0 nice=-20 active=2/256 > pending: blk_mq_timeout_work, blk_mq_timeout_work > workqueue vmstat: flags=0xc > pwq 4: cpus=2 node=0 flags=0x0 nice=0 active=1/256 > pending: vmstat_update > pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=1/256 > pending: vmstat_update > pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256 > pending: vmstat_update > pool 8: cpus=0-3 flags=0x4 nice=0 hung=0s workers=11 idle: 11638 10276 609 17937 606 9237 605 891 15998 14100 > note: trinity-c13[18815] exited with preempt_count 1 [...] Content analysis details: (-2.9 points, 5.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.0000] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3404 Lines: 73 On Fri, Apr 01, 2016 at 02:12:27PM -0400, Dave Jones wrote: > BUG: workqueue lockup - pool cpus=1 node=0 flags=0x0 nice=0 stuck for 30s! > Showing busy workqueues and worker pools: > workqueue events: flags=0x0 > pwq 6: cpus=3 node=0 flags=0x0 nice=0 active=1/256 > pending: vmstat_shepherd > pwq 4: cpus=2 node=0 flags=0x0 nice=0 active=1/256 > pending: check_corruption > pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=3/256 > pending: usb_serial_port_work, lru_add_drain_per_cpu BAR(17230), e1000_watchdog_task > workqueue events_power_efficient: flags=0x82 > pwq 8: cpus=0-3 flags=0x4 nice=0 active=3/256 > pending: fb_flashcursor, neigh_periodic_work, neigh_periodic_work > workqueue events_freezable_power_: flags=0x86 > pwq 8: cpus=0-3 flags=0x4 nice=0 active=1/256 > pending: disk_events_workfn > workqueue netns: flags=0x6000a > pwq 8: cpus=0-3 flags=0x4 nice=0 active=1/1 > in-flight: 10038:cleanup_net > workqueue writeback: flags=0x4e > pwq 8: cpus=0-3 flags=0x4 nice=0 active=2/256 > pending: wb_workfn, wb_workfn > workqueue kblockd: flags=0x18 > pwq 3: cpus=1 node=0 flags=0x0 nice=-20 active=2/256 > pending: blk_mq_timeout_work, blk_mq_timeout_work > workqueue vmstat: flags=0xc > pwq 4: cpus=2 node=0 flags=0x0 nice=0 active=1/256 > pending: vmstat_update > pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=1/256 > pending: vmstat_update > pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256 > pending: vmstat_update > pool 8: cpus=0-3 flags=0x4 nice=0 hung=0s workers=11 idle: 11638 10276 609 17937 606 9237 605 891 15998 14100 > note: trinity-c13[18815] exited with preempt_count 1 This has wedged userspace too: 23082 pts/2 SN+ 0:00 | \_ /bin/bash scripts/test-multi.sh 14140 pts/2 SNL+ 0:15 | \_ ../trinity -q -l off -N 1000000 -a64 -x fsync -x fdatasync 16900 ? DNs 0:04 | \_ ../trinity -q -l off -N 1000000 -a64 -x fsync -x fdata 18894 ? DNs 0:02 | \_ ../trinity -q -l off -N 1000000 -a64 -x fsync -x fdata (14:16:02:davej@think:trinity[master])$ stack 16900 [] wait_on_page_bit_killable+0x156/0x1b0 [] __lock_page_or_retry+0x112/0x1b0 [] filemap_fault+0x367/0xb30 [] __do_fault+0x167/0x3d0 [] handle_mm_fault+0x1837/0x2520 [] __do_page_fault+0x248/0x770 [] do_page_fault+0x39/0xa0 [] page_fault+0x1f/0x30 [] mm_release+0x1ec/0x230 [] do_exit+0x5d0/0x18c0 [] do_group_exit+0xac/0x190 [] get_signal+0x48f/0xeb0 [] do_signal+0xa0/0xb50 [] exit_to_usermode_loop+0xd9/0x100 [] do_syscall_64+0x238/0x2b0 [] return_from_SYSCALL_64+0x0/0x7a [] 0xffffffffffffffff (14:16:09:davej@think:trinity[master])$ stack 18894 [] btrfs_file_write_iter+0xe8/0x9a0 [btrfs] [] __vfs_write+0x279/0x2e0 [] vfs_write+0x11e/0x2b0 [] SyS_write+0xd2/0x1a0 [] do_syscall_64+0x103/0x2b0 [] return_from_SYSCALL_64+0x0/0x7a [] 0xffffffffffffffff I tried to ftrace the latter process, and the box completely hung. Dave