Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756092AbcJRXI7 (ORCPT ); Tue, 18 Oct 2016 19:08:59 -0400 Received: from ipmail06.adl2.internode.on.net ([150.101.137.129]:33594 "EHLO ipmail06.adl2.internode.on.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1033390AbcJRXIh (ORCPT ); Tue, 18 Oct 2016 19:08:37 -0400 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AoUbAEuqBlh5LNdpIGdsb2JhbABcHAEBBAEBCgEBgzwBAQEBAR2BU4J5g3mcIQEBAQEHjSCKO4YbAgIBAYIITQECAQEBAQECBgEBAQEBATlEhQ8THCMYJDQFJQMHLYhRw2UBAQgCJh6FVIZEGAGGOoIvBYg9hgOLSI97kAKQe4EaBgiDBg0PgWcqNIgZAQEB Date: Wed, 19 Oct 2016 10:07:40 +1100 From: Dave Chinner To: linux-kernel@vger.kernel.org Cc: Jens Axboe , linux-block@vger.kernel.org Subject: [regression, 4.9-rc1] blk-mq: list corruption in request queue Message-ID: <20161018230740.GE14023@dastard> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4834 Lines: 82 Hi Jens, One of my test VMs (4p, 4GB RAM) tripped over this last night running xfs/297 over a pair of 20GB iscsi luns: [ 8341.363558] ------------[ cut here ]------------ [ 8341.364360] WARNING: CPU: 0 PID: 10929 at lib/list_debug.c:33 __list_add+0x89/0xb0 [ 8341.365439] list_add corruption. prev->next should be next (ffffe8ffffc02808), but was ffffc90005f6bda8. (prev=ffff88013363bb80). [ 8341.366900] Modules linked in: [ 8341.367305] CPU: 0 PID: 10929 Comm: fsstress Tainted: G W 4.9.0-rc1-dgc+ #1001 [ 8341.368323] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Debian-1.8.2-1 04/01/2014 [ 8341.369431] ffffc90009d1b860 ffffffff81821c60 ffffc90009d1b8b0 0000000000000000 [ 8341.370423] ffffc90009d1b8a0 ffffffff810b69fb 0000002181808107 ffff880133713840 [ 8341.371415] ffff88013363bb80 ffffe8ffffc02808 ffffe8ffffc02800 0000000000000008 [ 8341.372442] Call Trace: [ 8341.372759] [] dump_stack+0x63/0x83 [ 8341.373411] [] __warn+0xcb/0xf0 [ 8341.374017] [] warn_slowpath_fmt+0x4f/0x60 [ 8341.374741] [] ? part_round_stats+0x4f/0x60 [ 8341.375466] [] __list_add+0x89/0xb0 [ 8341.376125] [] blk_sq_make_request+0x3ec/0x520 [ 8341.376881] [] generic_make_request+0xd0/0x1c0 [ 8341.377637] [] submit_bio+0x58/0x100 [ 8341.378315] [] xfs_submit_ioend+0x82/0xd0 [ 8341.379039] [] ? xfs_start_page_writeback+0x99/0xa0 [ 8341.379845] [] xfs_do_writepage+0x59a/0x730 [ 8341.380601] [] write_cache_pages+0x1f6/0x550 [ 8341.381357] [] ? xfs_aops_discard_page+0x140/0x140 [ 8341.382158] [] xfs_vm_writepages+0xa0/0xd0 [ 8341.382887] [] do_writepages+0x1e/0x30 [ 8341.383603] [] __filemap_fdatawrite_range+0x71/0x90 [ 8341.384423] [] filemap_write_and_wait_range+0x41/0x90 [ 8341.385255] [] xfs_free_file_space+0xb4/0x460 [ 8341.386021] [] ? avc_has_perm+0xad/0x1b0 [ 8341.386715] [] ? __might_sleep+0x4a/0x80 [ 8341.387422] [] xfs_zero_file_space+0x39/0xd0 [ 8341.388164] [] xfs_file_fallocate+0x2fc/0x340 [ 8341.388917] [] ? selinux_file_permission+0xd7/0x110 [ 8341.389738] [] ? __might_sleep+0x4a/0x80 [ 8341.390439] [] vfs_fallocate+0x157/0x220 [ 8341.391156] [] SyS_fallocate+0x48/0x80 [ 8341.391834] [] do_syscall_64+0x67/0x180 [ 8341.392517] [] entry_SYSCALL64_slow_path+0x25/0x25 [ 8341.393343] ---[ end trace 477b0f6e35ebd064 ]--- [ 8341.502708] ------------[ cut here ]------------ [ 8341.503479] WARNING: CPU: 1 PID: 27731 at lib/list_debug.c:29 __list_add+0x62/0xb0 [ 8341.505131] list_add corruption. next->prev should be prev (ffffe8ffffc02808), but was ffff880133795dc0. (next=ffffe8ffffc02808). [ 8341.506657] Modules linked in: [ 8341.507092] CPU: 1 PID: 27731 Comm: kworker/1:0H Tainted: G W 4.9.0-rc1-dgc+ #1001 [ 8341.508137] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Debian-1.8.2-1 04/01/2014 [ 8341.509228] Workqueue: kblockd blk_mq_requeue_work [ 8341.509819] ffffc900038efcb8 ffffffff81821c60 ffffc900038efd08 0000000000000000 [ 8341.510760] ffffc900038efcf8 ffffffff810b69fb 0000001d3371cf98 ffff88013363e100 [ 8341.511729] ffffe8ffffc02808 ffffe8ffffc02808 ffffc900038efde0 ffff88013363e100 [ 8341.512708] Call Trace: [ 8341.513026] [] dump_stack+0x63/0x83 [ 8341.513669] [] __warn+0xcb/0xf0 [ 8341.514283] [] warn_slowpath_fmt+0x4f/0x60 [ 8341.515003] [] ? set_next_entity+0xb6/0x970 [ 8341.515733] [] ? account_entity_dequeue+0x70/0x90 [ 8341.516521] [] __list_add+0x62/0xb0 [ 8341.517162] [] blk_mq_insert_request+0x11e/0x130 [ 8341.517951] [] blk_mq_requeue_work+0xbc/0x130 [ 8341.518701] [] process_one_work+0x180/0x440 [ 8341.519430] [] worker_thread+0x4e/0x490 [ 8341.520119] [] ? process_one_work+0x440/0x440 [ 8341.520865] [] ? process_one_work+0x440/0x440 [ 8341.521612] [] kthread+0xd5/0xf0 [ 8341.522226] [] ? kthread_park+0x60/0x60 [ 8341.522912] [] ret_from_fork+0x25/0x30 [ 8341.523620] ---[ end trace 477b0f6e35ebd065 ]--- I haven't seen it before, hence it's probably a regression. I haven't tried to reproduce it yet, so I don't know if it's easy to trip over. Cheers, Dave. -- Dave Chinner david@fromorbit.com