Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965911AbcJ0OJ1 (ORCPT ); Thu, 27 Oct 2016 10:09:27 -0400 Received: from mx0a-00082601.pphosted.com ([67.231.145.42]:34768 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1030404AbcJ0OIn (ORCPT ); Thu, 27 Oct 2016 10:08:43 -0400 Subject: Re: bio linked list corruption. To: Jens Axboe , Dave Jones , "Linus Torvalds" , Andy Lutomirski , Andy Lutomirski , Al Viro , Josef Bacik , David Sterba , linux-btrfs , Linux Kernel , Dave Chinner References: <488f9edc-6a1c-2c68-0d33-d3aa32ece9a4@fb.com> <20161026224025.mou27kki4bslftli@codemonkey.org.uk> <2bdc068d-afd5-7a78-f334-26970c91aaca@fb.com> <203e0319-bc9b-245c-e162-709267540d22@fb.com> <20161026233808.GC15247@clm-mbp.thefacebook.com> <20161026234751.e66xyzjiwifvbuha@codemonkey.org.uk> From: Chris Mason Message-ID: <6b7b958d-7017-a0f6-efe7-43aedba08a17@fb.com> Date: Thu, 27 Oct 2016 09:33:00 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.3.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit X-Originating-IP: [2620:10d:c091:180::1:54e] X-ClientProxiedBy: DM5PR19CA0023.namprd19.prod.outlook.com (10.175.226.161) To BN6PR15MB1235.namprd15.prod.outlook.com (10.172.206.9) X-MS-Office365-Filtering-Correlation-Id: d33e1f36-5cb4-42db-4c62-08d3fe6dd287 X-Microsoft-Exchange-Diagnostics: 1;BN6PR15MB1235;2:qc+o69DU2FCq7a6mZHJ79BXEvfGXQXZkR4qtdh49DZz6VRqcKjFBp6kGl10WeptP3WkGHV1t9XChdToZNff3Q7+jEXzH5CKXb8he1XrW9DqZHpShZyT3n8WU2JuDr7RnhrxHUeE8cjbujxol2Z0XU7pxmY9YWq8GPc41oZ3zo03xNgje6b6U9CgUnCdxe1gVI+K4X6ci2yxiRL0pXX+iwQ==;3:qPuJd9JZDRF9WVKwqlVGw6jf2ZcEZfAx/TuZ+raorNHX1NsKEGQx05V8mtJGNgbSAtJke0b3dOiG/CPqyTNwwwH1UzXe08MuScJdRUPMERXwCsDag+928cnAb4lLj59xBqGkVr23WcRcb/kKYFw+7Q==;25:jqJ1qsOXUfDovpjE8PhdvCGA5QCvYxnQWD15SHuTuPUPK2yf3BVo7gv58CnXG9fHDpM6A+eRVboZImGnBVgjbjkfrUA8CfTSo+HVbd81HSr43l4a9lMv6gzepSgXSZtUh2R/M3rVeFRxFtKnvZhdVHKeYPiJekPeGrWmKKAPixRFG2g47yWolTW6fINFVmspUUThoZvYNL0d2RPsZPKg6yciLj3GMKQIHBDmz8MDMA5IAy+EwZPwRAUcGgXa94L+KJxIbRG3avc+bIaapojhc8wxBl7v8oMisX7gs2BnvjlAtBLWFlm4DIIDNeBe+nEbhYycpLvobR4tbxHU3kRFYIvFdFOD2N6psC11e5KiYPPDAiGL+nHo/H0Wkr6RomEQ9i4W8XOGpWB/hIw5Ftw8YRjUU/thdmYpeuKaBuN969FPxYv9C2AZYequqey+6xKB X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:BN6PR15MB1235; X-Microsoft-Exchange-Diagnostics: 1;BN6PR15MB1235;31:CbJMwD4IiUncRpTwpevN0kMUAx3JednjJN6eRx+/rHJ21IaGLB0MRwsBhWiNAGWg9q+ZqFJgZSQFih43avJqIsAYuH26s8QZ/YXJnYxJzk0zt/OQAUu/ZzuHSx+lwxAyd3qfexJBUpZZo5J0+P4yKS4LTZXNJaudRuwXulwHsW5xjiwnPtNXXqIpTI1V56UuZvZ85g4G4p4jp57Who4cAftPhvw7VBKLm8g2P9pWHgkHRM491kbUxvlBneu+xKc0;20:9m1FWADYd7fyV/dMOzm+cbi5Xa/8c5ky2kruold0Hs/OJ8UIsKPut1I1t3A7LpbwR4hUyYvHheBOQRkBQVgyt+uGQAuw82nxTrPj38FUX9H6DftWh+KwpBWDn3W9ogCNKBIzR27eIA6ZgLtoStALdmDdf2Fz/8yahBClYpae5EY=;4:xldCUEQtJQlDL6B4MkRtxQiMKl18zh0D3T1fFqdSPCyr82D4WfpWwgcVPXsYn4lHqCscFM3tnBM6LgBcBdGNbuUHVET7ecHB+1x+Wb+vpsUUNh2MckPukB0B9RnSJvkp9CldcXXjInv3bB+p8C2G129XSEBz4jiq7kPaHgQ0S/0G4H/qkmH+IL7v3a0CKk2ZGn/R7l7Hbxb6T/QD4ImB3SWldsG1WygBXDGgTPTOwdpcmH9yHNr4i2z2TtEN/3W+YUojDwhpkE/0KOKH0Zv4qEQ+zQE5PGVezyE4BSrd8zAj+wwnkbaNVFcAIB3YXyWMc76dRy2fwRq2EzOWgOm66Yd7gF8wYoow3gvwW+0PXzJXQMKemmMkn5db6fH0EJSwCJ6T3gw1XeydYFTnO1iC8m4nwRHe7dDqsFKqCOCjP7Kt3nlk1uCmSzKQ+f8UPMj9 X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:(67672495146484); X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(6040176)(601004)(2401047)(5005006)(8121501046)(10201501046)(3002001);SRVR:BN6PR15MB1235;BCL:0;PCL:0;RULEID:;SRVR:BN6PR15MB1235; X-Forefront-PRVS: 0108A997B2 X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10019020)(4630300001)(6009001)(7916002)(24454002)(189002)(199003)(377454003)(101416001)(5001770100001)(31696002)(86362001)(4001350100001)(50466002)(5660300001)(31686004)(586003)(42186005)(50986999)(65826007)(76176999)(54356999)(105586002)(33646002)(6666003)(97736004)(2906002)(77096005)(107886002)(92566002)(64126003)(6116002)(189998001)(2950100002)(1706002)(19580405001)(19580395003)(68736007)(93886004)(3480700004)(65956001)(81166006)(65806001)(81156014)(8676002)(36756003)(106356001)(23746002)(230700001)(83506001)(305945005)(7736002)(7846002)(47776003)(921003)(1121003)(42262002);DIR:OUT;SFP:1102;SCL:1;SRVR:BN6PR15MB1235;H:[IPv6:2620:10d:c0a1:1110::1085];FPR:;SPF:None;PTR:InfoNoRecords;MX:1;A:1;LANG:en; X-Microsoft-Exchange-Diagnostics: =?Windows-1252?Q?1;BN6PR15MB1235;23:ZXCfQT8Awi77zPgTaYxwM8xiXPXZXEvjDxKeA?= =?Windows-1252?Q?ZABfup+cPIJ4aUaIqqvA2jIvI7YrhzoGH3UHaRm4YD4/eEZxt2WCm6lp?= =?Windows-1252?Q?c4m66h4vu1KVrP2Avxl0zrqJXijPp0muZGcGGLeyk/OVGMr0wwsk9nOv?= =?Windows-1252?Q?6N9VLuhkDEjQjNXxks9tNzCGUgxj+OykyRSSereBBM22/8WBg2/GmL9U?= =?Windows-1252?Q?+iX6zg2hrl4ReXuMWE/1ktvRn0UK7eZpgSLggZGi1T84ZPg22uM3QcYp?= =?Windows-1252?Q?3dwlN13WTWKMOCcYJdLk3yLpV/JVQ0CF2QeIFWPM6l0K5eUWEO/3AV8q?= =?Windows-1252?Q?/xJrDNaXq0WOmTGpIw3TJQK9tSRVTlFv3nTujsvr5O0WsJ0suzqrDrN+?= =?Windows-1252?Q?dZY12j4dKWYu+eQX8eT3+G67+bbRx1JaKNGhZXYQBlKA4CbqDfKkHEEP?= =?Windows-1252?Q?XJbPCVnNbzqHzaduRA93Sj9LtCNnp3zOBRgCfnVJ/Eb9iiZjI2Th+Vog?= =?Windows-1252?Q?K/dEuWiMTC0iTUUHuI2lL9vI8RJFWclAPF74w0cbZ6ck9gFXIb7sC3cO?= =?Windows-1252?Q?UGkqjVXSIuS89p4o4ZRWI4CUyos7dhs7u4yt5yW0bowgPUCNc/QPMyCf?= =?Windows-1252?Q?BMKU9XD+sgOddXVi0zGY2/IbkcIJGg+j0MTfYlAHkx3M+59JSU0wRxWe?= =?Windows-1252?Q?lvkT1/B4LTcR2FSI7IcGtpUgDmAczORwkuSmG8z5W/B0R/xcPyQpEOBq?= =?Windows-1252?Q?yA80llVH9LDQz3ig0Pvu/I0q14LQJfmtDDT2+H6y6u4rh2EUTYZ5kPoS?= =?Windows-1252?Q?uThEhWlNoOJk1cVDXpbPP4MyRvOxhyZ22bdk5xpCjCb83w6No3EZAJ+j?= =?Windows-1252?Q?b098rsE4TjI7w+9YbBL9L7z2UupnbDb8vq6EjFbhan//JUmU0A0bzWJZ?= =?Windows-1252?Q?i5yiTA8Z10vQ0rXA3RzWMq0nWo4twVUytq1wSbDoVHywuz4jGRANw9tJ?= =?Windows-1252?Q?ujEzk5yFNDkwHOuQUuP+Vuy6JFigFekwDHfFqnoIwLt4c8wovjakq2KS?= =?Windows-1252?Q?ZuSj9crxXRlcCejL1/aBC2BTwrSGb0Mam/ihEbbk/tOku9YKqH22erGS?= =?Windows-1252?Q?DLrshtyjNh3nbpC4aRSC+DUzma8ziXKindvTxtjJgIcItNa4h36qGd4a?= =?Windows-1252?Q?mdStG+8Nn/A2aEY/euZhnUyJcDVbJCe4V5WbtQL2jYaqjc3eQw20KbEp?= =?Windows-1252?Q?XWPMxP2297s/wlHjsivTjjbotPmvmQSOeSNL4spxvwnh66j/V62eG9gY?= =?Windows-1252?Q?MoObhV2HGK3dhtFPSX6IpSjs+J1JWpBAzvpkO8xUlxfiWuPBBu4QcQw+?= =?Windows-1252?Q?/GJhxevJiAH55JqplBOjI6b7usXkF7KVVRfDqshOdyImy2gMntO1A3CA?= =?Windows-1252?Q?3uo/DVbox1OuarX2iqivCdhXOZslcKKCgGwhEhG8+3LYVMyJv9LUQ6b+?= =?Windows-1252?Q?S2qr70=3D?= X-Microsoft-Exchange-Diagnostics: 1;BN6PR15MB1235;6:9qPhDqejA96mh9sWtKjZ8gKtQag7LHzUGuZf5D5U5Rh3iN9tCnAjGywfY3fjjpdW2vMnParFBDXMLCJL1HRsPZ8lWB62n06EEd+SCm5hHhRc97barGt45yXY70+SuVD+5lmQZFs0aZyJIoXcOS3SK50LSnIYLzTy7a1N+IYYsBNJY2zN8iIG4TIuF2++VOnLY0z0bUx2LhgUV9Na7mM61DAb2o3zXQ1mtNwBdWfKr2qZV6FJntjMv7857Gylu9vhglojsLQE8JWtM6lY1zD/LQlWnHbuI6mnbEJXRbCiXk0CBtknXjFk3TDDJdcPKt3m;5:Ve18br9ASxovd969diiNjWuC+FR81DV8V2peci4GteebjX5w+K1WZkOzz+ePP8KLPyk+Rf/GF0/UtCNXhuJAKAsKHZwLX+C0CBF61hlIlv+LmqLS8VheR+nxujdHxheU1ZvB5qaiieMue9KyPlfBMQ==;24:kY/7oo7h5h+X+KpP31Vufvrsluq4rESzZa3DfHwv5Umws53HG0AK/SIE6pAnGtrg+FW9LJpML1nEasX/nkVzuX0Rw/C81ljuanSnQW04sDk=;7:kO9BT/9zbTiBwx3r+Cc10gHAi21B3GnwPHMpwjL1LT7KBdlIfWeTnaCPoc5AeU0sbe1fo4wo5m9HP1CF/H3b1ebD6WVTWeK4Ih2iP/1+4J8u0MpWetrMon++5OeabMNg2VHyeow7s5Dh6szX1TXtsS4Yw6JuFmwDuIOAOcFNrlydGDyFwO6R5DtqGIjDX2Wg9o+yFvqYXwVny1/9xgrlelK9tFFNH3RM2WXDXddk3fqohrVq4PXP+1cBEBX2b5qb4jPfAANrPl+RVmN9R8tHEyvtGwVrmdPrsRCcvQKLX5ImFTlWEp/7KdRc2N8/GU1IiPWFtelgZEUyUuNTgLjampBd7cvpeuUvAZnsdZmzbh4= SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1;BN6PR15MB1235;20:hs03mzT62ODlvYwYphzT0w/NSKR1FYp71Ing6AwHi4uCpfgsrMc0R385seB9jEha1Xb81NlzGZ0zGHby1x3kNzqvwmxEJpQgV5oIo0hJfJD4QuIHN3i4XNDVYc1Rb/75Tq7WA5PntXstB34KYoX2n62n7icNnQh/zcucGh6RoZM= X-MS-Exchange-CrossTenant-OriginalArrivalTime: 27 Oct 2016 13:33:20.8348 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: BN6PR15MB1235 X-OriginatorOrg: fb.com X-Proofpoint-Spam-Reason: safe X-FB-Internal: Safe X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2016-10-27_08:,, signatures=0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3183 Lines: 71 On 10/26/2016 08:00 PM, Jens Axboe wrote: > On 10/26/2016 05:47 PM, Dave Jones wrote: >> On Wed, Oct 26, 2016 at 07:38:08PM -0400, Chris Mason wrote: >> >> > >- hctx->queued++; >> > >- data->hctx = hctx; >> > >- data->ctx = ctx; >> > >+ data->hctx = alloc_data.hctx; >> > >+ data->ctx = alloc_data.ctx; >> > >+ data->hctx->queued++; >> > > return rq; >> > > } >> > >> > This made it through an entire dbench 2048 run on btrfs. My script >> has >> > it running in a loop, but this is farther than I've gotten before. >> > Looking great so far. >> >> Fixed the splat during boot for me too. >> Now the fun part, let's see if it fixed the 'weird shit' that Trinity >> was stumbling on. > > Let's let the testing simmer overnight, then I'll turn this into a real > patch tomorrow and get it submitted. > I ran all night on both btrfs and xfs. XFS came out clean, but btrfs hit the WARN_ON below. I hit it a few times with Jens' patch, always the same warning. It's pretty obviously a btrfs bug, we're not cleaning up this list properly during fsync. I tried a v1 of a btrfs fix overnight, but I see where it was incomplete now and will re-run. For the blk-mq bug, I think we got it! Tested-by: always-blaming-jens-from-now-on WARNING: CPU: 5 PID: 16163 at lib/list_debug.c:62 __list_del_entry+0x86/0xd0 list_del corruption. next->prev should be ffff8801196d3be0, but was ffff88010fc63308 Modules linked in: crc32c_intel aesni_intel aes_x86_64 glue_helper i2c_piix4 lrw i2c_core gf128mul ablk_helper virtio_net serio_raw button pcspkr floppy cryptd sch_fq_codel autofs4 virtio_blk CPU: 5 PID: 16163 Comm: dbench Not tainted 4.9.0-rc2-00041-g811d54d-dirty #322 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.0-1.fc24 04/01/2014 ffff8801196d3a68 ffffffff814fde3f ffffffff8151c356 ffff8801196d3ac8 ffff8801196d3ac8 0000000000000000 ffff8801196d3ab8 ffffffff810648cf dead000000000100 0000003e813bfc4a ffff8801196d3b98 ffff880122b5c800 Call Trace: [] dump_stack+0x53/0x74 [] ? __list_del_entry+0x86/0xd0 [] __warn+0xff/0x120 [] warn_slowpath_fmt+0x49/0x50 [] __list_del_entry+0x86/0xd0 [] btrfs_sync_log+0x75d/0xbd0 [] ? btrfs_log_inode_parent+0x547/0xbb0 [] ? _raw_spin_lock+0x1b/0x40 [] ? __might_sleep+0x53/0xa0 [] ? dput+0x65/0x280 [] ? btrfs_log_dentry_safe+0x77/0x90 [] btrfs_sync_file+0x424/0x490 [] ? SYSC_kill+0xba/0x1d0 [] ? __sb_end_write+0x58/0x80 [] vfs_fsync_range+0x4c/0xb0 [] ? syscall_trace_enter+0x201/0x2e0 [] vfs_fsync+0x1c/0x20 [] do_fsync+0x3d/0x70 [] ? syscall_slow_exit_work+0xfb/0x100 [] SyS_fsync+0x10/0x20 [] do_syscall_64+0x55/0xd0 [] ? prepare_exit_to_usermode+0x37/0x40 [] entry_SYSCALL64_slow_path+0x25/0x25 ---[ end trace c93288442a6424aa ]---