Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756552AbcJRXNC (ORCPT ); Tue, 18 Oct 2016 19:13:02 -0400 Received: from mx0b-00082601.pphosted.com ([67.231.153.30]:54138 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1756491AbcJRXNA (ORCPT ); Tue, 18 Oct 2016 19:13:00 -0400 Subject: Re: bio linked list corruption. To: Dave Jones , Al Viro , Chris Mason , Josef Bacik , David Sterba , , Linux Kernel , Linus Torvalds References: <20161011144507.okg6baqvodn2m2lh@codemonkey.org.uk> <20161018224205.bjgloslaxcej2td2@codemonkey.org.uk> From: Jens Axboe Message-ID: Date: Tue, 18 Oct 2016 17:12:41 -0600 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.3.0 MIME-Version: 1.0 In-Reply-To: <20161018224205.bjgloslaxcej2td2@codemonkey.org.uk> Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [65.113.42.202] X-ClientProxiedBy: CY1PR20CA0102.namprd20.prod.outlook.com (10.164.213.156) To MWHPR15MB1199.namprd15.prod.outlook.com (10.175.2.141) X-MS-Office365-Filtering-Correlation-Id: 6ee3d596-03ae-4b66-302f-08d3f7ac45c5 X-Microsoft-Exchange-Diagnostics: 1;MWHPR15MB1199;2:+004IwAU5mBHMyzEMdKAD7yurlqwdWMeOwtzYK/GKFn5e2J6J7BI4br7FODdRYGltJ4kfPjNN5NTVUc6eDQ321/dirWz+JJmWayNMtESL0T9FA1n1agUdAY0DqAlLW+NsA8Lh7SRPgjDf6i+IOzXfYUeXLincIq2uA71b+Uw5syiEkBz6w3alV2aMMQ9bdh5GiDEYemvnQ1Qrrf7GcviIQ==;3:Qf2hGrS78PMhgi0LuDxSEuWFqkwKsBKbvZaMvehQAv1RARlbeXQn2LpXSRBmLQAFJs684hJagRZp0xsVggNzjJYtGMjBoZTkz/7nFO9oA1YFksEtCJEUG1gOx1LMnWLhhr2F8DVM2QYKO+7n3FRHsg==;25:FVXMwkYikfrnGSJd+4MkVS77W2SUMV9dUl5nA97hp8i76GPOc0rZE75YYlbuzmRHNsPm7kPAWJrwuo/+YeHa1cIUcmyQXrZ0r4S3RQAoSNd+Q3u8pkyt3QhZa3JntzKF937gJHz4jvb1dnVZbZEvaOUlF9OWYK085T4oXMhm/v7rCYzty4Ur439qVWIYkDWz5ubx3HPxFVPqkP05d19/8+PSaSlkRdNzlDaeMxPbxX1q2vsUce/pv9t7Qgg6ogZlvw4HtUg67q+M/wTU5oKvsVvMT4ocFC7e6xN03C8gTHdSNCx2NbelHfyNARdUz+q2HpKYcrylZJEt3XIq0ORphodNBxvnTcXAxvamgcvCX+x+lpgbgXUZX/asqxcDy6zRM78H9AKkU3u4KE8WNKSTTr5VL1mW0hhuB2sJ0AQZZwK+c5kkzh7qenZJRueLYWat X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:MWHPR15MB1199; X-Microsoft-Exchange-Diagnostics: 1;MWHPR15MB1199;31:lNUkpv2sgp+EymJ2eVW2Dp7csmsCQDLHH4GoWjPlV65LnP9RFKStyWeYC2NqE2locTxoavToImildb9m2YBn7do2/yff7a7YGeUUefQ47GjlzUyi1jQ0kieewWCt5OPieDAcOrG8SGYzXFcGTntHA/kJVEjDuBzHXEQAWPsoDR3qE0wv0JUn5joXmxE9WgoXisKOosni2Z66rPv64LQMeMtPOwBrJiXDc02vLphrTIDUHRa377qSJbvZc/PO1UWy;20:JXjzSW8oMfkVb9fH+0YID4yfxh3hGi72pBgA0+Druc8B+pVGTZYy6Xx8pwAPYD3HyHVXpk/ExJBcf/piqyEYwLu/6SatgDWzf/H0lavnatOD1Qu/D4qDfG9DCGTfqxP2aezAN3sQJGt4OupXwEVxHbQwyVqDdFZ/lumC5mCVvYE=;4:8XznPx/2fG0Ilm+TTRdUIfFxMhxuXbDiDkcWvlgIZuxOrzu2GYJ/7IBnM0D0rOt0UGPruErXFLWkKBPkaGB+90M3C7uQP5cCmcxzaCAkhlbjEJQe4y5NFVp8OWlPFS4AAsZ68CPut/MR014XZGmqz51Q56h9pRA0iOhKThS5eF3V7ZGS6/RRQBljLqaq5xTFC4cdubT8tQEhgYMzyOlgVUi8l6hI9gWym2djmB6FsPjF3dXSSGxQ89ma4LquCmWGhgf+ucYrBdVhGPpwasHrMcLcHxneqcX8oicT54buzmh5JgpgKhdD7IzciOXwsaZqGQgEo92ory5glOtdkWTQTlm3Qog6p9s1IkzbWeTfxnPmkrN/yGhO7p6zR4Ergv/y1Blr2eLc9NM+J29pnmx7Zg== X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(6040176)(601004)(2401047)(5005006)(8121501046)(10201501046)(3002001);SRVR:MWHPR15MB1199;BCL:0;PCL:0;RULEID:;SRVR:MWHPR15MB1199; X-Forefront-PRVS: 00997889E7 X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10019020)(4630300001)(6049001)(6009001)(7916002)(24454002)(199003)(377454003)(189002)(31696002)(92566002)(586003)(50466002)(3846002)(117156001)(5001770100001)(31686004)(50986999)(19580395003)(81166006)(4001350100001)(107886002)(36756003)(68736007)(97736004)(66066001)(65806001)(106356001)(47776003)(575784001)(105586002)(6116002)(64126003)(65826007)(305945005)(101416001)(7736002)(7846002)(42186005)(23746002)(5660300001)(54356999)(83506001)(76176999)(2906002)(86362001)(189998001)(2950100002)(230700001)(77096005)(3480700004)(81156014)(33646002)(8676002)(6666003)(65956001);DIR:OUT;SFP:1102;SCL:1;SRVR:MWHPR15MB1199;H:[192.168.0.138];FPR:;SPF:None;PTR:InfoNoRecords;A:1;MX:1;LANG:en; X-Microsoft-Exchange-Diagnostics: =?Windows-1252?Q?1;MWHPR15MB1199;23:8kKxt+yXHWKupAjW44jjJ+g+4d6LDIX049ZhT?= =?Windows-1252?Q?cF2iBmjPi/uZljOxUEO1GvmlPkGIj9aWnOJOeW4s2wui2ADTPTN9PkAo?= =?Windows-1252?Q?BI0oqvNG+4T+U5pqlJWNwe0hgS4iDnoIhz8jURaPnBO/dw70FwhADasn?= =?Windows-1252?Q?3TUyiUyO3B5hrNWfAZm2XHBmL2FnGTTuKJlsCFZtSH13V+xPW7XMBJib?= =?Windows-1252?Q?VDpOWkYpLnQobwXJzq4uJL2ntowyCgNsr9UYuCE5sgYCy9mszA0eM0wP?= =?Windows-1252?Q?pLrSAHok1yQ8Xb5uhcG3qlxxjGSOXIlJ68jgqXRI+ggy05gCbE3lgV/K?= =?Windows-1252?Q?wjJaZBvbgRlpOHVBLE+zH47rCV4KQ/kUsIQV1A0fFdVfc9Jp2Gpa4NE5?= =?Windows-1252?Q?5HVwPQThu03PCgRrxDkJ42FZDU3MNvB6JqAwiGPskFfPFrSMaxxdfAGT?= =?Windows-1252?Q?zJKTLHGoVqrHXeuG2g2j9ZwlKKSZTNSKd9x0o5l+jI8MEfm5RMlvlHE5?= =?Windows-1252?Q?fPwQTyT4MzYbK4sthyjtVvlMH/DLVPknVN5Zj6Is9gr3JrEST0xtFQx/?= =?Windows-1252?Q?E5aBpKCIXNckskLz7BPmtmgdcTeM5jfmfZGhYaFDeyptMMncKKwq1Hch?= =?Windows-1252?Q?I/1iBrG/ReRGAXCBZt5Y+aqo4CrJb/woxTN96uUfcVRm+PPMADgPbYCL?= =?Windows-1252?Q?gkeOUBfVPqxIoyMxinYraANocT/e3nc0BGgAvDZAmA0CkMqFq1u1hhH3?= =?Windows-1252?Q?3muKXpsw15xQilaj+4bJFxmjGHcH6YisaA59hH6stQAm9J3QDFrKPeJT?= =?Windows-1252?Q?e+yCiUFxeVnBAd75ZlG5nES/omLZD8UDS8Kvwgc5apopyTx9yI/+s+7x?= =?Windows-1252?Q?nyi13jAwzcKLBIN+0kCD5vYxHT8f58O+9805lV9rma90D37DKxvDgG/x?= =?Windows-1252?Q?R+ArYC1TAIf/kOtM9zu4jaXkcNu7SD+B95th7zwea8Ah4FM91o+5gKq3?= =?Windows-1252?Q?C5c9pLQgrFvOEjR9XtaybVUYzcXpnVmxeeyd5kKhouUXqG0iEljWhGs6?= =?Windows-1252?Q?gv9g2eTednWn9TNmrXn94D3hueEwDUiHZOmP7gjtjGdFgkhB70Wz5W6A?= =?Windows-1252?Q?fMYUn/C5/9BPcP8ce/jyICw6ynHua3Im35oJtaD/xr9tFWiYz/mOWgPt?= =?Windows-1252?Q?tmZFL2godIS6ccb7gGzHp6oRgkeJ4JhFRjNHcnqBFdCUdVmfWEHEwkvh?= =?Windows-1252?Q?x2ihIgK2l2VaTGpPS2ebS+CO6ZiOHWN90NSmqUkB1yEsL5tR3Nj1lxUn?= =?Windows-1252?Q?cAZEUP070RL9TJ9biAE6XZteYTgARPi/9EnZKHQbsCJBhz5dV3gnZ3Aw?= =?Windows-1252?Q?Op5/CS80yV78kHM6JxJkjzTGBUitkPE97qu0BkkjXn8tBzRpqDRt2FLU?= =?Windows-1252?Q?NjgV48MYGR57xo4N92JNAD2sYbxWvQKyj0fomJyoULpqdwauzDVwB/dE?= =?Windows-1252?Q?UKBVbw=3D?= X-Microsoft-Exchange-Diagnostics: 1;MWHPR15MB1199;6:u1qZJj6hL9Z5Tr+QICPjk+sc5w1q0bVqo3rHUCYxgVKUw8u2Yg2bmDQHUVrOwGt+Y3PkR5EQpgvQuAAO0B2NHCj0QtgpeIFjhMgKPB9FyegnQoPuYgSOuI1zIHz3BT9oWCAUZsbPXoDOEs3lmvb/3CbHztJ3Bm3BvCn6PtDz7m6RW+TH/w12gf9LuMJkHBRzL/cKjVfYNy5flx4kP4CtrQKp4xV4QZA8+zbOvbD4njGXtLCBK5rip97xvx2/XhaPf4HNi2L3yKMX/y23MdT3Fmq3VZmd94ejlx30Q1Q3THODlu4fBIXecQViAJWS+30h;5:ghlRdpFhH5So8F1RENN6EkGZtwzzXX/I723ILfrQZM+8ZBme6Ski6/awmDSFxsPODl93Dzdxf451sZsCVm5UH7cw+EM0fLlPnak0b/dpVOHA0zqvnfu+mjgu49FDGck7kcBDj2FOISsS/ge1GsYf6JId2JM1rQ/JKIarL4HxenI=;24:EIJKdBl9D7BJnuyjQqycwL1+G0UkXBcidkKGCVS+3jh5ZjXGfvKwXSeDSBGEmcOZFbgF9X7Fh7Lfk7cDnaoIfEzdFJuax3OcC8iR+nU4AeY=;7:uIScVZjQiweLkLnOqW+SF1YeOSvKX+pUso0CIVhT2/z2sZkX35QGqxiuC7Vb3P5UaJw5BizG16xnMGWr4BpwyZDY7/i+25PnBdLnNCxp0q5GyCbk5YlYsMqCGgx6rI2zlcBIwnc4PE5evhQ60nl6FSQ+gQibER3ruU6AMuEaPqEz1pDosF182EIVE6XX9rwOJxbxwKCwHID44/TfUIJFLvIbDvqAI7d7OCajgtBX7BYGiEPDGYH3CXfdIyNrRwMP1LW0CtqUgtJaI9u6oJXBwx2MKDl4GDm38IA+Gh91xJlYj0weI52PYqwv4hX+pPHQtl4cUsN3FN+s8wxL4v4lW5BRcVVQxxZb6xONvkbDUMc= SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1;MWHPR15MB1199;20:nq7bQG1U4tByVjzqy8XTjQBc0gDivbtH+D4ryOzfCB6Ihj/ZMoWp6cOKhw3T5lRxWTH/fG8Kg8Eot9TIXPOuRvQ7tbMVy6iO2En4HLJOEU2cAydRCaru7+vOyaK42HpOwzC/yuTVx73xnNV/QABkDvOuBUUPzFr0qa6WDrFvvqE= X-MS-Exchange-CrossTenant-OriginalArrivalTime: 18 Oct 2016 23:12:46.0233 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: MWHPR15MB1199 X-OriginatorOrg: fb.com X-Proofpoint-Spam-Reason: safe X-FB-Internal: Safe X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2016-10-18_12:,, signatures=0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5423 Lines: 102 On 10/18/2016 04:42 PM, Dave Jones wrote: > On Tue, Oct 11, 2016 at 10:45:07AM -0400, Dave Jones wrote: > > > WARNING: CPU: 1 PID: 3673 at lib/list_debug.c:33 __list_add+0x89/0xb0 > > list_add corruption. prev->next should be next (ffffe8ffff806648), but was ffffc9000067fcd8. (prev=ffff880503878b80). > > CPU: 1 PID: 3673 Comm: trinity-c0 Not tainted 4.8.0-think+ #13 > > ffffc90000d87458 ffffffff8d32007c ffffc90000d874a8 0000000000000000 > > ffffc90000d87498 ffffffff8d07a6c1 0000002100000246 ffff88050388e880 > > ffff880503878b80 ffffe8ffff806648 ffffe8ffffc06600 ffff880502808008 > > Call Trace: > > [] dump_stack+0x4f/0x73 > > [] __warn+0xc1/0xe0 > > [] warn_slowpath_fmt+0x5a/0x80 > > [] __list_add+0x89/0xb0 > > [] blk_sq_make_request+0x2f8/0x350 > > [] ? generic_make_request+0xec/0x240 > > [] generic_make_request+0xf9/0x240 > > [] submit_bio+0x78/0x150 > > [] ? __percpu_counter_add+0x85/0xb0 > > [] btrfs_map_bio+0x19e/0x330 [btrfs] > > [] btree_submit_bio_hook+0xfa/0x110 [btrfs] > > [] submit_one_bio+0x65/0xa0 [btrfs] > > [] read_extent_buffer_pages+0x2f0/0x3d0 [btrfs] > > [] ? free_root_pointers+0x60/0x60 [btrfs] > > [] btree_read_extent_buffer_pages.constprop.55+0xa8/0x110 [btrfs] > > [] read_tree_block+0x2d/0x50 [btrfs] > > [] read_block_for_search.isra.33+0x134/0x330 [btrfs] > > [] ? _raw_write_unlock+0x2c/0x50 > > [] ? unlock_up+0x16c/0x1a0 [btrfs] > > [] btrfs_search_slot+0x450/0xa40 [btrfs] > > [] btrfs_del_csums+0xe3/0x2e0 [btrfs] > > [] __btrfs_free_extent.isra.82+0x32d/0xc90 [btrfs] > > [] __btrfs_run_delayed_refs+0x4d3/0x1010 [btrfs] > > [] ? debug_smp_processor_id+0x17/0x20 > > [] ? get_lock_stats+0x19/0x50 > > [] btrfs_run_delayed_refs+0x9c/0x2d0 [btrfs] > > [] btrfs_truncate_inode_items+0x888/0xda0 [btrfs] > > [] btrfs_truncate+0xe5/0x2b0 [btrfs] > > [] btrfs_setattr+0x249/0x360 [btrfs] > > [] notify_change+0x252/0x440 > > [] do_truncate+0x6e/0xc0 > > [] do_sys_ftruncate.constprop.19+0x10c/0x170 > > [] ? __this_cpu_preempt_check+0x13/0x20 > > [] SyS_ftruncate+0x9/0x10 > > [] do_syscall_64+0x5c/0x170 > > [] entry_SYSCALL64_slow_path+0x25/0x25 > > So Chris had me do a run on ext4 just for giggles. It took a while, but > eventually this fell out... > > > WARNING: CPU: 3 PID: 21324 at lib/list_debug.c:33 __list_add+0x89/0xb0 > list_add corruption. prev->next should be next (ffffe8ffffc05648), but was ffffc9000028bcd8. (prev=ffff880503a145c0). > CPU: 3 PID: 21324 Comm: modprobe Not tainted 4.9.0-rc1-think+ #1 > ffffc90000a6b7b8 ffffffff81320e3c ffffc90000a6b808 0000000000000000 > ffffc90000a6b7f8 ffffffff8107a711 0000002100000246 ffff8805039f1740 > ffff880503a145c0 ffffe8ffffc05648 ffffe8ffffa05600 ffff880502c39548 > Call Trace: > [] dump_stack+0x4f/0x73 > [] __warn+0xc1/0xe0 > [] warn_slowpath_fmt+0x5a/0x80 > [] __list_add+0x89/0xb0 > [] blk_sq_make_request+0x2f8/0x350 > [] ? generic_make_request+0xec/0x240 > [] generic_make_request+0xf9/0x240 > [] submit_bio+0x78/0x150 > [] ? __find_get_block+0x126/0x130 > [] submit_bh_wbc+0x16f/0x1e0 > [] ? __end_buffer_read_notouch+0x20/0x20 > [] ll_rw_block+0xa8/0xb0 > [] __breadahead+0x3f/0x70 > [] __ext4_get_inode_loc+0x37c/0x3d0 > [] ext4_iget+0x8d/0xb90 > [] ? d_alloc_parallel+0x329/0x700 > [] ext4_iget_normal+0x2a/0x30 > [] ext4_lookup+0x136/0x250 > [] lookup_slow+0x12d/0x220 > [] walk_component+0x1e7/0x310 > [] ? path_init+0x4d8/0x520 > [] path_lookupat+0x62/0x120 > [] ? getname_flags+0x32/0x180 > [] filename_lookup+0xa8/0x130 > [] ? strncpy_from_user+0x46/0x170 > [] ? getname_flags+0x4e/0x180 > [] user_path_at_empty+0x31/0x40 > [] vfs_fstatat+0x61/0xc0 > [] ? __lock_acquire.isra.32+0x1cf/0x8c0 > [] SYSC_newstat+0x2e/0x60 > [] ? __this_cpu_preempt_check+0x13/0x20 > [] SyS_newstat+0x9/0x10 > [] do_syscall_64+0x5c/0x170 > [] entry_SYSCALL64_slow_path+0x25/0x25 > > So this one isn't a btrfs specific problem as I first thought. > > This sometimes reproduces within minutes, sometimes hours, which makes > it a pain to bisect. It only started showing up this merge window though. Chinner reported the same thing on XFS, I'll look into it asap. -- Jens Axboe