Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934758AbcJZUBL (ORCPT ); Wed, 26 Oct 2016 16:01:11 -0400 Received: from mx0b-00082601.pphosted.com ([67.231.153.30]:53795 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S932826AbcJZUBH (ORCPT ); Wed, 26 Oct 2016 16:01:07 -0400 Subject: Re: bio linked list corruption. To: Linus Torvalds , Dave Jones , Andy Lutomirski , "Andy Lutomirski" , Jens Axboe , Al Viro , Josef Bacik , David Sterba , linux-btrfs , Linux Kernel , Dave Chinner References: <20161021200245.kahjzgqzdfyoe3uz@codemonkey.org.uk> <20161022152033.gkmm3l75kqjzsije@codemonkey.org.uk> <20161024044051.onmh4h6sc2bjxzzc@codemonkey.org.uk> <77d9983d-a00a-1dc1-a9a1-631de1d0c146@fb.com> <20161026002752.qvrm6yxqb54fiqnd@codemonkey.org.uk> <20161026163018.wx57yy554576s6e2@codemonkey.org.uk> <20161026184201.6ofblkd3j5uxystq@codemonkey.org.uk> From: Chris Mason Message-ID: <488f9edc-6a1c-2c68-0d33-d3aa32ece9a4@fb.com> Date: Wed, 26 Oct 2016 16:00:23 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.3.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit X-Originating-IP: [2620:10d:c091:180::1:54e] X-ClientProxiedBy: BL2PR19CA0039.namprd19.prod.outlook.com (10.167.113.49) To DM5PR15MB1241.namprd15.prod.outlook.com (10.173.209.135) X-MS-Office365-Filtering-Correlation-Id: 868c3699-bca3-41f5-2245-08d3fddabc53 X-Microsoft-Exchange-Diagnostics: 1;DM5PR15MB1241;2:emOAermfF2H/qRehbX/Tf/RzBwFC6RtYQKHeZ66c5j/csEtzZ0PoZUcgIV9Kto/D+qxX74R9U/FZEz09IAEu8TdBcZrmBnorwRgiaP2V9blLt1sN3HHKMHF9LuP+oIERcm4X7ynSb4LCjrn75KwbbUzpGLj0x81D/JaBkfVJcrvRS43pG58kjUCOo/3hx1GiaoOJk9dopRBBTsWmMl2STQ==;3:wcwnTNdriYzsbAdq2OchcsfBfECJzb6sz85qYCN/hfDGAEk/Rz9/067fPyZfP7TLH1BtMICyRBrLbdyjHIY0nD2iAZeDsL1qVBYbpWr5pO6S4VHZxWSoilMbNMEFxHG0SWM1ir5HaGbBjqJuwrmtWw==;25:2l+mHzoosumLN6sjcVVeNCFVcjYbCSK6J7vPUBEaOBjKsigh8QsnBbsEPYWY9pBiXUFtpI7IqEMVeYQJJWGQhPMPqcnOiIZBtcdsDXiWT6rSmcJ2hmK+HmLV8D/SfHR2p56p97Nq5l4LTJGpjdeDuQs4P8t3hXPACMI2r9jz9AQ2RG3zk/hPH7xJWZrYq9gINWysTGWhXTw7qyxNY4ZDllN5yHg46s0Y+nEHICT4c8od0ppJLaQf6nk7Jdhjx/Ca+AJG5hj+Of0qniuogwUyFL8V0GSSDWaK1L5+pv0W3WDxfBeyVqnuTMwAccWxFTDCHnL1MRtVCQJ40znNNCzodDzX91N3QKNPvR35NGv+Dts+MyE1FHvlQnFbAGxr6yIHknH1mb5E49/NdEjcnvATFpbV1QdWhpqZc2RgxHoec+8gzQT51ioO/C5aOnmySHv9 X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:DM5PR15MB1241; X-Microsoft-Exchange-Diagnostics: 1;DM5PR15MB1241;31:h9+tPRKbFcUIkKvLOxFEiClCx45AkhI/LT9EkIPquiP2qSwQ/H+eQXviVxSI0FK4rm+q+sM0L/7EzDyIqad5Npwun7eP4QUhKw1Cbp+vlnqeldy8qjUS9kq4uAJPTrjmML/VemDGaq/9h+rpa98ZGqLz8G2mgms2fq/GV1lBFp9k6pUkZQtqv8+xpxFt6Zpc0LurJB98+FpQ+i0g3am8sveiwNhcxUkfGpkpFAOPioUGvfhQD4NSZk4L7iLjhWlJ;20:AH947n+PGf9hJdFAnk+UJzUloMp5k1IwFouu70qza8L7a4mtiaZDNBp6Nx8rc5YazXtDSQK8uHhTWZqBBoX8K9h3LPqmlMRcei3yfQLZZ5UUEi1SQ5u2GpMR0lfdgcBS34PVkLzAAjmMxZY0pXTWU1EQ477Hq3PVsidgXxJL0Vk=;4:cy3Wr49TNnTaMrg8cGDQ5oAUjifkqd3ITIqk+voyuKtXY3M/NsRdbIOgpgQgJ5J5SrtXV6K2iWGpiWHaoAW18E/9slUdRvh6H+C98b4MJG0l+6HpKsHwOuDdklyEKDltyOM9xN7/sWd95t4Rjh5En0MlFpI1rvwQ2+85QRbVzpx+uYZQq24I2V9u270JoCbjYAZtHyPFfAy5tyhmmApHxPWF+31swCpLo7jCWlSMSHkYbIhWfXPpAP1tpyEpqU4t8uf7zESRn8SCHwPQhmS4DjPlnaRocmNTVrBl/RMTlGHViGchX7usL0KluMCnx1XWMwWmY1ngMcD013Gt0ht0wshjzBzagryp10ZWpjqfKoxXo1P5lNUB4kNc+VLt6PySUGwTmlVTOrZsczYI8Q02WQ== X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(6040176)(601004)(2401047)(8121501046)(5005006)(3002001)(10201501046);SRVR:DM5PR15MB1241;BCL:0;PCL:0;RULEID:;SRVR:DM5PR15MB1241; X-Forefront-PRVS: 0107098B6C X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10019020)(4630300001)(6009001)(7916002)(189002)(377454003)(24454002)(199003)(81166006)(5660300001)(31696002)(81156014)(68736007)(105586002)(31686004)(86362001)(42186005)(93886004)(106356001)(54356999)(8676002)(101416001)(50986999)(76176999)(36756003)(33646002)(77096005)(50466002)(64126003)(92566002)(230700001)(23676002)(2906002)(1706002)(6666003)(107886002)(7846002)(189998001)(47776003)(4001350100001)(305945005)(83506001)(5890100001)(19580405001)(2950100002)(586003)(6116002)(65956001)(65806001)(65826007)(7736002)(3480700004)(5001770100001)(19580395003)(97736004)(921003)(42262002)(1121003);DIR:OUT;SFP:1102;SCL:1;SRVR:DM5PR15MB1241;H:[IPv6:2620:10d:c0a1:1110::1085];FPR:;SPF:None;PTR:InfoNoRecords;MX:1;A:1;LANG:en; X-Microsoft-Exchange-Diagnostics: =?utf-8?B?MTtETTVQUjE1TUIxMjQxOzIzOmFEWXZ0REdsekNSTTVJSEdIRXJEQVdDZVpj?= =?utf-8?B?WnlVQkNNcjFXOUVvYmRZaWdnL2pXa2hHU3hZSWxNTElaZS9EN3NqR2Vyb05T?= =?utf-8?B?UTIvNzhhR09yc0FiRlpEZTRqWDlsYkh1bWZTY3ZqZHZTL3ZuRDcvZDBZZ3Yy?= =?utf-8?B?SExNOVVZOGVQTUFDTVJhZjFhZ0cyZmJrVnJaNW4zbXlsRHhNc1hIRnBMUUlk?= =?utf-8?B?RFJzTVdMKzcrNjQrTnkwOEI1dWtvb1ZMMmpXVk5INm9YYk11c1NDVnNzNElN?= =?utf-8?B?bG4yTk1ndnh3YlZWT0xabGIxY3hQeDRZSnR1aG1ZbzZ1cVV4ZjFJWmVHN3dn?= =?utf-8?B?V1d4U0dzY0tibU1XdThJMnUyVm5oRzFmYkNQS2V3eklhU08vbzBsZ1VRN2Zk?= =?utf-8?B?MmRqMVF3U1VjRk9mcklCdVVSb3pJai9Ld3RSVkdCS2pJOUd6WDJGNlJnTE1j?= =?utf-8?B?WHB6SnlTM3FVVkV5YmV5YzBEU3dFSFJXNitsUU5CZ2hSa25wR0xvb3djWFMv?= =?utf-8?B?Y3YrWUdiWS9zY1huUDVJQUhlK0NubmwxK2thKzgzK2p4WTBJVERzTzNKV0p2?= =?utf-8?B?YUNDekRuVElZMkhDQVVHVXhzTHdESVdPd0xFOXVTZ0tSV1Z4OTBLUVdXUGVa?= =?utf-8?B?akJReFBMZEdlZkRjME9pUmNVQmFSdHNuOWplcVA4NDhjdHdwajFmNUt2S2FC?= =?utf-8?B?ZEk3cUVtQlhENXpzeVJsUXpnc2E5b0hLNlVNNmZiNmV2eVRjcE5lcnpqUzBR?= =?utf-8?B?MkpzbGlzRzAyc0MySzB3d1Q3TFZmaGVNVnhzTG1zb05UdmJWN2Vrdm1jM2VV?= =?utf-8?B?OEJ3NG1xaFB0V1lwRWNzb240U0ZNOUZCRnMvbnR3dUp1SE41cnB1UTFkSElU?= =?utf-8?B?QWovM2Y1NC9aVDh5bnRCZGFnbVdDWFA1N2hXM2d4L0NjajZMbGdKT0RhN2JB?= =?utf-8?B?SnBKK0hRSnN1UCs2QkVUZm9abHB1MGcxbEpPV0FNcjVKTWNZOHRQc29VcWh3?= =?utf-8?B?cVZMdWpRYjJRV0g5SVd0cEpNRDdjcFVkaFUrTXBDY3Nqa0QyMkxwd1prbll4?= =?utf-8?B?QThjRERyc0E4UFp6T1hLNE11RFl0L0ZoU3ordWdMenJKZDg2NkFCVFJhU3Y0?= =?utf-8?B?ODBZRTNxbE5nVnBpamZUT0NkYUtOcVErYmFOalB5WUtUc2lCaXp2N3dRUTZa?= =?utf-8?B?WWZkSVppMzVsSkNGSHl0M3BUdkt5UnFLemVRM1lPY1E4VlFGSEdBWFJqZkJE?= =?utf-8?B?ZmV4Qkd5SXdUWG1yOXZhM2VnUGY5OFAxL2pNcDllbkk2bVRXeDFSamlBMXhp?= =?utf-8?B?VitOK3lOY2tUMGpvZzZtRmYrOCtYK1h2VDBjNUNBR0JlcDROSzY1c0lGSUxD?= =?utf-8?B?d0ZQZWxNMnJWSnRiVzlXaGxySE1RaEVXSnZBME5IV0IwcVZMVzVDaXlKTDBk?= =?utf-8?B?OGs1ejhIUnZHOGt6blV3NFBkTHRzb2p6amwyWm5sQTIwN3BrbFE1TThqMUFY?= =?utf-8?B?bk1Pa1FTVUprbDErNkRVdHU4eUZYdUFUcjRQd2poK0tPTXNubjZtRE5KeUpT?= =?utf-8?B?YUs0dWlSVzg4b05hUGhNa1V3YWJhR3NVUkpuSG5PL2EvTEszOHhRUU5EdGNw?= =?utf-8?B?REhCZG9JeEVUbWV1b2tWR0RqSkpTVnZwS2hsMnJHRGgwMlVOQzdabTV5MnZw?= =?utf-8?B?V1dSYW9Ya2ZZRmZKeTR3YjNXQ0RxanF2Y25ERXB5K2hsd1BremxRS3dhYnc3?= =?utf-8?B?ajlwZWg0Rms3ZU9nMW0zRlJwN1NuUjZBSmF2WjdFV1hpdVJ1akRLL0xsSnNq?= =?utf-8?B?KzFWNlZlL0lSanZJalRkY1pUaWRIaERIU0hBTTVsQmpaVG5WU3FEK3czV2Fr?= =?utf-8?B?QmdOS1c0SGhhb0JORklUUE51dmFYTlV1K2VWRVVOZkE4YytUa1VkSjFya1RY?= =?utf-8?B?clVVNFNaLzFRPT0=?= X-Microsoft-Exchange-Diagnostics: 1;DM5PR15MB1241;6:uYGmaqsn4cWnNEr/JJk9QjMIaQFuttbV/h0+KphXrezBzr4m/jR0FNXia6s39yZW6qpgbisgoBVIf41TWDmTXImNKIhqD4HScqt1T3vWtR7FL4drHgno79B6Iw9e8kN7PQdmOYSyDxVZJHee5NYy6Qc5AYdr4THcH+/L+jV7rqRFnOdrJqD74V2LzAD3M1TNpGiqgvN3G4e+N1/o0XcrNgmMv4rjjOztDkYoU+BbdYIgIB0bWqhu9UapeKWdMTKhujUoh1uUEp+VyY9BwVyUvd7kQlXAIuBJ7mWs75foiN+1NqbQ6WiZ/B+4Gk30BspB;5:SwxrFbr+nfKKY1Ylfw7tdLyR3rLg5v9in+68hWsEQ6wocGJjZgG7ko/ual0L+Pyk3ENMMr2M8U/n/w7GpDG7j+2nxRA/bA0V8DC1Pg9aPpbueWSGt9eBWthC9LPVixmytVXyYGs6tN1LMIJac3ZvFw==;24:LGskQnEEzyKcoGakI+s3EBC4GJRGB/Mo4dpPkvqf2wxYUMbHbcbIJTNA7abZZGuzITnm6F4VZq5nv/a1w+A0t6Usw5BPTMLviRX0ZmFfETU=;7:NaWiut15m5rJt6xLZWI0z7gL4TnZYVEVCoUh6X22sweKMVX6Aqtsf51v21BUFwcfP7oAnxAyAXVw91YJ1AebJQiLuGlAGysP/b4CjMM53XVOrI8G5Bf+8nhN9vuJiQbnR6QoYxYIWgBacXDr38J2Aq1Tzz7Y+UOkRvoA33c/+CD+R2XqWAy9nak1Sh8KSaPihD7ILn9VzFheTbwFKHN0wa9r1GgjELhfEznAVxfp3QUDn1dQ4BtvHUsOChskGqtZJg8Aqj7S5IN9xxuJDyzMw7uWycMIHw8/cK5VxN/Xe7AuN486fklNv0bffJw/YkO8IrdEjS3orAlmg9bm++B59PDDiYnj0cqypuOZ4B0RbTc= SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1;DM5PR15MB1241;20:goSj/kxwVfaqq114hE8ZY4Vyv50ckW+4/PWl0WC+fL9eyeSfEE64iECgCH/Cu8UmxPiagB4QPDrviuZb7v3PUE3KGkr33Zxk9UjnsgX00fPiH949jqsdGMyfOMmQX3763XF0meSoh7X6yjCbX35Jpo7hRK9JhyjTbJpcED9U2Wk= X-MS-Exchange-CrossTenant-OriginalArrivalTime: 26 Oct 2016 20:00:27.8692 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM5PR15MB1241 X-OriginatorOrg: fb.com X-Proofpoint-Spam-Reason: safe X-FB-Internal: Safe X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2016-10-26_12:,, signatures=0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 6507 Lines: 133 On 10/26/2016 03:06 PM, Linus Torvalds wrote: > On Wed, Oct 26, 2016 at 11:42 AM, Dave Jones wrote: >> >> The stacks show nearly all of them are stuck in sync_inodes_sb > > That's just wb_wait_for_completion(), and it means that some IO isn't > completing. > > There's also a lot of processes waiting for inode_lock(), and a few > waiting for mnt_want_write() > > Ignoring those, we have > >> [] btrfs_wait_ordered_roots+0x3f/0x200 [btrfs] >> [] btrfs_sync_fs+0x31/0xc0 [btrfs] >> [] sync_filesystem+0x6e/0xa0 >> [] SyS_syncfs+0x3c/0x70 >> [] do_syscall_64+0x5c/0x170 >> [] entry_SYSCALL64_slow_path+0x25/0x25 >> [] 0xffffffffffffffff > > Don't know this one. There's a couple of them. Could there be some > ABBA deadlock on the ordered roots waiting? It's always possible, but we haven't changed anything here. I've tried a long list of things to reproduce this on my test boxes, including days of trinity runs and a kernel module to exercise vmalloc, and thread creation. Today I turned off every CONFIG_DEBUG_* except for list debugging, and ran dbench 2048: [ 2759.118711] WARNING: CPU: 2 PID: 31039 at lib/list_debug.c:33 __list_add+0xbe/0xd0 [ 2759.119652] list_add corruption. prev->next should be next (ffffe8ffffc80308), but was ffffc90000ccfb88. (prev=ffff880128522380). [ 2759.121039] Modules linked in: crc32c_intel i2c_piix4 aesni_intel aes_x86_64 virtio_net glue_helper i2c_core lrw floppy gf128mul serio_raw pcspkr button ablk_helper cryptd sch_fq_codel autofs4 virtio_blk [ 2759.124369] CPU: 2 PID: 31039 Comm: dbench Not tainted 4.9.0-rc1-15246-g4ce9206-dirty #317 [ 2759.125077] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.0-1.fc24 04/01/2014 [ 2759.125077] ffffc9000f6fb868 ffffffff814fe4ff ffffffff8151cb5e ffffc9000f6fb8c8 [ 2759.125077] ffffc9000f6fb8c8 0000000000000000 ffffc9000f6fb8b8 ffffffff81064bbf [ 2759.127444] ffff880128523680 0000002139968000 ffff880138b7a4a0 ffff880128523540 [ 2759.127444] Call Trace: [ 2759.127444] [] dump_stack+0x53/0x74 [ 2759.127444] [] ? __list_add+0xbe/0xd0 [ 2759.127444] [] __warn+0xff/0x120 [ 2759.127444] [] warn_slowpath_fmt+0x49/0x50 [ 2759.127444] [] __list_add+0xbe/0xd0 [ 2759.127444] [] blk_sq_make_request+0x388/0x580 [ 2759.127444] [] generic_make_request+0x104/0x200 [ 2759.127444] [] submit_bio+0x65/0x130 [ 2759.127444] [] ? __percpu_counter_add+0x96/0xd0 [ 2759.127444] [] btrfs_map_bio+0x23c/0x310 [ 2759.127444] [] btrfs_submit_bio_hook+0xd3/0x190 [ 2759.127444] [] submit_one_bio+0x6d/0xa0 [ 2759.127444] [] flush_epd_write_bio+0x4e/0x70 [ 2759.127444] [] extent_writepages+0x5d/0x70 [ 2759.127444] [] ? btrfs_releasepage+0x50/0x50 [ 2759.127444] [] ? wbc_attach_and_unlock_inode+0x6e/0x170 [ 2759.127444] [] btrfs_writepages+0x27/0x30 [ 2759.127444] [] do_writepages+0x20/0x30 [ 2759.127444] [] __filemap_fdatawrite_range+0xb5/0x100 [ 2759.127444] [] filemap_fdatawrite_range+0x13/0x20 [ 2759.127444] [] btrfs_fdatawrite_range+0x2b/0x70 [ 2759.127444] [] btrfs_sync_file+0x88/0x490 [ 2759.127444] [] ? group_send_sig_info+0x42/0x80 [ 2759.127444] [] ? kill_pid_info+0x5d/0x90 [ 2759.127444] [] ? SYSC_kill+0xba/0x1d0 [ 2759.127444] [] ? __sb_end_write+0x58/0x80 [ 2759.127444] [] vfs_fsync_range+0x4c/0xb0 [ 2759.127444] [] ? syscall_trace_enter+0x201/0x2e0 [ 2759.127444] [] vfs_fsync+0x1c/0x20 [ 2759.127444] [] do_fsync+0x3d/0x70 [ 2759.127444] [] ? syscall_slow_exit_work+0xfb/0x100 [ 2759.127444] [] SyS_fsync+0x10/0x20 [ 2759.127444] [] do_syscall_64+0x55/0xd0 [ 2759.127444] [] ? prepare_exit_to_usermode+0x37/0x40 [ 2759.127444] [] entry_SYSCALL64_slow_path+0x25/0x25 [ 2759.150635] ---[ end trace 3b5b7e2ef61c3d02 ]--- I put a variant of your suggested patch in place, but my printk never triggered. Now that I've made it happen once, I'll make sure I can do it over and over again. This doesn't have the patches that Andy asked Davej to try out yet, but I'll try them once I have a reliable reproducer. diff --git a/kernel/fork.c b/kernel/fork.c index 623259f..de95e19 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -165,7 +165,7 @@ void __weak arch_release_thread_stack(unsigned long *stack) * vmalloc() is a bit slow, and calling vfree() enough times will force a TLB * flush. Try to minimize the number of calls by caching stacks. */ -#define NR_CACHED_STACKS 2 +#define NR_CACHED_STACKS 256 static DEFINE_PER_CPU(struct vm_struct *, cached_stacks[NR_CACHED_STACKS]); #endif @@ -173,7 +173,9 @@ static unsigned long *alloc_thread_stack_node(struct task_struct *tsk, int node) { #ifdef CONFIG_VMAP_STACK void *stack; + char *p; int i; + int j; local_irq_disable(); for (i = 0; i < NR_CACHED_STACKS; i++) { @@ -183,7 +185,15 @@ static unsigned long *alloc_thread_stack_node(struct task_struct *tsk, int node) continue; this_cpu_write(cached_stacks[i], NULL); + p = s->addr; + for (j = 0; j < THREAD_SIZE; j++) { + if (p[j] != 'c') { + printk_ratelimited(KERN_CRIT "bad poison %c byte %d\n", p[j], j); + break; + } + } tsk->stack_vm_area = s; + local_irq_enable(); return s->addr; } @@ -219,6 +229,7 @@ static inline void free_thread_stack(struct task_struct *tsk) int i; local_irq_save(flags); + memset(tsk->stack_vm_area->addr, 'c', THREAD_SIZE); for (i = 0; i < NR_CACHED_STACKS; i++) { if (this_cpu_read(cached_stacks[i])) continue;