Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753119AbcLFImN (ORCPT ); Tue, 6 Dec 2016 03:42:13 -0500 Received: from mail-wm0-f48.google.com ([74.125.82.48]:37486 "EHLO mail-wm0-f48.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751639AbcLFImJ (ORCPT ); Tue, 6 Dec 2016 03:42:09 -0500 MIME-Version: 1.0 In-Reply-To: References: <2bdc068d-afd5-7a78-f334-26970c91aaca@fb.com> <203e0319-bc9b-245c-e162-709267540d22@fb.com> <20161026233808.GC15247@clm-mbp.thefacebook.com> <20161026234751.e66xyzjiwifvbuha@codemonkey.org.uk> <20161031185514.b22zvbxvga4xcinz@codemonkey.org.uk> <20161031194454.GA49877@clm-mbp.thefacebook.com> <20161123193419.pq7adje2eanky2wx@codemonkey.org.uk> <20161123195845.iphzr7ac4mu5ewjt@codemonkey.org.uk> From: Vegard Nossum Date: Tue, 6 Dec 2016 09:42:05 +0100 Message-ID: Subject: Re: bio linked list corruption. To: Linus Torvalds Cc: Ingo Molnar , Peter Zijlstra , Dave Jones , Chris Mason , Jens Axboe , Andy Lutomirski , Andy Lutomirski , Al Viro , Josef Bacik , David Sterba , linux-btrfs , Linux Kernel , Dave Chinner Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1046 Lines: 24 On 5 December 2016 at 22:33, Vegard Nossum wrote: > On 5 December 2016 at 21:35, Linus Torvalds > wrote: >> Note for Ingo and Peter: this patch has not been tested at all. But >> Vegard did test an earlier patch of mine that just verified that yes, >> the issue really was that wait queue entries remained on the wait >> queue head just as we were about to return and free it. > > The second patch has been running for 1h+ without any problems of any > kind. I should typically have seen 2 crashes by now. I'll let it run > overnight to be sure. Alright, so nearly 12 hours later I don't see either the new warning or the original crash at all, so feel free to add: Tested-by: Vegard Nossum . That said, my 8 VMs had all panicked in some way due to OOMs (which is new since v4.8), although some got page allocation stalls for >20s and died because "khugepaged blocked for more than 120 seconds", others got "Out of memory and no killable processes". Vegard