Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757318AbcJXVR6 (ORCPT ); Mon, 24 Oct 2016 17:17:58 -0400 Received: from mail-oi0-f46.google.com ([209.85.218.46]:33649 "EHLO mail-oi0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752186AbcJXVR5 (ORCPT ); Mon, 24 Oct 2016 17:17:57 -0400 MIME-Version: 1.0 In-Reply-To: References: <20161018234248.GB93792@clm-mbp.masoncoding.com> <332c8e94-a969-093f-1fb4-30d89be8993e@kernel.org> <20161020225028.czodw54tjbiwwv3o@codemonkey.org.uk> <20161020230341.jsxpia2sy53xn5l5@codemonkey.org.uk> <20161021200245.kahjzgqzdfyoe3uz@codemonkey.org.uk> <20161022152033.gkmm3l75kqjzsije@codemonkey.org.uk> <20161024044051.onmh4h6sc2bjxzzc@codemonkey.org.uk> From: Linus Torvalds Date: Mon, 24 Oct 2016 14:17:55 -0700 X-Google-Sender-Auth: TwKYtTDSww2GpU0-4PLekycK1tc Message-ID: Subject: Re: bio linked list corruption. To: Andy Lutomirski Cc: Dave Jones , Chris Mason , Andy Lutomirski , Jens Axboe , Al Viro , Josef Bacik , David Sterba , linux-btrfs , Linux Kernel Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1371 Lines: 32 On Mon, Oct 24, 2016 at 1:46 PM, Linus Torvalds wrote: > > So this is all some really subtle code, but I'm not seeing that it > would be wrong. Ahh... Except maybe.. The vmalloc/vfree code itself is a bit scary. In particular, we have a rather insane model of TLB flushing. We leave the virtual area on a lazy purge-list, and we delay flushing the TLB and actually freeing the virtual memory for it so that we can batch things up. But we've free'd the physical pages that are *mapped* by that area when we do the vfree(). So there can be stale TLB entries that point to pages that have gotten re-used. They shouldn't matter, because nothing should be writing to those pages, but it strikes me that this may also be hurting the DEBUG_PAGEALLOC thing. Maybe we're not getting the page fautls that we *should* be getting, and there are hidden reads and writes to those paghes that already got free'd.\ There was some nasty reason why we did that insane thing. I think it was just that there are a few high-frequency vmalloc/vfree users and the TLB flushing was killing some performance. But it does strike me that we are playing very fast and loose with the TLB on the vmalloc area. So maybe all the new VMAP code is fine, and it's really vmalloc/vfree that has been subtly broken but nobody has ever cared before? Linus