Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758157AbcJYBJr (ORCPT ); Mon, 24 Oct 2016 21:09:47 -0400 Received: from mail-ua0-f182.google.com ([209.85.217.182]:36075 "EHLO mail-ua0-f182.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756439AbcJYBJq (ORCPT ); Mon, 24 Oct 2016 21:09:46 -0400 MIME-Version: 1.0 In-Reply-To: References: <20161018234248.GB93792@clm-mbp.masoncoding.com> <332c8e94-a969-093f-1fb4-30d89be8993e@kernel.org> <20161020225028.czodw54tjbiwwv3o@codemonkey.org.uk> <20161020230341.jsxpia2sy53xn5l5@codemonkey.org.uk> <20161021200245.kahjzgqzdfyoe3uz@codemonkey.org.uk> <20161022152033.gkmm3l75kqjzsije@codemonkey.org.uk> <20161024044051.onmh4h6sc2bjxzzc@codemonkey.org.uk> From: Andy Lutomirski Date: Mon, 24 Oct 2016 18:09:24 -0700 Message-ID: Subject: Re: bio linked list corruption. To: Linus Torvalds Cc: David Sterba , Al Viro , Dave Jones , Linux Kernel , Jens Axboe , Josef Bacik , Chris Mason , linux-btrfs Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1255 Lines: 33 On Oct 24, 2016 5:00 PM, "Linus Torvalds" wrote: > > On Mon, Oct 24, 2016 at 3:42 PM, Andy Lutomirski wrote: > > > Now the fallocate thread catches up and *exits*. Dave's test makes a > > new thread that reuses the stack (the vmap area or the backing store). > > > > Now the shmem_fault thread continues on its merry way and takes > > q->lock. But oh crap, q->lock is pointing at some random spot on some > > other thread's stack. Kaboom! > > Note that q->lock should be entirely immaterial, since inode->i_lock > nests outside of it in all uses. > > Now, if there is some code that runs *without* the inode->i_lock, then > that would be a big bug. > > But I'm not seeing it. > > I do agree that some race on some stack data structure could easily be > the cause of these issues. And yes, the vmap code obviously starts > reusing the stack much earlier, and would trigger problems that would > essentially be hidden by the fact that the kernel stack used to stay > around not just until exit(), but until the process was reaped. > > I just think that in this case i_lock really looks like it should > serialize things correctly. > > Or are you seeing something I'm not? No, I missed that. --Andy