Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752280AbcDVSzv (ORCPT ); Fri, 22 Apr 2016 14:55:51 -0400 Received: from mail-wm0-f51.google.com ([74.125.82.51]:38476 "EHLO mail-wm0-f51.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752074AbcDVSzs (ORCPT ); Fri, 22 Apr 2016 14:55:48 -0400 MIME-Version: 1.0 In-Reply-To: <20160421170635.GO7822@mtj.duckdns.org> References: <20160420211456.GE4775@htj.duckdns.org> <20160421170635.GO7822@mtj.duckdns.org> From: Dmitry Vyukov Date: Fri, 22 Apr 2016 20:55:27 +0200 Message-ID: Subject: Re: fs: GPF in locked_inode_to_wb_and_lock_list To: Tejun Heo Cc: Andrey Ryabinin , Alexander Viro , "linux-fsdevel@vger.kernel.org" , LKML , syzkaller , Kostya Serebryany , Alexander Potapenko , Ilya Dryomov , Jens Axboe , Jan Kara Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4157 Lines: 108 On Thu, Apr 21, 2016 at 7:06 PM, Tejun Heo wrote: > Hello, > > (cc'ing Ilya, Jan and Jens) > > On Thu, Apr 21, 2016 at 12:00:38PM +0200, Dmitry Vyukov wrote: >> On Thu, Apr 21, 2016 at 11:45 AM, Andrey Ryabinin >> wrote: >> > 2016-04-21 11:35 GMT+03:00 Dmitry Vyukov : >> >> >> >> ffffffff818884dd: 48 8b 03 mov (%rbx),%rax >> >> >> >> So whatever load "&wb->bdi->wb" produces is a NULL deref. (is it wb >> >> that is NULL?) >> > >> > Yes it's NULL wb, because there is only one load: >> > mov (%rbx),%rax => rax = wb->bdi >> > add $0x50,%rax => rax = &bdi->wb >> >> >> I bet that wb becomes NULL on the second iteration of the loop. The >> loop loops in case of a race with another thread, so it would also >> explain why it is difficult to reproduce. >> >> Tejun, does it make any sense to you? > > Yeah, that makes sense. I think the culprit is 43d1c0eb7e11 ("block: > detach bdev inode from its wb in __blkdev_put()") which allows inode > to wb association to be broken while other operations including > writeback are in progress. I thought it should be okay as the inode > must be clean at that point but that obviously doesn't mean that there > can be no writeback operations in flight. > > I hope we could eventually move away from the current model where we > try to swap out an underlying data structure while upper layers may > still be referring to it in the future but for now we can make sure > the writeback operation is finished before detaching wb. > > Dmitry, I understand that the bug is difficult to reproduce but can > you please give the following patch a try? I've merged it into my tree and will restart the fuzzer and leave it running for the weekend. Though, yeah, it is difficult to reproduce... > Thanks. > > diff --git a/fs/block_dev.c b/fs/block_dev.c > index 20a2c02..209ea33 100644 > --- a/fs/block_dev.c > +++ b/fs/block_dev.c > @@ -1530,12 +1530,7 @@ static void __blkdev_put(struct block_device *bdev, fmode_t mode, int for_part) > kill_bdev(bdev); > > bdev_write_inode(bdev); > - /* > - * Detaching bdev inode from its wb in __destroy_inode() > - * is too late: the queue which embeds its bdi (along with > - * root wb) can be gone as soon as we put_disk() below. > - */ > - inode_detach_wb(bdev->bd_inode); > + inode_detach_blkdev_wb(bdev); > } > if (bdev->bd_contains == bdev) { > if (disk->fops->release) > diff --git a/include/linux/writeback.h b/include/linux/writeback.h > index d0b5ca5..ec1f530 100644 > --- a/include/linux/writeback.h > +++ b/include/linux/writeback.h > @@ -230,6 +230,25 @@ static inline void inode_detach_wb(struct inode *inode) > } > > /** > + * inode_detach_blkdev_wb - disassociate a bd_inode from its wb > + * @bdev: block_device of interest > + * > + * @bdev is being put for the last time. Detaching bdev inode in > + * __destroy_inode() is too late: the queue which embeds its bdi (along > + * with root wb) can be gone as soon as the containing disk is put. > + * > + * This function dissociates @bdev->bd_inode from its wb. The inode must > + * be clean and no further operations should be started on it. > + */ > +static inline void inode_detach_blkdev_wb(struct block_device *bdev) > +{ > + if (bdev->bd_inode->i_wb) { > + flush_delayed_work(&bdev->bd_inode->i_wb->dwork); > + inode_detach_wb(bdev->bd_inode); > + } > +} > + > +/** > * wbc_attach_fdatawrite_inode - associate wbc and inode for fdatawrite > * @wbc: writeback_control of interest > * @inode: target inode > @@ -277,6 +296,10 @@ static inline void inode_detach_wb(struct inode *inode) > { > } > > +static inline void inode_detach_blkdev_wb(struct block_device *bdev) > +{ > +} > + > static inline void wbc_attach_and_unlock_inode(struct writeback_control *wbc, > struct inode *inode) > __releases(&inode->i_lock)