Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754145AbbHXSKn (ORCPT ); Mon, 24 Aug 2015 14:10:43 -0400 Received: from mail-qk0-f174.google.com ([209.85.220.174]:33145 "EHLO mail-qk0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753905AbbHXSKl (ORCPT ); Mon, 24 Aug 2015 14:10:41 -0400 Date: Mon, 24 Aug 2015 14:10:38 -0400 From: Tejun Heo To: Dave Chinner Cc: Eryu Guan , Jens Axboe , Jan Kara , linux-kernel@vger.kernel.org, xfs@oss.sgi.com, axboe@fb.com, Jan Kara , linux-fsdevel@vger.kernel.org, kernel-team@fb.com Subject: Re: [PATCH block/for-linus] writeback: fix syncing of I_DIRTY_TIME inodes Message-ID: <20150824181038.GA28944@mtj.duckdns.org> References: <20150817200254.GG21075@mtj.duckdns.org> <20150818091603.GA12317@quack.suse.cz> <20150818174718.GA15739@mtj.duckdns.org> <20150818195439.GB15739@mtj.duckdns.org> <20150818215611.GD3902@dastard> <20150820061224.GG17933@dhcp-13-216.nay.redhat.com> <20150820143626.GI17933@dhcp-13-216.nay.redhat.com> <20150820143735.GJ17933@dhcp-13-216.nay.redhat.com> <20150820165537.GA2044@mtj.duckdns.org> <20150820230451.GT714@dastard> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150820230451.GT714@dastard> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1895 Lines: 48 Hello, Dave. On Fri, Aug 21, 2015 at 09:04:51AM +1000, Dave Chinner wrote: > > Maybe I'm misunderstanding the code but all xfs_writepage() calls are > > from unbound workqueues - the writeback workers - while > > xfs_setfilesize() are from bound workqueues, so I wondered why that > > was and looked at the code and the setsize functions are run off of a > > separate work item which is queued from the end_bio callback and I > > can't tell who would be waiting for them. Dave, what am I missing? > > xfs_setfilesize runs transactions, so it can't be run from IO > completion context as it needs to block (i.e. on log space or inode > locks). It also can't block log IO completion, nor metadata Io > completion, as only log IO completion can free log space, and the > inode lock might be waiting on metadata buffer IO completion (e.g. > during delayed allocation). Hence we have multiple IO completion > workqueues to keep these things separated and deadlock free. i.e. > they all get punted to a workqueue where they are then processed in > a context that can block safely. I'm still a bit confused. What prevents the following from happening? 1. io completion of last dirty page of an inode and work item for xfs_setfilesize() is queued. 2. inode removed from dirty list. 3. __sync_filesystem() invokes sync_inodes_sb(). There are no dirty pages, so it finishes. 4. xfs_fs_sync_fs() is called which calls _xfs_log_force() but the work item from #1 hasn't run yet, so the size update isn't written out. 5. Crash. Is it that _xfs_log_force() waits for the setfilesize transaction created during writepage? Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/