Date: Mon, 24 Aug 2015 14:10:38 -0400
From: Tejun Heo <tj@kernel.org>
To: Dave Chinner <david@fromorbit.com>
Cc: Eryu Guan <eguan@redhat.com>, Jens Axboe <axboe@kernel.dk>,
        Jan Kara <jack@suse.cz>, linux-kernel@vger.kernel.org, xfs@oss.sgi.com,
        axboe@fb.com, Jan Kara <jack@suse.com>, linux-fsdevel@vger.kernel.org,
        kernel-team@fb.com
Subject: Re: [PATCH block/for-linus] writeback: fix syncing of I_DIRTY_TIME
 inodes
Message-ID: <20150824181038.GA28944@mtj.duckdns.org>
References: <20150817200254.GG21075@mtj.duckdns.org>
 <20150818091603.GA12317@quack.suse.cz>
 <20150818174718.GA15739@mtj.duckdns.org>
 <20150818195439.GB15739@mtj.duckdns.org>
 <20150818215611.GD3902@dastard>
 <20150820061224.GG17933@dhcp-13-216.nay.redhat.com>
 <20150820143626.GI17933@dhcp-13-216.nay.redhat.com>
 <20150820143735.GJ17933@dhcp-13-216.nay.redhat.com>
 <20150820165537.GA2044@mtj.duckdns.org>
 <20150820230451.GT714@dastard>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20150820230451.GT714@dastard>
User-Agent: Mutt/1.5.23 (2014-03-12)
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 1895
Lines: 48

Hello, Dave.

On Fri, Aug 21, 2015 at 09:04:51AM +1000, Dave Chinner wrote:
> > Maybe I'm misunderstanding the code but all xfs_writepage() calls are
> > from unbound workqueues - the writeback workers - while
> > xfs_setfilesize() are from bound workqueues, so I wondered why that
> > was and looked at the code and the setsize functions are run off of a
> > separate work item which is queued from the end_bio callback and I
> > can't tell who would be waiting for them.  Dave, what am I missing?
> 
> xfs_setfilesize runs transactions, so it can't be run from IO
> completion context as it needs to block (i.e. on log space or inode
> locks). It also can't block log IO completion, nor metadata Io
> completion, as only log IO completion can free log space, and the
> inode lock might be waiting on metadata buffer IO completion (e.g.
> during delayed allocation). Hence we have multiple IO completion
> workqueues to keep these things separated and deadlock free. i.e.
> they all get punted to a workqueue where they are then processed in
> a context that can block safely.

I'm still a bit confused.  What prevents the following from happening?

1. io completion of last dirty page of an inode and work item for
   xfs_setfilesize() is queued.

2. inode removed from dirty list.

3. __sync_filesystem() invokes sync_inodes_sb().  There are no dirty
   pages, so it finishes.

4. xfs_fs_sync_fs() is called which calls _xfs_log_force() but the
   work item from #1 hasn't run yet, so the size update isn't written
   out.

5. Crash.

Is it that _xfs_log_force() waits for the setfilesize transaction
created during writepage?

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/