From: Dave Chinner Subject: Re: lazytime implementation questions Date: Wed, 6 Jan 2016 09:59:07 +1100 Message-ID: <20160105225907.GE21461@dastard> References: <20160104062219.GB19802@dastard> <20160105173604.GE18604@quack.suse.cz> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: xfs-VZNHf3L845pBDgjK7y7TUQ@public.gmane.org, linux-ext4-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, tytso-3s7WtUTddSA@public.gmane.org To: Jan Kara Return-path: Content-Disposition: inline In-Reply-To: <20160105173604.GE18604-+0h/O2h83AeN3ZZ/Hiejyg@public.gmane.org> Sender: linux-api-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org List-Id: linux-ext4.vger.kernel.org On Tue, Jan 05, 2016 at 06:36:04PM +0100, Jan Kara wrote: > Hi, > > On Mon 04-01-16 17:22:19, Dave Chinner wrote: > > I've been looking at implementing the lazytime mount option for XFS, > > and I'm struggling to work out what it is supposed to mean. > > > > AFAICT, on ext4, lazytime means that pure timestamp updates are not > > journalled and they are only ever written back when the inode is > > otherwise dirtied and written, or they are timestamp dirty for 24 > > hours which triggers writeback. > > > > This poses a couple of problems for XFS: > > > > 1. we log every timestamp change, so there is no mechanism > > for delayed/deferred update. > > > > 2. we track dirty metadata in the journal, not via the VFS > > dirty inode lists, so all the infrastructure written for > > ext4 to do periodic flushing is useless to us. > > > > These are solvable problems, but what I'm not sure about is exactly > > what the intended semantics of lazytime durability are. That is, > > exactly what guaranteed are we giving userspace about timestamp > > updates when lazytime is used? The guarantees we have to give will > > greatly influence the XFS implementation, so I really need to nail > > down what we are expected to provide userspace. Can we: > > > > a) just ignore all durability concerns? > > b) if not, do we only need to care about the 24 hour > > writeback and unmount? > > c) if not, are fsync/sync/syncfs/freeze/unmount supposed > > to provide durability of all metadata changes? > > d) do we have to care about ordering - if we fsync one inode > > with 1 hour old timestamps, do we also need to guarantee > > that all the inodes with older dirty timestamps also get > > made durable? > > So the intended semantics is: > 1) fsync / sync / freeze / unmount will write the timestamp updates even > with lazytime. So unless crash happens, timestamps are guaranteed to be > consistent. Also sync / fsync guarantees all changes to get to disk. > 2) We periodically write back timestamps (once per 24 hours) to avoid too > big timestamp inconsistencies in case of crash. Ok, so it's supposed to be a delayed timestamp update mechanism without any specific ordering guarantees, not an opportunistic timestamp update mechanism. I can work with that. Cheers, Dave. -- Dave Chinner david-FqsqvQoI3Ljby3iVrkZq2A@public.gmane.org