2016-01-05 22:59:07

by Dave Chinner

[permalink] [raw]
Subject: Re: lazytime implementation questions

On Tue, Jan 05, 2016 at 06:36:04PM +0100, Jan Kara wrote:
> Hi,
>
> On Mon 04-01-16 17:22:19, Dave Chinner wrote:
> > I've been looking at implementing the lazytime mount option for XFS,
> > and I'm struggling to work out what it is supposed to mean.
> >
> > AFAICT, on ext4, lazytime means that pure timestamp updates are not
> > journalled and they are only ever written back when the inode is
> > otherwise dirtied and written, or they are timestamp dirty for 24
> > hours which triggers writeback.
> >
> > This poses a couple of problems for XFS:
> >
> > 1. we log every timestamp change, so there is no mechanism
> > for delayed/deferred update.
> >
> > 2. we track dirty metadata in the journal, not via the VFS
> > dirty inode lists, so all the infrastructure written for
> > ext4 to do periodic flushing is useless to us.
> >
> > These are solvable problems, but what I'm not sure about is exactly
> > what the intended semantics of lazytime durability are. That is,
> > exactly what guaranteed are we giving userspace about timestamp
> > updates when lazytime is used? The guarantees we have to give will
> > greatly influence the XFS implementation, so I really need to nail
> > down what we are expected to provide userspace. Can we:
> >
> > a) just ignore all durability concerns?
> > b) if not, do we only need to care about the 24 hour
> > writeback and unmount?
> > c) if not, are fsync/sync/syncfs/freeze/unmount supposed
> > to provide durability of all metadata changes?
> > d) do we have to care about ordering - if we fsync one inode
> > with 1 hour old timestamps, do we also need to guarantee
> > that all the inodes with older dirty timestamps also get
> > made durable?
>
> So the intended semantics is:
> 1) fsync / sync / freeze / unmount will write the timestamp updates even
> with lazytime. So unless crash happens, timestamps are guaranteed to be
> consistent. Also sync / fsync guarantees all changes to get to disk.
> 2) We periodically write back timestamps (once per 24 hours) to avoid too
> big timestamp inconsistencies in case of crash.

Ok, so it's supposed to be a delayed timestamp update mechanism
without any specific ordering guarantees, not an opportunistic
timestamp update mechanism.

I can work with that.

Cheers,

Dave.
--
Dave Chinner
[email protected]


2016-01-07 01:05:06

by Theodore Ts'o

[permalink] [raw]
Subject: Re: lazytime implementation questions

On Wed, Jan 06, 2016 at 09:59:07AM +1100, Dave Chinner wrote:
> > So the intended semantics is:
> > 1) fsync / sync / freeze / unmount will write the timestamp updates even
> > with lazytime. So unless crash happens, timestamps are guaranteed to be
> > consistent. Also sync / fsync guarantees all changes to get to disk.
> > 2) We periodically write back timestamps (once per 24 hours) to avoid too
> > big timestamp inconsistencies in case of crash.
>
> Ok, so it's supposed to be a delayed timestamp update mechanism
> without any specific ordering guarantees, not an opportunistic
> timestamp update mechanism.

There is an optimization which ext4 has which will update related
timestamps when we write an inode table block, which is
"opportunistic", but there is no guarantee that this will happen.

This is purely optional; other file systems don't have to do this, but
it can be a win in that if related inodes are in the same 4k block,
and we need to update, say, the index file one because we are changing
i_size, but we were also doing non-allocating writes to the data file,
then we might as well write out the timestamps for the data file at
the same time, since this is "free".

- Ted