2016-01-05 17:36:04

by Jan Kara

[permalink] [raw]
Subject: Re: lazytime implementation questions

Hi,

On Mon 04-01-16 17:22:19, Dave Chinner wrote:
> I've been looking at implementing the lazytime mount option for XFS,
> and I'm struggling to work out what it is supposed to mean.
>
> AFAICT, on ext4, lazytime means that pure timestamp updates are not
> journalled and they are only ever written back when the inode is
> otherwise dirtied and written, or they are timestamp dirty for 24
> hours which triggers writeback.
>
> This poses a couple of problems for XFS:
>
> 1. we log every timestamp change, so there is no mechanism
> for delayed/deferred update.
>
> 2. we track dirty metadata in the journal, not via the VFS
> dirty inode lists, so all the infrastructure written for
> ext4 to do periodic flushing is useless to us.
>
> These are solvable problems, but what I'm not sure about is exactly
> what the intended semantics of lazytime durability are. That is,
> exactly what guaranteed are we giving userspace about timestamp
> updates when lazytime is used? The guarantees we have to give will
> greatly influence the XFS implementation, so I really need to nail
> down what we are expected to provide userspace. Can we:
>
> a) just ignore all durability concerns?
> b) if not, do we only need to care about the 24 hour
> writeback and unmount?
> c) if not, are fsync/sync/syncfs/freeze/unmount supposed
> to provide durability of all metadata changes?
> d) do we have to care about ordering - if we fsync one inode
> with 1 hour old timestamps, do we also need to guarantee
> that all the inodes with older dirty timestamps also get
> made durable?

So the intended semantics is:
1) fsync / sync / freeze / unmount will write the timestamp updates even
with lazytime. So unless crash happens, timestamps are guaranteed to be
consistent. Also sync / fsync guarantees all changes to get to disk.
2) We periodically write back timestamps (once per 24 hours) to avoid too
big timestamp inconsistencies in case of crash.

Otherwise there are no guarantees wrt durability. I've CCed Ted who
designed this just in case I missed something.

Honza
--
Jan Kara <jack-IBi9RG/[email protected]>
SUSE Labs, CR