2023-10-26 05:43:23

by Amir Goldstein

[permalink] [raw]
Subject: Re: [PATCH RFC 2/9] timekeeping: new interfaces for multigrain timestamp handing

On Thu, Oct 26, 2023 at 5:21 AM Dave Chinner <[email protected]> wrote:
>
> On Wed, Oct 25, 2023 at 08:25:35AM -0400, Jeff Layton wrote:
> > On Wed, 2023-10-25 at 19:05 +1100, Dave Chinner wrote:
> > > On Tue, Oct 24, 2023 at 02:40:06PM -0400, Jeff Layton wrote:
> > > > On Tue, 2023-10-24 at 10:08 +0300, Amir Goldstein wrote:
> > > > > On Tue, Oct 24, 2023 at 6:40 AM Dave Chinner <[email protected]> wrote:
> > > > > >
> > > > > > On Mon, Oct 23, 2023 at 02:18:12PM -1000, Linus Torvalds wrote:
> > > > > > > On Mon, 23 Oct 2023 at 13:26, Dave Chinner <[email protected]> wrote:
> > > > > Does xfs_repair guarantee that changes of atime, or any inode changes
> > > > > for that matter, update i_version? No, it does not.
> > > > > So IMO, "atime does not update i_version" is not an "on-disk format change",
> > > > > it is a runtime behavior change, just like lazytime is.
> > > >
> > > > This would certainly be my preference. I don't want to break any
> > > > existing users though.
> > >
> > > That's why I'm trying to get some kind of consensus on what
> > > rules and/or atime configurations people are happy for me to break
> > > to make it look to users like there's a viable working change
> > > attribute being supplied by XFS without needing to change the on
> > > disk format.
> > >
> >
> > I agree that the only bone of contention is whether to count atime
> > updates against the change attribute. I think we have consensus that all
> > in-kernel users do _not_ want atime updates counted against the change
> > attribute. The only real question is these "legacy" users of
> > di_changecount.
>
> Please stop refering to "legacy users" of di_changecount. Whether
> there are users or not is irrelevant - it is defined by the current
> on-disk format specification, and as such there may be applications
> we do not know about making use of the current behaviour.
>
> It's like a linux syscall - we can't remove them because there may
> be some user we don't know about still using that old syscall. We
> simply don't make changes that can potentially break user
> applications like that.
>
> The on disk format is the same - there is software out that we don't
> know about that expects a certain behaviour based on the
> specification. We don't break the on disk format by making silent
> behavioural changes - we require a feature flag to indicate
> behaviour has changed so that applications can take appropriate
> actions with stuff they don't understand.
>
> The example for this is the BIGTIME timestamp format change. The on
> disk inode structure is physically unchanged, but the contents of
> the timestamp fields are encoded very differently. Sure, the older
> kernels can read the timestamp data without any sort of problem
> occurring, except for the fact the timestamps now appear to be
> completely corrupted.
>
> Changing the meaning of ithe contents of di_changecount is no
> different. It might look OK and nothing crashes, but nothing can be
> inferred from the value in the field because we don't know how it
> has been modified.
>

I don't agree that this change is the same as BIGTIME change,
but it is a good queue to ask:
BIGTIME has an on-disk feature bit in super block that can be set on an
existing filesystem (and not cleared?).
BIGTIME also has an on-disk inode flag to specify the format in which a
specific inode timestampts are stored.

If we were to change the xfs on-disk to change the *meaning* (not the
format that the counter is stored) of di_changecount, would the feature
flag need be RO_COMPAT?
Would this require a per-inode on-disk flag that declares the meaning
of di_changecount on that specific inode?

Neither of those changes is going to be very hard to do btw.
Following the footsteps of the BIGTIME conversion, but without the
need for an actual format convertors.

Thanks,
Amir.