2022-09-07 14:05:00

by Trond Myklebust

[permalink] [raw]
Subject: Re: [man-pages RFC PATCH v4] statx, inode: document the new STATX_INO_VERSION field

On Wed, 2022-09-07 at 09:12 -0400, Jeff Layton wrote:
> On Wed, 2022-09-07 at 08:52 -0400, J. Bruce Fields wrote:
> > On Wed, Sep 07, 2022 at 08:47:20AM -0400, Jeff Layton wrote:
> > > On Wed, 2022-09-07 at 21:37 +1000, NeilBrown wrote:
> > > > On Wed, 07 Sep 2022, Jeff Layton wrote:
> > > > > +The change to \fIstatx.stx_ino_version\fP is not atomic with
> > > > > respect to the
> > > > > +other changes in the inode. On a write, for instance, the
> > > > > i_version it usually
> > > > > +incremented before the data is copied into the pagecache.
> > > > > Therefore it is
> > > > > +possible to see a new i_version value while a read still
> > > > > shows the old data.
> > > >
> > > > Doesn't that make the value useless?
> > > >
> > >
> > > No, I don't think so. It's only really useful for comparing to an
> > > older
> > > sample anyway. If you do "statx; read; statx" and the value
> > > hasn't
> > > changed, then you know that things are stable.
> >
> > I don't see how that helps.  It's still possible to get:
> >
> >                 reader          writer
> >                 ------          ------
> >                                 i_version++
> >                 statx
> >                 read
> >                 statx
> >                                 update page cache
> >
> > right?
> >
>
> Yeah, I suppose so -- the statx wouldn't necessitate any locking. In
> that case, maybe this is useless then other than for testing purposes
> and userland NFS servers.
>
> Would it be better to not consume a statx field with this if so? What
> could we use as an alternate interface? ioctl? Some sort of global
> virtual xattr? It does need to be something per-inode.

I don't see how a non-atomic change attribute is remotely useful even
for NFS.

The main problem is not so much the above (although NFS clients are
vulnerable to that too) but the behaviour w.r.t. directory changes.

If the server can't guarantee that file/directory/... creation and
unlink are atomically recorded with change attribute updates, then the
client has to always assume that the server is lying, and that it has
to revalidate all its caches anyway. Cue endless readdir/lookup/getattr
requests after each and every directory modification in order to check
that some other client didn't also sneak in a change of their own.

--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
[email protected]



2022-09-08 00:32:49

by NeilBrown

[permalink] [raw]
Subject: Re: [man-pages RFC PATCH v4] statx, inode: document the new STATX_INO_VERSION field

On Wed, 07 Sep 2022, Trond Myklebust wrote:
> On Wed, 2022-09-07 at 09:12 -0400, Jeff Layton wrote:
> > On Wed, 2022-09-07 at 08:52 -0400, J. Bruce Fields wrote:
> > > On Wed, Sep 07, 2022 at 08:47:20AM -0400, Jeff Layton wrote:
> > > > On Wed, 2022-09-07 at 21:37 +1000, NeilBrown wrote:
> > > > > On Wed, 07 Sep 2022, Jeff Layton wrote:
> > > > > > +The change to \fIstatx.stx_ino_version\fP is not atomic with
> > > > > > respect to the
> > > > > > +other changes in the inode. On a write, for instance, the
> > > > > > i_version it usually
> > > > > > +incremented before the data is copied into the pagecache.
> > > > > > Therefore it is
> > > > > > +possible to see a new i_version value while a read still
> > > > > > shows the old data.
> > > > >
> > > > > Doesn't that make the value useless?
> > > > >
> > > >
> > > > No, I don't think so. It's only really useful for comparing to an
> > > > older
> > > > sample anyway. If you do "statx; read; statx" and the value
> > > > hasn't
> > > > changed, then you know that things are stable.
> > >
> > > I don't see how that helps.  It's still possible to get:
> > >
> > >                 reader          writer
> > >                 ------          ------
> > >                                 i_version++
> > >                 statx
> > >                 read
> > >                 statx
> > >                                 update page cache
> > >
> > > right?
> > >
> >
> > Yeah, I suppose so -- the statx wouldn't necessitate any locking. In
> > that case, maybe this is useless then other than for testing purposes
> > and userland NFS servers.
> >
> > Would it be better to not consume a statx field with this if so? What
> > could we use as an alternate interface? ioctl? Some sort of global
> > virtual xattr? It does need to be something per-inode.
>
> I don't see how a non-atomic change attribute is remotely useful even
> for NFS.
>
> The main problem is not so much the above (although NFS clients are
> vulnerable to that too) but the behaviour w.r.t. directory changes.
>
> If the server can't guarantee that file/directory/... creation and
> unlink are atomically recorded with change attribute updates, then the
> client has to always assume that the server is lying, and that it has
> to revalidate all its caches anyway. Cue endless readdir/lookup/getattr
> requests after each and every directory modification in order to check
> that some other client didn't also sneak in a change of their own.

NFS re-export doesn't support atomic change attributes on directories.
Do we see the endless revalidate requests after directory modification
in that situation? Just curious.

Thanks,
NeilBrown