LinuxLists.cc - cto changes for v4 atomic open

2021-07-30 13:26:52

Subject: cto changes for v4 atomic open

I have some folks unhappy about behavior changes after: 479219218fbe NFS:
Optimise away the close-to-open GETATTR when we have NFSv4 OPEN

Before this change, a client holding a RO open would invalidate the
pagecache when doing a second RW open.

Now the client doesn't invalidate the pagecache, though technically it could
because we see a changeattr update on the RW OPEN response.

I feel this is a grey area in CTO if we're already holding an open. Do we
know how the client ought to behave in this case? Should the client's open
upgrade to RW invalidate the pagecache?

Ben

2021-07-30 14:49:25

by Trond Myklebust

[permalink] [raw]

Subject: Re: cto changes for v4 atomic open

On Fri, 2021-07-30 at 09:25 -0400, Benjamin Coddington wrote:
> I have some folks unhappy about behavior changes after: 479219218fbe
> NFS:
> Optimise away the close-to-open GETATTR when we have NFSv4 OPEN
>
> Before this change, a client holding a RO open would invalidate the
> pagecache when doing a second RW open.
>
> Now the client doesn't invalidate the pagecache, though technically
> it could
> because we see a changeattr update on the RW OPEN response.
>
> I feel this is a grey area in CTO if we're already holding an open.
> Do we
> know how the client ought to behave in this case? Should the
> client's open
> upgrade to RW invalidate the pagecache?
>

It's not a "grey area in close-to-open" at all. It is very cut and
dried.

If you need to invalidate your page cache while the file is open, then
by definition you are in a situation where there is a write by another
client going on while you are reading. You're clearly not doing close-
to-open.

The people who are doing this should be using uncached I/O.

--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
[email protected]

2021-07-30 15:17:03

by Benjamin Coddington

[permalink] [raw]

Subject: Re: cto changes for v4 atomic open

On 30 Jul 2021, at 10:48, Trond Myklebust wrote:

> On Fri, 2021-07-30 at 09:25 -0400, Benjamin Coddington wrote:
>> I have some folks unhappy about behavior changes after: 479219218fbe
>> NFS:
>> Optimise away the close-to-open GETATTR when we have NFSv4 OPEN
>>
>> Before this change, a client holding a RO open would invalidate the
>> pagecache when doing a second RW open.
>>
>> Now the client doesn't invalidate the pagecache, though technically
>> it could
>> because we see a changeattr update on the RW OPEN response.
>>
>> I feel this is a grey area in CTO if we're already holding an open.
>> Do we
>> know how the client ought to behave in this case? Should the
>> client's open
>> upgrade to RW invalidate the pagecache?
>>
>
> It's not a "grey area in close-to-open" at all. It is very cut and
> dried.
>
> If you need to invalidate your page cache while the file is open, then
> by definition you are in a situation where there is a write by another
> client going on while you are reading. You're clearly not doing close-
> to-open.
>
> The people who are doing this should be using uncached I/O.

Thanks Trond, that corrects my ambiguity and yes - there's a much better
way.

Ben

2021-08-03 21:40:25

by J. Bruce Fields

[permalink] [raw]

Subject: Re: cto changes for v4 atomic open

On Fri, Jul 30, 2021 at 02:48:41PM +0000, Trond Myklebust wrote:
> On Fri, 2021-07-30 at 09:25 -0400, Benjamin Coddington wrote:
> > I have some folks unhappy about behavior changes after: 479219218fbe
> > NFS:
> > Optimise away the close-to-open GETATTR when we have NFSv4 OPEN
> >
> > Before this change, a client holding a RO open would invalidate the
> > pagecache when doing a second RW open.
> >
> > Now the client doesn't invalidate the pagecache, though technically
> > it could
> > because we see a changeattr update on the RW OPEN response.
> >
> > I feel this is a grey area in CTO if we're already holding an open.
> > Do we
> > know how the client ought to behave in this case? Should the
> > client's open
> > upgrade to RW invalidate the pagecache?
> >
>
> It's not a "grey area in close-to-open" at all. It is very cut and
> dried.
>
> If you need to invalidate your page cache while the file is open, then
> by definition you are in a situation where there is a write by another
> client going on while you are reading. You're clearly not doing close-
> to-open.

Documentation is really unclear about this case. Every definition of
close-to-open that I've seen says that it requires a cache consistency
check on every application open. I've never seen one that says "on
every open that doesn't overlap with an already-existing open on that
client".

They *usually* also preface that by saying that this is motivated by the
use case where opens don't overlap. But it's never made clear that
that's part of the definition.

--b.

2021-08-03 21:44:04

by Trond Myklebust

[permalink] [raw]

Subject: Re: cto changes for v4 atomic open

On Tue, 2021-08-03 at 16:30 -0400, J. Bruce Fields wrote:
> On Fri, Jul 30, 2021 at 02:48:41PM +0000, Trond Myklebust wrote:
> > On Fri, 2021-07-30 at 09:25 -0400, Benjamin Coddington wrote:
> > > I have some folks unhappy about behavior changes after:
> > > 479219218fbe
> > > NFS:
> > > Optimise away the close-to-open GETATTR when we have NFSv4 OPEN
> > >
> > > Before this change, a client holding a RO open would invalidate
> > > the
> > > pagecache when doing a second RW open.
> > >
> > > Now the client doesn't invalidate the pagecache, though
> > > technically
> > > it could
> > > because we see a changeattr update on the RW OPEN response.
> > >
> > > I feel this is a grey area in CTO if we're already holding an
> > > open.
> > > Do we
> > > know how the client ought to behave in this case? Should the
> > > client's open
> > > upgrade to RW invalidate the pagecache?
> > >
> >
> > It's not a "grey area in close-to-open" at all. It is very cut and
> > dried.
> >
> > If you need to invalidate your page cache while the file is open,
> > then
> > by definition you are in a situation where there is a write by
> > another
> > client going on while you are reading. You're clearly not doing
> > close-
> > to-open.
>
> Documentation is really unclear about this case. Every definition of
> close-to-open that I've seen says that it requires a cache
> consistency
> check on every application open. I've never seen one that says "on
> every open that doesn't overlap with an already-existing open on that
> client".
>
> They *usually* also preface that by saying that this is motivated by
> the
> use case where opens don't overlap. But it's never made clear that
> that's part of the definition.
>

I'm not following your logic.

The close-to-open model assumes that the file is only being modified by
one client at a time and it assumes that file contents may be cached
while an application is holding it open.
The point checks exist in order to detect if the file is being changed
when the file is not open.

Linux does not have a per-application cache. It has a page cache that
is shared among all applications. It is impossible for two applications
to open the same file using buffered I/O, and yet see different
contents. So why do we need a second point check of the validity of the
page cache contents when one application has already verified that the
cache was valid when it opened it?

--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
[email protected]

2021-08-03 21:47:48

by J. Bruce Fields

[permalink] [raw]

Subject: Re: cto changes for v4 atomic open

On Tue, Aug 03, 2021 at 09:07:11PM +0000, Trond Myklebust wrote:
> On Tue, 2021-08-03 at 16:30 -0400, J. Bruce Fields wrote:
> > On Fri, Jul 30, 2021 at 02:48:41PM +0000, Trond Myklebust wrote:
> > > On Fri, 2021-07-30 at 09:25 -0400, Benjamin Coddington wrote:
> > > > I have some folks unhappy about behavior changes after:
> > > > 479219218fbe
> > > > NFS:
> > > > Optimise away the close-to-open GETATTR when we have NFSv4 OPEN
> > > >
> > > > Before this change, a client holding a RO open would invalidate
> > > > the
> > > > pagecache when doing a second RW open.
> > > >
> > > > Now the client doesn't invalidate the pagecache, though
> > > > technically
> > > > it could
> > > > because we see a changeattr update on the RW OPEN response.
> > > >
> > > > I feel this is a grey area in CTO if we're already holding an
> > > > open.
> > > > Do we
> > > > know how the client ought to behave in this case? Should the
> > > > client's open
> > > > upgrade to RW invalidate the pagecache?
> > > >
> > >
> > > It's not a "grey area in close-to-open" at all. It is very cut and
> > > dried.
> > >
> > > If you need to invalidate your page cache while the file is open,
> > > then
> > > by definition you are in a situation where there is a write by
> > > another
> > > client going on while you are reading. You're clearly not doing
> > > close-
> > > to-open.
> >
> > Documentation is really unclear about this case. Every definition of
> > close-to-open that I've seen says that it requires a cache
> > consistency
> > check on every application open. I've never seen one that says "on
> > every open that doesn't overlap with an already-existing open on that
> > client".
> >
> > They *usually* also preface that by saying that this is motivated by
> > the
> > use case where opens don't overlap. But it's never made clear that
> > that's part of the definition.
> >
>
> I'm not following your logic.

It's just a question of what every source I can find says close-to-open
means. E.g., NFS Illustrated, p. 248, "Close-to-open consistency
provides a guarantee of cache consistency at the level of file opens and
closes. When a file is closed by an application, the client flushes any
cached changs to the server. When a file is opened, the client ignores
any cache time remaining (if the file data are cached) and makes an
explicit GETATTR call to the server to check the file modification
time."

> The close-to-open model assumes that the file is only being modified by
> one client at a time and it assumes that file contents may be cached
> while an application is holding it open.
> The point checks exist in order to detect if the file is being changed
> when the file is not open.
>
> Linux does not have a per-application cache. It has a page cache that
> is shared among all applications. It is impossible for two applications
> to open the same file using buffered I/O, and yet see different
> contents.

Right, so based on the descriptions like the one above, I would have
expected both applications to see new data at that point.

Maybe that's not practical to implement. It'd be nice at least if that
was explicit in the documentation.

--b.

2021-08-03 21:49:20

by Trond Myklebust

[permalink] [raw]

Subject: Re: cto changes for v4 atomic open

On Tue, 2021-08-03 at 17:36 -0400, [email protected] wrote:
> On Tue, Aug 03, 2021 at 09:07:11PM +0000, Trond Myklebust wrote:
> > On Tue, 2021-08-03 at 16:30 -0400, J. Bruce Fields wrote:
> > > On Fri, Jul 30, 2021 at 02:48:41PM +0000, Trond Myklebust wrote:
> > > > On Fri, 2021-07-30 at 09:25 -0400, Benjamin Coddington wrote:
> > > > > I have some folks unhappy about behavior changes after:
> > > > > 479219218fbe
> > > > > NFS:
> > > > > Optimise away the close-to-open GETATTR when we have NFSv4
> > > > > OPEN
> > > > >
> > > > > Before this change, a client holding a RO open would
> > > > > invalidate
> > > > > the
> > > > > pagecache when doing a second RW open.
> > > > >
> > > > > Now the client doesn't invalidate the pagecache, though
> > > > > technically
> > > > > it could
> > > > > because we see a changeattr update on the RW OPEN response.
> > > > >
> > > > > I feel this is a grey area in CTO if we're already holding an
> > > > > open.
> > > > > Do we
> > > > > know how the client ought to behave in this case? Should the
> > > > > client's open
> > > > > upgrade to RW invalidate the pagecache?
> > > > >
> > > >
> > > > It's not a "grey area in close-to-open" at all. It is very cut
> > > > and
> > > > dried.
> > > >
> > > > If you need to invalidate your page cache while the file is
> > > > open,
> > > > then
> > > > by definition you are in a situation where there is a write by
> > > > another
> > > > client going on while you are reading. You're clearly not doing
> > > > close-
> > > > to-open.
> > >
> > > Documentation is really unclear about this case. Every
> > > definition of
> > > close-to-open that I've seen says that it requires a cache
> > > consistency
> > > check on every application open. I've never seen one that says
> > > "on
> > > every open that doesn't overlap with an already-existing open on
> > > that
> > > client".
> > >
> > > They *usually* also preface that by saying that this is motivated
> > > by
> > > the
> > > use case where opens don't overlap. But it's never made clear
> > > that
> > > that's part of the definition.
> > >
> >
> > I'm not following your logic.
>
> It's just a question of what every source I can find says close-to-
> open
> means. E.g., NFS Illustrated, p. 248, "Close-to-open consistency
> provides a guarantee of cache consistency at the level of file opens
> and
> closes. When a file is closed by an application, the client flushes
> any
> cached changs to the server. When a file is opened, the client
> ignores
> any cache time remaining (if the file data are cached) and makes an
> explicit GETATTR call to the server to check the file modification
> time."
>
> > The close-to-open model assumes that the file is only being
> > modified by
> > one client at a time and it assumes that file contents may be
> > cached
> > while an application is holding it open.
> > The point checks exist in order to detect if the file is being
> > changed
> > when the file is not open.
> >
> > Linux does not have a per-application cache. It has a page cache
> > that
> > is shared among all applications. It is impossible for two
> > applications
> > to open the same file using buffered I/O, and yet see different
> > contents.
>
> Right, so based on the descriptions like the one above, I would have
> expected both applications to see new data at that point.

Why? That would be a clear violation of the close-to-open rule that
nobody else can write to the file while it is open.

>
> Maybe that's not practical to implement. It'd be nice at least if
> that
> was explicit in the documentation.
>
> --b.

--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
[email protected]

2021-08-04 00:11:20

by NeilBrown

[permalink] [raw]

Subject: Re: cto changes for v4 atomic open

On Wed, 04 Aug 2021, Trond Myklebust wrote:
> On Tue, 2021-08-03 at 17:36 -0400, [email protected] wrote:
> > On Tue, Aug 03, 2021 at 09:07:11PM +0000, Trond Myklebust wrote:
> > > On Tue, 2021-08-03 at 16:30 -0400, J. Bruce Fields wrote:
> > > > On Fri, Jul 30, 2021 at 02:48:41PM +0000, Trond Myklebust wrote:
> > > > > On Fri, 2021-07-30 at 09:25 -0400, Benjamin Coddington wrote:
> > > > > > I have some folks unhappy about behavior changes after:
> > > > > > 479219218fbe
> > > > > > NFS:
> > > > > > Optimise away the close-to-open GETATTR when we have NFSv4
> > > > > > OPEN
> > > > > >
> > > > > > Before this change, a client holding a RO open would
> > > > > > invalidate
> > > > > > the
> > > > > > pagecache when doing a second RW open.
> > > > > >
> > > > > > Now the client doesn't invalidate the pagecache, though
> > > > > > technically
> > > > > > it could
> > > > > > because we see a changeattr update on the RW OPEN response.
> > > > > >
> > > > > > I feel this is a grey area in CTO if we're already holding an
> > > > > > open.
> > > > > > Do we
> > > > > > know how the client ought to behave in this case? Should the
> > > > > > client's open
> > > > > > upgrade to RW invalidate the pagecache?
> > > > > >
> > > > >
> > > > > It's not a "grey area in close-to-open" at all. It is very cut
> > > > > and
> > > > > dried.
> > > > >
> > > > > If you need to invalidate your page cache while the file is
> > > > > open,
> > > > > then
> > > > > by definition you are in a situation where there is a write by
> > > > > another
> > > > > client going on while you are reading. You're clearly not doing
> > > > > close-
> > > > > to-open.
> > > >
> > > > Documentation is really unclear about this case. Every
> > > > definition of
> > > > close-to-open that I've seen says that it requires a cache
> > > > consistency
> > > > check on every application open. I've never seen one that says
> > > > "on
> > > > every open that doesn't overlap with an already-existing open on
> > > > that
> > > > client".
> > > >
> > > > They *usually* also preface that by saying that this is motivated
> > > > by
> > > > the
> > > > use case where opens don't overlap. But it's never made clear
> > > > that
> > > > that's part of the definition.
> > > >
> > >
> > > I'm not following your logic.
> >
> > It's just a question of what every source I can find says close-to-
> > open
> > means. E.g., NFS Illustrated, p. 248, "Close-to-open consistency
> > provides a guarantee of cache consistency at the level of file opens
> > and
> > closes. When a file is closed by an application, the client flushes
> > any
> > cached changs to the server. When a file is opened, the client
> > ignores
> > any cache time remaining (if the file data are cached) and makes an
> > explicit GETATTR call to the server to check the file modification
> > time."
> >
> > > The close-to-open model assumes that the file is only being
> > > modified by
> > > one client at a time and it assumes that file contents may be
> > > cached
> > > while an application is holding it open.
> > > The point checks exist in order to detect if the file is being
> > > changed
> > > when the file is not open.
> > >
> > > Linux does not have a per-application cache. It has a page cache
> > > that
> > > is shared among all applications. It is impossible for two
> > > applications
> > > to open the same file using buffered I/O, and yet see different
> > > contents.
> >
> > Right, so based on the descriptions like the one above, I would have
> > expected both applications to see new data at that point.
>
> Why? That would be a clear violation of the close-to-open rule that
> nobody else can write to the file while it is open.
>

Is the rule
A - "it is not permitted for any other application/client to write to
the file while another has it open"
or
B - "it is not expected for any other application/client to write to
the file while another has it open"

I think B, because A is clearly not enforced. That suggests that there
is no *need* to check for changes, but equally there is no barrier to
checking for changes. So that fact that one application has the file
open should not prevent a check when another application opens the file.
Equally it should not prevent a flush when some other application closes
the file.

It is somewhat weird that if an application on one client misbehaves by
keeping a file open, that will prevent other applications on the same
client from seeing non-local changes, but will not prevent applications
on other clients from seeing any changes.

NeilBrown

2021-08-04 00:17:22

by Trond Myklebust

[permalink] [raw]

Subject: Re: cto changes for v4 atomic open

On Wed, 2021-08-04 at 09:47 +1000, NeilBrown wrote:
> On Wed, 04 Aug 2021, Trond Myklebust wrote:
> > On Tue, 2021-08-03 at 17:36 -0400, [email protected] wrote:
> > > On Tue, Aug 03, 2021 at 09:07:11PM +0000, Trond Myklebust wrote:
> > > > On Tue, 2021-08-03 at 16:30 -0400, J. Bruce Fields wrote:
> > > > > On Fri, Jul 30, 2021 at 02:48:41PM +0000, Trond Myklebust
> > > > > wrote:
> > > > > > On Fri, 2021-07-30 at 09:25 -0400, Benjamin Coddington
> > > > > > wrote:
> > > > > > > I have some folks unhappy about behavior changes after:
> > > > > > > 479219218fbe
> > > > > > > NFS:
> > > > > > > Optimise away the close-to-open GETATTR when we have
> > > > > > > NFSv4
> > > > > > > OPEN
> > > > > > >
> > > > > > > Before this change, a client holding a RO open would
> > > > > > > invalidate
> > > > > > > the
> > > > > > > pagecache when doing a second RW open.
> > > > > > >
> > > > > > > Now the client doesn't invalidate the pagecache, though
> > > > > > > technically
> > > > > > > it could
> > > > > > > because we see a changeattr update on the RW OPEN
> > > > > > > response.
> > > > > > >
> > > > > > > I feel this is a grey area in CTO if we're already
> > > > > > > holding an
> > > > > > > open.
> > > > > > > Do we
> > > > > > > know how the client ought to behave in this case? Should
> > > > > > > the
> > > > > > > client's open
> > > > > > > upgrade to RW invalidate the pagecache?
> > > > > > >
> > > > > >
> > > > > > It's not a "grey area in close-to-open" at all. It is very
> > > > > > cut
> > > > > > and
> > > > > > dried.
> > > > > >
> > > > > > If you need to invalidate your page cache while the file is
> > > > > > open,
> > > > > > then
> > > > > > by definition you are in a situation where there is a write
> > > > > > by
> > > > > > another
> > > > > > client going on while you are reading. You're clearly not
> > > > > > doing
> > > > > > close-
> > > > > > to-open.
> > > > >
> > > > > Documentation is really unclear about this case. Every
> > > > > definition of
> > > > > close-to-open that I've seen says that it requires a cache
> > > > > consistency
> > > > > check on every application open. I've never seen one that
> > > > > says
> > > > > "on
> > > > > every open that doesn't overlap with an already-existing open
> > > > > on
> > > > > that
> > > > > client".
> > > > >
> > > > > They *usually* also preface that by saying that this is
> > > > > motivated
> > > > > by
> > > > > the
> > > > > use case where opens don't overlap. But it's never made
> > > > > clear
> > > > > that
> > > > > that's part of the definition.
> > > > >
> > > >
> > > > I'm not following your logic.
> > >
> > > It's just a question of what every source I can find says close-
> > > to-
> > > open
> > > means. E.g., NFS Illustrated, p. 248, "Close-to-open consistency
> > > provides a guarantee of cache consistency at the level of file
> > > opens
> > > and
> > > closes. When a file is closed by an application, the client
> > > flushes
> > > any
> > > cached changs to the server. When a file is opened, the client
> > > ignores
> > > any cache time remaining (if the file data are cached) and makes
> > > an
> > > explicit GETATTR call to the server to check the file
> > > modification
> > > time."
> > >
> > > > The close-to-open model assumes that the file is only being
> > > > modified by
> > > > one client at a time and it assumes that file contents may be
> > > > cached
> > > > while an application is holding it open.
> > > > The point checks exist in order to detect if the file is being
> > > > changed
> > > > when the file is not open.
> > > >
> > > > Linux does not have a per-application cache. It has a page
> > > > cache
> > > > that
> > > > is shared among all applications. It is impossible for two
> > > > applications
> > > > to open the same file using buffered I/O, and yet see different
> > > > contents.
> > >
> > > Right, so based on the descriptions like the one above, I would
> > > have
> > > expected both applications to see new data at that point.
> >
> > Why? That would be a clear violation of the close-to-open rule that
> > nobody else can write to the file while it is open.
> >
>
> Is the rule
> A - "it is not permitted for any other application/client to write
> to
> the file while another has it open"
> or
> B - "it is not expected for any other application/client to write to
> the file while another has it open"
>
> I think B, because A is clearly not enforced. That suggests that
> there
> is no *need* to check for changes, but equally there is no barrier to
> checking for changes. So that fact that one application has the file
> open should not prevent a check when another application opens the
> file.
> Equally it should not prevent a flush when some other application
> closes
> the file.
>
> It is somewhat weird that if an application on one client misbehaves
> by
> keeping a file open, that will prevent other applications on the same
> client from seeing non-local changes, but will not prevent
> applications
> on other clients from seeing any changes.
>
> NeilBrown

No. What you propose is to optimise for a fringe case, which we cannot
guarantee will work anyway. I'd much rather optimise for the common
case, which is the only case with predictable semantics.

--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
[email protected]

2021-08-04 00:18:04

by Trond Myklebust

[permalink] [raw]

Subject: Re: cto changes for v4 atomic open

On Wed, 2021-08-04 at 00:00 +0000, Trond Myklebust wrote:
> On Wed, 2021-08-04 at 09:47 +1000, NeilBrown wrote:
> > On Wed, 04 Aug 2021, Trond Myklebust wrote:
> > > On Tue, 2021-08-03 at 17:36 -0400, [email protected] wrote:
> > > > On Tue, Aug 03, 2021 at 09:07:11PM +0000, Trond Myklebust
> > > > wrote:
> > > > > On Tue, 2021-08-03 at 16:30 -0400, J. Bruce Fields wrote:
> > > > > > On Fri, Jul 30, 2021 at 02:48:41PM +0000, Trond Myklebust
> > > > > > wrote:
> > > > > > > On Fri, 2021-07-30 at 09:25 -0400, Benjamin Coddington
> > > > > > > wrote:
> > > > > > > > I have some folks unhappy about behavior changes after:
> > > > > > > > 479219218fbe
> > > > > > > > NFS:
> > > > > > > > Optimise away the close-to-open GETATTR when we have
> > > > > > > > NFSv4
> > > > > > > > OPEN
> > > > > > > >
> > > > > > > > Before this change, a client holding a RO open would
> > > > > > > > invalidate
> > > > > > > > the
> > > > > > > > pagecache when doing a second RW open.
> > > > > > > >
> > > > > > > > Now the client doesn't invalidate the pagecache, though
> > > > > > > > technically
> > > > > > > > it could
> > > > > > > > because we see a changeattr update on the RW OPEN
> > > > > > > > response.
> > > > > > > >
> > > > > > > > I feel this is a grey area in CTO if we're already
> > > > > > > > holding an
> > > > > > > > open.
> > > > > > > > Do we
> > > > > > > > know how the client ought to behave in this case?
> > > > > > > > Should
> > > > > > > > the
> > > > > > > > client's open
> > > > > > > > upgrade to RW invalidate the pagecache?
> > > > > > > >
> > > > > > >
> > > > > > > It's not a "grey area in close-to-open" at all. It is
> > > > > > > very
> > > > > > > cut
> > > > > > > and
> > > > > > > dried.
> > > > > > >
> > > > > > > If you need to invalidate your page cache while the file
> > > > > > > is
> > > > > > > open,
> > > > > > > then
> > > > > > > by definition you are in a situation where there is a
> > > > > > > write
> > > > > > > by
> > > > > > > another
> > > > > > > client going on while you are reading. You're clearly not
> > > > > > > doing
> > > > > > > close-
> > > > > > > to-open.
> > > > > >
> > > > > > Documentation is really unclear about this case. Every
> > > > > > definition of
> > > > > > close-to-open that I've seen says that it requires a cache
> > > > > > consistency
> > > > > > check on every application open. I've never seen one that
> > > > > > says
> > > > > > "on
> > > > > > every open that doesn't overlap with an already-existing
> > > > > > open
> > > > > > on
> > > > > > that
> > > > > > client".
> > > > > >
> > > > > > They *usually* also preface that by saying that this is
> > > > > > motivated
> > > > > > by
> > > > > > the
> > > > > > use case where opens don't overlap. But it's never made
> > > > > > clear
> > > > > > that
> > > > > > that's part of the definition.
> > > > > >
> > > > >
> > > > > I'm not following your logic.
> > > >
> > > > It's just a question of what every source I can find says
> > > > close-
> > > > to-
> > > > open
> > > > means. E.g., NFS Illustrated, p. 248, "Close-to-open
> > > > consistency
> > > > provides a guarantee of cache consistency at the level of file
> > > > opens
> > > > and
> > > > closes. When a file is closed by an application, the client
> > > > flushes
> > > > any
> > > > cached changs to the server. When a file is opened, the client
> > > > ignores
> > > > any cache time remaining (if the file data are cached) and
> > > > makes
> > > > an
> > > > explicit GETATTR call to the server to check the file
> > > > modification
> > > > time."
> > > >
> > > > > The close-to-open model assumes that the file is only being
> > > > > modified by
> > > > > one client at a time and it assumes that file contents may be
> > > > > cached
> > > > > while an application is holding it open.
> > > > > The point checks exist in order to detect if the file is
> > > > > being
> > > > > changed
> > > > > when the file is not open.
> > > > >
> > > > > Linux does not have a per-application cache. It has a page
> > > > > cache
> > > > > that
> > > > > is shared among all applications. It is impossible for two
> > > > > applications
> > > > > to open the same file using buffered I/O, and yet see
> > > > > different
> > > > > contents.
> > > >
> > > > Right, so based on the descriptions like the one above, I would
> > > > have
> > > > expected both applications to see new data at that point.
> > >
> > > Why? That would be a clear violation of the close-to-open rule
> > > that
> > > nobody else can write to the file while it is open.
> > >
> >
> > Is the rule
> > A - "it is not permitted for any other application/client to write
> > to
> > the file while another has it open"
> > or
> > B - "it is not expected for any other application/client to write
> > to
> > the file while another has it open"
> >
> > I think B, because A is clearly not enforced. That suggests that
> > there
> > is no *need* to check for changes, but equally there is no barrier
> > to
> > checking for changes. So that fact that one application has the
> > file
> > open should not prevent a check when another application opens the
> > file.
> > Equally it should not prevent a flush when some other application
> > closes
> > the file.
> >
> > It is somewhat weird that if an application on one client
> > misbehaves
> > by
> > keeping a file open, that will prevent other applications on the
> > same
> > client from seeing non-local changes, but will not prevent
> > applications
> > on other clients from seeing any changes.
> >
> > NeilBrown
>
> No. What you propose is to optimise for a fringe case, which we
> cannot
> guarantee will work anyway. I'd much rather optimise for the common
> case, which is the only case with predictable semantics.
>

The point is that we do support uncached I/O (a.k.a. O_DIRECT)
precisely for the cases where users care about the difference in the
above to scenarios. Why should we break cached I/O just because of FUD?

--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
[email protected]

2021-08-04 01:45:49

by NeilBrown

[permalink] [raw]

Subject: Re: cto changes for v4 atomic open

On Wed, 04 Aug 2021, Trond Myklebust wrote:
>
> No. What you propose is to optimise for a fringe case, which we cannot
> guarantee will work anyway. I'd much rather optimise for the common
> case, which is the only case with predictable semantics.
>

"predictable"??

As I understand it (I haven't examined the code) the current semantics
includes:
If a file is open for read, some other client changed the file, and the
file is then opened, then the second open might see new data, or might
see old data, depending on whether the requested data is still in
cache or not.

I find this to be less predictable than the easy-to-understand semantics
that Bruce has quoted:
- revalidate on every open, flush on every close

I'm suggesting we optimize for fringe cases, I'm suggesting we provide
semantics that are simple, documentated, and predictable.

Thanks,
NeilBrown

2021-08-04 01:45:58

by Trond Myklebust

[permalink] [raw]

Subject: Re: cto changes for v4 atomic open

On Wed, 2021-08-04 at 10:57 +1000, NeilBrown wrote:
> On Wed, 04 Aug 2021, Trond Myklebust wrote:
> >
> > No. What you propose is to optimise for a fringe case, which we
> > cannot
> > guarantee will work anyway. I'd much rather optimise for the common
> > case, which is the only case with predictable semantics.
> >
>
> "predictable"??
>
> As I understand it (I haven't examined the code) the current
> semantics
> includes:
> If a file is open for read, some other client changed the file, and
> the
> file is then opened, then the second open might see new data, or
> might
> see old data, depending on whether the requested data is still in
> cache or not.
>
> I find this to be less predictable than the easy-to-understand
> semantics
> that Bruce has quoted:
> - revalidate on every open, flush on every close
>
> I'm suggesting we optimize for fringe cases, I'm suggesting we
> provide
> semantics that are simple, documentated, and predictable.
>

"Predictable" how?

This is cached I/O. By definition, it is allowed to do things like
readahead, writeback caching, metadata caching. What you're proposing
is to optimise for a case that breaks all of the above. What's the
point? We might just as well throw in the towel and just make uncached
I/O and 'noac' mounts the default.

--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
[email protected]

2021-08-04 01:52:44

by J. Bruce Fields

[permalink] [raw]

Subject: Re: cto changes for v4 atomic open

On Wed, Aug 04, 2021 at 01:03:58AM +0000, Trond Myklebust wrote:
> On Wed, 2021-08-04 at 10:57 +1000, NeilBrown wrote:
> > On Wed, 04 Aug 2021, Trond Myklebust wrote:
> > >
> > > No. What you propose is to optimise for a fringe case, which we
> > > cannot
> > > guarantee will work anyway. I'd much rather optimise for the common
> > > case, which is the only case with predictable semantics.
> > >
> >
> > "predictable"??
> >
> > As I understand it (I haven't examined the code) the current
> > semantics
> > includes:
> > If a file is open for read, some other client changed the file, and
> > the
> > file is then opened, then the second open might see new data, or
> > might
> > see old data, depending on whether the requested data is still in
> > cache or not.
> >
> > I find this to be less predictable than the easy-to-understand
> > semantics
> > that Bruce has quoted:
> > - revalidate on every open, flush on every close
> >
> > I'm suggesting we optimize for fringe cases, I'm suggesting we
> > provide
> > semantics that are simple, documentated, and predictable.
> >
>
> "Predictable" how?
>
> This is cached I/O. By definition, it is allowed to do things like
> readahead, writeback caching, metadata caching. What you're proposing
> is to optimise for a case that breaks all of the above. What's the
> point? We might just as well throw in the towel and just make uncached
> I/O and 'noac' mounts the default.

It's possible to revalidate on every open and also still do readahead,
writeback caching, and metadata caching.

--b.

2021-08-04 01:53:28

by Trond Myklebust

[permalink] [raw]

Subject: Re: cto changes for v4 atomic open

On Tue, 2021-08-03 at 21:16 -0400, [email protected] wrote:
> On Wed, Aug 04, 2021 at 01:03:58AM +0000, Trond Myklebust wrote:
> > On Wed, 2021-08-04 at 10:57 +1000, NeilBrown wrote:
> > > On Wed, 04 Aug 2021, Trond Myklebust wrote:
> > > >
> > > > No. What you propose is to optimise for a fringe case, which we
> > > > cannot
> > > > guarantee will work anyway. I'd much rather optimise for the
> > > > common
> > > > case, which is the only case with predictable semantics.
> > > >
> > >
> > > "predictable"??
> > >
> > > As I understand it (I haven't examined the code) the current
> > > semantics
> > > includes:
> > > If a file is open for read, some other client changed the file,
> > > and
> > > the
> > > file is then opened, then the second open might see new data,
> > > or
> > > might
> > > see old data, depending on whether the requested data is still
> > > in
> > > cache or not.
> > >
> > > I find this to be less predictable than the easy-to-understand
> > > semantics
> > > that Bruce has quoted:
> > > - revalidate on every open, flush on every close
> > >
> > > I'm suggesting we optimize for fringe cases, I'm suggesting we
> > > provide
> > > semantics that are simple, documentated, and predictable.
> > >
> >
> > "Predictable" how?
> >
> > This is cached I/O. By definition, it is allowed to do things like
> > readahead, writeback caching, metadata caching. What you're
> > proposing
> > is to optimise for a case that breaks all of the above. What's the
> > point? We might just as well throw in the towel and just make
> > uncached
> > I/O and 'noac' mounts the default.
>
> It's possible to revalidate on every open and also still do
> readahead,
> writeback caching, and metadata caching.
>

Sure. It is also possible to revalidate on every read, every write and
every metadata operation. That's not the point.

--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
[email protected]

2021-08-04 01:54:07

by NeilBrown

[permalink] [raw]

Subject: Re: cto changes for v4 atomic open

On Wed, 04 Aug 2021, Trond Myklebust wrote:
> On Wed, 2021-08-04 at 10:57 +1000, NeilBrown wrote:
> > On Wed, 04 Aug 2021, Trond Myklebust wrote:
> > >
> > > No. What you propose is to optimise for a fringe case, which we
> > > cannot
> > > guarantee will work anyway. I'd much rather optimise for the common
> > > case, which is the only case with predictable semantics.
> > >
> >
> > "predictable"??
> >
> > As I understand it (I haven't examined the code) the current
> > semantics
> > includes:
> > If a file is open for read, some other client changed the file, and
> > the
> > file is then opened, then the second open might see new data, or
> > might
> > see old data, depending on whether the requested data is still in
> > cache or not.
> >
> > I find this to be less predictable than the easy-to-understand
> > semantics
> > that Bruce has quoted:
> > - revalidate on every open, flush on every close
> >
> > I'm suggesting we optimize for fringe cases, I'm suggesting we
> > provide
> > semantics that are simple, documentated, and predictable.
> >
>
> "Predictable" how?
>
> This is cached I/O. By definition, it is allowed to do things like
> readahead, writeback caching, metadata caching. What you're proposing
> is to optimise for a case that breaks all of the above. What's the
> point? We might just as well throw in the towel and just make uncached
> I/O and 'noac' mounts the default.

How are readahead, and other caching broken? Indeed, how are they even
predictable? Caching is almost by definition a best-effort. Read
requests may, or may not, be served from read-ahead data. Write maybe
written back sooner or later. Various system-load factors can affect
this. You can never predict that a cache *will* be used.

"revalidate on every open, flush on every close" (in the absence of
delegations of course) provides access to the only element of cache
behaviour that *can* be predictable: the times when it *wont* be used.

NeilBrown

2021-08-04 01:57:22

by Trond Myklebust

[permalink] [raw]

Subject: Re: cto changes for v4 atomic open

On Wed, 2021-08-04 at 11:30 +1000, NeilBrown wrote:
> On Wed, 04 Aug 2021, Trond Myklebust wrote:
> > On Wed, 2021-08-04 at 10:57 +1000, NeilBrown wrote:
> > > On Wed, 04 Aug 2021, Trond Myklebust wrote:
> > > >
> > > > No. What you propose is to optimise for a fringe case, which we
> > > > cannot
> > > > guarantee will work anyway. I'd much rather optimise for the
> > > > common
> > > > case, which is the only case with predictable semantics.
> > > >
> > >
> > > "predictable"??
> > >
> > > As I understand it (I haven't examined the code) the current
> > > semantics
> > > includes:
> > > If a file is open for read, some other client changed the file,
> > > and
> > > the
> > > file is then opened, then the second open might see new data,
> > > or
> > > might
> > > see old data, depending on whether the requested data is still
> > > in
> > > cache or not.
> > >
> > > I find this to be less predictable than the easy-to-understand
> > > semantics
> > > that Bruce has quoted:
> > > - revalidate on every open, flush on every close
> > >
> > > I'm suggesting we optimize for fringe cases, I'm suggesting we
> > > provide
> > > semantics that are simple, documentated, and predictable.
> > >
> >
> > "Predictable" how?
> >
> > This is cached I/O. By definition, it is allowed to do things like
> > readahead, writeback caching, metadata caching. What you're
> > proposing
> > is to optimise for a case that breaks all of the above. What's the
> > point? We might just as well throw in the towel and just make
> > uncached
> > I/O and 'noac' mounts the default.
>
> How are readahead, and other caching broken? Indeed, how are they
> even
> predictable? Caching is almost by definition a best-effort. Read
> requests may, or may not, be served from read-ahead data. Write
> maybe
> written back sooner or later. Various system-load factors can affect
> this. You can never predict that a cache *will* be used.
>

Caching not a "best effort" attempt. The client is expected to provide
a perfect reproduction of the data stored on the server in the case
where there is no close-to-open violation.
In the case where there are close-to-open violations then there are two
cases:

1. The user cares, and is using uncached I/O together with a
synchronisation protocol in order to mitigate any data+metadata
discrepancies between the client and server.
2. The user doesn't care, and we're in the standard buffered I/O
case.

Why are you and Bruce insisting that case (2) needs to be treated as
special?

> "revalidate on every open, flush on every close" (in the absence of
> delegations of course) provides access to the only element of cache
> behaviour that *can* be predictable: the times when it *wont* be
> used.
>

No. ...and the very fact you had to qualify the above with "in the
absence of delegations" proves my point.

--
Trond Myklebust Linux NFS client maintainer, Hammerspace
[email protected]

2021-08-04 01:58:54

by Matt Benjamin

[permalink] [raw]

Subject: Re: cto changes for v4 atomic open

I think it is how close-to-open has been traditionally understood. I
do not believe that close-to-open in any way implies a single writer,
rather it sets the consistency expectation for all readers.

Matt

On Tue, Aug 3, 2021 at 5:36 PM [email protected]
<[email protected]> wrote:
>
> On Tue, Aug 03, 2021 at 09:07:11PM +0000, Trond Myklebust wrote:
> > On Tue, 2021-08-03 at 16:30 -0400, J. Bruce Fields wrote:
> > > On Fri, Jul 30, 2021 at 02:48:41PM +0000, Trond Myklebust wrote:
> > > > On Fri, 2021-07-30 at 09:25 -0400, Benjamin Coddington wrote:
> > > > > I have some folks unhappy about behavior changes after:
> > > > > 479219218fbe
> > > > > NFS:
> > > > > Optimise away the close-to-open GETATTR when we have NFSv4 OPEN
> > > > >
> > > > > Before this change, a client holding a RO open would invalidate
> > > > > the
> > > > > pagecache when doing a second RW open.
> > > > >
> > > > > Now the client doesn't invalidate the pagecache, though
> > > > > technically
> > > > > it could
> > > > > because we see a changeattr update on the RW OPEN response.
> > > > >
> > > > > I feel this is a grey area in CTO if we're already holding an
> > > > > open.
> > > > > Do we
> > > > > know how the client ought to behave in this case? Should the
> > > > > client's open
> > > > > upgrade to RW invalidate the pagecache?
> > > > >
> > > >
> > > > It's not a "grey area in close-to-open" at all. It is very cut and
> > > > dried.
> > > >
> > > > If you need to invalidate your page cache while the file is open,
> > > > then
> > > > by definition you are in a situation where there is a write by
> > > > another
> > > > client going on while you are reading. You're clearly not doing
> > > > close-
> > > > to-open.
> > >
> > > Documentation is really unclear about this case. Every definition of
> > > close-to-open that I've seen says that it requires a cache
> > > consistency
> > > check on every application open. I've never seen one that says "on
> > > every open that doesn't overlap with an already-existing open on that
> > > client".
> > >
> > > They *usually* also preface that by saying that this is motivated by
> > > the
> > > use case where opens don't overlap. But it's never made clear that
> > > that's part of the definition.
> > >
> >
> > I'm not following your logic.
>
> It's just a question of what every source I can find says close-to-open
> means. E.g., NFS Illustrated, p. 248, "Close-to-open consistency
> provides a guarantee of cache consistency at the level of file opens and
> closes. When a file is closed by an application, the client flushes any
> cached changs to the server. When a file is opened, the client ignores
> any cache time remaining (if the file data are cached) and makes an
> explicit GETATTR call to the server to check the file modification
> time."
>
> > The close-to-open model assumes that the file is only being modified by
> > one client at a time and it assumes that file contents may be cached
> > while an application is holding it open.
> > The point checks exist in order to detect if the file is being changed
> > when the file is not open.
> >
> > Linux does not have a per-application cache. It has a page cache that
> > is shared among all applications. It is impossible for two applications
> > to open the same file using buffered I/O, and yet see different
> > contents.
>
> Right, so based on the descriptions like the one above, I would have
> expected both applications to see new data at that point.
>
> Maybe that's not practical to implement. It'd be nice at least if that
> was explicit in the documentation.
>
> --b.
>

--

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel. 734-821-5101
fax. 734-769-8938
cel. 734-216-5309

2021-08-04 02:01:49

by Matt Benjamin

[permalink] [raw]

Subject: Re: cto changes for v4 atomic open

(who have performed an open)

On Tue, Aug 3, 2021 at 9:43 PM Matt Benjamin <[email protected]> wrote:
>
> I think it is how close-to-open has been traditionally understood. I
> do not believe that close-to-open in any way implies a single writer,
> rather it sets the consistency expectation for all readers.
>
> Matt
>
> On Tue, Aug 3, 2021 at 5:36 PM [email protected]
> <[email protected]> wrote:
> >
> > On Tue, Aug 03, 2021 at 09:07:11PM +0000, Trond Myklebust wrote:
> > > On Tue, 2021-08-03 at 16:30 -0400, J. Bruce Fields wrote:
> > > > On Fri, Jul 30, 2021 at 02:48:41PM +0000, Trond Myklebust wrote:
> > > > > On Fri, 2021-07-30 at 09:25 -0400, Benjamin Coddington wrote:
> > > > > > I have some folks unhappy about behavior changes after:
> > > > > > 479219218fbe
> > > > > > NFS:
> > > > > > Optimise away the close-to-open GETATTR when we have NFSv4 OPEN
> > > > > >
> > > > > > Before this change, a client holding a RO open would invalidate
> > > > > > the
> > > > > > pagecache when doing a second RW open.
> > > > > >
> > > > > > Now the client doesn't invalidate the pagecache, though
> > > > > > technically
> > > > > > it could
> > > > > > because we see a changeattr update on the RW OPEN response.
> > > > > >
> > > > > > I feel this is a grey area in CTO if we're already holding an
> > > > > > open.
> > > > > > Do we
> > > > > > know how the client ought to behave in this case? Should the
> > > > > > client's open
> > > > > > upgrade to RW invalidate the pagecache?
> > > > > >
> > > > >
> > > > > It's not a "grey area in close-to-open" at all. It is very cut and
> > > > > dried.
> > > > >
> > > > > If you need to invalidate your page cache while the file is open,
> > > > > then
> > > > > by definition you are in a situation where there is a write by
> > > > > another
> > > > > client going on while you are reading. You're clearly not doing
> > > > > close-
> > > > > to-open.
> > > >
> > > > Documentation is really unclear about this case. Every definition of
> > > > close-to-open that I've seen says that it requires a cache
> > > > consistency
> > > > check on every application open. I've never seen one that says "on
> > > > every open that doesn't overlap with an already-existing open on that
> > > > client".
> > > >
> > > > They *usually* also preface that by saying that this is motivated by
> > > > the
> > > > use case where opens don't overlap. But it's never made clear that
> > > > that's part of the definition.
> > > >
> > >
> > > I'm not following your logic.
> >
> > It's just a question of what every source I can find says close-to-open
> > means. E.g., NFS Illustrated, p. 248, "Close-to-open consistency
> > provides a guarantee of cache consistency at the level of file opens and
> > closes. When a file is closed by an application, the client flushes any
> > cached changs to the server. When a file is opened, the client ignores
> > any cache time remaining (if the file data are cached) and makes an
> > explicit GETATTR call to the server to check the file modification
> > time."
> >
> > > The close-to-open model assumes that the file is only being modified by
> > > one client at a time and it assumes that file contents may be cached
> > > while an application is holding it open.
> > > The point checks exist in order to detect if the file is being changed
> > > when the file is not open.
> > >
> > > Linux does not have a per-application cache. It has a page cache that
> > > is shared among all applications. It is impossible for two applications
> > > to open the same file using buffered I/O, and yet see different
> > > contents.
> >
> > Right, so based on the descriptions like the one above, I would have
> > expected both applications to see new data at that point.
> >
> > Maybe that's not practical to implement. It'd be nice at least if that
> > was explicit in the documentation.
> >
> > --b.
> >
>
>
> --
>
> Matt Benjamin
> Red Hat, Inc.
> 315 West Huron Street, Suite 140A
> Ann Arbor, Michigan 48103
>
> http://www.redhat.com/en/technologies/storage
>
> tel. 734-821-5101
> fax. 734-769-8938
> cel. 734-216-5309

--

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel. 734-821-5101
fax. 734-769-8938
cel. 734-216-5309

2021-08-04 03:20:23

by Trond Myklebust

[permalink] [raw]

Subject: Re: cto changes for v4 atomic open

On Tue, 2021-08-03 at 21:51 -0400, Matt Benjamin wrote:
> (who have performed an open)
>
> On Tue, Aug 3, 2021 at 9:43 PM Matt Benjamin <[email protected]>
> wrote:
> >
> > I think it is how close-to-open has been traditionally understood.
> > I
> > do not believe that close-to-open in any way implies a single
> > writer,
> > rather it sets the consistency expectation for all readers.
> >

OK. I'll bite, despite the obvious troll-bait...

close-to-open implies a single writer because it is impossible to
guarantee ordering semantics in RPC. You could, in theory, do so by
serialising on the client, but none of us do that because we care about
performance.

If you don't serialise between clients, then it is trivial (and I'm
seriously tired of people who whine about this) to reproduce reads to
file areas that have not been fully synced to the server, despite
having data on the client that is writing. i.e. the reader sees holes
that never existed on the client that wrote the data.
The reason is that the writes got re-ordered en route to the server,
and so reads to the areas that have not yet been filled are showing up
as holes.

So, no, the close-to-open semantics definitely apply to both readers
and writers.

> > Matt
> >
> > On Tue, Aug 3, 2021 at 5:36 PM [email protected]
> > <[email protected]> wrote:
> > >
> > > On Tue, Aug 03, 2021 at 09:07:11PM +0000, Trond Myklebust wrote:
> > > > On Tue, 2021-08-03 at 16:30 -0400, J. Bruce Fields wrote:
> > > > > On Fri, Jul 30, 2021 at 02:48:41PM +0000, Trond Myklebust
> > > > > wrote:
> > > > > > On Fri, 2021-07-30 at 09:25 -0400, Benjamin Coddington
> > > > > > wrote:
> > > > > > > I have some folks unhappy about behavior changes after:
> > > > > > > 479219218fbe
> > > > > > > NFS:
> > > > > > > Optimise away the close-to-open GETATTR when we have
> > > > > > > NFSv4 OPEN
> > > > > > >
> > > > > > > Before this change, a client holding a RO open would
> > > > > > > invalidate
> > > > > > > the
> > > > > > > pagecache when doing a second RW open.
> > > > > > >
> > > > > > > Now the client doesn't invalidate the pagecache, though
> > > > > > > technically
> > > > > > > it could
> > > > > > > because we see a changeattr update on the RW OPEN
> > > > > > > response.
> > > > > > >
> > > > > > > I feel this is a grey area in CTO if we're already
> > > > > > > holding an
> > > > > > > open.
> > > > > > > Do we
> > > > > > > know how the client ought to behave in this case? Should
> > > > > > > the
> > > > > > > client's open
> > > > > > > upgrade to RW invalidate the pagecache?
> > > > > > >
> > > > > >
> > > > > > It's not a "grey area in close-to-open" at all. It is very
> > > > > > cut and
> > > > > > dried.
> > > > > >
> > > > > > If you need to invalidate your page cache while the file is
> > > > > > open,
> > > > > > then
> > > > > > by definition you are in a situation where there is a write
> > > > > > by
> > > > > > another
> > > > > > client going on while you are reading. You're clearly not
> > > > > > doing
> > > > > > close-
> > > > > > to-open.
> > > > >
> > > > > Documentation is really unclear about this case. Every
> > > > > definition of
> > > > > close-to-open that I've seen says that it requires a cache
> > > > > consistency
> > > > > check on every application open. I've never seen one that
> > > > > says "on
> > > > > every open that doesn't overlap with an already-existing open
> > > > > on that
> > > > > client".
> > > > >
> > > > > They *usually* also preface that by saying that this is
> > > > > motivated by
> > > > > the
> > > > > use case where opens don't overlap. But it's never made
> > > > > clear that
> > > > > that's part of the definition.
> > > > >
> > > >
> > > > I'm not following your logic.
> > >
> > > It's just a question of what every source I can find says close-
> > > to-open
> > > means. E.g., NFS Illustrated, p. 248, "Close-to-open consistency
> > > provides a guarantee of cache consistency at the level of file
> > > opens and
> > > closes. When a file is closed by an application, the client
> > > flushes any
> > > cached changs to the server. When a file is opened, the client
> > > ignores
> > > any cache time remaining (if the file data are cached) and makes
> > > an
> > > explicit GETATTR call to the server to check the file
> > > modification
> > > time."
> > >
> > > > The close-to-open model assumes that the file is only being
> > > > modified by
> > > > one client at a time and it assumes that file contents may be
> > > > cached
> > > > while an application is holding it open.
> > > > The point checks exist in order to detect if the file is being
> > > > changed
> > > > when the file is not open.
> > > >
> > > > Linux does not have a per-application cache. It has a page
> > > > cache that
> > > > is shared among all applications. It is impossible for two
> > > > applications
> > > > to open the same file using buffered I/O, and yet see different
> > > > contents.
> > >
> > > Right, so based on the descriptions like the one above, I would
> > > have
> > > expected both applications to see new data at that point.
> > >
> > > Maybe that's not practical to implement. It'd be nice at least
> > > if that
> > > was explicit in the documentation.
> > >
> > > --b.
> > >
> >
> >
> > --
> >
> > Matt Benjamin
> > Red Hat, Inc.
> > 315 West Huron Street, Suite 140A
> > Ann Arbor, Michigan 48103
> >
> > http://www.redhat.com/en/technologies/storage
> >
> > tel. 734-821-5101
> > fax. 734-769-8938
> > cel. 734-216-5309
>
>
>

--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
[email protected]

2021-08-04 15:08:53

by Patrick Goetz

[permalink] [raw]

Subject: Re: cto changes for v4 atomic open

On 8/3/21 9:10 PM, Trond Myklebust wrote:
>
>
> On Tue, 2021-08-03 at 21:51 -0400, Matt Benjamin wrote:
>> (who have performed an open)
>>
>> On Tue, Aug 3, 2021 at 9:43 PM Matt Benjamin <[email protected]>
>> wrote:
>>>
>>> I think it is how close-to-open has been traditionally understood.
>>> I
>>> do not believe that close-to-open in any way implies a single
>>> writer,
>>> rather it sets the consistency expectation for all readers.
>>>
>
> OK. I'll bite, despite the obvious troll-bait...
>
>
> close-to-open implies a single writer because it is impossible to
> guarantee ordering semantics in RPC. You could, in theory, do so by
> serialising on the client, but none of us do that because we care about
> performance.
>
> If you don't serialise between clients, then it is trivial (and I'm
> seriously tired of people who whine about this) to reproduce reads to
> file areas that have not been fully synced to the server, despite
> having data on the client that is writing. i.e. the reader sees holes
> that never existed on the client that wrote the data.
> The reason is that the writes got re-ordered en route to the server,
> and so reads to the areas that have not yet been filled are showing up
> as holes.
>
> So, no, the close-to-open semantics definitely apply to both readers
> and writers.
>

So, I have a naive question. When a client is writing to cache, why
wouldn't it be possible to send an alert to the server indicating that
the file is being changed. The server would keep track of such files
(client cached, updated) and act accordingly; i.e. sending a request to
the client to flush the cache for that file if another client is asking
to open the file? The process could be bookended by the client alerting
the server when the cached version has been fully synchronized with the
copy on the server so that the server wouldn't serve that file until the
synchronization is complete. The only problem I can see with this is the
client crashing or disconnecting before the file is fully written to the
server, but then some timeout condition could be set.

>>> Matt
>>>
>>> On Tue, Aug 3, 2021 at 5:36 PM [email protected]
>>> <[email protected]> wrote:
>>>>
>>>> On Tue, Aug 03, 2021 at 09:07:11PM +0000, Trond Myklebust wrote:
>>>>> On Tue, 2021-08-03 at 16:30 -0400, J. Bruce Fields wrote:
>>>>>> On Fri, Jul 30, 2021 at 02:48:41PM +0000, Trond Myklebust
>>>>>> wrote:
>>>>>>> On Fri, 2021-07-30 at 09:25 -0400, Benjamin Coddington
>>>>>>> wrote:
>>>>>>>> I have some folks unhappy about behavior changes after:
>>>>>>>> 479219218fbe
>>>>>>>> NFS:
>>>>>>>> Optimise away the close-to-open GETATTR when we have
>>>>>>>> NFSv4 OPEN
>>>>>>>>
>>>>>>>> Before this change, a client holding a RO open would
>>>>>>>> invalidate
>>>>>>>> the
>>>>>>>> pagecache when doing a second RW open.
>>>>>>>>
>>>>>>>> Now the client doesn't invalidate the pagecache, though
>>>>>>>> technically
>>>>>>>> it could
>>>>>>>> because we see a changeattr update on the RW OPEN
>>>>>>>> response.
>>>>>>>>
>>>>>>>> I feel this is a grey area in CTO if we're already
>>>>>>>> holding an
>>>>>>>> open.
>>>>>>>> Do we
>>>>>>>> know how the client ought to behave in this case? Should
>>>>>>>> the
>>>>>>>> client's open
>>>>>>>> upgrade to RW invalidate the pagecache?
>>>>>>>>
>>>>>>>
>>>>>>> It's not a "grey area in close-to-open" at all. It is very
>>>>>>> cut and
>>>>>>> dried.
>>>>>>>
>>>>>>> If you need to invalidate your page cache while the file is
>>>>>>> open,
>>>>>>> then
>>>>>>> by definition you are in a situation where there is a write
>>>>>>> by
>>>>>>> another
>>>>>>> client going on while you are reading. You're clearly not
>>>>>>> doing
>>>>>>> close-
>>>>>>> to-open.
>>>>>>
>>>>>> Documentation is really unclear about this case. Every
>>>>>> definition of
>>>>>> close-to-open that I've seen says that it requires a cache
>>>>>> consistency
>>>>>> check on every application open. I've never seen one that
>>>>>> says "on
>>>>>> every open that doesn't overlap with an already-existing open
>>>>>> on that
>>>>>> client".
>>>>>>
>>>>>> They *usually* also preface that by saying that this is
>>>>>> motivated by
>>>>>> the
>>>>>> use case where opens don't overlap. But it's never made
>>>>>> clear that
>>>>>> that's part of the definition.
>>>>>>
>>>>>
>>>>> I'm not following your logic.
>>>>
>>>> It's just a question of what every source I can find says close-
>>>> to-open
>>>> means. E.g., NFS Illustrated, p. 248, "Close-to-open consistency
>>>> provides a guarantee of cache consistency at the level of file
>>>> opens and
>>>> closes. When a file is closed by an application, the client
>>>> flushes any
>>>> cached changs to the server. When a file is opened, the client
>>>> ignores
>>>> any cache time remaining (if the file data are cached) and makes
>>>> an
>>>> explicit GETATTR call to the server to check the file
>>>> modification
>>>> time."
>>>>
>>>>> The close-to-open model assumes that the file is only being
>>>>> modified by
>>>>> one client at a time and it assumes that file contents may be
>>>>> cached
>>>>> while an application is holding it open.
>>>>> The point checks exist in order to detect if the file is being
>>>>> changed
>>>>> when the file is not open.
>>>>>
>>>>> Linux does not have a per-application cache. It has a page
>>>>> cache that
>>>>> is shared among all applications. It is impossible for two
>>>>> applications
>>>>> to open the same file using buffered I/O, and yet see different
>>>>> contents.
>>>>
>>>> Right, so based on the descriptions like the one above, I would
>>>> have
>>>> expected both applications to see new data at that point.
>>>>
>>>> Maybe that's not practical to implement. It'd be nice at least
>>>> if that
>>>> was explicit in the documentation.
>>>>
>>>> --b.
>>>>
>>>
>>>
>>> --
>>>
>>> Matt Benjamin
>>> Red Hat, Inc.
>>> 315 West Huron Street, Suite 140A
>>> Ann Arbor, Michigan 48103
>>>
>>> http://www.redhat.com/en/technologies/storage
>>>
>>> tel. 734-821-5101
>>> fax. 734-769-8938
>>> cel. 734-216-5309
>>
>>
>>
>

2021-08-04 15:45:24

by Rick Macklem

[permalink] [raw]

Subject: Re: cto changes for v4 atomic open

Patrick Goetz wrote:
[stuff snipped]
>So, I have a naive question. When a client is writing to cache, why
>wouldn't it be possible to send an alert to the server indicating that
>the file is being changed. The server would keep track of such files
>(client cached, updated) and act accordingly; i.e. sending a request to
>the client to flush the cache for that file if another client is asking
>to open the file? The process could be bookended by the client alerting
>the server when the cached version has been fully synchronized with the
>copy on the server so that the server wouldn't serve that file until the
>synchronization is complete. The only problem I can see with this is the
>client crashing or disconnecting before the file is fully written to the
>server, but then some timeout condition could be set.
Well, I wouldn't call this a naive question.

There is no notification mechanism defined for any version of NFS.

However, although it isn't exactly a notification per se, in NFSv4
a client can exclusively lock a byte range (all bytes if desired).
The limitation is that all clients have to "play the game" and
acquire byte range locks before doing I/O on the file.

I've always thought close-to-open consistency was sketchy
at best, and clients should use byte range locks if they care
about getting up-to-date file data for cases where other clients
might be writing the file.

The FreeBSD client only implements close-to-open consistency
approximately. It uses cached attributes (which may not be up to
date) to re-validate cached data upon open syscalls and doesn't
worry about mtime clock resolution for NFSv3.
--> As such, the client will see data written by another client within
a bounded time, but not necessarily immediately after the writer
closes the file on another client.
When I work on the FreeBSD NFS client, it always seems to come
down to "correctness vs good performance via caching" or
"how incorrect can I get away with" if you prefer.

rick, who chooses to not have an opinion w.r.t. how the Linux
NFS client should handle close-to-open consistency
ps: I just told Bruce I wasn't going to post, but...

>>> Matt
>>>
>>> On Tue, Aug 3, 2021 at 5:36 PM [email protected]
>>> <[email protected]> wrote:
>>>>
>>>> On Tue, Aug 03, 2021 at 09:07:11PM +0000, Trond Myklebust wrote:
>>>>> On Tue, 2021-08-03 at 16:30 -0400, J. Bruce Fields wrote:
>>>>>> On Fri, Jul 30, 2021 at 02:48:41PM +0000, Trond Myklebust
>>>>>> wrote:
>>>>>>> On Fri, 2021-07-30 at 09:25 -0400, Benjamin Coddington
>>>>>>> wrote:
>>>>>>>> I have some folks unhappy about behavior changes after:
>>>>>>>> 479219218fbe
>>>>>>>> NFS:
>>>>>>>> Optimise away the close-to-open GETATTR when we have
>>>>>>>> NFSv4 OPEN
>>>>>>>>
>>>>>>>> Before this change, a client holding a RO open would
>>>>>>>> invalidate
>>>>>>>> the
>>>>>>>> pagecache when doing a second RW open.
>>>>>>>>
>>>>>>>> Now the client doesn't invalidate the pagecache, though
>>>>>>>> technically
>>>>>>>> it could
>>>>>>>> because we see a changeattr update on the RW OPEN
>>>>>>>> response.
>>>>>>>>
>>>>>>>> I feel this is a grey area in CTO if we're already
>>>>>>>> holding an
>>>>>>>> open.
>>>>>>>> Do we
>>>>>>>> know how the client ought to behave in this case? Should
>>>>>>>> the
>>>>>>>> client's open
>>>>>>>> upgrade to RW invalidate the pagecache?
>>>>>>>>
>>>>>>>
>>>>>>> It's not a "grey area in close-to-open" at all. It is very
>>>>>>> cut and
>>>>>>> dried.
>>>>>>>
>>>>>>> If you need to invalidate your page cache while the file is
>>>>>>> open,
>>>>>>> then
>>>>>>> by definition you are in a situation where there is a write
>>>>>>> by
>>>>>>> another
>>>>>>> client going on while you are reading. You're clearly not
>>>>>>> doing
>>>>>>> close-
>>>>>>> to-open.
>>>>>>
>>>>>> Documentation is really unclear about this case. Every
>>>>>> definition of
>>>>>> close-to-open that I've seen says that it requires a cache
>>>>>> consistency
>>>>>> check on every application open. I've never seen one that
>>>>>> says "on
>>>>>> every open that doesn't overlap with an already-existing open
>>>>>> on that
>>>>>> client".
>>>>>>
>>>>>> They *usually* also preface that by saying that this is
>>>>>> motivated by
>>>>>> the
>>>>>> use case where opens don't overlap. But it's never made
>>>>>> clear that
>>>>>> that's part of the definition.
>>>>>>
>>>>>
>>>>> I'm not following your logic.
>>>>
>>>> It's just a question of what every source I can find says close-
>>>> to-open
>>>> means. E.g., NFS Illustrated, p. 248, "Close-to-open consistency
>>>> provides a guarantee of cache consistency at the level of file
>>>> opens and
>>>> closes. When a file is closed by an application, the client
>>>> flushes any
>>>> cached changs to the server. When a file is opened, the client
>>>> ignores
>>>> any cache time remaining (if the file data are cached) and makes
>>>> an
>>>> explicit GETATTR call to the server to check the file
>>>> modification
>>>> time."
>>>>
>>>>> The close-to-open model assumes that the file is only being
>>>>> modified by
>>>>> one client at a time and it assumes that file contents may be
>>>>> cached
>>>>> while an application is holding it open.
>>>>> The point checks exist in order to detect if the file is being
>>>>> changed
>>>>> when the file is not open.
>>>>>
>>>>> Linux does not have a per-application cache. It has a page
>>>>> cache that
>>>>> is shared among all applications. It is impossible for two
>>>>> applications
>>>>> to open the same file using buffered I/O, and yet see different
>>>>> contents.
>>>>
>>>> Right, so based on the descriptions like the one above, I would
>>>> have
>>>> expected both applications to see new data at that point.
>>>>
>>>> Maybe that's not practical to implement. It'd be nice at least
>>>> if that
>>>> was explicit in the documentation.
>>>>
>>>> --b.
>>>>
>>>
>>>
>>> --
>>>
>>> Matt Benjamin
>>> Red Hat, Inc.
>>> 315 West Huron Street, Suite 140A
>>> Ann Arbor, Michigan 48103
>>>
>>> http://www.redhat.com/en/technologies/storage
>>>
>>> tel. 734-821-5101
>>> fax. 734-769-8938
>>> cel. 734-216-5309
>>
>>
>>
>

2021-08-04 20:51:06

by Anna Schumaker

[permalink] [raw]

Subject: Re: cto changes for v4 atomic open

Hi Patrick,

On Wed, Aug 4, 2021 at 2:17 PM Patrick Goetz <[email protected]> wrote:
>
>
>
> On 8/3/21 9:10 PM, Trond Myklebust wrote:
> >
> >
> > On Tue, 2021-08-03 at 21:51 -0400, Matt Benjamin wrote:
> >> (who have performed an open)
> >>
> >> On Tue, Aug 3, 2021 at 9:43 PM Matt Benjamin <[email protected]>
> >> wrote:
> >>>
> >>> I think it is how close-to-open has been traditionally understood.
> >>> I
> >>> do not believe that close-to-open in any way implies a single
> >>> writer,
> >>> rather it sets the consistency expectation for all readers.
> >>>
> >
> > OK. I'll bite, despite the obvious troll-bait...
> >
> >
> > close-to-open implies a single writer because it is impossible to
> > guarantee ordering semantics in RPC. You could, in theory, do so by
> > serialising on the client, but none of us do that because we care about
> > performance.
> >
> > If you don't serialise between clients, then it is trivial (and I'm
> > seriously tired of people who whine about this) to reproduce reads to
> > file areas that have not been fully synced to the server, despite
> > having data on the client that is writing. i.e. the reader sees holes
> > that never existed on the client that wrote the data.
> > The reason is that the writes got re-ordered en route to the server,
> > and so reads to the areas that have not yet been filled are showing up
> > as holes.
> >
> > So, no, the close-to-open semantics definitely apply to both readers
> > and writers.
> >
>
> So, I have a naive question. When a client is writing to cache, why
> wouldn't it be possible to send an alert to the server indicating that
> the file is being changed. The server would keep track of such files
> (client cached, updated) and act accordingly; i.e. sending a request to
> the client to flush the cache for that file if another client is asking
> to open the file? The process could be bookended by the client alerting
> the server when the cached version has been fully synchronized with the
> copy on the server so that the server wouldn't serve that file until the
> synchronization is complete. The only problem I can see with this is the
> client crashing or disconnecting before the file is fully written to the
> server, but then some timeout condition could be set.

We already have this! What you're describing is almost exactly how
delegations work :)

Anna
>
>
>
> >>> Matt
> >>>
> >>> On Tue, Aug 3, 2021 at 5:36 PM [email protected]
> >>> <[email protected]> wrote:
> >>>>
> >>>> On Tue, Aug 03, 2021 at 09:07:11PM +0000, Trond Myklebust wrote:
> >>>>> On Tue, 2021-08-03 at 16:30 -0400, J. Bruce Fields wrote:
> >>>>>> On Fri, Jul 30, 2021 at 02:48:41PM +0000, Trond Myklebust
> >>>>>> wrote:
> >>>>>>> On Fri, 2021-07-30 at 09:25 -0400, Benjamin Coddington
> >>>>>>> wrote:
> >>>>>>>> I have some folks unhappy about behavior changes after:
> >>>>>>>> 479219218fbe
> >>>>>>>> NFS:
> >>>>>>>> Optimise away the close-to-open GETATTR when we have
> >>>>>>>> NFSv4 OPEN
> >>>>>>>>
> >>>>>>>> Before this change, a client holding a RO open would
> >>>>>>>> invalidate
> >>>>>>>> the
> >>>>>>>> pagecache when doing a second RW open.
> >>>>>>>>
> >>>>>>>> Now the client doesn't invalidate the pagecache, though
> >>>>>>>> technically
> >>>>>>>> it could
> >>>>>>>> because we see a changeattr update on the RW OPEN
> >>>>>>>> response.
> >>>>>>>>
> >>>>>>>> I feel this is a grey area in CTO if we're already
> >>>>>>>> holding an
> >>>>>>>> open.
> >>>>>>>> Do we
> >>>>>>>> know how the client ought to behave in this case? Should
> >>>>>>>> the
> >>>>>>>> client's open
> >>>>>>>> upgrade to RW invalidate the pagecache?
> >>>>>>>>
> >>>>>>>
> >>>>>>> It's not a "grey area in close-to-open" at all. It is very
> >>>>>>> cut and
> >>>>>>> dried.
> >>>>>>>
> >>>>>>> If you need to invalidate your page cache while the file is
> >>>>>>> open,
> >>>>>>> then
> >>>>>>> by definition you are in a situation where there is a write
> >>>>>>> by
> >>>>>>> another
> >>>>>>> client going on while you are reading. You're clearly not
> >>>>>>> doing
> >>>>>>> close-
> >>>>>>> to-open.
> >>>>>>
> >>>>>> Documentation is really unclear about this case. Every
> >>>>>> definition of
> >>>>>> close-to-open that I've seen says that it requires a cache
> >>>>>> consistency
> >>>>>> check on every application open. I've never seen one that
> >>>>>> says "on
> >>>>>> every open that doesn't overlap with an already-existing open
> >>>>>> on that
> >>>>>> client".
> >>>>>>
> >>>>>> They *usually* also preface that by saying that this is
> >>>>>> motivated by
> >>>>>> the
> >>>>>> use case where opens don't overlap. But it's never made
> >>>>>> clear that
> >>>>>> that's part of the definition.
> >>>>>>
> >>>>>
> >>>>> I'm not following your logic.
> >>>>
> >>>> It's just a question of what every source I can find says close-
> >>>> to-open
> >>>> means. E.g., NFS Illustrated, p. 248, "Close-to-open consistency
> >>>> provides a guarantee of cache consistency at the level of file
> >>>> opens and
> >>>> closes. When a file is closed by an application, the client
> >>>> flushes any
> >>>> cached changs to the server. When a file is opened, the client
> >>>> ignores
> >>>> any cache time remaining (if the file data are cached) and makes
> >>>> an
> >>>> explicit GETATTR call to the server to check the file
> >>>> modification
> >>>> time."
> >>>>
> >>>>> The close-to-open model assumes that the file is only being
> >>>>> modified by
> >>>>> one client at a time and it assumes that file contents may be
> >>>>> cached
> >>>>> while an application is holding it open.
> >>>>> The point checks exist in order to detect if the file is being
> >>>>> changed
> >>>>> when the file is not open.
> >>>>>
> >>>>> Linux does not have a per-application cache. It has a page
> >>>>> cache that
> >>>>> is shared among all applications. It is impossible for two
> >>>>> applications
> >>>>> to open the same file using buffered I/O, and yet see different
> >>>>> contents.
> >>>>
> >>>> Right, so based on the descriptions like the one above, I would
> >>>> have
> >>>> expected both applications to see new data at that point.
> >>>>
> >>>> Maybe that's not practical to implement. It'd be nice at least
> >>>> if that
> >>>> was explicit in the documentation.
> >>>>
> >>>> --b.
> >>>>
> >>>
> >>>
> >>> --
> >>>
> >>> Matt Benjamin
> >>> Red Hat, Inc.
> >>> 315 West Huron Street, Suite 140A
> >>> Ann Arbor, Michigan 48103
> >>>
> >>> http://www.redhat.com/en/technologies/storage
> >>>
> >>> tel. 734-821-5101
> >>> fax. 734-769-8938
> >>> cel. 734-216-5309
> >>
> >>
> >>
> >

2021-08-04 20:52:38

by Matt Benjamin

[permalink] [raw]

Subject: Re: cto changes for v4 atomic open

That was not intended as a troll, and I don't see why you would assume that.

Of course, what you're saying is correct, multiple writers are not
effectively synchronized by close-to-open, and I wasn't implying they
should be. Another 3rd (...) writer operating on the file is still
relevant to the consumers, regardless of whether they can achieve a
uniform view of the data.

Matt

On Tue, Aug 3, 2021 at 10:11 PM Trond Myklebust <[email protected]> wrote:
>
>
>
> On Tue, 2021-08-03 at 21:51 -0400, Matt Benjamin wrote:
> > (who have performed an open)
> >
> > On Tue, Aug 3, 2021 at 9:43 PM Matt Benjamin <[email protected]>
> > wrote:
> > >
> > > I think it is how close-to-open has been traditionally understood.
> > > I
> > > do not believe that close-to-open in any way implies a single
> > > writer,
> > > rather it sets the consistency expectation for all readers.
> > >
>
> OK. I'll bite, despite the obvious troll-bait...
>
>
> close-to-open implies a single writer because it is impossible to
> guarantee ordering semantics in RPC. You could, in theory, do so by
> serialising on the client, but none of us do that because we care about
> performance.
>
> If you don't serialise between clients, then it is trivial (and I'm
> seriously tired of people who whine about this) to reproduce reads to
> file areas that have not been fully synced to the server, despite
> having data on the client that is writing. i.e. the reader sees holes
> that never existed on the client that wrote the data.
> The reason is that the writes got re-ordered en route to the server,
> and so reads to the areas that have not yet been filled are showing up
> as holes.
>
> So, no, the close-to-open semantics definitely apply to both readers
> and writers.
>
> > > Matt
> > >
> > > On Tue, Aug 3, 2021 at 5:36 PM [email protected]
> > > <[email protected]> wrote:
> > > >
> > > > On Tue, Aug 03, 2021 at 09:07:11PM +0000, Trond Myklebust wrote:
> > > > > On Tue, 2021-08-03 at 16:30 -0400, J. Bruce Fields wrote:
> > > > > > On Fri, Jul 30, 2021 at 02:48:41PM +0000, Trond Myklebust
> > > > > > wrote:
> > > > > > > On Fri, 2021-07-30 at 09:25 -0400, Benjamin Coddington
> > > > > > > wrote:
> > > > > > > > I have some folks unhappy about behavior changes after:
> > > > > > > > 479219218fbe
> > > > > > > > NFS:
> > > > > > > > Optimise away the close-to-open GETATTR when we have
> > > > > > > > NFSv4 OPEN
> > > > > > > >
> > > > > > > > Before this change, a client holding a RO open would
> > > > > > > > invalidate
> > > > > > > > the
> > > > > > > > pagecache when doing a second RW open.
> > > > > > > >
> > > > > > > > Now the client doesn't invalidate the pagecache, though
> > > > > > > > technically
> > > > > > > > it could
> > > > > > > > because we see a changeattr update on the RW OPEN
> > > > > > > > response.
> > > > > > > >
> > > > > > > > I feel this is a grey area in CTO if we're already
> > > > > > > > holding an
> > > > > > > > open.
> > > > > > > > Do we
> > > > > > > > know how the client ought to behave in this case? Should
> > > > > > > > the
> > > > > > > > client's open
> > > > > > > > upgrade to RW invalidate the pagecache?
> > > > > > > >
> > > > > > >
> > > > > > > It's not a "grey area in close-to-open" at all. It is very
> > > > > > > cut and
> > > > > > > dried.
> > > > > > >
> > > > > > > If you need to invalidate your page cache while the file is
> > > > > > > open,
> > > > > > > then
> > > > > > > by definition you are in a situation where there is a write
> > > > > > > by
> > > > > > > another
> > > > > > > client going on while you are reading. You're clearly not
> > > > > > > doing
> > > > > > > close-
> > > > > > > to-open.
> > > > > >
> > > > > > Documentation is really unclear about this case. Every
> > > > > > definition of
> > > > > > close-to-open that I've seen says that it requires a cache
> > > > > > consistency
> > > > > > check on every application open. I've never seen one that
> > > > > > says "on
> > > > > > every open that doesn't overlap with an already-existing open
> > > > > > on that
> > > > > > client".
> > > > > >
> > > > > > They *usually* also preface that by saying that this is
> > > > > > motivated by
> > > > > > the
> > > > > > use case where opens don't overlap. But it's never made
> > > > > > clear that
> > > > > > that's part of the definition.
> > > > > >
> > > > >
> > > > > I'm not following your logic.
> > > >
> > > > It's just a question of what every source I can find says close-
> > > > to-open
> > > > means. E.g., NFS Illustrated, p. 248, "Close-to-open consistency
> > > > provides a guarantee of cache consistency at the level of file
> > > > opens and
> > > > closes. When a file is closed by an application, the client
> > > > flushes any
> > > > cached changs to the server. When a file is opened, the client
> > > > ignores
> > > > any cache time remaining (if the file data are cached) and makes
> > > > an
> > > > explicit GETATTR call to the server to check the file
> > > > modification
> > > > time."
> > > >
> > > > > The close-to-open model assumes that the file is only being
> > > > > modified by
> > > > > one client at a time and it assumes that file contents may be
> > > > > cached
> > > > > while an application is holding it open.
> > > > > The point checks exist in order to detect if the file is being
> > > > > changed
> > > > > when the file is not open.
> > > > >
> > > > > Linux does not have a per-application cache. It has a page
> > > > > cache that
> > > > > is shared among all applications. It is impossible for two
> > > > > applications
> > > > > to open the same file using buffered I/O, and yet see different
> > > > > contents.
> > > >
> > > > Right, so based on the descriptions like the one above, I would
> > > > have
> > > > expected both applications to see new data at that point.
> > > >
> > > > Maybe that's not practical to implement. It'd be nice at least
> > > > if that
> > > > was explicit in the documentation.
> > > >
> > > > --b.
> > > >
> > >
> > >
> > > --
> > >
> > > Matt Benjamin
> > > Red Hat, Inc.
> > > 315 West Huron Street, Suite 140A
> > > Ann Arbor, Michigan 48103
> > >
> > > http://www.redhat.com/en/technologies/storage
> > >
> > > tel. 734-821-5101
> > > fax. 734-769-8938
> > > cel. 734-216-5309
> >
> >
> >
>
> --
> Trond Myklebust
> Linux NFS client maintainer, Hammerspace
> [email protected]
>
>

--

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel. 734-821-5101
fax. 734-769-8938
cel. 734-216-5309

2021-08-07 00:04:55

by Patrick Goetz

[permalink] [raw]

Subject: Re: cto changes for v4 atomic open

Hi -

I'm having trouble reconciling this comment:

On 8/4/21 1:24 PM, Anna Schumaker wrote:
>>
>> So, I have a naive question. When a client is writing to cache, why
>> wouldn't it be possible to send an alert to the server indicating that
>> the file is being changed. The server would keep track of such files
>> (client cached, updated) and act accordingly; i.e. sending a request to
>> the client to flush the cache for that file if another client is asking
>> to open the file? The process could be bookended by the client alerting
>> the server when the cached version has been fully synchronized with the
>> copy on the server so that the server wouldn't serve that file until the
>> synchronization is complete. The only problem I can see with this is the
>> client crashing or disconnecting before the file is fully written to the
>> server, but then some timeout condition could be set.
>
> We already have this! What you're describing is almost exactly how
> delegations work :)
>

with this one:

On 8/4/21 10:42 AM, Rick Macklem wrote:
>
> There is no notification mechanism defined for any version of NFS.

How can you do delegations if there's no notification system?

2021-08-07 01:04:09

by Rick Macklem

[permalink] [raw]

Subject: Re: cto changes for v4 atomic open

Patrick Goetz wrote:
>Hi -
>
>I'm having trouble reconciling this comment:
>
>On 8/4/21 1:24 PM, Anna Schumaker wrote:
>>>
>>> So, I have a naive question. When a client is writing to cache, why
>>> wouldn't it be possible to send an alert to the server indicating that
>>> the file is being changed. The server would keep track of such files
>>> (client cached, updated) and act accordingly; i.e. sending a request to
>>> the client to flush the cache for that file if another client is asking
>>> to open the file? The process could be bookended by the client alerting
>>> the server when the cached version has been fully synchronized with the
>>> copy on the server so that the server wouldn't serve that file until the
>>> synchronization is complete. The only problem I can see with this is the
>>> client crashing or disconnecting before the file is fully written to the
>>> server, but then some timeout condition could be set.
>>
>> We already have this! What you're describing is almost exactly how
>> delegations work :)
>>
>
>
>with this one:
>
>On 8/4/21 10:42 AM, Rick Macklem wrote:
> >
> > There is no notification mechanism defined for any version of NFS.
>
>
>How can you do delegations if there's no notification system?
When you asked the question, there was no mention of delegations
and only a discussion of caching. Delegations deal with Opens and,
yes, can be used to maintain consistent data caches when they happen
to be issued to client(s).

For write delegations, it works like this:
- When a client does an Open for writing, the server might choose to
issue a write delegation to the client. (It is not required to do so and
there is nothing a client can do to ensure that the server chooses to
do so. The only rule is "no callback path-->no delegation can be issued".
- If the client happens to get a write delegation, then it can assume no
other client is reading or writing the file (unless the client fails to maintain
its lease, due to network partitioning or ???).
--> Therefore it can safely cache the file's data, unless the server allow
I/O to be done using special stateids. More on this later.
- If the server received an Open request from another client for the file,
then it does a CB_RECALL callback to tell the client that it must return
the delegation.
--> The client can no longer safely cache file data once the delegation
is returned, since the server will then allow the other client to Open
the file.
--> If the client fails to return the delegation for a lease duration, then the
server can throw the delegation away.
--> If the client does not maintain its lease and maintain its callback
path, the client cannot safely cache data based on the delegation,
since it might have been discarded by the server.
In general, a delegation allows the client to do additional Opens on a file
without doing an Open on the server (called level 2 OpLocks in Windows world,
I think?).

The effect of consistent data caches depends upon two things, which a server
might or might not do:
1 - Issue delegations.
2 - Not allow I/O using special stateids. If any client can do I/O using special
stateids, then the I/O can be done without having an Open or delegation for
the file on the server.
In general, a client cannot easily tell if these are the case. I suppose it could try
an I/O with a special stateid, but that really only confirms that this particular client
cannot do I/O with special stateids, not that no client can do so.
A client can see that it acquired a delegation, but can do nothing if it did not get one.
--> Is a client going to not cache data for every file, where the server chooses not to
issue a delegation.

Back to your question. You can consider the CB_RECALL callback a notification, but it
is in a sense a notification to the client that the delegation must be returned, not that file data
has changed on the server. In other words, a CB_RECALL is done when another client
requests a conflicting Open, not when data on the server has been changed.
--> This has a similar effect to a notification that the data will/has changed, only if
the server requires all I/O present an Open/Lock/Delegation stateid.
--> No special stateids allowed and no NFSv3 I/O allowed by the server.
(The term notification is used in the NFSv4 RFCs for other things, but no CB_RECALL callbacks.)

rick

2021-08-09 04:22:41

by NeilBrown

[permalink] [raw]

Subject: Re: cto changes for v4 atomic open

On Wed, 04 Aug 2021, Trond Myklebust wrote:
> On Wed, 2021-08-04 at 11:30 +1000, NeilBrown wrote:
>
> Caching not a "best effort" attempt. The client is expected to provide
> a perfect reproduction of the data stored on the server in the case
> where there is no close-to-open violation.
> In the case where there are close-to-open violations then there are two
> cases:
>
> 1. The user cares, and is using uncached I/O together with a
> synchronisation protocol in order to mitigate any data+metadata
> discrepancies between the client and server.
> 2. The user doesn't care, and we're in the standard buffered I/O
> case.
>
>
> Why are you and Bruce insisting that case (2) needs to be treated as
> special?

I don't see these as the relevant cases. They seem to assume that "the
user" is a single entity with a coherent opinion. I don't think that is
necessarily the case.

I think it best to focus on the behaviours, and intentions behind,
individual applications. You said previously that NFS doesn't provide
caches for applications, only for whole clients. This is obviously true
but I think it misses an important point. While the cache belongs to
the whole client, the "open" and "close" are performed by individual
applications. close-to-open addresses what happens between a CLOSE and
an OPEN.

While it may be reasonable to accept that any application must depend on
correctness of any other application with write access to the file, it
doesn't necessary follow that any application can only be correct when
all applications with read access are well behaved.

If an application arranges, through some external means, to only open a
file after all possible writing application have closed it, then the NFS
caching should not get in the way for the application being able to read
anything that the other application(s) wrote. This, it me, is the core
of close-to-open consistency.

Another application writing concurrently may, of course, affect the read
results in an unpredictable way. However another application READING
concurrently should not affect an application which is carefully
serialised with any writers.

Thanks,
NeilBrown

2021-08-09 14:40:00

by Trond Myklebust

[permalink] [raw]

Subject: Re: cto changes for v4 atomic open

On Mon, 2021-08-09 at 14:20 +1000, NeilBrown wrote:
> On Wed, 04 Aug 2021, Trond Myklebust wrote:
> > On Wed, 2021-08-04 at 11:30 +1000, NeilBrown wrote:
> >
> > Caching not a "best effort" attempt. The client is expected to
> > provide
> > a perfect reproduction of the data stored on the server in the case
> > where there is no close-to-open violation.
> > In the case where there are close-to-open violations then there are
> > two
> > cases:
> >
> >    1. The user cares, and is using uncached I/O together with a
> >       synchronisation protocol in order to mitigate any
> > data+metadata
> >       discrepancies between the client and server.
> >    2. The user doesn't care, and we're in the standard buffered I/O
> >       case.
> >
> >
> > Why are you and Bruce insisting that case (2) needs to be treated
> > as
> > special?
>
> I don't see these as the relevant cases. They seem to assume that
> "the
> user" is a single entity with a coherent opinion. I don't think that
> is
> necessarily the case.
>
> I think it best to focus on the behaviours, and intentions behind,
> individual applications. You said previously that NFS doesn't
> provide
> caches for applications, only for whole clients. This is obviously
> true
> but I think it misses an important point. While the cache belongs to
> the whole client, the "open" and "close" are performed by individual
> applications. close-to-open addresses what happens between a CLOSE
> and
> an OPEN.
>
> While it may be reasonable to accept that any application must depend
> on
> correctness of any other application with write access to the file,
> it
> doesn't necessary follow that any application can only be correct
> when
> all applications with read access are well behaved.
>
> If an application arranges, through some external means, to only open
> a
> file after all possible writing application have closed it, then the
> NFS
> caching should not get in the way for the application being able to
> read
> anything that the other application(s) wrote. This, it me, is the
> core
> of close-to-open consistency.
>
> Another application writing concurrently may, of course, affect the
> read
> results in an unpredictable way. However another application READING
> concurrently should not affect an application which is carefully
> serialised with any writers.
>

That's a discussion we can have after Bruce and Chuck implement read
and write delegations that are always handed out when possible. Until
that's the case, there will be no changes made to the close-to-open
behaviour on the Linux NFSv4 client.

As for NFSv3, I don't see the above suggestion ever being implemented
in the Linux client because at this point, people deliberately choosing
NFSv3 are doing so almost exclusively for performance reasons.

--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
[email protected]

2021-08-09 14:46:03

by Chuck Lever III

[permalink] [raw]

Subject: Re: cto changes for v4 atomic open

> On Aug 9, 2021, at 10:22 AM, Trond Myklebust <[email protected]> wrote:
>
> That's a discussion we can have after Bruce and Chuck implement read
> and write delegations that are always handed out when possible.

I opened an enhancement request:

https://bugzilla.linux-nfs.org/show_bug.cgi?id=364

Feel free to add details or correct any naive assumptions.

--
Chuck Lever