Hey linux-nfs, and especially maintainers,
I'm still interested in working on a problem raised a couple weeks ago, but
confusion muddled that discussion and it died:
If the client holds a read delegation, it will skip revalidation of a dentry
in lookup. If the file was moved on the server, the client can end up with
two positive dentries in cache for the same inode, and the dentry that
doesn't exist on the server will never time out of the cache.
The client can detect this happening because the directory of the dentry
that should be revalidated updates it's change attribute. Skipping
revalidation is an optimization in the case we hold a delegation, but this
optimization should only be used when the delegation was obtained via a
lookup of the dentry we are currently revalidating.
Keeping the optimization might be done by tying the delegation to the
dentry. Lacking some (easy?) way to do that currently, it seems simpler to
remove the optimization altogether, and I will send a patch to remove it.
Any thoughts on this? Any response, even asserting that this is not something
we will fix are welcome.
Thanks,
Ben
On Tue, 2019-06-04 at 08:41 -0400, Benjamin Coddington wrote:
> Hey linux-nfs, and especially maintainers,
>
> I'm still interested in working on a problem raised a couple weeks
> ago, but
> confusion muddled that discussion and it died:
>
> If the client holds a read delegation, it will skip revalidation of a
> dentry
> in lookup. If the file was moved on the server, the client can end
> up with
> two positive dentries in cache for the same inode, and the dentry
> that
> doesn't exist on the server will never time out of the cache.
>
> The client can detect this happening because the directory of the
> dentry
> that should be revalidated updates it's change attribute. Skipping
> revalidation is an optimization in the case we hold a delegation, but
> this
> optimization should only be used when the delegation was obtained via
> a
> lookup of the dentry we are currently revalidating.
>
> Keeping the optimization might be done by tying the delegation to the
> dentry. Lacking some (easy?) way to do that currently, it seems
> simpler to
> remove the optimization altogether, and I will send a patch to remove
> it.
A delegation normally applies to the entire inode. It covers _all_
dentries that point to that inode too because create, rename and unlink
are always atomically accompanied by an inode change attribute.
IOW: The proposed restriction is both unnecessary and incorrect.
> Any thoughts on this? Any response, even asserting that this is not
> something
> we will fix are welcome.
--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
[email protected]
On 4 Jun 2019, at 8:56, Trond Myklebust wrote:
> On Tue, 2019-06-04 at 08:41 -0400, Benjamin Coddington wrote:
>> Hey linux-nfs, and especially maintainers,
>>
>> I'm still interested in working on a problem raised a couple weeks
>> ago, but
>> confusion muddled that discussion and it died:
>>
>> If the client holds a read delegation, it will skip revalidation of a
>> dentry
>> in lookup. If the file was moved on the server, the client can end
>> up with
>> two positive dentries in cache for the same inode, and the dentry
>> that
>> doesn't exist on the server will never time out of the cache.
>>
>> The client can detect this happening because the directory of the
>> dentry
>> that should be revalidated updates it's change attribute. Skipping
>> revalidation is an optimization in the case we hold a delegation, but
>> this
>> optimization should only be used when the delegation was obtained via
>> a
>> lookup of the dentry we are currently revalidating.
>>
>> Keeping the optimization might be done by tying the delegation to the
>> dentry. Lacking some (easy?) way to do that currently, it seems
>> simpler to
>> remove the optimization altogether, and I will send a patch to remove
>> it.
>
> A delegation normally applies to the entire inode. It covers _all_
> dentries that point to that inode too because create, rename and unlink
> are always atomically accompanied by an inode change attribute.
It should cover all dentries that point to that inode at the time the
delegation was handed out. Shouldn't dentries cached _before_ the
delegation be invalidated? The client doesn't currently care about the
order of dentries cached with respect to delegations.
> IOW: The proposed restriction is both unnecessary and incorrect.
But then I think: need to store that change attribute on the dentry instead
of what we currently use - a client-only monotonic counter. Then, we could
compare the delegation's change attr to the dentry's.
But that assumes they are both globally related -- that a directory's
change_attr on lookup relates to an inode's change attribute. I don't see
that anywhere (I'm looking in 7530)..
>> Any thoughts on this? Any response, even asserting that this is not
>> something
>> we will fix are welcome.
I think, what I am lacking (and I admit to have a tendency to become
fixated) is proper guidance on whether or not work on this front is
acceptable.
I am happy to work on this, but not if my time is wasted. Help!
Ben
On Tue, 2019-06-04 at 10:10 -0400, Benjamin Coddington wrote:
> On 4 Jun 2019, at 8:56, Trond Myklebust wrote:
>
> > On Tue, 2019-06-04 at 08:41 -0400, Benjamin Coddington wrote:
> > > Hey linux-nfs, and especially maintainers,
> > >
> > > I'm still interested in working on a problem raised a couple
> > > weeks
> > > ago, but
> > > confusion muddled that discussion and it died:
> > >
> > > If the client holds a read delegation, it will skip revalidation
> > > of a
> > > dentry
> > > in lookup. If the file was moved on the server, the client can
> > > end
> > > up with
> > > two positive dentries in cache for the same inode, and the dentry
> > > that
> > > doesn't exist on the server will never time out of the cache.
> > >
> > > The client can detect this happening because the directory of the
> > > dentry
> > > that should be revalidated updates it's change
> > > attribute. Skipping
> > > revalidation is an optimization in the case we hold a delegation,
> > > but
> > > this
> > > optimization should only be used when the delegation was obtained
> > > via
> > > a
> > > lookup of the dentry we are currently revalidating.
> > >
> > > Keeping the optimization might be done by tying the delegation to
> > > the
> > > dentry. Lacking some (easy?) way to do that currently, it seems
> > > simpler to
> > > remove the optimization altogether, and I will send a patch to
> > > remove
> > > it.
> >
> > A delegation normally applies to the entire inode. It covers _all_
> > dentries that point to that inode too because create, rename and
> > unlink
> > are always atomically accompanied by an inode change attribute.
>
> It should cover all dentries that point to that inode at the time the
> delegation was handed out. Shouldn't dentries cached _before_ the
> delegation be invalidated? The client doesn't currently care about
> the
> order of dentries cached with respect to delegations.
>
> > IOW: The proposed restriction is both unnecessary and incorrect.
>
> But then I think: need to store that change attribute on the dentry
> instead
> of what we currently use - a client-only monotonic counter. Then, we
> could
> compare the delegation's change attr to the dentry's.
>
> But that assumes they are both globally related -- that a directory's
> change_attr on lookup relates to an inode's change attribute. I
> don't see
> that anywhere (I'm looking in 7530)..
>
OK. Now I think I see what you are saying. This would the case that is
of interest:
* A directory contains a file "foo", which has a hardlink "bar". Our
client has both link names cached due to a previous set of lookups.
* Some other client changes the name of "bar" to "barbar" on the
server.
* Our client then opens "foo" and gets a delegation.
* Our client is then asked to open "bar", and succeeds, failing to
recognise that it has been renamed to "barbar".
Is that what you mean? That looks like it might happen with the current
code, and would indeed be a bug.
Actually, in the NFSv4.1 open-by-filehandle case, we might even see a
bug when "foo" is renamed on the server too.
--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
[email protected]
On 4 Jun 2019, at 10:53, Trond Myklebust wrote:
> On Tue, 2019-06-04 at 10:10 -0400, Benjamin Coddington wrote:
>> On 4 Jun 2019, at 8:56, Trond Myklebust wrote:
>>
>>> On Tue, 2019-06-04 at 08:41 -0400, Benjamin Coddington wrote:
>>>> Hey linux-nfs, and especially maintainers,
>>>>
>>>> I'm still interested in working on a problem raised a couple
>>>> weeks
>>>> ago, but
>>>> confusion muddled that discussion and it died:
>>>>
>>>> If the client holds a read delegation, it will skip revalidation
>>>> of a
>>>> dentry
>>>> in lookup. If the file was moved on the server, the client can
>>>> end
>>>> up with
>>>> two positive dentries in cache for the same inode, and the dentry
>>>> that
>>>> doesn't exist on the server will never time out of the cache.
>>>>
>>>> The client can detect this happening because the directory of the
>>>> dentry
>>>> that should be revalidated updates it's change
>>>> attribute. Skipping
>>>> revalidation is an optimization in the case we hold a delegation,
>>>> but
>>>> this
>>>> optimization should only be used when the delegation was obtained
>>>> via
>>>> a
>>>> lookup of the dentry we are currently revalidating.
>>>>
>>>> Keeping the optimization might be done by tying the delegation to
>>>> the
>>>> dentry. Lacking some (easy?) way to do that currently, it seems
>>>> simpler to
>>>> remove the optimization altogether, and I will send a patch to
>>>> remove
>>>> it.
>>>
>>> A delegation normally applies to the entire inode. It covers _all_
>>> dentries that point to that inode too because create, rename and
>>> unlink
>>> are always atomically accompanied by an inode change attribute.
>>
>> It should cover all dentries that point to that inode at the time the
>> delegation was handed out. Shouldn't dentries cached _before_ the
>> delegation be invalidated? The client doesn't currently care about
>> the
>> order of dentries cached with respect to delegations.
>>
>>> IOW: The proposed restriction is both unnecessary and incorrect.
>>
>> But then I think: need to store that change attribute on the dentry
>> instead
>> of what we currently use - a client-only monotonic counter. Then, we
>> could
>> compare the delegation's change attr to the dentry's.
>>
>> But that assumes they are both globally related -- that a directory's
>> change_attr on lookup relates to an inode's change attribute. I
>> don't see
>> that anywhere (I'm looking in 7530)..
>>
>
> OK. Now I think I see what you are saying. This would the case that is
> of interest:
>
> * A directory contains a file "foo", which has a hardlink "bar". Our
> client has both link names cached due to a previous set of lookups.
> * Some other client changes the name of "bar" to "barbar" on the
> server.
> * Our client then opens "foo" and gets a delegation.
> * Our client is then asked to open "bar", and succeeds, failing to
> recognise that it has been renamed to "barbar".
>
> Is that what you mean? That looks like it might happen with the current
> code, and would indeed be a bug.
Yes, that's the problem. The practical case that was reported to be hitting
it is when `mv` stats source and destination and finds they are the same
file.
> Actually, in the NFSv4.1 open-by-filehandle case, we might even see a
> bug when "foo" is renamed on the server too.
Ok, some relief that you agree this is a bug.
Some ideas for fixing it:
- change d_time to hold the directory's change_attr
from the server, stash that in the (unused?) struct delegation.change_attr
- git rid of the optimization.
- investigate (maybe heuristically discover) whether a directory's
change_attr is a global counter related to the inode's change_attr.
</hand waving>
At least now I can spend some time on it and not feel aimless, thanks for
the closer look.
Ben
On 4 Jun 2019, at 15:00, Benjamin Coddington wrote:
> At least now I can spend some time on it and not feel aimless, thanks for
> the closer look.
I am not finding a reliable way to fix this and retain the optimization. I
will send a patch to remove it.
Ben
On Mon, 2019-06-10 at 10:14 -0400, Benjamin Coddington wrote:
> On 4 Jun 2019, at 15:00, Benjamin Coddington wrote:
>
> > At least now I can spend some time on it and not feel aimless,
> > thanks for
> > the closer look.
>
> I am not finding a reliable way to fix this and retain the
> optimization. I
> will send a patch to remove it.
How about just moving the optimisation into the if
(nfs_check_verifier()) { } case? There should be no need to do the
nfs_lookup_verify_inode() dance if we hold a delegation. should there?
--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
[email protected]
On Tue, Jun 4, 2019 at 8:57 AM Trond Myklebust <[email protected]> wrote:
>
> On Tue, 2019-06-04 at 08:41 -0400, Benjamin Coddington wrote:
> > Hey linux-nfs, and especially maintainers,
> >
> > I'm still interested in working on a problem raised a couple weeks
> > ago, but
> > confusion muddled that discussion and it died:
> >
> > If the client holds a read delegation, it will skip revalidation of a
> > dentry
> > in lookup. If the file was moved on the server, the client can end
> > up with
> > two positive dentries in cache for the same inode, and the dentry
> > that
> > doesn't exist on the server will never time out of the cache.
> >
> > The client can detect this happening because the directory of the
> > dentry
> > that should be revalidated updates it's change attribute. Skipping
> > revalidation is an optimization in the case we hold a delegation, but
> > this
> > optimization should only be used when the delegation was obtained via
> > a
> > lookup of the dentry we are currently revalidating.
> >
> > Keeping the optimization might be done by tying the delegation to the
> > dentry. Lacking some (easy?) way to do that currently, it seems
> > simpler to
> > remove the optimization altogether, and I will send a patch to remove
> > it.
>
> A delegation normally applies to the entire inode. It covers _all_
> dentries that point to that inode too because create, rename and unlink
> are always atomically accompanied by an inode change attribute.
>
> IOW: The proposed restriction is both unnecessary and incorrect.
If the delegation also applies to the dentry, then in the described
case where the directory was being modified, isn't it then the
server's responsibility to recall that delegation. Would that have
prevented this problem?
Also reading thru the rfc7530, I don't see where it says that dentry
attributes are also delegated on a read/write delegation. Also, while
the open, create, rename, unlike carry the before/after change
attribute for the directory, I think there are no guarantees of that
attribute after the open (even with the delegation).
RFC 7530 1.4.6 snippet : "When the server grants a delegation for a
file to a client, the client is
guaranteed certain semantics with respect to the sharing of that file
with other clients. " All this is about sharing and IO access to
the file. Prior to that it talks about time-based attribute cache for
file and directory attributes same as in previous NFS versions. I
don't see any mention or implications that directory attributes are
cached for the file delegations.
> > Any thoughts on this? Any response, even asserting that this is not
> > something
> > we will fix are welcome.
>
> --
> Trond Myklebust
> Linux NFS client maintainer, Hammerspace
> [email protected]
>
>
On 10 Jun 2019, at 12:43, Trond Myklebust wrote:
> On Mon, 2019-06-10 at 10:14 -0400, Benjamin Coddington wrote:
>> On 4 Jun 2019, at 15:00, Benjamin Coddington wrote:
>>
>>> At least now I can spend some time on it and not feel aimless,
>>> thanks for
>>> the closer look.
>>
>> I am not finding a reliable way to fix this and retain the
>> optimization. I
>> will send a patch to remove it.
>
>
> How about just moving the optimisation into the if
> (nfs_check_verifier()) { } case? There should be no need to do the
> nfs_lookup_verify_inode() dance if we hold a delegation. should there?
Ok, right -- we can optimize that away.. I'll see what sticks to the wall.
Ben