2022-09-30 19:17:12

by Jeff Layton

[permalink] [raw]
Subject: [PATCH 0/3] nfsd: filecache fixes

I've been trying to hunt down the bug reported here:

https://bugzilla.linux-nfs.org/show_bug.cgi?id=394

I've not been able to reliably reproduce that, but patches #2 and #3 may
help. The issue I think may be in the management of the sentinel
reference. Responsibility for putting that reference belongs to the
task that clears the HASHED flag, so we need to take care not to put
that ref until we've successfully cleared it.

Hopefully, this will make it a bit more consistent (and close some other
potential races as well).

Jeff Layton (3):
nfsd: nfsd_do_file_acquire should hold rcu_read_lock while getting
refs
nfsd: fix potential race in nfsd_file_close
nfsd: fix nfsd_file_unhash_and_dispose

fs/nfsd/filecache.c | 53 ++++++++++++++++++---------------------------
1 file changed, 21 insertions(+), 32 deletions(-)

--
2.37.3


2022-09-30 19:17:52

by Jeff Layton

[permalink] [raw]
Subject: [PATCH 1/3] nfsd: nfsd_do_file_acquire should hold rcu_read_lock while getting refs

nfsd_file is RCU-freed, so it's possible that one could be found that's
in the process of being freed and the memory recycled. Ensure we hold
the rcu_read_lock while attempting to get a reference on the object.

Signed-off-by: Jeff Layton <[email protected]>
---
fs/nfsd/filecache.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
index d5c57360b418..6237715bd23e 100644
--- a/fs/nfsd/filecache.c
+++ b/fs/nfsd/filecache.c
@@ -1077,10 +1077,12 @@ nfsd_file_do_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,

retry:
/* Avoid allocation if the item is already in cache */
+ rcu_read_lock();
nf = rhashtable_lookup_fast(&nfsd_file_rhash_tbl, &key,
nfsd_file_rhash_params);
if (nf)
nf = nfsd_file_get(nf);
+ rcu_read_unlock();
if (nf)
goto wait_for_construction;

@@ -1090,16 +1092,21 @@ nfsd_file_do_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
goto out_status;
}

+ rcu_read_lock();
nf = rhashtable_lookup_get_insert_key(&nfsd_file_rhash_tbl,
&key, &new->nf_rhash,
nfsd_file_rhash_params);
if (!nf) {
+ rcu_read_unlock();
nf = new;
goto open_file;
}
- if (IS_ERR(nf))
+ if (IS_ERR(nf)) {
+ rcu_read_unlock();
goto insert_err;
+ }
nf = nfsd_file_get(nf);
+ rcu_read_unlock();
if (nf == NULL) {
nf = new;
goto open_file;
--
2.37.3

2022-09-30 19:29:24

by Chuck Lever III

[permalink] [raw]
Subject: Re: [PATCH 1/3] nfsd: nfsd_do_file_acquire should hold rcu_read_lock while getting refs



> On Sep 30, 2022, at 3:15 PM, Jeff Layton <[email protected]> wrote:
>
> nfsd_file is RCU-freed, so it's possible that one could be found that's
> in the process of being freed and the memory recycled. Ensure we hold
> the rcu_read_lock while attempting to get a reference on the object.
>
> Signed-off-by: Jeff Layton <[email protected]>

IIUC, the rcu_read_lock() is held when nfsd_file_obj_cmpfn() is
invoked. So, couldn't we just call nfsd_file_get() on @nf in
there if it returns a match?


> ---
> fs/nfsd/filecache.c | 9 ++++++++-
> 1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
> index d5c57360b418..6237715bd23e 100644
> --- a/fs/nfsd/filecache.c
> +++ b/fs/nfsd/filecache.c
> @@ -1077,10 +1077,12 @@ nfsd_file_do_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
>
> retry:
> /* Avoid allocation if the item is already in cache */
> + rcu_read_lock();
> nf = rhashtable_lookup_fast(&nfsd_file_rhash_tbl, &key,
> nfsd_file_rhash_params);
> if (nf)
> nf = nfsd_file_get(nf);
> + rcu_read_unlock();
> if (nf)
> goto wait_for_construction;
>
> @@ -1090,16 +1092,21 @@ nfsd_file_do_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
> goto out_status;
> }
>
> + rcu_read_lock();
> nf = rhashtable_lookup_get_insert_key(&nfsd_file_rhash_tbl,
> &key, &new->nf_rhash,
> nfsd_file_rhash_params);
> if (!nf) {
> + rcu_read_unlock();
> nf = new;
> goto open_file;
> }
> - if (IS_ERR(nf))
> + if (IS_ERR(nf)) {
> + rcu_read_unlock();
> goto insert_err;
> + }
> nf = nfsd_file_get(nf);
> + rcu_read_unlock();
> if (nf == NULL) {
> nf = new;
> goto open_file;
> --
> 2.37.3
>

--
Chuck Lever



2022-09-30 19:43:12

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH 1/3] nfsd: nfsd_do_file_acquire should hold rcu_read_lock while getting refs

On Fri, 2022-09-30 at 19:20 +0000, Chuck Lever III wrote:
>
> > On Sep 30, 2022, at 3:15 PM, Jeff Layton <[email protected]> wrote:
> >
> > nfsd_file is RCU-freed, so it's possible that one could be found that's
> > in the process of being freed and the memory recycled. Ensure we hold
> > the rcu_read_lock while attempting to get a reference on the object.
> >
> > Signed-off-by: Jeff Layton <[email protected]>
>
> IIUC, the rcu_read_lock() is held when nfsd_file_obj_cmpfn() is
> invoked. So, couldn't we just call nfsd_file_get() on @nf in
> there if it returns a match?
>
>

Adding side effects to the comparison function seems kind of gross. Do
you know for sure that it will only be called once when you do a search?

Also, there are other places where we search the hashtable but don't
take references. We could just make those places put the reference, but
that gets messy.

I think this is cleaner.

> > ---
> > fs/nfsd/filecache.c | 9 ++++++++-
> > 1 file changed, 8 insertions(+), 1 deletion(-)
> >
> > diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
> > index d5c57360b418..6237715bd23e 100644
> > --- a/fs/nfsd/filecache.c
> > +++ b/fs/nfsd/filecache.c
> > @@ -1077,10 +1077,12 @@ nfsd_file_do_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
> >
> > retry:
> > /* Avoid allocation if the item is already in cache */
> > + rcu_read_lock();
> > nf = rhashtable_lookup_fast(&nfsd_file_rhash_tbl, &key,
> > nfsd_file_rhash_params);
> > if (nf)
> > nf = nfsd_file_get(nf);
> > + rcu_read_unlock();
> > if (nf)
> > goto wait_for_construction;
> >
> > @@ -1090,16 +1092,21 @@ nfsd_file_do_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
> > goto out_status;
> > }
> >
> > + rcu_read_lock();
> > nf = rhashtable_lookup_get_insert_key(&nfsd_file_rhash_tbl,
> > &key, &new->nf_rhash,
> > nfsd_file_rhash_params);
> > if (!nf) {
> > + rcu_read_unlock();
> > nf = new;
> > goto open_file;
> > }
> > - if (IS_ERR(nf))
> > + if (IS_ERR(nf)) {
> > + rcu_read_unlock();
> > goto insert_err;
> > + }
> > nf = nfsd_file_get(nf);
> > + rcu_read_unlock();
> > if (nf == NULL) {
> > nf = new;
> > goto open_file;
> > --
> > 2.37.3
> >
>
> --
> Chuck Lever
>
>
>

--
Jeff Layton <[email protected]>

2022-09-30 20:11:01

by Chuck Lever III

[permalink] [raw]
Subject: Re: [PATCH 1/3] nfsd: nfsd_do_file_acquire should hold rcu_read_lock while getting refs



> On Sep 30, 2022, at 3:33 PM, Jeff Layton <[email protected]> wrote:
>
> On Fri, 2022-09-30 at 19:20 +0000, Chuck Lever III wrote:
>>
>>> On Sep 30, 2022, at 3:15 PM, Jeff Layton <[email protected]> wrote:
>>>
>>> nfsd_file is RCU-freed, so it's possible that one could be found that's
>>> in the process of being freed and the memory recycled. Ensure we hold
>>> the rcu_read_lock while attempting to get a reference on the object.
>>>
>>> Signed-off-by: Jeff Layton <[email protected]>
>>
>> IIUC, the rcu_read_lock() is held when nfsd_file_obj_cmpfn() is
>> invoked. So, couldn't we just call nfsd_file_get() on @nf in
>> there if it returns a match?
>>
>>
>
> Adding side effects to the comparison function seems kind of gross.
> Do you know for sure that it will only be called once when you do a search?

Yes, when a match occurs, the chain walk stops.


> Also, there are other places where we search the hashtable but don't
> take references. We could just make those places put the reference, but
> that gets messy.

nfsd_file_do_acquire() is the only place that calls the comparator
with NFSD_FILE_KEY_FULL.

165 if (test_bit(NFSD_FILE_HASHED, &nf->nf_flags) == 0)
166 return 1;
+ nfsd_file_get(nf);
167 break;


> I think this is cleaner.

Unfortunately, the comparator's second parameter is a const
void *, so we can't modify the matching nf. Ho-hum.


>>> ---
>>> fs/nfsd/filecache.c | 9 ++++++++-
>>> 1 file changed, 8 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
>>> index d5c57360b418..6237715bd23e 100644
>>> --- a/fs/nfsd/filecache.c
>>> +++ b/fs/nfsd/filecache.c
>>> @@ -1077,10 +1077,12 @@ nfsd_file_do_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
>>>
>>> retry:
>>> /* Avoid allocation if the item is already in cache */
>>> + rcu_read_lock();
>>> nf = rhashtable_lookup_fast(&nfsd_file_rhash_tbl, &key,
>>> nfsd_file_rhash_params);

This probably should be changed to rhashtable_lookup() so that
we're not double-wrapping the call with rcu_read_lock/unlock.


>>> if (nf)
>>> nf = nfsd_file_get(nf);
>>> + rcu_read_unlock();
>>> if (nf)
>>> goto wait_for_construction;
>>>
>>> @@ -1090,16 +1092,21 @@ nfsd_file_do_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
>>> goto out_status;
>>> }
>>>
>>> + rcu_read_lock();
>>> nf = rhashtable_lookup_get_insert_key(&nfsd_file_rhash_tbl,
>>> &key, &new->nf_rhash,
>>> nfsd_file_rhash_params);
>>> if (!nf) {
>>> + rcu_read_unlock();
>>> nf = new;
>>> goto open_file;
>>> }
>>> - if (IS_ERR(nf))
>>> + if (IS_ERR(nf)) {
>>> + rcu_read_unlock();
>>> goto insert_err;
>>> + }
>>> nf = nfsd_file_get(nf);
>>> + rcu_read_unlock();

What might be nicer is if this used rhashtable_lookup_insert_key()
instead, and if it returns EEXIST, it simply frees @new and retries
the first lookup.


>>> if (nf == NULL) {
>>> nf = new;
>>> goto open_file;
>>> --
>>> 2.37.3
>>>
>>
>> --
>> Chuck Lever
>>
>>
>>
>
> --
> Jeff Layton <[email protected]>

--
Chuck Lever



2022-10-01 04:58:39

by NeilBrown

[permalink] [raw]
Subject: Re: [PATCH 1/3] nfsd: nfsd_do_file_acquire should hold rcu_read_lock while getting refs

On Sat, 01 Oct 2022, Jeff Layton wrote:
> nfsd_file is RCU-freed, so it's possible that one could be found that's
> in the process of being freed and the memory recycled. Ensure we hold
> the rcu_read_lock while attempting to get a reference on the object.
>
> Signed-off-by: Jeff Layton <[email protected]>
> ---
> fs/nfsd/filecache.c | 9 ++++++++-
> 1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
> index d5c57360b418..6237715bd23e 100644
> --- a/fs/nfsd/filecache.c
> +++ b/fs/nfsd/filecache.c
> @@ -1077,10 +1077,12 @@ nfsd_file_do_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
>
> retry:
> /* Avoid allocation if the item is already in cache */
> + rcu_read_lock();
> nf = rhashtable_lookup_fast(&nfsd_file_rhash_tbl, &key,
> nfsd_file_rhash_params);
> if (nf)
> nf = nfsd_file_get(nf);
> + rcu_read_unlock();

Looks good.

> if (nf)
> goto wait_for_construction;
>
> @@ -1090,16 +1092,21 @@ nfsd_file_do_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
> goto out_status;
> }
>
> + rcu_read_lock();
> nf = rhashtable_lookup_get_insert_key(&nfsd_file_rhash_tbl,
> &key, &new->nf_rhash,
> nfsd_file_rhash_params);
> if (!nf) {
> + rcu_read_unlock();
> nf = new;
> goto open_file;
> }
> - if (IS_ERR(nf))
> + if (IS_ERR(nf)) {
> + rcu_read_unlock();
> goto insert_err;
> + }
> nf = nfsd_file_get(nf);
> + rcu_read_unlock();

Ugh.
Could we make this:
rcu_read_lock()
nf = r_l_g_i_k()
if (!IS_ERR_OR_NULL(nf))
nf = nfsd_file_get(nf);
rcu_read_unlock()
...
??

NeilBrown

> if (nf == NULL) {
> nf = new;
> goto open_file;
> --
> 2.37.3
>
>

2022-10-01 09:51:16

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH 1/3] nfsd: nfsd_do_file_acquire should hold rcu_read_lock while getting refs

On Sat, 2022-10-01 at 14:44 +1000, NeilBrown wrote:
> On Sat, 01 Oct 2022, Jeff Layton wrote:
> > nfsd_file is RCU-freed, so it's possible that one could be found that's
> > in the process of being freed and the memory recycled. Ensure we hold
> > the rcu_read_lock while attempting to get a reference on the object.
> >
> > Signed-off-by: Jeff Layton <[email protected]>
> > ---
> > fs/nfsd/filecache.c | 9 ++++++++-
> > 1 file changed, 8 insertions(+), 1 deletion(-)
> >
> > diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
> > index d5c57360b418..6237715bd23e 100644
> > --- a/fs/nfsd/filecache.c
> > +++ b/fs/nfsd/filecache.c
> > @@ -1077,10 +1077,12 @@ nfsd_file_do_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
> >
> > retry:
> > /* Avoid allocation if the item is already in cache */
> > + rcu_read_lock();
> > nf = rhashtable_lookup_fast(&nfsd_file_rhash_tbl, &key,
> > nfsd_file_rhash_params);
> > if (nf)
> > nf = nfsd_file_get(nf);
> > + rcu_read_unlock();
>
> Looks good.
>
> > if (nf)
> > goto wait_for_construction;
> >
> > @@ -1090,16 +1092,21 @@ nfsd_file_do_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
> > goto out_status;
> > }
> >
> > + rcu_read_lock();
> > nf = rhashtable_lookup_get_insert_key(&nfsd_file_rhash_tbl,
> > &key, &new->nf_rhash,
> > nfsd_file_rhash_params);
> > if (!nf) {
> > + rcu_read_unlock();
> > nf = new;
> > goto open_file;
> > }
> > - if (IS_ERR(nf))
> > + if (IS_ERR(nf)) {
> > + rcu_read_unlock();
> > goto insert_err;
> > + }
> > nf = nfsd_file_get(nf);
> > + rcu_read_unlock();
>
> Ugh.
> Could we make this:
> rcu_read_lock()
> nf = r_l_g_i_k()
> if (!IS_ERR_OR_NULL(nf))
> nf = nfsd_file_get(nf);
> rcu_read_unlock()
> ...
> ??
>
> NeilBrown
>

Good point. I'll resend a v2 here.

> > if (nf == NULL) {
> > nf = new;
> > goto open_file;
> > --
> > 2.37.3
> >
> >

--
Jeff Layton <[email protected]>