2022-07-14 20:07:47

by Jeff Layton

[permalink] [raw]
Subject: [PATCH 0/2] nfsd: close potential race between open and delegation

This is a respin of the patchset that I sent earlier today. I hit a
deadlock with that one because of the ambiguous locking.

This series is based on top of Neil's set entitled:

[PATCH 0/8] NFSD: clean up locking.

His patchset makes the locking in the nfsd4_open codepath much more
consistent, and this becomes a lot simpler to fix. Without that set
however, the state of the parent's i_rwsem is unclear after nfsd_lookup
is called, and I don't see a way to determine it reliably.

Jeff Layton (2):
nfsd: drop fh argument from alloc_init_deleg
nfsd: vet the opened dentry after setting a delegation

fs/nfsd/nfs4state.c | 54 +++++++++++++++++++++++++++++++++++++--------
1 file changed, 45 insertions(+), 9 deletions(-)

--
2.36.1


2022-07-14 20:07:47

by Jeff Layton

[permalink] [raw]
Subject: [PATCH 2/2] nfsd: vet the opened dentry after setting a delegation

Between opening a file and setting a delegation on it, someone could
rename or unlink the dentry. If this happens, we do not want to grant a
delegation on the open.

On a CLAIM_NULL open, we're opening by filename, and we'll hold the
i_rwsem while when attempting to set a delegation. After getting a
lease, redo the lookup of the file being opened and validate that the
resulting dentry matches the one in the open file description.

Signed-off-by: Jeff Layton <[email protected]>
---
fs/nfsd/nfs4state.c | 48 ++++++++++++++++++++++++++++++++++++++++-----
1 file changed, 43 insertions(+), 5 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index d2d21fdf5c41..8a8d8c738950 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -5267,11 +5267,38 @@ static int nfsd4_check_conflicting_opens(struct nfs4_client *clp,
return 0;
}

+/*
+ * It's possible that between opening the dentry and setting the delegation,
+ * that it has been renamed or unlinked. Redo the lookup to validate that this
+ * hasn't happened.
+ */
+static int
+nfsd4_vet_deleg_dentry(struct nfsd4_open *open, struct nfs4_file *fp,
+ struct dentry *parent)
+{
+ struct dentry *child;
+
+ lockdep_assert_held(&d_inode(parent)->i_rwsem);
+
+ child = lookup_one_len(open->op_fname, parent, open->op_fnamelen);
+ if (IS_ERR(child))
+ return PTR_ERR(child);
+ dput(child);
+
+ if (child != file_dentry(fp->fi_deleg_file->nf_file))
+ return -EAGAIN;
+
+ return 0;
+}
+
static struct nfs4_delegation *
-nfs4_set_delegation(struct nfs4_client *clp,
- struct nfs4_file *fp, struct nfs4_clnt_odstate *odstate)
+nfs4_set_delegation(struct nfsd4_open *open, struct nfs4_ol_stateid *stp,
+ struct dentry *parent)
{
int status = 0;
+ struct nfs4_client *clp = stp->st_stid.sc_client;
+ struct nfs4_file *fp = stp->st_stid.sc_file;
+ struct nfs4_clnt_odstate *odstate = stp->st_clnt_odstate;
struct nfs4_delegation *dp;
struct nfsd_file *nf;
struct file_lock *fl;
@@ -5326,6 +5353,13 @@ nfs4_set_delegation(struct nfs4_client *clp,
locks_free_lock(fl);
if (status)
goto out_clnt_odstate;
+
+ if (parent) {
+ status = nfsd4_vet_deleg_dentry(open, fp, parent);
+ if (status)
+ goto out_unlock;
+ }
+
status = nfsd4_check_conflicting_opens(clp, fp);
if (status)
goto out_unlock;
@@ -5381,11 +5415,13 @@ static void nfsd4_open_deleg_none_ext(struct nfsd4_open *open, int status)
* proper support for them.
*/
static void
-nfs4_open_delegation(struct nfsd4_open *open, struct nfs4_ol_stateid *stp)
+nfs4_open_delegation(struct nfsd4_open *open, struct nfs4_ol_stateid *stp,
+ struct dentry *cdentry)
{
struct nfs4_delegation *dp;
struct nfs4_openowner *oo = openowner(stp->st_stateowner);
struct nfs4_client *clp = stp->st_stid.sc_client;
+ struct dentry *parent = NULL;
int cb_up;
int status = 0;

@@ -5399,6 +5435,8 @@ nfs4_open_delegation(struct nfsd4_open *open, struct nfs4_ol_stateid *stp)
goto out_no_deleg;
break;
case NFS4_OPEN_CLAIM_NULL:
+ parent = cdentry;
+ fallthrough;
case NFS4_OPEN_CLAIM_FH:
/*
* Let's not give out any delegations till everyone's
@@ -5413,7 +5451,7 @@ nfs4_open_delegation(struct nfsd4_open *open, struct nfs4_ol_stateid *stp)
default:
goto out_no_deleg;
}
- dp = nfs4_set_delegation(clp, stp->st_stid.sc_file, stp->st_clnt_odstate);
+ dp = nfs4_set_delegation(open, stp, parent);
if (IS_ERR(dp))
goto out_no_deleg;

@@ -5545,7 +5583,7 @@ nfsd4_process_open2(struct svc_rqst *rqstp, struct svc_fh *current_fh, struct nf
* Attempt to hand out a delegation. No error return, because the
* OPEN succeeds even if we fail.
*/
- nfs4_open_delegation(open, stp);
+ nfs4_open_delegation(open, stp, resp->cstate.current_fh.fh_dentry);
nodeleg:
status = nfs_ok;
trace_nfsd_open(&stp->st_stid.sc_stateid);
--
2.36.1

2022-07-15 00:01:55

by NeilBrown

[permalink] [raw]
Subject: Re: [PATCH 0/2] nfsd: close potential race between open and delegation

On Fri, 15 Jul 2022, Jeff Layton wrote:
> This is a respin of the patchset that I sent earlier today. I hit a
> deadlock with that one because of the ambiguous locking.
>
> This series is based on top of Neil's set entitled:
>
> [PATCH 0/8] NFSD: clean up locking.
>
> His patchset makes the locking in the nfsd4_open codepath much more
> consistent, and this becomes a lot simpler to fix. Without that set
> however, the state of the parent's i_rwsem is unclear after nfsd_lookup
> is called, and I don't see a way to determine it reliably.

I haven't examined these patch very closely, but a few initial thoughts
are:

1/ Before my series, you can unambiguously tell if i_rwsem is held by
checking fhp->fh_locked. In fact, just call "fh_lock()", and you can
then be sure the fh is locked, whether or not it was locked before
however...
2/ Do we really need to lock the parent? If a rename or unlink happens
after the lease was taken, the lease will be broken. So
take lease.
repeat lookup (locklessly)
Check if lease has been broken
Should provide all you need.

You don't *need* to lock the directory to open an existing file and
with my pending parallel-updates patch set, you only need a shared
lock on the directory to create a file. So I'd rather not be locking
the directory at all to get a delegation

3/ When you vet the name you only do a lookup_one_len(), while
nfsd_lookup_dentry() also calls nfsd_cross_mnt() as it is possible
for a file to be mounted on.
That means that if I did bind mount one file over another and export
over NFSD, the file will never offer a delegation.
This is a minor point, but I think it would be best to be as correct
and consistent as possible.

Thanks for working on this!

NeilBrown

>
> Jeff Layton (2):
> nfsd: drop fh argument from alloc_init_deleg
> nfsd: vet the opened dentry after setting a delegation
>
> fs/nfsd/nfs4state.c | 54 +++++++++++++++++++++++++++++++++++++--------
> 1 file changed, 45 insertions(+), 9 deletions(-)
>
> --
> 2.36.1
>
>

2022-07-15 11:34:20

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH 0/2] nfsd: close potential race between open and delegation

On Fri, 2022-07-15 at 09:59 +1000, NeilBrown wrote:
> On Fri, 15 Jul 2022, Jeff Layton wrote:
> > This is a respin of the patchset that I sent earlier today. I hit a
> > deadlock with that one because of the ambiguous locking.
> >
> > This series is based on top of Neil's set entitled:
> >
> > [PATCH 0/8] NFSD: clean up locking.
> >
> > His patchset makes the locking in the nfsd4_open codepath much more
> > consistent, and this becomes a lot simpler to fix. Without that set
> > however, the state of the parent's i_rwsem is unclear after nfsd_lookup
> > is called, and I don't see a way to determine it reliably.
>
> I haven't examined these patch very closely, but a few initial thoughts
> are:
>
> 1/ Before my series, you can unambiguously tell if i_rwsem is held by
> checking fhp->fh_locked. In fact, just call "fh_lock()", and you can
> then be sure the fh is locked, whether or not it was locked before

Thanks, good to know. I wasn't sure how reliable that bool is. I guess
though that once you have a svc_fh, then you can more or less assume
that you have exclusive access to it for the life of the RPC being
processed.

> however...
> 2/ Do we really need to lock the parent? If a rename or unlink happens
> after the lease was taken, the lease will be broken. So
> take lease.
> repeat lookup (locklessly)
> Check if lease has been broken
> Should provide all you need.
>
> You don't *need* to lock the directory to open an existing file and
> with my pending parallel-updates patch set, you only need a shared
> lock on the directory to create a file. So I'd rather not be locking
> the directory at all to get a delegation
>

Yeah, we probably don't need to lock the dir. That said, after your
patch series we already hold the i_rwsem on the parent at this point so
lookup_one_len is fine in this instance.

> 3/ When you vet the name you only do a lookup_one_len(), while
> nfsd_lookup_dentry() also calls nfsd_cross_mnt() as it is possible
> for a file to be mounted on.
> That means that if I did bind mount one file over another and export
> over NFSD, the file will never offer a delegation.
> This is a minor point, but I think it would be best to be as correct
> and consistent as possible.
>

Agreed, but that will take a bit more work. nfsd_lookup_dentry takes
several parameters that we don't currently have access to in
nfs4_set_delegation (e.g. the rqstp). Those will need to be plumbed
through several functions.

> Thanks for working on this!

...and thank you for the locking cleanup! Getting rid of fh_lock/_unlock
is a really nice cleanup that makes it a lot more clear how this should
work.
--
Jeff Layton <[email protected]>

2022-07-18 03:23:27

by NeilBrown

[permalink] [raw]
Subject: Re: [PATCH 0/2] nfsd: close potential race between open and delegation

On Fri, 15 Jul 2022, Jeff Layton wrote:
> On Fri, 2022-07-15 at 09:59 +1000, NeilBrown wrote:
>
> > however...
> > 2/ Do we really need to lock the parent? If a rename or unlink happens
> > after the lease was taken, the lease will be broken. So
> > take lease.
> > repeat lookup (locklessly)
> > Check if lease has been broken
> > Should provide all you need.
> >
> > You don't *need* to lock the directory to open an existing file and
> > with my pending parallel-updates patch set, you only need a shared
> > lock on the directory to create a file. So I'd rather not be locking
> > the directory at all to get a delegation
> >
>
> Yeah, we probably don't need to lock the dir. That said, after your
> patch series we already hold the i_rwsem on the parent at this point so
> lookup_one_len is fine in this instance.

But the only reason we hold i_rwsem at this point is to prevent renames
in the "opened existing file" case. The "created new file" case holds
it as well just be be consistent with the first case.

If we "vet" the dentry, then we don't need the lock any more. We can
then simplify nfsd_lookup_dentry() to always assume the dir is not
locked - so the "locked" arg can go, and nfsd_lookup() can lose the
"lock" arg and always return with the directory unlocked.

I'm tempted to add your patch to the front of my series. The
inconsistency in locking can be fix by unlocking the directory before we
get even close to handing out a delegation - so the delegation never
sees a locked directory.
But right now I have a cold and don't trust myself to think clearly
enough to create code worth posting. Hopefully I'll be thinking more
clearly later in the week.

While I'm here ... is "vet" a good word? The meaning is appropriate,
but I wonder if it would cause our friends for whom English isn't their
first language to stumble. There are about 5 uses in the kernel at
present.

Would validate or verify be better? Even though they are longer..

Thanks,
NeilBrown

2022-07-18 11:20:15

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH 0/2] nfsd: close potential race between open and delegation

On Mon, 2022-07-18 at 13:21 +1000, NeilBrown wrote:
> On Fri, 15 Jul 2022, Jeff Layton wrote:
> > On Fri, 2022-07-15 at 09:59 +1000, NeilBrown wrote:
> >
> > > however...
> > > 2/ Do we really need to lock the parent? If a rename or unlink happens
> > > after the lease was taken, the lease will be broken. So
> > > take lease.
> > > repeat lookup (locklessly)
> > > Check if lease has been broken
> > > Should provide all you need.
> > >
> > > You don't *need* to lock the directory to open an existing file and
> > > with my pending parallel-updates patch set, you only need a shared
> > > lock on the directory to create a file. So I'd rather not be locking
> > > the directory at all to get a delegation
> > >
> >
> > Yeah, we probably don't need to lock the dir. That said, after your
> > patch series we already hold the i_rwsem on the parent at this point so
> > lookup_one_len is fine in this instance.
>
> But the only reason we hold i_rwsem at this point is to prevent renames
> in the "opened existing file" case. The "created new file" case holds
> it as well just be be consistent with the first case.
>
> If we "vet" the dentry, then we don't need the lock any more. We can
> then simplify nfsd_lookup_dentry() to always assume the dir is not
> locked - so the "locked" arg can go, and nfsd_lookup() can lose the
> "lock" arg and always return with the directory unlocked.
>
> I'm tempted to add your patch to the front of my series. The
> inconsistency in locking can be fix by unlocking the directory before we
> get even close to handing out a delegation - so the delegation never
> sees a locked directory.

Hmm, ok. I suppose we don't necessarily have to care whether the thing
is locked before calling into nfsd_lookup_dentry. I'll take another stab
at fixing this in the kernel w/o your series. That'll make Chuck happy
too.

> But right now I have a cold and don't trust myself to think clearly
> enough to create code worth posting. Hopefully I'll be thinking more
> clearly later in the week.
>
> While I'm here ... is "vet" a good word? The meaning is appropriate,
> but I wonder if it would cause our friends for whom English isn't their
> first language to stumble. There are about 5 uses in the kernel at
> present.
>
> Would validate or verify be better? Even though they are longer..


Good point. I'm all for helping out non-native English speakers. I'll
plan to change it to something less esoteric.
--
Jeff Layton <[email protected]>

2022-07-18 14:35:59

by Chuck Lever III

[permalink] [raw]
Subject: Re: [PATCH 0/2] nfsd: close potential race between open and delegation



> On Jul 17, 2022, at 11:21 PM, NeilBrown <[email protected]> wrote:
>
> But right now I have a cold and don't trust myself to think clearly
> enough to create code worth posting. Hopefully I'll be thinking more
> clearly later in the week.

Thanks for the update!

Here are my plans: I'd like to finalize content of the first 5.20
NFSD pull request this week. If these patches are not ready by
then, I can prepare a second PR later in the 5.20 merge window
with your work, which should give you another two weeks. If they
are still not ready, I'll get them in first thing for the next
merge window.


--
Chuck Lever



2022-07-25 05:20:09

by NeilBrown

[permalink] [raw]
Subject: Re: [PATCH 0/2] nfsd: close potential race between open and delegation

On Tue, 19 Jul 2022, Chuck Lever III wrote:
>
> > On Jul 17, 2022, at 11:21 PM, NeilBrown <[email protected]> wrote:
> >
> > But right now I have a cold and don't trust myself to think clearly
> > enough to create code worth posting. Hopefully I'll be thinking more
> > clearly later in the week.
>
> Thanks for the update!
>
> Here are my plans: I'd like to finalize content of the first 5.20
> NFSD pull request this week. If these patches are not ready by
> then, I can prepare a second PR later in the 5.20 merge window
> with your work, which should give you another two weeks. If they
> are still not ready, I'll get them in first thing for the next
> merge window.

FYI I've addressed the outstanding issues (I think), but have not yet done
any testing or given the patches a final inspection. I hope to do that
tomorrow, and will post the patches once it is done.
If you want a preview you can find them on github.com/neilbrown/linux
in the "nfsd" branch.

Jeff's patches to validate delegations after getting the lease, so we
don't have to hold the lock so long, comes first.

NeilBrown