2014-05-30 13:09:44

by Jeff Layton

[permalink] [raw]
Subject: [PATCH 0/9] nfsd: bugfixes and preliminary patches for client_mutex removal

Hi Bruce!

This patchset contains some bugfixes and preliminary patches for
the client_mutex removal work. I'm sending these out first as they
either fix bugs or don't add extra locking that's only nested inside
the client_mutex anyway.

These are based on your nfsd-next tree. Patch #3 fixes a (possibly
minor?) bug in the fi_delegations list handling, and might be suitable
for stable based on that. I haven't cc'ed stable on it as I can't quite
convince myself it's a real problem, but it might be reasonable to do
so anyway.

Benny Halevy (3):
nfsd4: use recall_lock for delegation hashing
nfsd4: rename recall_lock to state_lock
nfsd4: hash deleg stateid only on successful nfs4_set_delegation

Jeff Layton (4):
nfsd: make nfsd4_encode_fattr static
nfsd: fix laundromat next-run-time calculation
nfsd: fix setting of NFS4_OO_CONFIRMED in nfsd4_open
nfsd: remove unneeded zeroing of fields in nfsd4_proc_compound

Trond Myklebust (2):
NFSd: Protect addition to the file_hashtbl
NFSd: protect delegation setup with the i_lock

fs/nfsd/nfs4callback.c | 18 +++++-
fs/nfsd/nfs4proc.c | 6 +-
fs/nfsd/nfs4state.c | 168 ++++++++++++++++++++++++++++++-------------------
fs/nfsd/nfs4xdr.c | 2 +-
fs/nfsd/state.h | 1 +
5 files changed, 124 insertions(+), 71 deletions(-)

--
1.9.3



2014-05-30 13:09:46

by Jeff Layton

[permalink] [raw]
Subject: [PATCH 1/9] nfsd: make nfsd4_encode_fattr static

sparse says:

CHECK fs/nfsd/nfs4xdr.c
fs/nfsd/nfs4xdr.c:2043:1: warning: symbol 'nfsd4_encode_fattr' was not declared. Should it be static?

Signed-off-by: Jeff Layton <[email protected]>
---
fs/nfsd/nfs4xdr.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index d3a576dbd99b..6bcdff11c32c 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -2039,7 +2039,7 @@ static int get_parent_attributes(struct svc_export *exp, struct kstat *stat)
* Note: @fhp can be NULL; in this case, we might have to compose the filehandle
* ourselves.
*/
-__be32
+static __be32
nfsd4_encode_fattr(struct xdr_stream *xdr, struct svc_fh *fhp,
struct svc_export *exp,
struct dentry *dentry, u32 *bmval,
--
1.9.3


2014-05-30 13:13:56

by Jeff Layton

[permalink] [raw]
Subject: [PATCH 10/9] nfsd: remove initial assignment of "p" in nfsd4_encode_security_label

"pp" is no longer passed to this function, and "p" is clobbered just
afterward anyway.

Signed-off-by: Jeff Layton <[email protected]>
---
fs/nfsd/nfs4xdr.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index 1bac00040706..d3a576dbd99b 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -1981,7 +1981,7 @@ static inline __be32
nfsd4_encode_security_label(struct xdr_stream *xdr, struct svc_rqst *rqstp,
void *context, int len)
{
- __be32 *p = *pp;
+ __be32 *p;

p = xdr_reserve_space(xdr, len + 4 + 4 + 4);
if (!p)
--
1.9.3


2014-05-30 13:09:58

by Jeff Layton

[permalink] [raw]
Subject: [PATCH 9/9] NFSd: protect delegation setup with the i_lock

From: Trond Myklebust <[email protected]>

state_lock is a heavily contended global lock. We don't want to grab
that while simultaneously holding the inode->i_lock. Avoid doing that in
the delegation break callback by ensuring that we add/remove the
dl_perfile under the inode->i_lock.

This however, can cause an ABBA deadlock between the state_lock and the
i_lock. Work around that by doing the list manipulation in the delegation
recall from the workqueue context prior to starting the rpc call.

Signed-off-by: Trond Myklebust <[email protected]>
Signed-off-by: Jeff Layton <[email protected]>
---
fs/nfsd/nfs4callback.c | 18 ++++++++++++++++--
fs/nfsd/nfs4state.c | 41 +++++++++++++++++++++++++++--------------
fs/nfsd/state.h | 1 +
3 files changed, 44 insertions(+), 16 deletions(-)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index 2c73cae9899d..3d67bbf26cee 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -1011,9 +1011,8 @@ static void nfsd4_process_cb_update(struct nfsd4_callback *cb)
run_nfsd4_cb(cb);
}

-static void nfsd4_do_callback_rpc(struct work_struct *w)
+static void nfsd4_run_callback_rpc(struct nfsd4_callback *cb)
{
- struct nfsd4_callback *cb = container_of(w, struct nfsd4_callback, cb_work);
struct nfs4_client *clp = cb->cb_clp;
struct rpc_clnt *clnt;

@@ -1031,11 +1030,25 @@ static void nfsd4_do_callback_rpc(struct work_struct *w)
cb->cb_ops, cb);
}

+static void nfsd4_do_callback_rpc(struct work_struct *w)
+{
+ struct nfsd4_callback *cb = container_of(w, struct nfsd4_callback, cb_work);
+ nfsd4_run_callback_rpc(cb);
+}
+
void nfsd4_init_callback(struct nfsd4_callback *cb)
{
INIT_WORK(&cb->cb_work, nfsd4_do_callback_rpc);
}

+static void nfsd4_do_cb_recall(struct work_struct *w)
+{
+ struct nfsd4_callback *cb = container_of(w, struct nfsd4_callback, cb_work);
+
+ nfsd4_prepare_cb_recall(cb->cb_op);
+ nfsd4_run_callback_rpc(cb);
+}
+
void nfsd4_cb_recall(struct nfs4_delegation *dp)
{
struct nfsd4_callback *cb = &dp->dl_recall;
@@ -1052,6 +1065,7 @@ void nfsd4_cb_recall(struct nfs4_delegation *dp)

INIT_LIST_HEAD(&cb->cb_per_client);
cb->cb_done = true;
+ INIT_WORK(&cb->cb_work, nfsd4_do_cb_recall);

run_nfsd4_cb(&dp->dl_recall);
}
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 553c2d6d48dc..c249c5261640 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -438,7 +438,9 @@ hash_delegation_locked(struct nfs4_delegation *dp, struct nfs4_file *fp)
lockdep_assert_held(&state_lock);

dp->dl_stid.sc_type = NFS4_DELEG_STID;
+ spin_lock(&fp->fi_inode->i_lock);
list_add(&dp->dl_perfile, &fp->fi_delegations);
+ spin_unlock(&fp->fi_inode->i_lock);
list_add(&dp->dl_perclnt, &dp->dl_stid.sc_client->cl_delegations);
}

@@ -446,14 +448,20 @@ hash_delegation_locked(struct nfs4_delegation *dp, struct nfs4_file *fp)
static void
unhash_delegation(struct nfs4_delegation *dp)
{
+ struct nfs4_file *fp = dp->dl_file;
+
spin_lock(&state_lock);
list_del_init(&dp->dl_perclnt);
- list_del_init(&dp->dl_perfile);
list_del_init(&dp->dl_recall_lru);
+ if (!list_empty(&dp->dl_perfile)) {
+ spin_lock(&fp->fi_inode->i_lock);
+ list_del_init(&dp->dl_perfile);
+ spin_unlock(&fp->fi_inode->i_lock);
+ }
spin_unlock(&state_lock);
- if (dp->dl_file) {
- nfs4_put_deleg_lease(dp->dl_file);
- put_nfs4_file(dp->dl_file);
+ if (fp) {
+ nfs4_put_deleg_lease(fp);
+ put_nfs4_file(fp);
dp->dl_file = NULL;
}
}
@@ -2766,24 +2774,31 @@ out:
return ret;
}

-static void nfsd_break_one_deleg(struct nfs4_delegation *dp)
+void nfsd4_prepare_cb_recall(struct nfs4_delegation *dp)
{
struct nfs4_client *clp = dp->dl_stid.sc_client;
struct nfsd_net *nn = net_generic(clp->net, nfsd_net_id);

- lockdep_assert_held(&state_lock);
+ /*
+ * We can't do this in nfsd_break_deleg_cb because it is
+ * already holding inode->i_lock
+ */
+ spin_lock(&state_lock);
+ if (list_empty(&dp->dl_recall_lru)) {
+ dp->dl_time = get_seconds();
+ list_add_tail(&dp->dl_recall_lru, &nn->del_recall_lru);
+ }
+ spin_unlock(&state_lock);
+}
+
+static void nfsd_break_one_deleg(struct nfs4_delegation *dp)
+{
/* We're assuming the state code never drops its reference
* without first removing the lease. Since we're in this lease
* callback (and since the lease code is serialized by the kernel
* lock) we know the server hasn't removed the lease yet, we know
* it's safe to take a reference: */
atomic_inc(&dp->dl_count);
-
- list_add_tail(&dp->dl_recall_lru, &nn->del_recall_lru);
-
- /* Only place dl_time is set; protected by i_lock: */
- dp->dl_time = get_seconds();
-
nfsd4_cb_recall(dp);
}

@@ -2808,11 +2823,9 @@ static void nfsd_break_deleg_cb(struct file_lock *fl)
*/
fl->fl_break_time = 0;

- spin_lock(&state_lock);
fp->fi_had_conflict = true;
list_for_each_entry(dp, &fp->fi_delegations, dl_perfile)
nfsd_break_one_deleg(dp);
- spin_unlock(&state_lock);
}

static
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 374c66283ac5..aacccb2a476c 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -472,6 +472,7 @@ extern void nfsd4_cb_recall(struct nfs4_delegation *dp);
extern int nfsd4_create_callback_queue(void);
extern void nfsd4_destroy_callback_queue(void);
extern void nfsd4_shutdown_callback(struct nfs4_client *);
+extern void nfsd4_prepare_cb_recall(struct nfs4_delegation *dp);
extern void nfs4_put_delegation(struct nfs4_delegation *dp);
extern struct nfs4_client_reclaim *nfs4_client_to_reclaim(const char *name,
struct nfsd_net *nn);
--
1.9.3


2014-05-30 13:09:52

by Jeff Layton

[permalink] [raw]
Subject: [PATCH 5/9] nfsd: remove unneeded zeroing of fields in nfsd4_proc_compound

The memset of resp in svc_process_common should ensure that these are
already zeroed by the time they get here.

Signed-off-by: Jeff Layton <[email protected]>
---
fs/nfsd/nfs4proc.c | 3 ---
1 file changed, 3 deletions(-)

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index a1e16103a0bf..9e7846e19ea6 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -1292,11 +1292,8 @@ nfsd4_proc_compound(struct svc_rqst *rqstp,
xdr_reserve_space(&resp->xdr, 8 + args->taglen);
resp->taglen = args->taglen;
resp->tag = args->tag;
- resp->opcnt = 0;
resp->rqstp = rqstp;
cstate->minorversion = args->minorversion;
- cstate->replay_owner = NULL;
- cstate->session = NULL;
fh_init(current_fh, NFS4_FHSIZE);
fh_init(save_fh, NFS4_FHSIZE);
/*
--
1.9.3


2014-05-30 13:09:56

by Jeff Layton

[permalink] [raw]
Subject: [PATCH 8/9] NFSd: Protect addition to the file_hashtbl

From: Trond Myklebust <[email protected]>

Ensure that we only can have a single struct nfs4_file per inode
in the file_hashtbl and make addition atomic with respect to lookup.

Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfsd/nfs4state.c | 46 +++++++++++++++++++++++++++++++++++-----------
1 file changed, 35 insertions(+), 11 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index a500033a2f87..553c2d6d48dc 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -2519,6 +2519,8 @@ static void nfsd4_init_file(struct nfs4_file *fp, struct inode *ino)
{
unsigned int hashval = file_hashval(ino);

+ lockdep_assert_held(&state_lock);
+
atomic_set(&fp->fi_ref, 1);
INIT_LIST_HEAD(&fp->fi_stateids);
INIT_LIST_HEAD(&fp->fi_delegations);
@@ -2527,9 +2529,7 @@ static void nfsd4_init_file(struct nfs4_file *fp, struct inode *ino)
fp->fi_lease = NULL;
memset(fp->fi_fds, 0, sizeof(fp->fi_fds));
memset(fp->fi_access, 0, sizeof(fp->fi_access));
- spin_lock(&state_lock);
hlist_add_head(&fp->fi_hash, &file_hashtbl[hashval]);
- spin_unlock(&state_lock);
}

void
@@ -2695,23 +2695,49 @@ find_openstateowner_str(unsigned int hashval, struct nfsd4_open *open,

/* search file_hashtbl[] for file */
static struct nfs4_file *
-find_file(struct inode *ino)
+find_file_locked(struct inode *ino)
{
unsigned int hashval = file_hashval(ino);
struct nfs4_file *fp;

- spin_lock(&state_lock);
+ lockdep_assert_held(&state_lock);
+
hlist_for_each_entry(fp, &file_hashtbl[hashval], fi_hash) {
if (fp->fi_inode == ino) {
get_nfs4_file(fp);
- spin_unlock(&state_lock);
return fp;
}
}
- spin_unlock(&state_lock);
return NULL;
}

+static struct nfs4_file *
+find_file(struct inode *ino)
+{
+ struct nfs4_file *fp;
+
+ spin_lock(&state_lock);
+ fp = find_file_locked(ino);
+ spin_unlock(&state_lock);
+ return fp;
+}
+
+static struct nfs4_file *
+find_or_add_file(struct inode *ino, struct nfs4_file *new)
+{
+ struct nfs4_file *fp;
+
+ spin_lock(&state_lock);
+ fp = find_file_locked(ino);
+ if (fp == NULL) {
+ nfsd4_init_file(new, ino);
+ fp = new;
+ }
+ spin_unlock(&state_lock);
+
+ return fp;
+}
+
/*
* Called to check deny when READ with all zero stateid or
* WRITE with all zero or all one stateid
@@ -3230,21 +3256,19 @@ nfsd4_process_open2(struct svc_rqst *rqstp, struct svc_fh *current_fh, struct nf
* and check for delegations in the process of being recalled.
* If not found, create the nfs4_file struct
*/
- fp = find_file(ino);
- if (fp) {
+ fp = find_or_add_file(ino, open->op_file);
+ if (fp != open->op_file) {
if ((status = nfs4_check_open(fp, open, &stp)))
goto out;
status = nfs4_check_deleg(cl, open, &dp);
if (status)
goto out;
} else {
+ open->op_file = NULL;
status = nfserr_bad_stateid;
if (nfsd4_is_deleg_cur(open))
goto out;
status = nfserr_jukebox;
- fp = open->op_file;
- open->op_file = NULL;
- nfsd4_init_file(fp, ino);
}

/*
--
1.9.3


2014-05-30 13:09:55

by Jeff Layton

[permalink] [raw]
Subject: [PATCH 7/9] nfsd4: hash deleg stateid only on successful nfs4_set_delegation

From: Benny Halevy <[email protected]>

We don't want the stateid to be found in the hash table before the delegation
is granted.

Currently this is protected by the client_mutex, but we want to break that
up and this is a necessary step toward that goal.

Signed-off-by: Benny Halevy <[email protected]>
---
fs/nfsd/nfs4state.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 040000de3732..a500033a2f87 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -375,7 +375,6 @@ alloc_init_deleg(struct nfs4_client *clp, struct nfs4_ol_stateid *stp, struct sv
dp = delegstateid(nfs4_alloc_stid(clp, deleg_slab));
if (dp == NULL)
return dp;
- dp->dl_stid.sc_type = NFS4_DELEG_STID;
/*
* delegation seqid's are never incremented. The 4.1 special
* meaning of seqid 0 isn't meaningful, really, but let's avoid
@@ -438,6 +437,7 @@ hash_delegation_locked(struct nfs4_delegation *dp, struct nfs4_file *fp)
{
lockdep_assert_held(&state_lock);

+ dp->dl_stid.sc_type = NFS4_DELEG_STID;
list_add(&dp->dl_perfile, &fp->fi_delegations);
list_add(&dp->dl_perclnt, &dp->dl_stid.sc_client->cl_delegations);
}
--
1.9.3


2014-05-30 13:09:50

by Jeff Layton

[permalink] [raw]
Subject: [PATCH 4/9] nfsd: fix setting of NFS4_OO_CONFIRMED in nfsd4_open

In the NFS4_OPEN_CLAIM_PREVIOUS case, we should only mark it confirmed
if the nfs4_check_open_reclaim check succeeds.

In the NFS4_OPEN_CLAIM_DELEG_PREV_FH and NFS4_OPEN_CLAIM_DELEGATE_PREV
cases, I see no point in declaring the openowner confirmed when the
operation is going to fail anyway, and doing so might allow the client
to game things such that it wouldn't need to confirm a subsequent open
with the same owner.

Signed-off-by: Jeff Layton <[email protected]>
---
fs/nfsd/nfs4proc.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index 2517e0bf5751..a1e16103a0bf 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -430,12 +430,12 @@ nfsd4_open(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
goto out;
break;
case NFS4_OPEN_CLAIM_PREVIOUS:
- open->op_openowner->oo_flags |= NFS4_OO_CONFIRMED;
status = nfs4_check_open_reclaim(&open->op_clientid,
cstate->minorversion,
nn);
if (status)
goto out;
+ open->op_openowner->oo_flags |= NFS4_OO_CONFIRMED;
case NFS4_OPEN_CLAIM_FH:
case NFS4_OPEN_CLAIM_DELEG_CUR_FH:
status = do_open_fhandle(rqstp, cstate, open);
@@ -445,7 +445,6 @@ nfsd4_open(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
break;
case NFS4_OPEN_CLAIM_DELEG_PREV_FH:
case NFS4_OPEN_CLAIM_DELEGATE_PREV:
- open->op_openowner->oo_flags |= NFS4_OO_CONFIRMED;
dprintk("NFSD: unsupported OPEN claim type %d\n",
open->op_claim_type);
status = nfserr_notsupp;
--
1.9.3


2014-05-30 13:09:47

by Jeff Layton

[permalink] [raw]
Subject: [PATCH 2/9] nfsd: fix laundromat next-run-time calculation

The laundromat uses two variables to calculate when it should next run,
but one is completely ignored at the end of the run. Merge the two and
rename the variable to be more descriptive of what it does.

Signed-off-by: Jeff Layton <[email protected]>
---
fs/nfsd/nfs4state.c | 22 ++++++++--------------
1 file changed, 8 insertions(+), 14 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index e85b8811aad3..be6199cc8711 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -3385,8 +3385,7 @@ nfs4_laundromat(struct nfsd_net *nn)
struct nfs4_delegation *dp;
struct list_head *pos, *next, reaplist;
time_t cutoff = get_seconds() - nn->nfsd4_lease;
- time_t t, clientid_val = nn->nfsd4_lease;
- time_t u, test_val = nn->nfsd4_lease;
+ time_t t, new_timeo = nn->nfsd4_lease;

nfs4_lock_state();

@@ -3398,8 +3397,7 @@ nfs4_laundromat(struct nfsd_net *nn)
clp = list_entry(pos, struct nfs4_client, cl_lru);
if (time_after((unsigned long)clp->cl_time, (unsigned long)cutoff)) {
t = clp->cl_time - cutoff;
- if (clientid_val > t)
- clientid_val = t;
+ new_timeo = min(new_timeo, t);
break;
}
if (mark_client_expired_locked(clp)) {
@@ -3422,9 +3420,8 @@ nfs4_laundromat(struct nfsd_net *nn)
if (net_generic(dp->dl_stid.sc_client->net, nfsd_net_id) != nn)
continue;
if (time_after((unsigned long)dp->dl_time, (unsigned long)cutoff)) {
- u = dp->dl_time - cutoff;
- if (test_val > u)
- test_val = u;
+ t = dp->dl_time - cutoff;
+ new_timeo = min(new_timeo, t);
break;
}
list_move(&dp->dl_recall_lru, &reaplist);
@@ -3434,21 +3431,18 @@ nfs4_laundromat(struct nfsd_net *nn)
dp = list_entry (pos, struct nfs4_delegation, dl_recall_lru);
revoke_delegation(dp);
}
- test_val = nn->nfsd4_lease;
list_for_each_safe(pos, next, &nn->close_lru) {
oo = container_of(pos, struct nfs4_openowner, oo_close_lru);
if (time_after((unsigned long)oo->oo_time, (unsigned long)cutoff)) {
- u = oo->oo_time - cutoff;
- if (test_val > u)
- test_val = u;
+ t = oo->oo_time - cutoff;
+ new_timeo = min(new_timeo, t);
break;
}
release_openowner(oo);
}
- if (clientid_val < NFSD_LAUNDROMAT_MINTIMEOUT)
- clientid_val = NFSD_LAUNDROMAT_MINTIMEOUT;
+ new_timeo = max_t(time_t, new_timeo, NFSD_LAUNDROMAT_MINTIMEOUT);
nfs4_unlock_state();
- return clientid_val;
+ return new_timeo;
}

static struct workqueue_struct *laundry_wq;
--
1.9.3


2014-05-30 13:09:53

by Jeff Layton

[permalink] [raw]
Subject: [PATCH 6/9] nfsd4: rename recall_lock to state_lock

From: Benny Halevy <[email protected]>

...as the name is a bit more descriptive and we've started using it for
other purposes.

Signed-off-by: Benny Halevy <[email protected]>
---
fs/nfsd/nfs4state.c | 63 +++++++++++++++++++++++++++--------------------------
1 file changed, 32 insertions(+), 31 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 96bdd90b96b1..040000de3732 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -81,7 +81,7 @@ static DEFINE_MUTEX(client_mutex);
* effort to decrease the scope of the client_mutex, this spinlock may
* eventually cover more:
*/
-static DEFINE_SPINLOCK(recall_lock);
+static DEFINE_SPINLOCK(state_lock);

static struct kmem_cache *openowner_slab;
static struct kmem_cache *lockowner_slab;
@@ -235,9 +235,9 @@ static void nfsd4_free_file(struct nfs4_file *f)
static inline void
put_nfs4_file(struct nfs4_file *fi)
{
- if (atomic_dec_and_lock(&fi->fi_ref, &recall_lock)) {
+ if (atomic_dec_and_lock(&fi->fi_ref, &state_lock)) {
hlist_del(&fi->fi_hash);
- spin_unlock(&recall_lock);
+ spin_unlock(&state_lock);
iput(fi->fi_inode);
nfsd4_free_file(fi);
}
@@ -436,7 +436,7 @@ static void unhash_stid(struct nfs4_stid *s)
static void
hash_delegation_locked(struct nfs4_delegation *dp, struct nfs4_file *fp)
{
- lockdep_assert_held(&recall_lock);
+ lockdep_assert_held(&state_lock);

list_add(&dp->dl_perfile, &fp->fi_delegations);
list_add(&dp->dl_perclnt, &dp->dl_stid.sc_client->cl_delegations);
@@ -446,11 +446,11 @@ hash_delegation_locked(struct nfs4_delegation *dp, struct nfs4_file *fp)
static void
unhash_delegation(struct nfs4_delegation *dp)
{
- spin_lock(&recall_lock);
+ spin_lock(&state_lock);
list_del_init(&dp->dl_perclnt);
list_del_init(&dp->dl_perfile);
list_del_init(&dp->dl_recall_lru);
- spin_unlock(&recall_lock);
+ spin_unlock(&state_lock);
if (dp->dl_file) {
nfs4_put_deleg_lease(dp->dl_file);
put_nfs4_file(dp->dl_file);
@@ -1161,13 +1161,13 @@ destroy_client(struct nfs4_client *clp)
struct nfsd_net *nn = net_generic(clp->net, nfsd_net_id);

INIT_LIST_HEAD(&reaplist);
- spin_lock(&recall_lock);
+ spin_lock(&state_lock);
while (!list_empty(&clp->cl_delegations)) {
dp = list_entry(clp->cl_delegations.next, struct nfs4_delegation, dl_perclnt);
list_del_init(&dp->dl_perclnt);
list_move(&dp->dl_recall_lru, &reaplist);
}
- spin_unlock(&recall_lock);
+ spin_unlock(&state_lock);
while (!list_empty(&reaplist)) {
dp = list_entry(reaplist.next, struct nfs4_delegation, dl_recall_lru);
destroy_delegation(dp);
@@ -2527,9 +2527,9 @@ static void nfsd4_init_file(struct nfs4_file *fp, struct inode *ino)
fp->fi_lease = NULL;
memset(fp->fi_fds, 0, sizeof(fp->fi_fds));
memset(fp->fi_access, 0, sizeof(fp->fi_access));
- spin_lock(&recall_lock);
+ spin_lock(&state_lock);
hlist_add_head(&fp->fi_hash, &file_hashtbl[hashval]);
- spin_unlock(&recall_lock);
+ spin_unlock(&state_lock);
}

void
@@ -2700,15 +2700,15 @@ find_file(struct inode *ino)
unsigned int hashval = file_hashval(ino);
struct nfs4_file *fp;

- spin_lock(&recall_lock);
+ spin_lock(&state_lock);
hlist_for_each_entry(fp, &file_hashtbl[hashval], fi_hash) {
if (fp->fi_inode == ino) {
get_nfs4_file(fp);
- spin_unlock(&recall_lock);
+ spin_unlock(&state_lock);
return fp;
}
}
- spin_unlock(&recall_lock);
+ spin_unlock(&state_lock);
return NULL;
}

@@ -2745,6 +2745,7 @@ static void nfsd_break_one_deleg(struct nfs4_delegation *dp)
struct nfs4_client *clp = dp->dl_stid.sc_client;
struct nfsd_net *nn = net_generic(clp->net, nfsd_net_id);

+ lockdep_assert_held(&state_lock);
/* We're assuming the state code never drops its reference
* without first removing the lease. Since we're in this lease
* callback (and since the lease code is serialized by the kernel
@@ -2781,11 +2782,11 @@ static void nfsd_break_deleg_cb(struct file_lock *fl)
*/
fl->fl_break_time = 0;

- spin_lock(&recall_lock);
+ spin_lock(&state_lock);
fp->fi_had_conflict = true;
list_for_each_entry(dp, &fp->fi_delegations, dl_perfile)
nfsd_break_one_deleg(dp);
- spin_unlock(&recall_lock);
+ spin_unlock(&state_lock);
}

static
@@ -3065,9 +3066,9 @@ static int nfs4_setlease(struct nfs4_delegation *dp)
fp->fi_lease = fl;
fp->fi_deleg_file = get_file(fl->fl_file);
atomic_set(&fp->fi_delegees, 1);
- spin_lock(&recall_lock);
+ spin_lock(&state_lock);
hash_delegation_locked(dp, fp);
- spin_unlock(&recall_lock);
+ spin_unlock(&state_lock);
return 0;
out_free:
locks_free_lock(fl);
@@ -3082,14 +3083,14 @@ static int nfs4_set_delegation(struct nfs4_delegation *dp, struct nfs4_file *fp)
dp->dl_file = fp;
if (!fp->fi_lease)
return nfs4_setlease(dp);
- spin_lock(&recall_lock);
+ spin_lock(&state_lock);
atomic_inc(&fp->fi_delegees);
if (fp->fi_had_conflict) {
- spin_unlock(&recall_lock);
+ spin_unlock(&state_lock);
return -EAGAIN;
}
hash_delegation_locked(dp, fp);
- spin_unlock(&recall_lock);
+ spin_unlock(&state_lock);
return 0;
}

@@ -3423,7 +3424,7 @@ nfs4_laundromat(struct nfsd_net *nn)
clp->cl_clientid.cl_id);
expire_client(clp);
}
- spin_lock(&recall_lock);
+ spin_lock(&state_lock);
list_for_each_safe(pos, next, &nn->del_recall_lru) {
dp = list_entry (pos, struct nfs4_delegation, dl_recall_lru);
if (net_generic(dp->dl_stid.sc_client->net, nfsd_net_id) != nn)
@@ -3435,7 +3436,7 @@ nfs4_laundromat(struct nfsd_net *nn)
}
list_move(&dp->dl_recall_lru, &reaplist);
}
- spin_unlock(&recall_lock);
+ spin_unlock(&state_lock);
list_for_each_safe(pos, next, &reaplist) {
dp = list_entry (pos, struct nfs4_delegation, dl_recall_lru);
revoke_delegation(dp);
@@ -4900,7 +4901,7 @@ static u64 nfsd_find_all_delegations(struct nfs4_client *clp, u64 max,
struct nfs4_delegation *dp, *next;
u64 count = 0;

- lockdep_assert_held(&recall_lock);
+ lockdep_assert_held(&state_lock);
list_for_each_entry_safe(dp, next, &clp->cl_delegations, dl_perclnt) {
if (victims)
list_move(&dp->dl_recall_lru, victims);
@@ -4916,9 +4917,9 @@ u64 nfsd_forget_client_delegations(struct nfs4_client *clp, u64 max)
LIST_HEAD(victims);
u64 count;

- spin_lock(&recall_lock);
+ spin_lock(&state_lock);
count = nfsd_find_all_delegations(clp, max, &victims);
- spin_unlock(&recall_lock);
+ spin_unlock(&state_lock);

list_for_each_entry_safe(dp, next, &victims, dl_recall_lru)
revoke_delegation(dp);
@@ -4932,11 +4933,11 @@ u64 nfsd_recall_client_delegations(struct nfs4_client *clp, u64 max)
LIST_HEAD(victims);
u64 count;

- spin_lock(&recall_lock);
+ spin_lock(&state_lock);
count = nfsd_find_all_delegations(clp, max, &victims);
list_for_each_entry_safe(dp, next, &victims, dl_recall_lru)
nfsd_break_one_deleg(dp);
- spin_unlock(&recall_lock);
+ spin_unlock(&state_lock);

return count;
}
@@ -4945,9 +4946,9 @@ u64 nfsd_print_client_delegations(struct nfs4_client *clp, u64 max)
{
u64 count = 0;

- spin_lock(&recall_lock);
+ spin_lock(&state_lock);
count = nfsd_find_all_delegations(clp, max, NULL);
- spin_unlock(&recall_lock);
+ spin_unlock(&state_lock);

nfsd_print_count(clp, count, "delegations");
return count;
@@ -5158,12 +5159,12 @@ nfs4_state_shutdown_net(struct net *net)

nfs4_lock_state();
INIT_LIST_HEAD(&reaplist);
- spin_lock(&recall_lock);
+ spin_lock(&state_lock);
list_for_each_safe(pos, next, &nn->del_recall_lru) {
dp = list_entry (pos, struct nfs4_delegation, dl_recall_lru);
list_move(&dp->dl_recall_lru, &reaplist);
}
- spin_unlock(&recall_lock);
+ spin_unlock(&state_lock);
list_for_each_safe(pos, next, &reaplist) {
dp = list_entry (pos, struct nfs4_delegation, dl_recall_lru);
destroy_delegation(dp);
--
1.9.3


2014-05-30 13:09:48

by Jeff Layton

[permalink] [raw]
Subject: [PATCH 3/9] nfsd4: use recall_lock for delegation hashing

From: Benny Halevy <[email protected]>

This fixes a bug in the handling of the fi_delegations list.

nfs4_setlease does not hold the recall_lock when adding to it. The
client_mutex is held, which prevents against concurrent list changes,
but nfsd_break_deleg_cb does not hold while walking it. New delegations
could theoretically creep onto the list while we're walking it there.

Signed-off-by: Benny Halevy <[email protected]>
Signed-off-by: Jeff Layton <[email protected]>
---
fs/nfsd/nfs4state.c | 20 +++++++++++++++-----
1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index be6199cc8711..96bdd90b96b1 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -433,12 +433,21 @@ static void unhash_stid(struct nfs4_stid *s)
s->sc_type = 0;
}

+static void
+hash_delegation_locked(struct nfs4_delegation *dp, struct nfs4_file *fp)
+{
+ lockdep_assert_held(&recall_lock);
+
+ list_add(&dp->dl_perfile, &fp->fi_delegations);
+ list_add(&dp->dl_perclnt, &dp->dl_stid.sc_client->cl_delegations);
+}
+
/* Called under the state lock. */
static void
unhash_delegation(struct nfs4_delegation *dp)
{
- list_del_init(&dp->dl_perclnt);
spin_lock(&recall_lock);
+ list_del_init(&dp->dl_perclnt);
list_del_init(&dp->dl_perfile);
list_del_init(&dp->dl_recall_lru);
spin_unlock(&recall_lock);
@@ -3053,11 +3062,12 @@ static int nfs4_setlease(struct nfs4_delegation *dp)
status = vfs_setlease(fl->fl_file, fl->fl_type, &fl);
if (status)
goto out_free;
- list_add(&dp->dl_perclnt, &dp->dl_stid.sc_client->cl_delegations);
fp->fi_lease = fl;
fp->fi_deleg_file = get_file(fl->fl_file);
atomic_set(&fp->fi_delegees, 1);
- list_add(&dp->dl_perfile, &fp->fi_delegations);
+ spin_lock(&recall_lock);
+ hash_delegation_locked(dp, fp);
+ spin_unlock(&recall_lock);
return 0;
out_free:
locks_free_lock(fl);
@@ -3078,9 +3088,8 @@ static int nfs4_set_delegation(struct nfs4_delegation *dp, struct nfs4_file *fp)
spin_unlock(&recall_lock);
return -EAGAIN;
}
- list_add(&dp->dl_perfile, &fp->fi_delegations);
+ hash_delegation_locked(dp, fp);
spin_unlock(&recall_lock);
- list_add(&dp->dl_perclnt, &dp->dl_stid.sc_client->cl_delegations);
return 0;
}

@@ -4891,6 +4900,7 @@ static u64 nfsd_find_all_delegations(struct nfs4_client *clp, u64 max,
struct nfs4_delegation *dp, *next;
u64 count = 0;

+ lockdep_assert_held(&recall_lock);
list_for_each_entry_safe(dp, next, &clp->cl_delegations, dl_perclnt) {
if (victims)
list_move(&dp->dl_recall_lru, victims);
--
1.9.3


2014-06-05 16:27:23

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH 8/9] NFSd: Protect addition to the file_hashtbl

On Thu, 5 Jun 2014 12:18:09 -0400
Trond Myklebust <[email protected]> wrote:

> On Thu, Jun 5, 2014 at 12:12 PM, J. Bruce Fields <[email protected]> wrote:
> > On Fri, May 30, 2014 at 09:09:32AM -0400, Jeff Layton wrote:
> >> From: Trond Myklebust <[email protected]>
> >>
> >> Ensure that we only can have a single struct nfs4_file per inode
> >> in the file_hashtbl and make addition atomic with respect to lookup.
> >>
> >> Signed-off-by: Trond Myklebust <[email protected]>
> >> ---
> >> fs/nfsd/nfs4state.c | 46 +++++++++++++++++++++++++++++++++++-----------
> >> 1 file changed, 35 insertions(+), 11 deletions(-)
> >>
> >> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> >> index a500033a2f87..553c2d6d48dc 100644
> >> --- a/fs/nfsd/nfs4state.c
> >> +++ b/fs/nfsd/nfs4state.c
> >> @@ -2519,6 +2519,8 @@ static void nfsd4_init_file(struct nfs4_file *fp, struct inode *ino)
> >> {
> >> unsigned int hashval = file_hashval(ino);
> >>
> >> + lockdep_assert_held(&state_lock);
> >> +
> >
> > Oops, lockdep points out we overlooked a deadlock here: this function
> > also calls igrab(), which takes the i_lock, the reverse ordering from
> > what we take in the delegation-break case.
> >
> > Dropping this patch for now.
> >
>
> This was the reason for the delegation recall locking changes which
> are also part of the series.
>
> That said, why do we need igrab here as opposed to just ihold()?
>

Yeah, that would be a lot simpler. We certainly already have a
reference by virtue of the dentry in the filehandle so ihold is
reasonable. I'll change it to use that.

--
Jeff Layton <[email protected]>

2014-06-02 18:20:27

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 9/9] NFSd: protect delegation setup with the i_lock

On Mon, Jun 02, 2014 at 10:17:33AM -0400, Jeff Layton wrote:
> Ok, fair enough. A later patch in the series adds a per nfs4_file lock
> (the fi_lock) that we can use for this purpose in lieu of the i_lock.
> Nesting that within the i_lock shouldn't be too painful.

i_lock is part of some very complex locking hierachies in the inode
and dentry caches. Long term I'd prefer to get the file locking code
detangled from that, but let's keep it simple for now and nest
the nfs4_file lock inside for now.

> We could do that within the rpc_call_prepare operation, but the main
> problem there is that the delegation could leak if the rpc task
> allocation fails. Would you be amenable to adding a "cb_prepare"
> operation or something that we could use to run things from the
> workqueue before the actual callback runs?

I guess we'll have to do it then. I'll see if it can be done nicely.

I suspect the root problem of all this is that the file locking code
calls the lock manager ops with i_lock held, which isn't very nice.


2014-06-02 18:48:30

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH 9/9] NFSd: protect delegation setup with the i_lock

On Mon, 2 Jun 2014 11:20:26 -0700
Christoph Hellwig <[email protected]> wrote:

> On Mon, Jun 02, 2014 at 10:17:33AM -0400, Jeff Layton wrote:
> > Ok, fair enough. A later patch in the series adds a per nfs4_file lock
> > (the fi_lock) that we can use for this purpose in lieu of the i_lock.
> > Nesting that within the i_lock shouldn't be too painful.
>
> i_lock is part of some very complex locking hierachies in the inode
> and dentry caches. Long term I'd prefer to get the file locking code
> detangled from that, but let's keep it simple for now and nest
> the nfs4_file lock inside for now.
>
> > We could do that within the rpc_call_prepare operation, but the main
> > problem there is that the delegation could leak if the rpc task
> > allocation fails. Would you be amenable to adding a "cb_prepare"
> > operation or something that we could use to run things from the
> > workqueue before the actual callback runs?
>
> I guess we'll have to do it then. I'll see if it can be done nicely.
>

There is another option, but it's sort of ugly...

We could try to queue the thing in the rpc_call_prepare op, but if that
fails to occur then we could do it in the rpc_release callback. The
tricky part is figuring out whether we'll need to queue it in
rpc_release.

Perhaps we can try to use the dl_time field to indicate that as well so
we won't need to add a separate flag for it.

> I suspect the root problem of all this is that the file locking code
> calls the lock manager ops with i_lock held, which isn't very nice.
>

Yeah, it would be nice if it didn't, but it's better than it was. It
used to use a global spinlock for that instead of the i_lock, but that
was also a clear scalability problem...

--
Jeff Layton <[email protected]>

2014-06-05 16:18:10

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH 8/9] NFSd: Protect addition to the file_hashtbl

On Thu, Jun 5, 2014 at 12:12 PM, J. Bruce Fields <[email protected]> wrote:
> On Fri, May 30, 2014 at 09:09:32AM -0400, Jeff Layton wrote:
>> From: Trond Myklebust <[email protected]>
>>
>> Ensure that we only can have a single struct nfs4_file per inode
>> in the file_hashtbl and make addition atomic with respect to lookup.
>>
>> Signed-off-by: Trond Myklebust <[email protected]>
>> ---
>> fs/nfsd/nfs4state.c | 46 +++++++++++++++++++++++++++++++++++-----------
>> 1 file changed, 35 insertions(+), 11 deletions(-)
>>
>> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
>> index a500033a2f87..553c2d6d48dc 100644
>> --- a/fs/nfsd/nfs4state.c
>> +++ b/fs/nfsd/nfs4state.c
>> @@ -2519,6 +2519,8 @@ static void nfsd4_init_file(struct nfs4_file *fp, struct inode *ino)
>> {
>> unsigned int hashval = file_hashval(ino);
>>
>> + lockdep_assert_held(&state_lock);
>> +
>
> Oops, lockdep points out we overlooked a deadlock here: this function
> also calls igrab(), which takes the i_lock, the reverse ordering from
> what we take in the delegation-break case.
>
> Dropping this patch for now.
>

This was the reason for the delegation recall locking changes which
are also part of the series.

That said, why do we need igrab here as opposed to just ihold()?

--
Trond Myklebust

Linux NFS client maintainer, PrimaryData

[email protected]

2014-06-05 16:12:23

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 8/9] NFSd: Protect addition to the file_hashtbl

On Fri, May 30, 2014 at 09:09:32AM -0400, Jeff Layton wrote:
> From: Trond Myklebust <[email protected]>
>
> Ensure that we only can have a single struct nfs4_file per inode
> in the file_hashtbl and make addition atomic with respect to lookup.
>
> Signed-off-by: Trond Myklebust <[email protected]>
> ---
> fs/nfsd/nfs4state.c | 46 +++++++++++++++++++++++++++++++++++-----------
> 1 file changed, 35 insertions(+), 11 deletions(-)
>
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index a500033a2f87..553c2d6d48dc 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -2519,6 +2519,8 @@ static void nfsd4_init_file(struct nfs4_file *fp, struct inode *ino)
> {
> unsigned int hashval = file_hashval(ino);
>
> + lockdep_assert_held(&state_lock);
> +

Oops, lockdep points out we overlooked a deadlock here: this function
also calls igrab(), which takes the i_lock, the reverse ordering from
what we take in the delegation-break case.

Dropping this patch for now.

--b.

> atomic_set(&fp->fi_ref, 1);
> INIT_LIST_HEAD(&fp->fi_stateids);
> INIT_LIST_HEAD(&fp->fi_delegations);
> @@ -2527,9 +2529,7 @@ static void nfsd4_init_file(struct nfs4_file *fp, struct inode *ino)
> fp->fi_lease = NULL;
> memset(fp->fi_fds, 0, sizeof(fp->fi_fds));
> memset(fp->fi_access, 0, sizeof(fp->fi_access));
> - spin_lock(&state_lock);
> hlist_add_head(&fp->fi_hash, &file_hashtbl[hashval]);
> - spin_unlock(&state_lock);
> }
>
> void
> @@ -2695,23 +2695,49 @@ find_openstateowner_str(unsigned int hashval, struct nfsd4_open *open,
>
> /* search file_hashtbl[] for file */
> static struct nfs4_file *
> -find_file(struct inode *ino)
> +find_file_locked(struct inode *ino)
> {
> unsigned int hashval = file_hashval(ino);
> struct nfs4_file *fp;
>
> - spin_lock(&state_lock);
> + lockdep_assert_held(&state_lock);
> +
> hlist_for_each_entry(fp, &file_hashtbl[hashval], fi_hash) {
> if (fp->fi_inode == ino) {
> get_nfs4_file(fp);
> - spin_unlock(&state_lock);
> return fp;
> }
> }
> - spin_unlock(&state_lock);
> return NULL;
> }
>
> +static struct nfs4_file *
> +find_file(struct inode *ino)
> +{
> + struct nfs4_file *fp;
> +
> + spin_lock(&state_lock);
> + fp = find_file_locked(ino);
> + spin_unlock(&state_lock);
> + return fp;
> +}
> +
> +static struct nfs4_file *
> +find_or_add_file(struct inode *ino, struct nfs4_file *new)
> +{
> + struct nfs4_file *fp;
> +
> + spin_lock(&state_lock);
> + fp = find_file_locked(ino);
> + if (fp == NULL) {
> + nfsd4_init_file(new, ino);
> + fp = new;
> + }
> + spin_unlock(&state_lock);
> +
> + return fp;
> +}
> +
> /*
> * Called to check deny when READ with all zero stateid or
> * WRITE with all zero or all one stateid
> @@ -3230,21 +3256,19 @@ nfsd4_process_open2(struct svc_rqst *rqstp, struct svc_fh *current_fh, struct nf
> * and check for delegations in the process of being recalled.
> * If not found, create the nfs4_file struct
> */
> - fp = find_file(ino);
> - if (fp) {
> + fp = find_or_add_file(ino, open->op_file);
> + if (fp != open->op_file) {
> if ((status = nfs4_check_open(fp, open, &stp)))
> goto out;
> status = nfs4_check_deleg(cl, open, &dp);
> if (status)
> goto out;
> } else {
> + open->op_file = NULL;
> status = nfserr_bad_stateid;
> if (nfsd4_is_deleg_cur(open))
> goto out;
> status = nfserr_jukebox;
> - fp = open->op_file;
> - open->op_file = NULL;
> - nfsd4_init_file(fp, ino);
> }
>
> /*
> --
> 1.9.3
>

2014-06-02 14:17:37

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH 9/9] NFSd: protect delegation setup with the i_lock

On Mon, 2 Jun 2014 01:46:16 -0700
Christoph Hellwig <[email protected]> wrote:

> On Fri, May 30, 2014 at 09:09:33AM -0400, Jeff Layton wrote:
> > From: Trond Myklebust <[email protected]>
> >
> > state_lock is a heavily contended global lock. We don't want to grab
> > that while simultaneously holding the inode->i_lock. Avoid doing that in
> > the delegation break callback by ensuring that we add/remove the
> > dl_perfile under the inode->i_lock.
>
> The current code never takes the state lock (or recall lock) under
> i_lock. Non-trivial use of i_lock in nfsd is only added in this patch.
>
> >
> > This however, can cause an ABBA deadlock between the state_lock and the
> > i_lock. Work around that by doing the list manipulation in the delegation
> > recall from the workqueue context prior to starting the rpc call.
>
> I very much dislike the patch for multiple reasons. For one thing
> nfsd currently stays away from i_lock which is a VFS (and sometimes
> abused by the fs) lock except for a single case in the file locking code
> which really should be factored into common code. I'd rather have
> locks in nfsd than starting to use i_lock in nfsd and making auditing
> it's use and it's lock hierchies more complex by introducing usage
> outside of the VFS.
>

Ok, fair enough. A later patch in the series adds a per nfs4_file lock
(the fi_lock) that we can use for this purpose in lieu of the i_lock.
Nesting that within the i_lock shouldn't be too painful.

> Second I really don't like pushing the delegation setup into the
> callback workqueue. I have a patchset refactoring the callback code
> to allow adding more callbacks without lots of copy & paste, and except
> for this new addition the code in the workqueue already is almost
> perfectly generic for arbitrary callbacks.
>
> Please add a lock in struct nfs4_file to protects it's own members
> instead.
>

This is a problem however. The del_recall_lru list is a per-namespace
list, so we need a lock that is either global or per-namespace to add
things onto it. The point of this patch is to get that extra locking
out of the codepath where the i_lock is already held.

We could do that within the rpc_call_prepare operation, but the main
problem there is that the delegation could leak if the rpc task
allocation fails. Would you be amenable to adding a "cb_prepare"
operation or something that we could use to run things from the
workqueue before the actual callback runs?

--
Jeff Layton <[email protected]>

2014-06-04 19:56:44

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 0/9] nfsd: bugfixes and preliminary patches for client_mutex removal

On Fri, May 30, 2014 at 09:09:24AM -0400, Jeff Layton wrote:
> Hi Bruce!
>
> This patchset contains some bugfixes and preliminary patches for
> the client_mutex removal work. I'm sending these out first as they
> either fix bugs or don't add extra locking that's only nested inside
> the client_mutex anyway.
>
> These are based on your nfsd-next tree. Patch #3 fixes a (possibly
> minor?) bug in the fi_delegations list handling, and might be suitable
> for stable based on that. I haven't cc'ed stable on it as I can't quite
> convince myself it's a real problem, but it might be reasonable to do
> so anyway.

It looks innocent enough, so I'm adding the stable cc:.

I'm dropping patch 9 ("NFSd: protect delegation setup with the i_lock")
and applying the others.

--b.

>
> Benny Halevy (3):
> nfsd4: use recall_lock for delegation hashing
> nfsd4: rename recall_lock to state_lock
> nfsd4: hash deleg stateid only on successful nfs4_set_delegation
>
> Jeff Layton (4):
> nfsd: make nfsd4_encode_fattr static
> nfsd: fix laundromat next-run-time calculation
> nfsd: fix setting of NFS4_OO_CONFIRMED in nfsd4_open
> nfsd: remove unneeded zeroing of fields in nfsd4_proc_compound
>
> Trond Myklebust (2):
> NFSd: Protect addition to the file_hashtbl
> NFSd: protect delegation setup with the i_lock
>
> fs/nfsd/nfs4callback.c | 18 +++++-
> fs/nfsd/nfs4proc.c | 6 +-
> fs/nfsd/nfs4state.c | 168 ++++++++++++++++++++++++++++++-------------------
> fs/nfsd/nfs4xdr.c | 2 +-
> fs/nfsd/state.h | 1 +
> 5 files changed, 124 insertions(+), 71 deletions(-)
>
> --
> 1.9.3
>

2014-06-02 08:46:18

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 9/9] NFSd: protect delegation setup with the i_lock

On Fri, May 30, 2014 at 09:09:33AM -0400, Jeff Layton wrote:
> From: Trond Myklebust <[email protected]>
>
> state_lock is a heavily contended global lock. We don't want to grab
> that while simultaneously holding the inode->i_lock. Avoid doing that in
> the delegation break callback by ensuring that we add/remove the
> dl_perfile under the inode->i_lock.

The current code never takes the state lock (or recall lock) under
i_lock. Non-trivial use of i_lock in nfsd is only added in this patch.

>
> This however, can cause an ABBA deadlock between the state_lock and the
> i_lock. Work around that by doing the list manipulation in the delegation
> recall from the workqueue context prior to starting the rpc call.

I very much dislike the patch for multiple reasons. For one thing
nfsd currently stays away from i_lock which is a VFS (and sometimes
abused by the fs) lock except for a single case in the file locking code
which really should be factored into common code. I'd rather have
locks in nfsd than starting to use i_lock in nfsd and making auditing
it's use and it's lock hierchies more complex by introducing usage
outside of the VFS.

Second I really don't like pushing the delegation setup into the
callback workqueue. I have a patchset refactoring the callback code
to allow adding more callbacks without lots of copy & paste, and except
for this new addition the code in the workqueue already is almost
perfectly generic for arbitrary callbacks.

Please add a lock in struct nfs4_file to protects it's own members
instead.