2014-07-30 01:34:49

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v3 00/38] nfsd: stateid and stateowner refcounting overhaul

v3:
- fix intermediate regression in nfsd4_close_open_stateid

v2:
- rename functions from *_generic_stateid to *_ol_stateid
- eliminate some nfs4_put_* wrappers around nfs4_put_stid
- clean up lock stateid allocation and initialization
- reorder patches to reduce churn
- handle open/lock stateids individually in nfsd4_free_stateid
- add ops struct for stateowners
- combine v4.0 replay cache patches

Here is the next swath of patches for the nfsd client_mutex removal
series. The main focus of this series is to add refcounting to open and
lock stateids and to use those to drive their management and eventual
destruction. Additionally, we add refcounting to the open and lock
stateowners and use that refcount to manage their lifecycle as well.

Most of the new locking here is superfluous until we remove the
client_mutex. This series should apply cleanly to Bruce's for-3.17
branch now that the delegation patches have been merged.

Jeff Layton (20):
nfsd: Cleanup the freeing of stateids
nfsd4: use cl_lock to synchronize all stateid idr calls
nfsd: do filp_close in sc_free callback for lock stateids
nfsd: Add locking to protect the state owner lists
nfsd: clean up races in lock stateid searching and creation
nfsd: ensure atomicity in nfsd4_free_stateid and
nfsd4_validate_stateid
nfsd: Add reference counting to state owners
nfsd: Add a mutex to protect the NFSv4.0 open owner replay cache
nfsd: clean up lockowner refcounting when finding them
nfsd: add an operation for unhashing a stateowner
nfsd: clean up refcounting for lockowners
nfsd: make openstateids hold references to their openowners
nfsd: don't allow CLOSE to proceed until refcount on stateid drops
nfsd: clean up and reorganize release_lockowner
nfsd: add locking to stateowner release
nfsd: optimize destroy_lockowner cl_lock thrashing
nfsd: close potential race in nfsd4_free_stateid
nfsd: reduce cl_lock thrashing in release_openowner
nfsd: don't thrash the cl_lock while freeing an open stateid
nfsd: rename unhash_generic_stateid to unhash_ol_stateid

Trond Myklebust (18):
nfsd: Add reference counting to the lock and open stateids
nfsd: Add a struct nfs4_file field to struct nfs4_stid
nfsd: Replace nfs4_ol_stateid->st_file with the st_stid.sc_file
nfsd: Add reference counting to lock stateids
nfsd: nfsd4_locku() must reference the lock stateid
nfsd: Ensure that nfs4_open_delegation() references the delegation
stateid
nfsd: nfsd4_process_open2() must reference the delegation stateid
nfsd: nfsd4_process_open2() must reference the open stateid
nfsd: Prepare nfsd4_close() for open stateid referencing
nfsd: nfsd4_open_confirm() must reference the open stateid
nfsd: Add reference counting to nfs4_preprocess_confirmed_seqid_op
nfsd: Migrate the stateid reference into nfs4_preprocess_seqid_op
nfsd: Migrate the stateid reference into nfs4_lookup_stateid()
nfsd: Migrate the stateid reference into nfs4_find_stateid_by_type()
nfsd: Make lock stateid take a reference to the lockowner
nfsd: Protect adding/removing open state owners using client_lock
nfsd: Protect adding/removing lock owners using client_lock
nfsd: Move the open owner hash table into struct nfs4_client

fs/nfsd/netns.h | 1 -
fs/nfsd/nfs4callback.c | 4 +-
fs/nfsd/nfs4proc.c | 12 +-
fs/nfsd/nfs4state.c | 1014 +++++++++++++++++++++++++++++++-----------------
fs/nfsd/nfs4xdr.c | 2 -
fs/nfsd/state.h | 32 +-
fs/nfsd/xdr4.h | 5 +-
7 files changed, 695 insertions(+), 375 deletions(-)

--
1.9.3



2014-07-30 01:35:40

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v3 33/38] nfsd: add locking to stateowner release

Once we remove the client_mutex, we'll need to properly protect
the stateowner reference counts using the cl_lock.

Signed-off-by: Jeff Layton <[email protected]>
---
fs/nfsd/nfs4state.c | 15 ++++++---------
1 file changed, 6 insertions(+), 9 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index cd7d7df03afa..9b342e164407 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -937,9 +937,14 @@ release_all_access(struct nfs4_ol_stateid *stp)

static void nfs4_put_stateowner(struct nfs4_stateowner *sop)
{
- if (!atomic_dec_and_test(&sop->so_count))
+ struct nfs4_client *clp = sop->so_client;
+
+ might_lock(&clp->cl_lock);
+
+ if (!atomic_dec_and_lock(&sop->so_count, &clp->cl_lock))
return;
sop->so_ops->so_unhash(sop);
+ spin_unlock(&clp->cl_lock);
kfree(sop->so_owner.data);
sop->so_ops->so_free(sop);
}
@@ -3078,11 +3083,7 @@ static void hash_openowner(struct nfs4_openowner *oo, struct nfs4_client *clp, u

static void nfs4_unhash_openowner(struct nfs4_stateowner *so)
{
- struct nfs4_client *clp = so->so_client;
-
- spin_lock(&clp->cl_lock);
unhash_openowner_locked(openowner(so));
- spin_unlock(&clp->cl_lock);
}

static void nfs4_free_openowner(struct nfs4_stateowner *so)
@@ -4842,11 +4843,7 @@ find_lockowner_str(clientid_t *clid, struct xdr_netobj *owner,

static void nfs4_unhash_lockowner(struct nfs4_stateowner *sop)
{
- struct nfs4_client *clp = sop->so_client;
-
- spin_lock(&clp->cl_lock);
unhash_lockowner_locked(lockowner(sop));
- spin_unlock(&clp->cl_lock);
}

static void nfs4_free_lockowner(struct nfs4_stateowner *sop)
--
1.9.3


2014-07-30 01:35:31

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v3 27/38] nfsd: make openstateids hold references to their openowners

Change it so that only openstateids hold persistent references to
openowners. References can still be held by compounds in progress.

With this, we can get rid of NFS4_OO_NEW. It's possible that we
will create a new openowner in the process of doing the open, but
something later fails. In the meantime, another task could find
that openowner and start using it on a successful open. If that
occurs we don't necessarily want to tear it down, just put the
reference that the failing compound holds.

Signed-off-by: Jeff Layton <[email protected]>
---
fs/nfsd/nfs4state.c | 73 +++++++++++++++++++++++------------------------------
fs/nfsd/state.h | 1 -
2 files changed, 32 insertions(+), 42 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index c86fe66254b0..b61319401826 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -916,6 +916,8 @@ static void nfs4_free_ol_stateid(struct nfs4_stid *stid)
struct nfs4_ol_stateid *stp = openlockstateid(stid);

release_all_access(stp);
+ if (stp->st_stateowner)
+ nfs4_put_stateowner(stp->st_stateowner);
kmem_cache_free(stateid_slab, stid);
}

@@ -928,8 +930,6 @@ static void nfs4_free_lock_stateid(struct nfs4_stid *stid)
file = find_any_file(stp->st_stid.sc_file);
if (file)
filp_close(file, (fl_owner_t)lo);
- if (stp->st_stateowner)
- nfs4_put_stateowner(stp->st_stateowner);
nfs4_free_ol_stateid(stid);
}

@@ -1008,8 +1008,9 @@ static void release_last_closed_stateid(struct nfs4_openowner *oo)
struct nfs4_ol_stateid *s = oo->oo_last_closed_stid;

if (s) {
- nfs4_put_stid(&s->st_stid);
+ list_del_init(&oo->oo_close_lru);
oo->oo_last_closed_stid = NULL;
+ nfs4_put_stid(&s->st_stid);
}
}

@@ -1028,7 +1029,6 @@ static void release_openowner(struct nfs4_openowner *oo)
{
unhash_openowner(oo);
release_openowner_stateids(oo);
- list_del(&oo->oo_close_lru);
release_last_closed_stateid(oo);
nfs4_put_stateowner(&oo->oo_owner);
}
@@ -1497,6 +1497,7 @@ destroy_client(struct nfs4_client *clp)
}
while (!list_empty(&clp->cl_openowners)) {
oo = list_entry(clp->cl_openowners.next, struct nfs4_openowner, oo_perclient);
+ atomic_inc(&oo->oo_owner.so_count);
release_openowner(oo);
}
nfsd4_shutdown_callback(clp);
@@ -3024,7 +3025,7 @@ alloc_init_open_stateowner(unsigned int strhashval, struct nfsd4_open *open,
oo->oo_owner.so_ops = &openowner_ops;
oo->oo_owner.so_is_open_owner = 1;
oo->oo_owner.so_seqid = open->op_seqid;
- oo->oo_flags = NFS4_OO_NEW;
+ oo->oo_flags = 0;
if (nfsd4_has_session(cstate))
oo->oo_flags |= NFS4_OO_CONFIRMED;
oo->oo_time = 0;
@@ -3041,6 +3042,7 @@ static void init_open_stateid(struct nfs4_ol_stateid *stp, struct nfs4_file *fp,
stp->st_stid.sc_type = NFS4_OPEN_STID;
INIT_LIST_HEAD(&stp->st_locks);
stp->st_stateowner = &oo->oo_owner;
+ atomic_inc(&stp->st_stateowner->so_count);
get_nfs4_file(fp);
stp->st_stid.sc_file = fp;
stp->st_access_bmap = 0;
@@ -3054,13 +3056,27 @@ static void init_open_stateid(struct nfs4_ol_stateid *stp, struct nfs4_file *fp,
spin_unlock(&oo->oo_owner.so_client->cl_lock);
}

+/*
+ * In the 4.0 case we need to keep the owners around a little while to handle
+ * CLOSE replay. We still do need to release any file access that is held by
+ * them before returning however.
+ */
static void
-move_to_close_lru(struct nfs4_openowner *oo, struct net *net)
+move_to_close_lru(struct nfs4_ol_stateid *s, struct net *net)
{
- struct nfsd_net *nn = net_generic(net, nfsd_net_id);
+ struct nfs4_openowner *oo = openowner(s->st_stateowner);
+ struct nfsd_net *nn = net_generic(s->st_stid.sc_client->net,
+ nfsd_net_id);

dprintk("NFSD: move_to_close_lru nfs4_openowner %p\n", oo);

+ release_all_access(s);
+ if (s->st_stid.sc_file) {
+ put_nfs4_file(s->st_stid.sc_file);
+ s->st_stid.sc_file = NULL;
+ }
+ release_last_closed_stateid(oo);
+ oo->oo_last_closed_stid = s;
list_move_tail(&oo->oo_close_lru, &nn->close_lru);
oo->oo_time = get_seconds();
}
@@ -3091,6 +3107,7 @@ find_openstateowner_str(unsigned int hashval, struct nfsd4_open *open,
if ((bool)clp->cl_minorversion != sessions)
return NULL;
renew_client(oo->oo_owner.so_client);
+ atomic_inc(&oo->oo_owner.so_count);
return oo;
}
}
@@ -3887,19 +3904,10 @@ void nfsd4_cleanup_open_state(struct nfsd4_compound_state *cstate,
struct nfsd4_open *open, __be32 status)
{
if (open->op_openowner) {
- struct nfs4_openowner *oo = open->op_openowner;
-
- if (!list_empty(&oo->oo_owner.so_stateids))
- list_del_init(&oo->oo_close_lru);
- if (oo->oo_flags & NFS4_OO_NEW) {
- if (status) {
- release_openowner(oo);
- open->op_openowner = NULL;
- } else
- oo->oo_flags &= ~NFS4_OO_NEW;
- }
- if (open->op_openowner)
- nfsd4_cstate_assign_replay(cstate, &oo->oo_owner);
+ struct nfs4_stateowner *so = &open->op_openowner->oo_owner;
+
+ nfsd4_cstate_assign_replay(cstate, so);
+ nfs4_put_stateowner(so);
}
if (open->op_file)
nfsd4_free_file(open->op_file);
@@ -4015,7 +4023,7 @@ nfs4_laundromat(struct nfsd_net *nn)
new_timeo = min(new_timeo, t);
break;
}
- release_openowner(oo);
+ release_last_closed_stateid(oo);
}
new_timeo = max_t(time_t, new_timeo, NFSD_LAUNDROMAT_MINTIMEOUT);
nfs4_unlock_state();
@@ -4580,31 +4588,14 @@ out:
static void nfsd4_close_open_stateid(struct nfs4_ol_stateid *s)
{
struct nfs4_client *clp = s->st_stid.sc_client;
- struct nfs4_openowner *oo = openowner(s->st_stateowner);

s->st_stid.sc_type = NFS4_CLOSED_STID;
unhash_open_stateid(s);

- if (clp->cl_minorversion) {
- if (list_empty(&oo->oo_owner.so_stateids))
- release_openowner(oo);
+ if (clp->cl_minorversion)
nfs4_put_stid(&s->st_stid);
- } else {
- /*
- * In the 4.0 case we need to keep the owners around a
- * little while to handle CLOSE replay. We still do need
- * to release any file access that is held by them
- * before returning however.
- */
- release_all_access(s);
- if (s->st_stid.sc_file) {
- put_nfs4_file(s->st_stid.sc_file);
- s->st_stid.sc_file = NULL;
- }
- oo->oo_last_closed_stid = s;
- if (list_empty(&oo->oo_owner.so_stateids))
- move_to_close_lru(oo, clp->net);
- }
+ else
+ move_to_close_lru(s, clp->net);
}

/*
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 232246039db0..e073c86f389c 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -367,7 +367,6 @@ struct nfs4_openowner {
struct nfs4_ol_stateid *oo_last_closed_stid;
time_t oo_time; /* time of placement on so_close_lru */
#define NFS4_OO_CONFIRMED 1
-#define NFS4_OO_NEW 4
unsigned char oo_flags;
};

--
1.9.3


2014-07-30 01:35:44

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v3 36/38] nfsd: reduce cl_lock thrashing in release_openowner

Releasing an openowner is a bit inefficient as it can potentially thrash
the cl_lock if you have a lot of stateids attached to it. Once we remove
the client_mutex, it'll also potentially be dangerous to do this.

Add some functions to make it easier to defer the part of putting a
generic stateid reference that needs to be done outside the cl_lock while
doing the parts that must be done while holding it under a single lock.

First we unhash each open stateid. Then we call
put_generic_stateid_locked which will put the reference to an
nfs4_ol_stateid. If it turns out to be the last reference, it'll go
ahead and remove the stid from the IDR tree and put it onto the reaplist
using the st_locks list_head.

Then, after dropping the lock we'll call free_ol_stateid_reaplist to
walk the list of stateids that are fully unhashed and ready to be freed,
and free each of them. This function can sleep, so it must be done
outside any spinlocks.

Signed-off-by: Jeff Layton <[email protected]>
---
fs/nfsd/nfs4state.c | 96 +++++++++++++++++++++++++++++++++++------------------
1 file changed, 63 insertions(+), 33 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 9c7dcbb68094..879342bc16b2 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -983,6 +983,30 @@ static void nfs4_free_lock_stateid(struct nfs4_stid *stid)
nfs4_free_ol_stateid(stid);
}

+/*
+ * Put the persistent reference to an already unhashed generic stateid, while
+ * holding the cl_lock. If it's the last reference, then put it onto the
+ * reaplist for later destruction.
+ */
+static void put_ol_stateid_locked(struct nfs4_ol_stateid *stp,
+ struct list_head *reaplist)
+{
+ struct nfs4_stid *s = &stp->st_stid;
+ struct nfs4_client *clp = s->sc_client;
+
+ lockdep_assert_held(&clp->cl_lock);
+
+ WARN_ON_ONCE(!list_empty(&stp->st_locks));
+
+ if (!atomic_dec_and_test(&s->sc_count)) {
+ wake_up_all(&close_wq);
+ return;
+ }
+
+ idr_remove(&clp->cl_stateids, s->sc_stateid.si_opaque.so_id);
+ list_add(&stp->st_locks, reaplist);
+}
+
static void unhash_lock_stateid(struct nfs4_ol_stateid *stp)
{
struct nfs4_openowner *oo = openowner(stp->st_openstp->st_stateowner);
@@ -1013,6 +1037,25 @@ static void unhash_lockowner_locked(struct nfs4_lockowner *lo)
list_del_init(&lo->lo_owner.so_strhash);
}

+/*
+ * Free a list of generic stateids that were collected earlier after being
+ * fully unhashed.
+ */
+static void
+free_ol_stateid_reaplist(struct list_head *reaplist)
+{
+ struct nfs4_ol_stateid *stp;
+
+ might_sleep();
+
+ while (!list_empty(reaplist)) {
+ stp = list_first_entry(reaplist, struct nfs4_ol_stateid,
+ st_locks);
+ list_del(&stp->st_locks);
+ stp->st_stid.sc_free(&stp->st_stid);
+ }
+}
+
static void release_lockowner(struct nfs4_lockowner *lo)
{
struct nfs4_client *clp = lo->lo_owner.so_client;
@@ -1027,24 +1070,10 @@ static void release_lockowner(struct nfs4_lockowner *lo)
stp = list_first_entry(&lo->lo_owner.so_stateids,
struct nfs4_ol_stateid, st_perstateowner);
unhash_lock_stateid(stp);
- /*
- * We now know that no new references can be added to the
- * stateid. If ours is the last one, finish the unhashing
- * and put it on the list to be reaped.
- */
- if (atomic_dec_and_test(&stp->st_stid.sc_count)) {
- idr_remove(&clp->cl_stateids,
- stp->st_stid.sc_stateid.si_opaque.so_id);
- list_add(&stp->st_locks, &reaplist);
- }
+ put_ol_stateid_locked(stp, &reaplist);
}
spin_unlock(&clp->cl_lock);
- while (!list_empty(&reaplist)) {
- stp = list_first_entry(&reaplist, struct nfs4_ol_stateid,
- st_locks);
- list_del(&stp->st_locks);
- stp->st_stid.sc_free(&stp->st_stid);
- }
+ free_ol_stateid_reaplist(&reaplist);
nfs4_put_stateowner(&lo->lo_owner);
}

@@ -1065,16 +1094,21 @@ static void release_open_stateid_locks(struct nfs4_ol_stateid *open_stp)

static void unhash_open_stateid(struct nfs4_ol_stateid *stp)
{
- spin_lock(&stp->st_stateowner->so_client->cl_lock);
+ lockdep_assert_held(&stp->st_stid.sc_client->cl_lock);
+
unhash_generic_stateid(stp);
release_open_stateid_locks(stp);
- spin_unlock(&stp->st_stateowner->so_client->cl_lock);
}

static void release_open_stateid(struct nfs4_ol_stateid *stp)
{
+ LIST_HEAD(reaplist);
+
+ spin_lock(&stp->st_stid.sc_client->cl_lock);
unhash_open_stateid(stp);
- nfs4_put_stid(&stp->st_stid);
+ put_ol_stateid_locked(stp, &reaplist);
+ spin_unlock(&stp->st_stid.sc_client->cl_lock);
+ free_ol_stateid_reaplist(&reaplist);
}

static void unhash_openowner_locked(struct nfs4_openowner *oo)
@@ -1098,30 +1132,24 @@ static void release_last_closed_stateid(struct nfs4_openowner *oo)
}
}

-static void release_openowner_stateids(struct nfs4_openowner *oo)
+static void release_openowner(struct nfs4_openowner *oo)
{
struct nfs4_ol_stateid *stp;
struct nfs4_client *clp = oo->oo_owner.so_client;
+ struct list_head reaplist;

- lockdep_assert_held(&clp->cl_lock);
+ INIT_LIST_HEAD(&reaplist);

+ spin_lock(&clp->cl_lock);
+ unhash_openowner_locked(oo);
while (!list_empty(&oo->oo_owner.so_stateids)) {
stp = list_first_entry(&oo->oo_owner.so_stateids,
struct nfs4_ol_stateid, st_perstateowner);
- spin_unlock(&clp->cl_lock);
- release_open_stateid(stp);
- spin_lock(&clp->cl_lock);
+ unhash_open_stateid(stp);
+ put_ol_stateid_locked(stp, &reaplist);
}
-}
-
-static void release_openowner(struct nfs4_openowner *oo)
-{
- struct nfs4_client *clp = oo->oo_owner.so_client;
-
- spin_lock(&clp->cl_lock);
- unhash_openowner_locked(oo);
- release_openowner_stateids(oo);
spin_unlock(&clp->cl_lock);
+ free_ol_stateid_reaplist(&reaplist);
release_last_closed_stateid(oo);
nfs4_put_stateowner(&oo->oo_owner);
}
@@ -4675,7 +4703,9 @@ static void nfsd4_close_open_stateid(struct nfs4_ol_stateid *s)
struct nfs4_client *clp = s->st_stid.sc_client;

s->st_stid.sc_type = NFS4_CLOSED_STID;
+ spin_lock(&clp->cl_lock);
unhash_open_stateid(s);
+ spin_unlock(&clp->cl_lock);

if (clp->cl_minorversion)
nfs4_put_stid(&s->st_stid);
--
1.9.3


2014-07-30 01:34:59

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v3 06/38] nfsd: do filp_close in sc_free callback for lock stateids

Releasing locks when we unhash the stateid instead of doing so only when
the stateid is actually released will be problematic in later patches
when we need to protect the unhashing with spinlocks. Move it into the
sc_free operation instead.

Signed-off-by: Jeff Layton <[email protected]>
---
fs/nfsd/nfs4state.c | 26 ++++++++++++++++++--------
1 file changed, 18 insertions(+), 8 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index bb37cc4dd573..8ce5894133aa 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -502,7 +502,7 @@ out_free:
return NULL;
}

-static struct nfs4_ol_stateid * nfs4_alloc_stateid(struct nfs4_client *clp)
+static struct nfs4_ol_stateid * nfs4_alloc_open_stateid(struct nfs4_client *clp)
{
struct nfs4_stid *stid;
struct nfs4_ol_stateid *stp;
@@ -907,16 +907,23 @@ static void nfs4_free_ol_stateid(struct nfs4_stid *stid)
kmem_cache_free(stateid_slab, stid);
}

-static void __release_lock_stateid(struct nfs4_ol_stateid *stp)
+static void nfs4_free_lock_stateid(struct nfs4_stid *stid)
{
+ struct nfs4_ol_stateid *stp = openlockstateid(stid);
+ struct nfs4_lockowner *lo = lockowner(stp->st_stateowner);
struct file *file;

+ file = find_any_file(stp->st_stid.sc_file);
+ if (file)
+ filp_close(file, (fl_owner_t)lo);
+ nfs4_free_ol_stateid(stid);
+}
+
+static void __release_lock_stateid(struct nfs4_ol_stateid *stp)
+{
list_del(&stp->st_locks);
unhash_generic_stateid(stp);
unhash_stid(&stp->st_stid);
- file = find_any_file(stp->st_stid.sc_file);
- if (file)
- filp_close(file, (fl_owner_t)lockowner(stp->st_stateowner));
nfs4_put_stid(&stp->st_stid);
}

@@ -3287,7 +3294,7 @@ new_owner:
return nfserr_jukebox;
open->op_openowner = oo;
alloc_stateid:
- open->op_stp = nfs4_alloc_stateid(clp);
+ open->op_stp = nfs4_alloc_open_stateid(clp);
if (!open->op_stp)
return nfserr_jukebox;
return nfs_ok;
@@ -4703,17 +4710,20 @@ alloc_init_lock_stateid(struct nfs4_lockowner *lo, struct nfs4_file *fp,
struct inode *inode,
struct nfs4_ol_stateid *open_stp)
{
+ struct nfs4_stid *s;
struct nfs4_ol_stateid *stp;
struct nfs4_client *clp = lo->lo_owner.so_client;

- stp = nfs4_alloc_stateid(clp);
- if (stp == NULL)
+ s = nfs4_alloc_stid(clp, stateid_slab);
+ if (s == NULL)
return NULL;
+ stp = openlockstateid(s);
stp->st_stid.sc_type = NFS4_LOCK_STID;
list_add(&stp->st_perstateowner, &lo->lo_owner.so_stateids);
stp->st_stateowner = &lo->lo_owner;
get_nfs4_file(fp);
stp->st_stid.sc_file = fp;
+ stp->st_stid.sc_free = nfs4_free_lock_stateid;
stp->st_access_bmap = 0;
stp->st_deny_bmap = open_stp->st_deny_bmap;
stp->st_openstp = open_stp;
--
1.9.3


2014-07-31 20:04:28

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH v3 20/38] nfsd: Migrate the stateid reference into nfs4_find_stateid_by_type()

These look OK to me at least through this patch.

--b.

On Tue, Jul 29, 2014 at 09:34:25PM -0400, Jeff Layton wrote:
> From: Trond Myklebust <[email protected]>
>
> Allow nfs4_find_stateid_by_type to take the stateid reference, while
> still holding the &cl->cl_lock. Necessary step toward client_mutex
> removal.
>
> Signed-off-by: Trond Myklebust <[email protected]>
> ---
> fs/nfsd/nfs4state.c | 12 ++++++------
> 1 file changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index a4a49a3b464c..653de6b14665 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -1696,8 +1696,12 @@ find_stateid_by_type(struct nfs4_client *cl, stateid_t *t, char typemask)
>
> spin_lock(&cl->cl_lock);
> s = find_stateid_locked(cl, t);
> - if (s != NULL && !(typemask & s->sc_type))
> - s = NULL;
> + if (s != NULL) {
> + if (typemask & s->sc_type)
> + atomic_inc(&s->sc_count);
> + else
> + s = NULL;
> + }
> spin_unlock(&cl->cl_lock);
> return s;
> }
> @@ -3326,8 +3330,6 @@ static struct nfs4_delegation *find_deleg_stateid(struct nfs4_client *cl, statei
> ret = find_stateid_by_type(cl, s, NFS4_DELEG_STID);
> if (!ret)
> return NULL;
> - /* FIXME: move into find_stateid_by_type */
> - atomic_inc(&ret->sc_count);
> return delegstateid(ret);
> }
>
> @@ -4170,8 +4172,6 @@ nfsd4_lookup_stateid(struct nfsd4_compound_state *cstate,
> *s = find_stateid_by_type(cstate->clp, stateid, typemask);
> if (!*s)
> return nfserr_bad_stateid;
> - /* FIXME: move into find_stateid_by_type */
> - atomic_inc(&(*s)->sc_count);
> return nfs_ok;
> }
>
> --
> 1.9.3
>

2014-07-30 01:35:10

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v3 13/38] nfsd: nfsd4_process_open2() must reference the delegation stateid

From: Trond Myklebust <[email protected]>

Ensure that nfsd4_process_open2() keeps a reference to the delegation
stateid until it is done working with it. Necessary step toward
client_mutex removal.

Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfsd/nfs4state.c | 18 +++++++++++++-----
1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 2df6af93120e..5cb6305036cd 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -3325,6 +3325,8 @@ static struct nfs4_delegation *find_deleg_stateid(struct nfs4_client *cl, statei
ret = find_stateid_by_type(cl, s, NFS4_DELEG_STID);
if (!ret)
return NULL;
+ /* FIXME: move into find_stateid_by_type */
+ atomic_inc(&ret->sc_count);
return delegstateid(ret);
}

@@ -3340,14 +3342,18 @@ nfs4_check_deleg(struct nfs4_client *cl, struct nfsd4_open *open,
{
int flags;
__be32 status = nfserr_bad_stateid;
+ struct nfs4_delegation *deleg;

- *dp = find_deleg_stateid(cl, &open->op_delegate_stateid);
- if (*dp == NULL)
+ deleg = find_deleg_stateid(cl, &open->op_delegate_stateid);
+ if (deleg == NULL)
goto out;
flags = share_access_to_flags(open->op_share_access);
- status = nfs4_check_delegmode(*dp, flags);
- if (status)
- *dp = NULL;
+ status = nfs4_check_delegmode(deleg, flags);
+ if (status) {
+ nfs4_put_stid(&deleg->dl_stid);
+ goto out;
+ }
+ *dp = deleg;
out:
if (!nfsd4_is_deleg_cur(open))
return nfs_ok;
@@ -3828,6 +3834,8 @@ out:
if (!(open->op_openowner->oo_flags & NFS4_OO_CONFIRMED) &&
!nfsd4_has_session(&resp->cstate))
open->op_rflags |= NFS4_OPEN_RESULT_CONFIRM;
+ if (dp)
+ nfs4_put_stid(&dp->dl_stid);

return status;
}
--
1.9.3


2014-07-30 01:35:17

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v3 18/38] nfsd: Migrate the stateid reference into nfs4_preprocess_seqid_op

From: Trond Myklebust <[email protected]>

Allow nfs4_preprocess_seqid_op to take the stateid reference, instead
of having all the callers do so.

Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfsd/nfs4state.c | 26 +++++++++++---------------
1 file changed, 11 insertions(+), 15 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index a777666044f1..b0c0f4cdf503 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -4390,8 +4390,11 @@ nfs4_preprocess_seqid_op(struct nfsd4_compound_state *cstate, u32 seqid,
cstate->replay_owner = stp->st_stateowner;

status = nfs4_seqid_op_checks(cstate, stateid, seqid, stp);
- if (!status)
+ if (!status) {
+ /* FIXME: move into find_stateid_by_type */
+ atomic_inc(&stp->st_stid.sc_count);
*stpp = stp;
+ }
return status;
}

@@ -4400,16 +4403,18 @@ static __be32 nfs4_preprocess_confirmed_seqid_op(struct nfsd4_compound_state *cs
{
__be32 status;
struct nfs4_openowner *oo;
+ struct nfs4_ol_stateid *stp;

status = nfs4_preprocess_seqid_op(cstate, seqid, stateid,
- NFS4_OPEN_STID, stpp, nn);
+ NFS4_OPEN_STID, &stp, nn);
if (status)
return status;
- /* FIXME: move into nfs4_preprocess_seqid_op */
- atomic_inc(&(*stpp)->st_stid.sc_count);
- oo = openowner((*stpp)->st_stateowner);
- if (!(oo->oo_flags & NFS4_OO_CONFIRMED))
+ oo = openowner(stp->st_stateowner);
+ if (!(oo->oo_flags & NFS4_OO_CONFIRMED)) {
+ nfs4_put_stid(&stp->st_stid);
return nfserr_bad_stateid;
+ }
+ *stpp = stp;
return nfs_ok;
}

@@ -4436,8 +4441,6 @@ nfsd4_open_confirm(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
NFS4_OPEN_STID, &stp, nn);
if (status)
goto out;
- /* FIXME: move into nfs4_preprocess_seqid_op */
- atomic_inc(&stp->st_stid.sc_count);
oo = openowner(stp->st_stateowner);
status = nfserr_bad_stateid;
if (oo->oo_flags & NFS4_OO_CONFIRMED)
@@ -4587,8 +4590,6 @@ nfsd4_close(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
nfsd4_bump_seqid(cstate, status);
if (status)
goto out;
- /* FIXME: move into nfs4_preprocess_seqid_op */
- atomic_inc(&stp->st_stid.sc_count);
update_stateid(&stp->st_stid.sc_stateid);
memcpy(&close->cl_stateid, &stp->st_stid.sc_stateid, sizeof(stateid_t));

@@ -4944,9 +4945,6 @@ nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
lock->lk_old_lock_seqid,
&lock->lk_old_lock_stateid,
NFS4_LOCK_STID, &lock_stp, nn);
- /* FIXME: move into nfs4_preprocess_seqid_op */
- if (!status)
- atomic_inc(&lock_stp->st_stid.sc_count);
}
if (status)
goto out;
@@ -5175,8 +5173,6 @@ nfsd4_locku(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
&stp, nn);
if (status)
goto out;
- /* FIXME: move into nfs4_preprocess_seqid_op */
- atomic_inc(&stp->st_stid.sc_count);
filp = find_any_file(stp->st_stid.sc_file);
if (!filp) {
status = nfserr_lock_range;
--
1.9.3


2014-07-30 01:35:29

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v3 26/38] nfsd: clean up refcounting for lockowners

Ensure that lockowner references are only held by lockstateids and
operations that are in-progress. With this, we can get rid of
release_lockowner_if_empty, which will be racy once we remove
client_mutex protection.

Signed-off-by: Jeff Layton <[email protected]>
---
fs/nfsd/nfs4state.c | 29 +++++++----------------------
1 file changed, 7 insertions(+), 22 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 88225f0bbc12..c86fe66254b0 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -933,7 +933,7 @@ static void nfs4_free_lock_stateid(struct nfs4_stid *stid)
nfs4_free_ol_stateid(stid);
}

-static void __release_lock_stateid(struct nfs4_ol_stateid *stp)
+static void release_lock_stateid(struct nfs4_ol_stateid *stp)
{
struct nfs4_openowner *oo = openowner(stp->st_openstp->st_stateowner);

@@ -957,7 +957,7 @@ static void release_lockowner_stateids(struct nfs4_lockowner *lo)
while (!list_empty(&lo->lo_owner.so_stateids)) {
stp = list_first_entry(&lo->lo_owner.so_stateids,
struct nfs4_ol_stateid, st_perstateowner);
- __release_lock_stateid(stp);
+ release_lock_stateid(stp);
}
}

@@ -968,21 +968,6 @@ static void release_lockowner(struct nfs4_lockowner *lo)
nfs4_put_stateowner(&lo->lo_owner);
}

-static void release_lockowner_if_empty(struct nfs4_lockowner *lo)
-{
- if (list_empty(&lo->lo_owner.so_stateids))
- release_lockowner(lo);
-}
-
-static void release_lock_stateid(struct nfs4_ol_stateid *stp)
-{
- struct nfs4_lockowner *lo;
-
- lo = lockowner(stp->st_stateowner);
- __release_lock_stateid(stp);
- release_lockowner_if_empty(lo);
-}
-
static void release_open_stateid_locks(struct nfs4_ol_stateid *open_stp)
__releases(&open_stp->st_stateowner->so_client->cl_lock)
__acquires(&open_stp->st_stateowner->so_client->cl_lock)
@@ -4323,7 +4308,7 @@ nfsd4_free_lock_stateid(struct nfs4_ol_stateid *stp)

if (check_for_locks(stp->st_stid.sc_file, lo))
return nfserr_locks_held;
- release_lockowner_if_empty(lo);
+ release_lock_stateid(stp);
return nfs_ok;
}

@@ -4938,8 +4923,6 @@ lookup_or_create_lock_state(struct nfsd4_compound_state *cstate,
lo = alloc_init_lock_stateowner(strhashval, cl, ost, lock);
if (lo == NULL)
return nfserr_jukebox;
- /* FIXME: extra reference for new lockowners for the client */
- atomic_inc(&lo->lo_owner.so_count);
} else {
/* with an existing lockowner, seqids must be the same */
status = nfserr_bad_seqid;
@@ -4950,7 +4933,6 @@ lookup_or_create_lock_state(struct nfsd4_compound_state *cstate,

*lst = find_or_create_lock_stateid(lo, fi, inode, ost, new);
if (*lst == NULL) {
- release_lockowner_if_empty(lo);
status = nfserr_jukebox;
goto out;
}
@@ -5379,6 +5361,7 @@ nfsd4_release_lockowner(struct svc_rqst *rqstp,
continue;
if (same_owner_str(tmp, owner, clid)) {
sop = tmp;
+ atomic_inc(&sop->so_count);
break;
}
}
@@ -5392,8 +5375,10 @@ nfsd4_release_lockowner(struct svc_rqst *rqstp,
lo = lockowner(sop);
/* see if there are still any locks associated with it */
list_for_each_entry(stp, &sop->so_stateids, st_perstateowner) {
- if (check_for_locks(stp->st_stid.sc_file, lo))
+ if (check_for_locks(stp->st_stid.sc_file, lo)) {
+ nfs4_put_stateowner(sop);
goto out;
+ }
}

status = nfs_ok;
--
1.9.3


2014-07-30 01:35:37

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v3 31/38] nfsd: Move the open owner hash table into struct nfs4_client

From: Trond Myklebust <[email protected]>

Preparation for removing the client_mutex.

Convert the open owner hash table into a per-client table and protect it
using the nfs4_client->cl_lock spin lock.

Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfsd/netns.h | 1 -
fs/nfsd/nfs4state.c | 187 ++++++++++++++++++++++++----------------------------
fs/nfsd/state.h | 1 +
3 files changed, 86 insertions(+), 103 deletions(-)

diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
index a71d14413d39..e1f479c162b5 100644
--- a/fs/nfsd/netns.h
+++ b/fs/nfsd/netns.h
@@ -63,7 +63,6 @@ struct nfsd_net {
struct rb_root conf_name_tree;
struct list_head *unconf_id_hashtbl;
struct rb_root unconf_name_tree;
- struct list_head *ownerstr_hashtbl;
struct list_head *sessionid_hashtbl;
/*
* client_lru holds client queue ordered by nfs4_client.cl_time
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 7c15918d20f0..4af4e5eff491 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -240,35 +240,27 @@ static void nfsd4_put_session(struct nfsd4_session *ses)
}

static int
-same_owner_str(struct nfs4_stateowner *sop, struct xdr_netobj *owner,
- clientid_t *clid)
+same_owner_str(struct nfs4_stateowner *sop, struct xdr_netobj *owner)
{
return (sop->so_owner.len == owner->len) &&
- 0 == memcmp(sop->so_owner.data, owner->data, owner->len) &&
- (sop->so_client->cl_clientid.cl_id == clid->cl_id);
+ 0 == memcmp(sop->so_owner.data, owner->data, owner->len);
}

static struct nfs4_openowner *
find_openstateowner_str_locked(unsigned int hashval, struct nfsd4_open *open,
- bool sessions, struct nfsd_net *nn)
+ struct nfs4_client *clp)
{
struct nfs4_stateowner *so;
- struct nfs4_openowner *oo;
- struct nfs4_client *clp;

- lockdep_assert_held(&nn->client_lock);
+ lockdep_assert_held(&clp->cl_lock);

- list_for_each_entry(so, &nn->ownerstr_hashtbl[hashval], so_strhash) {
+ list_for_each_entry(so, &clp->cl_ownerstr_hashtbl[hashval],
+ so_strhash) {
if (!so->so_is_open_owner)
continue;
- if (same_owner_str(so, &open->op_owner, &open->op_clientid)) {
- oo = openowner(so);
- clp = oo->oo_owner.so_client;
- if ((bool)clp->cl_minorversion != sessions)
- break;
- renew_client_locked(clp);
+ if (same_owner_str(so, &open->op_owner)) {
atomic_inc(&so->so_count);
- return oo;
+ return openowner(so);
}
}
return NULL;
@@ -276,17 +268,16 @@ find_openstateowner_str_locked(unsigned int hashval, struct nfsd4_open *open,

static struct nfs4_openowner *
find_openstateowner_str(unsigned int hashval, struct nfsd4_open *open,
- bool sessions, struct nfsd_net *nn)
+ struct nfs4_client *clp)
{
struct nfs4_openowner *oo;

- spin_lock(&nn->client_lock);
- oo = find_openstateowner_str_locked(hashval, open, sessions, nn);
- spin_unlock(&nn->client_lock);
+ spin_lock(&clp->cl_lock);
+ oo = find_openstateowner_str_locked(hashval, open, clp);
+ spin_unlock(&clp->cl_lock);
return oo;
}

-
static inline u32
opaque_hashval(const void *ptr, int nbytes)
{
@@ -408,12 +399,11 @@ unsigned long max_delegations;
#define OWNER_HASH_SIZE (1 << OWNER_HASH_BITS)
#define OWNER_HASH_MASK (OWNER_HASH_SIZE - 1)

-static unsigned int ownerstr_hashval(u32 clientid, struct xdr_netobj *ownername)
+static unsigned int ownerstr_hashval(struct xdr_netobj *ownername)
{
unsigned int ret;

ret = opaque_hashval(ownername->data, ownername->len);
- ret += clientid;
return ret & OWNER_HASH_MASK;
}

@@ -1002,40 +992,37 @@ static void release_lock_stateid(struct nfs4_ol_stateid *stp)

static void unhash_lockowner_locked(struct nfs4_lockowner *lo)
{
- struct nfsd_net *nn = net_generic(lo->lo_owner.so_client->net,
- nfsd_net_id);
+ struct nfs4_client *clp = lo->lo_owner.so_client;

- lockdep_assert_held(&nn->client_lock);
+ lockdep_assert_held(&clp->cl_lock);

list_del_init(&lo->lo_owner.so_strhash);
}

static void release_lockowner_stateids(struct nfs4_lockowner *lo)
{
- struct nfsd_net *nn = net_generic(lo->lo_owner.so_client->net,
- nfsd_net_id);
+ struct nfs4_client *clp = lo->lo_owner.so_client;
struct nfs4_ol_stateid *stp;

- lockdep_assert_held(&nn->client_lock);
+ lockdep_assert_held(&clp->cl_lock);

while (!list_empty(&lo->lo_owner.so_stateids)) {
stp = list_first_entry(&lo->lo_owner.so_stateids,
struct nfs4_ol_stateid, st_perstateowner);
- spin_unlock(&nn->client_lock);
+ spin_unlock(&clp->cl_lock);
release_lock_stateid(stp);
- spin_lock(&nn->client_lock);
+ spin_lock(&clp->cl_lock);
}
}

static void release_lockowner(struct nfs4_lockowner *lo)
{
- struct nfsd_net *nn = net_generic(lo->lo_owner.so_client->net,
- nfsd_net_id);
+ struct nfs4_client *clp = lo->lo_owner.so_client;

- spin_lock(&nn->client_lock);
+ spin_lock(&clp->cl_lock);
unhash_lockowner_locked(lo);
release_lockowner_stateids(lo);
- spin_unlock(&nn->client_lock);
+ spin_unlock(&clp->cl_lock);
nfs4_put_stateowner(&lo->lo_owner);
}

@@ -1070,10 +1057,9 @@ static void release_open_stateid(struct nfs4_ol_stateid *stp)

static void unhash_openowner_locked(struct nfs4_openowner *oo)
{
- struct nfsd_net *nn = net_generic(oo->oo_owner.so_client->net,
- nfsd_net_id);
+ struct nfs4_client *clp = oo->oo_owner.so_client;

- lockdep_assert_held(&nn->client_lock);
+ lockdep_assert_held(&clp->cl_lock);

list_del_init(&oo->oo_owner.so_strhash);
list_del_init(&oo->oo_perclient);
@@ -1093,29 +1079,27 @@ static void release_last_closed_stateid(struct nfs4_openowner *oo)
static void release_openowner_stateids(struct nfs4_openowner *oo)
{
struct nfs4_ol_stateid *stp;
- struct nfsd_net *nn = net_generic(oo->oo_owner.so_client->net,
- nfsd_net_id);
+ struct nfs4_client *clp = oo->oo_owner.so_client;

- lockdep_assert_held(&nn->client_lock);
+ lockdep_assert_held(&clp->cl_lock);

while (!list_empty(&oo->oo_owner.so_stateids)) {
stp = list_first_entry(&oo->oo_owner.so_stateids,
struct nfs4_ol_stateid, st_perstateowner);
- spin_unlock(&nn->client_lock);
+ spin_unlock(&clp->cl_lock);
release_open_stateid(stp);
- spin_lock(&nn->client_lock);
+ spin_lock(&clp->cl_lock);
}
}

static void release_openowner(struct nfs4_openowner *oo)
{
- struct nfsd_net *nn = net_generic(oo->oo_owner.so_client->net,
- nfsd_net_id);
+ struct nfs4_client *clp = oo->oo_owner.so_client;

- spin_lock(&nn->client_lock);
+ spin_lock(&clp->cl_lock);
unhash_openowner_locked(oo);
release_openowner_stateids(oo);
- spin_unlock(&nn->client_lock);
+ spin_unlock(&clp->cl_lock);
release_last_closed_stateid(oo);
nfs4_put_stateowner(&oo->oo_owner);
}
@@ -1497,15 +1481,20 @@ STALE_CLIENTID(clientid_t *clid, struct nfsd_net *nn)
static struct nfs4_client *alloc_client(struct xdr_netobj name)
{
struct nfs4_client *clp;
+ int i;

clp = kzalloc(sizeof(struct nfs4_client), GFP_KERNEL);
if (clp == NULL)
return NULL;
clp->cl_name.data = kmemdup(name.data, name.len, GFP_KERNEL);
- if (clp->cl_name.data == NULL) {
- kfree(clp);
- return NULL;
- }
+ if (clp->cl_name.data == NULL)
+ goto err_no_name;
+ clp->cl_ownerstr_hashtbl = kmalloc(sizeof(struct list_head) *
+ OWNER_HASH_SIZE, GFP_KERNEL);
+ if (!clp->cl_ownerstr_hashtbl)
+ goto err_no_hashtbl;
+ for (i = 0; i < OWNER_HASH_SIZE; i++)
+ INIT_LIST_HEAD(&clp->cl_ownerstr_hashtbl[i]);
clp->cl_name.len = name.len;
INIT_LIST_HEAD(&clp->cl_sessions);
idr_init(&clp->cl_stateids);
@@ -1520,6 +1509,11 @@ static struct nfs4_client *alloc_client(struct xdr_netobj name)
spin_lock_init(&clp->cl_lock);
rpc_init_wait_queue(&clp->cl_cb_waitq, "Backchannel slot table");
return clp;
+err_no_hashtbl:
+ kfree(clp->cl_name.data);
+err_no_name:
+ kfree(clp);
+ return NULL;
}

static void
@@ -1538,6 +1532,7 @@ free_client(struct nfs4_client *clp)
}
rpc_destroy_wait_queue(&clp->cl_cb_waitq);
free_svc_cred(&clp->cl_cred);
+ kfree(clp->cl_ownerstr_hashtbl);
kfree(clp->cl_name.data);
idr_destroy(&clp->cl_stateids);
kfree(clp);
@@ -3074,20 +3069,20 @@ static inline void *alloc_stateowner(struct kmem_cache *slab, struct xdr_netobj

static void hash_openowner(struct nfs4_openowner *oo, struct nfs4_client *clp, unsigned int strhashval)
{
- struct nfsd_net *nn = net_generic(clp->net, nfsd_net_id);
+ lockdep_assert_held(&clp->cl_lock);

- list_add(&oo->oo_owner.so_strhash, &nn->ownerstr_hashtbl[strhashval]);
+ list_add(&oo->oo_owner.so_strhash,
+ &clp->cl_ownerstr_hashtbl[strhashval]);
list_add(&oo->oo_perclient, &clp->cl_openowners);
}

static void nfs4_unhash_openowner(struct nfs4_stateowner *so)
{
- struct nfs4_openowner *oo = openowner(so);
- struct nfsd_net *nn = net_generic(so->so_client->net, nfsd_net_id);
+ struct nfs4_client *clp = so->so_client;

- spin_lock(&nn->client_lock);
- unhash_openowner_locked(oo);
- spin_unlock(&nn->client_lock);
+ spin_lock(&clp->cl_lock);
+ unhash_openowner_locked(openowner(so));
+ spin_unlock(&clp->cl_lock);
}

static void nfs4_free_openowner(struct nfs4_stateowner *so)
@@ -3107,7 +3102,6 @@ alloc_init_open_stateowner(unsigned int strhashval, struct nfsd4_open *open,
struct nfsd4_compound_state *cstate)
{
struct nfs4_client *clp = cstate->clp;
- struct nfsd_net *nn = net_generic(clp->net, nfsd_net_id);
struct nfs4_openowner *oo, *ret;

oo = alloc_stateowner(openowner_slab, &open->op_owner, clp);
@@ -3122,15 +3116,14 @@ alloc_init_open_stateowner(unsigned int strhashval, struct nfsd4_open *open,
oo->oo_time = 0;
oo->oo_last_closed_stid = NULL;
INIT_LIST_HEAD(&oo->oo_close_lru);
- spin_lock(&nn->client_lock);
- ret = find_openstateowner_str_locked(strhashval,
- open, clp->cl_minorversion, nn);
+ spin_lock(&clp->cl_lock);
+ ret = find_openstateowner_str_locked(strhashval, open, clp);
if (ret == NULL) {
hash_openowner(oo, clp, strhashval);
ret = oo;
} else
nfs4_free_openowner(&oo->oo_owner);
- spin_unlock(&nn->client_lock);
+ spin_unlock(&clp->cl_lock);
return oo;
}

@@ -3412,8 +3405,8 @@ nfsd4_process_open1(struct nfsd4_compound_state *cstate,
return status;
clp = cstate->clp;

- strhashval = ownerstr_hashval(clientid->cl_id, &open->op_owner);
- oo = find_openstateowner_str(strhashval, open, cstate->minorversion, nn);
+ strhashval = ownerstr_hashval(&open->op_owner);
+ oo = find_openstateowner_str(strhashval, open, clp);
open->op_openowner = oo;
if (!oo) {
goto new_owner;
@@ -4818,15 +4811,16 @@ nevermind:

static struct nfs4_lockowner *
find_lockowner_str_locked(clientid_t *clid, struct xdr_netobj *owner,
- struct nfsd_net *nn)
+ struct nfs4_client *clp)
{
- unsigned int strhashval = ownerstr_hashval(clid->cl_id, owner);
+ unsigned int strhashval = ownerstr_hashval(owner);
struct nfs4_stateowner *so;

- list_for_each_entry(so, &nn->ownerstr_hashtbl[strhashval], so_strhash) {
+ list_for_each_entry(so, &clp->cl_ownerstr_hashtbl[strhashval],
+ so_strhash) {
if (so->so_is_open_owner)
continue;
- if (!same_owner_str(so, owner, clid))
+ if (!same_owner_str(so, owner))
continue;
atomic_inc(&so->so_count);
return lockowner(so);
@@ -4836,23 +4830,23 @@ find_lockowner_str_locked(clientid_t *clid, struct xdr_netobj *owner,

static struct nfs4_lockowner *
find_lockowner_str(clientid_t *clid, struct xdr_netobj *owner,
- struct nfsd_net *nn)
+ struct nfs4_client *clp)
{
struct nfs4_lockowner *lo;

- spin_lock(&nn->client_lock);
- lo = find_lockowner_str_locked(clid, owner, nn);
- spin_unlock(&nn->client_lock);
+ spin_lock(&clp->cl_lock);
+ lo = find_lockowner_str_locked(clid, owner, clp);
+ spin_unlock(&clp->cl_lock);
return lo;
}

static void nfs4_unhash_lockowner(struct nfs4_stateowner *sop)
{
- struct nfsd_net *nn = net_generic(sop->so_client->net, nfsd_net_id);
+ struct nfs4_client *clp = sop->so_client;

- spin_lock(&nn->client_lock);
+ spin_lock(&clp->cl_lock);
unhash_lockowner_locked(lockowner(sop));
- spin_unlock(&nn->client_lock);
+ spin_unlock(&clp->cl_lock);
}

static void nfs4_free_lockowner(struct nfs4_stateowner *sop)
@@ -4879,7 +4873,6 @@ alloc_init_lock_stateowner(unsigned int strhashval, struct nfs4_client *clp,
struct nfs4_ol_stateid *open_stp,
struct nfsd4_lock *lock)
{
- struct nfsd_net *nn = net_generic(clp->net, nfsd_net_id);
struct nfs4_lockowner *lo, *ret;

lo = alloc_stateowner(lockowner_slab, &lock->lk_new_owner, clp);
@@ -4889,16 +4882,16 @@ alloc_init_lock_stateowner(unsigned int strhashval, struct nfs4_client *clp,
lo->lo_owner.so_is_open_owner = 0;
lo->lo_owner.so_seqid = lock->lk_new_lock_seqid;
lo->lo_owner.so_ops = &lockowner_ops;
- spin_lock(&nn->client_lock);
+ spin_lock(&clp->cl_lock);
ret = find_lockowner_str_locked(&clp->cl_clientid,
- &lock->lk_new_owner, nn);
+ &lock->lk_new_owner, clp);
if (ret == NULL) {
list_add(&lo->lo_owner.so_strhash,
- &nn->ownerstr_hashtbl[strhashval]);
+ &clp->cl_ownerstr_hashtbl[strhashval]);
ret = lo;
} else
nfs4_free_lockowner(&lo->lo_owner);
- spin_unlock(&nn->client_lock);
+ spin_unlock(&clp->cl_lock);
return lo;
}

@@ -5010,12 +5003,10 @@ lookup_or_create_lock_state(struct nfsd4_compound_state *cstate,
struct inode *inode = cstate->current_fh.fh_dentry->d_inode;
struct nfs4_lockowner *lo;
unsigned int strhashval;
- struct nfsd_net *nn = net_generic(cl->net, nfsd_net_id);

- lo = find_lockowner_str(&cl->cl_clientid, &lock->v.new.owner, nn);
+ lo = find_lockowner_str(&cl->cl_clientid, &lock->v.new.owner, cl);
if (!lo) {
- strhashval = ownerstr_hashval(cl->cl_clientid.cl_id,
- &lock->v.new.owner);
+ strhashval = ownerstr_hashval(&lock->v.new.owner);
lo = alloc_init_lock_stateowner(strhashval, cl, ost, lock);
if (lo == NULL)
return nfserr_jukebox;
@@ -5293,7 +5284,8 @@ nfsd4_lockt(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
goto out;
}

- lo = find_lockowner_str(&lockt->lt_clientid, &lockt->lt_owner, nn);
+ lo = find_lockowner_str(&lockt->lt_clientid, &lockt->lt_owner,
+ cstate->clp);
if (lo)
file_lock->fl_owner = (fl_owner_t)lo;
file_lock->fl_pid = current->tgid;
@@ -5436,7 +5428,7 @@ nfsd4_release_lockowner(struct svc_rqst *rqstp,
struct nfs4_lockowner *lo;
struct nfs4_ol_stateid *stp;
struct xdr_netobj *owner = &rlockowner->rl_owner;
- unsigned int hashval = ownerstr_hashval(clid->cl_id, owner);
+ unsigned int hashval = ownerstr_hashval(owner);
__be32 status;
struct nfsd_net *nn = net_generic(SVC_NET(rqstp), nfsd_net_id);
struct nfs4_client *clp;
@@ -5452,29 +5444,29 @@ nfsd4_release_lockowner(struct svc_rqst *rqstp,

status = nfserr_locks_held;

+ clp = cstate->clp;
/* Find the matching lock stateowner */
- spin_lock(&nn->client_lock);
- list_for_each_entry(tmp, &nn->ownerstr_hashtbl[hashval], so_strhash) {
+ spin_lock(&clp->cl_lock);
+ list_for_each_entry(tmp, &clp->cl_ownerstr_hashtbl[hashval],
+ so_strhash) {
if (tmp->so_is_open_owner)
continue;
- if (same_owner_str(tmp, owner, clid)) {
+ if (same_owner_str(tmp, owner)) {
sop = tmp;
atomic_inc(&sop->so_count);
break;
}
}
- spin_unlock(&nn->client_lock);

/* No matching owner found, maybe a replay? Just declare victory... */
if (!sop) {
+ spin_unlock(&clp->cl_lock);
status = nfs_ok;
goto out;
}

lo = lockowner(sop);
/* see if there are still any locks associated with it */
- clp = cstate->clp;
- spin_lock(&clp->cl_lock);
list_for_each_entry(stp, &sop->so_stateids, st_perstateowner) {
if (check_for_locks(stp->st_stid.sc_file, lo)) {
spin_unlock(&clp->cl_lock);
@@ -5829,10 +5821,6 @@ static int nfs4_state_create_net(struct net *net)
CLIENT_HASH_SIZE, GFP_KERNEL);
if (!nn->unconf_id_hashtbl)
goto err_unconf_id;
- nn->ownerstr_hashtbl = kmalloc(sizeof(struct list_head) *
- OWNER_HASH_SIZE, GFP_KERNEL);
- if (!nn->ownerstr_hashtbl)
- goto err_ownerstr;
nn->sessionid_hashtbl = kmalloc(sizeof(struct list_head) *
SESSION_HASH_SIZE, GFP_KERNEL);
if (!nn->sessionid_hashtbl)
@@ -5842,8 +5830,6 @@ static int nfs4_state_create_net(struct net *net)
INIT_LIST_HEAD(&nn->conf_id_hashtbl[i]);
INIT_LIST_HEAD(&nn->unconf_id_hashtbl[i]);
}
- for (i = 0; i < OWNER_HASH_SIZE; i++)
- INIT_LIST_HEAD(&nn->ownerstr_hashtbl[i]);
for (i = 0; i < SESSION_HASH_SIZE; i++)
INIT_LIST_HEAD(&nn->sessionid_hashtbl[i]);
nn->conf_name_tree = RB_ROOT;
@@ -5859,8 +5845,6 @@ static int nfs4_state_create_net(struct net *net)
return 0;

err_sessionid:
- kfree(nn->ownerstr_hashtbl);
-err_ownerstr:
kfree(nn->unconf_id_hashtbl);
err_unconf_id:
kfree(nn->conf_id_hashtbl);
@@ -5890,7 +5874,6 @@ nfs4_state_destroy_net(struct net *net)
}

kfree(nn->sessionid_hashtbl);
- kfree(nn->ownerstr_hashtbl);
kfree(nn->unconf_id_hashtbl);
kfree(nn->conf_id_hashtbl);
put_net(net);
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index e073c86f389c..73a209dc352b 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -235,6 +235,7 @@ struct nfsd4_sessionid {
struct nfs4_client {
struct list_head cl_idhash; /* hash by cl_clientid.id */
struct rb_node cl_namenode; /* link into by-name trees */
+ struct list_head *cl_ownerstr_hashtbl;
struct list_head cl_openowners;
struct idr cl_stateids; /* stateid lookup */
struct list_head cl_delegations;
--
1.9.3


2014-07-30 01:35:47

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v3 38/38] nfsd: rename unhash_generic_stateid to unhash_ol_stateid

...to better match other functions that deal with open/lock stateids.

Signed-off-by: Jeff Layton <[email protected]>
---
fs/nfsd/nfs4state.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 1f67a96c4941..52ec47de1185 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -949,7 +949,7 @@ static void nfs4_put_stateowner(struct nfs4_stateowner *sop)
sop->so_ops->so_free(sop);
}

-static void unhash_generic_stateid(struct nfs4_ol_stateid *stp)
+static void unhash_ol_stateid(struct nfs4_ol_stateid *stp)
{
struct nfs4_file *fp = stp->st_stid.sc_file;

@@ -1014,7 +1014,7 @@ static void unhash_lock_stateid(struct nfs4_ol_stateid *stp)
lockdep_assert_held(&oo->oo_owner.so_client->cl_lock);

list_del_init(&stp->st_locks);
- unhash_generic_stateid(stp);
+ unhash_ol_stateid(stp);
unhash_stid(&stp->st_stid);
}

@@ -1095,7 +1095,7 @@ static void unhash_open_stateid(struct nfs4_ol_stateid *stp,
{
lockdep_assert_held(&stp->st_stid.sc_client->cl_lock);

- unhash_generic_stateid(stp);
+ unhash_ol_stateid(stp);
release_open_stateid_locks(stp, reaplist);
}

--
1.9.3


2014-07-30 01:35:06

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v3 11/38] nfsd: nfsd4_locku() must reference the lock stateid

From: Trond Myklebust <[email protected]>

Ensure that nfsd4_locku() keeps a reference to the lock stateid
until it is done working with it. Necessary step toward client_mutex
removal.

Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfsd/nfs4state.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 1ddf4da0023d..4f191456d737 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -5147,10 +5147,12 @@ nfsd4_locku(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
&stp, nn);
if (status)
goto out;
+ /* FIXME: move into nfs4_preprocess_seqid_op */
+ atomic_inc(&stp->st_stid.sc_count);
filp = find_any_file(stp->st_stid.sc_file);
if (!filp) {
status = nfserr_lock_range;
- goto out;
+ goto put_stateid;
}
file_lock = locks_alloc_lock();
if (!file_lock) {
@@ -5180,6 +5182,8 @@ nfsd4_locku(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
memcpy(&locku->lu_stateid, &stp->st_stid.sc_stateid, sizeof(stateid_t));
fput:
fput(filp);
+put_stateid:
+ nfs4_put_stid(&stp->st_stid);
out:
nfsd4_bump_seqid(cstate, status);
if (!cstate->replay_owner)
--
1.9.3


2014-07-30 01:34:54

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v3 03/38] nfsd: Add a struct nfs4_file field to struct nfs4_stid

From: Trond Myklebust <[email protected]>

All stateids are associated with a nfs4_file. Let's consolidate.
Start by replacing delegation->dl_file with the dl_stid.sc_file

Signed-off-by: Trond Myklebust <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
---
fs/nfsd/nfs4callback.c | 2 +-
fs/nfsd/nfs4state.c | 21 ++++++++++-----------
fs/nfsd/state.h | 2 +-
3 files changed, 12 insertions(+), 13 deletions(-)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index 8574c708cf8c..e0be57b0f79b 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -337,7 +337,7 @@ static void encode_cb_recall4args(struct xdr_stream *xdr,
p = xdr_reserve_space(xdr, 4);
*p++ = xdr_zero; /* truncate */

- encode_nfs_fh4(xdr, &dp->dl_file->fi_fhandle);
+ encode_nfs_fh4(xdr, &dp->dl_stid.sc_file->fi_fhandle);

hdr->nops++;
}
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 60ab22b6e099..d6b0ef4e90fe 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -515,10 +515,6 @@ static struct nfs4_ol_stateid * nfs4_alloc_stateid(struct nfs4_client *clp)

static void nfs4_free_deleg(struct nfs4_stid *stid)
{
- struct nfs4_delegation *dp = delegstateid(stid);
-
- if (dp->dl_file)
- put_nfs4_file(dp->dl_file);
kmem_cache_free(deleg_slab, stid);
atomic_long_dec(&num_delegations);
}
@@ -636,12 +632,15 @@ out_dec:
void
nfs4_put_stid(struct nfs4_stid *s)
{
+ struct nfs4_file *fp = s->sc_file;
struct nfs4_client *clp = s->sc_client;

if (!atomic_dec_and_test(&s->sc_count))
return;
idr_remove(&clp->cl_stateids, s->sc_stateid.si_opaque.so_id);
s->sc_free(s);
+ if (fp)
+ put_nfs4_file(fp);
}

static void nfs4_put_deleg_lease(struct nfs4_file *fp)
@@ -677,7 +676,7 @@ hash_delegation_locked(struct nfs4_delegation *dp, struct nfs4_file *fp)
static void
unhash_delegation_locked(struct nfs4_delegation *dp)
{
- struct nfs4_file *fp = dp->dl_file;
+ struct nfs4_file *fp = dp->dl_stid.sc_file;

lockdep_assert_held(&state_lock);

@@ -3097,10 +3096,10 @@ nfs4_share_conflict(struct svc_fh *current_fh, unsigned int deny_type)

void nfsd4_prepare_cb_recall(struct nfs4_delegation *dp)
{
- struct nfs4_client *clp = dp->dl_stid.sc_client;
- struct nfsd_net *nn = net_generic(clp->net, nfsd_net_id);
+ struct nfsd_net *nn = net_generic(dp->dl_stid.sc_client->net,
+ nfsd_net_id);

- block_delegations(&dp->dl_file->fi_fhandle);
+ block_delegations(&dp->dl_stid.sc_file->fi_fhandle);

/*
* We can't do this in nfsd_break_deleg_cb because it is
@@ -3508,7 +3507,7 @@ static struct file_lock *nfs4_alloc_init_lease(struct nfs4_file *fp, int flag)

static int nfs4_setlease(struct nfs4_delegation *dp)
{
- struct nfs4_file *fp = dp->dl_file;
+ struct nfs4_file *fp = dp->dl_stid.sc_file;
struct file_lock *fl;
struct file *filp;
int status = 0;
@@ -3573,7 +3572,7 @@ nfs4_set_delegation(struct nfs4_client *clp, struct svc_fh *fh,
get_nfs4_file(fp);
spin_lock(&state_lock);
spin_lock(&fp->fi_lock);
- dp->dl_file = fp;
+ dp->dl_stid.sc_file = fp;
if (!fp->fi_lease) {
spin_unlock(&fp->fi_lock);
spin_unlock(&state_lock);
@@ -4167,7 +4166,7 @@ nfs4_preprocess_stateid_op(struct net *net, struct nfsd4_compound_state *cstate,
if (status)
goto out;
if (filpp) {
- file = dp->dl_file->fi_deleg_file;
+ file = dp->dl_stid.sc_file->fi_deleg_file;
if (!file) {
WARN_ON_ONCE(1);
status = nfserr_serverfault;
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 32c466265ac1..c856601c15f6 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -85,6 +85,7 @@ struct nfs4_stid {
unsigned char sc_type;
stateid_t sc_stateid;
struct nfs4_client *sc_client;
+ struct nfs4_file *sc_file;
void (*sc_free)(struct nfs4_stid *);
};

@@ -93,7 +94,6 @@ struct nfs4_delegation {
struct list_head dl_perfile;
struct list_head dl_perclnt;
struct list_head dl_recall_lru; /* delegation recalled */
- struct nfs4_file *dl_file;
u32 dl_type;
time_t dl_time;
/* For recall: */
--
1.9.3


2014-07-31 16:53:02

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH v3 04/38] nfsd: Replace nfs4_ol_stateid->st_file with the st_stid.sc_file

On Wed, Jul 30, 2014 at 01:59:02PM -0700, Christoph Hellwig wrote:
> FYI, I'd merge this into the previous patch.

OK, done.

--b.

2014-07-30 01:34:55

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v3 04/38] nfsd: Replace nfs4_ol_stateid->st_file with the st_stid.sc_file

From: Trond Myklebust <[email protected]>

Signed-off-by: Trond Myklebust <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
---
fs/nfsd/nfs4state.c | 48 ++++++++++++++++++++++++------------------------
fs/nfsd/state.h | 1 -
2 files changed, 24 insertions(+), 25 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index d6b0ef4e90fe..344cd1ac3f67 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -863,7 +863,7 @@ reset_union_bmap_deny(u32 deny, struct nfs4_ol_stateid *stp)

/* Recalculate per-file deny mode if there was a change */
if (change)
- recalculate_deny_mode(stp->st_file);
+ recalculate_deny_mode(stp->st_stid.sc_file);
}

/* release all access and file references for a given stateid */
@@ -871,21 +871,21 @@ static void
release_all_access(struct nfs4_ol_stateid *stp)
{
int i;
- struct nfs4_file *fp = stp->st_file;
+ struct nfs4_file *fp = stp->st_stid.sc_file;

if (fp && stp->st_deny_bmap != 0)
recalculate_deny_mode(fp);

for (i = 1; i < 4; i++) {
if (test_access(i, stp))
- nfs4_file_put_access(stp->st_file, i);
+ nfs4_file_put_access(stp->st_stid.sc_file, i);
clear_access(i, stp);
}
}

static void unhash_generic_stateid(struct nfs4_ol_stateid *stp)
{
- struct nfs4_file *fp = stp->st_file;
+ struct nfs4_file *fp = stp->st_stid.sc_file;

spin_lock(&fp->fi_lock);
list_del(&stp->st_perfile);
@@ -898,8 +898,6 @@ static void nfs4_free_ol_stateid(struct nfs4_stid *stid)
struct nfs4_ol_stateid *stp = openlockstateid(stid);

release_all_access(stp);
- if (stp->st_file)
- put_nfs4_file(stp->st_file);
kmem_cache_free(stateid_slab, stid);
}

@@ -910,7 +908,7 @@ static void __release_lock_stateid(struct nfs4_ol_stateid *stp)
list_del(&stp->st_locks);
unhash_generic_stateid(stp);
unhash_stid(&stp->st_stid);
- file = find_any_file(stp->st_file);
+ file = find_any_file(stp->st_stid.sc_file);
if (file)
filp_close(file, (fl_owner_t)lockowner(stp->st_stateowner));
nfs4_put_stid(&stp->st_stid);
@@ -2975,7 +2973,7 @@ static void init_open_stateid(struct nfs4_ol_stateid *stp, struct nfs4_file *fp,
list_add(&stp->st_perstateowner, &oo->oo_owner.so_stateids);
stp->st_stateowner = &oo->oo_owner;
get_nfs4_file(fp);
- stp->st_file = fp;
+ stp->st_stid.sc_file = fp;
stp->st_access_bmap = 0;
stp->st_deny_bmap = 0;
stp->st_openstp = NULL;
@@ -3668,7 +3666,7 @@ nfs4_open_delegation(struct svc_fh *fh, struct nfsd4_open *open,
default:
goto out_no_deleg;
}
- dp = nfs4_set_delegation(clp, fh, stp->st_file);
+ dp = nfs4_set_delegation(clp, fh, stp->st_stid.sc_file);
if (IS_ERR(dp))
goto out_no_deleg;

@@ -3958,7 +3956,7 @@ laundromat_main(struct work_struct *laundry)

static inline __be32 nfs4_check_fh(struct svc_fh *fhp, struct nfs4_ol_stateid *stp)
{
- if (!nfsd_fh_match(&fhp->fh_handle, &stp->st_file->fi_fhandle))
+ if (!nfsd_fh_match(&fhp->fh_handle, &stp->st_stid.sc_file->fi_fhandle))
return nfserr_bad_stateid;
return nfs_ok;
}
@@ -4188,10 +4186,12 @@ nfs4_preprocess_stateid_op(struct net *net, struct nfsd4_compound_state *cstate,
if (status)
goto out;
if (filpp) {
+ struct nfs4_file *fp = stp->st_stid.sc_file;
+
if (flags & RD_STATE)
- file = find_readable_file(stp->st_file);
+ file = find_readable_file(fp);
else
- file = find_writeable_file(stp->st_file);
+ file = find_writeable_file(fp);
}
break;
default:
@@ -4211,7 +4211,7 @@ nfsd4_free_lock_stateid(struct nfs4_ol_stateid *stp)
{
struct nfs4_lockowner *lo = lockowner(stp->st_stateowner);

- if (check_for_locks(stp->st_file, lo))
+ if (check_for_locks(stp->st_stid.sc_file, lo))
return nfserr_locks_held;
release_lockowner_if_empty(lo);
return nfs_ok;
@@ -4402,7 +4402,7 @@ static inline void nfs4_stateid_downgrade_bit(struct nfs4_ol_stateid *stp, u32 a
{
if (!test_access(access, stp))
return;
- nfs4_file_put_access(stp->st_file, access);
+ nfs4_file_put_access(stp->st_stid.sc_file, access);
clear_access(access, stp);
}

@@ -4491,9 +4491,9 @@ static void nfsd4_close_open_stateid(struct nfs4_ol_stateid *s)
* before returning however.
*/
release_all_access(s);
- if (s->st_file) {
- put_nfs4_file(s->st_file);
- s->st_file = NULL;
+ if (s->st_stid.sc_file) {
+ put_nfs4_file(s->st_stid.sc_file);
+ s->st_stid.sc_file = NULL;
}
oo->oo_last_closed_stid = s;
if (list_empty(&oo->oo_owner.so_stateids))
@@ -4694,7 +4694,7 @@ alloc_init_lock_stateid(struct nfs4_lockowner *lo, struct nfs4_file *fp,
list_add(&stp->st_perstateowner, &lo->lo_owner.so_stateids);
stp->st_stateowner = &lo->lo_owner;
get_nfs4_file(fp);
- stp->st_file = fp;
+ stp->st_stid.sc_file = fp;
stp->st_access_bmap = 0;
stp->st_deny_bmap = open_stp->st_deny_bmap;
stp->st_openstp = open_stp;
@@ -4711,7 +4711,7 @@ find_lock_stateid(struct nfs4_lockowner *lo, struct nfs4_file *fp)
struct nfs4_ol_stateid *lst;

list_for_each_entry(lst, &lo->lo_owner.so_stateids, st_perstateowner) {
- if (lst->st_file == fp)
+ if (lst->st_stid.sc_file == fp)
return lst;
}
return NULL;
@@ -4727,7 +4727,7 @@ check_lock_length(u64 offset, u64 length)

static void get_lock_access(struct nfs4_ol_stateid *lock_stp, u32 access)
{
- struct nfs4_file *fp = lock_stp->st_file;
+ struct nfs4_file *fp = lock_stp->st_stid.sc_file;

lockdep_assert_held(&fp->fi_lock);

@@ -4739,7 +4739,7 @@ static void get_lock_access(struct nfs4_ol_stateid *lock_stp, u32 access)

static __be32 lookup_or_create_lock_state(struct nfsd4_compound_state *cstate, struct nfs4_ol_stateid *ost, struct nfsd4_lock *lock, struct nfs4_ol_stateid **lst, bool *new)
{
- struct nfs4_file *fi = ost->st_file;
+ struct nfs4_file *fi = ost->st_stid.sc_file;
struct nfs4_openowner *oo = openowner(ost->st_stateowner);
struct nfs4_client *cl = oo->oo_owner.so_client;
struct inode *inode = cstate->current_fh.fh_dentry->d_inode;
@@ -4864,7 +4864,7 @@ nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
goto out;
}

- fp = lock_stp->st_file;
+ fp = lock_stp->st_stid.sc_file;
locks_init_lock(file_lock);
switch (lock->lk_type) {
case NFS4_READ_LT:
@@ -5064,7 +5064,7 @@ nfsd4_locku(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
&stp, nn);
if (status)
goto out;
- filp = find_any_file(stp->st_file);
+ filp = find_any_file(stp->st_stid.sc_file);
if (!filp) {
status = nfserr_lock_range;
goto out;
@@ -5187,7 +5187,7 @@ nfsd4_release_lockowner(struct svc_rqst *rqstp,
lo = lockowner(sop);
/* see if there are still any locks associated with it */
list_for_each_entry(stp, &sop->so_stateids, st_perstateowner) {
- if (check_for_locks(stp->st_file, lo))
+ if (check_for_locks(stp->st_stid.sc_file, lo))
goto out;
}

diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index c856601c15f6..af1d9c42e939 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -407,7 +407,6 @@ struct nfs4_ol_stateid {
struct list_head st_perstateowner;
struct list_head st_locks;
struct nfs4_stateowner * st_stateowner;
- struct nfs4_file * st_file;
unsigned char st_access_bmap;
unsigned char st_deny_bmap;
struct nfs4_ol_stateid * st_openstp;
--
1.9.3


2014-07-30 01:35:05

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v3 09/38] nfsd: ensure atomicity in nfsd4_free_stateid and nfsd4_validate_stateid

Hold the cl_lock over the bulk of these functions. In addition to
ensuring that they aren't freed prematurely, this will also help prevent
a potential race that could be introduced later. Once we remove the
client_mutex, it'll be possible for FREE_STATEID and CLOSE to race and
for both to try to put the "persistent" reference to the stateid.

Signed-off-by: Trond Myklebust <[email protected]>
Signed-off-by: Jeff Layton <[email protected]>
---
fs/nfsd/nfs4state.c | 71 +++++++++++++++++++++++++++--------------------------
1 file changed, 36 insertions(+), 35 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 59d44873b68b..f4c7bf96c9fd 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1689,17 +1689,6 @@ find_stateid_locked(struct nfs4_client *cl, stateid_t *t)
}

static struct nfs4_stid *
-find_stateid(struct nfs4_client *cl, stateid_t *t)
-{
- struct nfs4_stid *ret;
-
- spin_lock(&cl->cl_lock);
- ret = find_stateid_locked(cl, t);
- spin_unlock(&cl->cl_lock);
- return ret;
-}
-
-static struct nfs4_stid *
find_stateid_by_type(struct nfs4_client *cl, stateid_t *t, char typemask)
{
struct nfs4_stid *s;
@@ -4098,10 +4087,10 @@ static __be32 nfsd4_validate_stateid(struct nfs4_client *cl, stateid_t *stateid)
{
struct nfs4_stid *s;
struct nfs4_ol_stateid *ols;
- __be32 status;
+ __be32 status = nfserr_bad_stateid;

if (ZERO_STATEID(stateid) || ONE_STATEID(stateid))
- return nfserr_bad_stateid;
+ return status;
/* Client debugging aid. */
if (!same_clid(&stateid->si_opaque.so_clid, &cl->cl_clientid)) {
char addr_str[INET6_ADDRSTRLEN];
@@ -4109,34 +4098,42 @@ static __be32 nfsd4_validate_stateid(struct nfs4_client *cl, stateid_t *stateid)
sizeof(addr_str));
pr_warn_ratelimited("NFSD: client %s testing state ID "
"with incorrect client ID\n", addr_str);
- return nfserr_bad_stateid;
+ return status;
}
- s = find_stateid(cl, stateid);
+ spin_lock(&cl->cl_lock);
+ s = find_stateid_locked(cl, stateid);
if (!s)
- return nfserr_bad_stateid;
+ goto out_unlock;
status = check_stateid_generation(stateid, &s->sc_stateid, 1);
if (status)
- return status;
+ goto out_unlock;
switch (s->sc_type) {
case NFS4_DELEG_STID:
- return nfs_ok;
+ status = nfs_ok;
+ break;
case NFS4_REVOKED_DELEG_STID:
- return nfserr_deleg_revoked;
+ status = nfserr_deleg_revoked;
+ break;
case NFS4_OPEN_STID:
case NFS4_LOCK_STID:
ols = openlockstateid(s);
if (ols->st_stateowner->so_is_open_owner
&& !(openowner(ols->st_stateowner)->oo_flags
& NFS4_OO_CONFIRMED))
- return nfserr_bad_stateid;
- return nfs_ok;
+ status = nfserr_bad_stateid;
+ else
+ status = nfs_ok;
+ break;
default:
printk("unknown stateid type %x\n", s->sc_type);
/* Fallthrough */
case NFS4_CLOSED_STID:
case NFS4_CLOSED_DELEG_STID:
- return nfserr_bad_stateid;
+ status = nfserr_bad_stateid;
}
+out_unlock:
+ spin_unlock(&cl->cl_lock);
+ return status;
}

static __be32
@@ -4287,34 +4284,38 @@ nfsd4_free_stateid(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
__be32 ret = nfserr_bad_stateid;

nfs4_lock_state();
- s = find_stateid(cl, stateid);
+ spin_lock(&cl->cl_lock);
+ s = find_stateid_locked(cl, stateid);
if (!s)
- goto out;
+ goto out_unlock;
switch (s->sc_type) {
case NFS4_DELEG_STID:
ret = nfserr_locks_held;
- goto out;
+ break;
case NFS4_OPEN_STID:
- case NFS4_LOCK_STID:
ret = check_stateid_generation(stateid, &s->sc_stateid, 1);
if (ret)
- goto out;
- if (s->sc_type == NFS4_LOCK_STID)
- ret = nfsd4_free_lock_stateid(openlockstateid(s));
- else
- ret = nfserr_locks_held;
+ break;
+ ret = nfserr_locks_held;
break;
+ case NFS4_LOCK_STID:
+ ret = check_stateid_generation(stateid, &s->sc_stateid, 1);
+ if (ret)
+ break;
+ spin_unlock(&cl->cl_lock);
+ ret = nfsd4_free_lock_stateid(openlockstateid(s));
+ goto out;
case NFS4_REVOKED_DELEG_STID:
dp = delegstateid(s);
- spin_lock(&cl->cl_lock);
list_del_init(&dp->dl_recall_lru);
spin_unlock(&cl->cl_lock);
nfs4_put_stid(s);
ret = nfs_ok;
- break;
- default:
- ret = nfserr_bad_stateid;
+ goto out;
+ /* Default falls through and returns nfserr_bad_stateid */
}
+out_unlock:
+ spin_unlock(&cl->cl_lock);
out:
nfs4_unlock_state();
return ret;
--
1.9.3


2014-07-30 01:34:57

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v3 05/38] nfsd4: use cl_lock to synchronize all stateid idr calls

Currently, this is serialized by the client_mutex, which is slated for
removal. Add finer-grained locking here. Also, do some cleanup around
find_stateid to prepare for taking references.

Signed-off-by: Trond Myklebust <[email protected]>
Signed-off-by: Benny Halevy <[email protected]>
Signed-off-by: Jeff Layton <[email protected]>
---
fs/nfsd/nfs4state.c | 41 ++++++++++++++++++++++++++++++-----------
1 file changed, 30 insertions(+), 11 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 344cd1ac3f67..bb37cc4dd573 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -467,7 +467,6 @@ static void nfs4_file_put_access(struct nfs4_file *fp, u32 access)
static struct nfs4_stid *nfs4_alloc_stid(struct nfs4_client *cl,
struct kmem_cache *slab)
{
- struct idr *stateids = &cl->cl_stateids;
struct nfs4_stid *stid;
int new_id;

@@ -475,7 +474,11 @@ static struct nfs4_stid *nfs4_alloc_stid(struct nfs4_client *cl,
if (!stid)
return NULL;

- new_id = idr_alloc_cyclic(stateids, stid, 0, 0, GFP_KERNEL);
+ idr_preload(GFP_KERNEL);
+ spin_lock(&cl->cl_lock);
+ new_id = idr_alloc_cyclic(&cl->cl_stateids, stid, 0, 0, GFP_NOWAIT);
+ spin_unlock(&cl->cl_lock);
+ idr_preload_end();
if (new_id < 0)
goto out_free;
stid->sc_client = cl;
@@ -635,9 +638,12 @@ nfs4_put_stid(struct nfs4_stid *s)
struct nfs4_file *fp = s->sc_file;
struct nfs4_client *clp = s->sc_client;

- if (!atomic_dec_and_test(&s->sc_count))
+ might_lock(&clp->cl_lock);
+
+ if (!atomic_dec_and_lock(&s->sc_count, &clp->cl_lock))
return;
idr_remove(&clp->cl_stateids, s->sc_stateid.si_opaque.so_id);
+ spin_unlock(&clp->cl_lock);
s->sc_free(s);
if (fp)
put_nfs4_file(fp);
@@ -1652,7 +1658,8 @@ static void gen_confirm(struct nfs4_client *clp)
memcpy(clp->cl_confirm.data, verf, sizeof(clp->cl_confirm.data));
}

-static struct nfs4_stid *find_stateid(struct nfs4_client *cl, stateid_t *t)
+static struct nfs4_stid *
+find_stateid_locked(struct nfs4_client *cl, stateid_t *t)
{
struct nfs4_stid *ret;

@@ -1662,16 +1669,28 @@ static struct nfs4_stid *find_stateid(struct nfs4_client *cl, stateid_t *t)
return ret;
}

-static struct nfs4_stid *find_stateid_by_type(struct nfs4_client *cl, stateid_t *t, char typemask)
+static struct nfs4_stid *
+find_stateid(struct nfs4_client *cl, stateid_t *t)
+{
+ struct nfs4_stid *ret;
+
+ spin_lock(&cl->cl_lock);
+ ret = find_stateid_locked(cl, t);
+ spin_unlock(&cl->cl_lock);
+ return ret;
+}
+
+static struct nfs4_stid *
+find_stateid_by_type(struct nfs4_client *cl, stateid_t *t, char typemask)
{
struct nfs4_stid *s;

- s = find_stateid(cl, t);
- if (!s)
- return NULL;
- if (typemask & s->sc_type)
- return s;
- return NULL;
+ spin_lock(&cl->cl_lock);
+ s = find_stateid_locked(cl, t);
+ if (s != NULL && !(typemask & s->sc_type))
+ s = NULL;
+ spin_unlock(&cl->cl_lock);
+ return s;
}

static struct nfs4_client *create_client(struct xdr_netobj name,
--
1.9.3


2014-07-31 16:51:07

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH v3 02/38] nfsd: Cleanup the freeing of stateids

On Wed, Jul 30, 2014 at 01:57:58PM -0700, Christoph Hellwig wrote:
> On Tue, Jul 29, 2014 at 09:34:07PM -0400, Jeff Layton wrote:
> > Add a ->free() callback to the struct nfs4_stid, so that we can
> > release a reference to the stid without caring about the contents.
>
> Looks good, but I'd suggest merging it into the previous patch.
>
>
> Signed-off-by: Christoph Hellwig <[email protected]>

OK, committing to for-3.17 with those two squashed together.

--b.

2014-07-30 01:35:05

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v3 08/38] nfsd: clean up races in lock stateid searching and creation

Preparation for removal of the client_mutex.

Currently, no lock aside from the client_mutex is held when calling
find_lock_state. Ensure that the cl_lock is held by adding a lockdep
assertion.

Once we remove the client_mutex, it'll be possible for another thread to
race in and insert a lock state for the same file after we search but
before we insert a new one. Ensure that doesn't happen by redoing the
search after allocating a new stid that we plan to insert. If one is
found just put the one that was allocated.

Signed-off-by: Jeff Layton <[email protected]>
---
fs/nfsd/nfs4state.c | 71 ++++++++++++++++++++++++++++++++++++-----------------
1 file changed, 49 insertions(+), 22 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 3ac6e2fdabe5..59d44873b68b 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -4719,20 +4719,15 @@ alloc_init_lock_stateowner(unsigned int strhashval, struct nfs4_client *clp, str
return lo;
}

-static struct nfs4_ol_stateid *
-alloc_init_lock_stateid(struct nfs4_lockowner *lo, struct nfs4_file *fp,
- struct inode *inode,
- struct nfs4_ol_stateid *open_stp)
+static void
+init_lock_stateid(struct nfs4_ol_stateid *stp, struct nfs4_lockowner *lo,
+ struct nfs4_file *fp, struct inode *inode,
+ struct nfs4_ol_stateid *open_stp)
{
- struct nfs4_stid *s;
- struct nfs4_openowner *oo = openowner(open_stp->st_stateowner);
- struct nfs4_ol_stateid *stp;
struct nfs4_client *clp = lo->lo_owner.so_client;

- s = nfs4_alloc_stid(clp, stateid_slab);
- if (s == NULL)
- return NULL;
- stp = openlockstateid(s);
+ lockdep_assert_held(&clp->cl_lock);
+
stp->st_stid.sc_type = NFS4_LOCK_STID;
stp->st_stateowner = &lo->lo_owner;
get_nfs4_file(fp);
@@ -4741,20 +4736,20 @@ alloc_init_lock_stateid(struct nfs4_lockowner *lo, struct nfs4_file *fp,
stp->st_access_bmap = 0;
stp->st_deny_bmap = open_stp->st_deny_bmap;
stp->st_openstp = open_stp;
- spin_lock(&oo->oo_owner.so_client->cl_lock);
list_add(&stp->st_locks, &open_stp->st_locks);
list_add(&stp->st_perstateowner, &lo->lo_owner.so_stateids);
spin_lock(&fp->fi_lock);
list_add(&stp->st_perfile, &fp->fi_stateids);
spin_unlock(&fp->fi_lock);
- spin_unlock(&oo->oo_owner.so_client->cl_lock);
- return stp;
}

static struct nfs4_ol_stateid *
find_lock_stateid(struct nfs4_lockowner *lo, struct nfs4_file *fp)
{
struct nfs4_ol_stateid *lst;
+ struct nfs4_client *clp = lo->lo_owner.so_client;
+
+ lockdep_assert_held(&clp->cl_lock);

list_for_each_entry(lst, &lo->lo_owner.so_stateids, st_perstateowner) {
if (lst->st_stid.sc_file == fp)
@@ -4763,6 +4758,38 @@ find_lock_stateid(struct nfs4_lockowner *lo, struct nfs4_file *fp)
return NULL;
}

+static struct nfs4_ol_stateid *
+find_or_create_lock_stateid(struct nfs4_lockowner *lo, struct nfs4_file *fi,
+ struct inode *inode, struct nfs4_ol_stateid *ost,
+ bool *new)
+{
+ struct nfs4_stid *ns = NULL;
+ struct nfs4_ol_stateid *lst;
+ struct nfs4_openowner *oo = openowner(ost->st_stateowner);
+ struct nfs4_client *clp = oo->oo_owner.so_client;
+
+ spin_lock(&clp->cl_lock);
+ lst = find_lock_stateid(lo, fi);
+ if (lst == NULL) {
+ spin_unlock(&clp->cl_lock);
+ ns = nfs4_alloc_stid(clp, stateid_slab);
+ if (ns == NULL)
+ return NULL;
+
+ spin_lock(&clp->cl_lock);
+ lst = find_lock_stateid(lo, fi);
+ if (likely(!lst)) {
+ lst = openlockstateid(ns);
+ init_lock_stateid(lst, lo, fi, inode, ost);
+ ns = NULL;
+ *new = true;
+ }
+ }
+ spin_unlock(&clp->cl_lock);
+ if (ns)
+ nfs4_put_stid(ns);
+ return lst;
+}

static int
check_lock_length(u64 offset, u64 length)
@@ -4783,7 +4810,11 @@ static void get_lock_access(struct nfs4_ol_stateid *lock_stp, u32 access)
set_access(access, lock_stp);
}

-static __be32 lookup_or_create_lock_state(struct nfsd4_compound_state *cstate, struct nfs4_ol_stateid *ost, struct nfsd4_lock *lock, struct nfs4_ol_stateid **lst, bool *new)
+static __be32
+lookup_or_create_lock_state(struct nfsd4_compound_state *cstate,
+ struct nfs4_ol_stateid *ost,
+ struct nfsd4_lock *lock,
+ struct nfs4_ol_stateid **lst, bool *new)
{
struct nfs4_file *fi = ost->st_stid.sc_file;
struct nfs4_openowner *oo = openowner(ost->st_stateowner);
@@ -4807,14 +4838,10 @@ static __be32 lookup_or_create_lock_state(struct nfsd4_compound_state *cstate, s
return nfserr_bad_seqid;
}

- *lst = find_lock_stateid(lo, fi);
+ *lst = find_or_create_lock_stateid(lo, fi, inode, ost, new);
if (*lst == NULL) {
- *lst = alloc_init_lock_stateid(lo, fi, inode, ost);
- if (*lst == NULL) {
- release_lockowner_if_empty(lo);
- return nfserr_jukebox;
- }
- *new = true;
+ release_lockowner_if_empty(lo);
+ return nfserr_jukebox;
}
return nfs_ok;
}
--
1.9.3


2014-07-30 01:35:11

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v3 14/38] nfsd: nfsd4_process_open2() must reference the open stateid

From: Trond Myklebust <[email protected]>

Ensure that nfsd4_process_open2() keeps a reference to the open
stateid until it is done working with it. Necessary step toward
client_mutex removal.

Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfsd/nfs4state.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 5cb6305036cd..f3018cb26769 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -2996,6 +2996,7 @@ alloc_init_open_stateowner(unsigned int strhashval, struct nfsd4_open *open,
static void init_open_stateid(struct nfs4_ol_stateid *stp, struct nfs4_file *fp, struct nfsd4_open *open) {
struct nfs4_openowner *oo = open->op_openowner;

+ atomic_inc(&stp->st_stid.sc_count);
stp->st_stid.sc_type = NFS4_OPEN_STID;
INIT_LIST_HEAD(&stp->st_locks);
stp->st_stateowner = &oo->oo_owner;
@@ -3376,6 +3377,7 @@ nfsd4_find_existing_open(struct nfs4_file *fp, struct nfsd4_open *open)
continue;
if (local->st_stateowner == &oo->oo_owner) {
ret = local;
+ atomic_inc(&ret->st_stid.sc_count);
break;
}
}
@@ -3836,6 +3838,8 @@ out:
open->op_rflags |= NFS4_OPEN_RESULT_CONFIRM;
if (dp)
nfs4_put_stid(&dp->dl_stid);
+ if (stp)
+ nfs4_put_stid(&stp->st_stid);

return status;
}
--
1.9.3


2014-07-30 01:35:08

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v3 12/38] nfsd: Ensure that nfs4_open_delegation() references the delegation stateid

From: Trond Myklebust <[email protected]>

Ensure that nfs4_open_delegation() keeps a reference to the delegation
stateid until it is done working with it. Necessary step toward
client_mutex removal.

Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfsd/nfs4state.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 4f191456d737..2df6af93120e 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -674,6 +674,7 @@ hash_delegation_locked(struct nfs4_delegation *dp, struct nfs4_file *fp)
lockdep_assert_held(&state_lock);
lockdep_assert_held(&fp->fi_lock);

+ atomic_inc(&dp->dl_stid.sc_count);
dp->dl_stid.sc_type = NFS4_DELEG_STID;
list_add(&dp->dl_perfile, &fp->fi_delegations);
list_add(&dp->dl_perclnt, &dp->dl_stid.sc_client->cl_delegations);
@@ -3704,6 +3705,7 @@ nfs4_open_delegation(struct svc_fh *fh, struct nfsd4_open *open,
dprintk("NFSD: delegation stateid=" STATEID_FMT "\n",
STATEID_VAL(&dp->dl_stid.sc_stateid));
open->op_delegate_type = NFS4_OPEN_DELEGATE_READ;
+ nfs4_put_stid(&dp->dl_stid);
return;
out_no_deleg:
open->op_delegate_type = NFS4_OPEN_DELEGATE_NONE;
--
1.9.3


2014-07-30 01:35:28

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v3 25/38] nfsd: Make lock stateid take a reference to the lockowner

From: Trond Myklebust <[email protected]>

A necessary step toward client_mutex removal.

Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfsd/nfs4state.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 906c8604de30..88225f0bbc12 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -928,6 +928,8 @@ static void nfs4_free_lock_stateid(struct nfs4_stid *stid)
file = find_any_file(stp->st_stid.sc_file);
if (file)
filp_close(file, (fl_owner_t)lo);
+ if (stp->st_stateowner)
+ nfs4_put_stateowner(stp->st_stateowner);
nfs4_free_ol_stateid(stid);
}

@@ -4831,6 +4833,7 @@ init_lock_stateid(struct nfs4_ol_stateid *stp, struct nfs4_lockowner *lo,
atomic_inc(&stp->st_stid.sc_count);
stp->st_stid.sc_type = NFS4_LOCK_STID;
stp->st_stateowner = &lo->lo_owner;
+ atomic_inc(&lo->lo_owner.so_count);
get_nfs4_file(fp);
stp->st_stid.sc_file = fp;
stp->st_stid.sc_free = nfs4_free_lock_stateid;
--
1.9.3


2014-07-30 01:35:16

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v3 17/38] nfsd: Add reference counting to nfs4_preprocess_confirmed_seqid_op

From: Trond Myklebust <[email protected]>

Ensure that all the callers put the open stateid after use.
Necessary step toward client_mutex removal.

Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfsd/nfs4state.c | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 8e18ca49555f..a777666044f1 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -4405,6 +4405,8 @@ static __be32 nfs4_preprocess_confirmed_seqid_op(struct nfsd4_compound_state *cs
NFS4_OPEN_STID, stpp, nn);
if (status)
return status;
+ /* FIXME: move into nfs4_preprocess_seqid_op */
+ atomic_inc(&(*stpp)->st_stid.sc_count);
oo = openowner((*stpp)->st_stateowner);
if (!(oo->oo_flags & NFS4_OO_CONFIRMED))
return nfserr_bad_stateid;
@@ -4509,12 +4511,12 @@ nfsd4_open_downgrade(struct svc_rqst *rqstp,
if (!test_access(od->od_share_access, stp)) {
dprintk("NFSD: access not a subset of current bitmap: 0x%hhx, input access=%08x\n",
stp->st_access_bmap, od->od_share_access);
- goto out;
+ goto put_stateid;
}
if (!test_deny(od->od_share_deny, stp)) {
dprintk("NFSD: deny not a subset of current bitmap: 0x%hhx, input deny=%08x\n",
stp->st_deny_bmap, od->od_share_deny);
- goto out;
+ goto put_stateid;
}
nfs4_stateid_downgrade(stp, od->od_share_access);

@@ -4523,6 +4525,8 @@ nfsd4_open_downgrade(struct svc_rqst *rqstp,
update_stateid(&stp->st_stid.sc_stateid);
memcpy(&od->od_stateid, &stp->st_stid.sc_stateid, sizeof(stateid_t));
status = nfs_ok;
+put_stateid:
+ nfs4_put_stid(&stp->st_stid);
out:
nfsd4_bump_seqid(cstate, status);
if (!cstate->replay_owner)
@@ -4883,6 +4887,7 @@ nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
struct nfs4_openowner *open_sop = NULL;
struct nfs4_lockowner *lock_sop = NULL;
struct nfs4_ol_stateid *lock_stp = NULL;
+ struct nfs4_ol_stateid *open_stp = NULL;
struct nfs4_file *fp;
struct file *filp = NULL;
struct file_lock *file_lock = NULL;
@@ -4910,8 +4915,6 @@ nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
nfs4_lock_state();

if (lock->lk_is_new) {
- struct nfs4_ol_stateid *open_stp = NULL;
-
if (nfsd4_has_session(cstate))
/* See rfc 5661 18.10.3: given clientid is ignored: */
memcpy(&lock->v.new.clientid,
@@ -5039,6 +5042,8 @@ out:
fput(filp);
if (lock_stp)
nfs4_put_stid(&lock_stp->st_stid);
+ if (open_stp)
+ nfs4_put_stid(&open_stp->st_stid);
if (status && new_state)
release_lock_stateid(lock_stp);
nfsd4_bump_seqid(cstate, status);
--
1.9.3


2014-07-30 01:35:26

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v3 24/38] nfsd: add an operation for unhashing a stateowner

Allow stateowners to be unhashed and destroyed when the last reference
is put. The unhashing must be idempotent. In a future patch, we'll add
some locking around it, but for now it's only protected by the
client_mutex.

Signed-off-by: Jeff Layton <[email protected]>
---
fs/nfsd/nfs4state.c | 49 +++++++++++++++++++++++++++++++++++++------------
fs/nfsd/state.h | 1 +
2 files changed, 38 insertions(+), 12 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index eaa5f9ebf444..906c8604de30 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -894,6 +894,7 @@ static void nfs4_put_stateowner(struct nfs4_stateowner *sop)
{
if (!atomic_dec_and_test(&sop->so_count))
return;
+ sop->so_ops->so_unhash(sop);
kfree(sop->so_owner.data);
sop->so_ops->so_free(sop);
}
@@ -944,9 +945,13 @@ static void __release_lock_stateid(struct nfs4_ol_stateid *stp)

static void unhash_lockowner(struct nfs4_lockowner *lo)
{
+ list_del_init(&lo->lo_owner.so_strhash);
+}
+
+static void release_lockowner_stateids(struct nfs4_lockowner *lo)
+{
struct nfs4_ol_stateid *stp;

- list_del(&lo->lo_owner.so_strhash);
while (!list_empty(&lo->lo_owner.so_stateids)) {
stp = list_first_entry(&lo->lo_owner.so_stateids,
struct nfs4_ol_stateid, st_perstateowner);
@@ -957,6 +962,7 @@ static void unhash_lockowner(struct nfs4_lockowner *lo)
static void release_lockowner(struct nfs4_lockowner *lo)
{
unhash_lockowner(lo);
+ release_lockowner_stateids(lo);
nfs4_put_stateowner(&lo->lo_owner);
}

@@ -1006,15 +1012,8 @@ static void release_open_stateid(struct nfs4_ol_stateid *stp)

static void unhash_openowner(struct nfs4_openowner *oo)
{
- struct nfs4_ol_stateid *stp;
-
- list_del(&oo->oo_owner.so_strhash);
- list_del(&oo->oo_perclient);
- while (!list_empty(&oo->oo_owner.so_stateids)) {
- stp = list_first_entry(&oo->oo_owner.so_stateids,
- struct nfs4_ol_stateid, st_perstateowner);
- release_open_stateid(stp);
- }
+ list_del_init(&oo->oo_owner.so_strhash);
+ list_del_init(&oo->oo_perclient);
}

static void release_last_closed_stateid(struct nfs4_openowner *oo)
@@ -1027,9 +1026,21 @@ static void release_last_closed_stateid(struct nfs4_openowner *oo)
}
}

+static void release_openowner_stateids(struct nfs4_openowner *oo)
+{
+ struct nfs4_ol_stateid *stp;
+
+ while (!list_empty(&oo->oo_owner.so_stateids)) {
+ stp = list_first_entry(&oo->oo_owner.so_stateids,
+ struct nfs4_ol_stateid, st_perstateowner);
+ release_open_stateid(stp);
+ }
+}
+
static void release_openowner(struct nfs4_openowner *oo)
{
unhash_openowner(oo);
+ release_openowner_stateids(oo);
list_del(&oo->oo_close_lru);
release_last_closed_stateid(oo);
nfs4_put_stateowner(&oo->oo_owner);
@@ -2994,6 +3005,13 @@ static void hash_openowner(struct nfs4_openowner *oo, struct nfs4_client *clp, u
list_add(&oo->oo_perclient, &clp->cl_openowners);
}

+static void nfs4_unhash_openowner(struct nfs4_stateowner *so)
+{
+ struct nfs4_openowner *oo = openowner(so);
+
+ unhash_openowner(oo);
+}
+
static void nfs4_free_openowner(struct nfs4_stateowner *so)
{
struct nfs4_openowner *oo = openowner(so);
@@ -3002,7 +3020,8 @@ static void nfs4_free_openowner(struct nfs4_stateowner *so)
}

static const struct nfs4_stateowner_operations openowner_ops = {
- .so_free = nfs4_free_openowner,
+ .so_unhash = nfs4_unhash_openowner,
+ .so_free = nfs4_free_openowner,
};

static struct nfs4_openowner *
@@ -4760,6 +4779,11 @@ find_lockowner_str(clientid_t *clid, struct xdr_netobj *owner,
return NULL;
}

+static void nfs4_unhash_lockowner(struct nfs4_stateowner *sop)
+{
+ unhash_lockowner(lockowner(sop));
+}
+
static void nfs4_free_lockowner(struct nfs4_stateowner *sop)
{
struct nfs4_lockowner *lo = lockowner(sop);
@@ -4768,7 +4792,8 @@ static void nfs4_free_lockowner(struct nfs4_stateowner *sop)
}

static const struct nfs4_stateowner_operations lockowner_ops = {
- .so_free = nfs4_free_lockowner,
+ .so_unhash = nfs4_unhash_lockowner,
+ .so_free = nfs4_free_lockowner,
};

/*
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 9cba295812f6..232246039db0 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -335,6 +335,7 @@ struct nfs4_replay {
struct nfs4_stateowner;

struct nfs4_stateowner_operations {
+ void (*so_unhash)(struct nfs4_stateowner *);
void (*so_free)(struct nfs4_stateowner *);
};

--
1.9.3


2014-07-30 01:35:19

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v3 19/38] nfsd: Migrate the stateid reference into nfs4_lookup_stateid()

From: Trond Myklebust <[email protected]>

Allow nfs4_lookup_stateid to take the stateid reference, instead
of having all the callers do so.

Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfsd/nfs4state.c | 17 +++++++++++------
1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index b0c0f4cdf503..a4a49a3b464c 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -4170,6 +4170,8 @@ nfsd4_lookup_stateid(struct nfsd4_compound_state *cstate,
*s = find_stateid_by_type(cstate->clp, stateid, typemask);
if (!*s)
return nfserr_bad_stateid;
+ /* FIXME: move into find_stateid_by_type */
+ atomic_inc(&(*s)->sc_count);
return nfs_ok;
}

@@ -4204,7 +4206,7 @@ nfs4_preprocess_stateid_op(struct net *net, struct nfsd4_compound_state *cstate,
NFS4_DELEG_STID|NFS4_OPEN_STID|NFS4_LOCK_STID,
&s, nn);
if (status)
- goto out;
+ goto unlock_state;
status = check_stateid_generation(stateid, &s->sc_stateid, nfsd4_has_session(cstate));
if (status)
goto out;
@@ -4253,6 +4255,8 @@ nfs4_preprocess_stateid_op(struct net *net, struct nfsd4_compound_state *cstate,
if (file)
*filpp = file;
out:
+ nfs4_put_stid(s);
+unlock_state:
nfs4_unlock_state();
return status;
}
@@ -4390,11 +4394,10 @@ nfs4_preprocess_seqid_op(struct nfsd4_compound_state *cstate, u32 seqid,
cstate->replay_owner = stp->st_stateowner;

status = nfs4_seqid_op_checks(cstate, stateid, seqid, stp);
- if (!status) {
- /* FIXME: move into find_stateid_by_type */
- atomic_inc(&stp->st_stid.sc_count);
+ if (!status)
*stpp = stp;
- }
+ else
+ nfs4_put_stid(&stp->st_stid);
return status;
}

@@ -4623,9 +4626,11 @@ nfsd4_delegreturn(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
dp = delegstateid(s);
status = check_stateid_generation(stateid, &dp->dl_stid.sc_stateid, nfsd4_has_session(cstate));
if (status)
- goto out;
+ goto put_stateid;

destroy_delegation(dp);
+put_stateid:
+ nfs4_put_stid(&dp->dl_stid);
out:
nfs4_unlock_state();

--
1.9.3


2014-07-30 20:59:27

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v3 05/38] nfsd4: use cl_lock to synchronize all stateid idr calls

On Tue, Jul 29, 2014 at 09:34:10PM -0400, Jeff Layton wrote:
> Currently, this is serialized by the client_mutex, which is slated for
> removal. Add finer-grained locking here. Also, do some cleanup around
> find_stateid to prepare for taking references.
>
> Signed-off-by: Trond Myklebust <[email protected]>
> Signed-off-by: Benny Halevy <[email protected]>
> Signed-off-by: Jeff Layton <[email protected]>

Looks good,

Reviewed-by: Christoph Hellwig <[email protected]>

2014-07-30 01:35:43

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v3 35/38] nfsd: close potential race in nfsd4_free_stateid

Once we remove the client_mutex, it'll be possible for the sc_type of a
lock stateid to change after it's found and checked, but before we can
go to destroy it. If that happens, we can end up putting the persistent
reference to the stateid more than once, and unhash it more than once.

Fix this by unhashing the lock stateid prior to dropping the cl_lock but
after finding it.

Signed-off-by: Jeff Layton <[email protected]>
---
fs/nfsd/nfs4state.c | 21 +++++++++------------
1 file changed, 9 insertions(+), 12 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 9358cbe2283d..9c7dcbb68094 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -4397,17 +4397,6 @@ unlock_state:
return status;
}

-static __be32
-nfsd4_free_lock_stateid(struct nfs4_ol_stateid *stp)
-{
- struct nfs4_lockowner *lo = lockowner(stp->st_stateowner);
-
- if (check_for_locks(stp->st_stid.sc_file, lo))
- return nfserr_locks_held;
- release_lock_stateid(stp);
- return nfs_ok;
-}
-
/*
* Test if the stateid is valid
*/
@@ -4434,6 +4423,7 @@ nfsd4_free_stateid(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
stateid_t *stateid = &free_stateid->fr_stateid;
struct nfs4_stid *s;
struct nfs4_delegation *dp;
+ struct nfs4_ol_stateid *stp;
struct nfs4_client *cl = cstate->session->se_client;
__be32 ret = nfserr_bad_stateid;

@@ -4456,8 +4446,15 @@ nfsd4_free_stateid(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
ret = check_stateid_generation(stateid, &s->sc_stateid, 1);
if (ret)
break;
+ stp = openlockstateid(s);
+ ret = nfserr_locks_held;
+ if (check_for_locks(stp->st_stid.sc_file,
+ lockowner(stp->st_stateowner)))
+ break;
+ unhash_lock_stateid(stp);
spin_unlock(&cl->cl_lock);
- ret = nfsd4_free_lock_stateid(openlockstateid(s));
+ nfs4_put_stid(s);
+ ret = nfs_ok;
goto out;
case NFS4_REVOKED_DELEG_STID:
dp = delegstateid(s);
--
1.9.3


2014-07-30 01:35:41

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v3 34/38] nfsd: optimize destroy_lockowner cl_lock thrashing

Reduce the cl_lock trashing in destroy_lockowner. Unhash all of the
lockstateids on the lockowner's list. Put the reference under the lock
and see if it was the last one. If so, then add it to a private list
to be destroyed after we drop the lock.

Signed-off-by: Jeff Layton <[email protected]>
---
fs/nfsd/nfs4state.c | 51 ++++++++++++++++++++++++++++++++++-----------------
1 file changed, 34 insertions(+), 17 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 9b342e164407..9358cbe2283d 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -983,14 +983,23 @@ static void nfs4_free_lock_stateid(struct nfs4_stid *stid)
nfs4_free_ol_stateid(stid);
}

-static void release_lock_stateid(struct nfs4_ol_stateid *stp)
+static void unhash_lock_stateid(struct nfs4_ol_stateid *stp)
{
struct nfs4_openowner *oo = openowner(stp->st_openstp->st_stateowner);

- spin_lock(&oo->oo_owner.so_client->cl_lock);
- list_del(&stp->st_locks);
+ lockdep_assert_held(&oo->oo_owner.so_client->cl_lock);
+
+ list_del_init(&stp->st_locks);
unhash_generic_stateid(stp);
unhash_stid(&stp->st_stid);
+}
+
+static void release_lock_stateid(struct nfs4_ol_stateid *stp)
+{
+ struct nfs4_openowner *oo = openowner(stp->st_openstp->st_stateowner);
+
+ spin_lock(&oo->oo_owner.so_client->cl_lock);
+ unhash_lock_stateid(stp);
spin_unlock(&oo->oo_owner.so_client->cl_lock);
nfs4_put_stid(&stp->st_stid);
}
@@ -1004,30 +1013,38 @@ static void unhash_lockowner_locked(struct nfs4_lockowner *lo)
list_del_init(&lo->lo_owner.so_strhash);
}

-static void release_lockowner_stateids(struct nfs4_lockowner *lo)
+static void release_lockowner(struct nfs4_lockowner *lo)
{
struct nfs4_client *clp = lo->lo_owner.so_client;
struct nfs4_ol_stateid *stp;
+ struct list_head reaplist;

- lockdep_assert_held(&clp->cl_lock);
+ INIT_LIST_HEAD(&reaplist);

+ spin_lock(&clp->cl_lock);
+ unhash_lockowner_locked(lo);
while (!list_empty(&lo->lo_owner.so_stateids)) {
stp = list_first_entry(&lo->lo_owner.so_stateids,
struct nfs4_ol_stateid, st_perstateowner);
- spin_unlock(&clp->cl_lock);
- release_lock_stateid(stp);
- spin_lock(&clp->cl_lock);
+ unhash_lock_stateid(stp);
+ /*
+ * We now know that no new references can be added to the
+ * stateid. If ours is the last one, finish the unhashing
+ * and put it on the list to be reaped.
+ */
+ if (atomic_dec_and_test(&stp->st_stid.sc_count)) {
+ idr_remove(&clp->cl_stateids,
+ stp->st_stid.sc_stateid.si_opaque.so_id);
+ list_add(&stp->st_locks, &reaplist);
+ }
}
-}
-
-static void release_lockowner(struct nfs4_lockowner *lo)
-{
- struct nfs4_client *clp = lo->lo_owner.so_client;
-
- spin_lock(&clp->cl_lock);
- unhash_lockowner_locked(lo);
- release_lockowner_stateids(lo);
spin_unlock(&clp->cl_lock);
+ while (!list_empty(&reaplist)) {
+ stp = list_first_entry(&reaplist, struct nfs4_ol_stateid,
+ st_locks);
+ list_del(&stp->st_locks);
+ stp->st_stid.sc_free(&stp->st_stid);
+ }
nfs4_put_stateowner(&lo->lo_owner);
}

--
1.9.3


2014-07-30 01:35:34

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v3 29/38] nfsd: Protect adding/removing open state owners using client_lock

From: Trond Myklebust <[email protected]>

Once we remove client mutex protection, we'll need to ensure that
stateowner lookup and creation are atomic between concurrent compounds.
Ensure that alloc_init_open_stateowner checks the hashtable under the
client_lock before adding a new element.

Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfsd/nfs4state.c | 118 +++++++++++++++++++++++++++++++++++-----------------
1 file changed, 80 insertions(+), 38 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 6d26d26751f5..c4bb7f2b29d9 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -239,6 +239,53 @@ static void nfsd4_put_session(struct nfsd4_session *ses)
spin_unlock(&nn->client_lock);
}

+static int
+same_owner_str(struct nfs4_stateowner *sop, struct xdr_netobj *owner,
+ clientid_t *clid)
+{
+ return (sop->so_owner.len == owner->len) &&
+ 0 == memcmp(sop->so_owner.data, owner->data, owner->len) &&
+ (sop->so_client->cl_clientid.cl_id == clid->cl_id);
+}
+
+static struct nfs4_openowner *
+find_openstateowner_str_locked(unsigned int hashval, struct nfsd4_open *open,
+ bool sessions, struct nfsd_net *nn)
+{
+ struct nfs4_stateowner *so;
+ struct nfs4_openowner *oo;
+ struct nfs4_client *clp;
+
+ lockdep_assert_held(&nn->client_lock);
+
+ list_for_each_entry(so, &nn->ownerstr_hashtbl[hashval], so_strhash) {
+ if (!so->so_is_open_owner)
+ continue;
+ if (same_owner_str(so, &open->op_owner, &open->op_clientid)) {
+ oo = openowner(so);
+ clp = oo->oo_owner.so_client;
+ if ((bool)clp->cl_minorversion != sessions)
+ break;
+ renew_client_locked(clp);
+ atomic_inc(&so->so_count);
+ return oo;
+ }
+ }
+ return NULL;
+}
+
+static struct nfs4_openowner *
+find_openstateowner_str(unsigned int hashval, struct nfsd4_open *open,
+ bool sessions, struct nfsd_net *nn)
+{
+ struct nfs4_openowner *oo;
+
+ spin_lock(&nn->client_lock);
+ oo = find_openstateowner_str_locked(hashval, open, sessions, nn);
+ spin_unlock(&nn->client_lock);
+ return oo;
+}
+

static inline u32
opaque_hashval(const void *ptr, int nbytes)
@@ -1005,8 +1052,13 @@ static void release_open_stateid(struct nfs4_ol_stateid *stp)
nfs4_put_stid(&stp->st_stid);
}

-static void unhash_openowner(struct nfs4_openowner *oo)
+static void unhash_openowner_locked(struct nfs4_openowner *oo)
{
+ struct nfsd_net *nn = net_generic(oo->oo_owner.so_client->net,
+ nfsd_net_id);
+
+ lockdep_assert_held(&nn->client_lock);
+
list_del_init(&oo->oo_owner.so_strhash);
list_del_init(&oo->oo_perclient);
}
@@ -1025,18 +1077,29 @@ static void release_last_closed_stateid(struct nfs4_openowner *oo)
static void release_openowner_stateids(struct nfs4_openowner *oo)
{
struct nfs4_ol_stateid *stp;
+ struct nfsd_net *nn = net_generic(oo->oo_owner.so_client->net,
+ nfsd_net_id);
+
+ lockdep_assert_held(&nn->client_lock);

while (!list_empty(&oo->oo_owner.so_stateids)) {
stp = list_first_entry(&oo->oo_owner.so_stateids,
struct nfs4_ol_stateid, st_perstateowner);
+ spin_unlock(&nn->client_lock);
release_open_stateid(stp);
+ spin_lock(&nn->client_lock);
}
}

static void release_openowner(struct nfs4_openowner *oo)
{
- unhash_openowner(oo);
+ struct nfsd_net *nn = net_generic(oo->oo_owner.so_client->net,
+ nfsd_net_id);
+
+ spin_lock(&nn->client_lock);
+ unhash_openowner_locked(oo);
release_openowner_stateids(oo);
+ spin_unlock(&nn->client_lock);
release_last_closed_stateid(oo);
nfs4_put_stateowner(&oo->oo_owner);
}
@@ -3004,8 +3067,11 @@ static void hash_openowner(struct nfs4_openowner *oo, struct nfs4_client *clp, u
static void nfs4_unhash_openowner(struct nfs4_stateowner *so)
{
struct nfs4_openowner *oo = openowner(so);
+ struct nfsd_net *nn = net_generic(so->so_client->net, nfsd_net_id);

- unhash_openowner(oo);
+ spin_lock(&nn->client_lock);
+ unhash_openowner_locked(oo);
+ spin_unlock(&nn->client_lock);
}

static void nfs4_free_openowner(struct nfs4_stateowner *so)
@@ -3025,7 +3091,8 @@ alloc_init_open_stateowner(unsigned int strhashval, struct nfsd4_open *open,
struct nfsd4_compound_state *cstate)
{
struct nfs4_client *clp = cstate->clp;
- struct nfs4_openowner *oo;
+ struct nfsd_net *nn = net_generic(clp->net, nfsd_net_id);
+ struct nfs4_openowner *oo, *ret;

oo = alloc_stateowner(openowner_slab, &open->op_owner, clp);
if (!oo)
@@ -3039,7 +3106,15 @@ alloc_init_open_stateowner(unsigned int strhashval, struct nfsd4_open *open,
oo->oo_time = 0;
oo->oo_last_closed_stid = NULL;
INIT_LIST_HEAD(&oo->oo_close_lru);
- hash_openowner(oo, clp, strhashval);
+ spin_lock(&nn->client_lock);
+ ret = find_openstateowner_str_locked(strhashval,
+ open, clp->cl_minorversion, nn);
+ if (ret == NULL) {
+ hash_openowner(oo, clp, strhashval);
+ ret = oo;
+ } else
+ nfs4_free_openowner(&oo->oo_owner);
+ spin_unlock(&nn->client_lock);
return oo;
}

@@ -3100,39 +3175,6 @@ move_to_close_lru(struct nfs4_ol_stateid *s, struct net *net)
oo->oo_time = get_seconds();
}

-static int
-same_owner_str(struct nfs4_stateowner *sop, struct xdr_netobj *owner,
- clientid_t *clid)
-{
- return (sop->so_owner.len == owner->len) &&
- 0 == memcmp(sop->so_owner.data, owner->data, owner->len) &&
- (sop->so_client->cl_clientid.cl_id == clid->cl_id);
-}
-
-static struct nfs4_openowner *
-find_openstateowner_str(unsigned int hashval, struct nfsd4_open *open,
- bool sessions, struct nfsd_net *nn)
-{
- struct nfs4_stateowner *so;
- struct nfs4_openowner *oo;
- struct nfs4_client *clp;
-
- list_for_each_entry(so, &nn->ownerstr_hashtbl[hashval], so_strhash) {
- if (!so->so_is_open_owner)
- continue;
- if (same_owner_str(so, &open->op_owner, &open->op_clientid)) {
- oo = openowner(so);
- clp = oo->oo_owner.so_client;
- if ((bool)clp->cl_minorversion != sessions)
- return NULL;
- renew_client(oo->oo_owner.so_client);
- atomic_inc(&oo->oo_owner.so_count);
- return oo;
- }
- }
- return NULL;
-}
-
/* search file_hashtbl[] for file */
static struct nfs4_file *
find_file_locked(struct knfsd_fh *fh)
--
1.9.3


2014-07-30 01:35:39

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v3 32/38] nfsd: clean up and reorganize release_lockowner

Do more within the main loop, and simplify the function a bit. Also,
there's no need to take a stateowner reference unless we're going to call
release_lockowner.

Signed-off-by: Jeff Layton <[email protected]>
---
fs/nfsd/nfs4state.c | 49 ++++++++++++++++++-------------------------------
1 file changed, 18 insertions(+), 31 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 4af4e5eff491..cd7d7df03afa 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -5424,8 +5424,8 @@ nfsd4_release_lockowner(struct svc_rqst *rqstp,
struct nfsd4_release_lockowner *rlockowner)
{
clientid_t *clid = &rlockowner->rl_clientid;
- struct nfs4_stateowner *sop = NULL, *tmp;
- struct nfs4_lockowner *lo;
+ struct nfs4_stateowner *sop;
+ struct nfs4_lockowner *lo = NULL;
struct nfs4_ol_stateid *stp;
struct xdr_netobj *owner = &rlockowner->rl_owner;
unsigned int hashval = ownerstr_hashval(owner);
@@ -5442,45 +5442,32 @@ nfsd4_release_lockowner(struct svc_rqst *rqstp,
if (status)
goto out;

- status = nfserr_locks_held;
-
clp = cstate->clp;
/* Find the matching lock stateowner */
spin_lock(&clp->cl_lock);
- list_for_each_entry(tmp, &clp->cl_ownerstr_hashtbl[hashval],
+ list_for_each_entry(sop, &clp->cl_ownerstr_hashtbl[hashval],
so_strhash) {
- if (tmp->so_is_open_owner)
- continue;
- if (same_owner_str(tmp, owner)) {
- sop = tmp;
- atomic_inc(&sop->so_count);
- break;
- }
- }

- /* No matching owner found, maybe a replay? Just declare victory... */
- if (!sop) {
- spin_unlock(&clp->cl_lock);
- status = nfs_ok;
- goto out;
- }
+ if (sop->so_is_open_owner || !same_owner_str(sop, owner))
+ continue;

- lo = lockowner(sop);
- /* see if there are still any locks associated with it */
- list_for_each_entry(stp, &sop->so_stateids, st_perstateowner) {
- if (check_for_locks(stp->st_stid.sc_file, lo)) {
- spin_unlock(&clp->cl_lock);
- goto out;
+ /* see if there are still any locks associated with it */
+ lo = lockowner(sop);
+ list_for_each_entry(stp, &sop->so_stateids, st_perstateowner) {
+ if (check_for_locks(stp->st_stid.sc_file, lo)) {
+ status = nfserr_locks_held;
+ spin_unlock(&clp->cl_lock);
+ goto out;
+ }
}
+
+ atomic_inc(&sop->so_count);
+ break;
}
spin_unlock(&clp->cl_lock);
-
- status = nfs_ok;
- sop = NULL;
- release_lockowner(lo);
+ if (lo)
+ release_lockowner(lo);
out:
- if (sop)
- nfs4_put_stateowner(sop);
nfs4_unlock_state();
return status;
}
--
1.9.3


2014-07-30 01:35:46

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v3 37/38] nfsd: don't thrash the cl_lock while freeing an open stateid

When we remove the client_mutex, we'll have a potential race between
FREE_STATEID and CLOSE.

The root of the problem is that we are walking the st_locks list,
dropping the spinlock and then trying to release the persistent
reference to the lockstateid. In between, a FREE_STATEID call can come
along and take the lock, find the stateid and then try to put the
reference. That leads to a double put.

Fix this by not releasing the cl_lock in order to release each lock
stateid. Use put_generic_stateid_locked to unhash them and gather them
onto a list, and free_ol_stateid_reaplist to free any that end up on the
list.

Signed-off-by: Jeff Layton <[email protected]>
---
fs/nfsd/nfs4state.c | 34 +++++++++++++++++++---------------
1 file changed, 19 insertions(+), 15 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 879342bc16b2..1f67a96c4941 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1077,27 +1077,26 @@ static void release_lockowner(struct nfs4_lockowner *lo)
nfs4_put_stateowner(&lo->lo_owner);
}

-static void release_open_stateid_locks(struct nfs4_ol_stateid *open_stp)
- __releases(&open_stp->st_stateowner->so_client->cl_lock)
- __acquires(&open_stp->st_stateowner->so_client->cl_lock)
+static void release_open_stateid_locks(struct nfs4_ol_stateid *open_stp,
+ struct list_head *reaplist)
{
struct nfs4_ol_stateid *stp;

while (!list_empty(&open_stp->st_locks)) {
stp = list_entry(open_stp->st_locks.next,
struct nfs4_ol_stateid, st_locks);
- spin_unlock(&open_stp->st_stateowner->so_client->cl_lock);
- release_lock_stateid(stp);
- spin_lock(&open_stp->st_stateowner->so_client->cl_lock);
+ unhash_lock_stateid(stp);
+ put_ol_stateid_locked(stp, reaplist);
}
}

-static void unhash_open_stateid(struct nfs4_ol_stateid *stp)
+static void unhash_open_stateid(struct nfs4_ol_stateid *stp,
+ struct list_head *reaplist)
{
lockdep_assert_held(&stp->st_stid.sc_client->cl_lock);

unhash_generic_stateid(stp);
- release_open_stateid_locks(stp);
+ release_open_stateid_locks(stp, reaplist);
}

static void release_open_stateid(struct nfs4_ol_stateid *stp)
@@ -1105,7 +1104,7 @@ static void release_open_stateid(struct nfs4_ol_stateid *stp)
LIST_HEAD(reaplist);

spin_lock(&stp->st_stid.sc_client->cl_lock);
- unhash_open_stateid(stp);
+ unhash_open_stateid(stp, &reaplist);
put_ol_stateid_locked(stp, &reaplist);
spin_unlock(&stp->st_stid.sc_client->cl_lock);
free_ol_stateid_reaplist(&reaplist);
@@ -1145,7 +1144,7 @@ static void release_openowner(struct nfs4_openowner *oo)
while (!list_empty(&oo->oo_owner.so_stateids)) {
stp = list_first_entry(&oo->oo_owner.so_stateids,
struct nfs4_ol_stateid, st_perstateowner);
- unhash_open_stateid(stp);
+ unhash_open_stateid(stp, &reaplist);
put_ol_stateid_locked(stp, &reaplist);
}
spin_unlock(&clp->cl_lock);
@@ -4701,16 +4700,21 @@ out:
static void nfsd4_close_open_stateid(struct nfs4_ol_stateid *s)
{
struct nfs4_client *clp = s->st_stid.sc_client;
+ LIST_HEAD(reaplist);

s->st_stid.sc_type = NFS4_CLOSED_STID;
spin_lock(&clp->cl_lock);
- unhash_open_stateid(s);
- spin_unlock(&clp->cl_lock);
+ unhash_open_stateid(s, &reaplist);

- if (clp->cl_minorversion)
- nfs4_put_stid(&s->st_stid);
- else
+ if (clp->cl_minorversion) {
+ put_ol_stateid_locked(s, &reaplist);
+ spin_unlock(&clp->cl_lock);
+ free_ol_stateid_reaplist(&reaplist);
+ } else {
+ spin_unlock(&clp->cl_lock);
+ free_ol_stateid_reaplist(&reaplist);
move_to_close_lru(s, clp->net);
+ }
}

/*
--
1.9.3


2014-07-30 01:35:13

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v3 15/38] nfsd: Prepare nfsd4_close() for open stateid referencing

From: Trond Myklebust <[email protected]>

Prepare nfsd4_close for a future where nfs4_preprocess_seqid_op()
hands it a fully referenced open stateid. Necessary step toward
client_mutex removal.

Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfsd/nfs4state.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index f3018cb26769..4e50f14f5bc1 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -4579,10 +4579,15 @@ nfsd4_close(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
nfsd4_bump_seqid(cstate, status);
if (status)
goto out;
+ /* FIXME: move into nfs4_preprocess_seqid_op */
+ atomic_inc(&stp->st_stid.sc_count);
update_stateid(&stp->st_stid.sc_stateid);
memcpy(&close->cl_stateid, &stp->st_stid.sc_stateid, sizeof(stateid_t));

nfsd4_close_open_stateid(stp);
+
+ /* put reference from nfs4_preprocess_seqid_op */
+ nfs4_put_stid(&stp->st_stid);
out:
if (!cstate->replay_owner)
nfs4_unlock_state();
--
1.9.3


2014-07-30 01:35:24

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v3 22/38] nfsd: Add a mutex to protect the NFSv4.0 open owner replay cache

We don't want to rely on the client_mutex for protection in the case of
NFSv4 open owners. Instead, we add a mutex that will only be taken for
NFSv4.0 state mutating operations, and that will be released once the
entire compound is done.

Also, ensure that nfsd4_cstate_assign_replay/nfsd4_cstate_clear_replay
take a reference to the stateowner when they are using it for NFSv4.0
open and lock replay caching.

Signed-off-by: Trond Myklebust <[email protected]>
Signed-off-by: Jeff Layton <[email protected]>
---
fs/nfsd/nfs4proc.c | 12 +++---------
fs/nfsd/nfs4state.c | 47 +++++++++++++++++++++++++++++++++--------------
fs/nfsd/nfs4xdr.c | 2 --
fs/nfsd/state.h | 1 +
fs/nfsd/xdr4.h | 5 ++++-
5 files changed, 41 insertions(+), 26 deletions(-)

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index 8611585f739d..29cf395b694e 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -469,12 +469,9 @@ out:
fh_put(resfh);
kfree(resfh);
}
- nfsd4_cleanup_open_state(open, status);
- if (open->op_openowner && !nfsd4_has_session(cstate))
- cstate->replay_owner = &open->op_openowner->oo_owner;
+ nfsd4_cleanup_open_state(cstate, open, status);
nfsd4_bump_seqid(cstate, status);
- if (!cstate->replay_owner)
- nfs4_unlock_state();
+ nfs4_unlock_state();
return status;
}

@@ -1395,10 +1392,7 @@ encode_op:
args->ops, args->opcnt, resp->opcnt, op->opnum,
be32_to_cpu(status));

- if (cstate->replay_owner) {
- nfs4_unlock_state();
- cstate->replay_owner = NULL;
- }
+ nfsd4_cstate_clear_replay(cstate);
/* XXX Ugh, we need to get rid of this kind of special case: */
if (op->opnum == OP_READ && op->u.read.rd_filp)
fput(op->u.read.rd_filp);
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 5a93e5fafd4a..749608b914b4 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1069,7 +1069,7 @@ void nfsd4_bump_seqid(struct nfsd4_compound_state *cstate, __be32 nfserr)
return;

if (!seqid_mutating_err(ntohl(nfserr))) {
- cstate->replay_owner = NULL;
+ nfsd4_cstate_clear_replay(cstate);
return;
}
if (!so)
@@ -2940,6 +2940,28 @@ static void init_nfs4_replay(struct nfs4_replay *rp)
rp->rp_status = nfserr_serverfault;
rp->rp_buflen = 0;
rp->rp_buf = rp->rp_ibuf;
+ mutex_init(&rp->rp_mutex);
+}
+
+static void nfsd4_cstate_assign_replay(struct nfsd4_compound_state *cstate,
+ struct nfs4_stateowner *so)
+{
+ if (!nfsd4_has_session(cstate)) {
+ mutex_lock(&so->so_replay.rp_mutex);
+ cstate->replay_owner = so;
+ atomic_inc(&so->so_count);
+ }
+}
+
+void nfsd4_cstate_clear_replay(struct nfsd4_compound_state *cstate)
+{
+ struct nfs4_stateowner *so = cstate->replay_owner;
+
+ if (so != NULL) {
+ cstate->replay_owner = NULL;
+ mutex_unlock(&so->so_replay.rp_mutex);
+ nfs4_put_stateowner(so);
+ }
}

static inline void *alloc_stateowner(struct kmem_cache *slab, struct xdr_netobj *owner, struct nfs4_client *clp)
@@ -3855,7 +3877,8 @@ out:
return status;
}

-void nfsd4_cleanup_open_state(struct nfsd4_open *open, __be32 status)
+void nfsd4_cleanup_open_state(struct nfsd4_compound_state *cstate,
+ struct nfsd4_open *open, __be32 status)
{
if (open->op_openowner) {
struct nfs4_openowner *oo = open->op_openowner;
@@ -3869,6 +3892,8 @@ void nfsd4_cleanup_open_state(struct nfsd4_open *open, __be32 status)
} else
oo->oo_flags &= ~NFS4_OO_NEW;
}
+ if (open->op_openowner)
+ nfsd4_cstate_assign_replay(cstate, &oo->oo_owner);
}
if (open->op_file)
nfsd4_free_file(open->op_file);
@@ -4399,8 +4424,7 @@ nfs4_preprocess_seqid_op(struct nfsd4_compound_state *cstate, u32 seqid,
if (status)
return status;
stp = openlockstateid(s);
- if (!nfsd4_has_session(cstate))
- cstate->replay_owner = stp->st_stateowner;
+ nfsd4_cstate_assign_replay(cstate, stp->st_stateowner);

status = nfs4_seqid_op_checks(cstate, stateid, seqid, stp);
if (!status)
@@ -4469,8 +4493,7 @@ put_stateid:
nfs4_put_stid(&stp->st_stid);
out:
nfsd4_bump_seqid(cstate, status);
- if (!cstate->replay_owner)
- nfs4_unlock_state();
+ nfs4_unlock_state();
return status;
}

@@ -4544,8 +4567,7 @@ put_stateid:
nfs4_put_stid(&stp->st_stid);
out:
nfsd4_bump_seqid(cstate, status);
- if (!cstate->replay_owner)
- nfs4_unlock_state();
+ nfs4_unlock_state();
return status;
}

@@ -4610,8 +4632,7 @@ nfsd4_close(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
/* put reference from nfs4_preprocess_seqid_op */
nfs4_put_stid(&stp->st_stid);
out:
- if (!cstate->replay_owner)
- nfs4_unlock_state();
+ nfs4_unlock_state();
return status;
}

@@ -5071,8 +5092,7 @@ out:
if (status && new_state)
release_lock_stateid(lock_stp);
nfsd4_bump_seqid(cstate, status);
- if (!cstate->replay_owner)
- nfs4_unlock_state();
+ nfs4_unlock_state();
if (file_lock)
locks_free_lock(file_lock);
if (conflock)
@@ -5236,8 +5256,7 @@ put_stateid:
nfs4_put_stid(&stp->st_stid);
out:
nfsd4_bump_seqid(cstate, status);
- if (!cstate->replay_owner)
- nfs4_unlock_state();
+ nfs4_unlock_state();
if (file_lock)
locks_free_lock(file_lock);
return status;
diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index 628b430e743e..72a2a82e11a4 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -3925,8 +3925,6 @@ status:
*
* XDR note: do not encode rp->rp_buflen: the buffer contains the
* previously sent already encoded operation.
- *
- * called with nfs4_lock_state() held
*/
void
nfsd4_encode_replay(struct xdr_stream *xdr, struct nfsd4_op *op)
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index dc725deb4aa8..9cba295812f6 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -328,6 +328,7 @@ struct nfs4_replay {
unsigned int rp_buflen;
char *rp_buf;
struct knfsd_fh rp_openfh;
+ struct mutex rp_mutex;
char rp_ibuf[NFSD4_REPLAY_ISIZE];
};

diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h
index 5abf6c942ddf..465e7799742a 100644
--- a/fs/nfsd/xdr4.h
+++ b/fs/nfsd/xdr4.h
@@ -599,7 +599,9 @@ extern __be32 nfsd4_process_open1(struct nfsd4_compound_state *,
struct nfsd4_open *open, struct nfsd_net *nn);
extern __be32 nfsd4_process_open2(struct svc_rqst *rqstp,
struct svc_fh *current_fh, struct nfsd4_open *open);
-extern void nfsd4_cleanup_open_state(struct nfsd4_open *open, __be32 status);
+extern void nfsd4_cstate_clear_replay(struct nfsd4_compound_state *cstate);
+extern void nfsd4_cleanup_open_state(struct nfsd4_compound_state *cstate,
+ struct nfsd4_open *open, __be32 status);
extern __be32 nfsd4_open_confirm(struct svc_rqst *rqstp,
struct nfsd4_compound_state *, struct nfsd4_open_confirm *oc);
extern __be32 nfsd4_close(struct svc_rqst *rqstp,
@@ -630,6 +632,7 @@ extern __be32 nfsd4_test_stateid(struct svc_rqst *rqstp,
extern __be32 nfsd4_free_stateid(struct svc_rqst *rqstp,
struct nfsd4_compound_state *, struct nfsd4_free_stateid *free_stateid);
extern void nfsd4_bump_seqid(struct nfsd4_compound_state *, __be32 nfserr);
+
#endif

/*
--
1.9.3


2014-07-30 01:35:22

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v3 21/38] nfsd: Add reference counting to state owners

The way stateowners are managed today is somewhat awkward. They need to
be explicitly destroyed, even though the stateids reference them. This
will be particularly problematic when we remove the client_mutex.

We may create a new stateowner and attempt to open a file or set a lock,
and have that fail. In the meantime, another RPC may come in that uses
that same stateowner and succeed. We can't have the first task tearing
down the stateowner in that situation.

To fix this, we need to change how stateowners are tracked altogether.
Refcount them and only destroy them once all stateids that reference
them have been destroyed. This patch starts by adding the refcounting
necessary to do that.

Signed-off-by: Trond Myklebust <[email protected]>
Signed-off-by: Jeff Layton <[email protected]>
---
fs/nfsd/nfs4state.c | 49 +++++++++++++++++++++++++++++++++++--------------
fs/nfsd/state.h | 22 +++++++++++++++-------
2 files changed, 50 insertions(+), 21 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 653de6b14665..5a93e5fafd4a 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -890,6 +890,14 @@ release_all_access(struct nfs4_ol_stateid *stp)
}
}

+static void nfs4_put_stateowner(struct nfs4_stateowner *sop)
+{
+ if (!atomic_dec_and_test(&sop->so_count))
+ return;
+ kfree(sop->so_owner.data);
+ sop->so_ops->so_free(sop);
+}
+
static void unhash_generic_stateid(struct nfs4_ol_stateid *stp)
{
struct nfs4_file *fp = stp->st_stid.sc_file;
@@ -946,16 +954,10 @@ static void unhash_lockowner(struct nfs4_lockowner *lo)
}
}

-static void nfs4_free_lockowner(struct nfs4_lockowner *lo)
-{
- kfree(lo->lo_owner.so_owner.data);
- kmem_cache_free(lockowner_slab, lo);
-}
-
static void release_lockowner(struct nfs4_lockowner *lo)
{
unhash_lockowner(lo);
- nfs4_free_lockowner(lo);
+ nfs4_put_stateowner(&lo->lo_owner);
}

static void release_lockowner_if_empty(struct nfs4_lockowner *lo)
@@ -1025,18 +1027,12 @@ static void release_last_closed_stateid(struct nfs4_openowner *oo)
}
}

-static void nfs4_free_openowner(struct nfs4_openowner *oo)
-{
- kfree(oo->oo_owner.so_owner.data);
- kmem_cache_free(openowner_slab, oo);
-}
-
static void release_openowner(struct nfs4_openowner *oo)
{
unhash_openowner(oo);
list_del(&oo->oo_close_lru);
release_last_closed_stateid(oo);
- nfs4_free_openowner(oo);
+ nfs4_put_stateowner(&oo->oo_owner);
}

static inline int
@@ -2964,6 +2960,7 @@ static inline void *alloc_stateowner(struct kmem_cache *slab, struct xdr_netobj
INIT_LIST_HEAD(&sop->so_stateids);
sop->so_client = clp;
init_nfs4_replay(&sop->so_replay);
+ atomic_set(&sop->so_count, 1);
return sop;
}

@@ -2975,6 +2972,17 @@ static void hash_openowner(struct nfs4_openowner *oo, struct nfs4_client *clp, u
list_add(&oo->oo_perclient, &clp->cl_openowners);
}

+static void nfs4_free_openowner(struct nfs4_stateowner *so)
+{
+ struct nfs4_openowner *oo = openowner(so);
+
+ kmem_cache_free(openowner_slab, oo);
+}
+
+static const struct nfs4_stateowner_operations openowner_ops = {
+ .so_free = nfs4_free_openowner,
+};
+
static struct nfs4_openowner *
alloc_init_open_stateowner(unsigned int strhashval, struct nfsd4_open *open,
struct nfsd4_compound_state *cstate)
@@ -2985,6 +2993,7 @@ alloc_init_open_stateowner(unsigned int strhashval, struct nfsd4_open *open,
oo = alloc_stateowner(openowner_slab, &open->op_owner, clp);
if (!oo)
return NULL;
+ oo->oo_owner.so_ops = &openowner_ops;
oo->oo_owner.so_is_open_owner = 1;
oo->oo_owner.so_seqid = open->op_seqid;
oo->oo_flags = NFS4_OO_NEW;
@@ -4729,6 +4738,17 @@ find_lockowner_str(clientid_t *clid, struct xdr_netobj *owner,
return NULL;
}

+static void nfs4_free_lockowner(struct nfs4_stateowner *sop)
+{
+ struct nfs4_lockowner *lo = lockowner(sop);
+
+ kmem_cache_free(lockowner_slab, lo);
+}
+
+static const struct nfs4_stateowner_operations lockowner_ops = {
+ .so_free = nfs4_free_lockowner,
+};
+
/*
* Alloc a lock owner structure.
* Called in nfsd4_lock - therefore, OPEN and OPEN_CONFIRM (if needed) has
@@ -4749,6 +4769,7 @@ alloc_init_lock_stateowner(unsigned int strhashval, struct nfs4_client *clp, str
/* It is the openowner seqid that will be incremented in encode in the
* case of new lockowners; so increment the lock seqid manually: */
lo->lo_owner.so_seqid = lock->lk_new_lock_seqid + 1;
+ lo->lo_owner.so_ops = &lockowner_ops;
list_add(&lo->lo_owner.so_strhash, &nn->ownerstr_hashtbl[strhashval]);
return lo;
}
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index af1d9c42e939..dc725deb4aa8 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -331,16 +331,24 @@ struct nfs4_replay {
char rp_ibuf[NFSD4_REPLAY_ISIZE];
};

+struct nfs4_stateowner;
+
+struct nfs4_stateowner_operations {
+ void (*so_free)(struct nfs4_stateowner *);
+};
+
struct nfs4_stateowner {
- struct list_head so_strhash; /* hash by op_name */
- struct list_head so_stateids;
- struct nfs4_client * so_client;
+ struct list_head so_strhash;
+ struct list_head so_stateids;
+ struct nfs4_client *so_client;
+ const struct nfs4_stateowner_operations *so_ops;
/* after increment in ENCODE_SEQID_OP_TAIL, represents the next
* sequence id expected from the client: */
- u32 so_seqid;
- struct xdr_netobj so_owner; /* open owner name */
- struct nfs4_replay so_replay;
- bool so_is_open_owner;
+ atomic_t so_count;
+ u32 so_seqid;
+ struct xdr_netobj so_owner; /* open owner name */
+ struct nfs4_replay so_replay;
+ bool so_is_open_owner;
};

struct nfs4_openowner {
--
1.9.3


2014-07-30 01:35:00

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v3 07/38] nfsd: Add locking to protect the state owner lists

Change to using the clp->cl_lock for this. For now, there's a lot of
cl_lock thrashing, but in later patches we'll eliminate that and close
the potential races that can occur when releasing the cl_lock while
walking the lists. For now, the client_mutex prevents those races.

Signed-off-by: Trond Myklebust <[email protected]>
Signed-off-by: Jeff Layton <[email protected]>
---
fs/nfsd/nfs4state.c | 21 +++++++++++++++++++--
1 file changed, 19 insertions(+), 2 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 8ce5894133aa..3ac6e2fdabe5 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -893,6 +893,8 @@ static void unhash_generic_stateid(struct nfs4_ol_stateid *stp)
{
struct nfs4_file *fp = stp->st_stid.sc_file;

+ lockdep_assert_held(&stp->st_stateowner->so_client->cl_lock);
+
spin_lock(&fp->fi_lock);
list_del(&stp->st_perfile);
spin_unlock(&fp->fi_lock);
@@ -921,9 +923,13 @@ static void nfs4_free_lock_stateid(struct nfs4_stid *stid)

static void __release_lock_stateid(struct nfs4_ol_stateid *stp)
{
+ struct nfs4_openowner *oo = openowner(stp->st_openstp->st_stateowner);
+
+ spin_lock(&oo->oo_owner.so_client->cl_lock);
list_del(&stp->st_locks);
unhash_generic_stateid(stp);
unhash_stid(&stp->st_stid);
+ spin_unlock(&oo->oo_owner.so_client->cl_lock);
nfs4_put_stid(&stp->st_stid);
}

@@ -967,20 +973,26 @@ static void release_lock_stateid(struct nfs4_ol_stateid *stp)
}

static void release_open_stateid_locks(struct nfs4_ol_stateid *open_stp)
+ __releases(&open_stp->st_stateowner->so_client->cl_lock)
+ __acquires(&open_stp->st_stateowner->so_client->cl_lock)
{
struct nfs4_ol_stateid *stp;

while (!list_empty(&open_stp->st_locks)) {
stp = list_entry(open_stp->st_locks.next,
struct nfs4_ol_stateid, st_locks);
+ spin_unlock(&open_stp->st_stateowner->so_client->cl_lock);
release_lock_stateid(stp);
+ spin_lock(&open_stp->st_stateowner->so_client->cl_lock);
}
}

static void unhash_open_stateid(struct nfs4_ol_stateid *stp)
{
+ spin_lock(&stp->st_stateowner->so_client->cl_lock);
unhash_generic_stateid(stp);
release_open_stateid_locks(stp);
+ spin_unlock(&stp->st_stateowner->so_client->cl_lock);
}

static void release_open_stateid(struct nfs4_ol_stateid *stp)
@@ -2996,16 +3008,18 @@ static void init_open_stateid(struct nfs4_ol_stateid *stp, struct nfs4_file *fp,

stp->st_stid.sc_type = NFS4_OPEN_STID;
INIT_LIST_HEAD(&stp->st_locks);
- list_add(&stp->st_perstateowner, &oo->oo_owner.so_stateids);
stp->st_stateowner = &oo->oo_owner;
get_nfs4_file(fp);
stp->st_stid.sc_file = fp;
stp->st_access_bmap = 0;
stp->st_deny_bmap = 0;
stp->st_openstp = NULL;
+ spin_lock(&oo->oo_owner.so_client->cl_lock);
+ list_add(&stp->st_perstateowner, &oo->oo_owner.so_stateids);
spin_lock(&fp->fi_lock);
list_add(&stp->st_perfile, &fp->fi_stateids);
spin_unlock(&fp->fi_lock);
+ spin_unlock(&oo->oo_owner.so_client->cl_lock);
}

static void
@@ -4711,6 +4725,7 @@ alloc_init_lock_stateid(struct nfs4_lockowner *lo, struct nfs4_file *fp,
struct nfs4_ol_stateid *open_stp)
{
struct nfs4_stid *s;
+ struct nfs4_openowner *oo = openowner(open_stp->st_stateowner);
struct nfs4_ol_stateid *stp;
struct nfs4_client *clp = lo->lo_owner.so_client;

@@ -4719,7 +4734,6 @@ alloc_init_lock_stateid(struct nfs4_lockowner *lo, struct nfs4_file *fp,
return NULL;
stp = openlockstateid(s);
stp->st_stid.sc_type = NFS4_LOCK_STID;
- list_add(&stp->st_perstateowner, &lo->lo_owner.so_stateids);
stp->st_stateowner = &lo->lo_owner;
get_nfs4_file(fp);
stp->st_stid.sc_file = fp;
@@ -4727,10 +4741,13 @@ alloc_init_lock_stateid(struct nfs4_lockowner *lo, struct nfs4_file *fp,
stp->st_access_bmap = 0;
stp->st_deny_bmap = open_stp->st_deny_bmap;
stp->st_openstp = open_stp;
+ spin_lock(&oo->oo_owner.so_client->cl_lock);
list_add(&stp->st_locks, &open_stp->st_locks);
+ list_add(&stp->st_perstateowner, &lo->lo_owner.so_stateids);
spin_lock(&fp->fi_lock);
list_add(&stp->st_perfile, &fp->fi_stateids);
spin_unlock(&fp->fi_lock);
+ spin_unlock(&oo->oo_owner.so_client->cl_lock);
return stp;
}

--
1.9.3


2014-07-30 01:34:52

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v3 02/38] nfsd: Cleanup the freeing of stateids

Add a ->free() callback to the struct nfs4_stid, so that we can
release a reference to the stid without caring about the contents.

Signed-off-by: Trond Myklebust <[email protected]>
Signed-off-by: Jeff Layton <[email protected]>
---
fs/nfsd/nfs4callback.c | 2 +-
fs/nfsd/nfs4state.c | 105 ++++++++++++++++++++++++++-----------------------
fs/nfsd/state.h | 3 +-
3 files changed, 58 insertions(+), 52 deletions(-)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index e9813389687b..8574c708cf8c 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -905,7 +905,7 @@ static void nfsd4_cb_recall_release(void *calldata)
spin_lock(&clp->cl_lock);
list_del(&cb->cb_per_client);
spin_unlock(&clp->cl_lock);
- nfs4_put_delegation(dp);
+ nfs4_put_stid(&dp->dl_stid);
}
}

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 3b0836db8553..60ab22b6e099 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -71,6 +71,7 @@ static u64 current_sessionid = 1;

/* forward declarations */
static bool check_for_locks(struct nfs4_file *fp, struct nfs4_lockowner *lowner);
+static void nfs4_free_ol_stateid(struct nfs4_stid *stid);

/* Locking: */

@@ -463,8 +464,8 @@ static void nfs4_file_put_access(struct nfs4_file *fp, u32 access)
__nfs4_file_put_access(fp, O_RDONLY);
}

-static struct nfs4_stid *nfs4_alloc_stid(struct nfs4_client *cl, struct
-kmem_cache *slab)
+static struct nfs4_stid *nfs4_alloc_stid(struct nfs4_client *cl,
+ struct kmem_cache *slab)
{
struct idr *stateids = &cl->cl_stateids;
struct nfs4_stid *stid;
@@ -500,7 +501,26 @@ out_free:

static struct nfs4_ol_stateid * nfs4_alloc_stateid(struct nfs4_client *clp)
{
- return openlockstateid(nfs4_alloc_stid(clp, stateid_slab));
+ struct nfs4_stid *stid;
+ struct nfs4_ol_stateid *stp;
+
+ stid = nfs4_alloc_stid(clp, stateid_slab);
+ if (!stid)
+ return NULL;
+
+ stp = openlockstateid(stid);
+ stp->st_stid.sc_free = nfs4_free_ol_stateid;
+ return stp;
+}
+
+static void nfs4_free_deleg(struct nfs4_stid *stid)
+{
+ struct nfs4_delegation *dp = delegstateid(stid);
+
+ if (dp->dl_file)
+ put_nfs4_file(dp->dl_file);
+ kmem_cache_free(deleg_slab, stid);
+ atomic_long_dec(&num_delegations);
}

/*
@@ -594,6 +614,8 @@ alloc_init_deleg(struct nfs4_client *clp, struct svc_fh *current_fh)
dp = delegstateid(nfs4_alloc_stid(clp, deleg_slab));
if (dp == NULL)
goto out_dec;
+
+ dp->dl_stid.sc_free = nfs4_free_deleg;
/*
* delegation seqid's are never incremented. The 4.1 special
* meaning of seqid 0 isn't meaningful, really, but let's avoid
@@ -611,28 +633,15 @@ out_dec:
return NULL;
}

-static void remove_stid(struct nfs4_stid *s)
-{
- struct idr *stateids = &s->sc_client->cl_stateids;
-
- idr_remove(stateids, s->sc_stateid.si_opaque.so_id);
-}
-
-static void nfs4_free_stid(struct kmem_cache *slab, struct nfs4_stid *s)
-{
- kmem_cache_free(slab, s);
-}
-
void
-nfs4_put_delegation(struct nfs4_delegation *dp)
+nfs4_put_stid(struct nfs4_stid *s)
{
- if (atomic_dec_and_test(&dp->dl_stid.sc_count)) {
- if (dp->dl_file)
- put_nfs4_file(dp->dl_file);
- remove_stid(&dp->dl_stid);
- nfs4_free_stid(deleg_slab, &dp->dl_stid);
- atomic_long_dec(&num_delegations);
- }
+ struct nfs4_client *clp = s->sc_client;
+
+ if (!atomic_dec_and_test(&s->sc_count))
+ return;
+ idr_remove(&clp->cl_stateids, s->sc_stateid.si_opaque.so_id);
+ s->sc_free(s);
}

static void nfs4_put_deleg_lease(struct nfs4_file *fp)
@@ -689,7 +698,7 @@ static void destroy_delegation(struct nfs4_delegation *dp)
spin_lock(&state_lock);
unhash_delegation_locked(dp);
spin_unlock(&state_lock);
- nfs4_put_delegation(dp);
+ nfs4_put_stid(&dp->dl_stid);
}

static void revoke_delegation(struct nfs4_delegation *dp)
@@ -699,7 +708,7 @@ static void revoke_delegation(struct nfs4_delegation *dp)
WARN_ON(!list_empty(&dp->dl_recall_lru));

if (clp->cl_minorversion == 0)
- nfs4_put_delegation(dp);
+ nfs4_put_stid(&dp->dl_stid);
else {
dp->dl_stid.sc_type = NFS4_REVOKED_DELEG_STID;
spin_lock(&clp->cl_lock);
@@ -885,19 +894,14 @@ static void unhash_generic_stateid(struct nfs4_ol_stateid *stp)
list_del(&stp->st_perstateowner);
}

-static void close_generic_stateid(struct nfs4_ol_stateid *stp)
+static void nfs4_free_ol_stateid(struct nfs4_stid *stid)
{
- release_all_access(stp);
-}
+ struct nfs4_ol_stateid *stp = openlockstateid(stid);

-static void put_ol_stateid(struct nfs4_ol_stateid *stp)
-{
- if (!atomic_dec_and_test(&stp->st_stid.sc_count))
- return;
+ release_all_access(stp);
if (stp->st_file)
put_nfs4_file(stp->st_file);
- remove_stid(&stp->st_stid);
- nfs4_free_stid(stateid_slab, &stp->st_stid);
+ kmem_cache_free(stateid_slab, stid);
}

static void __release_lock_stateid(struct nfs4_ol_stateid *stp)
@@ -910,8 +914,7 @@ static void __release_lock_stateid(struct nfs4_ol_stateid *stp)
file = find_any_file(stp->st_file);
if (file)
filp_close(file, (fl_owner_t)lockowner(stp->st_stateowner));
- close_generic_stateid(stp);
- put_ol_stateid(stp);
+ nfs4_put_stid(&stp->st_stid);
}

static void unhash_lockowner(struct nfs4_lockowner *lo)
@@ -968,13 +971,12 @@ static void unhash_open_stateid(struct nfs4_ol_stateid *stp)
{
unhash_generic_stateid(stp);
release_open_stateid_locks(stp);
- close_generic_stateid(stp);
}

static void release_open_stateid(struct nfs4_ol_stateid *stp)
{
unhash_open_stateid(stp);
- put_ol_stateid(stp);
+ nfs4_put_stid(&stp->st_stid);
}

static void unhash_openowner(struct nfs4_openowner *oo)
@@ -995,7 +997,7 @@ static void release_last_closed_stateid(struct nfs4_openowner *oo)
struct nfs4_ol_stateid *s = oo->oo_last_closed_stid;

if (s) {
- put_ol_stateid(s);
+ nfs4_put_stid(&s->st_stid);
oo->oo_last_closed_stid = NULL;
}
}
@@ -1469,12 +1471,12 @@ destroy_client(struct nfs4_client *clp)
while (!list_empty(&reaplist)) {
dp = list_entry(reaplist.next, struct nfs4_delegation, dl_recall_lru);
list_del_init(&dp->dl_recall_lru);
- nfs4_put_delegation(dp);
+ nfs4_put_stid(&dp->dl_stid);
}
while (!list_empty(&clp->cl_revoked)) {
dp = list_entry(reaplist.next, struct nfs4_delegation, dl_recall_lru);
list_del_init(&dp->dl_recall_lru);
- nfs4_put_delegation(dp);
+ nfs4_put_stid(&dp->dl_stid);
}
while (!list_empty(&clp->cl_openowners)) {
oo = list_entry(clp->cl_openowners.next, struct nfs4_openowner, oo_perclient);
@@ -3590,7 +3592,7 @@ out_unlock:
spin_unlock(&state_lock);
out:
if (status) {
- nfs4_put_delegation(dp);
+ nfs4_put_stid(&dp->dl_stid);
return ERR_PTR(status);
}
return dp;
@@ -3820,7 +3822,7 @@ void nfsd4_cleanup_open_state(struct nfsd4_open *open, __be32 status)
if (open->op_file)
nfsd4_free_file(open->op_file);
if (open->op_stp)
- put_ol_stateid(open->op_stp);
+ nfs4_put_stid(&open->op_stp->st_stid);
}

__be32
@@ -4268,7 +4270,7 @@ nfsd4_free_stateid(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
spin_lock(&cl->cl_lock);
list_del_init(&dp->dl_recall_lru);
spin_unlock(&cl->cl_lock);
- nfs4_put_delegation(dp);
+ nfs4_put_stid(s);
ret = nfs_ok;
break;
default:
@@ -4481,17 +4483,20 @@ static void nfsd4_close_open_stateid(struct nfs4_ol_stateid *s)
if (clp->cl_minorversion) {
if (list_empty(&oo->oo_owner.so_stateids))
release_openowner(oo);
- put_ol_stateid(s);
+ nfs4_put_stid(&s->st_stid);
} else {
+ /*
+ * In the 4.0 case we need to keep the owners around a
+ * little while to handle CLOSE replay. We still do need
+ * to release any file access that is held by them
+ * before returning however.
+ */
+ release_all_access(s);
if (s->st_file) {
put_nfs4_file(s->st_file);
s->st_file = NULL;
}
oo->oo_last_closed_stid = s;
- /*
- * In the 4.0 case we need to keep the owners around a
- * little while to handle CLOSE replay.
- */
if (list_empty(&oo->oo_owner.so_stateids))
move_to_close_lru(oo, clp->net);
}
@@ -5667,7 +5672,7 @@ nfs4_state_shutdown_net(struct net *net)
list_for_each_safe(pos, next, &reaplist) {
dp = list_entry (pos, struct nfs4_delegation, dl_recall_lru);
list_del_init(&dp->dl_recall_lru);
- nfs4_put_delegation(dp);
+ nfs4_put_stid(&dp->dl_stid);
}

nfsd4_client_tracking_exit(net);
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 39747736e83b..32c466265ac1 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -85,6 +85,7 @@ struct nfs4_stid {
unsigned char sc_type;
stateid_t sc_stateid;
struct nfs4_client *sc_client;
+ void (*sc_free)(struct nfs4_stid *);
};

struct nfs4_delegation {
@@ -429,6 +430,7 @@ extern __be32 nfs4_preprocess_stateid_op(struct net *net,
stateid_t *stateid, int flags, struct file **filp);
extern void nfs4_lock_state(void);
extern void nfs4_unlock_state(void);
+void nfs4_put_stid(struct nfs4_stid *s);
void nfs4_remove_reclaim_record(struct nfs4_client_reclaim *, struct nfsd_net *);
extern void nfs4_release_reclaim(struct nfsd_net *);
extern struct nfs4_client_reclaim *nfsd4_find_reclaim_client(const char *recdir,
@@ -446,7 +448,6 @@ extern int nfsd4_create_callback_queue(void);
extern void nfsd4_destroy_callback_queue(void);
extern void nfsd4_shutdown_callback(struct nfs4_client *);
extern void nfsd4_prepare_cb_recall(struct nfs4_delegation *dp);
-extern void nfs4_put_delegation(struct nfs4_delegation *dp);
extern struct nfs4_client_reclaim *nfs4_client_to_reclaim(const char *name,
struct nfsd_net *nn);
extern bool nfs4_has_reclaimed_state(const char *name, struct nfsd_net *nn);
--
1.9.3


2014-07-30 01:35:05

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v3 10/38] nfsd: Add reference counting to lock stateids

From: Trond Myklebust <[email protected]>

Ensure that nfsd4_lock() references the lock stateid while it is
manipulating it. Not currently necessary, but will be once the
client_mutex is removed.

Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfsd/nfs4state.c | 15 ++++++++++++---
1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index f4c7bf96c9fd..1ddf4da0023d 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -4729,6 +4729,7 @@ init_lock_stateid(struct nfs4_ol_stateid *stp, struct nfs4_lockowner *lo,

lockdep_assert_held(&clp->cl_lock);

+ atomic_inc(&stp->st_stid.sc_count);
stp->st_stid.sc_type = NFS4_LOCK_STID;
stp->st_stateowner = &lo->lo_owner;
get_nfs4_file(fp);
@@ -4753,8 +4754,10 @@ find_lock_stateid(struct nfs4_lockowner *lo, struct nfs4_file *fp)
lockdep_assert_held(&clp->cl_lock);

list_for_each_entry(lst, &lo->lo_owner.so_stateids, st_perstateowner) {
- if (lst->st_stid.sc_file == fp)
+ if (lst->st_stid.sc_file == fp) {
+ atomic_inc(&lst->st_stid.sc_count);
return lst;
+ }
}
return NULL;
}
@@ -4856,7 +4859,7 @@ nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
{
struct nfs4_openowner *open_sop = NULL;
struct nfs4_lockowner *lock_sop = NULL;
- struct nfs4_ol_stateid *lock_stp;
+ struct nfs4_ol_stateid *lock_stp = NULL;
struct nfs4_file *fp;
struct file *filp = NULL;
struct file_lock *file_lock = NULL;
@@ -4910,11 +4913,15 @@ nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
goto out;
status = lookup_or_create_lock_state(cstate, open_stp, lock,
&lock_stp, &new_state);
- } else
+ } else {
status = nfs4_preprocess_seqid_op(cstate,
lock->lk_old_lock_seqid,
&lock->lk_old_lock_stateid,
NFS4_LOCK_STID, &lock_stp, nn);
+ /* FIXME: move into nfs4_preprocess_seqid_op */
+ if (!status)
+ atomic_inc(&lock_stp->st_stid.sc_count);
+ }
if (status)
goto out;
lock_sop = lockowner(lock_stp->st_stateowner);
@@ -5007,6 +5014,8 @@ nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
out:
if (filp)
fput(filp);
+ if (lock_stp)
+ nfs4_put_stid(&lock_stp->st_stid);
if (status && new_state)
release_lock_stateid(lock_stp);
nfsd4_bump_seqid(cstate, status);
--
1.9.3


2014-07-30 01:35:36

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v3 30/38] nfsd: Protect adding/removing lock owners using client_lock

From: Trond Myklebust <[email protected]>

Once we remove client mutex protection, we'll need to ensure that
stateowner lookup and creation are atomic between concurrent compounds.
Ensure that alloc_init_lock_stateowner checks the hashtable under the
client_lock before adding a new element.

Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfsd/nfs4state.c | 69 ++++++++++++++++++++++++++++++++++++++++++++++-------
1 file changed, 61 insertions(+), 8 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index c4bb7f2b29d9..7c15918d20f0 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1000,26 +1000,42 @@ static void release_lock_stateid(struct nfs4_ol_stateid *stp)
nfs4_put_stid(&stp->st_stid);
}

-static void unhash_lockowner(struct nfs4_lockowner *lo)
+static void unhash_lockowner_locked(struct nfs4_lockowner *lo)
{
+ struct nfsd_net *nn = net_generic(lo->lo_owner.so_client->net,
+ nfsd_net_id);
+
+ lockdep_assert_held(&nn->client_lock);
+
list_del_init(&lo->lo_owner.so_strhash);
}

static void release_lockowner_stateids(struct nfs4_lockowner *lo)
{
+ struct nfsd_net *nn = net_generic(lo->lo_owner.so_client->net,
+ nfsd_net_id);
struct nfs4_ol_stateid *stp;

+ lockdep_assert_held(&nn->client_lock);
+
while (!list_empty(&lo->lo_owner.so_stateids)) {
stp = list_first_entry(&lo->lo_owner.so_stateids,
struct nfs4_ol_stateid, st_perstateowner);
+ spin_unlock(&nn->client_lock);
release_lock_stateid(stp);
+ spin_lock(&nn->client_lock);
}
}

static void release_lockowner(struct nfs4_lockowner *lo)
{
- unhash_lockowner(lo);
+ struct nfsd_net *nn = net_generic(lo->lo_owner.so_client->net,
+ nfsd_net_id);
+
+ spin_lock(&nn->client_lock);
+ unhash_lockowner_locked(lo);
release_lockowner_stateids(lo);
+ spin_unlock(&nn->client_lock);
nfs4_put_stateowner(&lo->lo_owner);
}

@@ -4801,7 +4817,7 @@ nevermind:
}

static struct nfs4_lockowner *
-find_lockowner_str(clientid_t *clid, struct xdr_netobj *owner,
+find_lockowner_str_locked(clientid_t *clid, struct xdr_netobj *owner,
struct nfsd_net *nn)
{
unsigned int strhashval = ownerstr_hashval(clid->cl_id, owner);
@@ -4818,9 +4834,25 @@ find_lockowner_str(clientid_t *clid, struct xdr_netobj *owner,
return NULL;
}

+static struct nfs4_lockowner *
+find_lockowner_str(clientid_t *clid, struct xdr_netobj *owner,
+ struct nfsd_net *nn)
+{
+ struct nfs4_lockowner *lo;
+
+ spin_lock(&nn->client_lock);
+ lo = find_lockowner_str_locked(clid, owner, nn);
+ spin_unlock(&nn->client_lock);
+ return lo;
+}
+
static void nfs4_unhash_lockowner(struct nfs4_stateowner *sop)
{
- unhash_lockowner(lockowner(sop));
+ struct nfsd_net *nn = net_generic(sop->so_client->net, nfsd_net_id);
+
+ spin_lock(&nn->client_lock);
+ unhash_lockowner_locked(lockowner(sop));
+ spin_unlock(&nn->client_lock);
}

static void nfs4_free_lockowner(struct nfs4_stateowner *sop)
@@ -4843,9 +4875,12 @@ static const struct nfs4_stateowner_operations lockowner_ops = {
* strhashval = ownerstr_hashval
*/
static struct nfs4_lockowner *
-alloc_init_lock_stateowner(unsigned int strhashval, struct nfs4_client *clp, struct nfs4_ol_stateid *open_stp, struct nfsd4_lock *lock) {
- struct nfs4_lockowner *lo;
+alloc_init_lock_stateowner(unsigned int strhashval, struct nfs4_client *clp,
+ struct nfs4_ol_stateid *open_stp,
+ struct nfsd4_lock *lock)
+{
struct nfsd_net *nn = net_generic(clp->net, nfsd_net_id);
+ struct nfs4_lockowner *lo, *ret;

lo = alloc_stateowner(lockowner_slab, &lock->lk_new_owner, clp);
if (!lo)
@@ -4854,7 +4889,16 @@ alloc_init_lock_stateowner(unsigned int strhashval, struct nfs4_client *clp, str
lo->lo_owner.so_is_open_owner = 0;
lo->lo_owner.so_seqid = lock->lk_new_lock_seqid;
lo->lo_owner.so_ops = &lockowner_ops;
- list_add(&lo->lo_owner.so_strhash, &nn->ownerstr_hashtbl[strhashval]);
+ spin_lock(&nn->client_lock);
+ ret = find_lockowner_str_locked(&clp->cl_clientid,
+ &lock->lk_new_owner, nn);
+ if (ret == NULL) {
+ list_add(&lo->lo_owner.so_strhash,
+ &nn->ownerstr_hashtbl[strhashval]);
+ ret = lo;
+ } else
+ nfs4_free_lockowner(&lo->lo_owner);
+ spin_unlock(&nn->client_lock);
return lo;
}

@@ -5395,6 +5439,7 @@ nfsd4_release_lockowner(struct svc_rqst *rqstp,
unsigned int hashval = ownerstr_hashval(clid->cl_id, owner);
__be32 status;
struct nfsd_net *nn = net_generic(SVC_NET(rqstp), nfsd_net_id);
+ struct nfs4_client *clp;

dprintk("nfsd4_release_lockowner clientid: (%08x/%08x):\n",
clid->cl_boot, clid->cl_id);
@@ -5408,6 +5453,7 @@ nfsd4_release_lockowner(struct svc_rqst *rqstp,
status = nfserr_locks_held;

/* Find the matching lock stateowner */
+ spin_lock(&nn->client_lock);
list_for_each_entry(tmp, &nn->ownerstr_hashtbl[hashval], so_strhash) {
if (tmp->so_is_open_owner)
continue;
@@ -5417,6 +5463,7 @@ nfsd4_release_lockowner(struct svc_rqst *rqstp,
break;
}
}
+ spin_unlock(&nn->client_lock);

/* No matching owner found, maybe a replay? Just declare victory... */
if (!sop) {
@@ -5426,16 +5473,22 @@ nfsd4_release_lockowner(struct svc_rqst *rqstp,

lo = lockowner(sop);
/* see if there are still any locks associated with it */
+ clp = cstate->clp;
+ spin_lock(&clp->cl_lock);
list_for_each_entry(stp, &sop->so_stateids, st_perstateowner) {
if (check_for_locks(stp->st_stid.sc_file, lo)) {
- nfs4_put_stateowner(sop);
+ spin_unlock(&clp->cl_lock);
goto out;
}
}
+ spin_unlock(&clp->cl_lock);

status = nfs_ok;
+ sop = NULL;
release_lockowner(lo);
out:
+ if (sop)
+ nfs4_put_stateowner(sop);
nfs4_unlock_state();
return status;
}
--
1.9.3


2014-07-30 20:57:59

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v3 02/38] nfsd: Cleanup the freeing of stateids

On Tue, Jul 29, 2014 at 09:34:07PM -0400, Jeff Layton wrote:
> Add a ->free() callback to the struct nfs4_stid, so that we can
> release a reference to the stid without caring about the contents.

Looks good, but I'd suggest merging it into the previous patch.


Signed-off-by: Christoph Hellwig <[email protected]>

2014-07-30 20:59:03

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v3 04/38] nfsd: Replace nfs4_ol_stateid->st_file with the st_stid.sc_file

FYI, I'd merge this into the previous patch.


2014-07-30 01:35:33

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v3 28/38] nfsd: don't allow CLOSE to proceed until refcount on stateid drops

Once we remove client_mutex protection, it'll be possible to have an
in-flight operation using an openstateid when a CLOSE call comes in.
If that happens, we can't just put the sc_file reference and clear its
pointer without risking an oops.

Fix this by ensuring that v4.0 CLOSE operations wait for the refcount
to drop before proceeding to do so.

Signed-off-by: Jeff Layton <[email protected]>
---
fs/nfsd/nfs4state.c | 21 ++++++++++++++++++++-
1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index b61319401826..6d26d26751f5 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -85,6 +85,12 @@ static DEFINE_MUTEX(client_mutex);
*/
static DEFINE_SPINLOCK(state_lock);

+/*
+ * A waitqueue for all in-progress 4.0 CLOSE operations that are waiting for
+ * the refcount on the open stateid to drop.
+ */
+static DECLARE_WAIT_QUEUE_HEAD(close_wq);
+
static struct kmem_cache *openowner_slab;
static struct kmem_cache *lockowner_slab;
static struct kmem_cache *file_slab;
@@ -640,8 +646,10 @@ nfs4_put_stid(struct nfs4_stid *s)

might_lock(&clp->cl_lock);

- if (!atomic_dec_and_lock(&s->sc_count, &clp->cl_lock))
+ if (!atomic_dec_and_lock(&s->sc_count, &clp->cl_lock)) {
+ wake_up_all(&close_wq);
return;
+ }
idr_remove(&clp->cl_stateids, s->sc_stateid.si_opaque.so_id);
spin_unlock(&clp->cl_lock);
s->sc_free(s);
@@ -3070,6 +3078,17 @@ move_to_close_lru(struct nfs4_ol_stateid *s, struct net *net)

dprintk("NFSD: move_to_close_lru nfs4_openowner %p\n", oo);

+ /*
+ * We know that we hold one reference via nfsd4_close, and another
+ * "persistent" reference for the client. If the refcount is higher
+ * than 2, then there are still calls in progress that are using this
+ * stateid. We can't put the sc_file reference until they are finished.
+ * Wait for the refcount to drop to 2. Since it has been unhashed,
+ * there should be no danger of the refcount going back up again at
+ * this point.
+ */
+ wait_event(close_wq, atomic_read(&s->st_stid.sc_count) == 2);
+
release_all_access(s);
if (s->st_stid.sc_file) {
put_nfs4_file(s->st_stid.sc_file);
--
1.9.3


2014-07-30 01:35:25

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v3 23/38] nfsd: clean up lockowner refcounting when finding them

Ensure that when finding or creating a lockowner, that we get a
reference to it. For now, we also take an extra reference when a
lockowner is created that can be put when release_lockowner is called,
but we'll remove that in a later patch once we change how references are
held.

Since we no longer destroy lockowners in the event of an error in
nfsd4_lock, we must change how the seqid gets bumped in the lk_is_new
case. Instead of doing so on creation, do it manually in nfsd4_lock.

Signed-off-by: Jeff Layton <[email protected]>
---
fs/nfsd/nfs4state.c | 45 +++++++++++++++++++++++++++++++++------------
1 file changed, 33 insertions(+), 12 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 749608b914b4..eaa5f9ebf444 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -4754,6 +4754,7 @@ find_lockowner_str(clientid_t *clid, struct xdr_netobj *owner,
continue;
if (!same_owner_str(so, owner, clid))
continue;
+ atomic_inc(&so->so_count);
return lockowner(so);
}
return NULL;
@@ -4787,9 +4788,7 @@ alloc_init_lock_stateowner(unsigned int strhashval, struct nfs4_client *clp, str
return NULL;
INIT_LIST_HEAD(&lo->lo_owner.so_stateids);
lo->lo_owner.so_is_open_owner = 0;
- /* It is the openowner seqid that will be incremented in encode in the
- * case of new lockowners; so increment the lock seqid manually: */
- lo->lo_owner.so_seqid = lock->lk_new_lock_seqid + 1;
+ lo->lo_owner.so_seqid = lock->lk_new_lock_seqid;
lo->lo_owner.so_ops = &lockowner_ops;
list_add(&lo->lo_owner.so_strhash, &nn->ownerstr_hashtbl[strhashval]);
return lo;
@@ -4895,6 +4894,7 @@ lookup_or_create_lock_state(struct nfsd4_compound_state *cstate,
struct nfsd4_lock *lock,
struct nfs4_ol_stateid **lst, bool *new)
{
+ __be32 status;
struct nfs4_file *fi = ost->st_stid.sc_file;
struct nfs4_openowner *oo = openowner(ost->st_stateowner);
struct nfs4_client *cl = oo->oo_owner.so_client;
@@ -4910,19 +4910,26 @@ lookup_or_create_lock_state(struct nfsd4_compound_state *cstate,
lo = alloc_init_lock_stateowner(strhashval, cl, ost, lock);
if (lo == NULL)
return nfserr_jukebox;
+ /* FIXME: extra reference for new lockowners for the client */
+ atomic_inc(&lo->lo_owner.so_count);
} else {
/* with an existing lockowner, seqids must be the same */
+ status = nfserr_bad_seqid;
if (!cstate->minorversion &&
lock->lk_new_lock_seqid != lo->lo_owner.so_seqid)
- return nfserr_bad_seqid;
+ goto out;
}

*lst = find_or_create_lock_stateid(lo, fi, inode, ost, new);
if (*lst == NULL) {
release_lockowner_if_empty(lo);
- return nfserr_jukebox;
+ status = nfserr_jukebox;
+ goto out;
}
- return nfs_ok;
+ status = nfs_ok;
+out:
+ nfs4_put_stateowner(&lo->lo_owner);
+ return status;
}

/*
@@ -4941,9 +4948,9 @@ nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
struct file_lock *file_lock = NULL;
struct file_lock *conflock = NULL;
__be32 status = 0;
- bool new_state = false;
int lkflg;
int err;
+ bool new = false;
struct net *net = SVC_NET(rqstp);
struct nfsd_net *nn = net_generic(net, nfsd_net_id);

@@ -4986,7 +4993,7 @@ nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
&lock->v.new.clientid))
goto out;
status = lookup_or_create_lock_state(cstate, open_stp, lock,
- &lock_stp, &new_state);
+ &lock_stp, &new);
} else {
status = nfs4_preprocess_seqid_op(cstate,
lock->lk_old_lock_seqid,
@@ -5085,12 +5092,24 @@ nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
out:
if (filp)
fput(filp);
- if (lock_stp)
+ if (lock_stp) {
+ /* Bump seqid manually if the 4.0 replay owner is openowner */
+ if (cstate->replay_owner &&
+ cstate->replay_owner != &lock_sop->lo_owner &&
+ seqid_mutating_err(ntohl(status)))
+ lock_sop->lo_owner.so_seqid++;
+
+ /*
+ * If this is a new, never-before-used stateid, and we are
+ * returning an error, then just go ahead and release it.
+ */
+ if (status && new)
+ release_lock_stateid(lock_stp);
+
nfs4_put_stid(&lock_stp->st_stid);
+ }
if (open_stp)
nfs4_put_stid(&open_stp->st_stid);
- if (status && new_state)
- release_lock_stateid(lock_stp);
nfsd4_bump_seqid(cstate, status);
nfs4_unlock_state();
if (file_lock)
@@ -5125,7 +5144,7 @@ nfsd4_lockt(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
struct nfsd4_lockt *lockt)
{
struct file_lock *file_lock = NULL;
- struct nfs4_lockowner *lo;
+ struct nfs4_lockowner *lo = NULL;
__be32 status;
struct nfsd_net *nn = net_generic(SVC_NET(rqstp), nfsd_net_id);

@@ -5188,6 +5207,8 @@ nfsd4_lockt(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
nfs4_set_lock_denied(file_lock, &lockt->lt_denied);
}
out:
+ if (lo)
+ nfs4_put_stateowner(&lo->lo_owner);
nfs4_unlock_state();
if (file_lock)
locks_free_lock(file_lock);
--
1.9.3


2014-07-30 01:35:15

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v3 16/38] nfsd: nfsd4_open_confirm() must reference the open stateid

From: Trond Myklebust <[email protected]>

Ensure that nfsd4_open_confirm() keeps a reference to the open
stateid until it is done working with it.

Necessary step toward client_mutex removal.

Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfsd/nfs4state.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 4e50f14f5bc1..8e18ca49555f 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -4434,10 +4434,12 @@ nfsd4_open_confirm(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
NFS4_OPEN_STID, &stp, nn);
if (status)
goto out;
+ /* FIXME: move into nfs4_preprocess_seqid_op */
+ atomic_inc(&stp->st_stid.sc_count);
oo = openowner(stp->st_stateowner);
status = nfserr_bad_stateid;
if (oo->oo_flags & NFS4_OO_CONFIRMED)
- goto out;
+ goto put_stateid;
oo->oo_flags |= NFS4_OO_CONFIRMED;
update_stateid(&stp->st_stid.sc_stateid);
memcpy(&oc->oc_resp_stateid, &stp->st_stid.sc_stateid, sizeof(stateid_t));
@@ -4446,6 +4448,8 @@ nfsd4_open_confirm(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,

nfsd4_client_record_create(oo->oo_owner.so_client);
status = nfs_ok;
+put_stateid:
+ nfs4_put_stid(&stp->st_stid);
out:
nfsd4_bump_seqid(cstate, status);
if (!cstate->replay_owner)
--
1.9.3


2014-07-30 01:34:51

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v3 01/38] nfsd: Add reference counting to the lock and open stateids

From: Trond Myklebust <[email protected]>

When we remove the client_mutex, we'll need to be able to ensure that
these objects aren't destroyed while we're not holding locks.

Signed-off-by: Trond Myklebust <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
---
fs/nfsd/nfs4state.c | 14 ++++++++------
1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index b0f83beeca75..3b0836db8553 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -890,8 +890,10 @@ static void close_generic_stateid(struct nfs4_ol_stateid *stp)
release_all_access(stp);
}

-static void free_generic_stateid(struct nfs4_ol_stateid *stp)
+static void put_ol_stateid(struct nfs4_ol_stateid *stp)
{
+ if (!atomic_dec_and_test(&stp->st_stid.sc_count))
+ return;
if (stp->st_file)
put_nfs4_file(stp->st_file);
remove_stid(&stp->st_stid);
@@ -909,7 +911,7 @@ static void __release_lock_stateid(struct nfs4_ol_stateid *stp)
if (file)
filp_close(file, (fl_owner_t)lockowner(stp->st_stateowner));
close_generic_stateid(stp);
- free_generic_stateid(stp);
+ put_ol_stateid(stp);
}

static void unhash_lockowner(struct nfs4_lockowner *lo)
@@ -972,7 +974,7 @@ static void unhash_open_stateid(struct nfs4_ol_stateid *stp)
static void release_open_stateid(struct nfs4_ol_stateid *stp)
{
unhash_open_stateid(stp);
- free_generic_stateid(stp);
+ put_ol_stateid(stp);
}

static void unhash_openowner(struct nfs4_openowner *oo)
@@ -993,7 +995,7 @@ static void release_last_closed_stateid(struct nfs4_openowner *oo)
struct nfs4_ol_stateid *s = oo->oo_last_closed_stid;

if (s) {
- free_generic_stateid(s);
+ put_ol_stateid(s);
oo->oo_last_closed_stid = NULL;
}
}
@@ -3818,7 +3820,7 @@ void nfsd4_cleanup_open_state(struct nfsd4_open *open, __be32 status)
if (open->op_file)
nfsd4_free_file(open->op_file);
if (open->op_stp)
- free_generic_stateid(open->op_stp);
+ put_ol_stateid(open->op_stp);
}

__be32
@@ -4477,9 +4479,9 @@ static void nfsd4_close_open_stateid(struct nfs4_ol_stateid *s)
unhash_open_stateid(s);

if (clp->cl_minorversion) {
- free_generic_stateid(s);
if (list_empty(&oo->oo_owner.so_stateids))
release_openowner(oo);
+ put_ol_stateid(s);
} else {
if (s->st_file) {
put_nfs4_file(s->st_file);
--
1.9.3


2014-07-30 01:35:20

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v3 20/38] nfsd: Migrate the stateid reference into nfs4_find_stateid_by_type()

From: Trond Myklebust <[email protected]>

Allow nfs4_find_stateid_by_type to take the stateid reference, while
still holding the &cl->cl_lock. Necessary step toward client_mutex
removal.

Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfsd/nfs4state.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index a4a49a3b464c..653de6b14665 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1696,8 +1696,12 @@ find_stateid_by_type(struct nfs4_client *cl, stateid_t *t, char typemask)

spin_lock(&cl->cl_lock);
s = find_stateid_locked(cl, t);
- if (s != NULL && !(typemask & s->sc_type))
- s = NULL;
+ if (s != NULL) {
+ if (typemask & s->sc_type)
+ atomic_inc(&s->sc_count);
+ else
+ s = NULL;
+ }
spin_unlock(&cl->cl_lock);
return s;
}
@@ -3326,8 +3330,6 @@ static struct nfs4_delegation *find_deleg_stateid(struct nfs4_client *cl, statei
ret = find_stateid_by_type(cl, s, NFS4_DELEG_STID);
if (!ret)
return NULL;
- /* FIXME: move into find_stateid_by_type */
- atomic_inc(&ret->sc_count);
return delegstateid(ret);
}

@@ -4170,8 +4172,6 @@ nfsd4_lookup_stateid(struct nfsd4_compound_state *cstate,
*s = find_stateid_by_type(cstate->clp, stateid, typemask);
if (!*s)
return nfserr_bad_stateid;
- /* FIXME: move into find_stateid_by_type */
- atomic_inc(&(*s)->sc_count);
return nfs_ok;
}

--
1.9.3


2014-07-30 21:08:04

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v3 06/38] nfsd: do filp_close in sc_free callback for lock stateids

On Tue, Jul 29, 2014 at 09:34:11PM -0400, Jeff Layton wrote:
> Releasing locks when we unhash the stateid instead of doing so only when
> the stateid is actually released will be problematic in later patches
> when we need to protect the unhashing with spinlocks. Move it into the
> sc_free operation instead.
>
> Signed-off-by: Jeff Layton <[email protected]>

Looks good,

Reviewed-by: Christoph Hellwig <[email protected]>

2014-08-02 14:21:17

by Kinglong Mee

[permalink] [raw]
Subject: Re: [PATCH v3 31/38] nfsd: Move the open owner hash table into struct nfs4_client

On 8/2/2014 22:05, Trond Myklebust wrote:
> On Sat, Aug 2, 2014 at 9:43 AM, Kinglong Mee <[email protected]> wrote:
>> On 8/2/2014 21:11, Trond Myklebust wrote:
>>> On Sat, Aug 2, 2014 at 6:39 AM, Kinglong Mee <[email protected]> wrote:
>>>> On 7/30/2014 09:34, Jeff Layton wrote:
>>>>> From: Trond Myklebust <[email protected]>
>>>>>
>>>>> Preparation for removing the client_mutex.
>>>>>
>>>>> Convert the open owner hash table into a per-client table and protect it
>>>>> using the nfs4_client->cl_lock spin lock.
>>>>>
>>>>> Signed-off-by: Trond Myklebust <[email protected]>
>>>>> ---
>>>>> fs/nfsd/netns.h | 1 -
>>>>> fs/nfsd/nfs4state.c | 187 ++++++++++++++++++++++++----------------------------
>>>>> fs/nfsd/state.h | 1 +
>>>>> 3 files changed, 86 insertions(+), 103 deletions(-)
>>>>>
>>>>> diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
>>>>> index a71d14413d39..e1f479c162b5 100644
>>>>> --- a/fs/nfsd/netns.h
>>>>> +++ b/fs/nfsd/netns.h
>>>>> @@ -63,7 +63,6 @@ struct nfsd_net {
>>>>> struct rb_root conf_name_tree;
>>>>> struct list_head *unconf_id_hashtbl;
>>>>> struct rb_root unconf_name_tree;
>>>>> - struct list_head *ownerstr_hashtbl;
>>>>
>>>> I send a patch "NFSD: Rervert "knfsd: locks: flag NFSv4-owned locks"" before,
>>>> http://comments.gmane.org/gmane.linux.nfs/64382
>>>>
>>>> nfsd needs the hashtbl to find the lockowner for locking by owner from
>>>> fl->fl_owner stored in struct file_lock, but without nfs_client.
>>>
>>> Why? We're not currently doing that.
>>
>> Although not doing that right now, but there is a bug for getting the right ld_owner
>> in nfs4_set_lock_denied.
>>
>> If denying locks, vfs don't copy fl->fl_lmops to the returned file_lock, so that,
>> fl->fl_lmops always be NULL, nfsd never return the owner who holds the conflock.
>>
>> If we want fix this problem, needs finding the lockowner from struct file_lock.
>
> Do we really care enough about fixing nfs4_set_lock_denied enough to
> do so at the cost of reducing overall scalability of locking state?

I just report this problem, don't think enough about the scalability.

> We will always be faking up the clientid etc for local locks. Are
> there any clients out there that actually inspect the clientid on a
> result of NFS4ERR_DENIED and that will break if we give them a fake
> for non-local locks?

Jeff has point the same problem of a non-nfs4_lockowner.
Maybe we should copy fl_lmops to conflock as before, nfsd can distinguish
the lockowner stored in struct file_lock by checking fl_lmops.

thanks,
Kinglong Mee


2014-08-01 16:48:47

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH v3 21/38] nfsd: Add reference counting to state owners

On Tue, Jul 29, 2014 at 09:34:26PM -0400, Jeff Layton wrote:
> struct nfs4_stateowner {
> - struct list_head so_strhash; /* hash by op_name */
> - struct list_head so_stateids;
> - struct nfs4_client * so_client;
> + struct list_head so_strhash;
> + struct list_head so_stateids;
> + struct nfs4_client *so_client;
> + const struct nfs4_stateowner_operations *so_ops;
> /* after increment in ENCODE_SEQID_OP_TAIL, represents the next
> * sequence id expected from the client: */

Uh, that could use updating--applying below.

Anyway, the stateowner reference counting makes sense to me.

--b.

commit 84a46922ea90
Author: J. Bruce Fields <[email protected]>
Date: Thu Jul 31 16:10:08 2014 -0400

nfsd4: fix out of date comment

Signed-off-by: J. Bruce Fields <[email protected]>

diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 73a209dc352b..0b234500f104 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -345,7 +345,7 @@ struct nfs4_stateowner {
struct list_head so_stateids;
struct nfs4_client *so_client;
const struct nfs4_stateowner_operations *so_ops;
- /* after increment in ENCODE_SEQID_OP_TAIL, represents the next
+ /* after increment in nfsd4_bump_seqid, represents the next
* sequence id expected from the client: */
atomic_t so_count;
u32 so_seqid;

2014-08-02 14:06:00

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH v3 31/38] nfsd: Move the open owner hash table into struct nfs4_client

On Sat, Aug 2, 2014 at 9:43 AM, Kinglong Mee <[email protected]> wrote:
> On 8/2/2014 21:11, Trond Myklebust wrote:
>> On Sat, Aug 2, 2014 at 6:39 AM, Kinglong Mee <[email protected]> wrote:
>>> On 7/30/2014 09:34, Jeff Layton wrote:
>>>> From: Trond Myklebust <[email protected]>
>>>>
>>>> Preparation for removing the client_mutex.
>>>>
>>>> Convert the open owner hash table into a per-client table and protect it
>>>> using the nfs4_client->cl_lock spin lock.
>>>>
>>>> Signed-off-by: Trond Myklebust <[email protected]>
>>>> ---
>>>> fs/nfsd/netns.h | 1 -
>>>> fs/nfsd/nfs4state.c | 187 ++++++++++++++++++++++++----------------------------
>>>> fs/nfsd/state.h | 1 +
>>>> 3 files changed, 86 insertions(+), 103 deletions(-)
>>>>
>>>> diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
>>>> index a71d14413d39..e1f479c162b5 100644
>>>> --- a/fs/nfsd/netns.h
>>>> +++ b/fs/nfsd/netns.h
>>>> @@ -63,7 +63,6 @@ struct nfsd_net {
>>>> struct rb_root conf_name_tree;
>>>> struct list_head *unconf_id_hashtbl;
>>>> struct rb_root unconf_name_tree;
>>>> - struct list_head *ownerstr_hashtbl;
>>>
>>> I send a patch "NFSD: Rervert "knfsd: locks: flag NFSv4-owned locks"" before,
>>> http://comments.gmane.org/gmane.linux.nfs/64382
>>>
>>> nfsd needs the hashtbl to find the lockowner for locking by owner from
>>> fl->fl_owner stored in struct file_lock, but without nfs_client.
>>
>> Why? We're not currently doing that.
>
> Although not doing that right now, but there is a bug for getting the right ld_owner
> in nfs4_set_lock_denied.
>
> If denying locks, vfs don't copy fl->fl_lmops to the returned file_lock, so that,
> fl->fl_lmops always be NULL, nfsd never return the owner who holds the conflock.
>
> If we want fix this problem, needs finding the lockowner from struct file_lock.

Do we really care enough about fixing nfs4_set_lock_denied enough to
do so at the cost of reducing overall scalability of locking state?
We will always be faking up the clientid etc for local locks. Are
there any clients out there that actually inspect the clientid on a
result of NFS4ERR_DENIED and that will break if we give them a fake
for non-local locks?

--
Trond Myklebust

Linux NFS client maintainer, PrimaryData

[email protected]

2014-08-03 14:15:37

by Kinglong Mee

[permalink] [raw]
Subject: Re: [PATCH v3 31/38] nfsd: Move the open owner hash table into struct nfs4_client

On 8/3/2014 19:35, Jeffrey Layton wrote:
> On Sat, Aug 2, 2014 at 9:59 PM, Trond Myklebust <[email protected] <mailto:[email protected]>> wrote:
>
> On Sat, Aug 2, 2014 at 6:59 PM, Jeff Layton <[email protected] <mailto:[email protected]>> wrote:
> > On Sat, 2 Aug 2014 10:47:27 -0400
> > Trond Myklebust <[email protected] <mailto:[email protected]>> wrote:
> >
> >> On Sat, Aug 2, 2014 at 10:20 AM, Kinglong Mee <[email protected] <mailto:[email protected]>> wrote:
> >> > On 8/2/2014 22:05, Trond Myklebust wrote:
> >> >> On Sat, Aug 2, 2014 at 9:43 AM, Kinglong Mee <[email protected] <mailto:[email protected]>> wrote:
> >> >>> On 8/2/2014 21:11, Trond Myklebust wrote:
> >> >>>> On Sat, Aug 2, 2014 at 6:39 AM, Kinglong Mee <[email protected] <mailto:[email protected]>> wrote:
> >> >>>>> On 7/30/2014 09:34, Jeff Layton wrote:
> >> >>>>>> From: Trond Myklebust <[email protected] <mailto:[email protected]>>
> >> >>>>>>
> >> >>>>>> Preparation for removing the client_mutex.
> >> >>>>>>
> >> >>>>>> Convert the open owner hash table into a per-client table and protect it
> >> >>>>>> using the nfs4_client->cl_lock spin lock.
> >> >>>>>>
> >> >>>>>> Signed-off-by: Trond Myklebust <[email protected] <mailto:[email protected]>>
> >> >>>>>> ---
> >> >>>>>> fs/nfsd/netns.h | 1 -
> >> >>>>>> fs/nfsd/nfs4state.c | 187 ++++++++++++++++++++++++----------------------------
> >> >>>>>> fs/nfsd/state.h | 1 +
> >> >>>>>> 3 files changed, 86 insertions(+), 103 deletions(-)
> >> >>>>>>
> >> >>>>>> diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
> >> >>>>>> index a71d14413d39..e1f479c162b5 100644
> >> >>>>>> --- a/fs/nfsd/netns.h
> >> >>>>>> +++ b/fs/nfsd/netns.h
> >> >>>>>> @@ -63,7 +63,6 @@ struct nfsd_net {
> >> >>>>>> struct rb_root conf_name_tree;
> >> >>>>>> struct list_head *unconf_id_hashtbl;
> >> >>>>>> struct rb_root unconf_name_tree;
> >> >>>>>> - struct list_head *ownerstr_hashtbl;
> >> >>>>>
> >> >>>>> I send a patch "NFSD: Rervert "knfsd: locks: flag NFSv4-owned locks"" before,
> >> >>>>> http://comments.gmane.org/gmane.linux.nfs/64382
> >> >>>>>
> >> >>>>> nfsd needs the hashtbl to find the lockowner for locking by owner from
> >> >>>>> fl->fl_owner stored in struct file_lock, but without nfs_client.
> >> >>>>
> >> >>>> Why? We're not currently doing that.
> >> >>>
> >> >>> Although not doing that right now, but there is a bug for getting the right ld_owner
> >> >>> in nfs4_set_lock_denied.
> >> >>>
> >> >>> If denying locks, vfs don't copy fl->fl_lmops to the returned file_lock, so that,
> >> >>> fl->fl_lmops always be NULL, nfsd never return the owner who holds the conflock.
> >> >>>
> >> >>> If we want fix this problem, needs finding the lockowner from struct file_lock.
> >> >>
> >> >> Do we really care enough about fixing nfs4_set_lock_denied enough to
> >> >> do so at the cost of reducing overall scalability of locking state?
> >> >
> >> > I just report this problem, don't think enough about the scalability.
> >> >
> >> >> We will always be faking up the clientid etc for local locks. Are
> >> >> there any clients out there that actually inspect the clientid on a
> >> >> result of NFS4ERR_DENIED and that will break if we give them a fake
> >> >> for non-local locks?
> >> >
> >> > Jeff has point the same problem of a non-nfs4_lockowner.
> >> > Maybe we should copy fl_lmops to conflock as before, nfsd can distinguish
> >> > the lockowner stored in struct file_lock by checking fl_lmops.
> >> >
> >>
> >> Alternatively, set a flag in fl_flags. Back in the days, we used to
> >> have a FL_NFSD, perhaps it is time to resurrect that?
> >>
> >
> > Would we need a similar flag for lockd too?
> >
> > I'm not sure a flag is the correct approach for this. A "fl_lmtype" field
> > or something similar might make sense, but I'm not sure that really
> > adds much over just ensuring that fl_lmops is set properly for these locks.
>
> It avoids adding an extra copy of fl_lmops that would only be useful
> to knfsd as a flag.
>
> > That said, we definitely will need to ensure that there are no harmful
> > effects from setting fl_lmops on a conflock container. I don't see any
> > right offhand, but it's probably reasonable to put something like that
> > in -next for a bit of soak time...
>
> The real problem here is that you are copying the lock, and then
> trying to dereference the copied pointer without ensuring that the
> thing the pointer is referencing still exists. Given that we're
> removing the global state mutex, then it is now perfectly possible for
> a competing knfsd thread to free the conflicting lock and the
> lockowner pointed to by fl->fl_owner before your thread hits
> nfs4_set_lock_denied.
> The check for fl_lmtype does nothing to fix that race (nor does FL_NFSD).
>
> Cheers
> Trond
>
>
> Agreed. That's a significant bit of nastiness, and I think you're correct that the
> client_mutex removal will exacerbate the problem. The only way I can see to
> properly fix that would be to make it so that the conflock holds a reference to the
> lockowner, and then put it when the conflock is freed.
>

I will check the reference to lockowner in nfsd, and try to simulate client-id expire.

> That will require some new fl_ops or fl_lmops. It would also be good to clean up
> conflock generation a bit, and turn it into a more explicit process.
> __locks_copy_lock doesn't really fully describe what's going on there, IMO...

I will have a try.

thanks,
Kinglong Mee

2014-08-05 18:46:55

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH v3 31/38] nfsd: Move the open owner hash table into struct nfs4_client

On Sun, Aug 03, 2014 at 07:35:37AM -0400, Jeffrey Layton wrote:
> Agreed. That's a significant bit of nastiness, and I think you're
> correct that the client_mutex removal will exacerbate the problem. The
> only way I can see to properly fix that would be to make it so that
> the conflock holds a reference to the lockowner, and then put it when
> the conflock is freed.
>
> That will require some new fl_ops or fl_lmops. It would also be good
> to clean up conflock generation a bit, and turn it into a more
> explicit process. __locks_copy_lock doesn't really fully describe
> what's going on there, IMO...

Or do the work of nfs4_set_lock_denied under the i_lock in some sort of
callback from the locking code?

--b.

2014-08-01 16:52:36

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH v3 21/38] nfsd: Add reference counting to state owners

On Fri, 1 Aug 2014 12:48:43 -0400
"J. Bruce Fields" <[email protected]> wrote:

> On Tue, Jul 29, 2014 at 09:34:26PM -0400, Jeff Layton wrote:
> > struct nfs4_stateowner {
> > - struct list_head so_strhash; /* hash by op_name */
> > - struct list_head so_stateids;
> > - struct nfs4_client * so_client;
> > + struct list_head so_strhash;
> > + struct list_head so_stateids;
> > + struct nfs4_client *so_client;
> > + const struct nfs4_stateowner_operations *so_ops;
> > /* after increment in ENCODE_SEQID_OP_TAIL, represents the next
> > * sequence id expected from the client: */
>
> Uh, that could use updating--applying below.
>
> Anyway, the stateowner reference counting makes sense to me.
>
> --b.
>
> commit 84a46922ea90
> Author: J. Bruce Fields <[email protected]>
> Date: Thu Jul 31 16:10:08 2014 -0400
>
> nfsd4: fix out of date comment
>
> Signed-off-by: J. Bruce Fields <[email protected]>
>
> diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
> index 73a209dc352b..0b234500f104 100644
> --- a/fs/nfsd/state.h
> +++ b/fs/nfsd/state.h
> @@ -345,7 +345,7 @@ struct nfs4_stateowner {
> struct list_head so_stateids;
> struct nfs4_client *so_client;
> const struct nfs4_stateowner_operations *so_ops;
> - /* after increment in ENCODE_SEQID_OP_TAIL, represents the next
> + /* after increment in nfsd4_bump_seqid, represents the next
> * sequence id expected from the client: */
> atomic_t so_count;
> u32 so_seqid;

Ahh yeah, I meant to update that comment and never got around to it.
Thanks!

Acked-by: Jeff Layton <[email protected]>

2014-08-02 22:59:36

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH v3 31/38] nfsd: Move the open owner hash table into struct nfs4_client

On Sat, 2 Aug 2014 10:47:27 -0400
Trond Myklebust <[email protected]> wrote:

> On Sat, Aug 2, 2014 at 10:20 AM, Kinglong Mee <[email protected]> wrote:
> > On 8/2/2014 22:05, Trond Myklebust wrote:
> >> On Sat, Aug 2, 2014 at 9:43 AM, Kinglong Mee <[email protected]> wrote:
> >>> On 8/2/2014 21:11, Trond Myklebust wrote:
> >>>> On Sat, Aug 2, 2014 at 6:39 AM, Kinglong Mee <[email protected]> wrote:
> >>>>> On 7/30/2014 09:34, Jeff Layton wrote:
> >>>>>> From: Trond Myklebust <[email protected]>
> >>>>>>
> >>>>>> Preparation for removing the client_mutex.
> >>>>>>
> >>>>>> Convert the open owner hash table into a per-client table and protect it
> >>>>>> using the nfs4_client->cl_lock spin lock.
> >>>>>>
> >>>>>> Signed-off-by: Trond Myklebust <[email protected]>
> >>>>>> ---
> >>>>>> fs/nfsd/netns.h | 1 -
> >>>>>> fs/nfsd/nfs4state.c | 187 ++++++++++++++++++++++++----------------------------
> >>>>>> fs/nfsd/state.h | 1 +
> >>>>>> 3 files changed, 86 insertions(+), 103 deletions(-)
> >>>>>>
> >>>>>> diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
> >>>>>> index a71d14413d39..e1f479c162b5 100644
> >>>>>> --- a/fs/nfsd/netns.h
> >>>>>> +++ b/fs/nfsd/netns.h
> >>>>>> @@ -63,7 +63,6 @@ struct nfsd_net {
> >>>>>> struct rb_root conf_name_tree;
> >>>>>> struct list_head *unconf_id_hashtbl;
> >>>>>> struct rb_root unconf_name_tree;
> >>>>>> - struct list_head *ownerstr_hashtbl;
> >>>>>
> >>>>> I send a patch "NFSD: Rervert "knfsd: locks: flag NFSv4-owned locks"" before,
> >>>>> http://comments.gmane.org/gmane.linux.nfs/64382
> >>>>>
> >>>>> nfsd needs the hashtbl to find the lockowner for locking by owner from
> >>>>> fl->fl_owner stored in struct file_lock, but without nfs_client.
> >>>>
> >>>> Why? We're not currently doing that.
> >>>
> >>> Although not doing that right now, but there is a bug for getting the right ld_owner
> >>> in nfs4_set_lock_denied.
> >>>
> >>> If denying locks, vfs don't copy fl->fl_lmops to the returned file_lock, so that,
> >>> fl->fl_lmops always be NULL, nfsd never return the owner who holds the conflock.
> >>>
> >>> If we want fix this problem, needs finding the lockowner from struct file_lock.
> >>
> >> Do we really care enough about fixing nfs4_set_lock_denied enough to
> >> do so at the cost of reducing overall scalability of locking state?
> >
> > I just report this problem, don't think enough about the scalability.
> >
> >> We will always be faking up the clientid etc for local locks. Are
> >> there any clients out there that actually inspect the clientid on a
> >> result of NFS4ERR_DENIED and that will break if we give them a fake
> >> for non-local locks?
> >
> > Jeff has point the same problem of a non-nfs4_lockowner.
> > Maybe we should copy fl_lmops to conflock as before, nfsd can distinguish
> > the lockowner stored in struct file_lock by checking fl_lmops.
> >
>
> Alternatively, set a flag in fl_flags. Back in the days, we used to
> have a FL_NFSD, perhaps it is time to resurrect that?
>

Would we need a similar flag for lockd too?

I'm not sure a flag is the correct approach for this. A "fl_lmtype" field
or something similar might make sense, but I'm not sure that really
adds much over just ensuring that fl_lmops is set properly for these locks.

That said, we definitely will need to ensure that there are no harmful
effects from setting fl_lmops on a conflock container. I don't see any
right offhand, but it's probably reasonable to put something like that
in -next for a bit of soak time...

--
Jeff Layton <[email protected]>

2014-08-02 13:11:26

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH v3 31/38] nfsd: Move the open owner hash table into struct nfs4_client

On Sat, Aug 2, 2014 at 6:39 AM, Kinglong Mee <[email protected]> wrote:
> On 7/30/2014 09:34, Jeff Layton wrote:
>> From: Trond Myklebust <[email protected]>
>>
>> Preparation for removing the client_mutex.
>>
>> Convert the open owner hash table into a per-client table and protect it
>> using the nfs4_client->cl_lock spin lock.
>>
>> Signed-off-by: Trond Myklebust <[email protected]>
>> ---
>> fs/nfsd/netns.h | 1 -
>> fs/nfsd/nfs4state.c | 187 ++++++++++++++++++++++++----------------------------
>> fs/nfsd/state.h | 1 +
>> 3 files changed, 86 insertions(+), 103 deletions(-)
>>
>> diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
>> index a71d14413d39..e1f479c162b5 100644
>> --- a/fs/nfsd/netns.h
>> +++ b/fs/nfsd/netns.h
>> @@ -63,7 +63,6 @@ struct nfsd_net {
>> struct rb_root conf_name_tree;
>> struct list_head *unconf_id_hashtbl;
>> struct rb_root unconf_name_tree;
>> - struct list_head *ownerstr_hashtbl;
>
> I send a patch "NFSD: Rervert "knfsd: locks: flag NFSv4-owned locks"" before,
> http://comments.gmane.org/gmane.linux.nfs/64382
>
> nfsd needs the hashtbl to find the lockowner for locking by owner from
> fl->fl_owner stored in struct file_lock, but without nfs_client.

Why? We're not currently doing that.

> If moving the hashtbl to nfs_client, it's hard to finding the lockowner for locking.

--
Trond Myklebust

Linux NFS client maintainer, PrimaryData

[email protected]

2014-08-02 14:47:28

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH v3 31/38] nfsd: Move the open owner hash table into struct nfs4_client

On Sat, Aug 2, 2014 at 10:20 AM, Kinglong Mee <[email protected]> wrote:
> On 8/2/2014 22:05, Trond Myklebust wrote:
>> On Sat, Aug 2, 2014 at 9:43 AM, Kinglong Mee <[email protected]> wrote:
>>> On 8/2/2014 21:11, Trond Myklebust wrote:
>>>> On Sat, Aug 2, 2014 at 6:39 AM, Kinglong Mee <[email protected]> wrote:
>>>>> On 7/30/2014 09:34, Jeff Layton wrote:
>>>>>> From: Trond Myklebust <[email protected]>
>>>>>>
>>>>>> Preparation for removing the client_mutex.
>>>>>>
>>>>>> Convert the open owner hash table into a per-client table and protect it
>>>>>> using the nfs4_client->cl_lock spin lock.
>>>>>>
>>>>>> Signed-off-by: Trond Myklebust <[email protected]>
>>>>>> ---
>>>>>> fs/nfsd/netns.h | 1 -
>>>>>> fs/nfsd/nfs4state.c | 187 ++++++++++++++++++++++++----------------------------
>>>>>> fs/nfsd/state.h | 1 +
>>>>>> 3 files changed, 86 insertions(+), 103 deletions(-)
>>>>>>
>>>>>> diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
>>>>>> index a71d14413d39..e1f479c162b5 100644
>>>>>> --- a/fs/nfsd/netns.h
>>>>>> +++ b/fs/nfsd/netns.h
>>>>>> @@ -63,7 +63,6 @@ struct nfsd_net {
>>>>>> struct rb_root conf_name_tree;
>>>>>> struct list_head *unconf_id_hashtbl;
>>>>>> struct rb_root unconf_name_tree;
>>>>>> - struct list_head *ownerstr_hashtbl;
>>>>>
>>>>> I send a patch "NFSD: Rervert "knfsd: locks: flag NFSv4-owned locks"" before,
>>>>> http://comments.gmane.org/gmane.linux.nfs/64382
>>>>>
>>>>> nfsd needs the hashtbl to find the lockowner for locking by owner from
>>>>> fl->fl_owner stored in struct file_lock, but without nfs_client.
>>>>
>>>> Why? We're not currently doing that.
>>>
>>> Although not doing that right now, but there is a bug for getting the right ld_owner
>>> in nfs4_set_lock_denied.
>>>
>>> If denying locks, vfs don't copy fl->fl_lmops to the returned file_lock, so that,
>>> fl->fl_lmops always be NULL, nfsd never return the owner who holds the conflock.
>>>
>>> If we want fix this problem, needs finding the lockowner from struct file_lock.
>>
>> Do we really care enough about fixing nfs4_set_lock_denied enough to
>> do so at the cost of reducing overall scalability of locking state?
>
> I just report this problem, don't think enough about the scalability.
>
>> We will always be faking up the clientid etc for local locks. Are
>> there any clients out there that actually inspect the clientid on a
>> result of NFS4ERR_DENIED and that will break if we give them a fake
>> for non-local locks?
>
> Jeff has point the same problem of a non-nfs4_lockowner.
> Maybe we should copy fl_lmops to conflock as before, nfsd can distinguish
> the lockowner stored in struct file_lock by checking fl_lmops.
>

Alternatively, set a flag in fl_flags. Back in the days, we used to
have a FL_NFSD, perhaps it is time to resurrect that?

Cheers,
Trond



--
Trond Myklebust

Linux NFS client maintainer, PrimaryData

[email protected]

2014-08-01 19:44:18

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH v3 30/38] nfsd: Protect adding/removing lock owners using client_lock

On Tue, Jul 29, 2014 at 09:34:35PM -0400, Jeff Layton wrote:
> From: Trond Myklebust <[email protected]>
>
> Once we remove client mutex protection, we'll need to ensure that
> stateowner lookup and creation are atomic between concurrent compounds.
> Ensure that alloc_init_lock_stateowner checks the hashtable under the
> client_lock before adding a new element.

Not worth respinning any patches for this, but "alloc_init_..." was
never my favorite naming scheme, and isn't so accurate at this point.
"find_or_create_..." is a little cumbersome too, but maybe it'd do.

--b.

>
> Signed-off-by: Trond Myklebust <[email protected]>
> ---
> fs/nfsd/nfs4state.c | 69 ++++++++++++++++++++++++++++++++++++++++++++++-------
> 1 file changed, 61 insertions(+), 8 deletions(-)
>
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index c4bb7f2b29d9..7c15918d20f0 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -1000,26 +1000,42 @@ static void release_lock_stateid(struct nfs4_ol_stateid *stp)
> nfs4_put_stid(&stp->st_stid);
> }
>
> -static void unhash_lockowner(struct nfs4_lockowner *lo)
> +static void unhash_lockowner_locked(struct nfs4_lockowner *lo)
> {
> + struct nfsd_net *nn = net_generic(lo->lo_owner.so_client->net,
> + nfsd_net_id);
> +
> + lockdep_assert_held(&nn->client_lock);
> +
> list_del_init(&lo->lo_owner.so_strhash);
> }
>
> static void release_lockowner_stateids(struct nfs4_lockowner *lo)
> {
> + struct nfsd_net *nn = net_generic(lo->lo_owner.so_client->net,
> + nfsd_net_id);
> struct nfs4_ol_stateid *stp;
>
> + lockdep_assert_held(&nn->client_lock);
> +
> while (!list_empty(&lo->lo_owner.so_stateids)) {
> stp = list_first_entry(&lo->lo_owner.so_stateids,
> struct nfs4_ol_stateid, st_perstateowner);
> + spin_unlock(&nn->client_lock);
> release_lock_stateid(stp);
> + spin_lock(&nn->client_lock);
> }
> }
>
> static void release_lockowner(struct nfs4_lockowner *lo)
> {
> - unhash_lockowner(lo);
> + struct nfsd_net *nn = net_generic(lo->lo_owner.so_client->net,
> + nfsd_net_id);
> +
> + spin_lock(&nn->client_lock);
> + unhash_lockowner_locked(lo);
> release_lockowner_stateids(lo);
> + spin_unlock(&nn->client_lock);
> nfs4_put_stateowner(&lo->lo_owner);
> }
>
> @@ -4801,7 +4817,7 @@ nevermind:
> }
>
> static struct nfs4_lockowner *
> -find_lockowner_str(clientid_t *clid, struct xdr_netobj *owner,
> +find_lockowner_str_locked(clientid_t *clid, struct xdr_netobj *owner,
> struct nfsd_net *nn)
> {
> unsigned int strhashval = ownerstr_hashval(clid->cl_id, owner);
> @@ -4818,9 +4834,25 @@ find_lockowner_str(clientid_t *clid, struct xdr_netobj *owner,
> return NULL;
> }
>
> +static struct nfs4_lockowner *
> +find_lockowner_str(clientid_t *clid, struct xdr_netobj *owner,
> + struct nfsd_net *nn)
> +{
> + struct nfs4_lockowner *lo;
> +
> + spin_lock(&nn->client_lock);
> + lo = find_lockowner_str_locked(clid, owner, nn);
> + spin_unlock(&nn->client_lock);
> + return lo;
> +}
> +
> static void nfs4_unhash_lockowner(struct nfs4_stateowner *sop)
> {
> - unhash_lockowner(lockowner(sop));
> + struct nfsd_net *nn = net_generic(sop->so_client->net, nfsd_net_id);
> +
> + spin_lock(&nn->client_lock);
> + unhash_lockowner_locked(lockowner(sop));
> + spin_unlock(&nn->client_lock);
> }
>
> static void nfs4_free_lockowner(struct nfs4_stateowner *sop)
> @@ -4843,9 +4875,12 @@ static const struct nfs4_stateowner_operations lockowner_ops = {
> * strhashval = ownerstr_hashval
> */
> static struct nfs4_lockowner *
> -alloc_init_lock_stateowner(unsigned int strhashval, struct nfs4_client *clp, struct nfs4_ol_stateid *open_stp, struct nfsd4_lock *lock) {
> - struct nfs4_lockowner *lo;
> +alloc_init_lock_stateowner(unsigned int strhashval, struct nfs4_client *clp,
> + struct nfs4_ol_stateid *open_stp,
> + struct nfsd4_lock *lock)
> +{
> struct nfsd_net *nn = net_generic(clp->net, nfsd_net_id);
> + struct nfs4_lockowner *lo, *ret;
>
> lo = alloc_stateowner(lockowner_slab, &lock->lk_new_owner, clp);
> if (!lo)
> @@ -4854,7 +4889,16 @@ alloc_init_lock_stateowner(unsigned int strhashval, struct nfs4_client *clp, str
> lo->lo_owner.so_is_open_owner = 0;
> lo->lo_owner.so_seqid = lock->lk_new_lock_seqid;
> lo->lo_owner.so_ops = &lockowner_ops;
> - list_add(&lo->lo_owner.so_strhash, &nn->ownerstr_hashtbl[strhashval]);
> + spin_lock(&nn->client_lock);
> + ret = find_lockowner_str_locked(&clp->cl_clientid,
> + &lock->lk_new_owner, nn);
> + if (ret == NULL) {
> + list_add(&lo->lo_owner.so_strhash,
> + &nn->ownerstr_hashtbl[strhashval]);
> + ret = lo;
> + } else
> + nfs4_free_lockowner(&lo->lo_owner);
> + spin_unlock(&nn->client_lock);
> return lo;
> }
>
> @@ -5395,6 +5439,7 @@ nfsd4_release_lockowner(struct svc_rqst *rqstp,
> unsigned int hashval = ownerstr_hashval(clid->cl_id, owner);
> __be32 status;
> struct nfsd_net *nn = net_generic(SVC_NET(rqstp), nfsd_net_id);
> + struct nfs4_client *clp;
>
> dprintk("nfsd4_release_lockowner clientid: (%08x/%08x):\n",
> clid->cl_boot, clid->cl_id);
> @@ -5408,6 +5453,7 @@ nfsd4_release_lockowner(struct svc_rqst *rqstp,
> status = nfserr_locks_held;
>
> /* Find the matching lock stateowner */
> + spin_lock(&nn->client_lock);
> list_for_each_entry(tmp, &nn->ownerstr_hashtbl[hashval], so_strhash) {
> if (tmp->so_is_open_owner)
> continue;
> @@ -5417,6 +5463,7 @@ nfsd4_release_lockowner(struct svc_rqst *rqstp,
> break;
> }
> }
> + spin_unlock(&nn->client_lock);
>
> /* No matching owner found, maybe a replay? Just declare victory... */
> if (!sop) {
> @@ -5426,16 +5473,22 @@ nfsd4_release_lockowner(struct svc_rqst *rqstp,
>
> lo = lockowner(sop);
> /* see if there are still any locks associated with it */
> + clp = cstate->clp;
> + spin_lock(&clp->cl_lock);
> list_for_each_entry(stp, &sop->so_stateids, st_perstateowner) {
> if (check_for_locks(stp->st_stid.sc_file, lo)) {
> - nfs4_put_stateowner(sop);
> + spin_unlock(&clp->cl_lock);
> goto out;
> }
> }
> + spin_unlock(&clp->cl_lock);
>
> status = nfs_ok;
> + sop = NULL;
> release_lockowner(lo);
> out:
> + if (sop)
> + nfs4_put_stateowner(sop);
> nfs4_unlock_state();
> return status;
> }
> --
> 1.9.3
>

2014-08-01 19:29:26

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH v3 27/38] nfsd: make openstateids hold references to their openowners

On Tue, Jul 29, 2014 at 09:34:32PM -0400, Jeff Layton wrote:
> Change it so that only openstateids hold persistent references to
> openowners. References can still be held by compounds in progress.
>
> With this, we can get rid of NFS4_OO_NEW.

Yay--that stuff was horrible, my apologies for it. This looks better.

--b.

> It's possible that we
> will create a new openowner in the process of doing the open, but
> something later fails. In the meantime, another task could find
> that openowner and start using it on a successful open. If that
> occurs we don't necessarily want to tear it down, just put the
> reference that the failing compound holds.
>
> Signed-off-by: Jeff Layton <[email protected]>
> ---
> fs/nfsd/nfs4state.c | 73 +++++++++++++++++++++++------------------------------
> fs/nfsd/state.h | 1 -
> 2 files changed, 32 insertions(+), 42 deletions(-)
>
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index c86fe66254b0..b61319401826 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -916,6 +916,8 @@ static void nfs4_free_ol_stateid(struct nfs4_stid *stid)
> struct nfs4_ol_stateid *stp = openlockstateid(stid);
>
> release_all_access(stp);
> + if (stp->st_stateowner)
> + nfs4_put_stateowner(stp->st_stateowner);
> kmem_cache_free(stateid_slab, stid);
> }
>
> @@ -928,8 +930,6 @@ static void nfs4_free_lock_stateid(struct nfs4_stid *stid)
> file = find_any_file(stp->st_stid.sc_file);
> if (file)
> filp_close(file, (fl_owner_t)lo);
> - if (stp->st_stateowner)
> - nfs4_put_stateowner(stp->st_stateowner);
> nfs4_free_ol_stateid(stid);
> }
>
> @@ -1008,8 +1008,9 @@ static void release_last_closed_stateid(struct nfs4_openowner *oo)
> struct nfs4_ol_stateid *s = oo->oo_last_closed_stid;
>
> if (s) {
> - nfs4_put_stid(&s->st_stid);
> + list_del_init(&oo->oo_close_lru);
> oo->oo_last_closed_stid = NULL;
> + nfs4_put_stid(&s->st_stid);
> }
> }
>
> @@ -1028,7 +1029,6 @@ static void release_openowner(struct nfs4_openowner *oo)
> {
> unhash_openowner(oo);
> release_openowner_stateids(oo);
> - list_del(&oo->oo_close_lru);
> release_last_closed_stateid(oo);
> nfs4_put_stateowner(&oo->oo_owner);
> }
> @@ -1497,6 +1497,7 @@ destroy_client(struct nfs4_client *clp)
> }
> while (!list_empty(&clp->cl_openowners)) {
> oo = list_entry(clp->cl_openowners.next, struct nfs4_openowner, oo_perclient);
> + atomic_inc(&oo->oo_owner.so_count);
> release_openowner(oo);
> }
> nfsd4_shutdown_callback(clp);
> @@ -3024,7 +3025,7 @@ alloc_init_open_stateowner(unsigned int strhashval, struct nfsd4_open *open,
> oo->oo_owner.so_ops = &openowner_ops;
> oo->oo_owner.so_is_open_owner = 1;
> oo->oo_owner.so_seqid = open->op_seqid;
> - oo->oo_flags = NFS4_OO_NEW;
> + oo->oo_flags = 0;
> if (nfsd4_has_session(cstate))
> oo->oo_flags |= NFS4_OO_CONFIRMED;
> oo->oo_time = 0;
> @@ -3041,6 +3042,7 @@ static void init_open_stateid(struct nfs4_ol_stateid *stp, struct nfs4_file *fp,
> stp->st_stid.sc_type = NFS4_OPEN_STID;
> INIT_LIST_HEAD(&stp->st_locks);
> stp->st_stateowner = &oo->oo_owner;
> + atomic_inc(&stp->st_stateowner->so_count);
> get_nfs4_file(fp);
> stp->st_stid.sc_file = fp;
> stp->st_access_bmap = 0;
> @@ -3054,13 +3056,27 @@ static void init_open_stateid(struct nfs4_ol_stateid *stp, struct nfs4_file *fp,
> spin_unlock(&oo->oo_owner.so_client->cl_lock);
> }
>
> +/*
> + * In the 4.0 case we need to keep the owners around a little while to handle
> + * CLOSE replay. We still do need to release any file access that is held by
> + * them before returning however.
> + */
> static void
> -move_to_close_lru(struct nfs4_openowner *oo, struct net *net)
> +move_to_close_lru(struct nfs4_ol_stateid *s, struct net *net)
> {
> - struct nfsd_net *nn = net_generic(net, nfsd_net_id);
> + struct nfs4_openowner *oo = openowner(s->st_stateowner);
> + struct nfsd_net *nn = net_generic(s->st_stid.sc_client->net,
> + nfsd_net_id);
>
> dprintk("NFSD: move_to_close_lru nfs4_openowner %p\n", oo);
>
> + release_all_access(s);
> + if (s->st_stid.sc_file) {
> + put_nfs4_file(s->st_stid.sc_file);
> + s->st_stid.sc_file = NULL;
> + }
> + release_last_closed_stateid(oo);
> + oo->oo_last_closed_stid = s;
> list_move_tail(&oo->oo_close_lru, &nn->close_lru);
> oo->oo_time = get_seconds();
> }
> @@ -3091,6 +3107,7 @@ find_openstateowner_str(unsigned int hashval, struct nfsd4_open *open,
> if ((bool)clp->cl_minorversion != sessions)
> return NULL;
> renew_client(oo->oo_owner.so_client);
> + atomic_inc(&oo->oo_owner.so_count);
> return oo;
> }
> }
> @@ -3887,19 +3904,10 @@ void nfsd4_cleanup_open_state(struct nfsd4_compound_state *cstate,
> struct nfsd4_open *open, __be32 status)
> {
> if (open->op_openowner) {
> - struct nfs4_openowner *oo = open->op_openowner;
> -
> - if (!list_empty(&oo->oo_owner.so_stateids))
> - list_del_init(&oo->oo_close_lru);
> - if (oo->oo_flags & NFS4_OO_NEW) {
> - if (status) {
> - release_openowner(oo);
> - open->op_openowner = NULL;
> - } else
> - oo->oo_flags &= ~NFS4_OO_NEW;
> - }
> - if (open->op_openowner)
> - nfsd4_cstate_assign_replay(cstate, &oo->oo_owner);
> + struct nfs4_stateowner *so = &open->op_openowner->oo_owner;
> +
> + nfsd4_cstate_assign_replay(cstate, so);
> + nfs4_put_stateowner(so);
> }
> if (open->op_file)
> nfsd4_free_file(open->op_file);
> @@ -4015,7 +4023,7 @@ nfs4_laundromat(struct nfsd_net *nn)
> new_timeo = min(new_timeo, t);
> break;
> }
> - release_openowner(oo);
> + release_last_closed_stateid(oo);
> }
> new_timeo = max_t(time_t, new_timeo, NFSD_LAUNDROMAT_MINTIMEOUT);
> nfs4_unlock_state();
> @@ -4580,31 +4588,14 @@ out:
> static void nfsd4_close_open_stateid(struct nfs4_ol_stateid *s)
> {
> struct nfs4_client *clp = s->st_stid.sc_client;
> - struct nfs4_openowner *oo = openowner(s->st_stateowner);
>
> s->st_stid.sc_type = NFS4_CLOSED_STID;
> unhash_open_stateid(s);
>
> - if (clp->cl_minorversion) {
> - if (list_empty(&oo->oo_owner.so_stateids))
> - release_openowner(oo);
> + if (clp->cl_minorversion)
> nfs4_put_stid(&s->st_stid);
> - } else {
> - /*
> - * In the 4.0 case we need to keep the owners around a
> - * little while to handle CLOSE replay. We still do need
> - * to release any file access that is held by them
> - * before returning however.
> - */
> - release_all_access(s);
> - if (s->st_stid.sc_file) {
> - put_nfs4_file(s->st_stid.sc_file);
> - s->st_stid.sc_file = NULL;
> - }
> - oo->oo_last_closed_stid = s;
> - if (list_empty(&oo->oo_owner.so_stateids))
> - move_to_close_lru(oo, clp->net);
> - }
> + else
> + move_to_close_lru(s, clp->net);
> }
>
> /*
> diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
> index 232246039db0..e073c86f389c 100644
> --- a/fs/nfsd/state.h
> +++ b/fs/nfsd/state.h
> @@ -367,7 +367,6 @@ struct nfs4_openowner {
> struct nfs4_ol_stateid *oo_last_closed_stid;
> time_t oo_time; /* time of placement on so_close_lru */
> #define NFS4_OO_CONFIRMED 1
> -#define NFS4_OO_NEW 4
> unsigned char oo_flags;
> };
>
> --
> 1.9.3
>

2014-08-03 14:49:19

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH v3 31/38] nfsd: Move the open owner hash table into struct nfs4_client

On Sun, 03 Aug 2014 22:15:05 +0800
Kinglong Mee <[email protected]> wrote:

> On 8/3/2014 19:35, Jeffrey Layton wrote:
> > On Sat, Aug 2, 2014 at 9:59 PM, Trond Myklebust <[email protected] <mailto:[email protected]>> wrote:
> >
> > On Sat, Aug 2, 2014 at 6:59 PM, Jeff Layton <[email protected] <mailto:[email protected]>> wrote:
> > > On Sat, 2 Aug 2014 10:47:27 -0400
> > > Trond Myklebust <[email protected] <mailto:[email protected]>> wrote:
> > >
> > >> On Sat, Aug 2, 2014 at 10:20 AM, Kinglong Mee <[email protected] <mailto:[email protected]>> wrote:
> > >> > On 8/2/2014 22:05, Trond Myklebust wrote:
> > >> >> On Sat, Aug 2, 2014 at 9:43 AM, Kinglong Mee <[email protected] <mailto:[email protected]>> wrote:
> > >> >>> On 8/2/2014 21:11, Trond Myklebust wrote:
> > >> >>>> On Sat, Aug 2, 2014 at 6:39 AM, Kinglong Mee <[email protected] <mailto:[email protected]>> wrote:
> > >> >>>>> On 7/30/2014 09:34, Jeff Layton wrote:
> > >> >>>>>> From: Trond Myklebust <[email protected] <mailto:[email protected]>>
> > >> >>>>>>
> > >> >>>>>> Preparation for removing the client_mutex.
> > >> >>>>>>
> > >> >>>>>> Convert the open owner hash table into a per-client table and protect it
> > >> >>>>>> using the nfs4_client->cl_lock spin lock.
> > >> >>>>>>
> > >> >>>>>> Signed-off-by: Trond Myklebust <[email protected] <mailto:[email protected]>>
> > >> >>>>>> ---
> > >> >>>>>> fs/nfsd/netns.h | 1 -
> > >> >>>>>> fs/nfsd/nfs4state.c | 187 ++++++++++++++++++++++++----------------------------
> > >> >>>>>> fs/nfsd/state.h | 1 +
> > >> >>>>>> 3 files changed, 86 insertions(+), 103 deletions(-)
> > >> >>>>>>
> > >> >>>>>> diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
> > >> >>>>>> index a71d14413d39..e1f479c162b5 100644
> > >> >>>>>> --- a/fs/nfsd/netns.h
> > >> >>>>>> +++ b/fs/nfsd/netns.h
> > >> >>>>>> @@ -63,7 +63,6 @@ struct nfsd_net {
> > >> >>>>>> struct rb_root conf_name_tree;
> > >> >>>>>> struct list_head *unconf_id_hashtbl;
> > >> >>>>>> struct rb_root unconf_name_tree;
> > >> >>>>>> - struct list_head *ownerstr_hashtbl;
> > >> >>>>>
> > >> >>>>> I send a patch "NFSD: Rervert "knfsd: locks: flag NFSv4-owned locks"" before,
> > >> >>>>> http://comments.gmane.org/gmane.linux.nfs/64382
> > >> >>>>>
> > >> >>>>> nfsd needs the hashtbl to find the lockowner for locking by owner from
> > >> >>>>> fl->fl_owner stored in struct file_lock, but without nfs_client.
> > >> >>>>
> > >> >>>> Why? We're not currently doing that.
> > >> >>>
> > >> >>> Although not doing that right now, but there is a bug for getting the right ld_owner
> > >> >>> in nfs4_set_lock_denied.
> > >> >>>
> > >> >>> If denying locks, vfs don't copy fl->fl_lmops to the returned file_lock, so that,
> > >> >>> fl->fl_lmops always be NULL, nfsd never return the owner who holds the conflock.
> > >> >>>
> > >> >>> If we want fix this problem, needs finding the lockowner from struct file_lock.
> > >> >>
> > >> >> Do we really care enough about fixing nfs4_set_lock_denied enough to
> > >> >> do so at the cost of reducing overall scalability of locking state?
> > >> >
> > >> > I just report this problem, don't think enough about the scalability.
> > >> >
> > >> >> We will always be faking up the clientid etc for local locks. Are
> > >> >> there any clients out there that actually inspect the clientid on a
> > >> >> result of NFS4ERR_DENIED and that will break if we give them a fake
> > >> >> for non-local locks?
> > >> >
> > >> > Jeff has point the same problem of a non-nfs4_lockowner.
> > >> > Maybe we should copy fl_lmops to conflock as before, nfsd can distinguish
> > >> > the lockowner stored in struct file_lock by checking fl_lmops.
> > >> >
> > >>
> > >> Alternatively, set a flag in fl_flags. Back in the days, we used to
> > >> have a FL_NFSD, perhaps it is time to resurrect that?
> > >>
> > >
> > > Would we need a similar flag for lockd too?
> > >
> > > I'm not sure a flag is the correct approach for this. A "fl_lmtype" field
> > > or something similar might make sense, but I'm not sure that really
> > > adds much over just ensuring that fl_lmops is set properly for these locks.
> >
> > It avoids adding an extra copy of fl_lmops that would only be useful
> > to knfsd as a flag.
> >
> > > That said, we definitely will need to ensure that there are no harmful
> > > effects from setting fl_lmops on a conflock container. I don't see any
> > > right offhand, but it's probably reasonable to put something like that
> > > in -next for a bit of soak time...
> >
> > The real problem here is that you are copying the lock, and then
> > trying to dereference the copied pointer without ensuring that the
> > thing the pointer is referencing still exists. Given that we're
> > removing the global state mutex, then it is now perfectly possible for
> > a competing knfsd thread to free the conflicting lock and the
> > lockowner pointed to by fl->fl_owner before your thread hits
> > nfs4_set_lock_denied.
> > The check for fl_lmtype does nothing to fix that race (nor does FL_NFSD).
> >
> > Cheers
> > Trond
> >
> >
> > Agreed. That's a significant bit of nastiness, and I think you're correct that the
> > client_mutex removal will exacerbate the problem. The only way I can see to
> > properly fix that would be to make it so that the conflock holds a reference to the
> > lockowner, and then put it when the conflock is freed.
> >
>
> I will check the reference to lockowner in nfsd, and try to simulate client-id expire.
>

You shouldn't need to do anything special there. It should be
sufficient to simply take a reference to the lockowner when you copy
the data to the conflock, and then put that reference when you go to
free it.

> > That will require some new fl_ops or fl_lmops. It would also be good to clean up
> > conflock generation a bit, and turn it into a more explicit process.
> > __locks_copy_lock doesn't really fully describe what's going on there, IMO...
>
> I will have a try.
>
> thanks,
> Kinglong Mee

Thanks. For now, I'm going to drop the patch that you proposed
yesterday since Trond's quite correct that there's more work needed in
this area before that's really safe.

--
Jeff Layton <[email protected]>

2014-08-02 13:43:35

by Kinglong Mee

[permalink] [raw]
Subject: Re: [PATCH v3 31/38] nfsd: Move the open owner hash table into struct nfs4_client

On 8/2/2014 21:11, Trond Myklebust wrote:
> On Sat, Aug 2, 2014 at 6:39 AM, Kinglong Mee <[email protected]> wrote:
>> On 7/30/2014 09:34, Jeff Layton wrote:
>>> From: Trond Myklebust <[email protected]>
>>>
>>> Preparation for removing the client_mutex.
>>>
>>> Convert the open owner hash table into a per-client table and protect it
>>> using the nfs4_client->cl_lock spin lock.
>>>
>>> Signed-off-by: Trond Myklebust <[email protected]>
>>> ---
>>> fs/nfsd/netns.h | 1 -
>>> fs/nfsd/nfs4state.c | 187 ++++++++++++++++++++++++----------------------------
>>> fs/nfsd/state.h | 1 +
>>> 3 files changed, 86 insertions(+), 103 deletions(-)
>>>
>>> diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
>>> index a71d14413d39..e1f479c162b5 100644
>>> --- a/fs/nfsd/netns.h
>>> +++ b/fs/nfsd/netns.h
>>> @@ -63,7 +63,6 @@ struct nfsd_net {
>>> struct rb_root conf_name_tree;
>>> struct list_head *unconf_id_hashtbl;
>>> struct rb_root unconf_name_tree;
>>> - struct list_head *ownerstr_hashtbl;
>>
>> I send a patch "NFSD: Rervert "knfsd: locks: flag NFSv4-owned locks"" before,
>> http://comments.gmane.org/gmane.linux.nfs/64382
>>
>> nfsd needs the hashtbl to find the lockowner for locking by owner from
>> fl->fl_owner stored in struct file_lock, but without nfs_client.
>
> Why? We're not currently doing that.

Although not doing that right now, but there is a bug for getting the right ld_owner
in nfs4_set_lock_denied.

If denying locks, vfs don't copy fl->fl_lmops to the returned file_lock, so that,
fl->fl_lmops always be NULL, nfsd never return the owner who holds the conflock.

If we want fix this problem, needs finding the lockowner from struct file_lock.

thanks,
Kinglong Mee

>
>> If moving the hashtbl to nfs_client, it's hard to finding the lockowner for locking.
>

2014-08-02 10:40:07

by Kinglong Mee

[permalink] [raw]
Subject: Re: [PATCH v3 31/38] nfsd: Move the open owner hash table into struct nfs4_client

On 7/30/2014 09:34, Jeff Layton wrote:
> From: Trond Myklebust <[email protected]>
>
> Preparation for removing the client_mutex.
>
> Convert the open owner hash table into a per-client table and protect it
> using the nfs4_client->cl_lock spin lock.
>
> Signed-off-by: Trond Myklebust <[email protected]>
> ---
> fs/nfsd/netns.h | 1 -
> fs/nfsd/nfs4state.c | 187 ++++++++++++++++++++++++----------------------------
> fs/nfsd/state.h | 1 +
> 3 files changed, 86 insertions(+), 103 deletions(-)
>
> diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
> index a71d14413d39..e1f479c162b5 100644
> --- a/fs/nfsd/netns.h
> +++ b/fs/nfsd/netns.h
> @@ -63,7 +63,6 @@ struct nfsd_net {
> struct rb_root conf_name_tree;
> struct list_head *unconf_id_hashtbl;
> struct rb_root unconf_name_tree;
> - struct list_head *ownerstr_hashtbl;

I send a patch "NFSD: Rervert "knfsd: locks: flag NFSv4-owned locks"" before,
http://comments.gmane.org/gmane.linux.nfs/64382

nfsd needs the hashtbl to find the lockowner for locking by owner from
fl->fl_owner stored in struct file_lock, but without nfs_client.

If moving the hashtbl to nfs_client, it's hard to finding the lockowner for locking.

thanks,
Kinglong Mee

> struct list_head *sessionid_hashtbl;
> /*
> * client_lru holds client queue ordered by nfs4_client.cl_time
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index 7c15918d20f0..4af4e5eff491 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -240,35 +240,27 @@ static void nfsd4_put_session(struct nfsd4_session *ses)
> }
>
> static int
> -same_owner_str(struct nfs4_stateowner *sop, struct xdr_netobj *owner,
> - clientid_t *clid)
> +same_owner_str(struct nfs4_stateowner *sop, struct xdr_netobj *owner)
> {
> return (sop->so_owner.len == owner->len) &&
> - 0 == memcmp(sop->so_owner.data, owner->data, owner->len) &&
> - (sop->so_client->cl_clientid.cl_id == clid->cl_id);
> + 0 == memcmp(sop->so_owner.data, owner->data, owner->len);
> }
>
> static struct nfs4_openowner *
> find_openstateowner_str_locked(unsigned int hashval, struct nfsd4_open *open,
> - bool sessions, struct nfsd_net *nn)
> + struct nfs4_client *clp)
> {
> struct nfs4_stateowner *so;
> - struct nfs4_openowner *oo;
> - struct nfs4_client *clp;
>
> - lockdep_assert_held(&nn->client_lock);
> + lockdep_assert_held(&clp->cl_lock);
>
> - list_for_each_entry(so, &nn->ownerstr_hashtbl[hashval], so_strhash) {
> + list_for_each_entry(so, &clp->cl_ownerstr_hashtbl[hashval],
> + so_strhash) {
> if (!so->so_is_open_owner)
> continue;
> - if (same_owner_str(so, &open->op_owner, &open->op_clientid)) {
> - oo = openowner(so);
> - clp = oo->oo_owner.so_client;
> - if ((bool)clp->cl_minorversion != sessions)
> - break;
> - renew_client_locked(clp);
> + if (same_owner_str(so, &open->op_owner)) {
> atomic_inc(&so->so_count);
> - return oo;
> + return openowner(so);
> }
> }
> return NULL;
> @@ -276,17 +268,16 @@ find_openstateowner_str_locked(unsigned int hashval, struct nfsd4_open *open,
>
> static struct nfs4_openowner *
> find_openstateowner_str(unsigned int hashval, struct nfsd4_open *open,
> - bool sessions, struct nfsd_net *nn)
> + struct nfs4_client *clp)
> {
> struct nfs4_openowner *oo;
>
> - spin_lock(&nn->client_lock);
> - oo = find_openstateowner_str_locked(hashval, open, sessions, nn);
> - spin_unlock(&nn->client_lock);
> + spin_lock(&clp->cl_lock);
> + oo = find_openstateowner_str_locked(hashval, open, clp);
> + spin_unlock(&clp->cl_lock);
> return oo;
> }
>
> -
> static inline u32
> opaque_hashval(const void *ptr, int nbytes)
> {
> @@ -408,12 +399,11 @@ unsigned long max_delegations;
> #define OWNER_HASH_SIZE (1 << OWNER_HASH_BITS)
> #define OWNER_HASH_MASK (OWNER_HASH_SIZE - 1)
>
> -static unsigned int ownerstr_hashval(u32 clientid, struct xdr_netobj *ownername)
> +static unsigned int ownerstr_hashval(struct xdr_netobj *ownername)
> {
> unsigned int ret;
>
> ret = opaque_hashval(ownername->data, ownername->len);
> - ret += clientid;
> return ret & OWNER_HASH_MASK;
> }
>
> @@ -1002,40 +992,37 @@ static void release_lock_stateid(struct nfs4_ol_stateid *stp)
>
> static void unhash_lockowner_locked(struct nfs4_lockowner *lo)
> {
> - struct nfsd_net *nn = net_generic(lo->lo_owner.so_client->net,
> - nfsd_net_id);
> + struct nfs4_client *clp = lo->lo_owner.so_client;
>
> - lockdep_assert_held(&nn->client_lock);
> + lockdep_assert_held(&clp->cl_lock);
>
> list_del_init(&lo->lo_owner.so_strhash);
> }
>
> static void release_lockowner_stateids(struct nfs4_lockowner *lo)
> {
> - struct nfsd_net *nn = net_generic(lo->lo_owner.so_client->net,
> - nfsd_net_id);
> + struct nfs4_client *clp = lo->lo_owner.so_client;
> struct nfs4_ol_stateid *stp;
>
> - lockdep_assert_held(&nn->client_lock);
> + lockdep_assert_held(&clp->cl_lock);
>
> while (!list_empty(&lo->lo_owner.so_stateids)) {
> stp = list_first_entry(&lo->lo_owner.so_stateids,
> struct nfs4_ol_stateid, st_perstateowner);
> - spin_unlock(&nn->client_lock);
> + spin_unlock(&clp->cl_lock);
> release_lock_stateid(stp);
> - spin_lock(&nn->client_lock);
> + spin_lock(&clp->cl_lock);
> }
> }
>
> static void release_lockowner(struct nfs4_lockowner *lo)
> {
> - struct nfsd_net *nn = net_generic(lo->lo_owner.so_client->net,
> - nfsd_net_id);
> + struct nfs4_client *clp = lo->lo_owner.so_client;
>
> - spin_lock(&nn->client_lock);
> + spin_lock(&clp->cl_lock);
> unhash_lockowner_locked(lo);
> release_lockowner_stateids(lo);
> - spin_unlock(&nn->client_lock);
> + spin_unlock(&clp->cl_lock);
> nfs4_put_stateowner(&lo->lo_owner);
> }
>
> @@ -1070,10 +1057,9 @@ static void release_open_stateid(struct nfs4_ol_stateid *stp)
>
> static void unhash_openowner_locked(struct nfs4_openowner *oo)
> {
> - struct nfsd_net *nn = net_generic(oo->oo_owner.so_client->net,
> - nfsd_net_id);
> + struct nfs4_client *clp = oo->oo_owner.so_client;
>
> - lockdep_assert_held(&nn->client_lock);
> + lockdep_assert_held(&clp->cl_lock);
>
> list_del_init(&oo->oo_owner.so_strhash);
> list_del_init(&oo->oo_perclient);
> @@ -1093,29 +1079,27 @@ static void release_last_closed_stateid(struct nfs4_openowner *oo)
> static void release_openowner_stateids(struct nfs4_openowner *oo)
> {
> struct nfs4_ol_stateid *stp;
> - struct nfsd_net *nn = net_generic(oo->oo_owner.so_client->net,
> - nfsd_net_id);
> + struct nfs4_client *clp = oo->oo_owner.so_client;
>
> - lockdep_assert_held(&nn->client_lock);
> + lockdep_assert_held(&clp->cl_lock);
>
> while (!list_empty(&oo->oo_owner.so_stateids)) {
> stp = list_first_entry(&oo->oo_owner.so_stateids,
> struct nfs4_ol_stateid, st_perstateowner);
> - spin_unlock(&nn->client_lock);
> + spin_unlock(&clp->cl_lock);
> release_open_stateid(stp);
> - spin_lock(&nn->client_lock);
> + spin_lock(&clp->cl_lock);
> }
> }
>
> static void release_openowner(struct nfs4_openowner *oo)
> {
> - struct nfsd_net *nn = net_generic(oo->oo_owner.so_client->net,
> - nfsd_net_id);
> + struct nfs4_client *clp = oo->oo_owner.so_client;
>
> - spin_lock(&nn->client_lock);
> + spin_lock(&clp->cl_lock);
> unhash_openowner_locked(oo);
> release_openowner_stateids(oo);
> - spin_unlock(&nn->client_lock);
> + spin_unlock(&clp->cl_lock);
> release_last_closed_stateid(oo);
> nfs4_put_stateowner(&oo->oo_owner);
> }
> @@ -1497,15 +1481,20 @@ STALE_CLIENTID(clientid_t *clid, struct nfsd_net *nn)
> static struct nfs4_client *alloc_client(struct xdr_netobj name)
> {
> struct nfs4_client *clp;
> + int i;
>
> clp = kzalloc(sizeof(struct nfs4_client), GFP_KERNEL);
> if (clp == NULL)
> return NULL;
> clp->cl_name.data = kmemdup(name.data, name.len, GFP_KERNEL);
> - if (clp->cl_name.data == NULL) {
> - kfree(clp);
> - return NULL;
> - }
> + if (clp->cl_name.data == NULL)
> + goto err_no_name;
> + clp->cl_ownerstr_hashtbl = kmalloc(sizeof(struct list_head) *
> + OWNER_HASH_SIZE, GFP_KERNEL);
> + if (!clp->cl_ownerstr_hashtbl)
> + goto err_no_hashtbl;
> + for (i = 0; i < OWNER_HASH_SIZE; i++)
> + INIT_LIST_HEAD(&clp->cl_ownerstr_hashtbl[i]);
> clp->cl_name.len = name.len;
> INIT_LIST_HEAD(&clp->cl_sessions);
> idr_init(&clp->cl_stateids);
> @@ -1520,6 +1509,11 @@ static struct nfs4_client *alloc_client(struct xdr_netobj name)
> spin_lock_init(&clp->cl_lock);
> rpc_init_wait_queue(&clp->cl_cb_waitq, "Backchannel slot table");
> return clp;
> +err_no_hashtbl:
> + kfree(clp->cl_name.data);
> +err_no_name:
> + kfree(clp);
> + return NULL;
> }
>
> static void
> @@ -1538,6 +1532,7 @@ free_client(struct nfs4_client *clp)
> }
> rpc_destroy_wait_queue(&clp->cl_cb_waitq);
> free_svc_cred(&clp->cl_cred);
> + kfree(clp->cl_ownerstr_hashtbl);
> kfree(clp->cl_name.data);
> idr_destroy(&clp->cl_stateids);
> kfree(clp);
> @@ -3074,20 +3069,20 @@ static inline void *alloc_stateowner(struct kmem_cache *slab, struct xdr_netobj
>
> static void hash_openowner(struct nfs4_openowner *oo, struct nfs4_client *clp, unsigned int strhashval)
> {
> - struct nfsd_net *nn = net_generic(clp->net, nfsd_net_id);
> + lockdep_assert_held(&clp->cl_lock);
>
> - list_add(&oo->oo_owner.so_strhash, &nn->ownerstr_hashtbl[strhashval]);
> + list_add(&oo->oo_owner.so_strhash,
> + &clp->cl_ownerstr_hashtbl[strhashval]);
> list_add(&oo->oo_perclient, &clp->cl_openowners);
> }
>
> static void nfs4_unhash_openowner(struct nfs4_stateowner *so)
> {
> - struct nfs4_openowner *oo = openowner(so);
> - struct nfsd_net *nn = net_generic(so->so_client->net, nfsd_net_id);
> + struct nfs4_client *clp = so->so_client;
>
> - spin_lock(&nn->client_lock);
> - unhash_openowner_locked(oo);
> - spin_unlock(&nn->client_lock);
> + spin_lock(&clp->cl_lock);
> + unhash_openowner_locked(openowner(so));
> + spin_unlock(&clp->cl_lock);
> }
>
> static void nfs4_free_openowner(struct nfs4_stateowner *so)
> @@ -3107,7 +3102,6 @@ alloc_init_open_stateowner(unsigned int strhashval, struct nfsd4_open *open,
> struct nfsd4_compound_state *cstate)
> {
> struct nfs4_client *clp = cstate->clp;
> - struct nfsd_net *nn = net_generic(clp->net, nfsd_net_id);
> struct nfs4_openowner *oo, *ret;
>
> oo = alloc_stateowner(openowner_slab, &open->op_owner, clp);
> @@ -3122,15 +3116,14 @@ alloc_init_open_stateowner(unsigned int strhashval, struct nfsd4_open *open,
> oo->oo_time = 0;
> oo->oo_last_closed_stid = NULL;
> INIT_LIST_HEAD(&oo->oo_close_lru);
> - spin_lock(&nn->client_lock);
> - ret = find_openstateowner_str_locked(strhashval,
> - open, clp->cl_minorversion, nn);
> + spin_lock(&clp->cl_lock);
> + ret = find_openstateowner_str_locked(strhashval, open, clp);
> if (ret == NULL) {
> hash_openowner(oo, clp, strhashval);
> ret = oo;
> } else
> nfs4_free_openowner(&oo->oo_owner);
> - spin_unlock(&nn->client_lock);
> + spin_unlock(&clp->cl_lock);
> return oo;
> }
>
> @@ -3412,8 +3405,8 @@ nfsd4_process_open1(struct nfsd4_compound_state *cstate,
> return status;
> clp = cstate->clp;
>
> - strhashval = ownerstr_hashval(clientid->cl_id, &open->op_owner);
> - oo = find_openstateowner_str(strhashval, open, cstate->minorversion, nn);
> + strhashval = ownerstr_hashval(&open->op_owner);
> + oo = find_openstateowner_str(strhashval, open, clp);
> open->op_openowner = oo;
> if (!oo) {
> goto new_owner;
> @@ -4818,15 +4811,16 @@ nevermind:
>
> static struct nfs4_lockowner *
> find_lockowner_str_locked(clientid_t *clid, struct xdr_netobj *owner,
> - struct nfsd_net *nn)
> + struct nfs4_client *clp)
> {
> - unsigned int strhashval = ownerstr_hashval(clid->cl_id, owner);
> + unsigned int strhashval = ownerstr_hashval(owner);
> struct nfs4_stateowner *so;
>
> - list_for_each_entry(so, &nn->ownerstr_hashtbl[strhashval], so_strhash) {
> + list_for_each_entry(so, &clp->cl_ownerstr_hashtbl[strhashval],
> + so_strhash) {
> if (so->so_is_open_owner)
> continue;
> - if (!same_owner_str(so, owner, clid))
> + if (!same_owner_str(so, owner))
> continue;
> atomic_inc(&so->so_count);
> return lockowner(so);
> @@ -4836,23 +4830,23 @@ find_lockowner_str_locked(clientid_t *clid, struct xdr_netobj *owner,
>
> static struct nfs4_lockowner *
> find_lockowner_str(clientid_t *clid, struct xdr_netobj *owner,
> - struct nfsd_net *nn)
> + struct nfs4_client *clp)
> {
> struct nfs4_lockowner *lo;
>
> - spin_lock(&nn->client_lock);
> - lo = find_lockowner_str_locked(clid, owner, nn);
> - spin_unlock(&nn->client_lock);
> + spin_lock(&clp->cl_lock);
> + lo = find_lockowner_str_locked(clid, owner, clp);
> + spin_unlock(&clp->cl_lock);
> return lo;
> }
>
> static void nfs4_unhash_lockowner(struct nfs4_stateowner *sop)
> {
> - struct nfsd_net *nn = net_generic(sop->so_client->net, nfsd_net_id);
> + struct nfs4_client *clp = sop->so_client;
>
> - spin_lock(&nn->client_lock);
> + spin_lock(&clp->cl_lock);
> unhash_lockowner_locked(lockowner(sop));
> - spin_unlock(&nn->client_lock);
> + spin_unlock(&clp->cl_lock);
> }
>
> static void nfs4_free_lockowner(struct nfs4_stateowner *sop)
> @@ -4879,7 +4873,6 @@ alloc_init_lock_stateowner(unsigned int strhashval, struct nfs4_client *clp,
> struct nfs4_ol_stateid *open_stp,
> struct nfsd4_lock *lock)
> {
> - struct nfsd_net *nn = net_generic(clp->net, nfsd_net_id);
> struct nfs4_lockowner *lo, *ret;
>
> lo = alloc_stateowner(lockowner_slab, &lock->lk_new_owner, clp);
> @@ -4889,16 +4882,16 @@ alloc_init_lock_stateowner(unsigned int strhashval, struct nfs4_client *clp,
> lo->lo_owner.so_is_open_owner = 0;
> lo->lo_owner.so_seqid = lock->lk_new_lock_seqid;
> lo->lo_owner.so_ops = &lockowner_ops;
> - spin_lock(&nn->client_lock);
> + spin_lock(&clp->cl_lock);
> ret = find_lockowner_str_locked(&clp->cl_clientid,
> - &lock->lk_new_owner, nn);
> + &lock->lk_new_owner, clp);
> if (ret == NULL) {
> list_add(&lo->lo_owner.so_strhash,
> - &nn->ownerstr_hashtbl[strhashval]);
> + &clp->cl_ownerstr_hashtbl[strhashval]);
> ret = lo;
> } else
> nfs4_free_lockowner(&lo->lo_owner);
> - spin_unlock(&nn->client_lock);
> + spin_unlock(&clp->cl_lock);
> return lo;
> }
>
> @@ -5010,12 +5003,10 @@ lookup_or_create_lock_state(struct nfsd4_compound_state *cstate,
> struct inode *inode = cstate->current_fh.fh_dentry->d_inode;
> struct nfs4_lockowner *lo;
> unsigned int strhashval;
> - struct nfsd_net *nn = net_generic(cl->net, nfsd_net_id);
>
> - lo = find_lockowner_str(&cl->cl_clientid, &lock->v.new.owner, nn);
> + lo = find_lockowner_str(&cl->cl_clientid, &lock->v.new.owner, cl);
> if (!lo) {
> - strhashval = ownerstr_hashval(cl->cl_clientid.cl_id,
> - &lock->v.new.owner);
> + strhashval = ownerstr_hashval(&lock->v.new.owner);
> lo = alloc_init_lock_stateowner(strhashval, cl, ost, lock);
> if (lo == NULL)
> return nfserr_jukebox;
> @@ -5293,7 +5284,8 @@ nfsd4_lockt(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
> goto out;
> }
>
> - lo = find_lockowner_str(&lockt->lt_clientid, &lockt->lt_owner, nn);
> + lo = find_lockowner_str(&lockt->lt_clientid, &lockt->lt_owner,
> + cstate->clp);
> if (lo)
> file_lock->fl_owner = (fl_owner_t)lo;
> file_lock->fl_pid = current->tgid;
> @@ -5436,7 +5428,7 @@ nfsd4_release_lockowner(struct svc_rqst *rqstp,
> struct nfs4_lockowner *lo;
> struct nfs4_ol_stateid *stp;
> struct xdr_netobj *owner = &rlockowner->rl_owner;
> - unsigned int hashval = ownerstr_hashval(clid->cl_id, owner);
> + unsigned int hashval = ownerstr_hashval(owner);
> __be32 status;
> struct nfsd_net *nn = net_generic(SVC_NET(rqstp), nfsd_net_id);
> struct nfs4_client *clp;
> @@ -5452,29 +5444,29 @@ nfsd4_release_lockowner(struct svc_rqst *rqstp,
>
> status = nfserr_locks_held;
>
> + clp = cstate->clp;
> /* Find the matching lock stateowner */
> - spin_lock(&nn->client_lock);
> - list_for_each_entry(tmp, &nn->ownerstr_hashtbl[hashval], so_strhash) {
> + spin_lock(&clp->cl_lock);
> + list_for_each_entry(tmp, &clp->cl_ownerstr_hashtbl[hashval],
> + so_strhash) {
> if (tmp->so_is_open_owner)
> continue;
> - if (same_owner_str(tmp, owner, clid)) {
> + if (same_owner_str(tmp, owner)) {
> sop = tmp;
> atomic_inc(&sop->so_count);
> break;
> }
> }
> - spin_unlock(&nn->client_lock);
>
> /* No matching owner found, maybe a replay? Just declare victory... */
> if (!sop) {
> + spin_unlock(&clp->cl_lock);
> status = nfs_ok;
> goto out;
> }
>
> lo = lockowner(sop);
> /* see if there are still any locks associated with it */
> - clp = cstate->clp;
> - spin_lock(&clp->cl_lock);
> list_for_each_entry(stp, &sop->so_stateids, st_perstateowner) {
> if (check_for_locks(stp->st_stid.sc_file, lo)) {
> spin_unlock(&clp->cl_lock);
> @@ -5829,10 +5821,6 @@ static int nfs4_state_create_net(struct net *net)
> CLIENT_HASH_SIZE, GFP_KERNEL);
> if (!nn->unconf_id_hashtbl)
> goto err_unconf_id;
> - nn->ownerstr_hashtbl = kmalloc(sizeof(struct list_head) *
> - OWNER_HASH_SIZE, GFP_KERNEL);
> - if (!nn->ownerstr_hashtbl)
> - goto err_ownerstr;
> nn->sessionid_hashtbl = kmalloc(sizeof(struct list_head) *
> SESSION_HASH_SIZE, GFP_KERNEL);
> if (!nn->sessionid_hashtbl)
> @@ -5842,8 +5830,6 @@ static int nfs4_state_create_net(struct net *net)
> INIT_LIST_HEAD(&nn->conf_id_hashtbl[i]);
> INIT_LIST_HEAD(&nn->unconf_id_hashtbl[i]);
> }
> - for (i = 0; i < OWNER_HASH_SIZE; i++)
> - INIT_LIST_HEAD(&nn->ownerstr_hashtbl[i]);
> for (i = 0; i < SESSION_HASH_SIZE; i++)
> INIT_LIST_HEAD(&nn->sessionid_hashtbl[i]);
> nn->conf_name_tree = RB_ROOT;
> @@ -5859,8 +5845,6 @@ static int nfs4_state_create_net(struct net *net)
> return 0;
>
> err_sessionid:
> - kfree(nn->ownerstr_hashtbl);
> -err_ownerstr:
> kfree(nn->unconf_id_hashtbl);
> err_unconf_id:
> kfree(nn->conf_id_hashtbl);
> @@ -5890,7 +5874,6 @@ nfs4_state_destroy_net(struct net *net)
> }
>
> kfree(nn->sessionid_hashtbl);
> - kfree(nn->ownerstr_hashtbl);
> kfree(nn->unconf_id_hashtbl);
> kfree(nn->conf_id_hashtbl);
> put_net(net);
> diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
> index e073c86f389c..73a209dc352b 100644
> --- a/fs/nfsd/state.h
> +++ b/fs/nfsd/state.h
> @@ -235,6 +235,7 @@ struct nfsd4_sessionid {
> struct nfs4_client {
> struct list_head cl_idhash; /* hash by cl_clientid.id */
> struct rb_node cl_namenode; /* link into by-name trees */
> + struct list_head *cl_ownerstr_hashtbl;
> struct list_head cl_openowners;
> struct idr cl_stateids; /* stateid lookup */
> struct list_head cl_delegations;
>

2014-08-03 01:59:37

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH v3 31/38] nfsd: Move the open owner hash table into struct nfs4_client

On Sat, Aug 2, 2014 at 6:59 PM, Jeff Layton <[email protected]> wrote:
> On Sat, 2 Aug 2014 10:47:27 -0400
> Trond Myklebust <[email protected]> wrote:
>
>> On Sat, Aug 2, 2014 at 10:20 AM, Kinglong Mee <[email protected]> wrote:
>> > On 8/2/2014 22:05, Trond Myklebust wrote:
>> >> On Sat, Aug 2, 2014 at 9:43 AM, Kinglong Mee <[email protected]> wrote:
>> >>> On 8/2/2014 21:11, Trond Myklebust wrote:
>> >>>> On Sat, Aug 2, 2014 at 6:39 AM, Kinglong Mee <[email protected]> wrote:
>> >>>>> On 7/30/2014 09:34, Jeff Layton wrote:
>> >>>>>> From: Trond Myklebust <[email protected]>
>> >>>>>>
>> >>>>>> Preparation for removing the client_mutex.
>> >>>>>>
>> >>>>>> Convert the open owner hash table into a per-client table and protect it
>> >>>>>> using the nfs4_client->cl_lock spin lock.
>> >>>>>>
>> >>>>>> Signed-off-by: Trond Myklebust <[email protected]>
>> >>>>>> ---
>> >>>>>> fs/nfsd/netns.h | 1 -
>> >>>>>> fs/nfsd/nfs4state.c | 187 ++++++++++++++++++++++++----------------------------
>> >>>>>> fs/nfsd/state.h | 1 +
>> >>>>>> 3 files changed, 86 insertions(+), 103 deletions(-)
>> >>>>>>
>> >>>>>> diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
>> >>>>>> index a71d14413d39..e1f479c162b5 100644
>> >>>>>> --- a/fs/nfsd/netns.h
>> >>>>>> +++ b/fs/nfsd/netns.h
>> >>>>>> @@ -63,7 +63,6 @@ struct nfsd_net {
>> >>>>>> struct rb_root conf_name_tree;
>> >>>>>> struct list_head *unconf_id_hashtbl;
>> >>>>>> struct rb_root unconf_name_tree;
>> >>>>>> - struct list_head *ownerstr_hashtbl;
>> >>>>>
>> >>>>> I send a patch "NFSD: Rervert "knfsd: locks: flag NFSv4-owned locks"" before,
>> >>>>> http://comments.gmane.org/gmane.linux.nfs/64382
>> >>>>>
>> >>>>> nfsd needs the hashtbl to find the lockowner for locking by owner from
>> >>>>> fl->fl_owner stored in struct file_lock, but without nfs_client.
>> >>>>
>> >>>> Why? We're not currently doing that.
>> >>>
>> >>> Although not doing that right now, but there is a bug for getting the right ld_owner
>> >>> in nfs4_set_lock_denied.
>> >>>
>> >>> If denying locks, vfs don't copy fl->fl_lmops to the returned file_lock, so that,
>> >>> fl->fl_lmops always be NULL, nfsd never return the owner who holds the conflock.
>> >>>
>> >>> If we want fix this problem, needs finding the lockowner from struct file_lock.
>> >>
>> >> Do we really care enough about fixing nfs4_set_lock_denied enough to
>> >> do so at the cost of reducing overall scalability of locking state?
>> >
>> > I just report this problem, don't think enough about the scalability.
>> >
>> >> We will always be faking up the clientid etc for local locks. Are
>> >> there any clients out there that actually inspect the clientid on a
>> >> result of NFS4ERR_DENIED and that will break if we give them a fake
>> >> for non-local locks?
>> >
>> > Jeff has point the same problem of a non-nfs4_lockowner.
>> > Maybe we should copy fl_lmops to conflock as before, nfsd can distinguish
>> > the lockowner stored in struct file_lock by checking fl_lmops.
>> >
>>
>> Alternatively, set a flag in fl_flags. Back in the days, we used to
>> have a FL_NFSD, perhaps it is time to resurrect that?
>>
>
> Would we need a similar flag for lockd too?
>
> I'm not sure a flag is the correct approach for this. A "fl_lmtype" field
> or something similar might make sense, but I'm not sure that really
> adds much over just ensuring that fl_lmops is set properly for these locks.

It avoids adding an extra copy of fl_lmops that would only be useful
to knfsd as a flag.

> That said, we definitely will need to ensure that there are no harmful
> effects from setting fl_lmops on a conflock container. I don't see any
> right offhand, but it's probably reasonable to put something like that
> in -next for a bit of soak time...

The real problem here is that you are copying the lock, and then
trying to dereference the copied pointer without ensuring that the
thing the pointer is referencing still exists. Given that we're
removing the global state mutex, then it is now perfectly possible for
a competing knfsd thread to free the conflicting lock and the
lockowner pointed to by fl->fl_owner before your thread hits
nfs4_set_lock_denied.
The check for fl_lmtype does nothing to fix that race (nor does FL_NFSD).

Cheers
Trond
--
Trond Myklebust

Linux NFS client maintainer, PrimaryData

[email protected]

2014-08-02 13:52:07

by Kinglong Mee

[permalink] [raw]
Subject: Re: [PATCH v3 31/38] nfsd: Move the open owner hash table into struct nfs4_client

On 8/2/2014 21:23, Jeff Layton wrote:
> On Sat, 2 Aug 2014 09:11:25 -0400
> Trond Myklebust <[email protected]> wrote:
>
>> On Sat, Aug 2, 2014 at 6:39 AM, Kinglong Mee <[email protected]> wrote:
>>> On 7/30/2014 09:34, Jeff Layton wrote:
>>>> From: Trond Myklebust <[email protected]>
>>>>
>>>> Preparation for removing the client_mutex.
>>>>
>>>> Convert the open owner hash table into a per-client table and protect it
>>>> using the nfs4_client->cl_lock spin lock.
>>>>
>>>> Signed-off-by: Trond Myklebust <[email protected]>
>>>> ---
>>>> fs/nfsd/netns.h | 1 -
>>>> fs/nfsd/nfs4state.c | 187 ++++++++++++++++++++++++----------------------------
>>>> fs/nfsd/state.h | 1 +
>>>> 3 files changed, 86 insertions(+), 103 deletions(-)
>>>>
>>>> diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
>>>> index a71d14413d39..e1f479c162b5 100644
>>>> --- a/fs/nfsd/netns.h
>>>> +++ b/fs/nfsd/netns.h
>>>> @@ -63,7 +63,6 @@ struct nfsd_net {
>>>> struct rb_root conf_name_tree;
>>>> struct list_head *unconf_id_hashtbl;
>>>> struct rb_root unconf_name_tree;
>>>> - struct list_head *ownerstr_hashtbl;
>>>
>>> I send a patch "NFSD: Rervert "knfsd: locks: flag NFSv4-owned locks"" before,
>>> http://comments.gmane.org/gmane.linux.nfs/64382
>>>
>>> nfsd needs the hashtbl to find the lockowner for locking by owner from
>>> fl->fl_owner stored in struct file_lock, but without nfs_client.
>>
>> Why? We're not currently doing that.
>>
>>> If moving the hashtbl to nfs_client, it's hard to finding the lockowner for locking.
>>
>
> I think there's a fundamental flaw in your original patch.
>
> You're casting fl->fl_owner to a struct nfs4_lockowner pointer, and
> then dereferencing that pointer to get to the lo->lo_hashval field. The
> problem there is that conflock might refer to a lock that is not a
> nfsv4 lock, and in that case there is zero guarantee that the
> fl_owner_t is a pointer at all, so that method could end up causing
> oopses.

Yes, you are right, the old patch has the problem.

>
> The fl_owner is really intended to be an opaque token, and you can't
> turn it back into a pointer without knowing for a fact what sort of
> lock it is. In the v4 case, we use the fl_lmops field to try and
> designate that.

Got it.

>
> Perhaps you need to change it so that the conflocks get fl_lmops set
> properly instead of changing things like you are in that patch?

OK, I will sends a new patch coping fl_lmops to conflocks for this problem.

thanks,
Kinglong Mee

>

2014-08-02 13:24:00

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH v3 31/38] nfsd: Move the open owner hash table into struct nfs4_client

On Sat, 2 Aug 2014 09:11:25 -0400
Trond Myklebust <[email protected]> wrote:

> On Sat, Aug 2, 2014 at 6:39 AM, Kinglong Mee <[email protected]> wrote:
> > On 7/30/2014 09:34, Jeff Layton wrote:
> >> From: Trond Myklebust <[email protected]>
> >>
> >> Preparation for removing the client_mutex.
> >>
> >> Convert the open owner hash table into a per-client table and protect it
> >> using the nfs4_client->cl_lock spin lock.
> >>
> >> Signed-off-by: Trond Myklebust <[email protected]>
> >> ---
> >> fs/nfsd/netns.h | 1 -
> >> fs/nfsd/nfs4state.c | 187 ++++++++++++++++++++++++----------------------------
> >> fs/nfsd/state.h | 1 +
> >> 3 files changed, 86 insertions(+), 103 deletions(-)
> >>
> >> diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
> >> index a71d14413d39..e1f479c162b5 100644
> >> --- a/fs/nfsd/netns.h
> >> +++ b/fs/nfsd/netns.h
> >> @@ -63,7 +63,6 @@ struct nfsd_net {
> >> struct rb_root conf_name_tree;
> >> struct list_head *unconf_id_hashtbl;
> >> struct rb_root unconf_name_tree;
> >> - struct list_head *ownerstr_hashtbl;
> >
> > I send a patch "NFSD: Rervert "knfsd: locks: flag NFSv4-owned locks"" before,
> > http://comments.gmane.org/gmane.linux.nfs/64382
> >
> > nfsd needs the hashtbl to find the lockowner for locking by owner from
> > fl->fl_owner stored in struct file_lock, but without nfs_client.
>
> Why? We're not currently doing that.
>
> > If moving the hashtbl to nfs_client, it's hard to finding the lockowner for locking.
>

I think there's a fundamental flaw in your original patch.

You're casting fl->fl_owner to a struct nfs4_lockowner pointer, and
then dereferencing that pointer to get to the lo->lo_hashval field. The
problem there is that conflock might refer to a lock that is not a
nfsv4 lock, and in that case there is zero guarantee that the
fl_owner_t is a pointer at all, so that method could end up causing
oopses.

The fl_owner is really intended to be an opaque token, and you can't
turn it back into a pointer without knowing for a fact what sort of
lock it is. In the v4 case, we use the fl_lmops field to try and
designate that.

Perhaps you need to change it so that the conflocks get fl_lmops set
properly instead of changing things like you are in that patch?

--
Jeff Layton <[email protected]>

2014-08-01 20:25:34

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH v3 00/38] nfsd: stateid and stateowner refcounting overhaul

On Tue, Jul 29, 2014 at 09:34:05PM -0400, Jeff Layton wrote:
> v3:
> - fix intermediate regression in nfsd4_close_open_stateid

These all look OK to me; applying for 3.17.--b.

>
> v2:
> - rename functions from *_generic_stateid to *_ol_stateid
> - eliminate some nfs4_put_* wrappers around nfs4_put_stid
> - clean up lock stateid allocation and initialization
> - reorder patches to reduce churn
> - handle open/lock stateids individually in nfsd4_free_stateid
> - add ops struct for stateowners
> - combine v4.0 replay cache patches
>
> Here is the next swath of patches for the nfsd client_mutex removal
> series. The main focus of this series is to add refcounting to open and
> lock stateids and to use those to drive their management and eventual
> destruction. Additionally, we add refcounting to the open and lock
> stateowners and use that refcount to manage their lifecycle as well.
>
> Most of the new locking here is superfluous until we remove the
> client_mutex. This series should apply cleanly to Bruce's for-3.17
> branch now that the delegation patches have been merged.
>
> Jeff Layton (20):
> nfsd: Cleanup the freeing of stateids
> nfsd4: use cl_lock to synchronize all stateid idr calls
> nfsd: do filp_close in sc_free callback for lock stateids
> nfsd: Add locking to protect the state owner lists
> nfsd: clean up races in lock stateid searching and creation
> nfsd: ensure atomicity in nfsd4_free_stateid and
> nfsd4_validate_stateid
> nfsd: Add reference counting to state owners
> nfsd: Add a mutex to protect the NFSv4.0 open owner replay cache
> nfsd: clean up lockowner refcounting when finding them
> nfsd: add an operation for unhashing a stateowner
> nfsd: clean up refcounting for lockowners
> nfsd: make openstateids hold references to their openowners
> nfsd: don't allow CLOSE to proceed until refcount on stateid drops
> nfsd: clean up and reorganize release_lockowner
> nfsd: add locking to stateowner release
> nfsd: optimize destroy_lockowner cl_lock thrashing
> nfsd: close potential race in nfsd4_free_stateid
> nfsd: reduce cl_lock thrashing in release_openowner
> nfsd: don't thrash the cl_lock while freeing an open stateid
> nfsd: rename unhash_generic_stateid to unhash_ol_stateid
>
> Trond Myklebust (18):
> nfsd: Add reference counting to the lock and open stateids
> nfsd: Add a struct nfs4_file field to struct nfs4_stid
> nfsd: Replace nfs4_ol_stateid->st_file with the st_stid.sc_file
> nfsd: Add reference counting to lock stateids
> nfsd: nfsd4_locku() must reference the lock stateid
> nfsd: Ensure that nfs4_open_delegation() references the delegation
> stateid
> nfsd: nfsd4_process_open2() must reference the delegation stateid
> nfsd: nfsd4_process_open2() must reference the open stateid
> nfsd: Prepare nfsd4_close() for open stateid referencing
> nfsd: nfsd4_open_confirm() must reference the open stateid
> nfsd: Add reference counting to nfs4_preprocess_confirmed_seqid_op
> nfsd: Migrate the stateid reference into nfs4_preprocess_seqid_op
> nfsd: Migrate the stateid reference into nfs4_lookup_stateid()
> nfsd: Migrate the stateid reference into nfs4_find_stateid_by_type()
> nfsd: Make lock stateid take a reference to the lockowner
> nfsd: Protect adding/removing open state owners using client_lock
> nfsd: Protect adding/removing lock owners using client_lock
> nfsd: Move the open owner hash table into struct nfs4_client
>
> fs/nfsd/netns.h | 1 -
> fs/nfsd/nfs4callback.c | 4 +-
> fs/nfsd/nfs4proc.c | 12 +-
> fs/nfsd/nfs4state.c | 1014 +++++++++++++++++++++++++++++++-----------------
> fs/nfsd/nfs4xdr.c | 2 -
> fs/nfsd/state.h | 32 +-
> fs/nfsd/xdr4.h | 5 +-
> 7 files changed, 695 insertions(+), 375 deletions(-)
>
> --
> 1.9.3
>