2009-03-11 00:27:05

by J. Bruce Fields

[permalink] [raw]
Subject: miscellaneous nfsd4 state changes

I'm trying to move the callback code to asynchronous rpc calls, fix some
miscellaneous problems with the callback code, and break up the nfsv4
state lock.

But the callback changes aren't done yet, and are still just in my local
tree; and the locking fixup is still just on paper.

In the meantime, these are some smaller cleanups (and one small locking
change) done in preparation; these are queued up for 2.6.30, barring
any objections.

Beyond this, the (vague) longer-term plan:

- Make the callback rpc's asynchronous, and remove the separate
kthreads.
- Add a list of outstanding callbacks to the nfs4 client, for
two purposes:
- to allow restarting those calls after the rpc callback
client is destroyed and replaced with a new one (as
may happen in the setclientid case that changes the
callback channel); and
- to keep track of whether any callbacks are currently
taking too long (which I'm provisionally defining as
longer than lease_time/10 (or a second, if that's
larger)). When callbacks are taking too long, we
start returning cb_channel_down errors to the client's
renews, and we stop handing out delegations. Having
the full list of pending callbacks available may make
it easier to decide when to re-enable delegations
(which we should probably wait to do till *all*
pending too-slow callbacks have been replied to,
rather than just one of them).

For locking:

- Create a semaphore for each stateid, to be taken on the
seqid-mutating ops.
- Try to rely on spinlocks otherwise; I haven't figured out
quite what's required yet.

--b.


2009-03-18 20:29:11

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 12/14] nfsd4: remove use of mutex for file_hashtable

On Fri, Mar 13, 2009 at 01:18:31PM -0400, bfields wrote:
> On Thu, Mar 12, 2009 at 12:49:25PM +0200, Benny Halevy wrote:
> > On Mar. 12, 2009, 2:05 +0200, "J. Bruce Fields" <[email protected]> wrote:
> > > On Wed, Mar 11, 2009 at 08:52:14PM +0200, Benny Halevy wrote:
> > >>> @@ -1177,12 +1179,15 @@ find_file(struct inode *ino)
> > >>> unsigned int hashval = file_hashval(ino);
> > >>> struct nfs4_file *fp;
> > >>>
> > >>> + spin_lock(&recall_lock);
> > >>> list_for_each_entry(fp, &file_hashtbl[hashval], fi_hash) {
> > >>> if (fp->fi_inode == ino) {
> > >>> + spin_unlock(&recall_lock);
> > >> Hmm, I'm afraid there's a race here since potentially
> > >> you could take the reference on the file after it has been freed.
> > >> find_file should call get_nfs4_file if and only if it's still hashed,
> > >> i.e. atomically with looking it up on the list.
> > >>
> > >> On the same lines, the lock should be taken in put_nfs4_file
> > >> rather than in free_nfs4_file.
> > >>
> > >> That being said, I'm not sure we're unhashing the file in the right
> > >> place... it's too late for me right now but my hunch is that open
> > >> should alloc_init the nfs4_file as done today and close should unhash
> > >> it (under the recall spinlock), and then put it.
> > >> put_nfs4_file can stay the same and free_nfs4_file should just do the
> > >> iput and kmem_cache_free.
> > >>
> > >> The main difference is that find_file will fail finding the nfs4_file
> > >> struct after close. (get_nfs4_file in find_file should still happen
> > >> under the lock though)
> > >
> > > It's probably better for the nfs4_file to be visible longer, but either
> > > is correct.
> >
> > I see. What matters is the stateids associated with the file.
>
> Right. So for now I'm taking your first suggestion. Thanks again for
> the review.

By the way, what I've got now is the following.--b.

commit 466c8f95fa8a70335002c2002f7a6c27b65e6e93
Author: J. Bruce Fields <[email protected]>
Date: Sun Feb 22 14:51:34 2009 -0800

nfsd4: remove use of mutex for file_hashtable

As part of reducing the scope of the client_mutex, and in order to
remove the need for mutexes from the callback code (so that callbacks
can be done as asynchronous rpc calls), move manipulations of the
file_hashtable under the recall_lock.

Update the relevant comments while we're here.

Signed-off-by: J. Bruce Fields <[email protected]>
Cc: Alexandros Batsakis <[email protected]>
Reviewed-by: Benny Halevy <[email protected]>

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index 3fd7136..5dcd38e 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -491,8 +491,6 @@ out_put_cred:
* or deleg_return.
*/
put_nfs4_client(clp);
- nfs4_lock_state();
nfs4_put_delegation(dp);
- nfs4_unlock_state();
return;
}
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index fdcd7cf..a4bcfe0 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -78,14 +78,18 @@ static struct nfs4_delegation * find_delegation_stateid(struct inode *ino, state
static char user_recovery_dirname[PATH_MAX] = "/var/lib/nfs/v4recovery";
static void nfs4_set_recdir(char *recdir);

-/* Locking:
- *
- * client_mutex:
- * protects clientid_hashtbl[], clientstr_hashtbl[],
- * unconfstr_hashtbl[], uncofid_hashtbl[].
- */
+/* Locking: */
+
+/* Currently used for almost all code touching nfsv4 state: */
static DEFINE_MUTEX(client_mutex);

+/*
+ * Currently used for the del_recall_lru and file hash table. In an
+ * effort to decrease the scope of the client_mutex, this spinlock may
+ * eventually cover more:
+ */
+static DEFINE_SPINLOCK(recall_lock);
+
static struct kmem_cache *stateowner_slab = NULL;
static struct kmem_cache *file_slab = NULL;
static struct kmem_cache *stateid_slab = NULL;
@@ -116,33 +120,23 @@ opaque_hashval(const void *ptr, int nbytes)
return x;
}

-/*
- * Delegation state
- */
-
-/* recall_lock protects the del_recall_lru */
-static DEFINE_SPINLOCK(recall_lock);
static struct list_head del_recall_lru;

-static void
-free_nfs4_file(struct kref *kref)
-{
- struct nfs4_file *fp = container_of(kref, struct nfs4_file, fi_ref);
- list_del(&fp->fi_hash);
- iput(fp->fi_inode);
- kmem_cache_free(file_slab, fp);
-}
-
static inline void
put_nfs4_file(struct nfs4_file *fi)
{
- kref_put(&fi->fi_ref, free_nfs4_file);
+ if (atomic_dec_and_lock(&fi->fi_ref, &recall_lock)) {
+ list_del(&fi->fi_hash);
+ spin_unlock(&recall_lock);
+ iput(fi->fi_inode);
+ kmem_cache_free(file_slab, fi);
+ }
}

static inline void
get_nfs4_file(struct nfs4_file *fi)
{
- kref_get(&fi->fi_ref);
+ atomic_inc(&fi->fi_ref);
}

static int num_delegations;
@@ -1000,11 +994,13 @@ alloc_init_file(struct inode *ino)

fp = kmem_cache_alloc(file_slab, GFP_KERNEL);
if (fp) {
- kref_init(&fp->fi_ref);
+ atomic_set(&fp->fi_ref, 1);
INIT_LIST_HEAD(&fp->fi_hash);
INIT_LIST_HEAD(&fp->fi_stateids);
INIT_LIST_HEAD(&fp->fi_delegations);
+ spin_lock(&recall_lock);
list_add(&fp->fi_hash, &file_hashtbl[hashval]);
+ spin_unlock(&recall_lock);
fp->fi_inode = igrab(ino);
fp->fi_id = current_fileid++;
fp->fi_had_conflict = false;
@@ -1177,12 +1173,15 @@ find_file(struct inode *ino)
unsigned int hashval = file_hashval(ino);
struct nfs4_file *fp;

+ spin_lock(&recall_lock);
list_for_each_entry(fp, &file_hashtbl[hashval], fi_hash) {
if (fp->fi_inode == ino) {
get_nfs4_file(fp);
+ spin_unlock(&recall_lock);
return fp;
}
}
+ spin_unlock(&recall_lock);
return NULL;
}

diff --git a/include/linux/nfsd/state.h b/include/linux/nfsd/state.h
index 503b6bb..a6e4a00 100644
--- a/include/linux/nfsd/state.h
+++ b/include/linux/nfsd/state.h
@@ -214,7 +214,7 @@ struct nfs4_stateowner {
* share_acces, share_deny on the file.
*/
struct nfs4_file {
- struct kref fi_ref;
+ atomic_t fi_ref;
struct list_head fi_hash; /* hash by "struct inode *" */
struct list_head fi_stateids;
struct list_head fi_delegations;

2009-03-18 20:40:46

by Benny Halevy

[permalink] [raw]
Subject: Re: [PATCH 12/14] nfsd4: remove use of mutex for file_hashtable

On Mar. 18, 2009, 22:28 +0200, "J. Bruce Fields" <[email protected]> wrote:
> On Fri, Mar 13, 2009 at 01:18:31PM -0400, bfields wrote:
>> On Thu, Mar 12, 2009 at 12:49:25PM +0200, Benny Halevy wrote:
>>> On Mar. 12, 2009, 2:05 +0200, "J. Bruce Fields" <[email protected]> wrote:
>>>> On Wed, Mar 11, 2009 at 08:52:14PM +0200, Benny Halevy wrote:
>>>>>> @@ -1177,12 +1179,15 @@ find_file(struct inode *ino)
>>>>>> unsigned int hashval = file_hashval(ino);
>>>>>> struct nfs4_file *fp;
>>>>>>
>>>>>> + spin_lock(&recall_lock);
>>>>>> list_for_each_entry(fp, &file_hashtbl[hashval], fi_hash) {
>>>>>> if (fp->fi_inode == ino) {
>>>>>> + spin_unlock(&recall_lock);
>>>>> Hmm, I'm afraid there's a race here since potentially
>>>>> you could take the reference on the file after it has been freed.
>>>>> find_file should call get_nfs4_file if and only if it's still hashed,
>>>>> i.e. atomically with looking it up on the list.
>>>>>
>>>>> On the same lines, the lock should be taken in put_nfs4_file
>>>>> rather than in free_nfs4_file.
>>>>>
>>>>> That being said, I'm not sure we're unhashing the file in the right
>>>>> place... it's too late for me right now but my hunch is that open
>>>>> should alloc_init the nfs4_file as done today and close should unhash
>>>>> it (under the recall spinlock), and then put it.
>>>>> put_nfs4_file can stay the same and free_nfs4_file should just do the
>>>>> iput and kmem_cache_free.
>>>>>
>>>>> The main difference is that find_file will fail finding the nfs4_file
>>>>> struct after close. (get_nfs4_file in find_file should still happen
>>>>> under the lock though)
>>>> It's probably better for the nfs4_file to be visible longer, but either
>>>> is correct.
>>> I see. What matters is the stateids associated with the file.
>> Right. So for now I'm taking your first suggestion. Thanks again for
>> the review.
>
> By the way, what I've got now is the following.--b.
>
> commit 466c8f95fa8a70335002c2002f7a6c27b65e6e93
> Author: J. Bruce Fields <[email protected]>
> Date: Sun Feb 22 14:51:34 2009 -0800
>
> nfsd4: remove use of mutex for file_hashtable
>
> As part of reducing the scope of the client_mutex, and in order to
> remove the need for mutexes from the callback code (so that callbacks
> can be done as asynchronous rpc calls), move manipulations of the
> file_hashtable under the recall_lock.
>
> Update the relevant comments while we're here.
>
> Signed-off-by: J. Bruce Fields <[email protected]>
> Cc: Alexandros Batsakis <[email protected]>
> Reviewed-by: Benny Halevy <[email protected]>
>
> diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
> index 3fd7136..5dcd38e 100644
> --- a/fs/nfsd/nfs4callback.c
> +++ b/fs/nfsd/nfs4callback.c
> @@ -491,8 +491,6 @@ out_put_cred:
> * or deleg_return.
> */
> put_nfs4_client(clp);
> - nfs4_lock_state();
> nfs4_put_delegation(dp);
> - nfs4_unlock_state();
> return;
> }
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index fdcd7cf..a4bcfe0 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -78,14 +78,18 @@ static struct nfs4_delegation * find_delegation_stateid(struct inode *ino, state
> static char user_recovery_dirname[PATH_MAX] = "/var/lib/nfs/v4recovery";
> static void nfs4_set_recdir(char *recdir);
>
> -/* Locking:
> - *
> - * client_mutex:
> - * protects clientid_hashtbl[], clientstr_hashtbl[],
> - * unconfstr_hashtbl[], uncofid_hashtbl[].
> - */
> +/* Locking: */
> +
> +/* Currently used for almost all code touching nfsv4 state: */
> static DEFINE_MUTEX(client_mutex);
>
> +/*
> + * Currently used for the del_recall_lru and file hash table. In an
> + * effort to decrease the scope of the client_mutex, this spinlock may
> + * eventually cover more:
> + */
> +static DEFINE_SPINLOCK(recall_lock);
> +
> static struct kmem_cache *stateowner_slab = NULL;
> static struct kmem_cache *file_slab = NULL;
> static struct kmem_cache *stateid_slab = NULL;
> @@ -116,33 +120,23 @@ opaque_hashval(const void *ptr, int nbytes)
> return x;
> }
>
> -/*
> - * Delegation state
> - */
> -
> -/* recall_lock protects the del_recall_lru */
> -static DEFINE_SPINLOCK(recall_lock);
> static struct list_head del_recall_lru;
>
> -static void
> -free_nfs4_file(struct kref *kref)
> -{
> - struct nfs4_file *fp = container_of(kref, struct nfs4_file, fi_ref);
> - list_del(&fp->fi_hash);
> - iput(fp->fi_inode);
> - kmem_cache_free(file_slab, fp);
> -}
> -
> static inline void
> put_nfs4_file(struct nfs4_file *fi)
> {
> - kref_put(&fi->fi_ref, free_nfs4_file);

missing spin_lock(&recall_lock);?


> + if (atomic_dec_and_lock(&fi->fi_ref, &recall_lock)) {
> + list_del(&fi->fi_hash);
> + spin_unlock(&recall_lock);
> + iput(fi->fi_inode);
> + kmem_cache_free(file_slab, fi);
> + }

} else {
spin_unlock(&recall_lock);
}

or am I missing something?

Benny

> }
>
> static inline void
> get_nfs4_file(struct nfs4_file *fi)
> {
> - kref_get(&fi->fi_ref);
> + atomic_inc(&fi->fi_ref);
> }
>
> static int num_delegations;
> @@ -1000,11 +994,13 @@ alloc_init_file(struct inode *ino)
>
> fp = kmem_cache_alloc(file_slab, GFP_KERNEL);
> if (fp) {
> - kref_init(&fp->fi_ref);
> + atomic_set(&fp->fi_ref, 1);
> INIT_LIST_HEAD(&fp->fi_hash);
> INIT_LIST_HEAD(&fp->fi_stateids);
> INIT_LIST_HEAD(&fp->fi_delegations);
> + spin_lock(&recall_lock);
> list_add(&fp->fi_hash, &file_hashtbl[hashval]);
> + spin_unlock(&recall_lock);
> fp->fi_inode = igrab(ino);
> fp->fi_id = current_fileid++;
> fp->fi_had_conflict = false;
> @@ -1177,12 +1173,15 @@ find_file(struct inode *ino)
> unsigned int hashval = file_hashval(ino);
> struct nfs4_file *fp;
>
> + spin_lock(&recall_lock);
> list_for_each_entry(fp, &file_hashtbl[hashval], fi_hash) {
> if (fp->fi_inode == ino) {
> get_nfs4_file(fp);
> + spin_unlock(&recall_lock);
> return fp;
> }
> }
> + spin_unlock(&recall_lock);
> return NULL;
> }
>
> diff --git a/include/linux/nfsd/state.h b/include/linux/nfsd/state.h
> index 503b6bb..a6e4a00 100644
> --- a/include/linux/nfsd/state.h
> +++ b/include/linux/nfsd/state.h
> @@ -214,7 +214,7 @@ struct nfs4_stateowner {
> * share_acces, share_deny on the file.
> */
> struct nfs4_file {
> - struct kref fi_ref;
> + atomic_t fi_ref;
> struct list_head fi_hash; /* hash by "struct inode *" */
> struct list_head fi_stateids;
> struct list_head fi_delegations;

2009-03-18 21:03:31

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 12/14] nfsd4: remove use of mutex for file_hashtable

On Wed, Mar 18, 2009 at 10:40:41PM +0200, Benny Halevy wrote:
> On Mar. 18, 2009, 22:28 +0200, "J. Bruce Fields" <[email protected]> wrote:
> > static inline void
> > put_nfs4_file(struct nfs4_file *fi)
> > {
> > - kref_put(&fi->fi_ref, free_nfs4_file);
>
> missing spin_lock(&recall_lock);?
>
>
> > + if (atomic_dec_and_lock(&fi->fi_ref, &recall_lock)) {
> > + list_del(&fi->fi_hash);
> > + spin_unlock(&recall_lock);
> > + iput(fi->fi_inode);
> > + kmem_cache_free(file_slab, fi);
> > + }
>
> } else {
> spin_unlock(&recall_lock);
> }
>
> or am I missing something?

>From include/linux/spinlock.h:

/**
* atomic_dec_and_lock - lock on reaching reference count zero
* @atomic: the atomic counter
* @lock: the spinlock in question
*
* Decrements @atomic by 1. If the result is 0, returns true and locks
* @lock. Returns false for all other cases.
*/

So it's useful for cases such as this, when a reference-counted object
is reachable from some data structure visible to others, and you want to
remove it from that structure on dropping the last reference (as opposed
to keeping one reference for the structure itself, and requiring the
object to be explicitly removed).

(Unless I'm missing something else.)

--b.

2009-03-18 21:07:58

by Benny Halevy

[permalink] [raw]
Subject: Re: [PATCH 12/14] nfsd4: remove use of mutex for file_hashtable

On Mar. 18, 2009, 23:03 +0200, "J. Bruce Fields" <[email protected]> wrote:
> On Wed, Mar 18, 2009 at 10:40:41PM +0200, Benny Halevy wrote:
>> On Mar. 18, 2009, 22:28 +0200, "J. Bruce Fields" <[email protected]> wrote:
>>> static inline void
>>> put_nfs4_file(struct nfs4_file *fi)
>>> {
>>> - kref_put(&fi->fi_ref, free_nfs4_file);
>> missing spin_lock(&recall_lock);?
>>
>>
>>> + if (atomic_dec_and_lock(&fi->fi_ref, &recall_lock)) {
>>> + list_del(&fi->fi_hash);
>>> + spin_unlock(&recall_lock);
>>> + iput(fi->fi_inode);
>>> + kmem_cache_free(file_slab, fi);
>>> + }
>> } else {
>> spin_unlock(&recall_lock);
>> }
>>
>> or am I missing something?
>
> From include/linux/spinlock.h:
>
> /**
> * atomic_dec_and_lock - lock on reaching reference count zero
> * @atomic: the atomic counter
> * @lock: the spinlock in question
> *
> * Decrements @atomic by 1. If the result is 0, returns true and locks
> * @lock. Returns false for all other cases.
> */
>
> So it's useful for cases such as this, when a reference-counted object
> is reachable from some data structure visible to others, and you want to
> remove it from that structure on dropping the last reference (as opposed
> to keeping one reference for the structure itself, and requiring the
> object to be explicitly removed).
>
> (Unless I'm missing something else.)

My bad. I did miss that.
Patch looks good to me now.

Benny

>
> --b.

2009-03-12 10:49:30

by Benny Halevy

[permalink] [raw]
Subject: Re: [PATCH 12/14] nfsd4: remove use of mutex for file_hashtable

On Mar. 12, 2009, 2:05 +0200, "J. Bruce Fields" <[email protected]> wrote:
> On Wed, Mar 11, 2009 at 08:52:14PM +0200, Benny Halevy wrote:
>> On Mar. 11, 2009, 2:27 +0200, "J. Bruce Fields" <[email protected]> wrote:
>>> From: J. Bruce Fields <[email protected]>
>>>
>>> As part of reducing the scope of the client_mutex, and in order to
>>> remove the need for mutexes from the callback code (so that callbacks
>>> can be done as asynchronous rpc calls), move manipulations of the
>>> file_hashtable under the recall_lock.
>>>
>>> Update the relevant comments while we're here.
>>>
>>> Signed-off-by: J. Bruce Fields <[email protected]>
>>> Cc: Alexandros Batsakis <[email protected]>
>>> ---
>>> fs/nfsd/nfs4callback.c | 2 --
>>> fs/nfsd/nfs4state.c | 29 +++++++++++++++++------------
>>> 2 files changed, 17 insertions(+), 14 deletions(-)
>>>
>>> diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
>>> index 3fd7136..5dcd38e 100644
>>> --- a/fs/nfsd/nfs4callback.c
>>> +++ b/fs/nfsd/nfs4callback.c
>>> @@ -491,8 +491,6 @@ out_put_cred:
>>> * or deleg_return.
>>> */
>>> put_nfs4_client(clp);
>>> - nfs4_lock_state();
>>> nfs4_put_delegation(dp);
>>> - nfs4_unlock_state();
>>> return;
>>> }
>>> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
>>> index 56e81f9..2f4cb9a 100644
>>> --- a/fs/nfsd/nfs4state.c
>>> +++ b/fs/nfsd/nfs4state.c
>>> @@ -78,14 +78,18 @@ static struct nfs4_delegation * find_delegation_stateid(struct inode *ino, state
>>> static char user_recovery_dirname[PATH_MAX] = "/var/lib/nfs/v4recovery";
>>> static void nfs4_set_recdir(char *recdir);
>>>
>>> -/* Locking:
>>> - *
>>> - * client_mutex:
>>> - * protects clientid_hashtbl[], clientstr_hashtbl[],
>>> - * unconfstr_hashtbl[], uncofid_hashtbl[].
>>> - */
>>> +/* Locking: */
>>> +
>>> +/* Currently used for almost all code touching nfsv4 state: */
>>> static DEFINE_MUTEX(client_mutex);
>>>
>>> +/*
>>> + * Currently used for the del_recall_lru and file hash table. In an
>>> + * effort to decrease the scope of the client_mutex, this spinlock may
>>> + * eventually cover more:
>>> + */
>>> +static DEFINE_SPINLOCK(recall_lock);
>>> +
>>> static struct kmem_cache *stateowner_slab = NULL;
>>> static struct kmem_cache *file_slab = NULL;
>>> static struct kmem_cache *stateid_slab = NULL;
>>> @@ -116,19 +120,15 @@ opaque_hashval(const void *ptr, int nbytes)
>>> return x;
>>> }
>>>
>>> -/*
>>> - * Delegation state
>>> - */
>>> -
>>> -/* recall_lock protects the del_recall_lru */
>>> -static DEFINE_SPINLOCK(recall_lock);
>>> static struct list_head del_recall_lru;
>>>
>>> static void
>>> free_nfs4_file(struct kref *kref)
>>> {
>>> struct nfs4_file *fp = container_of(kref, struct nfs4_file, fi_ref);
>>> + spin_lock(&recall_lock);
>>> list_del(&fp->fi_hash);
>>> + spin_unlock(&recall_lock);
>>> iput(fp->fi_inode);
>>> kmem_cache_free(file_slab, fp);
>>> }
>>> @@ -1004,7 +1004,9 @@ alloc_init_file(struct inode *ino)
>>> INIT_LIST_HEAD(&fp->fi_hash);
>>> INIT_LIST_HEAD(&fp->fi_stateids);
>>> INIT_LIST_HEAD(&fp->fi_delegations);
>>> + spin_lock(&recall_lock);
>>> list_add(&fp->fi_hash, &file_hashtbl[hashval]);
>>> + spin_unlock(&recall_lock);
>>> fp->fi_inode = igrab(ino);
>>> fp->fi_id = current_fileid++;
>>> fp->fi_had_conflict = false;
>>> @@ -1177,12 +1179,15 @@ find_file(struct inode *ino)
>>> unsigned int hashval = file_hashval(ino);
>>> struct nfs4_file *fp;
>>>
>>> + spin_lock(&recall_lock);
>>> list_for_each_entry(fp, &file_hashtbl[hashval], fi_hash) {
>>> if (fp->fi_inode == ino) {
>>> + spin_unlock(&recall_lock);
>> Hmm, I'm afraid there's a race here since potentially
>> you could take the reference on the file after it has been freed.
>> find_file should call get_nfs4_file if and only if it's still hashed,
>> i.e. atomically with looking it up on the list.
>>
>> On the same lines, the lock should be taken in put_nfs4_file
>> rather than in free_nfs4_file.
>>
>> That being said, I'm not sure we're unhashing the file in the right
>> place... it's too late for me right now but my hunch is that open
>> should alloc_init the nfs4_file as done today and close should unhash
>> it (under the recall spinlock), and then put it.
>> put_nfs4_file can stay the same and free_nfs4_file should just do the
>> iput and kmem_cache_free.
>>
>> The main difference is that find_file will fail finding the nfs4_file
>> struct after close. (get_nfs4_file in find_file should still happen
>> under the lock though)
>
> It's probably better for the nfs4_file to be visible longer, but either
> is correct.

I see. What matters is the stateids associated with the file.

Benny

>
> The only reason the delegation has a reference on this is to keep track
> of which files have seen conflicts. Maybe there's some better way.
>
> Thanks for catching this....
>
> --b.

2009-03-12 10:55:47

by Benny Halevy

[permalink] [raw]
Subject: Re: [PATCH 13/14] nfsd4: fix do_probe_callback errors

On Mar. 11, 2009, 2:27 +0200, "J. Bruce Fields" <[email protected]> wrote:
> From: J. Bruce Fields <[email protected]>
>
> The errors returned aren't used. Just return 0 and make them available

Wouldn't it be better to change the function to return void
so justice would be seen to be done?

A couple more comments below...

> to a dprintk(). Also, consistently use -ERRNO errors instead of nfs
> errors.
>
> Signed-off-by: J. Bruce Fields <[email protected]>
> ---
> fs/nfsd/nfs4callback.c | 15 +++++++--------
> 1 files changed, 7 insertions(+), 8 deletions(-)
>
> diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
> index 5dcd38e..c979af7 100644
> --- a/fs/nfsd/nfs4callback.c
> +++ b/fs/nfsd/nfs4callback.c
> @@ -389,12 +389,10 @@ static int do_probe_callback(void *data)
> .rpc_argp = clp,
> };
> struct rpc_clnt *client;
> - int status;
> + int status = -EINVAL;
>
> - if (!clp->cl_principal && (clp->cl_flavor >= RPC_AUTH_GSS_KRB5)) {
> - status = nfserr_cb_path_down;
> + if (!clp->cl_principal && (clp->cl_flavor >= RPC_AUTH_GSS_KRB5))
> goto out_err;
> - }
>
> /* Initialize address */
> memset(&addr, 0, sizeof(addr));
> @@ -405,7 +403,8 @@ static int do_probe_callback(void *data)
> /* Create RPC client */
> client = rpc_create(&args);
> if (IS_ERR(client)) {
> - dprintk("NFSD: couldn't create callback client\n");
> + dprintk("NFSD: couldn't create callback client: %d\n",
> + PTR_ERR(client));
> status = PTR_ERR(client);

How about reversing the order and dprintk status instead of PTR_ERR(client)?

> goto out_err;
> }
> @@ -422,10 +421,10 @@ static int do_probe_callback(void *data)
> out_release_client:
> rpc_shutdown_client(client);
> out_err:
> - dprintk("NFSD: warning: no callback path to client %.*s\n",
> - (int)clp->cl_name.len, clp->cl_name.data);
> + dprintk("NFSD: warning: no callback path to client %.*: error %ds\n",

seems like a typo. ": error %d" was pasted one char too soon...
should be:
+ dprintk("NFSD: warning: no callback path to client %.*s: error %d\n",

Benny

> + (int)clp->cl_name.len, clp->cl_name.data, status);
> put_nfs4_client(clp);
> - return status;
> + return 0;
> }
>
> /*

2009-03-12 10:59:09

by Benny Halevy

[permalink] [raw]
Subject: Re: [PATCH 14/14] nfsd4: move rpc_client setup to a separate function

On Mar. 11, 2009, 2:27 +0200, "J. Bruce Fields" <[email protected]> wrote:
> From: J. Bruce Fields <[email protected]>
>
> Signed-off-by: J. Bruce Fields <[email protected]>
> ---
> fs/nfsd/nfs4callback.c | 33 ++++++++++++++++++++++-----------
> 1 files changed, 22 insertions(+), 11 deletions(-)
>
> diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
> index c979af7..1d58ac7 100644
> --- a/fs/nfsd/nfs4callback.c
> +++ b/fs/nfsd/nfs4callback.c
> @@ -361,9 +361,8 @@ static struct rpc_program cb_program = {
> /* Reference counting, callback cleanup, etc., all look racy as heck.
> * And why is cb_set an atomic? */
>
> -static int do_probe_callback(void *data)
> +static struct rpc_clnt *setup_callback_client(struct nfs4_client *clp)
> {
> - struct nfs4_client *clp = data;
> struct sockaddr_in addr;
> struct nfs4_callback *cb = &clp->cl_callback;
> struct rpc_timeout timeparms = {
> @@ -384,15 +383,10 @@ static int do_probe_callback(void *data)
> .flags = (RPC_CLNT_CREATE_NOPING | RPC_CLNT_CREATE_QUIET),
> .client_name = clp->cl_principal,
> };
> - struct rpc_message msg = {
> - .rpc_proc = &nfs4_cb_procedures[NFSPROC4_CLNT_CB_NULL],
> - .rpc_argp = clp,
> - };
> struct rpc_clnt *client;
> - int status = -EINVAL;
>
> if (!clp->cl_principal && (clp->cl_flavor >= RPC_AUTH_GSS_KRB5))
> - goto out_err;
> + return ERR_PTR(-EINVAL);
>
> /* Initialize address */
> memset(&addr, 0, sizeof(addr));
> @@ -402,9 +396,26 @@ static int do_probe_callback(void *data)
>
> /* Create RPC client */
> client = rpc_create(&args);
> - if (IS_ERR(client)) {
> - dprintk("NFSD: couldn't create callback client: %d\n",
> + if (IS_ERR(client))
> + dprintk("NFSD: couldn't create callback client: %ld\n",
> PTR_ERR(client));
> + return client;
> +
> +}
> +
> +static int do_probe_callback(void *data)
> +{
> + struct nfs4_client *clp = data;
> + struct nfs4_callback *cb = &clp->cl_callback;
> + struct rpc_message msg = {
> + .rpc_proc = &nfs4_cb_procedures[NFSPROC4_CLNT_CB_NULL],
> + .rpc_argp = clp,
> + };
> + struct rpc_clnt *client;
> + int status;
> +
> + client = setup_callback_client(clp);
> + if (IS_ERR(client)) {
> status = PTR_ERR(client);
> goto out_err;
> }
> @@ -421,7 +432,7 @@ static int do_probe_callback(void *data)
> out_release_client:
> rpc_shutdown_client(client);
> out_err:
> - dprintk("NFSD: warning: no callback path to client %.*: error %ds\n",
> + dprintk("NFSD: warning: no callback path to client %.*s: error %ds\n"

Cool, that just leaves the trailing 's' to be cleaned up.

Benny

> (int)clp->cl_name.len, clp->cl_name.data, status);
> put_nfs4_client(clp);
> return 0;

2009-03-12 21:41:44

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 13/14] nfsd4: fix do_probe_callback errors

On Thu, Mar 12, 2009 at 12:55:42PM +0200, Benny Halevy wrote:
> On Mar. 11, 2009, 2:27 +0200, "J. Bruce Fields" <[email protected]> wrote:
> > From: J. Bruce Fields <[email protected]>
> >
> > The errors returned aren't used. Just return 0 and make them available
>
> Wouldn't it be better to change the function to return void
> so justice would be seen to be done?

Yeah, but for now since this is passed to kthread_start() this is
probably needed to keep the compiler happy.

> > @@ -405,7 +403,8 @@ static int do_probe_callback(void *data)
> > /* Create RPC client */
> > client = rpc_create(&args);
> > if (IS_ERR(client)) {
> > - dprintk("NFSD: couldn't create callback client\n");
> > + dprintk("NFSD: couldn't create callback client: %d\n",
> > + PTR_ERR(client));
> > status = PTR_ERR(client);
>
> How about reversing the order and dprintk status instead of PTR_ERR(client)?

OK.

>
> > goto out_err;
> > }
> > @@ -422,10 +421,10 @@ static int do_probe_callback(void *data)
> > out_release_client:
> > rpc_shutdown_client(client);
> > out_err:
> > - dprintk("NFSD: warning: no callback path to client %.*s\n",
> > - (int)clp->cl_name.len, clp->cl_name.data);
> > + dprintk("NFSD: warning: no callback path to client %.*: error %ds\n",
>
> seems like a typo. ": error %d" was pasted one char too soon...
> should be:
> + dprintk("NFSD: warning: no callback path to client %.*s: error %d\n",

Whoops, thanks. How did that ever compile? Oh, I see, I must have only
tested the comiple after the next patch, as it looks like I incorrectly
merged a half-fix there....

--b.

2009-03-13 17:18:38

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 12/14] nfsd4: remove use of mutex for file_hashtable

On Thu, Mar 12, 2009 at 12:49:25PM +0200, Benny Halevy wrote:
> On Mar. 12, 2009, 2:05 +0200, "J. Bruce Fields" <[email protected]> wrote:
> > On Wed, Mar 11, 2009 at 08:52:14PM +0200, Benny Halevy wrote:
> >> On Mar. 11, 2009, 2:27 +0200, "J. Bruce Fields" <[email protected]> wrote:
> >>> From: J. Bruce Fields <[email protected]>
> >>>
> >>> As part of reducing the scope of the client_mutex, and in order to
> >>> remove the need for mutexes from the callback code (so that callbacks
> >>> can be done as asynchronous rpc calls), move manipulations of the
> >>> file_hashtable under the recall_lock.
> >>>
> >>> Update the relevant comments while we're here.
> >>>
> >>> Signed-off-by: J. Bruce Fields <[email protected]>
> >>> Cc: Alexandros Batsakis <[email protected]>
> >>> ---
> >>> fs/nfsd/nfs4callback.c | 2 --
> >>> fs/nfsd/nfs4state.c | 29 +++++++++++++++++------------
> >>> 2 files changed, 17 insertions(+), 14 deletions(-)
> >>>
> >>> diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
> >>> index 3fd7136..5dcd38e 100644
> >>> --- a/fs/nfsd/nfs4callback.c
> >>> +++ b/fs/nfsd/nfs4callback.c
> >>> @@ -491,8 +491,6 @@ out_put_cred:
> >>> * or deleg_return.
> >>> */
> >>> put_nfs4_client(clp);
> >>> - nfs4_lock_state();
> >>> nfs4_put_delegation(dp);
> >>> - nfs4_unlock_state();
> >>> return;
> >>> }
> >>> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> >>> index 56e81f9..2f4cb9a 100644
> >>> --- a/fs/nfsd/nfs4state.c
> >>> +++ b/fs/nfsd/nfs4state.c
> >>> @@ -78,14 +78,18 @@ static struct nfs4_delegation * find_delegation_stateid(struct inode *ino, state
> >>> static char user_recovery_dirname[PATH_MAX] = "/var/lib/nfs/v4recovery";
> >>> static void nfs4_set_recdir(char *recdir);
> >>>
> >>> -/* Locking:
> >>> - *
> >>> - * client_mutex:
> >>> - * protects clientid_hashtbl[], clientstr_hashtbl[],
> >>> - * unconfstr_hashtbl[], uncofid_hashtbl[].
> >>> - */
> >>> +/* Locking: */
> >>> +
> >>> +/* Currently used for almost all code touching nfsv4 state: */
> >>> static DEFINE_MUTEX(client_mutex);
> >>>
> >>> +/*
> >>> + * Currently used for the del_recall_lru and file hash table. In an
> >>> + * effort to decrease the scope of the client_mutex, this spinlock may
> >>> + * eventually cover more:
> >>> + */
> >>> +static DEFINE_SPINLOCK(recall_lock);
> >>> +
> >>> static struct kmem_cache *stateowner_slab = NULL;
> >>> static struct kmem_cache *file_slab = NULL;
> >>> static struct kmem_cache *stateid_slab = NULL;
> >>> @@ -116,19 +120,15 @@ opaque_hashval(const void *ptr, int nbytes)
> >>> return x;
> >>> }
> >>>
> >>> -/*
> >>> - * Delegation state
> >>> - */
> >>> -
> >>> -/* recall_lock protects the del_recall_lru */
> >>> -static DEFINE_SPINLOCK(recall_lock);
> >>> static struct list_head del_recall_lru;
> >>>
> >>> static void
> >>> free_nfs4_file(struct kref *kref)
> >>> {
> >>> struct nfs4_file *fp = container_of(kref, struct nfs4_file, fi_ref);
> >>> + spin_lock(&recall_lock);
> >>> list_del(&fp->fi_hash);
> >>> + spin_unlock(&recall_lock);
> >>> iput(fp->fi_inode);
> >>> kmem_cache_free(file_slab, fp);
> >>> }
> >>> @@ -1004,7 +1004,9 @@ alloc_init_file(struct inode *ino)
> >>> INIT_LIST_HEAD(&fp->fi_hash);
> >>> INIT_LIST_HEAD(&fp->fi_stateids);
> >>> INIT_LIST_HEAD(&fp->fi_delegations);
> >>> + spin_lock(&recall_lock);
> >>> list_add(&fp->fi_hash, &file_hashtbl[hashval]);
> >>> + spin_unlock(&recall_lock);
> >>> fp->fi_inode = igrab(ino);
> >>> fp->fi_id = current_fileid++;
> >>> fp->fi_had_conflict = false;
> >>> @@ -1177,12 +1179,15 @@ find_file(struct inode *ino)
> >>> unsigned int hashval = file_hashval(ino);
> >>> struct nfs4_file *fp;
> >>>
> >>> + spin_lock(&recall_lock);
> >>> list_for_each_entry(fp, &file_hashtbl[hashval], fi_hash) {
> >>> if (fp->fi_inode == ino) {
> >>> + spin_unlock(&recall_lock);
> >> Hmm, I'm afraid there's a race here since potentially
> >> you could take the reference on the file after it has been freed.
> >> find_file should call get_nfs4_file if and only if it's still hashed,
> >> i.e. atomically with looking it up on the list.
> >>
> >> On the same lines, the lock should be taken in put_nfs4_file
> >> rather than in free_nfs4_file.
> >>
> >> That being said, I'm not sure we're unhashing the file in the right
> >> place... it's too late for me right now but my hunch is that open
> >> should alloc_init the nfs4_file as done today and close should unhash
> >> it (under the recall spinlock), and then put it.
> >> put_nfs4_file can stay the same and free_nfs4_file should just do the
> >> iput and kmem_cache_free.
> >>
> >> The main difference is that find_file will fail finding the nfs4_file
> >> struct after close. (get_nfs4_file in find_file should still happen
> >> under the lock though)
> >
> > It's probably better for the nfs4_file to be visible longer, but either
> > is correct.
>
> I see. What matters is the stateids associated with the file.

Right. So for now I'm taking your first suggestion. Thanks again for
the review.

--b.

2009-03-11 00:27:05

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 14/14] nfsd4: move rpc_client setup to a separate function

From: J. Bruce Fields <[email protected]>

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4callback.c | 33 ++++++++++++++++++++++-----------
1 files changed, 22 insertions(+), 11 deletions(-)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index c979af7..1d58ac7 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -361,9 +361,8 @@ static struct rpc_program cb_program = {
/* Reference counting, callback cleanup, etc., all look racy as heck.
* And why is cb_set an atomic? */

-static int do_probe_callback(void *data)
+static struct rpc_clnt *setup_callback_client(struct nfs4_client *clp)
{
- struct nfs4_client *clp = data;
struct sockaddr_in addr;
struct nfs4_callback *cb = &clp->cl_callback;
struct rpc_timeout timeparms = {
@@ -384,15 +383,10 @@ static int do_probe_callback(void *data)
.flags = (RPC_CLNT_CREATE_NOPING | RPC_CLNT_CREATE_QUIET),
.client_name = clp->cl_principal,
};
- struct rpc_message msg = {
- .rpc_proc = &nfs4_cb_procedures[NFSPROC4_CLNT_CB_NULL],
- .rpc_argp = clp,
- };
struct rpc_clnt *client;
- int status = -EINVAL;

if (!clp->cl_principal && (clp->cl_flavor >= RPC_AUTH_GSS_KRB5))
- goto out_err;
+ return ERR_PTR(-EINVAL);

/* Initialize address */
memset(&addr, 0, sizeof(addr));
@@ -402,9 +396,26 @@ static int do_probe_callback(void *data)

/* Create RPC client */
client = rpc_create(&args);
- if (IS_ERR(client)) {
- dprintk("NFSD: couldn't create callback client: %d\n",
+ if (IS_ERR(client))
+ dprintk("NFSD: couldn't create callback client: %ld\n",
PTR_ERR(client));
+ return client;
+
+}
+
+static int do_probe_callback(void *data)
+{
+ struct nfs4_client *clp = data;
+ struct nfs4_callback *cb = &clp->cl_callback;
+ struct rpc_message msg = {
+ .rpc_proc = &nfs4_cb_procedures[NFSPROC4_CLNT_CB_NULL],
+ .rpc_argp = clp,
+ };
+ struct rpc_clnt *client;
+ int status;
+
+ client = setup_callback_client(clp);
+ if (IS_ERR(client)) {
status = PTR_ERR(client);
goto out_err;
}
@@ -421,7 +432,7 @@ static int do_probe_callback(void *data)
out_release_client:
rpc_shutdown_client(client);
out_err:
- dprintk("NFSD: warning: no callback path to client %.*: error %ds\n",
+ dprintk("NFSD: warning: no callback path to client %.*s: error %ds\n",
(int)clp->cl_name.len, clp->cl_name.data, status);
put_nfs4_client(clp);
return 0;
--
1.6.0.4


2009-03-11 00:27:05

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 08/14] nfsd4: fail when delegreturn gets a non-delegation stateid

From: J. Bruce Fields <[email protected]>

Previous cleanup reveals an obvious (though harmless) bug: when
delegreturn gets a stateid that isn't for a delegation, it should return
an error rather than doing nothing.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4state.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index de1d68a..4a92d1e 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -2418,7 +2418,7 @@ nfsd4_delegreturn(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
status = nfserr_stale_stateid;
if (STALE_STATEID(stateid))
goto out;
- status = nfs_ok;
+ status = nfserr_bad_stateid;
if (is_delegation_stateid(stateid))
goto out;
status = nfserr_bad_stateid;
--
1.6.0.4


2009-03-11 00:27:05

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 10/14] nfsd4: rename io_during_grace_disallowed

From: J. Bruce Fields <[email protected]>

Use a slightly clearer, more concise name. Also removed unused
argument.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4state.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index caca569..56e81f9 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -2019,7 +2019,7 @@ check_special_stateids(svc_fh *current_fh, stateid_t *stateid, int flags)
* that are not able to provide mandatory locking.
*/
static inline int
-io_during_grace_disallowed(struct inode *inode, int flags)
+grace_disallows_io(struct inode *inode)
{
return locks_in_grace() && mandatory_lock(inode);
}
@@ -2063,7 +2063,7 @@ nfs4_preprocess_stateid_op(struct svc_fh *current_fh, stateid_t *stateid, int fl
if (filpp)
*filpp = NULL;

- if (io_during_grace_disallowed(ino, flags))
+ if (grace_disallows_io(ino))
return nfserr_grace;

if (ZERO_STATEID(stateid) || ONE_STATEID(stateid))
--
1.6.0.4


2009-03-11 00:27:05

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 03/14] nfsd4: remove redundant "if" in nfs4_preprocess_stateid_op

From: J. Bruce Fields <[email protected]>

Note that we exit this first big "if" with stp == NULL iff we took the
first branch; therefore, the second "if" is redundant, and we can just
combine the two, simplifying the logic.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4state.c | 19 ++++++++-----------
1 files changed, 8 insertions(+), 11 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index d6ca2be..16fcb65 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -2087,6 +2087,14 @@ nfs4_preprocess_stateid_op(struct svc_fh *current_fh, stateid_t *stateid, int fl
status = check_stateid_generation(stateid, stidp);
if (status)
goto out;
+ status = nfs4_check_delegmode(dp, flags);
+ if (status)
+ goto out;
+ renew_client(dp->dl_client);
+ if (flags & DELEG_RET)
+ unhash_delegation(dp);
+ if (filpp)
+ *filpp = dp->dl_vfs_file;
} else { /* open or lock stateid */
stp = find_stateid(stateid, flags);
if (!stp) {
@@ -2101,23 +2109,12 @@ nfs4_preprocess_stateid_op(struct svc_fh *current_fh, stateid_t *stateid, int fl
status = check_stateid_generation(stateid, stidp);
if (status)
goto out;
- }
- if (stp) {
status = nfs4_check_openmode(stp, flags);
if (status)
goto out;
renew_client(stp->st_stateowner->so_client);
if (filpp)
*filpp = stp->st_vfs_file;
- } else {
- status = nfs4_check_delegmode(dp, flags);
- if (status)
- goto out;
- renew_client(dp->dl_client);
- if (flags & DELEG_RET)
- unhash_delegation(dp);
- if (filpp)
- *filpp = dp->dl_vfs_file;
}
status = nfs_ok;
out:
--
1.6.0.4


2009-03-11 00:27:05

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 07/14] nfsd4: separate delegreturn case from preprocess_stateid_op

From: J. Bruce Fields <[email protected]>

Delegreturn is enough a special case for preprocess_stateid_op to
warrant just open-coding it in delegreturn.

There should be no change in behavior here; we're just reshuffling code.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4state.c | 40 ++++++++++++++++++++++++++++------------
include/linux/nfsd/state.h | 1 -
2 files changed, 28 insertions(+), 13 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index d555585..de1d68a 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -2000,10 +2000,7 @@ out:
static inline __be32
check_special_stateids(svc_fh *current_fh, stateid_t *stateid, int flags)
{
- /* Trying to call delegreturn with a special stateid? Yuch: */
- if (!(flags & (RD_STATE | WR_STATE)))
- return nfserr_bad_stateid;
- else if (ONE_STATEID(stateid) && (flags & RD_STATE))
+ if (ONE_STATEID(stateid) && (flags & RD_STATE))
return nfs_ok;
else if (locks_in_grace()) {
/* Answer in remaining cases depends on existance of
@@ -2024,8 +2021,7 @@ check_special_stateids(svc_fh *current_fh, stateid_t *stateid, int flags)
static inline int
io_during_grace_disallowed(struct inode *inode, int flags)
{
- return locks_in_grace() && (flags & (RD_STATE | WR_STATE))
- && mandatory_lock(inode);
+ return locks_in_grace() && mandatory_lock(inode);
}

static int check_stateid_generation(stateid_t *in, stateid_t *ref)
@@ -2089,8 +2085,6 @@ nfs4_preprocess_stateid_op(struct svc_fh *current_fh, stateid_t *stateid, int fl
if (status)
goto out;
renew_client(dp->dl_client);
- if (flags & DELEG_RET)
- unhash_delegation(dp);
if (filpp)
*filpp = dp->dl_vfs_file;
} else { /* open or lock stateid */
@@ -2408,16 +2402,38 @@ __be32
nfsd4_delegreturn(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
struct nfsd4_delegreturn *dr)
{
+ struct nfs4_delegation *dp;
+ stateid_t *stateid = &dr->dr_stateid;
+ struct inode *inode;
__be32 status;

if ((status = fh_verify(rqstp, &cstate->current_fh, S_IFREG, 0)))
- goto out;
+ return status;
+ inode = cstate->current_fh.fh_dentry->d_inode;

nfs4_lock_state();
- status = nfs4_preprocess_stateid_op(&cstate->current_fh,
- &dr->dr_stateid, DELEG_RET, NULL);
- nfs4_unlock_state();
+ status = nfserr_bad_stateid;
+ if (ZERO_STATEID(stateid) || ONE_STATEID(stateid))
+ goto out;
+ status = nfserr_stale_stateid;
+ if (STALE_STATEID(stateid))
+ goto out;
+ status = nfs_ok;
+ if (is_delegation_stateid(stateid))
+ goto out;
+ status = nfserr_bad_stateid;
+ dp = find_delegation_stateid(inode, stateid);
+ if (!dp)
+ goto out;
+ status = check_stateid_generation(stateid, &dp->dl_stateid);
+ if (status)
+ goto out;
+ renew_client(dp->dl_client);
+
+ unhash_delegation(dp);
out:
+ nfs4_unlock_state();
+
return status;
}

diff --git a/include/linux/nfsd/state.h b/include/linux/nfsd/state.h
index 1130d53..c9311a1 100644
--- a/include/linux/nfsd/state.h
+++ b/include/linux/nfsd/state.h
@@ -263,7 +263,6 @@ struct nfs4_stateid {
#define RD_STATE 0x00000010
#define WR_STATE 0x00000020
#define CLOSE_STATE 0x00000040
-#define DELEG_RET 0x00000080

#define seqid_mutating_err(err) \
(((err) != nfserr_stale_clientid) && \
--
1.6.0.4


2009-03-11 00:27:05

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 12/14] nfsd4: remove use of mutex for file_hashtable

From: J. Bruce Fields <[email protected]>

As part of reducing the scope of the client_mutex, and in order to
remove the need for mutexes from the callback code (so that callbacks
can be done as asynchronous rpc calls), move manipulations of the
file_hashtable under the recall_lock.

Update the relevant comments while we're here.

Signed-off-by: J. Bruce Fields <[email protected]>
Cc: Alexandros Batsakis <[email protected]>
---
fs/nfsd/nfs4callback.c | 2 --
fs/nfsd/nfs4state.c | 29 +++++++++++++++++------------
2 files changed, 17 insertions(+), 14 deletions(-)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index 3fd7136..5dcd38e 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -491,8 +491,6 @@ out_put_cred:
* or deleg_return.
*/
put_nfs4_client(clp);
- nfs4_lock_state();
nfs4_put_delegation(dp);
- nfs4_unlock_state();
return;
}
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 56e81f9..2f4cb9a 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -78,14 +78,18 @@ static struct nfs4_delegation * find_delegation_stateid(struct inode *ino, state
static char user_recovery_dirname[PATH_MAX] = "/var/lib/nfs/v4recovery";
static void nfs4_set_recdir(char *recdir);

-/* Locking:
- *
- * client_mutex:
- * protects clientid_hashtbl[], clientstr_hashtbl[],
- * unconfstr_hashtbl[], uncofid_hashtbl[].
- */
+/* Locking: */
+
+/* Currently used for almost all code touching nfsv4 state: */
static DEFINE_MUTEX(client_mutex);

+/*
+ * Currently used for the del_recall_lru and file hash table. In an
+ * effort to decrease the scope of the client_mutex, this spinlock may
+ * eventually cover more:
+ */
+static DEFINE_SPINLOCK(recall_lock);
+
static struct kmem_cache *stateowner_slab = NULL;
static struct kmem_cache *file_slab = NULL;
static struct kmem_cache *stateid_slab = NULL;
@@ -116,19 +120,15 @@ opaque_hashval(const void *ptr, int nbytes)
return x;
}

-/*
- * Delegation state
- */
-
-/* recall_lock protects the del_recall_lru */
-static DEFINE_SPINLOCK(recall_lock);
static struct list_head del_recall_lru;

static void
free_nfs4_file(struct kref *kref)
{
struct nfs4_file *fp = container_of(kref, struct nfs4_file, fi_ref);
+ spin_lock(&recall_lock);
list_del(&fp->fi_hash);
+ spin_unlock(&recall_lock);
iput(fp->fi_inode);
kmem_cache_free(file_slab, fp);
}
@@ -1004,7 +1004,9 @@ alloc_init_file(struct inode *ino)
INIT_LIST_HEAD(&fp->fi_hash);
INIT_LIST_HEAD(&fp->fi_stateids);
INIT_LIST_HEAD(&fp->fi_delegations);
+ spin_lock(&recall_lock);
list_add(&fp->fi_hash, &file_hashtbl[hashval]);
+ spin_unlock(&recall_lock);
fp->fi_inode = igrab(ino);
fp->fi_id = current_fileid++;
fp->fi_had_conflict = false;
@@ -1177,12 +1179,15 @@ find_file(struct inode *ino)
unsigned int hashval = file_hashval(ino);
struct nfs4_file *fp;

+ spin_lock(&recall_lock);
list_for_each_entry(fp, &file_hashtbl[hashval], fi_hash) {
if (fp->fi_inode == ino) {
+ spin_unlock(&recall_lock);
get_nfs4_file(fp);
return fp;
}
}
+ spin_unlock(&recall_lock);
return NULL;
}

--
1.6.0.4


2009-03-11 00:27:05

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 01/14] nfsd4: trivial preprocess_stateid_op cleanup

From: J. Bruce Fields <[email protected]>

Remove a couple redundant comments, adjust style; no change in behavior.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4state.c | 14 ++++++++------
1 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 7f616e9..b7e2f25 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -2072,21 +2072,21 @@ nfs4_preprocess_stateid_op(struct svc_fh *current_fh, stateid_t *stateid, int fl
if (ZERO_STATEID(stateid) || ONE_STATEID(stateid))
return check_special_stateids(current_fh, stateid, flags);

- /* STALE STATEID */
status = nfserr_stale_stateid;
if (STALE_STATEID(stateid))
goto out;

- /* BAD STATEID */
status = nfserr_bad_stateid;
if (!stateid->si_fileid) { /* delegation stateid */
- if(!(dp = find_delegation_stateid(ino, stateid))) {
+ dp = find_delegation_stateid(ino, stateid);
+ if (!dp) {
dprintk("NFSD: delegation stateid not found\n");
goto out;
}
stidp = &dp->dl_stateid;
} else { /* open or lock stateid */
- if (!(stp = find_stateid(stateid, flags))) {
+ stp = find_stateid(stateid, flags);
+ if (!stp) {
dprintk("NFSD: open or lock stateid not found\n");
goto out;
}
@@ -2100,13 +2100,15 @@ nfs4_preprocess_stateid_op(struct svc_fh *current_fh, stateid_t *stateid, int fl
if (status)
goto out;
if (stp) {
- if ((status = nfs4_check_openmode(stp,flags)))
+ status = nfs4_check_openmode(stp, flags);
+ if (status)
goto out;
renew_client(stp->st_stateowner->so_client);
if (filpp)
*filpp = stp->st_vfs_file;
} else {
- if ((status = nfs4_check_delegmode(dp, flags)))
+ status = nfs4_check_delegmode(dp, flags);
+ if (status)
goto out;
renew_client(dp->dl_client);
if (flags & DELEG_RET)
--
1.6.0.4


2009-03-11 00:27:05

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 02/14] nfsd4: move check_stateid_generation check

From: J. Bruce Fields <[email protected]>

No change in behavior.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4state.c | 9 ++++++---
1 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index b7e2f25..d6ca2be 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -2084,6 +2084,9 @@ nfs4_preprocess_stateid_op(struct svc_fh *current_fh, stateid_t *stateid, int fl
goto out;
}
stidp = &dp->dl_stateid;
+ status = check_stateid_generation(stateid, stidp);
+ if (status)
+ goto out;
} else { /* open or lock stateid */
stp = find_stateid(stateid, flags);
if (!stp) {
@@ -2095,10 +2098,10 @@ nfs4_preprocess_stateid_op(struct svc_fh *current_fh, stateid_t *stateid, int fl
if (!stp->st_stateowner->so_confirmed)
goto out;
stidp = &stp->st_stateid;
+ status = check_stateid_generation(stateid, stidp);
+ if (status)
+ goto out;
}
- status = check_stateid_generation(stateid, stidp);
- if (status)
- goto out;
if (stp) {
status = nfs4_check_openmode(stp, flags);
if (status)
--
1.6.0.4


2009-03-11 00:27:05

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 11/14] nfsd4: put_nfs4_client does not require state lock

From: J. Bruce Fields <[email protected]>

Since free_client() is guaranteed to only be called once, and to only
touch the client structure itself (not any common data structures), it
has no need for the state lock.

Signed-off-by: J. Bruce Fields <[email protected]>
Cc: Alexandros Batsakis <[email protected]>
---
fs/nfsd/nfs4callback.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index 3ddc9fb..3fd7136 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -490,8 +490,8 @@ out_put_cred:
* Success or failure, now we're either waiting for lease expiration
* or deleg_return.
*/
- nfs4_lock_state();
put_nfs4_client(clp);
+ nfs4_lock_state();
nfs4_put_delegation(dp);
nfs4_unlock_state();
return;
--
1.6.0.4


2009-03-11 00:27:05

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 09/14] nfsd4: remove unused CHECK_FH flag

From: J. Bruce Fields <[email protected]>

All users now pass this, so it's meaningless.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4proc.c | 6 +++---
fs/nfsd/nfs4state.c | 2 +-
include/linux/nfsd/state.h | 1 -
3 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index af66073..77f584f 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -519,7 +519,7 @@ nfsd4_read(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
/* check stateid */
if ((status = nfs4_preprocess_stateid_op(&cstate->current_fh,
&read->rd_stateid,
- CHECK_FH | RD_STATE, &read->rd_filp))) {
+ RD_STATE, &read->rd_filp))) {
dprintk("NFSD: nfsd4_read: couldn't process stateid!\n");
goto out;
}
@@ -651,7 +651,7 @@ nfsd4_setattr(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
if (setattr->sa_iattr.ia_valid & ATTR_SIZE) {
nfs4_lock_state();
status = nfs4_preprocess_stateid_op(&cstate->current_fh,
- &setattr->sa_stateid, CHECK_FH | WR_STATE, NULL);
+ &setattr->sa_stateid, WR_STATE, NULL);
nfs4_unlock_state();
if (status) {
dprintk("NFSD: nfsd4_setattr: couldn't process stateid!\n");
@@ -690,7 +690,7 @@ nfsd4_write(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,

nfs4_lock_state();
status = nfs4_preprocess_stateid_op(&cstate->current_fh, stateid,
- CHECK_FH | WR_STATE, &filp);
+ WR_STATE, &filp);
if (filp)
get_file(filp);
nfs4_unlock_state();
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 4a92d1e..caca569 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -2091,7 +2091,7 @@ nfs4_preprocess_stateid_op(struct svc_fh *current_fh, stateid_t *stateid, int fl
stp = find_stateid(stateid, flags);
if (!stp)
goto out;
- if ((flags & CHECK_FH) && nfs4_check_fh(current_fh, stp))
+ if (nfs4_check_fh(current_fh, stp))
goto out;
if (!stp->st_stateowner->so_confirmed)
goto out;
diff --git a/include/linux/nfsd/state.h b/include/linux/nfsd/state.h
index c9311a1..503b6bb 100644
--- a/include/linux/nfsd/state.h
+++ b/include/linux/nfsd/state.h
@@ -256,7 +256,6 @@ struct nfs4_stateid {
};

/* flags for preprocess_seqid_op() */
-#define CHECK_FH 0x00000001
#define CONFIRM 0x00000002
#define OPEN_STATE 0x00000004
#define LOCK_STATE 0x00000008
--
1.6.0.4


2009-03-11 00:27:05

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 06/14] nfsd4: add a helper function to decide if stateid is delegation

From: J. Bruce Fields <[email protected]>

Make this check self-documenting.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4state.c | 7 ++++++-
1 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 909a7a4..d555585 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -2048,6 +2048,11 @@ static int check_stateid_generation(stateid_t *in, stateid_t *ref)
return nfs_ok;
}

+static int is_delegation_stateid(stateid_t *stateid)
+{
+ return stateid->si_fileid == 0;
+}
+
/*
* Checks for stateid operations
*/
@@ -2073,7 +2078,7 @@ nfs4_preprocess_stateid_op(struct svc_fh *current_fh, stateid_t *stateid, int fl
goto out;

status = nfserr_bad_stateid;
- if (!stateid->si_fileid) { /* delegation stateid */
+ if (is_delegation_stateid(stateid)) {
dp = find_delegation_stateid(ino, stateid);
if (!dp)
goto out;
--
1.6.0.4


2009-03-11 00:27:05

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 13/14] nfsd4: fix do_probe_callback errors

From: J. Bruce Fields <[email protected]>

The errors returned aren't used. Just return 0 and make them available
to a dprintk(). Also, consistently use -ERRNO errors instead of nfs
errors.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4callback.c | 15 +++++++--------
1 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index 5dcd38e..c979af7 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -389,12 +389,10 @@ static int do_probe_callback(void *data)
.rpc_argp = clp,
};
struct rpc_clnt *client;
- int status;
+ int status = -EINVAL;

- if (!clp->cl_principal && (clp->cl_flavor >= RPC_AUTH_GSS_KRB5)) {
- status = nfserr_cb_path_down;
+ if (!clp->cl_principal && (clp->cl_flavor >= RPC_AUTH_GSS_KRB5))
goto out_err;
- }

/* Initialize address */
memset(&addr, 0, sizeof(addr));
@@ -405,7 +403,8 @@ static int do_probe_callback(void *data)
/* Create RPC client */
client = rpc_create(&args);
if (IS_ERR(client)) {
- dprintk("NFSD: couldn't create callback client\n");
+ dprintk("NFSD: couldn't create callback client: %d\n",
+ PTR_ERR(client));
status = PTR_ERR(client);
goto out_err;
}
@@ -422,10 +421,10 @@ static int do_probe_callback(void *data)
out_release_client:
rpc_shutdown_client(client);
out_err:
- dprintk("NFSD: warning: no callback path to client %.*s\n",
- (int)clp->cl_name.len, clp->cl_name.data);
+ dprintk("NFSD: warning: no callback path to client %.*: error %ds\n",
+ (int)clp->cl_name.len, clp->cl_name.data, status);
put_nfs4_client(clp);
- return status;
+ return 0;
}

/*
--
1.6.0.4


2009-03-11 00:27:05

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 05/14] nfsd4: remove some dprintk's

From: J. Bruce Fields <[email protected]>

I can't recall ever seeing these printk's used to debug a problem. I'll
happily put them back if we see a case where they'd be useful. (Though
if we do that the find_XXX() errors would probably be better
reported in find_XXX() functions themselves.)

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4state.c | 11 ++---------
1 files changed, 2 insertions(+), 9 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 099938e..909a7a4 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -2059,9 +2059,6 @@ nfs4_preprocess_stateid_op(struct svc_fh *current_fh, stateid_t *stateid, int fl
struct inode *ino = current_fh->fh_dentry->d_inode;
__be32 status;

- dprintk("NFSD: preprocess_stateid_op: stateid = (%08x/%08x/%08x/%08x)\n",
- stateid->si_boot, stateid->si_stateownerid,
- stateid->si_fileid, stateid->si_generation);
if (filpp)
*filpp = NULL;

@@ -2078,10 +2075,8 @@ nfs4_preprocess_stateid_op(struct svc_fh *current_fh, stateid_t *stateid, int fl
status = nfserr_bad_stateid;
if (!stateid->si_fileid) { /* delegation stateid */
dp = find_delegation_stateid(ino, stateid);
- if (!dp) {
- dprintk("NFSD: delegation stateid not found\n");
+ if (!dp)
goto out;
- }
status = check_stateid_generation(stateid, &dp->dl_stateid);
if (status)
goto out;
@@ -2095,10 +2090,8 @@ nfs4_preprocess_stateid_op(struct svc_fh *current_fh, stateid_t *stateid, int fl
*filpp = dp->dl_vfs_file;
} else { /* open or lock stateid */
stp = find_stateid(stateid, flags);
- if (!stp) {
- dprintk("NFSD: open or lock stateid not found\n");
+ if (!stp)
goto out;
- }
if ((flags & CHECK_FH) && nfs4_check_fh(current_fh, stp))
goto out;
if (!stp->st_stateowner->so_confirmed)
--
1.6.0.4


2009-03-11 00:27:05

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 04/14] nfsd4: remove unneeded local variable

From: J. Bruce Fields <[email protected]>

We no longer need stidp.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4state.c | 7 ++-----
1 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 16fcb65..099938e 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -2056,7 +2056,6 @@ nfs4_preprocess_stateid_op(struct svc_fh *current_fh, stateid_t *stateid, int fl
{
struct nfs4_stateid *stp = NULL;
struct nfs4_delegation *dp = NULL;
- stateid_t *stidp;
struct inode *ino = current_fh->fh_dentry->d_inode;
__be32 status;

@@ -2083,8 +2082,7 @@ nfs4_preprocess_stateid_op(struct svc_fh *current_fh, stateid_t *stateid, int fl
dprintk("NFSD: delegation stateid not found\n");
goto out;
}
- stidp = &dp->dl_stateid;
- status = check_stateid_generation(stateid, stidp);
+ status = check_stateid_generation(stateid, &dp->dl_stateid);
if (status)
goto out;
status = nfs4_check_delegmode(dp, flags);
@@ -2105,8 +2103,7 @@ nfs4_preprocess_stateid_op(struct svc_fh *current_fh, stateid_t *stateid, int fl
goto out;
if (!stp->st_stateowner->so_confirmed)
goto out;
- stidp = &stp->st_stateid;
- status = check_stateid_generation(stateid, stidp);
+ status = check_stateid_generation(stateid, &stp->st_stateid);
if (status)
goto out;
status = nfs4_check_openmode(stp, flags);
--
1.6.0.4


2009-03-11 01:13:56

by Yang Hongyang

[permalink] [raw]
Subject: Re: [PATCH 03/14] nfsd4: remove redundant "if" in nfs4_preprocess_stateid_op

J. Bruce Fields wrote:
> From: J. Bruce Fields <[email protected]>
>
> Note that we exit this first big "if" with stp == NULL iff we took the
^^ typo
> first branch; therefore, the second "if" is redundant, and we can just
> combine the two, simplifying the logic.
>
> Signed-off-by: J. Bruce Fields <[email protected]>
> ---
> fs/nfsd/nfs4state.c | 19 ++++++++-----------
> 1 files changed, 8 insertions(+), 11 deletions(-)
>
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index d6ca2be..16fcb65 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -2087,6 +2087,14 @@ nfs4_preprocess_stateid_op(struct svc_fh *current_fh, stateid_t *stateid, int fl
> status = check_stateid_generation(stateid, stidp);
> if (status)
> goto out;
> + status = nfs4_check_delegmode(dp, flags);
> + if (status)
> + goto out;
> + renew_client(dp->dl_client);
> + if (flags & DELEG_RET)
> + unhash_delegation(dp);
> + if (filpp)
> + *filpp = dp->dl_vfs_file;
> } else { /* open or lock stateid */
> stp = find_stateid(stateid, flags);
> if (!stp) {
> @@ -2101,23 +2109,12 @@ nfs4_preprocess_stateid_op(struct svc_fh *current_fh, stateid_t *stateid, int fl
> status = check_stateid_generation(stateid, stidp);
> if (status)
> goto out;
> - }
> - if (stp) {
> status = nfs4_check_openmode(stp, flags);
> if (status)
> goto out;
> renew_client(stp->st_stateowner->so_client);
> if (filpp)
> *filpp = stp->st_vfs_file;
> - } else {
> - status = nfs4_check_delegmode(dp, flags);
> - if (status)
> - goto out;
> - renew_client(dp->dl_client);
> - if (flags & DELEG_RET)
> - unhash_delegation(dp);
> - if (filpp)
> - *filpp = dp->dl_vfs_file;
> }
> status = nfs_ok;
> out:


--
Regards
Yang Hongyang

2009-03-11 01:23:47

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 03/14] nfsd4: remove redundant "if" in nfs4_preprocess_stateid_op

On Wed, Mar 11, 2009 at 09:12:46AM +0800, Yang Hongyang wrote:
> J. Bruce Fields wrote:
> > From: J. Bruce Fields <[email protected]>
> >
> > Note that we exit this first big "if" with stp == NULL iff we took the
> ^^ typo

The "iff"? That's short for "if and only if". I can try to get out of
the "iff" habit if it's too obscure.

--b.

> > first branch; therefore, the second "if" is redundant, and we can just
> > combine the two, simplifying the logic.
> >
> > Signed-off-by: J. Bruce Fields <[email protected]>
> > ---
> > fs/nfsd/nfs4state.c | 19 ++++++++-----------
> > 1 files changed, 8 insertions(+), 11 deletions(-)
> >
> > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> > index d6ca2be..16fcb65 100644
> > --- a/fs/nfsd/nfs4state.c
> > +++ b/fs/nfsd/nfs4state.c
> > @@ -2087,6 +2087,14 @@ nfs4_preprocess_stateid_op(struct svc_fh *current_fh, stateid_t *stateid, int fl
> > status = check_stateid_generation(stateid, stidp);
> > if (status)
> > goto out;
> > + status = nfs4_check_delegmode(dp, flags);
> > + if (status)
> > + goto out;
> > + renew_client(dp->dl_client);
> > + if (flags & DELEG_RET)
> > + unhash_delegation(dp);
> > + if (filpp)
> > + *filpp = dp->dl_vfs_file;
> > } else { /* open or lock stateid */
> > stp = find_stateid(stateid, flags);
> > if (!stp) {
> > @@ -2101,23 +2109,12 @@ nfs4_preprocess_stateid_op(struct svc_fh *current_fh, stateid_t *stateid, int fl
> > status = check_stateid_generation(stateid, stidp);
> > if (status)
> > goto out;
> > - }
> > - if (stp) {
> > status = nfs4_check_openmode(stp, flags);
> > if (status)
> > goto out;
> > renew_client(stp->st_stateowner->so_client);
> > if (filpp)
> > *filpp = stp->st_vfs_file;
> > - } else {
> > - status = nfs4_check_delegmode(dp, flags);
> > - if (status)
> > - goto out;
> > - renew_client(dp->dl_client);
> > - if (flags & DELEG_RET)
> > - unhash_delegation(dp);
> > - if (filpp)
> > - *filpp = dp->dl_vfs_file;
> > }
> > status = nfs_ok;
> > out:
>
>
> --
> Regards
> Yang Hongyang

2009-03-11 01:27:33

by Yang Hongyang

[permalink] [raw]
Subject: Re: [PATCH 03/14] nfsd4: remove redundant "if" in nfs4_preprocess_stateid_op

J. Bruce Fields wrote:
> On Wed, Mar 11, 2009 at 09:12:46AM +0800, Yang Hongyang wrote:
>> J. Bruce Fields wrote:
>>> From: J. Bruce Fields <[email protected]>
>>>
>>> Note that we exit this first big "if" with stp == NULL iff we took the
>> ^^ typo
>
> The "iff"? That's short for "if and only if". I can try to get out of
> the "iff" habit if it's too obscure.

Sorry for mis-understanding,My pool englishT_T Maybe "if and only if"
are more understandable.

>
> --b.
>
>>> first branch; therefore, the second "if" is redundant, and we can just
>>> combine the two, simplifying the logic.
>>>
>>> Signed-off-by: J. Bruce Fields <[email protected]>
>>> ---
>>> fs/nfsd/nfs4state.c | 19 ++++++++-----------
>>> 1 files changed, 8 insertions(+), 11 deletions(-)
>>>
>>> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
>>> index d6ca2be..16fcb65 100644
>>> --- a/fs/nfsd/nfs4state.c
>>> +++ b/fs/nfsd/nfs4state.c
>>> @@ -2087,6 +2087,14 @@ nfs4_preprocess_stateid_op(struct svc_fh *current_fh, stateid_t *stateid, int fl
>>> status = check_stateid_generation(stateid, stidp);
>>> if (status)
>>> goto out;
>>> + status = nfs4_check_delegmode(dp, flags);
>>> + if (status)
>>> + goto out;
>>> + renew_client(dp->dl_client);
>>> + if (flags & DELEG_RET)
>>> + unhash_delegation(dp);
>>> + if (filpp)
>>> + *filpp = dp->dl_vfs_file;
>>> } else { /* open or lock stateid */
>>> stp = find_stateid(stateid, flags);
>>> if (!stp) {
>>> @@ -2101,23 +2109,12 @@ nfs4_preprocess_stateid_op(struct svc_fh *current_fh, stateid_t *stateid, int fl
>>> status = check_stateid_generation(stateid, stidp);
>>> if (status)
>>> goto out;
>>> - }
>>> - if (stp) {
>>> status = nfs4_check_openmode(stp, flags);
>>> if (status)
>>> goto out;
>>> renew_client(stp->st_stateowner->so_client);
>>> if (filpp)
>>> *filpp = stp->st_vfs_file;
>>> - } else {
>>> - status = nfs4_check_delegmode(dp, flags);
>>> - if (status)
>>> - goto out;
>>> - renew_client(dp->dl_client);
>>> - if (flags & DELEG_RET)
>>> - unhash_delegation(dp);
>>> - if (filpp)
>>> - *filpp = dp->dl_vfs_file;
>>> }
>>> status = nfs_ok;
>>> out:
>>
>> --
>> Regards
>> Yang Hongyang
>
>


--
Regards
Yang Hongyang

2009-03-11 01:48:35

by Yang Hongyang

[permalink] [raw]
Subject: Re: [PATCH 07/14] nfsd4: separate delegreturn case from preprocess_stateid_op

J. Bruce Fields wrote:
> From: J. Bruce Fields <[email protected]>
>
> Delegreturn is enough a special case for preprocess_stateid_op to
> warrant just open-coding it in delegreturn.
>
> There should be no change in behavior here; we're just reshuffling code.
>
> Signed-off-by: J. Bruce Fields <[email protected]>
> ---
> fs/nfsd/nfs4state.c | 40 ++++++++++++++++++++++++++++------------
> include/linux/nfsd/state.h | 1 -
> 2 files changed, 28 insertions(+), 13 deletions(-)
>
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index d555585..de1d68a 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -2000,10 +2000,7 @@ out:
> static inline __be32
> check_special_stateids(svc_fh *current_fh, stateid_t *stateid, int flags)
> {
> - /* Trying to call delegreturn with a special stateid? Yuch: */
> - if (!(flags & (RD_STATE | WR_STATE)))
> - return nfserr_bad_stateid;
> - else if (ONE_STATEID(stateid) && (flags & RD_STATE))
> + if (ONE_STATEID(stateid) && (flags & RD_STATE))
> return nfs_ok;
> else if (locks_in_grace()) {
> /* Answer in remaining cases depends on existance of
> @@ -2024,8 +2021,7 @@ check_special_stateids(svc_fh *current_fh, stateid_t *stateid, int flags)
> static inline int
> io_during_grace_disallowed(struct inode *inode, int flags)
> {
> - return locks_in_grace() && (flags & (RD_STATE | WR_STATE))
> - && mandatory_lock(inode);
> + return locks_in_grace() && mandatory_lock(inode);
> }
>
> static int check_stateid_generation(stateid_t *in, stateid_t *ref)
> @@ -2089,8 +2085,6 @@ nfs4_preprocess_stateid_op(struct svc_fh *current_fh, stateid_t *stateid, int fl
> if (status)
> goto out;
> renew_client(dp->dl_client);
> - if (flags & DELEG_RET)
> - unhash_delegation(dp);
> if (filpp)
> *filpp = dp->dl_vfs_file;
> } else { /* open or lock stateid */
> @@ -2408,16 +2402,38 @@ __be32
> nfsd4_delegreturn(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
> struct nfsd4_delegreturn *dr)
> {
> + struct nfs4_delegation *dp;
> + stateid_t *stateid = &dr->dr_stateid;
> + struct inode *inode;
> __be32 status;
>
> if ((status = fh_verify(rqstp, &cstate->current_fh, S_IFREG, 0)))

adjust style:
status = fh_verify(rqstp, &cstate->current_fh, S_IFREG, 0)
if (status)

> - goto out;
> + return status;
> + inode = cstate->current_fh.fh_dentry->d_inode;
>
> nfs4_lock_state();
> - status = nfs4_preprocess_stateid_op(&cstate->current_fh,
> - &dr->dr_stateid, DELEG_RET, NULL);
> - nfs4_unlock_state();
> + status = nfserr_bad_stateid;
> + if (ZERO_STATEID(stateid) || ONE_STATEID(stateid))
> + goto out;
> + status = nfserr_stale_stateid;
> + if (STALE_STATEID(stateid))
> + goto out;
> + status = nfs_ok;
> + if (is_delegation_stateid(stateid))
> + goto out;
> + status = nfserr_bad_stateid;
> + dp = find_delegation_stateid(inode, stateid);
> + if (!dp)
> + goto out;
> + status = check_stateid_generation(stateid, &dp->dl_stateid);
> + if (status)
> + goto out;
> + renew_client(dp->dl_client);
> +
> + unhash_delegation(dp);
> out:
> + nfs4_unlock_state();
> +
> return status;
> }
>
> diff --git a/include/linux/nfsd/state.h b/include/linux/nfsd/state.h
> index 1130d53..c9311a1 100644
> --- a/include/linux/nfsd/state.h
> +++ b/include/linux/nfsd/state.h
> @@ -263,7 +263,6 @@ struct nfs4_stateid {
> #define RD_STATE 0x00000010
> #define WR_STATE 0x00000020
> #define CLOSE_STATE 0x00000040
> -#define DELEG_RET 0x00000080
>
> #define seqid_mutating_err(err) \
> (((err) != nfserr_stale_clientid) && \


--
Regards
Yang Hongyang

2009-03-11 01:53:29

by Yang Hongyang

[permalink] [raw]
Subject: Re: [PATCH 08/14] nfsd4: fail when delegreturn gets a non-delegation stateid

J. Bruce Fields wrote:
> From: J. Bruce Fields <[email protected]>
>
> Previous cleanup reveals an obvious (though harmless) bug: when
> delegreturn gets a stateid that isn't for a delegation, it should return
> an error rather than doing nothing.
>
> Signed-off-by: J. Bruce Fields <[email protected]>
> ---
> fs/nfsd/nfs4state.c | 2 +-
> 1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index de1d68a..4a92d1e 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -2418,7 +2418,7 @@ nfsd4_delegreturn(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
> status = nfserr_stale_stateid;
> if (STALE_STATEID(stateid))
> goto out;
> - status = nfs_ok;
> + status = nfserr_bad_stateid;

this should be done in patch 7?

> if (is_delegation_stateid(stateid))
> goto out;

confused!
if (is_delegation_stateid(stateid)) means a stateid is for a delegation right?
Are we prefered if (!(is_delegation_stateid(stateid)))?

> status = nfserr_bad_stateid;


--
Regards
Yang Hongyang

2009-03-11 02:10:56

by Yang Hongyang

[permalink] [raw]
Subject: Re: [PATCH 01/14] nfsd4: trivial preprocess_stateid_op cleanup

Reviewed patch 1-8.

J. Bruce Fields wrote:
> From: J. Bruce Fields <[email protected]>
>
> Remove a couple redundant comments, adjust style; no change in behavior.
>
> Signed-off-by: J. Bruce Fields <[email protected]>
> ---
> fs/nfsd/nfs4state.c | 14 ++++++++------
> 1 files changed, 8 insertions(+), 6 deletions(-)
>
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index 7f616e9..b7e2f25 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -2072,21 +2072,21 @@ nfs4_preprocess_stateid_op(struct svc_fh *current_fh, stateid_t *stateid, int fl
> if (ZERO_STATEID(stateid) || ONE_STATEID(stateid))
> return check_special_stateids(current_fh, stateid, flags);
>
> - /* STALE STATEID */
> status = nfserr_stale_stateid;
> if (STALE_STATEID(stateid))
> goto out;
>
> - /* BAD STATEID */
> status = nfserr_bad_stateid;
> if (!stateid->si_fileid) { /* delegation stateid */
> - if(!(dp = find_delegation_stateid(ino, stateid))) {
> + dp = find_delegation_stateid(ino, stateid);
> + if (!dp) {
> dprintk("NFSD: delegation stateid not found\n");
> goto out;
> }
> stidp = &dp->dl_stateid;
> } else { /* open or lock stateid */
> - if (!(stp = find_stateid(stateid, flags))) {
> + stp = find_stateid(stateid, flags);
> + if (!stp) {
> dprintk("NFSD: open or lock stateid not found\n");
> goto out;
> }
> @@ -2100,13 +2100,15 @@ nfs4_preprocess_stateid_op(struct svc_fh *current_fh, stateid_t *stateid, int fl
> if (status)
> goto out;
> if (stp) {
> - if ((status = nfs4_check_openmode(stp,flags)))
> + status = nfs4_check_openmode(stp, flags);
> + if (status)
> goto out;
> renew_client(stp->st_stateowner->so_client);
> if (filpp)
> *filpp = stp->st_vfs_file;
> } else {
> - if ((status = nfs4_check_delegmode(dp, flags)))
> + status = nfs4_check_delegmode(dp, flags);
> + if (status)
> goto out;
> renew_client(dp->dl_client);
> if (flags & DELEG_RET)


--
Regards
Yang Hongyang

2009-03-11 18:05:36

by Benny Halevy

[permalink] [raw]
Subject: Re: [PATCH 03/14] nfsd4: remove redundant "if" in nfs4_preprocess_stateid_op

On Mar. 11, 2009, 3:26 +0200, Yang Hongyang <[email protected]> wrote:
> J. Bruce Fields wrote:
>> On Wed, Mar 11, 2009 at 09:12:46AM +0800, Yang Hongyang wrote:
>>> J. Bruce Fields wrote:
>>>> From: J. Bruce Fields <[email protected]>
>>>>
>>>> Note that we exit this first big "if" with stp == NULL iff we took the
>>> ^^ typo
>> The "iff"? That's short for "if and only if". I can try to get out of
>> the "iff" habit if it's too obscure.
>
> Sorry for mis-understanding,My pool englishT_T Maybe "if and only if"
> are more understandable.

Although using this iffy math/cs school lingo does not seem too obscure
to me (one can find it easily on the net, e.g.
http://www.tfd.com/?Word=iff), as a non-native English speaker I'd be
inclined to simplify commentary language as much as possible and
refrain from using acronyms if they are not obvious to the linux
developers community.

Benny

>
>> --b.
>>
>>>> first branch; therefore, the second "if" is redundant, and we can just
>>>> combine the two, simplifying the logic.
>>>>
>>>> Signed-off-by: J. Bruce Fields <[email protected]>
>>>> ---
>>>> fs/nfsd/nfs4state.c | 19 ++++++++-----------
>>>> 1 files changed, 8 insertions(+), 11 deletions(-)
>>>>
>>>> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
>>>> index d6ca2be..16fcb65 100644
>>>> --- a/fs/nfsd/nfs4state.c
>>>> +++ b/fs/nfsd/nfs4state.c
>>>> @@ -2087,6 +2087,14 @@ nfs4_preprocess_stateid_op(struct svc_fh *current_fh, stateid_t *stateid, int fl
>>>> status = check_stateid_generation(stateid, stidp);
>>>> if (status)
>>>> goto out;
>>>> + status = nfs4_check_delegmode(dp, flags);
>>>> + if (status)
>>>> + goto out;
>>>> + renew_client(dp->dl_client);
>>>> + if (flags & DELEG_RET)
>>>> + unhash_delegation(dp);
>>>> + if (filpp)
>>>> + *filpp = dp->dl_vfs_file;
>>>> } else { /* open or lock stateid */
>>>> stp = find_stateid(stateid, flags);
>>>> if (!stp) {
>>>> @@ -2101,23 +2109,12 @@ nfs4_preprocess_stateid_op(struct svc_fh *current_fh, stateid_t *stateid, int fl
>>>> status = check_stateid_generation(stateid, stidp);
>>>> if (status)
>>>> goto out;
>>>> - }
>>>> - if (stp) {
>>>> status = nfs4_check_openmode(stp, flags);
>>>> if (status)
>>>> goto out;
>>>> renew_client(stp->st_stateowner->so_client);
>>>> if (filpp)
>>>> *filpp = stp->st_vfs_file;
>>>> - } else {
>>>> - status = nfs4_check_delegmode(dp, flags);
>>>> - if (status)
>>>> - goto out;
>>>> - renew_client(dp->dl_client);
>>>> - if (flags & DELEG_RET)
>>>> - unhash_delegation(dp);
>>>> - if (filpp)
>>>> - *filpp = dp->dl_vfs_file;
>>>> }
>>>> status = nfs_ok;
>>>> out:
>>> --
>>> Regards
>>> Yang Hongyang
>>
>
>

2009-03-11 18:29:56

by Benny Halevy

[permalink] [raw]
Subject: Re: [PATCH 08/14] nfsd4: fail when delegreturn gets a non-delegation stateid

On Mar. 11, 2009, 2:26 +0200, "J. Bruce Fields" <[email protected]> wrote:
> From: J. Bruce Fields <[email protected]>
>
> Previous cleanup reveals an obvious (though harmless) bug: when
> delegreturn gets a stateid that isn't for a delegation, it should return
> an error rather than doing nothing.
>
> Signed-off-by: J. Bruce Fields <[email protected]>
> ---
> fs/nfsd/nfs4state.c | 2 +-
> 1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index de1d68a..4a92d1e 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -2418,7 +2418,7 @@ nfsd4_delegreturn(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
> status = nfserr_stale_stateid;
> if (STALE_STATEID(stateid))
> goto out;
> - status = nfs_ok;
> + status = nfserr_bad_stateid;
> if (is_delegation_stateid(stateid))
> goto out;
> status = nfserr_bad_stateid;

This assignment is redundant now, isn't it?

Benny

2009-03-11 18:52:19

by Benny Halevy

[permalink] [raw]
Subject: Re: [PATCH 12/14] nfsd4: remove use of mutex for file_hashtable

On Mar. 11, 2009, 2:27 +0200, "J. Bruce Fields" <[email protected]> wrote:
> From: J. Bruce Fields <[email protected]>
>
> As part of reducing the scope of the client_mutex, and in order to
> remove the need for mutexes from the callback code (so that callbacks
> can be done as asynchronous rpc calls), move manipulations of the
> file_hashtable under the recall_lock.
>
> Update the relevant comments while we're here.
>
> Signed-off-by: J. Bruce Fields <[email protected]>
> Cc: Alexandros Batsakis <[email protected]>
> ---
> fs/nfsd/nfs4callback.c | 2 --
> fs/nfsd/nfs4state.c | 29 +++++++++++++++++------------
> 2 files changed, 17 insertions(+), 14 deletions(-)
>
> diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
> index 3fd7136..5dcd38e 100644
> --- a/fs/nfsd/nfs4callback.c
> +++ b/fs/nfsd/nfs4callback.c
> @@ -491,8 +491,6 @@ out_put_cred:
> * or deleg_return.
> */
> put_nfs4_client(clp);
> - nfs4_lock_state();
> nfs4_put_delegation(dp);
> - nfs4_unlock_state();
> return;
> }
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index 56e81f9..2f4cb9a 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -78,14 +78,18 @@ static struct nfs4_delegation * find_delegation_stateid(struct inode *ino, state
> static char user_recovery_dirname[PATH_MAX] = "/var/lib/nfs/v4recovery";
> static void nfs4_set_recdir(char *recdir);
>
> -/* Locking:
> - *
> - * client_mutex:
> - * protects clientid_hashtbl[], clientstr_hashtbl[],
> - * unconfstr_hashtbl[], uncofid_hashtbl[].
> - */
> +/* Locking: */
> +
> +/* Currently used for almost all code touching nfsv4 state: */
> static DEFINE_MUTEX(client_mutex);
>
> +/*
> + * Currently used for the del_recall_lru and file hash table. In an
> + * effort to decrease the scope of the client_mutex, this spinlock may
> + * eventually cover more:
> + */
> +static DEFINE_SPINLOCK(recall_lock);
> +
> static struct kmem_cache *stateowner_slab = NULL;
> static struct kmem_cache *file_slab = NULL;
> static struct kmem_cache *stateid_slab = NULL;
> @@ -116,19 +120,15 @@ opaque_hashval(const void *ptr, int nbytes)
> return x;
> }
>
> -/*
> - * Delegation state
> - */
> -
> -/* recall_lock protects the del_recall_lru */
> -static DEFINE_SPINLOCK(recall_lock);
> static struct list_head del_recall_lru;
>
> static void
> free_nfs4_file(struct kref *kref)
> {
> struct nfs4_file *fp = container_of(kref, struct nfs4_file, fi_ref);
> + spin_lock(&recall_lock);
> list_del(&fp->fi_hash);
> + spin_unlock(&recall_lock);
> iput(fp->fi_inode);
> kmem_cache_free(file_slab, fp);
> }
> @@ -1004,7 +1004,9 @@ alloc_init_file(struct inode *ino)
> INIT_LIST_HEAD(&fp->fi_hash);
> INIT_LIST_HEAD(&fp->fi_stateids);
> INIT_LIST_HEAD(&fp->fi_delegations);
> + spin_lock(&recall_lock);
> list_add(&fp->fi_hash, &file_hashtbl[hashval]);
> + spin_unlock(&recall_lock);
> fp->fi_inode = igrab(ino);
> fp->fi_id = current_fileid++;
> fp->fi_had_conflict = false;
> @@ -1177,12 +1179,15 @@ find_file(struct inode *ino)
> unsigned int hashval = file_hashval(ino);
> struct nfs4_file *fp;
>
> + spin_lock(&recall_lock);
> list_for_each_entry(fp, &file_hashtbl[hashval], fi_hash) {
> if (fp->fi_inode == ino) {
> + spin_unlock(&recall_lock);

Hmm, I'm afraid there's a race here since potentially
you could take the reference on the file after it has been freed.
find_file should call get_nfs4_file if and only if it's still hashed,
i.e. atomically with looking it up on the list.

On the same lines, the lock should be taken in put_nfs4_file
rather than in free_nfs4_file.

That being said, I'm not sure we're unhashing the file in the right
place... it's too late for me right now but my hunch is that open
should alloc_init the nfs4_file as done today and close should unhash
it (under the recall spinlock), and then put it.
put_nfs4_file can stay the same and free_nfs4_file should just do the
iput and kmem_cache_free.

The main difference is that find_file will fail finding the nfs4_file
struct after close. (get_nfs4_file in find_file should still happen
under the lock though)

Benny

> get_nfs4_file(fp);
> return fp;
> }
> }
> + spin_unlock(&recall_lock);
> return NULL;
> }
>

2009-03-11 19:49:52

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 03/14] nfsd4: remove redundant "if" in nfs4_preprocess_stateid_op

On Wed, Mar 11, 2009 at 08:05:31PM +0200, Benny Halevy wrote:
> On Mar. 11, 2009, 3:26 +0200, Yang Hongyang <[email protected]> wrote:
> > J. Bruce Fields wrote:
> >> On Wed, Mar 11, 2009 at 09:12:46AM +0800, Yang Hongyang wrote:
> >>> J. Bruce Fields wrote:
> >>>> From: J. Bruce Fields <[email protected]>
> >>>>
> >>>> Note that we exit this first big "if" with stp == NULL iff we took the
> >>> ^^ typo
> >> The "iff"? That's short for "if and only if". I can try to get out of
> >> the "iff" habit if it's too obscure.
> >
> > Sorry for mis-understanding,My pool englishT_T Maybe "if and only if"
> > are more understandable.
>
> Although using this iffy math/cs school lingo does not seem too obscure
> to me (one can find it easily on the net, e.g.
> http://www.tfd.com/?Word=iff), as a non-native English speaker I'd be
> inclined to simplify commentary language as much as possible and
> refrain from using acronyms if they are not obvious to the linux
> developers community.

OK.--b.

2009-03-11 19:51:30

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 07/14] nfsd4: separate delegreturn case from preprocess_stateid_op

On Wed, Mar 11, 2009 at 09:47:26AM +0800, Yang Hongyang wrote:
> J. Bruce Fields wrote:
> > From: J. Bruce Fields <[email protected]>
> >
> > Delegreturn is enough a special case for preprocess_stateid_op to
> > warrant just open-coding it in delegreturn.
> >
> > There should be no change in behavior here; we're just reshuffling code.
> >
> > Signed-off-by: J. Bruce Fields <[email protected]>
> > ---
> > fs/nfsd/nfs4state.c | 40 ++++++++++++++++++++++++++++------------
> > include/linux/nfsd/state.h | 1 -
> > 2 files changed, 28 insertions(+), 13 deletions(-)
> >
> > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> > index d555585..de1d68a 100644
> > --- a/fs/nfsd/nfs4state.c
> > +++ b/fs/nfsd/nfs4state.c
> > @@ -2000,10 +2000,7 @@ out:
> > static inline __be32
> > check_special_stateids(svc_fh *current_fh, stateid_t *stateid, int flags)
> > {
> > - /* Trying to call delegreturn with a special stateid? Yuch: */
> > - if (!(flags & (RD_STATE | WR_STATE)))
> > - return nfserr_bad_stateid;
> > - else if (ONE_STATEID(stateid) && (flags & RD_STATE))
> > + if (ONE_STATEID(stateid) && (flags & RD_STATE))
> > return nfs_ok;
> > else if (locks_in_grace()) {
> > /* Answer in remaining cases depends on existance of
> > @@ -2024,8 +2021,7 @@ check_special_stateids(svc_fh *current_fh, stateid_t *stateid, int flags)
> > static inline int
> > io_during_grace_disallowed(struct inode *inode, int flags)
> > {
> > - return locks_in_grace() && (flags & (RD_STATE | WR_STATE))
> > - && mandatory_lock(inode);
> > + return locks_in_grace() && mandatory_lock(inode);
> > }
> >
> > static int check_stateid_generation(stateid_t *in, stateid_t *ref)
> > @@ -2089,8 +2085,6 @@ nfs4_preprocess_stateid_op(struct svc_fh *current_fh, stateid_t *stateid, int fl
> > if (status)
> > goto out;
> > renew_client(dp->dl_client);
> > - if (flags & DELEG_RET)
> > - unhash_delegation(dp);
> > if (filpp)
> > *filpp = dp->dl_vfs_file;
> > } else { /* open or lock stateid */
> > @@ -2408,16 +2402,38 @@ __be32
> > nfsd4_delegreturn(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
> > struct nfsd4_delegreturn *dr)
> > {
> > + struct nfs4_delegation *dp;
> > + stateid_t *stateid = &dr->dr_stateid;
> > + struct inode *inode;
> > __be32 status;
> >
> > if ((status = fh_verify(rqstp, &cstate->current_fh, S_IFREG, 0)))
>
> adjust style:
> status = fh_verify(rqstp, &cstate->current_fh, S_IFREG, 0)
> if (status)

Agreed, I just didn't want to touch that for now.

At some point a separate patch that addressed all cases of that problem
in nfs4state.c might not be a bad idea.

--b.

>
> > - goto out;
> > + return status;
> > + inode = cstate->current_fh.fh_dentry->d_inode;
> >
> > nfs4_lock_state();
> > - status = nfs4_preprocess_stateid_op(&cstate->current_fh,
> > - &dr->dr_stateid, DELEG_RET, NULL);
> > - nfs4_unlock_state();
> > + status = nfserr_bad_stateid;
> > + if (ZERO_STATEID(stateid) || ONE_STATEID(stateid))
> > + goto out;
> > + status = nfserr_stale_stateid;
> > + if (STALE_STATEID(stateid))
> > + goto out;
> > + status = nfs_ok;
> > + if (is_delegation_stateid(stateid))
> > + goto out;
> > + status = nfserr_bad_stateid;
> > + dp = find_delegation_stateid(inode, stateid);
> > + if (!dp)
> > + goto out;
> > + status = check_stateid_generation(stateid, &dp->dl_stateid);
> > + if (status)
> > + goto out;
> > + renew_client(dp->dl_client);
> > +
> > + unhash_delegation(dp);
> > out:
> > + nfs4_unlock_state();
> > +
> > return status;
> > }
> >
> > diff --git a/include/linux/nfsd/state.h b/include/linux/nfsd/state.h
> > index 1130d53..c9311a1 100644
> > --- a/include/linux/nfsd/state.h
> > +++ b/include/linux/nfsd/state.h
> > @@ -263,7 +263,6 @@ struct nfs4_stateid {
> > #define RD_STATE 0x00000010
> > #define WR_STATE 0x00000020
> > #define CLOSE_STATE 0x00000040
> > -#define DELEG_RET 0x00000080
> >
> > #define seqid_mutating_err(err) \
> > (((err) != nfserr_stale_clientid) && \
>
>
> --
> Regards
> Yang Hongyang

2009-03-11 20:07:25

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 08/14] nfsd4: fail when delegreturn gets a non-delegation stateid

On Wed, Mar 11, 2009 at 09:52:22AM +0800, Yang Hongyang wrote:
> J. Bruce Fields wrote:
> > From: J. Bruce Fields <[email protected]>
> >
> > Previous cleanup reveals an obvious (though harmless) bug: when
> > delegreturn gets a stateid that isn't for a delegation, it should return
> > an error rather than doing nothing.
> >
> > Signed-off-by: J. Bruce Fields <[email protected]>
> > ---
> > fs/nfsd/nfs4state.c | 2 +-
> > 1 files changed, 1 insertions(+), 1 deletions(-)
> >
> > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> > index de1d68a..4a92d1e 100644
> > --- a/fs/nfsd/nfs4state.c
> > +++ b/fs/nfsd/nfs4state.c
> > @@ -2418,7 +2418,7 @@ nfsd4_delegreturn(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
> > status = nfserr_stale_stateid;
> > if (STALE_STATEID(stateid))
> > goto out;
> > - status = nfs_ok;
> > + status = nfserr_bad_stateid;
>
> this should be done in patch 7?
>
> > if (is_delegation_stateid(stateid))
> > goto out;
>
> confused!
> if (is_delegation_stateid(stateid)) means a stateid is for a delegation right?
> Are we prefered if (!(is_delegation_stateid(stateid)))?

Ugh, yes, thanks for spotting that. I need to take another look. (And
also figure out why pynfs didn't catch this--ok, turns out I need to
explicitly ask for "delegations" on the ./testserver commandline--this
isn't included in the standard tests. Oops.)

--b.

2009-03-12 00:03:17

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 08/14] nfsd4: fail when delegreturn gets a non-delegation stateid

On Wed, Mar 11, 2009 at 08:29:52PM +0200, Benny Halevy wrote:
> On Mar. 11, 2009, 2:26 +0200, "J. Bruce Fields" <[email protected]> wrote:
> > From: J. Bruce Fields <[email protected]>
> >
> > Previous cleanup reveals an obvious (though harmless) bug: when
> > delegreturn gets a stateid that isn't for a delegation, it should return
> > an error rather than doing nothing.
> >
> > Signed-off-by: J. Bruce Fields <[email protected]>
> > ---
> > fs/nfsd/nfs4state.c | 2 +-
> > 1 files changed, 1 insertions(+), 1 deletions(-)
> >
> > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> > index de1d68a..4a92d1e 100644
> > --- a/fs/nfsd/nfs4state.c
> > +++ b/fs/nfsd/nfs4state.c
> > @@ -2418,7 +2418,7 @@ nfsd4_delegreturn(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
> > status = nfserr_stale_stateid;
> > if (STALE_STATEID(stateid))
> > goto out;
> > - status = nfs_ok;
> > + status = nfserr_bad_stateid;
> > if (is_delegation_stateid(stateid))
> > goto out;
> > status = nfserr_bad_stateid;
>
> This assignment is redundant now, isn't it?

Yep, thanks.--b.

2009-03-12 00:05:42

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 12/14] nfsd4: remove use of mutex for file_hashtable

On Wed, Mar 11, 2009 at 08:52:14PM +0200, Benny Halevy wrote:
> On Mar. 11, 2009, 2:27 +0200, "J. Bruce Fields" <[email protected]> wrote:
> > From: J. Bruce Fields <[email protected]>
> >
> > As part of reducing the scope of the client_mutex, and in order to
> > remove the need for mutexes from the callback code (so that callbacks
> > can be done as asynchronous rpc calls), move manipulations of the
> > file_hashtable under the recall_lock.
> >
> > Update the relevant comments while we're here.
> >
> > Signed-off-by: J. Bruce Fields <[email protected]>
> > Cc: Alexandros Batsakis <[email protected]>
> > ---
> > fs/nfsd/nfs4callback.c | 2 --
> > fs/nfsd/nfs4state.c | 29 +++++++++++++++++------------
> > 2 files changed, 17 insertions(+), 14 deletions(-)
> >
> > diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
> > index 3fd7136..5dcd38e 100644
> > --- a/fs/nfsd/nfs4callback.c
> > +++ b/fs/nfsd/nfs4callback.c
> > @@ -491,8 +491,6 @@ out_put_cred:
> > * or deleg_return.
> > */
> > put_nfs4_client(clp);
> > - nfs4_lock_state();
> > nfs4_put_delegation(dp);
> > - nfs4_unlock_state();
> > return;
> > }
> > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> > index 56e81f9..2f4cb9a 100644
> > --- a/fs/nfsd/nfs4state.c
> > +++ b/fs/nfsd/nfs4state.c
> > @@ -78,14 +78,18 @@ static struct nfs4_delegation * find_delegation_stateid(struct inode *ino, state
> > static char user_recovery_dirname[PATH_MAX] = "/var/lib/nfs/v4recovery";
> > static void nfs4_set_recdir(char *recdir);
> >
> > -/* Locking:
> > - *
> > - * client_mutex:
> > - * protects clientid_hashtbl[], clientstr_hashtbl[],
> > - * unconfstr_hashtbl[], uncofid_hashtbl[].
> > - */
> > +/* Locking: */
> > +
> > +/* Currently used for almost all code touching nfsv4 state: */
> > static DEFINE_MUTEX(client_mutex);
> >
> > +/*
> > + * Currently used for the del_recall_lru and file hash table. In an
> > + * effort to decrease the scope of the client_mutex, this spinlock may
> > + * eventually cover more:
> > + */
> > +static DEFINE_SPINLOCK(recall_lock);
> > +
> > static struct kmem_cache *stateowner_slab = NULL;
> > static struct kmem_cache *file_slab = NULL;
> > static struct kmem_cache *stateid_slab = NULL;
> > @@ -116,19 +120,15 @@ opaque_hashval(const void *ptr, int nbytes)
> > return x;
> > }
> >
> > -/*
> > - * Delegation state
> > - */
> > -
> > -/* recall_lock protects the del_recall_lru */
> > -static DEFINE_SPINLOCK(recall_lock);
> > static struct list_head del_recall_lru;
> >
> > static void
> > free_nfs4_file(struct kref *kref)
> > {
> > struct nfs4_file *fp = container_of(kref, struct nfs4_file, fi_ref);
> > + spin_lock(&recall_lock);
> > list_del(&fp->fi_hash);
> > + spin_unlock(&recall_lock);
> > iput(fp->fi_inode);
> > kmem_cache_free(file_slab, fp);
> > }
> > @@ -1004,7 +1004,9 @@ alloc_init_file(struct inode *ino)
> > INIT_LIST_HEAD(&fp->fi_hash);
> > INIT_LIST_HEAD(&fp->fi_stateids);
> > INIT_LIST_HEAD(&fp->fi_delegations);
> > + spin_lock(&recall_lock);
> > list_add(&fp->fi_hash, &file_hashtbl[hashval]);
> > + spin_unlock(&recall_lock);
> > fp->fi_inode = igrab(ino);
> > fp->fi_id = current_fileid++;
> > fp->fi_had_conflict = false;
> > @@ -1177,12 +1179,15 @@ find_file(struct inode *ino)
> > unsigned int hashval = file_hashval(ino);
> > struct nfs4_file *fp;
> >
> > + spin_lock(&recall_lock);
> > list_for_each_entry(fp, &file_hashtbl[hashval], fi_hash) {
> > if (fp->fi_inode == ino) {
> > + spin_unlock(&recall_lock);
>
> Hmm, I'm afraid there's a race here since potentially
> you could take the reference on the file after it has been freed.
> find_file should call get_nfs4_file if and only if it's still hashed,
> i.e. atomically with looking it up on the list.
>
> On the same lines, the lock should be taken in put_nfs4_file
> rather than in free_nfs4_file.
>
> That being said, I'm not sure we're unhashing the file in the right
> place... it's too late for me right now but my hunch is that open
> should alloc_init the nfs4_file as done today and close should unhash
> it (under the recall spinlock), and then put it.
> put_nfs4_file can stay the same and free_nfs4_file should just do the
> iput and kmem_cache_free.
>
> The main difference is that find_file will fail finding the nfs4_file
> struct after close. (get_nfs4_file in find_file should still happen
> under the lock though)

It's probably better for the nfs4_file to be visible longer, but either
is correct.

The only reason the delegation has a reference on this is to keep track
of which files have seen conflicts. Maybe there's some better way.

Thanks for catching this....

--b.