2016-09-22 17:39:26

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH v7 00/31] Fix delegation behaviour when server revokes some state

According to RFC5661, if any of the SEQUENCE status bits
SEQ4_STATUS_EXPIRED_ALL_STATE_REVOKED,
SEQ4_STATUS_EXPIRED_SOME_STATE_REVOKED, SEQ4_STATUS_ADMIN_STATE_REVOKED,
or SEQ4_STATUS_RECALLABLE_STATE_REVOKED are set, then we need to use
TEST_STATEID to figure out which stateids have been revoked, so we
can acknowledge the loss of state using FREE_STATEID.

While we already do this for open and lock state, we have not been doing
so for all the delegations.

v2: nfs_v4_2_minor_ops needs to set .test_and_free_expired too
v3: Now with added lock revoke fixes and close/delegreturn/locku fixes
v4: Close a bunch of corner cases
v5: Report revoked delegations as invalid in nfs_have_delegation()
Fix an infinite loop in nfs_reap_expired_delegations.
Fixes for other looping behaviour
v6: Fix nfs4_do_handle_exception to handle all stateids, not just delegations
Stable fix for nfs4_copy_delegation_stateid
Marked fix "NFSv4: Don't report revoked delegations as valid in
nfs_have_delegation" for stable.
Stable fix for the inode mode/fileid corruption
v7: Fix handling of NFS4ERR_OPENMODE
Handle matching stateids that have set seqid==0


Many thanks to Oleg Drokin for all his time helping to test and debug these
issues!


Trond Myklebust (31):
NFSv4.1: Don't deadlock the state manager on the SEQUENCE status flags
NFS: Fix inode corruption in nfs_prime_dcache()
NFSv4: Don't report revoked delegations as valid in
nfs_have_delegation()
NFSv4: nfs4_copy_delegation_stateid() must fail if the delegation is
invalid
NFSv4.1: Don't check delegations that are already marked as revoked
NFSv4.1: Allow test_stateid to handle session errors without waiting
NFSv4.1: Add a helper function to deal with expired stateids
NFSv4.x: Allow callers of nfs_remove_bad_delegation() to specify a
stateid
NFSv4.1: Test delegation stateids when server declares "some state
revoked"
NFSv4.1: Deal with server reboots during delegation expiration
recovery
NFSv4.1: Don't recheck delegations that have already been checked
NFSv4.1: Allow revoked stateids to skip the call to TEST_STATEID
NFSv4.1: Ensure we always run TEST/FREE_STATEID on locks
NFSv4.1: FREE_STATEID can be asynchronous
NFSv4.1: Ensure we call FREE_STATEID if needed on
close/delegreturn/locku
NFSv4: Ensure we don't re-test revoked and freed stateids
NFSv4: nfs_inode_find_state_and_recover() should check all stateids
NFSv4: nfs4_handle_delegation_recall_error() handle expiration as
revoke case
NFSv4: nfs4_handle_setlk_error() handle expiration as revoke case
NFSv4.1: nfs4_layoutget_handle_exception handle revoked state
NFSv4: Pass the stateid to the exception handler in
nfs4_read/write_done_cb
NFSv4: Fix a race in nfs_inode_reclaim_delegation()
NFSv4: Fix a race when updating an open_stateid
NFS: Always call nfs_inode_find_state_and_recover() when revoking a
delegation
NFSv4: nfs4_do_handle_exception() handle revoke/expiry of a single
stateid
NFSv4: Don't test open_stateid unless it is set
NFSv4: Mark the lock and open stateids as invalid after freeing them
NFSv4: Open state recovery must account for file permission changes
NFSv4: Fix retry issues with nfs41_test/free_stateid
NFSv4: If recovery failed for a specific open stateid, then don't
retry
NFSv4.1: Even if the stateid is OK, we may need to recover the open
modes

fs/nfs/delegation.c | 213 +++++++++++++++--
fs/nfs/delegation.h | 8 +-
fs/nfs/dir.c | 16 +-
fs/nfs/flexfilelayout/flexfilelayout.c | 2 +-
fs/nfs/nfs4_fs.h | 5 +-
fs/nfs/nfs4proc.c | 402 +++++++++++++++++++++++----------
fs/nfs/nfs4session.h | 1 +
fs/nfs/nfs4state.c | 84 +++++--
include/linux/nfs4.h | 1 +
9 files changed, 565 insertions(+), 167 deletions(-)

--
2.7.4



2016-09-22 17:39:29

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH v7 03/31] NFSv4: Don't report revoked delegations as valid in nfs_have_delegation()

If the delegation is revoked, then it can't be used for caching.

Fixes: 869f9dfa4d6d ("NFSv4: Fix races between nfs_remove_bad_delegation()...")
Signed-off-by: Trond Myklebust <[email protected]>
Cc: [email protected] # v3.19+
---
fs/nfs/delegation.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/fs/nfs/delegation.c b/fs/nfs/delegation.c
index 322c2585bc34..86d2c748140b 100644
--- a/fs/nfs/delegation.c
+++ b/fs/nfs/delegation.c
@@ -51,6 +51,7 @@ nfs4_do_check_delegation(struct inode *inode, fmode_t flags, bool mark)
rcu_read_lock();
delegation = rcu_dereference(NFS_I(inode)->delegation);
if (delegation != NULL && (delegation->type & flags) == flags &&
+ !test_bit(NFS_DELEGATION_REVOKED, &delegation->flags) &&
!test_bit(NFS_DELEGATION_RETURNING, &delegation->flags)) {
if (mark)
nfs_mark_delegation_referenced(delegation);
--
2.7.4


2016-09-22 17:39:30

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH v7 04/31] NFSv4: nfs4_copy_delegation_stateid() must fail if the delegation is invalid

We must not allow the use of delegations that have been revoked or are
being returned.

Signed-off-by: Trond Myklebust <[email protected]>
Fixes: 869f9dfa4d6d ("NFSv4: Fix races between nfs_remove_bad_delegation()...")
Signed-off-by: Trond Myklebust <[email protected]>
Cc: [email protected] # v3.19+
---
fs/nfs/delegation.c | 17 +++++++++++++----
1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/fs/nfs/delegation.c b/fs/nfs/delegation.c
index 86d2c748140b..b9c65421ed81 100644
--- a/fs/nfs/delegation.c
+++ b/fs/nfs/delegation.c
@@ -41,6 +41,17 @@ void nfs_mark_delegation_referenced(struct nfs_delegation *delegation)
set_bit(NFS_DELEGATION_REFERENCED, &delegation->flags);
}

+static bool
+nfs4_is_valid_delegation(const struct nfs_delegation *delegation,
+ fmode_t flags)
+{
+ if (delegation != NULL && (delegation->type & flags) == flags &&
+ !test_bit(NFS_DELEGATION_REVOKED, &delegation->flags) &&
+ !test_bit(NFS_DELEGATION_RETURNING, &delegation->flags))
+ return true;
+ return false;
+}
+
static int
nfs4_do_check_delegation(struct inode *inode, fmode_t flags, bool mark)
{
@@ -50,9 +61,7 @@ nfs4_do_check_delegation(struct inode *inode, fmode_t flags, bool mark)
flags &= FMODE_READ|FMODE_WRITE;
rcu_read_lock();
delegation = rcu_dereference(NFS_I(inode)->delegation);
- if (delegation != NULL && (delegation->type & flags) == flags &&
- !test_bit(NFS_DELEGATION_REVOKED, &delegation->flags) &&
- !test_bit(NFS_DELEGATION_RETURNING, &delegation->flags)) {
+ if (nfs4_is_valid_delegation(delegation, flags)) {
if (mark)
nfs_mark_delegation_referenced(delegation);
ret = 1;
@@ -894,7 +903,7 @@ bool nfs4_copy_delegation_stateid(struct inode *inode, fmode_t flags,
flags &= FMODE_READ|FMODE_WRITE;
rcu_read_lock();
delegation = rcu_dereference(nfsi->delegation);
- ret = (delegation != NULL && (delegation->type & flags) == flags);
+ ret = nfs4_is_valid_delegation(delegation, flags);
if (ret) {
nfs4_stateid_copy(dst, &delegation->stateid);
nfs_mark_delegation_referenced(delegation);
--
2.7.4


2016-09-22 17:39:27

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH v7 01/31] NFSv4.1: Don't deadlock the state manager on the SEQUENCE status flags

As described in RFC5661, section 18.46, some of the status flags exist
in order to tell the client when it needs to acknowledge the existence of
revoked state on the server and/or to recover state.
Those flags will then remain set until the recovery procedure is done.

In order to avoid looping, the client therefore needs to ignore
those particular flags while recovering.

Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/nfs4_fs.h | 2 +-
fs/nfs/nfs4proc.c | 5 ++++-
fs/nfs/nfs4session.h | 1 +
fs/nfs/nfs4state.c | 12 +++++++++++-
4 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/fs/nfs/nfs4_fs.h b/fs/nfs/nfs4_fs.h
index f230aa62ca59..4390d73a92e5 100644
--- a/fs/nfs/nfs4_fs.h
+++ b/fs/nfs/nfs4_fs.h
@@ -439,7 +439,7 @@ extern void nfs4_schedule_path_down_recovery(struct nfs_client *clp);
extern int nfs4_schedule_stateid_recovery(const struct nfs_server *, struct nfs4_state *);
extern int nfs4_schedule_migration_recovery(const struct nfs_server *);
extern void nfs4_schedule_lease_moved_recovery(struct nfs_client *);
-extern void nfs41_handle_sequence_flag_errors(struct nfs_client *clp, u32 flags);
+extern void nfs41_handle_sequence_flag_errors(struct nfs_client *clp, u32 flags, bool);
extern void nfs41_handle_server_scope(struct nfs_client *,
struct nfs41_server_scope **);
extern void nfs4_put_lock_state(struct nfs4_lock_state *lsp);
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 251e48e7ba16..6b700c59eede 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -616,6 +616,7 @@ int nfs40_setup_sequence(struct nfs4_slot_table *tbl,
}
spin_unlock(&tbl->slot_tbl_lock);

+ slot->privileged = args->sa_privileged ? 1 : 0;
args->sa_slot = slot;
res->sr_slot = slot;

@@ -728,7 +729,8 @@ static int nfs41_sequence_process(struct rpc_task *task,
clp = session->clp;
do_renew_lease(clp, res->sr_timestamp);
/* Check sequence flags */
- nfs41_handle_sequence_flag_errors(clp, res->sr_status_flags);
+ nfs41_handle_sequence_flag_errors(clp, res->sr_status_flags,
+ !!slot->privileged);
nfs41_update_target_slotid(slot->table, slot, res);
break;
case 1:
@@ -875,6 +877,7 @@ int nfs41_setup_sequence(struct nfs4_session *session,
}
spin_unlock(&tbl->slot_tbl_lock);

+ slot->privileged = args->sa_privileged ? 1 : 0;
args->sa_slot = slot;

dprintk("<-- %s slotid=%u seqid=%u\n", __func__,
diff --git a/fs/nfs/nfs4session.h b/fs/nfs/nfs4session.h
index 3bb6af70973c..dae385500005 100644
--- a/fs/nfs/nfs4session.h
+++ b/fs/nfs/nfs4session.h
@@ -23,6 +23,7 @@ struct nfs4_slot {
u32 slot_nr;
u32 seq_nr;
unsigned int interrupted : 1,
+ privileged : 1,
seq_done : 1;
};

diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
index cada00aa5096..9801b5bb5fac 100644
--- a/fs/nfs/nfs4state.c
+++ b/fs/nfs/nfs4state.c
@@ -2227,13 +2227,22 @@ static void nfs41_handle_cb_path_down(struct nfs_client *clp)
nfs4_schedule_state_manager(clp);
}

-void nfs41_handle_sequence_flag_errors(struct nfs_client *clp, u32 flags)
+void nfs41_handle_sequence_flag_errors(struct nfs_client *clp, u32 flags,
+ bool recovery)
{
if (!flags)
return;

dprintk("%s: \"%s\" (client ID %llx) flags=0x%08x\n",
__func__, clp->cl_hostname, clp->cl_clientid, flags);
+ /*
+ * If we're called from the state manager thread, then assume we're
+ * already handling the RECLAIM_NEEDED and/or STATE_REVOKED.
+ * Those flags are expected to remain set until we're done
+ * recovering (see RFC5661, section 18.46.3).
+ */
+ if (recovery)
+ goto out_recovery;

if (flags & SEQ4_STATUS_RESTART_RECLAIM_NEEDED)
nfs41_handle_server_reboot(clp);
@@ -2246,6 +2255,7 @@ void nfs41_handle_sequence_flag_errors(struct nfs_client *clp, u32 flags)
nfs4_schedule_lease_moved_recovery(clp);
if (flags & SEQ4_STATUS_RECALLABLE_STATE_REVOKED)
nfs41_handle_recallable_state_revoked(clp);
+out_recovery:
if (flags & SEQ4_STATUS_BACKCHANNEL_FAULT)
nfs41_handle_backchannel_fault(clp);
else if (flags & (SEQ4_STATUS_CB_PATH_DOWN |
--
2.7.4


2016-09-22 17:39:33

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH v7 02/31] NFS: Fix inode corruption in nfs_prime_dcache()

Due to inode number reuse in filesystems, we can end up corrupting the
inode on our client if we apply the file attributes without ensuring that
the filehandle matches.
Typical symptoms include spurious "mode changed" reports in the syslog.

We still do want to ensure that we don't invalidate the dentry if the
inode number matches, but we don't have a filehandle.

Fixes: fa9233699cc1 ("NFS: Don't require a filehandle to refresh...")
Signed-off-by: Trond Myklebust <[email protected]>
Cc: [email protected] # v4.0+
---
fs/nfs/dir.c | 16 +++++++++++-----
1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index 177fefb26c18..6bc5a68e39f1 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -435,11 +435,11 @@ int nfs_same_file(struct dentry *dentry, struct nfs_entry *entry)
return 0;

nfsi = NFS_I(inode);
- if (entry->fattr->fileid == nfsi->fileid)
- return 1;
- if (nfs_compare_fh(entry->fh, &nfsi->fh) == 0)
- return 1;
- return 0;
+ if (entry->fattr->fileid != nfsi->fileid)
+ return 0;
+ if (entry->fh->size && nfs_compare_fh(entry->fh, &nfsi->fh) != 0)
+ return 0;
+ return 1;
}

static
@@ -517,6 +517,8 @@ again:
&entry->fattr->fsid))
goto out;
if (nfs_same_file(dentry, entry)) {
+ if (!entry->fh->size)
+ goto out;
nfs_set_verifier(dentry, nfs_save_change_attribute(dir));
status = nfs_refresh_inode(d_inode(dentry), entry->fattr);
if (!status)
@@ -529,6 +531,10 @@ again:
goto again;
}
}
+ if (!entry->fh->size) {
+ d_lookup_done(dentry);
+ goto out;
+ }

inode = nfs_fhget(dentry->d_sb, entry->fh, entry->fattr, entry->label);
alias = d_splice_alias(inode, dentry);
--
2.7.4


2016-09-22 17:39:39

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH v7 08/31] NFSv4.x: Allow callers of nfs_remove_bad_delegation() to specify a stateid

Allow the callers of nfs_remove_bad_delegation() to specify the stateid
that needs to be marked as bad.

Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/delegation.c | 30 +++++++++++++++++++++++-------
fs/nfs/delegation.h | 2 +-
fs/nfs/flexfilelayout/flexfilelayout.c | 2 +-
fs/nfs/nfs4proc.c | 14 ++++++++------
4 files changed, 33 insertions(+), 15 deletions(-)

diff --git a/fs/nfs/delegation.c b/fs/nfs/delegation.c
index b9c65421ed81..e5212e5c73d2 100644
--- a/fs/nfs/delegation.c
+++ b/fs/nfs/delegation.c
@@ -652,23 +652,39 @@ static void nfs_client_mark_return_unused_delegation_types(struct nfs_client *cl
rcu_read_unlock();
}

-static void nfs_revoke_delegation(struct inode *inode)
+static void nfs_mark_delegation_revoked(struct nfs_server *server,
+ struct nfs_delegation *delegation)
+{
+ set_bit(NFS_DELEGATION_REVOKED, &delegation->flags);
+ nfs_mark_return_delegation(server, delegation);
+}
+
+static bool nfs_revoke_delegation(struct inode *inode,
+ const nfs4_stateid *stateid)
{
struct nfs_delegation *delegation;
+ bool ret = false;
+
rcu_read_lock();
delegation = rcu_dereference(NFS_I(inode)->delegation);
- if (delegation != NULL) {
- set_bit(NFS_DELEGATION_REVOKED, &delegation->flags);
- nfs_mark_return_delegation(NFS_SERVER(inode), delegation);
- }
+ if (delegation == NULL)
+ goto out;
+ if (stateid && !nfs4_stateid_match(stateid, &delegation->stateid))
+ goto out;
+ nfs_mark_delegation_revoked(NFS_SERVER(inode), delegation);
+ ret = true;
+out:
rcu_read_unlock();
+ return ret;
}

-void nfs_remove_bad_delegation(struct inode *inode)
+void nfs_remove_bad_delegation(struct inode *inode,
+ const nfs4_stateid *stateid)
{
struct nfs_delegation *delegation;

- nfs_revoke_delegation(inode);
+ if (!nfs_revoke_delegation(inode, stateid))
+ return;
delegation = nfs_inode_detach_delegation(inode);
if (delegation) {
nfs_inode_find_state_and_recover(inode, &delegation->stateid);
diff --git a/fs/nfs/delegation.h b/fs/nfs/delegation.h
index 64724d252a79..d40827af5913 100644
--- a/fs/nfs/delegation.h
+++ b/fs/nfs/delegation.h
@@ -47,7 +47,7 @@ void nfs_expire_unused_delegation_types(struct nfs_client *clp, fmode_t flags);
void nfs_expire_unreferenced_delegations(struct nfs_client *clp);
int nfs_client_return_marked_delegations(struct nfs_client *clp);
int nfs_delegations_present(struct nfs_client *clp);
-void nfs_remove_bad_delegation(struct inode *inode);
+void nfs_remove_bad_delegation(struct inode *inode, const nfs4_stateid *stateid);

void nfs_delegation_mark_reclaim(struct nfs_client *clp);
void nfs_delegation_reap_unclaimed(struct nfs_client *clp);
diff --git a/fs/nfs/flexfilelayout/flexfilelayout.c b/fs/nfs/flexfilelayout/flexfilelayout.c
index 51b51369704c..98ace127bf86 100644
--- a/fs/nfs/flexfilelayout/flexfilelayout.c
+++ b/fs/nfs/flexfilelayout/flexfilelayout.c
@@ -1080,7 +1080,7 @@ static int ff_layout_async_handle_error_v4(struct rpc_task *task,
case -NFS4ERR_BAD_STATEID:
if (state == NULL)
break;
- nfs_remove_bad_delegation(state->inode);
+ nfs_remove_bad_delegation(state->inode, NULL);
case -NFS4ERR_OPENMODE:
if (state == NULL)
break;
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 0ade81441ac2..9c203c221876 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -2385,9 +2385,10 @@ static int nfs4_open_expired(struct nfs4_state_owner *sp, struct nfs4_state *sta
return ret;
}

-static void nfs_finish_clear_delegation_stateid(struct nfs4_state *state)
+static void nfs_finish_clear_delegation_stateid(struct nfs4_state *state,
+ const nfs4_stateid *stateid)
{
- nfs_remove_bad_delegation(state->inode);
+ nfs_remove_bad_delegation(state->inode, stateid);
write_seqlock(&state->seqlock);
nfs4_stateid_copy(&state->stateid, &state->open_stateid);
write_sequnlock(&state->seqlock);
@@ -2397,7 +2398,7 @@ static void nfs_finish_clear_delegation_stateid(struct nfs4_state *state)
static void nfs40_clear_delegation_stateid(struct nfs4_state *state)
{
if (rcu_access_pointer(NFS_I(state->inode)->delegation) != NULL)
- nfs_finish_clear_delegation_stateid(state);
+ nfs_finish_clear_delegation_stateid(state, NULL);
}

static int nfs40_open_expired(struct nfs4_state_owner *sp, struct nfs4_state *state)
@@ -2443,19 +2444,20 @@ static void nfs41_check_delegation_stateid(struct nfs4_state *state)
rcu_read_unlock();
return;
}
+
+ nfs4_stateid_copy(&stateid, &delegation->stateid);
if (test_bit(NFS_DELEGATION_REVOKED, &delegation->flags)) {
rcu_read_unlock();
- nfs_finish_clear_delegation_stateid(state);
+ nfs_finish_clear_delegation_stateid(state, &stateid);
return;
}

- nfs4_stateid_copy(&stateid, &delegation->stateid);
cred = get_rpccred(delegation->cred);
rcu_read_unlock();
status = nfs41_test_and_free_expired_stateid(server, &stateid, cred);
trace_nfs4_test_delegation_stateid(state, NULL, status);
if (status != NFS_OK)
- nfs_finish_clear_delegation_stateid(state);
+ nfs_finish_clear_delegation_stateid(state, &stateid);

put_rpccred(cred);
}
--
2.7.4


2016-09-22 17:39:34

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH v7 06/31] NFSv4.1: Allow test_stateid to handle session errors without waiting

If the server crashes while we're testing stateids for validity, then
we want to initiate session recovery. Usually, we will be calling from
a state manager thread, though, so we don't really want to wait.

Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/nfs4proc.c | 21 ++++++++++++++++++---
1 file changed, 18 insertions(+), 3 deletions(-)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index a026156434ae..2b569d5fb3e2 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -8598,6 +8598,23 @@ static int _nfs41_test_stateid(struct nfs_server *server,
return -res.status;
}

+static void nfs4_handle_delay_or_session_error(struct nfs_server *server,
+ int err, struct nfs4_exception *exception)
+{
+ exception->retry = 0;
+ switch(err) {
+ case -NFS4ERR_DELAY:
+ nfs4_handle_exception(server, err, exception);
+ break;
+ case -NFS4ERR_BADSESSION:
+ case -NFS4ERR_BADSLOT:
+ case -NFS4ERR_BAD_HIGH_SLOT:
+ case -NFS4ERR_CONN_NOT_BOUND_TO_SESSION:
+ case -NFS4ERR_DEADSESSION:
+ nfs4_do_handle_exception(server, err, exception);
+ }
+}
+
/**
* nfs41_test_stateid - perform a TEST_STATEID operation
*
@@ -8617,9 +8634,7 @@ static int nfs41_test_stateid(struct nfs_server *server,
int err;
do {
err = _nfs41_test_stateid(server, stateid, cred);
- if (err != -NFS4ERR_DELAY)
- break;
- nfs4_handle_exception(server, err, &exception);
+ nfs4_handle_delay_or_session_error(server, err, &exception);
} while (exception.retry);
return err;
}
--
2.7.4


2016-09-22 17:39:37

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH v7 05/31] NFSv4.1: Don't check delegations that are already marked as revoked

If the delegation has been marked as revoked, we don't have to test
it, because we should already have called FREE_STATEID on it.

Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/nfs4proc.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 6b700c59eede..a026156434ae 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -2423,6 +2423,11 @@ static void nfs41_check_delegation_stateid(struct nfs4_state *state)
rcu_read_unlock();
return;
}
+ if (test_bit(NFS_DELEGATION_REVOKED, &delegation->flags)) {
+ rcu_read_unlock();
+ nfs_finish_clear_delegation_stateid(state);
+ return;
+ }

nfs4_stateid_copy(&stateid, &delegation->stateid);
cred = get_rpccred(delegation->cred);
--
2.7.4


2016-09-22 17:39:42

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH v7 09/31] NFSv4.1: Test delegation stateids when server declares "some state revoked"

According to RFC5661, if any of the SEQUENCE status bits
SEQ4_STATUS_EXPIRED_ALL_STATE_REVOKED,
SEQ4_STATUS_EXPIRED_SOME_STATE_REVOKED, SEQ4_STATUS_ADMIN_STATE_REVOKED,
or SEQ4_STATUS_RECALLABLE_STATE_REVOKED are set, then we need to use
TEST_STATEID to figure out which stateids have been revoked, so we
can acknowledge the loss of state using FREE_STATEID.

While we already do this for open and lock state, we have not been doing
so for all the delegations.

Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/delegation.c | 97 ++++++++++++++++++++++++++++++++++++++++++++++++++++-
fs/nfs/delegation.h | 4 +++
fs/nfs/nfs4_fs.h | 3 ++
fs/nfs/nfs4proc.c | 10 ++++++
fs/nfs/nfs4state.c | 17 +++++-----
5 files changed, 122 insertions(+), 9 deletions(-)

diff --git a/fs/nfs/delegation.c b/fs/nfs/delegation.c
index e5212e5c73d2..dfb300901f7e 100644
--- a/fs/nfs/delegation.c
+++ b/fs/nfs/delegation.c
@@ -812,8 +812,15 @@ static void nfs_delegation_mark_reclaim_server(struct nfs_server *server)
{
struct nfs_delegation *delegation;

- list_for_each_entry_rcu(delegation, &server->delegations, super_list)
+ list_for_each_entry_rcu(delegation, &server->delegations, super_list) {
+ /*
+ * If the delegation may have been admin revoked, then we
+ * cannot reclaim it.
+ */
+ if (test_bit(NFS_DELEGATION_TEST_EXPIRED, &delegation->flags))
+ continue;
set_bit(NFS_DELEGATION_NEED_RECLAIM, &delegation->flags);
+ }
}

/**
@@ -877,6 +884,94 @@ restart:
rcu_read_unlock();
}

+static void nfs_mark_test_expired_delegation(struct nfs_server *server,
+ struct nfs_delegation *delegation)
+{
+ clear_bit(NFS_DELEGATION_NEED_RECLAIM, &delegation->flags);
+ set_bit(NFS_DELEGATION_TEST_EXPIRED, &delegation->flags);
+ set_bit(NFS4CLNT_DELEGATION_EXPIRED, &server->nfs_client->cl_state);
+}
+
+static void nfs_delegation_mark_test_expired_server(struct nfs_server *server)
+{
+ struct nfs_delegation *delegation;
+
+ list_for_each_entry_rcu(delegation, &server->delegations, super_list)
+ nfs_mark_test_expired_delegation(server, delegation);
+}
+
+/**
+ * nfs_mark_test_expired_all_delegations - mark all delegations for testing
+ * @clp: nfs_client to process
+ *
+ * Iterates through all the delegations associated with this server and
+ * marks them as needing to be checked for validity.
+ */
+void nfs_mark_test_expired_all_delegations(struct nfs_client *clp)
+{
+ struct nfs_server *server;
+
+ rcu_read_lock();
+ list_for_each_entry_rcu(server, &clp->cl_superblocks, client_link)
+ nfs_delegation_mark_test_expired_server(server);
+ rcu_read_unlock();
+}
+
+/**
+ * nfs_reap_expired_delegations - reap expired delegations
+ * @clp: nfs_client to process
+ *
+ * Iterates through all the delegations associated with this server and
+ * checks if they have may have been revoked. This function is usually
+ * expected to be called in cases where the server may have lost its
+ * lease.
+ */
+void nfs_reap_expired_delegations(struct nfs_client *clp)
+{
+ const struct nfs4_minor_version_ops *ops = clp->cl_mvops;
+ struct nfs_delegation *delegation;
+ struct nfs_server *server;
+ struct inode *inode;
+ struct rpc_cred *cred;
+ nfs4_stateid stateid;
+
+restart:
+ rcu_read_lock();
+ list_for_each_entry_rcu(server, &clp->cl_superblocks, client_link) {
+ list_for_each_entry_rcu(delegation, &server->delegations,
+ super_list) {
+ if (test_bit(NFS_DELEGATION_RETURNING,
+ &delegation->flags))
+ continue;
+ if (test_bit(NFS_DELEGATION_TEST_EXPIRED,
+ &delegation->flags) == 0)
+ continue;
+ if (!nfs_sb_active(server->super))
+ continue;
+ inode = nfs_delegation_grab_inode(delegation);
+ if (inode == NULL) {
+ rcu_read_unlock();
+ nfs_sb_deactive(server->super);
+ goto restart;
+ }
+ cred = get_rpccred_rcu(delegation->cred);
+ nfs4_stateid_copy(&stateid, &delegation->stateid);
+ clear_bit(NFS_DELEGATION_TEST_EXPIRED, &delegation->flags);
+ rcu_read_unlock();
+ if (cred != NULL &&
+ ops->test_and_free_expired(server, &stateid, cred) < 0) {
+ nfs_revoke_delegation(inode, &stateid);
+ nfs_inode_find_state_and_recover(inode, &stateid);
+ }
+ put_rpccred(cred);
+ iput(inode);
+ nfs_sb_deactive(server->super);
+ goto restart;
+ }
+ }
+ rcu_read_unlock();
+}
+
/**
* nfs_delegations_present - check for existence of delegations
* @clp: client state handle
diff --git a/fs/nfs/delegation.h b/fs/nfs/delegation.h
index d40827af5913..1442e3b1521d 100644
--- a/fs/nfs/delegation.h
+++ b/fs/nfs/delegation.h
@@ -32,6 +32,7 @@ enum {
NFS_DELEGATION_REFERENCED,
NFS_DELEGATION_RETURNING,
NFS_DELEGATION_REVOKED,
+ NFS_DELEGATION_TEST_EXPIRED,
};

int nfs_inode_set_delegation(struct inode *inode, struct rpc_cred *cred, struct nfs_openres *res);
@@ -52,6 +53,9 @@ void nfs_remove_bad_delegation(struct inode *inode, const nfs4_stateid *stateid)
void nfs_delegation_mark_reclaim(struct nfs_client *clp);
void nfs_delegation_reap_unclaimed(struct nfs_client *clp);

+void nfs_mark_test_expired_all_delegations(struct nfs_client *clp);
+void nfs_reap_expired_delegations(struct nfs_client *clp);
+
/* NFSv4 delegation-related procedures */
int nfs4_proc_delegreturn(struct inode *inode, struct rpc_cred *cred, const nfs4_stateid *stateid, int issync);
int nfs4_open_delegation_recall(struct nfs_open_context *ctx, struct nfs4_state *state, const nfs4_stateid *stateid, fmode_t type);
diff --git a/fs/nfs/nfs4_fs.h b/fs/nfs/nfs4_fs.h
index 4390d73a92e5..b9083a6cefd6 100644
--- a/fs/nfs/nfs4_fs.h
+++ b/fs/nfs/nfs4_fs.h
@@ -39,6 +39,7 @@ enum nfs4_client_state {
NFS4CLNT_BIND_CONN_TO_SESSION,
NFS4CLNT_MOVED,
NFS4CLNT_LEASE_MOVED,
+ NFS4CLNT_DELEGATION_EXPIRED,
};

#define NFS4_RENEW_TIMEOUT 0x01
@@ -57,6 +58,8 @@ struct nfs4_minor_version_ops {
struct nfs_fsinfo *);
void (*free_lock_state)(struct nfs_server *,
struct nfs4_lock_state *);
+ int (*test_and_free_expired)(struct nfs_server *,
+ nfs4_stateid *, struct rpc_cred *);
struct nfs_seqid *
(*alloc_seqid)(struct nfs_seqid_counter *, gfp_t);
const struct rpc_call_ops *call_sync_ops;
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 9c203c221876..3552f3a4ceb0 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -2408,6 +2408,13 @@ static int nfs40_open_expired(struct nfs4_state_owner *sp, struct nfs4_state *st
return nfs4_open_expired(sp, state);
}

+static int nfs40_test_and_free_expired_stateid(struct nfs_server *server,
+ nfs4_stateid *stateid,
+ struct rpc_cred *cred)
+{
+ return -NFS4ERR_BAD_STATEID;
+}
+
#if defined(CONFIG_NFS_V4_1)
static int nfs41_test_and_free_expired_stateid(struct nfs_server *server,
nfs4_stateid *stateid,
@@ -8862,6 +8869,7 @@ static const struct nfs4_minor_version_ops nfs_v4_0_minor_ops = {
.match_stateid = nfs4_match_stateid,
.find_root_sec = nfs4_find_root_sec,
.free_lock_state = nfs4_release_lockowner,
+ .test_and_free_expired = nfs40_test_and_free_expired_stateid,
.alloc_seqid = nfs_alloc_seqid,
.call_sync_ops = &nfs40_call_sync_ops,
.reboot_recovery_ops = &nfs40_reboot_recovery_ops,
@@ -8889,6 +8897,7 @@ static const struct nfs4_minor_version_ops nfs_v4_1_minor_ops = {
.match_stateid = nfs41_match_stateid,
.find_root_sec = nfs41_find_root_sec,
.free_lock_state = nfs41_free_lock_state,
+ .test_and_free_expired = nfs41_test_and_free_expired_stateid,
.alloc_seqid = nfs_alloc_no_seqid,
.call_sync_ops = &nfs41_call_sync_ops,
.reboot_recovery_ops = &nfs41_reboot_recovery_ops,
@@ -8918,6 +8927,7 @@ static const struct nfs4_minor_version_ops nfs_v4_2_minor_ops = {
.find_root_sec = nfs41_find_root_sec,
.free_lock_state = nfs41_free_lock_state,
.call_sync_ops = &nfs41_call_sync_ops,
+ .test_and_free_expired = nfs41_test_and_free_expired_stateid,
.alloc_seqid = nfs_alloc_no_seqid,
.reboot_recovery_ops = &nfs41_reboot_recovery_ops,
.nograce_recovery_ops = &nfs41_nograce_recovery_ops,
diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
index 9801b5bb5fac..63da0411e2af 100644
--- a/fs/nfs/nfs4state.c
+++ b/fs/nfs/nfs4state.c
@@ -1656,15 +1656,9 @@ static void nfs4_state_end_reclaim_reboot(struct nfs_client *clp)
put_rpccred(cred);
}

-static void nfs_delegation_clear_all(struct nfs_client *clp)
-{
- nfs_delegation_mark_reclaim(clp);
- nfs_delegation_reap_unclaimed(clp);
-}
-
static void nfs4_state_start_reclaim_nograce(struct nfs_client *clp)
{
- nfs_delegation_clear_all(clp);
+ nfs_mark_test_expired_all_delegations(clp);
nfs4_state_mark_reclaim_helper(clp, nfs4_state_mark_reclaim_nograce);
}

@@ -2195,7 +2189,7 @@ static void nfs41_handle_all_state_revoked(struct nfs_client *clp)

static void nfs41_handle_some_state_revoked(struct nfs_client *clp)
{
- nfs4_state_mark_reclaim_helper(clp, nfs4_state_mark_reclaim_nograce);
+ nfs4_state_start_reclaim_nograce(clp);
nfs4_schedule_state_manager(clp);

dprintk("%s: state revoked on server %s\n", __func__, clp->cl_hostname);
@@ -2420,6 +2414,13 @@ static void nfs4_state_manager(struct nfs_client *clp)
nfs4_state_end_reclaim_reboot(clp);
}

+ /* Detect expired delegations... */
+ if (test_and_clear_bit(NFS4CLNT_DELEGATION_EXPIRED, &clp->cl_state)) {
+ section = "detect expired delegations";
+ nfs_reap_expired_delegations(clp);
+ continue;
+ }
+
/* Now recover expired state... */
if (test_and_clear_bit(NFS4CLNT_RECLAIM_NOGRACE, &clp->cl_state)) {
section = "reclaim nograce";
--
2.7.4


2016-09-22 17:39:43

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH v7 10/31] NFSv4.1: Deal with server reboots during delegation expiration recovery

Ensure that if the server reboots while we're testing and recovering
from revoked delegations, we exit to allow the state manager to
handle matters.

Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/delegation.c | 26 ++++++++++++++++++++++++++
1 file changed, 26 insertions(+)

diff --git a/fs/nfs/delegation.c b/fs/nfs/delegation.c
index dfb300901f7e..0ffead281555 100644
--- a/fs/nfs/delegation.c
+++ b/fs/nfs/delegation.c
@@ -884,6 +884,13 @@ restart:
rcu_read_unlock();
}

+static inline bool nfs4_server_rebooted(const struct nfs_client *clp)
+{
+ return (clp->cl_state & (BIT(NFS4CLNT_CHECK_LEASE) |
+ BIT(NFS4CLNT_LEASE_EXPIRED) |
+ BIT(NFS4CLNT_SESSION_RESET))) != 0;
+}
+
static void nfs_mark_test_expired_delegation(struct nfs_server *server,
struct nfs_delegation *delegation)
{
@@ -892,6 +899,19 @@ static void nfs_mark_test_expired_delegation(struct nfs_server *server,
set_bit(NFS4CLNT_DELEGATION_EXPIRED, &server->nfs_client->cl_state);
}

+static void nfs_inode_mark_test_expired_delegation(struct nfs_server *server,
+ struct inode *inode)
+{
+ struct nfs_delegation *delegation;
+
+ rcu_read_lock();
+ delegation = rcu_dereference(NFS_I(inode)->delegation);
+ if (delegation)
+ nfs_mark_test_expired_delegation(server, delegation);
+ rcu_read_unlock();
+
+}
+
static void nfs_delegation_mark_test_expired_server(struct nfs_server *server)
{
struct nfs_delegation *delegation;
@@ -964,6 +984,12 @@ restart:
nfs_inode_find_state_and_recover(inode, &stateid);
}
put_rpccred(cred);
+ if (nfs4_server_rebooted(clp)) {
+ nfs_inode_mark_test_expired_delegation(server,inode);
+ iput(inode);
+ nfs_sb_deactive(server->super);
+ return;
+ }
iput(inode);
nfs_sb_deactive(server->super);
goto restart;
--
2.7.4


2016-09-22 17:39:44

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH v7 11/31] NFSv4.1: Don't recheck delegations that have already been checked

Ensure we don't spam the server with test_stateid() calls for
delegations that have already been checked.

Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/nfs4proc.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 3552f3a4ceb0..97a614370604 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -2459,6 +2459,11 @@ static void nfs41_check_delegation_stateid(struct nfs4_state *state)
return;
}

+ if (!test_and_clear_bit(NFS_DELEGATION_TEST_EXPIRED, &delegation->flags)) {
+ rcu_read_unlock();
+ return;
+ }
+
cred = get_rpccred(delegation->cred);
rcu_read_unlock();
status = nfs41_test_and_free_expired_stateid(server, &stateid, cred);
--
2.7.4


2016-09-22 17:39:48

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH v7 14/31] NFSv4.1: FREE_STATEID can be asynchronous

Nothing should need to be serialised with FREE_STATEID on the client,
so let's make the RPC call always asynchronous. Also constify the
stateid argument.

Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/nfs4proc.c | 31 ++++++++++++-------------------
1 file changed, 12 insertions(+), 19 deletions(-)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 33ca6d768bd2..a53caeb73f75 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -99,8 +99,8 @@ static int nfs4_do_setattr(struct inode *inode, struct rpc_cred *cred,
#ifdef CONFIG_NFS_V4_1
static int nfs41_test_stateid(struct nfs_server *, nfs4_stateid *,
struct rpc_cred *);
-static int nfs41_free_stateid(struct nfs_server *, nfs4_stateid *,
- struct rpc_cred *);
+static int nfs41_free_stateid(struct nfs_server *, const nfs4_stateid *,
+ struct rpc_cred *, bool);
#endif

#ifdef CONFIG_NFS_V4_SECURITY_LABEL
@@ -2443,7 +2443,7 @@ static int nfs41_test_and_free_expired_stateid(struct nfs_server *server,
}
out_free:
/* Ack the revoked state to the server */
- nfs41_free_stateid(server, stateid, cred);
+ nfs41_free_stateid(server, stateid, cred, true);
return -NFS4ERR_EXPIRED;
}

@@ -8722,7 +8722,7 @@ static const struct rpc_call_ops nfs41_free_stateid_ops = {
};

static struct rpc_task *_nfs41_free_stateid(struct nfs_server *server,
- nfs4_stateid *stateid,
+ const nfs4_stateid *stateid,
struct rpc_cred *cred,
bool privileged)
{
@@ -8765,38 +8765,31 @@ static struct rpc_task *_nfs41_free_stateid(struct nfs_server *server,
* @server: server / transport on which to perform the operation
* @stateid: state ID to release
* @cred: credential
+ * @is_recovery: set to true if this call needs to be privileged
*
- * Returns NFS_OK if the server freed "stateid". Otherwise a
- * negative NFS4ERR value is returned.
+ * Note: this function is always asynchronous.
*/
static int nfs41_free_stateid(struct nfs_server *server,
- nfs4_stateid *stateid,
- struct rpc_cred *cred)
+ const nfs4_stateid *stateid,
+ struct rpc_cred *cred,
+ bool is_recovery)
{
struct rpc_task *task;
- int ret;

- task = _nfs41_free_stateid(server, stateid, cred, true);
+ task = _nfs41_free_stateid(server, stateid, cred, is_recovery);
if (IS_ERR(task))
return PTR_ERR(task);
- ret = rpc_wait_for_completion_task(task);
- if (!ret)
- ret = task->tk_status;
rpc_put_task(task);
- return ret;
+ return 0;
}

static void
nfs41_free_lock_state(struct nfs_server *server, struct nfs4_lock_state *lsp)
{
- struct rpc_task *task;
struct rpc_cred *cred = lsp->ls_state->owner->so_cred;

- task = _nfs41_free_stateid(server, &lsp->ls_stateid, cred, false);
+ nfs41_free_stateid(server, &lsp->ls_stateid, cred, false);
nfs4_free_lock_state(server, lsp);
- if (IS_ERR(task))
- return;
- rpc_put_task(task);
}

static bool nfs41_match_stateid(const nfs4_stateid *s1,
--
2.7.4


2016-09-22 17:39:47

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH v7 13/31] NFSv4.1: Ensure we always run TEST/FREE_STATEID on locks

Right now, we're only running TEST/FREE_STATEID on the locks if
the open stateid recovery succeeds. The protocol requires us to
always do so.
The fix would be to move the call to TEST/FREE_STATEID and do it
before we attempt open recovery.

Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/nfs4proc.c | 92 +++++++++++++++++++++++++++++++------------------------
1 file changed, 52 insertions(+), 40 deletions(-)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 3c1b8cb7dd95..33ca6d768bd2 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -2486,6 +2486,45 @@ static void nfs41_check_delegation_stateid(struct nfs4_state *state)
}

/**
+ * nfs41_check_expired_locks - possibly free a lock stateid
+ *
+ * @state: NFSv4 state for an inode
+ *
+ * Returns NFS_OK if recovery for this stateid is now finished.
+ * Otherwise a negative NFS4ERR value is returned.
+ */
+static int nfs41_check_expired_locks(struct nfs4_state *state)
+{
+ int status, ret = NFS_OK;
+ struct nfs4_lock_state *lsp;
+ struct nfs_server *server = NFS_SERVER(state->inode);
+
+ if (!test_bit(LK_STATE_IN_USE, &state->flags))
+ goto out;
+ list_for_each_entry(lsp, &state->lock_states, ls_locks) {
+ if (test_bit(NFS_LOCK_INITIALIZED, &lsp->ls_flags)) {
+ struct rpc_cred *cred = lsp->ls_state->owner->so_cred;
+
+ status = nfs41_test_and_free_expired_stateid(server,
+ &lsp->ls_stateid,
+ cred);
+ trace_nfs4_test_lock_stateid(state, lsp, status);
+ if (status == -NFS4ERR_EXPIRED ||
+ status == -NFS4ERR_BAD_STATEID) {
+ clear_bit(NFS_LOCK_INITIALIZED, &lsp->ls_flags);
+ if (!recover_lost_locks)
+ set_bit(NFS_LOCK_LOST, &lsp->ls_flags);
+ } else if (status != NFS_OK) {
+ ret = status;
+ break;
+ }
+ }
+ };
+out:
+ return ret;
+}
+
+/**
* nfs41_check_open_stateid - possibly free an open stateid
*
* @state: NFSv4 state for an inode
@@ -2522,6 +2561,9 @@ static int nfs41_open_expired(struct nfs4_state_owner *sp, struct nfs4_state *st
int status;

nfs41_check_delegation_stateid(state);
+ status = nfs41_check_expired_locks(state);
+ if (status != NFS_OK)
+ return status;
status = nfs41_check_open_stateid(state);
if (status != NFS_OK)
status = nfs4_open_expired(sp, state);
@@ -6125,49 +6167,19 @@ out:
}

#if defined(CONFIG_NFS_V4_1)
-/**
- * nfs41_check_expired_locks - possibly free a lock stateid
- *
- * @state: NFSv4 state for an inode
- *
- * Returns NFS_OK if recovery for this stateid is now finished.
- * Otherwise a negative NFS4ERR value is returned.
- */
-static int nfs41_check_expired_locks(struct nfs4_state *state)
-{
- int status, ret = NFS_OK;
- struct nfs4_lock_state *lsp;
- struct nfs_server *server = NFS_SERVER(state->inode);
-
- list_for_each_entry(lsp, &state->lock_states, ls_locks) {
- if (test_bit(NFS_LOCK_INITIALIZED, &lsp->ls_flags)) {
- struct rpc_cred *cred = lsp->ls_state->owner->so_cred;
-
- status = nfs41_test_and_free_expired_stateid(server,
- &lsp->ls_stateid,
- cred);
- trace_nfs4_test_lock_stateid(state, lsp, status);
- if (status == -NFS4ERR_EXPIRED ||
- status == -NFS4ERR_BAD_STATEID)
- clear_bit(NFS_LOCK_INITIALIZED, &lsp->ls_flags);
- else if (status != NFS_OK) {
- ret = status;
- break;
- }
- }
- };
-
- return ret;
-}
-
static int nfs41_lock_expired(struct nfs4_state *state, struct file_lock *request)
{
- int status = NFS_OK;
+ struct nfs4_lock_state *lsp;
+ int status;

- if (test_bit(LK_STATE_IN_USE, &state->flags))
- status = nfs41_check_expired_locks(state);
- if (status != NFS_OK)
- status = nfs4_lock_expired(state, request);
+ status = nfs4_set_lock_state(state, request);
+ if (status != 0)
+ return status;
+ lsp = request->fl_u.nfs4_fl.owner;
+ if (test_bit(NFS_LOCK_INITIALIZED, &lsp->ls_flags) ||
+ test_bit(NFS_LOCK_LOST, &lsp->ls_flags))
+ return 0;
+ status = nfs4_lock_expired(state, request);
return status;
}
#endif
--
2.7.4


2016-09-22 17:39:42

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH v7 07/31] NFSv4.1: Add a helper function to deal with expired stateids

In NFSv4.1 and newer, if the server decides to revoke some or all of
the protocol state, the client is required to iterate through all the
stateids that it holds and call TEST_STATEID to determine which stateids
still correspond to valid state, and then call FREE_STATEID on the
others.

Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/nfs4proc.c | 45 ++++++++++++++++++++++++---------------------
1 file changed, 24 insertions(+), 21 deletions(-)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 2b569d5fb3e2..0ade81441ac2 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -2408,6 +2408,26 @@ static int nfs40_open_expired(struct nfs4_state_owner *sp, struct nfs4_state *st
}

#if defined(CONFIG_NFS_V4_1)
+static int nfs41_test_and_free_expired_stateid(struct nfs_server *server,
+ nfs4_stateid *stateid,
+ struct rpc_cred *cred)
+{
+ int status;
+
+ status = nfs41_test_stateid(server, stateid, cred);
+
+ switch (status) {
+ case -NFS4ERR_EXPIRED:
+ case -NFS4ERR_ADMIN_REVOKED:
+ case -NFS4ERR_DELEG_REVOKED:
+ /* Ack the revoked state to the server */
+ nfs41_free_stateid(server, stateid, cred);
+ case -NFS4ERR_BAD_STATEID:
+ return status;
+ }
+ return NFS_OK;
+}
+
static void nfs41_check_delegation_stateid(struct nfs4_state *state)
{
struct nfs_server *server = NFS_SERVER(state->inode);
@@ -2432,16 +2452,10 @@ static void nfs41_check_delegation_stateid(struct nfs4_state *state)
nfs4_stateid_copy(&stateid, &delegation->stateid);
cred = get_rpccred(delegation->cred);
rcu_read_unlock();
- status = nfs41_test_stateid(server, &stateid, cred);
+ status = nfs41_test_and_free_expired_stateid(server, &stateid, cred);
trace_nfs4_test_delegation_stateid(state, NULL, status);
-
- if (status != NFS_OK) {
- /* Free the stateid unless the server explicitly
- * informs us the stateid is unrecognized. */
- if (status != -NFS4ERR_BAD_STATEID)
- nfs41_free_stateid(server, &stateid, cred);
+ if (status != NFS_OK)
nfs_finish_clear_delegation_stateid(state);
- }

put_rpccred(cred);
}
@@ -2467,14 +2481,9 @@ static int nfs41_check_open_stateid(struct nfs4_state *state)
(test_bit(NFS_O_RDWR_STATE, &state->flags) == 0))
return -NFS4ERR_BAD_STATEID;

- status = nfs41_test_stateid(server, stateid, cred);
+ status = nfs41_test_and_free_expired_stateid(server, stateid, cred);
trace_nfs4_test_open_stateid(state, NULL, status);
if (status != NFS_OK) {
- /* Free the stateid unless the server explicitly
- * informs us the stateid is unrecognized. */
- if (status != -NFS4ERR_BAD_STATEID)
- nfs41_free_stateid(server, stateid, cred);
-
clear_bit(NFS_O_RDONLY_STATE, &state->flags);
clear_bit(NFS_O_WRONLY_STATE, &state->flags);
clear_bit(NFS_O_RDWR_STATE, &state->flags);
@@ -6109,17 +6118,11 @@ static int nfs41_check_expired_locks(struct nfs4_state *state)
if (test_bit(NFS_LOCK_INITIALIZED, &lsp->ls_flags)) {
struct rpc_cred *cred = lsp->ls_state->owner->so_cred;

- status = nfs41_test_stateid(server,
+ status = nfs41_test_and_free_expired_stateid(server,
&lsp->ls_stateid,
cred);
trace_nfs4_test_lock_stateid(state, lsp, status);
if (status != NFS_OK) {
- /* Free the stateid unless the server
- * informs us the stateid is unrecognized. */
- if (status != -NFS4ERR_BAD_STATEID)
- nfs41_free_stateid(server,
- &lsp->ls_stateid,
- cred);
clear_bit(NFS_LOCK_INITIALIZED, &lsp->ls_flags);
ret = status;
}
--
2.7.4


2016-09-22 17:39:51

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH v7 16/31] NFSv4: Ensure we don't re-test revoked and freed stateids

This fixes a potential infinite loop in nfs_reap_expired_delegations.

Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/delegation.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/fs/nfs/delegation.c b/fs/nfs/delegation.c
index 0ffead281555..484f14700108 100644
--- a/fs/nfs/delegation.c
+++ b/fs/nfs/delegation.c
@@ -656,6 +656,7 @@ static void nfs_mark_delegation_revoked(struct nfs_server *server,
struct nfs_delegation *delegation)
{
set_bit(NFS_DELEGATION_REVOKED, &delegation->flags);
+ delegation->stateid.type = NFS4_INVALID_STATEID_TYPE;
nfs_mark_return_delegation(server, delegation);
}

@@ -894,6 +895,8 @@ static inline bool nfs4_server_rebooted(const struct nfs_client *clp)
static void nfs_mark_test_expired_delegation(struct nfs_server *server,
struct nfs_delegation *delegation)
{
+ if (delegation->stateid.type == NFS4_INVALID_STATEID_TYPE)
+ return;
clear_bit(NFS_DELEGATION_NEED_RECLAIM, &delegation->flags);
set_bit(NFS_DELEGATION_TEST_EXPIRED, &delegation->flags);
set_bit(NFS4CLNT_DELEGATION_EXPIRED, &server->nfs_client->cl_state);
--
2.7.4


2016-09-22 17:39:45

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH v7 12/31] NFSv4.1: Allow revoked stateids to skip the call to TEST_STATEID

In some cases (e.g. when the SEQ4_STATUS_EXPIRED_ALL_STATE_REVOKED sequence
flag is set) we may already know that the stateid was revoked and that the
only valid operation we can call is FREE_STATEID. In those cases, allow
the stateid to carry the information in the type field, so that we skip
the redundant call to TEST_STATEID.

Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/nfs4proc.c | 32 +++++++++++++++++++++++---------
include/linux/nfs4.h | 1 +
2 files changed, 24 insertions(+), 9 deletions(-)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 97a614370604..3c1b8cb7dd95 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -2422,18 +2422,29 @@ static int nfs41_test_and_free_expired_stateid(struct nfs_server *server,
{
int status;

- status = nfs41_test_stateid(server, stateid, cred);
+ switch (stateid->type) {
+ default:
+ break;
+ case NFS4_INVALID_STATEID_TYPE:
+ case NFS4_SPECIAL_STATEID_TYPE:
+ return -NFS4ERR_BAD_STATEID;
+ case NFS4_REVOKED_STATEID_TYPE:
+ goto out_free;
+ }

+ status = nfs41_test_stateid(server, stateid, cred);
switch (status) {
case -NFS4ERR_EXPIRED:
case -NFS4ERR_ADMIN_REVOKED:
case -NFS4ERR_DELEG_REVOKED:
- /* Ack the revoked state to the server */
- nfs41_free_stateid(server, stateid, cred);
- case -NFS4ERR_BAD_STATEID:
+ break;
+ default:
return status;
}
- return NFS_OK;
+out_free:
+ /* Ack the revoked state to the server */
+ nfs41_free_stateid(server, stateid, cred);
+ return -NFS4ERR_EXPIRED;
}

static void nfs41_check_delegation_stateid(struct nfs4_state *state)
@@ -2468,7 +2479,7 @@ static void nfs41_check_delegation_stateid(struct nfs4_state *state)
rcu_read_unlock();
status = nfs41_test_and_free_expired_stateid(server, &stateid, cred);
trace_nfs4_test_delegation_stateid(state, NULL, status);
- if (status != NFS_OK)
+ if (status == -NFS4ERR_EXPIRED || status == -NFS4ERR_BAD_STATEID)
nfs_finish_clear_delegation_stateid(state, &stateid);

put_rpccred(cred);
@@ -2497,7 +2508,7 @@ static int nfs41_check_open_stateid(struct nfs4_state *state)

status = nfs41_test_and_free_expired_stateid(server, stateid, cred);
trace_nfs4_test_open_stateid(state, NULL, status);
- if (status != NFS_OK) {
+ if (status == -NFS4ERR_EXPIRED || status == -NFS4ERR_BAD_STATEID) {
clear_bit(NFS_O_RDONLY_STATE, &state->flags);
clear_bit(NFS_O_WRONLY_STATE, &state->flags);
clear_bit(NFS_O_RDWR_STATE, &state->flags);
@@ -6124,7 +6135,7 @@ out:
*/
static int nfs41_check_expired_locks(struct nfs4_state *state)
{
- int status, ret = -NFS4ERR_BAD_STATEID;
+ int status, ret = NFS_OK;
struct nfs4_lock_state *lsp;
struct nfs_server *server = NFS_SERVER(state->inode);

@@ -6136,9 +6147,12 @@ static int nfs41_check_expired_locks(struct nfs4_state *state)
&lsp->ls_stateid,
cred);
trace_nfs4_test_lock_stateid(state, lsp, status);
- if (status != NFS_OK) {
+ if (status == -NFS4ERR_EXPIRED ||
+ status == -NFS4ERR_BAD_STATEID)
clear_bit(NFS_LOCK_INITIALIZED, &lsp->ls_flags);
+ else if (status != NFS_OK) {
ret = status;
+ break;
}
}
};
diff --git a/include/linux/nfs4.h b/include/linux/nfs4.h
index c6564ada9beb..9094faf0699d 100644
--- a/include/linux/nfs4.h
+++ b/include/linux/nfs4.h
@@ -67,6 +67,7 @@ struct nfs4_stateid_struct {
NFS4_DELEGATION_STATEID_TYPE,
NFS4_LAYOUT_STATEID_TYPE,
NFS4_PNFS_DS_STATEID_TYPE,
+ NFS4_REVOKED_STATEID_TYPE,
} type;
};

--
2.7.4


2016-09-22 17:39:55

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH v7 19/31] NFSv4: nfs4_handle_setlk_error() handle expiration as revoke case

If the server tells us our stateid has expired, then handle that as if
it was revoked.

Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/nfs4proc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 75dc389c9719..f0c5442401d7 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -6086,6 +6086,7 @@ static void nfs4_handle_setlk_error(struct nfs_server *server, struct nfs4_lock_
{
switch (error) {
case -NFS4ERR_ADMIN_REVOKED:
+ case -NFS4ERR_EXPIRED:
case -NFS4ERR_BAD_STATEID:
lsp->ls_seqid.flags &= ~NFS_SEQID_CONFIRMED;
if (new_lock_owner != 0 ||
@@ -6094,7 +6095,6 @@ static void nfs4_handle_setlk_error(struct nfs_server *server, struct nfs4_lock_
break;
case -NFS4ERR_STALE_STATEID:
lsp->ls_seqid.flags &= ~NFS_SEQID_CONFIRMED;
- case -NFS4ERR_EXPIRED:
nfs4_schedule_lease_recovery(server->nfs_client);
};
}
--
2.7.4


2016-09-22 17:40:00

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH v7 23/31] NFSv4: Fix a race when updating an open_stateid

If we're replacing an old stateid which has a different 'other' field,
then we probably need to free the old stateid.

Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/nfs4proc.c | 35 ++++++++++++++++++++++++++---------
1 file changed, 26 insertions(+), 9 deletions(-)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 1ddfb300addf..338684195e95 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -1399,11 +1399,12 @@ static void nfs_test_and_clear_all_open_stateid(struct nfs4_state *state)
}

static bool nfs_need_update_open_stateid(struct nfs4_state *state,
- nfs4_stateid *stateid)
+ const nfs4_stateid *stateid, nfs4_stateid *freeme)
{
if (test_and_set_bit(NFS_OPEN_STATE, &state->flags) == 0)
return true;
if (!nfs4_stateid_match_other(stateid, &state->open_stateid)) {
+ nfs4_stateid_copy(freeme, &state->open_stateid);
nfs_test_and_clear_all_open_stateid(state);
return true;
}
@@ -1467,7 +1468,9 @@ static void nfs_clear_open_stateid(struct nfs4_state *state,
nfs4_schedule_state_manager(state->owner->so_server->nfs_client);
}

-static void nfs_set_open_stateid_locked(struct nfs4_state *state, nfs4_stateid *stateid, fmode_t fmode)
+static void nfs_set_open_stateid_locked(struct nfs4_state *state,
+ const nfs4_stateid *stateid, fmode_t fmode,
+ nfs4_stateid *freeme)
{
switch (fmode) {
case FMODE_READ:
@@ -1479,14 +1482,18 @@ static void nfs_set_open_stateid_locked(struct nfs4_state *state, nfs4_stateid *
case FMODE_READ|FMODE_WRITE:
set_bit(NFS_O_RDWR_STATE, &state->flags);
}
- if (!nfs_need_update_open_stateid(state, stateid))
+ if (!nfs_need_update_open_stateid(state, stateid, freeme))
return;
if (test_bit(NFS_DELEGATED_STATE, &state->flags) == 0)
nfs4_stateid_copy(&state->stateid, stateid);
nfs4_stateid_copy(&state->open_stateid, stateid);
}

-static void __update_open_stateid(struct nfs4_state *state, nfs4_stateid *open_stateid, const nfs4_stateid *deleg_stateid, fmode_t fmode)
+static void __update_open_stateid(struct nfs4_state *state,
+ const nfs4_stateid *open_stateid,
+ const nfs4_stateid *deleg_stateid,
+ fmode_t fmode,
+ nfs4_stateid *freeme)
{
/*
* Protect the call to nfs4_state_set_mode_locked and
@@ -1499,16 +1506,22 @@ static void __update_open_stateid(struct nfs4_state *state, nfs4_stateid *open_s
set_bit(NFS_DELEGATED_STATE, &state->flags);
}
if (open_stateid != NULL)
- nfs_set_open_stateid_locked(state, open_stateid, fmode);
+ nfs_set_open_stateid_locked(state, open_stateid, fmode, freeme);
write_sequnlock(&state->seqlock);
update_open_stateflags(state, fmode);
spin_unlock(&state->owner->so_lock);
}

-static int update_open_stateid(struct nfs4_state *state, nfs4_stateid *open_stateid, nfs4_stateid *delegation, fmode_t fmode)
+static int update_open_stateid(struct nfs4_state *state,
+ const nfs4_stateid *open_stateid,
+ const nfs4_stateid *delegation,
+ fmode_t fmode)
{
+ struct nfs_server *server = NFS_SERVER(state->inode);
+ struct nfs_client *clp = server->nfs_client;
struct nfs_inode *nfsi = NFS_I(state->inode);
struct nfs_delegation *deleg_cur;
+ nfs4_stateid freeme = {0};
int ret = 0;

fmode &= (FMODE_READ|FMODE_WRITE);
@@ -1530,7 +1543,8 @@ static int update_open_stateid(struct nfs4_state *state, nfs4_stateid *open_stat
goto no_delegation_unlock;

nfs_mark_delegation_referenced(deleg_cur);
- __update_open_stateid(state, open_stateid, &deleg_cur->stateid, fmode);
+ __update_open_stateid(state, open_stateid, &deleg_cur->stateid,
+ fmode, &freeme);
ret = 1;
no_delegation_unlock:
spin_unlock(&deleg_cur->lock);
@@ -1538,11 +1552,14 @@ no_delegation:
rcu_read_unlock();

if (!ret && open_stateid != NULL) {
- __update_open_stateid(state, open_stateid, NULL, fmode);
+ __update_open_stateid(state, open_stateid, NULL, fmode, &freeme);
ret = 1;
}
if (test_bit(NFS_STATE_RECLAIM_NOGRACE, &state->flags))
- nfs4_schedule_state_manager(state->owner->so_server->nfs_client);
+ nfs4_schedule_state_manager(clp);
+ if (freeme.type != 0)
+ nfs4_test_and_free_stateid(server, &freeme,
+ state->owner->so_cred);

return ret;
}
--
2.7.4


2016-09-22 17:39:57

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH v7 21/31] NFSv4: Pass the stateid to the exception handler in nfs4_read/write_done_cb

The actual stateid used in the READ or WRITE can represent a delegation,
a lock or a stateid, so it is useful to pass it as an argument to the
exception handler when an expired/revoked response is received from the
server. It also ensures that we don't re-label the state as needing
recovery if that has already occurred.

Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/nfs4proc.c | 35 +++++++++++++++++++++++++----------
1 file changed, 25 insertions(+), 10 deletions(-)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index e5b35e57a8ff..1ddfb300addf 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -4523,11 +4523,18 @@ static int nfs4_read_done_cb(struct rpc_task *task, struct nfs_pgio_header *hdr)
struct nfs_server *server = NFS_SERVER(hdr->inode);

trace_nfs4_read(hdr, task->tk_status);
- if (nfs4_async_handle_error(task, server,
- hdr->args.context->state,
- NULL) == -EAGAIN) {
- rpc_restart_call_prepare(task);
- return -EAGAIN;
+ if (task->tk_status < 0) {
+ struct nfs4_exception exception = {
+ .inode = hdr->inode,
+ .state = hdr->args.context->state,
+ .stateid = &hdr->args.stateid,
+ };
+ task->tk_status = nfs4_async_handle_exception(task,
+ server, task->tk_status, &exception);
+ if (exception.retry) {
+ rpc_restart_call_prepare(task);
+ return -EAGAIN;
+ }
}

__nfs4_read_done_cb(hdr);
@@ -4596,11 +4603,19 @@ static int nfs4_write_done_cb(struct rpc_task *task,
struct inode *inode = hdr->inode;

trace_nfs4_write(hdr, task->tk_status);
- if (nfs4_async_handle_error(task, NFS_SERVER(inode),
- hdr->args.context->state,
- NULL) == -EAGAIN) {
- rpc_restart_call_prepare(task);
- return -EAGAIN;
+ if (task->tk_status < 0) {
+ struct nfs4_exception exception = {
+ .inode = hdr->inode,
+ .state = hdr->args.context->state,
+ .stateid = &hdr->args.stateid,
+ };
+ task->tk_status = nfs4_async_handle_exception(task,
+ NFS_SERVER(inode), task->tk_status,
+ &exception);
+ if (exception.retry) {
+ rpc_restart_call_prepare(task);
+ return -EAGAIN;
+ }
}
if (task->tk_status >= 0) {
renew_lease(NFS_SERVER(inode), hdr->timestamp);
--
2.7.4


2016-09-22 17:39:56

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH v7 20/31] NFSv4.1: nfs4_layoutget_handle_exception handle revoked state

Handle revoked open/lock/delegation stateids when LAYOUTGET tells us
the state was revoked.

Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/nfs4proc.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index f0c5442401d7..e5b35e57a8ff 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -8069,6 +8069,8 @@ nfs4_layoutget_handle_exception(struct rpc_task *task,
case -NFS4ERR_RECALLCONFLICT:
status = -ERECALLCONFLICT;
break;
+ case -NFS4ERR_DELEG_REVOKED:
+ case -NFS4ERR_ADMIN_REVOKED:
case -NFS4ERR_EXPIRED:
case -NFS4ERR_BAD_STATEID:
exception->timeout = 0;
@@ -8080,6 +8082,7 @@ nfs4_layoutget_handle_exception(struct rpc_task *task,
&lgp->args.ctx->state->stateid)) {
spin_unlock(&inode->i_lock);
exception->state = lgp->args.ctx->state;
+ exception->stateid = &lgp->args.stateid;
break;
}

--
2.7.4


2016-09-22 17:40:01

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH v7 24/31] NFS: Always call nfs_inode_find_state_and_recover() when revoking a delegation

Don't rely on nfs_inode_detach_delegation() succeeding. That can race...

Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/delegation.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/fs/nfs/delegation.c b/fs/nfs/delegation.c
index 094e0efe6a82..dff600ae0d74 100644
--- a/fs/nfs/delegation.c
+++ b/fs/nfs/delegation.c
@@ -662,18 +662,24 @@ static bool nfs_revoke_delegation(struct inode *inode,
const nfs4_stateid *stateid)
{
struct nfs_delegation *delegation;
+ nfs4_stateid tmp;
bool ret = false;

rcu_read_lock();
delegation = rcu_dereference(NFS_I(inode)->delegation);
if (delegation == NULL)
goto out;
- if (stateid && !nfs4_stateid_match(stateid, &delegation->stateid))
+ if (stateid == NULL) {
+ nfs4_stateid_copy(&tmp, &delegation->stateid);
+ stateid = &tmp;
+ } else if (!nfs4_stateid_match(stateid, &delegation->stateid))
goto out;
nfs_mark_delegation_revoked(NFS_SERVER(inode), delegation);
ret = true;
out:
rcu_read_unlock();
+ if (ret)
+ nfs_inode_find_state_and_recover(inode, stateid);
return ret;
}

@@ -685,10 +691,8 @@ void nfs_remove_bad_delegation(struct inode *inode,
if (!nfs_revoke_delegation(inode, stateid))
return;
delegation = nfs_inode_detach_delegation(inode);
- if (delegation) {
- nfs_inode_find_state_and_recover(inode, &delegation->stateid);
+ if (delegation)
nfs_free_delegation(delegation);
- }
}
EXPORT_SYMBOL_GPL(nfs_remove_bad_delegation);

--
2.7.4


2016-09-22 17:39:59

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH v7 22/31] NFSv4: Fix a race in nfs_inode_reclaim_delegation()

If we race with a delegreturn before taking the spin lock, we
currently end up dropping the delegation stateid.

Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/delegation.c | 12 +++++-------
1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/fs/nfs/delegation.c b/fs/nfs/delegation.c
index 5de4cfb2ab07..094e0efe6a82 100644
--- a/fs/nfs/delegation.c
+++ b/fs/nfs/delegation.c
@@ -195,15 +195,13 @@ void nfs_inode_reclaim_delegation(struct inode *inode, struct rpc_cred *cred,
rcu_read_unlock();
put_rpccred(oldcred);
trace_nfs4_reclaim_delegation(inode, res->delegation_type);
- } else {
- /* We appear to have raced with a delegation return. */
- spin_unlock(&delegation->lock);
- rcu_read_unlock();
- nfs_inode_set_delegation(inode, cred, res);
+ return;
}
- } else {
- rcu_read_unlock();
+ /* We appear to have raced with a delegation return. */
+ spin_unlock(&delegation->lock);
}
+ rcu_read_unlock();
+ nfs_inode_set_delegation(inode, cred, res);
}

static int nfs_do_return_delegation(struct inode *inode, struct nfs_delegation *delegation, int issync)
--
2.7.4


2016-09-22 17:40:03

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH v7 26/31] NFSv4: Don't test open_stateid unless it is set

We need to test the NFS_OPEN_STATE flag for whether or not the
open_stateid is valid.

Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/nfs4proc.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 552bea2af811..995b646c9aea 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -2587,6 +2587,11 @@ static int nfs41_check_open_stateid(struct nfs4_state *state)
struct rpc_cred *cred = state->owner->so_cred;
int status;

+ if (test_bit(NFS_OPEN_STATE, &state->flags) == 0) {
+ if (test_bit(NFS_DELEGATED_STATE, &state->flags) == 0)
+ return NFS_OK;
+ return -NFS4ERR_BAD_STATEID;
+ }
/* If a state reset has been done, test_stateid is unneeded */
if ((test_bit(NFS_O_RDONLY_STATE, &state->flags) == 0) &&
(test_bit(NFS_O_WRONLY_STATE, &state->flags) == 0) &&
--
2.7.4


2016-09-22 17:40:04

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH v7 27/31] NFSv4: Mark the lock and open stateids as invalid after freeing them

Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/nfs4proc.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 995b646c9aea..2e1f9c30b805 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -2560,6 +2560,7 @@ static int nfs41_check_expired_locks(struct nfs4_state *state)
if (status == -NFS4ERR_EXPIRED ||
status == -NFS4ERR_BAD_STATEID) {
clear_bit(NFS_LOCK_INITIALIZED, &lsp->ls_flags);
+ lsp->ls_stateid.type = NFS4_INVALID_STATEID_TYPE;
if (!recover_lost_locks)
set_bit(NFS_LOCK_LOST, &lsp->ls_flags);
} else if (status != NFS_OK) {
@@ -2605,6 +2606,7 @@ static int nfs41_check_open_stateid(struct nfs4_state *state)
clear_bit(NFS_O_WRONLY_STATE, &state->flags);
clear_bit(NFS_O_RDWR_STATE, &state->flags);
clear_bit(NFS_OPEN_STATE, &state->flags);
+ stateid->type = NFS4_INVALID_STATEID_TYPE;
}
return status;
}
--
2.7.4


2016-09-22 17:40:05

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH v7 28/31] NFSv4: Open state recovery must account for file permission changes

If the file permissions change on the server, then we may not be able to
recover open state. If so, we need to ensure that we mark the file
descriptor appropriately.

Cc: [email protected]
Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/nfs4state.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
index 4b538bb194f8..0a25f70a3b0b 100644
--- a/fs/nfs/nfs4state.c
+++ b/fs/nfs/nfs4state.c
@@ -1532,6 +1532,9 @@ restart:
__func__, status);
case -ENOENT:
case -ENOMEM:
+ case -EACCES:
+ case -EROFS:
+ case -EIO:
case -ESTALE:
/* Open state on this file cannot be recovered */
nfs4_state_mark_recovery_failed(state, status);
--
2.7.4


2016-09-22 17:40:07

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH v7 25/31] NFSv4: nfs4_do_handle_exception() handle revoke/expiry of a single stateid

If we're not yet sure that all state has expired or been revoked, we
should try to do a minimal recovery on just the one stateid.

Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/nfs4proc.c | 18 +++++++++++-------
1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 338684195e95..552bea2af811 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -397,13 +397,23 @@ static int nfs4_do_handle_exception(struct nfs_server *server,
exception->delay = 0;
exception->recovering = 0;
exception->retry = 0;
+
+ if (stateid == NULL && state != NULL)
+ stateid = &state->stateid;
+
switch(errorcode) {
case 0:
return 0;
- case -NFS4ERR_OPENMODE:
case -NFS4ERR_DELEG_REVOKED:
case -NFS4ERR_ADMIN_REVOKED:
+ case -NFS4ERR_EXPIRED:
case -NFS4ERR_BAD_STATEID:
+ if (inode != NULL && stateid != NULL) {
+ nfs_inode_find_state_and_recover(inode,
+ stateid);
+ goto wait_on_recovery;
+ }
+ case -NFS4ERR_OPENMODE:
if (inode) {
int err;

@@ -422,12 +432,6 @@ static int nfs4_do_handle_exception(struct nfs_server *server,
if (ret < 0)
break;
goto wait_on_recovery;
- case -NFS4ERR_EXPIRED:
- if (state != NULL) {
- ret = nfs4_schedule_stateid_recovery(server, state);
- if (ret < 0)
- break;
- }
case -NFS4ERR_STALE_STATEID:
case -NFS4ERR_STALE_CLIENTID:
nfs4_schedule_lease_recovery(clp);
--
2.7.4


2016-09-22 17:39:50

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH v7 15/31] NFSv4.1: Ensure we call FREE_STATEID if needed on close/delegreturn/locku

If a server returns NFS4ERR_ADMIN_REVOKED, NFS4ERR_DELEG_REVOKED
or NFS4ERR_EXPIRED on a call to close, open_downgrade, delegreturn, or
locku, we should call FREE_STATEID before attempting to recover.

Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/nfs4proc.c | 43 ++++++++++++++++++++++++++++++++++++++++---
1 file changed, 40 insertions(+), 3 deletions(-)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index a53caeb73f75..f79d4f55d83f 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -328,6 +328,33 @@ static void nfs4_setup_readdir(u64 cookie, __be32 *verifier, struct dentry *dent
kunmap_atomic(start);
}

+static void nfs4_test_and_free_stateid(struct nfs_server *server,
+ nfs4_stateid *stateid,
+ struct rpc_cred *cred)
+{
+ const struct nfs4_minor_version_ops *ops = server->nfs_client->cl_mvops;
+
+ ops->test_and_free_expired(server, stateid, cred);
+}
+
+static void __nfs4_free_revoked_stateid(struct nfs_server *server,
+ nfs4_stateid *stateid,
+ struct rpc_cred *cred)
+{
+ stateid->type = NFS4_REVOKED_STATEID_TYPE;
+ nfs4_test_and_free_stateid(server, stateid, cred);
+}
+
+static void nfs4_free_revoked_stateid(struct nfs_server *server,
+ const nfs4_stateid *stateid,
+ struct rpc_cred *cred)
+{
+ nfs4_stateid tmp;
+
+ nfs4_stateid_copy(&tmp, stateid);
+ __nfs4_free_revoked_stateid(server, &tmp, cred);
+}
+
static long nfs4_update_delay(long *timeout)
{
long ret;
@@ -2983,9 +3010,12 @@ static void nfs4_close_done(struct rpc_task *task, void *data)
break;
case -NFS4ERR_ADMIN_REVOKED:
case -NFS4ERR_STALE_STATEID:
+ case -NFS4ERR_EXPIRED:
+ nfs4_free_revoked_stateid(server,
+ &calldata->arg.stateid,
+ task->tk_msg.rpc_cred);
case -NFS4ERR_OLD_STATEID:
case -NFS4ERR_BAD_STATEID:
- case -NFS4ERR_EXPIRED:
if (!nfs4_stateid_match(&calldata->arg.stateid,
&state->open_stateid)) {
rpc_restart_call_prepare(task);
@@ -5477,10 +5507,13 @@ static void nfs4_delegreturn_done(struct rpc_task *task, void *calldata)
renew_lease(data->res.server, data->timestamp);
case -NFS4ERR_ADMIN_REVOKED:
case -NFS4ERR_DELEG_REVOKED:
+ case -NFS4ERR_EXPIRED:
+ nfs4_free_revoked_stateid(data->res.server,
+ data->args.stateid,
+ task->tk_msg.rpc_cred);
case -NFS4ERR_BAD_STATEID:
case -NFS4ERR_OLD_STATEID:
case -NFS4ERR_STALE_STATEID:
- case -NFS4ERR_EXPIRED:
task->tk_status = 0;
if (data->roc)
pnfs_roc_set_barrier(data->inode, data->roc_barrier);
@@ -5745,10 +5778,14 @@ static void nfs4_locku_done(struct rpc_task *task, void *data)
if (nfs4_update_lock_stateid(calldata->lsp,
&calldata->res.stateid))
break;
+ case -NFS4ERR_ADMIN_REVOKED:
+ case -NFS4ERR_EXPIRED:
+ nfs4_free_revoked_stateid(calldata->server,
+ &calldata->arg.stateid,
+ task->tk_msg.rpc_cred);
case -NFS4ERR_BAD_STATEID:
case -NFS4ERR_OLD_STATEID:
case -NFS4ERR_STALE_STATEID:
- case -NFS4ERR_EXPIRED:
if (!nfs4_stateid_match(&calldata->arg.stateid,
&calldata->lsp->ls_stateid))
rpc_restart_call_prepare(task);
--
2.7.4


2016-09-22 17:39:53

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH v7 18/31] NFSv4: nfs4_handle_delegation_recall_error() handle expiration as revoke case

If the server tells us our stateid has expired, then handle that as if
it was revoked.

Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/nfs4proc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index f79d4f55d83f..75dc389c9719 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -1919,7 +1919,6 @@ static int nfs4_handle_delegation_recall_error(struct nfs_server *server, struct
case -NFS4ERR_STALE_CLIENTID:
case -NFS4ERR_STALE_STATEID:
set_bit(NFS_DELEGATED_STATE, &state->flags);
- case -NFS4ERR_EXPIRED:
/* Don't recall a delegation if it was lost */
nfs4_schedule_lease_recovery(server->nfs_client);
return -EAGAIN;
@@ -1931,6 +1930,7 @@ static int nfs4_handle_delegation_recall_error(struct nfs_server *server, struct
return -EAGAIN;
case -NFS4ERR_DELEG_REVOKED:
case -NFS4ERR_ADMIN_REVOKED:
+ case -NFS4ERR_EXPIRED:
case -NFS4ERR_BAD_STATEID:
case -NFS4ERR_OPENMODE:
nfs_inode_find_state_and_recover(state->inode,
--
2.7.4


2016-09-22 17:40:07

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH v7 30/31] NFSv4: If recovery failed for a specific open stateid, then don't retry

Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/nfs4state.c | 18 +++++++++++-------
1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
index 0a25f70a3b0b..5f4281ec5f72 100644
--- a/fs/nfs/nfs4state.c
+++ b/fs/nfs/nfs4state.c
@@ -991,6 +991,8 @@ int nfs4_select_rw_stateid(struct nfs4_state *state,
{
int ret;

+ if (!nfs4_valid_open_stateid(state))
+ return -EIO;
if (cred != NULL)
*cred = NULL;
ret = nfs4_copy_lock_stateid(dst, state, lockowner);
@@ -1303,6 +1305,8 @@ void nfs4_schedule_path_down_recovery(struct nfs_client *clp)
static int nfs4_state_mark_reclaim_reboot(struct nfs_client *clp, struct nfs4_state *state)
{

+ if (!nfs4_valid_open_stateid(state))
+ return 0;
set_bit(NFS_STATE_RECLAIM_REBOOT, &state->flags);
/* Don't recover state that expired before the reboot */
if (test_bit(NFS_STATE_RECLAIM_NOGRACE, &state->flags)) {
@@ -1316,6 +1320,8 @@ static int nfs4_state_mark_reclaim_reboot(struct nfs_client *clp, struct nfs4_st

int nfs4_state_mark_reclaim_nograce(struct nfs_client *clp, struct nfs4_state *state)
{
+ if (!nfs4_valid_open_stateid(state))
+ return 0;
set_bit(NFS_STATE_RECLAIM_NOGRACE, &state->flags);
clear_bit(NFS_STATE_RECLAIM_REBOOT, &state->flags);
set_bit(NFS_OWNER_RECLAIM_NOGRACE, &state->owner->so_flags);
@@ -1327,9 +1333,8 @@ int nfs4_schedule_stateid_recovery(const struct nfs_server *server, struct nfs4_
{
struct nfs_client *clp = server->nfs_client;

- if (!nfs4_valid_open_stateid(state))
+ if (!nfs4_state_mark_reclaim_nograce(clp, state))
return -EBADF;
- nfs4_state_mark_reclaim_nograce(clp, state);
dprintk("%s: scheduling stateid recovery for server %s\n", __func__,
clp->cl_hostname);
nfs4_schedule_state_manager(clp);
@@ -1380,15 +1385,14 @@ void nfs_inode_find_state_and_recover(struct inode *inode,
state = ctx->state;
if (state == NULL)
continue;
- if (nfs4_stateid_match_other(&state->stateid, stateid)) {
- nfs4_state_mark_reclaim_nograce(clp, state);
+ if (nfs4_stateid_match_other(&state->stateid, stateid) &&
+ nfs4_state_mark_reclaim_nograce(clp, state)) {
found = true;
continue;
}
- if (nfs_state_lock_state_matches_stateid(state, stateid)) {
- nfs4_state_mark_reclaim_nograce(clp, state);
+ if (nfs_state_lock_state_matches_stateid(state, stateid) &&
+ nfs4_state_mark_reclaim_nograce(clp, state))
found = true;
- }
}
spin_unlock(&inode->i_lock);

--
2.7.4


2016-09-22 17:40:11

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH v7 29/31] NFSv4: Fix retry issues with nfs41_test/free_stateid

_nfs41_free_stateid() needs to be cached by the session, but
nfs41_test_stateid() may return NFS4ERR_RETRY_UNCACHED_REP (in which
case we should just retry).

Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/nfs4proc.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 2e1f9c30b805..72858f900abf 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -8730,6 +8730,7 @@ static void nfs4_handle_delay_or_session_error(struct nfs_server *server,
exception->retry = 0;
switch(err) {
case -NFS4ERR_DELAY:
+ case -NFS4ERR_RETRY_UNCACHED_REP:
nfs4_handle_exception(server, err, exception);
break;
case -NFS4ERR_BADSESSION:
@@ -8835,7 +8836,7 @@ static struct rpc_task *_nfs41_free_stateid(struct nfs_server *server,

msg.rpc_argp = &data->args;
msg.rpc_resp = &data->res;
- nfs4_init_sequence(&data->args.seq_args, &data->res.seq_res, 0);
+ nfs4_init_sequence(&data->args.seq_args, &data->res.seq_res, 1);
if (privileged)
nfs4_set_sequence_privileged(&data->args.seq_args);

--
2.7.4


2016-09-22 17:39:52

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH v7 17/31] NFSv4: nfs_inode_find_state_and_recover() should check all stateids

Modify the helper nfs_inode_find_state_and_recover() so that it
can check all open/lock/delegation state trackers on that inode for
whether or not they need are affected by a revoked stateid error.

Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/delegation.c | 19 +++++++++++++++++++
fs/nfs/delegation.h | 2 ++
fs/nfs/nfs4state.c | 44 +++++++++++++++++++++++++++++++++++++++-----
3 files changed, 60 insertions(+), 5 deletions(-)

diff --git a/fs/nfs/delegation.c b/fs/nfs/delegation.c
index 484f14700108..5de4cfb2ab07 100644
--- a/fs/nfs/delegation.c
+++ b/fs/nfs/delegation.c
@@ -1001,6 +1001,25 @@ restart:
rcu_read_unlock();
}

+void nfs_inode_find_delegation_state_and_recover(struct inode *inode,
+ const nfs4_stateid *stateid)
+{
+ struct nfs_client *clp = NFS_SERVER(inode)->nfs_client;
+ struct nfs_delegation *delegation;
+ bool found = false;
+
+ rcu_read_lock();
+ delegation = rcu_dereference(NFS_I(inode)->delegation);
+ if (delegation &&
+ nfs4_stateid_match_other(&delegation->stateid, stateid)) {
+ nfs_mark_test_expired_delegation(NFS_SERVER(inode), delegation);
+ found = true;
+ }
+ rcu_read_unlock();
+ if (found)
+ nfs4_schedule_state_manager(clp);
+}
+
/**
* nfs_delegations_present - check for existence of delegations
* @clp: client state handle
diff --git a/fs/nfs/delegation.h b/fs/nfs/delegation.h
index 1442e3b1521d..e9d555796873 100644
--- a/fs/nfs/delegation.h
+++ b/fs/nfs/delegation.h
@@ -66,6 +66,8 @@ void nfs_mark_delegation_referenced(struct nfs_delegation *delegation);
int nfs4_have_delegation(struct inode *inode, fmode_t flags);
int nfs4_check_delegation(struct inode *inode, fmode_t flags);
bool nfs4_delegation_flush_on_close(const struct inode *inode);
+void nfs_inode_find_delegation_state_and_recover(struct inode *inode,
+ const nfs4_stateid *stateid);

#endif

diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
index 63da0411e2af..4b538bb194f8 100644
--- a/fs/nfs/nfs4state.c
+++ b/fs/nfs/nfs4state.c
@@ -1337,6 +1337,35 @@ int nfs4_schedule_stateid_recovery(const struct nfs_server *server, struct nfs4_
}
EXPORT_SYMBOL_GPL(nfs4_schedule_stateid_recovery);

+static struct nfs4_lock_state *
+nfs_state_find_lock_state_by_stateid(struct nfs4_state *state,
+ const nfs4_stateid *stateid)
+{
+ struct nfs4_lock_state *pos;
+
+ list_for_each_entry(pos, &state->lock_states, ls_locks) {
+ if (!test_bit(NFS_LOCK_INITIALIZED, &pos->ls_flags))
+ continue;
+ if (nfs4_stateid_match_other(&pos->ls_stateid, stateid))
+ return pos;
+ }
+ return NULL;
+}
+
+static bool nfs_state_lock_state_matches_stateid(struct nfs4_state *state,
+ const nfs4_stateid *stateid)
+{
+ bool found = false;
+
+ if (test_bit(LK_STATE_IN_USE, &state->flags)) {
+ spin_lock(&state->state_lock);
+ if (nfs_state_find_lock_state_by_stateid(state, stateid))
+ found = true;
+ spin_unlock(&state->state_lock);
+ }
+ return found;
+}
+
void nfs_inode_find_state_and_recover(struct inode *inode,
const nfs4_stateid *stateid)
{
@@ -1351,14 +1380,19 @@ void nfs_inode_find_state_and_recover(struct inode *inode,
state = ctx->state;
if (state == NULL)
continue;
- if (!test_bit(NFS_DELEGATED_STATE, &state->flags))
- continue;
- if (!nfs4_stateid_match(&state->stateid, stateid))
+ if (nfs4_stateid_match_other(&state->stateid, stateid)) {
+ nfs4_state_mark_reclaim_nograce(clp, state);
+ found = true;
continue;
- nfs4_state_mark_reclaim_nograce(clp, state);
- found = true;
+ }
+ if (nfs_state_lock_state_matches_stateid(state, stateid)) {
+ nfs4_state_mark_reclaim_nograce(clp, state);
+ found = true;
+ }
}
spin_unlock(&inode->i_lock);
+
+ nfs_inode_find_delegation_state_and_recover(inode, stateid);
if (found)
nfs4_schedule_state_manager(clp);
}
--
2.7.4


2016-09-24 20:38:59

by Oleg Drokin

[permalink] [raw]
Subject: Re: [PATCH v7 00/31] Fix delegation behaviour when server revokes some state


On Sep 22, 2016, at 1:38 PM, Trond Myklebust wrote:

> According to RFC5661, if any of the SEQUENCE status bits
> SEQ4_STATUS_EXPIRED_ALL_STATE_REVOKED,
> SEQ4_STATUS_EXPIRED_SOME_STATE_REVOKED, SEQ4_STATUS_ADMIN_STATE_REVOKED,
> or SEQ4_STATUS_RECALLABLE_STATE_REVOKED are set, then we need to use
> TEST_STATEID to figure out which stateids have been revoked, so we
> can acknowledge the loss of state using FREE_STATEID.
>
> While we already do this for open and lock state, we have not been doing
> so for all the delegations.
>
> v2: nfs_v4_2_minor_ops needs to set .test_and_free_expired too
> v3: Now with added lock revoke fixes and close/delegreturn/locku fixes
> v4: Close a bunch of corner cases
> v5: Report revoked delegations as invalid in nfs_have_delegation()
> Fix an infinite loop in nfs_reap_expired_delegations.
> Fixes for other looping behaviour
> v6: Fix nfs4_do_handle_exception to handle all stateids, not just delegations
> Stable fix for nfs4_copy_delegation_stateid
> Marked fix "NFSv4: Don't report revoked delegations as valid in
> nfs_have_delegation" for stable.
> Stable fix for the inode mode/fileid corruption
> v7: Fix handling of NFS4ERR_OPENMODE
> Handle matching stateids that have set seqid==0

I just gave this another whirl and it holds up great, so I guess

Tested-by: Oleg Drokin <[email protected]>

>
>
> Many thanks to Oleg Drokin for all his time helping to test and debug these
> issues!
>
>
> Trond Myklebust (31):
> NFSv4.1: Don't deadlock the state manager on the SEQUENCE status flags
> NFS: Fix inode corruption in nfs_prime_dcache()
> NFSv4: Don't report revoked delegations as valid in
> nfs_have_delegation()
> NFSv4: nfs4_copy_delegation_stateid() must fail if the delegation is
> invalid
> NFSv4.1: Don't check delegations that are already marked as revoked
> NFSv4.1: Allow test_stateid to handle session errors without waiting
> NFSv4.1: Add a helper function to deal with expired stateids
> NFSv4.x: Allow callers of nfs_remove_bad_delegation() to specify a
> stateid
> NFSv4.1: Test delegation stateids when server declares "some state
> revoked"
> NFSv4.1: Deal with server reboots during delegation expiration
> recovery
> NFSv4.1: Don't recheck delegations that have already been checked
> NFSv4.1: Allow revoked stateids to skip the call to TEST_STATEID
> NFSv4.1: Ensure we always run TEST/FREE_STATEID on locks
> NFSv4.1: FREE_STATEID can be asynchronous
> NFSv4.1: Ensure we call FREE_STATEID if needed on
> close/delegreturn/locku
> NFSv4: Ensure we don't re-test revoked and freed stateids
> NFSv4: nfs_inode_find_state_and_recover() should check all stateids
> NFSv4: nfs4_handle_delegation_recall_error() handle expiration as
> revoke case
> NFSv4: nfs4_handle_setlk_error() handle expiration as revoke case
> NFSv4.1: nfs4_layoutget_handle_exception handle revoked state
> NFSv4: Pass the stateid to the exception handler in
> nfs4_read/write_done_cb
> NFSv4: Fix a race in nfs_inode_reclaim_delegation()
> NFSv4: Fix a race when updating an open_stateid
> NFS: Always call nfs_inode_find_state_and_recover() when revoking a
> delegation
> NFSv4: nfs4_do_handle_exception() handle revoke/expiry of a single
> stateid
> NFSv4: Don't test open_stateid unless it is set
> NFSv4: Mark the lock and open stateids as invalid after freeing them
> NFSv4: Open state recovery must account for file permission changes
> NFSv4: Fix retry issues with nfs41_test/free_stateid
> NFSv4: If recovery failed for a specific open stateid, then don't
> retry
> NFSv4.1: Even if the stateid is OK, we may need to recover the open
> modes
>
> fs/nfs/delegation.c | 213 +++++++++++++++--
> fs/nfs/delegation.h | 8 +-
> fs/nfs/dir.c | 16 +-
> fs/nfs/flexfilelayout/flexfilelayout.c | 2 +-
> fs/nfs/nfs4_fs.h | 5 +-
> fs/nfs/nfs4proc.c | 402 +++++++++++++++++++++++----------
> fs/nfs/nfs4session.h | 1 +
> fs/nfs/nfs4state.c | 84 +++++--
> include/linux/nfs4.h | 1 +
> 9 files changed, 565 insertions(+), 167 deletions(-)
>
> --
> 2.7.4


2016-09-26 20:23:32

by Oleg Drokin

[permalink] [raw]
Subject: Re: [PATCH v7 00/31] Fix delegation behaviour when server revokes some state


On Sep 22, 2016, at 1:38 PM, Trond Myklebust wrote:

> According to RFC5661, if any of the SEQUENCE status bits
> SEQ4_STATUS_EXPIRED_ALL_STATE_REVOKED,
> SEQ4_STATUS_EXPIRED_SOME_STATE_REVOKED, SEQ4_STATUS_ADMIN_STATE_REVOKED,
> or SEQ4_STATUS_RECALLABLE_STATE_REVOKED are set, then we need to use
> TEST_STATEID to figure out which stateids have been revoked, so we
> can acknowledge the loss of state using FREE_STATEID.
>
> While we already do this for open and lock state, we have not been doing
> so for all the delegations.
>
> v2: nfs_v4_2_minor_ops needs to set .test_and_free_expired too
> v3: Now with added lock revoke fixes and close/delegreturn/locku fixes
> v4: Close a bunch of corner cases
> v5: Report revoked delegations as invalid in nfs_have_delegation()
> Fix an infinite loop in nfs_reap_expired_delegations.
> Fixes for other looping behaviour
> v6: Fix nfs4_do_handle_exception to handle all stateids, not just delegations
> Stable fix for nfs4_copy_delegation_stateid
> Marked fix "NFSv4: Don't report revoked delegations as valid in
> nfs_have_delegation" for stable.
> Stable fix for the inode mode/fileid corruption
> v7: Fix handling of NFS4ERR_OPENMODE
> Handle matching stateids that have set seqid==0



Well, it took 2.5 days this time around, but it seems some narrow
races remain.
I got the system stuck in that state again:
[257275.514588] NFS call test_stateid ffff880005304f24
[257275.514591] --> nfs41_call_sync_prepare data->seq_server ffff88005cb2e000
[257275.514591] --> nfs41_setup_sequence
[257275.514592] --> nfs4_alloc_slot used_slots=0003 highest_used=1 max_slots=31
[257275.514593] <-- nfs4_alloc_slot used_slots=0007 highest_used=2 slotid=2
[257275.514594] <-- nfs41_setup_sequence slotid=2 seqid=65520554
[257275.514612] encode_sequence: sessionid=1474859578:109:110:0 seqid=65520554 slotid=2 max_slotid=2 cache_this=0
[257275.514707] --> nfs4_alloc_slot used_slots=0007 highest_used=2 max_slots=31
[257275.514708] <-- nfs4_alloc_slot used_slots=000f highest_used=3 slotid=3
[257275.514709] nfs4_free_slot: slotid 3 highest_used_slotid 2
[257275.514710] nfs41_sequence_process: Error 0 free the slot
[257275.514711] nfs4_free_slot: slotid 2 highest_used_slotid 1
[257275.514733] NFS reply test_stateid: succeeded, 0
[257275.514738] --> nfs_put_client({2})
[257275.514751] --> nfs4_setup_sequence clp ffff88005b2e4800 session ffff880029aa8800 sr_slot 4294967295
[257275.514751] --> nfs41_setup_sequence
[257275.514752] --> nfs4_alloc_slot used_slots=0003 highest_used=1 max_slots=31
[257275.514753] <-- nfs4_alloc_slot used_slots=0007 highest_used=2 slotid=2
[257275.514754] <-- nfs41_setup_sequence slotid=2 seqid=65520555
[257275.514754] <-- nfs4_setup_sequence status=0
[257275.514761] encode_sequence: sessionid=1474859578:109:110:0 seqid=65520555 slotid=2 max_slotid=2 cache_this=1
[257275.515115] NFS: nfs_pgio_result: 17130, (status -10038)
[257275.515116] --> nfs4_alloc_slot used_slots=0007 highest_used=2 max_slots=31
[257275.515117] <-- nfs4_alloc_slot used_slots=000f highest_used=3 slotid=3
[257275.515117] nfs4_free_slot: slotid 3 highest_used_slotid 2
[257275.515118] nfs41_sequence_process: Error 0 free the slot
[257275.515118] nfs4_free_slot: slotid 2 highest_used_slotid 1
[257275.515119] nfs4_schedule_stateid_recovery: scheduling stateid recovery for server localhost
[257275.515193] NFS call test_stateid ffff880005304f24
[257275.515196] --> nfs41_call_sync_prepare data->seq_server ffff88005cb2e000
[257275.515196] --> nfs41_setup_sequence
[257275.515198] --> nfs4_alloc_slot used_slots=0003 highest_used=1 max_slots=31
[257275.515199] <-- nfs4_alloc_slot used_slots=0007 highest_used=2 slotid=2
[257275.515199] <-- nfs41_setup_sequence slotid=2 seqid=65520556
[257275.515206] encode_sequence: sessionid=1474859578:109:110:0 seqid=65520556 slotid=2 max_slotid=2 cache_this=0
[257275.515297] --> nfs4_alloc_slot used_slots=0007 highest_used=2 max_slots=31
[257275.515298] <-- nfs4_alloc_slot used_slots=000f highest_used=3 slotid=3
[257275.515299] nfs4_free_slot: slotid 3 highest_used_slotid 2
[257275.515300] nfs41_sequence_process: Error 0 free the slot
[257275.515301] nfs4_free_slot: slotid 2 highest_used_slotid 1
[257275.515319] NFS reply test_stateid: succeeded, 0
[257275.515323] --> nfs_put_client({2})
[257275.515326] --> nfs4_setup_sequence clp ffff88005b2e4800 session ffff880029aa8800 sr_slot 4294967295
[257275.515326] --> nfs41_setup_sequence
[257275.515327] --> nfs4_alloc_slot used_slots=0003 highest_used=1 max_slots=31
[257275.515328] <-- nfs4_alloc_slot used_slots=0007 highest_used=2 slotid=2
[257275.515329] <-- nfs41_setup_sequence slotid=2 seqid=65520557
[257275.515329] <-- nfs4_setup_sequence status=0
[257275.515336] encode_sequence: sessionid=1474859578:109:110:0 seqid=65520557 slotid=2 max_slotid=2 cache_this=1
[257275.515687] NFS: nfs_pgio_result: 17130, (status -10038)
[257275.515689] --> nfs4_alloc_slot used_slots=0007 highest_used=2 max_slots=31
[257275.515690] <-- nfs4_alloc_slot used_slots=000f highest_used=3 slotid=3
[257275.515690] nfs4_free_slot: slotid 3 highest_used_slotid 2
[257275.515691] nfs41_sequence_process: Error 0 free the slot
[257275.515692] nfs4_free_slot: slotid 2 highest_used_slotid 1
[257275.515703] nfs4_schedule_stateid_recovery: scheduling stateid recovery for server localhost
[257275.515793] NFS call test_stateid ffff880005304f24
[257275.515796] --> nfs41_call_sync_prepare data->seq_server ffff88005cb2e000
[257275.515797] --> nfs41_setup_sequence
[257275.515798] --> nfs4_alloc_slot used_slots=0003 highest_used=1 max_slots=31
[257275.515799] <-- nfs4_alloc_slot used_slots=0007 highest_used=2 slotid=2
[257275.515799] <-- nfs41_setup_sequence slotid=2 seqid=65520558
[257275.515806] encode_sequence: sessionid=1474859578:109:110:0 seqid=65520558 slotid=2 max_slotid=2 cache_this=0
[257275.515889] --> nfs4_alloc_slot used_slots=0007 highest_used=2 max_slots=31
[257275.515890] <-- nfs4_alloc_slot used_slots=000f highest_used=3 slotid=3
[257275.515891] nfs4_free_slot: slotid 3 highest_used_slotid 2
[257275.515891] nfs41_sequence_process: Error 0 free the slot
[257275.515892] nfs4_free_slot: slotid 2 highest_used_slotid 1
[257275.515910] NFS reply test_stateid: succeeded, 0
[257275.515914] --> nfs_put_client({2})
[257275.515917] --> nfs4_setup_sequence clp ffff88005b2e4800 session ffff880029aa8800 sr_slot 4294967295
[257275.515918] --> nfs41_setup_sequence
[257275.515919] --> nfs4_alloc_slot used_slots=0003 highest_used=1 max_slots=31
[257275.515920] <-- nfs4_alloc_slot used_slots=0007 highest_used=2 slotid=2
[257275.515920] <-- nfs41_setup_sequence slotid=2 seqid=65520559
[257275.515921] <-- nfs4_setup_sequence status=0
[257275.515928] encode_sequence: sessionid=1474859578:109:110:0 seqid=65520559 slotid=2 max_slotid=2 cache_this=1
[257275.516312] NFS: nfs_pgio_result: 17130, (status -10038)
[257275.516313] --> nfs4_alloc_slot used_slots=0007 highest_used=2 max_slots=31
[257275.516314] <-- nfs4_alloc_slot used_slots=000f highest_used=3 slotid=3
[257275.516315] nfs4_free_slot: slotid 3 highest_used_slotid 2
[257275.516315] nfs41_sequence_process: Error 0 free the slot
[257275.516316] nfs4_free_slot: slotid 2 highest_used_slotid 1
[257275.516317] nfs4_schedule_stateid_recovery: scheduling stateid recovery for server localhost

Please let me know if you need anything else extracted from a system in this state.


2016-09-26 21:04:14

by Oleg Drokin

[permalink] [raw]
Subject: Re: [PATCH v7 00/31] Fix delegation behaviour when server revokes some state


On Sep 26, 2016, at 4:46 PM, Trond Myklebust wrote:

>
>> On Sep 26, 2016, at 13:23, Oleg Drokin <[email protected]> wrote:
>>
>>
>> On Sep 22, 2016, at 1:38 PM, Trond Myklebust wrote:
>>
>>> According to RFC5661, if any of the SEQUENCE status bits
>>> SEQ4_STATUS_EXPIRED_ALL_STATE_REVOKED,
>>> SEQ4_STATUS_EXPIRED_SOME_STATE_REVOKED, SEQ4_STATUS_ADMIN_STATE_REVOKED,
>>> or SEQ4_STATUS_RECALLABLE_STATE_REVOKED are set, then we need to use
>>> TEST_STATEID to figure out which stateids have been revoked, so we
>>> can acknowledge the loss of state using FREE_STATEID.
>>>
>>> While we already do this for open and lock state, we have not been doing
>>> so for all the delegations.
>>>
>>> v2: nfs_v4_2_minor_ops needs to set .test_and_free_expired too
>>> v3: Now with added lock revoke fixes and close/delegreturn/locku fixes
>>> v4: Close a bunch of corner cases
>>> v5: Report revoked delegations as invalid in nfs_have_delegation()
>>> Fix an infinite loop in nfs_reap_expired_delegations.
>>> Fixes for other looping behaviour
>>> v6: Fix nfs4_do_handle_exception to handle all stateids, not just delegations
>>> Stable fix for nfs4_copy_delegation_stateid
>>> Marked fix "NFSv4: Don't report revoked delegations as valid in
>>> nfs_have_delegation" for stable.
>>> Stable fix for the inode mode/fileid corruption
>>> v7: Fix handling of NFS4ERR_OPENMODE
>>> Handle matching stateids that have set seqid==0
>>
>>
>>
>> Well, it took 2.5 days this time around, but it seems some narrow
>> races remain.
>> I got the system stuck in that state again:
>> [257275.514588] NFS call test_stateid ffff880005304f24
>> [257275.514591] --> nfs41_call_sync_prepare data->seq_server ffff88005cb2e000
>> [257275.514591] --> nfs41_setup_sequence
>> [257275.514592] --> nfs4_alloc_slot used_slots=0003 highest_used=1 max_slots=31
>> [257275.514593] <-- nfs4_alloc_slot used_slots=0007 highest_used=2 slotid=2
>> [257275.514594] <-- nfs41_setup_sequence slotid=2 seqid=65520554
>> [257275.514612] encode_sequence: sessionid=1474859578:109:110:0 seqid=65520554 slotid=2 max_slotid=2 cache_this=0
>> [257275.514707] --> nfs4_alloc_slot used_slots=0007 highest_used=2 max_slots=31
>> [257275.514708] <-- nfs4_alloc_slot used_slots=000f highest_used=3 slotid=3
>> [257275.514709] nfs4_free_slot: slotid 3 highest_used_slotid 2
>> [257275.514710] nfs41_sequence_process: Error 0 free the slot
>> [257275.514711] nfs4_free_slot: slotid 2 highest_used_slotid 1
>> [257275.514733] NFS reply test_stateid: succeeded, 0
>> [257275.514738] --> nfs_put_client({2})
>> [257275.514751] --> nfs4_setup_sequence clp ffff88005b2e4800 session ffff880029aa8800 sr_slot 4294967295
>> [257275.514751] --> nfs41_setup_sequence
>> [257275.514752] --> nfs4_alloc_slot used_slots=0003 highest_used=1 max_slots=31
>> [257275.514753] <-- nfs4_alloc_slot used_slots=0007 highest_used=2 slotid=2
>> [257275.514754] <-- nfs41_setup_sequence slotid=2 seqid=65520555
>> [257275.514754] <-- nfs4_setup_sequence status=0
>> [257275.514761] encode_sequence: sessionid=1474859578:109:110:0 seqid=65520555 slotid=2 max_slotid=2 cache_this=1
>> [257275.515115] NFS: nfs_pgio_result: 17130, (status -10038)
>> [257275.515116] --> nfs4_alloc_slot used_slots=0007 highest_used=2 max_slots=31
>> [257275.515117] <-- nfs4_alloc_slot used_slots=000f highest_used=3 slotid=3
>> [257275.515117] nfs4_free_slot: slotid 3 highest_used_slotid 2
>> [257275.515118] nfs41_sequence_process: Error 0 free the slot
>> [257275.515118] nfs4_free_slot: slotid 2 highest_used_slotid 1
>> [257275.515119] nfs4_schedule_stateid_recovery: scheduling stateid recovery for server localhost
>> [257275.515193] NFS call test_stateid ffff880005304f24
>> [257275.515196] --> nfs41_call_sync_prepare data->seq_server ffff88005cb2e000
>> [257275.515196] --> nfs41_setup_sequence
>> [257275.515198] --> nfs4_alloc_slot used_slots=0003 highest_used=1 max_slots=31
>> [257275.515199] <-- nfs4_alloc_slot used_slots=0007 highest_used=2 slotid=2
>> [257275.515199] <-- nfs41_setup_sequence slotid=2 seqid=65520556
>> [257275.515206] encode_sequence: sessionid=1474859578:109:110:0 seqid=65520556 slotid=2 max_slotid=2 cache_this=0
>> [257275.515297] --> nfs4_alloc_slot used_slots=0007 highest_used=2 max_slots=31
>> [257275.515298] <-- nfs4_alloc_slot used_slots=000f highest_used=3 slotid=3
>> [257275.515299] nfs4_free_slot: slotid 3 highest_used_slotid 2
>> [257275.515300] nfs41_sequence_process: Error 0 free the slot
>> [257275.515301] nfs4_free_slot: slotid 2 highest_used_slotid 1
>> [257275.515319] NFS reply test_stateid: succeeded, 0
>> [257275.515323] --> nfs_put_client({2})
>> [257275.515326] --> nfs4_setup_sequence clp ffff88005b2e4800 session ffff880029aa8800 sr_slot 4294967295
>> [257275.515326] --> nfs41_setup_sequence
>> [257275.515327] --> nfs4_alloc_slot used_slots=0003 highest_used=1 max_slots=31
>> [257275.515328] <-- nfs4_alloc_slot used_slots=0007 highest_used=2 slotid=2
>> [257275.515329] <-- nfs41_setup_sequence slotid=2 seqid=65520557
>> [257275.515329] <-- nfs4_setup_sequence status=0
>> [257275.515336] encode_sequence: sessionid=1474859578:109:110:0 seqid=65520557 slotid=2 max_slotid=2 cache_this=1
>> [257275.515687] NFS: nfs_pgio_result: 17130, (status -10038)
>> [257275.515689] --> nfs4_alloc_slot used_slots=0007 highest_used=2 max_slots=31
>> [257275.515690] <-- nfs4_alloc_slot used_slots=000f highest_used=3 slotid=3
>> [257275.515690] nfs4_free_slot: slotid 3 highest_used_slotid 2
>> [257275.515691] nfs41_sequence_process: Error 0 free the slot
>> [257275.515692] nfs4_free_slot: slotid 2 highest_used_slotid 1
>> [257275.515703] nfs4_schedule_stateid_recovery: scheduling stateid recovery for server localhost
>> [257275.515793] NFS call test_stateid ffff880005304f24
>> [257275.515796] --> nfs41_call_sync_prepare data->seq_server ffff88005cb2e000
>> [257275.515797] --> nfs41_setup_sequence
>> [257275.515798] --> nfs4_alloc_slot used_slots=0003 highest_used=1 max_slots=31
>> [257275.515799] <-- nfs4_alloc_slot used_slots=0007 highest_used=2 slotid=2
>> [257275.515799] <-- nfs41_setup_sequence slotid=2 seqid=65520558
>> [257275.515806] encode_sequence: sessionid=1474859578:109:110:0 seqid=65520558 slotid=2 max_slotid=2 cache_this=0
>> [257275.515889] --> nfs4_alloc_slot used_slots=0007 highest_used=2 max_slots=31
>> [257275.515890] <-- nfs4_alloc_slot used_slots=000f highest_used=3 slotid=3
>> [257275.515891] nfs4_free_slot: slotid 3 highest_used_slotid 2
>> [257275.515891] nfs41_sequence_process: Error 0 free the slot
>> [257275.515892] nfs4_free_slot: slotid 2 highest_used_slotid 1
>> [257275.515910] NFS reply test_stateid: succeeded, 0
>> [257275.515914] --> nfs_put_client({2})
>> [257275.515917] --> nfs4_setup_sequence clp ffff88005b2e4800 session ffff880029aa8800 sr_slot 4294967295
>> [257275.515918] --> nfs41_setup_sequence
>> [257275.515919] --> nfs4_alloc_slot used_slots=0003 highest_used=1 max_slots=31
>> [257275.515920] <-- nfs4_alloc_slot used_slots=0007 highest_used=2 slotid=2
>> [257275.515920] <-- nfs41_setup_sequence slotid=2 seqid=65520559
>> [257275.515921] <-- nfs4_setup_sequence status=0
>> [257275.515928] encode_sequence: sessionid=1474859578:109:110:0 seqid=65520559 slotid=2 max_slotid=2 cache_this=1
>> [257275.516312] NFS: nfs_pgio_result: 17130, (status -10038)
>
> Hmm… It looks as if one of the pgio calls is stuck in an NFS4ERR_OPENMODE loop.
>
>> [257275.516313] --> nfs4_alloc_slot used_slots=0007 highest_used=2 max_slots=31
>> [257275.516314] <-- nfs4_alloc_slot used_slots=000f highest_used=3 slotid=3
>> [257275.516315] nfs4_free_slot: slotid 3 highest_used_slotid 2
>> [257275.516315] nfs41_sequence_process: Error 0 free the slot
>> [257275.516316] nfs4_free_slot: slotid 2 highest_used_slotid 1
>> [257275.516317] nfs4_schedule_stateid_recovery: scheduling stateid recovery for server localhost
>
> …but it looks as if it is successfully scheduling stateid recovery.
>
>>
>> Please let me know if you need anything else extracted from a system in this state.
>
> Do you have the ability to extract the ‘struct state’ info? I suspect the problem may be a bug tracking the open mode. Also, can you tell if this is a read or a write that is failing?

I think it's a write because:
#3 rpcauth_wrap_req (task=0xffff88005bbb8cd8,
encode=0xffffffff813bf250 <nfs4_xdr_enc_write>, rqstp=0xffff880006d9de00,
data=0xffff880065be6060, obj=<optimized out>)


There's no "struct state" anywhere in the fs/nfs/*.c, can you please elaborate
on what did you mean by that?

In nfs41_test_stateid:
$1 = {{data = "\000\000\000\b:\222ХWm\000\000\000©\226\001", {
seqid = 134217728, other = ":\222ХWm\000\000\000©\226\001"}},
type = NFS4_OPEN_STATEID_TYPE}

In that same callpath if I go up to nfs4_reclaim_open_state:

(gdb) p *state
$3 = {open_states = {next = 0xffff88000c6d6c98, prev = 0xffff88005ed3ee00},
inode_states = {next = 0xffff88008447c1d8, prev = 0xffff88008447c1d8},
lock_states = {next = 0xffff880005304e20, prev = 0xffff880005304e20},
owner = 0xffff88000c6d6c00, inode = 0xffff88008447c2a8, flags = 284,
state_lock = {{rlock = {raw_lock = {val = {counter = 0}},
magic = 3735899821, owner_cpu = 4294967295,
owner = 0xffffffffffffffff, dep_map = {
key = 0xffffffff833ff7f0 <__key.66122>, class_cache = {
0x0 <irq_stack_union>, 0x0 <irq_stack_union>},
name = 0xffffffff81c89ce6 "&(&state->state_lock)->rlock", cpu = 3,
ip = 0}}, {
__padding = "\000\000\000\000╜N╜чЪЪЪЪ\000\000\000\000ЪЪЪЪЪЪЪЪ",
dep_map = {key = 0xffffffff833ff7f0 <__key.66122>, class_cache = {
0x0 <irq_stack_union>, 0x0 <irq_stack_union>},
name = 0xffffffff81c89ce6 "&(&state->state_lock)->rlock", cpu = 3,
ip = 0}}}}, seqlock = {seqcount = {sequence = 16, dep_map = {
key = 0xffffffff833ff7e8 <__key.66123>, class_cache = {
0x0 <irq_stack_union>, 0x0 <irq_stack_union>},
name = 0xffffffff81c89d03 "&(&state->seqlock)->seqcount", cpu = 3,
ip = 0}}, lock = {{rlock = {raw_lock = {val = {counter = 0}},
magic = 3735899821, owner_cpu = 4294967295,
owner = 0xffffffffffffffff, dep_map = {
key = 0xffffffff833ff7e0 <__key.66124>, class_cache = {
0x0 <irq_stack_union>, 0x0 <irq_stack_union>},
name = 0xffffffff81c7c978 "&(&(&state->seqlock)->lock)->rlock",
cpu = 3, ip = 0}}, {
__padding = "\000\000\000\000╜N╜чЪЪЪЪ\000\000\000\000ЪЪЪЪЪЪЪЪ",
dep_map = {key = 0xffffffff833ff7e0 <__key.66124>, class_cache = {
0x0 <irq_stack_union>, 0x0 <irq_stack_union>},
name = 0xffffffff81c7c978 "&(&(&state->seqlock)->lock)->rlock",
cpu = 3, ip = 0}}}}}, stateid = {{
data = "\000\000\000\b:\222ХWm\000\000\000©\226\001", {
seqid = 134217728, other = ":\222ХWm\000\000\000©\226\001"}},
type = NFS4_OPEN_STATEID_TYPE}, open_stateid = {{
data = "\000\000\000\b:\222ХWm\000\000\000©\226\001", {
seqid = 134217728, other = ":\222ХWm\000\000\000©\226\001"}},
type = NFS4_OPEN_STATEID_TYPE}, n_rdonly = 2, n_wronly = 2, n_rdwr = 0,
state = 3, count = {counter = 5}}




2016-09-22 17:40:09

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH v7 31/31] NFSv4.1: Even if the stateid is OK, we may need to recover the open modes

TEST_STATEID only tells you that you have a valid open stateid. It doesn't
tell the client anything about whether or not it holds the required share
locks.

Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/nfs4proc.c | 30 +++++++++++++++++++++---------
1 file changed, 21 insertions(+), 9 deletions(-)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 72858f900abf..7b8054db5e0f 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -1387,6 +1387,17 @@ static void update_open_stateflags(struct nfs4_state *state, fmode_t fmode)
nfs4_state_set_mode_locked(state, state->state | fmode);
}

+static bool nfs_open_stateid_recover_openmode(struct nfs4_state *state)
+{
+ if (state->n_rdonly && !test_bit(NFS_O_RDONLY_STATE, &state->flags))
+ return true;
+ if (state->n_wronly && !test_bit(NFS_O_WRONLY_STATE, &state->flags))
+ return true;
+ if (state->n_rdwr && !test_bit(NFS_O_RDWR_STATE, &state->flags))
+ return true;
+ return false;
+}
+
static void nfs_test_and_clear_all_open_stateid(struct nfs4_state *state)
{
struct nfs_client *clp = state->owner->so_server->nfs_client;
@@ -2589,16 +2600,13 @@ static int nfs41_check_open_stateid(struct nfs4_state *state)
int status;

if (test_bit(NFS_OPEN_STATE, &state->flags) == 0) {
- if (test_bit(NFS_DELEGATED_STATE, &state->flags) == 0)
- return NFS_OK;
+ if (test_bit(NFS_DELEGATED_STATE, &state->flags) == 0) {
+ if (nfs4_have_delegation(state->inode, state->state))
+ return NFS_OK;
+ return -NFS4ERR_OPENMODE;
+ }
return -NFS4ERR_BAD_STATEID;
}
- /* If a state reset has been done, test_stateid is unneeded */
- if ((test_bit(NFS_O_RDONLY_STATE, &state->flags) == 0) &&
- (test_bit(NFS_O_WRONLY_STATE, &state->flags) == 0) &&
- (test_bit(NFS_O_RDWR_STATE, &state->flags) == 0))
- return -NFS4ERR_BAD_STATEID;
-
status = nfs41_test_and_free_expired_stateid(server, stateid, cred);
trace_nfs4_test_open_stateid(state, NULL, status);
if (status == -NFS4ERR_EXPIRED || status == -NFS4ERR_BAD_STATEID) {
@@ -2608,7 +2616,11 @@ static int nfs41_check_open_stateid(struct nfs4_state *state)
clear_bit(NFS_OPEN_STATE, &state->flags);
stateid->type = NFS4_INVALID_STATEID_TYPE;
}
- return status;
+ if (status != NFS_OK)
+ return status;
+ if (nfs_open_stateid_recover_openmode(state))
+ return -NFS4ERR_OPENMODE;
+ return NFS_OK;
}

static int nfs41_open_expired(struct nfs4_state_owner *sp, struct nfs4_state *state)
--
2.7.4


2016-10-14 12:51:04

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v7 23/31] NFSv4: Fix a race when updating an open_stateid

On Thu, Sep 22, 2016 at 01:39:13PM -0400, Trond Myklebust wrote:
> If we're replacing an old stateid which has a different 'other' field,
> then we probably need to free the old stateid.

This gives me a new compiler warning:

fs/nfs/nfs4proc.c: In function ‘update_open_stateid’:
fs/nfs/nfs4proc.c:1548:2: warning: missing braces around initializer [-Wmissing-braces]
nfs4_stateid freeme = {0};
^
fs/nfs/nfs4proc.c:1548:2: warning: (near initialization for ‘freeme.<anonymous>’) [-Wmissing-braces]

2016-11-04 16:02:58

by Benjamin Coddington

[permalink] [raw]
Subject: Re: [PATCH v7 13/31] NFSv4.1: Ensure we always run TEST/FREE_STATEID on locks

Hi Trond,

On 22 Sep 2016, at 13:39, Trond Myklebust wrote:

> Right now, we're only running TEST/FREE_STATEID on the locks if
> the open stateid recovery succeeds. The protocol requires us to
> always do so.
> The fix would be to move the call to TEST/FREE_STATEID and do it
> before we attempt open recovery.
>
> Signed-off-by: Trond Myklebust <[email protected]>
> ---
> fs/nfs/nfs4proc.c | 92
> +++++++++++++++++++++++++++++++------------------------
> 1 file changed, 52 insertions(+), 40 deletions(-)
>
> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> index 3c1b8cb7dd95..33ca6d768bd2 100644
> --- a/fs/nfs/nfs4proc.c
> +++ b/fs/nfs/nfs4proc.c
> @@ -2486,6 +2486,45 @@ static void
> nfs41_check_delegation_stateid(struct nfs4_state *state)
> }
>
> /**
> + * nfs41_check_expired_locks - possibly free a lock stateid
> + *
> + * @state: NFSv4 state for an inode
> + *
> + * Returns NFS_OK if recovery for this stateid is now finished.
> + * Otherwise a negative NFS4ERR value is returned.
> + */
> +static int nfs41_check_expired_locks(struct nfs4_state *state)
> +{
> + int status, ret = NFS_OK;
> + struct nfs4_lock_state *lsp;
> + struct nfs_server *server = NFS_SERVER(state->inode);
> +
> + if (!test_bit(LK_STATE_IN_USE, &state->flags))
> + goto out;
> + list_for_each_entry(lsp, &state->lock_states, ls_locks) {
> + if (test_bit(NFS_LOCK_INITIALIZED, &lsp->ls_flags)) {

I bisected a crash to this patch (commit
c5896fc8622d57b31e1e98545d67d7089019e478).
I thought the problem was that this patch moved this path out from under
the
nfsi->rwsem in nfs4_reclaim_locks() so it ends up with a freed
nfs4_lock_state here. So I tried the following fixup, but it doesn't
help:

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index b5290fd..2f51200 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -2559,9 +2559,13 @@ static int nfs41_check_open_stateid(struct
nfs4_state *state)
static int nfs41_open_expired(struct nfs4_state_owner *sp, struct
nfs4_state *state)
{
int status;
+ struct inode *inode = state->inode;
+ struct nfs_inode *nfsi = NFS_I(inode);

nfs41_check_delegation_stateid(state);
+ down_write(&nfsi->rwsem);
status = nfs41_check_expired_locks(state);
+ up_write(&nfsi->rwsem);
if (status != NFS_OK)
return status;
status = nfs41_check_open_stateid(state);

I can reproduce this with generic/089. Any ideas?

[ 1113.492603] BUG: unable to handle kernel NULL pointer dereference at
0000000000000018
[ 1113.493553] IP: [<ffffffffa02d1987>] nfs41_open_expired+0xb7/0x380
[nfsv4]
[ 1113.494355] PGD 132363067 PUD 1387b7067 PMD 0
[ 1113.494908] Oops: 0000 [#1] SMP
[ 1113.495274] Modules linked in: nfsv4 dns_resolver nfs ip6t_rpfilter
ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_broute
bridge stp llc ebtable_nat ip6table_security ip6table_mangle
ip6table_raw ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6
iptable_security iptable_mangle iptable_raw iptable_nat
nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack
ebtable_filter ebtables ip6table_filter ip6_tables nfsd auth_rpcgss
nfs_acl lockd grace sunrpc virtio_balloon virtio_net virtio_console
virtio_blk ppdev crct10dif_pclmul crc32_pclmul crc32c_intel
ghash_clmulni_intel parport_pc parport serio_raw acpi_cpufreq tpm_tis
virtio_pci tpm_tis_core ata_generic virtio_ring pata_acpi virtio
i2c_piix4 tpm
[ 1113.503438] CPU: 3 PID: 3008 Comm: ::1-manager Not tainted
4.8.0-rc7-00073-gf746650 #31
[ 1113.504329] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 1.8.1-20150318_183358- 04/01/2014
[ 1113.505476] task: ffff88013a469d40 task.stack: ffff8801386cc000
[ 1113.506140] RIP: 0010:[<ffffffffa02d1987>] [<ffffffffa02d1987>]
nfs41_open_expired+0xb7/0x380 [nfsv4]
[ 1113.507202] RSP: 0018:ffff8801386cfd98 EFLAGS: 00010203
[ 1113.507794] RAX: 0000000000000000 RBX: 0000000000000000 RCX:
0000000000000000
[ 1113.508586] RDX: 00000000003ce306 RSI: ffff88013fd9c940 RDI:
ffff880138e5ca00
[ 1113.509385] RBP: ffff8801386cfde8 R08: 000000000001c940 R09:
ffff88013537e300
[ 1113.510182] R10: ffff88013537e338 R11: ffff88013fd99d88 R12:
ffff8801384820c0
[ 1113.510977] R13: 0000000000000000 R14: ffff8801384820e0 R15:
ffff880138496650
[ 1113.511770] FS: 0000000000000000(0000) GS:ffff88013fd80000(0000)
knlGS:0000000000000000
[ 1113.512672] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1113.513318] CR2: 0000000000000018 CR3: 0000000132600000 CR4:
00000000000406e0
[ 1113.514116] Stack:
[ 1113.514352] ffff880131839000 ffff880139fab000 ffff8801386cfda8
ffff8801386cfda8
[ 1113.515244] 00000000530c0386 ffff8801384820c0 ffffffffa03012a0
ffff880138496650
[ 1113.516139] ffffffffa03012a0 ffff880138496650 ffff8801386cfe80
ffffffffa02e615f
[ 1113.517030] Call Trace:
[ 1113.517330] [<ffffffffa02e615f>] nfs4_do_reclaim+0x1af/0x610 [nfsv4]
[ 1113.518067] [<ffffffffa02e6ad0>] nfs4_run_state_manager+0x510/0x7d0
[nfsv4]
[ 1113.518854] [<ffffffffa02e65c0>] ? nfs4_do_reclaim+0x610/0x610
[nfsv4]
[ 1113.519606] [<ffffffffa02e65c0>] ? nfs4_do_reclaim+0x610/0x610
[nfsv4]
[ 1113.520366] [<ffffffff810c62c8>] kthread+0xd8/0xf0
[ 1113.520930] [<ffffffff8180653f>] ret_from_fork+0x1f/0x40
[ 1113.521542] [<ffffffff810c61f0>] ? kthread_worker_fn+0x170/0x170
[ 1113.522227] Code: 44 24 40 a8 01 0f 84 ee 00 00 00 49 8b 5c 24 20 4d
8d 74 24 20 49 39 de 75 11 e9 da 00 00 00 48 8b 1b 4c 39 f3 0f 84 ba 00
00 00 <48> 8b 43 18 a8 01 74 ec 48 8b 43 10 48 8b 3c 24 48 8d b3 08 01
[ 1113.525343] RIP [<ffffffffa02d1987>] nfs41_open_expired+0xb7/0x380
[nfsv4]
[ 1113.526149] RSP <ffff8801386cfd98>
[ 1113.526545] CR2: 0000000000000018
[ 1113.527301] ---[ end trace 10e07174c7bf56ff ]---

Ben


> + struct rpc_cred *cred = lsp->ls_state->owner->so_cred;
> +
> + status = nfs41_test_and_free_expired_stateid(server,
> + &lsp->ls_stateid,
> + cred);
> + trace_nfs4_test_lock_stateid(state, lsp, status);
> + if (status == -NFS4ERR_EXPIRED ||
> + status == -NFS4ERR_BAD_STATEID) {
> + clear_bit(NFS_LOCK_INITIALIZED, &lsp->ls_flags);
> + if (!recover_lost_locks)
> + set_bit(NFS_LOCK_LOST, &lsp->ls_flags);
> + } else if (status != NFS_OK) {
> + ret = status;
> + break;
> + }
> + }
> + };
> +out:
> + return ret;
> +}
> +
> +/**
> * nfs41_check_open_stateid - possibly free an open stateid
> *
> * @state: NFSv4 state for an inode
> @@ -2522,6 +2561,9 @@ static int nfs41_open_expired(struct
> nfs4_state_owner *sp, struct nfs4_state *st
> int status;
>
> nfs41_check_delegation_stateid(state);
> + status = nfs41_check_expired_locks(state);
> + if (status != NFS_OK)
> + return status;
> status = nfs41_check_open_stateid(state);
> if (status != NFS_OK)
> status = nfs4_open_expired(sp, state);
> @@ -6125,49 +6167,19 @@ out:
> }
>
> #if defined(CONFIG_NFS_V4_1)
> -/**
> - * nfs41_check_expired_locks - possibly free a lock stateid
> - *
> - * @state: NFSv4 state for an inode
> - *
> - * Returns NFS_OK if recovery for this stateid is now finished.
> - * Otherwise a negative NFS4ERR value is returned.
> - */
> -static int nfs41_check_expired_locks(struct nfs4_state *state)
> -{
> - int status, ret = NFS_OK;
> - struct nfs4_lock_state *lsp;
> - struct nfs_server *server = NFS_SERVER(state->inode);
> -
> - list_for_each_entry(lsp, &state->lock_states, ls_locks) {
> - if (test_bit(NFS_LOCK_INITIALIZED, &lsp->ls_flags)) {
> - struct rpc_cred *cred = lsp->ls_state->owner->so_cred;
> -
> - status = nfs41_test_and_free_expired_stateid(server,
> - &lsp->ls_stateid,
> - cred);
> - trace_nfs4_test_lock_stateid(state, lsp, status);
> - if (status == -NFS4ERR_EXPIRED ||
> - status == -NFS4ERR_BAD_STATEID)
> - clear_bit(NFS_LOCK_INITIALIZED, &lsp->ls_flags);
> - else if (status != NFS_OK) {
> - ret = status;
> - break;
> - }
> - }
> - };
> -
> - return ret;
> -}
> -
> static int nfs41_lock_expired(struct nfs4_state *state, struct
> file_lock *request)
> {
> - int status = NFS_OK;
> + struct nfs4_lock_state *lsp;
> + int status;
>
> - if (test_bit(LK_STATE_IN_USE, &state->flags))
> - status = nfs41_check_expired_locks(state);
> - if (status != NFS_OK)
> - status = nfs4_lock_expired(state, request);
> + status = nfs4_set_lock_state(state, request);
> + if (status != 0)
> + return status;
> + lsp = request->fl_u.nfs4_fl.owner;
> + if (test_bit(NFS_LOCK_INITIALIZED, &lsp->ls_flags) ||
> + test_bit(NFS_LOCK_LOST, &lsp->ls_flags))
> + return 0;
> + status = nfs4_lock_expired(state, request);
> return status;
> }
> #endif
> --
> 2.7.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs"
> in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2016-11-07 13:09:47

by Benjamin Coddington

[permalink] [raw]
Subject: Re: [PATCH v7 13/31] NFSv4.1: Ensure we always run TEST/FREE_STATEID on locks

On 4 Nov 2016, at 12:02, Benjamin Coddington wrote:

> Hi Trond,
>
> On 22 Sep 2016, at 13:39, Trond Myklebust wrote:
>
>> Right now, we're only running TEST/FREE_STATEID on the locks if
>> the open stateid recovery succeeds. The protocol requires us to
>> always do so.
>> The fix would be to move the call to TEST/FREE_STATEID and do it
>> before we attempt open recovery.
>>
>> Signed-off-by: Trond Myklebust <[email protected]>
>> ---
>> fs/nfs/nfs4proc.c | 92
>> +++++++++++++++++++++++++++++++------------------------
>> 1 file changed, 52 insertions(+), 40 deletions(-)
>>
>> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
>> index 3c1b8cb7dd95..33ca6d768bd2 100644
>> --- a/fs/nfs/nfs4proc.c
>> +++ b/fs/nfs/nfs4proc.c
>> @@ -2486,6 +2486,45 @@ static void
>> nfs41_check_delegation_stateid(struct nfs4_state *state)
>> }
>>
>> /**
>> + * nfs41_check_expired_locks - possibly free a lock stateid
>> + *
>> + * @state: NFSv4 state for an inode
>> + *
>> + * Returns NFS_OK if recovery for this stateid is now finished.
>> + * Otherwise a negative NFS4ERR value is returned.
>> + */
>> +static int nfs41_check_expired_locks(struct nfs4_state *state)
>> +{
>> + int status, ret = NFS_OK;
>> + struct nfs4_lock_state *lsp;
>> + struct nfs_server *server = NFS_SERVER(state->inode);
>> +
>> + if (!test_bit(LK_STATE_IN_USE, &state->flags))
>> + goto out;
>> + list_for_each_entry(lsp, &state->lock_states, ls_locks) {
>> + if (test_bit(NFS_LOCK_INITIALIZED, &lsp->ls_flags)) {
>
> I bisected a crash to this patch (commit
> c5896fc8622d57b31e1e98545d67d7089019e478).
> I thought the problem was that this patch moved this path out from
> under the
> nfsi->rwsem in nfs4_reclaim_locks() so it ends up with a freed
> nfs4_lock_state here.
>
> I can reproduce this with generic/089. Any ideas?

Hit this on v4.9-rc4 this morning. This probably needs to take the
state_lock before traversing the lock_states list. I guess we've never
hit
this before because the old path would serialize things somehow - maybe
via
taking flc_lock in nfs4_reclaim_locks().. I'll test that fix.

Ben

2016-11-07 13:46:02

by Benjamin Coddington

[permalink] [raw]
Subject: Re: [PATCH v7 13/31] NFSv4.1: Ensure we always run TEST/FREE_STATEID on locks


On 7 Nov 2016, at 8:09, Benjamin Coddington wrote:

> On 4 Nov 2016, at 12:02, Benjamin Coddington wrote:
>
>> Hi Trond,
>>
>> On 22 Sep 2016, at 13:39, Trond Myklebust wrote:
>>
>>> Right now, we're only running TEST/FREE_STATEID on the locks if
>>> the open stateid recovery succeeds. The protocol requires us to
>>> always do so.
>>> The fix would be to move the call to TEST/FREE_STATEID and do it
>>> before we attempt open recovery.
>>>
>>> Signed-off-by: Trond Myklebust <[email protected]>
>>> ---
>>> fs/nfs/nfs4proc.c | 92
>>> +++++++++++++++++++++++++++++++------------------------
>>> 1 file changed, 52 insertions(+), 40 deletions(-)
>>>
>>> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
>>> index 3c1b8cb7dd95..33ca6d768bd2 100644
>>> --- a/fs/nfs/nfs4proc.c
>>> +++ b/fs/nfs/nfs4proc.c
>>> @@ -2486,6 +2486,45 @@ static void
>>> nfs41_check_delegation_stateid(struct nfs4_state *state)
>>> }
>>>
>>> /**
>>> + * nfs41_check_expired_locks - possibly free a lock stateid
>>> + *
>>> + * @state: NFSv4 state for an inode
>>> + *
>>> + * Returns NFS_OK if recovery for this stateid is now finished.
>>> + * Otherwise a negative NFS4ERR value is returned.
>>> + */
>>> +static int nfs41_check_expired_locks(struct nfs4_state *state)
>>> +{
>>> + int status, ret = NFS_OK;
>>> + struct nfs4_lock_state *lsp;
>>> + struct nfs_server *server = NFS_SERVER(state->inode);
>>> +
>>> + if (!test_bit(LK_STATE_IN_USE, &state->flags))
>>> + goto out;
>>> + list_for_each_entry(lsp, &state->lock_states, ls_locks) {
>>> + if (test_bit(NFS_LOCK_INITIALIZED, &lsp->ls_flags)) {
>>
>> I bisected a crash to this patch (commit
>> c5896fc8622d57b31e1e98545d67d7089019e478).
>> I thought the problem was that this patch moved this path out from
>> under the
>> nfsi->rwsem in nfs4_reclaim_locks() so it ends up with a freed
>> nfs4_lock_state here.
>>
>> I can reproduce this with generic/089. Any ideas?
>
> Hit this on v4.9-rc4 this morning. This probably needs to take the
> state_lock before traversing the lock_states list. I guess we've
> never hit
> this before because the old path would serialize things somehow -
> maybe via
> taking flc_lock in nfs4_reclaim_locks().. I'll test that fix.

Well, that's no good either as it gets stuck in a NFS4ERR_OLD_STATEID
loop
in recovery since we'd want to retry in that case, but taking the
state_lock
means we won't use the new stateid. So maybe we need both the
state_lock to
protect the list and the rwsem to stop new locks from being sent. I'll
try
that now.

Ben

2016-11-07 14:50:28

by Benjamin Coddington

[permalink] [raw]
Subject: Re: [PATCH v7 13/31] NFSv4.1: Ensure we always run TEST/FREE_STATEID on locks



On 7 Nov 2016, at 8:45, Benjamin Coddington wrote:

>
> On 7 Nov 2016, at 8:09, Benjamin Coddington wrote:
>
>> On 4 Nov 2016, at 12:02, Benjamin Coddington wrote:
>>
>>> Hi Trond,
>>>
>>> On 22 Sep 2016, at 13:39, Trond Myklebust wrote:
>>>
>>>> Right now, we're only running TEST/FREE_STATEID on the locks if
>>>> the open stateid recovery succeeds. The protocol requires us to
>>>> always do so.
>>>> The fix would be to move the call to TEST/FREE_STATEID and do it
>>>> before we attempt open recovery.
>>>>
>>>> Signed-off-by: Trond Myklebust <[email protected]>
>>>> ---
>>>> fs/nfs/nfs4proc.c | 92
>>>> +++++++++++++++++++++++++++++++------------------------
>>>> 1 file changed, 52 insertions(+), 40 deletions(-)
>>>>
>>>> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
>>>> index 3c1b8cb7dd95..33ca6d768bd2 100644
>>>> --- a/fs/nfs/nfs4proc.c
>>>> +++ b/fs/nfs/nfs4proc.c
>>>> @@ -2486,6 +2486,45 @@ static void
>>>> nfs41_check_delegation_stateid(struct nfs4_state *state)
>>>> }
>>>>
>>>> /**
>>>> + * nfs41_check_expired_locks - possibly free a lock stateid
>>>> + *
>>>> + * @state: NFSv4 state for an inode
>>>> + *
>>>> + * Returns NFS_OK if recovery for this stateid is now finished.
>>>> + * Otherwise a negative NFS4ERR value is returned.
>>>> + */
>>>> +static int nfs41_check_expired_locks(struct nfs4_state *state)
>>>> +{
>>>> + int status, ret = NFS_OK;
>>>> + struct nfs4_lock_state *lsp;
>>>> + struct nfs_server *server = NFS_SERVER(state->inode);
>>>> +
>>>> + if (!test_bit(LK_STATE_IN_USE, &state->flags))
>>>> + goto out;
>>>> + list_for_each_entry(lsp, &state->lock_states, ls_locks) {
>>>> + if (test_bit(NFS_LOCK_INITIALIZED, &lsp->ls_flags)) {
>>>
>>> I bisected a crash to this patch (commit
>>> c5896fc8622d57b31e1e98545d67d7089019e478).
>>> I thought the problem was that this patch moved this path out from
>>> under the
>>> nfsi->rwsem in nfs4_reclaim_locks() so it ends up with a freed
>>> nfs4_lock_state here.
>>>
>>> I can reproduce this with generic/089. Any ideas?
>>
>> Hit this on v4.9-rc4 this morning. This probably needs to take the
>> state_lock before traversing the lock_states list. I guess we've
>> never hit
>> this before because the old path would serialize things somehow -
>> maybe via
>> taking flc_lock in nfs4_reclaim_locks().. I'll test that fix.
>
> Well, that's no good either as it gets stuck in a NFS4ERR_OLD_STATEID
> loop
> in recovery since we'd want to retry in that case, but taking the
> state_lock
> means we won't use the new stateid. So maybe we need both the
> state_lock to
> protect the list and the rwsem to stop new locks from being sent.
> I'll try
> that now.

That one got much further, but eventually soft-locked up on the
state_lock
when what looks like the state manager needed to have a TEST_STATEID
wait on
another lock to complete.

The other question here is why are we doing recovery so much? It seems
like
we're sending FREE_STATEID unnecessarily on successful DELEGRETURN and
LOCKU, but that shouldn't be triggering state recovery..


2016-11-07 14:59:39

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH v7 13/31] NFSv4.1: Ensure we always run TEST/FREE_STATEID on locks


On Nov 7, 2016, at 09:50, Benjamin Coddington <[email protected]<mailto:[email protected]>> wrote:



On 7 Nov 2016, at 8:45, Benjamin Coddington wrote:


On 7 Nov 2016, at 8:09, Benjamin Coddington wrote:

On 4 Nov 2016, at 12:02, Benjamin Coddington wrote:

Hi Trond,

On 22 Sep 2016, at 13:39, Trond Myklebust wrote:

Right now, we're only running TEST/FREE_STATEID on the locks if
the open stateid recovery succeeds. The protocol requires us to
always do so.
The fix would be to move the call to TEST/FREE_STATEID and do it
before we attempt open recovery.

Signed-off-by: Trond Myklebust <[email protected]<mailto:[email protected]>>
---
fs/nfs/nfs4proc.c | 92 +++++++++++++++++++++++++++++++------------------------
1 file changed, 52 insertions(+), 40 deletions(-)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 3c1b8cb7dd95..33ca6d768bd2 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -2486,6 +2486,45 @@ static void nfs41_check_delegation_stateid(struct nfs4_state *state)
}

/**
+ * nfs41_check_expired_locks - possibly free a lock stateid
+ *
+ * @state: NFSv4 state for an inode
+ *
+ * Returns NFS_OK if recovery for this stateid is now finished.
+ * Otherwise a negative NFS4ERR value is returned.
+ */
+static int nfs41_check_expired_locks(struct nfs4_state *state)
+{
+ int status, ret = NFS_OK;
+ struct nfs4_lock_state *lsp;
+ struct nfs_server *server = NFS_SERVER(state->inode);
+
+ if (!test_bit(LK_STATE_IN_USE, &state->flags))
+ goto out;
+ list_for_each_entry(lsp, &state->lock_states, ls_locks) {
+ if (test_bit(NFS_LOCK_INITIALIZED, &lsp->ls_flags)) {

I bisected a crash to this patch (commit c5896fc8622d57b31e1e98545d67d7089019e478).
I thought the problem was that this patch moved this path out from under the
nfsi->rwsem in nfs4_reclaim_locks() so it ends up with a freed
nfs4_lock_state here.

I can reproduce this with generic/089. Any ideas?

Hit this on v4.9-rc4 this morning. This probably needs to take the
state_lock before traversing the lock_states list. I guess we've never hit
this before because the old path would serialize things somehow - maybe via
taking flc_lock in nfs4_reclaim_locks().. I'll test that fix.

Well, that's no good either as it gets stuck in a NFS4ERR_OLD_STATEID loop
in recovery since we'd want to retry in that case, but taking the state_lock
means we won't use the new stateid. So maybe we need both the state_lock to
protect the list and the rwsem to stop new locks from being sent. I'll try
that now.

That one got much further, but eventually soft-locked up on the state_lock
when what looks like the state manager needed to have a TEST_STATEID wait on
another lock to complete.

The other question here is why are we doing recovery so much? It seems like
we're sending FREE_STATEID unnecessarily on successful DELEGRETURN and
LOCKU, but that shouldn't be triggering state recovery..


FREE_STATEID is required by the protocol after LOCKU, so that’s intentional. It isn’t needed after DELEGRETURN, so I’m not sure why that is happening.

2016-11-08 15:10:13

by Benjamin Coddington

[permalink] [raw]
Subject: Re: [PATCH v7 13/31] NFSv4.1: Ensure we always run TEST/FREE_STATEID on locks



On 7 Nov 2016, at 9:59, Trond Myklebust wrote:

> On Nov 7, 2016, at 09:50, Benjamin Coddington
> <[email protected]<mailto:[email protected]>> wrote:
>
> On 7 Nov 2016, at 8:45, Benjamin Coddington wrote:
>
> On 7 Nov 2016, at 8:09, Benjamin Coddington wrote:
>
> On 4 Nov 2016, at 12:02, Benjamin Coddington wrote:
>
> Hi Trond,
>
> On 22 Sep 2016, at 13:39, Trond Myklebust wrote:
>
> Right now, we're only running TEST/FREE_STATEID on the locks if
> the open stateid recovery succeeds. The protocol requires us to
> always do so.
> The fix would be to move the call to TEST/FREE_STATEID and do it
> before we attempt open recovery.
>
> Signed-off-by: Trond Myklebust
> <[email protected]<mailto:[email protected]>>
> ---
> fs/nfs/nfs4proc.c | 92
> +++++++++++++++++++++++++++++++------------------------
> 1 file changed, 52 insertions(+), 40 deletions(-)
>
> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> index 3c1b8cb7dd95..33ca6d768bd2 100644
> --- a/fs/nfs/nfs4proc.c
> +++ b/fs/nfs/nfs4proc.c
> @@ -2486,6 +2486,45 @@ static void
> nfs41_check_delegation_stateid(struct nfs4_state *state)
> }
>
> /**
> + * nfs41_check_expired_locks - possibly free a lock stateid
> + *
> + * @state: NFSv4 state for an inode
> + *
> + * Returns NFS_OK if recovery for this stateid is now finished.
> + * Otherwise a negative NFS4ERR value is returned.
> + */
> +static int nfs41_check_expired_locks(struct nfs4_state *state)
> +{
> + int status, ret = NFS_OK;
> + struct nfs4_lock_state *lsp;
> + struct nfs_server *server = NFS_SERVER(state->inode);
> +
> + if (!test_bit(LK_STATE_IN_USE, &state->flags))
> + goto out;
> + list_for_each_entry(lsp, &state->lock_states, ls_locks) {
> + if (test_bit(NFS_LOCK_INITIALIZED, &lsp->ls_flags)) {
>
> I bisected a crash to this patch (commit
> c5896fc8622d57b31e1e98545d67d7089019e478).
> I thought the problem was that this patch moved this path out from
> under the
> nfsi->rwsem in nfs4_reclaim_locks() so it ends up with a freed
> nfs4_lock_state here.
>
> I can reproduce this with generic/089. Any ideas?
>
> Hit this on v4.9-rc4 this morning. This probably needs to take the
> state_lock before traversing the lock_states list. I guess we've
> never hit
> this before because the old path would serialize things somehow -
> maybe via
> taking flc_lock in nfs4_reclaim_locks().. I'll test that fix.
>
> Well, that's no good either as it gets stuck in a NFS4ERR_OLD_STATEID
> loop
> in recovery since we'd want to retry in that case, but taking the
> state_lock
> means we won't use the new stateid. So maybe we need both the
> state_lock to
> protect the list and the rwsem to stop new locks from being sent.
> I'll try
> that now.
>
> That one got much further, but eventually soft-locked up on the
> state_lock
> when what looks like the state manager needed to have a TEST_STATEID
> wait on
> another lock to complete.
>
> The other question here is why are we doing recovery so much? It
> seems like
> we're sending FREE_STATEID unnecessarily on successful DELEGRETURN and
> LOCKU, but that shouldn't be triggering state recovery..
>
> FREE_STATEID is required by the protocol after LOCKU, so that’s
> intentional.

I thought it wasn't required if we do CLOSE, but I checked again and
that
wasn't what I was seeing. I am seeing LOCKU, FREE_STATEID, CLOSE, so it
is
correct.

> It isn’t needed after DELEGRETURN, so I’m not sure why that is
> happening.

I think it's just falling through the case statement in
nfs4_delegreturn_done(). I'll send a patch for that.

I think the fix here is to manually increment ls_count for the current
lock
state, and expect that the lock_states list can be modified while we
walk
it. I'll send a patch for that too if it runs though testing OK.

Ben

2016-11-08 15:20:49

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH v7 13/31] NFSv4.1: Ensure we always run TEST/FREE_STATEID on locks


On Nov 8, 2016, at 10:10, Benjamin Coddington <[email protected]<mailto:[email protected]>> wrote:



On 7 Nov 2016, at 9:59, Trond Myklebust wrote:

On Nov 7, 2016, at 09:50, Benjamin Coddington <[email protected]<mailto:[email protected]><mailto:[email protected]>> wrote:

On 7 Nov 2016, at 8:45, Benjamin Coddington wrote:

On 7 Nov 2016, at 8:09, Benjamin Coddington wrote:

On 4 Nov 2016, at 12:02, Benjamin Coddington wrote:

Hi Trond,

On 22 Sep 2016, at 13:39, Trond Myklebust wrote:

Right now, we're only running TEST/FREE_STATEID on the locks if
the open stateid recovery succeeds. The protocol requires us to
always do so.
The fix would be to move the call to TEST/FREE_STATEID and do it
before we attempt open recovery.

Signed-off-by: Trond Myklebust <[email protected]<mailto:[email protected]><mailto:[email protected]>>
---
fs/nfs/nfs4proc.c | 92 +++++++++++++++++++++++++++++++------------------------
1 file changed, 52 insertions(+), 40 deletions(-)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 3c1b8cb7dd95..33ca6d768bd2 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -2486,6 +2486,45 @@ static void nfs41_check_delegation_stateid(struct nfs4_state *state)
}

/**
+ * nfs41_check_expired_locks - possibly free a lock stateid
+ *
+ * @state: NFSv4 state for an inode
+ *
+ * Returns NFS_OK if recovery for this stateid is now finished.
+ * Otherwise a negative NFS4ERR value is returned.
+ */
+static int nfs41_check_expired_locks(struct nfs4_state *state)
+{
+ int status, ret = NFS_OK;
+ struct nfs4_lock_state *lsp;
+ struct nfs_server *server = NFS_SERVER(state->inode);
+
+ if (!test_bit(LK_STATE_IN_USE, &state->flags))
+ goto out;
+ list_for_each_entry(lsp, &state->lock_states, ls_locks) {
+ if (test_bit(NFS_LOCK_INITIALIZED, &lsp->ls_flags)) {

I bisected a crash to this patch (commit c5896fc8622d57b31e1e98545d67d7089019e478).
I thought the problem was that this patch moved this path out from under the
nfsi->rwsem in nfs4_reclaim_locks() so it ends up with a freed
nfs4_lock_state here.

I can reproduce this with generic/089. Any ideas?

Hit this on v4.9-rc4 this morning. This probably needs to take the
state_lock before traversing the lock_states list. I guess we've never hit
this before because the old path would serialize things somehow - maybe via
taking flc_lock in nfs4_reclaim_locks().. I'll test that fix.

Well, that's no good either as it gets stuck in a NFS4ERR_OLD_STATEID loop
in recovery since we'd want to retry in that case, but taking the state_lock
means we won't use the new stateid. So maybe we need both the state_lock to
protect the list and the rwsem to stop new locks from being sent. I'll try
that now.

That one got much further, but eventually soft-locked up on the state_lock
when what looks like the state manager needed to have a TEST_STATEID wait on
another lock to complete.

The other question here is why are we doing recovery so much? It seems like
we're sending FREE_STATEID unnecessarily on successful DELEGRETURN and
LOCKU, but that shouldn't be triggering state recovery..

FREE_STATEID is required by the protocol after LOCKU, so that’s intentional.

I thought it wasn't required if we do CLOSE, but I checked again and that
wasn't what I was seeing. I am seeing LOCKU, FREE_STATEID, CLOSE, so it is
correct.

Note that at some point we probably should put the FREE_STATEID in the same COMPOUND as the LOCKU. I can’t think of any reason why we would not want to do that given our current lockowner model.


It isn’t needed after DELEGRETURN, so I’m not sure why that is happening.

I think it's just falling through the case statement in
nfs4_delegreturn_done(). I'll send a patch for that.

I think the fix here is to manually increment ls_count for the current lock
state, and expect that the lock_states list can be modified while we walk
it. I'll send a patch for that too if it runs though testing OK.

Thanks!

2016-11-10 15:01:52

by Anna Schumaker

[permalink] [raw]
Subject: Re: [PATCH v7 13/31] NFSv4.1: Ensure we always run TEST/FREE_STATEID on locks

Hi Ben,

On 11/08/2016 10:10 AM, Benjamin Coddington wrote:
>
>
> On 7 Nov 2016, at 9:59, Trond Myklebust wrote:
>
>> On Nov 7, 2016, at 09:50, Benjamin Coddington <[email protected]<mailto:[email protected]>> wrote:
>>
>> On 7 Nov 2016, at 8:45, Benjamin Coddington wrote:
>>
>> On 7 Nov 2016, at 8:09, Benjamin Coddington wrote:
>>
>> On 4 Nov 2016, at 12:02, Benjamin Coddington wrote:
>>
>> Hi Trond,
>>
>> On 22 Sep 2016, at 13:39, Trond Myklebust wrote:
>>
>> Right now, we're only running TEST/FREE_STATEID on the locks if
>> the open stateid recovery succeeds. The protocol requires us to
>> always do so.
>> The fix would be to move the call to TEST/FREE_STATEID and do it
>> before we attempt open recovery.
>>
>> Signed-off-by: Trond Myklebust <[email protected]<mailto:[email protected]>>
>> ---
>> fs/nfs/nfs4proc.c | 92 +++++++++++++++++++++++++++++++------------------------
>> 1 file changed, 52 insertions(+), 40 deletions(-)
>>
>> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
>> index 3c1b8cb7dd95..33ca6d768bd2 100644
>> --- a/fs/nfs/nfs4proc.c
>> +++ b/fs/nfs/nfs4proc.c
>> @@ -2486,6 +2486,45 @@ static void nfs41_check_delegation_stateid(struct nfs4_state *state)
>> }
>>
>> /**
>> + * nfs41_check_expired_locks - possibly free a lock stateid
>> + *
>> + * @state: NFSv4 state for an inode
>> + *
>> + * Returns NFS_OK if recovery for this stateid is now finished.
>> + * Otherwise a negative NFS4ERR value is returned.
>> + */
>> +static int nfs41_check_expired_locks(struct nfs4_state *state)
>> +{
>> + int status, ret = NFS_OK;
>> + struct nfs4_lock_state *lsp;
>> + struct nfs_server *server = NFS_SERVER(state->inode);
>> +
>> + if (!test_bit(LK_STATE_IN_USE, &state->flags))
>> + goto out;
>> + list_for_each_entry(lsp, &state->lock_states, ls_locks) {
>> + if (test_bit(NFS_LOCK_INITIALIZED, &lsp->ls_flags)) {
>>
>> I bisected a crash to this patch (commit c5896fc8622d57b31e1e98545d67d7089019e478).
>> I thought the problem was that this patch moved this path out from under the
>> nfsi->rwsem in nfs4_reclaim_locks() so it ends up with a freed
>> nfs4_lock_state here.
>>
>> I can reproduce this with generic/089. Any ideas?
>>
>> Hit this on v4.9-rc4 this morning. This probably needs to take the
>> state_lock before traversing the lock_states list. I guess we've never hit
>> this before because the old path would serialize things somehow - maybe via
>> taking flc_lock in nfs4_reclaim_locks().. I'll test that fix.
>>
>> Well, that's no good either as it gets stuck in a NFS4ERR_OLD_STATEID loop
>> in recovery since we'd want to retry in that case, but taking the state_lock
>> means we won't use the new stateid. So maybe we need both the state_lock to
>> protect the list and the rwsem to stop new locks from being sent. I'll try
>> that now.
>>
>> That one got much further, but eventually soft-locked up on the state_lock
>> when what looks like the state manager needed to have a TEST_STATEID wait on
>> another lock to complete.
>>
>> The other question here is why are we doing recovery so much? It seems like
>> we're sending FREE_STATEID unnecessarily on successful DELEGRETURN and
>> LOCKU, but that shouldn't be triggering state recovery..
>>
>> FREE_STATEID is required by the protocol after LOCKU, so that’s intentional.
>
> I thought it wasn't required if we do CLOSE, but I checked again and that
> wasn't what I was seeing. I am seeing LOCKU, FREE_STATEID, CLOSE, so it is
> correct.
>
>> It isn’t needed after DELEGRETURN, so I’m not sure why that is happening.
>
> I think it's just falling through the case statement in
> nfs4_delegreturn_done(). I'll send a patch for that.
>
> I think the fix here is to manually increment ls_count for the current lock
> state, and expect that the lock_states list can be modified while we walk
> it. I'll send a patch for that too if it runs though testing OK.

Do you have an estimate for when this patch will be ready? I want to include it in my next bugfix pull request for 4.9.

Thanks,
Anna

>
> Ben

2016-11-10 15:58:19

by Benjamin Coddington

[permalink] [raw]
Subject: Re: [PATCH v7 13/31] NFSv4.1: Ensure we always run TEST/FREE_STATEID on locks

Hi Anna,

On 10 Nov 2016, at 10:01, Anna Schumaker wrote:
> Do you have an estimate for when this patch will be ready? I want to
> include it in my next bugfix pull request for 4.9.

I haven't posted because I am still trying to get to the bottom of
another
problem where the client gets stuck in a loop sending the same stateid
over
and over on NFS4ERR_OLD_STATEID. I want to make sure this problem isn't
caused by this fix -- which I don't think it is, but I'd rather make
sure.
If I don't make any progress on this problem by the end of today, I'll
post
what I have.

Read on if interested in this new problem:

It looks like racing opens with the same openowner can be returned out
of
order by the server, so the client sees stateid seqid of 2 before 1.
Then a
LOCK sent with seqid 1 is endlessly retried if sent while doing
recovery.

It's hard to tell if I was able to capture all the moving parts to
describe
this problem, though. As it takes a very long time for me to reproduce,
and
the packet captures were dropping frames. I'm working on manually
reproducing it now.

Ben

2016-11-10 16:51:52

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH v7 13/31] NFSv4.1: Ensure we always run TEST/FREE_STATEID on locks

T24gVGh1LCAyMDE2LTExLTEwIGF0IDEwOjU4IC0wNTAwLCBCZW5qYW1pbiBDb2RkaW5ndG9uIHdy
b3RlOg0KPiBIaSBBbm5hLA0KPiANCj4gT24gMTAgTm92IDIwMTYsIGF0IDEwOjAxLCBBbm5hIFNj
aHVtYWtlciB3cm90ZToNCj4gPiANCj4gPiBEbyB5b3UgaGF2ZSBhbiBlc3RpbWF0ZSBmb3Igd2hl
biB0aGlzIHBhdGNoIHdpbGwgYmUgcmVhZHk/wqDCoEkgd2FudA0KPiA+IHRvwqANCj4gPiBpbmNs
dWRlIGl0IGluIG15IG5leHQgYnVnZml4IHB1bGwgcmVxdWVzdCBmb3IgNC45Lg0KPiANCj4gSSBo
YXZlbid0IHBvc3RlZCBiZWNhdXNlIEkgYW0gc3RpbGwgdHJ5aW5nIHRvIGdldCB0byB0aGUgYm90
dG9tIG9mwqANCj4gYW5vdGhlcg0KPiBwcm9ibGVtIHdoZXJlIHRoZSBjbGllbnQgZ2V0cyBzdHVj
ayBpbiBhIGxvb3Agc2VuZGluZyB0aGUgc2FtZQ0KPiBzdGF0ZWlkwqANCj4gb3Zlcg0KPiBhbmQg
b3ZlciBvbiBORlM0RVJSX09MRF9TVEFURUlELsKgwqBJIHdhbnQgdG8gbWFrZSBzdXJlIHRoaXMg
cHJvYmxlbQ0KPiBpc24ndA0KPiBjYXVzZWQgYnkgdGhpcyBmaXggLS0gd2hpY2ggSSBkb24ndCB0
aGluayBpdCBpcywgYnV0IEknZCByYXRoZXIgbWFrZcKgDQo+IHN1cmUuDQo+IElmIEkgZG9uJ3Qg
bWFrZSBhbnkgcHJvZ3Jlc3Mgb24gdGhpcyBwcm9ibGVtIGJ5IHRoZSBlbmQgb2YgdG9kYXksDQo+
IEknbGzCoA0KPiBwb3N0DQo+IHdoYXQgSSBoYXZlLg0KPiANCj4gUmVhZCBvbiBpZiBpbnRlcmVz
dGVkIGluIHRoaXMgbmV3IHByb2JsZW06DQo+IA0KPiBJdCBsb29rcyBsaWtlIHJhY2luZyBvcGVu
cyB3aXRoIHRoZSBzYW1lIG9wZW5vd25lciBjYW4gYmUgcmV0dXJuZWQNCj4gb3V0wqANCj4gb2YN
Cj4gb3JkZXIgYnkgdGhlIHNlcnZlciwgc28gdGhlIGNsaWVudCBzZWVzIHN0YXRlaWQgc2VxaWQg
b2YgMiBiZWZvcmUNCj4gMS7CoMKgDQo+IFRoZW4gYQ0KPiBMT0NLIHNlbnQgd2l0aCBzZXFpZCAx
IGlzIGVuZGxlc3NseSByZXRyaWVkIGlmIHNlbnQgd2hpbGUgZG9pbmfCoA0KPiByZWNvdmVyeS4N
Cj7CoA0KDQpXaHkgaXMgaXQgZG9pbmcgdGhhdD/CoG5mczRfbG9ja19wcmVwYXJlKCkgc2hvdWxk
IGJlIGNvcHlpbmcgdGhlIHN0YXRlaWQNCmZyb20gdGhlIG5mczRfc3RhdGUgc3RydWN0dXJlIG9u
IGVhY2ggaXRlcmF0aW9uLg0KDQpDaGVlcnMNCsKgIFRyb25k

2016-11-10 20:18:25

by Benjamin Coddington

[permalink] [raw]
Subject: Re: [PATCH v7 13/31] NFSv4.1: Ensure we always run TEST/FREE_STATEID on locks


On 10 Nov 2016, at 10:58, Benjamin Coddington wrote:

> Hi Anna,
>
> On 10 Nov 2016, at 10:01, Anna Schumaker wrote:
>> Do you have an estimate for when this patch will be ready? I want to
>> include it in my next bugfix pull request for 4.9.
>
> I haven't posted because I am still trying to get to the bottom of
> another
> problem where the client gets stuck in a loop sending the same stateid
> over
> and over on NFS4ERR_OLD_STATEID. I want to make sure this problem
> isn't
> caused by this fix -- which I don't think it is, but I'd rather make
> sure.
> If I don't make any progress on this problem by the end of today, I'll
> post
> what I have.
>
> Read on if interested in this new problem:
>
> It looks like racing opens with the same openowner can be returned out
> of
> order by the server, so the client sees stateid seqid of 2 before 1.
> Then a
> LOCK sent with seqid 1 is endlessly retried if sent while doing
> recovery.
>
> It's hard to tell if I was able to capture all the moving parts to
> describe
> this problem, though. As it takes a very long time for me to
> reproduce, and
> the packet captures were dropping frames. I'm working on manually
> reproducing it now.

Anna,

I haven't gotten to the bottom of it, and so I'm not confident it isn't
a
problem created by the fix I've been testing, which is:

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index e809498..2aa9d86 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -2564,12 +2564,15 @@ static void
nfs41_check_delegation_stateid(struct
nfs4_state *state)
static int nfs41_check_expired_locks(struct nfs4_state *state)
{
int status, ret = NFS_OK;
- struct nfs4_lock_state *lsp;
+ struct nfs4_lock_state *lsp, *tmp;
struct nfs_server *server = NFS_SERVER(state->inode);

if (!test_bit(LK_STATE_IN_USE, &state->flags))
goto out;
- list_for_each_entry(lsp, &state->lock_states, ls_locks) {
+ spin_lock(&state->state_lock);
+ list_for_each_entry_safe(lsp, tmp, &state->lock_states,
ls_locks) {
+ atomic_inc(&lsp->ls_count);
+ spin_unlock(&state->state_lock);
if (test_bit(NFS_LOCK_INITIALIZED, &lsp->ls_flags)) {
struct rpc_cred *cred =
lsp->ls_state->owner->so_cred;

@@ -2588,7 +2591,10 @@ static int nfs41_check_expired_locks(struct
nfs4_state *state)
break;
}
}
- };
+ nfs4_put_lock_state(lsp);
+ spin_lock(&state->state_lock);
+ }
+ spin_unlock(&state->state_lock);
out:
return ret;
}

http://people.redhat.com/bcodding/old_stateid_loop is tshark output of
my
only good wirecapture of the problem. Without this patch, generic/089
crashes long before this problem is reproduced, so I am stuck figuring
it
out, I'm afraid. Don't wait on my account.

I plan on trying a bit more to reproduce tomorrow, and if I cannot, I'll
write about it under separate cover.

Ben

2016-11-10 20:54:46

by Anna Schumaker

[permalink] [raw]
Subject: Re: [PATCH v7 13/31] NFSv4.1: Ensure we always run TEST/FREE_STATEID on locks

On 11/10/2016 03:18 PM, Benjamin Coddington wrote:
>
> On 10 Nov 2016, at 10:58, Benjamin Coddington wrote:
>
>> Hi Anna,
>>
>> On 10 Nov 2016, at 10:01, Anna Schumaker wrote:
>>> Do you have an estimate for when this patch will be ready? I want to include it in my next bugfix pull request for 4.9.
>>
>> I haven't posted because I am still trying to get to the bottom of another
>> problem where the client gets stuck in a loop sending the same stateid over
>> and over on NFS4ERR_OLD_STATEID. I want to make sure this problem isn't
>> caused by this fix -- which I don't think it is, but I'd rather make sure.
>> If I don't make any progress on this problem by the end of today, I'll post
>> what I have.
>>
>> Read on if interested in this new problem:
>>
>> It looks like racing opens with the same openowner can be returned out of
>> order by the server, so the client sees stateid seqid of 2 before 1. Then a
>> LOCK sent with seqid 1 is endlessly retried if sent while doing recovery.
>>
>> It's hard to tell if I was able to capture all the moving parts to describe
>> this problem, though. As it takes a very long time for me to reproduce, and
>> the packet captures were dropping frames. I'm working on manually
>> reproducing it now.
>
> Anna,
>
> I haven't gotten to the bottom of it, and so I'm not confident it isn't a
> problem created by the fix I've been testing, which is:
>
> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> index e809498..2aa9d86 100644
> --- a/fs/nfs/nfs4proc.c
> +++ b/fs/nfs/nfs4proc.c
> @@ -2564,12 +2564,15 @@ static void nfs41_check_delegation_stateid(struct
> nfs4_state *state)
> static int nfs41_check_expired_locks(struct nfs4_state *state)
> {
> int status, ret = NFS_OK;
> - struct nfs4_lock_state *lsp;
> + struct nfs4_lock_state *lsp, *tmp;
> struct nfs_server *server = NFS_SERVER(state->inode);
>
> if (!test_bit(LK_STATE_IN_USE, &state->flags))
> goto out;
> - list_for_each_entry(lsp, &state->lock_states, ls_locks) {
> + spin_lock(&state->state_lock);
> + list_for_each_entry_safe(lsp, tmp, &state->lock_states, ls_locks) {
> + atomic_inc(&lsp->ls_count);
> + spin_unlock(&state->state_lock);
> if (test_bit(NFS_LOCK_INITIALIZED, &lsp->ls_flags)) {
> struct rpc_cred *cred =
> lsp->ls_state->owner->so_cred;
>
> @@ -2588,7 +2591,10 @@ static int nfs41_check_expired_locks(struct
> nfs4_state *state)
> break;
> }
> }
> - };
> + nfs4_put_lock_state(lsp);
> + spin_lock(&state->state_lock);
> + }
> + spin_unlock(&state->state_lock);
> out:
> return ret;
> }
>
> http://people.redhat.com/bcodding/old_stateid_loop is tshark output of my
> only good wirecapture of the problem. Without this patch, generic/089
> crashes long before this problem is reproduced, so I am stuck figuring it
> out, I'm afraid. Don't wait on my account.
>
> I plan on trying a bit more to reproduce tomorrow, and if I cannot, I'll
> write about it under separate cover.

Sounds good. Thanks for the update!

Anna

>
> Ben