From: Andy Adamson <[email protected]>
This is an expanded version of the "NFSv4 always compare stateids in nfs4_stateid_is_current" patch I sent on Feb 27.
Found at Connectathon 2014 and NetApp internal testing.
nfs4_stateid_is_current is used on the NFSv4 I/O path to determine if a
stateid has changed. The idea is that if there is a stateid expire error
such as NFS4ERR_BAD_STATEID and the stateid used that induced the error has
changed, then we can just resend the RPC from the call prepare state with
the changed stateid instead of kicking off recovery as the changed stateid
indicates it already had been recovered.
This patch set fixes a false positive that resulted in an NFS4ERR_BAD_STATEID
infinite loop. Trond pointed out that the nfs4_select_rw_stateid -EIO error
is special in that it indicates a lost lock.
As I examined the use of nfs4_select_rw_stateid, I addressed some other errors
in the use of nfs4_set_rw_stateid and friends in setattr and filelayout
prepare functions.
Just tested with connectathon tests. Will test more once I'm back in town...
-->Andy
Andy Adamson (4):
NFSv4 propagate nfs4_stateid_is_current errors
NFSv4 Check errors on stateid changed functions in I/O path
NFSv4 _nfs4_do_setattr use zero stateid on lost lock
NFSv4.1 Fail data server I/O if stateid represents a lost lock
fs/nfs/nfs4filelayout.c | 10 +++--
fs/nfs/nfs4proc.c | 116 ++++++++++++++++++++++++++++++++++++------------
fs/nfs/nfs4state.c | 6 +++
3 files changed, 100 insertions(+), 32 deletions(-)
--
1.8.3.1
Hi Trond
I did not get any patches with this email.....
-->Andy
On Tue, Mar 4, 2014 at 2:05 PM, Trond Myklebust
<[email protected]> wrote:
> Hi Andy,
>
> Would this be a sufficient fix?
>
> Cheers
> Trond
>
> Andy Adamson (1):
> NFSv4.1 Fail data server I/O if stateid represents a lost lock
>
> Trond Myklebust (2):
> NFSv4: Fix the return value of nfs4_select_rw_stateid
> NFSv4: Fail the truncate() if the lock/open stateid is invalid
>
> fs/nfs/nfs4filelayout.c | 10 ++++++----
> fs/nfs/nfs4proc.c | 9 ++++++---
> fs/nfs/nfs4state.c | 6 ------
> 3 files changed, 12 insertions(+), 13 deletions(-)
>
> --
> 1.8.5.3
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mar 4, 2014, at 12:31, <[email protected]> <[email protected]> wrote:
> From: Andy Adamson <[email protected]>
>
> Signed-off-by: Andy Adamson <[email protected]>
> ---
> fs/nfs/nfs4filelayout.c | 10 ++++++----
> 1 file changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/fs/nfs/nfs4filelayout.c b/fs/nfs/nfs4filelayout.c
> index 12c8132..66da1e4 100644
> --- a/fs/nfs/nfs4filelayout.c
> +++ b/fs/nfs/nfs4filelayout.c
> @@ -324,8 +324,9 @@ static void filelayout_read_prepare(struct rpc_task *task, void *data)
> &rdata->res.seq_res,
> task))
> return;
> - nfs4_set_rw_stateid(&rdata->args.stateid, rdata->args.context,
> - rdata->args.lock_context, FMODE_READ);
> + if (nfs4_set_rw_stateid(&rdata->args.stateid, rdata->args.context,
> + rdata->args.lock_context, FMODE_READ) == -EIO)
> + rpc_exit(task, -EIO); /* lost lock, terminate I/O */
> }
>
> static void filelayout_read_call_done(struct rpc_task *task, void *data)
> @@ -435,8 +436,9 @@ static void filelayout_write_prepare(struct rpc_task *task, void *data)
> &wdata->res.seq_res,
> task))
> return;
> - nfs4_set_rw_stateid(&wdata->args.stateid, wdata->args.context,
> - wdata->args.lock_context, FMODE_WRITE);
> + if (nfs4_set_rw_stateid(&wdata->args.stateid, wdata->args.context,
> + wdata->args.lock_context, FMODE_WRITE))
Shouldn?t this have an '== -EIO? test, just like the read case?
> + rpc_exit(task, -EIO); /* lost lock, terminate I/O */
> }
>
> static void filelayout_write_call_done(struct rpc_task *task, void *data)
> --
> 1.8.3.1
>
_________________________________
Trond Myklebust
Linux NFS client maintainer, PrimaryData
[email protected]
Final tweaks:
- Remove the redundant return value in nfs4_copy_lock_stateid.
- Add comments
Andy Adamson (1):
NFSv4.1 Fail data server I/O if stateid represents a lost lock
Trond Myklebust (3):
NFSv4: nfs4_stateid_is_current should return 'true' for an invalid
stateid
NFSv4: Fix the return value of nfs4_select_rw_stateid
NFSv4: Fail the truncate() if the lock/open stateid is invalid
fs/nfs/nfs4filelayout.c | 10 ++++++----
fs/nfs/nfs4proc.c | 14 +++++++++-----
fs/nfs/nfs4state.c | 14 +++-----------
3 files changed, 18 insertions(+), 20 deletions(-)
--
1.8.5.3
Hi Andy,
Would this be a sufficient fix?
Cheers
Trond
Andy Adamson (1):
NFSv4.1 Fail data server I/O if stateid represents a lost lock
Trond Myklebust (2):
NFSv4: Fix the return value of nfs4_select_rw_stateid
NFSv4: Fail the truncate() if the lock/open stateid is invalid
fs/nfs/nfs4filelayout.c | 10 ++++++----
fs/nfs/nfs4proc.c | 9 ++++++---
fs/nfs/nfs4state.c | 6 ------
3 files changed, 12 insertions(+), 13 deletions(-)
--
1.8.5.3
When nfs4_set_rw_stateid() can fails by returning EIO to indicate that
the stateid is completely invalid, then it makes no sense to have it
trigger a retry of the READ or WRITE operation. Instead, we should just
have it fall through and attempt a recovery.
Link: http://lkml.kernel.org/r/[email protected]
Cc: [email protected] # 3.10+
Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/nfs4proc.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 44e088dc357c..4778c55d0336 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -4011,8 +4011,8 @@ static bool nfs4_stateid_is_current(nfs4_stateid *stateid,
{
nfs4_stateid current_stateid;
- if (nfs4_set_rw_stateid(¤t_stateid, ctx, l_ctx, fmode))
- return false;
+ if (nfs4_set_rw_stateid(¤t_stateid, ctx, l_ctx, fmode) == -EIO)
+ return true;
return nfs4_stateid_match(stateid, ¤t_stateid);
}
--
1.8.5.3
From: Andy Adamson <[email protected]>
Check nfs4_set_rw_stateid return and try to match the input stateid
on success.
Return error so that caller can react accordingly
Signed-off-by: Andy Adamson <[email protected]>
---
fs/nfs/nfs4proc.c | 21 +++++++++++++++++----
fs/nfs/nfs4state.c | 6 ++++++
2 files changed, 23 insertions(+), 4 deletions(-)
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 02f65cc..2f1997d 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -4046,16 +4046,29 @@ int nfs4_set_rw_stateid(nfs4_stateid *stateid,
}
EXPORT_SYMBOL_GPL(nfs4_set_rw_stateid);
-static bool nfs4_stateid_is_current(nfs4_stateid *stateid,
+/**
+ * @stateid: stateid to test
+ * @ctx: open context with current stateid (unless lock context)
+ * @l_ctx: lock context with current stateid
+ * @fmode: open mode
+ *
+ * returns
+ * 0: @stateid does not match current stateid
+ * 1: @stateid matches current stateid
+ * negative: error
+ */
+static int nfs4_stateid_is_current(nfs4_stateid *stateid,
const struct nfs_open_context *ctx,
const struct nfs_lock_context *l_ctx,
fmode_t fmode)
{
nfs4_stateid current_stateid;
+ int ret;
- if (nfs4_set_rw_stateid(¤t_stateid, ctx, l_ctx, fmode))
- return false;
- return nfs4_stateid_match(stateid, ¤t_stateid);
+ ret = nfs4_set_rw_stateid(¤t_stateid, ctx, l_ctx, fmode);
+ if (ret == 0) /* current stateid set, see if it matches input stateid */
+ return nfs4_stateid_match(stateid, ¤t_stateid) ? 1 : 0;
+ return ret; /* error */
}
static bool nfs4_error_stateid_expired(int err)
diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
index 1cfde97..4cabe2c 100644
--- a/fs/nfs/nfs4state.c
+++ b/fs/nfs/nfs4state.c
@@ -1112,6 +1112,12 @@ static int nfs4_copy_open_stateid(nfs4_stateid *dst, struct nfs4_state *state)
/*
* Byte-range lock aware utility to initialize the stateid of read/write
* requests.
+ *
+ * returns:
+ * 0: success, stateid selected and copied
+ * -EWOULDBLOCK: from nfs4_copy_open. stateid change in progress
+ * with stateid copied
+ * -EIO: lost lock, fail I/O
*/
int nfs4_select_rw_stateid(nfs4_stateid *dst, struct nfs4_state *state,
fmode_t fmode, const struct nfs_lockowner *lockowner)
--
1.8.3.1
If the open stateid could not be recovered, or the file locks were lost,
then we should fail the truncate() operation altogether.
Reported-by: Andy Adamson <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/nfs4proc.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 4778c55d0336..bc60a0f81019 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -2398,13 +2398,16 @@ static int _nfs4_do_setattr(struct inode *inode, struct rpc_cred *cred,
if (nfs4_copy_delegation_stateid(&arg.stateid, inode, fmode)) {
/* Use that stateid */
- } else if (truncate && state != NULL && nfs4_valid_open_stateid(state)) {
+ } else if (truncate && state != NULL) {
struct nfs_lockowner lockowner = {
.l_owner = current->files,
.l_pid = current->tgid,
};
- nfs4_select_rw_stateid(&arg.stateid, state, FMODE_WRITE,
- &lockowner);
+ if (!nfs4_valid_open_stateid(state))
+ return -EBADF;
+ if (nfs4_select_rw_stateid(&arg.stateid, state, FMODE_WRITE,
+ &lockowner) < 0)
+ return -EBADF;
} else
nfs4_stateid_copy(&arg.stateid, &zero_stateid);
--
1.8.5.3
On Mar 5, 2014, at 5:58, Andy Adamson <[email protected]> wrote:
> On Tue, Mar 4, 2014 at 1:09 PM, Trond Myklebust
> <[email protected]> wrote:
>>
>> On Mar 4, 2014, at 12:31, [email protected] wrote:
>>
>>> From: Andy Adamson <[email protected]>
>>>
>>> This is an expanded version of the "NFSv4 always compare stateids in nfs4_stateid_is_current" patch I sent on Feb 27.
>>>
>>> Found at Connectathon 2014 and NetApp internal testing.
>>>
>>> nfs4_stateid_is_current is used on the NFSv4 I/O path to determine if a
>>> stateid has changed. The idea is that if there is a stateid expire error
>>> such as NFS4ERR_BAD_STATEID and the stateid used that induced the error has
>>> changed, then we can just resend the RPC from the call prepare state with
>>> the changed stateid instead of kicking off recovery as the changed stateid
>>> indicates it already had been recovered.
>>>
>>> This patch set fixes a false positive that resulted in an NFS4ERR_BAD_STATEID
>>> infinite loop. Trond pointed out that the nfs4_select_rw_stateid -EIO error
>>> is special in that it indicates a lost lock.
>>>
>>> As I examined the use of nfs4_select_rw_stateid, I addressed some other errors
>>> in the use of nfs4_set_rw_stateid and friends in setattr and filelayout
>>> prepare functions.
>>>
>>> Just tested with connectathon tests. Will test more once I'm back in town...
>>
>> One question:
>>
>> Do we need the EWOULDBLOCK case at all in nfs4_copy_lock_stateid/nfs4_copy_open_stateid? Looking at the code, it seems particularly redundant for the case of NFSv4.1, where we set the seqid to '0'. Given that we do the nfs4_stateid_is_current() test, it seems unnecessary in the case of NFSv4.0 too...
>
> Yes - Even though you mentioned it's importance at Connectathon, I
> could not find a use of the EWOULDBLOCK return.
>
>>
>> So, how about just throwing out the EWOULDBLOCK returns, then keeping nfs4_stateid_is_current() as it is,
>
> So, you mean keeping nfs4_stateid_is_current as a boolean, and then
> assume that a false return == -EIO?
>
> I don't like that because that is how this bug was created - mixing
> bool and int functions.
Why is that such a problem? The rpc_call_prepare functions copy the stateid and should test the stateid validity flags again.
>> fixing up _nfs4_do_setattr to only deal with the -EIO case,
>
> Yes.
>
> and then fixing up the file layout uses of nfs4_select_rw_stateid()
> to check the return values?
>
> As I did in the last patch...
Yes, but that patch failed to respect the locking requirements.
_________________________________
Trond Myklebust
Linux NFS client maintainer, PrimaryData
[email protected]
If the open stateid could not be recovered, or the file locks were lost,
then we should fail the truncate() operation altogether.
Reported-by: Andy Adamson <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/nfs4proc.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 44e088dc357c..daf41182ecfb 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -2398,13 +2398,16 @@ static int _nfs4_do_setattr(struct inode *inode, struct rpc_cred *cred,
if (nfs4_copy_delegation_stateid(&arg.stateid, inode, fmode)) {
/* Use that stateid */
- } else if (truncate && state != NULL && nfs4_valid_open_stateid(state)) {
+ } else if (truncate && state != NULL) {
struct nfs_lockowner lockowner = {
.l_owner = current->files,
.l_pid = current->tgid,
};
- nfs4_select_rw_stateid(&arg.stateid, state, FMODE_WRITE,
- &lockowner);
+ if (!nfs4_valid_open_stateid(state))
+ return -EBADF;
+ if (nfs4_select_rw_stateid(&arg.stateid, state, FMODE_WRITE,
+ &lockowner) < 0)
+ return -EBADF;
} else
nfs4_stateid_copy(&arg.stateid, &zero_stateid);
--
1.8.5.3
On Mar 5, 2014, at 3:51, Andy Adamson <[email protected]> wrote:
> On Tue, Mar 4, 2014 at 1:42 PM, Trond Myklebust
> <[email protected]> wrote:
>> On Tue, 2014-03-04 at 12:31 -0500, [email protected] wrote:
>>> From: Andy Adamson <[email protected]>
>>>
>>> Use the zero stateid if nfs4_select_rw_stateid indicates a lost lock (-EIO)
>>>
>>> Signed-off-by: Andy Adamson <[email protected]>
>>> ---
>>> fs/nfs/nfs4proc.c | 18 +++++++++++-------
>>> 1 file changed, 11 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
>>> index de8e7f5..efe940c 100644
>>> --- a/fs/nfs/nfs4proc.c
>>> +++ b/fs/nfs/nfs4proc.c
>>> @@ -2458,18 +2458,22 @@ static int _nfs4_do_setattr(struct inode *inode, struct rpc_cred *cred,
>>> truncate = (sattr->ia_valid & ATTR_SIZE) ? true : false;
>>> fmode = truncate ? FMODE_WRITE : FMODE_READ;
>>>
>>> - if (nfs4_copy_delegation_stateid(&arg.stateid, inode, fmode)) {
>>> - /* Use that stateid */
>>> - } else if (truncate && state != NULL && nfs4_valid_open_stateid(state)) {
>>> + if (nfs4_copy_delegation_stateid(&arg.stateid, inode, fmode))
>>> + /* Use delegation stateid */
>>> + goto do_call;
>>> + if (truncate && state != NULL && nfs4_valid_open_stateid(state)) {
>>> struct nfs_lockowner lockowner = {
>>> .l_owner = current->files,
>>> .l_pid = current->tgid,
>>> };
>>> - nfs4_select_rw_stateid(&arg.stateid, state, FMODE_WRITE,
>>> - &lockowner);
>>> - } else
>>> - nfs4_stateid_copy(&arg.stateid, &zero_stateid);
>>> + /* Use zero stateid if lock is lost (-EIO fall through) */
>>> + if (nfs4_select_rw_stateid(&arg.stateid, state, FMODE_WRITE,
>>> + &lockowner) != -EIO)
>>
>> Shouldn't we fail with EBADF instead?
>
> Hmm. -EIO means lost lock, but not lost file, which is why I say we
> send it with a zero stateid.
If the application has lost the lock that was protecting the data, then why should we think it is safe to allow it to proceed to modify the file by truncating it?
_________________________________
Trond Myklebust
Linux NFS client maintainer, PrimaryData
[email protected]
In commit 5521abfdcf4d6 (NFSv4: Resend the READ/WRITE RPC call
if a stateid change causes an error), we overloaded the return value of
nfs4_select_rw_stateid() to cause it to return -EWOULDBLOCK if an RPC
call is outstanding that would cause the NFSv4 lock or open stateid
to change.
That is all redundant when we actually copy the stateid used in the
read/write RPC call that failed, and check that against the current
stateid. It is doubly so, when we consider that in the NFSv4.1 case,
we also set the stateid's seqid to the special value '0', which means
'match the current valid stateid'.
Reported-by: Andy Adamson <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/nfs4state.c | 6 ------
1 file changed, 6 deletions(-)
diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
index e1a47217c05e..cd0cf93ac927 100644
--- a/fs/nfs/nfs4state.c
+++ b/fs/nfs/nfs4state.c
@@ -974,9 +974,6 @@ static int nfs4_copy_lock_stateid(nfs4_stateid *dst,
else if (lsp != NULL && test_bit(NFS_LOCK_INITIALIZED, &lsp->ls_flags) != 0) {
nfs4_stateid_copy(dst, &lsp->ls_stateid);
ret = 0;
- smp_rmb();
- if (!list_empty(&lsp->ls_seqid.list))
- ret = -EWOULDBLOCK;
}
spin_unlock(&state->state_lock);
nfs4_put_lock_state(lsp);
@@ -997,9 +994,6 @@ static int nfs4_copy_open_stateid(nfs4_stateid *dst, struct nfs4_state *state)
src = &state->open_stateid;
nfs4_stateid_copy(dst, src);
ret = 0;
- smp_rmb();
- if (!list_empty(&state->owner->so_seqid.list))
- ret = -EWOULDBLOCK;
} while (read_seqretry(&state->seqlock, seq));
return ret;
}
--
1.8.5.3
Added a patch to fall through and recover when the stateid is invalid.
Andy Adamson (1):
NFSv4.1 Fail data server I/O if stateid represents a lost lock
Trond Myklebust (3):
NFSv4: nfs4_stateid_is_current should return 'true' for an invalid
stateid
NFSv4: Fix the return value of nfs4_select_rw_stateid
NFSv4: Fail the truncate() if the lock/open stateid is invalid
fs/nfs/nfs4filelayout.c | 10 ++++++----
fs/nfs/nfs4proc.c | 13 ++++++++-----
fs/nfs/nfs4state.c | 6 ------
3 files changed, 14 insertions(+), 15 deletions(-)
--
1.8.5.3
On Tue, Mar 4, 2014 at 2:05 PM, Trond Myklebust
<[email protected]> wrote:
> If the open stateid could not be recovered, or the file locks were lost,
> then we should fail the truncate() operation altogether.
>
> Reported-by: Andy Adamson <[email protected]>
> Link: http://lkml.kernel.org/r/[email protected]
> Signed-off-by: Trond Myklebust <[email protected]>
> ---
> fs/nfs/nfs4proc.c | 9 ++++++---
> 1 file changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> index 44e088dc357c..daf41182ecfb 100644
> --- a/fs/nfs/nfs4proc.c
> +++ b/fs/nfs/nfs4proc.c
> @@ -2398,13 +2398,16 @@ static int _nfs4_do_setattr(struct inode *inode, struct rpc_cred *cred,
>
> if (nfs4_copy_delegation_stateid(&arg.stateid, inode, fmode)) {
> /* Use that stateid */
> - } else if (truncate && state != NULL && nfs4_valid_open_stateid(state)) {
> + } else if (truncate && state != NULL) {
> struct nfs_lockowner lockowner = {
> .l_owner = current->files,
> .l_pid = current->tgid,
> };
> - nfs4_select_rw_stateid(&arg.stateid, state, FMODE_WRITE,
> - &lockowner);
> + if (!nfs4_valid_open_stateid(state))
> + return -EBADF;
> + if (nfs4_select_rw_stateid(&arg.stateid, state, FMODE_WRITE,
> + &lockowner) < 0)
AFICS this means -EIO = lost lock. Why fail the setattr? the file
handle is not bad. Why not send a ZERO_STATEID? At any rate, -EBADF
looks wrong to me.
-->Andy
> + return -EBADF;
> } else
> nfs4_stateid_copy(&arg.stateid, &zero_stateid);
>
> --
> 1.8.5.3
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
In commit 5521abfdcf4d6 (NFSv4: Resend the READ/WRITE RPC call
if a stateid change causes an error), we overloaded the return value of
nfs4_select_rw_stateid() to cause it to return -EWOULDBLOCK if an RPC
call is outstanding that would cause the NFSv4 lock or open stateid
to change.
That is all redundant when we actually copy the stateid used in the
read/write RPC call that failed, and check that against the current
stateid. It is doubly so, when we consider that in the NFSv4.1 case,
we also set the stateid's seqid to the special value '0', which means
'match the current valid stateid'.
Reported-by: Andy Adamson <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/nfs4state.c | 14 +++-----------
1 file changed, 3 insertions(+), 11 deletions(-)
diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
index e1a47217c05e..0deb32105ccf 100644
--- a/fs/nfs/nfs4state.c
+++ b/fs/nfs/nfs4state.c
@@ -974,9 +974,6 @@ static int nfs4_copy_lock_stateid(nfs4_stateid *dst,
else if (lsp != NULL && test_bit(NFS_LOCK_INITIALIZED, &lsp->ls_flags) != 0) {
nfs4_stateid_copy(dst, &lsp->ls_stateid);
ret = 0;
- smp_rmb();
- if (!list_empty(&lsp->ls_seqid.list))
- ret = -EWOULDBLOCK;
}
spin_unlock(&state->state_lock);
nfs4_put_lock_state(lsp);
@@ -984,10 +981,9 @@ out:
return ret;
}
-static int nfs4_copy_open_stateid(nfs4_stateid *dst, struct nfs4_state *state)
+static void nfs4_copy_open_stateid(nfs4_stateid *dst, struct nfs4_state *state)
{
const nfs4_stateid *src;
- int ret;
int seq;
do {
@@ -996,12 +992,7 @@ static int nfs4_copy_open_stateid(nfs4_stateid *dst, struct nfs4_state *state)
if (test_bit(NFS_OPEN_STATE, &state->flags))
src = &state->open_stateid;
nfs4_stateid_copy(dst, src);
- ret = 0;
- smp_rmb();
- if (!list_empty(&state->owner->so_seqid.list))
- ret = -EWOULDBLOCK;
} while (read_seqretry(&state->seqlock, seq));
- return ret;
}
/*
@@ -1026,7 +1017,8 @@ int nfs4_select_rw_stateid(nfs4_stateid *dst, struct nfs4_state *state,
* choose to use.
*/
goto out;
- ret = nfs4_copy_open_stateid(dst, state);
+ nfs4_copy_open_stateid(dst, state);
+ ret = 0;
out:
if (nfs_server_capable(state->inode, NFS_CAP_STATEID_NFSV41))
dst->seqid = 0;
--
1.8.5.3
If the open stateid could not be recovered, or the file locks were lost,
then we should fail the truncate() operation altogether.
Reported-by: Andy Adamson <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/nfs4proc.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 4ae8141452c9..450bfedbe2f4 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -2398,13 +2398,16 @@ static int _nfs4_do_setattr(struct inode *inode, struct rpc_cred *cred,
if (nfs4_copy_delegation_stateid(&arg.stateid, inode, fmode)) {
/* Use that stateid */
- } else if (truncate && state != NULL && nfs4_valid_open_stateid(state)) {
+ } else if (truncate && state != NULL) {
struct nfs_lockowner lockowner = {
.l_owner = current->files,
.l_pid = current->tgid,
};
- nfs4_select_rw_stateid(&arg.stateid, state, FMODE_WRITE,
- &lockowner);
+ if (!nfs4_valid_open_stateid(state))
+ return -EBADF;
+ if (nfs4_select_rw_stateid(&arg.stateid, state, FMODE_WRITE,
+ &lockowner) == -EIO)
+ return -EBADF;
} else
nfs4_stateid_copy(&arg.stateid, &zero_stateid);
--
1.8.5.3
From: Andy Adamson <[email protected]>
Use the zero stateid if nfs4_select_rw_stateid indicates a lost lock (-EIO)
Signed-off-by: Andy Adamson <[email protected]>
---
fs/nfs/nfs4proc.c | 18 +++++++++++-------
1 file changed, 11 insertions(+), 7 deletions(-)
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index de8e7f5..efe940c 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -2458,18 +2458,22 @@ static int _nfs4_do_setattr(struct inode *inode, struct rpc_cred *cred,
truncate = (sattr->ia_valid & ATTR_SIZE) ? true : false;
fmode = truncate ? FMODE_WRITE : FMODE_READ;
- if (nfs4_copy_delegation_stateid(&arg.stateid, inode, fmode)) {
- /* Use that stateid */
- } else if (truncate && state != NULL && nfs4_valid_open_stateid(state)) {
+ if (nfs4_copy_delegation_stateid(&arg.stateid, inode, fmode))
+ /* Use delegation stateid */
+ goto do_call;
+ if (truncate && state != NULL && nfs4_valid_open_stateid(state)) {
struct nfs_lockowner lockowner = {
.l_owner = current->files,
.l_pid = current->tgid,
};
- nfs4_select_rw_stateid(&arg.stateid, state, FMODE_WRITE,
- &lockowner);
- } else
- nfs4_stateid_copy(&arg.stateid, &zero_stateid);
+ /* Use zero stateid if lock is lost (-EIO fall through) */
+ if (nfs4_select_rw_stateid(&arg.stateid, state, FMODE_WRITE,
+ &lockowner) != -EIO)
+ goto do_call;
+ }
+ nfs4_stateid_copy(&arg.stateid, &zero_stateid);
+do_call:
status = nfs4_call_sync(server->client, server, &msg, &arg.seq_args, &res.seq_res, 1);
if (status == 0 && state != NULL)
renew_lease(server, timestamp);
--
1.8.3.1
On Mar 4, 2014, at 12:31, [email protected] wrote:
> From: Andy Adamson <[email protected]>
>
> This is an expanded version of the "NFSv4 always compare stateids in nfs4_stateid_is_current" patch I sent on Feb 27.
>
> Found at Connectathon 2014 and NetApp internal testing.
>
> nfs4_stateid_is_current is used on the NFSv4 I/O path to determine if a
> stateid has changed. The idea is that if there is a stateid expire error
> such as NFS4ERR_BAD_STATEID and the stateid used that induced the error has
> changed, then we can just resend the RPC from the call prepare state with
> the changed stateid instead of kicking off recovery as the changed stateid
> indicates it already had been recovered.
>
> This patch set fixes a false positive that resulted in an NFS4ERR_BAD_STATEID
> infinite loop. Trond pointed out that the nfs4_select_rw_stateid -EIO error
> is special in that it indicates a lost lock.
>
> As I examined the use of nfs4_select_rw_stateid, I addressed some other errors
> in the use of nfs4_set_rw_stateid and friends in setattr and filelayout
> prepare functions.
>
> Just tested with connectathon tests. Will test more once I'm back in town...
One question:
Do we need the EWOULDBLOCK case at all in nfs4_copy_lock_stateid/nfs4_copy_open_stateid? Looking at the code, it seems particularly redundant for the case of NFSv4.1, where we set the seqid to ?0?. Given that we do the nfs4_stateid_is_current() test, it seems unnecessary in the case of NFSv4.0 too...
So, how about just throwing out the EWOULDBLOCK returns, then keeping nfs4_stateid_is_current() as it is, fixing up _nfs4_do_setattr to only deal with the -EIO case, and then fixing up the file layout uses of nfs4_select_rw_stateid() to check the return values?
_________________________________
Trond Myklebust
Linux NFS client maintainer, PrimaryData
[email protected]
From: Andy Adamson <[email protected]>
Signed-off-by: Andy Adamson <[email protected]>
---
fs/nfs/nfs4filelayout.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/fs/nfs/nfs4filelayout.c b/fs/nfs/nfs4filelayout.c
index 12c8132..66da1e4 100644
--- a/fs/nfs/nfs4filelayout.c
+++ b/fs/nfs/nfs4filelayout.c
@@ -324,8 +324,9 @@ static void filelayout_read_prepare(struct rpc_task *task, void *data)
&rdata->res.seq_res,
task))
return;
- nfs4_set_rw_stateid(&rdata->args.stateid, rdata->args.context,
- rdata->args.lock_context, FMODE_READ);
+ if (nfs4_set_rw_stateid(&rdata->args.stateid, rdata->args.context,
+ rdata->args.lock_context, FMODE_READ) == -EIO)
+ rpc_exit(task, -EIO); /* lost lock, terminate I/O */
}
static void filelayout_read_call_done(struct rpc_task *task, void *data)
@@ -435,8 +436,9 @@ static void filelayout_write_prepare(struct rpc_task *task, void *data)
&wdata->res.seq_res,
task))
return;
- nfs4_set_rw_stateid(&wdata->args.stateid, wdata->args.context,
- wdata->args.lock_context, FMODE_WRITE);
+ if (nfs4_set_rw_stateid(&wdata->args.stateid, wdata->args.context,
+ wdata->args.lock_context, FMODE_WRITE))
+ rpc_exit(task, -EIO); /* lost lock, terminate I/O */
}
static void filelayout_write_call_done(struct rpc_task *task, void *data)
--
1.8.3.1
On Tue, Mar 4, 2014 at 1:09 PM, Trond Myklebust
<[email protected]> wrote:
>
> On Mar 4, 2014, at 12:31, [email protected] wrote:
>
>> From: Andy Adamson <[email protected]>
>>
>> This is an expanded version of the "NFSv4 always compare stateids in nfs4_stateid_is_current" patch I sent on Feb 27.
>>
>> Found at Connectathon 2014 and NetApp internal testing.
>>
>> nfs4_stateid_is_current is used on the NFSv4 I/O path to determine if a
>> stateid has changed. The idea is that if there is a stateid expire error
>> such as NFS4ERR_BAD_STATEID and the stateid used that induced the error has
>> changed, then we can just resend the RPC from the call prepare state with
>> the changed stateid instead of kicking off recovery as the changed stateid
>> indicates it already had been recovered.
>>
>> This patch set fixes a false positive that resulted in an NFS4ERR_BAD_STATEID
>> infinite loop. Trond pointed out that the nfs4_select_rw_stateid -EIO error
>> is special in that it indicates a lost lock.
>>
>> As I examined the use of nfs4_select_rw_stateid, I addressed some other errors
>> in the use of nfs4_set_rw_stateid and friends in setattr and filelayout
>> prepare functions.
>>
>> Just tested with connectathon tests. Will test more once I'm back in town...
>
> One question:
>
> Do we need the EWOULDBLOCK case at all in nfs4_copy_lock_stateid/nfs4_copy_open_stateid? Looking at the code, it seems particularly redundant for the case of NFSv4.1, where we set the seqid to '0'. Given that we do the nfs4_stateid_is_current() test, it seems unnecessary in the case of NFSv4.0 too...
Yes - Even though you mentioned it's importance at Connectathon, I
could not find a use of the EWOULDBLOCK return.
>
> So, how about just throwing out the EWOULDBLOCK returns, then keeping nfs4_stateid_is_current() as it is,
So, you mean keeping nfs4_stateid_is_current as a boolean, and then
assume that a false return == -EIO?
I don't like that because that is how this bug was created - mixing
bool and int functions.
> fixing up _nfs4_do_setattr to only deal with the -EIO case,
Yes.
and then fixing up the file layout uses of nfs4_select_rw_stateid()
to check the return values?
As I did in the last patch...
>
> _________________________________
> Trond Myklebust
> Linux NFS client maintainer, PrimaryData
> [email protected]
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Mar 4, 2014 at 1:08 PM, Anna Schumaker <[email protected]> wrote:
> On 03/04/2014 12:31 PM, [email protected] wrote:
>> From: Andy Adamson <[email protected]>
>>
>> Retry I/O from the call prepare state (with the new stateid) if the stateid
>> has changed on an stateid expired error
>>
>> Fail the I/O if the statied represents a lost lock.
>>
>> Otherwise, let the error handler decide what to do.
>>
>> Without this change, I/O with a stateid expired error such as
>> NFS4ERR_BAD_STATEID can be retried without recovery and loop forever.
>>
>> Signed-off-by: Andy Adamson <[email protected]>
>> ---
>> fs/nfs/nfs4proc.c | 77 +++++++++++++++++++++++++++++++++++++++++++------------
>> 1 file changed, 60 insertions(+), 17 deletions(-)
>>
>> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
>> index 2f1997d..de8e7f5 100644
>> --- a/fs/nfs/nfs4proc.c
>> +++ b/fs/nfs/nfs4proc.c
>> @@ -4107,29 +4107,49 @@ static int nfs4_read_done_cb(struct rpc_task *task, struct nfs_read_data *data)
>> return 0;
>> }
>>
>> -static bool nfs4_read_stateid_changed(struct rpc_task *task,
>> +static int nfs4_read_stateid_changed(struct rpc_task *task,
>> struct nfs_readargs *args)
>> {
>> + int ret;
>>
>> - if (!nfs4_error_stateid_expired(task->tk_status) ||
>> - nfs4_stateid_is_current(&args->stateid,
>> + ret = nfs4_stateid_is_current(&args->stateid,
>> args->context,
>> args->lock_context,
>> - FMODE_READ))
>> - return false;
>> - rpc_restart_call_prepare(task);
>> - return true;
>> + FMODE_READ);
>> + switch (ret) {
>> + case 0: /* stateid changed */
>> + if (nfs4_error_stateid_expired(task->tk_status)) {
>> + /* stateid expired error, retry with changed stateid */
>> + rpc_restart_call_prepare(task);
>> + return -EAGAIN;
>> + }
>> + return 0; /* let error handler proceed */
>> +
>> + case -EIO: /* -EIO means lock stateid was lost: fail the I/O */
>> + task->tk_status = -EIO;
>> + return -EIO;
>> +
>> + case -EWOULDBLOCK: /* stateid change in progress */
>> + default:/* stateid has not changed, let error handler proceed.*/
>> + return 0;
>> +
>> + }
>> }
>
> This bit of code exists down below, too. Is there any reason it can't be an error handling function?
OK
>
> Anna
>
>
>>
>> static int nfs4_read_done(struct rpc_task *task, struct nfs_read_data *data)
>> {
>> + int ret;
>>
>> dprintk("--> %s\n", __func__);
>>
>> if (!nfs4_sequence_done(task, &data->res.seq_res))
>> return -EAGAIN;
>> - if (nfs4_read_stateid_changed(task, &data->args))
>> - return -EAGAIN;
>> + ret = nfs4_read_stateid_changed(task, &data->args);
>> + switch (ret) {
>> + case -EAGAIN:
>> + case -EIO:
>> + return ret;
>> + }
>> return data->read_done_cb ? data->read_done_cb(task, data) :
>> nfs4_read_done_cb(task, data);
>> }
>> @@ -4176,23 +4196,46 @@ static int nfs4_write_done_cb(struct rpc_task *task, struct nfs_write_data *data
>> static bool nfs4_write_stateid_changed(struct rpc_task *task,
>> struct nfs_writeargs *args)
>> {
>> + int ret;
>>
>> - if (!nfs4_error_stateid_expired(task->tk_status) ||
>> - nfs4_stateid_is_current(&args->stateid,
>> + ret = nfs4_stateid_is_current(&args->stateid,
>> args->context,
>> args->lock_context,
>> - FMODE_WRITE))
>> - return false;
>> - rpc_restart_call_prepare(task);
>> - return true;
>> + FMODE_WRITE);
>> +
>> + switch (ret) {
>> + case 0: /* stateid changed */
>> + if (nfs4_error_stateid_expired(task->tk_status)) {
>> + /* stateid expired error, retry with changed stateid */
>> + rpc_restart_call_prepare(task);
>> + return -EAGAIN;
>> + }
>> + return 0; /* let error handler proceed */
>> +
>> + case -EIO: /* -EIO means lock stateid was lost: fail the I/O */
>> + task->tk_status = -EIO;
>> + return -EIO;
>> +
>> + case -EWOULDBLOCK: /* stateid change in progress */
>> + default:/* stateid has not changed, let error handler proceed.*/
>> + return 0;
>> +
>> + }
>> }
>>
>> static int nfs4_write_done(struct rpc_task *task, struct nfs_write_data *data)
>> {
>> + int ret;
>> +
>> if (!nfs4_sequence_done(task, &data->res.seq_res))
>> return -EAGAIN;
>> - if (nfs4_write_stateid_changed(task, &data->args))
>> - return -EAGAIN;
>> +
>> + ret = nfs4_write_stateid_changed(task, &data->args);
>> + switch (ret) {
>> + case -EAGAIN:
>> + case -EIO:
>> + return ret;
>> + }
>> return data->write_done_cb ? data->write_done_cb(task, data) :
>> nfs4_write_done_cb(task, data);
>> }
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Andy Adamson <[email protected]>
Signed-off-by: Andy Adamson <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/nfs4filelayout.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/fs/nfs/nfs4filelayout.c b/fs/nfs/nfs4filelayout.c
index 12c8132ad408..b9a35c05b60f 100644
--- a/fs/nfs/nfs4filelayout.c
+++ b/fs/nfs/nfs4filelayout.c
@@ -324,8 +324,9 @@ static void filelayout_read_prepare(struct rpc_task *task, void *data)
&rdata->res.seq_res,
task))
return;
- nfs4_set_rw_stateid(&rdata->args.stateid, rdata->args.context,
- rdata->args.lock_context, FMODE_READ);
+ if (nfs4_set_rw_stateid(&rdata->args.stateid, rdata->args.context,
+ rdata->args.lock_context, FMODE_READ) == -EIO)
+ rpc_exit(task, -EIO); /* lost lock, terminate I/O */
}
static void filelayout_read_call_done(struct rpc_task *task, void *data)
@@ -435,8 +436,9 @@ static void filelayout_write_prepare(struct rpc_task *task, void *data)
&wdata->res.seq_res,
task))
return;
- nfs4_set_rw_stateid(&wdata->args.stateid, wdata->args.context,
- wdata->args.lock_context, FMODE_WRITE);
+ if (nfs4_set_rw_stateid(&wdata->args.stateid, wdata->args.context,
+ wdata->args.lock_context, FMODE_WRITE) == -EIO)
+ rpc_exit(task, -EIO); /* lost lock, terminate I/O */
}
static void filelayout_write_call_done(struct rpc_task *task, void *data)
--
1.8.5.3
On Mar 5, 2014, at 6:40, Andy Adamson <[email protected]> wrote:
> On Tue, Mar 4, 2014 at 2:05 PM, Trond Myklebust
> <[email protected]> wrote:
>> If the open stateid could not be recovered, or the file locks were lost,
>> then we should fail the truncate() operation altogether.
>>
>> Reported-by: Andy Adamson <[email protected]>
>> Link: http://lkml.kernel.org/r/[email protected]
>> Signed-off-by: Trond Myklebust <[email protected]>
>> ---
>> fs/nfs/nfs4proc.c | 9 ++++++---
>> 1 file changed, 6 insertions(+), 3 deletions(-)
>>
>> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
>> index 44e088dc357c..daf41182ecfb 100644
>> --- a/fs/nfs/nfs4proc.c
>> +++ b/fs/nfs/nfs4proc.c
>> @@ -2398,13 +2398,16 @@ static int _nfs4_do_setattr(struct inode *inode, struct rpc_cred *cred,
>>
>> if (nfs4_copy_delegation_stateid(&arg.stateid, inode, fmode)) {
>> /* Use that stateid */
>> - } else if (truncate && state != NULL && nfs4_valid_open_stateid(state)) {
>> + } else if (truncate && state != NULL) {
>> struct nfs_lockowner lockowner = {
>> .l_owner = current->files,
>> .l_pid = current->tgid,
>> };
>> - nfs4_select_rw_stateid(&arg.stateid, state, FMODE_WRITE,
>> - &lockowner);
>> + if (!nfs4_valid_open_stateid(state))
>> + return -EBADF;
>> + if (nfs4_select_rw_stateid(&arg.stateid, state, FMODE_WRITE,
>> + &lockowner) < 0)
>
> AFICS this means -EIO = lost lock. Why fail the setattr? the file
> handle is not bad. Why not send a ZERO_STATEID? At any rate, -EBADF
> looks wrong to me.
What should we use instead of EBADF? EIO doesn?t really help you figure out what is wrong or how to fix it.
_________________________________
Trond Myklebust
Linux NFS client maintainer, PrimaryData
[email protected]
On 03/04/2014 12:31 PM, [email protected] wrote:
> From: Andy Adamson <[email protected]>
>
> Retry I/O from the call prepare state (with the new stateid) if the stateid
> has changed on an stateid expired error
>
> Fail the I/O if the statied represents a lost lock.
>
> Otherwise, let the error handler decide what to do.
>
> Without this change, I/O with a stateid expired error such as
> NFS4ERR_BAD_STATEID can be retried without recovery and loop forever.
>
> Signed-off-by: Andy Adamson <[email protected]>
> ---
> fs/nfs/nfs4proc.c | 77 +++++++++++++++++++++++++++++++++++++++++++------------
> 1 file changed, 60 insertions(+), 17 deletions(-)
>
> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> index 2f1997d..de8e7f5 100644
> --- a/fs/nfs/nfs4proc.c
> +++ b/fs/nfs/nfs4proc.c
> @@ -4107,29 +4107,49 @@ static int nfs4_read_done_cb(struct rpc_task *task, struct nfs_read_data *data)
> return 0;
> }
>
> -static bool nfs4_read_stateid_changed(struct rpc_task *task,
> +static int nfs4_read_stateid_changed(struct rpc_task *task,
> struct nfs_readargs *args)
> {
> + int ret;
>
> - if (!nfs4_error_stateid_expired(task->tk_status) ||
> - nfs4_stateid_is_current(&args->stateid,
> + ret = nfs4_stateid_is_current(&args->stateid,
> args->context,
> args->lock_context,
> - FMODE_READ))
> - return false;
> - rpc_restart_call_prepare(task);
> - return true;
> + FMODE_READ);
> + switch (ret) {
> + case 0: /* stateid changed */
> + if (nfs4_error_stateid_expired(task->tk_status)) {
> + /* stateid expired error, retry with changed stateid */
> + rpc_restart_call_prepare(task);
> + return -EAGAIN;
> + }
> + return 0; /* let error handler proceed */
> +
> + case -EIO: /* -EIO means lock stateid was lost: fail the I/O */
> + task->tk_status = -EIO;
> + return -EIO;
> +
> + case -EWOULDBLOCK: /* stateid change in progress */
> + default:/* stateid has not changed, let error handler proceed.*/
> + return 0;
> +
> + }
> }
This bit of code exists down below, too. Is there any reason it can't be an error handling function?
Anna
>
> static int nfs4_read_done(struct rpc_task *task, struct nfs_read_data *data)
> {
> + int ret;
>
> dprintk("--> %s\n", __func__);
>
> if (!nfs4_sequence_done(task, &data->res.seq_res))
> return -EAGAIN;
> - if (nfs4_read_stateid_changed(task, &data->args))
> - return -EAGAIN;
> + ret = nfs4_read_stateid_changed(task, &data->args);
> + switch (ret) {
> + case -EAGAIN:
> + case -EIO:
> + return ret;
> + }
> return data->read_done_cb ? data->read_done_cb(task, data) :
> nfs4_read_done_cb(task, data);
> }
> @@ -4176,23 +4196,46 @@ static int nfs4_write_done_cb(struct rpc_task *task, struct nfs_write_data *data
> static bool nfs4_write_stateid_changed(struct rpc_task *task,
> struct nfs_writeargs *args)
> {
> + int ret;
>
> - if (!nfs4_error_stateid_expired(task->tk_status) ||
> - nfs4_stateid_is_current(&args->stateid,
> + ret = nfs4_stateid_is_current(&args->stateid,
> args->context,
> args->lock_context,
> - FMODE_WRITE))
> - return false;
> - rpc_restart_call_prepare(task);
> - return true;
> + FMODE_WRITE);
> +
> + switch (ret) {
> + case 0: /* stateid changed */
> + if (nfs4_error_stateid_expired(task->tk_status)) {
> + /* stateid expired error, retry with changed stateid */
> + rpc_restart_call_prepare(task);
> + return -EAGAIN;
> + }
> + return 0; /* let error handler proceed */
> +
> + case -EIO: /* -EIO means lock stateid was lost: fail the I/O */
> + task->tk_status = -EIO;
> + return -EIO;
> +
> + case -EWOULDBLOCK: /* stateid change in progress */
> + default:/* stateid has not changed, let error handler proceed.*/
> + return 0;
> +
> + }
> }
>
> static int nfs4_write_done(struct rpc_task *task, struct nfs_write_data *data)
> {
> + int ret;
> +
> if (!nfs4_sequence_done(task, &data->res.seq_res))
> return -EAGAIN;
> - if (nfs4_write_stateid_changed(task, &data->args))
> - return -EAGAIN;
> +
> + ret = nfs4_write_stateid_changed(task, &data->args);
> + switch (ret) {
> + case -EAGAIN:
> + case -EIO:
> + return ret;
> + }
> return data->write_done_cb ? data->write_done_cb(task, data) :
> nfs4_write_done_cb(task, data);
> }
From: Andy Adamson <[email protected]>
Signed-off-by: Andy Adamson <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/nfs4filelayout.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/fs/nfs/nfs4filelayout.c b/fs/nfs/nfs4filelayout.c
index 12c8132ad408..b9a35c05b60f 100644
--- a/fs/nfs/nfs4filelayout.c
+++ b/fs/nfs/nfs4filelayout.c
@@ -324,8 +324,9 @@ static void filelayout_read_prepare(struct rpc_task *task, void *data)
&rdata->res.seq_res,
task))
return;
- nfs4_set_rw_stateid(&rdata->args.stateid, rdata->args.context,
- rdata->args.lock_context, FMODE_READ);
+ if (nfs4_set_rw_stateid(&rdata->args.stateid, rdata->args.context,
+ rdata->args.lock_context, FMODE_READ) == -EIO)
+ rpc_exit(task, -EIO); /* lost lock, terminate I/O */
}
static void filelayout_read_call_done(struct rpc_task *task, void *data)
@@ -435,8 +436,9 @@ static void filelayout_write_prepare(struct rpc_task *task, void *data)
&wdata->res.seq_res,
task))
return;
- nfs4_set_rw_stateid(&wdata->args.stateid, wdata->args.context,
- wdata->args.lock_context, FMODE_WRITE);
+ if (nfs4_set_rw_stateid(&wdata->args.stateid, wdata->args.context,
+ wdata->args.lock_context, FMODE_WRITE) == -EIO)
+ rpc_exit(task, -EIO); /* lost lock, terminate I/O */
}
static void filelayout_write_call_done(struct rpc_task *task, void *data)
--
1.8.5.3
From: Andy Adamson <[email protected]>
Signed-off-by: Andy Adamson <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/nfs4filelayout.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/fs/nfs/nfs4filelayout.c b/fs/nfs/nfs4filelayout.c
index 12c8132ad408..b9a35c05b60f 100644
--- a/fs/nfs/nfs4filelayout.c
+++ b/fs/nfs/nfs4filelayout.c
@@ -324,8 +324,9 @@ static void filelayout_read_prepare(struct rpc_task *task, void *data)
&rdata->res.seq_res,
task))
return;
- nfs4_set_rw_stateid(&rdata->args.stateid, rdata->args.context,
- rdata->args.lock_context, FMODE_READ);
+ if (nfs4_set_rw_stateid(&rdata->args.stateid, rdata->args.context,
+ rdata->args.lock_context, FMODE_READ) == -EIO)
+ rpc_exit(task, -EIO); /* lost lock, terminate I/O */
}
static void filelayout_read_call_done(struct rpc_task *task, void *data)
@@ -435,8 +436,9 @@ static void filelayout_write_prepare(struct rpc_task *task, void *data)
&wdata->res.seq_res,
task))
return;
- nfs4_set_rw_stateid(&wdata->args.stateid, wdata->args.context,
- wdata->args.lock_context, FMODE_WRITE);
+ if (nfs4_set_rw_stateid(&wdata->args.stateid, wdata->args.context,
+ wdata->args.lock_context, FMODE_WRITE) == -EIO)
+ rpc_exit(task, -EIO); /* lost lock, terminate I/O */
}
static void filelayout_write_call_done(struct rpc_task *task, void *data)
--
1.8.5.3
On Tue, Mar 4, 2014 at 1:42 PM, Trond Myklebust
<[email protected]> wrote:
> On Tue, 2014-03-04 at 12:31 -0500, [email protected] wrote:
>> From: Andy Adamson <[email protected]>
>>
>> Use the zero stateid if nfs4_select_rw_stateid indicates a lost lock (-EIO)
>>
>> Signed-off-by: Andy Adamson <[email protected]>
>> ---
>> fs/nfs/nfs4proc.c | 18 +++++++++++-------
>> 1 file changed, 11 insertions(+), 7 deletions(-)
>>
>> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
>> index de8e7f5..efe940c 100644
>> --- a/fs/nfs/nfs4proc.c
>> +++ b/fs/nfs/nfs4proc.c
>> @@ -2458,18 +2458,22 @@ static int _nfs4_do_setattr(struct inode *inode, struct rpc_cred *cred,
>> truncate = (sattr->ia_valid & ATTR_SIZE) ? true : false;
>> fmode = truncate ? FMODE_WRITE : FMODE_READ;
>>
>> - if (nfs4_copy_delegation_stateid(&arg.stateid, inode, fmode)) {
>> - /* Use that stateid */
>> - } else if (truncate && state != NULL && nfs4_valid_open_stateid(state)) {
>> + if (nfs4_copy_delegation_stateid(&arg.stateid, inode, fmode))
>> + /* Use delegation stateid */
>> + goto do_call;
>> + if (truncate && state != NULL && nfs4_valid_open_stateid(state)) {
>> struct nfs_lockowner lockowner = {
>> .l_owner = current->files,
>> .l_pid = current->tgid,
>> };
>> - nfs4_select_rw_stateid(&arg.stateid, state, FMODE_WRITE,
>> - &lockowner);
>> - } else
>> - nfs4_stateid_copy(&arg.stateid, &zero_stateid);
>> + /* Use zero stateid if lock is lost (-EIO fall through) */
>> + if (nfs4_select_rw_stateid(&arg.stateid, state, FMODE_WRITE,
>> + &lockowner) != -EIO)
>
> Shouldn't we fail with EBADF instead?
Hmm. -EIO means lost lock, but not lost file, which is why I say we
send it with a zero stateid.
-->Andy
>
>> + goto do_call;
>> + }
>> + nfs4_stateid_copy(&arg.stateid, &zero_stateid);
>>
>> +do_call:
>> status = nfs4_call_sync(server->client, server, &msg, &arg.seq_args, &res.seq_res, 1);
>> if (status == 0 && state != NULL)
>> renew_lease(server, timestamp);
>
> --
> Trond Myklebust
> Linux NFS client maintainer, PrimaryData
> [email protected]
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 2014-03-04 at 12:31 -0500, [email protected] wrote:
> From: Andy Adamson <[email protected]>
>
> Use the zero stateid if nfs4_select_rw_stateid indicates a lost lock (-EIO)
>
> Signed-off-by: Andy Adamson <[email protected]>
> ---
> fs/nfs/nfs4proc.c | 18 +++++++++++-------
> 1 file changed, 11 insertions(+), 7 deletions(-)
>
> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> index de8e7f5..efe940c 100644
> --- a/fs/nfs/nfs4proc.c
> +++ b/fs/nfs/nfs4proc.c
> @@ -2458,18 +2458,22 @@ static int _nfs4_do_setattr(struct inode *inode, struct rpc_cred *cred,
> truncate = (sattr->ia_valid & ATTR_SIZE) ? true : false;
> fmode = truncate ? FMODE_WRITE : FMODE_READ;
>
> - if (nfs4_copy_delegation_stateid(&arg.stateid, inode, fmode)) {
> - /* Use that stateid */
> - } else if (truncate && state != NULL && nfs4_valid_open_stateid(state)) {
> + if (nfs4_copy_delegation_stateid(&arg.stateid, inode, fmode))
> + /* Use delegation stateid */
> + goto do_call;
> + if (truncate && state != NULL && nfs4_valid_open_stateid(state)) {
> struct nfs_lockowner lockowner = {
> .l_owner = current->files,
> .l_pid = current->tgid,
> };
> - nfs4_select_rw_stateid(&arg.stateid, state, FMODE_WRITE,
> - &lockowner);
> - } else
> - nfs4_stateid_copy(&arg.stateid, &zero_stateid);
> + /* Use zero stateid if lock is lost (-EIO fall through) */
> + if (nfs4_select_rw_stateid(&arg.stateid, state, FMODE_WRITE,
> + &lockowner) != -EIO)
Shouldn't we fail with EBADF instead?
> + goto do_call;
> + }
> + nfs4_stateid_copy(&arg.stateid, &zero_stateid);
>
> +do_call:
> status = nfs4_call_sync(server->client, server, &msg, &arg.seq_args, &res.seq_res, 1);
> if (status == 0 && state != NULL)
> renew_lease(server, timestamp);
--
Trond Myklebust
Linux NFS client maintainer, PrimaryData
[email protected]
When nfs4_set_rw_stateid() can fails by returning EIO to indicate that
the stateid is completely invalid, then it makes no sense to have it
trigger a retry of the READ or WRITE operation. Instead, we should just
have it fall through and attempt a recovery.
This fixes an infinite loop in which the client keeps replaying the same
bad stateid back to the server.
Reported-by: Andy Adamson <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Cc: [email protected] # 3.10+
Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/nfs4proc.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 44e088dc357c..4ae8141452c9 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -4011,8 +4011,9 @@ static bool nfs4_stateid_is_current(nfs4_stateid *stateid,
{
nfs4_stateid current_stateid;
- if (nfs4_set_rw_stateid(¤t_stateid, ctx, l_ctx, fmode))
- return false;
+ /* If the current stateid represents a lost lock, then exit */
+ if (nfs4_set_rw_stateid(¤t_stateid, ctx, l_ctx, fmode) == -EIO)
+ return true;
return nfs4_stateid_match(stateid, ¤t_stateid);
}
--
1.8.5.3
From: Andy Adamson <[email protected]>
Retry I/O from the call prepare state (with the new stateid) if the stateid
has changed on an stateid expired error
Fail the I/O if the statied represents a lost lock.
Otherwise, let the error handler decide what to do.
Without this change, I/O with a stateid expired error such as
NFS4ERR_BAD_STATEID can be retried without recovery and loop forever.
Signed-off-by: Andy Adamson <[email protected]>
---
fs/nfs/nfs4proc.c | 77 +++++++++++++++++++++++++++++++++++++++++++------------
1 file changed, 60 insertions(+), 17 deletions(-)
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 2f1997d..de8e7f5 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -4107,29 +4107,49 @@ static int nfs4_read_done_cb(struct rpc_task *task, struct nfs_read_data *data)
return 0;
}
-static bool nfs4_read_stateid_changed(struct rpc_task *task,
+static int nfs4_read_stateid_changed(struct rpc_task *task,
struct nfs_readargs *args)
{
+ int ret;
- if (!nfs4_error_stateid_expired(task->tk_status) ||
- nfs4_stateid_is_current(&args->stateid,
+ ret = nfs4_stateid_is_current(&args->stateid,
args->context,
args->lock_context,
- FMODE_READ))
- return false;
- rpc_restart_call_prepare(task);
- return true;
+ FMODE_READ);
+ switch (ret) {
+ case 0: /* stateid changed */
+ if (nfs4_error_stateid_expired(task->tk_status)) {
+ /* stateid expired error, retry with changed stateid */
+ rpc_restart_call_prepare(task);
+ return -EAGAIN;
+ }
+ return 0; /* let error handler proceed */
+
+ case -EIO: /* -EIO means lock stateid was lost: fail the I/O */
+ task->tk_status = -EIO;
+ return -EIO;
+
+ case -EWOULDBLOCK: /* stateid change in progress */
+ default:/* stateid has not changed, let error handler proceed.*/
+ return 0;
+
+ }
}
static int nfs4_read_done(struct rpc_task *task, struct nfs_read_data *data)
{
+ int ret;
dprintk("--> %s\n", __func__);
if (!nfs4_sequence_done(task, &data->res.seq_res))
return -EAGAIN;
- if (nfs4_read_stateid_changed(task, &data->args))
- return -EAGAIN;
+ ret = nfs4_read_stateid_changed(task, &data->args);
+ switch (ret) {
+ case -EAGAIN:
+ case -EIO:
+ return ret;
+ }
return data->read_done_cb ? data->read_done_cb(task, data) :
nfs4_read_done_cb(task, data);
}
@@ -4176,23 +4196,46 @@ static int nfs4_write_done_cb(struct rpc_task *task, struct nfs_write_data *data
static bool nfs4_write_stateid_changed(struct rpc_task *task,
struct nfs_writeargs *args)
{
+ int ret;
- if (!nfs4_error_stateid_expired(task->tk_status) ||
- nfs4_stateid_is_current(&args->stateid,
+ ret = nfs4_stateid_is_current(&args->stateid,
args->context,
args->lock_context,
- FMODE_WRITE))
- return false;
- rpc_restart_call_prepare(task);
- return true;
+ FMODE_WRITE);
+
+ switch (ret) {
+ case 0: /* stateid changed */
+ if (nfs4_error_stateid_expired(task->tk_status)) {
+ /* stateid expired error, retry with changed stateid */
+ rpc_restart_call_prepare(task);
+ return -EAGAIN;
+ }
+ return 0; /* let error handler proceed */
+
+ case -EIO: /* -EIO means lock stateid was lost: fail the I/O */
+ task->tk_status = -EIO;
+ return -EIO;
+
+ case -EWOULDBLOCK: /* stateid change in progress */
+ default:/* stateid has not changed, let error handler proceed.*/
+ return 0;
+
+ }
}
static int nfs4_write_done(struct rpc_task *task, struct nfs_write_data *data)
{
+ int ret;
+
if (!nfs4_sequence_done(task, &data->res.seq_res))
return -EAGAIN;
- if (nfs4_write_stateid_changed(task, &data->args))
- return -EAGAIN;
+
+ ret = nfs4_write_stateid_changed(task, &data->args);
+ switch (ret) {
+ case -EAGAIN:
+ case -EIO:
+ return ret;
+ }
return data->write_done_cb ? data->write_done_cb(task, data) :
nfs4_write_done_cb(task, data);
}
--
1.8.3.1
In commit 5521abfdcf4d6 (NFSv4: Resend the READ/WRITE RPC call
if a stateid change causes an error), we overloaded the return value of
nfs4_select_rw_stateid() to cause it to return -EWOULDBLOCK if an RPC
call is outstanding that would cause the NFSv4 lock or open stateid
to change.
That is all redundant when we actually copy the stateid used in the
read/write RPC call that failed, and check that against the current
stateid. It is doubly so, when we consider that in the NFSv4.1 case,
we also set the stateid's seqid to the special value '0', which means
'match the current valid stateid'.
Reported-by: Andy Adamson <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/nfs4state.c | 6 ------
1 file changed, 6 deletions(-)
diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
index e1a47217c05e..cd0cf93ac927 100644
--- a/fs/nfs/nfs4state.c
+++ b/fs/nfs/nfs4state.c
@@ -974,9 +974,6 @@ static int nfs4_copy_lock_stateid(nfs4_stateid *dst,
else if (lsp != NULL && test_bit(NFS_LOCK_INITIALIZED, &lsp->ls_flags) != 0) {
nfs4_stateid_copy(dst, &lsp->ls_stateid);
ret = 0;
- smp_rmb();
- if (!list_empty(&lsp->ls_seqid.list))
- ret = -EWOULDBLOCK;
}
spin_unlock(&state->state_lock);
nfs4_put_lock_state(lsp);
@@ -997,9 +994,6 @@ static int nfs4_copy_open_stateid(nfs4_stateid *dst, struct nfs4_state *state)
src = &state->open_stateid;
nfs4_stateid_copy(dst, src);
ret = 0;
- smp_rmb();
- if (!list_empty(&state->owner->so_seqid.list))
- ret = -EWOULDBLOCK;
} while (read_seqretry(&state->seqlock, seq));
return ret;
}
--
1.8.5.3