2013-06-26 19:21:30

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 0/7] miscellaneous nfsd bugfixes

From: "J. Bruce Fields" <[email protected]>

These are fixes for some miscellaneous bugs found at the bakeathon, to
be queued for 3.11.

--b.

J. Bruce Fields (7):
nfsd4: fix decoding across page boundaries
nfsd4: minor read_buf cleanup
svcrpc: fix handling of too-short rpc's
svcrpc: don't error out on small tcp fragment
nfsd4: delegation-based open reclaims should bypass permissions
nfsd4: do not throw away 4.1 lock state on last unlock
nfsd4: return delegation immediately if lease fails

fs/nfsd/nfs4proc.c | 3 ++-
fs/nfsd/nfs4state.c | 13 ++++---------
fs/nfsd/nfs4xdr.c | 24 ++++++++++++++----------
net/sunrpc/svcsock.c | 9 +++++++--
4 files changed, 27 insertions(+), 22 deletions(-)

--
1.8.1.4



2013-06-26 19:21:31

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 4/7] svcrpc: don't error out on small tcp fragment

From: "J. Bruce Fields" <[email protected]>

Though clients we care about mostly don't do this, it is possible for
rpc requests to be sent in multiple fragments. Here we have a sanity
check to ensure that the final received rpc isn't too small--except that
the number we're actually checking is the length of just the final
fragment, not of the whole rpc. So a perfectly legal rpc that's
unluckily fragmented could cause the server to close the connection
here.

Cc: [email protected]
Signed-off-by: J. Bruce Fields <[email protected]>
---
net/sunrpc/svcsock.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index df74919..305374d 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -1095,7 +1095,7 @@ static int svc_tcp_recvfrom(struct svc_rqst *rqstp)
goto err_noclose;
}

- if (svc_sock_reclen(svsk) < 8) {
+ if (svsk->sk_datalen < 8) {
svsk->sk_datalen = 0;
goto err_delete; /* client is nuts. */
}
--
1.8.1.4


2013-06-26 19:21:32

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 5/7] nfsd4: delegation-based open reclaims should bypass permissions

From: "J. Bruce Fields" <[email protected]>

We saw a v4.0 client's create fail as follows:

- open create succeeds and gets a read delegation
- client attempts to set mode on new file, gets DELAY while
server recalls delegation.
- client attempts a CLAIM_DELEGATE_CUR open using the
delegation, gets error because of new file mode.

This probably can't happen on a recent kernel since we're no longer
giving out delegations on create opens. Nevertheless, it's a
bug--reclaim opens should bypass permission checks.

Reported-by: Steve Dickson <[email protected]>
Reported-by: Trond Myklebust <[email protected]>
Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4proc.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index 1a1ff24..a7cee86 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -296,7 +296,8 @@ do_open_lookup(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate, stru

nfsd4_set_open_owner_reply_cache(cstate, open, resfh);
accmode = NFSD_MAY_NOP;
- if (open->op_created)
+ if (open->op_created ||
+ open->op_claim_type == NFS4_OPEN_CLAIM_DELEGATE_CUR)
accmode |= NFSD_MAY_OWNER_OVERRIDE;
status = do_open_permission(rqstp, resfh, open, accmode);
set_change_info(&open->op_cinfo, current_fh);
--
1.8.1.4


2013-06-26 19:21:30

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 1/7] nfsd4: fix decoding across page boundaries

From: "J. Bruce Fields" <[email protected]>

nfsd4: fix decoding of compounds across page boundaries

A freebsd NFSv4.0 client was getting rare IO errors expanding a tarball.
A network trace showed the server returning BAD_XDR on the final getattr
of a getattr+write+getattr compound. The final getattr started on a
page boundary.

I believe the Linux client ignores errors on the post-write getattr, and
that that's why we haven't seen this before.

Cc: [email protected]
Reported-by: Rick Macklem <[email protected]>
Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4xdr.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index 171fe5e4..e34f5eb 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -167,8 +167,8 @@ static __be32 *read_buf(struct nfsd4_compoundargs *argp, u32 nbytes)
*/
memcpy(p, argp->p, avail);
/* step to next page */
- argp->p = page_address(argp->pagelist[0]);
argp->pagelist++;
+ argp->p = page_address(argp->pagelist[0]);
if (argp->pagelen < PAGE_SIZE) {
argp->end = argp->p + (argp->pagelen>>2);
argp->pagelen = 0;
--
1.8.1.4


2013-06-26 19:21:31

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 3/7] svcrpc: fix handling of too-short rpc's

From: "J. Bruce Fields" <[email protected]>

If we detect that an rpc is too short, we abort and close the
connection. Except, there's a bug here: we're leaving sk_datalen
nonzero without leaving any pages in the sk_pages array. The most
likely result of the inconsistency is a subsequent crash in
svc_tcp_clear_pages.

Also demote the BUG_ON in svc_tcp_clear_pages to a WARN.

Cc: [email protected]
Signed-off-by: J. Bruce Fields <[email protected]>
---
net/sunrpc/svcsock.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index 0f679df..df74919 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -917,7 +917,10 @@ static void svc_tcp_clear_pages(struct svc_sock *svsk)
len = svsk->sk_datalen;
npages = (len + PAGE_SIZE - 1) >> PAGE_SHIFT;
for (i = 0; i < npages; i++) {
- BUG_ON(svsk->sk_pages[i] == NULL);
+ if (svsk->sk_pages[i] == NULL) {
+ WARN_ON_ONCE(1);
+ continue;
+ }
put_page(svsk->sk_pages[i]);
svsk->sk_pages[i] = NULL;
}
@@ -1092,8 +1095,10 @@ static int svc_tcp_recvfrom(struct svc_rqst *rqstp)
goto err_noclose;
}

- if (svc_sock_reclen(svsk) < 8)
+ if (svc_sock_reclen(svsk) < 8) {
+ svsk->sk_datalen = 0;
goto err_delete; /* client is nuts. */
+ }

rqstp->rq_arg.len = svsk->sk_datalen;
rqstp->rq_arg.page_base = 0;
--
1.8.1.4


2013-06-26 19:27:10

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 0/7] miscellaneous nfsd bugfixes

On Wed, Jun 26, 2013 at 03:21:20PM -0400, J. Bruce Fields wrote:
> From: "J. Bruce Fields" <[email protected]>
>
> These are fixes for some miscellaneous bugs found at the bakeathon, to
> be queued for 3.11.

Also to help track down the decoding errors I added a dumb pynfs test,
WRITE15, which does writes of every possible size from 0 bytes to 8k:

git://linux-nfs.org/~bfields/pynfs.git

--b.

2013-06-26 19:21:32

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 6/7] nfsd4: do not throw away 4.1 lock state on last unlock

From: "J. Bruce Fields" <[email protected]>

This reverts commit eb2099f31b0f090684a64ef8df44a30ff7c45fc2 "nfsd4:
release lockowners on last unlock in 4.1 case". Trond identified
language in rfc 5661 section 8.2.4 which forbids this behavior:

Stateids associated with byte-range locks are an exception.
They remain valid even if a LOCKU frees all remaining locks, so
long as the open file with which they are associated remains
open, unless the client frees the stateids via the FREE_STATEID
operation.

And bakeathon 2013 testing found a 4.1 freebsd client was getting an
incorrect BAD_STATEID return from a FREE_STATEID in the above situation
and then failing.

The spec language honestly was probably a mistake but at this point with
implementations already following it we're probably stuck with that.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4state.c | 9 +--------
1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index fcc0610..a7d8a11 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -4514,7 +4514,6 @@ __be32
nfsd4_locku(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
struct nfsd4_locku *locku)
{
- struct nfs4_lockowner *lo;
struct nfs4_ol_stateid *stp;
struct file *filp = NULL;
struct file_lock *file_lock = NULL;
@@ -4547,10 +4546,9 @@ nfsd4_locku(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
status = nfserr_jukebox;
goto out;
}
- lo = lockowner(stp->st_stateowner);
locks_init_lock(file_lock);
file_lock->fl_type = F_UNLCK;
- file_lock->fl_owner = (fl_owner_t)lo;
+ file_lock->fl_owner = (fl_owner_t)lockowner(stp->st_stateowner);
file_lock->fl_pid = current->tgid;
file_lock->fl_file = filp;
file_lock->fl_flags = FL_POSIX;
@@ -4569,11 +4567,6 @@ nfsd4_locku(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
update_stateid(&stp->st_stid.sc_stateid);
memcpy(&locku->lu_stateid, &stp->st_stid.sc_stateid, sizeof(stateid_t));

- if (nfsd4_has_session(cstate) && !check_for_locks(stp->st_file, lo)) {
- WARN_ON_ONCE(cstate->replay_owner);
- release_lockowner(lo);
- }
-
out:
nfsd4_bump_seqid(cstate, status);
if (!cstate->replay_owner)
--
1.8.1.4


2013-06-26 19:21:31

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 2/7] nfsd4: minor read_buf cleanup

From: "J. Bruce Fields" <[email protected]>

The code to step to the next page seems reasonably self-contained.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4xdr.c | 24 ++++++++++++++----------
1 file changed, 14 insertions(+), 10 deletions(-)

diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index e34f5eb..c102d25 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -139,6 +139,19 @@ xdr_error: \
} \
} while (0)

+static void next_decode_page(struct nfsd4_compoundargs *argp)
+{
+ argp->pagelist++;
+ argp->p = page_address(argp->pagelist[0]);
+ if (argp->pagelen < PAGE_SIZE) {
+ argp->end = argp->p + (argp->pagelen>>2);
+ argp->pagelen = 0;
+ } else {
+ argp->end = argp->p + (PAGE_SIZE>>2);
+ argp->pagelen -= PAGE_SIZE;
+ }
+}
+
static __be32 *read_buf(struct nfsd4_compoundargs *argp, u32 nbytes)
{
/* We want more bytes than seem to be available.
@@ -166,16 +179,7 @@ static __be32 *read_buf(struct nfsd4_compoundargs *argp, u32 nbytes)
* guarantee p points to at least nbytes bytes.
*/
memcpy(p, argp->p, avail);
- /* step to next page */
- argp->pagelist++;
- argp->p = page_address(argp->pagelist[0]);
- if (argp->pagelen < PAGE_SIZE) {
- argp->end = argp->p + (argp->pagelen>>2);
- argp->pagelen = 0;
- } else {
- argp->end = argp->p + (PAGE_SIZE>>2);
- argp->pagelen -= PAGE_SIZE;
- }
+ next_decode_page(argp);
memcpy(((char*)p)+avail, argp->p, (nbytes - avail));
argp->p += XDR_QUADLEN(nbytes - avail);
return p;
--
1.8.1.4


2013-06-26 19:21:32

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 7/7] nfsd4: return delegation immediately if lease fails

From: "J. Bruce Fields" <[email protected]>

This case shouldn't happen--the administrator shouldn't really allow
other applications access to the export until clients have had the
chance to reclaim their state--but if it does then we should set the
"return this lease immediately" bit on the reply. That still leaves
some small races, but it's the best the protocol allows us to do in the
case a lease is ripped out from under us....

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4state.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index a7d8a11..d44a4bf 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -3143,8 +3143,10 @@ out_free:
out_no_deleg:
open->op_delegate_type = NFS4_OPEN_DELEGATE_NONE;
if (open->op_claim_type == NFS4_OPEN_CLAIM_PREVIOUS &&
- open->op_delegate_type != NFS4_OPEN_DELEGATE_NONE)
+ open->op_delegate_type != NFS4_OPEN_DELEGATE_NONE) {
dprintk("NFSD: WARNING: refusing delegation reclaim\n");
+ open->op_recall = 1;
+ }

/* 4.1 client asking for a delegation? */
if (open->op_deleg_want)
--
1.8.1.4