2014-03-23 01:12:28

by J. Bruce Fields

[permalink] [raw]
Subject: nfsd4 xdr encoding fixes

This is a collection of fixes for the NFS server's encoding of NFSv4
compounds, along with a few tangentially related cleanups and bugfixes I
noticed along the way.

The basic problem is that we've always assumed an rpc reply is either

- "small" (much less than a page), or
- looks like a read (a bunch of data with a little bit at the
beginning and the end).

That assumption has allowed us to cover the most important cases without
having to deal with some annoying details like how to encode arbitrary
data across a page boundary, but:

- The inability to encode attributes of more than a page annoys
some people who would like to get and set extraordinarily long
ACLs.

- The inability to encode attributes that cross page boundaries
also means we can't return more than a page of readdir data at
a time, limiting readdir performance on large directories.

- The NFSv4 protocol doesn't really allow us to place these
sorts of arbitrary limits on the types of compounds we handle.
(Well, 4.0 is a bit fuzzy on this point, but 4.1 I think
definitely considers it a bug if a server won't handle, e.g.,
multiple read ops in a compound.) This hasn't been an issue
because most of these exotic compounds aren't really useful to
clients. But maybe future clients will find a use for some of
them--in which case we'd prefer not to make the work around a
server that doesn't meet the spec.

So, the main goal is to fix those limitations. We also get to share a
little more code with the client.

I was hoping to merge this for 3.15, but I'm cutting this a little close
now, so we'll see. I'll merge whatever's ready.

Further work may include:

- writing more pynfs tests for exotic compounds and odd corner
cases,
- auditing the annoying nfsd4_*_rsize() functions, which we now
depend on for more things,
- improving our (currently very sloppy) estimate of how much
space we need for krb5i/krb5p to checksum/encrypt the result.
- sharing some of this with the v2/v3 code (especially in the
read and readdir cases),
- allow rpc's whose call and reply are both very large (our one
remaining dubious limit on compounds, though again something
clients seem unlikely to notice for now),
- on the decode side, eliminating the existing macros and
sharing more helpers with the client.

--b.



2014-03-23 14:43:56

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 23/50] nfsd4: "backfill" using write_bytes_to_xdr_buf

On Sat, Mar 22, 2014 at 11:51:54PM -0700, Christoph Hellwig wrote:
> Looks good, but shouldn't there be an abstract function to skip writing
> into an XDR buf instead of the pointer increment as well?

So you're talking about the earlier spot in the code where we've got a

p++; /* to be filled in later? */

I'm not convinced it'd be worth it, but maybe.

> I wonder if the XDR buf internal should even be entirely hidden to the
> consummers and it should only be accessed through procedural interfaces?

By the end of the series, most xdr encoding looks like

p = xdr_reserve_space(xdr, <nbytes>);

*p++ = cpu_to_be32(<some value>);
p = xdr_encode_something_more_complicated(p, <some struct>);
...

It doesn't require the encoders to know much about xdr_buf's.

The main exception is read, especially the splice case. We could
probably do better there, I just haven't thought much about it yet.

Hm, but maybe what you're asking for here is just a
write_bytes_to_xdr_buf wrapper that takes the xdr_stream instead of an
xdr_buf, and a corresponding function to read xdr->buf->len when we want
an offset? OK, maybe so.

--b.

2014-03-23 01:12:31

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 33/50] nfsd4: more precise nfsd4_max_reply

From: "J. Bruce Fields" <[email protected]>

---
fs/nfsd/nfs4xdr.c | 27 ++++++++-------------------
1 file changed, 8 insertions(+), 19 deletions(-)

diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index 3620ac5d..1ac730d 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -1615,22 +1615,12 @@ nfsd4_opnum_in_range(struct nfsd4_compoundargs *argp, struct nfsd4_op *op)
* use pages beyond the first one, so the maximum possible length is the
* maximum over these values, not the sum.
*/
-static int nfsd4_max_reply(u32 opnum)
+static int nfsd4_max_reply(struct svc_rqst *rqstp, struct nfsd4_op *op)
{
- switch (opnum) {
- case OP_READLINK:
- case OP_READDIR:
- /*
- * Both of these ops take a single page for data and put
- * the head and tail in another page:
- */
- return 2 * PAGE_SIZE;
- case OP_GETATTR:
- case OP_READ:
- return INT_MAX;
- default:
- return PAGE_SIZE;
- }
+ struct nfsd4_operation *opdesc = OPDESC(op);
+ nfsd4op_rsize estimator = opdesc->op_rsize_bop;
+
+ return estimator ? estimator(rqstp, op) : PAGE_SIZE;
}

static __be32
@@ -1639,7 +1629,7 @@ nfsd4_decode_compound(struct nfsd4_compoundargs *argp)
DECODE_HEAD;
struct nfsd4_op *op;
bool cachethis = false;
- int max_reply = PAGE_SIZE;
+ int max_reply = 2 * RPC_MAX_AUTH_SIZE; /* uh, kind of a guess */
int i;

READ_BUF(4);
@@ -1690,13 +1680,12 @@ nfsd4_decode_compound(struct nfsd4_compoundargs *argp)
*/
cachethis |= nfsd4_cache_this_op(op);

- max_reply = max(max_reply, nfsd4_max_reply(op->opnum));
+ max_reply += nfsd4_max_reply(argp->rqstp, op);
}
/* Sessions make the DRC unnecessary: */
if (argp->minorversion)
cachethis = false;
- if (max_reply != INT_MAX)
- svc_reserve(argp->rqstp, max_reply);
+ svc_reserve(argp->rqstp, max_reply);
argp->rqstp->rq_cachetype = cachethis ? RC_REPLBUFF : RC_NOCACHE;

DECODE_TAIL;
--
1.8.5.3


2014-03-23 01:12:25

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 04/50] nfsd4: minor nfsd4_replay_cache_entry cleanup

From: "J. Bruce Fields" <[email protected]>

Maybe this is comment true, who cares? Handle this like any other
error.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4state.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 613f7cd..cb4cc4a 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1609,9 +1609,8 @@ nfsd4_replay_cache_entry(struct nfsd4_compoundres *resp,

dprintk("--> %s slot %p\n", __func__, slot);

- /* Either returns 0 or nfserr_retry_uncached */
status = nfsd4_enc_sequence_replay(resp->rqstp->rq_argp, resp);
- if (status == nfserr_retry_uncached_rep)
+ if (status)
return status;

/* The sequence operation has been encoded, cstate->datap set. */
--
1.8.5.3


2014-03-23 01:12:29

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 22/50] nfsd4: use xdr_truncate_encode

From: "J. Bruce Fields" <[email protected]>

Now that lengths are reliable, we can use xdr_truncate instead of
open-coding it everywhere.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4xdr.c | 28 +++++++++++-----------------
1 file changed, 11 insertions(+), 17 deletions(-)

diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index a3dce3c..e5cc354 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -2043,7 +2043,7 @@ nfsd4_encode_fattr(struct xdr_stream *xdr, struct svc_fh *fhp, struct svc_export
struct svc_fh *tempfh = NULL;
struct kstatfs statfs;
__be32 *p;
- __be32 *start = xdr->p;
+ int starting_len = xdr->buf->len;
__be32 *attrlenp;
u32 dummy;
u64 dummy64;
@@ -2534,13 +2534,8 @@ out:
kfree(acl);
if (tempfh)
fh_put(tempfh);
- if (status) {
- int nbytes = (char *)xdr->p - (char *)start;
- /* open code what *should* be xdr_truncate(xdr, len); */
- xdr->iov->iov_len -= nbytes;
- xdr->buf->len -= nbytes;
- xdr->p = start;
- }
+ if (status)
+ xdr_truncate_encode(xdr, starting_len);
return status;
out_nfserr:
status = nfserrno(err);
@@ -3005,6 +3000,7 @@ nfsd4_encode_read(struct nfsd4_compoundres *resp, __be32 nfserr,
struct page *page;
unsigned long maxcount;
struct xdr_stream *xdr = &resp->xdr;
+ int starting_len = xdr->buf->len;
long len;
__be32 *p;

@@ -3041,8 +3037,8 @@ nfsd4_encode_read(struct nfsd4_compoundres *resp, __be32 nfserr,
&maxcount);

if (nfserr) {
- xdr->p -= 2;
- xdr->iov->iov_len -= 8;
+ xdr->buf->page_len = 0;
+ xdr_truncate_encode(xdr, starting_len);
return nfserr;
}
eof = (read->rd_offset + maxcount >=
@@ -3077,6 +3073,7 @@ nfsd4_encode_readlink(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd
int maxcount;
struct xdr_stream *xdr = &resp->xdr;
char *page;
+ int length_offset = xdr->buf->len;
__be32 *p;

if (nfserr)
@@ -3101,8 +3098,7 @@ nfsd4_encode_readlink(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd
if (nfserr == nfserr_isdir)
nfserr = nfserr_inval;
if (nfserr) {
- xdr->p--;
- xdr->iov->iov_len -= 4;
+ xdr_truncate_encode(xdr, length_offset);
return nfserr;
}

@@ -3133,7 +3129,8 @@ nfsd4_encode_readdir(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4
int maxcount;
loff_t offset;
struct xdr_stream *xdr = &resp->xdr;
- __be32 *page, *savep, *tailbase;
+ int starting_len = xdr->buf->len;
+ __be32 *page, *tailbase;
__be32 *p;

if (nfserr)
@@ -3144,7 +3141,6 @@ nfsd4_encode_readdir(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4
return nfserr_resource;

RESERVE_SPACE(NFS4_VERIFIER_SIZE);
- savep = p;

/* XXX: Following NFSv3, we ignore the READDIR verifier for now. */
WRITE32(0);
@@ -3206,9 +3202,7 @@ nfsd4_encode_readdir(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4

return 0;
err_no_verf:
- xdr->p = savep;
- xdr->iov->iov_len = ((char *)resp->xdr.p)
- - (char *)resp->xdr.buf->head[0].iov_base;
+ xdr_truncate_encode(xdr, starting_len);
return nfserr;
}

--
1.8.5.3


2014-03-23 01:12:33

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 50/50] nfsd4: kill write32, write64

From: "J. Bruce Fields" <[email protected]>

And switch a couple other functions from the encode(&p,...) convention
to the p = encode(p,...) convention mostly used elsewhere.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4xdr.c | 51 +++++++++++++++++++++------------------------------
1 file changed, 21 insertions(+), 30 deletions(-)

diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index 24e9fde..1595e84 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -1700,39 +1700,30 @@ nfsd4_decode_compound(struct nfsd4_compoundargs *argp)
DECODE_TAIL;
}

-static void write32(__be32 **p, u32 n)
-{
- *(*p)++ = htonl(n);
-}
-
-static void write64(__be32 **p, u64 n)
-{
- write32(p, (n >> 32));
- write32(p, (u32)n);
-}
-
-static void write_change(__be32 **p, struct kstat *stat, struct inode *inode)
+static __be32 *write_change(__be32 *p, struct kstat *stat, struct inode *inode)
{
if (IS_I_VERSION(inode)) {
- write64(p, inode->i_version);
+ p = xdr_encode_hyper(p, inode->i_version);
} else {
- write32(p, stat->ctime.tv_sec);
- write32(p, stat->ctime.tv_nsec);
+ *p++ = cpu_to_be32(stat->ctime.tv_sec);
+ *p++ = cpu_to_be32(stat->ctime.tv_nsec);
}
+ return p;
}

-static void write_cinfo(__be32 **p, struct nfsd4_change_info *c)
+static __be32 *write_cinfo(__be32 *p, struct nfsd4_change_info *c)
{
- write32(p, c->atomic);
+ *p++ = cpu_to_be32(c->atomic);
if (c->change_supported) {
- write64(p, c->before_change);
- write64(p, c->after_change);
+ p = xdr_encode_hyper(p, c->before_change);
+ p = xdr_encode_hyper(p, c->after_change);
} else {
- write32(p, c->before_ctime_sec);
- write32(p, c->before_ctime_nsec);
- write32(p, c->after_ctime_sec);
- write32(p, c->after_ctime_nsec);
+ *p++ = cpu_to_be32(c->before_ctime_sec);
+ *p++ = cpu_to_be32(c->before_ctime_nsec);
+ *p++ = cpu_to_be32(c->after_ctime_sec);
+ *p++ = cpu_to_be32(c->after_ctime_nsec);
}
+ return p;
}

/* Encode as an array of strings the string given with components
@@ -2190,7 +2181,7 @@ nfsd4_encode_fattr(struct xdr_stream *xdr, struct svc_fh *fhp, struct svc_export
p = xdr_reserve_space(xdr, 8);
if (!p)
goto out_resource;
- write_change(&p, &stat, dentry->d_inode);
+ p = write_change(p, &stat, dentry->d_inode);
}
if (bmval0 & FATTR4_WORD0_SIZE) {
p = xdr_reserve_space(xdr, 8);
@@ -2814,7 +2805,7 @@ nfsd4_encode_create(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4_
p = xdr_reserve_space(xdr, 32);
if (!p)
return nfserr_resource;
- write_cinfo(&p, &create->cr_cinfo);
+ p = write_cinfo(p, &create->cr_cinfo);
*p++ = cpu_to_be32(2);
*p++ = cpu_to_be32(create->cr_bmval[0]);
*p++ = cpu_to_be32(create->cr_bmval[1]);
@@ -2937,7 +2928,7 @@ nfsd4_encode_link(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4_li
p = xdr_reserve_space(xdr, 20);
if (!p)
return nfserr_resource;
- write_cinfo(&p, &link->li_cinfo);
+ p = write_cinfo(p, &link->li_cinfo);
}
return nfserr;
}
@@ -2958,7 +2949,7 @@ nfsd4_encode_open(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4_op
p = xdr_reserve_space(xdr, 40);
if (!p)
return nfserr_resource;
- write_cinfo(&p, &open->op_cinfo);
+ p = write_cinfo(p, &open->op_cinfo);
*p++ = cpu_to_be32(open->op_rflags);
*p++ = cpu_to_be32(2);
*p++ = cpu_to_be32(open->op_bmval[0]);
@@ -3372,7 +3363,7 @@ nfsd4_encode_remove(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4_
p = xdr_reserve_space(xdr, 20);
if (!p)
return nfserr_resource;
- write_cinfo(&p, &remove->rm_cinfo);
+ p = write_cinfo(p, &remove->rm_cinfo);
}
return nfserr;
}
@@ -3387,8 +3378,8 @@ nfsd4_encode_rename(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4_
p = xdr_reserve_space(xdr, 40);
if (!p)
return nfserr_resource;
- write_cinfo(&p, &rename->rn_sinfo);
- write_cinfo(&p, &rename->rn_tinfo);
+ p = write_cinfo(p, &rename->rn_sinfo);
+ p = write_cinfo(p, &rename->rn_tinfo);
}
return nfserr;
}
--
1.8.5.3


2014-03-23 01:12:33

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 47/50] nfsd4: kill WRITE32

From: "J. Bruce Fields" <[email protected]>

These macros just obsucre what's going on. Adopt the convention of the
client-side code.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4xdr.c | 303 +++++++++++++++++++++++++++---------------------------
1 file changed, 151 insertions(+), 152 deletions(-)

diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index d92dd1b..7a827c1 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -1700,7 +1700,6 @@ nfsd4_decode_compound(struct nfsd4_compoundargs *argp)
DECODE_TAIL;
}

-#define WRITE32(n) *p++ = htonl(n)
#define WRITE64(n) do { \
*p++ = htonl((u32)((n) >> 32)); \
*p++ = htonl((u32)(n)); \
@@ -1789,7 +1788,7 @@ static __be32 nfsd4_encode_components_esc(struct xdr_stream *xdr, char sep, char
p = xdr_reserve_space(xdr, strlen + 4);
if (!p)
return nfserr_resource;
- WRITE32(strlen);
+ *p++ = cpu_to_be32(strlen);
WRITEMEM(str, strlen);
count++;
}
@@ -1868,7 +1867,7 @@ static __be32 nfsd4_encode_path(struct xdr_stream *xdr, const struct path *root,
p = xdr_reserve_space(xdr, 4);
if (!p)
goto out_free;
- WRITE32(ncomponents);
+ *p++ = cpu_to_be32(ncomponents);

while (ncomponents) {
struct dentry *dentry = components[ncomponents - 1];
@@ -1881,7 +1880,7 @@ static __be32 nfsd4_encode_path(struct xdr_stream *xdr, const struct path *root,
spin_unlock(&dentry->d_lock);
goto out_free;
}
- WRITE32(len);
+ *p++ = cpu_to_be32(len);
WRITEMEM(dentry->d_name.name, len);
dprintk("/%s", dentry->d_name.name);
spin_unlock(&dentry->d_lock);
@@ -1928,7 +1927,7 @@ static __be32 nfsd4_encode_fs_locations(struct xdr_stream *xdr, struct svc_rqst
p = xdr_reserve_space(xdr, 4);
if (!p)
return nfserr_resource;
- WRITE32(fslocs->locations_count);
+ *p++ = cpu_to_be32(fslocs->locations_count);
for (i=0; i<fslocs->locations_count; i++) {
status = nfsd4_encode_fs_location4(xdr, &fslocs->locations[i]);
if (status)
@@ -1980,8 +1979,8 @@ nfsd4_encode_security_label(struct xdr_stream *xdr, struct svc_rqst *rqstp, void
* For now we use a 0 here to indicate the null translation; in
* the future we may place a call to translation code here.
*/
- WRITE32(0); /* lfs */
- WRITE32(0); /* pi */
+ *p++ = cpu_to_be32(0); /* lfs */
+ *p++ = cpu_to_be32(0); /* pi */
p = xdr_encode_opaque(p, context, len);
return 0;
}
@@ -2128,23 +2127,23 @@ nfsd4_encode_fattr(struct xdr_stream *xdr, struct svc_fh *fhp, struct svc_export
p = xdr_reserve_space(xdr, 16);
if (!p)
goto out_resource;
- WRITE32(3);
- WRITE32(bmval0);
- WRITE32(bmval1);
- WRITE32(bmval2);
+ *p++ = cpu_to_be32(3);
+ *p++ = cpu_to_be32(bmval0);
+ *p++ = cpu_to_be32(bmval1);
+ *p++ = cpu_to_be32(bmval2);
} else if (bmval1) {
p = xdr_reserve_space(xdr, 12);
if (!p)
goto out_resource;
- WRITE32(2);
- WRITE32(bmval0);
- WRITE32(bmval1);
+ *p++ = cpu_to_be32(2);
+ *p++ = cpu_to_be32(bmval0);
+ *p++ = cpu_to_be32(bmval1);
} else {
p = xdr_reserve_space(xdr, 8);
if (!p)
goto out_resource;
- WRITE32(1);
- WRITE32(bmval0);
+ *p++ = cpu_to_be32(1);
+ *p++ = cpu_to_be32(bmval0);
}

attrlen_offset = xdr->buf->len;
@@ -2166,17 +2165,17 @@ nfsd4_encode_fattr(struct xdr_stream *xdr, struct svc_fh *fhp, struct svc_export
p = xdr_reserve_space(xdr, 12);
if (!p)
goto out_resource;
- WRITE32(2);
- WRITE32(word0);
- WRITE32(word1);
+ *p++ = cpu_to_be32(2);
+ *p++ = cpu_to_be32(word0);
+ *p++ = cpu_to_be32(word1);
} else {
p = xdr_reserve_space(xdr, 16);
if (!p)
goto out_resource;
- WRITE32(3);
- WRITE32(word0);
- WRITE32(word1);
- WRITE32(word2);
+ *p++ = cpu_to_be32(3);
+ *p++ = cpu_to_be32(word0);
+ *p++ = cpu_to_be32(word1);
+ *p++ = cpu_to_be32(word2);
}
}
if (bmval0 & FATTR4_WORD0_TYPE) {
@@ -2188,16 +2187,16 @@ nfsd4_encode_fattr(struct xdr_stream *xdr, struct svc_fh *fhp, struct svc_export
status = nfserr_serverfault;
goto out;
}
- WRITE32(dummy);
+ *p++ = cpu_to_be32(dummy);
}
if (bmval0 & FATTR4_WORD0_FH_EXPIRE_TYPE) {
p = xdr_reserve_space(xdr, 4);
if (!p)
goto out_resource;
if (exp->ex_flags & NFSEXP_NOSUBTREECHECK)
- WRITE32(NFS4_FH_PERSISTENT);
+ *p++ = cpu_to_be32(NFS4_FH_PERSISTENT);
else
- WRITE32(NFS4_FH_PERSISTENT|NFS4_FH_VOL_RENAME);
+ *p++ = cpu_to_be32(NFS4_FH_PERSISTENT|NFS4_FH_VOL_RENAME);
}
if (bmval0 & FATTR4_WORD0_CHANGE) {
p = xdr_reserve_space(xdr, 8);
@@ -2215,19 +2214,19 @@ nfsd4_encode_fattr(struct xdr_stream *xdr, struct svc_fh *fhp, struct svc_export
p = xdr_reserve_space(xdr, 4);
if (!p)
goto out_resource;
- WRITE32(1);
+ *p++ = cpu_to_be32(1);
}
if (bmval0 & FATTR4_WORD0_SYMLINK_SUPPORT) {
p = xdr_reserve_space(xdr, 4);
if (!p)
goto out_resource;
- WRITE32(1);
+ *p++ = cpu_to_be32(1);
}
if (bmval0 & FATTR4_WORD0_NAMED_ATTR) {
p = xdr_reserve_space(xdr, 4);
if (!p)
goto out_resource;
- WRITE32(0);
+ *p++ = cpu_to_be32(0);
}
if (bmval0 & FATTR4_WORD0_FSID) {
p = xdr_reserve_space(xdr, 16);
@@ -2242,10 +2241,10 @@ nfsd4_encode_fattr(struct xdr_stream *xdr, struct svc_fh *fhp, struct svc_export
WRITE64((u64)0);
break;
case FSIDSOURCE_DEV:
- WRITE32(0);
- WRITE32(MAJOR(stat.dev));
- WRITE32(0);
- WRITE32(MINOR(stat.dev));
+ *p++ = cpu_to_be32(0);
+ *p++ = cpu_to_be32(MAJOR(stat.dev));
+ *p++ = cpu_to_be32(0);
+ *p++ = cpu_to_be32(MINOR(stat.dev));
break;
case FSIDSOURCE_UUID:
WRITEMEM(exp->ex_uuid, 16);
@@ -2256,19 +2255,19 @@ nfsd4_encode_fattr(struct xdr_stream *xdr, struct svc_fh *fhp, struct svc_export
p = xdr_reserve_space(xdr, 4);
if (!p)
goto out_resource;
- WRITE32(0);
+ *p++ = cpu_to_be32(0);
}
if (bmval0 & FATTR4_WORD0_LEASE_TIME) {
p = xdr_reserve_space(xdr, 4);
if (!p)
goto out_resource;
- WRITE32(nn->nfsd4_lease);
+ *p++ = cpu_to_be32(nn->nfsd4_lease);
}
if (bmval0 & FATTR4_WORD0_RDATTR_ERROR) {
p = xdr_reserve_space(xdr, 4);
if (!p)
goto out_resource;
- WRITE32(rdattr_err);
+ *p++ = cpu_to_be32(rdattr_err);
}
if (bmval0 & FATTR4_WORD0_ACL) {
struct nfs4_ace *ace;
@@ -2278,21 +2277,21 @@ nfsd4_encode_fattr(struct xdr_stream *xdr, struct svc_fh *fhp, struct svc_export
if (!p)
goto out_resource;

- WRITE32(0);
+ *p++ = cpu_to_be32(0);
goto out_acl;
}
p = xdr_reserve_space(xdr, 4);
if (!p)
goto out_resource;
- WRITE32(acl->naces);
+ *p++ = cpu_to_be32(acl->naces);

for (ace = acl->aces; ace < acl->aces + acl->naces; ace++) {
p = xdr_reserve_space(xdr, 4*3);
if (!p)
goto out_resource;
- WRITE32(ace->type);
- WRITE32(ace->flag);
- WRITE32(ace->access_mask & NFS4_ACE_MASK_ALL);
+ *p++ = cpu_to_be32(ace->type);
+ *p++ = cpu_to_be32(ace->flag);
+ *p++ = cpu_to_be32(ace->access_mask & NFS4_ACE_MASK_ALL);
status = nfsd4_encode_aclname(xdr, rqstp, ace);
if (status)
goto out;
@@ -2303,38 +2302,38 @@ out_acl:
p = xdr_reserve_space(xdr, 4);
if (!p)
goto out_resource;
- WRITE32(aclsupport ?
+ *p++ = cpu_to_be32(aclsupport ?
ACL4_SUPPORT_ALLOW_ACL|ACL4_SUPPORT_DENY_ACL : 0);
}
if (bmval0 & FATTR4_WORD0_CANSETTIME) {
p = xdr_reserve_space(xdr, 4);
if (!p)
goto out_resource;
- WRITE32(1);
+ *p++ = cpu_to_be32(1);
}
if (bmval0 & FATTR4_WORD0_CASE_INSENSITIVE) {
p = xdr_reserve_space(xdr, 4);
if (!p)
goto out_resource;
- WRITE32(0);
+ *p++ = cpu_to_be32(0);
}
if (bmval0 & FATTR4_WORD0_CASE_PRESERVING) {
p = xdr_reserve_space(xdr, 4);
if (!p)
goto out_resource;
- WRITE32(1);
+ *p++ = cpu_to_be32(1);
}
if (bmval0 & FATTR4_WORD0_CHOWN_RESTRICTED) {
p = xdr_reserve_space(xdr, 4);
if (!p)
goto out_resource;
- WRITE32(1);
+ *p++ = cpu_to_be32(1);
}
if (bmval0 & FATTR4_WORD0_FILEHANDLE) {
p = xdr_reserve_space(xdr, fhp->fh_handle.fh_size + 4);
if (!p)
goto out_resource;
- WRITE32(fhp->fh_handle.fh_size);
+ *p++ = cpu_to_be32(fhp->fh_handle.fh_size);
WRITEMEM(&fhp->fh_handle.fh_base, fhp->fh_handle.fh_size);
}
if (bmval0 & FATTR4_WORD0_FILEID) {
@@ -2370,7 +2369,7 @@ out_acl:
p = xdr_reserve_space(xdr, 4);
if (!p)
goto out_resource;
- WRITE32(1);
+ *p++ = cpu_to_be32(1);
}
if (bmval0 & FATTR4_WORD0_MAXFILESIZE) {
p = xdr_reserve_space(xdr, 8);
@@ -2382,13 +2381,13 @@ out_acl:
p = xdr_reserve_space(xdr, 4);
if (!p)
goto out_resource;
- WRITE32(255);
+ *p++ = cpu_to_be32(255);
}
if (bmval0 & FATTR4_WORD0_MAXNAME) {
p = xdr_reserve_space(xdr, 4);
if (!p)
goto out_resource;
- WRITE32(statfs.f_namelen);
+ *p++ = cpu_to_be32(statfs.f_namelen);
}
if (bmval0 & FATTR4_WORD0_MAXREAD) {
p = xdr_reserve_space(xdr, 8);
@@ -2406,19 +2405,19 @@ out_acl:
p = xdr_reserve_space(xdr, 4);
if (!p)
goto out_resource;
- WRITE32(stat.mode & S_IALLUGO);
+ *p++ = cpu_to_be32(stat.mode & S_IALLUGO);
}
if (bmval1 & FATTR4_WORD1_NO_TRUNC) {
p = xdr_reserve_space(xdr, 4);
if (!p)
goto out_resource;
- WRITE32(1);
+ *p++ = cpu_to_be32(1);
}
if (bmval1 & FATTR4_WORD1_NUMLINKS) {
p = xdr_reserve_space(xdr, 4);
if (!p)
goto out_resource;
- WRITE32(stat.nlink);
+ *p++ = cpu_to_be32(stat.nlink);
}
if (bmval1 & FATTR4_WORD1_OWNER) {
status = nfsd4_encode_user(xdr, rqstp, stat.uid);
@@ -2434,8 +2433,8 @@ out_acl:
p = xdr_reserve_space(xdr, 8);
if (!p)
goto out_resource;
- WRITE32((u32) MAJOR(stat.rdev));
- WRITE32((u32) MINOR(stat.rdev));
+ *p++ = cpu_to_be32((u32) MAJOR(stat.rdev));
+ *p++ = cpu_to_be32((u32) MINOR(stat.rdev));
}
if (bmval1 & FATTR4_WORD1_SPACE_AVAIL) {
p = xdr_reserve_space(xdr, 8);
@@ -2470,29 +2469,29 @@ out_acl:
if (!p)
goto out_resource;
WRITE64((s64)stat.atime.tv_sec);
- WRITE32(stat.atime.tv_nsec);
+ *p++ = cpu_to_be32(stat.atime.tv_nsec);
}
if (bmval1 & FATTR4_WORD1_TIME_DELTA) {
p = xdr_reserve_space(xdr, 12);
if (!p)
goto out_resource;
- WRITE32(0);
- WRITE32(1);
- WRITE32(0);
+ *p++ = cpu_to_be32(0);
+ *p++ = cpu_to_be32(1);
+ *p++ = cpu_to_be32(0);
}
if (bmval1 & FATTR4_WORD1_TIME_METADATA) {
p = xdr_reserve_space(xdr, 12);
if (!p)
goto out_resource;
WRITE64((s64)stat.ctime.tv_sec);
- WRITE32(stat.ctime.tv_nsec);
+ *p++ = cpu_to_be32(stat.ctime.tv_nsec);
}
if (bmval1 & FATTR4_WORD1_TIME_MODIFY) {
p = xdr_reserve_space(xdr, 12);
if (!p)
goto out_resource;
WRITE64((s64)stat.mtime.tv_sec);
- WRITE32(stat.mtime.tv_nsec);
+ *p++ = cpu_to_be32(stat.mtime.tv_nsec);
}
if (bmval1 & FATTR4_WORD1_MOUNTED_ON_FILEID) {
p = xdr_reserve_space(xdr, 8);
@@ -2516,10 +2515,10 @@ out_acl:
p = xdr_reserve_space(xdr, 16);
if (!p)
goto out_resource;
- WRITE32(3);
- WRITE32(NFSD_SUPPATTR_EXCLCREAT_WORD0);
- WRITE32(NFSD_SUPPATTR_EXCLCREAT_WORD1);
- WRITE32(NFSD_SUPPATTR_EXCLCREAT_WORD2);
+ *p++ = cpu_to_be32(3);
+ *p++ = cpu_to_be32(NFSD_SUPPATTR_EXCLCREAT_WORD0);
+ *p++ = cpu_to_be32(NFSD_SUPPATTR_EXCLCREAT_WORD1);
+ *p++ = cpu_to_be32(NFSD_SUPPATTR_EXCLCREAT_WORD2);
}

attrlen = htonl(xdr->buf->len - attrlen_offset - 4);
@@ -2749,7 +2748,7 @@ nfsd4_encode_stateid(struct xdr_stream *xdr, stateid_t *sid)
p = xdr_reserve_space(xdr, sizeof(stateid_t));
if (!p)
return nfserr_resource;
- WRITE32(sid->si_generation);
+ *p++ = cpu_to_be32(sid->si_generation);
WRITEMEM(&sid->si_opaque, sizeof(stateid_opaque_t));
return 0;
}
@@ -2764,8 +2763,8 @@ nfsd4_encode_access(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4_
p = xdr_reserve_space(xdr, 8);
if (!p)
return nfserr_resource;
- WRITE32(access->ac_supported);
- WRITE32(access->ac_resp_access);
+ *p++ = cpu_to_be32(access->ac_supported);
+ *p++ = cpu_to_be32(access->ac_resp_access);
}
return nfserr;
}
@@ -2780,9 +2779,9 @@ static __be32 nfsd4_encode_bind_conn_to_session(struct nfsd4_compoundres *resp,
if (!p)
return nfserr_resource;
WRITEMEM(bcts->sessionid.data, NFS4_MAX_SESSIONID_LEN);
- WRITE32(bcts->dir);
+ *p++ = cpu_to_be32(bcts->dir);
/* Sorry, we do not yet support RDMA over 4.1: */
- WRITE32(0);
+ *p++ = cpu_to_be32(0);
}
return nfserr;
}
@@ -2825,9 +2824,9 @@ nfsd4_encode_create(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4_
if (!p)
return nfserr_resource;
write_cinfo(&p, &create->cr_cinfo);
- WRITE32(2);
- WRITE32(create->cr_bmval[0]);
- WRITE32(create->cr_bmval[1]);
+ *p++ = cpu_to_be32(2);
+ *p++ = cpu_to_be32(create->cr_bmval[0]);
+ *p++ = cpu_to_be32(create->cr_bmval[1]);
}
return nfserr;
}
@@ -2860,7 +2859,7 @@ nfsd4_encode_getfh(struct nfsd4_compoundres *resp, __be32 nfserr, struct svc_fh
p = xdr_reserve_space(xdr, len + 4);
if (!p)
return nfserr_resource;
- WRITE32(len);
+ *p++ = cpu_to_be32(len);
WRITEMEM(&fhp->fh_handle.fh_base, len);
}
return nfserr;
@@ -2892,14 +2891,14 @@ again:
}
WRITE64(ld->ld_start);
WRITE64(ld->ld_length);
- WRITE32(ld->ld_type);
+ *p++ = cpu_to_be32(ld->ld_type);
if (conf->len) {
WRITEMEM(&ld->ld_clientid, 8);
- WRITE32(conf->len);
+ *p++ = cpu_to_be32(conf->len);
WRITEMEM(conf->data, conf->len);
} else { /* non - nfsv4 lock in conflict, no clientid nor owner */
WRITE64((u64)0); /* clientid */
- WRITE32(0); /* length of owner name */
+ *p++ = cpu_to_be32(0); /* length of owner name */
}
return nfserr_denied;
}
@@ -2971,11 +2970,11 @@ nfsd4_encode_open(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4_op
if (!p)
return nfserr_resource;
write_cinfo(&p, &open->op_cinfo);
- WRITE32(open->op_rflags);
- WRITE32(2);
- WRITE32(open->op_bmval[0]);
- WRITE32(open->op_bmval[1]);
- WRITE32(open->op_delegate_type);
+ *p++ = cpu_to_be32(open->op_rflags);
+ *p++ = cpu_to_be32(2);
+ *p++ = cpu_to_be32(open->op_bmval[0]);
+ *p++ = cpu_to_be32(open->op_bmval[1]);
+ *p++ = cpu_to_be32(open->op_delegate_type);

switch (open->op_delegate_type) {
case NFS4_OPEN_DELEGATE_NONE:
@@ -2987,15 +2986,15 @@ nfsd4_encode_open(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4_op
p = xdr_reserve_space(xdr, 20);
if (!p)
return nfserr_resource;
- WRITE32(open->op_recall);
+ *p++ = cpu_to_be32(open->op_recall);

/*
* TODO: ACE's in delegations
*/
- WRITE32(NFS4_ACE_ACCESS_ALLOWED_ACE_TYPE);
- WRITE32(0);
- WRITE32(0);
- WRITE32(0); /* XXX: is NULL principal ok? */
+ *p++ = cpu_to_be32(NFS4_ACE_ACCESS_ALLOWED_ACE_TYPE);
+ *p++ = cpu_to_be32(0);
+ *p++ = cpu_to_be32(0);
+ *p++ = cpu_to_be32(0); /* XXX: is NULL principal ok? */
break;
case NFS4_OPEN_DELEGATE_WRITE:
nfserr = nfsd4_encode_stateid(xdr, &open->op_delegate_stateid);
@@ -3004,22 +3003,22 @@ nfsd4_encode_open(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4_op
p = xdr_reserve_space(xdr, 32);
if (!p)
return nfserr_resource;
- WRITE32(0);
+ *p++ = cpu_to_be32(0);

/*
* TODO: space_limit's in delegations
*/
- WRITE32(NFS4_LIMIT_SIZE);
- WRITE32(~(u32)0);
- WRITE32(~(u32)0);
+ *p++ = cpu_to_be32(NFS4_LIMIT_SIZE);
+ *p++ = cpu_to_be32(~(u32)0);
+ *p++ = cpu_to_be32(~(u32)0);

/*
* TODO: ACE's in delegations
*/
- WRITE32(NFS4_ACE_ACCESS_ALLOWED_ACE_TYPE);
- WRITE32(0);
- WRITE32(0);
- WRITE32(0); /* XXX: is NULL principal ok? */
+ *p++ = cpu_to_be32(NFS4_ACE_ACCESS_ALLOWED_ACE_TYPE);
+ *p++ = cpu_to_be32(0);
+ *p++ = cpu_to_be32(0);
+ *p++ = cpu_to_be32(0); /* XXX: is NULL principal ok? */
break;
case NFS4_OPEN_DELEGATE_NONE_EXT: /* 4.1 */
switch (open->op_why_no_deleg) {
@@ -3028,14 +3027,14 @@ nfsd4_encode_open(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4_op
p = xdr_reserve_space(xdr, 8);
if (!p)
return nfserr_resource;
- WRITE32(open->op_why_no_deleg);
- WRITE32(0); /* deleg signaling not supported yet */
+ *p++ = cpu_to_be32(open->op_why_no_deleg);
+ *p++ = cpu_to_be32(0); /* deleg signaling not supported yet */
break;
default:
p = xdr_reserve_space(xdr, 4);
if (!p)
return nfserr_resource;
- WRITE32(open->op_why_no_deleg);
+ *p++ = cpu_to_be32(open->op_why_no_deleg);
}
break;
default:
@@ -3304,8 +3303,8 @@ nfsd4_encode_readdir(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4
return nfserr_resource;

/* XXX: Following NFSv3, we ignore the READDIR verifier for now. */
- WRITE32(0);
- WRITE32(0);
+ *p++ = cpu_to_be32(0);
+ *p++ = cpu_to_be32(0);
resp->xdr.buf->head[0].iov_len = ((char*)resp->xdr.p)
- (char*)resp->xdr.buf->head[0].iov_base;

@@ -3453,17 +3452,17 @@ nfsd4_do_encode_secinfo(struct xdr_stream *xdr,
p = xdr_reserve_space(xdr, 4 + 4 + XDR_LEN(info.oid.len) + 4 + 4);
if (!p)
goto out;
- WRITE32(RPC_AUTH_GSS);
- WRITE32(info.oid.len);
+ *p++ = cpu_to_be32(RPC_AUTH_GSS);
+ *p++ = cpu_to_be32(info.oid.len);
WRITEMEM(info.oid.data, info.oid.len);
- WRITE32(info.qop);
- WRITE32(info.service);
+ *p++ = cpu_to_be32(info.qop);
+ *p++ = cpu_to_be32(info.service);
} else if (pf < RPC_AUTH_MAXFLAVOR) {
supported++;
p = xdr_reserve_space(xdr, 4);
if (!p)
goto out;
- WRITE32(pf);
+ *p++ = cpu_to_be32(pf);
} else {
if (report)
pr_warn("NFS: SECINFO: security flavor %u "
@@ -3513,16 +3512,16 @@ nfsd4_encode_setattr(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4
if (!p)
return nfserr_resource;
if (nfserr) {
- WRITE32(3);
- WRITE32(0);
- WRITE32(0);
- WRITE32(0);
+ *p++ = cpu_to_be32(3);
+ *p++ = cpu_to_be32(0);
+ *p++ = cpu_to_be32(0);
+ *p++ = cpu_to_be32(0);
}
else {
- WRITE32(3);
- WRITE32(setattr->sa_bmval[0]);
- WRITE32(setattr->sa_bmval[1]);
- WRITE32(setattr->sa_bmval[2]);
+ *p++ = cpu_to_be32(3);
+ *p++ = cpu_to_be32(setattr->sa_bmval[0]);
+ *p++ = cpu_to_be32(setattr->sa_bmval[1]);
+ *p++ = cpu_to_be32(setattr->sa_bmval[2]);
}
return nfserr;
}
@@ -3544,8 +3543,8 @@ nfsd4_encode_setclientid(struct nfsd4_compoundres *resp, __be32 nfserr, struct n
p = xdr_reserve_space(xdr, 8);
if (!p)
return nfserr_resource;
- WRITE32(0);
- WRITE32(0);
+ *p++ = cpu_to_be32(0);
+ *p++ = cpu_to_be32(0);
}
return nfserr;
}
@@ -3560,8 +3559,8 @@ nfsd4_encode_write(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4_w
p = xdr_reserve_space(xdr, 16);
if (!p)
return nfserr_resource;
- WRITE32(write->wr_bytes_written);
- WRITE32(write->wr_how_written);
+ *p++ = cpu_to_be32(write->wr_bytes_written);
+ *p++ = cpu_to_be32(write->wr_how_written);
WRITEMEM(write->wr_verifier.data, NFS4_VERIFIER_SIZE);
}
return nfserr;
@@ -3604,10 +3603,10 @@ nfsd4_encode_exchange_id(struct nfsd4_compoundres *resp, __be32 nfserr,
return nfserr_resource;

WRITEMEM(&exid->clientid, 8);
- WRITE32(exid->seqid);
- WRITE32(exid->flags);
+ *p++ = cpu_to_be32(exid->seqid);
+ *p++ = cpu_to_be32(exid->flags);

- WRITE32(exid->spa_how);
+ *p++ = cpu_to_be32(exid->spa_how);

switch (exid->spa_how) {
case SP4_NONE:
@@ -3619,11 +3618,11 @@ nfsd4_encode_exchange_id(struct nfsd4_compoundres *resp, __be32 nfserr,
return nfserr_resource;

/* spo_must_enforce bitmap: */
- WRITE32(2);
- WRITE32(nfs4_minimal_spo_must_enforce[0]);
- WRITE32(nfs4_minimal_spo_must_enforce[1]);
+ *p++ = cpu_to_be32(2);
+ *p++ = cpu_to_be32(nfs4_minimal_spo_must_enforce[0]);
+ *p++ = cpu_to_be32(nfs4_minimal_spo_must_enforce[1]);
/* empty spo_must_allow bitmap: */
- WRITE32(0);
+ *p++ = cpu_to_be32(0);

break;
default:
@@ -3643,15 +3642,15 @@ nfsd4_encode_exchange_id(struct nfsd4_compoundres *resp, __be32 nfserr,
/* The server_owner struct */
WRITE64(minor_id); /* Minor id */
/* major id */
- WRITE32(major_id_sz);
+ *p++ = cpu_to_be32(major_id_sz);
WRITEMEM(major_id, major_id_sz);

/* Server scope */
- WRITE32(server_scope_sz);
+ *p++ = cpu_to_be32(server_scope_sz);
WRITEMEM(server_scope, server_scope_sz);

/* Implementation id */
- WRITE32(0); /* zero length nfs_impl_id4 array */
+ *p++ = cpu_to_be32(0); /* zero length nfs_impl_id4 array */
return 0;
}

@@ -3669,43 +3668,43 @@ nfsd4_encode_create_session(struct nfsd4_compoundres *resp, __be32 nfserr,
if (!p)
return nfserr_resource;
WRITEMEM(sess->sessionid.data, NFS4_MAX_SESSIONID_LEN);
- WRITE32(sess->seqid);
- WRITE32(sess->flags);
+ *p++ = cpu_to_be32(sess->seqid);
+ *p++ = cpu_to_be32(sess->flags);

p = xdr_reserve_space(xdr, 28);
if (!p)
return nfserr_resource;
- WRITE32(0); /* headerpadsz */
- WRITE32(sess->fore_channel.maxreq_sz);
- WRITE32(sess->fore_channel.maxresp_sz);
- WRITE32(sess->fore_channel.maxresp_cached);
- WRITE32(sess->fore_channel.maxops);
- WRITE32(sess->fore_channel.maxreqs);
- WRITE32(sess->fore_channel.nr_rdma_attrs);
+ *p++ = cpu_to_be32(0); /* headerpadsz */
+ *p++ = cpu_to_be32(sess->fore_channel.maxreq_sz);
+ *p++ = cpu_to_be32(sess->fore_channel.maxresp_sz);
+ *p++ = cpu_to_be32(sess->fore_channel.maxresp_cached);
+ *p++ = cpu_to_be32(sess->fore_channel.maxops);
+ *p++ = cpu_to_be32(sess->fore_channel.maxreqs);
+ *p++ = cpu_to_be32(sess->fore_channel.nr_rdma_attrs);

if (sess->fore_channel.nr_rdma_attrs) {
p = xdr_reserve_space(xdr, 4);
if (!p)
return nfserr_resource;
- WRITE32(sess->fore_channel.rdma_attrs);
+ *p++ = cpu_to_be32(sess->fore_channel.rdma_attrs);
}

p = xdr_reserve_space(xdr, 28);
if (!p)
return nfserr_resource;
- WRITE32(0); /* headerpadsz */
- WRITE32(sess->back_channel.maxreq_sz);
- WRITE32(sess->back_channel.maxresp_sz);
- WRITE32(sess->back_channel.maxresp_cached);
- WRITE32(sess->back_channel.maxops);
- WRITE32(sess->back_channel.maxreqs);
- WRITE32(sess->back_channel.nr_rdma_attrs);
+ *p++ = cpu_to_be32(0); /* headerpadsz */
+ *p++ = cpu_to_be32(sess->back_channel.maxreq_sz);
+ *p++ = cpu_to_be32(sess->back_channel.maxresp_sz);
+ *p++ = cpu_to_be32(sess->back_channel.maxresp_cached);
+ *p++ = cpu_to_be32(sess->back_channel.maxops);
+ *p++ = cpu_to_be32(sess->back_channel.maxreqs);
+ *p++ = cpu_to_be32(sess->back_channel.nr_rdma_attrs);

if (sess->back_channel.nr_rdma_attrs) {
p = xdr_reserve_space(xdr, 4);
if (!p)
return nfserr_resource;
- WRITE32(sess->back_channel.rdma_attrs);
+ *p++ = cpu_to_be32(sess->back_channel.rdma_attrs);
}
return 0;
}
@@ -3724,12 +3723,12 @@ nfsd4_encode_sequence(struct nfsd4_compoundres *resp, __be32 nfserr,
if (!p)
return nfserr_resource;
WRITEMEM(seq->sessionid.data, NFS4_MAX_SESSIONID_LEN);
- WRITE32(seq->seqid);
- WRITE32(seq->slotid);
+ *p++ = cpu_to_be32(seq->seqid);
+ *p++ = cpu_to_be32(seq->slotid);
/* Note slotid's are numbered from zero: */
- WRITE32(seq->maxslots - 1); /* sr_highest_slotid */
- WRITE32(seq->maxslots - 1); /* sr_target_highest_slotid */
- WRITE32(seq->status_flags);
+ *p++ = cpu_to_be32(seq->maxslots - 1); /* sr_highest_slotid */
+ *p++ = cpu_to_be32(seq->maxslots - 1); /* sr_target_highest_slotid */
+ *p++ = cpu_to_be32(seq->status_flags);

resp->cstate.data_offset = xdr->buf->len; /* DRC cache data pointer */
return 0;
@@ -3876,7 +3875,7 @@ nfsd4_encode_operation(struct nfsd4_compoundres *resp, struct nfsd4_op *op)
WARN_ON_ONCE(1);
return;
}
- WRITE32(op->opnum);
+ *p++ = cpu_to_be32(op->opnum);
post_err_offset = xdr->buf->len;

if (op->opnum == OP_ILLEGAL)
@@ -3954,7 +3953,7 @@ nfsd4_encode_replay(struct xdr_stream *xdr, struct nfsd4_op *op)
WARN_ON_ONCE(1);
return;
}
- WRITE32(op->opnum);
+ *p++ = cpu_to_be32(op->opnum);
*p++ = rp->rp_status; /* already xdr'ed */

WRITEMEM(rp->rp_buf, rp->rp_buflen);
--
1.8.5.3


2014-03-23 01:12:28

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 18/50] nfsd4: use xdr_stream throughout compound encoding

From: "J. Bruce Fields" <[email protected]>

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4proc.c | 2 +-
fs/nfsd/nfs4xdr.c | 23 +++++++++++++++--------
2 files changed, 16 insertions(+), 9 deletions(-)

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index 9e7cb05..e5cc711 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -1242,7 +1242,7 @@ nfsd4_proc_compound(struct svc_rqst *rqstp,
svcxdr_init_encode(rqstp, resp);
resp->tagp = resp->xdr.p;
/* reserve space for: taglen, tag, and opcnt */
- resp->xdr.p += 2 + XDR_QUADLEN(args->taglen);
+ xdr_reserve_space(&resp->xdr, 8 + args->taglen);
resp->taglen = args->taglen;
resp->tag = args->tag;
resp->opcnt = 0;
diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index 6025713..d665c60 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -1748,10 +1748,10 @@ static void write_cinfo(__be32 **p, struct nfsd4_change_info *c)
}

#define RESERVE_SPACE(nbytes) do { \
- p = resp->xdr.p; \
- BUG_ON(p + XDR_QUADLEN(nbytes) > resp->xdr.end); \
+ p = xdr_reserve_space(&resp->xdr, nbytes); \
+ BUG_ON(!p); \
} while (0)
-#define ADJUST_ARGS() resp->xdr.p = p
+#define ADJUST_ARGS() WARN_ON_ONCE(p != resp->xdr.p) \

/* Encode as an array of strings the string given with components
* separated @sep, escaped with esc_enter and esc_exit.
@@ -3040,8 +3040,11 @@ nfsd4_encode_read(struct nfsd4_compoundres *resp, __be32 nfserr,
read->rd_offset, resp->rqstp->rq_vec, read->rd_vlen,
&maxcount);

- if (nfserr)
+ if (nfserr) {
+ xdr->p -= 2;
+ xdr->iov->iov_len -= 8;
return nfserr;
+ }
eof = (read->rd_offset + maxcount >=
read->rd_fhp->fh_dentry->d_inode->i_size);

@@ -3094,9 +3097,12 @@ nfsd4_encode_readlink(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd
*/
nfserr = nfsd_readlink(readlink->rl_rqstp, readlink->rl_fhp, page, &maxcount);
if (nfserr == nfserr_isdir)
- return nfserr_inval;
- if (nfserr)
+ nfserr = nfserr_inval;
+ if (nfserr) {
+ xdr->p--;
+ xdr->iov->iov_len -= 4;
return nfserr;
+ }

WRITE32(maxcount);
ADJUST_ARGS();
@@ -3196,8 +3202,9 @@ nfsd4_encode_readdir(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4

return 0;
err_no_verf:
- p = savep;
- ADJUST_ARGS();
+ xdr->p = savep;
+ xdr->iov->iov_len = ((char *)resp->xdr.p)
+ - (char *)resp->xdr.buf->head[0].iov_base;
return nfserr;
}

--
1.8.5.3


2014-03-23 01:12:24

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 01/50] rpc: Allow xdr_buf_subsegment to operate in-place

From: "J. Bruce Fields" <[email protected]>

Allow

xdr_buf_subsegment(&buf, &buf, base, len)

to modify an xdr_buf in-place.

Also, none of the callers need the iov_base of head or tail to be zeroed
out.

Also add documentation.

(As it turns out, I"m not really using this, but it seems a simple way
to make this function a bit more robust.)

Signed-off-by: J. Bruce Fields <[email protected]>
---
net/sunrpc/xdr.c | 22 ++++++++++++++++------
1 file changed, 16 insertions(+), 6 deletions(-)

diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c
index 1504bb1..dd97ba3 100644
--- a/net/sunrpc/xdr.c
+++ b/net/sunrpc/xdr.c
@@ -833,8 +833,20 @@ xdr_buf_from_iov(struct kvec *iov, struct xdr_buf *buf)
}
EXPORT_SYMBOL_GPL(xdr_buf_from_iov);

-/* Sets subbuf to the portion of buf of length len beginning base bytes
- * from the start of buf. Returns -1 if base of length are out of bounds. */
+/**
+ * xdr_buf_subsegment - set subbuf to a portion of buf
+ * @buf: an xdr buffer
+ * @subbuf: the result buffer
+ * @base: beginning of range in bytes
+ * @len: length of range in bytes
+ *
+ * sets @subbuf to an xdr buffer representing the portion of @buf of
+ * length @len starting at offset @base.
+ *
+ * @buf and @subbuf may be pointers to the same struct xdr_buf.
+ *
+ * Returns -1 if base of length are out of bounds.
+ */
int
xdr_buf_subsegment(struct xdr_buf *buf, struct xdr_buf *subbuf,
unsigned int base, unsigned int len)
@@ -847,9 +859,8 @@ xdr_buf_subsegment(struct xdr_buf *buf, struct xdr_buf *subbuf,
len -= subbuf->head[0].iov_len;
base = 0;
} else {
- subbuf->head[0].iov_base = NULL;
- subbuf->head[0].iov_len = 0;
base -= buf->head[0].iov_len;
+ subbuf->head[0].iov_len = 0;
}

if (base < buf->page_len) {
@@ -871,9 +882,8 @@ xdr_buf_subsegment(struct xdr_buf *buf, struct xdr_buf *subbuf,
len -= subbuf->tail[0].iov_len;
base = 0;
} else {
- subbuf->tail[0].iov_base = NULL;
- subbuf->tail[0].iov_len = 0;
base -= buf->tail[0].iov_len;
+ subbuf->tail[0].iov_len = 0;
}

if (base || len)
--
1.8.5.3


2014-03-23 01:12:27

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 13/50] nfsd4: move nfsd4_operation to xdr4.h

From: "J. Bruce Fields" <[email protected]>

We want to share some of these definitions.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4proc.c | 58 ++++--------------------------------------------------
fs/nfsd/xdr4.h | 53 +++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 57 insertions(+), 54 deletions(-)

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index 849b0eb..afa7ff7 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -1134,58 +1134,8 @@ static inline void nfsd4_increment_op_stats(u32 opnum)
nfsdstats.nfs4_opcount[opnum]++;
}

-typedef __be32(*nfsd4op_func)(struct svc_rqst *, struct nfsd4_compound_state *,
- void *);
-typedef u32(*nfsd4op_rsize)(struct svc_rqst *, struct nfsd4_op *op);
-typedef void(*stateid_setter)(struct nfsd4_compound_state *, void *);
-typedef void(*stateid_getter)(struct nfsd4_compound_state *, void *);
-
-enum nfsd4_op_flags {
- ALLOWED_WITHOUT_FH = 1 << 0, /* No current filehandle required */
- ALLOWED_ON_ABSENT_FS = 1 << 1, /* ops processed on absent fs */
- ALLOWED_AS_FIRST_OP = 1 << 2, /* ops reqired first in compound */
- /* For rfc 5661 section 2.6.3.1.1: */
- OP_HANDLES_WRONGSEC = 1 << 3,
- OP_IS_PUTFH_LIKE = 1 << 4,
- /*
- * These are the ops whose result size we estimate before
- * encoding, to avoid performing an op then not being able to
- * respond or cache a response. This includes writes and setattrs
- * as well as the operations usually called "nonidempotent":
- */
- OP_MODIFIES_SOMETHING = 1 << 5,
- /*
- * Cache compounds containing these ops in the xid-based drc:
- * We use the DRC for compounds containing non-idempotent
- * operations, *except* those that are 4.1-specific (since
- * sessions provide their own EOS), and except for stateful
- * operations other than setclientid and setclientid_confirm
- * (since sequence numbers provide EOS for open, lock, etc in
- * the v4.0 case).
- */
- OP_CACHEME = 1 << 6,
- /*
- * These are ops which clear current state id.
- */
- OP_CLEAR_STATEID = 1 << 7,
-};
-
-struct nfsd4_operation {
- nfsd4op_func op_func;
- u32 op_flags;
- char *op_name;
- /* Try to get response size before operation */
- nfsd4op_rsize op_rsize_bop;
- stateid_getter op_get_currentstateid;
- stateid_setter op_set_currentstateid;
-};
-
static struct nfsd4_operation nfsd4_ops[];

-#ifdef NFSD_DEBUG
-static const char *nfsd4_op_name(unsigned opnum);
-#endif
-
/*
* Enforce NFSv4.1 COMPOUND ordering rules:
*
@@ -1219,7 +1169,7 @@ static __be32 nfs41_check_op_ordering(struct nfsd4_compoundargs *args)
return nfs_ok;
}

-static inline struct nfsd4_operation *OPDESC(struct nfsd4_op *op)
+struct nfsd4_operation *OPDESC(struct nfsd4_op *op)
{
return &nfsd4_ops[op->opnum];
}
@@ -1868,14 +1818,14 @@ static struct nfsd4_operation nfsd4_ops[] = {
},
};

-#ifdef NFSD_DEBUG
-static const char *nfsd4_op_name(unsigned opnum)
+const char *nfsd4_op_name(unsigned opnum)
{
+#ifdef NFSD_DEBUG
if (opnum < ARRAY_SIZE(nfsd4_ops))
return nfsd4_ops[opnum].op_name;
+#endif
return "unknown_operation";
}
-#endif

#define nfsd4_voidres nfsd4_voidargs
struct nfsd4_voidargs { int dummy; };
diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h
index f62a055..fa3a589 100644
--- a/fs/nfsd/xdr4.h
+++ b/fs/nfsd/xdr4.h
@@ -536,6 +536,59 @@ static inline bool nfsd4_last_compound_op(struct svc_rqst *rqstp)
return argp->opcnt == resp->opcnt;
}

+
+
+typedef __be32(*nfsd4op_func)(struct svc_rqst *, struct nfsd4_compound_state *,
+ void *);
+typedef u32(*nfsd4op_rsize)(struct svc_rqst *, struct nfsd4_op *op);
+typedef void(*stateid_setter)(struct nfsd4_compound_state *, void *);
+typedef void(*stateid_getter)(struct nfsd4_compound_state *, void *);
+
+enum nfsd4_op_flags {
+ ALLOWED_WITHOUT_FH = 1 << 0, /* No current filehandle required */
+ ALLOWED_ON_ABSENT_FS = 1 << 1, /* ops processed on absent fs */
+ ALLOWED_AS_FIRST_OP = 1 << 2, /* ops reqired first in compound */
+ /* For rfc 5661 section 2.6.3.1.1: */
+ OP_HANDLES_WRONGSEC = 1 << 3,
+ OP_IS_PUTFH_LIKE = 1 << 4,
+ /*
+ * These are the ops whose result size we estimate before
+ * encoding, to avoid performing an op then not being able to
+ * respond or cache a response. This includes writes and setattrs
+ * as well as the operations usually called "nonidempotent":
+ */
+ OP_MODIFIES_SOMETHING = 1 << 5,
+ /*
+ * Cache compounds containing these ops in the xid-based drc:
+ * We use the DRC for compounds containing non-idempotent
+ * operations, *except* those that are 4.1-specific (since
+ * sessions provide their own EOS), and except for stateful
+ * operations other than setclientid and setclientid_confirm
+ * (since sequence numbers provide EOS for open, lock, etc in
+ * the v4.0 case).
+ */
+ OP_CACHEME = 1 << 6,
+ /*
+ * These are ops which clear current state id.
+ */
+ OP_CLEAR_STATEID = 1 << 7,
+};
+
+struct nfsd4_operation {
+ nfsd4op_func op_func;
+ u32 op_flags;
+ char *op_name;
+ /* Try to get response size before
+ * operation */
+ nfsd4op_rsize op_rsize_bop;
+ stateid_getter op_get_currentstateid;
+ stateid_setter op_set_currentstateid;
+};
+
+struct nfsd4_operation *OPDESC(struct nfsd4_op *op);
+
+const char *nfsd4_op_name(unsigned opnum);
+
#define NFS4_SVC_XDRSIZE sizeof(struct nfsd4_compoundargs)

static inline void
--
1.8.5.3


2014-03-23 01:12:25

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 09/50] nfsd4: embed xdr_stream in nfsd4_compoundres

From: "J. Bruce Fields" <[email protected]>

This is a mechanical transformation with no change in behavior.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4proc.c | 12 +++++-----
fs/nfsd/nfs4state.c | 8 +++----
fs/nfsd/nfs4xdr.c | 65 +++++++++++++++++++++++++++--------------------------
fs/nfsd/xdr4.h | 4 +---
4 files changed, 44 insertions(+), 45 deletions(-)

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index 9a914e8..27b8268 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -1277,13 +1277,13 @@ nfsd4_proc_compound(struct svc_rqst *rqstp,
u32 plen = 0;
__be32 status;

- resp->xbuf = &rqstp->rq_res;
- resp->p = rqstp->rq_res.head[0].iov_base +
+ resp->xdr.buf = &rqstp->rq_res;
+ resp->xdr.p = rqstp->rq_res.head[0].iov_base +
rqstp->rq_res.head[0].iov_len;
- resp->tagp = resp->p;
+ resp->tagp = resp->xdr.p;
/* reserve space for: taglen, tag, and opcnt */
- resp->p += 2 + XDR_QUADLEN(args->taglen);
- resp->end = rqstp->rq_res.head[0].iov_base + PAGE_SIZE;
+ resp->xdr.p += 2 + XDR_QUADLEN(args->taglen);
+ resp->xdr.end = rqstp->rq_res.head[0].iov_base + PAGE_SIZE;
resp->taglen = args->taglen;
resp->tag = args->tag;
resp->opcnt = 0;
@@ -1335,7 +1335,7 @@ nfsd4_proc_compound(struct svc_rqst *rqstp,
* failed response to the next operation. If we don't
* have enough room, fail with ERR_RESOURCE.
*/
- slack_bytes = (char *)resp->end - (char *)resp->p;
+ slack_bytes = (char *)resp->xdr.end - (char *)resp->xdr.p;
if (slack_bytes < COMPOUND_SLACK_SPACE
+ COMPOUND_ERR_SLACK_SPACE) {
BUG_ON(slack_bytes < COMPOUND_ERR_SLACK_SPACE);
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index cb4cc4a..d98e282 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1560,10 +1560,10 @@ nfsd4_store_cache_entry(struct nfsd4_compoundres *resp)
slot->sl_datalen = 0;
return;
}
- slot->sl_datalen = (char *)resp->p - (char *)resp->cstate.datap;
+ slot->sl_datalen = (char *)resp->xdr.p - (char *)resp->cstate.datap;
base = (char *)resp->cstate.datap -
- (char *)resp->xbuf->head[0].iov_base;
- if (read_bytes_from_xdr_buf(resp->xbuf, base, slot->sl_data,
+ (char *)resp->xdr.buf->head[0].iov_base;
+ if (read_bytes_from_xdr_buf(resp->xdr.buf, base, slot->sl_data,
slot->sl_datalen))
WARN("%s: sessions DRC could not cache compound\n", __func__);
return;
@@ -1617,7 +1617,7 @@ nfsd4_replay_cache_entry(struct nfsd4_compoundres *resp,
memcpy(resp->cstate.datap, slot->sl_data, slot->sl_datalen);

resp->opcnt = slot->sl_opcnt;
- resp->p = resp->cstate.datap + XDR_QUADLEN(slot->sl_datalen);
+ resp->xdr.p = resp->cstate.datap + XDR_QUADLEN(slot->sl_datalen);
status = slot->sl_status;

return status;
diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index ea32674..484a8aa 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -1748,10 +1748,10 @@ static void write_cinfo(__be32 **p, struct nfsd4_change_info *c)
}

#define RESERVE_SPACE(nbytes) do { \
- p = resp->p; \
- BUG_ON(p + XDR_QUADLEN(nbytes) > resp->end); \
+ p = resp->xdr.p; \
+ BUG_ON(p + XDR_QUADLEN(nbytes) > resp->xdr.end); \
} while (0)
-#define ADJUST_ARGS() resp->p = p
+#define ADJUST_ARGS() resp->xdr.p = p

/* Encode as an array of strings the string given with components
* separated @sep, escaped with esc_enter and esc_exit.
@@ -2750,9 +2750,9 @@ nfsd4_encode_getattr(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4
if (nfserr)
return nfserr;

- buflen = resp->end - resp->p - (COMPOUND_ERR_SLACK_SPACE >> 2);
+ buflen = resp->xdr.end - resp->xdr.p - (COMPOUND_ERR_SLACK_SPACE >> 2);
nfserr = nfsd4_encode_fattr(fhp, fhp->fh_export, fhp->fh_dentry,
- &resp->p, buflen, getattr->ga_bmval,
+ &resp->xdr.p, buflen, getattr->ga_bmval,
resp->rqstp, 0);
return nfserr;
}
@@ -2952,7 +2952,7 @@ nfsd4_encode_read(struct nfsd4_compoundres *resp, __be32 nfserr,

if (nfserr)
return nfserr;
- if (resp->xbuf->page_len)
+ if (resp->xdr.buf->page_len)
return nfserr_resource;

RESERVE_SPACE(8); /* eof flag and byte count */
@@ -2990,18 +2990,18 @@ nfsd4_encode_read(struct nfsd4_compoundres *resp, __be32 nfserr,
WRITE32(eof);
WRITE32(maxcount);
ADJUST_ARGS();
- resp->xbuf->head[0].iov_len = (char*)p
- - (char*)resp->xbuf->head[0].iov_base;
- resp->xbuf->page_len = maxcount;
+ resp->xdr.buf->head[0].iov_len = (char*)p
+ - (char*)resp->xdr.buf->head[0].iov_base;
+ resp->xdr.buf->page_len = maxcount;

/* Use rest of head for padding and remaining ops: */
- resp->xbuf->tail[0].iov_base = p;
- resp->xbuf->tail[0].iov_len = 0;
+ resp->xdr.buf->tail[0].iov_base = p;
+ resp->xdr.buf->tail[0].iov_len = 0;
if (maxcount&3) {
RESERVE_SPACE(4);
WRITE32(0);
- resp->xbuf->tail[0].iov_base += maxcount&3;
- resp->xbuf->tail[0].iov_len = 4 - (maxcount&3);
+ resp->xdr.buf->tail[0].iov_base += maxcount&3;
+ resp->xdr.buf->tail[0].iov_len = 4 - (maxcount&3);
ADJUST_ARGS();
}
return 0;
@@ -3016,7 +3016,7 @@ nfsd4_encode_readlink(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd

if (nfserr)
return nfserr;
- if (resp->xbuf->page_len)
+ if (resp->xdr.buf->page_len)
return nfserr_resource;
if (!*resp->rqstp->rq_next_page)
return nfserr_resource;
@@ -3040,18 +3040,18 @@ nfsd4_encode_readlink(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd

WRITE32(maxcount);
ADJUST_ARGS();
- resp->xbuf->head[0].iov_len = (char*)p
- - (char*)resp->xbuf->head[0].iov_base;
- resp->xbuf->page_len = maxcount;
+ resp->xdr.buf->head[0].iov_len = (char*)p
+ - (char*)resp->xdr.buf->head[0].iov_base;
+ resp->xdr.buf->page_len = maxcount;

/* Use rest of head for padding and remaining ops: */
- resp->xbuf->tail[0].iov_base = p;
- resp->xbuf->tail[0].iov_len = 0;
+ resp->xdr.buf->tail[0].iov_base = p;
+ resp->xdr.buf->tail[0].iov_len = 0;
if (maxcount&3) {
RESERVE_SPACE(4);
WRITE32(0);
- resp->xbuf->tail[0].iov_base += maxcount&3;
- resp->xbuf->tail[0].iov_len = 4 - (maxcount&3);
+ resp->xdr.buf->tail[0].iov_base += maxcount&3;
+ resp->xdr.buf->tail[0].iov_len = 4 - (maxcount&3);
ADJUST_ARGS();
}
return 0;
@@ -3067,7 +3067,7 @@ nfsd4_encode_readdir(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4

if (nfserr)
return nfserr;
- if (resp->xbuf->page_len)
+ if (resp->xdr.buf->page_len)
return nfserr_resource;
if (!*resp->rqstp->rq_next_page)
return nfserr_resource;
@@ -3079,7 +3079,8 @@ nfsd4_encode_readdir(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4
WRITE32(0);
WRITE32(0);
ADJUST_ARGS();
- resp->xbuf->head[0].iov_len = ((char*)resp->p) - (char*)resp->xbuf->head[0].iov_base;
+ resp->xdr.buf->head[0].iov_len = ((char*)resp->xdr.p)
+ - (char*)resp->xdr.buf->head[0].iov_base;
tailbase = p;

maxcount = PAGE_SIZE;
@@ -3120,14 +3121,14 @@ nfsd4_encode_readdir(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4
p = readdir->buffer;
*p++ = 0; /* no more entries */
*p++ = htonl(readdir->common.err == nfserr_eof);
- resp->xbuf->page_len = ((char*)p) -
+ resp->xdr.buf->page_len = ((char*)p) -
(char*)page_address(*(resp->rqstp->rq_next_page-1));

/* Use rest of head for padding and remaining ops: */
- resp->xbuf->tail[0].iov_base = tailbase;
- resp->xbuf->tail[0].iov_len = 0;
- resp->p = resp->xbuf->tail[0].iov_base;
- resp->end = resp->p + (PAGE_SIZE - resp->xbuf->head[0].iov_len)/4;
+ resp->xdr.buf->tail[0].iov_base = tailbase;
+ resp->xdr.buf->tail[0].iov_len = 0;
+ resp->xdr.p = resp->xdr.buf->tail[0].iov_base;
+ resp->xdr.end = resp->xdr.p + (PAGE_SIZE - resp->xdr.buf->head[0].iov_len)/4;

return 0;
err_no_verf:
@@ -3586,10 +3587,10 @@ __be32 nfsd4_check_resp_size(struct nfsd4_compoundres *resp, u32 pad)
session = resp->cstate.session;

if (xb->page_len == 0) {
- length = (char *)resp->p - (char *)xb->head[0].iov_base + pad;
+ length = (char *)resp->xdr.p - (char *)xb->head[0].iov_base + pad;
} else {
if (xb->tail[0].iov_base && xb->tail[0].iov_len > 0)
- tlen = (char *)resp->p - (char *)xb->tail[0].iov_base;
+ tlen = (char *)resp->xdr.p - (char *)xb->tail[0].iov_base;

length = xb->head[0].iov_len + xb->page_len + tlen + pad;
}
@@ -3636,7 +3637,7 @@ nfsd4_encode_operation(struct nfsd4_compoundres *resp, struct nfsd4_op *op)
}
if (so) {
so->so_replay.rp_status = op->status;
- so->so_replay.rp_buflen = (char *)resp->p - (char *)(statp+1);
+ so->so_replay.rp_buflen = (char *)resp->xdr.p - (char *)(statp+1);
memcpy(so->so_replay.rp_buf, statp+1, so->so_replay.rp_buflen);
}
status:
@@ -3732,7 +3733,7 @@ nfs4svc_encode_compoundres(struct svc_rqst *rqstp, __be32 *p, struct nfsd4_compo
iov = &rqstp->rq_res.tail[0];
else
iov = &rqstp->rq_res.head[0];
- iov->iov_len = ((char*)resp->p) - (char*)iov->iov_base;
+ iov->iov_len = ((char*)resp->xdr.p) - (char*)iov->iov_base;
BUG_ON(iov->iov_len > PAGE_SIZE);
if (nfsd4_has_session(cs)) {
struct nfsd_net *nn = net_generic(SVC_NET(rqstp), nfsd_net_id);
diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h
index 5ea7df3..6884d70 100644
--- a/fs/nfsd/xdr4.h
+++ b/fs/nfsd/xdr4.h
@@ -506,9 +506,7 @@ struct nfsd4_compoundargs {

struct nfsd4_compoundres {
/* scratch variables for XDR encode */
- __be32 * p;
- __be32 * end;
- struct xdr_buf * xbuf;
+ struct xdr_stream xdr;
struct svc_rqst * rqstp;

u32 taglen;
--
1.8.5.3


2014-03-23 01:12:31

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 38/50] nfsd4: adjust buflen to session channel limit

From: "J. Bruce Fields" <[email protected]>

We can simplify session limit enforcement by restricting the xdr buflen
to the session size.

Also fix a preexisting bug: we should really have been taking into
account the auth-required space when comparing against session limits,
which are limits on the size of the entire rpc reply, including any krb5
overhead.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4state.c | 11 +++++++++++
fs/nfsd/nfs4xdr.c | 24 ++++++++----------------
2 files changed, 19 insertions(+), 16 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 1146cf2..79f3b71 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -2195,11 +2195,13 @@ nfsd4_sequence(struct svc_rqst *rqstp,
struct nfsd4_sequence *seq)
{
struct nfsd4_compoundres *resp = rqstp->rq_resp;
+ struct xdr_stream *xdr = &resp->xdr;
struct nfsd4_session *session;
struct nfs4_client *clp;
struct nfsd4_slot *slot;
struct nfsd4_conn *conn;
__be32 status;
+ int buflen;
struct nfsd_net *nn = net_generic(SVC_NET(rqstp), nfsd_net_id);

if (resp->opcnt != 1)
@@ -2268,6 +2270,15 @@ nfsd4_sequence(struct svc_rqst *rqstp,
if (status)
goto out_put_session;

+ buflen = (seq->cachethis) ?
+ session->se_fchannel.maxresp_cached :
+ session->se_fchannel.maxresp_sz;
+ status = (seq->cachethis) ? nfserr_rep_too_big_to_cache :
+ nfserr_rep_too_big;
+ if (xdr_restrict_buflen(xdr, buflen - 2 * RPC_MAX_AUTH_SIZE))
+ goto out_put_session;
+
+ status = nfs_ok;
/* Success! bump slot seqid */
slot->sl_seqid = seq->seqid;
slot->sl_flags |= NFSD4_SLOT_INUSE;
diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index 6d9f0a53..e0e486d 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -3753,25 +3753,17 @@ static nfsd4_enc nfsd4_enc_ops[] = {
__be32 nfsd4_check_resp_size(struct nfsd4_compoundres *resp, u32 respsize)
{
struct xdr_buf *buf = &resp->rqstp->rq_res;
- struct nfsd4_session *session = resp->cstate.session;
+ struct nfsd4_slot *slot = resp->cstate.slot;

- if (nfsd4_has_session(&resp->cstate)) {
- struct nfsd4_slot *slot = resp->cstate.slot;
-
- if (buf->len + respsize > session->se_fchannel.maxresp_sz)
- return nfserr_rep_too_big;
-
- if ((slot->sl_flags & NFSD4_SLOT_CACHETHIS) &&
- buf->len + respsize > session->se_fchannel.maxresp_cached)
- return nfserr_rep_too_big_to_cache;
- }
-
- if (buf->len + respsize > buf->buflen) {
- WARN_ON_ONCE(nfsd4_has_session(&resp->cstate));
+ if (buf->len + respsize <= buf->buflen)
+ return nfs_ok;
+ if (!nfsd4_has_session(&resp->cstate))
return nfserr_resource;
+ if (slot->sl_flags & NFSD4_SLOT_CACHETHIS) {
+ WARN_ON_ONCE(1);
+ return nfserr_rep_too_big_to_cache;
}
-
- return 0;
+ return nfserr_rep_too_big;
}

void
--
1.8.5.3


2014-03-23 01:12:31

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 39/50] nfsd4: use session limits to release send buffer reservation

From: "J. Bruce Fields" <[email protected]>

Once we know the limits the session places on the size of the rpc, we
can also use that information to release any unnecessary reserved reply
buffer space.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4state.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 79f3b71..343324e 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -2277,6 +2277,7 @@ nfsd4_sequence(struct svc_rqst *rqstp,
nfserr_rep_too_big;
if (xdr_restrict_buflen(xdr, buflen - 2 * RPC_MAX_AUTH_SIZE))
goto out_put_session;
+ svc_reserve(rqstp, buflen);

status = nfs_ok;
/* Success! bump slot seqid */
--
1.8.5.3


2014-03-23 01:12:29

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 27/50] nfsd4: nfsd4_check_resp_size needn't recalculate length

From: "J. Bruce Fields" <[email protected]>

We're keeping the length updated as we go now, so there's no need for
the extra calculation here.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4xdr.c | 18 +++---------------
1 file changed, 3 insertions(+), 15 deletions(-)

diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index 704167d..0d8bbe4 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -3729,32 +3729,20 @@ static nfsd4_enc nfsd4_enc_ops[] = {
*/
__be32 nfsd4_check_resp_size(struct nfsd4_compoundres *resp, u32 pad)
{
- struct xdr_buf *xb = &resp->rqstp->rq_res;
+ struct xdr_buf *buf = &resp->rqstp->rq_res;
struct nfsd4_session *session = NULL;
struct nfsd4_slot *slot = resp->cstate.slot;
- u32 length, tlen = 0;

if (!nfsd4_has_session(&resp->cstate))
return 0;

session = resp->cstate.session;

- if (xb->page_len == 0) {
- length = (char *)resp->xdr.p - (char *)xb->head[0].iov_base + pad;
- } else {
- if (xb->tail[0].iov_base && xb->tail[0].iov_len > 0)
- tlen = (char *)resp->xdr.p - (char *)xb->tail[0].iov_base;
-
- length = xb->head[0].iov_len + xb->page_len + tlen + pad;
- }
- dprintk("%s length %u, xb->page_len %u tlen %u pad %u\n", __func__,
- length, xb->page_len, tlen, pad);
-
- if (length > session->se_fchannel.maxresp_sz)
+ if (buf->len + pad > session->se_fchannel.maxresp_sz)
return nfserr_rep_too_big;

if ((slot->sl_flags & NFSD4_SLOT_CACHETHIS) &&
- length > session->se_fchannel.maxresp_cached)
+ buf->len + pad > session->se_fchannel.maxresp_cached)
return nfserr_rep_too_big_to_cache;

return 0;
--
1.8.5.3


2014-03-23 01:12:26

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 11/50] nfsd4: move proc_compound xdr encode init to helper

From: "J. Bruce Fields" <[email protected]>

Mechanical transformation with no change of behavior.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4proc.c | 16 ++++++++++++----
1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index 7248a78..80e5921 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -1262,6 +1262,17 @@ static bool need_wrongsec_check(struct svc_rqst *rqstp)
return !(nextd->op_flags & OP_HANDLES_WRONGSEC);
}

+static void svcxdr_init_encode(struct svc_rqst *rqstp, struct nfsd4_compoundres *resp)
+{
+ struct xdr_stream *xdr = &resp->xdr;
+ struct xdr_buf *buf = &rqstp->rq_res;
+ struct kvec *head = buf->head;
+
+ xdr->buf = buf;
+ xdr->p = head->iov_base + head->iov_len;
+ xdr->end = head->iov_base + PAGE_SIZE;
+}
+
/*
* COMPOUND call.
*/
@@ -1277,13 +1288,10 @@ nfsd4_proc_compound(struct svc_rqst *rqstp,
u32 plen = 0;
__be32 status;

- resp->xdr.buf = &rqstp->rq_res;
- resp->xdr.p = rqstp->rq_res.head[0].iov_base +
- rqstp->rq_res.head[0].iov_len;
+ svcxdr_init_encode(rqstp, resp);
resp->tagp = resp->xdr.p;
/* reserve space for: taglen, tag, and opcnt */
resp->xdr.p += 2 + XDR_QUADLEN(args->taglen);
- resp->xdr.end = rqstp->rq_res.head[0].iov_base + PAGE_SIZE;
resp->taglen = args->taglen;
resp->tag = args->tag;
resp->opcnt = 0;
--
1.8.5.3


2014-03-23 01:12:30

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 30/50] nfsd4: allow encoding across page boundaries

From: "J. Bruce Fields" <[email protected]>

After this we can handle for example getattr of very large ACLs.

Read, readdir, readlink are still special cases with their own limits.

Also we can't handle a new operation starting close to the end of a
page.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4proc.c | 4 +++
fs/nfsd/nfs4xdr.c | 59 +++++++++++++++++++++++++----------
include/linux/sunrpc/svc.h | 1 +
include/linux/sunrpc/xdr.h | 1 +
net/sunrpc/svc_xprt.c | 1 +
net/sunrpc/xdr.c | 78 ++++++++++++++++++++++++++++++++++++++++++++--
6 files changed, 125 insertions(+), 19 deletions(-)

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index 54081d27..bc221e4 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -1224,6 +1224,10 @@ static void svcxdr_init_encode(struct svc_rqst *rqstp, struct nfsd4_compoundres
xdr->end = head->iov_base + PAGE_SIZE - 2 * RPC_MAX_AUTH_SIZE;
/* Tail and page_len should be zero at this point: */
buf->len = buf->head[0].iov_len;
+ xdr->scratch.iov_len = 0;
+ xdr->page_ptr = buf->pages;
+ buf->buflen = PAGE_SIZE * (1 + rqstp->rq_page_end - buf->pages)
+ - 2 * RPC_MAX_AUTH_SIZE;
}

/*
diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index 062559c..8dc65f5 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -1625,6 +1625,7 @@ static int nfsd4_max_reply(u32 opnum)
* the head and tail in another page:
*/
return 2 * PAGE_SIZE;
+ case OP_GETATTR:
case OP_READ:
return INT_MAX;
default:
@@ -2546,21 +2547,30 @@ out_resource:
goto out;
}

+static void svcxdr_init_encode_from_buffer(struct xdr_stream *xdr, struct xdr_buf *buf, __be32 *p, int bytes)
+{
+ xdr->scratch.iov_len = 0;
+ memset(buf, 0, sizeof(struct xdr_buf));
+ buf->head[0].iov_base = p;
+ buf->head[0].iov_len = 0;
+ buf->len = 0;
+ xdr->buf = buf;
+ xdr->iov = buf->head;
+ xdr->p = p;
+ xdr->end = (void *)p + bytes;
+ buf->buflen = bytes;
+}
+
__be32 nfsd4_encode_fattr_to_buf(__be32 **p, int words,
struct svc_fh *fhp, struct svc_export *exp,
struct dentry *dentry, u32 *bmval,
struct svc_rqst *rqstp, int ignore_crossmnt)
{
- struct xdr_buf dummy = {
- .head[0] = {
- .iov_base = *p,
- },
- .buflen = words << 2,
- };
+ struct xdr_buf dummy;
struct xdr_stream xdr;
__be32 ret;

- xdr_init_encode(&xdr, &dummy, NULL);
+ svcxdr_init_encode_from_buffer(&xdr, &dummy, *p, words << 2);
ret = nfsd4_encode_fattr(&xdr, fhp, exp, dentry, bmval, rqstp, ignore_crossmnt);
*p = xdr.p;
return ret;
@@ -3048,8 +3058,6 @@ nfsd4_encode_read(struct nfsd4_compoundres *resp, __be32 nfserr,

if (nfserr)
return nfserr;
- if (resp->xdr.buf->page_len)
- return nfserr_resource;

p = xdr_reserve_space(xdr, 8); /* eof flag and byte count */
if (!p)
@@ -3059,6 +3067,9 @@ nfsd4_encode_read(struct nfsd4_compoundres *resp, __be32 nfserr,
if (xdr->end - xdr->p < 1)
return nfserr_resource;

+ if (resp->xdr.buf->page_len)
+ return nfserr_resource;
+
maxcount = svc_max_payload(resp->rqstp);
if (maxcount > read->rd_length)
maxcount = read->rd_length;
@@ -3098,6 +3109,8 @@ nfsd4_encode_read(struct nfsd4_compoundres *resp, __be32 nfserr,
- (char*)resp->xdr.buf->head[0].iov_base);
resp->xdr.buf->page_len = maxcount;
xdr->buf->len += maxcount;
+ xdr->page_ptr += v;
+ xdr->buf->buflen = maxcount + PAGE_SIZE - 2 * RPC_MAX_AUTH_SIZE;
xdr->iov = xdr->buf->tail;

/* Use rest of head for padding and remaining ops: */
@@ -3124,6 +3137,11 @@ nfsd4_encode_readlink(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd

if (nfserr)
return nfserr;
+
+ p = xdr_reserve_space(xdr, 4);
+ if (!p)
+ return nfserr_resource;
+
if (resp->xdr.buf->page_len)
return nfserr_resource;
if (!*resp->rqstp->rq_next_page)
@@ -3133,10 +3151,6 @@ nfsd4_encode_readlink(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd

maxcount = PAGE_SIZE;

- p = xdr_reserve_space(xdr, 4);
- if (!p)
- return nfserr_resource;
-
if (xdr->end - xdr->p < 1)
return nfserr_resource;

@@ -3159,6 +3173,8 @@ nfsd4_encode_readlink(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd
- (char*)resp->xdr.buf->head[0].iov_base;
resp->xdr.buf->page_len = maxcount;
xdr->buf->len += maxcount;
+ xdr->page_ptr += 1;
+ xdr->buf->buflen -= PAGE_SIZE;
xdr->iov = xdr->buf->tail;

/* Use rest of head for padding and remaining ops: */
@@ -3185,15 +3201,16 @@ nfsd4_encode_readdir(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4

if (nfserr)
return nfserr;
- if (resp->xdr.buf->page_len)
- return nfserr_resource;
- if (!*resp->rqstp->rq_next_page)
- return nfserr_resource;

p = xdr_reserve_space(xdr, NFS4_VERIFIER_SIZE);
if (!p)
return nfserr_resource;

+ if (resp->xdr.buf->page_len)
+ return nfserr_resource;
+ if (!*resp->rqstp->rq_next_page)
+ return nfserr_resource;
+
/* XXX: Following NFSv3, we ignore the READDIR verifier for now. */
WRITE32(0);
WRITE32(0);
@@ -3245,6 +3262,10 @@ nfsd4_encode_readdir(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4

xdr->iov = xdr->buf->tail;

+ xdr->page_ptr++;
+ xdr->buf->buflen -= PAGE_SIZE;
+ xdr->iov = xdr->buf->tail;
+
/* Use rest of head for padding and remaining ops: */
resp->xdr.buf->tail[0].iov_base = tailbase;
resp->xdr.buf->tail[0].iov_len = 0;
@@ -3777,6 +3798,8 @@ nfsd4_encode_operation(struct nfsd4_compoundres *resp, struct nfsd4_op *op)
!nfsd4_enc_ops[op->opnum]);
encoder = nfsd4_enc_ops[op->opnum];
op->status = encoder(resp, op->status, &op->u);
+ xdr_commit_encode(xdr);
+
/* nfsd4_check_resp_size guarantees enough room for error status */
if (!op->status) {
int space_needed = 0;
@@ -3903,6 +3926,8 @@ nfs4svc_encode_compoundres(struct svc_rqst *rqstp, __be32 *p, struct nfsd4_compo
WARN_ON_ONCE(buf->len != buf->head[0].iov_len + buf->page_len +
buf->tail[0].iov_len);

+ rqstp->rq_next_page = resp->xdr.page_ptr + 1;
+
p = resp->tagp;
*p++ = htonl(resp->taglen);
memcpy(p, resp->tag, resp->taglen);
diff --git a/include/linux/sunrpc/svc.h b/include/linux/sunrpc/svc.h
index 04e7632..39c50e1 100644
--- a/include/linux/sunrpc/svc.h
+++ b/include/linux/sunrpc/svc.h
@@ -244,6 +244,7 @@ struct svc_rqst {
struct page * rq_pages[RPCSVC_MAXPAGES];
struct page * *rq_respages; /* points into rq_pages */
struct page * *rq_next_page; /* next reply page to use */
+ struct page * *rq_page_end; /* one past the last page */

struct kvec rq_vec[RPCSVC_MAXPAGES]; /* generally useful.. */

diff --git a/include/linux/sunrpc/xdr.h b/include/linux/sunrpc/xdr.h
index e7bb2e3..b23d69f 100644
--- a/include/linux/sunrpc/xdr.h
+++ b/include/linux/sunrpc/xdr.h
@@ -215,6 +215,7 @@ typedef int (*kxdrdproc_t)(void *rqstp, struct xdr_stream *xdr, void *obj);

extern void xdr_init_encode(struct xdr_stream *xdr, struct xdr_buf *buf, __be32 *p);
extern __be32 *xdr_reserve_space(struct xdr_stream *xdr, size_t nbytes);
+extern void xdr_commit_encode(struct xdr_stream *xdr);
extern void xdr_truncate_encode(struct xdr_stream *xdr, size_t len);
extern void xdr_write_pages(struct xdr_stream *xdr, struct page **pages,
unsigned int base, unsigned int len);
diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
index 80a6640..e455590 100644
--- a/net/sunrpc/svc_xprt.c
+++ b/net/sunrpc/svc_xprt.c
@@ -597,6 +597,7 @@ int svc_alloc_arg(struct svc_rqst *rqstp)
}
rqstp->rq_pages[i] = p;
}
+ rqstp->rq_page_end = &rqstp->rq_pages[i];
rqstp->rq_pages[i++] = NULL; /* this might be seen in nfs_read_actor */

/* Make arg->head point to first page and arg->pages point to rest */
diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c
index 8ae8ee7..e65d6b6 100644
--- a/net/sunrpc/xdr.c
+++ b/net/sunrpc/xdr.c
@@ -462,6 +462,7 @@ void xdr_init_encode(struct xdr_stream *xdr, struct xdr_buf *buf, __be32 *p)
struct kvec *iov = buf->head;
int scratch_len = buf->buflen - buf->page_len - buf->tail[0].iov_len;

+ xdr_set_scratch_buffer(xdr, NULL, 0);
BUG_ON(scratch_len < 0);
xdr->buf = buf;
xdr->iov = iov;
@@ -482,6 +483,74 @@ void xdr_init_encode(struct xdr_stream *xdr, struct xdr_buf *buf, __be32 *p)
EXPORT_SYMBOL_GPL(xdr_init_encode);

/**
+ * xdr_commit_encode - Ensure all data is written to buffer
+ * @xdr: pointer to xdr_stream
+ *
+ * We handle encoding across page boundaries by giving the caller a
+ * temporary location to write to, then later copying the data into
+ * place; xdr_commit_encode does that copying.
+ *
+ * Normally the caller doesn't need to call this directly, as the
+ * following xdr_reserve_space will do it. But an explicit call may be
+ * required at the end of encoding, or any other time when the xdr_buf
+ * data might be read.
+ */
+void xdr_commit_encode(struct xdr_stream *xdr)
+{
+ int shift = xdr->scratch.iov_len;
+ void *page;
+
+ if (shift == 0)
+ return;
+ page = page_address(*xdr->page_ptr);
+ memcpy(xdr->scratch.iov_base, page, shift);
+ memmove(page, page + shift, (void *)xdr->p - page);
+ xdr->scratch.iov_len = 0;
+}
+EXPORT_SYMBOL_GPL(xdr_commit_encode);
+
+__be32 * xdr_get_next_encode_buffer(struct xdr_stream *xdr, size_t nbytes)
+{
+ static __be32 *p;
+ int space_left;
+ int frag1bytes, frag2bytes;
+
+ if (nbytes > PAGE_SIZE)
+ return NULL; /* Bigger buffers require special handling */
+ if (xdr->buf->len + nbytes > xdr->buf->buflen)
+ return NULL; /* Sorry, we're totally out of space */
+ frag1bytes = (xdr->end - xdr->p) << 2;
+ frag2bytes = nbytes - frag1bytes;
+ if (xdr->iov)
+ xdr->iov->iov_len += frag1bytes;
+ else {
+ xdr->buf->page_len += frag1bytes;
+ xdr->page_ptr++;
+ }
+ xdr->iov = NULL;
+ /*
+ * If the last encode didn't end exactly on a page boundary, the
+ * next one will straddle boundaries. Encode into the next
+ * page, then copy it back later in xdr_commit_encode. We use
+ * the "scratch" iov to track any temporarily unused fragment of
+ * space at the end of the previous buffer:
+ */
+ xdr->scratch.iov_base = xdr->p;
+ xdr->scratch.iov_len = frag1bytes;
+ p = page_address(*xdr->page_ptr);
+ /*
+ * Note this is where the next encode will start after we've
+ * shifted this one back:
+ */
+ xdr->p = (void *)p + frag2bytes;
+ space_left = xdr->buf->buflen - xdr->buf->len;
+ xdr->end = (void *)p + min_t(int, space_left, PAGE_SIZE);
+ xdr->buf->page_len += frag2bytes;
+ xdr->buf->len += nbytes;
+ return p;
+}
+
+/**
* xdr_reserve_space - Reserve buffer space for sending
* @xdr: pointer to xdr_stream
* @nbytes: number of bytes to reserve
@@ -495,14 +564,18 @@ __be32 * xdr_reserve_space(struct xdr_stream *xdr, size_t nbytes)
__be32 *p = xdr->p;
__be32 *q;

+ xdr_commit_encode(xdr);
/* align nbytes on the next 32-bit boundary */
nbytes += 3;
nbytes &= ~3;
q = p + (nbytes >> 2);
if (unlikely(q > xdr->end || q < p))
- return NULL;
+ return xdr_get_next_encode_buffer(xdr, nbytes);
xdr->p = q;
- xdr->iov->iov_len += nbytes;
+ if (xdr->iov)
+ xdr->iov->iov_len += nbytes;
+ else
+ xdr->buf->page_len += nbytes;
xdr->buf->len += nbytes;
return p;
}
@@ -539,6 +612,7 @@ void xdr_truncate_encode(struct xdr_stream *xdr, size_t len)
WARN_ON_ONCE(1);
return;
}
+ xdr_commit_encode(xdr);

fraglen = min_t(int, buf->len - len, tail->iov_len);
tail->iov_len -= fraglen;
--
1.8.5.3


2014-03-23 01:12:30

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 32/50] nfsd4: don't try to encode conflicting owner if low on space

From: "J. Bruce Fields" <[email protected]>

I ran into this corner case in testing: in theory clients can provide
state owners up to 1024 bytes long. In the sessions case there might be
a risk of this pushing us over the DRC slot size.

The conflicting owner isn't really that important, so let's humor a
client that provides a small maxresponsize_cached by allowing ourselves
to return without the conflicting owner instead of outright failing the
operation.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4proc.c | 3 ++-
fs/nfsd/nfs4xdr.c | 16 +++++++++++++---
2 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index bc221e4..9876de2 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -1387,7 +1387,8 @@ out:
#define op_encode_change_info_maxsz (5)
#define nfs4_fattr_bitmap_maxsz (4)

-#define op_encode_lockowner_maxsz (1 + XDR_QUADLEN(IDMAP_NAMESZ))
+/* We'll fall back on returning no lockowner if run out of space: */
+#define op_encode_lockowner_maxsz (0)
#define op_encode_lock_denied_maxsz (8 + op_encode_lockowner_maxsz)

#define nfs4_owner_maxsz (1 + XDR_QUADLEN(IDMAP_NAMESZ))
diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index 8703cca..3620ac5d 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -2861,9 +2861,20 @@ nfsd4_encode_lock_denied(struct xdr_stream *xdr, struct nfsd4_lock_denied *ld)
struct xdr_netobj *conf = &ld->ld_owner;
__be32 *p;

+again:
p = xdr_reserve_space(xdr, 32 + XDR_LEN(conf->len));
- if (!p)
+ if (!p) {
+ /*
+ * Don't fail to return the result just because we can't
+ * return the conflicting open:
+ */
+ if (conf->len) {
+ conf->len = 0;
+ conf->data = NULL;
+ goto again;
+ }
return nfserr_resource;
+ }
WRITE64(ld->ld_start);
WRITE64(ld->ld_length);
WRITE32(ld->ld_type);
@@ -2871,7 +2882,6 @@ nfsd4_encode_lock_denied(struct xdr_stream *xdr, struct nfsd4_lock_denied *ld)
WRITEMEM(&ld->ld_clientid, 8);
WRITE32(conf->len);
WRITEMEM(conf->data, conf->len);
- kfree(conf->data);
} else { /* non - nfsv4 lock in conflict, no clientid nor owner */
WRITE64((u64)0); /* clientid */
WRITE32(0); /* length of owner name */
@@ -2888,7 +2898,7 @@ nfsd4_encode_lock(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4_lo
nfserr = nfsd4_encode_stateid(xdr, &lock->lk_resp_stateid);
else if (nfserr == nfserr_denied)
nfserr = nfsd4_encode_lock_denied(xdr, &lock->lk_denied);
-
+ kfree(lock->lk_denied.ld_owner.data);
return nfserr;
}

--
1.8.5.3


2014-03-23 15:07:25

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 22/50] nfsd4: use xdr_truncate_encode

On Sat, Mar 22, 2014 at 11:50:05PM -0700, Christoph Hellwig wrote:
> Ah, here we got the helper for most of the error handling I asked for
> earlier. Might be worth mentioning in that patch that this gets further
> improvements soons.

OK, will do.

> > if (nfserr) {
> > - xdr->p -= 2;
> > - xdr->iov->iov_len -= 8;
> > + xdr->buf->page_len = 0;
> > + xdr_truncate_encode(xdr, starting_len);
>
> Why do we still need to manually set the page_len?

In the splice case read increments page_len as it goes. This leaves the
xdr_buf in somewhat of an inconsistent state and may confuse a later
xdr_truncate_encode.

It's a little ugly, I'm not sure what to do about it. Currently I'm
thinking of just telling nfsd's splice callback to leave page_len alone.
That'd mean taking another look at the splice callback and at the v2/v3
read code.

--b.

2014-03-23 01:12:27

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 14/50] nfsd4: fix encoding of out-of-space replies

From: "J. Bruce Fields" <[email protected]>

If nfsd4_check_resp_size() returns an error then we should really be
truncating the reply here, otherwise we may leave extra garbage at the
end of the rpc reply.

Also add a warning to catch any cases where our reply-size estimates may
be wrong in the case of a non-idempotent operation.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4xdr.c | 22 +++++++++++++++++++++-
1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index 2ad423d..c150121 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -3632,6 +3632,7 @@ nfsd4_encode_operation(struct nfsd4_compoundres *resp, struct nfsd4_op *op)
{
struct nfs4_stateowner *so = resp->cstate.replay_owner;
__be32 *statp;
+ nfsd4_enc encoder;
__be32 *p;

RESERVE_SPACE(8);
@@ -3643,7 +3644,8 @@ nfsd4_encode_operation(struct nfsd4_compoundres *resp, struct nfsd4_op *op)
goto status;
BUG_ON(op->opnum < 0 || op->opnum >= ARRAY_SIZE(nfsd4_enc_ops) ||
!nfsd4_enc_ops[op->opnum]);
- op->status = nfsd4_enc_ops[op->opnum](resp, op->status, &op->u);
+ encoder = nfsd4_enc_ops[op->opnum];
+ op->status = encoder(resp, op->status, &op->u);
/* nfsd4_check_resp_size guarantees enough room for error status */
if (!op->status)
op->status = nfsd4_check_resp_size(resp, 0);
@@ -3655,6 +3657,24 @@ nfsd4_encode_operation(struct nfsd4_compoundres *resp, struct nfsd4_op *op)
else
op->status = nfserr_rep_too_big;
}
+ if (op->status == nfserr_resource ||
+ op->status == nfserr_rep_too_big ||
+ op->status == nfserr_rep_too_big_to_cache) {
+ /*
+ * The operation may have already been encoded or
+ * partially encoded. No op returns anything additional
+ * in the case of one of these three errors, so we can
+ * just truncate back to after the status. But it's a
+ * bug if we had to do this on a non-idempotent op:
+ */
+ if (OPDESC(op)->op_flags & OP_MODIFIES_SOMETHING) {
+ printk("unable to encode reply to nonidempotent op"
+ " %d (%s)\n", op->opnum,
+ nfsd4_op_name(op->opnum));
+ WARN_ON_ONCE(1);
+ }
+ resp->xdr.p = statp + 1;
+ }
if (so) {
so->so_replay.rp_status = op->status;
so->so_replay.rp_buflen = (char *)resp->xdr.p - (char *)(statp+1);
--
1.8.5.3


2014-03-25 15:36:35

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 22/50] nfsd4: use xdr_truncate_encode

On Sun, Mar 23, 2014 at 11:07:22AM -0400, J. Bruce Fields wrote:
> In the splice case read increments page_len as it goes. This leaves the
> xdr_buf in somewhat of an inconsistent state and may confuse a later
> xdr_truncate_encode.
>
> It's a little ugly, I'm not sure what to do about it. Currently I'm
> thinking of just telling nfsd's splice callback to leave page_len alone.
> That'd mean taking another look at the splice callback and at the v2/v3
> read code.

How a about just adding a comment explaing this for now?


2014-03-23 01:12:25

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 06/50] nfsd4: remove redundant check from nfsd4_check_resp_size

From: "J. Bruce Fields" <[email protected]>

cstate->slot and ->session are each set together in nfsd4_sequence. If
one is non-NULL, so is the other.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4xdr.c | 2 --
1 file changed, 2 deletions(-)

diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index ebefc42..143780d 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -3584,8 +3584,6 @@ __be32 nfsd4_check_resp_size(struct nfsd4_compoundres *resp, u32 pad)
return 0;

session = resp->cstate.session;
- if (session == NULL)
- return 0;

if (xb->page_len == 0) {
length = (char *)resp->p - (char *)xb->head[0].iov_base + pad;
--
1.8.5.3


2014-03-23 01:12:25

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 03/50] nfsd4: nfsd4_replay_cache_entry should be static

From: "J. Bruce Fields" <[email protected]>

This isn't actually used anywhere else.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4state.c | 2 +-
fs/nfsd/xdr4.h | 2 --
2 files changed, 1 insertion(+), 3 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index f082961..613f7cd 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1600,7 +1600,7 @@ nfsd4_enc_sequence_replay(struct nfsd4_compoundargs *args,
* The sequence operation is not cached because we can use the slot and
* session values.
*/
-__be32
+static __be32
nfsd4_replay_cache_entry(struct nfsd4_compoundres *resp,
struct nfsd4_sequence *seq)
{
diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h
index d278a0d..5ea7df3 100644
--- a/fs/nfsd/xdr4.h
+++ b/fs/nfsd/xdr4.h
@@ -574,8 +574,6 @@ extern __be32 nfsd4_setclientid_confirm(struct svc_rqst *rqstp,
struct nfsd4_compound_state *,
struct nfsd4_setclientid_confirm *setclientid_confirm);
extern void nfsd4_store_cache_entry(struct nfsd4_compoundres *resp);
-extern __be32 nfsd4_replay_cache_entry(struct nfsd4_compoundres *resp,
- struct nfsd4_sequence *seq);
extern __be32 nfsd4_exchange_id(struct svc_rqst *rqstp,
struct nfsd4_compound_state *, struct nfsd4_exchange_id *);
extern __be32 nfsd4_backchannel_ctl(struct svc_rqst *, struct nfsd4_compound_state *, struct nfsd4_backchannel_ctl *);
--
1.8.5.3


2014-03-23 15:11:44

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 18/50] nfsd4: use xdr_stream throughout compound encoding

On Sat, Mar 22, 2014 at 11:43:36PM -0700, Christoph Hellwig wrote:
> > #define RESERVE_SPACE(nbytes) do { \
> > - p = resp->xdr.p; \
> > - BUG_ON(p + XDR_QUADLEN(nbytes) > resp->xdr.end); \
> > + p = xdr_reserve_space(&resp->xdr, nbytes); \
> > + BUG_ON(!p); \
> > } while (0)
> > -#define ADJUST_ARGS() resp->xdr.p = p
> > +#define ADJUST_ARGS() WARN_ON_ONCE(p != resp->xdr.p) \
>
> I think the code would become more clear if you'd start expanding these
> macros. IF the RESERVE_SPACE BUG_ON is important enough it should move
> into xdr_reserve_space or a variant thereof, or otherweise just removed.

Yeah, those will go later.

Ideas for how to sequence these changes better welcomed.

> > + if (nfserr) {
> > + xdr->p -= 2;
> > + xdr->iov->iov_len -= 8;
>
>
> > + if (nfserr) {
> > + xdr->p--;
> > + xdr->iov->iov_len -= 4;
> > return nfserr;
> > + }
>
>
> > + xdr->p = savep;
> > + xdr->iov->iov_len = ((char *)resp->xdr.p)
> > + - (char *)resp->xdr.buf->head[0].iov_base;
>
> Where is this coming from? Looks like deep magic to be and would
> benefit from comments, symbolic names and/or a helper.

OK, may be worth a comment even if it's only a temporary thing.

--b.

2014-03-23 01:12:31

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 37/50] rpc: define xdr_restrict_buflen

From: "J. Bruce Fields" <[email protected]>

With this xdr_reserve_space can help us enforce various limits.

Signed-off-by: J. Bruce Fields <[email protected]>
---
include/linux/sunrpc/xdr.h | 1 +
net/sunrpc/xdr.c | 29 +++++++++++++++++++++++++++++
2 files changed, 30 insertions(+)

diff --git a/include/linux/sunrpc/xdr.h b/include/linux/sunrpc/xdr.h
index b23d69f..70c6b92 100644
--- a/include/linux/sunrpc/xdr.h
+++ b/include/linux/sunrpc/xdr.h
@@ -217,6 +217,7 @@ extern void xdr_init_encode(struct xdr_stream *xdr, struct xdr_buf *buf, __be32
extern __be32 *xdr_reserve_space(struct xdr_stream *xdr, size_t nbytes);
extern void xdr_commit_encode(struct xdr_stream *xdr);
extern void xdr_truncate_encode(struct xdr_stream *xdr, size_t len);
+extern int xdr_restrict_buflen(struct xdr_stream *xdr, int newbuflen);
extern void xdr_write_pages(struct xdr_stream *xdr, struct page **pages,
unsigned int base, unsigned int len);
extern unsigned int xdr_stream_pos(const struct xdr_stream *xdr);
diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c
index e65d6b6..f97e3df 100644
--- a/net/sunrpc/xdr.c
+++ b/net/sunrpc/xdr.c
@@ -650,6 +650,35 @@ void xdr_truncate_encode(struct xdr_stream *xdr, size_t len)
EXPORT_SYMBOL(xdr_truncate_encode);

/**
+ * xdr_restrict_buflen - decrease available buffer space
+ * @xdr: pointer to xdr_stream
+ * @newbuflen: new maximum number of bytes available
+ *
+ * Adjust our idea of how much space is available in the buffer.
+ * If we've already used too much space in the buffer, returns -1.
+ * If the available space is already smaller than newbuflen, returns 0
+ * and does nothing. Otherwise, adjusts xdr->buf->buflen to newbuflen
+ * and ensures xdr->end is set at most offset newbuflen from the start
+ * of the buffer.
+ */
+int xdr_restrict_buflen(struct xdr_stream *xdr, int newbuflen)
+{
+ struct xdr_buf *buf = xdr->buf;
+ int left_in_this_buf = (void *)xdr->end - (void *)xdr->p;
+ int end_offset = buf->len + left_in_this_buf;
+
+ if (newbuflen < 0 || newbuflen < buf->len)
+ return -1;
+ if (newbuflen > buf->buflen)
+ return 0;
+ if (newbuflen < end_offset)
+ xdr->end = (void *)xdr->end + newbuflen - end_offset;
+ buf->buflen = newbuflen;
+ return 0;
+}
+EXPORT_SYMBOL(xdr_restrict_buflen);
+
+/**
* xdr_write_pages - Insert a list of pages into an XDR buffer for sending
* @xdr: pointer to xdr_stream
* @pages: list of pages
--
1.8.5.3


2014-03-25 15:38:23

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 18/50] nfsd4: use xdr_stream throughout compound encoding

On Sun, Mar 23, 2014 at 11:11:42AM -0400, J. Bruce Fields wrote:
> > > } while (0)
> > > -#define ADJUST_ARGS() resp->xdr.p = p
> > > +#define ADJUST_ARGS() WARN_ON_ONCE(p != resp->xdr.p) \
> >
> > I think the code would become more clear if you'd start expanding these
> > macros. IF the RESERVE_SPACE BUG_ON is important enough it should move
> > into xdr_reserve_space or a variant thereof, or otherweise just removed.
>
> Yeah, those will go later.
>
> Ideas for how to sequence these changes better welcomed.

In this case I'd suggest just killing ADJUST_ARGS in this patch.

In general if there's something looking ugly that gets killed or cleaned
up later I simply mention it in the patch description.

> > > + xdr->p = savep;
> > > + xdr->iov->iov_len = ((char *)resp->xdr.p)
> > > + - (char *)resp->xdr.buf->head[0].iov_base;
> >
> > Where is this coming from? Looks like deep magic to be and would
> > benefit from comments, symbolic names and/or a helper.
>
> OK, may be worth a comment even if it's only a temporary thing.

I didn't know it was going away when I read the patch. Keeping the code
as-is is fine with me, but maybe a little comment in the changelog might
help reviwers the look at the next iteration.


2014-03-23 14:52:08

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 23/50] nfsd4: "backfill" using write_bytes_to_xdr_buf

On Sun, Mar 23, 2014 at 10:43:44AM -0400, J. Bruce Fields wrote:
> By the end of the series, most xdr encoding looks like
>
> p = xdr_reserve_space(xdr, <nbytes>);
>
> *p++ = cpu_to_be32(<some value>);
> p = xdr_encode_something_more_complicated(p, <some struct>);
> ...
>
> It doesn't require the encoders to know much about xdr_buf's.
>
> The main exception is read, especially the splice case. We could
> probably do better there, I just haven't thought much about it yet.
>
> Hm, but maybe what you're asking for here is just a
> write_bytes_to_xdr_buf wrapper that takes the xdr_stream instead of an
> xdr_buf, and a corresponding function to read xdr->buf->len when we want
> an offset? OK, maybe so.

I was thinking of something like that, but looking over the rest of the
series I don't think it fits in here. I like the way where your series
is moving the XDR code, and that seems enough for now.


2014-03-23 01:12:33

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 48/50] nfsd4: kill WRITE64

From: "J. Bruce Fields" <[email protected]>

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4xdr.c | 52 ++++++++++++++++++++++++----------------------------
1 file changed, 24 insertions(+), 28 deletions(-)

diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index 7a827c1..6afa0a1 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -1700,10 +1700,6 @@ nfsd4_decode_compound(struct nfsd4_compoundargs *argp)
DECODE_TAIL;
}

-#define WRITE64(n) do { \
- *p++ = htonl((u32)((n) >> 32)); \
- *p++ = htonl((u32)(n)); \
-} while (0)
#define WRITEMEM(ptr,nbytes) do { if (nbytes > 0) { \
*(p + XDR_QUADLEN(nbytes) -1) = 0; \
memcpy(p, ptr, nbytes); \
@@ -2208,7 +2204,7 @@ nfsd4_encode_fattr(struct xdr_stream *xdr, struct svc_fh *fhp, struct svc_export
p = xdr_reserve_space(xdr, 8);
if (!p)
goto out_resource;
- WRITE64(stat.size);
+ p = xdr_encode_hyper(p, stat.size);
}
if (bmval0 & FATTR4_WORD0_LINK_SUPPORT) {
p = xdr_reserve_space(xdr, 4);
@@ -2233,12 +2229,12 @@ nfsd4_encode_fattr(struct xdr_stream *xdr, struct svc_fh *fhp, struct svc_export
if (!p)
goto out_resource;
if (exp->ex_fslocs.migrated) {
- WRITE64(NFS4_REFERRAL_FSID_MAJOR);
- WRITE64(NFS4_REFERRAL_FSID_MINOR);
+ p = xdr_encode_hyper(p, NFS4_REFERRAL_FSID_MAJOR);
+ p = xdr_encode_hyper(p, NFS4_REFERRAL_FSID_MINOR);
} else switch(fsid_source(fhp)) {
case FSIDSOURCE_FSID:
- WRITE64((u64)exp->ex_fsid);
- WRITE64((u64)0);
+ p = xdr_encode_hyper(p, (u64)exp->ex_fsid);
+ p = xdr_encode_hyper(p, (u64)0);
break;
case FSIDSOURCE_DEV:
*p++ = cpu_to_be32(0);
@@ -2340,25 +2336,25 @@ out_acl:
p = xdr_reserve_space(xdr, 8);
if (!p)
goto out_resource;
- WRITE64(stat.ino);
+ p = xdr_encode_hyper(p, stat.ino);
}
if (bmval0 & FATTR4_WORD0_FILES_AVAIL) {
p = xdr_reserve_space(xdr, 8);
if (!p)
goto out_resource;
- WRITE64((u64) statfs.f_ffree);
+ p = xdr_encode_hyper(p, (u64) statfs.f_ffree);
}
if (bmval0 & FATTR4_WORD0_FILES_FREE) {
p = xdr_reserve_space(xdr, 8);
if (!p)
goto out_resource;
- WRITE64((u64) statfs.f_ffree);
+ p = xdr_encode_hyper(p, (u64) statfs.f_ffree);
}
if (bmval0 & FATTR4_WORD0_FILES_TOTAL) {
p = xdr_reserve_space(xdr, 8);
if (!p)
goto out_resource;
- WRITE64((u64) statfs.f_files);
+ p = xdr_encode_hyper(p, (u64) statfs.f_files);
}
if (bmval0 & FATTR4_WORD0_FS_LOCATIONS) {
status = nfsd4_encode_fs_locations(xdr, rqstp, exp);
@@ -2375,7 +2371,7 @@ out_acl:
p = xdr_reserve_space(xdr, 8);
if (!p)
goto out_resource;
- WRITE64(exp->ex_path.mnt->mnt_sb->s_maxbytes);
+ p = xdr_encode_hyper(p, exp->ex_path.mnt->mnt_sb->s_maxbytes);
}
if (bmval0 & FATTR4_WORD0_MAXLINK) {
p = xdr_reserve_space(xdr, 4);
@@ -2393,13 +2389,13 @@ out_acl:
p = xdr_reserve_space(xdr, 8);
if (!p)
goto out_resource;
- WRITE64((u64) svc_max_payload(rqstp));
+ p = xdr_encode_hyper(p, (u64) svc_max_payload(rqstp));
}
if (bmval0 & FATTR4_WORD0_MAXWRITE) {
p = xdr_reserve_space(xdr, 8);
if (!p)
goto out_resource;
- WRITE64((u64) svc_max_payload(rqstp));
+ p = xdr_encode_hyper(p, (u64) svc_max_payload(rqstp));
}
if (bmval1 & FATTR4_WORD1_MODE) {
p = xdr_reserve_space(xdr, 4);
@@ -2441,34 +2437,34 @@ out_acl:
if (!p)
goto out_resource;
dummy64 = (u64)statfs.f_bavail * (u64)statfs.f_bsize;
- WRITE64(dummy64);
+ p = xdr_encode_hyper(p, dummy64);
}
if (bmval1 & FATTR4_WORD1_SPACE_FREE) {
p = xdr_reserve_space(xdr, 8);
if (!p)
goto out_resource;
dummy64 = (u64)statfs.f_bfree * (u64)statfs.f_bsize;
- WRITE64(dummy64);
+ p = xdr_encode_hyper(p, dummy64);
}
if (bmval1 & FATTR4_WORD1_SPACE_TOTAL) {
p = xdr_reserve_space(xdr, 8);
if (!p)
goto out_resource;
dummy64 = (u64)statfs.f_blocks * (u64)statfs.f_bsize;
- WRITE64(dummy64);
+ p = xdr_encode_hyper(p, dummy64);
}
if (bmval1 & FATTR4_WORD1_SPACE_USED) {
p = xdr_reserve_space(xdr, 8);
if (!p)
goto out_resource;
dummy64 = (u64)stat.blocks << 9;
- WRITE64(dummy64);
+ p = xdr_encode_hyper(p, dummy64);
}
if (bmval1 & FATTR4_WORD1_TIME_ACCESS) {
p = xdr_reserve_space(xdr, 12);
if (!p)
goto out_resource;
- WRITE64((s64)stat.atime.tv_sec);
+ p = xdr_encode_hyper(p, (s64)stat.atime.tv_sec);
*p++ = cpu_to_be32(stat.atime.tv_nsec);
}
if (bmval1 & FATTR4_WORD1_TIME_DELTA) {
@@ -2483,14 +2479,14 @@ out_acl:
p = xdr_reserve_space(xdr, 12);
if (!p)
goto out_resource;
- WRITE64((s64)stat.ctime.tv_sec);
+ p = xdr_encode_hyper(p, (s64)stat.ctime.tv_sec);
*p++ = cpu_to_be32(stat.ctime.tv_nsec);
}
if (bmval1 & FATTR4_WORD1_TIME_MODIFY) {
p = xdr_reserve_space(xdr, 12);
if (!p)
goto out_resource;
- WRITE64((s64)stat.mtime.tv_sec);
+ p = xdr_encode_hyper(p, (s64)stat.mtime.tv_sec);
*p++ = cpu_to_be32(stat.mtime.tv_nsec);
}
if (bmval1 & FATTR4_WORD1_MOUNTED_ON_FILEID) {
@@ -2504,7 +2500,7 @@ out_acl:
if (ignore_crossmnt == 0 &&
dentry == exp->ex_path.mnt->mnt_root)
get_parent_attributes(exp, &stat);
- WRITE64(stat.ino);
+ p = xdr_encode_hyper(p, stat.ino);
}
if (bmval2 & FATTR4_WORD2_SECURITY_LABEL) {
status = nfsd4_encode_security_label(xdr, rqstp, context, contextlen);
@@ -2889,15 +2885,15 @@ again:
}
return nfserr_resource;
}
- WRITE64(ld->ld_start);
- WRITE64(ld->ld_length);
+ p = xdr_encode_hyper(p, ld->ld_start);
+ p = xdr_encode_hyper(p, ld->ld_length);
*p++ = cpu_to_be32(ld->ld_type);
if (conf->len) {
WRITEMEM(&ld->ld_clientid, 8);
*p++ = cpu_to_be32(conf->len);
WRITEMEM(conf->data, conf->len);
} else { /* non - nfsv4 lock in conflict, no clientid nor owner */
- WRITE64((u64)0); /* clientid */
+ p = xdr_encode_hyper(p, (u64)0); /* clientid */
*p++ = cpu_to_be32(0); /* length of owner name */
}
return nfserr_denied;
@@ -3640,7 +3636,7 @@ nfsd4_encode_exchange_id(struct nfsd4_compoundres *resp, __be32 nfserr,
return nfserr_resource;

/* The server_owner struct */
- WRITE64(minor_id); /* Minor id */
+ p = xdr_encode_hyper(p, minor_id); /* Minor id */
/* major id */
*p++ = cpu_to_be32(major_id_sz);
WRITEMEM(major_id, major_id_sz);
--
1.8.5.3


2014-03-23 06:51:54

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 23/50] nfsd4: "backfill" using write_bytes_to_xdr_buf

Looks good, but shouldn't there be an abstract function to skip writing
into an XDR buf instead of the pointer increment as well?

I wonder if the XDR buf internal should even be entirely hidden to the
consummers and it should only be accessed through procedural interfaces?


2014-03-23 01:12:27

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 16/50] nfsd4: READ, READDIR, etc., are idempotent

From: "J. Bruce Fields" <[email protected]>

OP_MODIFIES_SOMETHING flags operations that we should be careful not to
initiate without being sure we have the buffer space to encode a reply.

None of these ops fall into that category.

We could probably remove a few more, but this isn't a very important
problem at least for ops whose reply size is easy to estimate.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4proc.c | 13 +++----------
1 file changed, 3 insertions(+), 10 deletions(-)

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index afa7ff7..a61f492 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -1630,38 +1630,31 @@ static struct nfsd4_operation nfsd4_ops[] = {
[OP_PUTFH] = {
.op_func = (nfsd4op_func)nfsd4_putfh,
.op_flags = ALLOWED_WITHOUT_FH | ALLOWED_ON_ABSENT_FS
- | OP_IS_PUTFH_LIKE | OP_MODIFIES_SOMETHING
- | OP_CLEAR_STATEID,
+ | OP_IS_PUTFH_LIKE | OP_CLEAR_STATEID,
.op_name = "OP_PUTFH",
.op_rsize_bop = (nfsd4op_rsize)nfsd4_only_status_rsize,
},
[OP_PUTPUBFH] = {
.op_func = (nfsd4op_func)nfsd4_putrootfh,
.op_flags = ALLOWED_WITHOUT_FH | ALLOWED_ON_ABSENT_FS
- | OP_IS_PUTFH_LIKE | OP_MODIFIES_SOMETHING
- | OP_CLEAR_STATEID,
+ | OP_IS_PUTFH_LIKE | OP_CLEAR_STATEID,
.op_name = "OP_PUTPUBFH",
.op_rsize_bop = (nfsd4op_rsize)nfsd4_only_status_rsize,
},
[OP_PUTROOTFH] = {
.op_func = (nfsd4op_func)nfsd4_putrootfh,
.op_flags = ALLOWED_WITHOUT_FH | ALLOWED_ON_ABSENT_FS
- | OP_IS_PUTFH_LIKE | OP_MODIFIES_SOMETHING
- | OP_CLEAR_STATEID,
+ | OP_IS_PUTFH_LIKE | OP_CLEAR_STATEID,
.op_name = "OP_PUTROOTFH",
.op_rsize_bop = (nfsd4op_rsize)nfsd4_only_status_rsize,
},
[OP_READ] = {
.op_func = (nfsd4op_func)nfsd4_read,
- .op_flags = OP_MODIFIES_SOMETHING,
- .op_name = "OP_READ",
.op_rsize_bop = (nfsd4op_rsize)nfsd4_read_rsize,
.op_get_currentstateid = (stateid_getter)nfsd4_get_readstateid,
},
[OP_READDIR] = {
.op_func = (nfsd4op_func)nfsd4_readdir,
- .op_flags = OP_MODIFIES_SOMETHING,
- .op_name = "OP_READDIR",
.op_rsize_bop = (nfsd4op_rsize)nfsd4_readdir_rsize,
},
[OP_READLINK] = {
--
1.8.5.3


2014-03-23 01:12:27

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 15/50] nfsd4: allow space for final error return

From: "J. Bruce Fields" <[email protected]>

This post-encoding check should be taking into account the need to
encode at least an out-of-space error to the following op (if any).

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4xdr.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index c150121..ed486e7 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -3631,6 +3631,7 @@ void
nfsd4_encode_operation(struct nfsd4_compoundres *resp, struct nfsd4_op *op)
{
struct nfs4_stateowner *so = resp->cstate.replay_owner;
+ struct svc_rqst *rqstp = resp->rqstp;
__be32 *statp;
nfsd4_enc encoder;
__be32 *p;
@@ -3647,8 +3648,12 @@ nfsd4_encode_operation(struct nfsd4_compoundres *resp, struct nfsd4_op *op)
encoder = nfsd4_enc_ops[op->opnum];
op->status = encoder(resp, op->status, &op->u);
/* nfsd4_check_resp_size guarantees enough room for error status */
- if (!op->status)
- op->status = nfsd4_check_resp_size(resp, 0);
+ if (!op->status) {
+ int space_needed = 0;
+ if (!nfsd4_last_compound_op(rqstp))
+ space_needed = COMPOUND_ERR_SLACK_SPACE;
+ op->status = nfsd4_check_resp_size(resp, space_needed);
+ }
if (op->status == nfserr_resource && nfsd4_has_session(&resp->cstate)) {
struct nfsd4_slot *slot = resp->cstate.slot;

--
1.8.5.3


2014-03-23 06:43:37

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 18/50] nfsd4: use xdr_stream throughout compound encoding

> #define RESERVE_SPACE(nbytes) do { \
> - p = resp->xdr.p; \
> - BUG_ON(p + XDR_QUADLEN(nbytes) > resp->xdr.end); \
> + p = xdr_reserve_space(&resp->xdr, nbytes); \
> + BUG_ON(!p); \
> } while (0)
> -#define ADJUST_ARGS() resp->xdr.p = p
> +#define ADJUST_ARGS() WARN_ON_ONCE(p != resp->xdr.p) \

I think the code would become more clear if you'd start expanding these
macros. IF the RESERVE_SPACE BUG_ON is important enough it should move
into xdr_reserve_space or a variant thereof, or otherweise just removed.

> + if (nfserr) {
> + xdr->p -= 2;
> + xdr->iov->iov_len -= 8;


> + if (nfserr) {
> + xdr->p--;
> + xdr->iov->iov_len -= 4;
> return nfserr;
> + }


> + xdr->p = savep;
> + xdr->iov->iov_len = ((char *)resp->xdr.p)
> + - (char *)resp->xdr.buf->head[0].iov_base;

Where is this coming from? Looks like deep magic to be and would
benefit from comments, symbolic names and/or a helper.

2014-03-23 01:12:29

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 23/50] nfsd4: "backfill" using write_bytes_to_xdr_buf

From: "J. Bruce Fields" <[email protected]>

Normally xdr encoding proceeds in a single pass from start of a buffer
to end, but sometimes we have to write a few bytes to an earlier
position.

Use write_bytes_to_xdr_buf for these cases rather than saving a pointer
to write to. We plan to rewrite xdr_reserve_space to handle encoding
across page boundaries using a scratch buffer, and don't want to risk
writing to a pointer that was contained in a scratch buffer.

Also it will no longer be safe to calculate lengths by subtracting two
pointers, so use xdr_buf offsets instead.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4xdr.c | 45 ++++++++++++++++++++++++++-------------------
1 file changed, 26 insertions(+), 19 deletions(-)

diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index e5cc354..43c2a99 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -1759,16 +1759,19 @@ static void write_cinfo(__be32 **p, struct nfsd4_change_info *c)
static __be32 nfsd4_encode_components_esc(struct xdr_stream *xdr, char sep, char *components, char esc_enter, char esc_exit)
{
__be32 *p;
- __be32 *countp;
+ __be32 pathlen;
+ int pathlen_offset;
int strlen, count=0;
char *str, *end, *next;

dprintk("nfsd4_encode_components(%s)\n", components);
+
+ pathlen_offset = xdr->buf->len;
p = xdr_reserve_space(xdr, 4);
if (!p)
return nfserr_resource;
- countp = p;
- WRITE32(0); /* We will fill this in with @count later */
+ p++; /* We will fill this in with @count later */
+
end = str = components;
while (*end) {
bool found_esc = false;
@@ -1801,8 +1804,8 @@ static __be32 nfsd4_encode_components_esc(struct xdr_stream *xdr, char sep, char
end++;
str = end;
}
- p = countp;
- WRITE32(count);
+ pathlen = htonl(xdr->buf->len - pathlen_offset);
+ write_bytes_to_xdr_buf(xdr->buf, pathlen_offset, &pathlen, 4);
return 0;
}

@@ -2044,7 +2047,8 @@ nfsd4_encode_fattr(struct xdr_stream *xdr, struct svc_fh *fhp, struct svc_export
struct kstatfs statfs;
__be32 *p;
int starting_len = xdr->buf->len;
- __be32 *attrlenp;
+ int attrlen_offset;
+ __be32 attrlen;
u32 dummy;
u64 dummy64;
u32 rdattr_err = 0;
@@ -2149,10 +2153,12 @@ nfsd4_encode_fattr(struct xdr_stream *xdr, struct svc_fh *fhp, struct svc_export
WRITE32(1);
WRITE32(bmval0);
}
+
+ attrlen_offset = xdr->buf->len;
p = xdr_reserve_space(xdr, 4);
if (!p)
goto out_resource;
- attrlenp = p++; /* to be backfilled later */
+ p++; /* to be backfilled later */

if (bmval0 & FATTR4_WORD0_SUPPORTED_ATTRS) {
u32 word0 = nfsd_suppattrs0(minorversion);
@@ -2523,7 +2529,8 @@ out_acl:
WRITE32(NFSD_SUPPATTR_EXCLCREAT_WORD2);
}

- *attrlenp = htonl((char *)xdr->p - (char *)attrlenp - 4);
+ attrlen = htonl(xdr->buf->len - attrlen_offset - 4);
+ write_bytes_to_xdr_buf(xdr->buf, attrlen_offset, &attrlen, 4);
status = nfs_ok;

out:
@@ -3679,16 +3686,16 @@ __be32 nfsd4_check_resp_size(struct nfsd4_compoundres *resp, u32 pad)
void
nfsd4_encode_operation(struct nfsd4_compoundres *resp, struct nfsd4_op *op)
{
+ struct xdr_stream *xdr = &resp->xdr;
struct nfs4_stateowner *so = resp->cstate.replay_owner;
struct svc_rqst *rqstp = resp->rqstp;
- __be32 *statp;
+ int post_err_offset;
nfsd4_enc encoder;
__be32 *p;

RESERVE_SPACE(8);
WRITE32(op->opnum);
- statp = p++; /* to be backfilled at the end */
- ADJUST_ARGS();
+ post_err_offset = xdr->buf->len;

if (op->opnum == OP_ILLEGAL)
goto status;
@@ -3727,19 +3734,19 @@ nfsd4_encode_operation(struct nfsd4_compoundres *resp, struct nfsd4_op *op)
nfsd4_op_name(op->opnum));
WARN_ON_ONCE(1);
}
- resp->xdr.p = statp + 1;
+ xdr_truncate_encode(xdr, post_err_offset);
}
if (so) {
+ int len = xdr->buf->len - post_err_offset;
+
so->so_replay.rp_status = op->status;
- so->so_replay.rp_buflen = (char *)resp->xdr.p - (char *)(statp+1);
- memcpy(so->so_replay.rp_buf, statp+1, so->so_replay.rp_buflen);
+ so->so_replay.rp_buflen = len;
+ read_bytes_from_xdr_buf(xdr->buf, post_err_offset,
+ so->so_replay.rp_buf, len);
}
status:
- /*
- * Note: We write the status directly, instead of using WRITE32(),
- * since it is already in network byte order.
- */
- *statp = op->status;
+ /* Note that op->status is already in network byte order: */
+ write_bytes_to_xdr_buf(xdr->buf, post_err_offset - 4, &op->status, 4);
}

/*
--
1.8.5.3


2014-03-23 01:12:30

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 29/50] nfsd4: size-checking cleanup

From: "J. Bruce Fields" <[email protected]>

Better variable name, some comments, etc.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4proc.c | 10 ++++++----
fs/nfsd/nfs4xdr.c | 29 +++++++++++++++--------------
2 files changed, 21 insertions(+), 18 deletions(-)

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index 570c7e5..54081d27 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -1237,7 +1237,6 @@ nfsd4_proc_compound(struct svc_rqst *rqstp,
struct nfsd4_op *op;
struct nfsd4_operation *opdesc;
struct nfsd4_compound_state *cstate = &resp->cstate;
- u32 plen = 0;
__be32 status;

svcxdr_init_encode(rqstp, resp);
@@ -1303,11 +1302,14 @@ nfsd4_proc_compound(struct svc_rqst *rqstp,
goto encode_op;
}

- /* If op is non-idempotent */
if (opdesc->op_flags & OP_MODIFIES_SOMETHING) {
- plen = opdesc->op_rsize_bop(rqstp, op);
/*
- * If there's still another operation, make sure
+ * Don't execute this op if we couldn't encode a
+ * succesful reply:
+ */
+ u32 plen = opdesc->op_rsize_bop(rqstp, op);
+ /*
+ * Plus if there's another operation, make sure
* we'll have space to at least encode an error:
*/
if (resp->opcnt < args->opcnt)
diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index 74ddf12..062559c 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -3716,35 +3716,36 @@ static nfsd4_enc nfsd4_enc_ops[] = {
};

/*
- * Calculate the total amount of memory that the compound response has taken
- * after encoding the current operation with pad.
+ * Calculate whether we still have space to encode repsize bytes.
+ * There are two considerations:
+ * - For NFS versions >=4.1, the size of the reply must stay within
+ * session limits
+ * - For all NFS versions, we must stay within limited preallocated
+ * buffer space.
*
- * pad: if operation is non-idempotent, pad was calculate by op_rsize_bop()
- * which was specified at nfsd4_operation, else pad is zero.
- *
- * Compare this length to the session se_fmaxresp_sz and se_fmaxresp_cached.
- *
- * Our se_fmaxresp_cached will always be a multiple of PAGE_SIZE, and so
- * will be at least a page and will therefore hold the xdr_buf head.
+ * This is called before the operation is processed, so can only provide
+ * an upper estimate. For some nonidempotent operations (such as
+ * getattr), it's not necessarily a problem if that estimate is wrong,
+ * as we can fail it after processing without significant side effects.
*/
-__be32 nfsd4_check_resp_size(struct nfsd4_compoundres *resp, u32 pad)
+__be32 nfsd4_check_resp_size(struct nfsd4_compoundres *resp, u32 respsize)
{
struct xdr_buf *buf = &resp->rqstp->rq_res;
struct nfsd4_session *session = resp->cstate.session;
- struct nfsd4_slot *slot = resp->cstate.slot;
int slack_bytes = (char *)resp->xdr.end - (char *)resp->xdr.p;

if (nfsd4_has_session(&resp->cstate)) {
+ struct nfsd4_slot *slot = resp->cstate.slot;

- if (buf->len + pad > session->se_fchannel.maxresp_sz)
+ if (buf->len + respsize > session->se_fchannel.maxresp_sz)
return nfserr_rep_too_big;

if ((slot->sl_flags & NFSD4_SLOT_CACHETHIS) &&
- buf->len + pad > session->se_fchannel.maxresp_cached)
+ buf->len + respsize > session->se_fchannel.maxresp_cached)
return nfserr_rep_too_big_to_cache;
}

- if (pad > slack_bytes) {
+ if (respsize > slack_bytes) {
WARN_ON_ONCE(nfsd4_has_session(&resp->cstate));
return nfserr_resource;
}
--
1.8.5.3


2014-03-23 01:12:33

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 45/50] nfsd4: separate splice and readv cases

From: "J. Bruce Fields" <[email protected]>

The splice and readv cases are actually quite different--for example the
former case ignores the array of vectors we build up for the latter.

It is probably clearer to separate the two cases entirely.

There's some code duplication between the split out encoders, but this
is only temporary and will be fixed by a later patch.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4xdr.c | 153 ++++++++++++++++++++++++++++++++++++++++++------------
fs/nfsd/vfs.c | 121 ++++++++++++++++++++++++++----------------
fs/nfsd/vfs.h | 8 +++
3 files changed, 202 insertions(+), 80 deletions(-)

diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index 8cdc346..e052351 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -3068,36 +3068,72 @@ nfsd4_encode_open_downgrade(struct nfsd4_compoundres *resp, __be32 nfserr, struc
return nfserr;
}

-static __be32
-nfsd4_encode_read(struct nfsd4_compoundres *resp, __be32 nfserr,
- struct nfsd4_read *read)
+static __be32 nfsd4_encode_splice_read(
+ struct nfsd4_compoundres *resp,
+ struct nfsd4_read *read,
+ struct file *file, unsigned long maxcount)
{
- u32 eof;
- int v;
- struct page *page;
- unsigned long maxcount;
struct xdr_stream *xdr = &resp->xdr;
- int starting_len = xdr->buf->len;
- long len;
+ u32 eof;
+ int starting_len = xdr->buf->len - 8;
+ __be32 nfserr;
+ __be32 tmp;
__be32 *p;

- if (nfserr)
+ /*
+ * Don't inline pages unless we know there's room for eof,
+ * count, and possible padding:
+ */
+ if (xdr->end - xdr->p < 3)
+ return nfserr_resource;
+
+ nfserr = nfsd_splice_read(read->rd_rqstp, file,
+ read->rd_offset, &maxcount);
+ if (nfserr) {
+ xdr->buf->page_len = 0;
return nfserr;
+ }

- p = xdr_reserve_space(xdr, 8); /* eof flag and byte count */
- if (!p)
- return nfserr_resource;
+ eof = (read->rd_offset + maxcount >=
+ read->rd_fhp->fh_dentry->d_inode->i_size);

- /* Make sure there will be room for padding if needed: */
- if (xdr->end - xdr->p < 1)
- return nfserr_resource;
+ tmp = htonl(eof);
+ write_bytes_to_xdr_buf(xdr->buf, starting_len , &tmp, 4);
+ tmp = htonl(maxcount);
+ write_bytes_to_xdr_buf(xdr->buf, starting_len + 4, &tmp, 4);

- if (resp->xdr.buf->page_len)
- return nfserr_resource;
+ resp->xdr.buf->page_len = maxcount;
+ xdr->buf->len += maxcount;
+ xdr->page_ptr += (maxcount + PAGE_SIZE - 1) / PAGE_SIZE;
+ xdr->buf->buflen = maxcount + PAGE_SIZE - 2 * RPC_MAX_AUTH_SIZE;
+ xdr->iov = xdr->buf->tail;

- maxcount = svc_max_payload(resp->rqstp);
- if (maxcount > read->rd_length)
- maxcount = read->rd_length;
+ /* Use rest of head for padding and remaining ops: */
+ resp->xdr.buf->tail[0].iov_base = xdr->p;
+ resp->xdr.buf->tail[0].iov_len = 0;
+ if (maxcount&3) {
+ p = xdr_reserve_space(xdr, 4);
+ WRITE32(0);
+ resp->xdr.buf->tail[0].iov_base += maxcount&3;
+ resp->xdr.buf->tail[0].iov_len = 4 - (maxcount&3);
+ xdr->buf->len -= (maxcount&3);
+ }
+ return 0;
+}
+
+static __be32 nfsd4_encode_readv(struct nfsd4_compoundres *resp,
+ struct nfsd4_read *read,
+ struct file *file, unsigned long maxcount)
+{
+ struct xdr_stream *xdr = &resp->xdr;
+ u32 eof;
+ int v;
+ struct page *page;
+ int starting_len = xdr->buf->len - 8;
+ long len;
+ __be32 nfserr;
+ __be32 tmp;
+ __be32 *p;

len = maxcount;
v = 0;
@@ -3118,22 +3154,19 @@ nfsd4_encode_read(struct nfsd4_compoundres *resp, __be32 nfserr,
}
read->rd_vlen = v;

- nfserr = nfsd_read_file(read->rd_rqstp, read->rd_fhp, read->rd_filp,
- read->rd_offset, resp->rqstp->rq_vec, read->rd_vlen,
- &maxcount);
-
- if (nfserr) {
- xdr->buf->page_len = 0;
- xdr_truncate_encode(xdr, starting_len);
+ nfserr = nfsd_readv(file, read->rd_offset, resp->rqstp->rq_vec,
+ read->rd_vlen, &maxcount);
+ if (nfserr)
return nfserr;
- }
+
eof = (read->rd_offset + maxcount >=
read->rd_fhp->fh_dentry->d_inode->i_size);

- WRITE32(eof);
- WRITE32(maxcount);
- WARN_ON_ONCE(resp->xdr.buf->head[0].iov_len != (char*)p
- - (char*)resp->xdr.buf->head[0].iov_base);
+ tmp = htonl(eof);
+ write_bytes_to_xdr_buf(xdr->buf, starting_len , &tmp, 4);
+ tmp = htonl(maxcount);
+ write_bytes_to_xdr_buf(xdr->buf, starting_len + 4, &tmp, 4);
+
resp->xdr.buf->page_len = maxcount;
xdr->buf->len += maxcount;
xdr->page_ptr += v;
@@ -3141,7 +3174,7 @@ nfsd4_encode_read(struct nfsd4_compoundres *resp, __be32 nfserr,
xdr->iov = xdr->buf->tail;

/* Use rest of head for padding and remaining ops: */
- resp->xdr.buf->tail[0].iov_base = p;
+ resp->xdr.buf->tail[0].iov_base = xdr->p;
resp->xdr.buf->tail[0].iov_len = 0;
if (maxcount&3) {
p = xdr_reserve_space(xdr, 4);
@@ -3151,6 +3184,58 @@ nfsd4_encode_read(struct nfsd4_compoundres *resp, __be32 nfserr,
xdr->buf->len -= (maxcount&3);
}
return 0;
+
+}
+
+static __be32
+nfsd4_encode_read(struct nfsd4_compoundres *resp, __be32 nfserr,
+ struct nfsd4_read *read)
+{
+ unsigned long maxcount;
+ struct xdr_stream *xdr = &resp->xdr;
+ struct file *file = read->rd_filp;
+ int starting_len = xdr->buf->len;
+ struct raparms *ra;
+ __be32 *p;
+ __be32 err;
+
+ if (nfserr)
+ return nfserr;
+
+ p = xdr_reserve_space(xdr, 8); /* eof flag and byte count */
+ if (!p) {
+ WARN_ON_ONCE(1);
+ return nfserr_resource;
+ }
+
+ if (resp->xdr.buf->page_len)
+ return nfserr_resource;
+
+ xdr_commit_encode(xdr);
+
+ maxcount = svc_max_payload(resp->rqstp);
+ if (maxcount > read->rd_length)
+ maxcount = read->rd_length;
+
+ if (!read->rd_filp) {
+ err = nfsd_get_tmp_read_open(resp->rqstp, read->rd_fhp,
+ &file, &ra);
+ if (err)
+ goto err_truncate;
+ }
+
+ if (file->f_op->splice_read && resp->rqstp->rq_splice_ok)
+ err = nfsd4_encode_splice_read(resp, read, file, maxcount);
+ else
+ err = nfsd4_encode_readv(resp, read, file, maxcount);
+
+ if (!read->rd_filp)
+ nfsd_put_tmp_read_open(file, ra);
+
+err_truncate:
+ if (err)
+ xdr_truncate_encode(xdr, starting_len);
+ return err;
}

static __be32
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index cbf7690..022a2ae 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -820,41 +820,54 @@ static int nfsd_direct_splice_actor(struct pipe_inode_info *pipe,
return __splice_from_pipe(pipe, sd, nfsd_splice_actor);
}

-static __be32
-nfsd_vfs_read(struct svc_rqst *rqstp, struct file *file,
- loff_t offset, struct kvec *vec, int vlen, unsigned long *count)
+__be32 nfsd_finish_read(struct file *file, unsigned long *count, int host_err)
{
- mm_segment_t oldfs;
- __be32 err;
- int host_err;
-
- err = nfserr_perm;
-
- if (file->f_op->splice_read && rqstp->rq_splice_ok) {
- struct splice_desc sd = {
- .len = 0,
- .total_len = *count,
- .pos = offset,
- .u.data = rqstp,
- };
-
- rqstp->rq_next_page = rqstp->rq_respages + 1;
- host_err = splice_direct_to_actor(file, &sd, nfsd_direct_splice_actor);
- } else {
- oldfs = get_fs();
- set_fs(KERNEL_DS);
- host_err = vfs_readv(file, (struct iovec __user *)vec, vlen, &offset);
- set_fs(oldfs);
- }
-
if (host_err >= 0) {
nfsdstats.io_read += host_err;
*count = host_err;
- err = 0;
fsnotify_access(file);
+ return 0;
} else
- err = nfserrno(host_err);
- return err;
+ return nfserrno(host_err);
+}
+
+int nfsd_splice_read(struct svc_rqst *rqstp,
+ struct file *file, loff_t offset, unsigned long *count)
+{
+ struct splice_desc sd = {
+ .len = 0,
+ .total_len = *count,
+ .pos = offset,
+ .u.data = rqstp,
+ };
+ int host_err;
+
+ rqstp->rq_next_page = rqstp->rq_respages + 1;
+ host_err = splice_direct_to_actor(file, &sd, nfsd_direct_splice_actor);
+ return nfsd_finish_read(file, count, host_err);
+}
+
+int nfsd_readv(struct file *file, loff_t offset, struct kvec *vec, int vlen,
+ unsigned long *count)
+{
+ mm_segment_t oldfs;
+ int host_err;
+
+ oldfs = get_fs();
+ set_fs(KERNEL_DS);
+ host_err = vfs_readv(file, (struct iovec __user *)vec, vlen, &offset);
+ set_fs(oldfs);
+ return nfsd_finish_read(file, count, host_err);
+}
+
+static __be32
+nfsd_vfs_read(struct svc_rqst *rqstp, struct file *file,
+ loff_t offset, struct kvec *vec, int vlen, unsigned long *count)
+{
+ if (file->f_op->splice_read && rqstp->rq_splice_ok)
+ return nfsd_splice_read(rqstp, file, offset, count);
+ else
+ return nfsd_readv(file, offset, vec, vlen, count);
}

static void kill_suid(struct dentry *dentry)
@@ -962,33 +975,28 @@ out_nfserr:
return err;
}

-/*
- * Read data from a file. count must contain the requested read count
- * on entry. On return, *count contains the number of bytes actually read.
- * N.B. After this call fhp needs an fh_put
- */
-__be32 nfsd_read(struct svc_rqst *rqstp, struct svc_fh *fhp,
- loff_t offset, struct kvec *vec, int vlen, unsigned long *count)
+__be32 nfsd_get_tmp_read_open(struct svc_rqst *rqstp, struct svc_fh *fhp,
+ struct file **file, struct raparms **ra)
{
- struct file *file;
struct inode *inode;
- struct raparms *ra;
__be32 err;

- err = nfsd_open(rqstp, fhp, S_IFREG, NFSD_MAY_READ, &file);
+ err = nfsd_open(rqstp, fhp, S_IFREG, NFSD_MAY_READ, file);
if (err)
return err;

- inode = file_inode(file);
+ inode = file_inode(*file);

/* Get readahead parameters */
- ra = nfsd_get_raparms(inode->i_sb->s_dev, inode->i_ino);
-
- if (ra && ra->p_set)
- file->f_ra = ra->p_ra;
+ *ra = nfsd_get_raparms(inode->i_sb->s_dev, inode->i_ino);

- err = nfsd_vfs_read(rqstp, file, offset, vec, vlen, count);
+ if (*ra && (*ra)->p_set)
+ (*file)->f_ra = (*ra)->p_ra;
+ return nfs_ok;
+}

+void nfsd_put_tmp_read_open(struct file *file, struct raparms *ra)
+{
/* Write back readahead params */
if (ra) {
struct raparm_hbucket *rab = &raparm_hash[ra->p_hindex];
@@ -998,8 +1006,29 @@ __be32 nfsd_read(struct svc_rqst *rqstp, struct svc_fh *fhp,
ra->p_count--;
spin_unlock(&rab->pb_lock);
}
-
nfsd_close(file);
+}
+
+/*
+ * Read data from a file. count must contain the requested read count
+ * on entry. On return, *count contains the number of bytes actually read.
+ * N.B. After this call fhp needs an fh_put
+ */
+__be32 nfsd_read(struct svc_rqst *rqstp, struct svc_fh *fhp,
+ loff_t offset, struct kvec *vec, int vlen, unsigned long *count)
+{
+ struct file *file;
+ struct raparms *ra;
+ __be32 err;
+
+ err = nfsd_get_tmp_read_open(rqstp, fhp, &file, &ra);
+ if (err)
+ return err;
+
+ err = nfsd_vfs_read(rqstp, file, offset, vec, vlen, count);
+
+ nfsd_put_tmp_read_open(file, ra);
+
return err;
}

diff --git a/fs/nfsd/vfs.h b/fs/nfsd/vfs.h
index fbe90bd..7441e96 100644
--- a/fs/nfsd/vfs.h
+++ b/fs/nfsd/vfs.h
@@ -70,6 +70,14 @@ __be32 nfsd_commit(struct svc_rqst *, struct svc_fh *,
__be32 nfsd_open(struct svc_rqst *, struct svc_fh *, umode_t,
int, struct file **);
void nfsd_close(struct file *);
+struct raparms;
+__be32 nfsd_get_tmp_read_open(struct svc_rqst *, struct svc_fh *,
+ struct file **, struct raparms **);
+void nfsd_put_tmp_read_open(struct file *, struct raparms *);
+int nfsd_splice_read(struct svc_rqst *,
+ struct file *, loff_t, unsigned long *);
+int nfsd_readv(struct file *, loff_t, struct kvec *, int,
+ unsigned long *);
__be32 nfsd_read(struct svc_rqst *, struct svc_fh *,
loff_t, struct kvec *, int, unsigned long *);
__be32 nfsd_read_file(struct svc_rqst *, struct svc_fh *, struct file *,
--
1.8.5.3


2014-03-23 01:12:26

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 12/50] nfsd4: reserve head space for krb5 integ/priv info

From: "J. Bruce Fields" <[email protected]>

Currently if the nfs-level part of a reply would be too large, we'll
return an error to the client. But if the nfs-level part fits and
leaves no room for krb5p or krb5i stuff, then we just drop the request
entirely.

That's no good. Instead, reserve some slack space at the end of the
buffer and make sure we fail outright if we'd come close.

The slack space here is a massive overstimate of what's required, we
should probably try for a tighter limit at some point.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4proc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index 80e5921..849b0eb 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -1270,7 +1270,7 @@ static void svcxdr_init_encode(struct svc_rqst *rqstp, struct nfsd4_compoundres

xdr->buf = buf;
xdr->p = head->iov_base + head->iov_len;
- xdr->end = head->iov_base + PAGE_SIZE;
+ xdr->end = head->iov_base + PAGE_SIZE - 2 * RPC_MAX_AUTH_SIZE;
}

/*
--
1.8.5.3


2014-03-23 01:12:28

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 21/50] rpc: xdr_truncate_encode

From: "J. Bruce Fields" <[email protected]>

This will be used in the server side in a few cases:
- when certain operations (read, readdir, readlink) fail after
encoding a partial response.
- when we run out of space after encoding a partial response.
- in readlink, where we initially reserve PAGE_SIZE bytes for
data, then truncate to the actual size.

Signed-off-by: J. Bruce Fields <[email protected]>
---
include/linux/sunrpc/xdr.h | 1 +
net/sunrpc/xdr.c | 67 ++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 68 insertions(+)

diff --git a/include/linux/sunrpc/xdr.h b/include/linux/sunrpc/xdr.h
index 15f9204..e7bb2e3 100644
--- a/include/linux/sunrpc/xdr.h
+++ b/include/linux/sunrpc/xdr.h
@@ -215,6 +215,7 @@ typedef int (*kxdrdproc_t)(void *rqstp, struct xdr_stream *xdr, void *obj);

extern void xdr_init_encode(struct xdr_stream *xdr, struct xdr_buf *buf, __be32 *p);
extern __be32 *xdr_reserve_space(struct xdr_stream *xdr, size_t nbytes);
+extern void xdr_truncate_encode(struct xdr_stream *xdr, size_t len);
extern void xdr_write_pages(struct xdr_stream *xdr, struct page **pages,
unsigned int base, unsigned int len);
extern unsigned int xdr_stream_pos(const struct xdr_stream *xdr);
diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c
index dd97ba3..8ae8ee7 100644
--- a/net/sunrpc/xdr.c
+++ b/net/sunrpc/xdr.c
@@ -509,6 +509,73 @@ __be32 * xdr_reserve_space(struct xdr_stream *xdr, size_t nbytes)
EXPORT_SYMBOL_GPL(xdr_reserve_space);

/**
+ * xdr_truncate_encode - truncate an encode buffer
+ * @xdr: pointer to xdr_stream
+ * @len: new length of buffer
+ *
+ * Truncates the xdr stream, so that xdr->buf->len == len,
+ * and xdr->p points at offset len from the start of the buffer, and
+ * head, tail, and page lengths are adjusted to correspond.
+ *
+ * If this means moving xdr->p to a different buffer, we assume that
+ * that the end pointer should be set to the end of the current page,
+ * except in the case of the head buffer when we assume the head
+ * buffer's current length represents the end of the available buffer.
+ *
+ * This is *not* safe to use on a buffer that already has inlined page
+ * cache pages (as in a zero-copy server read reply), except for the
+ * simple case of truncating from one position in the tail to another.
+ *
+ */
+void xdr_truncate_encode(struct xdr_stream *xdr, size_t len)
+{
+ struct xdr_buf *buf = xdr->buf;
+ struct kvec *head = buf->head;
+ struct kvec *tail = buf->tail;
+ int fraglen;
+ int new, old;
+
+ if (len > buf->len) {
+ WARN_ON_ONCE(1);
+ return;
+ }
+
+ fraglen = min_t(int, buf->len - len, tail->iov_len);
+ tail->iov_len -= fraglen;
+ buf->len -= fraglen;
+ if (tail->iov_len && buf->len == len) {
+ xdr->p = tail->iov_base + tail->iov_len;
+ /* xdr->end, xdr->iov should be set already */
+ return;
+ }
+ WARN_ON_ONCE(fraglen);
+ fraglen = min_t(int, buf->len - len, buf->page_len);
+ buf->page_len -= fraglen;
+ buf->len -= fraglen;
+
+ new = buf->page_base + buf->page_len;
+ old = new + fraglen;
+ xdr->page_ptr -= (old >> PAGE_SHIFT) - (new >> PAGE_SHIFT);
+
+ if (buf->page_len && buf->len == len) {
+ xdr->p = page_address(*xdr->page_ptr);
+ xdr->end = (void *)xdr->p + PAGE_SIZE;
+ xdr->p = (void *)xdr->p + (new % PAGE_SIZE);
+ /* xdr->iov should already be NULL */
+ return;
+ }
+ if (fraglen)
+ xdr->end = head->iov_base + head->iov_len;
+ /* (otherwise assume xdr->end is already set) */
+ head->iov_len = len;
+ buf->len = len;
+ xdr->p = head->iov_base + head->iov_len;
+ xdr->iov = buf->head;
+ xdr->page_ptr -= 1;
+}
+EXPORT_SYMBOL(xdr_truncate_encode);
+
+/**
* xdr_write_pages - Insert a list of pages into an XDR buffer for sending
* @xdr: pointer to xdr_stream
* @pages: list of pages
--
1.8.5.3


2014-03-23 01:12:25

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 07/50] nfsd4: fix setclientid encode size

From: "J. Bruce Fields" <[email protected]>

---
fs/nfsd/nfs4proc.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index b9048e5..9a914e8 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -1529,7 +1529,8 @@ static inline u32 nfsd4_setattr_rsize(struct svc_rqst *rqstp, struct nfsd4_op *o

static inline u32 nfsd4_setclientid_rsize(struct svc_rqst *rqstp, struct nfsd4_op *op)
{
- return (op_encode_hdr_size + 2 + 1024) * sizeof(__be32);
+ return (op_encode_hdr_size + 2 + XDR_QUADLEN(NFS4_VERIFIER_SIZE)) *
+ sizeof(__be32);
}

static inline u32 nfsd4_write_rsize(struct svc_rqst *rqstp, struct nfsd4_op *op)
--
1.8.5.3


2014-03-23 01:12:33

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 43/50] nfsd4: turn off zero-copy-read in exotic cases

From: "J. Bruce Fields" <[email protected]>

We currently allow only one read per compound, with operations before
and after whose responses will require no more than about a page to
encode.

While we don't expect clients to violate those limits any time soon,
this limitation isn't really condoned by the spec, so to future proof
the server we should lift the limitation.

At the same time we'd like to continue to support zero-copy reads.

Supporting multiple zero-copy-reads per compound would require a new
data structure to replace struct xdr_buf, which can represent only one
set of included pages.

So for now we plan to modify encode_read() to support either zero-copy
or non-zero-copy reads, and use some heuristics at the start of the
compound processing to decide whether a zero-copy read will work.

This will allow us to support more exotic compounds without introducing
a performance regression in the normal case.

Later patches handle those "exotic compounds", this one just makes sure
zero-copy is turned off in those cases.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4xdr.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index c96a05e..8cdc346 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -1630,6 +1630,8 @@ nfsd4_decode_compound(struct nfsd4_compoundargs *argp)
struct nfsd4_op *op;
bool cachethis = false;
int max_reply = 2 * RPC_MAX_AUTH_SIZE; /* uh, kind of a guess */
+ int readcount = 0;
+ int readbytes = 0;
int i;

READ_BUF(4);
@@ -1680,14 +1682,21 @@ nfsd4_decode_compound(struct nfsd4_compoundargs *argp)
*/
cachethis |= nfsd4_cache_this_op(op);

- max_reply += nfsd4_max_reply(argp->rqstp, op);
+ if (op->opnum == OP_READ) {
+ readcount++;
+ readbytes += nfsd4_max_reply(argp->rqstp, op);
+ } else
+ max_reply += nfsd4_max_reply(argp->rqstp, op);
}
/* Sessions make the DRC unnecessary: */
if (argp->minorversion)
cachethis = false;
- svc_reserve(argp->rqstp, max_reply);
+ svc_reserve(argp->rqstp, max_reply + readbytes);
argp->rqstp->rq_cachetype = cachethis ? RC_REPLBUFF : RC_NOCACHE;

+ if (readcount > 1 || max_reply > PAGE_SIZE - 2*RPC_MAX_AUTH_SIZE)
+ argp->rqstp->rq_splice_ok = false;
+
DECODE_TAIL;
}

--
1.8.5.3


2014-03-23 01:12:29

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 25/50] nfsd4: teach encoders to handle reserve_space failures

From: "J. Bruce Fields" <[email protected]>

We've tried to prevent running out of space with COMPOUND_SLACK_SPACE
and special checking in those operations (getattr) whose result can vary
enormously.

However:
- COMPOUND_SLACK_SPACE may be difficult to maintain as we add
more protocol.
- BUG_ON or page faulting on failure seems overly fragile.
- Especially in the 4.1 case, we prefer not to fail compounds
just because the returned result came *close* to session
limits. (Though perfect enforcement here may be difficult.)
- I'd prefer encoding to be uniform for all encoders instead of
having special exceptions for encoders containing, for
example, attributes.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4proc.c | 2 +-
fs/nfsd/nfs4xdr.c | 246 +++++++++++++++++++++++++++++++++++++++--------------
fs/nfsd/xdr4.h | 2 +-
3 files changed, 184 insertions(+), 66 deletions(-)

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index 8360def..d55546c 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -1356,7 +1356,7 @@ encode_op:
}
if (op->status == nfserr_replay_me) {
op->replay = &cstate->replay_owner->so_replay;
- nfsd4_encode_replay(resp, op);
+ nfsd4_encode_replay(&resp->xdr, op);
status = op->status = op->replay->rp_status;
} else {
nfsd4_encode_operation(resp, op);
diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index a2dbc19..ab84545 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -1747,11 +1747,6 @@ static void write_cinfo(__be32 **p, struct nfsd4_change_info *c)
}
}

-#define RESERVE_SPACE(nbytes) do { \
- p = xdr_reserve_space(&resp->xdr, nbytes); \
- BUG_ON(!p); \
-} while (0)
-
/* Encode as an array of strings the string given with components
* separated @sep, escaped with esc_enter and esc_exit.
*/
@@ -2721,23 +2716,29 @@ fail:
return -EINVAL;
}

-static void
-nfsd4_encode_stateid(struct nfsd4_compoundres *resp, stateid_t *sid)
+static __be32
+nfsd4_encode_stateid(struct xdr_stream *xdr, stateid_t *sid)
{
__be32 *p;

- RESERVE_SPACE(sizeof(stateid_t));
+ p = xdr_reserve_space(xdr, sizeof(stateid_t));
+ if (!p)
+ return nfserr_resource;
WRITE32(sid->si_generation);
WRITEMEM(&sid->si_opaque, sizeof(stateid_opaque_t));
+ return 0;
}

static __be32
nfsd4_encode_access(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4_access *access)
{
+ struct xdr_stream *xdr = &resp->xdr;
__be32 *p;

if (!nfserr) {
- RESERVE_SPACE(8);
+ p = xdr_reserve_space(xdr, 8);
+ if (!p)
+ return nfserr_resource;
WRITE32(access->ac_supported);
WRITE32(access->ac_resp_access);
}
@@ -2746,10 +2747,13 @@ nfsd4_encode_access(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4_

static __be32 nfsd4_encode_bind_conn_to_session(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4_bind_conn_to_session *bcts)
{
+ struct xdr_stream *xdr = &resp->xdr;
__be32 *p;

if (!nfserr) {
- RESERVE_SPACE(NFS4_MAX_SESSIONID_LEN + 8);
+ p = xdr_reserve_space(xdr, NFS4_MAX_SESSIONID_LEN + 8);
+ if (!p)
+ return nfserr_resource;
WRITEMEM(bcts->sessionid.data, NFS4_MAX_SESSIONID_LEN);
WRITE32(bcts->dir);
/* Sorry, we do not yet support RDMA over 4.1: */
@@ -2761,8 +2765,10 @@ static __be32 nfsd4_encode_bind_conn_to_session(struct nfsd4_compoundres *resp,
static __be32
nfsd4_encode_close(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4_close *close)
{
+ struct xdr_stream *xdr = &resp->xdr;
+
if (!nfserr)
- nfsd4_encode_stateid(resp, &close->cl_stateid);
+ nfserr = nfsd4_encode_stateid(xdr, &close->cl_stateid);

return nfserr;
}
@@ -2771,10 +2777,13 @@ nfsd4_encode_close(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4_c
static __be32
nfsd4_encode_commit(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4_commit *commit)
{
+ struct xdr_stream *xdr = &resp->xdr;
__be32 *p;

if (!nfserr) {
- RESERVE_SPACE(NFS4_VERIFIER_SIZE);
+ p = xdr_reserve_space(xdr, NFS4_VERIFIER_SIZE);
+ if (!p)
+ return nfserr_resource;
WRITEMEM(commit->co_verf.data, NFS4_VERIFIER_SIZE);
}
return nfserr;
@@ -2783,10 +2792,13 @@ nfsd4_encode_commit(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4_
static __be32
nfsd4_encode_create(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4_create *create)
{
+ struct xdr_stream *xdr = &resp->xdr;
__be32 *p;

if (!nfserr) {
- RESERVE_SPACE(32);
+ p = xdr_reserve_space(xdr, 32);
+ if (!p)
+ return nfserr_resource;
write_cinfo(&p, &create->cr_cinfo);
WRITE32(2);
WRITE32(create->cr_bmval[0]);
@@ -2813,13 +2825,16 @@ nfsd4_encode_getattr(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4
static __be32
nfsd4_encode_getfh(struct nfsd4_compoundres *resp, __be32 nfserr, struct svc_fh **fhpp)
{
+ struct xdr_stream *xdr = &resp->xdr;
struct svc_fh *fhp = *fhpp;
unsigned int len;
__be32 *p;

if (!nfserr) {
len = fhp->fh_handle.fh_size;
- RESERVE_SPACE(len + 4);
+ p = xdr_reserve_space(xdr, len + 4);
+ if (!p)
+ return nfserr_resource;
WRITE32(len);
WRITEMEM(&fhp->fh_handle.fh_base, len);
}
@@ -2830,13 +2845,15 @@ nfsd4_encode_getfh(struct nfsd4_compoundres *resp, __be32 nfserr, struct svc_fh
* Including all fields other than the name, a LOCK4denied structure requires
* 8(clientid) + 4(namelen) + 8(offset) + 8(length) + 4(type) = 32 bytes.
*/
-static void
-nfsd4_encode_lock_denied(struct nfsd4_compoundres *resp, struct nfsd4_lock_denied *ld)
+static __be32
+nfsd4_encode_lock_denied(struct xdr_stream *xdr, struct nfsd4_lock_denied *ld)
{
struct xdr_netobj *conf = &ld->ld_owner;
__be32 *p;

- RESERVE_SPACE(32 + XDR_LEN(conf->len));
+ p = xdr_reserve_space(xdr, 32 + XDR_LEN(conf->len));
+ if (!p)
+ return nfserr_resource;
WRITE64(ld->ld_start);
WRITE64(ld->ld_length);
WRITE32(ld->ld_type);
@@ -2849,15 +2866,18 @@ nfsd4_encode_lock_denied(struct nfsd4_compoundres *resp, struct nfsd4_lock_denie
WRITE64((u64)0); /* clientid */
WRITE32(0); /* length of owner name */
}
+ return nfserr_denied;
}

static __be32
nfsd4_encode_lock(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4_lock *lock)
{
+ struct xdr_stream *xdr = &resp->xdr;
+
if (!nfserr)
- nfsd4_encode_stateid(resp, &lock->lk_resp_stateid);
+ nfserr = nfsd4_encode_stateid(xdr, &lock->lk_resp_stateid);
else if (nfserr == nfserr_denied)
- nfsd4_encode_lock_denied(resp, &lock->lk_denied);
+ nfserr = nfsd4_encode_lock_denied(xdr, &lock->lk_denied);

return nfserr;
}
@@ -2865,16 +2885,20 @@ nfsd4_encode_lock(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4_lo
static __be32
nfsd4_encode_lockt(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4_lockt *lockt)
{
+ struct xdr_stream *xdr = &resp->xdr;
+
if (nfserr == nfserr_denied)
- nfsd4_encode_lock_denied(resp, &lockt->lt_denied);
+ nfsd4_encode_lock_denied(xdr, &lockt->lt_denied);
return nfserr;
}

static __be32
nfsd4_encode_locku(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4_locku *locku)
{
+ struct xdr_stream *xdr = &resp->xdr;
+
if (!nfserr)
- nfsd4_encode_stateid(resp, &locku->lu_stateid);
+ nfserr = nfsd4_encode_stateid(xdr, &locku->lu_stateid);

return nfserr;
}
@@ -2883,10 +2907,13 @@ nfsd4_encode_locku(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4_l
static __be32
nfsd4_encode_link(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4_link *link)
{
+ struct xdr_stream *xdr = &resp->xdr;
__be32 *p;

if (!nfserr) {
- RESERVE_SPACE(20);
+ p = xdr_reserve_space(xdr, 20);
+ if (!p)
+ return nfserr_resource;
write_cinfo(&p, &link->li_cinfo);
}
return nfserr;
@@ -2896,13 +2923,18 @@ nfsd4_encode_link(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4_li
static __be32
nfsd4_encode_open(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4_open *open)
{
+ struct xdr_stream *xdr = &resp->xdr;
__be32 *p;

if (nfserr)
goto out;

- nfsd4_encode_stateid(resp, &open->op_stateid);
- RESERVE_SPACE(40);
+ nfserr = nfsd4_encode_stateid(xdr, &open->op_stateid);
+ if (nfserr)
+ goto out;
+ p = xdr_reserve_space(xdr, 40);
+ if (!p)
+ return nfserr_resource;
write_cinfo(&p, &open->op_cinfo);
WRITE32(open->op_rflags);
WRITE32(2);
@@ -2914,8 +2946,12 @@ nfsd4_encode_open(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4_op
case NFS4_OPEN_DELEGATE_NONE:
break;
case NFS4_OPEN_DELEGATE_READ:
- nfsd4_encode_stateid(resp, &open->op_delegate_stateid);
- RESERVE_SPACE(20);
+ nfserr = nfsd4_encode_stateid(xdr, &open->op_delegate_stateid);
+ if (nfserr)
+ return nfserr;
+ p = xdr_reserve_space(xdr, 20);
+ if (!p)
+ return nfserr_resource;
WRITE32(open->op_recall);

/*
@@ -2927,8 +2963,12 @@ nfsd4_encode_open(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4_op
WRITE32(0); /* XXX: is NULL principal ok? */
break;
case NFS4_OPEN_DELEGATE_WRITE:
- nfsd4_encode_stateid(resp, &open->op_delegate_stateid);
- RESERVE_SPACE(32);
+ nfserr = nfsd4_encode_stateid(xdr, &open->op_delegate_stateid);
+ if (nfserr)
+ return nfserr;
+ p = xdr_reserve_space(xdr, 32);
+ if (!p)
+ return nfserr_resource;
WRITE32(0);

/*
@@ -2950,12 +2990,16 @@ nfsd4_encode_open(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4_op
switch (open->op_why_no_deleg) {
case WND4_CONTENTION:
case WND4_RESOURCE:
- RESERVE_SPACE(8);
+ p = xdr_reserve_space(xdr, 8);
+ if (!p)
+ return nfserr_resource;
WRITE32(open->op_why_no_deleg);
WRITE32(0); /* deleg signaling not supported yet */
break;
default:
- RESERVE_SPACE(4);
+ p = xdr_reserve_space(xdr, 4);
+ if (!p)
+ return nfserr_resource;
WRITE32(open->op_why_no_deleg);
}
break;
@@ -2970,8 +3014,10 @@ out:
static __be32
nfsd4_encode_open_confirm(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4_open_confirm *oc)
{
+ struct xdr_stream *xdr = &resp->xdr;
+
if (!nfserr)
- nfsd4_encode_stateid(resp, &oc->oc_resp_stateid);
+ nfserr = nfsd4_encode_stateid(xdr, &oc->oc_resp_stateid);

return nfserr;
}
@@ -2979,8 +3025,10 @@ nfsd4_encode_open_confirm(struct nfsd4_compoundres *resp, __be32 nfserr, struct
static __be32
nfsd4_encode_open_downgrade(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4_open_downgrade *od)
{
+ struct xdr_stream *xdr = &resp->xdr;
+
if (!nfserr)
- nfsd4_encode_stateid(resp, &od->od_stateid);
+ nfserr = nfsd4_encode_stateid(xdr, &od->od_stateid);

return nfserr;
}
@@ -3003,7 +3051,9 @@ nfsd4_encode_read(struct nfsd4_compoundres *resp, __be32 nfserr,
if (resp->xdr.buf->page_len)
return nfserr_resource;

- RESERVE_SPACE(8); /* eof flag and byte count */
+ p = xdr_reserve_space(xdr, 8); /* eof flag and byte count */
+ if (!p)
+ return nfserr_resource;

maxcount = svc_max_payload(resp->rqstp);
if (maxcount > read->rd_length)
@@ -3050,7 +3100,9 @@ nfsd4_encode_read(struct nfsd4_compoundres *resp, __be32 nfserr,
resp->xdr.buf->tail[0].iov_base = p;
resp->xdr.buf->tail[0].iov_len = 0;
if (maxcount&3) {
- RESERVE_SPACE(4);
+ p = xdr_reserve_space(xdr, 4);
+ if (!p)
+ return nfserr_resource;
WRITE32(0);
resp->xdr.buf->tail[0].iov_base += maxcount&3;
resp->xdr.buf->tail[0].iov_len = 4 - (maxcount&3);
@@ -3078,7 +3130,10 @@ nfsd4_encode_readlink(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd
page = page_address(*(resp->rqstp->rq_next_page++));

maxcount = PAGE_SIZE;
- RESERVE_SPACE(4);
+
+ p = xdr_reserve_space(xdr, 4);
+ if (!p)
+ return nfserr_resource;

/*
* XXX: By default, the ->readlink() VFS op will truncate symlinks
@@ -3105,7 +3160,9 @@ nfsd4_encode_readlink(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd
resp->xdr.buf->tail[0].iov_base = p;
resp->xdr.buf->tail[0].iov_len = 0;
if (maxcount&3) {
- RESERVE_SPACE(4);
+ p = xdr_reserve_space(xdr, 4);
+ if (!p)
+ return nfserr_resource;
WRITE32(0);
resp->xdr.buf->tail[0].iov_base += maxcount&3;
resp->xdr.buf->tail[0].iov_len = 4 - (maxcount&3);
@@ -3130,7 +3187,9 @@ nfsd4_encode_readdir(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4
if (!*resp->rqstp->rq_next_page)
return nfserr_resource;

- RESERVE_SPACE(NFS4_VERIFIER_SIZE);
+ p = xdr_reserve_space(xdr, NFS4_VERIFIER_SIZE);
+ if (!p)
+ return nfserr_resource;

/* XXX: Following NFSv3, we ignore the READDIR verifier for now. */
WRITE32(0);
@@ -3198,10 +3257,13 @@ err_no_verf:
static __be32
nfsd4_encode_remove(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4_remove *remove)
{
+ struct xdr_stream *xdr = &resp->xdr;
__be32 *p;

if (!nfserr) {
- RESERVE_SPACE(20);
+ p = xdr_reserve_space(xdr, 20);
+ if (!p)
+ return nfserr_resource;
write_cinfo(&p, &remove->rm_cinfo);
}
return nfserr;
@@ -3210,10 +3272,13 @@ nfsd4_encode_remove(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4_
static __be32
nfsd4_encode_rename(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4_rename *rename)
{
+ struct xdr_stream *xdr = &resp->xdr;
__be32 *p;

if (!nfserr) {
- RESERVE_SPACE(40);
+ p = xdr_reserve_space(xdr, 40);
+ if (!p)
+ return nfserr_resource;
write_cinfo(&p, &rename->rn_sinfo);
write_cinfo(&p, &rename->rn_tinfo);
}
@@ -3221,7 +3286,7 @@ nfsd4_encode_rename(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4_
}

static __be32
-nfsd4_do_encode_secinfo(struct nfsd4_compoundres *resp,
+nfsd4_do_encode_secinfo(struct xdr_stream *xdr,
__be32 nfserr, struct svc_export *exp)
{
u32 i, nflavs, supported;
@@ -3232,6 +3297,7 @@ nfsd4_do_encode_secinfo(struct nfsd4_compoundres *resp,

if (nfserr)
goto out;
+ nfserr = nfserr_resource;
if (exp->ex_nflavors) {
flavs = exp->ex_flavors;
nflavs = exp->ex_nflavors;
@@ -3253,7 +3319,9 @@ nfsd4_do_encode_secinfo(struct nfsd4_compoundres *resp,
}

supported = 0;
- RESERVE_SPACE(4);
+ p = xdr_reserve_space(xdr, 4);
+ if (!p)
+ goto out;
flavorsp = p++; /* to be backfilled later */

for (i = 0; i < nflavs; i++) {
@@ -3262,7 +3330,9 @@ nfsd4_do_encode_secinfo(struct nfsd4_compoundres *resp,

if (rpcauth_get_gssinfo(pf, &info) == 0) {
supported++;
- RESERVE_SPACE(4 + 4 + XDR_LEN(info.oid.len) + 4 + 4);
+ p = xdr_reserve_space(xdr, 4 + 4 + XDR_LEN(info.oid.len) + 4 + 4);
+ if (!p)
+ goto out;
WRITE32(RPC_AUTH_GSS);
WRITE32(info.oid.len);
WRITEMEM(info.oid.data, info.oid.len);
@@ -3270,7 +3340,9 @@ nfsd4_do_encode_secinfo(struct nfsd4_compoundres *resp,
WRITE32(info.service);
} else if (pf < RPC_AUTH_MAXFLAVOR) {
supported++;
- RESERVE_SPACE(4);
+ p = xdr_reserve_space(xdr, 4);
+ if (!p)
+ goto out;
WRITE32(pf);
} else {
if (report)
@@ -3282,7 +3354,7 @@ nfsd4_do_encode_secinfo(struct nfsd4_compoundres *resp,
if (nflavs != supported)
report = false;
*flavorsp = htonl(supported);
-
+ nfserr = 0;
out:
if (exp)
exp_put(exp);
@@ -3293,14 +3365,18 @@ static __be32
nfsd4_encode_secinfo(struct nfsd4_compoundres *resp, __be32 nfserr,
struct nfsd4_secinfo *secinfo)
{
- return nfsd4_do_encode_secinfo(resp, nfserr, secinfo->si_exp);
+ struct xdr_stream *xdr = &resp->xdr;
+
+ return nfsd4_do_encode_secinfo(xdr, nfserr, secinfo->si_exp);
}

static __be32
nfsd4_encode_secinfo_no_name(struct nfsd4_compoundres *resp, __be32 nfserr,
struct nfsd4_secinfo_no_name *secinfo)
{
- return nfsd4_do_encode_secinfo(resp, nfserr, secinfo->sin_exp);
+ struct xdr_stream *xdr = &resp->xdr;
+
+ return nfsd4_do_encode_secinfo(xdr, nfserr, secinfo->sin_exp);
}

/*
@@ -3310,9 +3386,12 @@ nfsd4_encode_secinfo_no_name(struct nfsd4_compoundres *resp, __be32 nfserr,
static __be32
nfsd4_encode_setattr(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4_setattr *setattr)
{
+ struct xdr_stream *xdr = &resp->xdr;
__be32 *p;

- RESERVE_SPACE(16);
+ p = xdr_reserve_space(xdr, 16);
+ if (!p)
+ return nfserr_resource;
if (nfserr) {
WRITE32(3);
WRITE32(0);
@@ -3331,15 +3410,20 @@ nfsd4_encode_setattr(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4
static __be32
nfsd4_encode_setclientid(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4_setclientid *scd)
{
+ struct xdr_stream *xdr = &resp->xdr;
__be32 *p;

if (!nfserr) {
- RESERVE_SPACE(8 + NFS4_VERIFIER_SIZE);
+ p = xdr_reserve_space(xdr, 8 + NFS4_VERIFIER_SIZE);
+ if (!p)
+ return nfserr_resource;
WRITEMEM(&scd->se_clientid, 8);
WRITEMEM(&scd->se_confirm, NFS4_VERIFIER_SIZE);
}
else if (nfserr == nfserr_clid_inuse) {
- RESERVE_SPACE(8);
+ p = xdr_reserve_space(xdr, 8);
+ if (!p)
+ return nfserr_resource;
WRITE32(0);
WRITE32(0);
}
@@ -3349,10 +3433,13 @@ nfsd4_encode_setclientid(struct nfsd4_compoundres *resp, __be32 nfserr, struct n
static __be32
nfsd4_encode_write(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4_write *write)
{
+ struct xdr_stream *xdr = &resp->xdr;
__be32 *p;

if (!nfserr) {
- RESERVE_SPACE(16);
+ p = xdr_reserve_space(xdr, 16);
+ if (!p)
+ return nfserr_resource;
WRITE32(write->wr_bytes_written);
WRITE32(write->wr_how_written);
WRITEMEM(write->wr_verifier.data, NFS4_VERIFIER_SIZE);
@@ -3372,6 +3459,7 @@ static __be32
nfsd4_encode_exchange_id(struct nfsd4_compoundres *resp, __be32 nfserr,
struct nfsd4_exchange_id *exid)
{
+ struct xdr_stream *xdr = &resp->xdr;
__be32 *p;
char *major_id;
char *server_scope;
@@ -3387,11 +3475,13 @@ nfsd4_encode_exchange_id(struct nfsd4_compoundres *resp, __be32 nfserr,
server_scope = utsname()->nodename;
server_scope_sz = strlen(server_scope);

- RESERVE_SPACE(
+ p = xdr_reserve_space(xdr,
8 /* eir_clientid */ +
4 /* eir_sequenceid */ +
4 /* eir_flags */ +
4 /* spr_how */);
+ if (!p)
+ return nfserr_resource;

WRITEMEM(&exid->clientid, 8);
WRITE32(exid->seqid);
@@ -3404,7 +3494,9 @@ nfsd4_encode_exchange_id(struct nfsd4_compoundres *resp, __be32 nfserr,
break;
case SP4_MACH_CRED:
/* spo_must_enforce, spo_must_allow */
- RESERVE_SPACE(16);
+ p = xdr_reserve_space(xdr, 16);
+ if (!p)
+ return nfserr_resource;

/* spo_must_enforce bitmap: */
WRITE32(2);
@@ -3418,13 +3510,15 @@ nfsd4_encode_exchange_id(struct nfsd4_compoundres *resp, __be32 nfserr,
WARN_ON_ONCE(1);
}

- RESERVE_SPACE(
+ p = xdr_reserve_space(xdr,
8 /* so_minor_id */ +
4 /* so_major_id.len */ +
(XDR_QUADLEN(major_id_sz) * 4) +
4 /* eir_server_scope.len */ +
(XDR_QUADLEN(server_scope_sz) * 4) +
4 /* eir_server_impl_id.count (0) */);
+ if (!p)
+ return nfserr_resource;

/* The server_owner struct */
WRITE64(minor_id); /* Minor id */
@@ -3445,17 +3539,22 @@ static __be32
nfsd4_encode_create_session(struct nfsd4_compoundres *resp, __be32 nfserr,
struct nfsd4_create_session *sess)
{
+ struct xdr_stream *xdr = &resp->xdr;
__be32 *p;

if (nfserr)
return nfserr;

- RESERVE_SPACE(24);
+ p = xdr_reserve_space(xdr, 24);
+ if (!p)
+ return nfserr_resource;
WRITEMEM(sess->sessionid.data, NFS4_MAX_SESSIONID_LEN);
WRITE32(sess->seqid);
WRITE32(sess->flags);

- RESERVE_SPACE(28);
+ p = xdr_reserve_space(xdr, 28);
+ if (!p)
+ return nfserr_resource;
WRITE32(0); /* headerpadsz */
WRITE32(sess->fore_channel.maxreq_sz);
WRITE32(sess->fore_channel.maxresp_sz);
@@ -3465,11 +3564,15 @@ nfsd4_encode_create_session(struct nfsd4_compoundres *resp, __be32 nfserr,
WRITE32(sess->fore_channel.nr_rdma_attrs);

if (sess->fore_channel.nr_rdma_attrs) {
- RESERVE_SPACE(4);
+ p = xdr_reserve_space(xdr, 4);
+ if (!p)
+ return nfserr_resource;
WRITE32(sess->fore_channel.rdma_attrs);
}

- RESERVE_SPACE(28);
+ p = xdr_reserve_space(xdr, 28);
+ if (!p)
+ return nfserr_resource;
WRITE32(0); /* headerpadsz */
WRITE32(sess->back_channel.maxreq_sz);
WRITE32(sess->back_channel.maxresp_sz);
@@ -3479,7 +3582,9 @@ nfsd4_encode_create_session(struct nfsd4_compoundres *resp, __be32 nfserr,
WRITE32(sess->back_channel.nr_rdma_attrs);

if (sess->back_channel.nr_rdma_attrs) {
- RESERVE_SPACE(4);
+ p = xdr_reserve_space(xdr, 4);
+ if (!p)
+ return nfserr_resource;
WRITE32(sess->back_channel.rdma_attrs);
}
return 0;
@@ -3489,12 +3594,15 @@ static __be32
nfsd4_encode_sequence(struct nfsd4_compoundres *resp, __be32 nfserr,
struct nfsd4_sequence *seq)
{
+ struct xdr_stream *xdr = &resp->xdr;
__be32 *p;

if (nfserr)
return nfserr;

- RESERVE_SPACE(NFS4_MAX_SESSIONID_LEN + 20);
+ p = xdr_reserve_space(xdr, NFS4_MAX_SESSIONID_LEN + 20);
+ if (!p)
+ return nfserr_resource;
WRITEMEM(seq->sessionid.data, NFS4_MAX_SESSIONID_LEN);
WRITE32(seq->seqid);
WRITE32(seq->slotid);
@@ -3511,13 +3619,16 @@ static __be32
nfsd4_encode_test_stateid(struct nfsd4_compoundres *resp, __be32 nfserr,
struct nfsd4_test_stateid *test_stateid)
{
+ struct xdr_stream *xdr = &resp->xdr;
struct nfsd4_test_stateid_id *stateid, *next;
__be32 *p;

if (nfserr)
return nfserr;

- RESERVE_SPACE(4 + (4 * test_stateid->ts_num_ids));
+ p = xdr_reserve_space(xdr, 4 + (4 * test_stateid->ts_num_ids));
+ if (!p)
+ return nfserr_resource;
*p++ = htonl(test_stateid->ts_num_ids);

list_for_each_entry_safe(stateid, next, &test_stateid->ts_stateid_list, ts_id_list) {
@@ -3656,7 +3767,11 @@ nfsd4_encode_operation(struct nfsd4_compoundres *resp, struct nfsd4_op *op)
nfsd4_enc encoder;
__be32 *p;

- RESERVE_SPACE(8);
+ p = xdr_reserve_space(xdr, 8);
+ if (!p) {
+ WARN_ON_ONCE(1);
+ return;
+ }
WRITE32(op->opnum);
post_err_offset = xdr->buf->len;

@@ -3721,18 +3836,21 @@ status:
* called with nfs4_lock_state() held
*/
void
-nfsd4_encode_replay(struct nfsd4_compoundres *resp, struct nfsd4_op *op)
+nfsd4_encode_replay(struct xdr_stream *xdr, struct nfsd4_op *op)
{
__be32 *p;
struct nfs4_replay *rp = op->replay;

BUG_ON(!rp);

- RESERVE_SPACE(8);
+ p = xdr_reserve_space(xdr, 8 + rp->rp_buflen);
+ if (!p) {
+ WARN_ON_ONCE(1);
+ return;
+ }
WRITE32(op->opnum);
*p++ = rp->rp_status; /* already xdr'ed */

- RESERVE_SPACE(rp->rp_buflen);
WRITEMEM(rp->rp_buf, rp->rp_buflen);
}

diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h
index fa3a589..19bf3fc 100644
--- a/fs/nfsd/xdr4.h
+++ b/fs/nfsd/xdr4.h
@@ -614,7 +614,7 @@ int nfs4svc_encode_compoundres(struct svc_rqst *, __be32 *,
struct nfsd4_compoundres *);
__be32 nfsd4_check_resp_size(struct nfsd4_compoundres *, u32);
void nfsd4_encode_operation(struct nfsd4_compoundres *, struct nfsd4_op *);
-void nfsd4_encode_replay(struct nfsd4_compoundres *resp, struct nfsd4_op *op);
+void nfsd4_encode_replay(struct xdr_stream *xdr, struct nfsd4_op *op);
__be32 nfsd4_encode_fattr_to_buf(__be32 **p, int words,
struct svc_fh *fhp, struct svc_export *exp,
struct dentry *dentry,
--
1.8.5.3


2014-03-23 01:12:31

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 34/50] nfsd4: minor encode_read cleanup

From: "J. Bruce Fields" <[email protected]>

---
fs/nfsd/nfs4xdr.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index 1ac730d..9f28e59 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -3075,18 +3075,20 @@ nfsd4_encode_read(struct nfsd4_compoundres *resp, __be32 nfserr,

len = maxcount;
v = 0;
- while (len > 0) {
+ while (len) {
+ int thislen;
+
page = *(resp->rqstp->rq_next_page);
if (!page) { /* ran out of pages */
maxcount -= len;
break;
}
+ thislen = min_t(long, len, PAGE_SIZE);
resp->rqstp->rq_vec[v].iov_base = page_address(page);
- resp->rqstp->rq_vec[v].iov_len =
- len < PAGE_SIZE ? len : PAGE_SIZE;
+ resp->rqstp->rq_vec[v].iov_len = thislen;
resp->rqstp->rq_next_page++;
v++;
- len -= PAGE_SIZE;
+ len -= thislen;
}
read->rd_vlen = v;

--
1.8.5.3


2014-03-23 01:12:29

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 24/50] nfsd4: remove ADJUST_ARGS

From: "J. Bruce Fields" <[email protected]>

It's just uninteresting debugging code at this point.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4xdr.c | 39 ---------------------------------------
1 file changed, 39 deletions(-)

diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index 43c2a99..a2dbc19 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -1751,7 +1751,6 @@ static void write_cinfo(__be32 **p, struct nfsd4_change_info *c)
p = xdr_reserve_space(&resp->xdr, nbytes); \
BUG_ON(!p); \
} while (0)
-#define ADJUST_ARGS() WARN_ON_ONCE(p != resp->xdr.p) \

/* Encode as an array of strings the string given with components
* separated @sep, escaped with esc_enter and esc_exit.
@@ -2730,7 +2729,6 @@ nfsd4_encode_stateid(struct nfsd4_compoundres *resp, stateid_t *sid)
RESERVE_SPACE(sizeof(stateid_t));
WRITE32(sid->si_generation);
WRITEMEM(&sid->si_opaque, sizeof(stateid_opaque_t));
- ADJUST_ARGS();
}

static __be32
@@ -2742,7 +2740,6 @@ nfsd4_encode_access(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4_
RESERVE_SPACE(8);
WRITE32(access->ac_supported);
WRITE32(access->ac_resp_access);
- ADJUST_ARGS();
}
return nfserr;
}
@@ -2757,7 +2754,6 @@ static __be32 nfsd4_encode_bind_conn_to_session(struct nfsd4_compoundres *resp,
WRITE32(bcts->dir);
/* Sorry, we do not yet support RDMA over 4.1: */
WRITE32(0);
- ADJUST_ARGS();
}
return nfserr;
}
@@ -2780,7 +2776,6 @@ nfsd4_encode_commit(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4_
if (!nfserr) {
RESERVE_SPACE(NFS4_VERIFIER_SIZE);
WRITEMEM(commit->co_verf.data, NFS4_VERIFIER_SIZE);
- ADJUST_ARGS();
}
return nfserr;
}
@@ -2796,7 +2791,6 @@ nfsd4_encode_create(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4_
WRITE32(2);
WRITE32(create->cr_bmval[0]);
WRITE32(create->cr_bmval[1]);
- ADJUST_ARGS();
}
return nfserr;
}
@@ -2828,7 +2822,6 @@ nfsd4_encode_getfh(struct nfsd4_compoundres *resp, __be32 nfserr, struct svc_fh
RESERVE_SPACE(len + 4);
WRITE32(len);
WRITEMEM(&fhp->fh_handle.fh_base, len);
- ADJUST_ARGS();
}
return nfserr;
}
@@ -2856,7 +2849,6 @@ nfsd4_encode_lock_denied(struct nfsd4_compoundres *resp, struct nfsd4_lock_denie
WRITE64((u64)0); /* clientid */
WRITE32(0); /* length of owner name */
}
- ADJUST_ARGS();
}

static __be32
@@ -2896,7 +2888,6 @@ nfsd4_encode_link(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4_li
if (!nfserr) {
RESERVE_SPACE(20);
write_cinfo(&p, &link->li_cinfo);
- ADJUST_ARGS();
}
return nfserr;
}
@@ -2918,7 +2909,6 @@ nfsd4_encode_open(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4_op
WRITE32(open->op_bmval[0]);
WRITE32(open->op_bmval[1]);
WRITE32(open->op_delegate_type);
- ADJUST_ARGS();

switch (open->op_delegate_type) {
case NFS4_OPEN_DELEGATE_NONE:
@@ -2935,7 +2925,6 @@ nfsd4_encode_open(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4_op
WRITE32(0);
WRITE32(0);
WRITE32(0); /* XXX: is NULL principal ok? */
- ADJUST_ARGS();
break;
case NFS4_OPEN_DELEGATE_WRITE:
nfsd4_encode_stateid(resp, &open->op_delegate_stateid);
@@ -2956,7 +2945,6 @@ nfsd4_encode_open(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4_op
WRITE32(0);
WRITE32(0);
WRITE32(0); /* XXX: is NULL principal ok? */
- ADJUST_ARGS();
break;
case NFS4_OPEN_DELEGATE_NONE_EXT: /* 4.1 */
switch (open->op_why_no_deleg) {
@@ -2970,7 +2958,6 @@ nfsd4_encode_open(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4_op
RESERVE_SPACE(4);
WRITE32(open->op_why_no_deleg);
}
- ADJUST_ARGS();
break;
default:
BUG();
@@ -3053,7 +3040,6 @@ nfsd4_encode_read(struct nfsd4_compoundres *resp, __be32 nfserr,

WRITE32(eof);
WRITE32(maxcount);
- ADJUST_ARGS();
WARN_ON_ONCE(resp->xdr.buf->head[0].iov_len != (char*)p
- (char*)resp->xdr.buf->head[0].iov_base);
resp->xdr.buf->page_len = maxcount;
@@ -3069,7 +3055,6 @@ nfsd4_encode_read(struct nfsd4_compoundres *resp, __be32 nfserr,
resp->xdr.buf->tail[0].iov_base += maxcount&3;
resp->xdr.buf->tail[0].iov_len = 4 - (maxcount&3);
xdr->buf->len -= (maxcount&3);
- ADJUST_ARGS();
}
return 0;
}
@@ -3110,7 +3095,6 @@ nfsd4_encode_readlink(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd
}

WRITE32(maxcount);
- ADJUST_ARGS();
resp->xdr.buf->head[0].iov_len = (char*)p
- (char*)resp->xdr.buf->head[0].iov_base;
resp->xdr.buf->page_len = maxcount;
@@ -3125,7 +3109,6 @@ nfsd4_encode_readlink(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd
WRITE32(0);
resp->xdr.buf->tail[0].iov_base += maxcount&3;
resp->xdr.buf->tail[0].iov_len = 4 - (maxcount&3);
- ADJUST_ARGS();
}
return 0;
}
@@ -3152,7 +3135,6 @@ nfsd4_encode_readdir(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4
/* XXX: Following NFSv3, we ignore the READDIR verifier for now. */
WRITE32(0);
WRITE32(0);
- ADJUST_ARGS();
resp->xdr.buf->head[0].iov_len = ((char*)resp->xdr.p)
- (char*)resp->xdr.buf->head[0].iov_base;
tailbase = p;
@@ -3221,7 +3203,6 @@ nfsd4_encode_remove(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4_
if (!nfserr) {
RESERVE_SPACE(20);
write_cinfo(&p, &remove->rm_cinfo);
- ADJUST_ARGS();
}
return nfserr;
}
@@ -3235,7 +3216,6 @@ nfsd4_encode_rename(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4_
RESERVE_SPACE(40);
write_cinfo(&p, &rename->rn_sinfo);
write_cinfo(&p, &rename->rn_tinfo);
- ADJUST_ARGS();
}
return nfserr;
}
@@ -3275,7 +3255,6 @@ nfsd4_do_encode_secinfo(struct nfsd4_compoundres *resp,
supported = 0;
RESERVE_SPACE(4);
flavorsp = p++; /* to be backfilled later */
- ADJUST_ARGS();

for (i = 0; i < nflavs; i++) {
rpc_authflavor_t pf = flavs[i].pseudoflavor;
@@ -3289,12 +3268,10 @@ nfsd4_do_encode_secinfo(struct nfsd4_compoundres *resp,
WRITEMEM(info.oid.data, info.oid.len);
WRITE32(info.qop);
WRITE32(info.service);
- ADJUST_ARGS();
} else if (pf < RPC_AUTH_MAXFLAVOR) {
supported++;
RESERVE_SPACE(4);
WRITE32(pf);
- ADJUST_ARGS();
} else {
if (report)
pr_warn("NFS: SECINFO: security flavor %u "
@@ -3348,7 +3325,6 @@ nfsd4_encode_setattr(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4
WRITE32(setattr->sa_bmval[1]);
WRITE32(setattr->sa_bmval[2]);
}
- ADJUST_ARGS();
return nfserr;
}

@@ -3361,13 +3337,11 @@ nfsd4_encode_setclientid(struct nfsd4_compoundres *resp, __be32 nfserr, struct n
RESERVE_SPACE(8 + NFS4_VERIFIER_SIZE);
WRITEMEM(&scd->se_clientid, 8);
WRITEMEM(&scd->se_confirm, NFS4_VERIFIER_SIZE);
- ADJUST_ARGS();
}
else if (nfserr == nfserr_clid_inuse) {
RESERVE_SPACE(8);
WRITE32(0);
WRITE32(0);
- ADJUST_ARGS();
}
return nfserr;
}
@@ -3382,7 +3356,6 @@ nfsd4_encode_write(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4_w
WRITE32(write->wr_bytes_written);
WRITE32(write->wr_how_written);
WRITEMEM(write->wr_verifier.data, NFS4_VERIFIER_SIZE);
- ADJUST_ARGS();
}
return nfserr;
}
@@ -3425,7 +3398,6 @@ nfsd4_encode_exchange_id(struct nfsd4_compoundres *resp, __be32 nfserr,
WRITE32(exid->flags);

WRITE32(exid->spa_how);
- ADJUST_ARGS();

switch (exid->spa_how) {
case SP4_NONE:
@@ -3441,7 +3413,6 @@ nfsd4_encode_exchange_id(struct nfsd4_compoundres *resp, __be32 nfserr,
/* empty spo_must_allow bitmap: */
WRITE32(0);

- ADJUST_ARGS();
break;
default:
WARN_ON_ONCE(1);
@@ -3467,7 +3438,6 @@ nfsd4_encode_exchange_id(struct nfsd4_compoundres *resp, __be32 nfserr,

/* Implementation id */
WRITE32(0); /* zero length nfs_impl_id4 array */
- ADJUST_ARGS();
return 0;
}

@@ -3484,7 +3454,6 @@ nfsd4_encode_create_session(struct nfsd4_compoundres *resp, __be32 nfserr,
WRITEMEM(sess->sessionid.data, NFS4_MAX_SESSIONID_LEN);
WRITE32(sess->seqid);
WRITE32(sess->flags);
- ADJUST_ARGS();

RESERVE_SPACE(28);
WRITE32(0); /* headerpadsz */
@@ -3494,12 +3463,10 @@ nfsd4_encode_create_session(struct nfsd4_compoundres *resp, __be32 nfserr,
WRITE32(sess->fore_channel.maxops);
WRITE32(sess->fore_channel.maxreqs);
WRITE32(sess->fore_channel.nr_rdma_attrs);
- ADJUST_ARGS();

if (sess->fore_channel.nr_rdma_attrs) {
RESERVE_SPACE(4);
WRITE32(sess->fore_channel.rdma_attrs);
- ADJUST_ARGS();
}

RESERVE_SPACE(28);
@@ -3510,12 +3477,10 @@ nfsd4_encode_create_session(struct nfsd4_compoundres *resp, __be32 nfserr,
WRITE32(sess->back_channel.maxops);
WRITE32(sess->back_channel.maxreqs);
WRITE32(sess->back_channel.nr_rdma_attrs);
- ADJUST_ARGS();

if (sess->back_channel.nr_rdma_attrs) {
RESERVE_SPACE(4);
WRITE32(sess->back_channel.rdma_attrs);
- ADJUST_ARGS();
}
return 0;
}
@@ -3538,7 +3503,6 @@ nfsd4_encode_sequence(struct nfsd4_compoundres *resp, __be32 nfserr,
WRITE32(seq->maxslots - 1); /* sr_target_highest_slotid */
WRITE32(seq->status_flags);

- ADJUST_ARGS();
resp->cstate.datap = p; /* DRC cache data pointer */
return 0;
}
@@ -3560,7 +3524,6 @@ nfsd4_encode_test_stateid(struct nfsd4_compoundres *resp, __be32 nfserr,
*p++ = stateid->ts_id_status;
}

- ADJUST_ARGS();
return nfserr;
}

@@ -3768,11 +3731,9 @@ nfsd4_encode_replay(struct nfsd4_compoundres *resp, struct nfsd4_op *op)
RESERVE_SPACE(8);
WRITE32(op->opnum);
*p++ = rp->rp_status; /* already xdr'ed */
- ADJUST_ARGS();

RESERVE_SPACE(rp->rp_buflen);
WRITEMEM(rp->rp_buf, rp->rp_buflen);
- ADJUST_ARGS();
}

int
--
1.8.5.3


2014-03-23 01:12:32

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 42/50] nfsd4: don't treat readlink like a zero-copy operation

From: "J. Bruce Fields" <[email protected]>

There's no advantage to this zero-copy-style readlink encoding, and it
unnecessarily limits the kinds of compounds we can handle. (In practice
I can't see why a client would want e.g. multiple readlink calls in a
comound, but it's probably a spec violation for us not to handle it.)

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4xdr.c | 44 +++++++++++++-------------------------------
1 file changed, 13 insertions(+), 31 deletions(-)

diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index b35b7b1..c96a05e 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -3148,8 +3148,9 @@ static __be32
nfsd4_encode_readlink(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4_readlink *readlink)
{
int maxcount;
+ __be32 wire_count;
+ int zero = 0;
struct xdr_stream *xdr = &resp->xdr;
- char *page;
int length_offset = xdr->buf->len;
__be32 *p;

@@ -3159,51 +3160,32 @@ nfsd4_encode_readlink(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd
p = xdr_reserve_space(xdr, 4);
if (!p)
return nfserr_resource;
-
- if (resp->xdr.buf->page_len)
- return nfserr_resource;
- if (!*resp->rqstp->rq_next_page)
- return nfserr_resource;
-
- page = page_address(*(resp->rqstp->rq_next_page++));
-
maxcount = PAGE_SIZE;

- if (xdr->end - xdr->p < 1)
+ p = xdr_reserve_space(xdr, maxcount);
+ if (!p)
return nfserr_resource;
-
/*
* XXX: By default, the ->readlink() VFS op will truncate symlinks
* if they would overflow the buffer. Is this kosher in NFSv4? If
* not, one easy fix is: if ->readlink() precisely fills the buffer,
* assume that truncation occurred, and return NFS4ERR_RESOURCE.
*/
- nfserr = nfsd_readlink(readlink->rl_rqstp, readlink->rl_fhp, page, &maxcount);
+ nfserr = nfsd_readlink(readlink->rl_rqstp, readlink->rl_fhp, (void *)p, &maxcount);
+
if (nfserr == nfserr_isdir)
- nfserr = nfserr_inval;
+ nfserr= nfserr_inval;
if (nfserr) {
xdr_truncate_encode(xdr, length_offset);
return nfserr;
}

- WRITE32(maxcount);
- resp->xdr.buf->head[0].iov_len = (char*)p
- - (char*)resp->xdr.buf->head[0].iov_base;
- resp->xdr.buf->page_len = maxcount;
- xdr->buf->len += maxcount;
- xdr->page_ptr += 1;
- xdr->buf->buflen -= PAGE_SIZE;
- xdr->iov = xdr->buf->tail;
-
- /* Use rest of head for padding and remaining ops: */
- resp->xdr.buf->tail[0].iov_base = p;
- resp->xdr.buf->tail[0].iov_len = 0;
- if (maxcount&3) {
- p = xdr_reserve_space(xdr, 4);
- WRITE32(0);
- resp->xdr.buf->tail[0].iov_base += maxcount&3;
- resp->xdr.buf->tail[0].iov_len = 4 - (maxcount&3);
- }
+ wire_count = htonl(maxcount);
+ write_bytes_to_xdr_buf(xdr->buf, length_offset, &wire_count, 4);
+ xdr_truncate_encode(xdr, length_offset + 4 + maxcount);
+ if (maxcount & 3)
+ write_bytes_to_xdr_buf(xdr->buf, length_offset + 4 + maxcount,
+ &zero, 4 - (maxcount&3));
return 0;
}

--
1.8.5.3


2014-03-23 01:12:33

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 44/50] nfsd4: nfsd_vfs_read doesn't use file handle parameter

From: "J. Bruce Fields" <[email protected]>

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/vfs.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index eea5ad1..cbf7690 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -821,7 +821,7 @@ static int nfsd_direct_splice_actor(struct pipe_inode_info *pipe,
}

static __be32
-nfsd_vfs_read(struct svc_rqst *rqstp, struct svc_fh *fhp, struct file *file,
+nfsd_vfs_read(struct svc_rqst *rqstp, struct file *file,
loff_t offset, struct kvec *vec, int vlen, unsigned long *count)
{
mm_segment_t oldfs;
@@ -987,7 +987,7 @@ __be32 nfsd_read(struct svc_rqst *rqstp, struct svc_fh *fhp,
if (ra && ra->p_set)
file->f_ra = ra->p_ra;

- err = nfsd_vfs_read(rqstp, fhp, file, offset, vec, vlen, count);
+ err = nfsd_vfs_read(rqstp, file, offset, vec, vlen, count);

/* Write back readahead params */
if (ra) {
@@ -1016,7 +1016,7 @@ nfsd_read_file(struct svc_rqst *rqstp, struct svc_fh *fhp, struct file *file,
NFSD_MAY_READ|NFSD_MAY_OWNER_OVERRIDE);
if (err)
goto out;
- err = nfsd_vfs_read(rqstp, fhp, file, offset, vec, vlen, count);
+ err = nfsd_vfs_read(rqstp, file, offset, vec, vlen, count);
} else /* Note file may still be NULL in NFSv4 special stateid case: */
err = nfsd_read(rqstp, fhp, offset, vec, vlen, count);
out:
--
1.8.5.3


2014-03-23 06:50:05

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 22/50] nfsd4: use xdr_truncate_encode

Ah, here we got the helper for most of the error handling I asked for
earlier. Might be worth mentioning in that patch that this gets further
improvements soons.

> if (nfserr) {
> - xdr->p -= 2;
> - xdr->iov->iov_len -= 8;
> + xdr->buf->page_len = 0;
> + xdr_truncate_encode(xdr, starting_len);

Why do we still need to manually set the page_len?


2014-03-23 01:12:33

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 46/50] nfsd4: allow exotic read compounds

From: "J. Bruce Fields" <[email protected]>

---
Documentation/filesystems/nfs/nfs41-server.txt | 2 -
fs/nfsd/nfs4xdr.c | 100 +++++++++++++------------
2 files changed, 51 insertions(+), 51 deletions(-)

diff --git a/Documentation/filesystems/nfs/nfs41-server.txt b/Documentation/filesystems/nfs/nfs41-server.txt
index b930ad0..c49cd7e 100644
--- a/Documentation/filesystems/nfs/nfs41-server.txt
+++ b/Documentation/filesystems/nfs/nfs41-server.txt
@@ -176,7 +176,5 @@ Nonstandard compound limitations:
ca_maxrequestsize request and a ca_maxresponsesize reply, so we may
fail to live up to the promise we made in CREATE_SESSION fore channel
negotiation.
-* No more than one read-like operation allowed per compound; encoding
- replies that cross page boundaries (except for read data) not handled.

See also http://wiki.linux-nfs.org/wiki/index.php/Server_4.0_and_4.1_issues.
diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index e052351..d92dd1b 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -3074,11 +3074,11 @@ static __be32 nfsd4_encode_splice_read(
struct file *file, unsigned long maxcount)
{
struct xdr_stream *xdr = &resp->xdr;
+ struct xdr_buf *buf = xdr->buf;
u32 eof;
- int starting_len = xdr->buf->len - 8;
+ int space_left;
__be32 nfserr;
- __be32 tmp;
- __be32 *p;
+ __be32 *p = xdr->p - 2;

/*
* Don't inline pages unless we know there's room for eof,
@@ -3090,34 +3090,38 @@ static __be32 nfsd4_encode_splice_read(
nfserr = nfsd_splice_read(read->rd_rqstp, file,
read->rd_offset, &maxcount);
if (nfserr) {
- xdr->buf->page_len = 0;
+ buf->page_len = 0;
return nfserr;
}

eof = (read->rd_offset + maxcount >=
read->rd_fhp->fh_dentry->d_inode->i_size);

- tmp = htonl(eof);
- write_bytes_to_xdr_buf(xdr->buf, starting_len , &tmp, 4);
- tmp = htonl(maxcount);
- write_bytes_to_xdr_buf(xdr->buf, starting_len + 4, &tmp, 4);
+ *(p++) = htonl(eof);
+ *(p++) = htonl(maxcount);

- resp->xdr.buf->page_len = maxcount;
- xdr->buf->len += maxcount;
- xdr->page_ptr += (maxcount + PAGE_SIZE - 1) / PAGE_SIZE;
- xdr->buf->buflen = maxcount + PAGE_SIZE - 2 * RPC_MAX_AUTH_SIZE;
- xdr->iov = xdr->buf->tail;
+ buf->page_len = maxcount;
+ buf->len += maxcount;

/* Use rest of head for padding and remaining ops: */
- resp->xdr.buf->tail[0].iov_base = xdr->p;
- resp->xdr.buf->tail[0].iov_len = 0;
+ buf->tail[0].iov_base = xdr->p;
+ buf->tail[0].iov_len = 0;
+ xdr->iov = xdr->buf->tail;
if (maxcount&3) {
- p = xdr_reserve_space(xdr, 4);
- WRITE32(0);
- resp->xdr.buf->tail[0].iov_base += maxcount&3;
- resp->xdr.buf->tail[0].iov_len = 4 - (maxcount&3);
- xdr->buf->len -= (maxcount&3);
+ int pad = 4 - (maxcount&3);
+
+ *(xdr->p++) = 0;
+
+ buf->tail[0].iov_base += maxcount&3;
+ buf->tail[0].iov_len = pad;
+ buf->len += pad;
}
+
+ space_left = min_t(int, (void *)xdr->end - (void *)xdr->p,
+ buf->buflen - buf->len);
+ buf->buflen = buf->len + space_left;
+ xdr->end = (__be32 *)((void *)xdr->end + space_left);
+
return 0;
}

@@ -3128,27 +3132,34 @@ static __be32 nfsd4_encode_readv(struct nfsd4_compoundres *resp,
struct xdr_stream *xdr = &resp->xdr;
u32 eof;
int v;
- struct page *page;
int starting_len = xdr->buf->len - 8;
long len;
+ int thislen;
__be32 nfserr;
__be32 tmp;
__be32 *p;
+ u32 zzz = 0;
+ int pad;

len = maxcount;
v = 0;
- while (len) {
- int thislen;

- page = *(resp->rqstp->rq_next_page);
- if (!page) { /* ran out of pages */
- maxcount -= len;
- break;
- }
+ thislen = (void *)xdr->end - (void *)xdr->p;
+ if (len < thislen)
+ thislen = len;
+ p = xdr_reserve_space(xdr, (thislen+3)&~3);
+ WARN_ON_ONCE(!p);
+ resp->rqstp->rq_vec[v].iov_base = p;
+ resp->rqstp->rq_vec[v].iov_len = thislen;
+ v++;
+ len -= thislen;
+
+ while (len) {
thislen = min_t(long, len, PAGE_SIZE);
- resp->rqstp->rq_vec[v].iov_base = page_address(page);
+ p = xdr_reserve_space(xdr, (thislen+3)&~3);
+ WARN_ON_ONCE(!p);
+ resp->rqstp->rq_vec[v].iov_base = p;
resp->rqstp->rq_vec[v].iov_len = thislen;
- resp->rqstp->rq_next_page++;
v++;
len -= thislen;
}
@@ -3158,6 +3169,7 @@ static __be32 nfsd4_encode_readv(struct nfsd4_compoundres *resp,
read->rd_vlen, &maxcount);
if (nfserr)
return nfserr;
+ xdr_truncate_encode(xdr, starting_len + 8 + ((maxcount+3)&~3));

eof = (read->rd_offset + maxcount >=
read->rd_fhp->fh_dentry->d_inode->i_size);
@@ -3167,22 +3179,9 @@ static __be32 nfsd4_encode_readv(struct nfsd4_compoundres *resp,
tmp = htonl(maxcount);
write_bytes_to_xdr_buf(xdr->buf, starting_len + 4, &tmp, 4);

- resp->xdr.buf->page_len = maxcount;
- xdr->buf->len += maxcount;
- xdr->page_ptr += v;
- xdr->buf->buflen = maxcount + PAGE_SIZE - 2 * RPC_MAX_AUTH_SIZE;
- xdr->iov = xdr->buf->tail;
-
- /* Use rest of head for padding and remaining ops: */
- resp->xdr.buf->tail[0].iov_base = xdr->p;
- resp->xdr.buf->tail[0].iov_len = 0;
- if (maxcount&3) {
- p = xdr_reserve_space(xdr, 4);
- WRITE32(0);
- resp->xdr.buf->tail[0].iov_base += maxcount&3;
- resp->xdr.buf->tail[0].iov_len = 4 - (maxcount&3);
- xdr->buf->len -= (maxcount&3);
- }
+ pad = (maxcount&3) ? 4 - (maxcount&3) : 0;
+ write_bytes_to_xdr_buf(xdr->buf, starting_len + 8 + maxcount,
+ &zzz, pad);
return 0;

}
@@ -3204,16 +3203,19 @@ nfsd4_encode_read(struct nfsd4_compoundres *resp, __be32 nfserr,

p = xdr_reserve_space(xdr, 8); /* eof flag and byte count */
if (!p) {
- WARN_ON_ONCE(1);
+ WARN_ON_ONCE(resp->rqstp->rq_splice_ok);
return nfserr_resource;
}

- if (resp->xdr.buf->page_len)
+ if (resp->rqstp->rq_splice_ok && resp->xdr.buf->page_len) {
+ WARN_ON_ONCE(1);
return nfserr_resource;
-
+ }
xdr_commit_encode(xdr);

maxcount = svc_max_payload(resp->rqstp);
+ if (maxcount > xdr->buf->buflen - xdr->buf->len)
+ maxcount = xdr->buf->buflen - xdr->buf->len;
if (maxcount > read->rd_length)
maxcount = read->rd_length;

--
1.8.5.3


2014-03-23 01:12:30

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 31/50] nfsd4: convert 4.1 replay encoding

From: "J. Bruce Fields" <[email protected]>

Limits on maxresp_sz mean that we only ever need to replay rpc's that
are contained entirely in the head.

The one exception is very small zero-copy reads. That's an odd corner
case as clients wouldn't normally ask those to be cached.

in any case, this seems a little more robust.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4state.c | 28 ++++++++++++++--------------
fs/nfsd/nfs4xdr.c | 2 +-
fs/nfsd/xdr4.h | 2 +-
3 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 2910abc..1146cf2 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1547,6 +1547,7 @@ out_err:
void
nfsd4_store_cache_entry(struct nfsd4_compoundres *resp)
{
+ struct xdr_buf *buf = resp->xdr.buf;
struct nfsd4_slot *slot = resp->cstate.slot;
unsigned int base;

@@ -1560,11 +1561,9 @@ nfsd4_store_cache_entry(struct nfsd4_compoundres *resp)
slot->sl_datalen = 0;
return;
}
- slot->sl_datalen = (char *)resp->xdr.p - (char *)resp->cstate.datap;
- base = (char *)resp->cstate.datap -
- (char *)resp->xdr.buf->head[0].iov_base;
- if (read_bytes_from_xdr_buf(resp->xdr.buf, base, slot->sl_data,
- slot->sl_datalen))
+ base = resp->cstate.data_offset;
+ slot->sl_datalen = buf->len - base;
+ if (read_bytes_from_xdr_buf(buf, base, slot->sl_data, slot->sl_datalen))
WARN("%s: sessions DRC could not cache compound\n", __func__);
return;
}
@@ -1605,7 +1604,8 @@ nfsd4_replay_cache_entry(struct nfsd4_compoundres *resp,
struct nfsd4_sequence *seq)
{
struct nfsd4_slot *slot = resp->cstate.slot;
- struct kvec *head = resp->xdr.iov;
+ struct xdr_stream *xdr = &resp->xdr;
+ __be32 *p;
__be32 status;

dprintk("--> %s slot %p\n", __func__, slot);
@@ -1614,16 +1614,16 @@ nfsd4_replay_cache_entry(struct nfsd4_compoundres *resp,
if (status)
return status;

- /* The sequence operation has been encoded, cstate->datap set. */
- memcpy(resp->cstate.datap, slot->sl_data, slot->sl_datalen);
+ p = xdr_reserve_space(xdr, slot->sl_datalen);
+ if (!p) {
+ WARN_ON_ONCE(1);
+ return nfserr_serverfault;
+ }
+ xdr_encode_opaque_fixed(p, slot->sl_data, slot->sl_datalen);
+ xdr_commit_encode(xdr);

resp->opcnt = slot->sl_opcnt;
- resp->xdr.p = resp->cstate.datap + XDR_QUADLEN(slot->sl_datalen);
- head->iov_len = (void *)resp->xdr.p - head->iov_base;
- resp->xdr.buf->len = head->iov_len;
- status = slot->sl_status;
-
- return status;
+ return slot->sl_status;
}

/*
diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index 8dc65f5..8703cca 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -3635,7 +3635,7 @@ nfsd4_encode_sequence(struct nfsd4_compoundres *resp, __be32 nfserr,
WRITE32(seq->maxslots - 1); /* sr_target_highest_slotid */
WRITE32(seq->status_flags);

- resp->cstate.datap = p; /* DRC cache data pointer */
+ resp->cstate.data_offset = xdr->buf->len; /* DRC cache data pointer */
return 0;
}

diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h
index 19bf3fc..d1c6e21 100644
--- a/fs/nfsd/xdr4.h
+++ b/fs/nfsd/xdr4.h
@@ -58,7 +58,7 @@ struct nfsd4_compound_state {
/* For sessions DRC */
struct nfsd4_session *session;
struct nfsd4_slot *slot;
- __be32 *datap;
+ int data_offset;
size_t iovlen;
u32 minorversion;
__be32 status;
--
1.8.5.3


2014-03-29 19:18:16

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 08/50] nfsd4: fix nfs4err_resource in 4.1 case

I've applied the first 8 patches of this series for 3.15, with more
probably to come. My tree for 3.15 so far is at

git://linux-nfs.org/~bfields/linux.git for-3.15

--b.

On Sat, Mar 22, 2014 at 09:11:39PM -0400, J. Bruce Fields wrote:
> From: "J. Bruce Fields" <[email protected]>
>
> encode_getattr, for example, can return nfserr_resource to indicate it
> ran out of buffer space. That's not a legal error in the 4.1 case. And
> in the 4.1 case, if we ran out of buffer space, we should have exceeded
> a session limit too.
> ---
> fs/nfsd/nfs4xdr.c | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
> index 143780d..ea32674 100644
> --- a/fs/nfsd/nfs4xdr.c
> +++ b/fs/nfsd/nfs4xdr.c
> @@ -3626,6 +3626,14 @@ nfsd4_encode_operation(struct nfsd4_compoundres *resp, struct nfsd4_op *op)
> /* nfsd4_check_resp_size guarantees enough room for error status */
> if (!op->status)
> op->status = nfsd4_check_resp_size(resp, 0);
> + if (op->status == nfserr_resource && nfsd4_has_session(&resp->cstate)) {
> + struct nfsd4_slot *slot = resp->cstate.slot;
> +
> + if (slot->sl_flags & NFSD4_SLOT_CACHETHIS)
> + op->status = nfserr_rep_too_big_to_cache;
> + else
> + op->status = nfserr_rep_too_big;
> + }
> if (so) {
> so->so_replay.rp_status = op->status;
> so->so_replay.rp_buflen = (char *)resp->p - (char *)(statp+1);
> --
> 1.8.5.3
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2014-03-23 01:12:31

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 35/50] nfsd4: nfsd4_check_resp_size should check against whole buffer

From: "J. Bruce Fields" <[email protected]>

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4xdr.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index 9f28e59..6d9f0a53 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -3754,7 +3754,6 @@ __be32 nfsd4_check_resp_size(struct nfsd4_compoundres *resp, u32 respsize)
{
struct xdr_buf *buf = &resp->rqstp->rq_res;
struct nfsd4_session *session = resp->cstate.session;
- int slack_bytes = (char *)resp->xdr.end - (char *)resp->xdr.p;

if (nfsd4_has_session(&resp->cstate)) {
struct nfsd4_slot *slot = resp->cstate.slot;
@@ -3767,7 +3766,7 @@ __be32 nfsd4_check_resp_size(struct nfsd4_compoundres *resp, u32 respsize)
return nfserr_rep_too_big_to_cache;
}

- if (respsize > slack_bytes) {
+ if (buf->len + respsize > buf->buflen) {
WARN_ON_ONCE(nfsd4_has_session(&resp->cstate));
return nfserr_resource;
}
--
1.8.5.3


2014-03-23 01:12:25

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 08/50] nfsd4: fix nfs4err_resource in 4.1 case

From: "J. Bruce Fields" <[email protected]>

encode_getattr, for example, can return nfserr_resource to indicate it
ran out of buffer space. That's not a legal error in the 4.1 case. And
in the 4.1 case, if we ran out of buffer space, we should have exceeded
a session limit too.
---
fs/nfsd/nfs4xdr.c | 8 ++++++++
1 file changed, 8 insertions(+)

diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index 143780d..ea32674 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -3626,6 +3626,14 @@ nfsd4_encode_operation(struct nfsd4_compoundres *resp, struct nfsd4_op *op)
/* nfsd4_check_resp_size guarantees enough room for error status */
if (!op->status)
op->status = nfsd4_check_resp_size(resp, 0);
+ if (op->status == nfserr_resource && nfsd4_has_session(&resp->cstate)) {
+ struct nfsd4_slot *slot = resp->cstate.slot;
+
+ if (slot->sl_flags & NFSD4_SLOT_CACHETHIS)
+ op->status = nfserr_rep_too_big_to_cache;
+ else
+ op->status = nfserr_rep_too_big;
+ }
if (so) {
so->so_replay.rp_status = op->status;
so->so_replay.rp_buflen = (char *)resp->p - (char *)(statp+1);
--
1.8.5.3


2014-03-23 01:12:28

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 17/50] nfsd4: use xdr_reserve_space in attribute encoding

From: "J. Bruce Fields" <[email protected]>

This is a cosmetic change for now; no change in behavior.

Note we're just depending on xdr_reserve_space to do the bounds checking
for us, we're not really depending on its adjustment of iovec or xdr_buf
lengths yet, as those are fixed up by as necessary after the fact by
read-link operations and by nfs4svc_encode_compoundres. However we do
have to update xdr->iov on read-like operations to prevent
xdr_reserve_space from messing with the already-fixed-up length of the
the head.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/acl.h | 2 +-
fs/nfsd/idmap.h | 4 +-
fs/nfsd/nfs4acl.c | 11 +--
fs/nfsd/nfs4idmap.c | 38 ++++----
fs/nfsd/nfs4proc.c | 1 +
fs/nfsd/nfs4xdr.c | 274 ++++++++++++++++++++++++++++++----------------------
6 files changed, 186 insertions(+), 144 deletions(-)

diff --git a/fs/nfsd/acl.h b/fs/nfsd/acl.h
index b481e1f..a986ceb 100644
--- a/fs/nfsd/acl.h
+++ b/fs/nfsd/acl.h
@@ -49,7 +49,7 @@ struct svc_rqst;

struct nfs4_acl *nfs4_acl_new(int);
int nfs4_acl_get_whotype(char *, u32);
-__be32 nfs4_acl_write_who(int who, __be32 **p, int *len);
+__be32 nfs4_acl_write_who(struct xdr_stream *xdr, int who);

int nfsd4_get_nfs4_acl(struct svc_rqst *rqstp, struct dentry *dentry,
struct nfs4_acl **acl);
diff --git a/fs/nfsd/idmap.h b/fs/nfsd/idmap.h
index 66e58db..a3f3490 100644
--- a/fs/nfsd/idmap.h
+++ b/fs/nfsd/idmap.h
@@ -56,7 +56,7 @@ static inline void nfsd_idmap_shutdown(struct net *net)

__be32 nfsd_map_name_to_uid(struct svc_rqst *, const char *, size_t, kuid_t *);
__be32 nfsd_map_name_to_gid(struct svc_rqst *, const char *, size_t, kgid_t *);
-__be32 nfsd4_encode_user(struct svc_rqst *, kuid_t, __be32 **, int *);
-__be32 nfsd4_encode_group(struct svc_rqst *, kgid_t, __be32 **, int *);
+__be32 nfsd4_encode_user(struct xdr_stream *, struct svc_rqst *, kuid_t);
+__be32 nfsd4_encode_group(struct xdr_stream *, struct svc_rqst *, kgid_t);

#endif /* LINUX_NFSD_IDMAP_H */
diff --git a/fs/nfsd/nfs4acl.c b/fs/nfsd/nfs4acl.c
index d190e33..02229b5 100644
--- a/fs/nfsd/nfs4acl.c
+++ b/fs/nfsd/nfs4acl.c
@@ -914,20 +914,19 @@ nfs4_acl_get_whotype(char *p, u32 len)
return NFS4_ACL_WHO_NAMED;
}

-__be32 nfs4_acl_write_who(int who, __be32 **p, int *len)
+__be32 nfs4_acl_write_who(struct xdr_stream *xdr, int who)
{
+ __be32 *p;
int i;
- int bytes;

for (i = 0; i < ARRAY_SIZE(s2t_map); i++) {
if (s2t_map[i].type != who)
continue;
- bytes = 4 + (XDR_QUADLEN(s2t_map[i].stringlen) << 2);
- if (bytes > *len)
+ p = xdr_reserve_space(xdr, s2t_map[i].stringlen + 4);
+ if (!p)
return nfserr_resource;
- *p = xdr_encode_opaque(*p, s2t_map[i].string,
+ p = xdr_encode_opaque(p, s2t_map[i].string,
s2t_map[i].stringlen);
- *len -= bytes;
return 0;
}
WARN_ON_ONCE(1);
diff --git a/fs/nfsd/nfs4idmap.c b/fs/nfsd/nfs4idmap.c
index c0dfde6..43afc90 100644
--- a/fs/nfsd/nfs4idmap.c
+++ b/fs/nfsd/nfs4idmap.c
@@ -551,44 +551,42 @@ idmap_name_to_id(struct svc_rqst *rqstp, int type, const char *name, u32 namelen
return 0;
}

-static __be32 encode_ascii_id(u32 id, __be32 **p, int *buflen)
+static __be32 encode_ascii_id(struct xdr_stream *xdr, u32 id)
{
char buf[11];
int len;
- int bytes;
+ __be32 *p;

len = sprintf(buf, "%u", id);
- bytes = 4 + (XDR_QUADLEN(len) << 2);
- if (bytes > *buflen)
+ p = xdr_reserve_space(xdr, len + 4);
+ if (!p)
return nfserr_resource;
- *p = xdr_encode_opaque(*p, buf, len);
- *buflen -= bytes;
+ p = xdr_encode_opaque(p, buf, len);
return 0;
}

-static __be32 idmap_id_to_name(struct svc_rqst *rqstp, int type, u32 id, __be32 **p, int *buflen)
+static __be32 idmap_id_to_name(struct xdr_stream *xdr, struct svc_rqst *rqstp, int type, u32 id)
{
struct ent *item, key = {
.id = id,
.type = type,
};
+ __be32 *p;
int ret;
- int bytes;
struct nfsd_net *nn = net_generic(SVC_NET(rqstp), nfsd_net_id);

strlcpy(key.authname, rqst_authname(rqstp), sizeof(key.authname));
ret = idmap_lookup(rqstp, idtoname_lookup, &key, nn->idtoname_cache, &item);
if (ret == -ENOENT)
- return encode_ascii_id(id, p, buflen);
+ return encode_ascii_id(xdr, id);
if (ret)
return nfserrno(ret);
ret = strlen(item->name);
WARN_ON_ONCE(ret > IDMAP_NAMESZ);
- bytes = 4 + (XDR_QUADLEN(ret) << 2);
- if (bytes > *buflen)
+ p = xdr_reserve_space(xdr, ret + 4);
+ if (!p)
return nfserr_resource;
- *p = xdr_encode_opaque(*p, item->name, ret);
- *buflen -= bytes;
+ p = xdr_encode_opaque(p, item->name, ret);
cache_put(&item->h, nn->idtoname_cache);
return 0;
}
@@ -622,11 +620,11 @@ do_name_to_id(struct svc_rqst *rqstp, int type, const char *name, u32 namelen, u
return idmap_name_to_id(rqstp, type, name, namelen, id);
}

-static __be32 encode_name_from_id(struct svc_rqst *rqstp, int type, u32 id, __be32 **p, int *buflen)
+static __be32 encode_name_from_id(struct xdr_stream *xdr, struct svc_rqst *rqstp, int type, u32 id)
{
if (nfs4_disable_idmapping && rqstp->rq_cred.cr_flavor < RPC_AUTH_GSS)
- return encode_ascii_id(id, p, buflen);
- return idmap_id_to_name(rqstp, type, id, p, buflen);
+ return encode_ascii_id(xdr, id);
+ return idmap_id_to_name(xdr, rqstp, type, id);
}

__be32
@@ -655,14 +653,14 @@ nfsd_map_name_to_gid(struct svc_rqst *rqstp, const char *name, size_t namelen,
return status;
}

-__be32 nfsd4_encode_user(struct svc_rqst *rqstp, kuid_t uid, __be32 **p, int *buflen)
+__be32 nfsd4_encode_user(struct xdr_stream *xdr, struct svc_rqst *rqstp, kuid_t uid)
{
u32 id = from_kuid(&init_user_ns, uid);
- return encode_name_from_id(rqstp, IDMAP_TYPE_USER, id, p, buflen);
+ return encode_name_from_id(xdr, rqstp, IDMAP_TYPE_USER, id);
}

-__be32 nfsd4_encode_group(struct svc_rqst *rqstp, kgid_t gid, __be32 **p, int *buflen)
+__be32 nfsd4_encode_group(struct xdr_stream *xdr, struct svc_rqst *rqstp, kgid_t gid)
{
u32 id = from_kgid(&init_user_ns, gid);
- return encode_name_from_id(rqstp, IDMAP_TYPE_GROUP, id, p, buflen);
+ return encode_name_from_id(xdr, rqstp, IDMAP_TYPE_GROUP, id);
}
diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index a61f492..9e7cb05 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -1219,6 +1219,7 @@ static void svcxdr_init_encode(struct svc_rqst *rqstp, struct nfsd4_compoundres
struct kvec *head = buf->head;

xdr->buf = buf;
+ xdr->iov = head;
xdr->p = head->iov_base + head->iov_len;
xdr->end = head->iov_base + PAGE_SIZE - 2 * RPC_MAX_AUTH_SIZE;
}
diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index ed486e7..6025713 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -1756,18 +1756,18 @@ static void write_cinfo(__be32 **p, struct nfsd4_change_info *c)
/* Encode as an array of strings the string given with components
* separated @sep, escaped with esc_enter and esc_exit.
*/
-static __be32 nfsd4_encode_components_esc(char sep, char *components,
- __be32 **pp, int *buflen,
- char esc_enter, char esc_exit)
+static __be32 nfsd4_encode_components_esc(struct xdr_stream *xdr, char sep, char *components, char esc_enter, char esc_exit)
{
- __be32 *p = *pp;
- __be32 *countp = p;
+ __be32 *p;
+ __be32 *countp;
int strlen, count=0;
char *str, *end, *next;

dprintk("nfsd4_encode_components(%s)\n", components);
- if ((*buflen -= 4) < 0)
+ p = xdr_reserve_space(xdr, 4);
+ if (!p)
return nfserr_resource;
+ countp = p;
WRITE32(0); /* We will fill this in with @count later */
end = str = components;
while (*end) {
@@ -1790,7 +1790,8 @@ static __be32 nfsd4_encode_components_esc(char sep, char *components,

strlen = end - str;
if (strlen) {
- if ((*buflen -= ((XDR_QUADLEN(strlen) << 2) + 4)) < 0)
+ p = xdr_reserve_space(xdr, strlen + 4);
+ if (!p)
return nfserr_resource;
WRITE32(strlen);
WRITEMEM(str, strlen);
@@ -1800,7 +1801,6 @@ static __be32 nfsd4_encode_components_esc(char sep, char *components,
end++;
str = end;
}
- *pp = p;
p = countp;
WRITE32(count);
return 0;
@@ -1809,40 +1809,35 @@ static __be32 nfsd4_encode_components_esc(char sep, char *components,
/* Encode as an array of strings the string given with components
* separated @sep.
*/
-static __be32 nfsd4_encode_components(char sep, char *components,
- __be32 **pp, int *buflen)
+static __be32 nfsd4_encode_components(struct xdr_stream *xdr, char sep, char *components)
{
- return nfsd4_encode_components_esc(sep, components, pp, buflen, 0, 0);
+ return nfsd4_encode_components_esc(xdr, sep, components, 0, 0);
}

/*
* encode a location element of a fs_locations structure
*/
-static __be32 nfsd4_encode_fs_location4(struct nfsd4_fs_location *location,
- __be32 **pp, int *buflen)
+static __be32 nfsd4_encode_fs_location4(struct xdr_stream *xdr, struct nfsd4_fs_location *location)
{
__be32 status;
- __be32 *p = *pp;

- status = nfsd4_encode_components_esc(':', location->hosts, &p, buflen,
+ status = nfsd4_encode_components_esc(xdr, ':', location->hosts,
'[', ']');
if (status)
return status;
- status = nfsd4_encode_components('/', location->path, &p, buflen);
+ status = nfsd4_encode_components(xdr, '/', location->path);
if (status)
return status;
- *pp = p;
return 0;
}

/*
* Encode a path in RFC3530 'pathname4' format
*/
-static __be32 nfsd4_encode_path(const struct path *root,
- const struct path *path, __be32 **pp, int *buflen)
+static __be32 nfsd4_encode_path(struct xdr_stream *xdr, const struct path *root, const struct path *path)
{
struct path cur = *path;
- __be32 *p = *pp;
+ __be32 *p;
struct dentry **components = NULL;
unsigned int ncomponents = 0;
__be32 err = nfserr_jukebox;
@@ -1873,9 +1868,9 @@ static __be32 nfsd4_encode_path(const struct path *root,
components[ncomponents++] = cur.dentry;
cur.dentry = dget_parent(cur.dentry);
}
-
- *buflen -= 4;
- if (*buflen < 0)
+ err = nfserr_resource;
+ p = xdr_reserve_space(xdr, 4);
+ if (!p)
goto out_free;
WRITE32(ncomponents);

@@ -1885,8 +1880,8 @@ static __be32 nfsd4_encode_path(const struct path *root,

spin_lock(&dentry->d_lock);
len = dentry->d_name.len;
- *buflen -= 4 + (XDR_QUADLEN(len) << 2);
- if (*buflen < 0) {
+ p = xdr_reserve_space(xdr, len + 4);
+ if (!p) {
spin_unlock(&dentry->d_lock);
goto out_free;
}
@@ -1898,7 +1893,6 @@ static __be32 nfsd4_encode_path(const struct path *root,
ncomponents--;
}

- *pp = p;
err = 0;
out_free:
dprintk(")\n");
@@ -1909,8 +1903,7 @@ out_free:
return err;
}

-static __be32 nfsd4_encode_fsloc_fsroot(struct svc_rqst *rqstp,
- const struct path *path, __be32 **pp, int *buflen)
+static __be32 nfsd4_encode_fsloc_fsroot(struct xdr_stream *xdr, struct svc_rqst *rqstp, const struct path *path)
{
struct svc_export *exp_ps;
__be32 res;
@@ -1918,7 +1911,7 @@ static __be32 nfsd4_encode_fsloc_fsroot(struct svc_rqst *rqstp,
exp_ps = rqst_find_fsidzero_export(rqstp);
if (IS_ERR(exp_ps))
return nfserrno(PTR_ERR(exp_ps));
- res = nfsd4_encode_path(&exp_ps->ex_path, path, pp, buflen);
+ res = nfsd4_encode_path(xdr, &exp_ps->ex_path, path);
exp_put(exp_ps);
return res;
}
@@ -1926,28 +1919,25 @@ static __be32 nfsd4_encode_fsloc_fsroot(struct svc_rqst *rqstp,
/*
* encode a fs_locations structure
*/
-static __be32 nfsd4_encode_fs_locations(struct svc_rqst *rqstp,
- struct svc_export *exp,
- __be32 **pp, int *buflen)
+static __be32 nfsd4_encode_fs_locations(struct xdr_stream *xdr, struct svc_rqst *rqstp, struct svc_export *exp)
{
__be32 status;
int i;
- __be32 *p = *pp;
+ __be32 *p;
struct nfsd4_fs_locations *fslocs = &exp->ex_fslocs;

- status = nfsd4_encode_fsloc_fsroot(rqstp, &exp->ex_path, &p, buflen);
+ status = nfsd4_encode_fsloc_fsroot(xdr, rqstp, &exp->ex_path);
if (status)
return status;
- if ((*buflen -= 4) < 0)
+ p = xdr_reserve_space(xdr, 4);
+ if (!p)
return nfserr_resource;
WRITE32(fslocs->locations_count);
for (i=0; i<fslocs->locations_count; i++) {
- status = nfsd4_encode_fs_location4(&fslocs->locations[i],
- &p, buflen);
+ status = nfsd4_encode_fs_location4(xdr, &fslocs->locations[i]);
if (status)
return status;
}
- *pp = p;
return 0;
}

@@ -1966,15 +1956,14 @@ static u32 nfs4_file_type(umode_t mode)
}

static inline __be32
-nfsd4_encode_aclname(struct svc_rqst *rqstp, struct nfs4_ace *ace,
- __be32 **p, int *buflen)
+nfsd4_encode_aclname(struct xdr_stream *xdr, struct svc_rqst *rqstp, struct nfs4_ace *ace)
{
if (ace->whotype != NFS4_ACL_WHO_NAMED)
- return nfs4_acl_write_who(ace->whotype, p, buflen);
+ return nfs4_acl_write_who(xdr, ace->whotype);
else if (ace->flag & NFS4_ACE_IDENTIFIER_GROUP)
- return nfsd4_encode_group(rqstp, ace->who_gid, p, buflen);
+ return nfsd4_encode_group(xdr, rqstp, ace->who_gid);
else
- return nfsd4_encode_user(rqstp, ace->who_uid, p, buflen);
+ return nfsd4_encode_user(xdr, rqstp, ace->who_uid);
}

#define WORD0_ABSENT_FS_ATTRS (FATTR4_WORD0_FS_LOCATIONS | FATTR4_WORD0_FSID | \
@@ -1983,31 +1972,26 @@ nfsd4_encode_aclname(struct svc_rqst *rqstp, struct nfs4_ace *ace,

#ifdef CONFIG_NFSD_V4_SECURITY_LABEL
static inline __be32
-nfsd4_encode_security_label(struct svc_rqst *rqstp, void *context, int len, __be32 **pp, int *buflen)
+nfsd4_encode_security_label(struct xdr_stream *xdr, struct svc_rqst *rqstp, void *context, int len)
{
__be32 *p = *pp;

- if (*buflen < ((XDR_QUADLEN(len) << 2) + 4 + 4 + 4))
+ p = xdr_reserve_space(xdr, len + 4 + 4 + 4);
+ if (!p)
return nfserr_resource;

/*
* For now we use a 0 here to indicate the null translation; in
* the future we may place a call to translation code here.
*/
- if ((*buflen -= 8) < 0)
- return nfserr_resource;
-
WRITE32(0); /* lfs */
WRITE32(0); /* pi */
p = xdr_encode_opaque(p, context, len);
- *buflen -= (XDR_QUADLEN(len) << 2) + 4;
-
- *pp = p;
return 0;
}
#else
static inline __be32
-nfsd4_encode_security_label(struct svc_rqst *rqstp, void *context, int len, __be32 **pp, int *buflen)
+nfsd4_encode_security_label(struct xdr_stream *xdr, struct svc_rqst *rqstp, void *context, int len)
{ return 0; }
#endif

@@ -2058,8 +2042,8 @@ nfsd4_encode_fattr(struct xdr_stream *xdr, struct svc_fh *fhp, struct svc_export
struct kstat stat;
struct svc_fh *tempfh = NULL;
struct kstatfs statfs;
- __be32 *p = xdr->p;
- int buflen = xdr->buf->buflen;
+ __be32 *p;
+ __be32 *start = xdr->p;
__be32 *attrlenp;
u32 dummy;
u64 dummy64;
@@ -2144,24 +2128,30 @@ nfsd4_encode_fattr(struct xdr_stream *xdr, struct svc_fh *fhp, struct svc_export
#endif /* CONFIG_NFSD_V4_SECURITY_LABEL */

if (bmval2) {
- if ((buflen -= 16) < 0)
+ p = xdr_reserve_space(xdr, 16);
+ if (!p)
goto out_resource;
WRITE32(3);
WRITE32(bmval0);
WRITE32(bmval1);
WRITE32(bmval2);
} else if (bmval1) {
- if ((buflen -= 12) < 0)
+ p = xdr_reserve_space(xdr, 12);
+ if (!p)
goto out_resource;
WRITE32(2);
WRITE32(bmval0);
WRITE32(bmval1);
} else {
- if ((buflen -= 8) < 0)
+ p = xdr_reserve_space(xdr, 8);
+ if (!p)
goto out_resource;
WRITE32(1);
WRITE32(bmval0);
}
+ p = xdr_reserve_space(xdr, 4);
+ if (!p)
+ goto out_resource;
attrlenp = p++; /* to be backfilled later */

if (bmval0 & FATTR4_WORD0_SUPPORTED_ATTRS) {
@@ -2174,13 +2164,15 @@ nfsd4_encode_fattr(struct xdr_stream *xdr, struct svc_fh *fhp, struct svc_export
if (!contextsupport)
word2 &= ~FATTR4_WORD2_SECURITY_LABEL;
if (!word2) {
- if ((buflen -= 12) < 0)
+ p = xdr_reserve_space(xdr, 12);
+ if (!p)
goto out_resource;
WRITE32(2);
WRITE32(word0);
WRITE32(word1);
} else {
- if ((buflen -= 16) < 0)
+ p = xdr_reserve_space(xdr, 16);
+ if (!p)
goto out_resource;
WRITE32(3);
WRITE32(word0);
@@ -2189,7 +2181,8 @@ nfsd4_encode_fattr(struct xdr_stream *xdr, struct svc_fh *fhp, struct svc_export
}
}
if (bmval0 & FATTR4_WORD0_TYPE) {
- if ((buflen -= 4) < 0)
+ p = xdr_reserve_space(xdr, 4);
+ if (!p)
goto out_resource;
dummy = nfs4_file_type(stat.mode);
if (dummy == NF4BAD) {
@@ -2199,7 +2192,8 @@ nfsd4_encode_fattr(struct xdr_stream *xdr, struct svc_fh *fhp, struct svc_export
WRITE32(dummy);
}
if (bmval0 & FATTR4_WORD0_FH_EXPIRE_TYPE) {
- if ((buflen -= 4) < 0)
+ p = xdr_reserve_space(xdr, 4);
+ if (!p)
goto out_resource;
if (exp->ex_flags & NFSEXP_NOSUBTREECHECK)
WRITE32(NFS4_FH_PERSISTENT);
@@ -2207,32 +2201,38 @@ nfsd4_encode_fattr(struct xdr_stream *xdr, struct svc_fh *fhp, struct svc_export
WRITE32(NFS4_FH_PERSISTENT|NFS4_FH_VOL_RENAME);
}
if (bmval0 & FATTR4_WORD0_CHANGE) {
- if ((buflen -= 8) < 0)
+ p = xdr_reserve_space(xdr, 8);
+ if (!p)
goto out_resource;
write_change(&p, &stat, dentry->d_inode);
}
if (bmval0 & FATTR4_WORD0_SIZE) {
- if ((buflen -= 8) < 0)
+ p = xdr_reserve_space(xdr, 8);
+ if (!p)
goto out_resource;
WRITE64(stat.size);
}
if (bmval0 & FATTR4_WORD0_LINK_SUPPORT) {
- if ((buflen -= 4) < 0)
+ p = xdr_reserve_space(xdr, 4);
+ if (!p)
goto out_resource;
WRITE32(1);
}
if (bmval0 & FATTR4_WORD0_SYMLINK_SUPPORT) {
- if ((buflen -= 4) < 0)
+ p = xdr_reserve_space(xdr, 4);
+ if (!p)
goto out_resource;
WRITE32(1);
}
if (bmval0 & FATTR4_WORD0_NAMED_ATTR) {
- if ((buflen -= 4) < 0)
+ p = xdr_reserve_space(xdr, 4);
+ if (!p)
goto out_resource;
WRITE32(0);
}
if (bmval0 & FATTR4_WORD0_FSID) {
- if ((buflen -= 16) < 0)
+ p = xdr_reserve_space(xdr, 16);
+ if (!p)
goto out_resource;
if (exp->ex_fslocs.migrated) {
WRITE64(NFS4_REFERRAL_FSID_MAJOR);
@@ -2254,17 +2254,20 @@ nfsd4_encode_fattr(struct xdr_stream *xdr, struct svc_fh *fhp, struct svc_export
}
}
if (bmval0 & FATTR4_WORD0_UNIQUE_HANDLES) {
- if ((buflen -= 4) < 0)
+ p = xdr_reserve_space(xdr, 4);
+ if (!p)
goto out_resource;
WRITE32(0);
}
if (bmval0 & FATTR4_WORD0_LEASE_TIME) {
- if ((buflen -= 4) < 0)
+ p = xdr_reserve_space(xdr, 4);
+ if (!p)
goto out_resource;
WRITE32(nn->nfsd4_lease);
}
if (bmval0 & FATTR4_WORD0_RDATTR_ERROR) {
- if ((buflen -= 4) < 0)
+ p = xdr_reserve_space(xdr, 4);
+ if (!p)
goto out_resource;
WRITE32(rdattr_err);
}
@@ -2272,198 +2275,229 @@ nfsd4_encode_fattr(struct xdr_stream *xdr, struct svc_fh *fhp, struct svc_export
struct nfs4_ace *ace;

if (acl == NULL) {
- if ((buflen -= 4) < 0)
+ p = xdr_reserve_space(xdr, 4);
+ if (!p)
goto out_resource;

WRITE32(0);
goto out_acl;
}
- if ((buflen -= 4) < 0)
+ p = xdr_reserve_space(xdr, 4);
+ if (!p)
goto out_resource;
WRITE32(acl->naces);

for (ace = acl->aces; ace < acl->aces + acl->naces; ace++) {
- if ((buflen -= 4*3) < 0)
+ p = xdr_reserve_space(xdr, 4*3);
+ if (!p)
goto out_resource;
WRITE32(ace->type);
WRITE32(ace->flag);
WRITE32(ace->access_mask & NFS4_ACE_MASK_ALL);
- status = nfsd4_encode_aclname(rqstp, ace, &p, &buflen);
+ status = nfsd4_encode_aclname(xdr, rqstp, ace);
if (status)
goto out;
}
}
out_acl:
if (bmval0 & FATTR4_WORD0_ACLSUPPORT) {
- if ((buflen -= 4) < 0)
+ p = xdr_reserve_space(xdr, 4);
+ if (!p)
goto out_resource;
WRITE32(aclsupport ?
ACL4_SUPPORT_ALLOW_ACL|ACL4_SUPPORT_DENY_ACL : 0);
}
if (bmval0 & FATTR4_WORD0_CANSETTIME) {
- if ((buflen -= 4) < 0)
+ p = xdr_reserve_space(xdr, 4);
+ if (!p)
goto out_resource;
WRITE32(1);
}
if (bmval0 & FATTR4_WORD0_CASE_INSENSITIVE) {
- if ((buflen -= 4) < 0)
+ p = xdr_reserve_space(xdr, 4);
+ if (!p)
goto out_resource;
WRITE32(0);
}
if (bmval0 & FATTR4_WORD0_CASE_PRESERVING) {
- if ((buflen -= 4) < 0)
+ p = xdr_reserve_space(xdr, 4);
+ if (!p)
goto out_resource;
WRITE32(1);
}
if (bmval0 & FATTR4_WORD0_CHOWN_RESTRICTED) {
- if ((buflen -= 4) < 0)
+ p = xdr_reserve_space(xdr, 4);
+ if (!p)
goto out_resource;
WRITE32(1);
}
if (bmval0 & FATTR4_WORD0_FILEHANDLE) {
- buflen -= (XDR_QUADLEN(fhp->fh_handle.fh_size) << 2) + 4;
- if (buflen < 0)
+ p = xdr_reserve_space(xdr, fhp->fh_handle.fh_size + 4);
+ if (!p)
goto out_resource;
WRITE32(fhp->fh_handle.fh_size);
WRITEMEM(&fhp->fh_handle.fh_base, fhp->fh_handle.fh_size);
}
if (bmval0 & FATTR4_WORD0_FILEID) {
- if ((buflen -= 8) < 0)
+ p = xdr_reserve_space(xdr, 8);
+ if (!p)
goto out_resource;
WRITE64(stat.ino);
}
if (bmval0 & FATTR4_WORD0_FILES_AVAIL) {
- if ((buflen -= 8) < 0)
+ p = xdr_reserve_space(xdr, 8);
+ if (!p)
goto out_resource;
WRITE64((u64) statfs.f_ffree);
}
if (bmval0 & FATTR4_WORD0_FILES_FREE) {
- if ((buflen -= 8) < 0)
+ p = xdr_reserve_space(xdr, 8);
+ if (!p)
goto out_resource;
WRITE64((u64) statfs.f_ffree);
}
if (bmval0 & FATTR4_WORD0_FILES_TOTAL) {
- if ((buflen -= 8) < 0)
+ p = xdr_reserve_space(xdr, 8);
+ if (!p)
goto out_resource;
WRITE64((u64) statfs.f_files);
}
if (bmval0 & FATTR4_WORD0_FS_LOCATIONS) {
- status = nfsd4_encode_fs_locations(rqstp, exp, &p, &buflen);
+ status = nfsd4_encode_fs_locations(xdr, rqstp, exp);
if (status)
goto out;
}
if (bmval0 & FATTR4_WORD0_HOMOGENEOUS) {
- if ((buflen -= 4) < 0)
+ p = xdr_reserve_space(xdr, 4);
+ if (!p)
goto out_resource;
WRITE32(1);
}
if (bmval0 & FATTR4_WORD0_MAXFILESIZE) {
- if ((buflen -= 8) < 0)
+ p = xdr_reserve_space(xdr, 8);
+ if (!p)
goto out_resource;
WRITE64(exp->ex_path.mnt->mnt_sb->s_maxbytes);
}
if (bmval0 & FATTR4_WORD0_MAXLINK) {
- if ((buflen -= 4) < 0)
+ p = xdr_reserve_space(xdr, 4);
+ if (!p)
goto out_resource;
WRITE32(255);
}
if (bmval0 & FATTR4_WORD0_MAXNAME) {
- if ((buflen -= 4) < 0)
+ p = xdr_reserve_space(xdr, 4);
+ if (!p)
goto out_resource;
WRITE32(statfs.f_namelen);
}
if (bmval0 & FATTR4_WORD0_MAXREAD) {
- if ((buflen -= 8) < 0)
+ p = xdr_reserve_space(xdr, 8);
+ if (!p)
goto out_resource;
WRITE64((u64) svc_max_payload(rqstp));
}
if (bmval0 & FATTR4_WORD0_MAXWRITE) {
- if ((buflen -= 8) < 0)
+ p = xdr_reserve_space(xdr, 8);
+ if (!p)
goto out_resource;
WRITE64((u64) svc_max_payload(rqstp));
}
if (bmval1 & FATTR4_WORD1_MODE) {
- if ((buflen -= 4) < 0)
+ p = xdr_reserve_space(xdr, 4);
+ if (!p)
goto out_resource;
WRITE32(stat.mode & S_IALLUGO);
}
if (bmval1 & FATTR4_WORD1_NO_TRUNC) {
- if ((buflen -= 4) < 0)
+ p = xdr_reserve_space(xdr, 4);
+ if (!p)
goto out_resource;
WRITE32(1);
}
if (bmval1 & FATTR4_WORD1_NUMLINKS) {
- if ((buflen -= 4) < 0)
+ p = xdr_reserve_space(xdr, 4);
+ if (!p)
goto out_resource;
WRITE32(stat.nlink);
}
if (bmval1 & FATTR4_WORD1_OWNER) {
- status = nfsd4_encode_user(rqstp, stat.uid, &p, &buflen);
+ status = nfsd4_encode_user(xdr, rqstp, stat.uid);
if (status)
goto out;
}
if (bmval1 & FATTR4_WORD1_OWNER_GROUP) {
- status = nfsd4_encode_group(rqstp, stat.gid, &p, &buflen);
+ status = nfsd4_encode_group(xdr, rqstp, stat.gid);
if (status)
goto out;
}
if (bmval1 & FATTR4_WORD1_RAWDEV) {
- if ((buflen -= 8) < 0)
+ p = xdr_reserve_space(xdr, 8);
+ if (!p)
goto out_resource;
WRITE32((u32) MAJOR(stat.rdev));
WRITE32((u32) MINOR(stat.rdev));
}
if (bmval1 & FATTR4_WORD1_SPACE_AVAIL) {
- if ((buflen -= 8) < 0)
+ p = xdr_reserve_space(xdr, 8);
+ if (!p)
goto out_resource;
dummy64 = (u64)statfs.f_bavail * (u64)statfs.f_bsize;
WRITE64(dummy64);
}
if (bmval1 & FATTR4_WORD1_SPACE_FREE) {
- if ((buflen -= 8) < 0)
+ p = xdr_reserve_space(xdr, 8);
+ if (!p)
goto out_resource;
dummy64 = (u64)statfs.f_bfree * (u64)statfs.f_bsize;
WRITE64(dummy64);
}
if (bmval1 & FATTR4_WORD1_SPACE_TOTAL) {
- if ((buflen -= 8) < 0)
+ p = xdr_reserve_space(xdr, 8);
+ if (!p)
goto out_resource;
dummy64 = (u64)statfs.f_blocks * (u64)statfs.f_bsize;
WRITE64(dummy64);
}
if (bmval1 & FATTR4_WORD1_SPACE_USED) {
- if ((buflen -= 8) < 0)
+ p = xdr_reserve_space(xdr, 8);
+ if (!p)
goto out_resource;
dummy64 = (u64)stat.blocks << 9;
WRITE64(dummy64);
}
if (bmval1 & FATTR4_WORD1_TIME_ACCESS) {
- if ((buflen -= 12) < 0)
+ p = xdr_reserve_space(xdr, 12);
+ if (!p)
goto out_resource;
WRITE64((s64)stat.atime.tv_sec);
WRITE32(stat.atime.tv_nsec);
}
if (bmval1 & FATTR4_WORD1_TIME_DELTA) {
- if ((buflen -= 12) < 0)
+ p = xdr_reserve_space(xdr, 12);
+ if (!p)
goto out_resource;
WRITE32(0);
WRITE32(1);
WRITE32(0);
}
if (bmval1 & FATTR4_WORD1_TIME_METADATA) {
- if ((buflen -= 12) < 0)
+ p = xdr_reserve_space(xdr, 12);
+ if (!p)
goto out_resource;
WRITE64((s64)stat.ctime.tv_sec);
WRITE32(stat.ctime.tv_nsec);
}
if (bmval1 & FATTR4_WORD1_TIME_MODIFY) {
- if ((buflen -= 12) < 0)
+ p = xdr_reserve_space(xdr, 12);
+ if (!p)
goto out_resource;
WRITE64((s64)stat.mtime.tv_sec);
WRITE32(stat.mtime.tv_nsec);
}
if (bmval1 & FATTR4_WORD1_MOUNTED_ON_FILEID) {
- if ((buflen -= 8) < 0)
+ p = xdr_reserve_space(xdr, 8);
+ if (!p)
goto out_resource;
/*
* Get parent's attributes if not ignoring crossmount
@@ -2475,13 +2509,13 @@ out_acl:
WRITE64(stat.ino);
}
if (bmval2 & FATTR4_WORD2_SECURITY_LABEL) {
- status = nfsd4_encode_security_label(rqstp, context,
- contextlen, &p, &buflen);
+ status = nfsd4_encode_security_label(xdr, rqstp, context, contextlen);
if (status)
goto out;
}
if (bmval2 & FATTR4_WORD2_SUPPATTR_EXCLCREAT) {
- if ((buflen -= 16) < 0)
+ p = xdr_reserve_space(xdr, 16);
+ if (!p)
goto out_resource;
WRITE32(3);
WRITE32(NFSD_SUPPATTR_EXCLCREAT_WORD0);
@@ -2489,8 +2523,7 @@ out_acl:
WRITE32(NFSD_SUPPATTR_EXCLCREAT_WORD2);
}

- *attrlenp = htonl((char *)p - (char *)attrlenp - 4);
- xdr->p = p;
+ *attrlenp = htonl((char *)xdr->p - (char *)attrlenp - 4);
status = nfs_ok;

out:
@@ -2501,6 +2534,13 @@ out:
kfree(acl);
if (tempfh)
fh_put(tempfh);
+ if (status) {
+ int nbytes = (char *)xdr->p - (char *)start;
+ /* open code what *should* be xdr_truncate(xdr, len); */
+ xdr->iov->iov_len -= nbytes;
+ xdr->buf->len -= nbytes;
+ xdr->p = start;
+ }
return status;
out_nfserr:
status = nfserrno(err);
@@ -2764,13 +2804,10 @@ nfsd4_encode_getattr(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4
{
struct svc_fh *fhp = getattr->ga_fhp;
struct xdr_stream *xdr = &resp->xdr;
- struct xdr_buf *buf = resp->xdr.buf;

if (nfserr)
return nfserr;

- buf->buflen = (void *)resp->xdr.end - (void *)resp->xdr.p
- - COMPOUND_ERR_SLACK_SPACE;
nfserr = nfsd4_encode_fattr(xdr, fhp, fhp->fh_export, fhp->fh_dentry,
getattr->ga_bmval,
resp->rqstp, 0);
@@ -2967,6 +3004,7 @@ nfsd4_encode_read(struct nfsd4_compoundres *resp, __be32 nfserr,
int v;
struct page *page;
unsigned long maxcount;
+ struct xdr_stream *xdr = &resp->xdr;
long len;
__be32 *p;

@@ -3013,6 +3051,7 @@ nfsd4_encode_read(struct nfsd4_compoundres *resp, __be32 nfserr,
resp->xdr.buf->head[0].iov_len = (char*)p
- (char*)resp->xdr.buf->head[0].iov_base;
resp->xdr.buf->page_len = maxcount;
+ xdr->iov = xdr->buf->tail;

/* Use rest of head for padding and remaining ops: */
resp->xdr.buf->tail[0].iov_base = p;
@@ -3031,6 +3070,7 @@ static __be32
nfsd4_encode_readlink(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4_readlink *readlink)
{
int maxcount;
+ struct xdr_stream *xdr = &resp->xdr;
char *page;
__be32 *p;

@@ -3063,6 +3103,7 @@ nfsd4_encode_readlink(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd
resp->xdr.buf->head[0].iov_len = (char*)p
- (char*)resp->xdr.buf->head[0].iov_base;
resp->xdr.buf->page_len = maxcount;
+ xdr->iov = xdr->buf->tail;

/* Use rest of head for padding and remaining ops: */
resp->xdr.buf->tail[0].iov_base = p;
@@ -3082,6 +3123,7 @@ nfsd4_encode_readdir(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4
{
int maxcount;
loff_t offset;
+ struct xdr_stream *xdr = &resp->xdr;
__be32 *page, *savep, *tailbase;
__be32 *p;

@@ -3144,6 +3186,8 @@ nfsd4_encode_readdir(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4
resp->xdr.buf->page_len = ((char*)p) -
(char*)page_address(*(resp->rqstp->rq_next_page-1));

+ xdr->iov = xdr->buf->tail;
+
/* Use rest of head for padding and remaining ops: */
resp->xdr.buf->tail[0].iov_base = tailbase;
resp->xdr.buf->tail[0].iov_len = 0;
--
1.8.5.3


2014-03-23 01:12:31

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 36/50] nfsd4: allow larger 4.1 session drc slots

From: "J. Bruce Fields" <[email protected]>

The client is actually asking for 2532 bytes. I suspect that's a
mistake. But maybe we can allow some more. In theory lock needs more
if it might return a maximum-length lockowner in the denied case.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/state.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 424d8f5..66db998 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -123,7 +123,7 @@ static inline struct nfs4_delegation *delegstateid(struct nfs4_stid *s)
/* Maximum number of operations per session compound */
#define NFSD_MAX_OPS_PER_COMPOUND 16
/* Maximum session per slot cache size */
-#define NFSD_SLOT_CACHE_SIZE 1024
+#define NFSD_SLOT_CACHE_SIZE 2048
/* Maximum number of NFSD_SLOT_CACHE_SIZE slots per session */
#define NFSD_CACHE_SIZE_SLOTS_PER_SESSION 32
#define NFSD_MAX_MEM_PER_SESSION \
--
1.8.5.3


2014-03-23 01:12:31

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 40/50] nfsd4: allow large readdirs

From: "J. Bruce Fields" <[email protected]>

Currently we limit readdir results to a single page. This can result in
a performance regression compared to NFSv3 when reading large
directories.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4proc.c | 3 --
fs/nfsd/nfs4xdr.c | 134 +++++++++++++++++++++++++++++------------------------
fs/nfsd/xdr4.h | 5 +-
3 files changed, 76 insertions(+), 66 deletions(-)

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index 9876de2..d1b4513 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -1457,9 +1457,6 @@ static inline u32 nfsd4_readdir_rsize(struct svc_rqst *rqstp, struct nfsd4_op *o
{
u32 rlen = op->u.readdir.rd_maxcount;

- if (rlen > PAGE_SIZE)
- rlen = PAGE_SIZE;
-
return (op_encode_hdr_size + op_encode_verifier_maxsz)
* sizeof(__be32) + rlen;
}
diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index e0e486d..fa3ae50 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -2575,8 +2575,8 @@ static inline int attributes_need_mount(u32 *bmval)
}

static __be32
-nfsd4_encode_dirent_fattr(struct nfsd4_readdir *cd,
- const char *name, int namlen, __be32 **p, int buflen)
+nfsd4_encode_dirent_fattr(struct xdr_stream *xdr, struct nfsd4_readdir *cd,
+ const char *name, int namlen)
{
struct svc_export *exp = cd->rd_fhp->fh_export;
struct dentry *dentry;
@@ -2628,7 +2628,7 @@ nfsd4_encode_dirent_fattr(struct nfsd4_readdir *cd,

}
out_encode:
- nfserr = nfsd4_encode_fattr_to_buf(p, buflen, NULL, exp, dentry, cd->rd_bmval,
+ nfserr = nfsd4_encode_fattr(xdr, NULL, exp, dentry, cd->rd_bmval,
cd->rd_rqstp, ignore_crossmnt);
out_put:
dput(dentry);
@@ -2637,9 +2637,12 @@ out_put:
}

static __be32 *
-nfsd4_encode_rdattr_error(__be32 *p, int buflen, __be32 nfserr)
+nfsd4_encode_rdattr_error(struct xdr_stream *xdr, __be32 nfserr)
{
- if (buflen < 6)
+ __be32 *p;
+
+ p = xdr_reserve_space(xdr, 6);
+ if (!p)
return NULL;
*p++ = htonl(2);
*p++ = htonl(FATTR4_WORD0_RDATTR_ERROR); /* bmval0 */
@@ -2656,10 +2659,13 @@ nfsd4_encode_dirent(void *ccdv, const char *name, int namlen,
{
struct readdir_cd *ccd = ccdv;
struct nfsd4_readdir *cd = container_of(ccd, struct nfsd4_readdir, common);
- int buflen;
- __be32 *p = cd->buffer;
- __be32 *cookiep;
+ struct xdr_stream *xdr = cd->xdr;
+ int start_offset = xdr->buf->len;
+ int cookie_offset;
+ int entry_bytes;
__be32 nfserr = nfserr_toosmall;
+ __be64 wire_offset;
+ __be32 *p;

/* In nfsv4, "." and ".." never make it onto the wire.. */
if (name && isdotent(name, namlen)) {
@@ -2667,19 +2673,23 @@ nfsd4_encode_dirent(void *ccdv, const char *name, int namlen,
return 0;
}

- if (cd->offset)
- xdr_encode_hyper(cd->offset, (u64) offset);
+ if (cd->cookie_offset) {
+ wire_offset = cpu_to_be64(offset);
+ write_bytes_to_xdr_buf(xdr->buf, cd->cookie_offset, &wire_offset, 8);
+ }

- buflen = cd->buflen - 4 - XDR_QUADLEN(namlen);
- if (buflen < 0)
+ p = xdr_reserve_space(xdr, 4);
+ if (!p)
goto fail;
-
*p++ = xdr_one; /* mark entry present */
- cookiep = p;
+ cookie_offset = xdr->buf->len;
+ p = xdr_reserve_space(xdr, 3*4 + namlen);
+ if (!p)
+ goto fail;
p = xdr_encode_hyper(p, NFS_OFFSET_MAX); /* offset of next entry */
p = xdr_encode_array(p, name, namlen); /* name length & name */

- nfserr = nfsd4_encode_dirent_fattr(cd, name, namlen, &p, buflen);
+ nfserr = nfsd4_encode_dirent_fattr(xdr, cd, name, namlen);
switch (nfserr) {
case nfs_ok:
break;
@@ -2698,19 +2708,23 @@ nfsd4_encode_dirent(void *ccdv, const char *name, int namlen,
*/
if (!(cd->rd_bmval[0] & FATTR4_WORD0_RDATTR_ERROR))
goto fail;
- p = nfsd4_encode_rdattr_error(p, buflen, nfserr);
+ p = nfsd4_encode_rdattr_error(xdr, nfserr);
if (p == NULL) {
nfserr = nfserr_toosmall;
goto fail;
}
}
- cd->buflen -= (p - cd->buffer);
- cd->buffer = p;
- cd->offset = cookiep;
+ nfserr = nfserr_toosmall;
+ entry_bytes = xdr->buf->len - start_offset;
+ if (entry_bytes > cd->rd_maxcount)
+ goto fail;
+ cd->rd_maxcount -= entry_bytes;
+ cd->cookie_offset = cookie_offset;
skip_entry:
cd->common.err = nfs_ok;
return 0;
fail:
+ xdr_truncate_encode(xdr, start_offset);
cd->common.err = nfserr;
return -EINVAL;
}
@@ -3194,10 +3208,11 @@ static __be32
nfsd4_encode_readdir(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4_readdir *readdir)
{
int maxcount;
+ int bytes_left;
loff_t offset;
+ __be64 wire_offset;
struct xdr_stream *xdr = &resp->xdr;
int starting_len = xdr->buf->len;
- __be32 *page, *tailbase;
__be32 *p;

if (nfserr)
@@ -3207,38 +3222,38 @@ nfsd4_encode_readdir(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4
if (!p)
return nfserr_resource;

- if (resp->xdr.buf->page_len)
- return nfserr_resource;
- if (!*resp->rqstp->rq_next_page)
- return nfserr_resource;
-
/* XXX: Following NFSv3, we ignore the READDIR verifier for now. */
WRITE32(0);
WRITE32(0);
resp->xdr.buf->head[0].iov_len = ((char*)resp->xdr.p)
- (char*)resp->xdr.buf->head[0].iov_base;
- tailbase = p;
-
- maxcount = PAGE_SIZE;
- if (maxcount > readdir->rd_maxcount)
- maxcount = readdir->rd_maxcount;

/*
- * Convert from bytes to words, account for the two words already
- * written, make sure to leave two words at the end for the next
- * pointer and eof field.
+ * Number of bytes left for directory entries allowing for the
+ * final 8 bytes of the readdir and a following failed op:
+ */
+ bytes_left = xdr->buf->buflen - xdr->buf->len
+ - COMPOUND_ERR_SLACK_SPACE - 8;
+ if (bytes_left < 0) {
+ nfserr = nfserr_resource;
+ goto err_no_verf;
+ }
+ maxcount = min_t(u32, readdir->rd_maxcount, INT_MAX);
+ /*
+ * Note the rfc defines rd_maxcount as the size of the
+ * READDIR4resok structure, which includes the verifier above
+ * and the 8 bytes encoded at the end of this function:
*/
- maxcount = (maxcount >> 2) - 4;
- if (maxcount < 0) {
- nfserr = nfserr_toosmall;
+ if (maxcount < 16) {
+ nfserr = nfserr_toosmall;
goto err_no_verf;
}
+ maxcount = min_t(int, maxcount-16, bytes_left);

- page = page_address(*(resp->rqstp->rq_next_page++));
+ readdir->xdr = xdr;
+ readdir->rd_maxcount = maxcount;
readdir->common.err = 0;
- readdir->buflen = maxcount;
- readdir->buffer = page;
- readdir->offset = NULL;
+ readdir->cookie_offset = 0;

offset = readdir->rd_cookie;
nfserr = nfsd_readdir(readdir->rd_rqstp, readdir->rd_fhp,
@@ -3246,32 +3261,31 @@ nfsd4_encode_readdir(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4
&readdir->common, nfsd4_encode_dirent);
if (nfserr == nfs_ok &&
readdir->common.err == nfserr_toosmall &&
- readdir->buffer == page)
- nfserr = nfserr_toosmall;
+ xdr->buf->len == starting_len + 8) {
+ /* nothing encoded; which limit did we hit?: */
+ if (maxcount - 16 < bytes_left)
+ /* It was the fault of rd_maxcount: */
+ nfserr = nfserr_toosmall;
+ else
+ /* We ran out of buffer space: */
+ nfserr = nfserr_resource;
+ }
if (nfserr)
goto err_no_verf;

- if (readdir->offset)
- xdr_encode_hyper(readdir->offset, offset);
+ if (readdir->cookie_offset) {
+ wire_offset = cpu_to_be64(offset);
+ write_bytes_to_xdr_buf(xdr->buf, readdir->cookie_offset,
+ &wire_offset, 8);
+ }

- p = readdir->buffer;
+ p = xdr_reserve_space(xdr, 8);
+ if (!p) {
+ WARN_ON_ONCE(1);
+ goto err_no_verf;
+ }
*p++ = 0; /* no more entries */
*p++ = htonl(readdir->common.err == nfserr_eof);
- resp->xdr.buf->page_len = ((char*)p) -
- (char*)page_address(*(resp->rqstp->rq_next_page-1));
- xdr->buf->len += xdr->buf->page_len;
-
- xdr->iov = xdr->buf->tail;
-
- xdr->page_ptr++;
- xdr->buf->buflen -= PAGE_SIZE;
- xdr->iov = xdr->buf->tail;
-
- /* Use rest of head for padding and remaining ops: */
- resp->xdr.buf->tail[0].iov_base = tailbase;
- resp->xdr.buf->tail[0].iov_len = 0;
- resp->xdr.p = resp->xdr.buf->tail[0].iov_base;
- resp->xdr.end = resp->xdr.p + (PAGE_SIZE - resp->xdr.buf->head[0].iov_len)/4;

return 0;
err_no_verf:
diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h
index d1c6e21..04b8a80 100644
--- a/fs/nfsd/xdr4.h
+++ b/fs/nfsd/xdr4.h
@@ -287,9 +287,8 @@ struct nfsd4_readdir {
struct svc_fh * rd_fhp; /* response */

struct readdir_cd common;
- __be32 * buffer;
- int buflen;
- __be32 * offset;
+ struct xdr_stream *xdr;
+ int cookie_offset;
};

struct nfsd4_release_lockowner {
--
1.8.5.3


2014-03-23 01:12:24

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 02/50] nfsd4: update comments with obsolete function name

From: "J. Bruce Fields" <[email protected]>

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4state.c | 2 +-
fs/nfsd/nfs4xdr.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 9354c15..f082961 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1542,7 +1542,7 @@ out_err:
}

/*
- * Cache a reply. nfsd4_check_drc_limit() has bounded the cache size.
+ * Cache a reply. nfsd4_check_resp_size() has bounded the cache size.
*/
void
nfsd4_store_cache_entry(struct nfsd4_compoundres *resp)
diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index aa04a6a..ebefc42 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -3625,7 +3625,7 @@ nfsd4_encode_operation(struct nfsd4_compoundres *resp, struct nfsd4_op *op)
BUG_ON(op->opnum < 0 || op->opnum >= ARRAY_SIZE(nfsd4_enc_ops) ||
!nfsd4_enc_ops[op->opnum]);
op->status = nfsd4_enc_ops[op->opnum](resp, op->status, &op->u);
- /* nfsd4_check_drc_limit guarantees enough room for error status */
+ /* nfsd4_check_resp_size guarantees enough room for error status */
if (!op->status)
op->status = nfsd4_check_resp_size(resp, 0);
if (so) {
--
1.8.5.3


2014-03-23 01:12:29

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 26/50] nfsd4: reserve space before inlining 0-copy pages

From: "J. Bruce Fields" <[email protected]>

Once we've included page-cache pages in the encoding it's difficult to
remove them and restart encoding. (xdr_truncate_encode doesn't handle
that case.) So, make sure we'll have adequate space to finish the
operation first.

For now COMPOUND_SLACK_SPACE checks should prevent this case happening,
but we want to remove those checks.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4xdr.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index ab84545..704167d 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -3055,6 +3055,10 @@ nfsd4_encode_read(struct nfsd4_compoundres *resp, __be32 nfserr,
if (!p)
return nfserr_resource;

+ /* Make sure there will be room for padding if needed: */
+ if (xdr->end - xdr->p < 1)
+ return nfserr_resource;
+
maxcount = svc_max_payload(resp->rqstp);
if (maxcount > read->rd_length)
maxcount = read->rd_length;
@@ -3101,8 +3105,6 @@ nfsd4_encode_read(struct nfsd4_compoundres *resp, __be32 nfserr,
resp->xdr.buf->tail[0].iov_len = 0;
if (maxcount&3) {
p = xdr_reserve_space(xdr, 4);
- if (!p)
- return nfserr_resource;
WRITE32(0);
resp->xdr.buf->tail[0].iov_base += maxcount&3;
resp->xdr.buf->tail[0].iov_len = 4 - (maxcount&3);
@@ -3135,6 +3137,9 @@ nfsd4_encode_readlink(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd
if (!p)
return nfserr_resource;

+ if (xdr->end - xdr->p < 1)
+ return nfserr_resource;
+
/*
* XXX: By default, the ->readlink() VFS op will truncate symlinks
* if they would overflow the buffer. Is this kosher in NFSv4? If
@@ -3161,8 +3166,6 @@ nfsd4_encode_readlink(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd
resp->xdr.buf->tail[0].iov_len = 0;
if (maxcount&3) {
p = xdr_reserve_space(xdr, 4);
- if (!p)
- return nfserr_resource;
WRITE32(0);
resp->xdr.buf->tail[0].iov_base += maxcount&3;
resp->xdr.buf->tail[0].iov_len = 4 - (maxcount&3);
--
1.8.5.3


2014-03-23 01:12:30

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 28/50] nfsd4: remove redundant encode buffer size checking

From: "J. Bruce Fields" <[email protected]>

Now that all op encoders can handle running out of space, we no longer
need to check the remaining size for every operation; only nonidempotent
operations need that check, and that can be done by
nfsd4_check_resp_size.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4proc.c | 14 --------------
fs/nfsd/nfs4xdr.c | 22 +++++++++++++---------
2 files changed, 13 insertions(+), 23 deletions(-)

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index d55546c..570c7e5 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -1237,7 +1237,6 @@ nfsd4_proc_compound(struct svc_rqst *rqstp,
struct nfsd4_op *op;
struct nfsd4_operation *opdesc;
struct nfsd4_compound_state *cstate = &resp->cstate;
- int slack_bytes;
u32 plen = 0;
__be32 status;

@@ -1291,19 +1290,6 @@ nfsd4_proc_compound(struct svc_rqst *rqstp,
goto encode_op;
}

- /* We must be able to encode a successful response to
- * this operation, with enough room left over to encode a
- * failed response to the next operation. If we don't
- * have enough room, fail with ERR_RESOURCE.
- */
- slack_bytes = (char *)resp->xdr.end - (char *)resp->xdr.p;
- if (slack_bytes < COMPOUND_SLACK_SPACE
- + COMPOUND_ERR_SLACK_SPACE) {
- BUG_ON(slack_bytes < COMPOUND_ERR_SLACK_SPACE);
- op->status = nfserr_resource;
- goto encode_op;
- }
-
opdesc = OPDESC(op);

if (!cstate->current_fh.fh_dentry) {
diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index 0d8bbe4..74ddf12 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -3730,20 +3730,24 @@ static nfsd4_enc nfsd4_enc_ops[] = {
__be32 nfsd4_check_resp_size(struct nfsd4_compoundres *resp, u32 pad)
{
struct xdr_buf *buf = &resp->rqstp->rq_res;
- struct nfsd4_session *session = NULL;
+ struct nfsd4_session *session = resp->cstate.session;
struct nfsd4_slot *slot = resp->cstate.slot;
+ int slack_bytes = (char *)resp->xdr.end - (char *)resp->xdr.p;

- if (!nfsd4_has_session(&resp->cstate))
- return 0;
+ if (nfsd4_has_session(&resp->cstate)) {

- session = resp->cstate.session;
+ if (buf->len + pad > session->se_fchannel.maxresp_sz)
+ return nfserr_rep_too_big;

- if (buf->len + pad > session->se_fchannel.maxresp_sz)
- return nfserr_rep_too_big;
+ if ((slot->sl_flags & NFSD4_SLOT_CACHETHIS) &&
+ buf->len + pad > session->se_fchannel.maxresp_cached)
+ return nfserr_rep_too_big_to_cache;
+ }

- if ((slot->sl_flags & NFSD4_SLOT_CACHETHIS) &&
- buf->len + pad > session->se_fchannel.maxresp_cached)
- return nfserr_rep_too_big_to_cache;
+ if (pad > slack_bytes) {
+ WARN_ON_ONCE(nfsd4_has_session(&resp->cstate));
+ return nfserr_resource;
+ }

return 0;
}
--
1.8.5.3


2014-03-23 06:47:38

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 20/50] nfsd4: keep xdr buf length updated

> diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
> index fae7d02..a3dce3c 100644
> --- a/fs/nfsd/nfs4xdr.c
> +++ b/fs/nfsd/nfs4xdr.c
> @@ -3051,9 +3051,10 @@ nfsd4_encode_read(struct nfsd4_compoundres *resp, __be32 nfserr,
> WRITE32(eof);
> WRITE32(maxcount);
> ADJUST_ARGS();
> - resp->xdr.buf->head[0].iov_len = (char*)p
> - - (char*)resp->xdr.buf->head[0].iov_base;
> + WARN_ON_ONCE(resp->xdr.buf->head[0].iov_len != (char*)p
> + - (char*)resp->xdr.buf->head[0].iov_base);

There should be spaces between the typename and the *.

> resp->xdr.buf->page_len = maxcount;
> + xdr->buf->len += maxcount;
> xdr->iov = xdr->buf->tail;
>
> /* Use rest of head for padding and remaining ops: */
> @@ -3064,6 +3065,7 @@ nfsd4_encode_read(struct nfsd4_compoundres *resp, __be32 nfserr,
> WRITE32(0);
> resp->xdr.buf->tail[0].iov_base += maxcount&3;
> resp->xdr.buf->tail[0].iov_len = 4 - (maxcount&3);
> + xdr->buf->len -= (maxcount&3);
> ADJUST_ARGS();
> }
> return 0;
> @@ -3109,6 +3111,7 @@ nfsd4_encode_readlink(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd
> resp->xdr.buf->head[0].iov_len = (char*)p
> - (char*)resp->xdr.buf->head[0].iov_base;
> resp->xdr.buf->page_len = maxcount;
> + xdr->buf->len += maxcount;
> xdr->iov = xdr->buf->tail;

To me it seems all these manpiluations sream for being split into
well-documented little helpers.


2014-03-23 01:12:32

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 41/50] nfsd4: enforce rd_dircount

From: "J. Bruce Fields" <[email protected]>

As long as we're here, let's enforce the protocol's limit on the number
of directory entries to return in a readdir.

I don't think anyone's ever noticed our lack of enforcement, but maybe
there's more of a chance they will now that we allow larger readdirs.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4xdr.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index fa3ae50..b35b7b1 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -1033,7 +1033,7 @@ nfsd4_decode_readdir(struct nfsd4_compoundargs *argp, struct nfsd4_readdir *read
READ_BUF(24);
READ64(readdir->rd_cookie);
COPYMEM(readdir->rd_verf.data, sizeof(readdir->rd_verf.data));
- READ32(readdir->rd_dircount); /* just in case you needed a useless field... */
+ READ32(readdir->rd_dircount);
READ32(readdir->rd_maxcount);
if ((status = nfsd4_decode_bitmap(argp, readdir->rd_bmval)))
goto out;
@@ -2719,6 +2719,9 @@ nfsd4_encode_dirent(void *ccdv, const char *name, int namlen,
if (entry_bytes > cd->rd_maxcount)
goto fail;
cd->rd_maxcount -= entry_bytes;
+ if (!cd->rd_dircount)
+ goto fail;
+ cd->rd_dircount--;
cd->cookie_offset = cookie_offset;
skip_entry:
cd->common.err = nfs_ok;
--
1.8.5.3


2014-03-23 01:12:28

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 19/50] nfsd4: no need for encode_compoundres to adjust lengths

From: "J. Bruce Fields" <[email protected]>

xdr_reserve_space should now be calculating the length correctly as we
go, so there's no longer any need to fix it up here.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4state.c | 3 +++
fs/nfsd/nfs4xdr.c | 8 +-------
2 files changed, 4 insertions(+), 7 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index d98e282..2910abc 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1605,6 +1605,7 @@ nfsd4_replay_cache_entry(struct nfsd4_compoundres *resp,
struct nfsd4_sequence *seq)
{
struct nfsd4_slot *slot = resp->cstate.slot;
+ struct kvec *head = resp->xdr.iov;
__be32 status;

dprintk("--> %s slot %p\n", __func__, slot);
@@ -1618,6 +1619,8 @@ nfsd4_replay_cache_entry(struct nfsd4_compoundres *resp,

resp->opcnt = slot->sl_opcnt;
resp->xdr.p = resp->cstate.datap + XDR_QUADLEN(slot->sl_datalen);
+ head->iov_len = (void *)resp->xdr.p - head->iov_base;
+ resp->xdr.buf->len = head->iov_len;
status = slot->sl_status;

return status;
diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index d665c60..fae7d02 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -3818,19 +3818,13 @@ nfs4svc_encode_compoundres(struct svc_rqst *rqstp, __be32 *p, struct nfsd4_compo
* All that remains is to write the tag and operation count...
*/
struct nfsd4_compound_state *cs = &resp->cstate;
- struct kvec *iov;
+
p = resp->tagp;
*p++ = htonl(resp->taglen);
memcpy(p, resp->tag, resp->taglen);
p += XDR_QUADLEN(resp->taglen);
*p++ = htonl(resp->opcnt);

- if (rqstp->rq_res.page_len)
- iov = &rqstp->rq_res.tail[0];
- else
- iov = &rqstp->rq_res.head[0];
- iov->iov_len = ((char*)resp->xdr.p) - (char*)iov->iov_base;
- BUG_ON(iov->iov_len > PAGE_SIZE);
if (nfsd4_has_session(cs)) {
struct nfsd_net *nn = net_generic(SVC_NET(rqstp), nfsd_net_id);
struct nfs4_client *clp = cs->session->se_client;
--
1.8.5.3


2014-03-23 01:12:25

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 05/50] nfsd4: use more generous NFS4_ACL_MAX

From: "J. Bruce Fields" <[email protected]>

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/acl.h | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/fs/nfsd/acl.h b/fs/nfsd/acl.h
index a812fd1..b481e1f 100644
--- a/fs/nfsd/acl.h
+++ b/fs/nfsd/acl.h
@@ -39,9 +39,13 @@ struct nfs4_acl;
struct svc_fh;
struct svc_rqst;

-/* Maximum ACL we'll accept from client; chosen (somewhat arbitrarily) to
- * fit in a page: */
-#define NFS4_ACL_MAX 170
+/*
+ * Maximum ACL we'll accept from a client; chosen (somewhat
+ * arbitrarily) so that kmalloc'ing the ACL shouldn't require a
+ * high-order allocation. This allows 204 ACEs on x86_64:
+ */
+#define NFS4_ACL_MAX ((PAGE_SIZE - sizeof(struct nfs4_acl)) \
+ / sizeof(struct nfs4_ace))

struct nfs4_acl *nfs4_acl_new(int);
int nfs4_acl_get_whotype(char *, u32);
--
1.8.5.3


2014-03-23 01:12:33

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 49/50] nfsd4: kill WRITEMEM

From: "J. Bruce Fields" <[email protected]>

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4xdr.c | 62 +++++++++++++++++++++++++------------------------------
1 file changed, 28 insertions(+), 34 deletions(-)

diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index 6afa0a1..24e9fde 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -1700,12 +1700,6 @@ nfsd4_decode_compound(struct nfsd4_compoundargs *argp)
DECODE_TAIL;
}

-#define WRITEMEM(ptr,nbytes) do { if (nbytes > 0) { \
- *(p + XDR_QUADLEN(nbytes) -1) = 0; \
- memcpy(p, ptr, nbytes); \
- p += XDR_QUADLEN(nbytes); \
-}} while (0)
-
static void write32(__be32 **p, u32 n)
{
*(*p)++ = htonl(n);
@@ -1784,8 +1778,7 @@ static __be32 nfsd4_encode_components_esc(struct xdr_stream *xdr, char sep, char
p = xdr_reserve_space(xdr, strlen + 4);
if (!p)
return nfserr_resource;
- *p++ = cpu_to_be32(strlen);
- WRITEMEM(str, strlen);
+ p = xdr_encode_opaque(p, str, strlen);
count++;
}
else
@@ -1876,8 +1869,7 @@ static __be32 nfsd4_encode_path(struct xdr_stream *xdr, const struct path *root,
spin_unlock(&dentry->d_lock);
goto out_free;
}
- *p++ = cpu_to_be32(len);
- WRITEMEM(dentry->d_name.name, len);
+ p = xdr_encode_opaque(p, dentry->d_name.name, len);
dprintk("/%s", dentry->d_name.name);
spin_unlock(&dentry->d_lock);
dput(dentry);
@@ -2243,7 +2235,7 @@ nfsd4_encode_fattr(struct xdr_stream *xdr, struct svc_fh *fhp, struct svc_export
*p++ = cpu_to_be32(MINOR(stat.dev));
break;
case FSIDSOURCE_UUID:
- WRITEMEM(exp->ex_uuid, 16);
+ p = xdr_encode_opaque_fixed(p, exp->ex_uuid, 16);
break;
}
}
@@ -2329,8 +2321,8 @@ out_acl:
p = xdr_reserve_space(xdr, fhp->fh_handle.fh_size + 4);
if (!p)
goto out_resource;
- *p++ = cpu_to_be32(fhp->fh_handle.fh_size);
- WRITEMEM(&fhp->fh_handle.fh_base, fhp->fh_handle.fh_size);
+ p = xdr_encode_opaque(p, &fhp->fh_handle.fh_base,
+ fhp->fh_handle.fh_size);
}
if (bmval0 & FATTR4_WORD0_FILEID) {
p = xdr_reserve_space(xdr, 8);
@@ -2745,7 +2737,8 @@ nfsd4_encode_stateid(struct xdr_stream *xdr, stateid_t *sid)
if (!p)
return nfserr_resource;
*p++ = cpu_to_be32(sid->si_generation);
- WRITEMEM(&sid->si_opaque, sizeof(stateid_opaque_t));
+ p = xdr_encode_opaque_fixed(p, &sid->si_opaque,
+ sizeof(stateid_opaque_t));
return 0;
}

@@ -2774,7 +2767,8 @@ static __be32 nfsd4_encode_bind_conn_to_session(struct nfsd4_compoundres *resp,
p = xdr_reserve_space(xdr, NFS4_MAX_SESSIONID_LEN + 8);
if (!p)
return nfserr_resource;
- WRITEMEM(bcts->sessionid.data, NFS4_MAX_SESSIONID_LEN);
+ p = xdr_encode_opaque_fixed(p, bcts->sessionid.data,
+ NFS4_MAX_SESSIONID_LEN);
*p++ = cpu_to_be32(bcts->dir);
/* Sorry, we do not yet support RDMA over 4.1: */
*p++ = cpu_to_be32(0);
@@ -2804,7 +2798,8 @@ nfsd4_encode_commit(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4_
p = xdr_reserve_space(xdr, NFS4_VERIFIER_SIZE);
if (!p)
return nfserr_resource;
- WRITEMEM(commit->co_verf.data, NFS4_VERIFIER_SIZE);
+ p = xdr_encode_opaque_fixed(p, commit->co_verf.data,
+ NFS4_VERIFIER_SIZE);
}
return nfserr;
}
@@ -2855,8 +2850,7 @@ nfsd4_encode_getfh(struct nfsd4_compoundres *resp, __be32 nfserr, struct svc_fh
p = xdr_reserve_space(xdr, len + 4);
if (!p)
return nfserr_resource;
- *p++ = cpu_to_be32(len);
- WRITEMEM(&fhp->fh_handle.fh_base, len);
+ p = xdr_encode_opaque(p, &fhp->fh_handle.fh_base, len);
}
return nfserr;
}
@@ -2889,9 +2883,8 @@ again:
p = xdr_encode_hyper(p, ld->ld_length);
*p++ = cpu_to_be32(ld->ld_type);
if (conf->len) {
- WRITEMEM(&ld->ld_clientid, 8);
- *p++ = cpu_to_be32(conf->len);
- WRITEMEM(conf->data, conf->len);
+ p = xdr_encode_opaque_fixed(p, &ld->ld_clientid, 8);
+ p = xdr_encode_opaque(p, conf->data, conf->len);
} else { /* non - nfsv4 lock in conflict, no clientid nor owner */
p = xdr_encode_hyper(p, (u64)0); /* clientid */
*p++ = cpu_to_be32(0); /* length of owner name */
@@ -3449,8 +3442,7 @@ nfsd4_do_encode_secinfo(struct xdr_stream *xdr,
if (!p)
goto out;
*p++ = cpu_to_be32(RPC_AUTH_GSS);
- *p++ = cpu_to_be32(info.oid.len);
- WRITEMEM(info.oid.data, info.oid.len);
+ p = xdr_encode_opaque(p, info.oid.data, info.oid.len);
*p++ = cpu_to_be32(info.qop);
*p++ = cpu_to_be32(info.service);
} else if (pf < RPC_AUTH_MAXFLAVOR) {
@@ -3532,8 +3524,9 @@ nfsd4_encode_setclientid(struct nfsd4_compoundres *resp, __be32 nfserr, struct n
p = xdr_reserve_space(xdr, 8 + NFS4_VERIFIER_SIZE);
if (!p)
return nfserr_resource;
- WRITEMEM(&scd->se_clientid, 8);
- WRITEMEM(&scd->se_confirm, NFS4_VERIFIER_SIZE);
+ p = xdr_encode_opaque_fixed(p, &scd->se_clientid, 8);
+ p = xdr_encode_opaque_fixed(p, &scd->se_confirm,
+ NFS4_VERIFIER_SIZE);
}
else if (nfserr == nfserr_clid_inuse) {
p = xdr_reserve_space(xdr, 8);
@@ -3557,7 +3550,8 @@ nfsd4_encode_write(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4_w
return nfserr_resource;
*p++ = cpu_to_be32(write->wr_bytes_written);
*p++ = cpu_to_be32(write->wr_how_written);
- WRITEMEM(write->wr_verifier.data, NFS4_VERIFIER_SIZE);
+ p = xdr_encode_opaque_fixed(p, write->wr_verifier.data,
+ NFS4_VERIFIER_SIZE);
}
return nfserr;
}
@@ -3598,7 +3592,7 @@ nfsd4_encode_exchange_id(struct nfsd4_compoundres *resp, __be32 nfserr,
if (!p)
return nfserr_resource;

- WRITEMEM(&exid->clientid, 8);
+ p = xdr_encode_opaque_fixed(p, &exid->clientid, 8);
*p++ = cpu_to_be32(exid->seqid);
*p++ = cpu_to_be32(exid->flags);

@@ -3638,12 +3632,10 @@ nfsd4_encode_exchange_id(struct nfsd4_compoundres *resp, __be32 nfserr,
/* The server_owner struct */
p = xdr_encode_hyper(p, minor_id); /* Minor id */
/* major id */
- *p++ = cpu_to_be32(major_id_sz);
- WRITEMEM(major_id, major_id_sz);
+ p = xdr_encode_opaque(p, major_id, major_id_sz);

/* Server scope */
- *p++ = cpu_to_be32(server_scope_sz);
- WRITEMEM(server_scope, server_scope_sz);
+ p = xdr_encode_opaque(p, server_scope, server_scope_sz);

/* Implementation id */
*p++ = cpu_to_be32(0); /* zero length nfs_impl_id4 array */
@@ -3663,7 +3655,8 @@ nfsd4_encode_create_session(struct nfsd4_compoundres *resp, __be32 nfserr,
p = xdr_reserve_space(xdr, 24);
if (!p)
return nfserr_resource;
- WRITEMEM(sess->sessionid.data, NFS4_MAX_SESSIONID_LEN);
+ p = xdr_encode_opaque_fixed(p, sess->sessionid.data,
+ NFS4_MAX_SESSIONID_LEN);
*p++ = cpu_to_be32(sess->seqid);
*p++ = cpu_to_be32(sess->flags);

@@ -3718,7 +3711,8 @@ nfsd4_encode_sequence(struct nfsd4_compoundres *resp, __be32 nfserr,
p = xdr_reserve_space(xdr, NFS4_MAX_SESSIONID_LEN + 20);
if (!p)
return nfserr_resource;
- WRITEMEM(seq->sessionid.data, NFS4_MAX_SESSIONID_LEN);
+ p = xdr_encode_opaque_fixed(p, seq->sessionid.data,
+ NFS4_MAX_SESSIONID_LEN);
*p++ = cpu_to_be32(seq->seqid);
*p++ = cpu_to_be32(seq->slotid);
/* Note slotid's are numbered from zero: */
@@ -3952,7 +3946,7 @@ nfsd4_encode_replay(struct xdr_stream *xdr, struct nfsd4_op *op)
*p++ = cpu_to_be32(op->opnum);
*p++ = rp->rp_status; /* already xdr'ed */

- WRITEMEM(rp->rp_buf, rp->rp_buflen);
+ p = xdr_encode_opaque_fixed(p, rp->rp_buf, rp->rp_buflen);
}

int
--
1.8.5.3


2014-03-23 01:12:28

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 20/50] nfsd4: keep xdr buf length updated

From: "J. Bruce Fields" <[email protected]>

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4proc.c | 2 ++
fs/nfsd/nfs4xdr.c | 12 ++++++++++--
2 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index e5cc711..8360def 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -1222,6 +1222,8 @@ static void svcxdr_init_encode(struct svc_rqst *rqstp, struct nfsd4_compoundres
xdr->iov = head;
xdr->p = head->iov_base + head->iov_len;
xdr->end = head->iov_base + PAGE_SIZE - 2 * RPC_MAX_AUTH_SIZE;
+ /* Tail and page_len should be zero at this point: */
+ buf->len = buf->head[0].iov_len;
}

/*
diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index fae7d02..a3dce3c 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -3051,9 +3051,10 @@ nfsd4_encode_read(struct nfsd4_compoundres *resp, __be32 nfserr,
WRITE32(eof);
WRITE32(maxcount);
ADJUST_ARGS();
- resp->xdr.buf->head[0].iov_len = (char*)p
- - (char*)resp->xdr.buf->head[0].iov_base;
+ WARN_ON_ONCE(resp->xdr.buf->head[0].iov_len != (char*)p
+ - (char*)resp->xdr.buf->head[0].iov_base);
resp->xdr.buf->page_len = maxcount;
+ xdr->buf->len += maxcount;
xdr->iov = xdr->buf->tail;

/* Use rest of head for padding and remaining ops: */
@@ -3064,6 +3065,7 @@ nfsd4_encode_read(struct nfsd4_compoundres *resp, __be32 nfserr,
WRITE32(0);
resp->xdr.buf->tail[0].iov_base += maxcount&3;
resp->xdr.buf->tail[0].iov_len = 4 - (maxcount&3);
+ xdr->buf->len -= (maxcount&3);
ADJUST_ARGS();
}
return 0;
@@ -3109,6 +3111,7 @@ nfsd4_encode_readlink(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd
resp->xdr.buf->head[0].iov_len = (char*)p
- (char*)resp->xdr.buf->head[0].iov_base;
resp->xdr.buf->page_len = maxcount;
+ xdr->buf->len += maxcount;
xdr->iov = xdr->buf->tail;

/* Use rest of head for padding and remaining ops: */
@@ -3191,6 +3194,7 @@ nfsd4_encode_readdir(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4
*p++ = htonl(readdir->common.err == nfserr_eof);
resp->xdr.buf->page_len = ((char*)p) -
(char*)page_address(*(resp->rqstp->rq_next_page-1));
+ xdr->buf->len += xdr->buf->page_len;

xdr->iov = xdr->buf->tail;

@@ -3818,6 +3822,10 @@ nfs4svc_encode_compoundres(struct svc_rqst *rqstp, __be32 *p, struct nfsd4_compo
* All that remains is to write the tag and operation count...
*/
struct nfsd4_compound_state *cs = &resp->cstate;
+ struct xdr_buf *buf = resp->xdr.buf;
+
+ WARN_ON_ONCE(buf->len != buf->head[0].iov_len + buf->page_len +
+ buf->tail[0].iov_len);

p = resp->tagp;
*p++ = htonl(resp->taglen);
--
1.8.5.3


2014-03-23 01:12:26

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 10/50] nfsd4: tweak nfsd4_encode_getattr to take xdr_stream

From: "J. Bruce Fields" <[email protected]>

Just change the nfsd4_encode_getattr api. Not changing any code or
adding any new functionality yet.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4proc.c | 6 +++---
fs/nfsd/nfs4xdr.c | 44 ++++++++++++++++++++++++++++++++------------
fs/nfsd/xdr4.h | 7 ++++---
3 files changed, 39 insertions(+), 18 deletions(-)

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index 27b8268..7248a78 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -1072,10 +1072,10 @@ _nfsd4_verify(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
return nfserr_jukebox;

p = buf;
- status = nfsd4_encode_fattr(&cstate->current_fh,
+ status = nfsd4_encode_fattr_to_buf(&p, count, &cstate->current_fh,
cstate->current_fh.fh_export,
- cstate->current_fh.fh_dentry, &p,
- count, verify->ve_bmval,
+ cstate->current_fh.fh_dentry,
+ verify->ve_bmval,
rqstp, 0);
/*
* If nfsd4_encode_fattr() ran out of space, assume that's because
diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index 484a8aa..2ad423d 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -2046,12 +2046,10 @@ static int get_parent_attributes(struct svc_export *exp, struct kstat *stat)
/*
* Note: @fhp can be NULL; in this case, we might have to compose the filehandle
* ourselves.
- *
- * countp is the buffer size in _words_
*/
__be32
-nfsd4_encode_fattr(struct svc_fh *fhp, struct svc_export *exp,
- struct dentry *dentry, __be32 **buffer, int count, u32 *bmval,
+nfsd4_encode_fattr(struct xdr_stream *xdr, struct svc_fh *fhp, struct svc_export *exp,
+ struct dentry *dentry, u32 *bmval,
struct svc_rqst *rqstp, int ignore_crossmnt)
{
u32 bmval0 = bmval[0];
@@ -2060,12 +2058,12 @@ nfsd4_encode_fattr(struct svc_fh *fhp, struct svc_export *exp,
struct kstat stat;
struct svc_fh *tempfh = NULL;
struct kstatfs statfs;
- int buflen = count << 2;
+ __be32 *p = xdr->p;
+ int buflen = xdr->buf->buflen;
__be32 *attrlenp;
u32 dummy;
u64 dummy64;
u32 rdattr_err = 0;
- __be32 *p = *buffer;
__be32 status;
int err;
int aclsupport = 0;
@@ -2492,7 +2490,7 @@ out_acl:
}

*attrlenp = htonl((char *)p - (char *)attrlenp - 4);
- *buffer = p;
+ xdr->p = p;
status = nfs_ok;

out:
@@ -2512,6 +2510,26 @@ out_resource:
goto out;
}

+__be32 nfsd4_encode_fattr_to_buf(__be32 **p, int words,
+ struct svc_fh *fhp, struct svc_export *exp,
+ struct dentry *dentry, u32 *bmval,
+ struct svc_rqst *rqstp, int ignore_crossmnt)
+{
+ struct xdr_buf dummy = {
+ .head[0] = {
+ .iov_base = *p,
+ },
+ .buflen = words << 2,
+ };
+ struct xdr_stream xdr;
+ __be32 ret;
+
+ xdr_init_encode(&xdr, &dummy, NULL);
+ ret = nfsd4_encode_fattr(&xdr, fhp, exp, dentry, bmval, rqstp, ignore_crossmnt);
+ *p = xdr.p;
+ return ret;
+}
+
static inline int attributes_need_mount(u32 *bmval)
{
if (bmval[0] & ~(FATTR4_WORD0_RDATTR_ERROR | FATTR4_WORD0_LEASE_TIME))
@@ -2575,7 +2593,7 @@ nfsd4_encode_dirent_fattr(struct nfsd4_readdir *cd,

}
out_encode:
- nfserr = nfsd4_encode_fattr(NULL, exp, dentry, p, buflen, cd->rd_bmval,
+ nfserr = nfsd4_encode_fattr_to_buf(p, buflen, NULL, exp, dentry, cd->rd_bmval,
cd->rd_rqstp, ignore_crossmnt);
out_put:
dput(dentry);
@@ -2745,14 +2763,16 @@ static __be32
nfsd4_encode_getattr(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4_getattr *getattr)
{
struct svc_fh *fhp = getattr->ga_fhp;
- int buflen;
+ struct xdr_stream *xdr = &resp->xdr;
+ struct xdr_buf *buf = resp->xdr.buf;

if (nfserr)
return nfserr;

- buflen = resp->xdr.end - resp->xdr.p - (COMPOUND_ERR_SLACK_SPACE >> 2);
- nfserr = nfsd4_encode_fattr(fhp, fhp->fh_export, fhp->fh_dentry,
- &resp->xdr.p, buflen, getattr->ga_bmval,
+ buf->buflen = (void *)resp->xdr.end - (void *)resp->xdr.p
+ - COMPOUND_ERR_SLACK_SPACE;
+ nfserr = nfsd4_encode_fattr(xdr, fhp, fhp->fh_export, fhp->fh_dentry,
+ getattr->ga_bmval,
resp->rqstp, 0);
return nfserr;
}
diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h
index 6884d70..f62a055 100644
--- a/fs/nfsd/xdr4.h
+++ b/fs/nfsd/xdr4.h
@@ -562,9 +562,10 @@ int nfs4svc_encode_compoundres(struct svc_rqst *, __be32 *,
__be32 nfsd4_check_resp_size(struct nfsd4_compoundres *, u32);
void nfsd4_encode_operation(struct nfsd4_compoundres *, struct nfsd4_op *);
void nfsd4_encode_replay(struct nfsd4_compoundres *resp, struct nfsd4_op *op);
-__be32 nfsd4_encode_fattr(struct svc_fh *fhp, struct svc_export *exp,
- struct dentry *dentry, __be32 **buffer, int countp,
- u32 *bmval, struct svc_rqst *, int ignore_crossmnt);
+__be32 nfsd4_encode_fattr_to_buf(__be32 **p, int words,
+ struct svc_fh *fhp, struct svc_export *exp,
+ struct dentry *dentry,
+ u32 *bmval, struct svc_rqst *, int ignore_crossmnt);
extern __be32 nfsd4_setclientid(struct svc_rqst *rqstp,
struct nfsd4_compound_state *,
struct nfsd4_setclientid *setclid);
--
1.8.5.3


2014-04-05 00:20:15

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 22/50] nfsd4: use xdr_truncate_encode

On Tue, Mar 25, 2014 at 08:36:32AM -0700, Christoph Hellwig wrote:
> On Sun, Mar 23, 2014 at 11:07:22AM -0400, J. Bruce Fields wrote:
> > In the splice case read increments page_len as it goes. This leaves the
> > xdr_buf in somewhat of an inconsistent state and may confuse a later
> > xdr_truncate_encode.
> >
> > It's a little ugly, I'm not sure what to do about it. Currently I'm
> > thinking of just telling nfsd's splice callback to leave page_len alone.
> > That'd mean taking another look at the splice callback and at the v2/v3
> > read code.
>
> How a about just adding a comment explaing this for now?

OK, done.--b.