2017-08-18 15:12:12

by Chuck Lever

[permalink] [raw]
Subject: [PATCH v1 0/3] Handle NFSv4 operations in xdr_buf tail

Hi Bruce-

A little light weekend reading. I got around to implementing
support in the NFSv4 WRITE decoder for dealing with trailing
operations that reside in the receiver buffer's tail iovec.
Let me know what you think of this solution.

---

Chuck Lever (3):
nfsd: Limit end of page list when decoding NFSv4 WRITE
nfsd: Incoming xdr_bufs may have content in tail buffer
svcrdma: Populate tail iovec when receiving


fs/nfsd/nfs4xdr.c | 26 ++++++++-
fs/nfsd/xdr4.h | 1
net/sunrpc/xprtrdma/svc_rdma_rw.c | 108 +++++--------------------------------
3 files changed, 38 insertions(+), 97 deletions(-)

--
Chuck Lever


2017-08-18 15:12:20

by Chuck Lever

[permalink] [raw]
Subject: [PATCH v1 1/3] nfsd: Limit end of page list when decoding NFSv4 WRITE

When processing an NFSv4 WRITE operation, argp->end should never
point past the end of the data in the final page of the page list.
Otherwise, nfsd4_decode_compound can walk into uninitialized memory.

Signed-off-by: Chuck Lever <[email protected]>
---
fs/nfsd/nfs4xdr.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index 51e729a..7c48d68 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -144,7 +144,7 @@ static void next_decode_page(struct nfsd4_compoundargs *argp)
argp->p = page_address(argp->pagelist[0]);
argp->pagelist++;
if (argp->pagelen < PAGE_SIZE) {
- argp->end = argp->p + (argp->pagelen>>2);
+ argp->end = argp->p + XDR_QUADLEN(argp->pagelen);
argp->pagelen = 0;
} else {
argp->end = argp->p + (PAGE_SIZE>>2);
@@ -1279,9 +1279,7 @@ static __be32 nfsd4_decode_opaque(struct nfsd4_compoundargs *argp, struct xdr_ne
argp->pagelen -= pages * PAGE_SIZE;
len -= pages * PAGE_SIZE;

- argp->p = (__be32 *)page_address(argp->pagelist[0]);
- argp->pagelist++;
- argp->end = argp->p + XDR_QUADLEN(PAGE_SIZE);
+ next_decode_page(argp);
}
argp->p += XDR_QUADLEN(len);



2017-08-18 15:12:29

by Chuck Lever

[permalink] [raw]
Subject: [PATCH v1 2/3] nfsd: Incoming xdr_bufs may have content in tail buffer

Since the beginning, svcsock has built a received RPC Call message
by populating the xdr_buf's head, then placing the remaining
message bytes in the xdr_buf's page list. The xdr_buf's tail is
never populated.

This means that an NFSv4 COMPOUND containing an NFS WRITE operation
plus trailing operations has a page list that contains the WRITE
data payload followed by the trailing operations. NFSv4 XDR decoders
will not look in the xdr_buf's tail, ever, because svcsock never put
anything there.

To support transports that can pass the write payload in the
xdr_buf's pagelist and trailing content in the xdr_buf's tail,
introduce logic in READ_BUF that switches to the xdr_buf's tail vec
when the decoder runs out of content in rq_arg.pages.

Signed-off-by: Chuck Lever <[email protected]>
---
fs/nfsd/nfs4xdr.c | 20 ++++++++++++++++++++
fs/nfsd/xdr4.h | 1 +
2 files changed, 21 insertions(+)

diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index 7c48d68..a9f88cf 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -159,6 +159,25 @@ static __be32 *read_buf(struct nfsd4_compoundargs *argp, u32 nbytes)
*/
unsigned int avail = (char *)argp->end - (char *)argp->p;
__be32 *p;
+
+ if (argp->pagelen == 0) {
+ struct kvec *vec = &argp->rqstp->rq_arg.tail[0];
+
+ if (!argp->tail) {
+ argp->tail = true;
+ avail = vec->iov_len;
+ argp->p = vec->iov_base;
+ argp->end = vec->iov_base + avail;
+ }
+
+ if (avail < nbytes)
+ return NULL;
+
+ p = argp->p;
+ argp->p += XDR_QUADLEN(nbytes);
+ return p;
+ }
+
if (avail + argp->pagelen < nbytes)
return NULL;
if (avail + PAGE_SIZE < nbytes) /* need more than a page !! */
@@ -4573,6 +4592,7 @@ void nfsd4_release_compoundargs(struct svc_rqst *rqstp)
args->end = rqstp->rq_arg.head[0].iov_base + rqstp->rq_arg.head[0].iov_len;
args->pagelist = rqstp->rq_arg.pages;
args->pagelen = rqstp->rq_arg.page_len;
+ args->tail = false;
args->tmpp = NULL;
args->to_free = NULL;
args->ops = args->iops;
diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h
index 72c6ad1..bac91b1 100644
--- a/fs/nfsd/xdr4.h
+++ b/fs/nfsd/xdr4.h
@@ -614,6 +614,7 @@ struct nfsd4_compoundargs {
__be32 * end;
struct page ** pagelist;
int pagelen;
+ bool tail;
__be32 tmp[8];
__be32 * tmpp;
struct svcxdr_tmpbuf *to_free;


2017-08-18 15:12:37

by Chuck Lever

[permalink] [raw]
Subject: [PATCH v1 3/3] svcrdma: Populate tail iovec when receiving

So that NFS WRITE payloads can eventually be placed directly into a
file's page cache, enable the RPC-over-RDMA transport to present
these payloads in the xdr_buf's page list, while placing trailing
content (such as a GETATTR operation) in the xdr_buf's tail.

After this change, the RPC-over-RDMA's "copy tail" hack, added by
commit a97c331f9aa9 ("svcrdma: Handle additional inline content"),
is no longer needed and can be removed.

Signed-off-by: Chuck Lever <[email protected]>
---
net/sunrpc/xprtrdma/svc_rdma_rw.c | 108 +++++--------------------------------
1 file changed, 15 insertions(+), 93 deletions(-)

diff --git a/net/sunrpc/xprtrdma/svc_rdma_rw.c b/net/sunrpc/xprtrdma/svc_rdma_rw.c
index 1f34fae..7dcda45 100644
--- a/net/sunrpc/xprtrdma/svc_rdma_rw.c
+++ b/net/sunrpc/xprtrdma/svc_rdma_rw.c
@@ -691,78 +691,6 @@ static int svc_rdma_build_read_chunk(struct svc_rqst *rqstp,
return ret;
}

-/* If there is inline content following the Read chunk, append it to
- * the page list immediately following the data payload. This has to
- * be done after the reader function has determined how many pages
- * were consumed for RDMA Read.
- *
- * On entry, ri_pageno and ri_pageoff point directly to the end of the
- * page list. On exit, both have been updated to the new "next byte".
- *
- * Assumptions:
- * - Inline content fits entirely in rq_pages[0]
- * - Trailing content is only a handful of bytes
- */
-static int svc_rdma_copy_tail(struct svc_rqst *rqstp,
- struct svc_rdma_read_info *info)
-{
- struct svc_rdma_op_ctxt *head = info->ri_readctxt;
- unsigned int tail_length, remaining;
- u8 *srcp, *destp;
-
- /* Assert that all inline content fits in page 0. This is an
- * implementation limit, not a protocol limit.
- */
- if (head->arg.head[0].iov_len > PAGE_SIZE) {
- pr_warn_once("svcrdma: too much trailing inline content\n");
- return -EINVAL;
- }
-
- srcp = head->arg.head[0].iov_base;
- srcp += info->ri_position;
- tail_length = head->arg.head[0].iov_len - info->ri_position;
- remaining = tail_length;
-
- /* If there is room on the last page in the page list, try to
- * fit the trailing content there.
- */
- if (info->ri_pageoff > 0) {
- unsigned int len;
-
- len = min_t(unsigned int, remaining,
- PAGE_SIZE - info->ri_pageoff);
- destp = page_address(rqstp->rq_pages[info->ri_pageno]);
- destp += info->ri_pageoff;
-
- memcpy(destp, srcp, len);
- srcp += len;
- destp += len;
- info->ri_pageoff += len;
- remaining -= len;
-
- if (info->ri_pageoff == PAGE_SIZE) {
- info->ri_pageno++;
- info->ri_pageoff = 0;
- }
- }
-
- /* Otherwise, a fresh page is needed. */
- if (remaining) {
- head->arg.pages[info->ri_pageno] =
- rqstp->rq_pages[info->ri_pageno];
- head->count++;
-
- destp = page_address(rqstp->rq_pages[info->ri_pageno]);
- memcpy(destp, srcp, remaining);
- info->ri_pageoff += remaining;
- }
-
- head->arg.page_len += tail_length;
- head->arg.len += tail_length;
- head->arg.buflen += tail_length;
- return 0;
-}
-
/* Construct RDMA Reads to pull over a normal Read chunk. The chunk
* data lands in the page list of head->arg.pages.
*
@@ -787,34 +715,28 @@ static int svc_rdma_build_normal_read_chunk(struct svc_rqst *rqstp,
if (ret < 0)
goto out;

- /* Read chunk may need XDR round-up (see RFC 5666, s. 3.7).
+ /* Split the Receive buffer between the head and tail
+ * buffers at Read chunk's position. XDR roundup of the
+ * chunk is not included in either the pagelist or in
+ * the tail.
*/
- if (info->ri_chunklen & 3) {
- u32 padlen = 4 - (info->ri_chunklen & 3);
-
- info->ri_chunklen += padlen;
+ head->arg.tail[0].iov_base =
+ head->arg.head[0].iov_base + info->ri_position;
+ head->arg.tail[0].iov_len =
+ head->arg.head[0].iov_len - info->ri_position;
+ head->arg.head[0].iov_len = info->ri_position;

- /* NB: data payload always starts on XDR alignment,
- * thus the pad can never contain a page boundary.
- */
- info->ri_pageoff += padlen;
- if (info->ri_pageoff == PAGE_SIZE) {
- info->ri_pageno++;
- info->ri_pageoff = 0;
- }
- }
+ /* Read chunk may need XDR roundup (see RFC 5666, s. 3.7).
+ *
+ * NFSv2/3 write decoders need the length of the tail to
+ * contain the size of the roundup padding.
+ */
+ head->arg.tail[0].iov_len += 4 - (info->ri_chunklen & 3);

head->arg.page_len = info->ri_chunklen;
head->arg.len += info->ri_chunklen;
head->arg.buflen += info->ri_chunklen;

- if (info->ri_position < head->arg.head[0].iov_len) {
- ret = svc_rdma_copy_tail(rqstp, info);
- if (ret < 0)
- goto out;
- }
- head->arg.head[0].iov_len = info->ri_position;
-
out:
return ret;
}


2017-08-21 21:14:00

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH v1 1/3] nfsd: Limit end of page list when decoding NFSv4 WRITE

On Fri, Aug 18, 2017 at 11:12:19AM -0400, Chuck Lever wrote:
> When processing an NFSv4 WRITE operation, argp->end should never
> point past the end of the data in the final page of the page list.
> Otherwise, nfsd4_decode_compound can walk into uninitialized memory.
>
> Signed-off-by: Chuck Lever <[email protected]>
> ---
> fs/nfsd/nfs4xdr.c | 6 ++----
> 1 file changed, 2 insertions(+), 4 deletions(-)
>
> diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
> index 51e729a..7c48d68 100644
> --- a/fs/nfsd/nfs4xdr.c
> +++ b/fs/nfsd/nfs4xdr.c
> @@ -144,7 +144,7 @@ static void next_decode_page(struct nfsd4_compoundargs *argp)
> argp->p = page_address(argp->pagelist[0]);
> argp->pagelist++;
> if (argp->pagelen < PAGE_SIZE) {
> - argp->end = argp->p + (argp->pagelen>>2);
> + argp->end = argp->p + XDR_QUADLEN(argp->pagelen);
> argp->pagelen = 0;
> } else {
> argp->end = argp->p + (PAGE_SIZE>>2);
> @@ -1279,9 +1279,7 @@ static __be32 nfsd4_decode_opaque(struct nfsd4_compoundargs *argp, struct xdr_ne
> argp->pagelen -= pages * PAGE_SIZE;
> len -= pages * PAGE_SIZE;
>
> - argp->p = (__be32 *)page_address(argp->pagelist[0]);
> - argp->pagelist++;
> - argp->end = argp->p + XDR_QUADLEN(PAGE_SIZE);
> + next_decode_page(argp);

I think there's no change in behavior here *except* for adding a new
argp->pagelen=0 (or argp->pagelen -= PAGE_SIZE).

--b.

> }
> argp->p += XDR_QUADLEN(len);
>

2017-08-21 21:15:42

by Chuck Lever

[permalink] [raw]
Subject: Re: [PATCH v1 1/3] nfsd: Limit end of page list when decoding NFSv4 WRITE


> On Aug 21, 2017, at 5:13 PM, J. Bruce Fields <[email protected]> wrote:
>
> On Fri, Aug 18, 2017 at 11:12:19AM -0400, Chuck Lever wrote:
>> When processing an NFSv4 WRITE operation, argp->end should never
>> point past the end of the data in the final page of the page list.
>> Otherwise, nfsd4_decode_compound can walk into uninitialized memory.
>>
>> Signed-off-by: Chuck Lever <[email protected]>
>> ---
>> fs/nfsd/nfs4xdr.c | 6 ++----
>> 1 file changed, 2 insertions(+), 4 deletions(-)
>>
>> diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
>> index 51e729a..7c48d68 100644
>> --- a/fs/nfsd/nfs4xdr.c
>> +++ b/fs/nfsd/nfs4xdr.c
>> @@ -144,7 +144,7 @@ static void next_decode_page(struct nfsd4_compoundargs *argp)
>> argp->p = page_address(argp->pagelist[0]);
>> argp->pagelist++;
>> if (argp->pagelen < PAGE_SIZE) {
>> - argp->end = argp->p + (argp->pagelen>>2);
>> + argp->end = argp->p + XDR_QUADLEN(argp->pagelen);
>> argp->pagelen = 0;
>> } else {
>> argp->end = argp->p + (PAGE_SIZE>>2);
>> @@ -1279,9 +1279,7 @@ static __be32 nfsd4_decode_opaque(struct nfsd4_compoundargs *argp, struct xdr_ne
>> argp->pagelen -= pages * PAGE_SIZE;
>> len -= pages * PAGE_SIZE;
>>
>> - argp->p = (__be32 *)page_address(argp->pagelist[0]);
>> - argp->pagelist++;
>> - argp->end = argp->p + XDR_QUADLEN(PAGE_SIZE);
>> + next_decode_page(argp);
>
> I think there's no change in behavior here *except* for adding a new
> argp->pagelen=0 (or argp->pagelen -= PAGE_SIZE).

The code around this change is currently working correctly,
so there is no change in behavior AFAICT. This is a defensive
change, but it also replaces duplicate code.


> --b.
>
>> }
>> argp->p += XDR_QUADLEN(len);
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

--
Chuck Lever




2017-08-21 21:21:40

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH v1 1/3] nfsd: Limit end of page list when decoding NFSv4 WRITE

On Mon, Aug 21, 2017 at 05:15:38PM -0400, Chuck Lever wrote:
>
> > On Aug 21, 2017, at 5:13 PM, J. Bruce Fields <[email protected]> wrote:
> >
> > On Fri, Aug 18, 2017 at 11:12:19AM -0400, Chuck Lever wrote:
> >> When processing an NFSv4 WRITE operation, argp->end should never
> >> point past the end of the data in the final page of the page list.
> >> Otherwise, nfsd4_decode_compound can walk into uninitialized memory.
> >>
> >> Signed-off-by: Chuck Lever <[email protected]>
> >> ---
> >> fs/nfsd/nfs4xdr.c | 6 ++----
> >> 1 file changed, 2 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
> >> index 51e729a..7c48d68 100644
> >> --- a/fs/nfsd/nfs4xdr.c
> >> +++ b/fs/nfsd/nfs4xdr.c
> >> @@ -144,7 +144,7 @@ static void next_decode_page(struct nfsd4_compoundargs *argp)
> >> argp->p = page_address(argp->pagelist[0]);
> >> argp->pagelist++;
> >> if (argp->pagelen < PAGE_SIZE) {
> >> - argp->end = argp->p + (argp->pagelen>>2);
> >> + argp->end = argp->p + XDR_QUADLEN(argp->pagelen);
> >> argp->pagelen = 0;
> >> } else {
> >> argp->end = argp->p + (PAGE_SIZE>>2);
> >> @@ -1279,9 +1279,7 @@ static __be32 nfsd4_decode_opaque(struct nfsd4_compoundargs *argp, struct xdr_ne
> >> argp->pagelen -= pages * PAGE_SIZE;
> >> len -= pages * PAGE_SIZE;
> >>
> >> - argp->p = (__be32 *)page_address(argp->pagelist[0]);
> >> - argp->pagelist++;
> >> - argp->end = argp->p + XDR_QUADLEN(PAGE_SIZE);
> >> + next_decode_page(argp);
> >
> > I think there's no change in behavior here *except* for adding a new
> > argp->pagelen=0 (or argp->pagelen -= PAGE_SIZE).
>
> The code around this change is currently working correctly,
> so there is no change in behavior AFAICT. This is a defensive
> change, but it also replaces duplicate code.

I don't understand. I'm saying that by calling next_decode_page() there
you've added a new argp->pagelen assignment. I don't understand how
that can't change behavior, unless there's another bug in our bounds
checking someplace.

Most likely it could cause subsequent op parsers to believe there's less
space in the argument buffer than there really is, so it might fail to
parse a compound with a write plus some other ops, if that puts the
total call close to the maximum size?

--b.

>
>
> > --b.
> >
> >> }
> >> argp->p += XDR_QUADLEN(len);
> >>
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> > the body of a message to [email protected]
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
>
> --
> Chuck Lever
>
>

2017-08-21 22:08:19

by Chuck Lever

[permalink] [raw]
Subject: Re: [PATCH v1 1/3] nfsd: Limit end of page list when decoding NFSv4 WRITE


> On Aug 21, 2017, at 5:21 PM, J. Bruce Fields <[email protected]> wrote:
>
> On Mon, Aug 21, 2017 at 05:15:38PM -0400, Chuck Lever wrote:
>>
>>> On Aug 21, 2017, at 5:13 PM, J. Bruce Fields <[email protected]> wrote:
>>>
>>> On Fri, Aug 18, 2017 at 11:12:19AM -0400, Chuck Lever wrote:
>>>> When processing an NFSv4 WRITE operation, argp->end should never
>>>> point past the end of the data in the final page of the page list.
>>>> Otherwise, nfsd4_decode_compound can walk into uninitialized memory.
>>>>
>>>> Signed-off-by: Chuck Lever <[email protected]>
>>>> ---
>>>> fs/nfsd/nfs4xdr.c | 6 ++----
>>>> 1 file changed, 2 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
>>>> index 51e729a..7c48d68 100644
>>>> --- a/fs/nfsd/nfs4xdr.c
>>>> +++ b/fs/nfsd/nfs4xdr.c
>>>> @@ -144,7 +144,7 @@ static void next_decode_page(struct nfsd4_compoundargs *argp)
>>>> argp->p = page_address(argp->pagelist[0]);
>>>> argp->pagelist++;
>>>> if (argp->pagelen < PAGE_SIZE) {
>>>> - argp->end = argp->p + (argp->pagelen>>2);
>>>> + argp->end = argp->p + XDR_QUADLEN(argp->pagelen);

^^^^^^^^^^^^^^ A

>>>> argp->pagelen = 0;
>>>> } else {
>>>> argp->end = argp->p + (PAGE_SIZE>>2);
>>>> @@ -1279,9 +1279,7 @@ static __be32 nfsd4_decode_opaque(struct nfsd4_compoundargs *argp, struct xdr_ne
>>>> argp->pagelen -= pages * PAGE_SIZE;
>>>> len -= pages * PAGE_SIZE;
>>>>
>>>> - argp->p = (__be32 *)page_address(argp->pagelist[0]);
>>>> - argp->pagelist++;
>>>> - argp->end = argp->p + XDR_QUADLEN(PAGE_SIZE);

^^^^^^^^^^^^^^ B

>>>> + next_decode_page(argp);
>>>
>>> I think there's no change in behavior here *except* for adding a new
>>> argp->pagelen=0 (or argp->pagelen -= PAGE_SIZE).
>>
>> The code around this change is currently working correctly,
>> so there is no change in behavior AFAICT. This is a defensive
>> change, but it also replaces duplicate code.
>
> I don't understand. I'm saying that by calling next_decode_page() there
> you've added a new argp->pagelen assignment. I don't understand how
> that can't change behavior, unless there's another bug in our bounds
> checking someplace.

Because of line B above, argp->end always points to the
end of the final page in the page list. However, the
buffer might end somewhere in the middle of that page,
in which case, the transport hasn't initialized any of
the bytes between the end of the buffer and the end of
the page.

As long as the other fields in the xdr_buf are set up
properly, the XDR decoder will not walk into that uninit-
ialized section of the last page. But there's nothing
preventing a decoder or transport bug from causing it
to walk into the uninitialized area. And always setting
to the end of the page is confusing when the buffer
itself is actually shorter.

The key is to replace line B above with line A. argp->end
is advanced by the remaining part of the final page rather
than by a whole page.

The next patch uses this new behavior to signal precisely
when it has to move from the page list to the tail iovec.


> Most likely it could cause subsequent op parsers to believe there's less
> space in the argument buffer than there really is, so it might fail to
> parse a compound with a write plus some other ops, if that puts the
> total call close to the maximum size?

Where is argp->pagelen used after the final next_decode_page
call?


> --b.
>
>>
>>
>>> --b.
>>>
>>>> }
>>>> argp->p += XDR_QUADLEN(len);
>>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>>> the body of a message to [email protected]
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>> --
>> Chuck Lever
>>
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

--
Chuck Lever




2017-08-22 21:45:13

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH v1 1/3] nfsd: Limit end of page list when decoding NFSv4 WRITE

On Mon, Aug 21, 2017 at 06:08:15PM -0400, Chuck Lever wrote:
>
> > On Aug 21, 2017, at 5:21 PM, J. Bruce Fields <[email protected]> wrote:
> >
> > On Mon, Aug 21, 2017 at 05:15:38PM -0400, Chuck Lever wrote:
> >>
> >>> On Aug 21, 2017, at 5:13 PM, J. Bruce Fields <[email protected]> wrote:
> >>>
> >>> On Fri, Aug 18, 2017 at 11:12:19AM -0400, Chuck Lever wrote:
> >>>> When processing an NFSv4 WRITE operation, argp->end should never
> >>>> point past the end of the data in the final page of the page list.
> >>>> Otherwise, nfsd4_decode_compound can walk into uninitialized memory.
> >>>>
> >>>> Signed-off-by: Chuck Lever <[email protected]>
> >>>> ---
> >>>> fs/nfsd/nfs4xdr.c | 6 ++----
> >>>> 1 file changed, 2 insertions(+), 4 deletions(-)
> >>>>
> >>>> diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
> >>>> index 51e729a..7c48d68 100644
> >>>> --- a/fs/nfsd/nfs4xdr.c
> >>>> +++ b/fs/nfsd/nfs4xdr.c
> >>>> @@ -144,7 +144,7 @@ static void next_decode_page(struct nfsd4_compoundargs *argp)
> >>>> argp->p = page_address(argp->pagelist[0]);
> >>>> argp->pagelist++;
> >>>> if (argp->pagelen < PAGE_SIZE) {
> >>>> - argp->end = argp->p + (argp->pagelen>>2);
> >>>> + argp->end = argp->p + XDR_QUADLEN(argp->pagelen);
>
> ^^^^^^^^^^^^^^ A
>
> >>>> argp->pagelen = 0;
> >>>> } else {
> >>>> argp->end = argp->p + (PAGE_SIZE>>2);
> >>>> @@ -1279,9 +1279,7 @@ static __be32 nfsd4_decode_opaque(struct nfsd4_compoundargs *argp, struct xdr_ne
> >>>> argp->pagelen -= pages * PAGE_SIZE;
> >>>> len -= pages * PAGE_SIZE;
> >>>>
> >>>> - argp->p = (__be32 *)page_address(argp->pagelist[0]);
> >>>> - argp->pagelist++;
> >>>> - argp->end = argp->p + XDR_QUADLEN(PAGE_SIZE);
>
> ^^^^^^^^^^^^^^ B
>
> >>>> + next_decode_page(argp);
> >>>
> >>> I think there's no change in behavior here *except* for adding a new
> >>> argp->pagelen=0 (or argp->pagelen -= PAGE_SIZE).
> >>
> >> The code around this change is currently working correctly,
> >> so there is no change in behavior AFAICT. This is a defensive
> >> change, but it also replaces duplicate code.
> >
> > I don't understand. I'm saying that by calling next_decode_page() there
> > you've added a new argp->pagelen assignment. I don't understand how
> > that can't change behavior, unless there's another bug in our bounds
> > checking someplace.
>
> Because of line B above, argp->end always points to the
> end of the final page in the page list. However, the
> buffer might end somewhere in the middle of that page,
> in which case, the transport hasn't initialized any of
> the bytes between the end of the buffer and the end of
> the page.
>
> As long as the other fields in the xdr_buf are set up
> properly, the XDR decoder will not walk into that uninit-
> ialized section of the last page. But there's nothing
> preventing a decoder or transport bug from causing it
> to walk into the uninitialized area. And always setting
> to the end of the page is confusing when the buffer
> itself is actually shorter.
>
> The key is to replace line B above with line A. argp->end
> is advanced by the remaining part of the final page rather
> than by a whole page.

Got it, I agree with that part of the change, it's the pagelen change I
was having trouble with.

But looking at it more, I think your patch is a fix and the current code
is wrong.

> The next patch uses this new behavior to signal precisely
> when it has to move from the page list to the tail iovec.
>
>
> > Most likely it could cause subsequent op parsers to believe there's less
> > space in the argument buffer than there really is, so it might fail to
> > parse a compound with a write plus some other ops, if that puts the
> > total call close to the maximum size?
>
> Where is argp->pagelen used after the final next_decode_page
> call?

Well, it's checked in every read_buf and next_decode_page to decide how
much space is left.

It looks to me like the current code is wrong not to be decreasing
page_len at the end there. So I wonder if there's a bug right now.
E.g. maybe a compound with multiple writes could leave the xdr decoding
thinking it has more space than it does and allow someone to write
unrelated memory to some file.

--b.

2017-08-23 18:36:38

by Chuck Lever

[permalink] [raw]
Subject: Re: [PATCH v1 1/3] nfsd: Limit end of page list when decoding NFSv4 WRITE


> On Aug 22, 2017, at 5:45 PM, J. Bruce Fields <[email protected]> wrote:
>
> On Mon, Aug 21, 2017 at 06:08:15PM -0400, Chuck Lever wrote:
>>
>>> On Aug 21, 2017, at 5:21 PM, J. Bruce Fields <[email protected]> wrote:
>>>
>>> On Mon, Aug 21, 2017 at 05:15:38PM -0400, Chuck Lever wrote:
>>>>
>>>>> On Aug 21, 2017, at 5:13 PM, J. Bruce Fields <[email protected]> wrote:
>>>>>
>>>>> On Fri, Aug 18, 2017 at 11:12:19AM -0400, Chuck Lever wrote:
>>>>>> When processing an NFSv4 WRITE operation, argp->end should never
>>>>>> point past the end of the data in the final page of the page list.
>>>>>> Otherwise, nfsd4_decode_compound can walk into uninitialized memory.
>>>>>>
>>>>>> Signed-off-by: Chuck Lever <[email protected]>
>>>>>> ---
>>>>>> fs/nfsd/nfs4xdr.c | 6 ++----
>>>>>> 1 file changed, 2 insertions(+), 4 deletions(-)
>>>>>>
>>>>>> diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
>>>>>> index 51e729a..7c48d68 100644
>>>>>> --- a/fs/nfsd/nfs4xdr.c
>>>>>> +++ b/fs/nfsd/nfs4xdr.c
>>>>>> @@ -144,7 +144,7 @@ static void next_decode_page(struct nfsd4_compoundargs *argp)
>>>>>> argp->p = page_address(argp->pagelist[0]);
>>>>>> argp->pagelist++;
>>>>>> if (argp->pagelen < PAGE_SIZE) {
>>>>>> - argp->end = argp->p + (argp->pagelen>>2);
>>>>>> + argp->end = argp->p + XDR_QUADLEN(argp->pagelen);
>>
>> ^^^^^^^^^^^^^^ A
>>
>>>>>> argp->pagelen = 0;
>>>>>> } else {
>>>>>> argp->end = argp->p + (PAGE_SIZE>>2);
>>>>>> @@ -1279,9 +1279,7 @@ static __be32 nfsd4_decode_opaque(struct nfsd4_compoundargs *argp, struct xdr_ne
>>>>>> argp->pagelen -= pages * PAGE_SIZE;
>>>>>> len -= pages * PAGE_SIZE;
>>>>>>
>>>>>> - argp->p = (__be32 *)page_address(argp->pagelist[0]);
>>>>>> - argp->pagelist++;
>>>>>> - argp->end = argp->p + XDR_QUADLEN(PAGE_SIZE);
>>
>> ^^^^^^^^^^^^^^ B
>>
>>>>>> + next_decode_page(argp);
>>>>>
>>>>> I think there's no change in behavior here *except* for adding a new
>>>>> argp->pagelen=0 (or argp->pagelen -= PAGE_SIZE).
>>>>
>>>> The code around this change is currently working correctly,
>>>> so there is no change in behavior AFAICT. This is a defensive
>>>> change, but it also replaces duplicate code.
>>>
>>> I don't understand. I'm saying that by calling next_decode_page() there
>>> you've added a new argp->pagelen assignment. I don't understand how
>>> that can't change behavior, unless there's another bug in our bounds
>>> checking someplace.
>>
>> Because of line B above, argp->end always points to the
>> end of the final page in the page list. However, the
>> buffer might end somewhere in the middle of that page,
>> in which case, the transport hasn't initialized any of
>> the bytes between the end of the buffer and the end of
>> the page.
>>
>> As long as the other fields in the xdr_buf are set up
>> properly, the XDR decoder will not walk into that uninit-
>> ialized section of the last page. But there's nothing
>> preventing a decoder or transport bug from causing it
>> to walk into the uninitialized area. And always setting
>> to the end of the page is confusing when the buffer
>> itself is actually shorter.
>>
>> The key is to replace line B above with line A. argp->end
>> is advanced by the remaining part of the final page rather
>> than by a whole page.
>
> Got it, I agree with that part of the change, it's the pagelen change I
> was having trouble with.
>
> But looking at it more, I think your patch is a fix and the current code
> is wrong.
>
>> The next patch uses this new behavior to signal precisely
>> when it has to move from the page list to the tail iovec.
>>
>>
>>> Most likely it could cause subsequent op parsers to believe there's less
>>> space in the argument buffer than there really is, so it might fail to
>>> parse a compound with a write plus some other ops, if that puts the
>>> total call close to the maximum size?
>>
>> Where is argp->pagelen used after the final next_decode_page
>> call?
>
> Well, it's checked in every read_buf and next_decode_page to decide how
> much space is left.

Right, and I didn't change the pagelen adjustment that occurs
in the loop. Just the final adjustment should be different.


> It looks to me like the current code is wrong not to be decreasing
> page_len at the end there. So I wonder if there's a bug right now.
> E.g. maybe a compound with multiple writes could leave the xdr decoding
> thinking it has more space than it does and allow someone to write
> unrelated memory to some file.

I believe that's possible.


--
Chuck Lever




2017-08-24 01:18:05

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH v1 1/3] nfsd: Limit end of page list when decoding NFSv4 WRITE

On Wed, Aug 23, 2017 at 02:36:33PM -0400, Chuck Lever wrote:
>
> > On Aug 22, 2017, at 5:45 PM, J. Bruce Fields <[email protected]> wrote:
> >
> > On Mon, Aug 21, 2017 at 06:08:15PM -0400, Chuck Lever wrote:
> >>
> >>> On Aug 21, 2017, at 5:21 PM, J. Bruce Fields <[email protected]> wrote:
> >>>
> >>> On Mon, Aug 21, 2017 at 05:15:38PM -0400, Chuck Lever wrote:
> >>>>
> >>>>> On Aug 21, 2017, at 5:13 PM, J. Bruce Fields <[email protected]> wrote:
> >>>>>
> >>>>> On Fri, Aug 18, 2017 at 11:12:19AM -0400, Chuck Lever wrote:
> >>>>>> When processing an NFSv4 WRITE operation, argp->end should never
> >>>>>> point past the end of the data in the final page of the page list.
> >>>>>> Otherwise, nfsd4_decode_compound can walk into uninitialized memory.
> >>>>>>
> >>>>>> Signed-off-by: Chuck Lever <[email protected]>
> >>>>>> ---
> >>>>>> fs/nfsd/nfs4xdr.c | 6 ++----
> >>>>>> 1 file changed, 2 insertions(+), 4 deletions(-)
> >>>>>>
> >>>>>> diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
> >>>>>> index 51e729a..7c48d68 100644
> >>>>>> --- a/fs/nfsd/nfs4xdr.c
> >>>>>> +++ b/fs/nfsd/nfs4xdr.c
> >>>>>> @@ -144,7 +144,7 @@ static void next_decode_page(struct nfsd4_compoundargs *argp)
> >>>>>> argp->p = page_address(argp->pagelist[0]);
> >>>>>> argp->pagelist++;
> >>>>>> if (argp->pagelen < PAGE_SIZE) {
> >>>>>> - argp->end = argp->p + (argp->pagelen>>2);
> >>>>>> + argp->end = argp->p + XDR_QUADLEN(argp->pagelen);
> >>
> >> ^^^^^^^^^^^^^^ A
> >>
> >>>>>> argp->pagelen = 0;
> >>>>>> } else {
> >>>>>> argp->end = argp->p + (PAGE_SIZE>>2);
> >>>>>> @@ -1279,9 +1279,7 @@ static __be32 nfsd4_decode_opaque(struct nfsd4_compoundargs *argp, struct xdr_ne
> >>>>>> argp->pagelen -= pages * PAGE_SIZE;
> >>>>>> len -= pages * PAGE_SIZE;
> >>>>>>
> >>>>>> - argp->p = (__be32 *)page_address(argp->pagelist[0]);
> >>>>>> - argp->pagelist++;
> >>>>>> - argp->end = argp->p + XDR_QUADLEN(PAGE_SIZE);
> >>
> >> ^^^^^^^^^^^^^^ B
> >>
> >>>>>> + next_decode_page(argp);
> >>>>>
> >>>>> I think there's no change in behavior here *except* for adding a new
> >>>>> argp->pagelen=0 (or argp->pagelen -= PAGE_SIZE).
> >>>>
> >>>> The code around this change is currently working correctly,
> >>>> so there is no change in behavior AFAICT. This is a defensive
> >>>> change, but it also replaces duplicate code.
> >>>
> >>> I don't understand. I'm saying that by calling next_decode_page() there
> >>> you've added a new argp->pagelen assignment. I don't understand how
> >>> that can't change behavior, unless there's another bug in our bounds
> >>> checking someplace.
> >>
> >> Because of line B above, argp->end always points to the
> >> end of the final page in the page list. However, the
> >> buffer might end somewhere in the middle of that page,
> >> in which case, the transport hasn't initialized any of
> >> the bytes between the end of the buffer and the end of
> >> the page.
> >>
> >> As long as the other fields in the xdr_buf are set up
> >> properly, the XDR decoder will not walk into that uninit-
> >> ialized section of the last page. But there's nothing
> >> preventing a decoder or transport bug from causing it
> >> to walk into the uninitialized area. And always setting
> >> to the end of the page is confusing when the buffer
> >> itself is actually shorter.
> >>
> >> The key is to replace line B above with line A. argp->end
> >> is advanced by the remaining part of the final page rather
> >> than by a whole page.
> >
> > Got it, I agree with that part of the change, it's the pagelen change I
> > was having trouble with.
> >
> > But looking at it more, I think your patch is a fix and the current code
> > is wrong.
> >
> >> The next patch uses this new behavior to signal precisely
> >> when it has to move from the page list to the tail iovec.
> >>
> >>
> >>> Most likely it could cause subsequent op parsers to believe there's less
> >>> space in the argument buffer than there really is, so it might fail to
> >>> parse a compound with a write plus some other ops, if that puts the
> >>> total call close to the maximum size?
> >>
> >> Where is argp->pagelen used after the final next_decode_page
> >> call?
> >
> > Well, it's checked in every read_buf and next_decode_page to decide how
> > much space is left.
>
> Right, and I didn't change the pagelen adjustment that occurs
> in the loop. Just the final adjustment should be different.
>
>
> > It looks to me like the current code is wrong not to be decreasing
> > page_len at the end there. So I wonder if there's a bug right now.
> > E.g. maybe a compound with multiple writes could leave the xdr decoding
> > thinking it has more space than it does and allow someone to write
> > unrelated memory to some file.
>
> I believe that's possible.

Yeah, I can totally crash the server this way.

I'll send this along for stable. (And see if I can figure out how to get
a test for this kind of thing into pynfs. Unfortunately it's seems to
require some contortions to get it to produce bad xdr.)

--b.

2017-08-24 02:52:48

by Weston Andros Adamson

[permalink] [raw]
Subject: Re: [PATCH v1 1/3] nfsd: Limit end of page list when decoding NFSv4 WRITE


> On Aug 23, 2017, at 9:18 PM, J. Bruce Fields <[email protected]> =
wrote:
>=20
> On Wed, Aug 23, 2017 at 02:36:33PM -0400, Chuck Lever wrote:
>>=20
>>> On Aug 22, 2017, at 5:45 PM, J. Bruce Fields <[email protected]> =
wrote:
>>>=20
>>> On Mon, Aug 21, 2017 at 06:08:15PM -0400, Chuck Lever wrote:
>>>>=20
>>>>> On Aug 21, 2017, at 5:21 PM, J. Bruce Fields =
<[email protected]> wrote:
>>>>>=20
>>>>> On Mon, Aug 21, 2017 at 05:15:38PM -0400, Chuck Lever wrote:
>>>>>>=20
>>>>>>> On Aug 21, 2017, at 5:13 PM, J. Bruce Fields =
<[email protected]> wrote:
>>>>>>>=20
>>>>>>> On Fri, Aug 18, 2017 at 11:12:19AM -0400, Chuck Lever wrote:
>>>>>>>> When processing an NFSv4 WRITE operation, argp->end should =
never
>>>>>>>> point past the end of the data in the final page of the page =
list.
>>>>>>>> Otherwise, nfsd4_decode_compound can walk into uninitialized =
memory.
>>>>>>>>=20
>>>>>>>> Signed-off-by: Chuck Lever <[email protected]>
>>>>>>>> ---
>>>>>>>> fs/nfsd/nfs4xdr.c | 6 ++----
>>>>>>>> 1 file changed, 2 insertions(+), 4 deletions(-)
>>>>>>>>=20
>>>>>>>> diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
>>>>>>>> index 51e729a..7c48d68 100644
>>>>>>>> --- a/fs/nfsd/nfs4xdr.c
>>>>>>>> +++ b/fs/nfsd/nfs4xdr.c
>>>>>>>> @@ -144,7 +144,7 @@ static void next_decode_page(struct =
nfsd4_compoundargs *argp)
>>>>>>>> argp->p =3D page_address(argp->pagelist[0]);
>>>>>>>> argp->pagelist++;
>>>>>>>> if (argp->pagelen < PAGE_SIZE) {
>>>>>>>> - argp->end =3D argp->p + (argp->pagelen>>2);
>>>>>>>> + argp->end =3D argp->p + =
XDR_QUADLEN(argp->pagelen);
>>>>=20
>>>> ^^^^^^^^^^^^^^ A
>>>>=20
>>>>>>>> argp->pagelen =3D 0;
>>>>>>>> } else {
>>>>>>>> argp->end =3D argp->p + (PAGE_SIZE>>2);
>>>>>>>> @@ -1279,9 +1279,7 @@ static __be32 nfsd4_decode_opaque(struct =
nfsd4_compoundargs *argp, struct xdr_ne
>>>>>>>> argp->pagelen -=3D pages * PAGE_SIZE;
>>>>>>>> len -=3D pages * PAGE_SIZE;
>>>>>>>>=20
>>>>>>>> - argp->p =3D (__be32 =
*)page_address(argp->pagelist[0]);
>>>>>>>> - argp->pagelist++;
>>>>>>>> - argp->end =3D argp->p + XDR_QUADLEN(PAGE_SIZE);
>>>>=20
>>>> ^^^^^^^^^^^^^^ B
>>>>=20
>>>>>>>> + next_decode_page(argp);
>>>>>>>=20
>>>>>>> I think there's no change in behavior here *except* for adding a =
new
>>>>>>> argp->pagelen=3D0 (or argp->pagelen -=3D PAGE_SIZE).
>>>>>>=20
>>>>>> The code around this change is currently working correctly,
>>>>>> so there is no change in behavior AFAICT. This is a defensive
>>>>>> change, but it also replaces duplicate code.
>>>>>=20
>>>>> I don't understand. I'm saying that by calling next_decode_page() =
there
>>>>> you've added a new argp->pagelen assignment. I don't understand =
how
>>>>> that can't change behavior, unless there's another bug in our =
bounds
>>>>> checking someplace.
>>>>=20
>>>> Because of line B above, argp->end always points to the
>>>> end of the final page in the page list. However, the
>>>> buffer might end somewhere in the middle of that page,
>>>> in which case, the transport hasn't initialized any of
>>>> the bytes between the end of the buffer and the end of
>>>> the page.
>>>>=20
>>>> As long as the other fields in the xdr_buf are set up
>>>> properly, the XDR decoder will not walk into that uninit-
>>>> ialized section of the last page. But there's nothing
>>>> preventing a decoder or transport bug from causing it
>>>> to walk into the uninitialized area. And always setting
>>>> to the end of the page is confusing when the buffer
>>>> itself is actually shorter.
>>>>=20
>>>> The key is to replace line B above with line A. argp->end
>>>> is advanced by the remaining part of the final page rather
>>>> than by a whole page.
>>>=20
>>> Got it, I agree with that part of the change, it's the pagelen =
change I
>>> was having trouble with.
>>>=20
>>> But looking at it more, I think your patch is a fix and the current =
code
>>> is wrong.
>>>=20
>>>> The next patch uses this new behavior to signal precisely
>>>> when it has to move from the page list to the tail iovec.
>>>>=20
>>>>=20
>>>>> Most likely it could cause subsequent op parsers to believe =
there's less
>>>>> space in the argument buffer than there really is, so it might =
fail to
>>>>> parse a compound with a write plus some other ops, if that puts =
the
>>>>> total call close to the maximum size?
>>>>=20
>>>> Where is argp->pagelen used after the final next_decode_page
>>>> call?
>>>=20
>>> Well, it's checked in every read_buf and next_decode_page to decide =
how
>>> much space is left.
>>=20
>> Right, and I didn't change the pagelen adjustment that occurs
>> in the loop. Just the final adjustment should be different.
>>=20
>>=20
>>> It looks to me like the current code is wrong not to be decreasing
>>> page_len at the end there. So I wonder if there's a bug right now.
>>> E.g. maybe a compound with multiple writes could leave the xdr =
decoding
>>> thinking it has more space than it does and allow someone to write
>>> unrelated memory to some file.
>>=20
>> I believe that's possible.
>=20
> Yeah, I can totally crash the server this way.
>=20
> I'll send this along for stable. (And see if I can figure out how to =
get
> a test for this kind of thing into pynfs. Unfortunately it's seems to
> require some contortions to get it to produce bad xdr.)

Having a generic way to include bad XDR would be a really useful feature =
for pynfs.

-dros

>=20
> --b.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" =
in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html


2017-08-25 17:46:30

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH v1 2/3] nfsd: Incoming xdr_bufs may have content in tail buffer

On Fri, Aug 18, 2017 at 11:12:27AM -0400, Chuck Lever wrote:
> Since the beginning, svcsock has built a received RPC Call message
> by populating the xdr_buf's head, then placing the remaining
> message bytes in the xdr_buf's page list. The xdr_buf's tail is
> never populated.
>
> This means that an NFSv4 COMPOUND containing an NFS WRITE operation
> plus trailing operations has a page list that contains the WRITE
> data payload followed by the trailing operations. NFSv4 XDR decoders
> will not look in the xdr_buf's tail, ever, because svcsock never put
> anything there.
>
> To support transports that can pass the write payload in the
> xdr_buf's pagelist and trailing content in the xdr_buf's tail,
> introduce logic in READ_BUF that switches to the xdr_buf's tail vec
> when the decoder runs out of content in rq_arg.pages.

This is very specialized: it assumes an xdr buffer will never cross the
boundary from pages into the tail, for example. But, I guess we do in
fact get that kind of guarantee from the rdma code, so fine. Might be
worth a comment.

> Signed-off-by: Chuck Lever <[email protected]>
> ---
> fs/nfsd/nfs4xdr.c | 20 ++++++++++++++++++++
> fs/nfsd/xdr4.h | 1 +
> 2 files changed, 21 insertions(+)
>
> diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
> index 7c48d68..a9f88cf 100644
> --- a/fs/nfsd/nfs4xdr.c
> +++ b/fs/nfsd/nfs4xdr.c
> @@ -159,6 +159,25 @@ static __be32 *read_buf(struct nfsd4_compoundargs *argp, u32 nbytes)
> */
> unsigned int avail = (char *)argp->end - (char *)argp->p;
> __be32 *p;
> +
> + if (argp->pagelen == 0) {
> + struct kvec *vec = &argp->rqstp->rq_arg.tail[0];
> +
> + if (!argp->tail) {

I think we may have other code that does this by checking whether
argp->p is in the range covered by that iovec, but your approach is
probably cleaner, OK.

--b.

> + argp->tail = true;
> + avail = vec->iov_len;
> + argp->p = vec->iov_base;
> + argp->end = vec->iov_base + avail;
> + }
> +
> + if (avail < nbytes)
> + return NULL;
> +
> + p = argp->p;
> + argp->p += XDR_QUADLEN(nbytes);
> + return p;
> + }
> +
> if (avail + argp->pagelen < nbytes)
> return NULL;
> if (avail + PAGE_SIZE < nbytes) /* need more than a page !! */
> @@ -4573,6 +4592,7 @@ void nfsd4_release_compoundargs(struct svc_rqst *rqstp)
> args->end = rqstp->rq_arg.head[0].iov_base + rqstp->rq_arg.head[0].iov_len;
> args->pagelist = rqstp->rq_arg.pages;
> args->pagelen = rqstp->rq_arg.page_len;
> + args->tail = false;
> args->tmpp = NULL;
> args->to_free = NULL;
> args->ops = args->iops;
> diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h
> index 72c6ad1..bac91b1 100644
> --- a/fs/nfsd/xdr4.h
> +++ b/fs/nfsd/xdr4.h
> @@ -614,6 +614,7 @@ struct nfsd4_compoundargs {
> __be32 * end;
> struct page ** pagelist;
> int pagelen;
> + bool tail;
> __be32 tmp[8];
> __be32 * tmpp;
> struct svcxdr_tmpbuf *to_free;