2019-03-04 03:09:13

by NeilBrown

[permalink] [raw]
Subject: [PATCH] nfsd: fix memory corruption caused by readdir


If the result of an NFSv3 readdir{,plus} request results in the
"offset" on one entry having to be split across 2 pages, and is sized
so that the next directory entry doesn't fix in the requested size,
then memory corruption can happen.

When encode_entry() is called after encoding the last entry that fits,
it notices that ->offset and ->offset1 are set, and so stores the
offset value in the two pages as required. It clears ->offset1 but
*does not* clear ->offset.

Normally this omission doesn't matter as encode_entry_baggage() will
be called, and will set ->offset to a suitable value (not on a page
boundary).
But in the case where cd->buflen < elen and nfserr_toosmall is
returned, ->offset is not reset.

This means that nfsd3proc_readdirplus will see ->offset with a value 4
bytes before the end of a page, and ->offset1 set to NULL.
It will try to write 8bytes to ->offset.
If we are lucky, the next page will be read-only, and the system will
BUG: unable to handle kernel paging request at...

If we are unlucky, some innocent page will have the first 4 bytes
corrupted.

nfsd3proc_readdir() doesn't even check for ->offset1, it just blindly
writes 8 bytes to the offset wherever it is.

Fix this by clearing ->offset after it is used, and copying the
->offset handling code from nfsd3_proc_readdirplus into
nfsd3_proc_readdir.

(Note that the commit hash in the Fixes tag is from the 'history'
tree - this bug predates git).
Fixes: eb229d253e6c ("[PATCH] kNFSd: fix two xdr-encode bugs for readdirplus reply")
Cc: [email protected] (v2.6.12+)
Signed-off-by: NeilBrown <[email protected]>
---

Can I still get extra credit for fixing a bug that is 14.5 years old, if
I'm the one who introduced it?

fs/nfsd/nfs3proc.c | 16 ++++++++++++++--
fs/nfsd/nfs3xdr.c | 1 +
2 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c
index 9eb8086ea841..c9cf46e0c040 100644
--- a/fs/nfsd/nfs3proc.c
+++ b/fs/nfsd/nfs3proc.c
@@ -463,8 +463,19 @@ nfsd3_proc_readdir(struct svc_rqst *rqstp)
&resp->common, nfs3svc_encode_entry);
memcpy(resp->verf, argp->verf, 8);
resp->count = resp->buffer - argp->buffer;
- if (resp->offset)
- xdr_encode_hyper(resp->offset, argp->cookie);
+ if (resp->offset) {
+ loff_t offset = argp->cookie;
+
+ if (unlikely(resp->offset1)) {
+ /* we ended up with offset on a page boundary */
+ *resp->offset = htonl(offset >> 32);
+ *resp->offset1 = htonl(offset & 0xffffffff);
+ resp->offset1 = NULL;
+ } else {
+ xdr_encode_hyper(resp->offset, offset);
+ }
+ resp->offset = NULL;
+ }

RETURN_STATUS(nfserr);
}
@@ -533,6 +544,7 @@ nfsd3_proc_readdirplus(struct svc_rqst *rqstp)
} else {
xdr_encode_hyper(resp->offset, offset);
}
+ resp->offset = NULL;
}

RETURN_STATUS(nfserr);
diff --git a/fs/nfsd/nfs3xdr.c b/fs/nfsd/nfs3xdr.c
index 9b973f4f7d01..83919116d5cb 100644
--- a/fs/nfsd/nfs3xdr.c
+++ b/fs/nfsd/nfs3xdr.c
@@ -921,6 +921,7 @@ encode_entry(struct readdir_cd *ccd, const char *name, int namlen,
} else {
xdr_encode_hyper(cd->offset, offset64);
}
+ cd->offset = NULL;
}

/*
--
2.14.0.rc0.dirty


Attachments:
signature.asc (847.00 B)

2019-03-04 17:09:25

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH] nfsd: fix memory corruption caused by readdir

On Mon, Mar 04, 2019 at 02:08:22PM +1100, NeilBrown wrote:
> (Note that the commit hash in the Fixes tag is from the 'history'
> tree - this bug predates git).
> Fixes: eb229d253e6c ("[PATCH] kNFSd: fix two xdr-encode bugs for readdirplus reply")

It'd be nice to provide a URL for that. The one I originally cloned one
seems to have disappeared.

> Cc: [email protected] (v2.6.12+)
> Signed-off-by: NeilBrown <[email protected]>
> ---
>
> Can I still get extra credit for fixing a bug that is 14.5 years old, if
> I'm the one who introduced it?

Good grief, yes! Great fix. Is that a record?

And how did it go undetected so long, and what caused it to surface just
now?

I once thought about converting this over to the xdr_stream api that
NFSv4 uses to hide the page-crossing logic now. But I think it's better
to leave it alone.

--b.

>
> fs/nfsd/nfs3proc.c | 16 ++++++++++++++--
> fs/nfsd/nfs3xdr.c | 1 +
> 2 files changed, 15 insertions(+), 2 deletions(-)
>
> diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c
> index 9eb8086ea841..c9cf46e0c040 100644
> --- a/fs/nfsd/nfs3proc.c
> +++ b/fs/nfsd/nfs3proc.c
> @@ -463,8 +463,19 @@ nfsd3_proc_readdir(struct svc_rqst *rqstp)
> &resp->common, nfs3svc_encode_entry);
> memcpy(resp->verf, argp->verf, 8);
> resp->count = resp->buffer - argp->buffer;
> - if (resp->offset)
> - xdr_encode_hyper(resp->offset, argp->cookie);
> + if (resp->offset) {
> + loff_t offset = argp->cookie;
> +
> + if (unlikely(resp->offset1)) {
> + /* we ended up with offset on a page boundary */
> + *resp->offset = htonl(offset >> 32);
> + *resp->offset1 = htonl(offset & 0xffffffff);
> + resp->offset1 = NULL;
> + } else {
> + xdr_encode_hyper(resp->offset, offset);
> + }
> + resp->offset = NULL;
> + }
>
> RETURN_STATUS(nfserr);
> }
> @@ -533,6 +544,7 @@ nfsd3_proc_readdirplus(struct svc_rqst *rqstp)
> } else {
> xdr_encode_hyper(resp->offset, offset);
> }
> + resp->offset = NULL;
> }
>
> RETURN_STATUS(nfserr);
> diff --git a/fs/nfsd/nfs3xdr.c b/fs/nfsd/nfs3xdr.c
> index 9b973f4f7d01..83919116d5cb 100644
> --- a/fs/nfsd/nfs3xdr.c
> +++ b/fs/nfsd/nfs3xdr.c
> @@ -921,6 +921,7 @@ encode_entry(struct readdir_cd *ccd, const char *name, int namlen,
> } else {
> xdr_encode_hyper(cd->offset, offset64);
> }
> + cd->offset = NULL;
> }
>
> /*
> --
> 2.14.0.rc0.dirty
>



2019-03-04 23:49:34

by NeilBrown

[permalink] [raw]
Subject: Re: [PATCH] nfsd: fix memory corruption caused by readdir

On Mon, Mar 04 2019, J. Bruce Fields wrote:

> On Mon, Mar 04, 2019 at 02:08:22PM +1100, NeilBrown wrote:
>> (Note that the commit hash in the Fixes tag is from the 'history'
>> tree - this bug predates git).
>> Fixes: eb229d253e6c ("[PATCH] kNFSd: fix two xdr-encode bugs for readdirplus reply")
>
> It'd be nice to provide a URL for that. The one I originally cloned one
> seems to have disappeared.

Fixes-URL: https://git.kernel.org/pub/scm/linux/kernel/git/history/history.git/commit/?id=eb229d253e6c

Though on reflection, that didn't introduce the bug, it just failed to
fix it properly. It should be:

Fixes: 0b1d57cf7654 ("[PATCH] kNFSd: Fix nfs3 dentry encoding")
Fixes-URL: https://git.kernel.org/pub/scm/linux/kernel/git/history/history.git/commit/?id=0b1d57cf7654

>

>> Cc: [email protected] (v2.6.12+)
>> Signed-off-by: NeilBrown <[email protected]>
>> ---
>>
>> Can I still get extra credit for fixing a bug that is 14.5 years old, if
>> I'm the one who introduced it?
>
> Good grief, yes! Great fix. Is that a record?
>
> And how did it go undetected so long, and what caused it to surface just
> now?

I suspect two different things need to come together to trigger the bug.
1/ a directory needs to have filename lengths which cause the xdr
encoding of the readdirplus reply to place the offset across a page
boundary.
A typical entry is around 200 bytes, or 50 quads, so there should be
a 1:50 chance of hitting that, assuming name lengths are evenly
distributed (which they aren't).
In the case which triggered the bug, all file names were 43 bytes,
all filehandles 28 bytes. This means 192 bytes per entry.
21 entries fit in a page leaving 64 bytes. This puts the cookie
on the page boundary.

2/ The *next* entry after the one that crosses the page boundary doesn't
fit. In the cases which triggered, the requested size was 0x1110
(4368).
That is enough room for 21 entries, but not for 22.

So presumably the client doesn't run Linux - which always asks
for 4096 bytes of directory entry (from a Linux server).
I have no idea what clients the customer was using, but these clients
seem to have a fairly good chance of triggering the bug (when configured
like the customer configured them - maybe).

>
> I once thought about converting this over to the xdr_stream api that
> NFSv4 uses to hide the page-crossing logic now. But I think it's better
> to leave it alone.

I agree - the code isn't being actively developed, so stability wins
over elegance.


BTW, the readdir (non-plus) code doesn't really need fixing.
nfs3svc_decode_readdirargs() caps the ->count at PAGE_SIZE, so the cookie
can never cross pages. nfs3svc_decode_readdirplusargs() caps it
at max_blocksize. So if you feel like leaving that part of the change
out, I probably wouldn't complain.

Thanks,
NeilBrown

>
> --b.
>
>>
>> fs/nfsd/nfs3proc.c | 16 ++++++++++++++--
>> fs/nfsd/nfs3xdr.c | 1 +
>> 2 files changed, 15 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c
>> index 9eb8086ea841..c9cf46e0c040 100644
>> --- a/fs/nfsd/nfs3proc.c
>> +++ b/fs/nfsd/nfs3proc.c
>> @@ -463,8 +463,19 @@ nfsd3_proc_readdir(struct svc_rqst *rqstp)
>> &resp->common, nfs3svc_encode_entry);
>> memcpy(resp->verf, argp->verf, 8);
>> resp->count = resp->buffer - argp->buffer;
>> - if (resp->offset)
>> - xdr_encode_hyper(resp->offset, argp->cookie);
>> + if (resp->offset) {
>> + loff_t offset = argp->cookie;
>> +
>> + if (unlikely(resp->offset1)) {
>> + /* we ended up with offset on a page boundary */
>> + *resp->offset = htonl(offset >> 32);
>> + *resp->offset1 = htonl(offset & 0xffffffff);
>> + resp->offset1 = NULL;
>> + } else {
>> + xdr_encode_hyper(resp->offset, offset);
>> + }
>> + resp->offset = NULL;
>> + }
>>
>> RETURN_STATUS(nfserr);
>> }
>> @@ -533,6 +544,7 @@ nfsd3_proc_readdirplus(struct svc_rqst *rqstp)
>> } else {
>> xdr_encode_hyper(resp->offset, offset);
>> }
>> + resp->offset = NULL;
>> }
>>
>> RETURN_STATUS(nfserr);
>> diff --git a/fs/nfsd/nfs3xdr.c b/fs/nfsd/nfs3xdr.c
>> index 9b973f4f7d01..83919116d5cb 100644
>> --- a/fs/nfsd/nfs3xdr.c
>> +++ b/fs/nfsd/nfs3xdr.c
>> @@ -921,6 +921,7 @@ encode_entry(struct readdir_cd *ccd, const char *name, int namlen,
>> } else {
>> xdr_encode_hyper(cd->offset, offset64);
>> }
>> + cd->offset = NULL;
>> }
>>
>> /*
>> --
>> 2.14.0.rc0.dirty
>>


Attachments:
signature.asc (847.00 B)

2019-03-05 22:05:24

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH] nfsd: fix memory corruption caused by readdir

On Tue, Mar 05, 2019 at 10:48:45AM +1100, NeilBrown wrote:
> On Mon, Mar 04 2019, J. Bruce Fields wrote:
>
> > On Mon, Mar 04, 2019 at 02:08:22PM +1100, NeilBrown wrote:
> >> (Note that the commit hash in the Fixes tag is from the 'history'
> >> tree - this bug predates git).
> >> Fixes: eb229d253e6c ("[PATCH] kNFSd: fix two xdr-encode bugs for readdirplus reply")
> >
> > It'd be nice to provide a URL for that. The one I originally cloned one
> > seems to have disappeared.
>
> Fixes-URL: https://git.kernel.org/pub/scm/linux/kernel/git/history/history.git/commit/?id=eb229d253e6c
>
> Though on reflection, that didn't introduce the bug, it just failed to
> fix it properly. It should be:
>
> Fixes: 0b1d57cf7654 ("[PATCH] kNFSd: Fix nfs3 dentry encoding")
> Fixes-URL: https://git.kernel.org/pub/scm/linux/kernel/git/history/history.git/commit/?id=0b1d57cf7654

Oh, so we can blame Olaf. Even better.

> > And how did it go undetected so long, and what caused it to surface just
> > now?
>
> I suspect two different things need to come together to trigger the bug.
> 1/ a directory needs to have filename lengths which cause the xdr
> encoding of the readdirplus reply to place the offset across a page
> boundary.
> A typical entry is around 200 bytes, or 50 quads, so there should be
> a 1:50 chance of hitting that, assuming name lengths are evenly
> distributed (which they aren't).
> In the case which triggered the bug, all file names were 43 bytes,
> all filehandles 28 bytes. This means 192 bytes per entry.
> 21 entries fit in a page leaving 64 bytes. This puts the cookie
> on the page boundary.
>
> 2/ The *next* entry after the one that crosses the page boundary doesn't
> fit. In the cases which triggered, the requested size was 0x1110
> (4368).
> That is enough room for 21 entries, but not for 22.
>
> So presumably the client doesn't run Linux - which always asks
> for 4096 bytes of directory entry (from a Linux server).
> I have no idea what clients the customer was using, but these clients
> seem to have a fairly good chance of triggering the bug (when configured
> like the customer configured them - maybe).

Thanks for the explanation!

> > I once thought about converting this over to the xdr_stream api that
> > NFSv4 uses to hide the page-crossing logic now. But I think it's better
> > to leave it alone.
>
> I agree - the code isn't being actively developed, so stability wins
> over elegance.
>
>
> BTW, the readdir (non-plus) code doesn't really need fixing.
> nfs3svc_decode_readdirargs() caps the ->count at PAGE_SIZE, so the cookie
> can never cross pages. nfs3svc_decode_readdirplusargs() caps it
> at max_blocksize. So if you feel like leaving that part of the change
> out, I probably wouldn't complain.

Eh, makes sense to me to fix it.

--b.

2019-03-06 22:50:34

by NeilBrown

[permalink] [raw]
Subject: [PATCH] nfsd: allow nfsv3 readdir request to be larger.


nfsd currently reports the NFSv4 dtpref FSINFO parameter
to be PAGE_SIZE, so NFS clients will typically ask for one
page of directory entries at a time. This is needlessly restrictive
as nfsd can handle larger replies easily.

Also, a READDIR request (but not a READDIRPLUS request) has the count
size clipped to PAGE_SIE, again unnecessary.

This patch lifts these limits so that larger readdir requests can be
used.

Signed-off-by: NeilBrown <[email protected]>
---
fs/nfsd/nfs3proc.c | 2 +-
fs/nfsd/nfs3xdr.c | 4 +++-
2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c
index c9cf46e0c040..8f933e84cec1 100644
--- a/fs/nfsd/nfs3proc.c
+++ b/fs/nfsd/nfs3proc.c
@@ -588,7 +588,7 @@ nfsd3_proc_fsinfo(struct svc_rqst *rqstp)
resp->f_wtmax = max_blocksize;
resp->f_wtpref = max_blocksize;
resp->f_wtmult = PAGE_SIZE;
- resp->f_dtpref = PAGE_SIZE;
+ resp->f_dtpref = max_blocksize;
resp->f_maxfilesize = ~(u32) 0;
resp->f_properties = NFS3_FSF_DEFAULT;

diff --git a/fs/nfsd/nfs3xdr.c b/fs/nfsd/nfs3xdr.c
index 83919116d5cb..93fea246f676 100644
--- a/fs/nfsd/nfs3xdr.c
+++ b/fs/nfsd/nfs3xdr.c
@@ -573,6 +573,8 @@ int
nfs3svc_decode_readdirargs(struct svc_rqst *rqstp, __be32 *p)
{
struct nfsd3_readdirargs *args = rqstp->rq_argp;
+ u32 max_blocksize = svc_max_payload(rqstp);
+
p = decode_fh(p, &args->fh);
if (!p)
return 0;
@@ -580,7 +582,7 @@ nfs3svc_decode_readdirargs(struct svc_rqst *rqstp, __be32 *p)
args->verf = p; p += 2;
args->dircount = ~0;
args->count = ntohl(*p++);
- args->count = min_t(u32, args->count, PAGE_SIZE);
+ args->count = min_t(u32, args->count, max_blocksize);
args->buffer = page_address(*(rqstp->rq_next_page++));

return xdr_argsize_check(rqstp, p);
--
2.14.0.rc0.dirty


Attachments:
signature.asc (847.00 B)

2019-03-08 19:30:26

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH] nfsd: allow nfsv3 readdir request to be larger.

Thanks! Applied.

On Thu, Mar 07, 2019 at 09:49:46AM +1100, NeilBrown wrote:
>
> nfsd currently reports the NFSv4 dtpref FSINFO parameter

(Just changed NFSv4 here to NFSv3.)--b.

> to be PAGE_SIZE, so NFS clients will typically ask for one
> page of directory entries at a time. This is needlessly restrictive
> as nfsd can handle larger replies easily.
>
> Also, a READDIR request (but not a READDIRPLUS request) has the count
> size clipped to PAGE_SIE, again unnecessary.
>
> This patch lifts these limits so that larger readdir requests can be
> used.
>
> Signed-off-by: NeilBrown <[email protected]>
> ---
> fs/nfsd/nfs3proc.c | 2 +-
> fs/nfsd/nfs3xdr.c | 4 +++-
> 2 files changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c
> index c9cf46e0c040..8f933e84cec1 100644
> --- a/fs/nfsd/nfs3proc.c
> +++ b/fs/nfsd/nfs3proc.c
> @@ -588,7 +588,7 @@ nfsd3_proc_fsinfo(struct svc_rqst *rqstp)
> resp->f_wtmax = max_blocksize;
> resp->f_wtpref = max_blocksize;
> resp->f_wtmult = PAGE_SIZE;
> - resp->f_dtpref = PAGE_SIZE;
> + resp->f_dtpref = max_blocksize;
> resp->f_maxfilesize = ~(u32) 0;
> resp->f_properties = NFS3_FSF_DEFAULT;
>
> diff --git a/fs/nfsd/nfs3xdr.c b/fs/nfsd/nfs3xdr.c
> index 83919116d5cb..93fea246f676 100644
> --- a/fs/nfsd/nfs3xdr.c
> +++ b/fs/nfsd/nfs3xdr.c
> @@ -573,6 +573,8 @@ int
> nfs3svc_decode_readdirargs(struct svc_rqst *rqstp, __be32 *p)
> {
> struct nfsd3_readdirargs *args = rqstp->rq_argp;
> + u32 max_blocksize = svc_max_payload(rqstp);
> +
> p = decode_fh(p, &args->fh);
> if (!p)
> return 0;
> @@ -580,7 +582,7 @@ nfs3svc_decode_readdirargs(struct svc_rqst *rqstp, __be32 *p)
> args->verf = p; p += 2;
> args->dircount = ~0;
> args->count = ntohl(*p++);
> - args->count = min_t(u32, args->count, PAGE_SIZE);
> + args->count = min_t(u32, args->count, max_blocksize);
> args->buffer = page_address(*(rqstp->rq_next_page++));
>
> return xdr_argsize_check(rqstp, p);
> --
> 2.14.0.rc0.dirty
>