From: "J. Bruce Fields" <[email protected]>
An NFS operation that creates a new symlink includes the symlink data,
which is xdr-encoded as a length followed by the data plus 0 to 3 bytes
of zero-padding as required to reach a 4-byte boundary.
The vfs, on the other hand, wants null-terminated data.
The simple way to handle this would be by copying the data into a newly
allocated buffer with space for the final null.
The current nfsd_symlink code tries to be more clever by skipping that
step in the (likely) case where the byte following the string is already
0.
But that assumes that the byte following the string is ours to look at.
In fact, it might be the first byte of a page that we can't read, or of
some object that another task might modify.
Worse, the NFSv4 code tries to fix the problem by actually writing to
that byte.
In the NFSv2/v3 cases this actually appears to be safe:
- nfs3svc_decode_symlinkargs explicitly null-terminates the data
(after first checking its length and copying it to a new
page).
- NFSv2 limits symlinks to 1k. The buffer holding the rpc
request is always at least a page, and the link data (and
previous fields) have maximum lengths that prevent the request
from reaching the end of a page.
In the NFSv4 case the CREATE op is potentially just one part of a long
compound so can end up on the end of a page if you're unlucky.
The minimal fix here is to copy and null-terminate in the NFSv4 case.
The nfsd_symlink() interface here seems too fragile, though. It should
really either do the copy itself every time or just require a
null-terminated string.
Reported-by: Jeff Layton <[email protected]>
Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4proc.c | 22 +++++++++++++---------
1 file changed, 13 insertions(+), 9 deletions(-)
diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index 6851b00..453c907 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -601,6 +601,8 @@ nfsd4_create(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
{
struct svc_fh resfh;
__be32 status;
+ u32 len;
+ char *data;
dev_t rdev;
fh_init(&resfh, NFS4_FHSIZE);
@@ -617,19 +619,21 @@ nfsd4_create(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
switch (create->cr_type) {
case NF4LNK:
- /* ugh! we have to null-terminate the linktext, or
- * vfs_symlink() will choke. it is always safe to
- * null-terminate by brute force, since at worst we
- * will overwrite the first byte of the create namelen
- * in the XDR buffer, which has already been extracted
- * during XDR decode.
+ len = create->cr_linklen;
+ data = kmalloc(len + 1, GFP_KERNEL);
+ /*
+ * Null-terminating in place isn't safe since
+ * cr_linkname might end on a page boundary.
*/
- create->cr_linkname[create->cr_linklen] = 0;
-
+ if (!data)
+ return nfserr_jukebox;
+ memcpy(data, create->cr_linkname, len + 1);
+ data[len] = '\0';
status = nfsd_symlink(rqstp, &cstate->current_fh,
create->cr_name, create->cr_namelen,
- create->cr_linkname, create->cr_linklen,
+ data, len,
&resfh, &create->cr_iattr);
+ kfree(data);
break;
case NF4BLK:
--
1.7.9.5
On Mon, Jun 23, 2014 at 07:28:16PM -0400, Trond Myklebust wrote:
> On Mon, Jun 23, 2014 at 6:10 PM, J. Bruce Fields <[email protected]> wrote:
> > diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
> > index 2d305a1..56bdf4a 100644
> > --- a/fs/nfsd/nfs4xdr.c
> > +++ b/fs/nfsd/nfs4xdr.c
> > @@ -600,7 +600,18 @@ nfsd4_decode_create(struct nfsd4_compoundargs *argp, struct nfsd4_create *create
> > READ_BUF(4);
> > create->cr_linklen = be32_to_cpup(p++);
> > READ_BUF(create->cr_linklen);
> > - SAVEMEM(create->cr_linkname, create->cr_linklen);
> > + /*
> > + * The VFS will want a null-terminated string, and
> > + * null-terminating in place isn't safe since this might
> > + * end on a page boundary:
> > + */
> > + create->cr_linkname =
> > + kmalloc(create->cr_linklen + 1, GFP_KERNEL);
> > + if (!create->cr_linkname)
> > + return nfserr_jukebox;
> > + memcpy(create->cr_linkname, p, create->cr_linklen);
> > + create->cr_linkname[create->cr_linklen] = '\0';
> > + defer_free(argp, kfree, create->cr_linkname);
> > break;
> > case NF4BLK:
> > case NF4CHR:
>
> Note that "defer_free()" does yet another allocation here in order to
> make space for a small 'struct tmpbuf' structure. Unlike the first
> allocation, there is no check for whether or not that second
> allocation is successful above, so you can easily end up with a silent
> memory leakage (ditto for a number of other callers of defer_free()).
>
> Looking around at all the other users, wouldn't it perhaps make sense
> to replace defer_free() with a helper that does just a single
> allocation for both the memory buffer and the struct tmpbuf?
Yeah, thanks, working on it....
--b.
On Mon, Jun 23, 2014 at 6:10 PM, J. Bruce Fields <[email protected]> wrote:
> diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
> index 2d305a1..56bdf4a 100644
> --- a/fs/nfsd/nfs4xdr.c
> +++ b/fs/nfsd/nfs4xdr.c
> @@ -600,7 +600,18 @@ nfsd4_decode_create(struct nfsd4_compoundargs *argp, struct nfsd4_create *create
> READ_BUF(4);
> create->cr_linklen = be32_to_cpup(p++);
> READ_BUF(create->cr_linklen);
> - SAVEMEM(create->cr_linkname, create->cr_linklen);
> + /*
> + * The VFS will want a null-terminated string, and
> + * null-terminating in place isn't safe since this might
> + * end on a page boundary:
> + */
> + create->cr_linkname =
> + kmalloc(create->cr_linklen + 1, GFP_KERNEL);
> + if (!create->cr_linkname)
> + return nfserr_jukebox;
> + memcpy(create->cr_linkname, p, create->cr_linklen);
> + create->cr_linkname[create->cr_linklen] = '\0';
> + defer_free(argp, kfree, create->cr_linkname);
> break;
> case NF4BLK:
> case NF4CHR:
Note that "defer_free()" does yet another allocation here in order to
make space for a small 'struct tmpbuf' structure. Unlike the first
allocation, there is no check for whether or not that second
allocation is successful above, so you can easily end up with a silent
memory leakage (ditto for a number of other callers of defer_free()).
Looking around at all the other users, wouldn't it perhaps make sense
to replace defer_free() with a helper that does just a single
allocation for both the memory buffer and the struct tmpbuf?
Cheers
Trond
--
Trond Myklebust
Linux NFS client maintainer, PrimaryData
[email protected]
From: "J. Bruce Fields" <[email protected]>
It's simple enough for NFSv2 to null-terminate the symlink data.
A bit weird (it depends on knowing that we've already read the following
byte, which is either padding or part of the mode), but no worse than
the conditional kstrdup it otherwise relies on in nfsd_symlink().
Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfsproc.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/fs/nfsd/nfsproc.c b/fs/nfsd/nfsproc.c
index 54c6b3d..aebe23c 100644
--- a/fs/nfsd/nfsproc.c
+++ b/fs/nfsd/nfsproc.c
@@ -403,8 +403,11 @@ nfsd_proc_symlink(struct svc_rqst *rqstp, struct nfsd_symlinkargs *argp,
fh_init(&newfh, NFS_FHSIZE);
/*
- * Create the link, look up new file and set attrs.
+ * Crazy hack: the request fits in a page, and already-decoded
+ * attributes follow argp->tname, so it's safe to just write a
+ * null to ensure it's null-terminated:
*/
+ argp->tname[argp->tlen] = '\0';
nfserr = nfsd_symlink(rqstp, &argp->ffh, argp->fname, argp->flen,
argp->tname, argp->tlen,
&newfh, &argp->attrs);
--
1.7.9.5
From: "J. Bruce Fields" <[email protected]>
Currently nfsd_symlink has a weird hack to serve callers who don't
null-terminate symlink data: it looks ahead at the next byte to see if
it's zero, and copies it to a new buffer to null-terminate if not.
That means callers don't have to null-terminate, but they *do* have to
ensure that the byte following the end of the data is theirs to read.
That's a bit subtle, and the NFSv4 code actually got this wrong.
So let's just throw out that code and let callers pass null-terminated
strings; we've already fixed them to do that.
Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs3proc.c | 2 +-
fs/nfsd/nfs4proc.c | 3 +--
fs/nfsd/nfsproc.c | 2 +-
fs/nfsd/vfs.c | 17 +++--------------
fs/nfsd/vfs.h | 2 +-
5 files changed, 7 insertions(+), 19 deletions(-)
diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c
index 4012899..4abf4dc 100644
--- a/fs/nfsd/nfs3proc.c
+++ b/fs/nfsd/nfs3proc.c
@@ -286,7 +286,7 @@ nfsd3_proc_symlink(struct svc_rqst *rqstp, struct nfsd3_symlinkargs *argp,
fh_copy(&resp->dirfh, &argp->ffh);
fh_init(&resp->fh, NFS3_FHSIZE);
nfserr = nfsd_symlink(rqstp, &resp->dirfh, argp->fname, argp->flen,
- argp->tname, argp->tlen,
+ argp->tname,
&resp->fh, &argp->attrs);
RETURN_STATUS(nfserr);
}
diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index 453c907..4ee72a3 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -631,8 +631,7 @@ nfsd4_create(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
data[len] = '\0';
status = nfsd_symlink(rqstp, &cstate->current_fh,
create->cr_name, create->cr_namelen,
- data, len,
- &resfh, &create->cr_iattr);
+ data, &resfh, &create->cr_iattr);
kfree(data);
break;
diff --git a/fs/nfsd/nfsproc.c b/fs/nfsd/nfsproc.c
index aebe23c..583ed03 100644
--- a/fs/nfsd/nfsproc.c
+++ b/fs/nfsd/nfsproc.c
@@ -409,7 +409,7 @@ nfsd_proc_symlink(struct svc_rqst *rqstp, struct nfsd_symlinkargs *argp,
*/
argp->tname[argp->tlen] = '\0';
nfserr = nfsd_symlink(rqstp, &argp->ffh, argp->fname, argp->flen,
- argp->tname, argp->tlen,
+ argp->tname,
&newfh, &argp->attrs);
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 140c496..575fa9c 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -1504,7 +1504,7 @@ out_nfserr:
__be32
nfsd_symlink(struct svc_rqst *rqstp, struct svc_fh *fhp,
char *fname, int flen,
- char *path, int plen,
+ char *path,
struct svc_fh *resfhp,
struct iattr *iap)
{
@@ -1513,7 +1513,7 @@ nfsd_symlink(struct svc_rqst *rqstp, struct svc_fh *fhp,
int host_err;
err = nfserr_noent;
- if (!flen || !plen)
+ if (!flen || path[0] == '\0')
goto out;
err = nfserr_exist;
if (isdotent(fname, flen))
@@ -1534,18 +1534,7 @@ nfsd_symlink(struct svc_rqst *rqstp, struct svc_fh *fhp,
if (IS_ERR(dnew))
goto out_nfserr;
- if (unlikely(path[plen] != 0)) {
- char *path_alloced = kmalloc(plen+1, GFP_KERNEL);
- if (path_alloced == NULL)
- host_err = -ENOMEM;
- else {
- strncpy(path_alloced, path, plen);
- path_alloced[plen] = 0;
- host_err = vfs_symlink(dentry->d_inode, dnew, path_alloced);
- kfree(path_alloced);
- }
- } else
- host_err = vfs_symlink(dentry->d_inode, dnew, path);
+ host_err = vfs_symlink(dentry->d_inode, dnew, path);
err = nfserrno(host_err);
if (!err)
err = nfserrno(commit_metadata(fhp));
diff --git a/fs/nfsd/vfs.h b/fs/nfsd/vfs.h
index 91b6ae3..9b9eb91 100644
--- a/fs/nfsd/vfs.h
+++ b/fs/nfsd/vfs.h
@@ -85,7 +85,7 @@ __be32 nfsd_write(struct svc_rqst *, struct svc_fh *,struct file *,
__be32 nfsd_readlink(struct svc_rqst *, struct svc_fh *,
char *, int *);
__be32 nfsd_symlink(struct svc_rqst *, struct svc_fh *,
- char *name, int len, char *path, int plen,
+ char *name, int len, char *path,
struct svc_fh *res, struct iattr *);
__be32 nfsd_link(struct svc_rqst *, struct svc_fh *,
char *, int, struct svc_fh *);
--
1.7.9.5
On Mon, Jun 23, 2014 at 5:34 PM, J. Bruce Fields <[email protected]> wrote:
> From: "J. Bruce Fields" <[email protected]>
>
> An NFS operation that creates a new symlink includes the symlink data,
> which is xdr-encoded as a length followed by the data plus 0 to 3 bytes
> of zero-padding as required to reach a 4-byte boundary.
>
> The vfs, on the other hand, wants null-terminated data.
>
> The simple way to handle this would be by copying the data into a newly
> allocated buffer with space for the final null.
>
> The current nfsd_symlink code tries to be more clever by skipping that
> step in the (likely) case where the byte following the string is already
> 0.
>
> But that assumes that the byte following the string is ours to look at.
> In fact, it might be the first byte of a page that we can't read, or of
> some object that another task might modify.
>
> Worse, the NFSv4 code tries to fix the problem by actually writing to
> that byte.
>
> In the NFSv2/v3 cases this actually appears to be safe:
>
> - nfs3svc_decode_symlinkargs explicitly null-terminates the data
> (after first checking its length and copying it to a new
> page).
> - NFSv2 limits symlinks to 1k. The buffer holding the rpc
> request is always at least a page, and the link data (and
> previous fields) have maximum lengths that prevent the request
> from reaching the end of a page.
>
> In the NFSv4 case the CREATE op is potentially just one part of a long
> compound so can end up on the end of a page if you're unlucky.
>
> The minimal fix here is to copy and null-terminate in the NFSv4 case.
> The nfsd_symlink() interface here seems too fragile, though. It should
> really either do the copy itself every time or just require a
> null-terminated string.
>
> Reported-by: Jeff Layton <[email protected]>
> Signed-off-by: J. Bruce Fields <[email protected]>
> ---
> fs/nfsd/nfs4proc.c | 22 +++++++++++++---------
> 1 file changed, 13 insertions(+), 9 deletions(-)
>
> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> index 6851b00..453c907 100644
> --- a/fs/nfsd/nfs4proc.c
> +++ b/fs/nfsd/nfs4proc.c
> @@ -601,6 +601,8 @@ nfsd4_create(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
> {
> struct svc_fh resfh;
> __be32 status;
> + u32 len;
> + char *data;
> dev_t rdev;
>
> fh_init(&resfh, NFS4_FHSIZE);
> @@ -617,19 +619,21 @@ nfsd4_create(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>
> switch (create->cr_type) {
> case NF4LNK:
> - /* ugh! we have to null-terminate the linktext, or
> - * vfs_symlink() will choke. it is always safe to
> - * null-terminate by brute force, since at worst we
> - * will overwrite the first byte of the create namelen
> - * in the XDR buffer, which has already been extracted
> - * during XDR decode.
> + len = create->cr_linklen;
> + data = kmalloc(len + 1, GFP_KERNEL);
> + /*
> + * Null-terminating in place isn't safe since
> + * cr_linkname might end on a page boundary.
> */
> - create->cr_linkname[create->cr_linklen] = 0;
> -
> + if (!data)
> + return nfserr_jukebox;
> + memcpy(data, create->cr_linkname, len + 1);
Shouldn't that be a copy of 'len' bytes?
> + data[len] = '\0';
> status = nfsd_symlink(rqstp, &cstate->current_fh,
> create->cr_name, create->cr_namelen,
> - create->cr_linkname, create->cr_linklen,
> + data, len,
> &resfh, &create->cr_iattr);
> + kfree(data);
> break;
>
> case NF4BLK:
> --
> 1.7.9.5
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Trond Myklebust
Linux NFS client maintainer, PrimaryData
[email protected]
On Mon, Jun 23, 2014 at 05:41:31PM -0400, Trond Myklebust wrote:
> On Mon, Jun 23, 2014 at 5:34 PM, J. Bruce Fields <[email protected]> wrote:
> > + len = create->cr_linklen;
> > + data = kmalloc(len + 1, GFP_KERNEL);
> > + /*
> > + * Null-terminating in place isn't safe since
> > + * cr_linkname might end on a page boundary.
> > */
> > - create->cr_linkname[create->cr_linklen] = 0;
> > -
> > + if (!data)
> > + return nfserr_jukebox;
> > + memcpy(data, create->cr_linkname, len + 1);
>
> Shouldn't that be a copy of 'len' bytes?
Doh, yes.
After sending out I realized that we'd probably rather than do a single
alloc© instead of two, so I'm thinking of doing the following (still
testing).
But my first draft of that had the same problem that you pointed out
here!
--b.
commit f719db9342235b3ebd4d65b9944fce9168177682
Author: J. Bruce Fields <[email protected]>
Date: Thu Jun 19 16:44:48 2014 -0400
nfsd: fix rare symlink decoding bug
An NFS operation that creates a new symlink includes the symlink data,
which is xdr-encoded as a length followed by the data plus 0 to 3 bytes
of zero-padding as required to reach a 4-byte boundary.
The vfs, on the other hand, wants null-terminated data.
The simple way to handle this would be by copying the data into a newly
allocated buffer with space for the final null.
The current nfsd_symlink code tries to be more clever by skipping that
step in the (likely) case where the byte following the string is already
0.
But that assumes that the byte following the string is ours to look at.
In fact, it might be the first byte of a page that we can't read, or of
some object that another task might modify.
Worse, the NFSv4 code tries to fix the problem by actually writing to
that byte.
In the NFSv2/v3 cases this actually appears to be safe:
- nfs3svc_decode_symlinkargs explicitly null-terminates the data
(after first checking its length and copying it to a new
page).
- NFSv2 limits symlinks to 1k. The buffer holding the rpc
request is always at least a page, and the link data (and
previous fields) have maximum lengths that prevent the request
from reaching the end of a page.
In the NFSv4 case the CREATE op is potentially just one part of a long
compound so can end up on the end of a page if you're unlucky.
The minimal fix here is to copy and null-terminate in the NFSv4 case.
The nfsd_symlink() interface here seems too fragile, though. It should
really either do the copy itself every time or just require a
null-terminated string.
Reported-by: Jeff Layton <[email protected]>
Signed-off-by: J. Bruce Fields <[email protected]>
diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index 6851b00..8f029db 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -617,15 +617,6 @@ nfsd4_create(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
switch (create->cr_type) {
case NF4LNK:
- /* ugh! we have to null-terminate the linktext, or
- * vfs_symlink() will choke. it is always safe to
- * null-terminate by brute force, since at worst we
- * will overwrite the first byte of the create namelen
- * in the XDR buffer, which has already been extracted
- * during XDR decode.
- */
- create->cr_linkname[create->cr_linklen] = 0;
-
status = nfsd_symlink(rqstp, &cstate->current_fh,
create->cr_name, create->cr_namelen,
create->cr_linkname, create->cr_linklen,
diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index 2d305a1..56bdf4a 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -600,7 +600,18 @@ nfsd4_decode_create(struct nfsd4_compoundargs *argp, struct nfsd4_create *create
READ_BUF(4);
create->cr_linklen = be32_to_cpup(p++);
READ_BUF(create->cr_linklen);
- SAVEMEM(create->cr_linkname, create->cr_linklen);
+ /*
+ * The VFS will want a null-terminated string, and
+ * null-terminating in place isn't safe since this might
+ * end on a page boundary:
+ */
+ create->cr_linkname =
+ kmalloc(create->cr_linklen + 1, GFP_KERNEL);
+ if (!create->cr_linkname)
+ return nfserr_jukebox;
+ memcpy(create->cr_linkname, p, create->cr_linklen);
+ create->cr_linkname[create->cr_linklen] = '\0';
+ defer_free(argp, kfree, create->cr_linkname);
break;
case NF4BLK:
case NF4CHR: