LinuxLists.cc - [PATCH v2 0/5] NFSD size, offset, and count sanity

2022-02-01 20:46:53

Subject: [PATCH v2 0/5] NFSD size, offset, and count sanity

Dan Aloni reported a problem with the way NFSD's READ implementation
deals with the very upper end of file sizes, and I got interested in
how some of the other operations handled it. I found some issues,
and have started a (growing) pile of patches to deal with them.

Since at least the SETATTR case appears to cause a crash on some
filesystems, I think several of these are 5.17-rc fodder (i.e.,
priority bug fixes). Dan's still working on the READ issue.

Changes since RFC:
- Series reordered so priority fixes come first
- Setattr size check is now in a common function
- Patch descriptions clarified

---

Chuck Lever (5):
NFSD: Fix ia_size underflow
NFSD: Fix NFSv3 SETATTR/CREATE's handling of large file sizes
NFSD: Clamp WRITE offsets
NFSD: COMMIT operations must not return NFS?ERR_INVAL
NFSD: Deprecate NFS_OFFSET_MAX

fs/nfsd/nfs3proc.c | 21 ++++++++++++-----
fs/nfsd/nfs3xdr.c | 4 ++--
fs/nfsd/nfs4proc.c | 5 ++--
fs/nfsd/nfs4xdr.c | 2 +-
fs/nfsd/vfs.c | 57 +++++++++++++++++++++++++++++++--------------
fs/nfsd/vfs.h | 4 ++--
include/linux/nfs.h | 8 -------
7 files changed, 63 insertions(+), 38 deletions(-)

--
Chuck Lever

2022-02-01 20:46:55

by Chuck Lever III

[permalink] [raw]

Subject: [PATCH v2 1/5] NFSD: Fix ia_size underflow

iattr::ia_size is a loff_t, which is a signed 64-bit type. NFSv3 and
NFSv4 both define file size as an unsigned 64-bit type. Thus there
is a range of valid file size values an NFS client can send that is
already larger than Linux can handle.

Currently decode_fattr4() dumps a full u64 value into ia_size. If
that value happens to be larger than S64_MAX, then ia_size
underflows. I'm about to fix up the NFSv3 behavior as well, so let's
catch the underflow in the common code path: nfsd_setattr().

Signed-off-by: Chuck Lever <[email protected]>
---
fs/nfsd/vfs.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 99c2b9dfbb10..0cccceb105e7 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -435,6 +435,10 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap,
.ia_size = iap->ia_size,
};

+ host_err = -EFBIG;
+ if (iap->ia_size < 0)
+ goto out_unlock;
+
host_err = notify_change(&init_user_ns, dentry, &size_attr, NULL);
if (host_err)
goto out_unlock;

2022-02-01 20:46:59

by Chuck Lever III

[permalink] [raw]

Subject: [PATCH v2 3/5] NFSD: Clamp WRITE offsets

Ensure that a client cannot specify a WRITE range that falls in a
byte range outside what the kernel's internal types (such as loff_t,
which is signed) can represent. The kiocb iterators, invoked in
nfsd_vfs_write(), should properly limit write operations to within
the underlying file system's s_maxbytes.

Signed-off-by: Chuck Lever <[email protected]>
---
fs/nfsd/nfs3proc.c | 6 ++++++
fs/nfsd/nfs4proc.c | 5 +++--
2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c
index 02edc7074d06..4e939ebba5d5 100644
--- a/fs/nfsd/nfs3proc.c
+++ b/fs/nfsd/nfs3proc.c
@@ -203,6 +203,11 @@ nfsd3_proc_write(struct svc_rqst *rqstp)
(unsigned long long) argp->offset,
argp->stable? " stable" : "");

+ resp->status = nfserr_fbig;
+ if (argp->offset >= OFFSET_MAX ||
+ argp->offset + argp->len >= OFFSET_MAX)
+ goto out;
+
fh_copy(&resp->fh, &argp->fh);
resp->committed = argp->stable;
nvecs = svc_fill_write_vector(rqstp, &argp->payload);
@@ -211,6 +216,7 @@ nfsd3_proc_write(struct svc_rqst *rqstp)
rqstp->rq_vec, nvecs, &cnt,
resp->committed, resp->verf);
resp->count = cnt;
+out:
return rpc_success;
}

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index ed1ee25647be..807f41380e77 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -1018,8 +1018,9 @@ nfsd4_write(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
unsigned long cnt;
int nvecs;

- if (write->wr_offset >= OFFSET_MAX)
- return nfserr_inval;
+ if (write->wr_offset >= OFFSET_MAX ||
+ write->wr_offset + write->wr_buflen >= OFFSET_MAX)
+ return nfserr_fbig;

cnt = write->wr_buflen;
trace_nfsd_write_start(rqstp, &cstate->current_fh,

2022-02-01 20:47:11

by Chuck Lever III

[permalink] [raw]

Subject: [PATCH v2 2/5] NFSD: Fix NFSv3 SETATTR/CREATE's handling of large file sizes

iattr::ia_size is a loff_t, so these NFSv3 procedures must be
careful to deal with incoming client size values that are larger
than s64_max without corrupting the value.

Silently capping the value results in storing a different value
than the client passed in which is unexpected behavior, so remove
the min_t() check in decode_sattr3().

Moreover, a large file size is not an XDR error, since anything up
to U64_MAX is permitted for NFSv3 file size values. So it has to be
dealt with in nfs3proc.c, not in the XDR decoder.

Size comparisons like in inode_newsize_ok should now work as
expected -- the VFS returns -EFBIG if the new size is larger than
the underlying filesystem's s_maxbytes.

However, RFC 1813 permits only the WRITE procedure to return
NFS3ERR_FBIG. Extra checks are needed to prevent NFSv3 SETATTR and
CREATE from returning FBIG. Unfortunately RFC 1813 does not provide
a specific status code for either procedure to indicate this
specific failure, so I've chosen NFS3ERR_INVAL for SETATTR and
NFS3ERR_IO for CREATE.

Applications and NFS clients might be better served if the server
stuck with NFS3ERR_FBIG despite what RFC 1813 says.

Signed-off-by: Chuck Lever <[email protected]>
---
fs/nfsd/nfs3proc.c | 9 +++++++++
fs/nfsd/nfs3xdr.c | 2 +-
2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c
index 8ef53f6726ec..02edc7074d06 100644
--- a/fs/nfsd/nfs3proc.c
+++ b/fs/nfsd/nfs3proc.c
@@ -73,6 +73,10 @@ nfsd3_proc_setattr(struct svc_rqst *rqstp)
fh_copy(&resp->fh, &argp->fh);
resp->status = nfsd_setattr(rqstp, &resp->fh, &argp->attrs,
argp->check_guard, argp->guardtime);
+
+ if (resp->status == nfserr_fbig)
+ resp->status = nfserr_inval;
+
return rpc_success;
}

@@ -245,6 +249,11 @@ nfsd3_proc_create(struct svc_rqst *rqstp)
resp->status = do_nfsd_create(rqstp, dirfhp, argp->name, argp->len,
attr, newfhp, argp->createmode,
(u32 *)argp->verf, NULL, NULL);
+
+ /* CREATE must not return NFS3ERR_FBIG */
+ if (resp->status == nfserr_fbig)
+ resp->status = nfserr_io;
+
return rpc_success;
}

diff --git a/fs/nfsd/nfs3xdr.c b/fs/nfsd/nfs3xdr.c
index 7c45ba4db61b..2e47a07029f1 100644
--- a/fs/nfsd/nfs3xdr.c
+++ b/fs/nfsd/nfs3xdr.c
@@ -254,7 +254,7 @@ svcxdr_decode_sattr3(struct svc_rqst *rqstp, struct xdr_stream *xdr,
if (xdr_stream_decode_u64(xdr, &newsize) < 0)
return false;
iap->ia_valid |= ATTR_SIZE;
- iap->ia_size = min_t(u64, newsize, NFS_OFFSET_MAX);
+ iap->ia_size = newsize;
}
if (xdr_stream_decode_u32(xdr, &set_it) < 0)
return false;

2022-02-01 20:47:18

by Chuck Lever III

[permalink] [raw]

Subject: [PATCH v2 4/5] NFSD: COMMIT operations must not return NFS?ERR_INVAL

Since, well, forever, the Linux NFS server's nfsd_commit() function
has returned nfserr_inval when the passed-in byte range arguments
were non-sensical.

However, according to RFC 1813 section 3.3.21, NFSv3 COMMIT requests
are permitted to return only the following non-zero status codes:

NFS3ERR_IO
NFS3ERR_STALE
NFS3ERR_BADHANDLE
NFS3ERR_SERVERFAULT

NFS3ERR_INVAL is not included in that list. Likewise, NFS4ERR_INVAL
is not listed in the COMMIT row of Table 6 in RFC 8881.

RFC 7530 does permit COMMIT to return NFS4ERR_INVAL, but does not
specify when it can or should be used.

Instead of dropping or failing a COMMIT request in a byte range that
is not supported, turn it into a valid request by treating one or
both arguments as zero. Offset zero means start-of-file, count zero
means until-end-of-file, so we only ever extend the commit range.
NFS servers are always allowed to commit more and sooner than
requested.

The range check is no longer bounded by NFS_OFFSET_MAX, but rather
by the value that is returned in the maxfilesize field of the NFSv3
FSINFO procedure or the NFSv4 maxfilesize file attribute.

Note that this change results in a new pynfs failure:

CMT4 st_commit.testCommitOverflow : RUNNING
CMT4 st_commit.testCommitOverflow : FAILURE
COMMIT with offset + count overflow should return
NFS4ERR_INVAL, instead got NFS4_OK

IMO the test is not correct as written: RFC 8881 does not allow the
COMMIT operation to return NFS4ERR_INVAL.

Reported-by: Dan Aloni <[email protected]>
Signed-off-by: Chuck Lever <[email protected]>
Reviewed-by: Bruce Fields <[email protected]>
---
fs/nfsd/nfs3proc.c | 6 ------
fs/nfsd/vfs.c | 53 +++++++++++++++++++++++++++++++++++-----------------
fs/nfsd/vfs.h | 4 ++--
3 files changed, 38 insertions(+), 25 deletions(-)

diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c
index 4e939ebba5d5..79831da12462 100644
--- a/fs/nfsd/nfs3proc.c
+++ b/fs/nfsd/nfs3proc.c
@@ -666,15 +666,9 @@ nfsd3_proc_commit(struct svc_rqst *rqstp)
argp->count,
(unsigned long long) argp->offset);

- if (argp->offset > NFS_OFFSET_MAX) {
- resp->status = nfserr_inval;
- goto out;
- }
-
fh_copy(&resp->fh, &argp->fh);
resp->status = nfsd_commit(rqstp, &resp->fh, argp->offset,
argp->count, resp->verf);
-out:
return rpc_success;
}

diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 0cccceb105e7..91600e71be19 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -1114,42 +1114,61 @@ nfsd_write(struct svc_rqst *rqstp, struct svc_fh *fhp, loff_t offset,
}

#ifdef CONFIG_NFSD_V3
-/*
- * Commit all pending writes to stable storage.
+/**
+ * nfsd_commit - Commit pending writes to stable storage
+ * @rqstp: RPC request being processed
+ * @fhp: NFS filehandle
+ * @offset: raw offset from beginning of file
+ * @count: raw count of bytes to sync
+ * @verf: filled in with the server's current write verifier
*
- * Note: we only guarantee that data that lies within the range specified
- * by the 'offset' and 'count' parameters will be synced.
+ * Note: we guarantee that data that lies within the range specified
+ * by the 'offset' and 'count' parameters will be synced. The server
+ * is permitted to sync data that lies outside this range at the
+ * same time.
*
* Unfortunately we cannot lock the file to make sure we return full WCC
* data to the client, as locking happens lower down in the filesystem.
+ *
+ * Return values:
+ * An nfsstat value in network byte order.
*/
__be32
-nfsd_commit(struct svc_rqst *rqstp, struct svc_fh *fhp,
- loff_t offset, unsigned long count, __be32 *verf)
+nfsd_commit(struct svc_rqst *rqstp, struct svc_fh *fhp, u64 offset,
+ u32 count, __be32 *verf)
{
+ u64 maxbytes;
+ loff_t start, end;
struct nfsd_net *nn;
struct nfsd_file *nf;
- loff_t end = LLONG_MAX;
- __be32 err = nfserr_inval;
-
- if (offset < 0)
- goto out;
- if (count != 0) {
- end = offset + (loff_t)count - 1;
- if (end < offset)
- goto out;
- }
+ __be32 err;

err = nfsd_file_acquire(rqstp, fhp,
NFSD_MAY_WRITE|NFSD_MAY_NOT_BREAK_LEASE, &nf);
if (err)
goto out;
+
+ /*
+ * Convert the client-provided (offset, count) range to a
+ * (start, end) range. If the client-provided range falls
+ * outside the maximum file size of the underlying FS,
+ * clamp the sync range appropriately.
+ */
+ start = 0;
+ end = LLONG_MAX;
+ maxbytes = (u64)fhp->fh_dentry->d_sb->s_maxbytes;
+ if (offset < maxbytes) {
+ start = offset;
+ if (count && (offset + count - 1 < maxbytes))
+ end = offset + count - 1;
+ }
+
nn = net_generic(nf->nf_net, nfsd_net_id);
if (EX_ISSYNC(fhp->fh_export)) {
errseq_t since = READ_ONCE(nf->nf_file->f_wb_err);
int err2;

- err2 = vfs_fsync_range(nf->nf_file, offset, end, 0);
+ err2 = vfs_fsync_range(nf->nf_file, start, end, 0);
switch (err2) {
case 0:
nfsd_copy_write_verifier(verf, nn);
diff --git a/fs/nfsd/vfs.h b/fs/nfsd/vfs.h
index 9f56dcb22ff7..2c43d10e3cab 100644
--- a/fs/nfsd/vfs.h
+++ b/fs/nfsd/vfs.h
@@ -74,8 +74,8 @@ __be32 do_nfsd_create(struct svc_rqst *, struct svc_fh *,
char *name, int len, struct iattr *attrs,
struct svc_fh *res, int createmode,
u32 *verifier, bool *truncp, bool *created);
-__be32 nfsd_commit(struct svc_rqst *, struct svc_fh *,
- loff_t, unsigned long, __be32 *verf);
+__be32 nfsd_commit(struct svc_rqst *rqst, struct svc_fh *fhp,
+ u64 offset, u32 count, __be32 *verf);
#endif /* CONFIG_NFSD_V3 */
#ifdef CONFIG_NFSD_V4
__be32 nfsd_getxattr(struct svc_rqst *rqstp, struct svc_fh *fhp,

2022-02-01 20:47:19

by Chuck Lever III

[permalink] [raw]

Subject: [PATCH v2 5/5] NFSD: Deprecate NFS_OFFSET_MAX

NFS_OFFSET_MAX was introduced way back in Linux v2.3.y before there
was a kernel-wide OFFSET_MAX value. As a clean up, replace the last
few uses of it with its generic equivalent, and get rid of it.

Signed-off-by: Chuck Lever <[email protected]>
---
fs/nfsd/nfs3xdr.c | 2 +-
fs/nfsd/nfs4xdr.c | 2 +-
include/linux/nfs.h | 8 --------
3 files changed, 2 insertions(+), 10 deletions(-)

diff --git a/fs/nfsd/nfs3xdr.c b/fs/nfsd/nfs3xdr.c
index 2e47a07029f1..0293b8d65f10 100644
--- a/fs/nfsd/nfs3xdr.c
+++ b/fs/nfsd/nfs3xdr.c
@@ -1060,7 +1060,7 @@ svcxdr_encode_entry3_common(struct nfsd3_readdirres *resp, const char *name,
return false;
/* cookie */
resp->cookie_offset = dirlist->len;
- if (xdr_stream_encode_u64(xdr, NFS_OFFSET_MAX) < 0)
+ if (xdr_stream_encode_u64(xdr, OFFSET_MAX) < 0)
return false;

return true;
diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index 7d2217cdaeaa..64d73b750491 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -3505,7 +3505,7 @@ nfsd4_encode_dirent(void *ccdv, const char *name, int namlen,
p = xdr_reserve_space(xdr, 3*4 + namlen);
if (!p)
goto fail;
- p = xdr_encode_hyper(p, NFS_OFFSET_MAX); /* offset of next entry */
+ p = xdr_encode_hyper(p, OFFSET_MAX); /* offset of next entry */
p = xdr_encode_array(p, name, namlen); /* name length & name */

nfserr = nfsd4_encode_dirent_fattr(xdr, cd, name, namlen);
diff --git a/include/linux/nfs.h b/include/linux/nfs.h
index 0dc7ad38a0da..b06375e88e58 100644
--- a/include/linux/nfs.h
+++ b/include/linux/nfs.h
@@ -36,14 +36,6 @@ static inline void nfs_copy_fh(struct nfs_fh *target, const struct nfs_fh *sourc
memcpy(target->data, source->data, source->size);
}

-
-/*
- * This is really a general kernel constant, but since nothing like
- * this is defined in the kernel headers, I have to do it here.
- */
-#define NFS_OFFSET_MAX ((__s64)((~(__u64)0) >> 1))
-
-
enum nfs3_stable_how {
NFS_UNSTABLE = 0,
NFS_DATA_SYNC = 1,

2022-02-01 20:47:34

by Trond Myklebust

[permalink] [raw]

Subject: Re: [PATCH v2 2/5] NFSD: Fix NFSv3 SETATTR/CREATE's handling of large file sizes

On Mon, 2022-01-31 at 13:24 -0500, Chuck Lever wrote:
> iattr::ia_size is a loff_t, so these NFSv3 procedures must be
> careful to deal with incoming client size values that are larger
> than s64_max without corrupting the value.
>
> Silently capping the value results in storing a different value
> than the client passed in which is unexpected behavior, so remove
> the min_t() check in decode_sattr3().
>
> Moreover, a large file size is not an XDR error, since anything up
> to U64_MAX is permitted for NFSv3 file size values. So it has to be
> dealt with in nfs3proc.c, not in the XDR decoder.
>
> Size comparisons like in inode_newsize_ok should now work as
> expected -- the VFS returns -EFBIG if the new size is larger than
> the underlying filesystem's s_maxbytes.
>
> However, RFC 1813 permits only the WRITE procedure to return
> NFS3ERR_FBIG. Extra checks are needed to prevent NFSv3 SETATTR and
> CREATE from returning FBIG. Unfortunately RFC 1813 does not provide
> a specific status code for either procedure to indicate this
> specific failure, so I've chosen NFS3ERR_INVAL for SETATTR and
> NFS3ERR_IO for CREATE.
>
> Applications and NFS clients might be better served if the server
> stuck with NFS3ERR_FBIG despite what RFC 1813 says.
>
> Signed-off-by: Chuck Lever <[email protected]>
> ---
> fs/nfsd/nfs3proc.c |    9 +++++++++
> fs/nfsd/nfs3xdr.c |    2 +-
> 2 files changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c
> index 8ef53f6726ec..02edc7074d06 100644
> --- a/fs/nfsd/nfs3proc.c
> +++ b/fs/nfsd/nfs3proc.c
> @@ -73,6 +73,10 @@ nfsd3_proc_setattr(struct svc_rqst *rqstp)
>         fh_copy(&resp->fh, &argp->fh);
>         resp->status = nfsd_setattr(rqstp, &resp->fh, &argp->attrs,
>                                     argp->check_guard, argp-
> >guardtime);
> +
> +       if (resp->status == nfserr_fbig)
> +               resp->status = nfserr_inval;
> +
>         return rpc_success;
> }
>
> @@ -245,6 +249,11 @@ nfsd3_proc_create(struct svc_rqst *rqstp)
>         resp->status = do_nfsd_create(rqstp, dirfhp, argp->name,
> argp->len,
>                                       attr, newfhp, argp->createmode,
>                                       (u32 *)argp->verf, NULL, NULL);
> +
> +       /* CREATE must not return NFS3ERR_FBIG */
> +       if (resp->status == nfserr_fbig)
> +               resp->status = nfserr_io;
> +
>         return rpc_success;
> }
>
> diff --git a/fs/nfsd/nfs3xdr.c b/fs/nfsd/nfs3xdr.c
> index 7c45ba4db61b..2e47a07029f1 100644
> --- a/fs/nfsd/nfs3xdr.c
> +++ b/fs/nfsd/nfs3xdr.c
> @@ -254,7 +254,7 @@ svcxdr_decode_sattr3(struct svc_rqst *rqstp,
> struct xdr_stream *xdr,
>                 if (xdr_stream_decode_u64(xdr, &newsize) < 0)
>                         return false;
>                 iap->ia_valid |= ATTR_SIZE;
> -               iap->ia_size = min_t(u64, newsize, NFS_OFFSET_MAX);
> +               iap->ia_size = newsize;
>         }
>         if (xdr_stream_decode_u32(xdr, &set_it) < 0)0
>                 return false;
>
>

NACK.

Unlike NFSV4, NFSv3 has reference implementations, not a reference
specification document. There is no need to change those
implementations to deal with the fact that RFC1813 is underspecified.

This change would just serve to break client behaviour, for no good
reason.

--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
[email protected]

2022-02-01 20:47:40

by Chuck Lever III

[permalink] [raw]

Subject: Re: [PATCH v2 2/5] NFSD: Fix NFSv3 SETATTR/CREATE's handling of large file sizes

> On Jan 31, 2022, at 1:37 PM, Trond Myklebust <[email protected]> wrote:
>
> On Mon, 2022-01-31 at 13:24 -0500, Chuck Lever wrote:
>> iattr::ia_size is a loff_t, so these NFSv3 procedures must be
>> careful to deal with incoming client size values that are larger
>> than s64_max without corrupting the value.
>>
>> Silently capping the value results in storing a different value
>> than the client passed in which is unexpected behavior, so remove
>> the min_t() check in decode_sattr3().
>>
>> Moreover, a large file size is not an XDR error, since anything up
>> to U64_MAX is permitted for NFSv3 file size values. So it has to be
>> dealt with in nfs3proc.c, not in the XDR decoder.
>>
>> Size comparisons like in inode_newsize_ok should now work as
>> expected -- the VFS returns -EFBIG if the new size is larger than
>> the underlying filesystem's s_maxbytes.
>>
>> However, RFC 1813 permits only the WRITE procedure to return
>> NFS3ERR_FBIG. Extra checks are needed to prevent NFSv3 SETATTR and
>> CREATE from returning FBIG. Unfortunately RFC 1813 does not provide
>> a specific status code for either procedure to indicate this
>> specific failure, so I've chosen NFS3ERR_INVAL for SETATTR and
>> NFS3ERR_IO for CREATE.
>>
>> Applications and NFS clients might be better served if the server
>> stuck with NFS3ERR_FBIG despite what RFC 1813 says.
>>
>> Signed-off-by: Chuck Lever <[email protected]>
>> ---
>> fs/nfsd/nfs3proc.c | 9 +++++++++
>> fs/nfsd/nfs3xdr.c | 2 +-
>> 2 files changed, 10 insertions(+), 1 deletion(-)
>>
>> diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c
>> index 8ef53f6726ec..02edc7074d06 100644
>> --- a/fs/nfsd/nfs3proc.c
>> +++ b/fs/nfsd/nfs3proc.c
>> @@ -73,6 +73,10 @@ nfsd3_proc_setattr(struct svc_rqst *rqstp)
>> fh_copy(&resp->fh, &argp->fh);
>> resp->status = nfsd_setattr(rqstp, &resp->fh, &argp->attrs,
>> argp->check_guard, argp-
>>> guardtime);
>> +
>> + if (resp->status == nfserr_fbig)
>> + resp->status = nfserr_inval;
>> +
>> return rpc_success;
>> }
>>
>> @@ -245,6 +249,11 @@ nfsd3_proc_create(struct svc_rqst *rqstp)
>> resp->status = do_nfsd_create(rqstp, dirfhp, argp->name,
>> argp->len,
>> attr, newfhp, argp->createmode,
>> (u32 *)argp->verf, NULL, NULL);
>> +
>> + /* CREATE must not return NFS3ERR_FBIG */
>> + if (resp->status == nfserr_fbig)
>> + resp->status = nfserr_io;
>> +
>> return rpc_success;
>> }
>>
>> diff --git a/fs/nfsd/nfs3xdr.c b/fs/nfsd/nfs3xdr.c
>> index 7c45ba4db61b..2e47a07029f1 100644
>> --- a/fs/nfsd/nfs3xdr.c
>> +++ b/fs/nfsd/nfs3xdr.c
>> @@ -254,7 +254,7 @@ svcxdr_decode_sattr3(struct svc_rqst *rqstp,
>> struct xdr_stream *xdr,
>> if (xdr_stream_decode_u64(xdr, &newsize) < 0)
>> return false;
>> iap->ia_valid |= ATTR_SIZE;
>> - iap->ia_size = min_t(u64, newsize, NFS_OFFSET_MAX);
>> + iap->ia_size = newsize;
>> }
>> if (xdr_stream_decode_u32(xdr, &set_it) < 0)0
>> return false;
>>
>>
>
> NACK.
>
> Unlike NFSV4, NFSv3 has reference implementations, not a reference
> specification document. There is no need to change those
> implementations to deal with the fact that RFC1813 is underspecified.
>
> This change would just serve to break client behaviour, for no good
> reason.

So, I _have_ been asking around. This is not a change that
I'm proposing blithely.

Which part of the change is wrong, and which clients would
break? Solaris NFSv3 server is supposed to return NFS3ERR_FBIG
in this case, I believe. NFSD could return NFS3ERR_FBIG in
these cases instead.

Is there somewhere that the behavior of the reference
implementation is documented? If the current XDR decoder
behavior is a de facto standard, that should be noted in a
comment here.

--
Chuck Lever

2022-02-01 20:48:09

by Trond Myklebust

[permalink] [raw]

Subject: Re: [PATCH v2 2/5] NFSD: Fix NFSv3 SETATTR/CREATE's handling of large file sizes

On Mon, 2022-01-31 at 18:49 +0000, Chuck Lever III wrote:
>
>
> > On Jan 31, 2022, at 1:37 PM, Trond Myklebust
> > <[email protected]> wrote:
> >
> > On Mon, 2022-01-31 at 13:24 -0500, Chuck Lever wrote:
> > > iattr::ia_size is a loff_t, so these NFSv3 procedures must be
> > > careful to deal with incoming client size values that are larger
> > > than s64_max without corrupting the value.
> > >
> > > Silently capping the value results in storing a different value
> > > than the client passed in which is unexpected behavior, so remove
> > > the min_t() check in decode_sattr3().
> > >
> > > Moreover, a large file size is not an XDR error, since anything
> > > up
> > > to U64_MAX is permitted for NFSv3 file size values. So it has to
> > > be
> > > dealt with in nfs3proc.c, not in the XDR decoder.
> > >
> > > Size comparisons like in inode_newsize_ok should now work as
> > > expected -- the VFS returns -EFBIG if the new size is larger than
> > > the underlying filesystem's s_maxbytes.
> > >
> > > However, RFC 1813 permits only the WRITE procedure to return
> > > NFS3ERR_FBIG. Extra checks are needed to prevent NFSv3 SETATTR
> > > and
> > > CREATE from returning FBIG. Unfortunately RFC 1813 does not
> > > provide
> > > a specific status code for either procedure to indicate this
> > > specific failure, so I've chosen NFS3ERR_INVAL for SETATTR and
> > > NFS3ERR_IO for CREATE.
> > >
> > > Applications and NFS clients might be better served if the server
> > > stuck with NFS3ERR_FBIG despite what RFC 1813 says.
> > >
> > > Signed-off-by: Chuck Lever <[email protected]>
> > > ---
> > > fs/nfsd/nfs3proc.c |    9 +++++++++
> > > fs/nfsd/nfs3xdr.c |    2 +-
> > > 2 files changed, 10 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c
> > > index 8ef53f6726ec..02edc7074d06 100644
> > > --- a/fs/nfsd/nfs3proc.c
> > > +++ b/fs/nfsd/nfs3proc.c
> > > @@ -73,6 +73,10 @@ nfsd3_proc_setattr(struct svc_rqst *rqstp)
> > >         fh_copy(&resp->fh, &argp->fh);
> > >         resp->status = nfsd_setattr(rqstp, &resp->fh, &argp-
> > > >attrs,
> > >                                     argp->check_guard, argp-
> > > > guardtime);
> > > +
> > > +       if (resp->status == nfserr_fbig)
> > > +               resp->status = nfserr_inval;
> > > +
> > >         return rpc_success;
> > > }
> > >
> > > @@ -245,6 +249,11 @@ nfsd3_proc_create(struct svc_rqst *rqstp)
> > >         resp->status = do_nfsd_create(rqstp, dirfhp, argp->name,
> > > argp->len,
> > >                                       attr, newfhp, argp-
> > > >createmode,
> > >                                       (u32 *)argp->verf, NULL,
> > > NULL);
> > > +
> > > +       /* CREATE must not return NFS3ERR_FBIG */
> > > +       if (resp->status == nfserr_fbig)
> > > +               resp->status = nfserr_io;
> > > +
> > >         return rpc_success;
> > > }
> > >
> > > diff --git a/fs/nfsd/nfs3xdr.c b/fs/nfsd/nfs3xdr.c
> > > index 7c45ba4db61b..2e47a07029f1 100644
> > > --- a/fs/nfsd/nfs3xdr.c
> > > +++ b/fs/nfsd/nfs3xdr.c
> > > @@ -254,7 +254,7 @@ svcxdr_decode_sattr3(struct svc_rqst *rqstp,
> > > struct xdr_stream *xdr,
> > >                 if (xdr_stream_decode_u64(xdr, &newsize) < 0)
> > >                         return false;
> > >                 iap->ia_valid |= ATTR_SIZE;
> > > -               iap->ia_size = min_t(u64, newsize,
> > > NFS_OFFSET_MAX);
> > > +               iap->ia_size = newsize;
> > >         }
> > >         if (xdr_stream_decode_u32(xdr, &set_it) < 0)0
> > >                 return false;
> > >
> > >
> >
> > NACK.
> >
> > Unlike NFSV4, NFSv3 has reference implementations, not a reference
> > specification document. There is no need to change those
> > implementations to deal with the fact that RFC1813 is
> > underspecified.
> >
> > This change would just serve to break client behaviour, for no good
> > reason.
>
> So, I _have_ been asking around. This is not a change that
> I'm proposing blithely.
>
> Which part of the change is wrong, and which clients would
> break? Solaris NFSv3 server is supposed to return NFS3ERR_FBIG
> in this case, I believe. NFSD could return NFS3ERR_FBIG in
> these cases instead.
>
> Is there somewhere that the behavior of the reference
> implementation is documented? If the current XDR decoder
> behavior is a de facto standard, that should be noted in a
> comment here.
>
>

Please return NFS3ERR_FBIG in the setattr case, and just drop the
create change (do_nfsd_create() can never return EFBIG given that nfsd
always opens the file with O_LARGEFILE).

There is no document other than the Solaris and Linux NFS code. RFC1813
was never intended as an IETF standard, and never saw any follow up.
Nothing else was published following the Connectathon testing events
which determined the wire protocol.

--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
[email protected]

2022-02-01 20:48:25

by Chuck Lever III

[permalink] [raw]

Subject: Re: [PATCH v2 2/5] NFSD: Fix NFSv3 SETATTR/CREATE's handling of large file sizes

> On Jan 31, 2022, at 1:47 PM, Trond Myklebust <[email protected]> wrote:
>
> On Mon, 2022-01-31 at 13:37 -0500, Trond Myklebust wrote:
>> On Mon, 2022-01-31 at 13:24 -0500, Chuck Lever wrote:
>>> iattr::ia_size is a loff_t, so these NFSv3 procedures must be
>>> careful to deal with incoming client size values that are larger
>>> than s64_max without corrupting the value.
>>>
>>> Silently capping the value results in storing a different value
>>> than the client passed in which is unexpected behavior, so remove
>>> the min_t() check in decode_sattr3().
>>>
>>> Moreover, a large file size is not an XDR error, since anything up
>>> to U64_MAX is permitted for NFSv3 file size values. So it has to be
>>> dealt with in nfs3proc.c, not in the XDR decoder.
>>>
>>> Size comparisons like in inode_newsize_ok should now work as
>>> expected -- the VFS returns -EFBIG if the new size is larger than
>>> the underlying filesystem's s_maxbytes.
>>>
>>> However, RFC 1813 permits only the WRITE procedure to return
>>> NFS3ERR_FBIG. Extra checks are needed to prevent NFSv3 SETATTR and
>>> CREATE from returning FBIG. Unfortunately RFC 1813 does not provide
>>> a specific status code for either procedure to indicate this
>>> specific failure, so I've chosen NFS3ERR_INVAL for SETATTR and
>>> NFS3ERR_IO for CREATE.
>>>
>>> Applications and NFS clients might be better served if the server
>>> stuck with NFS3ERR_FBIG despite what RFC 1813 says.
>>>
>>> Signed-off-by: Chuck Lever <[email protected]>
>>> ---
>>> fs/nfsd/nfs3proc.c | 9 +++++++++
>>> fs/nfsd/nfs3xdr.c | 2 +-
>>> 2 files changed, 10 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c
>>> index 8ef53f6726ec..02edc7074d06 100644
>>> --- a/fs/nfsd/nfs3proc.c
>>> +++ b/fs/nfsd/nfs3proc.c
>>> @@ -73,6 +73,10 @@ nfsd3_proc_setattr(struct svc_rqst *rqstp)
>>> fh_copy(&resp->fh, &argp->fh);
>>> resp->status = nfsd_setattr(rqstp, &resp->fh, &argp->attrs,
>>> argp->check_guard, argp-
>>>> guardtime);
>>> +
>>> + if (resp->status == nfserr_fbig)
>>> + resp->status = nfserr_inval;
>>> +
>>> return rpc_success;
>>> }
>>>
>>> @@ -245,6 +249,11 @@ nfsd3_proc_create(struct svc_rqst *rqstp)
>>> resp->status = do_nfsd_create(rqstp, dirfhp, argp->name,
>>> argp->len,
>>> attr, newfhp, argp-
>>>> createmode,
>>> (u32 *)argp->verf, NULL,
>>> NULL);
>>> +
>>> + /* CREATE must not return NFS3ERR_FBIG */
>>> + if (resp->status == nfserr_fbig)
>>> + resp->status = nfserr_io;
>
> BTW: This EFBIG / EOVERFLOW case could only possibly happen due to an
> internal server error.
>
> EFBIG See EOVERFLOW.
>
> EOVERFLOW
> pathname refers to a regular file that is too large to be
> opened. The usual scenario here is that an application compiled
> on a 32-bit platform without -D_FILE_OFFSET_BITS=64 tried to
> open a file whose size exceeds (1<<31)-1 bytes; see also
> O_LARGEFILE above. This is the error specified by POSIX.1; in
> kernels before 2.6.24, Linux gave the error EFBIG for this case.

What if the client has sent a CREATE with attributes that
has a filesize that is smaller than OFFSET_MAX but larger
than the filesystem's s_maxbytes? I believe notify_change()
will return -EFBIG in this case, and correctly so.

NFSD's NFSv3 SETATTR implementation will leak FBIG in
some cases. If that's going to be a problem for certain
important clients, then I'd like it not to do that.

--
Chuck Lever

2022-02-01 20:48:25

by Chuck Lever III

[permalink] [raw]

Subject: Re: [PATCH v2 2/5] NFSD: Fix NFSv3 SETATTR/CREATE's handling of large file sizes

> On Jan 31, 2022, at 1:58 PM, Trond Myklebust <[email protected]> wrote:
>
> On Mon, 2022-01-31 at 18:49 +0000, Chuck Lever III wrote:
>>
>>
>>> On Jan 31, 2022, at 1:37 PM, Trond Myklebust
>>> <[email protected]> wrote:
>>>
>>> On Mon, 2022-01-31 at 13:24 -0500, Chuck Lever wrote:
>>>> iattr::ia_size is a loff_t, so these NFSv3 procedures must be
>>>> careful to deal with incoming client size values that are larger
>>>> than s64_max without corrupting the value.
>>>>
>>>> Silently capping the value results in storing a different value
>>>> than the client passed in which is unexpected behavior, so remove
>>>> the min_t() check in decode_sattr3().
>>>>
>>>> Moreover, a large file size is not an XDR error, since anything
>>>> up
>>>> to U64_MAX is permitted for NFSv3 file size values. So it has to
>>>> be
>>>> dealt with in nfs3proc.c, not in the XDR decoder.
>>>>
>>>> Size comparisons like in inode_newsize_ok should now work as
>>>> expected -- the VFS returns -EFBIG if the new size is larger than
>>>> the underlying filesystem's s_maxbytes.
>>>>
>>>> However, RFC 1813 permits only the WRITE procedure to return
>>>> NFS3ERR_FBIG. Extra checks are needed to prevent NFSv3 SETATTR
>>>> and
>>>> CREATE from returning FBIG. Unfortunately RFC 1813 does not
>>>> provide
>>>> a specific status code for either procedure to indicate this
>>>> specific failure, so I've chosen NFS3ERR_INVAL for SETATTR and
>>>> NFS3ERR_IO for CREATE.
>>>>
>>>> Applications and NFS clients might be better served if the server
>>>> stuck with NFS3ERR_FBIG despite what RFC 1813 says.
>>>>
>>>> Signed-off-by: Chuck Lever <[email protected]>
>>>> ---
>>>> fs/nfsd/nfs3proc.c | 9 +++++++++
>>>> fs/nfsd/nfs3xdr.c | 2 +-
>>>> 2 files changed, 10 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c
>>>> index 8ef53f6726ec..02edc7074d06 100644
>>>> --- a/fs/nfsd/nfs3proc.c
>>>> +++ b/fs/nfsd/nfs3proc.c
>>>> @@ -73,6 +73,10 @@ nfsd3_proc_setattr(struct svc_rqst *rqstp)
>>>> fh_copy(&resp->fh, &argp->fh);
>>>> resp->status = nfsd_setattr(rqstp, &resp->fh, &argp-
>>>>> attrs,
>>>> argp->check_guard, argp-
>>>>> guardtime);
>>>> +
>>>> + if (resp->status == nfserr_fbig)
>>>> + resp->status = nfserr_inval;
>>>> +
>>>> return rpc_success;
>>>> }
>>>>
>>>> @@ -245,6 +249,11 @@ nfsd3_proc_create(struct svc_rqst *rqstp)
>>>> resp->status = do_nfsd_create(rqstp, dirfhp, argp->name,
>>>> argp->len,
>>>> attr, newfhp, argp-
>>>>> createmode,
>>>> (u32 *)argp->verf, NULL,
>>>> NULL);
>>>> +
>>>> + /* CREATE must not return NFS3ERR_FBIG */
>>>> + if (resp->status == nfserr_fbig)
>>>> + resp->status = nfserr_io;
>>>> +
>>>> return rpc_success;
>>>> }
>>>>
>>>> diff --git a/fs/nfsd/nfs3xdr.c b/fs/nfsd/nfs3xdr.c
>>>> index 7c45ba4db61b..2e47a07029f1 100644
>>>> --- a/fs/nfsd/nfs3xdr.c
>>>> +++ b/fs/nfsd/nfs3xdr.c
>>>> @@ -254,7 +254,7 @@ svcxdr_decode_sattr3(struct svc_rqst *rqstp,
>>>> struct xdr_stream *xdr,
>>>> if (xdr_stream_decode_u64(xdr, &newsize) < 0)
>>>> return false;
>>>> iap->ia_valid |= ATTR_SIZE;
>>>> - iap->ia_size = min_t(u64, newsize,
>>>> NFS_OFFSET_MAX);
>>>> + iap->ia_size = newsize;
>>>> }
>>>> if (xdr_stream_decode_u32(xdr, &set_it) < 0)0
>>>> return false;
>>>>
>>>>
>>>
>>> NACK.
>>>
>>> Unlike NFSV4, NFSv3 has reference implementations, not a reference
>>> specification document. There is no need to change those
>>> implementations to deal with the fact that RFC1813 is
>>> underspecified.
>>>
>>> This change would just serve to break client behaviour, for no good
>>> reason.
>>
>> So, I _have_ been asking around. This is not a change that
>> I'm proposing blithely.
>>
>> Which part of the change is wrong, and which clients would
>> break? Solaris NFSv3 server is supposed to return NFS3ERR_FBIG
>> in this case, I believe. NFSD could return NFS3ERR_FBIG in
>> these cases instead.
>>
>> Is there somewhere that the behavior of the reference
>> implementation is documented? If the current XDR decoder
>> behavior is a de facto standard, that should be noted in a
>> comment here.
>>
>>
>
> Please return NFS3ERR_FBIG in the setattr case, and just drop the
> create change (do_nfsd_create() can never return EFBIG given that nfsd
> always opens the file with O_LARGEFILE).
>
> There is no document other than the Solaris and Linux NFS code. RFC1813
> was never intended as an IETF standard, and never saw any follow up.
> Nothing else was published following the Connectathon testing events
> which determined the wire protocol.

So to make sure I understand you: Drop the hunks that
modify nfsd3_proc_setattr() and nfsd3_proc_create().

I'm fine with that.

--
Chuck Lever

2022-02-01 20:48:52

by Trond Myklebust

[permalink] [raw]

Subject: Re: [PATCH v2 2/5] NFSD: Fix NFSv3 SETATTR/CREATE's handling of large file sizes

On Mon, 2022-01-31 at 13:37 -0500, Trond Myklebust wrote:
> On Mon, 2022-01-31 at 13:24 -0500, Chuck Lever wrote:
> > iattr::ia_size is a loff_t, so these NFSv3 procedures must be
> > careful to deal with incoming client size values that are larger
> > than s64_max without corrupting the value.
> >
> > Silently capping the value results in storing a different value
> > than the client passed in which is unexpected behavior, so remove
> > the min_t() check in decode_sattr3().
> >
> > Moreover, a large file size is not an XDR error, since anything up
> > to U64_MAX is permitted for NFSv3 file size values. So it has to be
> > dealt with in nfs3proc.c, not in the XDR decoder.
> >
> > Size comparisons like in inode_newsize_ok should now work as
> > expected -- the VFS returns -EFBIG if the new size is larger than
> > the underlying filesystem's s_maxbytes.
> >
> > However, RFC 1813 permits only the WRITE procedure to return
> > NFS3ERR_FBIG. Extra checks are needed to prevent NFSv3 SETATTR and
> > CREATE from returning FBIG. Unfortunately RFC 1813 does not provide
> > a specific status code for either procedure to indicate this
> > specific failure, so I've chosen NFS3ERR_INVAL for SETATTR and
> > NFS3ERR_IO for CREATE.
> >
> > Applications and NFS clients might be better served if the server
> > stuck with NFS3ERR_FBIG despite what RFC 1813 says.
> >
> > Signed-off-by: Chuck Lever <[email protected]>
> > ---
> > fs/nfsd/nfs3proc.c |    9 +++++++++
> > fs/nfsd/nfs3xdr.c |    2 +-
> > 2 files changed, 10 insertions(+), 1 deletion(-)
> >
> > diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c
> > index 8ef53f6726ec..02edc7074d06 100644
> > --- a/fs/nfsd/nfs3proc.c
> > +++ b/fs/nfsd/nfs3proc.c
> > @@ -73,6 +73,10 @@ nfsd3_proc_setattr(struct svc_rqst *rqstp)
> >         fh_copy(&resp->fh, &argp->fh);
> >         resp->status = nfsd_setattr(rqstp, &resp->fh, &argp->attrs,
> >                                     argp->check_guard, argp-
> > > guardtime);
> > +
> > +       if (resp->status == nfserr_fbig)
> > +               resp->status = nfserr_inval;
> > +
> >         return rpc_success;
> > }
> >
> > @@ -245,6 +249,11 @@ nfsd3_proc_create(struct svc_rqst *rqstp)
> >         resp->status = do_nfsd_create(rqstp, dirfhp, argp->name,
> > argp->len,
> >                                       attr, newfhp, argp-
> > >createmode,
> >                                       (u32 *)argp->verf, NULL,
> > NULL);
> > +
> > +       /* CREATE must not return NFS3ERR_FBIG */
> > +       if (resp->status == nfserr_fbig)
> > +               resp->status = nfserr_io;

BTW: This EFBIG / EOVERFLOW case could only possibly happen due to an
internal server error.

EFBIG See EOVERFLOW.

EOVERFLOW
pathname refers to a regular file that is too large to be
opened. The usual scenario here is that an application compiled
on a 32-bit platform without -D_FILE_OFFSET_BITS=64 tried to
open a file whose size exceeds (1<<31)-1 bytes; see also
O_LARGEFILE above. This is the error specified by POSIX.1; in
kernels before 2.6.24, Linux gave the error EFBIG for this case.

> > +
> >         return rpc_success;
> > }
> >
> > diff --git a/fs/nfsd/nfs3xdr.c b/fs/nfsd/nfs3xdr.c
> > index 7c45ba4db61b..2e47a07029f1 100644
> > --- a/fs/nfsd/nfs3xdr.c
> > +++ b/fs/nfsd/nfs3xdr.c
> > @@ -254,7 +254,7 @@ svcxdr_decode_sattr3(struct svc_rqst *rqstp,
> > struct xdr_stream *xdr,
> >                 if (xdr_stream_decode_u64(xdr, &newsize) < 0)
> >                         return false;
> >                 iap->ia_valid |= ATTR_SIZE;
> > -               iap->ia_size = min_t(u64, newsize, NFS_OFFSET_MAX);
> > +               iap->ia_size = newsize;
> >         }
> >         if (xdr_stream_decode_u32(xdr, &set_it) < 0)0
> >                 return false;
> >
> >
>
> NACK.
>
> Unlike NFSV4, NFSv3 has reference implementations, not a reference
> specification document. There is no need to change those
> implementations to deal with the fact that RFC1813 is underspecified.
>
> This change would just serve to break client behaviour, for no good
> reason.
>

--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
[email protected]

2022-02-01 20:50:46

by Trond Myklebust

[permalink] [raw]

Subject: Re: [PATCH v2 2/5] NFSD: Fix NFSv3 SETATTR/CREATE's handling of large file sizes

On Mon, 2022-01-31 at 19:04 +0000, Chuck Lever III wrote:
>
>
> > On Jan 31, 2022, at 1:47 PM, Trond Myklebust
> > <[email protected]> wrote:
> >
> > On Mon, 2022-01-31 at 13:37 -0500, Trond Myklebust wrote:
> > > On Mon, 2022-01-31 at 13:24 -0500, Chuck Lever wrote:
> > > > iattr::ia_size is a loff_t, so these NFSv3 procedures must be
> > > > careful to deal with incoming client size values that are
> > > > larger
> > > > than s64_max without corrupting the value.
> > > >
> > > > Silently capping the value results in storing a different value
> > > > than the client passed in which is unexpected behavior, so
> > > > remove
> > > > the min_t() check in decode_sattr3().
> > > >
> > > > Moreover, a large file size is not an XDR error, since anything
> > > > up
> > > > to U64_MAX is permitted for NFSv3 file size values. So it has
> > > > to be
> > > > dealt with in nfs3proc.c, not in the XDR decoder.
> > > >
> > > > Size comparisons like in inode_newsize_ok should now work as
> > > > expected -- the VFS returns -EFBIG if the new size is larger
> > > > than
> > > > the underlying filesystem's s_maxbytes.
> > > >
> > > > However, RFC 1813 permits only the WRITE procedure to return
> > > > NFS3ERR_FBIG. Extra checks are needed to prevent NFSv3 SETATTR
> > > > and
> > > > CREATE from returning FBIG. Unfortunately RFC 1813 does not
> > > > provide
> > > > a specific status code for either procedure to indicate this
> > > > specific failure, so I've chosen NFS3ERR_INVAL for SETATTR and
> > > > NFS3ERR_IO for CREATE.
> > > >
> > > > Applications and NFS clients might be better served if the
> > > > server
> > > > stuck with NFS3ERR_FBIG despite what RFC 1813 says.
> > > >
> > > > Signed-off-by: Chuck Lever <[email protected]>
> > > > ---
> > > > fs/nfsd/nfs3proc.c |    9 +++++++++
> > > > fs/nfsd/nfs3xdr.c |    2 +-
> > > > 2 files changed, 10 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c
> > > > index 8ef53f6726ec..02edc7074d06 100644
> > > > --- a/fs/nfsd/nfs3proc.c
> > > > +++ b/fs/nfsd/nfs3proc.c
> > > > @@ -73,6 +73,10 @@ nfsd3_proc_setattr(struct svc_rqst *rqstp)
> > > >         fh_copy(&resp->fh, &argp->fh);
> > > >         resp->status = nfsd_setattr(rqstp, &resp->fh, &argp-
> > > > >attrs,
> > > >                                     argp->check_guard, argp-
> > > > > guardtime);
> > > > +
> > > > +       if (resp->status == nfserr_fbig)
> > > > +               resp->status = nfserr_inval;
> > > > +
> > > >         return rpc_success;
> > > > }
> > > >
> > > > @@ -245,6 +249,11 @@ nfsd3_proc_create(struct svc_rqst *rqstp)
> > > >         resp->status = do_nfsd_create(rqstp, dirfhp, argp-
> > > > >name,
> > > > argp->len,
> > > >                                       attr, newfhp, argp-
> > > > > createmode,
> > > >                                       (u32 *)argp->verf, NULL,
> > > > NULL);
> > > > +
> > > > +       /* CREATE must not return NFS3ERR_FBIG */
> > > > +       if (resp->status == nfserr_fbig)
> > > > +               resp->status = nfserr_io;
> >
> > BTW: This EFBIG / EOVERFLOW case could only possibly happen due to
> > an
> > internal server error.
> >
> >       EFBIG See EOVERFLOW.
> >
> >       EOVERFLOW
> >              pathname refers to a regular file that is too
> > large to be
> >              opened. The usual scenario here is that an
> > application compiled
> >              on a 32-bit platform without -
> > D_FILE_OFFSET_BITS=64 tried to
> >              open a file whose size exceeds (1<<31)-1 bytes;
> > see also
> >              O_LARGEFILE above.   This is the error specified by
> > POSIX.1; in
> >              kernels before 2.6.24, Linux gave the error EFBIG for
> > this case.
>
> What if the client has sent a CREATE with attributes that
> has a filesize that is smaller than OFFSET_MAX but larger
> than the filesystem's s_maxbytes? I believe notify_change()
> will return -EFBIG in this case, and correctly so.
>

There is no POSIX or BSD function that allows you to do this, so that
would be a very unusual client. Pretty sure that Windows won't allow it
either.

> NFSD's NFSv3 SETATTR implementation will leak FBIG in
> some cases. If that's going to be a problem for certain
> important clients, then I'd like it not to do that.
>

As I said, changing the behaviour at this point, it far more likely to
cause breakage than keeping it would. So I strongly disagree with this
argument.

--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
[email protected]