2014-04-03 16:33:55

by Mark Lord

[permalink] [raw]
Subject: linux-3.14 nfsd regression

This commit from linux-3.14 breaks our NFS-root clients here:

https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=6e14b46b91fee8a049b0940333ce13a820beaaa5


- *p++ = htonl((u32) stat->mode);
+ *p++ = htonl((u32) (stat->mode & S_IALLUGO));


Reverting the one-liner above (on the server) fixes it for us,
as does reverting back to linux-3.13.8 on the server.

The NFS-root clients are on PowerPC (big-endian) architecture,
running linux-3.12.16. The NFS server is on an Intel PC running linux-3.14.

ACL is completely disabled on server and client,
and we're using NFSv2/v3. No support for v4.

I instrumented the function to see what other bits were being cleared
by the (stat->mode & S_IALLUGO) masking. The results are attached.

Cheers
Mark


Attachments:
11_revert_nfsxdr_to_fix_nfsroot.trace (190.04 kB)

2014-04-03 16:44:37

by Mark Lord

[permalink] [raw]
Subject: Re: linux-3.14 nfsd regression

On 14-04-03 12:33 PM, Mark Lord wrote:
> This commit from linux-3.14 breaks our NFS-root clients here:
>
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=6e14b46b91fee8a049b0940333ce13a820beaaa5
>
>
> - *p++ = htonl((u32) stat->mode);
> + *p++ = htonl((u32) (stat->mode & S_IALLUGO));
>
>
> Reverting the one-liner above (on the server) fixes it for us,
> as does reverting back to linux-3.13.8 on the server.
>
> The NFS-root clients are on PowerPC (big-endian) architecture,
> running linux-3.12.16. The NFS server is on an Intel PC running linux-3.14.
>
> ACL is completely disabled on server and client,
> and we're using NFSv2/v3. No support for v4.
>
> I instrumented the function to see what other bits were being cleared
> by the (stat->mode & S_IALLUGO) masking. The results are attached.
..


Looking into the (previously attached) trace, the bits that seem to be
getting chopped most often are S_IFREG (0x8000) and S_IFDIR (0x4000).
It appears that one/both of those are needed for mounting nfsroot.


[ 2733.823753] encode_fattr: mode=0x000041ed mask=0x00000fff
[ 2733.824217] encode_fattr: mode=0x000041ed mask=0x00000fff
[ 2733.838769] encode_fattr: mode=0x000041ed mask=0x00000fff
[ 2733.839175] encode_fattr: mode=0x0000a1ff mask=0x00000fff
[ 2733.839895] encode_fattr: mode=0x000081ed mask=0x00000fff
[ 2733.840388] encode_fattr: mode=0x000081ed mask=0x00000fff
[ 2733.840431] encode_fattr: mode=0x000081ed mask=0x00000fff
[ 2733.841256] encode_fattr: mode=0x000041ed mask=0x00000fff
[ 2733.841659] encode_fattr: mode=0x0000a1ff mask=0x00000fff
[ 2733.842379] encode_fattr: mode=0x000081ed mask=0x00000fff
[ 2733.842825] encode_fattr: mode=0x000081ed mask=0x00000fff
[ 2733.842879] encode_fattr: mode=0x000081ed mask=0x00000fff
[ 2733.843876] encode_fattr: mode=0x000081ed mask=0x00000fff
[ 2733.843924] encode_fattr: mode=0x000081ed mask=0x00000fff
...

2014-04-04 13:58:47

by J. Bruce Fields

[permalink] [raw]
Subject: Re: linux-3.14 nfsd regression

On Thu, Apr 03, 2014 at 07:21:46PM -0400, Jeff Layton wrote:
> So according to the RFC you have to encode both the mode bits and the
> ftype for v2. The type bits seem to be removed from the mode in NFSv3
> though, so perhaps we should only be doing that masking in versions
> above v2?

Right, the problematic patch applied the same mask in both v2 and v3
cases, so I'm reverting just the v2 part (see below).

> With a quick check, it looks like the v3 code doesn't rely on those bits
> and I imagine v4 doesn't either.
>
> It might also be nice to have the client v2 decode_fattr function to
> throw a warning if the server sends us mismatched type bits and ftype
> values. That would have helped us catch this sooner...

Yes, that might be a reasonable thing to do, though I don't know if it's
worth it.

--b.

commit 35a8dff14e76c00e5b52140290cfb498dc2454a0
Author: J. Bruce Fields <[email protected]>
Date: Thu Apr 3 15:10:35 2014 -0400

nfsd: revert v2 half of "nfsd: don't return high mode bits"

This reverts the part of commit 6e14b46b91fee8a049b0940333ce13a820beaaa5
that changes NFSv2 behavior.

Mark Lord found that it broke nfs-root for Linux clients, because it
broke NFSv2.

In fact, from RFC 1094:

"Notice that the file type is specified both in the mode bits
and in the file type. This is really a bug in the protocol and
will be fixed in future versions."

So NFSv2 clients really are expected to depend on the high bits of the
mode.

Cc: [email protected]
Reported-by: Mark Lord <[email protected]>
Signed-off-by: J. Bruce Fields <[email protected]>

diff --git a/fs/nfsd/nfsxdr.c b/fs/nfsd/nfsxdr.c
index b17d932..9c769a4 100644
--- a/fs/nfsd/nfsxdr.c
+++ b/fs/nfsd/nfsxdr.c
@@ -152,7 +152,7 @@ encode_fattr(struct svc_rqst *rqstp, __be32 *p, struct svc_fh *fhp,
type = (stat->mode & S_IFMT);

*p++ = htonl(nfs_ftypes[type >> 12]);
- *p++ = htonl((u32) (stat->mode & S_IALLUGO));
+ *p++ = htonl((u32) stat->mode);
*p++ = htonl((u32) stat->nlink);
*p++ = htonl((u32) from_kuid(&init_user_ns, stat->uid));
*p++ = htonl((u32) from_kgid(&init_user_ns, stat->gid));

2014-04-04 14:08:04

by Jeff Layton

[permalink] [raw]
Subject: Re: linux-3.14 nfsd regression

On Fri, 4 Apr 2014 09:58:43 -0400
"J. Bruce Fields" <[email protected]> wrote:

> On Thu, Apr 03, 2014 at 07:21:46PM -0400, Jeff Layton wrote:
> > So according to the RFC you have to encode both the mode bits and the
> > ftype for v2. The type bits seem to be removed from the mode in NFSv3
> > though, so perhaps we should only be doing that masking in versions
> > above v2?
>
> Right, the problematic patch applied the same mask in both v2 and v3
> cases, so I'm reverting just the v2 part (see below).
>
> > With a quick check, it looks like the v3 code doesn't rely on those bits
> > and I imagine v4 doesn't either.
> >
> > It might also be nice to have the client v2 decode_fattr function to
> > throw a warning if the server sends us mismatched type bits and ftype
> > values. That would have helped us catch this sooner...
>
> Yes, that might be a reasonable thing to do, though I don't know if it's
> worth it.
>
> --b.
>
> commit 35a8dff14e76c00e5b52140290cfb498dc2454a0
> Author: J. Bruce Fields <[email protected]>
> Date: Thu Apr 3 15:10:35 2014 -0400
>
> nfsd: revert v2 half of "nfsd: don't return high mode bits"
>
> This reverts the part of commit 6e14b46b91fee8a049b0940333ce13a820beaaa5
> that changes NFSv2 behavior.
>
> Mark Lord found that it broke nfs-root for Linux clients, because it
> broke NFSv2.
>
> In fact, from RFC 1094:
>
> "Notice that the file type is specified both in the mode bits
> and in the file type. This is really a bug in the protocol and
> will be fixed in future versions."
>
> So NFSv2 clients really are expected to depend on the high bits of the
> mode.
>
> Cc: [email protected]
> Reported-by: Mark Lord <[email protected]>
> Signed-off-by: J. Bruce Fields <[email protected]>
>
> diff --git a/fs/nfsd/nfsxdr.c b/fs/nfsd/nfsxdr.c
> index b17d932..9c769a4 100644
> --- a/fs/nfsd/nfsxdr.c
> +++ b/fs/nfsd/nfsxdr.c
> @@ -152,7 +152,7 @@ encode_fattr(struct svc_rqst *rqstp, __be32 *p, struct svc_fh *fhp,
> type = (stat->mode & S_IFMT);
>
> *p++ = htonl(nfs_ftypes[type >> 12]);
> - *p++ = htonl((u32) (stat->mode & S_IALLUGO));
> + *p++ = htonl((u32) stat->mode);
> *p++ = htonl((u32) stat->nlink);
> *p++ = htonl((u32) from_kuid(&init_user_ns, stat->uid));
> *p++ = htonl((u32) from_kgid(&init_user_ns, stat->gid));

Looks right...

Reviewed-by: Jeff Layton <[email protected]>

2014-04-03 20:57:12

by Mark Lord

[permalink] [raw]
Subject: Re: linux-3.14 nfsd regression

On 14-04-03 04:15 PM, J. Bruce Fields wrote:
> On Thu, Apr 03, 2014 at 01:51:06PM -0400, Mark Lord wrote:
>> On 14-04-03 01:16 PM, J. Bruce Fields wrote:
>>> The original behavior was in practice harmless and changing it broke
>>> something, so I think we should definitely just revert this patch.
>>
>> Yup. Who?
>
> I'll submit this soon.
..

Thanks, Bruce!

--
Mark Lord
Real-Time Remedies Inc.
[email protected]

2014-04-03 20:16:29

by J. Bruce Fields

[permalink] [raw]
Subject: Re: linux-3.14 nfsd regression

On Thu, Apr 03, 2014 at 02:55:04PM -0400, Jeff Layton wrote:
> On Thu, 03 Apr 2014 13:51:06 -0400
> Mark Lord <[email protected]> wrote:
>
> > On 14-04-03 01:16 PM, J. Bruce Fields wrote:
> > > On Thu, Apr 03, 2014 at 12:33:55PM -0400, Mark Lord wrote:
> > >> This commit from linux-3.14 breaks our NFS-root clients here:
> > >>
> > >> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=6e14b46b91fee8a049b0940333ce13a820beaaa5
> > >>
> > >>
> > >> - *p++ = htonl((u32) stat->mode);
> > >> + *p++ = htonl((u32) (stat->mode & S_IALLUGO));
> > >>
> > >>
> > >> Reverting the one-liner above (on the server) fixes it for us,
> > >> as does reverting back to linux-3.13.8 on the server.
> > >>
> > >> The NFS-root clients are on PowerPC (big-endian) architecture,
> > >> running linux-3.12.16. The NFS server is on an Intel PC running linux-3.14.
> > >>
> > >> ACL is completely disabled on server and client,
> > >> and we're using NFSv2/v3. No support for v4.
> > >>
> > >> I instrumented the function to see what other bits were being cleared
> > >> by the (stat->mode & S_IALLUGO) masking. The results are attached.
> > >
> > > Hm, it sounds like a bug in the client if it's depending on those high
> > > bits.
> >
> > But only for mounting / starting up from the nfsroot, it seems.
> > I wonder if there's an unusual code path for that in there?
> > The regular stuff looks mostly fine:
> >
> > p = xdr_decode_ftype3(p, &fmode);
> > fattr->mode = (be32_to_cpup(p++) & ~S_IFMT) | fmode;
> >
> > Except perhaps that second line ought to use the same mask
> > as the server side is using, just in case there are some other
> > stray high (higher than S_IFMT) bits in there now/someday.
> >
> > > The original behavior was in practice harmless and changing it broke
> > > something, so I think we should definitely just revert this patch.
> >
> > Yup. Who?
> >
> > > But the client may need fixing too.
> >
> > Probably a good thing in the longer term, for better compatibility
> > with non-Linux servers. But we'll still have to keep the revert
> > on the server (nfsd) code for backward compatibility, I think.
> >
> > Cheers
> >
>
> It would be good to understand where this is broken in the client.
>
> It's incorrect for the client to interpret those bits, as I think that
> there's no guarantee that other OS's implement the type bits in the
> same way that Linux does. So if you end up mounting a different OS,
> it's possible that the client will get that wrong...

It turns out these bits actually are defined in rfc 1094, so this is
just an odd NFSv2-specific wart, and the nfsd change was just flat-out
wrong.

--b.

2014-04-03 20:15:36

by J. Bruce Fields

[permalink] [raw]
Subject: Re: linux-3.14 nfsd regression

On Thu, Apr 03, 2014 at 01:51:06PM -0400, Mark Lord wrote:
> On 14-04-03 01:16 PM, J. Bruce Fields wrote:
> > The original behavior was in practice harmless and changing it broke
> > something, so I think we should definitely just revert this patch.
>
> Yup. Who?

I'll submit this soon.

--b.

Author: J. Bruce Fields <[email protected]>
Date: Thu Apr 3 15:10:35 2014 -0400

nfsd: revert v2 half of "nfsd: don't return high mode bits"

This reverts the part of commit 6e14b46b91fee8a049b0940333ce13a820beaaa5
that changes NFSv2 behavior.

Mark Lord found that it broke nfs-root for Linux clients, because it
broke NFSv2.

In fact, from RFC 1094:

"Notice that the file type is specified both in the mode bits
and in the file type. This is really a bug in the protocol and
will be fixed in future versions."

So NFSv2 clients really are expected to depend on the high bits of the
mode.

Cc: [email protected]
Reported-by: Mark Lord <[email protected]>
Signed-off-by: J. Bruce Fields <[email protected]>

diff --git a/fs/nfsd/nfsxdr.c b/fs/nfsd/nfsxdr.c
index b17d932..9c769a4 100644
--- a/fs/nfsd/nfsxdr.c
+++ b/fs/nfsd/nfsxdr.c
@@ -152,7 +152,7 @@ encode_fattr(struct svc_rqst *rqstp, __be32 *p, struct svc_fh *fhp,
type = (stat->mode & S_IFMT);

*p++ = htonl(nfs_ftypes[type >> 12]);
- *p++ = htonl((u32) (stat->mode & S_IALLUGO));
+ *p++ = htonl((u32) stat->mode);
*p++ = htonl((u32) stat->nlink);
*p++ = htonl((u32) from_kuid(&init_user_ns, stat->uid));
*p++ = htonl((u32) from_kgid(&init_user_ns, stat->gid));

2014-04-03 21:02:13

by Mark Lord

[permalink] [raw]
Subject: Re: linux-3.14 nfsd regression

On 14-04-03 03:30 PM, J. Bruce Fields wrote:
> On Thu, Apr 03, 2014 at 01:51:06PM -0400, Mark Lord wrote:
>> On 14-04-03 01:16 PM, J. Bruce Fields wrote:
>>> On Thu, Apr 03, 2014 at 12:33:55PM -0400, Mark Lord wrote:
>>>> This commit from linux-3.14 breaks our NFS-root clients here:
>>>>
>>>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=6e14b46b91fee8a049b0940333ce13a820beaaa5
>>>>
>>>>
>>>> - *p++ = htonl((u32) stat->mode);
>>>> + *p++ = htonl((u32) (stat->mode & S_IALLUGO));
>>>>
>>>>
>>>> Reverting the one-liner above (on the server) fixes it for us,
>>>> as does reverting back to linux-3.13.8 on the server.
>>>>
>>>> The NFS-root clients are on PowerPC (big-endian) architecture,
>>>> running linux-3.12.16. The NFS server is on an Intel PC running linux-3.14.
>>>>
>>>> ACL is completely disabled on server and client,
>>>> and we're using NFSv2/v3. No support for v4.
>>>>
>>>> I instrumented the function to see what other bits were being cleared
>>>> by the (stat->mode & S_IALLUGO) masking. The results are attached.
>>>
>>> Hm, it sounds like a bug in the client if it's depending on those high
>>> bits.
>>
>> But only for mounting / starting up from the nfsroot, it seems.
>> I wonder if there's an unusual code path for that in there?
>> The regular stuff looks mostly fine:
>>
>> p = xdr_decode_ftype3(p, &fmode);
>> fattr->mode = (be32_to_cpup(p++) & ~S_IFMT) | fmode;
>
> Hm, but that's in nfs3xdr.c; in nfs2xdr.c we have just
>
> fattr->mode = be32_to_cpup(p+);
>
> and NFSv2 is the default for nfsroot. Do you have some reason to
> believe you're not using NFSv2?

Oh, the client here was using NFS2, absolutely.
I just don't know my way around the code very well yet. :)

But that mask in nfs3xdr.c (client) doesn't match what the server side is using.
--
Mark Lord
Real-Time Remedies Inc.
[email protected]

2014-04-03 20:12:01

by J. Bruce Fields

[permalink] [raw]
Subject: Re: linux-3.14 nfsd regression

On Thu, Apr 03, 2014 at 03:30:24PM -0400, J. Bruce Fields wrote:
> On Thu, Apr 03, 2014 at 01:51:06PM -0400, Mark Lord wrote:
> > On 14-04-03 01:16 PM, J. Bruce Fields wrote:
> > > On Thu, Apr 03, 2014 at 12:33:55PM -0400, Mark Lord wrote:
> > >> This commit from linux-3.14 breaks our NFS-root clients here:
> > >>
> > >> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=6e14b46b91fee8a049b0940333ce13a820beaaa5
> > >>
> > >>
> > >> - *p++ = htonl((u32) stat->mode);
> > >> + *p++ = htonl((u32) (stat->mode & S_IALLUGO));
> > >>
> > >>
> > >> Reverting the one-liner above (on the server) fixes it for us,
> > >> as does reverting back to linux-3.13.8 on the server.
> > >>
> > >> The NFS-root clients are on PowerPC (big-endian) architecture,
> > >> running linux-3.12.16. The NFS server is on an Intel PC running linux-3.14.
> > >>
> > >> ACL is completely disabled on server and client,
> > >> and we're using NFSv2/v3. No support for v4.
> > >>
> > >> I instrumented the function to see what other bits were being cleared
> > >> by the (stat->mode & S_IALLUGO) masking. The results are attached.
> > >
> > > Hm, it sounds like a bug in the client if it's depending on those high
> > > bits.
> >
> > But only for mounting / starting up from the nfsroot, it seems.
> > I wonder if there's an unusual code path for that in there?
> > The regular stuff looks mostly fine:
> >
> > p = xdr_decode_ftype3(p, &fmode);
> > fattr->mode = (be32_to_cpup(p++) & ~S_IFMT) | fmode;
>
> Hm, but that's in nfs3xdr.c; in nfs2xdr.c we have just
>
> fattr->mode = be32_to_cpup(p+);
>
> and NFSv2 is the default for nfsroot. Do you have some reason to
> believe you're not using NFSv2?

Oh, bah, after actually writing a patch for this I thought to check the
rfc's and in fact rfc 1094 2.3.5 says that v2 *does* encode the file
type both in the type and mode fields of the attributes, though it
describes this as "a bug in the protocol".

So I think the nfsd patch was just flat-out wrong in the v2 case, and
that it probably just isn't worth "fixing" the client.

But patch included below anyway for amusement value.

--b.

commit 86706287828aa5b4deed6b6b1478e89d2e2c9707
Author: J. Bruce Fields <[email protected]>
Date: Thu Apr 3 16:04:59 2014 -0400

nfs: nfsv2 client shouldn't get ftype from mode

The NFSv2 client is using the high bits of the mode to determine the
file type; use the "type" field instead.

XXX: rfc 1094 actually says this behavior is correct for NFSv2, though
this is described as "a bug in the protocol". So probably this isn't
worth changing.

Signed-off-by: J. Bruce Fields <[email protected]>

diff --git a/fs/nfs/nfs2xdr.c b/fs/nfs/nfs2xdr.c
index 62db136..40fb021 100644
--- a/fs/nfs/nfs2xdr.c
+++ b/fs/nfs/nfs2xdr.c
@@ -166,23 +166,28 @@ out_overflow:
}

/*
- * 2.3.2. ftype
- *
- * enum ftype {
- * NFNON = 0,
- * NFREG = 1,
- * NFDIR = 2,
- * NFBLK = 3,
- * NFCHR = 4,
- * NFLNK = 5
- * };
- *
+ * Map file type to S_IFMT bits
*/
-static __be32 *xdr_decode_ftype(__be32 *p, u32 *type)
+static const umode_t nfs2_type2fmt[] = {
+ [NFNON] = 0,
+ [NFREG] = S_IFREG,
+ [NFDIR] = S_IFDIR,
+ [NFBLK] = S_IFBLK,
+ [NFCHR] = S_IFCHR,
+ [NFLNK] = S_IFLNK,
+ [NFSOCK] = S_IFSOCK,
+ [NFBAD] = 0,
+ [NFFIFO] = S_IFIFO,
+};
+
+static __be32 *xdr_decode_ftype(__be32 *p, umode_t *mode)
{
- *type = be32_to_cpup(p++);
- if (unlikely(*type > NF2FIFO))
- *type = NFBAD;
+ u32 type;
+
+ type = be32_to_cpup(p++);
+ if (unlikely(type > NF2FIFO))
+ type = NFBAD;
+ *mode = nfs2_type2fmt[type];
return p;
}

@@ -277,7 +282,8 @@ static __be32 *xdr_decode_time(__be32 *p, struct timespec *timep)
*/
static int decode_fattr(struct xdr_stream *xdr, struct nfs_fattr *fattr)
{
- u32 rdev, type;
+ umode_t fmode;
+ u32 rdev;
__be32 *p;

p = xdr_inline_decode(xdr, NFS_fattr_sz << 2);
@@ -286,9 +292,9 @@ static int decode_fattr(struct xdr_stream *xdr, struct nfs_fattr *fattr)

fattr->valid |= NFS_ATTR_FATTR_V2;

- p = xdr_decode_ftype(p, &type);
+ p = xdr_decode_ftype(p, &fmode);

- fattr->mode = be32_to_cpup(p++);
+ fattr->mode = (be32_to_cpup(p++) & ~S_IFMT) | fmode;
fattr->nlink = be32_to_cpup(p++);
fattr->uid = make_kuid(&init_user_ns, be32_to_cpup(p++));
if (!uid_valid(fattr->uid))
@@ -302,7 +308,7 @@ static int decode_fattr(struct xdr_stream *xdr, struct nfs_fattr *fattr)

rdev = be32_to_cpup(p++);
fattr->rdev = new_decode_dev(rdev);
- if (type == (u32)NFCHR && rdev == (u32)NFS2_FIFO_DEV) {
+ if (fmode == S_IFCHR && rdev == (u32)NFS2_FIFO_DEV) {
fattr->mode = (fattr->mode & ~S_IFMT) | S_IFIFO;
fattr->rdev = 0;
}

2014-04-03 23:21:57

by Jeff Layton

[permalink] [raw]
Subject: Re: linux-3.14 nfsd regression

On Thu, 3 Apr 2014 16:16:27 -0400
"J. Bruce Fields" <[email protected]> wrote:

> On Thu, Apr 03, 2014 at 02:55:04PM -0400, Jeff Layton wrote:
> > On Thu, 03 Apr 2014 13:51:06 -0400
> > Mark Lord <[email protected]> wrote:
> >
> > > On 14-04-03 01:16 PM, J. Bruce Fields wrote:
> > > > On Thu, Apr 03, 2014 at 12:33:55PM -0400, Mark Lord wrote:
> > > >> This commit from linux-3.14 breaks our NFS-root clients here:
> > > >>
> > > >> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=6e14b46b91fee8a049b0940333ce13a820beaaa5
> > > >>
> > > >>
> > > >> - *p++ = htonl((u32) stat->mode);
> > > >> + *p++ = htonl((u32) (stat->mode & S_IALLUGO));
> > > >>
> > > >>
> > > >> Reverting the one-liner above (on the server) fixes it for us,
> > > >> as does reverting back to linux-3.13.8 on the server.
> > > >>
> > > >> The NFS-root clients are on PowerPC (big-endian) architecture,
> > > >> running linux-3.12.16. The NFS server is on an Intel PC running linux-3.14.
> > > >>
> > > >> ACL is completely disabled on server and client,
> > > >> and we're using NFSv2/v3. No support for v4.
> > > >>
> > > >> I instrumented the function to see what other bits were being cleared
> > > >> by the (stat->mode & S_IALLUGO) masking. The results are attached.
> > > >
> > > > Hm, it sounds like a bug in the client if it's depending on those high
> > > > bits.
> > >
> > > But only for mounting / starting up from the nfsroot, it seems.
> > > I wonder if there's an unusual code path for that in there?
> > > The regular stuff looks mostly fine:
> > >
> > > p = xdr_decode_ftype3(p, &fmode);
> > > fattr->mode = (be32_to_cpup(p++) & ~S_IFMT) | fmode;
> > >
> > > Except perhaps that second line ought to use the same mask
> > > as the server side is using, just in case there are some other
> > > stray high (higher than S_IFMT) bits in there now/someday.
> > >
> > > > The original behavior was in practice harmless and changing it broke
> > > > something, so I think we should definitely just revert this patch.
> > >
> > > Yup. Who?
> > >
> > > > But the client may need fixing too.
> > >
> > > Probably a good thing in the longer term, for better compatibility
> > > with non-Linux servers. But we'll still have to keep the revert
> > > on the server (nfsd) code for backward compatibility, I think.
> > >
> > > Cheers
> > >
> >
> > It would be good to understand where this is broken in the client.
> >
> > It's incorrect for the client to interpret those bits, as I think that
> > there's no guarantee that other OS's implement the type bits in the
> > same way that Linux does. So if you end up mounting a different OS,
> > it's possible that the client will get that wrong...
>
> It turns out these bits actually are defined in rfc 1094, so this is
> just an odd NFSv2-specific wart, and the nfsd change was just flat-out
> wrong.
>
> --b.

Ahh right -- I remember seeing that long ago.

So according to the RFC you have to encode both the mode bits and the
ftype for v2. The type bits seem to be removed from the mode in NFSv3
though, so perhaps we should only be doing that masking in versions
above v2?

With a quick check, it looks like the v3 code doesn't rely on those bits
and I imagine v4 doesn't either.

It might also be nice to have the client v2 decode_fattr function to
throw a warning if the server sends us mismatched type bits and ftype
values. That would have helped us catch this sooner...

--
Jeff Layton <[email protected]>

2014-04-03 17:16:48

by J. Bruce Fields

[permalink] [raw]
Subject: Re: linux-3.14 nfsd regression

On Thu, Apr 03, 2014 at 12:33:55PM -0400, Mark Lord wrote:
> This commit from linux-3.14 breaks our NFS-root clients here:
>
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=6e14b46b91fee8a049b0940333ce13a820beaaa5
>
>
> - *p++ = htonl((u32) stat->mode);
> + *p++ = htonl((u32) (stat->mode & S_IALLUGO));
>
>
> Reverting the one-liner above (on the server) fixes it for us,
> as does reverting back to linux-3.13.8 on the server.
>
> The NFS-root clients are on PowerPC (big-endian) architecture,
> running linux-3.12.16. The NFS server is on an Intel PC running linux-3.14.
>
> ACL is completely disabled on server and client,
> and we're using NFSv2/v3. No support for v4.
>
> I instrumented the function to see what other bits were being cleared
> by the (stat->mode & S_IALLUGO) masking. The results are attached.

Hm, it sounds like a bug in the client if it's depending on those high
bits.

The original behavior was in practice harmless and changing it broke
something, so I think we should definitely just revert this patch.

But the client may need fixing too.

--b.

2014-04-03 21:28:49

by J. Bruce Fields

[permalink] [raw]
Subject: Re: linux-3.14 nfsd regression

On Thu, Apr 03, 2014 at 04:48:11PM -0400, Mark Lord wrote:
> On 14-04-03 03:30 PM, J. Bruce Fields wrote:
> > On Thu, Apr 03, 2014 at 01:51:06PM -0400, Mark Lord wrote:
> >> On 14-04-03 01:16 PM, J. Bruce Fields wrote:
> >>> On Thu, Apr 03, 2014 at 12:33:55PM -0400, Mark Lord wrote:
> >>>> This commit from linux-3.14 breaks our NFS-root clients here:
> >>>>
> >>>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=6e14b46b91fee8a049b0940333ce13a820beaaa5
> >>>>
> >>>>
> >>>> - *p++ = htonl((u32) stat->mode);
> >>>> + *p++ = htonl((u32) (stat->mode & S_IALLUGO));
> >>>>
> >>>>
> >>>> Reverting the one-liner above (on the server) fixes it for us,
> >>>> as does reverting back to linux-3.13.8 on the server.
> >>>>
> >>>> The NFS-root clients are on PowerPC (big-endian) architecture,
> >>>> running linux-3.12.16. The NFS server is on an Intel PC running linux-3.14.
> >>>>
> >>>> ACL is completely disabled on server and client,
> >>>> and we're using NFSv2/v3. No support for v4.
> >>>>
> >>>> I instrumented the function to see what other bits were being cleared
> >>>> by the (stat->mode & S_IALLUGO) masking. The results are attached.
> >>>
> >>> Hm, it sounds like a bug in the client if it's depending on those high
> >>> bits.
> >>
> >> But only for mounting / starting up from the nfsroot, it seems.
> >> I wonder if there's an unusual code path for that in there?
> >> The regular stuff looks mostly fine:
> >>
> >> p = xdr_decode_ftype3(p, &fmode);
> >> fattr->mode = (be32_to_cpup(p++) & ~S_IFMT) | fmode;
> >
> > Hm, but that's in nfs3xdr.c; in nfs2xdr.c we have just
> >
> > fattr->mode = be32_to_cpup(p+);
> >
> > and NFSv2 is the default for nfsroot. Do you have some reason to
> > believe you're not using NFSv2?
>
> Oh, the client here was using NFS2, absolutely.
> I just don't know my way around the code very well yet. :)
>
> But that mask in nfs3xdr.c (client) doesn't match what the server side is using.

Not sure there's anything to see there.

Looking at include/uapi/linux/stat.h and include/linux/stat.h, if I'm
doing this right...

S_ISFMT is 0170000
S_IALLUGO is 0007777

So they're 16-bit complements. And we're storing the result in a short?

--b.

2014-04-03 17:51:17

by Mark Lord

[permalink] [raw]
Subject: Re: linux-3.14 nfsd regression

On 14-04-03 01:16 PM, J. Bruce Fields wrote:
> On Thu, Apr 03, 2014 at 12:33:55PM -0400, Mark Lord wrote:
>> This commit from linux-3.14 breaks our NFS-root clients here:
>>
>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=6e14b46b91fee8a049b0940333ce13a820beaaa5
>>
>>
>> - *p++ = htonl((u32) stat->mode);
>> + *p++ = htonl((u32) (stat->mode & S_IALLUGO));
>>
>>
>> Reverting the one-liner above (on the server) fixes it for us,
>> as does reverting back to linux-3.13.8 on the server.
>>
>> The NFS-root clients are on PowerPC (big-endian) architecture,
>> running linux-3.12.16. The NFS server is on an Intel PC running linux-3.14.
>>
>> ACL is completely disabled on server and client,
>> and we're using NFSv2/v3. No support for v4.
>>
>> I instrumented the function to see what other bits were being cleared
>> by the (stat->mode & S_IALLUGO) masking. The results are attached.
>
> Hm, it sounds like a bug in the client if it's depending on those high
> bits.

But only for mounting / starting up from the nfsroot, it seems.
I wonder if there's an unusual code path for that in there?
The regular stuff looks mostly fine:

p = xdr_decode_ftype3(p, &fmode);
fattr->mode = (be32_to_cpup(p++) & ~S_IFMT) | fmode;

Except perhaps that second line ought to use the same mask
as the server side is using, just in case there are some other
stray high (higher than S_IFMT) bits in there now/someday.

> The original behavior was in practice harmless and changing it broke
> something, so I think we should definitely just revert this patch.

Yup. Who?

> But the client may need fixing too.

Probably a good thing in the longer term, for better compatibility
with non-Linux servers. But we'll still have to keep the revert
on the server (nfsd) code for backward compatibility, I think.

Cheers



2014-04-03 16:53:58

by Mark Lord

[permalink] [raw]
Subject: Re: linux-3.14 nfsd regression

Mark Lord wrote:
> On 14-04-03 12:33 PM, Mark Lord wrote:
>> This commit from linux-3.14 breaks our NFS-root clients here:
>>
>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=6e14b46b91fee8a049b0940333ce13a820beaaa5
>>
>>
>> - *p++ = htonl((u32) stat->mode);
>> + *p++ = htonl((u32) (stat->mode & S_IALLUGO));
>>
>>
>> Reverting the one-liner above (on the server) fixes it for us,
>> as does reverting back to linux-3.13.8 on the server.
>>
>> The NFS-root clients are on PowerPC (big-endian) architecture,
>> running linux-3.12.16. The NFS server is on an Intel PC running linux-3.14.
>>
>> ACL is completely disabled on server and client,
>> and we're using NFSv2/v3. No support for v4.
>>
>> I instrumented the function to see what other bits were being cleared
>> by the (stat->mode & S_IALLUGO) masking. The results are attached.
> ..
>
>
> Looking into the (previously attached) trace, the bits that seem to be
> getting chopped most often are S_IFREG (0x8000) and S_IFDIR (0x4000).
> It appears that one/both of those are needed for mounting nfsroot.
>
>
> [ 2733.823753] encode_fattr: mode=0x000041ed mask=0x00000fff
> [ 2733.824217] encode_fattr: mode=0x000041ed mask=0x00000fff
> [ 2733.838769] encode_fattr: mode=0x000041ed mask=0x00000fff
> [ 2733.839175] encode_fattr: mode=0x0000a1ff mask=0x00000fff
> [ 2733.839895] encode_fattr: mode=0x000081ed mask=0x00000fff
...

Okay, the NFS root filesystem mounts and appears to work if I use this:

*p++ = htonl((u32) (stat->mode & (S_IALLUGO | S_IFDIR | S_IFREG | S_IFCHR)));

But I suspect it may also need S_IFBLK for block device accesses as well.

Cheers


2014-04-03 21:32:56

by Mark Lord

[permalink] [raw]
Subject: Re: linux-3.14 nfsd regression

On 14-04-03 05:28 PM, J. Bruce Fields wrote:
> On Thu, Apr 03, 2014 at 04:48:11PM -0400, Mark Lord wrote:
>> On 14-04-03 03:30 PM, J. Bruce Fields wrote:
>>> On Thu, Apr 03, 2014 at 01:51:06PM -0400, Mark Lord wrote:
>>>> On 14-04-03 01:16 PM, J. Bruce Fields wrote:
>>>>> On Thu, Apr 03, 2014 at 12:33:55PM -0400, Mark Lord wrote:
>>>>>> This commit from linux-3.14 breaks our NFS-root clients here:
>>>>>>
>>>>>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=6e14b46b91fee8a049b0940333ce13a820beaaa5
>>>>>>
>>>>>>
>>>>>> - *p++ = htonl((u32) stat->mode);
>>>>>> + *p++ = htonl((u32) (stat->mode & S_IALLUGO));
>>>>>>
>>>>>>
>>>>>> Reverting the one-liner above (on the server) fixes it for us,
>>>>>> as does reverting back to linux-3.13.8 on the server.
>>>>>>
>>>>>> The NFS-root clients are on PowerPC (big-endian) architecture,
>>>>>> running linux-3.12.16. The NFS server is on an Intel PC running linux-3.14.
>>>>>>
>>>>>> ACL is completely disabled on server and client,
>>>>>> and we're using NFSv2/v3. No support for v4.
>>>>>>
>>>>>> I instrumented the function to see what other bits were being cleared
>>>>>> by the (stat->mode & S_IALLUGO) masking. The results are attached.
>>>>>
>>>>> Hm, it sounds like a bug in the client if it's depending on those high
>>>>> bits.
>>>>
>>>> But only for mounting / starting up from the nfsroot, it seems.
>>>> I wonder if there's an unusual code path for that in there?
>>>> The regular stuff looks mostly fine:
>>>>
>>>> p = xdr_decode_ftype3(p, &fmode);
>>>> fattr->mode = (be32_to_cpup(p++) & ~S_IFMT) | fmode;
>>>
>>> Hm, but that's in nfs3xdr.c; in nfs2xdr.c we have just
>>>
>>> fattr->mode = be32_to_cpup(p+);
>>>
>>> and NFSv2 is the default for nfsroot. Do you have some reason to
>>> believe you're not using NFSv2?
>>
>> Oh, the client here was using NFS2, absolutely.
>> I just don't know my way around the code very well yet. :)
>>
>> But that mask in nfs3xdr.c (client) doesn't match what the server side is using.
>
> Not sure there's anything to see there.
>
> Looking at include/uapi/linux/stat.h and include/linux/stat.h, if I'm
> doing this right...
>
> S_ISFMT is 0170000
> S_IALLUGO is 0007777
>
> So they're 16-bit complements. And we're storing the result in a short?

Oh, is it a short? I was just going by the be32_to_cpup() macro (not a short).
But if the target field is, then okay for now, until someone expands it.

Cheers
--
Mark Lord
Real-Time Remedies Inc.
[email protected]

2014-04-03 19:55:17

by J. Bruce Fields

[permalink] [raw]
Subject: Re: linux-3.14 nfsd regression

On Thu, Apr 03, 2014 at 01:51:06PM -0400, Mark Lord wrote:
> On 14-04-03 01:16 PM, J. Bruce Fields wrote:
> > On Thu, Apr 03, 2014 at 12:33:55PM -0400, Mark Lord wrote:
> >> This commit from linux-3.14 breaks our NFS-root clients here:
> >>
> >> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=6e14b46b91fee8a049b0940333ce13a820beaaa5
> >>
> >>
> >> - *p++ = htonl((u32) stat->mode);
> >> + *p++ = htonl((u32) (stat->mode & S_IALLUGO));
> >>
> >>
> >> Reverting the one-liner above (on the server) fixes it for us,
> >> as does reverting back to linux-3.13.8 on the server.
> >>
> >> The NFS-root clients are on PowerPC (big-endian) architecture,
> >> running linux-3.12.16. The NFS server is on an Intel PC running linux-3.14.
> >>
> >> ACL is completely disabled on server and client,
> >> and we're using NFSv2/v3. No support for v4.
> >>
> >> I instrumented the function to see what other bits were being cleared
> >> by the (stat->mode & S_IALLUGO) masking. The results are attached.
> >
> > Hm, it sounds like a bug in the client if it's depending on those high
> > bits.
>
> But only for mounting / starting up from the nfsroot, it seems.
> I wonder if there's an unusual code path for that in there?
> The regular stuff looks mostly fine:
>
> p = xdr_decode_ftype3(p, &fmode);
> fattr->mode = (be32_to_cpup(p++) & ~S_IFMT) | fmode;

Hm, but that's in nfs3xdr.c; in nfs2xdr.c we have just

fattr->mode = be32_to_cpup(p+);

and NFSv2 is the default for nfsroot. Do you have some reason to
believe you're not using NFSv2?

--b.

>
> Except perhaps that second line ought to use the same mask
> as the server side is using, just in case there are some other
> stray high (higher than S_IFMT) bits in there now/someday.
>
> > The original behavior was in practice harmless and changing it broke
> > something, so I think we should definitely just revert this patch.
>
> Yup. Who?
>
> > But the client may need fixing too.
>
> Probably a good thing in the longer term, for better compatibility
> with non-Linux servers. But we'll still have to keep the revert
> on the server (nfsd) code for backward compatibility, I think.
>
> Cheers
>
>

2014-04-03 18:55:15

by Jeff Layton

[permalink] [raw]
Subject: Re: linux-3.14 nfsd regression

On Thu, 03 Apr 2014 13:51:06 -0400
Mark Lord <[email protected]> wrote:

> On 14-04-03 01:16 PM, J. Bruce Fields wrote:
> > On Thu, Apr 03, 2014 at 12:33:55PM -0400, Mark Lord wrote:
> >> This commit from linux-3.14 breaks our NFS-root clients here:
> >>
> >> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=6e14b46b91fee8a049b0940333ce13a820beaaa5
> >>
> >>
> >> - *p++ = htonl((u32) stat->mode);
> >> + *p++ = htonl((u32) (stat->mode & S_IALLUGO));
> >>
> >>
> >> Reverting the one-liner above (on the server) fixes it for us,
> >> as does reverting back to linux-3.13.8 on the server.
> >>
> >> The NFS-root clients are on PowerPC (big-endian) architecture,
> >> running linux-3.12.16. The NFS server is on an Intel PC running linux-3.14.
> >>
> >> ACL is completely disabled on server and client,
> >> and we're using NFSv2/v3. No support for v4.
> >>
> >> I instrumented the function to see what other bits were being cleared
> >> by the (stat->mode & S_IALLUGO) masking. The results are attached.
> >
> > Hm, it sounds like a bug in the client if it's depending on those high
> > bits.
>
> But only for mounting / starting up from the nfsroot, it seems.
> I wonder if there's an unusual code path for that in there?
> The regular stuff looks mostly fine:
>
> p = xdr_decode_ftype3(p, &fmode);
> fattr->mode = (be32_to_cpup(p++) & ~S_IFMT) | fmode;
>
> Except perhaps that second line ought to use the same mask
> as the server side is using, just in case there are some other
> stray high (higher than S_IFMT) bits in there now/someday.
>
> > The original behavior was in practice harmless and changing it broke
> > something, so I think we should definitely just revert this patch.
>
> Yup. Who?
>
> > But the client may need fixing too.
>
> Probably a good thing in the longer term, for better compatibility
> with non-Linux servers. But we'll still have to keep the revert
> on the server (nfsd) code for backward compatibility, I think.
>
> Cheers
>

It would be good to understand where this is broken in the client.

It's incorrect for the client to interpret those bits, as I think that
there's no guarantee that other OS's implement the type bits in the
same way that Linux does. So if you end up mounting a different OS,
it's possible that the client will get that wrong...

--
Jeff Layton <[email protected]>

2014-05-01 19:59:47

by J. Bruce Fields

[permalink] [raw]
Subject: Re: linux-3.14 nfsd regression

On Thu, May 01, 2014 at 07:50:34AM -0400, Mark Lord wrote:
> Still a regression in 3.14.2 now.
> Anyone got plans to push this patch out to mainline, as well as +stable ?

This is upstream as 082f31a2169bd639785e45bf252f3d5bce0303c6 "nfsd:
revert v2 half of "nfsd: don't return high mode bits"", with a cc to
[email protected], so I assume it's just taking a little time for stable
to pick it up.

--b.

2014-05-01 11:50:43

by Mark Lord

[permalink] [raw]
Subject: Re: linux-3.14 nfsd regression

On 14-04-04 09:58 AM, J. Bruce Fields wrote:
> On Thu, Apr 03, 2014 at 07:21:46PM -0400, Jeff Layton wrote:
>> So according to the RFC you have to encode both the mode bits and the
>> ftype for v2. The type bits seem to be removed from the mode in NFSv3
>> though, so perhaps we should only be doing that masking in versions
>> above v2?
>
> Right, the problematic patch applied the same mask in both v2 and v3
> cases, so I'm reverting just the v2 part (see below).
>
>> With a quick check, it looks like the v3 code doesn't rely on those bits
>> and I imagine v4 doesn't either.
>>
>> It might also be nice to have the client v2 decode_fattr function to
>> throw a warning if the server sends us mismatched type bits and ftype
>> values. That would have helped us catch this sooner...
>
> Yes, that might be a reasonable thing to do, though I don't know if it's
> worth it.
>
> --b.
>
> commit 35a8dff14e76c00e5b52140290cfb498dc2454a0
> Author: J. Bruce Fields <[email protected]>
> Date: Thu Apr 3 15:10:35 2014 -0400
>
> nfsd: revert v2 half of "nfsd: don't return high mode bits"
>
> This reverts the part of commit 6e14b46b91fee8a049b0940333ce13a820beaaa5
> that changes NFSv2 behavior.
>
> Mark Lord found that it broke nfs-root for Linux clients, because it
> broke NFSv2.
>
> In fact, from RFC 1094:
>
> "Notice that the file type is specified both in the mode bits
> and in the file type. This is really a bug in the protocol and
> will be fixed in future versions."
>
> So NFSv2 clients really are expected to depend on the high bits of the
> mode.
>
> Cc: [email protected]
> Reported-by: Mark Lord <[email protected]>
> Signed-off-by: J. Bruce Fields <[email protected]>
>
> diff --git a/fs/nfsd/nfsxdr.c b/fs/nfsd/nfsxdr.c
> index b17d932..9c769a4 100644
> --- a/fs/nfsd/nfsxdr.c
> +++ b/fs/nfsd/nfsxdr.c
> @@ -152,7 +152,7 @@ encode_fattr(struct svc_rqst *rqstp, __be32 *p, struct svc_fh *fhp,
> type = (stat->mode & S_IFMT);
>
> *p++ = htonl(nfs_ftypes[type >> 12]);
> - *p++ = htonl((u32) (stat->mode & S_IALLUGO));
> + *p++ = htonl((u32) stat->mode);
> *p++ = htonl((u32) stat->nlink);
> *p++ = htonl((u32) from_kuid(&init_user_ns, stat->uid));
> *p++ = htonl((u32) from_kgid(&init_user_ns, stat->gid));
>


Still a regression in 3.14.2 now.
Anyone got plans to push this patch out to mainline, as well as +stable ?

--
Mark Lord
Real-Time Remedies Inc.
[email protected]