2015-06-18 08:23:10

by Anders Blomdell

[permalink] [raw]
Subject: Mount regression between 4.0.4 client and 2.6.35 server

I have a problem with a 4.0.4 client refusing to mount from a 2.6.35 server
due to NFS4ERR_INVAL returned during nfs4_discover_server_trunking. See
https://bugzilla.redhat.com/show_bug.cgi?id=1228272.

Proposed patch attached.

Regards

Anders Blomdell
--
Anders Blomdell Email: [email protected]
Department of Automatic Control
Lund University Phone: +46 46 222 4625
P.O. Box 118 Fax: +46 46 138118
SE-221 00 Lund, Sweden


Attachments:
nfs4_discover_server_trunking-eio.patch (866.00 B)

2015-06-18 11:49:53

by Trond Myklebust

[permalink] [raw]
Subject: Re: Mount regression between 4.0.4 client and 2.6.35 server

On Thu, Jun 18, 2015 at 4:04 AM, Anders Blomdell
<[email protected]> wrote:
>
> I have a problem with a 4.0.4 client refusing to mount from a 2.6.35 server
> due to NFS4ERR_INVAL returned during nfs4_discover_server_trunking. See
> https://bugzilla.redhat.com/show_bug.cgi?id=1228272.


Why should we change the clients if the server is in clear and obvious
violation of the spec?

Cheers
Trond

2015-06-18 12:28:23

by Anders Blomdell

[permalink] [raw]
Subject: Re: Mount regression between 4.0.4 client and 2.6.35 server

On 2015-06-18 13:49, Trond Myklebust wrote:
> On Thu, Jun 18, 2015 at 4:04 AM, Anders Blomdell
> <[email protected]> wrote:
>>
>> I have a problem with a 4.0.4 client refusing to mount from a 2.6.35 server
>> due to NFS4ERR_INVAL returned during nfs4_discover_server_trunking. See
>> https://bugzilla.redhat.com/show_bug.cgi?id=1228272.
>
>
> Why should we change the clients if the server is in clear and obvious
> violation of the spec?
In order to make clients work with servers that worked well with previous versions
of nfs-utils, the cultprit probably being commit f9802988 that bumped the default
autonegotion version to 4.2, what the patch does is only to negotiate a lower version
in case of errors, and hence making 1.3.2 working with servers that worked with
1.3.1 (that only tried version 4[.0]).

Will probably save some people some time.

/Anders

--
Anders Blomdell Email: [email protected]
Department of Automatic Control
Lund University Phone: +46 46 222 4625
P.O. Box 118 Fax: +46 46 138118
SE-221 00 Lund, Sweden


2015-06-18 12:53:59

by Trond Myklebust

[permalink] [raw]
Subject: Re: Mount regression between 4.0.4 client and 2.6.35 server

On Thu, Jun 18, 2015 at 8:28 AM, Anders Blomdell
<[email protected]> wrote:
> On 2015-06-18 13:49, Trond Myklebust wrote:
>> On Thu, Jun 18, 2015 at 4:04 AM, Anders Blomdell
>> <[email protected]> wrote:
>>>
>>> I have a problem with a 4.0.4 client refusing to mount from a 2.6.35 server
>>> due to NFS4ERR_INVAL returned during nfs4_discover_server_trunking. See
>>> https://bugzilla.redhat.com/show_bug.cgi?id=1228272.
>>
>>
>> Why should we change the clients if the server is in clear and obvious
>> violation of the spec?
> In order to make clients work with servers that worked well with previous versions
> of nfs-utils, the cultprit probably being commit f9802988 that bumped the default
> autonegotion version to 4.2, what the patch does is only to negotiate a lower version
> in case of errors, and hence making 1.3.2 working with servers that worked with
> 1.3.1 (that only tried version 4[.0]).
>
> Will probably save some people some time.

This is what /etc/nfsmount.conf is for. We don't fix clients that are
working correctly according to the protocol spec.

Trond

2015-06-18 14:20:41

by Steve Dickson

[permalink] [raw]
Subject: Re: Mount regression between 4.0.4 client and 2.6.35 server

Hello,

BTW, I asked Anders to post this patch to
get this discussion going.

On 06/18/2015 08:53 AM, Trond Myklebust wrote:
> On Thu, Jun 18, 2015 at 8:28 AM, Anders Blomdell
> <[email protected]> wrote:
>> On 2015-06-18 13:49, Trond Myklebust wrote:
>>> On Thu, Jun 18, 2015 at 4:04 AM, Anders Blomdell
>>> <[email protected]> wrote:
>>>>
>>>> I have a problem with a 4.0.4 client refusing to mount from a 2.6.35 server
>>>> due to NFS4ERR_INVAL returned during nfs4_discover_server_trunking. See
>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1228272.
>>>
>>>
>>> Why should we change the clients if the server is in clear and obvious
>>> violation of the spec?
>> In order to make clients work with servers that worked well with previous versions
>> of nfs-utils, the cultprit probably being commit f9802988 that bumped the default
>> autonegotion version to 4.2, what the patch does is only to negotiate a lower version
>> in case of errors, and hence making 1.3.2 working with servers that worked with
>> 1.3.1 (that only tried version 4[.0]).
>>
>> Will probably save some people some time.
>
> This is what /etc/nfsmount.conf is for. We don't fix clients that are
> working correctly according to the protocol spec.
>
Using /etc/nfsmount.conf does not scale very well when there is
a large number clients involved...

When we bumped up the auto-negotiation to 4.2, I thinking there
would be issues with legacy servers. But I was thinking obscure
servers like AIX would break, not old Linux servers and possibly
Solaris servers (I'm trying to check that now)

Inlining the patch:

--- utils/mount/stropts.c.orig 2015-06-18 09:51:02.091148891 +0200
+++ utils/mount/stropts.c 2015-06-18 09:48:56.859970023 +0200
@@ -838,6 +838,10 @@ check_result:
return result;

switch (errno) {
+ case EIO:
+ /* Fix to handle nfs4_discover_server_trunking returning
+ * EIO in case where nfs server returns NFS4ERR_INVAL,
+ * see https://bugzilla.redhat.com/show_bug.cgi?id=1228272 */
case EPROTONOSUPPORT:
/* A clear indication that the server or our
* client does not support NFS version 4 and minor */

Now I'm not sure this is the right to handle this but
I do think it is a good idea to make the auto-negation
mounting code smarter to handle these legacy servers.

steved.

2015-06-26 12:17:30

by Benjamin Coddington

[permalink] [raw]
Subject: Re: Mount regression between 4.0.4 client and 2.6.35 server

On Thu, 18 Jun 2015, Trond Myklebust wrote:

> On Thu, Jun 18, 2015 at 4:04 AM, Anders Blomdell
> <[email protected]> wrote:
> >
> > I have a problem with a 4.0.4 client refusing to mount from a 2.6.35 server
> > due to NFS4ERR_INVAL returned during nfs4_discover_server_trunking. See
> > https://bugzilla.redhat.com/show_bug.cgi?id=1228272.
>
>
> Why should we change the clients if the server is in clear and obvious
> violation of the spec?


What's happening here is knfsd older than 2.6.38 will return NFS4ERR_INVAL
for EXCHANGE_ID that has EXCHGID4_FLAG_BIND_PRINC_STATEID set in the
request.

Should the client's use of stateid/princ binding be optional/configureable?

Ben

2015-06-26 17:50:32

by J. Bruce Fields

[permalink] [raw]
Subject: Re: Mount regression between 4.0.4 client and 2.6.35 server

On Fri, Jun 26, 2015 at 08:17:30AM -0400, Benjamin Coddington wrote:
> On Thu, 18 Jun 2015, Trond Myklebust wrote:
>
> > On Thu, Jun 18, 2015 at 4:04 AM, Anders Blomdell
> > <[email protected]> wrote:
> > >
> > > I have a problem with a 4.0.4 client refusing to mount from a 2.6.35 server
> > > due to NFS4ERR_INVAL returned during nfs4_discover_server_trunking. See
> > > https://bugzilla.redhat.com/show_bug.cgi?id=1228272.
> >
> >
> > Why should we change the clients if the server is in clear and obvious
> > violation of the spec?
>
>
> What's happening here is knfsd older than 2.6.38 will return NFS4ERR_INVAL
> for EXCHANGE_ID that has EXCHGID4_FLAG_BIND_PRINC_STATEID set in the
> request.

Note that intentionally left off by default until 3.11, exactly because
I thought there was a high risk of incompatibility with future clients
at that point.

> Should the client's use of stateid/princ binding be optional/configureable?

So, perhaps there's other reasons for doing this, but we're not going to
justify it for compatibility with out-of-spec behavior in older
experimental server code.

Possibly more annoying is that Solaris servers (I don't know which
versions, I just have one report that says "Solaris 10") are also
returning GARBAGE_ARGS instead of NFS4ERR_MINOR_VERS_MISMATCH on
attempts to use 4.2.

--b.

2015-06-26 17:58:35

by Trond Myklebust

[permalink] [raw]
Subject: Re: Mount regression between 4.0.4 client and 2.6.35 server

On Fri, Jun 26, 2015 at 1:50 PM, J. Bruce Fields <[email protected]> wrote:
> On Fri, Jun 26, 2015 at 08:17:30AM -0400, Benjamin Coddington wrote:
>> On Thu, 18 Jun 2015, Trond Myklebust wrote:
>>
>> > On Thu, Jun 18, 2015 at 4:04 AM, Anders Blomdell
>> > <[email protected]> wrote:
>> > >
>> > > I have a problem with a 4.0.4 client refusing to mount from a 2.6.35 server
>> > > due to NFS4ERR_INVAL returned during nfs4_discover_server_trunking. See
>> > > https://bugzilla.redhat.com/show_bug.cgi?id=1228272.
>> >
>> >
>> > Why should we change the clients if the server is in clear and obvious
>> > violation of the spec?
>>
>>
>> What's happening here is knfsd older than 2.6.38 will return NFS4ERR_INVAL
>> for EXCHANGE_ID that has EXCHGID4_FLAG_BIND_PRINC_STATEID set in the
>> request.
>
> Note that intentionally left off by default until 3.11, exactly because
> I thought there was a high risk of incompatibility with future clients
> at that point.
>
>> Should the client's use of stateid/princ binding be optional/configureable?
>
> So, perhaps there's other reasons for doing this, but we're not going to
> justify it for compatibility with out-of-spec behavior in older
> experimental server code.

The protocol gives the server the option of rejecting the client's
request for bind_princ_stateid by simply turning off the flag in its
reply. See https://tools.ietf.org/html/rfc5661#section-18.35.3

> Possibly more annoying is that Solaris servers (I don't know which
> versions, I just have one report that says "Solaris 10") are also
> returning GARBAGE_ARGS instead of NFS4ERR_MINOR_VERS_MISMATCH on
> attempts to use 4.2.

2015-06-26 18:32:20

by J. Bruce Fields

[permalink] [raw]
Subject: Re: Mount regression between 4.0.4 client and 2.6.35 server

On Fri, Jun 26, 2015 at 01:58:35PM -0400, Trond Myklebust wrote:
> On Fri, Jun 26, 2015 at 1:50 PM, J. Bruce Fields <[email protected]> wrote:
> > On Fri, Jun 26, 2015 at 08:17:30AM -0400, Benjamin Coddington wrote:
> >> On Thu, 18 Jun 2015, Trond Myklebust wrote:
> >>
> >> > On Thu, Jun 18, 2015 at 4:04 AM, Anders Blomdell
> >> > <[email protected]> wrote:
> >> > >
> >> > > I have a problem with a 4.0.4 client refusing to mount from a 2.6.35 server
> >> > > due to NFS4ERR_INVAL returned during nfs4_discover_server_trunking. See
> >> > > https://bugzilla.redhat.com/show_bug.cgi?id=1228272.
> >> >
> >> >
> >> > Why should we change the clients if the server is in clear and obvious
> >> > violation of the spec?
> >>
> >>
> >> What's happening here is knfsd older than 2.6.38 will return NFS4ERR_INVAL
> >> for EXCHANGE_ID that has EXCHGID4_FLAG_BIND_PRINC_STATEID set in the
> >> request.
> >
> > Note that intentionally left off by default until 3.11, exactly because
> > I thought there was a high risk of incompatibility with future clients
> > at that point.
> >
> >> Should the client's use of stateid/princ binding be optional/configureable?
> >
> > So, perhaps there's other reasons for doing this, but we're not going to
> > justify it for compatibility with out-of-spec behavior in older
> > experimental server code.
>
> The protocol gives the server the option of rejecting the client's
> request for bind_princ_stateid by simply turning off the flag in its
> reply. See https://tools.ietf.org/html/rfc5661#section-18.35.3

Yeah, that's what it's doing.

(The failure in this case was because the code just didn't know about
the flag at all--I think it still had constants taken from an earlier
ietf draft. Like I say, experimental code.)

--b.

>
> > Possibly more annoying is that Solaris servers (I don't know which
> > versions, I just have one report that says "Solaris 10") are also
> > returning GARBAGE_ARGS instead of NFS4ERR_MINOR_VERS_MISMATCH on
> > attempts to use 4.2.