2010-04-01 22:48:48

by Tom Tucker

[permalink] [raw]
Subject: [PATCH,RFC] nfsd: Make INET6 transport creation failure an informational message

Hi Bruce/Chuck,

RDMA Transports are currently broken in 2.6.34 because they don't have a
V4ONLY setsockopt. So what happens is that when write_ports attempts to
create the PF_INET6 transport it fails because the port is already in
use. There is discussion on linux-rdma about how to fix this, but in the
interim and perhaps indefinitely, I propose the following:

Tom

nfsd: Make INET6 transport creation failure an informational message

The write_ports code will fail both the INET4 and INET6 transport creation if
the transport returns an error when PF_INET6 is specified. Some transports
that do not support INET6 return an error other than EAFNOSUPPORT. We should
allow communication on INET4 even if INET6 is not yet supported or fails
for some reason.

Signed-off-by: Tom Tucker <[email protected]>
---

fs/nfsd/nfsctl.c | 6 ++++--
1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
index 0f0e77f..934b624 100644
--- a/fs/nfsd/nfsctl.c
+++ b/fs/nfsd/nfsctl.c
@@ -1008,8 +1008,10 @@ static ssize_t __write_ports_addxprt(char *buf)

err = svc_create_xprt(nfsd_serv, transport,
PF_INET6, port, SVC_SOCK_ANONYMOUS);
- if (err < 0 && err != -EAFNOSUPPORT)
- goto out_close;
+ if (err < 0)
+ printk(KERN_INFO "nfsd: Error creating PF_INET6 listener "
+ "for transport '%s'\n", transport);
+
return 0;
out_close:
xprt = svc_find_xprt(nfsd_serv, transport, PF_INET, port);



2010-04-02 18:05:23

by Chuck Lever III

[permalink] [raw]
Subject: Re: [PATCH,RFC] nfsd: Make INET6 transport creation failure an informational message

Hi Roland-

On 04/02/2010 01:22 PM, Roland Dreier wrote:
> > > The write_ports code will fail both the INET4 and INET6 transport
> > > creation if
> > > the transport returns an error when PF_INET6 is specified. Some transports
> > > that do not support INET6 return an error other than EAFNOSUPPORT.
> >
> > That's the real bug. Any reason the RDMA RPC transport can't return
> > EAFNOSUPPORT in this case?
>
> I think Tom's changelog is misleading. The problem is that the RDMA
> transport actually does support IPv6, but it doesn't support the
> IPV6ONLY option yet. So if NFS/RDMA binds to a port for IPv4, then the
> IPv6 bind fails because of the port collision.

IPV6ONLY is a requirement for RPC over IPv6. If the underlying
transport does not support IPV6ONLY, then it cannot properly support RPC
over IPv6. It's easy enough to catch listener creation calls for IPv6
on such transports, and simply return EAFNOSUPPORT until support for
IPV6ONLY can be provided.

The __write_ports() interface is specifically designed to silently fall
back to IPv4-only when IPv6 transport creation fails with ENOAFSUPPORT.
I don't see a good reason to change the generic logic in
__write_ports() if there is a problem with implementing RPC over IPv6 in
a specific transport capability. __write_ports() will do the right
thing if the transport returns the correct error code.

> Implementing the IPV6ONLY option for RDMA binding is probably not
> feasible for 2.6.34, so the best band-aid for now seems to be Tom's
> patch.

My recent experience with similar changes suggests the specific solution
Tom proposed will trigger extra bug reports and e-mails, as the change
appears to affect non-RDMA transports as well. This printk might fire,
for example, for INET transports on systems that are built without IPv6
support, or where ipv6.ko is blacklisted in user space.

In other words, I agree that there's a bug that should be addressed in
2.6.34, and I don't have any problem with setting up only an IPv4
listener in this case. But I think the addition of a printk that fires
for all transports in this case is problematic.

It would be better to address this in the RPC/RDMA transport capability,
and not in generic upper level logic. We already have correct behavior
in __write_ports, and the RPC/RDMA transport capability should be
changed to use it.

--
chuck[dot]lever[at]oracle[dot]com

2010-04-02 17:22:21

by Roland Dreier

[permalink] [raw]
Subject: Re: [PATCH,RFC] nfsd: Make INET6 transport creation failure an informational message

> > The write_ports code will fail both the INET4 and INET6 transport
> > creation if
> > the transport returns an error when PF_INET6 is specified. Some transports
> > that do not support INET6 return an error other than EAFNOSUPPORT.
>
> That's the real bug. Any reason the RDMA RPC transport can't return
> EAFNOSUPPORT in this case?

I think Tom's changelog is misleading. The problem is that the RDMA
transport actually does support IPv6, but it doesn't support the
IPV6ONLY option yet. So if NFS/RDMA binds to a port for IPv4, then the
IPv6 bind fails because of the port collision.

Implementing the IPV6ONLY option for RDMA binding is probably not
feasible for 2.6.34, so the best band-aid for now seems to be Tom's
patch.

- R.
--
Roland Dreier <[email protected]> || For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/index.html

2010-04-02 19:04:58

by Tom Tucker

[permalink] [raw]
Subject: Re: [PATCH,RFC] nfsd: Make INET6 transport creation failure an informational message

Chuck Lever wrote:
> Hi Roland-
>
> On 04/02/2010 01:22 PM, Roland Dreier wrote:
>> > > The write_ports code will fail both the INET4 and INET6
>> transport
>> > > creation if
>> > > the transport returns an error when PF_INET6 is specified.
>> Some transports
>> > > that do not support INET6 return an error other than
>> EAFNOSUPPORT.
>> >
>> > That's the real bug. Any reason the RDMA RPC transport can't
>> return
>> > EAFNOSUPPORT in this case?
>>
>> I think Tom's changelog is misleading. The problem is that the RDMA
>> transport actually does support IPv6, but it doesn't support the
>> IPV6ONLY option yet. So if NFS/RDMA binds to a port for IPv4, then the
>> IPv6 bind fails because of the port collision.
>
> IPV6ONLY is a requirement for RPC over IPv6. If the underlying
> transport does not support IPV6ONLY, then it cannot properly support
> RPC over IPv6. It's easy enough to catch listener creation calls for
> IPv6 on such transports, and simply return EAFNOSUPPORT until support
> for IPV6ONLY can be provided.
>
> The __write_ports() interface is specifically designed to silently
> fall back to IPv4-only when IPv6 transport creation fails with
> ENOAFSUPPORT. I don't see a good reason to change the generic logic
> in __write_ports() if there is a problem with implementing RPC over
> IPv6 in a specific transport capability. __write_ports() will do the
> right thing if the transport returns the correct error code.
>
>> Implementing the IPV6ONLY option for RDMA binding is probably not
>> feasible for 2.6.34, so the best band-aid for now seems to be Tom's
>> patch.
>
> My recent experience with similar changes suggests the specific
> solution Tom proposed will trigger extra bug reports and e-mails, as
> the change appears to affect non-RDMA transports as well. This printk
> might fire, for example, for INET transports on systems that are built
> without IPv6 support, or where ipv6.ko is blacklisted in user space.
>
> In other words, I agree that there's a bug that should be addressed in
> 2.6.34, and I don't have any problem with setting up only an IPv4
> listener in this case. But I think the addition of a printk that
> fires for all transports in this case is problematic.
>
This makes sense to me.

> It would be better to address this in the RPC/RDMA transport
> capability, and not in generic upper level logic. We already have
> correct behavior in __write_ports, and the RPC/RDMA transport
> capability should be changed to use it.
>
So is seems reasonable to me to fail svc_create_xprt with ("rdma",
PF_INET6) with EAFNOSUPPORT because the RDMA transport does not support
the v4only setsockopt.

I will post a patch that does this.

Thanks,
Tom


2010-04-02 18:52:41

by Tom Tucker

[permalink] [raw]
Subject: Re: [PATCH,RFC] nfsd: Make INET6 transport creation failure an informational message

Roland Dreier wrote:
> > > The write_ports code will fail both the INET4 and INET6 transport
> > > creation if
> > > the transport returns an error when PF_INET6 is specified. Some transports
> > > that do not support INET6 return an error other than EAFNOSUPPORT.
> >
> > That's the real bug. Any reason the RDMA RPC transport can't return
> > EAFNOSUPPORT in this case?
>
> I think Tom's changelog is misleading.
Yes, it should read "A transport may fail for some reason other than
EAFNOSUPPORT."

> The problem is that the RDMA
> transport actually does support IPv6, but it doesn't support the
> IPV6ONLY option yet. So if NFS/RDMA binds to a port for IPv4, then the
> IPv6 bind fails because of the port collision.
>
>

Should we fail INET4 if INET6 fails under any circumstances?

> Implementing the IPV6ONLY option for RDMA binding is probably not
> feasible for 2.6.34, so the best band-aid for now seems to be Tom's
> patch.
>
> - R.
>


2010-04-02 16:46:13

by Chuck Lever III

[permalink] [raw]
Subject: Re: [PATCH,RFC] nfsd: Make INET6 transport creation failure an informational message

Hi Tom-

On 04/01/2010 06:48 PM, Tom Tucker wrote:
> Hi Bruce/Chuck,
>
> RDMA Transports are currently broken in 2.6.34 because they don't have a
> V4ONLY setsockopt. So what happens is that when write_ports attempts to
> create the PF_INET6 transport it fails because the port is already in
> use. There is discussion on linux-rdma about how to fix this, but in the
> interim and perhaps indefinitely, I propose the following:
>
> Tom
>
> nfsd: Make INET6 transport creation failure an informational message
>
> The write_ports code will fail both the INET4 and INET6 transport
> creation if
> the transport returns an error when PF_INET6 is specified. Some transports
> that do not support INET6 return an error other than EAFNOSUPPORT.

That's the real bug. Any reason the RDMA RPC transport can't return
EAFNOSUPPORT in this case?

> We
> should
> allow communication on INET4 even if INET6 is not yet supported or fails
> for some reason.

Yes, that's why EAFNOSUPPORT is ignored in __write_ports(). People
complain when they see messages like this, even if the result is a
working configuration.

> Signed-off-by: Tom Tucker <[email protected]>
> ---
>
> fs/nfsd/nfsctl.c | 6 ++++--
> 1 files changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
> index 0f0e77f..934b624 100644
> --- a/fs/nfsd/nfsctl.c
> +++ b/fs/nfsd/nfsctl.c
> @@ -1008,8 +1008,10 @@ static ssize_t __write_ports_addxprt(char *buf)
>
> err = svc_create_xprt(nfsd_serv, transport,
> PF_INET6, port, SVC_SOCK_ANONYMOUS);
> - if (err < 0 && err != -EAFNOSUPPORT)
> - goto out_close;
> + if (err < 0)
> + printk(KERN_INFO "nfsd: Error creating PF_INET6 listener "
> + "for transport '%s'\n", transport);
> +
> return 0;
> out_close:
> xprt = svc_find_xprt(nfsd_serv, transport, PF_INET, port);
>


--
chuck[dot]lever[at]oracle[dot]com