2017-12-04 19:13:24

by Chuck Lever

[permalink] [raw]
Subject: [PATCH v1 0/4] Use same proto= after traversing a referral

Hi Anna-

Today when the Linux client traverses an NFSv4 referral, it always
chooses proto=tcp on the submount. This means following a referral
on an NFS/RDMA mount will always make the submounts use TCP, which
is not desirable. These patches make the client try NFS/RDMA first
when traversing a referral so that submounts will continue to use
NFS/RDMA.

Please consider this series for v4.16.

---

Chuck Lever (4):
nfs: Define NFS_RDMA_PORT
nfs: Referrals should use the same proto setting as their parent
nfs: Update server port after referral or migration
SUNRPC: Remove rpc_protocol()


fs/nfs/nfs4client.c | 24 +++++++++++++++++++++---
fs/nfs/nfs4namespace.c | 2 --
include/linux/sunrpc/clnt.h | 1 -
include/uapi/linux/nfs.h | 1 +
net/sunrpc/clnt.c | 16 ----------------
5 files changed, 22 insertions(+), 22 deletions(-)

--
Chuck Lever


2017-12-04 19:13:32

by Chuck Lever

[permalink] [raw]
Subject: [PATCH v1 1/4] nfs: Define NFS_RDMA_PORT

The NFS/RDMA port assignment is specified in Section 9 of RFC 8267.

Signed-off-by: Chuck Lever <[email protected]>
---
include/uapi/linux/nfs.h | 1 +
1 file changed, 1 insertion(+)

diff --git a/include/uapi/linux/nfs.h b/include/uapi/linux/nfs.h
index 057d22a..946cb62 100644
--- a/include/uapi/linux/nfs.h
+++ b/include/uapi/linux/nfs.h
@@ -12,6 +12,7 @@

#define NFS_PROGRAM 100003
#define NFS_PORT 2049
+#define NFS_RDMA_PORT 20049
#define NFS_MAXDATA 8192
#define NFS_MAXPATHLEN 1024
#define NFS_MAXNAMLEN 255


2017-12-04 19:13:40

by Chuck Lever

[permalink] [raw]
Subject: [PATCH v1 2/4] nfs: Referrals should use the same proto setting as their parent

Helen Chao <[email protected]> noticed that when a user
traverses a referral on an NFS/RDMA mount, the resulting submount
always uses TCP.

This behavior does not match the vers= setting when traversing
a referral (vers=4.1 is preserved). It also does not match the
behavior of crossing from the pseudofs into a real filesystem
(proto=rdma is preserved in that case).

The Linux NFS client does not currently support the
fs_locations_info attribute. The situation is similar for all
NFSv4 servers I know of. Therefore until the community has broad
support for fs_locations_info, when following a referral:

- First try to connect with RPC-over-RDMA. This will fail quickly
if the client has no RDMA-capable interfaces.

- If connecting with RPC-over-RDMA fails, or the RPC-over-RDMA
transport is not available, use TCP.

Reported-by: Helen Chao <[email protected]>
Signed-off-by: Chuck Lever <[email protected]>
---
fs/nfs/nfs4client.c | 23 ++++++++++++++++++++---
fs/nfs/nfs4namespace.c | 2 --
2 files changed, 20 insertions(+), 5 deletions(-)

diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c
index 12bbab0..a3d5592 100644
--- a/fs/nfs/nfs4client.c
+++ b/fs/nfs/nfs4client.c
@@ -1114,19 +1114,36 @@ struct nfs_server *nfs4_create_referral_server(struct nfs_clone_mount *data,
/* Initialise the client representation from the parent server */
nfs_server_copy_userdata(server, parent_server);

- /* Get a client representation.
- * Note: NFSv4 always uses TCP, */
+ /* Get a client representation */
+#ifdef CONFIG_SUNRPC_XPRT_RDMA
+ rpc_set_port(data->addr, NFS_RDMA_PORT);
error = nfs4_set_client(server, data->hostname,
data->addr,
data->addrlen,
parent_client->cl_ipaddr,
- rpc_protocol(parent_server->client),
+ XPRT_TRANSPORT_RDMA,
+ parent_server->client->cl_timeout,
+ parent_client->cl_mvops->minor_version,
+ parent_client->cl_net);
+ if (!error)
+ goto init_server;
+#endif /* CONFIG_SUNRPC_XPRT_RDMA */
+
+ rpc_set_port(data->addr, NFS_PORT);
+ error = nfs4_set_client(server, data->hostname,
+ data->addr,
+ data->addrlen,
+ parent_client->cl_ipaddr,
+ XPRT_TRANSPORT_TCP,
parent_server->client->cl_timeout,
parent_client->cl_mvops->minor_version,
parent_client->cl_net);
if (error < 0)
goto error;

+#ifdef CONFIG_SUNRPC_XPRT_RDMA
+init_server:
+#endif
error = nfs_init_server_rpcclient(server, parent_server->client->cl_timeout, data->authflavor);
if (error < 0)
goto error;
diff --git a/fs/nfs/nfs4namespace.c b/fs/nfs/nfs4namespace.c
index 8c3f327..24f06dc 100644
--- a/fs/nfs/nfs4namespace.c
+++ b/fs/nfs/nfs4namespace.c
@@ -270,8 +270,6 @@ static struct vfsmount *try_location(struct nfs_clone_mount *mountdata,
if (mountdata->addrlen == 0)
continue;

- rpc_set_port(mountdata->addr, NFS_PORT);
-
memcpy(page2, buf->data, buf->len);
page2[buf->len] = '\0';
mountdata->hostname = page2;


2017-12-04 19:13:48

by Chuck Lever

[permalink] [raw]
Subject: [PATCH v1 3/4] nfs: Update server port after referral or migration

After traversing a referral or recovering from a migration event,
ensure that the server port reported in /proc/mounts is updated
to the correct port setting for the new submount.

Reported-by: Helen Chao <[email protected]>
Signed-off-by: Chuck Lever <[email protected]>
---
fs/nfs/nfs4client.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c
index a3d5592..5bc722e 100644
--- a/fs/nfs/nfs4client.c
+++ b/fs/nfs/nfs4client.c
@@ -852,6 +852,7 @@ static int nfs4_set_client(struct nfs_server *server,
set_bit(NFS_CS_MIGRATION, &cl_init.init_flags);
if (test_bit(NFS_MIG_TSM_POSSIBLE, &server->mig_status))
set_bit(NFS_CS_TSM_POSSIBLE, &cl_init.init_flags);
+ server->port = rpc_get_port(addr);

/* Allocate or find a client reference we can use */
clp = nfs_get_client(&cl_init);


2017-12-04 19:13:57

by Chuck Lever

[permalink] [raw]
Subject: [PATCH v1 4/4] SUNRPC: Remove rpc_protocol()

Since nfs4_create_referral_server was the only call site of
rpc_protocol, rpc_protocol can now be removed.

Signed-off-by: Chuck Lever <[email protected]>
---
include/linux/sunrpc/clnt.h | 1 -
net/sunrpc/clnt.c | 16 ----------------
2 files changed, 17 deletions(-)

diff --git a/include/linux/sunrpc/clnt.h b/include/linux/sunrpc/clnt.h
index 71c237e..ed761f7 100644
--- a/include/linux/sunrpc/clnt.h
+++ b/include/linux/sunrpc/clnt.h
@@ -179,7 +179,6 @@ struct rpc_task *rpc_call_null(struct rpc_clnt *clnt, struct rpc_cred *cred,
int rpc_restart_call_prepare(struct rpc_task *);
int rpc_restart_call(struct rpc_task *);
void rpc_setbufsize(struct rpc_clnt *, unsigned int, unsigned int);
-int rpc_protocol(struct rpc_clnt *);
struct net * rpc_net_ns(struct rpc_clnt *);
size_t rpc_max_payload(struct rpc_clnt *);
size_t rpc_max_bc_payload(struct rpc_clnt *);
diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
index e2a4184..6e432ec 100644
--- a/net/sunrpc/clnt.c
+++ b/net/sunrpc/clnt.c
@@ -1376,22 +1376,6 @@ int rpc_localaddr(struct rpc_clnt *clnt, struct sockaddr *buf, size_t buflen)
EXPORT_SYMBOL_GPL(rpc_setbufsize);

/**
- * rpc_protocol - Get transport protocol number for an RPC client
- * @clnt: RPC client to query
- *
- */
-int rpc_protocol(struct rpc_clnt *clnt)
-{
- int protocol;
-
- rcu_read_lock();
- protocol = rcu_dereference(clnt->cl_xprt)->prot;
- rcu_read_unlock();
- return protocol;
-}
-EXPORT_SYMBOL_GPL(rpc_protocol);
-
-/**
* rpc_net_ns - Get the network namespace for this RPC client
* @clnt: RPC client to query
*


2017-12-05 15:10:36

by Devesh Sharma

[permalink] [raw]
Subject: Re: [PATCH v1 1/4] nfs: Define NFS_RDMA_PORT

Hi Chuck,

Will this change avoid the "echo rdma 20049" on the nfs server during
nfs server start?

-Regards
Devesh

On Tue, Dec 5, 2017 at 12:43 AM, Chuck Lever <[email protected]> wrote:
> The NFS/RDMA port assignment is specified in Section 9 of RFC 8267.
>
> Signed-off-by: Chuck Lever <[email protected]>
> ---
> include/uapi/linux/nfs.h | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/include/uapi/linux/nfs.h b/include/uapi/linux/nfs.h
> index 057d22a..946cb62 100644
> --- a/include/uapi/linux/nfs.h
> +++ b/include/uapi/linux/nfs.h
> @@ -12,6 +12,7 @@
>
> #define NFS_PROGRAM 100003
> #define NFS_PORT 2049
> +#define NFS_RDMA_PORT 20049
> #define NFS_MAXDATA 8192
> #define NFS_MAXPATHLEN 1024
> #define NFS_MAXNAMLEN 255
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2017-12-05 15:36:26

by Anna Schumaker

[permalink] [raw]
Subject: Re: [PATCH v1 2/4] nfs: Referrals should use the same proto setting as their parent

Hi Chuck,

On 12/04/2017 02:13 PM, Chuck Lever wrote:
> Helen Chao <[email protected]> noticed that when a user
> traverses a referral on an NFS/RDMA mount, the resulting submount
> always uses TCP.
>
> This behavior does not match the vers= setting when traversing
> a referral (vers=4.1 is preserved). It also does not match the
> behavior of crossing from the pseudofs into a real filesystem
> (proto=rdma is preserved in that case).
>
> The Linux NFS client does not currently support the
> fs_locations_info attribute. The situation is similar for all
> NFSv4 servers I know of. Therefore until the community has broad
> support for fs_locations_info, when following a referral:
>
> - First try to connect with RPC-over-RDMA. This will fail quickly
> if the client has no RDMA-capable interfaces.
>
> - If connecting with RPC-over-RDMA fails, or the RPC-over-RDMA
> transport is not available, use TCP.

Won't this have the opposite problem if the client and server have RDMA interfaces, but are currently mounted over TCP instead?

Anna

>
> Reported-by: Helen Chao <[email protected]>
> Signed-off-by: Chuck Lever <[email protected]>
> ---
> fs/nfs/nfs4client.c | 23 ++++++++++++++++++++---
> fs/nfs/nfs4namespace.c | 2 --
> 2 files changed, 20 insertions(+), 5 deletions(-)
>
> diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c
> index 12bbab0..a3d5592 100644
> --- a/fs/nfs/nfs4client.c
> +++ b/fs/nfs/nfs4client.c
> @@ -1114,19 +1114,36 @@ struct nfs_server *nfs4_create_referral_server(struct nfs_clone_mount *data,
> /* Initialise the client representation from the parent server */
> nfs_server_copy_userdata(server, parent_server);
>
> - /* Get a client representation.
> - * Note: NFSv4 always uses TCP, */
> + /* Get a client representation */
> +#ifdef CONFIG_SUNRPC_XPRT_RDMA
> + rpc_set_port(data->addr, NFS_RDMA_PORT);
> error = nfs4_set_client(server, data->hostname,
> data->addr,
> data->addrlen,
> parent_client->cl_ipaddr,
> - rpc_protocol(parent_server->client),
> + XPRT_TRANSPORT_RDMA,
> + parent_server->client->cl_timeout,
> + parent_client->cl_mvops->minor_version,
> + parent_client->cl_net);
> + if (!error)
> + goto init_server;
> +#endif /* CONFIG_SUNRPC_XPRT_RDMA */
> +
> + rpc_set_port(data->addr, NFS_PORT);
> + error = nfs4_set_client(server, data->hostname,
> + data->addr,
> + data->addrlen,
> + parent_client->cl_ipaddr,
> + XPRT_TRANSPORT_TCP,
> parent_server->client->cl_timeout,
> parent_client->cl_mvops->minor_version,
> parent_client->cl_net);
> if (error < 0)
> goto error;
>
> +#ifdef CONFIG_SUNRPC_XPRT_RDMA
> +init_server:
> +#endif
> error = nfs_init_server_rpcclient(server, parent_server->client->cl_timeout, data->authflavor);
> if (error < 0)
> goto error;
> diff --git a/fs/nfs/nfs4namespace.c b/fs/nfs/nfs4namespace.c
> index 8c3f327..24f06dc 100644
> --- a/fs/nfs/nfs4namespace.c
> +++ b/fs/nfs/nfs4namespace.c
> @@ -270,8 +270,6 @@ static struct vfsmount *try_location(struct nfs_clone_mount *mountdata,
> if (mountdata->addrlen == 0)
> continue;
>
> - rpc_set_port(mountdata->addr, NFS_PORT);
> -
> memcpy(page2, buf->data, buf->len);
> page2[buf->len] = '\0';
> mountdata->hostname = page2;
>

2017-12-05 18:50:01

by Chuck Lever

[permalink] [raw]
Subject: Re: [PATCH v1 1/4] nfs: Define NFS_RDMA_PORT


> On Dec 5, 2017, at 10:09 AM, Devesh Sharma =
<[email protected]> wrote:
>=20
> Hi Chuck,
>=20
> Will this change avoid the "echo rdma 20049" on the nfs server during
> nfs server start?

I'm not familiar with this issue. Is there a bug report ?


> -Regards
> Devesh
>=20
> On Tue, Dec 5, 2017 at 12:43 AM, Chuck Lever <[email protected]> =
wrote:
>> The NFS/RDMA port assignment is specified in Section 9 of RFC 8267.
>>=20
>> Signed-off-by: Chuck Lever <[email protected]>
>> ---
>> include/uapi/linux/nfs.h | 1 +
>> 1 file changed, 1 insertion(+)
>>=20
>> diff --git a/include/uapi/linux/nfs.h b/include/uapi/linux/nfs.h
>> index 057d22a..946cb62 100644
>> --- a/include/uapi/linux/nfs.h
>> +++ b/include/uapi/linux/nfs.h
>> @@ -12,6 +12,7 @@
>>=20
>> #define NFS_PROGRAM 100003
>> #define NFS_PORT 2049
>> +#define NFS_RDMA_PORT 20049
>> #define NFS_MAXDATA 8192
>> #define NFS_MAXPATHLEN 1024
>> #define NFS_MAXNAMLEN 255

--
Chuck Lever




2017-12-05 19:04:59

by Chuck Lever

[permalink] [raw]
Subject: Re: [PATCH v1 2/4] nfs: Referrals should use the same proto setting as their parent


> On Dec 5, 2017, at 10:36 AM, Anna Schumaker =
<[email protected]> wrote:
>=20
> Hi Chuck,
>=20
> On 12/04/2017 02:13 PM, Chuck Lever wrote:
>> Helen Chao <[email protected]> noticed that when a user
>> traverses a referral on an NFS/RDMA mount, the resulting submount
>> always uses TCP.
>>=20
>> This behavior does not match the vers=3D setting when traversing
>> a referral (vers=3D4.1 is preserved). It also does not match the
>> behavior of crossing from the pseudofs into a real filesystem
>> (proto=3Drdma is preserved in that case).
>>=20
>> The Linux NFS client does not currently support the
>> fs_locations_info attribute. The situation is similar for all
>> NFSv4 servers I know of. Therefore until the community has broad
>> support for fs_locations_info, when following a referral:
>>=20
>> - First try to connect with RPC-over-RDMA. This will fail quickly
>> if the client has no RDMA-capable interfaces.
>>=20
>> - If connecting with RPC-over-RDMA fails, or the RPC-over-RDMA
>> transport is not available, use TCP.
>=20
> Won't this have the opposite problem if the client and server have =
RDMA interfaces, but are currently mounted over TCP instead?

The new behavior will prefer NFS/RDMA, yes. However, typically
the referral will contain a hostname or IP address that either
supports both or only TCP. In the former case, the client will
now choose RDMA instead of TCP, and in the latter, it should
always use TCP.

IOW the referral has to point to a server IP address that can
do either transport before the client can choose RDMA.


> Anna
>=20
>>=20
>> Reported-by: Helen Chao <[email protected]>
>> Signed-off-by: Chuck Lever <[email protected]>
>> ---
>> fs/nfs/nfs4client.c | 23 ++++++++++++++++++++---
>> fs/nfs/nfs4namespace.c | 2 --
>> 2 files changed, 20 insertions(+), 5 deletions(-)
>>=20
>> diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c
>> index 12bbab0..a3d5592 100644
>> --- a/fs/nfs/nfs4client.c
>> +++ b/fs/nfs/nfs4client.c
>> @@ -1114,19 +1114,36 @@ struct nfs_server =
*nfs4_create_referral_server(struct nfs_clone_mount *data,
>> /* Initialise the client representation from the parent server =
*/
>> nfs_server_copy_userdata(server, parent_server);
>>=20
>> - /* Get a client representation.
>> - * Note: NFSv4 always uses TCP, */
>> + /* Get a client representation */
>> +#ifdef CONFIG_SUNRPC_XPRT_RDMA
>> + rpc_set_port(data->addr, NFS_RDMA_PORT);
>> error =3D nfs4_set_client(server, data->hostname,
>> data->addr,
>> data->addrlen,
>> parent_client->cl_ipaddr,
>> - rpc_protocol(parent_server->client),
>> + XPRT_TRANSPORT_RDMA,
>> + parent_server->client->cl_timeout,
>> + parent_client->cl_mvops->minor_version,
>> + parent_client->cl_net);
>> + if (!error)
>> + goto init_server;
>> +#endif /* CONFIG_SUNRPC_XPRT_RDMA */
>> +
>> + rpc_set_port(data->addr, NFS_PORT);
>> + error =3D nfs4_set_client(server, data->hostname,
>> + data->addr,
>> + data->addrlen,
>> + parent_client->cl_ipaddr,
>> + XPRT_TRANSPORT_TCP,
>> parent_server->client->cl_timeout,
>> parent_client->cl_mvops->minor_version,
>> parent_client->cl_net);
>> if (error < 0)
>> goto error;
>>=20
>> +#ifdef CONFIG_SUNRPC_XPRT_RDMA
>> +init_server:
>> +#endif
>> error =3D nfs_init_server_rpcclient(server, =
parent_server->client->cl_timeout, data->authflavor);
>> if (error < 0)
>> goto error;
>> diff --git a/fs/nfs/nfs4namespace.c b/fs/nfs/nfs4namespace.c
>> index 8c3f327..24f06dc 100644
>> --- a/fs/nfs/nfs4namespace.c
>> +++ b/fs/nfs/nfs4namespace.c
>> @@ -270,8 +270,6 @@ static struct vfsmount *try_location(struct =
nfs_clone_mount *mountdata,
>> if (mountdata->addrlen =3D=3D 0)
>> continue;
>>=20
>> - rpc_set_port(mountdata->addr, NFS_PORT);
>> -
>> memcpy(page2, buf->data, buf->len);
>> page2[buf->len] =3D '\0';
>> mountdata->hostname =3D page2;
>>=20
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" =
in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

--
Chuck Lever




2017-12-06 06:05:08

by Devesh Sharma

[permalink] [raw]
Subject: Re: [PATCH v1 1/4] nfs: Define NFS_RDMA_PORT

Oh! apologies if I was not clear what I wanted to say. I was thinking
this change will avoid the step to write the nfsrdma port number into
/proc/fs/nfsd/portlist file after starting the nfs server. This is
step is a part of configuring nfs-rdma server. However, for tcp such
entries are present by default in this file.

-Regards
Devesh

On Wed, Dec 6, 2017 at 12:19 AM, Chuck Lever <[email protected]> wrote:
>
>> On Dec 5, 2017, at 10:09 AM, Devesh Sharma <[email protected]> wrote:
>>
>> Hi Chuck,
>>
>> Will this change avoid the "echo rdma 20049" on the nfs server during
>> nfs server start?
>
> I'm not familiar with this issue. Is there a bug report ?
>
>
>> -Regards
>> Devesh
>>
>> On Tue, Dec 5, 2017 at 12:43 AM, Chuck Lever <[email protected]> wrote:
>>> The NFS/RDMA port assignment is specified in Section 9 of RFC 8267.
>>>
>>> Signed-off-by: Chuck Lever <[email protected]>
>>> ---
>>> include/uapi/linux/nfs.h | 1 +
>>> 1 file changed, 1 insertion(+)
>>>
>>> diff --git a/include/uapi/linux/nfs.h b/include/uapi/linux/nfs.h
>>> index 057d22a..946cb62 100644
>>> --- a/include/uapi/linux/nfs.h
>>> +++ b/include/uapi/linux/nfs.h
>>> @@ -12,6 +12,7 @@
>>>
>>> #define NFS_PROGRAM 100003
>>> #define NFS_PORT 2049
>>> +#define NFS_RDMA_PORT 20049
>>> #define NFS_MAXDATA 8192
>>> #define NFS_MAXPATHLEN 1024
>>> #define NFS_MAXNAMLEN 255
>
> --
> Chuck Lever
>
>
>

2017-12-06 14:52:32

by Chuck Lever

[permalink] [raw]
Subject: Re: [PATCH v1 1/4] nfs: Define NFS_RDMA_PORT


> On Dec 6, 2017, at 1:04 AM, Devesh Sharma <[email protected]> =
wrote:
>=20
> Oh! apologies if I was not clear what I wanted to say. I was thinking
> this change will avoid the step to write the nfsrdma port number into
> /proc/fs/nfsd/portlist file after starting the nfs server. This is
> step is a part of configuring nfs-rdma server. However, for tcp such
> entries are present by default in this file.

Hi Devesh-

I don't have to do this manually on my server, but maybe that's
because RHEL 7 system start-up scripts already handle it?

Bring this up in a separate thread on [email protected]
and we can work it out. To answer your original question: no,
I don't believe this patch will change server start-up behavior.


> -Regards
> Devesh
>=20
> On Wed, Dec 6, 2017 at 12:19 AM, Chuck Lever <[email protected]> =
wrote:
>>=20
>>> On Dec 5, 2017, at 10:09 AM, Devesh Sharma =
<[email protected]> wrote:
>>>=20
>>> Hi Chuck,
>>>=20
>>> Will this change avoid the "echo rdma 20049" on the nfs server =
during
>>> nfs server start?
>>=20
>> I'm not familiar with this issue. Is there a bug report ?
>>=20
>>=20
>>> -Regards
>>> Devesh
>>>=20
>>> On Tue, Dec 5, 2017 at 12:43 AM, Chuck Lever =
<[email protected]> wrote:
>>>> The NFS/RDMA port assignment is specified in Section 9 of RFC 8267.
>>>>=20
>>>> Signed-off-by: Chuck Lever <[email protected]>
>>>> ---
>>>> include/uapi/linux/nfs.h | 1 +
>>>> 1 file changed, 1 insertion(+)
>>>>=20
>>>> diff --git a/include/uapi/linux/nfs.h b/include/uapi/linux/nfs.h
>>>> index 057d22a..946cb62 100644
>>>> --- a/include/uapi/linux/nfs.h
>>>> +++ b/include/uapi/linux/nfs.h
>>>> @@ -12,6 +12,7 @@
>>>>=20
>>>> #define NFS_PROGRAM 100003
>>>> #define NFS_PORT 2049
>>>> +#define NFS_RDMA_PORT 20049
>>>> #define NFS_MAXDATA 8192
>>>> #define NFS_MAXPATHLEN 1024
>>>> #define NFS_MAXNAMLEN 255
>>=20
>> --
>> Chuck Lever
>>=20
>>=20
>>=20
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" =
in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

--
Chuck Lever




2017-12-06 16:43:15

by Devesh Sharma

[permalink] [raw]
Subject: Re: [PATCH v1 1/4] nfs: Define NFS_RDMA_PORT

Okay, Thanks for the response.

On Wed, Dec 6, 2017 at 8:22 PM, Chuck Lever <[email protected]> wrote:
>
>> On Dec 6, 2017, at 1:04 AM, Devesh Sharma <[email protected]> wrote:
>>
>> Oh! apologies if I was not clear what I wanted to say. I was thinking
>> this change will avoid the step to write the nfsrdma port number into
>> /proc/fs/nfsd/portlist file after starting the nfs server. This is
>> step is a part of configuring nfs-rdma server. However, for tcp such
>> entries are present by default in this file.
>
> Hi Devesh-
>
> I don't have to do this manually on my server, but maybe that's
> because RHEL 7 system start-up scripts already handle it?
>
> Bring this up in a separate thread on [email protected]
> and we can work it out. To answer your original question: no,
> I don't believe this patch will change server start-up behavior.
>
>
>> -Regards
>> Devesh
>>
>> On Wed, Dec 6, 2017 at 12:19 AM, Chuck Lever <[email protected]> wrote:
>>>
>>>> On Dec 5, 2017, at 10:09 AM, Devesh Sharma <[email protected]> wrote:
>>>>
>>>> Hi Chuck,
>>>>
>>>> Will this change avoid the "echo rdma 20049" on the nfs server during
>>>> nfs server start?
>>>
>>> I'm not familiar with this issue. Is there a bug report ?
>>>
>>>
>>>> -Regards
>>>> Devesh
>>>>
>>>> On Tue, Dec 5, 2017 at 12:43 AM, Chuck Lever <[email protected]> wrote:
>>>>> The NFS/RDMA port assignment is specified in Section 9 of RFC 8267.
>>>>>
>>>>> Signed-off-by: Chuck Lever <[email protected]>
>>>>> ---
>>>>> include/uapi/linux/nfs.h | 1 +
>>>>> 1 file changed, 1 insertion(+)
>>>>>
>>>>> diff --git a/include/uapi/linux/nfs.h b/include/uapi/linux/nfs.h
>>>>> index 057d22a..946cb62 100644
>>>>> --- a/include/uapi/linux/nfs.h
>>>>> +++ b/include/uapi/linux/nfs.h
>>>>> @@ -12,6 +12,7 @@
>>>>>
>>>>> #define NFS_PROGRAM 100003
>>>>> #define NFS_PORT 2049
>>>>> +#define NFS_RDMA_PORT 20049
>>>>> #define NFS_MAXDATA 8192
>>>>> #define NFS_MAXPATHLEN 1024
>>>>> #define NFS_MAXNAMLEN 255
>>>
>>> --
>>> Chuck Lever
>>>
>>>
>>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
> --
> Chuck Lever
>
>
>

2017-12-13 15:15:49

by Chuck Lever

[permalink] [raw]
Subject: Re: [PATCH v1 2/4] nfs: Referrals should use the same proto setting as their parent


> On Dec 5, 2017, at 2:04 PM, Chuck Lever <[email protected]> =
wrote:
>=20
>=20
>> On Dec 5, 2017, at 10:36 AM, Anna Schumaker =
<[email protected]> wrote:
>>=20
>> Hi Chuck,
>>=20
>> On 12/04/2017 02:13 PM, Chuck Lever wrote:
>>> Helen Chao <[email protected]> noticed that when a user
>>> traverses a referral on an NFS/RDMA mount, the resulting submount
>>> always uses TCP.
>>>=20
>>> This behavior does not match the vers=3D setting when traversing
>>> a referral (vers=3D4.1 is preserved). It also does not match the
>>> behavior of crossing from the pseudofs into a real filesystem
>>> (proto=3Drdma is preserved in that case).
>>>=20
>>> The Linux NFS client does not currently support the
>>> fs_locations_info attribute. The situation is similar for all
>>> NFSv4 servers I know of. Therefore until the community has broad
>>> support for fs_locations_info, when following a referral:
>>>=20
>>> - First try to connect with RPC-over-RDMA. This will fail quickly
>>> if the client has no RDMA-capable interfaces.
>>>=20
>>> - If connecting with RPC-over-RDMA fails, or the RPC-over-RDMA
>>> transport is not available, use TCP.
>>=20
>> Won't this have the opposite problem if the client and server have =
RDMA interfaces, but are currently mounted over TCP instead?
>=20
> The new behavior will prefer NFS/RDMA, yes. However, typically
> the referral will contain a hostname or IP address that either
> supports both or only TCP. In the former case, the client will
> now choose RDMA instead of TCP, and in the latter, it should
> always use TCP.
>=20
> IOW the referral has to point to a server IP address that can
> do either transport before the client can choose RDMA.

Any further comment? Is this patch series acceptable?


>> Anna
>>=20
>>>=20
>>> Reported-by: Helen Chao <[email protected]>
>>> Signed-off-by: Chuck Lever <[email protected]>
>>> ---
>>> fs/nfs/nfs4client.c | 23 ++++++++++++++++++++---
>>> fs/nfs/nfs4namespace.c | 2 --
>>> 2 files changed, 20 insertions(+), 5 deletions(-)
>>>=20
>>> diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c
>>> index 12bbab0..a3d5592 100644
>>> --- a/fs/nfs/nfs4client.c
>>> +++ b/fs/nfs/nfs4client.c
>>> @@ -1114,19 +1114,36 @@ struct nfs_server =
*nfs4_create_referral_server(struct nfs_clone_mount *data,
>>> /* Initialise the client representation from the parent server =
*/
>>> nfs_server_copy_userdata(server, parent_server);
>>>=20
>>> - /* Get a client representation.
>>> - * Note: NFSv4 always uses TCP, */
>>> + /* Get a client representation */
>>> +#ifdef CONFIG_SUNRPC_XPRT_RDMA
>>> + rpc_set_port(data->addr, NFS_RDMA_PORT);
>>> error =3D nfs4_set_client(server, data->hostname,
>>> data->addr,
>>> data->addrlen,
>>> parent_client->cl_ipaddr,
>>> - rpc_protocol(parent_server->client),
>>> + XPRT_TRANSPORT_RDMA,
>>> + parent_server->client->cl_timeout,
>>> + parent_client->cl_mvops->minor_version,
>>> + parent_client->cl_net);
>>> + if (!error)
>>> + goto init_server;
>>> +#endif /* CONFIG_SUNRPC_XPRT_RDMA */
>>> +
>>> + rpc_set_port(data->addr, NFS_PORT);
>>> + error =3D nfs4_set_client(server, data->hostname,
>>> + data->addr,
>>> + data->addrlen,
>>> + parent_client->cl_ipaddr,
>>> + XPRT_TRANSPORT_TCP,
>>> parent_server->client->cl_timeout,
>>> parent_client->cl_mvops->minor_version,
>>> parent_client->cl_net);
>>> if (error < 0)
>>> goto error;
>>>=20
>>> +#ifdef CONFIG_SUNRPC_XPRT_RDMA
>>> +init_server:
>>> +#endif
>>> error =3D nfs_init_server_rpcclient(server, =
parent_server->client->cl_timeout, data->authflavor);
>>> if (error < 0)
>>> goto error;
>>> diff --git a/fs/nfs/nfs4namespace.c b/fs/nfs/nfs4namespace.c
>>> index 8c3f327..24f06dc 100644
>>> --- a/fs/nfs/nfs4namespace.c
>>> +++ b/fs/nfs/nfs4namespace.c
>>> @@ -270,8 +270,6 @@ static struct vfsmount *try_location(struct =
nfs_clone_mount *mountdata,
>>> if (mountdata->addrlen =3D=3D 0)
>>> continue;
>>>=20
>>> - rpc_set_port(mountdata->addr, NFS_PORT);
>>> -
>>> memcpy(page2, buf->data, buf->len);
>>> page2[buf->len] =3D '\0';
>>> mountdata->hostname =3D page2;
>>>=20

--
Chuck Lever