2024-05-27 05:16:18

by gaurav gangalwar

[permalink] [raw]
Subject: Re: NFS4.0 rdma with referal

Hi,
Facing one more issue while using referrals with RDMA
If RDMA is enabled and supported on both client and server and If we
mount parent with TCP. Then referral/submount mount will be done over
RDMA instead of TCP, since for referral/submount mount the client
tries with RDMA first and then TCP only of RDMA connections fails.

As we can see here parent /home .160, mounted with tcp, t1 is
referral mount mounted with rdma

> /root/tcp-mnt1 from 10.53.87.160:/home
> Flags: rw,relatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.53.87.158,local_lock=none,addr=10.53.87.160
>
> /root/tcp-mnt1/t1 from 10.53.87.157:/:home/t1
> Flags: rw,relatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,hard,proto=rdma,port=20049,timeo=600,retrans=2,sec=sys,clientaddr=10.53.87.158,local_lock=none,addr=10.53.87.157


Code which tries RDMA first, can we get transport type from parent and
use the same?

> #if IS_ENABLED(CONFIG_SUNRPC_XPRT_RDMA)
>
> rpc_set_port(&ctx->nfs_server.address, NFS_RDMA_PORT);
>
> error = nfs4_set_client(server,
>
> ctx->nfs_server.hostname,
>
> &ctx->nfs_server._address,
>
> ctx->nfs_server.addrlen,
>
> parent_client->cl_ipaddr,
>
> XPRT_TRANSPORT_RDMA,
>
> parent_server->client->cl_timeout,
>
> parent_client->cl_mvops->minor_version,
>
> parent_client->cl_nconnect,
>
> parent_client->cl_max_connect,
>
> parent_client->cl_net,
>
> &parent_client->cl_xprtsec);
>
> if (!error)
>
> goto init_server;
>
> #endif /* IS_ENABLED(CONFIG_SUNRPC_XPRT_RDMA) */
>
>

Regards,
Gaurav Gangalwar

On Tue, Feb 27, 2024 at 6:22 PM gaurav gangalwar
<[email protected]> wrote:
>
> One more issue with referral code is there is no retry on connection failure
>
>> Feb 26 01:49:32 rbt-el7-1 kernel: nfs_create_rpc_client: cannot create RPC client. Error = -111
>> Feb 26 01:49:32 rbt-el7-1 kernel: NFS4: Couldn't follow remote path
>> Feb 26 01:49:32 rbt-el7-1 kernel: <-- nfs4_get_referral_tree() = -111 [error]
>
>
> I was expecting retries from the client if submount fails if it's a hard mount on parent, but it fails submount.
> I can understand we will be stuck in a loop if fs info is not valid, then connection will always fail.
>
> Regards,
> Gaurav Gangalwar
>
> On Mon, Feb 12, 2024 at 7:23 PM Chuck Lever III <[email protected]> wrote:
>>
>>
>>
>> > On Feb 12, 2024, at 12:51 AM, gaurav gangalwar <[email protected]> wrote:
>> >
>> > I think I was using an older kernel version on a client which doesn't have your fix.
>> > I tried with the newer version v5.10, it worked fine.
>> >
>> > The only issue I see is we are not inheriting port from the parent in nfs4_create_referral_server, so even if I use port=20047 in mount it will try referral submount on 20049 only.
>> >
>> > rpc_set_port(data->addr, NFS_RDMA_PORT);
>> >
>> > We could inherit this also from parent?
>>
>> The client is supposed to use the port number information contained
>> in the referral. There's nothing that mandates that the two servers
>> will use the same alternate port.
>>
>> Using a constant here is probably wrong for both the TCP and RDMA
>> cases, though.
>>
>>
>> --
>> Chuck Lever
>>
>>