Hi,
Facing one more issue while using referrals with RDMA
If RDMA is enabled and supported on both client and server and If we
mount parent with TCP. Then referral/submount mount will be done over
RDMA instead of TCP, since for referral/submount mount the client
tries with RDMA first and then TCP only of RDMA connections fails.
As we can see here parent /home .160, mounted with tcp, t1 is
referral mount mounted with rdma
> /root/tcp-mnt1 from 10.53.87.160:/home
> Flags: rw,relatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.53.87.158,local_lock=none,addr=10.53.87.160
>
> /root/tcp-mnt1/t1 from 10.53.87.157:/:home/t1
> Flags: rw,relatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,hard,proto=rdma,port=20049,timeo=600,retrans=2,sec=sys,clientaddr=10.53.87.158,local_lock=none,addr=10.53.87.157
Code which tries RDMA first, can we get transport type from parent and
use the same?
> #if IS_ENABLED(CONFIG_SUNRPC_XPRT_RDMA)
>
> rpc_set_port(&ctx->nfs_server.address, NFS_RDMA_PORT);
>
> error = nfs4_set_client(server,
>
> ctx->nfs_server.hostname,
>
> &ctx->nfs_server._address,
>
> ctx->nfs_server.addrlen,
>
> parent_client->cl_ipaddr,
>
> XPRT_TRANSPORT_RDMA,
>
> parent_server->client->cl_timeout,
>
> parent_client->cl_mvops->minor_version,
>
> parent_client->cl_nconnect,
>
> parent_client->cl_max_connect,
>
> parent_client->cl_net,
>
> &parent_client->cl_xprtsec);
>
> if (!error)
>
> goto init_server;
>
> #endif /* IS_ENABLED(CONFIG_SUNRPC_XPRT_RDMA) */
>
>
Regards,
Gaurav Gangalwar
On Tue, Feb 27, 2024 at 6:22 PM gaurav gangalwar
<[email protected]> wrote:
>
> One more issue with referral code is there is no retry on connection failure
>
>> Feb 26 01:49:32 rbt-el7-1 kernel: nfs_create_rpc_client: cannot create RPC client. Error = -111
>> Feb 26 01:49:32 rbt-el7-1 kernel: NFS4: Couldn't follow remote path
>> Feb 26 01:49:32 rbt-el7-1 kernel: <-- nfs4_get_referral_tree() = -111 [error]
>
>
> I was expecting retries from the client if submount fails if it's a hard mount on parent, but it fails submount.
> I can understand we will be stuck in a loop if fs info is not valid, then connection will always fail.
>
> Regards,
> Gaurav Gangalwar
>
> On Mon, Feb 12, 2024 at 7:23 PM Chuck Lever III <[email protected]> wrote:
>>
>>
>>
>> > On Feb 12, 2024, at 12:51 AM, gaurav gangalwar <[email protected]> wrote:
>> >
>> > I think I was using an older kernel version on a client which doesn't have your fix.
>> > I tried with the newer version v5.10, it worked fine.
>> >
>> > The only issue I see is we are not inheriting port from the parent in nfs4_create_referral_server, so even if I use port=20047 in mount it will try referral submount on 20049 only.
>> >
>> > rpc_set_port(data->addr, NFS_RDMA_PORT);
>> >
>> > We could inherit this also from parent?
>>
>> The client is supposed to use the port number information contained
>> in the referral. There's nothing that mandates that the two servers
>> will use the same alternate port.
>>
>> Using a constant here is probably wrong for both the TCP and RDMA
>> cases, though.
>>
>>
>> --
>> Chuck Lever
>>
>>