Hi, Wenjia and Jan,
When testing SMC-R v2, I found some scenarios where SMC-R v2 should be worked, but due to some restrictions in SMC-R v2's implementation,
fallback happened. I want to know why these restrictions exist and what would happen if these restrictions were removed.
The first is in the function smc_ib_determine_gid_rcu, where restricts the subnet matching between smcrv2->saddr and the RDMA related netdev.
codes here:
static int smc_ib_determine_gid_rcu(...)
{
...
in_dev_for_each_ifa_rcu(ifa, in_dev) {
if (!inet_ifa_match(smcrv2->saddr, ifa))
continue;
subnet_match = true;
break;
}
if (!subnet_match)
goto out;
...
out:
return -ENODEV;
}
In my testing environment, either server or client, exists two netdevs, eth0 in netnamespace1 and eth0 in netnamespace2. For the sake of clarity
in the following text, we will refer to eth0 in netnamespace1 as eth1, and eth0 in netnamespace2 as eth2. The eth1's ip is 192.168.0.3/32 and the
eth2's ip is 192.168.0.4/24. The netmask of eth1 must be 32 due to some reasons. The eth1 is a RDMA related netdev, which means the adaptor of eth1
has RDMA function. The eth2 has been associated to the eth1's RDMA device using smc_pnet. When testing connection in netnamespace2(using eth2 for
SMC-R connection), we got fallback connection, rsn is 0x03010000, due to the above subnet matching restriction. But in this scenario, I think
SMC-R should work.
In my another testing environment, either server or client, exists two netdevs, eth0 in netnamespace1 and eth1 in netnamespace1. The eth0's ip is
192.168.0.3/24 and the eth1's ip is 192.168.1.4/24. The eth0 is a RDMA related netdev, which means the adaptor of eth0 has RDMA function. The eth1 has
been associated to the eth0's RDMA device using smc_pnet. When testing SMC-R connection through eth1, we got fallback connection, rsn is 0x03010000,
due to the above subnet matching restriction. In my environment, eth0 and eth1 have the same network connectivity even though they have different
subnet. I think SMC-R should work in this scenario.
The other is in the function smc_connect_rdma_v2_prepare, where restricts the symmetric configuration of routing between client and server. codes here:
static int smc_connect_rdma_v2_prepare(...)
{
...
if (fce->v2_direct) {
memcpy(ini->smcrv2.nexthop_mac, &aclc->r0.lcl.mac, ETH_ALEN);
ini->smcrv2.uses_gateway = false;
} else {
if (smc_ib_find_route(net, smc->clcsock->sk->sk_rcv_saddr,
smc_ib_gid_to_ipv4(aclc->r0.lcl.gid),
ini->smcrv2.nexthop_mac,
&ini->smcrv2.uses_gateway))
return SMC_CLC_DECL_NOROUTE;
if (!ini->smcrv2.uses_gateway) {
/* mismatch: peer claims indirect, but its direct */
return SMC_CLC_DECL_NOINDIRECT;
}
}
...
}
In my testing environment, server's ip is 192.168.0.3/24, client's ip 192.168.0.4/24, regarding how many netdev in server or client. Server has special
route setting due to some other reasons, which results in indirect route from 192.168.0.3/24 to 192.168.0.4/24. Thus, when CLC handshake, client will
get fce->v2_direct==false, but client has no special routing setting and will find direct route from 192.168.0.4/24 to 192.168.0.3/24. Due to the above
symmetric configuration of routing restriction, we got fallback connection, rsn is 0x030f0000. But I think SMC-R should work in this scenario.
And more, why check the symmetric configuration of routing only when server is indirect route?
Waiting for your reply.
Thanks,
Guangguan Wang
On 07.05.24 07:54, Guangguan Wang wrote:
> Hi, Wenjia and Jan,
>
> When testing SMC-R v2, I found some scenarios where SMC-R v2 should be worked, but due to some restrictions in SMC-R v2's implementation,
> fallback happened. I want to know why these restrictions exist and what would happen if these restrictions were removed.
>
> The first is in the function smc_ib_determine_gid_rcu, where restricts the subnet matching between smcrv2->saddr and the RDMA related netdev.
> codes here:
> static int smc_ib_determine_gid_rcu(...)
> {
> ...
> in_dev_for_each_ifa_rcu(ifa, in_dev) {
> if (!inet_ifa_match(smcrv2->saddr, ifa))
> continue;
> subnet_match = true;
> break;
> }
> if (!subnet_match)
> goto out;
> ...
> out:
> return -ENODEV;
> }
> In my testing environment, either server or client, exists two netdevs, eth0 in netnamespace1 and eth0 in netnamespace2. For the sake of clarity
> in the following text, we will refer to eth0 in netnamespace1 as eth1, and eth0 in netnamespace2 as eth2. The eth1's ip is 192.168.0.3/32 and the
> eth2's ip is 192.168.0.4/24. The netmask of eth1 must be 32 due to some reasons. The eth1 is a RDMA related netdev, which means the adaptor of eth1
> has RDMA function. The eth2 has been associated to the eth1's RDMA device using smc_pnet. When testing connection in netnamespace2(using eth2 for
> SMC-R connection), we got fallback connection, rsn is 0x03010000, due to the above subnet matching restriction. But in this scenario, I think
> SMC-R should work.
> In my another testing environment, either server or client, exists two netdevs, eth0 in netnamespace1 and eth1 in netnamespace1. The eth0's ip is
> 192.168.0.3/24 and the eth1's ip is 192.168.1.4/24. The eth0 is a RDMA related netdev, which means the adaptor of eth0 has RDMA function. The eth1 has
> been associated to the eth0's RDMA device using smc_pnet. When testing SMC-R connection through eth1, we got fallback connection, rsn is 0x03010000,
> due to the above subnet matching restriction. In my environment, eth0 and eth1 have the same network connectivity even though they have different
> subnet. I think SMC-R should work in this scenario.
>
> The other is in the function smc_connect_rdma_v2_prepare, where restricts the symmetric configuration of routing between client and server. codes here:
> static int smc_connect_rdma_v2_prepare(...)
> {
> ...
> if (fce->v2_direct) {
> memcpy(ini->smcrv2.nexthop_mac, &aclc->r0.lcl.mac, ETH_ALEN);
> ini->smcrv2.uses_gateway = false;
> } else {
> if (smc_ib_find_route(net, smc->clcsock->sk->sk_rcv_saddr,
> smc_ib_gid_to_ipv4(aclc->r0.lcl.gid),
> ini->smcrv2.nexthop_mac,
> &ini->smcrv2.uses_gateway))
> return SMC_CLC_DECL_NOROUTE;
> if (!ini->smcrv2.uses_gateway) {
> /* mismatch: peer claims indirect, but its direct */
> return SMC_CLC_DECL_NOINDIRECT;
> }
> }
> ...
> }
> In my testing environment, server's ip is 192.168.0.3/24, client's ip 192.168.0.4/24, regarding how many netdev in server or client. Server has special
> route setting due to some other reasons, which results in indirect route from 192.168.0.3/24 to 192.168.0.4/24. Thus, when CLC handshake, client will
> get fce->v2_direct==false, but client has no special routing setting and will find direct route from 192.168.0.4/24 to 192.168.0.3/24. Due to the above
> symmetric configuration of routing restriction, we got fallback connection, rsn is 0x030f0000. But I think SMC-R should work in this scenario.
> And more, why check the symmetric configuration of routing only when server is indirect route?
>
> Waiting for your reply.
>
> Thanks,
> Guangguan Wang
>
Hi Guangguan,
Thank you for the questions. We also asked ourselves the same questions
a while ago, and also did some research on it. Unfortunately, it was not
yet done and I had to delay it because of my vacation last month. Now
it's time to pick it up again ;) I'll come back to you as soon as I can
give a very certain answer.
Thanks,
Wenjia
On 2024/5/10 17:40, Wenjia Zhang wrote:
>
>
> On 07.05.24 07:54, Guangguan Wang wrote:
>> Hi, Wenjia and Jan,
>>
>> When testing SMC-R v2, I found some scenarios where SMC-R v2 should be worked, but due to some restrictions in SMC-R
>> v2's implementation,
>> fallback happened. I want to know why these restrictions exist and what would happen if these restrictions were removed.
>>
>> The first is in the function smc_ib_determine_gid_rcu, where restricts the subnet matching between smcrv2->saddr and
>> the RDMA related netdev.
>> codes here:
>> static int smc_ib_determine_gid_rcu(...)
>> {
>> ...
>> in_dev_for_each_ifa_rcu(ifa, in_dev) {
>> if (!inet_ifa_match(smcrv2->saddr, ifa))
>> continue;
>> subnet_match = true;
>> break;
>> }
>> if (!subnet_match)
>> goto out;
>> ...
>> out:
>> return -ENODEV;
>> }
>> In my testing environment, either server or client, exists two netdevs, eth0 in netnamespace1 and eth0 in
>> netnamespace2. For the sake of clarity
>> in the following text, we will refer to eth0 in netnamespace1 as eth1, and eth0 in netnamespace2 as eth2. The eth1's
>> ip is 192.168.0.3/32 and the
>> eth2's ip is 192.168.0.4/24. The netmask of eth1 must be 32 due to some reasons. The eth1 is a RDMA related netdev,
>> which means the adaptor of eth1
>> has RDMA function. The eth2 has been associated to the eth1's RDMA device using smc_pnet. When testing connection in
>> netnamespace2(using eth2 for
>> SMC-R connection), we got fallback connection, rsn is 0x03010000, due to the above subnet matching restriction. But in
>> this scenario, I think
>> SMC-R should work.
>> In my another testing environment, either server or client, exists two netdevs, eth0 in netnamespace1 and eth1 in
>> netnamespace1. The eth0's ip is
>> 192.168.0.3/24 and the eth1's ip is 192.168.1.4/24. The eth0 is a RDMA related netdev, which means the adaptor of eth0
>> has RDMA function. The eth1 has
>> been associated to the eth0's RDMA device using smc_pnet. When testing SMC-R connection through eth1, we got fallback
>> connection, rsn is 0x03010000,
>> due to the above subnet matching restriction. In my environment, eth0 and eth1 have the same network connectivity even
>> though they have different
>> subnet. I think SMC-R should work in this scenario.
>>
>> The other is in the function smc_connect_rdma_v2_prepare, where restricts the symmetric configuration of routing
>> between client and server. codes here:
>> static int smc_connect_rdma_v2_prepare(...)
>> {
>> ...
>> if (fce->v2_direct) {
>> memcpy(ini->smcrv2.nexthop_mac, &aclc->r0.lcl.mac, ETH_ALEN);
>> ini->smcrv2.uses_gateway = false;
>> } else {
>> if (smc_ib_find_route(net, smc->clcsock->sk->sk_rcv_saddr,
>> smc_ib_gid_to_ipv4(aclc->r0.lcl.gid),
>> ini->smcrv2.nexthop_mac,
>> &ini->smcrv2.uses_gateway))
>> return SMC_CLC_DECL_NOROUTE;
>> if (!ini->smcrv2.uses_gateway) {
>> /* mismatch: peer claims indirect, but its direct */
>> return SMC_CLC_DECL_NOINDIRECT;
>> }
>> }
>> ...
>> }
>> In my testing environment, server's ip is 192.168.0.3/24, client's ip 192.168.0.4/24, regarding how many netdev in
>> server or client. Server has special
>> route setting due to some other reasons, which results in indirect route from 192.168.0.3/24 to 192.168.0.4/24. Thus,
>> when CLC handshake, client will
>> get fce->v2_direct==false, but client has no special routing setting and will find direct route from 192.168.0.4/24 to
>> 192.168.0.3/24. Due to the above
>> symmetric configuration of routing restriction, we got fallback connection, rsn is 0x030f0000. But I think SMC-R
>> should work in this scenario.
>> And more, why check the symmetric configuration of routing only when server is indirect route?
>>
>> Waiting for your reply.
>>
>> Thanks,
>> Guangguan Wang
>>
> Hi Guangguan,
>
> Thank you for the questions. We also asked ourselves the same questions a while ago, and also did some research on it.
> Unfortunately, it was not yet done and I had to delay it because of my vacation last month. Now it's time to pick it up
> again ;) I'll come back to you as soon as I can give a very certain answer.
>
> Thanks,
> Wenjia
Hi, Wenjia.
Following Guangguan's questions, I noticed that in SMCv2, ini->smcrv2.saddr stores clcsock->sk->sk_rcv_saddr
and ini->smcrv2.daddr stores the IP converted from peer RNIC's gid (smc_ib_gid_to_ipv4(smc_v2_ext->roce)),
e.g. in smc_find_rdma_v2_device_serv(). And this is also how src address and dst address are considered in many
other places, such as in smc_ib_find_route() mentioned above. I am confused why such 'asymmetrical' usage?
* clc src addr <----> clc dst addr
local RNIC gid <----> * peer RNIC gid (*) means used for saddr or daddr
I guess there might be some reason behind this and I'd really appreciate if you have a answer.
Thank you!
On 2024/5/10 17:40, Wenjia Zhang wrote:
>
>
> On 07.05.24 07:54, Guangguan Wang wrote:
>> Hi, Wenjia and Jan,
>>
>> When testing SMC-R v2, I found some scenarios where SMC-R v2 should be worked, but due to some restrictions in SMC-R v2's implementation,
>> fallback happened. I want to know why these restrictions exist and what would happen if these restrictions were removed.
>>
>> The first is in the function smc_ib_determine_gid_rcu, where restricts the subnet matching between smcrv2->saddr and the RDMA related netdev.
>> codes here:
>> static int smc_ib_determine_gid_rcu(...)
>> {
>> ...
>> in_dev_for_each_ifa_rcu(ifa, in_dev) {
>> if (!inet_ifa_match(smcrv2->saddr, ifa))
>> continue;
>> subnet_match = true;
>> break;
>> }
>> if (!subnet_match)
>> goto out;
>> ...
>> out:
>> return -ENODEV;
>> }
>> In my testing environment, either server or client, exists two netdevs, eth0 in netnamespace1 and eth0 in netnamespace2. For the sake of clarity
>> in the following text, we will refer to eth0 in netnamespace1 as eth1, and eth0 in netnamespace2 as eth2. The eth1's ip is 192.168.0.3/32 and the
>> eth2's ip is 192.168.0.4/24. The netmask of eth1 must be 32 due to some reasons. The eth1 is a RDMA related netdev, which means the adaptor of eth1
>> has RDMA function. The eth2 has been associated to the eth1's RDMA device using smc_pnet. When testing connection in netnamespace2(using eth2 for
>> SMC-R connection), we got fallback connection, rsn is 0x03010000, due to the above subnet matching restriction. But in this scenario, I think
>> SMC-R should work.
>> In my another testing environment, either server or client, exists two netdevs, eth0 in netnamespace1 and eth1 in netnamespace1. The eth0's ip is
>> 192.168.0.3/24 and the eth1's ip is 192.168.1.4/24. The eth0 is a RDMA related netdev, which means the adaptor of eth0 has RDMA function. The eth1 has
>> been associated to the eth0's RDMA device using smc_pnet. When testing SMC-R connection through eth1, we got fallback connection, rsn is 0x03010000,
>> due to the above subnet matching restriction. In my environment, eth0 and eth1 have the same network connectivity even though they have different
>> subnet. I think SMC-R should work in this scenario.
>>
>> The other is in the function smc_connect_rdma_v2_prepare, where restricts the symmetric configuration of routing between client and server. codes here:
>> static int smc_connect_rdma_v2_prepare(...)
>> {
>> ...
>> if (fce->v2_direct) {
>> memcpy(ini->smcrv2.nexthop_mac, &aclc->r0.lcl.mac, ETH_ALEN);
>> ini->smcrv2.uses_gateway = false;
>> } else {
>> if (smc_ib_find_route(net, smc->clcsock->sk->sk_rcv_saddr,
>> smc_ib_gid_to_ipv4(aclc->r0.lcl.gid),
>> ini->smcrv2.nexthop_mac,
>> &ini->smcrv2.uses_gateway))
>> return SMC_CLC_DECL_NOROUTE;
>> if (!ini->smcrv2.uses_gateway) {
>> /* mismatch: peer claims indirect, but its direct */
>> return SMC_CLC_DECL_NOINDIRECT;
>> }
>> }
>> ...
>> }
>> In my testing environment, server's ip is 192.168.0.3/24, client's ip 192.168.0.4/24, regarding how many netdev in server or client. Server has special
>> route setting due to some other reasons, which results in indirect route from 192.168.0.3/24 to 192.168.0.4/24. Thus, when CLC handshake, client will
>> get fce->v2_direct==false, but client has no special routing setting and will find direct route from 192.168.0.4/24 to 192.168.0.3/24. Due to the above
>> symmetric configuration of routing restriction, we got fallback connection, rsn is 0x030f0000. But I think SMC-R should work in this scenario.
>> And more, why check the symmetric configuration of routing only when server is indirect route?
>>
>> Waiting for your reply.
>>
>> Thanks,
>> Guangguan Wang
>>
> Hi Guangguan,
>
> Thank you for the questions. We also asked ourselves the same questions a while ago, and also did some research on it. Unfortunately, it was not yet done and I had to delay it because of my vacation last month. Now it's time to pick it up again ;) I'll come back to you as soon as I can give a very certain answer.
>
> Thanks,
> Wenjia
Hi, Wen Jia,
So glad to hear that these questions have also caught your attention, and I'm really looking forward to your answers.
Thanks,
Guangguan Wang
On 2024/5/10 17:40, Wenjia Zhang wrote:
>
>
> On 07.05.24 07:54, Guangguan Wang wrote:
>> Hi, Wenjia and Jan,
>>
>> When testing SMC-R v2, I found some scenarios where SMC-R v2 should be worked, but due to some restrictions in SMC-R v2's implementation,
>> fallback happened. I want to know why these restrictions exist and what would happen if these restrictions were removed.
>>
>> The first is in the function smc_ib_determine_gid_rcu, where restricts the subnet matching between smcrv2->saddr and the RDMA related netdev.
>> codes here:
>> static int smc_ib_determine_gid_rcu(...)
>> {
>> ...
>> in_dev_for_each_ifa_rcu(ifa, in_dev) {
>> if (!inet_ifa_match(smcrv2->saddr, ifa))
>> continue;
>> subnet_match = true;
>> break;
>> }
>> if (!subnet_match)
>> goto out;
>> ...
>> out:
>> return -ENODEV;
>> }
>> In my testing environment, either server or client, exists two netdevs, eth0 in netnamespace1 and eth0 in netnamespace2. For the sake of clarity
>> in the following text, we will refer to eth0 in netnamespace1 as eth1, and eth0 in netnamespace2 as eth2. The eth1's ip is 192.168.0.3/32 and the
>> eth2's ip is 192.168.0.4/24. The netmask of eth1 must be 32 due to some reasons. The eth1 is a RDMA related netdev, which means the adaptor of eth1
>> has RDMA function. The eth2 has been associated to the eth1's RDMA device using smc_pnet. When testing connection in netnamespace2(using eth2 for
>> SMC-R connection), we got fallback connection, rsn is 0x03010000, due to the above subnet matching restriction. But in this scenario, I think
>> SMC-R should work.
>> In my another testing environment, either server or client, exists two netdevs, eth0 in netnamespace1 and eth1 in netnamespace1. The eth0's ip is
>> 192.168.0.3/24 and the eth1's ip is 192.168.1.4/24. The eth0 is a RDMA related netdev, which means the adaptor of eth0 has RDMA function. The eth1 has
>> been associated to the eth0's RDMA device using smc_pnet. When testing SMC-R connection through eth1, we got fallback connection, rsn is 0x03010000,
>> due to the above subnet matching restriction. In my environment, eth0 and eth1 have the same network connectivity even though they have different
>> subnet. I think SMC-R should work in this scenario.
>>
>> The other is in the function smc_connect_rdma_v2_prepare, where restricts the symmetric configuration of routing between client and server. codes here:
>> static int smc_connect_rdma_v2_prepare(...)
>> {
>> ...
>> if (fce->v2_direct) {
>> memcpy(ini->smcrv2.nexthop_mac, &aclc->r0.lcl.mac, ETH_ALEN);
>> ini->smcrv2.uses_gateway = false;
>> } else {
>> if (smc_ib_find_route(net, smc->clcsock->sk->sk_rcv_saddr,
>> smc_ib_gid_to_ipv4(aclc->r0.lcl.gid),
>> ini->smcrv2.nexthop_mac,
>> &ini->smcrv2.uses_gateway))
>> return SMC_CLC_DECL_NOROUTE;
>> if (!ini->smcrv2.uses_gateway) {
>> /* mismatch: peer claims indirect, but its direct */
>> return SMC_CLC_DECL_NOINDIRECT;
>> }
>> }
>> ...
>> }
>> In my testing environment, server's ip is 192.168.0.3/24, client's ip 192.168.0.4/24, regarding how many netdev in server or client. Server has special
>> route setting due to some other reasons, which results in indirect route from 192.168.0.3/24 to 192.168.0.4/24. Thus, when CLC handshake, client will
>> get fce->v2_direct==false, but client has no special routing setting and will find direct route from 192.168.0.4/24 to 192.168.0.3/24. Due to the above
>> symmetric configuration of routing restriction, we got fallback connection, rsn is 0x030f0000. But I think SMC-R should work in this scenario.
>> And more, why check the symmetric configuration of routing only when server is indirect route?
>>
>> Waiting for your reply.
>>
>> Thanks,
>> Guangguan Wang
>>
> Hi Guangguan,
>
> Thank you for the questions. We also asked ourselves the same questions a while ago, and also did some research on it. Unfortunately, it was not yet done and I had to delay it because of my vacation last month. Now it's time to pick it up again ;) I'll come back to you as soon as I can give a very
> certain answer.
>
> Thanks,
> Wenjia
Hi Wenjia, is there any new information on the original intent of these designs? :) Thanks!
On 07.05.24 07:54, Guangdong Wang wrote:
> Hi, Wenjia and Jan,
>
> When testing SMC-R v2, I found some scenarios where SMC-R v2 should be worked, but due to some restrictions in SMC-R v2's implementation,
> fallback happened. I want to know why these restrictions exist and what would happen if these restrictions were removed.
>
Hi Guangguan and Wen,
please see my answer below.
> The first is in the function smc_ib_determine_gid_rcu, where restricts the subnet matching between smcrv2->saddr and the RDMA related netdev.
> codes here:
> static int smc_ib_determine_gid_rcu(...)
> {
> ...
> in_dev_for_each_ifa_rcu(ifa, in_dev) {
> if (!inet_ifa_match(smcrv2->saddr, ifa))
> continue;
> subnet_match = true;
> break;
> }
> if (!subnet_match)
> goto out;
> ...
> out:
> return -ENODEV;
> }
> In my testing environment, either server or client, exists two netdevs, eth0 in netnamespace1 and eth0 in netnamespace2. For the sake of clarity
> in the following text, we will refer to eth0 in netnamespace1 as eth1, and eth0 in netnamespace2 as eth2. The eth1's ip is 192.168.0.3/32 and the
> eth2's ip is 192.168.0.4/24. The netmask of eth1 must be 32 due to some reasons. The eth1 is a RDMA related netdev, which means the adaptor of eth1
> has RDMA function. The eth2 has been associated to the eth1's RDMA device using smc_pnet. When testing connection in netnamespace2(using eth2 for
> SMC-R connection), we got fallback connection, rsn is 0x03010000, due to the above subnet matching restriction. But in this scenario, I think
> SMC-R should work.
> In my another testing environment, either server or client, exists two netdevs, eth0 in netnamespace1 and eth1 in netnamespace1. The eth0's ip is
> 192.168.0.3/24 and the eth1's ip is 192.168.1.4/24. The eth0 is a RDMA related netdev, which means the adaptor of eth0 has RDMA function. The eth1 has
> been associated to the eth0's RDMA device using smc_pnet. When testing SMC-R connection through eth1, we got fallback connection, rsn is 0x03010000,
> due to the above subnet matching restriction. In my environment, eth0 and eth1 have the same network connectivity even though they have different
> subnet. I think SMC-R should work in this scenario.
>
The purpose of the restriction is to simplify the IP routing topology
allowing IP routing to use the destination host's subnet route. Because
each host must also have a valid IP route to the peer’s RoCE IP address
to create RC QP. If the IP route used is the same IP Route as the
associated TCP/IP connection, the reuse of the IP routing topology could
be achieved. I think it is what the following sentence means in the doc
https://www.ibm.com/support/pages/system/files/inline-files/IBM%20Shared%20Memory%20Communications%20Version%202_2.pdf
"
For HA, multiple RoCE adapters should be provisioned along with multiple
equal cost IP routes to the peer host (i.e., reusing the TCP/IP routing
topology).
"
And the "Figure 19. SMC-Rv2 with RoCEv2 Connectivity" in the doc also
mentions the restriction.
The SMCRv2 on linux is indeed implemented with this purpose. Please see
the function smc_ib_modify_qp_rtr(). During the first contact
processing, the Mac address of the next hop IP address for the IP route
is resolved by performing e.g. ARP and used to create the RoCEv2 RC QP.
If the route is not usable for the RoCE IP address to reach the peer's
RoCE IP address i.e. without this restriction, the UDP/IP packets would
not be transported in a right way.
BTW, the fallback would still happen without the restriction. Because at
the end of the CLC handshake(TCP/IP traffic), the first link will be
created by sending and receiving LLC confirm message (SMCRv2 traffic).
If one peer can just send but not receive the LLC confirm message, he
will send CLC decline message with the reason "Time Out".
Now let's have a look at your examples above. Both of your RDMA related
device have another IP route as the TCP/IP connection, so that the reuse
of the IP routing topology is not possible.
Any thought still?
> The other is in the function smc_connect_rdma_v2_prepare, where restricts the symmetric configuration of routing between client and server. codes here:
> static int smc_connect_rdma_v2_prepare(...)
> {
> ...
> if (fce->v2_direct) {
> memcpy(ini->smcrv2.nexthop_mac, &aclc->r0.lcl.mac, ETH_ALEN);
> ini->smcrv2.uses_gateway = false;
> } else {
> if (smc_ib_find_route(net, smc->clcsock->sk->sk_rcv_saddr,
> smc_ib_gid_to_ipv4(aclc->r0.lcl.gid),
> ini->smcrv2.nexthop_mac,
> &ini->smcrv2.uses_gateway))
> return SMC_CLC_DECL_NOROUTE;
> if (!ini->smcrv2.uses_gateway) {
> /* mismatch: peer claims indirect, but its direct */
> return SMC_CLC_DECL_NOINDIRECT;
> }
> }
> ...
> }
> In my testing environment, server's ip is 192.168.0.3/24, client's ip 192.168.0.4/24, regarding how many netdev in server or client. Server has special
> route setting due to some other reasons, which results in indirect route from 192.168.0.3/24 to 192.168.0.4/24. Thus, when CLC handshake, client will
> get fce->v2_direct==false, but client has no special routing setting and will find direct route from 192.168.0.4/24 to 192.168.0.3/24. Due to the above
> symmetric configuration of routing restriction, we got fallback connection, rsn is 0x030f0000. But I think SMC-R should work in this scenario.
> And more, why check the symmetric configuration of routing only when server is indirect route?
>
That is to check if the IP routing topology is the same on both sides.
Then I'd like to ask why you use asymmetric routing for your connection?
From the perspective of Networking set up, does it make any sense that
the peers communicate with each other with different IP routing topology?
> Waiting for your reply.
>
> Thanks,
> Guangguan Wang
>
On 2024/5/17 15:41, Wenjia Zhang wrote:
>
>
> On 07.05.24 07:54, Guangdong Wang wrote:
>> Hi, Wenjia and Jan,
>>
>> When testing SMC-R v2, I found some scenarios where SMC-R v2 should be worked, but due to some restrictions in SMC-R v2's implementation,
>> fallback happened. I want to know why these restrictions exist and what would happen if these restrictions were removed.
>>
>
> Hi Guangguan and Wen,
>
> please see my answer below.
>> The first is in the function smc_ib_determine_gid_rcu, where restricts the subnet matching between smcrv2->saddr and the RDMA related netdev.
>> ...
>>
> The purpose of the restriction is to simplify the IP routing topology allowing IP routing to use the destination host's subnet route. Because each host must also have a valid IP route to the peer’s RoCE IP address to create RC QP. If the IP route used is the same IP Route as the associated TCP/IP connection, the reuse of the IP routing topology could be achieved. I think it is what the following sentence means in the doc https://www.ibm.com/support/pages/system/files/inline-files/IBM%20Shared%20Memory%20Communications%20Version%202_2.pdf
>
> "
> For HA, multiple RoCE adapters should be provisioned along with multiple equal cost IP routes to the peer host (i.e., reusing the TCP/IP routing topology).
> "
> And the "Figure 19. SMC-Rv2 with RoCEv2 Connectivity" in the doc also mentions the restriction.
>
> The SMCRv2 on linux is indeed implemented with this purpose. Please see the function smc_ib_modify_qp_rtr(). During the first contact processing, the Mac address of the next hop IP address for the IP route is resolved by performing e.g. ARP and used to create the RoCEv2 RC QP. If the route is not usable for the RoCE IP address to reach the peer's RoCE IP address i.e. without this restriction, the UDP/IP packets would not be transported in a right way.
>
Hi, Wenjia
Thanks for the answer.
I am clear about the restriction of subnet matching.
> BTW, the fallback would still happen without the restriction. Because at the end of the CLC handshake(TCP/IP traffic), the first link will be created by sending and receiving LLC confirm message (SMCRv2 traffic). If one peer can just send but not receive the LLC confirm message, he will send CLC decline message with the reason "Time Out".
>
> Now let's have a look at your examples above. Both of your RDMA related device have another IP route as the TCP/IP connection, so that the reuse of the IP routing topology is not possible.
>
> Any thought still?
>
>> The other is in the function smc_connect_rdma_v2_prepare, where restricts the symmetric configuration of routing between client and server. codes here:
>> ...
>> In my testing environment, server's ip is 192.168.0.3/24, client's ip 192.168.0.4/24, regarding how many netdev in server or client. Server has special
>> route setting due to some other reasons, which results in indirect route from 192.168.0.3/24 to 192.168.0.4/24. Thus, when CLC handshake, client will
>> get fce->v2_direct==false, but client has no special routing setting and will find direct route from 192.168.0.4/24 to 192.168.0.3/24. Due to the above
>> symmetric configuration of routing restriction, we got fallback connection, rsn is 0x030f0000. But I think SMC-R should work in this scenario.
>> And more, why check the symmetric configuration of routing only when server is indirect route?
>>
> That is to check if the IP routing topology is the same on both sides. Then I'd like to ask why you use asymmetric routing for your connection? From the perspective of Networking set up, does it make any sense that the peers communicate with each other with different IP routing topology?
I have looked into the configuration of my testing environment's routing table and found that the configuration can be optimized.
And the sketch in the attachment used to describe the topology and route configuration of my testing environment.
After optimizing the route setting, the fallback disappear.
But why check the symmetric configuration of routing only when server is indirect route is still not clear.
Thanks,
Guangguan Wang
On 2024/5/21 18:52, Guangguan Wang wrote:
>
>
> On 2024/5/17 15:41, Wenjia Zhang wrote:
>>
>>
>> On 07.05.24 07:54, Guangdong Wang wrote:
>>> Hi, Wenjia and Jan,
>>>
>>> When testing SMC-R v2, I found some scenarios where SMC-R v2 should be worked, but due to some restrictions in SMC-R v2's implementation,
>>> fallback happened. I want to know why these restrictions exist and what would happen if these restrictions were removed.
>>>
>>
>> Hi Guangguan and Wen,
>>
>> please see my answer below.
>>> The first is in the function smc_ib_determine_gid_rcu, where restricts the subnet matching between smcrv2->saddr and the RDMA related netdev.
>>> ...
>>>
>> The purpose of the restriction is to simplify the IP routing topology allowing IP routing to use the destination host's subnet route. Because each host must also have a valid IP route to the peer’s RoCE IP address to create RC QP. If the IP route used is the same IP Route as the associated TCP/IP connection, the reuse of the IP routing topology could be achieved. I think it is what the following sentence means in the doc https://www.ibm.com/support/pages/system/files/inline-files/IBM%20Shared%20Memory%20Communications%20Version%202_2.pdf
>>
>> "
>> For HA, multiple RoCE adapters should be provisioned along with multiple equal cost IP routes to the peer host (i.e., reusing the TCP/IP routing topology).
>> "
>> And the "Figure 19. SMC-Rv2 with RoCEv2 Connectivity" in the doc also mentions the restriction.
>>
>> The SMCRv2 on linux is indeed implemented with this purpose. Please see the function smc_ib_modify_qp_rtr(). During the first contact processing, the Mac address of the next hop IP address for the IP route is resolved by performing e.g. ARP and used to create the RoCEv2 RC QP. If the route is not usable for the RoCE IP address to reach the peer's RoCE IP address i.e. without this restriction, the UDP/IP packets would not be transported in a right way.
>>
>
> Hi, Wenjia
>
> Thanks for the answer.
>
> I am clear about the restriction of subnet matching.
>
>> BTW, the fallback would still happen without the restriction. Because at the end of the CLC handshake(TCP/IP traffic), the first link will be created by sending and receiving LLC confirm message (SMCRv2 traffic). If one peer can just send but not receive the LLC confirm message, he will send CLC decline message with the reason "Time Out".
>>
>> Now let's have a look at your examples above. Both of your RDMA related device have another IP route as the TCP/IP connection, so that the reuse of the IP routing topology is not possible.
>>
>> Any thought still?
>>
>>> The other is in the function smc_connect_rdma_v2_prepare, where restricts the symmetric configuration of routing between client and server. codes here:
>>> ...
>>> In my testing environment, server's ip is 192.168.0.3/24, client's ip 192.168.0.4/24, regarding how many netdev in server or client. Server has special
>>> route setting due to some other reasons, which results in indirect route from 192.168.0.3/24 to 192.168.0.4/24. Thus, when CLC handshake, client will
>>> get fce->v2_direct==false, but client has no special routing setting and will find direct route from 192.168.0.4/24 to 192.168.0.3/24. Due to the above
>>> symmetric configuration of routing restriction, we got fallback connection, rsn is 0x030f0000. But I think SMC-R should work in this scenario.
>>> And more, why check the symmetric configuration of routing only when server is indirect route?
>>>
>> That is to check if the IP routing topology is the same on both sides. Then I'd like to ask why you use asymmetric routing for your connection? From the perspective of Networking set up, does it make any sense that the peers communicate with each other with different IP routing topology?
>
> I have looked into the configuration of my testing environment's routing table and found that the configuration can be optimized.
> And the sketch in the attachment used to describe the topology and route configuration of my testing environment.
Sorry for some mistakes in the sketch attached in last email, resend the corrected sketch in the attachment for update.
Thanks,
Guangguan Wang
> After optimizing the route setting, the fallback disappear.
>
> But why check the symmetric configuration of routing only when server is indirect route is still not clear.
>
>
> Thanks,
> Guangguan Wang
On 21.05.24 12:52, Guangguan Wang wrote:
>
>
> On 2024/5/17 15:41, Wenjia Zhang wrote:
>>
>>
>> On 07.05.24 07:54, Guangdong Wang wrote:
>>> Hi, Wenjia and Jan,
>>>
>>> When testing SMC-R v2, I found some scenarios where SMC-R v2 should be worked, but due to some restrictions in SMC-R v2's implementation,
>>> fallback happened. I want to know why these restrictions exist and what would happen if these restrictions were removed.
>>>
>>
>> Hi Guangguan and Wen,
>>
>> please see my answer below.
>>> The first is in the function smc_ib_determine_gid_rcu, where restricts the subnet matching between smcrv2->saddr and the RDMA related netdev.
>>> ...
>>>
>> The purpose of the restriction is to simplify the IP routing topology allowing IP routing to use the destination host's subnet route. Because each host must also have a valid IP route to the peer’s RoCE IP address to create RC QP. If the IP route used is the same IP Route as the associated TCP/IP connection, the reuse of the IP routing topology could be achieved. I think it is what the following sentence means in the doc https://www.ibm.com/support/pages/system/files/inline-files/IBM%20Shared%20Memory%20Communications%20Version%202_2.pdf
>>
>> "
>> For HA, multiple RoCE adapters should be provisioned along with multiple equal cost IP routes to the peer host (i.e., reusing the TCP/IP routing topology).
>> "
>> And the "Figure 19. SMC-Rv2 with RoCEv2 Connectivity" in the doc also mentions the restriction.
>>
>> The SMCRv2 on linux is indeed implemented with this purpose. Please see the function smc_ib_modify_qp_rtr(). During the first contact processing, the Mac address of the next hop IP address for the IP route is resolved by performing e.g. ARP and used to create the RoCEv2 RC QP. If the route is not usable for the RoCE IP address to reach the peer's RoCE IP address i.e. without this restriction, the UDP/IP packets would not be transported in a right way.
>>
>
> Hi, Wenjia
>
> Thanks for the answer.
>
> I am clear about the restriction of subnet matching.
>
>> BTW, the fallback would still happen without the restriction. Because at the end of the CLC handshake(TCP/IP traffic), the first link will be created by sending and receiving LLC confirm message (SMCRv2 traffic). If one peer can just send but not receive the LLC confirm message, he will send CLC decline message with the reason "Time Out".
>>
>> Now let's have a look at your examples above. Both of your RDMA related device have another IP route as the TCP/IP connection, so that the reuse of the IP routing topology is not possible.
>>
>> Any thought still?
>>
>>> The other is in the function smc_connect_rdma_v2_prepare, where restricts the symmetric configuration of routing between client and server. codes here:
>>> ...
>>> In my testing environment, server's ip is 192.168.0.3/24, client's ip 192.168.0.4/24, regarding how many netdev in server or client. Server has special
>>> route setting due to some other reasons, which results in indirect route from 192.168.0.3/24 to 192.168.0.4/24. Thus, when CLC handshake, client will
>>> get fce->v2_direct==false, but client has no special routing setting and will find direct route from 192.168.0.4/24 to 192.168.0.3/24. Due to the above
>>> symmetric configuration of routing restriction, we got fallback connection, rsn is 0x030f0000. But I think SMC-R should work in this scenario.
>>> And more, why check the symmetric configuration of routing only when server is indirect route?
>>>
>> That is to check if the IP routing topology is the same on both sides. Then I'd like to ask why you use asymmetric routing for your connection? From the perspective of Networking set up, does it make any sense that the peers communicate with each other with different IP routing topology?
>
> I have looked into the configuration of my testing environment's routing table and found that the configuration can be optimized.
> And the sketch in the attachment used to describe the topology and route configuration of my testing environment.
> After optimizing the route setting, the fallback disappear.
>
> But why check the symmetric configuration of routing only when server is indirect route is still not clear.
>
>
> Thanks,
> Guangguan Wang
The optimized configuration looks much more reasonable to me. Thus, why
do we need to do the symmetric check when the server is direct route?
Don't we expect for a direct route on the client's side? If not, I have
to repeat my question: does it make any sense that the peers communicate
with each other with different IP routing topology structures, like your
first version of configuration? If yes, I need convincing argument.
Thanks,
Wenjia
On 2024/5/27 22:57, Wenjia Zhang wrote:
>
>
> On 21.05.24 12:52, Guangguan Wang wrote:
>>
>>
>> On 2024/5/17 15:41, Wenjia Zhang wrote:
>>>
>>>
>>> On 07.05.24 07:54, Guangdong Wang wrote:
>>>> Hi, Wenjia and Jan,
>>>>
>>>> When testing SMC-R v2, I found some scenarios where SMC-R v2 should be worked, but due to some restrictions in SMC-R v2's implementation,
>>>> fallback happened. I want to know why these restrictions exist and what would happen if these restrictions were removed.
>>>>
>>>
>>> Hi Guangguan and Wen,
>>>
>>> please see my answer below.
>>>> The first is in the function smc_ib_determine_gid_rcu, where restricts the subnet matching between smcrv2->saddr and the RDMA related netdev.
>>>> ...
>>>>
>>> The purpose of the restriction is to simplify the IP routing topology allowing IP routing to use the destination host's subnet route. Because each host must also have a valid IP route to the peer’s RoCE IP address to create RC QP. If the IP route used is the same IP Route as the associated TCP/IP connection, the reuse of the IP routing topology could be achieved. I think it is what the following sentence means in the doc https://www.ibm.com/support/pages/system/files/inline-files/IBM%20Shared%20Memory%20Communications%20Version%202_2.pdf
>>>
>>> "
>>> For HA, multiple RoCE adapters should be provisioned along with multiple equal cost IP routes to the peer host (i.e., reusing the TCP/IP routing topology).
>>> "
>>> And the "Figure 19. SMC-Rv2 with RoCEv2 Connectivity" in the doc also mentions the restriction.
>>>
>>> The SMCRv2 on linux is indeed implemented with this purpose. Please see the function smc_ib_modify_qp_rtr(). During the first contact processing, the Mac address of the next hop IP address for the IP route is resolved by performing e.g. ARP and used to create the RoCEv2 RC QP. If the route is not usable for the RoCE IP address to reach the peer's RoCE IP address i.e. without this restriction, the UDP/IP packets would not be transported in a right way.
>>>
>>
>> Hi, Wenjia
>>
>> Thanks for the answer.
>>
>> I am clear about the restriction of subnet matching.
>>
>>> BTW, the fallback would still happen without the restriction. Because at the end of the CLC handshake(TCP/IP traffic), the first link will be created by sending and receiving LLC confirm message (SMCRv2 traffic). If one peer can just send but not receive the LLC confirm message, he will send CLC decline message with the reason "Time Out".
>>>
>>> Now let's have a look at your examples above. Both of your RDMA related device have another IP route as the TCP/IP connection, so that the reuse of the IP routing topology is not possible.
>>>
>>> Any thought still?
>>>
>>>> The other is in the function smc_connect_rdma_v2_prepare, where restricts the symmetric configuration of routing between client and server. codes here:
>>>> ...
>>>> In my testing environment, server's ip is 192.168.0.3/24, client's ip 192.168.0.4/24, regarding how many netdev in server or client. Server has special
>>>> route setting due to some other reasons, which results in indirect route from 192.168.0.3/24 to 192.168.0.4/24. Thus, when CLC handshake, client will
>>>> get fce->v2_direct==false, but client has no special routing setting and will find direct route from 192.168.0.4/24 to 192.168.0.3/24. Due to the above
>>>> symmetric configuration of routing restriction, we got fallback connection, rsn is 0x030f0000. But I think SMC-R should work in this scenario.
>>>> And more, why check the symmetric configuration of routing only when server is indirect route?
>>>>
>>> That is to check if the IP routing topology is the same on both sides. Then I'd like to ask why you use asymmetric routing for your connection? From the perspective of Networking set up, does it make any sense that the peers communicate with each other with different IP routing topology?
>>
>> I have looked into the configuration of my testing environment's routing table and found that the configuration can be optimized.
>> And the sketch in the attachment used to describe the topology and route configuration of my testing environment.
>> After optimizing the route setting, the fallback disappear.
>>
>> But why check the symmetric configuration of routing only when server is indirect route is still not clear.
>>
>>
>> Thanks,
>> Guangguan Wang
>
> The optimized configuration looks much more reasonable to me. Thus, why do we need to do the symmetric check when the server is direct route? Don't we expect for a direct route on the client's side? If not, I have to repeat my question: does it make any sense that the peers communicate with each other with different IP routing topology structures, like your first version of configuration? If yes, I need convincing argument.
I agree it is more reasonable that peers communicate with each other in same IP routing topology structures.
My question is that when server is direct routing, why do not check the route configuration in client side?
For routing configuration, I think it is equal for both sides, either server or client can be misconfigured.
Thanks,
Guangguan Wang
>
> Thanks,
> Wenjia