2022-09-20 07:09:54

by Wen Gu

[permalink] [raw]
Subject: [PATCH net] net/smc: Stop the CLC flow if no link to map buffers on

There might be a potential race between SMC-R buffer map and
link group termination.

smc_smcr_terminate_all() | smc_connect_rdma()
--------------------------------------------------------------
| smc_conn_create()
for links in smcibdev |
schedule links down |
| smc_buf_create()
| \- smcr_buf_map_usable_links()
| \- no usable links found,
| (rmb->mr = NULL)
|
| smc_clc_send_confirm()
| \- access conn->rmb_desc->mr[]->rkey
| (panic)

During reboot and IB device module remove, all links will be set
down and no usable links remain in link groups. In such situation
smcr_buf_map_usable_links() should return an error and stop the
CLC flow accessing to uninitialized mr.

Fixes: b9247544c1bc ("net/smc: convert static link ID instances to support multiple links")
Signed-off-by: Wen Gu <[email protected]>
---
net/smc/smc_core.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/net/smc/smc_core.c b/net/smc/smc_core.c
index ebf56cd..df89c2e 100644
--- a/net/smc/smc_core.c
+++ b/net/smc/smc_core.c
@@ -2239,7 +2239,7 @@ static struct smc_buf_desc *smcr_new_buf_create(struct smc_link_group *lgr,
static int smcr_buf_map_usable_links(struct smc_link_group *lgr,
struct smc_buf_desc *buf_desc, bool is_rmb)
{
- int i, rc = 0;
+ int i, rc = 0, cnt = 0;

/* protect against parallel link reconfiguration */
mutex_lock(&lgr->llc_conf_mutex);
@@ -2252,9 +2252,12 @@ static int smcr_buf_map_usable_links(struct smc_link_group *lgr,
rc = -ENOMEM;
goto out;
}
+ cnt++;
}
out:
mutex_unlock(&lgr->llc_conf_mutex);
+ if (!rc && !cnt)
+ rc = -EINVAL;
return rc;
}

--
1.8.3.1


2022-09-22 08:43:19

by Wen Gu

[permalink] [raw]
Subject: Re: [PATCH net] net/smc: Stop the CLC flow if no link to map buffers on



On 2022/9/20 14:43, Wen Gu wrote:

> There might be a potential race between SMC-R buffer map and
> link group termination.
>
> smc_smcr_terminate_all() | smc_connect_rdma()
> --------------------------------------------------------------
> | smc_conn_create()
> for links in smcibdev |
> schedule links down |
> | smc_buf_create()
> | \- smcr_buf_map_usable_links()
> | \- no usable links found,
> | (rmb->mr = NULL)
> |
> | smc_clc_send_confirm()
> | \- access conn->rmb_desc->mr[]->rkey
> | (panic)
>
> During reboot and IB device module remove, all links will be set
> down and no usable links remain in link groups. In such situation
> smcr_buf_map_usable_links() should return an error and stop the
> CLC flow accessing to uninitialized mr.
>
> Fixes: b9247544c1bc ("net/smc: convert static link ID instances to support multiple links")
> Signed-off-by: Wen Gu <[email protected]>
> ---
> net/smc/smc_core.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/net/smc/smc_core.c b/net/smc/smc_core.c
> index ebf56cd..df89c2e 100644
> --- a/net/smc/smc_core.c
> +++ b/net/smc/smc_core.c
> @@ -2239,7 +2239,7 @@ static struct smc_buf_desc *smcr_new_buf_create(struct smc_link_group *lgr,
> static int smcr_buf_map_usable_links(struct smc_link_group *lgr,
> struct smc_buf_desc *buf_desc, bool is_rmb)
> {
> - int i, rc = 0;
> + int i, rc = 0, cnt = 0;
>
> /* protect against parallel link reconfiguration */
> mutex_lock(&lgr->llc_conf_mutex);
> @@ -2252,9 +2252,12 @@ static int smcr_buf_map_usable_links(struct smc_link_group *lgr,
> rc = -ENOMEM;
> goto out;
> }
> + cnt++;
> }
> out:
> mutex_unlock(&lgr->llc_conf_mutex);
> + if (!rc && !cnt)
> + rc = -EINVAL;
> return rc;
> }
>

Any comments or reviews are welcome and appreciated.

Thanks,
Wen Gu

2022-09-22 12:10:01

by patchwork-bot+netdevbpf

[permalink] [raw]
Subject: Re: [PATCH net] net/smc: Stop the CLC flow if no link to map buffers on

Hello:

This patch was applied to netdev/net.git (master)
by Paolo Abeni <[email protected]>:

On Tue, 20 Sep 2022 14:43:09 +0800 you wrote:
> There might be a potential race between SMC-R buffer map and
> link group termination.
>
> smc_smcr_terminate_all() | smc_connect_rdma()
> --------------------------------------------------------------
> | smc_conn_create()
> for links in smcibdev |
> schedule links down |
> | smc_buf_create()
> | \- smcr_buf_map_usable_links()
> | \- no usable links found,
> | (rmb->mr = NULL)
> |
> | smc_clc_send_confirm()
> | \- access conn->rmb_desc->mr[]->rkey
> | (panic)
>
> [...]

Here is the summary with links:
- [net] net/smc: Stop the CLC flow if no link to map buffers on
https://git.kernel.org/netdev/net/c/e738455b2c6d

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html


2022-09-22 13:36:34

by Wenjia Zhang

[permalink] [raw]
Subject: Re: [PATCH net] net/smc: Stop the CLC flow if no link to map buffers on



On 22.09.22 10:29, Wen Gu wrote:
>
>
> On 2022/9/20 14:43, Wen Gu wrote:
>
>> There might be a potential race between SMC-R buffer map and
>> link group termination.
>>
>> smc_smcr_terminate_all()     | smc_connect_rdma()
>> --------------------------------------------------------------
>>                               | smc_conn_create()
>> for links in smcibdev        |
>>          schedule links down  |
>>                               | smc_buf_create()
>>                               |  \- smcr_buf_map_usable_links()
>>                               |      \- no usable links found,
>>                               |         (rmb->mr = NULL)
>>                               |
>>                               | smc_clc_send_confirm()
>>                               |  \- access conn->rmb_desc->mr[]->rkey
>>                               |     (panic)
>>
>> During reboot and IB device module remove, all links will be set
>> down and no usable links remain in link groups. In such situation
>> smcr_buf_map_usable_links() should return an error and stop the
>> CLC flow accessing to uninitialized mr.
>>
>> Fixes: b9247544c1bc ("net/smc: convert static link ID instances to
>> support multiple links")
>> Signed-off-by: Wen Gu <[email protected]>
>> ---
>>   net/smc/smc_core.c | 5 ++++-
>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/net/smc/smc_core.c b/net/smc/smc_core.c
>> index ebf56cd..df89c2e 100644
>> --- a/net/smc/smc_core.c
>> +++ b/net/smc/smc_core.c
>> @@ -2239,7 +2239,7 @@ static struct smc_buf_desc
>> *smcr_new_buf_create(struct smc_link_group *lgr,
>>   static int smcr_buf_map_usable_links(struct smc_link_group *lgr,
>>                        struct smc_buf_desc *buf_desc, bool is_rmb)
>>   {
>> -    int i, rc = 0;
>> +    int i, rc = 0, cnt = 0;
>>       /* protect against parallel link reconfiguration */
>>       mutex_lock(&lgr->llc_conf_mutex);
>> @@ -2252,9 +2252,12 @@ static int smcr_buf_map_usable_links(struct
>> smc_link_group *lgr,
>>               rc = -ENOMEM;
>>               goto out;
>>           }
>> +        cnt++;
>>       }
>>   out:
>>       mutex_unlock(&lgr->llc_conf_mutex);
>> +    if (!rc && !cnt)
>> +        rc = -EINVAL;
>>       return rc;
>>   }
>
> Any comments or reviews are welcome and appreciated.
>
> Thanks,
> Wen Gu

Sorry for the late answer!
Good catch! Thank you!

Acked-by: Wenjia Zhang <[email protected]>