2018-07-13 23:53:02

by Qing Huang

[permalink] [raw]
Subject: [PATCH] IB/mlx5: avoid binding a new mpi unit to the same devices repeatedly

When a CX5 device is configured in dual-port RoCE mode, after creating
many VFs against port 1, creating the same number of VFs against port 2
will flood kernel/syslog with something like
"mlx5_*:mlx5_ib_bind_slave_port:4266:(pid 5269): port 2 already
affiliated."

So basically, when traversing mlx5_ib_dev_list, mlx5_ib_add_slave_port()
shouldn't repeatedly attempt to bind the new mpi data unit to every device
on the list until it finds an unbound device.

Reported-by: Gerald Gibson <[email protected]>
Signed-off-by: Qing Huang <[email protected]>
---
drivers/infiniband/hw/mlx5/main.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index b3ba9a2..1ddd1d3 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -6068,7 +6068,8 @@ static void *mlx5_ib_add_slave_port(struct mlx5_core_dev *mdev, u8 port_num)

mutex_lock(&mlx5_ib_multiport_mutex);
list_for_each_entry(dev, &mlx5_ib_dev_list, ib_dev_list) {
- if (dev->sys_image_guid == mpi->sys_image_guid)
+ if (dev->sys_image_guid == mpi->sys_image_guid &&
+ !dev->port[mlx5_core_native_port_num(mdev) - 1].mp.mpi)
bound = mlx5_ib_bind_slave_port(dev, mpi);

if (bound) {
--
2.9.3



2018-07-14 15:57:51

by Or Gerlitz

[permalink] [raw]
Subject: Re: [PATCH] IB/mlx5: avoid binding a new mpi unit to the same devices repeatedly

On Sat, Jul 14, 2018 at 2:50 AM, Qing Huang <[email protected]> wrote:
> When a CX5 device is configured in dual-port RoCE mode, after creating
> many VFs against port 1, creating the same number of VFs against port 2
> will flood kernel/syslog with something like
> "mlx5_*:mlx5_ib_bind_slave_port:4266:(pid 5269): port 2 already
> affiliated."
>
> So basically, when traversing mlx5_ib_dev_list, mlx5_ib_add_slave_port()
> shouldn't repeatedly attempt to bind the new mpi data unit to every device
> on the list until it finds an unbound device.

Daniel,

What is mpi data unit?

Or.

>
> Reported-by: Gerald Gibson <[email protected]>
> Signed-off-by: Qing Huang <[email protected]>
> ---
> drivers/infiniband/hw/mlx5/main.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
> index b3ba9a2..1ddd1d3 100644
> --- a/drivers/infiniband/hw/mlx5/main.c
> +++ b/drivers/infiniband/hw/mlx5/main.c
> @@ -6068,7 +6068,8 @@ static void *mlx5_ib_add_slave_port(struct mlx5_core_dev *mdev, u8 port_num)
>
> mutex_lock(&mlx5_ib_multiport_mutex);
> list_for_each_entry(dev, &mlx5_ib_dev_list, ib_dev_list) {
> - if (dev->sys_image_guid == mpi->sys_image_guid)
> + if (dev->sys_image_guid == mpi->sys_image_guid &&
> + !dev->port[mlx5_core_native_port_num(mdev) - 1].mp.mpi)
> bound = mlx5_ib_bind_slave_port(dev, mpi);
>
> if (bound) {
> --
> 2.9.3
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2018-07-15 06:05:04

by Leon Romanovsky

[permalink] [raw]
Subject: Re: [PATCH] IB/mlx5: avoid binding a new mpi unit to the same devices repeatedly

On Sat, Jul 14, 2018 at 06:57:01PM +0300, Or Gerlitz wrote:
> On Sat, Jul 14, 2018 at 2:50 AM, Qing Huang <[email protected]> wrote:
> > When a CX5 device is configured in dual-port RoCE mode, after creating
> > many VFs against port 1, creating the same number of VFs against port 2
> > will flood kernel/syslog with something like
> > "mlx5_*:mlx5_ib_bind_slave_port:4266:(pid 5269): port 2 already
> > affiliated."
> >
> > So basically, when traversing mlx5_ib_dev_list, mlx5_ib_add_slave_port()
> > shouldn't repeatedly attempt to bind the new mpi data unit to every device
> > on the list until it finds an unbound device.
>
> Daniel,
>
> What is mpi data unit?

Or,

"mpi" is multi-port information needed for multi-port IB devices
resource sharing.

In simple words, "mpi" holds mapping information between
affiliated ports (master vs. slave) and IB device which it is connected.

Most probably, "mpi data unit" was meant "mpi struct" by Qing.

BTW, Daniel is OOO till 7-23.

Thanks

>
> Or.
>
> >
> > Reported-by: Gerald Gibson <[email protected]>
> > Signed-off-by: Qing Huang <[email protected]>
> > ---
> > drivers/infiniband/hw/mlx5/main.c | 3 ++-
> > 1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
> > index b3ba9a2..1ddd1d3 100644
> > --- a/drivers/infiniband/hw/mlx5/main.c
> > +++ b/drivers/infiniband/hw/mlx5/main.c
> > @@ -6068,7 +6068,8 @@ static void *mlx5_ib_add_slave_port(struct mlx5_core_dev *mdev, u8 port_num)
> >
> > mutex_lock(&mlx5_ib_multiport_mutex);
> > list_for_each_entry(dev, &mlx5_ib_dev_list, ib_dev_list) {
> > - if (dev->sys_image_guid == mpi->sys_image_guid)
> > + if (dev->sys_image_guid == mpi->sys_image_guid &&
> > + !dev->port[mlx5_core_native_port_num(mdev) - 1].mp.mpi)
> > bound = mlx5_ib_bind_slave_port(dev, mpi);
> >
> > if (bound) {
> > --
> > 2.9.3
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> > the body of a message to [email protected]
> > More majordomo info at http://vger.kernel.org/majordomo-info.html


Attachments:
(No filename) (2.23 kB)
signature.asc (817.00 B)
Download all attachments

2018-07-15 19:49:54

by Daniel Jurgens

[permalink] [raw]
Subject: Re: [PATCH] IB/mlx5: avoid binding a new mpi unit to the same devices repeatedly


On 7/14/2018 10:57 AM, Or Gerlitz wrote:
> On Sat, Jul 14, 2018 at 2:50 AM, Qing Huang <[email protected]> wrote:
>> When a CX5 device is configured in dual-port RoCE mode, after creating
>> many VFs against port 1, creating the same number of VFs against port 2
>> will flood kernel/syslog with something like
>> "mlx5_*:mlx5_ib_bind_slave_port:4266:(pid 5269): port 2 already
>> affiliated."
>>
>> So basically, when traversing mlx5_ib_dev_list, mlx5_ib_add_slave_port()
>> shouldn't repeatedly attempt to bind the new mpi data unit to every device
>> on the list until it finds an unbound device.
> Daniel,
>
> What is mpi data unit?
It's a structure to keep track affiliated port info in dual port RoCE mode, mpi meaning multi-port info. Parav can review this it my absence, otherwise I can take a closer look when I return to the office.
>
> Or.
>
>> Reported-by: Gerald Gibson <[email protected]>
>> Signed-off-by: Qing Huang <[email protected]>
>> ---
>> drivers/infiniband/hw/mlx5/main.c | 3 ++-
>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
>> index b3ba9a2..1ddd1d3 100644
>> --- a/drivers/infiniband/hw/mlx5/main.c
>> +++ b/drivers/infiniband/hw/mlx5/main.c
>> @@ -6068,7 +6068,8 @@ static void *mlx5_ib_add_slave_port(struct mlx5_core_dev *mdev, u8 port_num)
>>
>> mutex_lock(&mlx5_ib_multiport_mutex);
>> list_for_each_entry(dev, &mlx5_ib_dev_list, ib_dev_list) {
>> - if (dev->sys_image_guid == mpi->sys_image_guid)
>> + if (dev->sys_image_guid == mpi->sys_image_guid &&
>> + !dev->port[mlx5_core_native_port_num(mdev) - 1].mp.mpi)
>> bound = mlx5_ib_bind_slave_port(dev, mpi);
>>
>> if (bound) {
>> --
>> 2.9.3
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html


2018-07-23 15:37:53

by Qing Huang

[permalink] [raw]
Subject: Re: [PATCH] IB/mlx5: avoid binding a new mpi unit to the same devices repeatedly



On 7/15/2018 12:48 PM, Daniel Jurgens wrote:
> On 7/14/2018 10:57 AM, Or Gerlitz wrote:
>> On Sat, Jul 14, 2018 at 2:50 AM, Qing Huang <[email protected]> wrote:
>>> When a CX5 device is configured in dual-port RoCE mode, after creating
>>> many VFs against port 1, creating the same number of VFs against port 2
>>> will flood kernel/syslog with something like
>>> "mlx5_*:mlx5_ib_bind_slave_port:4266:(pid 5269): port 2 already
>>> affiliated."
>>>
>>> So basically, when traversing mlx5_ib_dev_list, mlx5_ib_add_slave_port()
>>> shouldn't repeatedly attempt to bind the new mpi data unit to every device
>>> on the list until it finds an unbound device.
>> Daniel,
>>
>> What is mpi data unit?
> It's a structure to keep track affiliated port info in dual port RoCE mode, mpi meaning multi-port info. Parav can review this it my absence, otherwise I can take a closer look when I return to the office.
Hi Daniel/Parav,

Have you got a chance to review this patch? Thanks!

>> Or.
>>
>>> Reported-by: Gerald Gibson <[email protected]>
>>> Signed-off-by: Qing Huang <[email protected]>
>>> ---
>>> drivers/infiniband/hw/mlx5/main.c | 3 ++-
>>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
>>> index b3ba9a2..1ddd1d3 100644
>>> --- a/drivers/infiniband/hw/mlx5/main.c
>>> +++ b/drivers/infiniband/hw/mlx5/main.c
>>> @@ -6068,7 +6068,8 @@ static void *mlx5_ib_add_slave_port(struct mlx5_core_dev *mdev, u8 port_num)
>>>
>>> mutex_lock(&mlx5_ib_multiport_mutex);
>>> list_for_each_entry(dev, &mlx5_ib_dev_list, ib_dev_list) {
>>> - if (dev->sys_image_guid == mpi->sys_image_guid)
>>> + if (dev->sys_image_guid == mpi->sys_image_guid &&
>>> + !dev->port[mlx5_core_native_port_num(mdev) - 1].mp.mpi)
>>> bound = mlx5_ib_bind_slave_port(dev, mpi);
>>>
>>> if (bound) {
>>> --
>>> 2.9.3
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>>> the body of a message to [email protected]
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html


2018-07-23 16:23:14

by Parav Pandit

[permalink] [raw]
Subject: RE: [PATCH] IB/mlx5: avoid binding a new mpi unit to the same devices repeatedly

Hi Qing,


> -----Original Message-----
> From: Qing Huang [mailto:[email protected]]
> Sent: Monday, July 23, 2018 10:36 AM
> To: Daniel Jurgens <[email protected]>; Or Gerlitz
> <[email protected]>; Parav Pandit <[email protected]>
> Cc: Linux Kernel <[email protected]>; RDMA mailing list <linux-
> [email protected]>; Jason Gunthorpe <[email protected]>; Doug Ledford
> <[email protected]>; Leon Romanovsky <[email protected]>;
> [email protected]
> Subject: Re: [PATCH] IB/mlx5: avoid binding a new mpi unit to the same
> devices repeatedly
>
>
>
> On 7/15/2018 12:48 PM, Daniel Jurgens wrote:
> > On 7/14/2018 10:57 AM, Or Gerlitz wrote:
> >> On Sat, Jul 14, 2018 at 2:50 AM, Qing Huang <[email protected]>
> wrote:
> >>> When a CX5 device is configured in dual-port RoCE mode, after
> >>> creating many VFs against port 1, creating the same number of VFs
> >>> against port 2 will flood kernel/syslog with something like
> >>> "mlx5_*:mlx5_ib_bind_slave_port:4266:(pid 5269): port 2 already
> >>> affiliated."
> >>>
> >>> So basically, when traversing mlx5_ib_dev_list,
> >>> mlx5_ib_add_slave_port() shouldn't repeatedly attempt to bind the
> >>> new mpi data unit to every device on the list until it finds an unbound
> device.
> >> Daniel,
> >>
> >> What is mpi data unit?
> > It's a structure to keep track affiliated port info in dual port RoCE mode,
> mpi meaning multi-port info. Parav can review this it my absence, otherwise I
> can take a closer look when I return to the office.
> Hi Daniel/Parav,
>
> Have you got a chance to review this patch? Thanks!
Didn't have chance yet.
Will do this week.

2018-07-23 18:13:41

by Daniel Jurgens

[permalink] [raw]
Subject: Re: [PATCH] IB/mlx5: avoid binding a new mpi unit to the same devices repeatedly



On 7/23/2018 10:36 AM, Qing Huang wrote:
>
> Hi Daniel/Parav,
>
> Have you got a chance to review this patch? Thanks!
Hi Qing, sorry for the delay, I just got back to the office today. I don't agree with the proposed fix, I provided an alternative suggestion below.
>
>>> Or.
>>>
>>>> Reported-by: Gerald Gibson <[email protected]>
>>>> Signed-off-by: Qing Huang <[email protected]>
>>>> ---
>>>>   drivers/infiniband/hw/mlx5/main.c | 3 ++-
>>>>   1 file changed, 2 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
>>>> index b3ba9a2..1ddd1d3 100644
>>>> --- a/drivers/infiniband/hw/mlx5/main.c
>>>> +++ b/drivers/infiniband/hw/mlx5/main.c
>>>> @@ -6068,7 +6068,8 @@ static void *mlx5_ib_add_slave_port(struct mlx5_core_dev *mdev, u8 port_num)
>>>>
>>>>          mutex_lock(&mlx5_ib_multiport_mutex);
>>>>          list_for_each_entry(dev, &mlx5_ib_dev_list, ib_dev_list) {
>>>> -               if (dev->sys_image_guid == mpi->sys_image_guid)
>>>> +               if (dev->sys_image_guid == mpi->sys_image_guid &&
>>>> +                   !dev->port[mlx5_core_native_port_num(mdev) - 1].mp.mpi)
You shouldn't check the mpi field that without holding the lock in the mp structure. Prefer you change the print from a warning in mlx5_ib_bind_slave_port to a debug message.

>>>>                          bound = mlx5_ib_bind_slave_port(dev, mpi);
>>>>
>>>>                  if (bound) {
>>>> --
>>>> 2.9.3
>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>>>> the body of a message to [email protected]
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>


2018-07-23 18:23:11

by Qing Huang

[permalink] [raw]
Subject: Re: [PATCH] IB/mlx5: avoid binding a new mpi unit to the same devices repeatedly

Hi Daniel,


On 7/23/2018 11:11 AM, Daniel Jurgens wrote:
>
> On 7/23/2018 10:36 AM, Qing Huang wrote:
>> Hi Daniel/Parav,
>>
>> Have you got a chance to review this patch? Thanks!
> Hi Qing, sorry for the delay, I just got back to the office today. I don't agree with the proposed fix, I provided an alternative suggestion below.
>>>> Or.
>>>>
>>>>> Reported-by: Gerald Gibson <[email protected]>
>>>>> Signed-off-by: Qing Huang <[email protected]>
>>>>> ---
>>>>>   drivers/infiniband/hw/mlx5/main.c | 3 ++-
>>>>>   1 file changed, 2 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
>>>>> index b3ba9a2..1ddd1d3 100644
>>>>> --- a/drivers/infiniband/hw/mlx5/main.c
>>>>> +++ b/drivers/infiniband/hw/mlx5/main.c
>>>>> @@ -6068,7 +6068,8 @@ static void *mlx5_ib_add_slave_port(struct mlx5_core_dev *mdev, u8 port_num)
>>>>>
>>>>>          mutex_lock(&mlx5_ib_multiport_mutex);
>>>>>          list_for_each_entry(dev, &mlx5_ib_dev_list, ib_dev_list) {
>>>>> -               if (dev->sys_image_guid == mpi->sys_image_guid)
>>>>> +               if (dev->sys_image_guid == mpi->sys_image_guid &&
>>>>> +                   !dev->port[mlx5_core_native_port_num(mdev) - 1].mp.mpi)
> You shouldn't check the mpi field that without holding the lock in the mp structure. Prefer you change the print from a warning in mlx5_ib_bind_slave_port to a debug message.
Thanks for the review. That works for us too. Will resend the patch.

Regards,
Qing

>
>>>>>                          bound = mlx5_ib_bind_slave_port(dev, mpi);
>>>>>
>>>>>                  if (bound) {
>>>>> --
>>>>> 2.9.3
>>>>>
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>>>>> the body of a message to [email protected]
>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html