2021-06-09 17:18:06

by Leon Romanovsky

[permalink] [raw]
Subject: [PATCH v2 rdma-next] RDMA/mlx5: Enable Relaxed Ordering by default for kernel ULPs

From: Avihai Horon <[email protected]>

Relaxed Ordering is a capability that can only benefit users that support
it. All kernel ULPs should support Relaxed Ordering, as they are designed
to read data only after observing the CQE and use the DMA API correctly.

Hence, implicitly enable Relaxed Ordering by default for kernel ULPs.

Signed-off-by: Avihai Horon <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
---
Changelog:
v2:
* Dropped IB/core patch and set RO implicitly in mlx5 exactly like in
eth side of mlx5 driver.
v1: https://lore.kernel.org/lkml/[email protected]
* Enabled by default RO in IB/core instead of changing all users
v0: https://lore.kernel.org/lkml/[email protected]
---
drivers/infiniband/hw/mlx5/mr.c | 10 ++++++----
drivers/infiniband/hw/mlx5/wr.c | 5 ++++-
2 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
index 3363cde85b14..2182e76ae734 100644
--- a/drivers/infiniband/hw/mlx5/mr.c
+++ b/drivers/infiniband/hw/mlx5/mr.c
@@ -69,6 +69,7 @@ static void set_mkc_access_pd_addr_fields(void *mkc, int acc, u64 start_addr,
struct ib_pd *pd)
{
struct mlx5_ib_dev *dev = to_mdev(pd->device);
+ bool ro_pci_enabled = pcie_relaxed_ordering_enabled(dev->mdev->pdev);

MLX5_SET(mkc, mkc, a, !!(acc & IB_ACCESS_REMOTE_ATOMIC));
MLX5_SET(mkc, mkc, rw, !!(acc & IB_ACCESS_REMOTE_WRITE));
@@ -78,10 +79,10 @@ static void set_mkc_access_pd_addr_fields(void *mkc, int acc, u64 start_addr,

if (MLX5_CAP_GEN(dev->mdev, relaxed_ordering_write))
MLX5_SET(mkc, mkc, relaxed_ordering_write,
- !!(acc & IB_ACCESS_RELAXED_ORDERING));
+ acc & IB_ACCESS_RELAXED_ORDERING && ro_pci_enabled);
if (MLX5_CAP_GEN(dev->mdev, relaxed_ordering_read))
MLX5_SET(mkc, mkc, relaxed_ordering_read,
- !!(acc & IB_ACCESS_RELAXED_ORDERING));
+ acc & IB_ACCESS_RELAXED_ORDERING && ro_pci_enabled);

MLX5_SET(mkc, mkc, pd, to_mpd(pd)->pdn);
MLX5_SET(mkc, mkc, qpn, 0xffffff);
@@ -812,7 +813,8 @@ struct ib_mr *mlx5_ib_get_dma_mr(struct ib_pd *pd, int acc)

MLX5_SET(mkc, mkc, access_mode_1_0, MLX5_MKC_ACCESS_MODE_PA);
MLX5_SET(mkc, mkc, length64, 1);
- set_mkc_access_pd_addr_fields(mkc, acc, 0, pd);
+ set_mkc_access_pd_addr_fields(mkc, acc | IB_ACCESS_RELAXED_ORDERING, 0,
+ pd);

err = mlx5_ib_create_mkey(dev, &mr->mmkey, in, inlen);
if (err)
@@ -2022,7 +2024,7 @@ static void mlx5_set_umr_free_mkey(struct ib_pd *pd, u32 *in, int ndescs,
mkc = MLX5_ADDR_OF(create_mkey_in, in, memory_key_mkey_entry);

/* This is only used from the kernel, so setting the PD is OK. */
- set_mkc_access_pd_addr_fields(mkc, 0, 0, pd);
+ set_mkc_access_pd_addr_fields(mkc, IB_ACCESS_RELAXED_ORDERING, 0, pd);
MLX5_SET(mkc, mkc, free, 1);
MLX5_SET(mkc, mkc, translations_octword_size, ndescs);
MLX5_SET(mkc, mkc, access_mode_1_0, access_mode & 0x3);
diff --git a/drivers/infiniband/hw/mlx5/wr.c b/drivers/infiniband/hw/mlx5/wr.c
index 6880627c45be..8841620af82f 100644
--- a/drivers/infiniband/hw/mlx5/wr.c
+++ b/drivers/infiniband/hw/mlx5/wr.c
@@ -866,7 +866,10 @@ static int set_reg_wr(struct mlx5_ib_qp *qp,
bool atomic = wr->access & IB_ACCESS_REMOTE_ATOMIC;
u8 flags = 0;

- /* Matches access in mlx5_set_umr_free_mkey() */
+ /* Matches access in mlx5_set_umr_free_mkey().
+ * Relaxed Ordering is set implicitly in mlx5_set_umr_free_mkey() and
+ * kernel ULPs are not aware of it, so we don't set it here.
+ */
if (!mlx5_ib_can_reconfig_with_umr(dev, 0, wr->access)) {
mlx5_ib_warn(
to_mdev(qp->ibqp.device),
--
2.31.1


2021-06-09 17:27:15

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v2 rdma-next] RDMA/mlx5: Enable Relaxed Ordering by default for kernel ULPs

On Wed, Jun 09, 2021 at 02:05:03PM +0300, Leon Romanovsky wrote:
> From: Avihai Horon <[email protected]>
>
> Relaxed Ordering is a capability that can only benefit users that support
> it. All kernel ULPs should support Relaxed Ordering, as they are designed
> to read data only after observing the CQE and use the DMA API correctly.
>
> Hence, implicitly enable Relaxed Ordering by default for kernel ULPs.
>
> Signed-off-by: Avihai Horon <[email protected]>
> Signed-off-by: Leon Romanovsky <[email protected]>
> ---
> Changelog:
> v2:
> * Dropped IB/core patch and set RO implicitly in mlx5 exactly like in
> eth side of mlx5 driver.

This looks great in terms of code changes. But can we please also add a
patch to document that PCIe relaxed ordering is fine for kernel ULP usage
somewhere?

2021-06-21 18:11:02

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v2 rdma-next] RDMA/mlx5: Enable Relaxed Ordering by default for kernel ULPs

On Wed, Jun 09, 2021 at 02:05:03PM +0300, Leon Romanovsky wrote:
> From: Avihai Horon <[email protected]>
>
> Relaxed Ordering is a capability that can only benefit users that support
> it. All kernel ULPs should support Relaxed Ordering, as they are designed
> to read data only after observing the CQE and use the DMA API correctly.
>
> Hence, implicitly enable Relaxed Ordering by default for kernel ULPs.
>
> Signed-off-by: Avihai Horon <[email protected]>
> Signed-off-by: Leon Romanovsky <[email protected]>
> ---
> Changelog:
> v2:
> * Dropped IB/core patch and set RO implicitly in mlx5 exactly like in
> eth side of mlx5 driver.
> v1: https://lore.kernel.org/lkml/[email protected]
> * Enabled by default RO in IB/core instead of changing all users
> v0: https://lore.kernel.org/lkml/[email protected]
> ---
> drivers/infiniband/hw/mlx5/mr.c | 10 ++++++----
> drivers/infiniband/hw/mlx5/wr.c | 5 ++++-
> 2 files changed, 10 insertions(+), 5 deletions(-)

Applied to for-next, with the extra comment, thanks

Someone is working on dis-entangling the access flags? It took a long
time to sort out that this mess in wr.c actually does have a
distinct user/kernel call chain too..

Jason

2021-06-21 20:22:14

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v2 rdma-next] RDMA/mlx5: Enable Relaxed Ordering by default for kernel ULPs

On Mon, Jun 21, 2021 at 03:02:05PM -0300, Jason Gunthorpe wrote:
> Someone is working on dis-entangling the access flags? It took a long
> time to sort out that this mess in wr.c actually does have a
> distinct user/kernel call chain too..

I'd love to see it done, but I won't find time for it anytime soon.

2021-06-21 23:20:08

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v2 rdma-next] RDMA/mlx5: Enable Relaxed Ordering by default for kernel ULPs

On Mon, Jun 21, 2021 at 10:20:33PM +0200, Christoph Hellwig wrote:
> On Mon, Jun 21, 2021 at 03:02:05PM -0300, Jason Gunthorpe wrote:
> > Someone is working on dis-entangling the access flags? It took a long
> > time to sort out that this mess in wr.c actually does have a
> > distinct user/kernel call chain too..
>
> I'd love to see it done, but I won't find time for it anytime soon.

Heh, me too..

I did actually once try to get a start on doing something to wr.c but
it rapidly started to get into mire..

I thought I recalled Leon saying he or Avihai would work on the ACCESS
thing anyhow?

Thanks,
Jason

2021-06-22 06:21:31

by Leon Romanovsky

[permalink] [raw]
Subject: Re: [PATCH v2 rdma-next] RDMA/mlx5: Enable Relaxed Ordering by default for kernel ULPs

On Mon, Jun 21, 2021 at 08:18:37PM -0300, Jason Gunthorpe wrote:
> On Mon, Jun 21, 2021 at 10:20:33PM +0200, Christoph Hellwig wrote:
> > On Mon, Jun 21, 2021 at 03:02:05PM -0300, Jason Gunthorpe wrote:
> > > Someone is working on dis-entangling the access flags? It took a long
> > > time to sort out that this mess in wr.c actually does have a
> > > distinct user/kernel call chain too..
> >
> > I'd love to see it done, but I won't find time for it anytime soon.
>
> Heh, me too..
>
> I did actually once try to get a start on doing something to wr.c but
> it rapidly started to get into mire..
>
> I thought I recalled Leon saying he or Avihai would work on the ACCESS
> thing anyhow?

Yes, we are planning to do it for the next cycle.

Thanks

>
> Thanks,
> Jason

2021-06-23 23:09:32

by Max Gurtovoy

[permalink] [raw]
Subject: Re: [PATCH v2 rdma-next] RDMA/mlx5: Enable Relaxed Ordering by default for kernel ULPs


On 6/9/2021 2:05 PM, Leon Romanovsky wrote:
> From: Avihai Horon <[email protected]>
>
> Relaxed Ordering is a capability that can only benefit users that support
> it. All kernel ULPs should support Relaxed Ordering, as they are designed
> to read data only after observing the CQE and use the DMA API correctly.
>
> Hence, implicitly enable Relaxed Ordering by default for kernel ULPs.
>
> Signed-off-by: Avihai Horon <[email protected]>
> Signed-off-by: Leon Romanovsky <[email protected]>
> ---
> Changelog:
> v2:
> * Dropped IB/core patch and set RO implicitly in mlx5 exactly like in
> eth side of mlx5 driver.
> v1: https://lore.kernel.org/lkml/[email protected]
> * Enabled by default RO in IB/core instead of changing all users
> v0: https://lore.kernel.org/lkml/[email protected]
> ---
> drivers/infiniband/hw/mlx5/mr.c | 10 ++++++----
> drivers/infiniband/hw/mlx5/wr.c | 5 ++++-
> 2 files changed, 10 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
> index 3363cde85b14..2182e76ae734 100644
> --- a/drivers/infiniband/hw/mlx5/mr.c
> +++ b/drivers/infiniband/hw/mlx5/mr.c
> @@ -69,6 +69,7 @@ static void set_mkc_access_pd_addr_fields(void *mkc, int acc, u64 start_addr,
> struct ib_pd *pd)
> {
> struct mlx5_ib_dev *dev = to_mdev(pd->device);
> + bool ro_pci_enabled = pcie_relaxed_ordering_enabled(dev->mdev->pdev);
>
> MLX5_SET(mkc, mkc, a, !!(acc & IB_ACCESS_REMOTE_ATOMIC));
> MLX5_SET(mkc, mkc, rw, !!(acc & IB_ACCESS_REMOTE_WRITE));
> @@ -78,10 +79,10 @@ static void set_mkc_access_pd_addr_fields(void *mkc, int acc, u64 start_addr,
>
> if (MLX5_CAP_GEN(dev->mdev, relaxed_ordering_write))
> MLX5_SET(mkc, mkc, relaxed_ordering_write,
> - !!(acc & IB_ACCESS_RELAXED_ORDERING));
> + acc & IB_ACCESS_RELAXED_ORDERING && ro_pci_enabled);
> if (MLX5_CAP_GEN(dev->mdev, relaxed_ordering_read))
> MLX5_SET(mkc, mkc, relaxed_ordering_read,
> - !!(acc & IB_ACCESS_RELAXED_ORDERING));
> + acc & IB_ACCESS_RELAXED_ORDERING && ro_pci_enabled);

Jason,

If it's still possible to add small change, it will be nice to avoid
calculating "acc & IB_ACCESS_RELAXED_ORDERING && ro_pci_enabled" twice.


2021-06-24 06:40:57

by Leon Romanovsky

[permalink] [raw]
Subject: Re: [PATCH v2 rdma-next] RDMA/mlx5: Enable Relaxed Ordering by default for kernel ULPs

On Thu, Jun 24, 2021 at 02:06:46AM +0300, Max Gurtovoy wrote:
>
> On 6/9/2021 2:05 PM, Leon Romanovsky wrote:
> > From: Avihai Horon <[email protected]>
> >
> > Relaxed Ordering is a capability that can only benefit users that support
> > it. All kernel ULPs should support Relaxed Ordering, as they are designed
> > to read data only after observing the CQE and use the DMA API correctly.
> >
> > Hence, implicitly enable Relaxed Ordering by default for kernel ULPs.
> >
> > Signed-off-by: Avihai Horon <[email protected]>
> > Signed-off-by: Leon Romanovsky <[email protected]>
> > ---
> > Changelog:
> > v2:
> > * Dropped IB/core patch and set RO implicitly in mlx5 exactly like in
> > eth side of mlx5 driver.
> > v1: https://lore.kernel.org/lkml/[email protected]
> > * Enabled by default RO in IB/core instead of changing all users
> > v0: https://lore.kernel.org/lkml/[email protected]
> > ---
> > drivers/infiniband/hw/mlx5/mr.c | 10 ++++++----
> > drivers/infiniband/hw/mlx5/wr.c | 5 ++++-
> > 2 files changed, 10 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
> > index 3363cde85b14..2182e76ae734 100644
> > --- a/drivers/infiniband/hw/mlx5/mr.c
> > +++ b/drivers/infiniband/hw/mlx5/mr.c
> > @@ -69,6 +69,7 @@ static void set_mkc_access_pd_addr_fields(void *mkc, int acc, u64 start_addr,
> > struct ib_pd *pd)
> > {
> > struct mlx5_ib_dev *dev = to_mdev(pd->device);
> > + bool ro_pci_enabled = pcie_relaxed_ordering_enabled(dev->mdev->pdev);
> > MLX5_SET(mkc, mkc, a, !!(acc & IB_ACCESS_REMOTE_ATOMIC));
> > MLX5_SET(mkc, mkc, rw, !!(acc & IB_ACCESS_REMOTE_WRITE));
> > @@ -78,10 +79,10 @@ static void set_mkc_access_pd_addr_fields(void *mkc, int acc, u64 start_addr,
> > if (MLX5_CAP_GEN(dev->mdev, relaxed_ordering_write))
> > MLX5_SET(mkc, mkc, relaxed_ordering_write,
> > - !!(acc & IB_ACCESS_RELAXED_ORDERING));
> > + acc & IB_ACCESS_RELAXED_ORDERING && ro_pci_enabled);
> > if (MLX5_CAP_GEN(dev->mdev, relaxed_ordering_read))
> > MLX5_SET(mkc, mkc, relaxed_ordering_read,
> > - !!(acc & IB_ACCESS_RELAXED_ORDERING));
> > + acc & IB_ACCESS_RELAXED_ORDERING && ro_pci_enabled);
>
> Jason,
>
> If it's still possible to add small change, it will be nice to avoid
> calculating "acc & IB_ACCESS_RELAXED_ORDERING && ro_pci_enabled" twice.

The patch is part of for-next now, so feel free to send followup patch.

Thanks

diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
index c1e70c99b70c..c4f246c90c4d 100644
--- a/drivers/infiniband/hw/mlx5/mr.c
+++ b/drivers/infiniband/hw/mlx5/mr.c
@@ -69,7 +69,8 @@ static void set_mkc_access_pd_addr_fields(void *mkc, int acc, u64 start_addr,
struct ib_pd *pd)
{
struct mlx5_ib_dev *dev = to_mdev(pd->device);
- bool ro_pci_enabled = pcie_relaxed_ordering_enabled(dev->mdev->pdev);
+ bool ro_pci_enabled = acc & IB_ACCESS_RELAXED_ORDERING &&
+ pcie_relaxed_ordering_enabled(dev->mdev->pdev);

MLX5_SET(mkc, mkc, a, !!(acc & IB_ACCESS_REMOTE_ATOMIC));
MLX5_SET(mkc, mkc, rw, !!(acc & IB_ACCESS_REMOTE_WRITE));
@@ -78,11 +79,9 @@ static void set_mkc_access_pd_addr_fields(void *mkc, int acc, u64 start_addr,
MLX5_SET(mkc, mkc, lr, 1);

if (MLX5_CAP_GEN(dev->mdev, relaxed_ordering_write))
- MLX5_SET(mkc, mkc, relaxed_ordering_write,
- (acc & IB_ACCESS_RELAXED_ORDERING) && ro_pci_enabled);
+ MLX5_SET(mkc, mkc, relaxed_ordering_write, ro_pci_enabled);
if (MLX5_CAP_GEN(dev->mdev, relaxed_ordering_read))
- MLX5_SET(mkc, mkc, relaxed_ordering_read,
- (acc & IB_ACCESS_RELAXED_ORDERING) && ro_pci_enabled);
+ MLX5_SET(mkc, mkc, relaxed_ordering_read, ro_pci_enabled);

MLX5_SET(mkc, mkc, pd, to_mpd(pd)->pdn);
MLX5_SET(mkc, mkc, qpn, 0xffffff);
(END)


>
>

2021-06-24 07:41:48

by Max Gurtovoy

[permalink] [raw]
Subject: Re: [PATCH v2 rdma-next] RDMA/mlx5: Enable Relaxed Ordering by default for kernel ULPs


On 6/24/2021 9:38 AM, Leon Romanovsky wrote:
> On Thu, Jun 24, 2021 at 02:06:46AM +0300, Max Gurtovoy wrote:
>> On 6/9/2021 2:05 PM, Leon Romanovsky wrote:
>>> From: Avihai Horon <[email protected]>
>>>
>>> Relaxed Ordering is a capability that can only benefit users that support
>>> it. All kernel ULPs should support Relaxed Ordering, as they are designed
>>> to read data only after observing the CQE and use the DMA API correctly.
>>>
>>> Hence, implicitly enable Relaxed Ordering by default for kernel ULPs.
>>>
>>> Signed-off-by: Avihai Horon <[email protected]>
>>> Signed-off-by: Leon Romanovsky <[email protected]>
>>> ---
>>> Changelog:
>>> v2:
>>> * Dropped IB/core patch and set RO implicitly in mlx5 exactly like in
>>> eth side of mlx5 driver.
>>> v1: https://lore.kernel.org/lkml/[email protected]
>>> * Enabled by default RO in IB/core instead of changing all users
>>> v0: https://lore.kernel.org/lkml/[email protected]
>>> ---
>>> drivers/infiniband/hw/mlx5/mr.c | 10 ++++++----
>>> drivers/infiniband/hw/mlx5/wr.c | 5 ++++-
>>> 2 files changed, 10 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
>>> index 3363cde85b14..2182e76ae734 100644
>>> --- a/drivers/infiniband/hw/mlx5/mr.c
>>> +++ b/drivers/infiniband/hw/mlx5/mr.c
>>> @@ -69,6 +69,7 @@ static void set_mkc_access_pd_addr_fields(void *mkc, int acc, u64 start_addr,
>>> struct ib_pd *pd)
>>> {
>>> struct mlx5_ib_dev *dev = to_mdev(pd->device);
>>> + bool ro_pci_enabled = pcie_relaxed_ordering_enabled(dev->mdev->pdev);
>>> MLX5_SET(mkc, mkc, a, !!(acc & IB_ACCESS_REMOTE_ATOMIC));
>>> MLX5_SET(mkc, mkc, rw, !!(acc & IB_ACCESS_REMOTE_WRITE));
>>> @@ -78,10 +79,10 @@ static void set_mkc_access_pd_addr_fields(void *mkc, int acc, u64 start_addr,
>>> if (MLX5_CAP_GEN(dev->mdev, relaxed_ordering_write))
>>> MLX5_SET(mkc, mkc, relaxed_ordering_write,
>>> - !!(acc & IB_ACCESS_RELAXED_ORDERING));
>>> + acc & IB_ACCESS_RELAXED_ORDERING && ro_pci_enabled);
>>> if (MLX5_CAP_GEN(dev->mdev, relaxed_ordering_read))
>>> MLX5_SET(mkc, mkc, relaxed_ordering_read,
>>> - !!(acc & IB_ACCESS_RELAXED_ORDERING));
>>> + acc & IB_ACCESS_RELAXED_ORDERING && ro_pci_enabled);
>> Jason,
>>
>> If it's still possible to add small change, it will be nice to avoid
>> calculating "acc & IB_ACCESS_RELAXED_ORDERING && ro_pci_enabled" twice.
> The patch is part of for-next now, so feel free to send followup patch.
>
> Thanks
>
> diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
> index c1e70c99b70c..c4f246c90c4d 100644
> --- a/drivers/infiniband/hw/mlx5/mr.c
> +++ b/drivers/infiniband/hw/mlx5/mr.c
> @@ -69,7 +69,8 @@ static void set_mkc_access_pd_addr_fields(void *mkc, int acc, u64 start_addr,
> struct ib_pd *pd)
> {
> struct mlx5_ib_dev *dev = to_mdev(pd->device);
> - bool ro_pci_enabled = pcie_relaxed_ordering_enabled(dev->mdev->pdev);
> + bool ro_pci_enabled = acc & IB_ACCESS_RELAXED_ORDERING &&
> + pcie_relaxed_ordering_enabled(dev->mdev->pdev);
>
> MLX5_SET(mkc, mkc, a, !!(acc & IB_ACCESS_REMOTE_ATOMIC));
> MLX5_SET(mkc, mkc, rw, !!(acc & IB_ACCESS_REMOTE_WRITE));
> @@ -78,11 +79,9 @@ static void set_mkc_access_pd_addr_fields(void *mkc, int acc, u64 start_addr,
> MLX5_SET(mkc, mkc, lr, 1);
>
> if (MLX5_CAP_GEN(dev->mdev, relaxed_ordering_write))
> - MLX5_SET(mkc, mkc, relaxed_ordering_write,
> - (acc & IB_ACCESS_RELAXED_ORDERING) && ro_pci_enabled);
> + MLX5_SET(mkc, mkc, relaxed_ordering_write, ro_pci_enabled);
> if (MLX5_CAP_GEN(dev->mdev, relaxed_ordering_read))
> - MLX5_SET(mkc, mkc, relaxed_ordering_read,
> - (acc & IB_ACCESS_RELAXED_ORDERING) && ro_pci_enabled);
> + MLX5_SET(mkc, mkc, relaxed_ordering_read, ro_pci_enabled);
>
> MLX5_SET(mkc, mkc, pd, to_mpd(pd)->pdn);
> MLX5_SET(mkc, mkc, qpn, 0xffffff);
> (END)
>
Yes this looks good.

Can you/Avihai create a patch from this ? or I'll do it ?


>>

2021-06-24 11:37:02

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v2 rdma-next] RDMA/mlx5: Enable Relaxed Ordering by default for kernel ULPs

On Thu, Jun 24, 2021 at 10:39:16AM +0300, Max Gurtovoy wrote:
>
> On 6/24/2021 9:38 AM, Leon Romanovsky wrote:
> > On Thu, Jun 24, 2021 at 02:06:46AM +0300, Max Gurtovoy wrote:
> > > On 6/9/2021 2:05 PM, Leon Romanovsky wrote:
> > > > From: Avihai Horon <[email protected]>
> > > >
> > > > Relaxed Ordering is a capability that can only benefit users that support
> > > > it. All kernel ULPs should support Relaxed Ordering, as they are designed
> > > > to read data only after observing the CQE and use the DMA API correctly.
> > > >
> > > > Hence, implicitly enable Relaxed Ordering by default for kernel ULPs.
> > > >
> > > > Signed-off-by: Avihai Horon <[email protected]>
> > > > Signed-off-by: Leon Romanovsky <[email protected]>
> > > > Changelog:
> > > > v2:
> > > > * Dropped IB/core patch and set RO implicitly in mlx5 exactly like in
> > > > eth side of mlx5 driver.
> > > > v1: https://lore.kernel.org/lkml/[email protected]
> > > > * Enabled by default RO in IB/core instead of changing all users
> > > > v0: https://lore.kernel.org/lkml/[email protected]
> > > > drivers/infiniband/hw/mlx5/mr.c | 10 ++++++----
> > > > drivers/infiniband/hw/mlx5/wr.c | 5 ++++-
> > > > 2 files changed, 10 insertions(+), 5 deletions(-)
> > > >
> > > > diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
> > > > index 3363cde85b14..2182e76ae734 100644
> > > > +++ b/drivers/infiniband/hw/mlx5/mr.c
> > > > @@ -69,6 +69,7 @@ static void set_mkc_access_pd_addr_fields(void *mkc, int acc, u64 start_addr,
> > > > struct ib_pd *pd)
> > > > {
> > > > struct mlx5_ib_dev *dev = to_mdev(pd->device);
> > > > + bool ro_pci_enabled = pcie_relaxed_ordering_enabled(dev->mdev->pdev);
> > > > MLX5_SET(mkc, mkc, a, !!(acc & IB_ACCESS_REMOTE_ATOMIC));
> > > > MLX5_SET(mkc, mkc, rw, !!(acc & IB_ACCESS_REMOTE_WRITE));
> > > > @@ -78,10 +79,10 @@ static void set_mkc_access_pd_addr_fields(void *mkc, int acc, u64 start_addr,
> > > > if (MLX5_CAP_GEN(dev->mdev, relaxed_ordering_write))
> > > > MLX5_SET(mkc, mkc, relaxed_ordering_write,
> > > > - !!(acc & IB_ACCESS_RELAXED_ORDERING));
> > > > + acc & IB_ACCESS_RELAXED_ORDERING && ro_pci_enabled);
> > > > if (MLX5_CAP_GEN(dev->mdev, relaxed_ordering_read))
> > > > MLX5_SET(mkc, mkc, relaxed_ordering_read,
> > > > - !!(acc & IB_ACCESS_RELAXED_ORDERING));
> > > > + acc & IB_ACCESS_RELAXED_ORDERING && ro_pci_enabled);
> > > Jason,
> > >
> > > If it's still possible to add small change, it will be nice to avoid
> > > calculating "acc & IB_ACCESS_RELAXED_ORDERING && ro_pci_enabled" twice.
> > The patch is part of for-next now, so feel free to send followup patch.
> >
> > Thanks
> >
> > diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
> > index c1e70c99b70c..c4f246c90c4d 100644
> > +++ b/drivers/infiniband/hw/mlx5/mr.c
> > @@ -69,7 +69,8 @@ static void set_mkc_access_pd_addr_fields(void *mkc, int acc, u64 start_addr,
> > struct ib_pd *pd)
> > {
> > struct mlx5_ib_dev *dev = to_mdev(pd->device);
> > - bool ro_pci_enabled = pcie_relaxed_ordering_enabled(dev->mdev->pdev);
> > + bool ro_pci_enabled = acc & IB_ACCESS_RELAXED_ORDERING &&
> > + pcie_relaxed_ordering_enabled(dev->mdev->pdev);
> >
> > MLX5_SET(mkc, mkc, a, !!(acc & IB_ACCESS_REMOTE_ATOMIC));
> > MLX5_SET(mkc, mkc, rw, !!(acc & IB_ACCESS_REMOTE_WRITE));
> > @@ -78,11 +79,9 @@ static void set_mkc_access_pd_addr_fields(void *mkc, int acc, u64 start_addr,
> > MLX5_SET(mkc, mkc, lr, 1);
> >
> > if (MLX5_CAP_GEN(dev->mdev, relaxed_ordering_write))
> > - MLX5_SET(mkc, mkc, relaxed_ordering_write,
> > - (acc & IB_ACCESS_RELAXED_ORDERING) && ro_pci_enabled);
> > + MLX5_SET(mkc, mkc, relaxed_ordering_write, ro_pci_enabled);
> > if (MLX5_CAP_GEN(dev->mdev, relaxed_ordering_read))
> > - MLX5_SET(mkc, mkc, relaxed_ordering_read,
> > - (acc & IB_ACCESS_RELAXED_ORDERING) && ro_pci_enabled);
> > + MLX5_SET(mkc, mkc, relaxed_ordering_read, ro_pci_enabled);
> >
> > MLX5_SET(mkc, mkc, pd, to_mpd(pd)->pdn);
> > MLX5_SET(mkc, mkc, qpn, 0xffffff);
> > (END)
> >
> Yes this looks good.
>
> Can you/Avihai create a patch from this ? or I'll do it ?

I'd be surpised if it matters.. CSE and all

Jason

2021-06-27 07:32:20

by Leon Romanovsky

[permalink] [raw]
Subject: Re: [PATCH v2 rdma-next] RDMA/mlx5: Enable Relaxed Ordering by default for kernel ULPs

On Thu, Jun 24, 2021 at 10:39:16AM +0300, Max Gurtovoy wrote:
>
> On 6/24/2021 9:38 AM, Leon Romanovsky wrote:
> > On Thu, Jun 24, 2021 at 02:06:46AM +0300, Max Gurtovoy wrote:
> > > On 6/9/2021 2:05 PM, Leon Romanovsky wrote:
> > > > From: Avihai Horon <[email protected]>
> > > >
> > > > Relaxed Ordering is a capability that can only benefit users that support
> > > > it. All kernel ULPs should support Relaxed Ordering, as they are designed
> > > > to read data only after observing the CQE and use the DMA API correctly.
> > > >
> > > > Hence, implicitly enable Relaxed Ordering by default for kernel ULPs.
> > > >
> > > > Signed-off-by: Avihai Horon <[email protected]>
> > > > Signed-off-by: Leon Romanovsky <[email protected]>
> > > > ---
> > > > Changelog:
> > > > v2:
> > > > * Dropped IB/core patch and set RO implicitly in mlx5 exactly like in
> > > > eth side of mlx5 driver.
> > > > v1: https://lore.kernel.org/lkml/[email protected]
> > > > * Enabled by default RO in IB/core instead of changing all users
> > > > v0: https://lore.kernel.org/lkml/[email protected]
> > > > ---
> > > > drivers/infiniband/hw/mlx5/mr.c | 10 ++++++----
> > > > drivers/infiniband/hw/mlx5/wr.c | 5 ++++-
> > > > 2 files changed, 10 insertions(+), 5 deletions(-)
> > > >
> > > > diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
> > > > index 3363cde85b14..2182e76ae734 100644
> > > > --- a/drivers/infiniband/hw/mlx5/mr.c
> > > > +++ b/drivers/infiniband/hw/mlx5/mr.c
> > > > @@ -69,6 +69,7 @@ static void set_mkc_access_pd_addr_fields(void *mkc, int acc, u64 start_addr,
> > > > struct ib_pd *pd)
> > > > {
> > > > struct mlx5_ib_dev *dev = to_mdev(pd->device);
> > > > + bool ro_pci_enabled = pcie_relaxed_ordering_enabled(dev->mdev->pdev);
> > > > MLX5_SET(mkc, mkc, a, !!(acc & IB_ACCESS_REMOTE_ATOMIC));
> > > > MLX5_SET(mkc, mkc, rw, !!(acc & IB_ACCESS_REMOTE_WRITE));
> > > > @@ -78,10 +79,10 @@ static void set_mkc_access_pd_addr_fields(void *mkc, int acc, u64 start_addr,
> > > > if (MLX5_CAP_GEN(dev->mdev, relaxed_ordering_write))
> > > > MLX5_SET(mkc, mkc, relaxed_ordering_write,
> > > > - !!(acc & IB_ACCESS_RELAXED_ORDERING));
> > > > + acc & IB_ACCESS_RELAXED_ORDERING && ro_pci_enabled);
> > > > if (MLX5_CAP_GEN(dev->mdev, relaxed_ordering_read))
> > > > MLX5_SET(mkc, mkc, relaxed_ordering_read,
> > > > - !!(acc & IB_ACCESS_RELAXED_ORDERING));
> > > > + acc & IB_ACCESS_RELAXED_ORDERING && ro_pci_enabled);
> > > Jason,
> > >
> > > If it's still possible to add small change, it will be nice to avoid
> > > calculating "acc & IB_ACCESS_RELAXED_ORDERING && ro_pci_enabled" twice.
> > The patch is part of for-next now, so feel free to send followup patch.
> >
> > Thanks
> >
> > diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
> > index c1e70c99b70c..c4f246c90c4d 100644
> > --- a/drivers/infiniband/hw/mlx5/mr.c
> > +++ b/drivers/infiniband/hw/mlx5/mr.c
> > @@ -69,7 +69,8 @@ static void set_mkc_access_pd_addr_fields(void *mkc, int acc, u64 start_addr,
> > struct ib_pd *pd)
> > {
> > struct mlx5_ib_dev *dev = to_mdev(pd->device);
> > - bool ro_pci_enabled = pcie_relaxed_ordering_enabled(dev->mdev->pdev);
> > + bool ro_pci_enabled = acc & IB_ACCESS_RELAXED_ORDERING &&
> > + pcie_relaxed_ordering_enabled(dev->mdev->pdev);
> >
> > MLX5_SET(mkc, mkc, a, !!(acc & IB_ACCESS_REMOTE_ATOMIC));
> > MLX5_SET(mkc, mkc, rw, !!(acc & IB_ACCESS_REMOTE_WRITE));
> > @@ -78,11 +79,9 @@ static void set_mkc_access_pd_addr_fields(void *mkc, int acc, u64 start_addr,
> > MLX5_SET(mkc, mkc, lr, 1);
> >
> > if (MLX5_CAP_GEN(dev->mdev, relaxed_ordering_write))
> > - MLX5_SET(mkc, mkc, relaxed_ordering_write,
> > - (acc & IB_ACCESS_RELAXED_ORDERING) && ro_pci_enabled);
> > + MLX5_SET(mkc, mkc, relaxed_ordering_write, ro_pci_enabled);
> > if (MLX5_CAP_GEN(dev->mdev, relaxed_ordering_read))
> > - MLX5_SET(mkc, mkc, relaxed_ordering_read,
> > - (acc & IB_ACCESS_RELAXED_ORDERING) && ro_pci_enabled);
> > + MLX5_SET(mkc, mkc, relaxed_ordering_read, ro_pci_enabled);
> >
> > MLX5_SET(mkc, mkc, pd, to_mpd(pd)->pdn);
> > MLX5_SET(mkc, mkc, qpn, 0xffffff);
> > (END)
> >
> Yes this looks good.
>
> Can you/Avihai create a patch from this ? or I'll do it ?

Feel free to send it directly.

Thanks

>
>
> > >

2021-06-27 07:36:27

by Leon Romanovsky

[permalink] [raw]
Subject: Re: [PATCH v2 rdma-next] RDMA/mlx5: Enable Relaxed Ordering by default for kernel ULPs

On Thu, Jun 24, 2021 at 08:36:07AM -0300, Jason Gunthorpe wrote:
> On Thu, Jun 24, 2021 at 10:39:16AM +0300, Max Gurtovoy wrote:
> >
> > On 6/24/2021 9:38 AM, Leon Romanovsky wrote:
> > > On Thu, Jun 24, 2021 at 02:06:46AM +0300, Max Gurtovoy wrote:
> > > > On 6/9/2021 2:05 PM, Leon Romanovsky wrote:
> > > > > From: Avihai Horon <[email protected]>
> > > > >
> > > > > Relaxed Ordering is a capability that can only benefit users that support
> > > > > it. All kernel ULPs should support Relaxed Ordering, as they are designed
> > > > > to read data only after observing the CQE and use the DMA API correctly.
> > > > >
> > > > > Hence, implicitly enable Relaxed Ordering by default for kernel ULPs.
> > > > >
> > > > > Signed-off-by: Avihai Horon <[email protected]>
> > > > > Signed-off-by: Leon Romanovsky <[email protected]>
> > > > > Changelog:
> > > > > v2:
> > > > > * Dropped IB/core patch and set RO implicitly in mlx5 exactly like in
> > > > > eth side of mlx5 driver.
> > > > > v1: https://lore.kernel.org/lkml/[email protected]
> > > > > * Enabled by default RO in IB/core instead of changing all users
> > > > > v0: https://lore.kernel.org/lkml/[email protected]
> > > > > drivers/infiniband/hw/mlx5/mr.c | 10 ++++++----
> > > > > drivers/infiniband/hw/mlx5/wr.c | 5 ++++-
> > > > > 2 files changed, 10 insertions(+), 5 deletions(-)
> > > > >
> > > > > diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
> > > > > index 3363cde85b14..2182e76ae734 100644
> > > > > +++ b/drivers/infiniband/hw/mlx5/mr.c
> > > > > @@ -69,6 +69,7 @@ static void set_mkc_access_pd_addr_fields(void *mkc, int acc, u64 start_addr,
> > > > > struct ib_pd *pd)
> > > > > {
> > > > > struct mlx5_ib_dev *dev = to_mdev(pd->device);
> > > > > + bool ro_pci_enabled = pcie_relaxed_ordering_enabled(dev->mdev->pdev);
> > > > > MLX5_SET(mkc, mkc, a, !!(acc & IB_ACCESS_REMOTE_ATOMIC));
> > > > > MLX5_SET(mkc, mkc, rw, !!(acc & IB_ACCESS_REMOTE_WRITE));
> > > > > @@ -78,10 +79,10 @@ static void set_mkc_access_pd_addr_fields(void *mkc, int acc, u64 start_addr,
> > > > > if (MLX5_CAP_GEN(dev->mdev, relaxed_ordering_write))
> > > > > MLX5_SET(mkc, mkc, relaxed_ordering_write,
> > > > > - !!(acc & IB_ACCESS_RELAXED_ORDERING));
> > > > > + acc & IB_ACCESS_RELAXED_ORDERING && ro_pci_enabled);
> > > > > if (MLX5_CAP_GEN(dev->mdev, relaxed_ordering_read))
> > > > > MLX5_SET(mkc, mkc, relaxed_ordering_read,
> > > > > - !!(acc & IB_ACCESS_RELAXED_ORDERING));
> > > > > + acc & IB_ACCESS_RELAXED_ORDERING && ro_pci_enabled);
> > > > Jason,
> > > >
> > > > If it's still possible to add small change, it will be nice to avoid
> > > > calculating "acc & IB_ACCESS_RELAXED_ORDERING && ro_pci_enabled" twice.
> > > The patch is part of for-next now, so feel free to send followup patch.
> > >
> > > Thanks
> > >
> > > diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
> > > index c1e70c99b70c..c4f246c90c4d 100644
> > > +++ b/drivers/infiniband/hw/mlx5/mr.c
> > > @@ -69,7 +69,8 @@ static void set_mkc_access_pd_addr_fields(void *mkc, int acc, u64 start_addr,
> > > struct ib_pd *pd)
> > > {
> > > struct mlx5_ib_dev *dev = to_mdev(pd->device);
> > > - bool ro_pci_enabled = pcie_relaxed_ordering_enabled(dev->mdev->pdev);
> > > + bool ro_pci_enabled = acc & IB_ACCESS_RELAXED_ORDERING &&
> > > + pcie_relaxed_ordering_enabled(dev->mdev->pdev);
> > >
> > > MLX5_SET(mkc, mkc, a, !!(acc & IB_ACCESS_REMOTE_ATOMIC));
> > > MLX5_SET(mkc, mkc, rw, !!(acc & IB_ACCESS_REMOTE_WRITE));
> > > @@ -78,11 +79,9 @@ static void set_mkc_access_pd_addr_fields(void *mkc, int acc, u64 start_addr,
> > > MLX5_SET(mkc, mkc, lr, 1);
> > >
> > > if (MLX5_CAP_GEN(dev->mdev, relaxed_ordering_write))
> > > - MLX5_SET(mkc, mkc, relaxed_ordering_write,
> > > - (acc & IB_ACCESS_RELAXED_ORDERING) && ro_pci_enabled);
> > > + MLX5_SET(mkc, mkc, relaxed_ordering_write, ro_pci_enabled);
> > > if (MLX5_CAP_GEN(dev->mdev, relaxed_ordering_read))
> > > - MLX5_SET(mkc, mkc, relaxed_ordering_read,
> > > - (acc & IB_ACCESS_RELAXED_ORDERING) && ro_pci_enabled);
> > > + MLX5_SET(mkc, mkc, relaxed_ordering_read, ro_pci_enabled);
> > >
> > > MLX5_SET(mkc, mkc, pd, to_mpd(pd)->pdn);
> > > MLX5_SET(mkc, mkc, qpn, 0xffffff);
> > > (END)
> > >
> > Yes this looks good.
> >
> > Can you/Avihai create a patch from this ? or I'll do it ?
>
> I'd be surpised if it matters.. CSE and all

From bytecode/performance POV, It shouldn't change anything.
However it looks better.

Thanks

>
> Jason