2018-03-20 02:50:40

by Sinan Kaya

[permalink] [raw]
Subject: [PATCH v4 6/6] RDMA/qedr: eliminate duplicate barriers on weakly-ordered archs #2

Code includes wmb() followed by writel() in multiple places. writel()
already has a barrier on some architectures like arm64.

This ends up CPU observing two barriers back to back before executing the
register write.

Since code already has an explicit barrier call, changing writel() to
writel_relaxed().

Signed-off-by: Sinan Kaya <[email protected]>
---
drivers/infiniband/hw/qedr/verbs.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/hw/qedr/verbs.c b/drivers/infiniband/hw/qedr/verbs.c
index ccd55f4..db60360 100644
--- a/drivers/infiniband/hw/qedr/verbs.c
+++ b/drivers/infiniband/hw/qedr/verbs.c
@@ -860,7 +860,7 @@ static void doorbell_cq(struct qedr_cq *cq, u32 cons, u8 flags)
wmb();
cq->db.data.agg_flags = flags;
cq->db.data.value = cpu_to_le32(cons);
- writeq(cq->db.raw, cq->db_addr);
+ writeq_relaxed(cq->db.raw, cq->db_addr);

/* Make sure write would stick */
mmiowb();
@@ -3338,7 +3338,7 @@ int qedr_post_recv(struct ib_qp *ibqp, struct ib_recv_wr *wr,

qp->rq.db_data.data.value++;

- writel(qp->rq.db_data.raw, qp->rq.db);
+ writel_relaxed(qp->rq.db_data.raw, qp->rq.db);

/* Make sure write sticks */
mmiowb();
--
2.7.4



2018-03-20 07:40:04

by Kalderon, Michal

[permalink] [raw]
Subject: RE: [PATCH v4 6/6] RDMA/qedr: eliminate duplicate barriers on weakly-ordered archs #2

> From: Sinan Kaya [mailto:[email protected]]
> Sent: Tuesday, March 20, 2018 4:48 AM
>
> Code includes wmb() followed by writel() in multiple places. writel() already
> has a barrier on some architectures like arm64.
>
> This ends up CPU observing two barriers back to back before executing the
> register write.
>
> Since code already has an explicit barrier call, changing writel() to
> writel_relaxed().
>
> Signed-off-by: Sinan Kaya <[email protected]>
> ---
> drivers/infiniband/hw/qedr/verbs.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/infiniband/hw/qedr/verbs.c
> b/drivers/infiniband/hw/qedr/verbs.c
> index ccd55f4..db60360 100644
> --- a/drivers/infiniband/hw/qedr/verbs.c
> +++ b/drivers/infiniband/hw/qedr/verbs.c
> @@ -860,7 +860,7 @@ static void doorbell_cq(struct qedr_cq *cq, u32 cons,
> u8 flags)
> wmb();
> cq->db.data.agg_flags = flags;
> cq->db.data.value = cpu_to_le32(cons);
> - writeq(cq->db.raw, cq->db_addr);
> + writeq_relaxed(cq->db.raw, cq->db_addr);
>
> /* Make sure write would stick */
> mmiowb();
> @@ -3338,7 +3338,7 @@ int qedr_post_recv(struct ib_qp *ibqp, struct
> ib_recv_wr *wr,
>
> qp->rq.db_data.data.value++;
>
> - writel(qp->rq.db_data.raw, qp->rq.db);
> + writel_relaxed(qp->rq.db_data.raw, qp->rq.db);
>
> /* Make sure write sticks */
> mmiowb();
> --
> 2.7.4
Acked-by: Michal Kalderon <[email protected]>


2018-03-20 14:57:02

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v4 6/6] RDMA/qedr: eliminate duplicate barriers on weakly-ordered archs #2

On Mon, Mar 19, 2018 at 10:47:48PM -0400, Sinan Kaya wrote:
> Code includes wmb() followed by writel() in multiple places. writel()
> already has a barrier on some architectures like arm64.
>
> This ends up CPU observing two barriers back to back before executing the
> register write.
>
> Since code already has an explicit barrier call, changing writel() to
> writel_relaxed().
>
> Signed-off-by: Sinan Kaya <[email protected]>
> drivers/infiniband/hw/qedr/verbs.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/infiniband/hw/qedr/verbs.c b/drivers/infiniband/hw/qedr/verbs.c
> index ccd55f4..db60360 100644
> +++ b/drivers/infiniband/hw/qedr/verbs.c
> @@ -860,7 +860,7 @@ static void doorbell_cq(struct qedr_cq *cq, u32 cons, u8 flags)
> wmb();
> cq->db.data.agg_flags = flags;
> cq->db.data.value = cpu_to_le32(cons);
> - writeq(cq->db.raw, cq->db_addr);
> + writeq_relaxed(cq->db.raw, cq->db_addr);
>
> /* Make sure write would stick */
> mmiowb();
> @@ -3338,7 +3338,7 @@ int qedr_post_recv(struct ib_qp *ibqp, struct ib_recv_wr *wr,
>
> qp->rq.db_data.data.value++;
>
> - writel(qp->rq.db_data.raw, qp->rq.db);
> + writel_relaxed(qp->rq.db_data.raw, qp->rq.db);
>
> /* Make sure write sticks */
> mmiowb();

Looks fine, but the next lines should be relaxed too:


/* Make sure write sticks */
mmiowb();

if (rdma_protocol_iwarp(&dev->ibdev, 1)) {
writel(qp->rq.iwarp_db2_data.raw, qp->rq.iwarp_db2);
mmiowb(); /* for second doorbell */
}

mmiowb() is strong enough to order writel, IIRC.

Jason