2018-03-23 22:25:04

by Sinan Kaya

[permalink] [raw]
Subject: [PATCH v6 6/6] bnxt_en: Eliminate duplicate barriers on weakly-ordered archs

Code includes wmb() followed by writel(). writel() already has a barrier on
some architectures like arm64.

This ends up CPU observing two barriers back to back before executing the
register write.

Create a new wrapper function with relaxed write operator. Use the new
wrapper when a write is following a wmb().

Since code already has an explicit barrier call, changing writel() to
writel_relaxed().

Also add mmiowb() so that write code doesn't move outside of scope.

Signed-off-by: Sinan Kaya <[email protected]>
---
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 3 ++-
drivers/net/ethernet/broadcom/bnxt/bnxt.h | 9 +++++++++
2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 1500243..fc8ea0d 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -1922,7 +1922,8 @@ static int bnxt_poll_work(struct bnxt *bp, struct bnxt_napi *bnapi, int budget)
/* Sync BD data before updating doorbell */
wmb();

- bnxt_db_write(bp, db, DB_KEY_TX | prod);
+ bnxt_db_write_relaxed(bp, db, DB_KEY_TX | prod);
+ mmiowb();
}

cpr->cp_raw_cons = raw_cons;
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index 1989c47..5e453b9 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -1401,6 +1401,15 @@ static inline u32 bnxt_tx_avail(struct bnxt *bp, struct bnxt_tx_ring_info *txr)
((txr->tx_prod - txr->tx_cons) & bp->tx_ring_mask);
}

+/* For TX and RX ring doorbells with no ordering guarantee*/
+static inline void bnxt_db_write_relaxed(struct bnxt *bp, void __iomem *db,
+ u32 val)
+{
+ writel_relaxed(val, db);
+ if (bp->flags & BNXT_FLAG_DOUBLE_DB)
+ writel_relaxed(val, db);
+}
+
/* For TX and RX ring doorbells */
static inline void bnxt_db_write(struct bnxt *bp, void __iomem *db, u32 val)
{
--
2.7.4



2018-03-23 22:37:14

by Michael Chan

[permalink] [raw]
Subject: Re: [PATCH v6 6/6] bnxt_en: Eliminate duplicate barriers on weakly-ordered archs

On Fri, Mar 23, 2018 at 3:23 PM, Sinan Kaya <[email protected]> wrote:
> Code includes wmb() followed by writel(). writel() already has a barrier on
> some architectures like arm64.
>
> This ends up CPU observing two barriers back to back before executing the
> register write.
>
> Create a new wrapper function with relaxed write operator. Use the new
> wrapper when a write is following a wmb().
>
> Since code already has an explicit barrier call, changing writel() to
> writel_relaxed().
>
> Also add mmiowb() so that write code doesn't move outside of scope.
>
> Signed-off-by: Sinan Kaya <[email protected]>
> ---
> drivers/net/ethernet/broadcom/bnxt/bnxt.c | 3 ++-
> drivers/net/ethernet/broadcom/bnxt/bnxt.h | 9 +++++++++
> 2 files changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> index 1500243..fc8ea0d 100644
> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> @@ -1922,7 +1922,8 @@ static int bnxt_poll_work(struct bnxt *bp, struct bnxt_napi *bnapi, int budget)
> /* Sync BD data before updating doorbell */
> wmb();
>
> - bnxt_db_write(bp, db, DB_KEY_TX | prod);
> + bnxt_db_write_relaxed(bp, db, DB_KEY_TX | prod);
> + mmiowb();

Sorry for the late review. mmiowb() is not required here because we
are in NAPI context, so only one CPU will be updating this doorbell.

Other than that, it looks good.

> }
>
> cpr->cp_raw_cons = raw_cons;
> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
> index 1989c47..5e453b9 100644
> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
> @@ -1401,6 +1401,15 @@ static inline u32 bnxt_tx_avail(struct bnxt *bp, struct bnxt_tx_ring_info *txr)
> ((txr->tx_prod - txr->tx_cons) & bp->tx_ring_mask);
> }
>
> +/* For TX and RX ring doorbells with no ordering guarantee*/
> +static inline void bnxt_db_write_relaxed(struct bnxt *bp, void __iomem *db,
> + u32 val)
> +{
> + writel_relaxed(val, db);
> + if (bp->flags & BNXT_FLAG_DOUBLE_DB)
> + writel_relaxed(val, db);
> +}
> +
> /* For TX and RX ring doorbells */
> static inline void bnxt_db_write(struct bnxt *bp, void __iomem *db, u32 val)
> {
> --
> 2.7.4
>

2018-03-24 04:04:03

by Sinan Kaya

[permalink] [raw]
Subject: Re: [PATCH v6 6/6] bnxt_en: Eliminate duplicate barriers on weakly-ordered archs

On 3/23/2018 6:36 PM, Michael Chan wrote:
>> + mmiowb();
> Sorry for the late review. mmiowb() is not required here because we
> are in NAPI context, so only one CPU will be updating this doorbell.
>
> Other than that, it looks good.
>

OK, I'll fix this on the next version.

--
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.