Received: by 10.213.65.68 with SMTP id h4csp65586imn; Mon, 19 Mar 2018 19:49:33 -0700 (PDT) X-Google-Smtp-Source: AG47ELtebpWZLTd8EDZi4U5/wqBfQxdE8InT4PtlIEeIay2DGC+AxTs+DC8TU7bL+V74TiqWb0vc X-Received: by 10.98.141.65 with SMTP id z62mr12265338pfd.129.1521514173874; Mon, 19 Mar 2018 19:49:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1521514173; cv=none; d=google.com; s=arc-20160816; b=Bo4w3tcYsMjGJisGi/4n2LQwfcRJcFXgeGxENwYgVxd8o8A+9g8+ppvSll+i+H4bpc d8FPXtY0YXExUZwPPjiEwh6sKuTzzjvFO/YKzCyciPr5mBRMHHQrJAfechVvlYCfaviY WPcXkx3N7+R2fUyGw4NqLo9TNABHxM78MXLPoiUkk232aPlMcP4Ba5olhQ5UDmI8rg1m YVCjSUwII3YVJggKtxx1ucyRtwIdviEruVgDEzR0bWT2cXqK337bb65MtZN0nDdiqrhl 8aQlvk3MLFftmbEBvQ/+BiJoyW1MIBKo6aiQQ48Z605jzb+Bl4/5pSzd/jP654ogbUtA Kw+Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dmarc-filter:dkim-signature:dkim-signature :arc-authentication-results; bh=G6Dlthr0jISCBn4yfENwaCoB0U0WdLZDeJiL/8J+R2Y=; b=sC1feAt2++LGZBegYegsQRgkE/EasOOGiZtVI0Gr0c2FFOtIs7A6FADAHd4qsOuXCZ FjJ8/tSFzvNDmn5Wn8i+HMlgISsjnUNXv3DT5ZDF2rZT/kHHOxjfFrrR1LNTzOqHdht4 OFojCKtQ6ila9stJbbgDxHgXDWmcvgy1Qd8vlYs+M8VjjWbX7V3kgNfpWL9r1JIrjpuT ShQI4oIZPooLYqM7ArsH6MoFKsVvcuU5rSOicAmOZckvhG6/kNngVyV6xNdvvW+RH9i0 P+fIgZbq+M2C2/OcjLEY8+B+q/tdMQOAZsifm54zSZlJCCLLSKk9z92q3246bsm7tnQX q4yA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@codeaurora.org header.s=default header.b=IeyFoYrD; dkim=pass header.i=@codeaurora.org header.s=default header.b=dB/d1WDm; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id e67si539866pfa.140.2018.03.19.19.49.19; Mon, 19 Mar 2018 19:49:33 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@codeaurora.org header.s=default header.b=IeyFoYrD; dkim=pass header.i=@codeaurora.org header.s=default header.b=dB/d1WDm; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752462AbeCTCsT (ORCPT + 99 others); Mon, 19 Mar 2018 22:48:19 -0400 Received: from smtp.codeaurora.org ([198.145.29.96]:56222 "EHLO smtp.codeaurora.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751920AbeCTCsG (ORCPT ); Mon, 19 Mar 2018 22:48:06 -0400 Received: by smtp.codeaurora.org (Postfix, from userid 1000) id 5F18E610D6; Tue, 20 Mar 2018 02:48:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=codeaurora.org; s=default; t=1521514085; bh=TNyfmiY8kgT7C4dGpBk3Td2ZzWVSbNByoAO7CB9TA7I=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=IeyFoYrDzMP/JjQ/uVnmA4Dtip9xqvzUbH1unCJEwHQb+pb5/6zwY5YYeOoTbJB8h ic5ZlfC15jga03OUFw/APlhCeC6AWcaRGG/sGLkMTRCr4H/wW13snDCVhv+A8S0txt zS9HSC/FRWo/tLO7PIA/8EP6Pll6eumHHC8l0Cec= X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on pdx-caf-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.8 required=2.0 tests=ALL_TRUSTED,BAYES_00, DKIM_SIGNED,T_DKIM_INVALID autolearn=no autolearn_force=no version=3.4.0 Received: from drakthul.qualcomm.com (global_nat1_iad_fw.qualcomm.com [129.46.232.65]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: okaya@smtp.codeaurora.org) by smtp.codeaurora.org (Postfix) with ESMTPSA id DFBB4610D9; Tue, 20 Mar 2018 02:48:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=codeaurora.org; s=default; t=1521514082; bh=TNyfmiY8kgT7C4dGpBk3Td2ZzWVSbNByoAO7CB9TA7I=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=dB/d1WDmQbyJyIg5vZafV55imKTVwdZrlEfWk/b7q79q3FTOP4W1UcwmkGsPAPmMB 4wGySWlwfW8ZLc+26mJRu6Z3DK/idVBgGEzlenw7yG7nwY6P17QVcSctNCmk9ESNDw uWfuf0WHi/+jLoaIuLq/RUz8vVP27lyU3MxgKPF0= DMARC-Filter: OpenDMARC Filter v1.3.2 smtp.codeaurora.org DFBB4610D9 Authentication-Results: pdx-caf-mail.web.codeaurora.org; dmarc=none (p=none dis=none) header.from=codeaurora.org Authentication-Results: pdx-caf-mail.web.codeaurora.org; spf=none smtp.mailfrom=okaya@codeaurora.org From: Sinan Kaya To: linux-rdma@vger.kernel.org, timur@codeaurora.org, sulrich@codeaurora.org Cc: linux-arm-msm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, Sinan Kaya , Faisal Latif , Doug Ledford , Jason Gunthorpe , linux-kernel@vger.kernel.org Subject: [PATCH v4 5/6] IB/nes: Eliminate duplicate barriers on weakly-ordered archs Date: Mon, 19 Mar 2018 22:47:47 -0400 Message-Id: <1521514068-8856-6-git-send-email-okaya@codeaurora.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1521514068-8856-1-git-send-email-okaya@codeaurora.org> References: <1521514068-8856-1-git-send-email-okaya@codeaurora.org> MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Code includes barrier() followed by writel(). writel() already has a barrier on some architectures like arm64. This ends up CPU observing two barriers back to back before executing the register write. Create a new wrapper function with relaxed write operator. Use the new wrapper when a write is following a barrier(). Signed-off-by: Sinan Kaya --- drivers/infiniband/hw/nes/nes.h | 5 +++++ drivers/infiniband/hw/nes/nes_hw.c | 21 ++++++++++++++------- drivers/infiniband/hw/nes/nes_mgt.c | 15 ++++++++++----- drivers/infiniband/hw/nes/nes_nic.c | 2 +- drivers/infiniband/hw/nes/nes_utils.c | 3 ++- drivers/infiniband/hw/nes/nes_verbs.c | 5 +++-- 6 files changed, 35 insertions(+), 16 deletions(-) diff --git a/drivers/infiniband/hw/nes/nes.h b/drivers/infiniband/hw/nes/nes.h index 00c27291..85e007d 100644 --- a/drivers/infiniband/hw/nes/nes.h +++ b/drivers/infiniband/hw/nes/nes.h @@ -387,6 +387,11 @@ static inline void nes_write_indexed(struct nes_device *nesdev, u32 reg_index, u spin_unlock_irqrestore(&nesdev->indexed_regs_lock, flags); } +static inline void nes_write32_relaxed(void __iomem *addr, u32 val) +{ + writel_relaxed(val, addr); +} + static inline void nes_write32(void __iomem *addr, u32 val) { writel(val, addr); diff --git a/drivers/infiniband/hw/nes/nes_hw.c b/drivers/infiniband/hw/nes/nes_hw.c index 18a7de1..568e17d 100644 --- a/drivers/infiniband/hw/nes/nes_hw.c +++ b/drivers/infiniband/hw/nes/nes_hw.c @@ -1257,7 +1257,8 @@ int nes_destroy_cqp(struct nes_device *nesdev) barrier(); /* Ring doorbell (5 WQEs) */ - nes_write32(nesdev->regs+NES_WQE_ALLOC, 0x05800000 | nesdev->cqp.qp_id); + nes_write32_relaxed(nesdev->regs+NES_WQE_ALLOC, + 0x05800000 | nesdev->cqp.qp_id); spin_unlock_irqrestore(&nesdev->cqp.lock, flags); @@ -1594,7 +1595,8 @@ static void nes_replenish_nic_rq(struct nes_vnic *nesvnic) atomic_dec(&nesvnic->rx_skbs_needed); barrier(); if (++rx_wqes_posted == 255) { - nes_write32(nesdev->regs+NES_WQE_ALLOC, (rx_wqes_posted << 24) | nesnic->qp_id); + nes_write32_relaxed(nesdev->regs+NES_WQE_ALLOC, + (rx_wqes_posted << 24) | nesnic->qp_id); rx_wqes_posted = 0; } } else { @@ -1612,7 +1614,8 @@ static void nes_replenish_nic_rq(struct nes_vnic *nesvnic) } while (atomic_read(&nesvnic->rx_skbs_needed)); barrier(); if (rx_wqes_posted) - nes_write32(nesdev->regs+NES_WQE_ALLOC, (rx_wqes_posted << 24) | nesnic->qp_id); + nes_write32_relaxed(nesdev->regs+NES_WQE_ALLOC, + (rx_wqes_posted << 24) | nesnic->qp_id); nesnic->replenishing_rq = 0; } @@ -1795,7 +1798,8 @@ int nes_init_nic_qp(struct nes_device *nesdev, struct net_device *netdev) barrier(); /* Ring doorbell (2 WQEs) */ - nes_write32(nesdev->regs+NES_WQE_ALLOC, 0x02800000 | nesdev->cqp.qp_id); + nes_write32_relaxed(nesdev->regs+NES_WQE_ALLOC, + 0x02800000 | nesdev->cqp.qp_id); spin_unlock_irqrestore(&nesdev->cqp.lock, flags); nes_debug(NES_DBG_INIT, "Waiting for create NIC QP%u to complete.\n", @@ -1844,7 +1848,8 @@ int nes_init_nic_qp(struct nes_device *nesdev, struct net_device *netdev) do { counter = min(wqe_count, ((u32)255)); wqe_count -= counter; - nes_write32(nesdev->regs+NES_WQE_ALLOC, (counter << 24) | nesvnic->nic.qp_id); + nes_write32_relaxed(nesdev->regs+NES_WQE_ALLOC, + (counter << 24) | nesvnic->nic.qp_id); } while (wqe_count); timer_setup(&nesvnic->rq_wqes_timer, nes_rq_wqes_timeout, 0); nes_debug(NES_DBG_INIT, "NAPI support Enabled\n"); @@ -1988,7 +1993,8 @@ void nes_destroy_nic_qp(struct nes_vnic *nesvnic) barrier(); /* Ring doorbell (2 WQEs) */ - nes_write32(nesdev->regs+NES_WQE_ALLOC, 0x02800000 | nesdev->cqp.qp_id); + nes_write32_relaxed(nesdev->regs+NES_WQE_ALLOC, + 0x02800000 | nesdev->cqp.qp_id); spin_unlock_irqrestore(&nesdev->cqp.lock, flags); nes_debug(NES_DBG_SHUTDOWN, "Waiting for CQP, cqp_head=%u, cqp.sq_head=%u," @@ -3064,7 +3070,8 @@ static void nes_cqp_ce_handler(struct nes_device *nesdev, struct nes_hw_cq *cq) cqp_request, le32_to_cpu(cqp_wqe->wqe_words[NES_CQP_WQE_OPCODE_IDX])&0x3f, head); /* Ring doorbell (1 WQEs) */ barrier(); - nes_write32(nesdev->regs+NES_WQE_ALLOC, 0x01800000 | nesdev->cqp.qp_id); + nes_write32_relaxed(nesdev->regs+NES_WQE_ALLOC, + 0x01800000 | nesdev->cqp.qp_id); } spin_unlock_irqrestore(&nesdev->cqp.lock, flags); diff --git a/drivers/infiniband/hw/nes/nes_mgt.c b/drivers/infiniband/hw/nes/nes_mgt.c index 21e0ebd..5c5073c 100644 --- a/drivers/infiniband/hw/nes/nes_mgt.c +++ b/drivers/infiniband/hw/nes/nes_mgt.c @@ -96,7 +96,8 @@ static void nes_replenish_mgt_rq(struct nes_vnic_mgt *mgtvnic) atomic_dec(&mgtvnic->rx_skbs_needed); barrier(); if (++rx_wqes_posted == 255) { - nes_write32(nesdev->regs + NES_WQE_ALLOC, (rx_wqes_posted << 24) | nesmgt->qp_id); + nes_write32_relaxed(nesdev->regs + NES_WQE_ALLOC, + (rx_wqes_posted << 24) | nesmgt->qp_id); rx_wqes_posted = 0; } } else { @@ -115,7 +116,8 @@ static void nes_replenish_mgt_rq(struct nes_vnic_mgt *mgtvnic) } while (atomic_read(&mgtvnic->rx_skbs_needed)); barrier(); if (rx_wqes_posted) - nes_write32(nesdev->regs + NES_WQE_ALLOC, (rx_wqes_posted << 24) | nesmgt->qp_id); + nes_write32_relaxed(nesdev->regs + NES_WQE_ALLOC, + (rx_wqes_posted << 24) | nesmgt->qp_id); nesmgt->replenishing_rq = 0; } @@ -995,7 +997,8 @@ int nes_init_mgt_qp(struct nes_device *nesdev, struct net_device *netdev, struct barrier(); /* Ring doorbell (2 WQEs) */ - nes_write32(nesdev->regs + NES_WQE_ALLOC, 0x02800000 | nesdev->cqp.qp_id); + nes_write32_relaxed(nesdev->regs + NES_WQE_ALLOC, + 0x02800000 | nesdev->cqp.qp_id); spin_unlock_irqrestore(&nesdev->cqp.lock, flags); nes_debug(NES_DBG_INIT, "Waiting for create MGT QP%u to complete.\n", @@ -1050,7 +1053,8 @@ int nes_init_mgt_qp(struct nes_device *nesdev, struct net_device *netdev, struct do { counter = min(wqe_count, ((u32)255)); wqe_count -= counter; - nes_write32(nesdev->regs + NES_WQE_ALLOC, (counter << 24) | mgtvnic->mgt.qp_id); + nes_write32_relaxed(nesdev->regs + NES_WQE_ALLOC, + (counter << 24) | mgtvnic->mgt.qp_id); } while (wqe_count); nes_write32(nesdev->regs + NES_CQE_ALLOC, NES_CQE_ALLOC_NOTIFY_NEXT | @@ -1124,7 +1128,8 @@ void nes_destroy_mgt(struct nes_vnic *nesvnic) barrier(); /* Ring doorbell (2 WQEs) */ - nes_write32(nesdev->regs + NES_WQE_ALLOC, 0x02800000 | nesdev->cqp.qp_id); + nes_write32_relaxed(nesdev->regs + NES_WQE_ALLOC, + 0x02800000 | nesdev->cqp.qp_id); spin_unlock_irqrestore(&nesdev->cqp.lock, flags); nes_debug(NES_DBG_SHUTDOWN, "Waiting for CQP, cqp_head=%u, cqp.sq_head=%u," diff --git a/drivers/infiniband/hw/nes/nes_nic.c b/drivers/infiniband/hw/nes/nes_nic.c index 0a75164..0653da0 100644 --- a/drivers/infiniband/hw/nes/nes_nic.c +++ b/drivers/infiniband/hw/nes/nes_nic.c @@ -683,7 +683,7 @@ static int nes_netdev_start_xmit(struct sk_buff *skb, struct net_device *netdev) barrier(); if (wqe_count) - nes_write32(nesdev->regs+NES_WQE_ALLOC, + nes_write32_relaxed(nesdev->regs+NES_WQE_ALLOC, (wqe_count << 24) | (1 << 23) | nesvnic->nic.qp_id); netif_trans_update(netdev); diff --git a/drivers/infiniband/hw/nes/nes_utils.c b/drivers/infiniband/hw/nes/nes_utils.c index 21b4a83..79a3d98 100644 --- a/drivers/infiniband/hw/nes/nes_utils.c +++ b/drivers/infiniband/hw/nes/nes_utils.c @@ -661,7 +661,8 @@ void nes_post_cqp_request(struct nes_device *nesdev, barrier(); /* Ring doorbell (1 WQEs) */ - nes_write32(nesdev->regs+NES_WQE_ALLOC, 0x01800000 | nesdev->cqp.qp_id); + nes_write32_relaxed(nesdev->regs+NES_WQE_ALLOC, + 0x01800000 | nesdev->cqp.qp_id); barrier(); } else { diff --git a/drivers/infiniband/hw/nes/nes_verbs.c b/drivers/infiniband/hw/nes/nes_verbs.c index 162475a..87f8635 100644 --- a/drivers/infiniband/hw/nes/nes_verbs.c +++ b/drivers/infiniband/hw/nes/nes_verbs.c @@ -3310,7 +3310,7 @@ static int nes_post_send(struct ib_qp *ibqp, struct ib_send_wr *ib_wr, while (wqe_count) { counter = min(wqe_count, ((u32)255)); wqe_count -= counter; - nes_write32(nesdev->regs + NES_WQE_ALLOC, + nes_write32_relaxed(nesdev->regs + NES_WQE_ALLOC, (counter << 24) | 0x00800000 | nesqp->hwqp.qp_id); } @@ -3404,7 +3404,8 @@ static int nes_post_recv(struct ib_qp *ibqp, struct ib_recv_wr *ib_wr, while (wqe_count) { counter = min(wqe_count, ((u32)255)); wqe_count -= counter; - nes_write32(nesdev->regs+NES_WQE_ALLOC, (counter<<24) | nesqp->hwqp.qp_id); + nes_write32_relaxed(nesdev->regs+NES_WQE_ALLOC, + (counter<<24) | nesqp->hwqp.qp_id); } spin_unlock_irqrestore(&nesqp->lock, flags); -- 2.7.4