Received: by 10.213.65.68 with SMTP id h4csp401813imn; Sat, 17 Mar 2018 08:07:17 -0700 (PDT) X-Google-Smtp-Source: AG47ELvu9+LU4KG/TtjGp8G50EWPX6SMS8fHT0H8witgXhsJ2bLkvQBqQCbOj0idl4liTUMUCsUG X-Received: by 2002:a17:902:d03:: with SMTP id 3-v6mr6028317plu.245.1521299236967; Sat, 17 Mar 2018 08:07:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1521299236; cv=none; d=google.com; s=arc-20160816; b=bdZKn97kOhhK/WIDiBeNOmLGby+JH1+uH2lt0PP3nICjBV1TcAyFkCvkVNUHpaUzMr Y5S4cm19CBJuG/MSejZfFlPSHJzu93pQO0vejKnX91YezUcYDOeFUUq3RYtJjLtY5fan iVI5iW4ZzOVKk8iyIZOg5zZCot15HyTc+3yjWPn2Kq2Knx/1d4KlSzn28YG0nFFh/2At 7C0VG8s1RG1oDCLNC2zYoXxdSOyHvSsW/RUtXBg/6mW5rb7CJUT/RGlNxhQ6T3fPdtFP lfasHhz/6QXqkQFwwek4Bun7keqHpWi45kfhNPWdRnXaAhAmuzWUo+Srv4aaRlnaAKjP 0gsw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature:arc-authentication-results; bh=BAl1+BppN4QFGv0egYzXnUDXG/bsDMjg00sRRb/0lXI=; b=edGpIsXw1w2Vqd6dO/8sNV53v5JPwa50YUjufMDMGkSmYkjnoXMYLKwH1u7gfcIzwS o9Tu76UAoLBWTz85dTv3/oywXca3Pr9ppDKonbkUCCwvdx6H41/6UUSGzr3lJQv3N2+y Y+YN2MQPhFOi8MSMgLFdK4tVFnul+i9xCm9xjPhjB9gsbM88LnsO+8vz+Z0Wt98meiW/ lopxQkxprJINnhpBIAPiEkCfkKkGB+FU0GM3Bvy9WYQr/n2nEWUuAnyF+PR+caN3DLu5 FGkZ8x3Hr/gyoV9gEB3L6eh9gBSeG9M1Q19VeqEDGhZnxa5h2yBaKuL1+lyB7gXdpsPy WZcg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ziepe.ca header.s=google header.b=TT6hoKSe; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id x3-v6si9114022plw.269.2018.03.17.08.06.51; Sat, 17 Mar 2018 08:07:16 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@ziepe.ca header.s=google header.b=TT6hoKSe; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752836AbeCQPF0 (ORCPT + 99 others); Sat, 17 Mar 2018 11:05:26 -0400 Received: from mail-it0-f66.google.com ([209.85.214.66]:36709 "EHLO mail-it0-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752625AbeCQPFX (ORCPT ); Sat, 17 Mar 2018 11:05:23 -0400 Received: by mail-it0-f66.google.com with SMTP id u5-v6so5435140itc.1 for ; Sat, 17 Mar 2018 08:05:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ziepe.ca; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=BAl1+BppN4QFGv0egYzXnUDXG/bsDMjg00sRRb/0lXI=; b=TT6hoKSe6vVShbHSgANJYQHgbuzGf/w/ik0+nLKnfbP6lusLFcFU6KrSgAiKd3mJO2 xkC95eW8iG6S1GfrGK9KzsnUmrINXEOGTeNANGvtbFj84WW6uRS0MXCm6Y4lkODRgqQP PN2t6iH1q/ko6cojhtvUV8NVQGhXaSpT3LstuVSbOh2JZ5BTtT+KDMrDKpctovj6co25 tNVUuP1e5bRiFhl2zlt8yL6R8NT7tW96v/HHxdUeOFkkfo94s8fJVqV6DBa/fv1wfW9L 2TVLdtObqudqq39EDTghD8thr/euOAQ6dypbC5NUvFKP2tPirUEZnObsYhTYPVic5JSe Nmpg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=BAl1+BppN4QFGv0egYzXnUDXG/bsDMjg00sRRb/0lXI=; b=aV7PJRAjU8feXzUPPXZQJML590kbqpqHDy9wsoKN6YtvUYuGetrPWyNmntpSaPM3jQ 14+zg9E3Oa3GccDPaYDroiCFMngBFGdkNQEkv2x5xD/zC66RSsZHiW4T1OACdq3d6ri9 h9YBXN0weLEBP5RLP6aNvCkbtsB7nvj2X3Yi+ZifKTDBRYatHxNebU+0Pv/abgKq+3q3 pLz92lYkzNNpptd8O9tCxp1tDjgC2b9bSReDrU8zWjeQwstr3xGbQx72VUCH1lwEF1LP F4y0PVujZoiq6Mlos+WuT9eWJvOYqkF3A6suhFEgSJcc/pMUcYvXukyhZs/eIr0kzw5V Spow== X-Gm-Message-State: AElRT7G3JckhUbSTUe60PqY6o0kx3tGasuVCyGUAkI0AgEitduP0syn5 n6Nrk1KGw/2xPkblkQcRE4Zbig== X-Received: by 2002:a24:8602:: with SMTP id u2-v6mr5844081itd.99.1521299123041; Sat, 17 Mar 2018 08:05:23 -0700 (PDT) Received: from ziepe.ca (S010614cc2056d97f.ed.shawcable.net. [174.3.196.123]) by smtp.gmail.com with ESMTPSA id 79-v6sm4279680itm.11.2018.03.17.08.05.21 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 17 Mar 2018 08:05:22 -0700 (PDT) Received: from jgg by mlx.ziepe.ca with local (Exim 4.86_2) (envelope-from ) id 1exDOa-0006Ew-Bk; Sat, 17 Mar 2018 09:05:20 -0600 Date: Sat, 17 Mar 2018 09:05:20 -0600 From: Jason Gunthorpe To: Sinan Kaya Cc: Steve Wise , netdev@vger.kernel.org, timur@codeaurora.org, sulrich@codeaurora.org, linux-arm-msm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, 'Steve Wise' , 'Doug Ledford' , linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org, 'Michael Werner' , 'Casey Leedom' Subject: Re: [PATCH v3 18/18] infiniband: cxgb4: Eliminate duplicate barriers on weakly-ordered archs Message-ID: <20180317150520.GA23463@ziepe.ca> References: <1521216991-28706-1-git-send-email-okaya@codeaurora.org> <1521216991-28706-19-git-send-email-okaya@codeaurora.org> <003601d3bd6a$783d6970$68b83c50$@opengridcomputing.com> <83387f6e-adcb-14e9-2c22-96abf9493cc6@codeaurora.org> <004501d3bd7b$505e70f0$f11b52d0$@opengridcomputing.com> <740c7d45-450e-c9b3-ceed-7bc7fcefbc5a@codeaurora.org> <71e37a55-537b-d75a-cfde-f188b7cfce8e@codeaurora.org> <1f5e3b14-05a1-08d0-c0cb-00805526448d@codeaurora.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1f5e3b14-05a1-08d0-c0cb-00805526448d@codeaurora.org> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Mar 17, 2018 at 12:25:14AM -0400, Sinan Kaya wrote: > On 3/17/2018 12:03 AM, Sinan Kaya wrote: > > On 3/16/2018 11:40 PM, Sinan Kaya wrote: > >> I'll change writel_relaxed() with __raw_writel() in the series like you suggested > >> and also look at your other comments. > > > > I spoke too soon. > > > > Now that I realized, code needs to follow one of the following patterns for correctness > > > > 1) > > wmb() > > writel()/writel_relaxed() > > > > or > > > > 2) > > wmb() > > __raw_wrltel() > > mmiowb() > > > > but definitely not > > > > wmb() > > __raw_wrltel() > > > > Since #1 == #2, I'll stick to my current implementation of writel_relaxed() > > > > Changing writel() to writel_relaxed() or __raw_writel() isn't enough. PowerPC needs mmiowb() > > for correctness. ARM's mmiowb() implementation is empty. > > > > So, there is no one size fits all solution with the current state of affairs. > > > > > > I think I finally got what you mean. > > Code seems to have > > wmb() > writel()/writeq() > wmb() > > this can be safely replaced with > > wmb() > __raw_writel()/__raw_writeq() > wmb() > > This will work on all arches. Below is the new version. Let me know if this is OK. > > +++ b/drivers/infiniband/hw/cxgb4/t4.h > @@ -457,7 +457,7 @@ static inline void pio_copy(u64 __iomem *dst, u64 *src) > int count = 8; > > while (count) { > - writeq(*src, dst); > + __raw_writeq(*src, dst); > src++; > dst++; > count--; > @@ -477,15 +477,16 @@ static inline void t4_ring_sq_db(struct t4_wq *wq, u16 inc, union t4_wr *wqe) > (u64 *)wqe); > } else { > pr_debug("DB wq->sq.pidx = %d\n", wq->sq.pidx); > - writel(PIDX_T5_V(inc) | QID_V(wq->sq.bar2_qid), > - wq->sq.bar2_va + SGE_UDB_KDOORBELL); > + __raw_writel(PIDX_T5_V(inc) | QID_V(wq->sq.bar2_qid), > + wq->sq.bar2_va + SGE_UDB_KDOORBELL); > } > > /* Flush user doorbell area writes. */ > wmb(); > return; > } > - writel(QID_V(wq->sq.qid) | PIDX_V(inc), wq->db); > + __raw_writel(QID_V(wq->sq.qid) | PIDX_V(inc), wq->db); > + mmiowmb() > } > > static inline void t4_ring_rq_db(struct t4_wq *wq, u16 inc, > @@ -502,15 +503,16 @@ static inline void t4_ring_rq_db(struct t4_wq *wq, u16 inc, > (void *)wqe); > } else { > pr_debug("DB wq->rq.pidx = %d\n", wq->rq.pidx); > - writel(PIDX_T5_V(inc) | QID_V(wq->rq.bar2_qid), > - wq->rq.bar2_va + SGE_UDB_KDOORBELL); > + __raw_writel(PIDX_T5_V(inc) | QID_V(wq->rq.bar2_qid), > + wq->rq.bar2_va + SGE_UDB_KDOORBELL); > } > > /* Flush user doorbell area writes. */ > wmb(); > return; > } > - writel(QID_V(wq->rq.qid) | PIDX_V(inc), wq->db); > + __raw_writel(QID_V(wq->rq.qid) | PIDX_V(inc), wq->db); > + mmiowmb(); No! NAK! Adding *new* barriers to use the *wrong* accessor is crazy! Your first patch was right, replacing wmb() writel() With wmb(); writel_relaxed() is fine, and helps ARM. The problem with PPC has nothing to do with the writel, it is with the spinlock, and we can't improve it without adding some new infrastructure. Certainly don't hack around the problem by using __raw_writel in multi-arch drivers! In userspace we settled on something like this as a pattern: mmio_wc_spin_lock() writel_wc_mmio() mmio_wc_spin_unlock() Where mmio_wc_spin_unlock incorporates the mmiowmb and is defined to fully contain all MMIO access to WC and UC memory. Due to our userspace the pattern is specifically used with MMIO writes to WC BAR memory, but it could be generalized.. This allows all known arches to use the best code in all cases. x86 can even optimize a lot and combine the mmiowmb() into the spin_unlock, apparently. Jason