Received: by 2002:ac0:a582:0:0:0:0:0 with SMTP id m2-v6csp3009341imm; Sun, 7 Oct 2018 17:43:09 -0700 (PDT) X-Google-Smtp-Source: ACcGV602byOHTKlEdt9gwQ1T6CVxDG0W6t6Dowm8zkq4Qrgcj/45JW25AfQ9eNsN8ZaMoFsUbVBN X-Received: by 2002:a63:31c8:: with SMTP id x191-v6mr18456979pgx.229.1538959389598; Sun, 07 Oct 2018 17:43:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1538959389; cv=none; d=google.com; s=arc-20160816; b=HuAhQiiSaMwP5enECo6moCOz2HrcJ4Ti9d3ZIiQRDrDVhVuynRFbu0e2/qT0MmciK6 v1lio7hFq4Z8q46z8Ty9naOZRD9mFZ4f+XG3Xf1rZetArDbCJpb/GpvEQZrhahwJjf57 HDfniZZDeT20gFcsS9x4RYaYS7LfOKX1zMxXQjWOi4Az6ZJ1mzSN0gBmQE9Ov9vHG7bd VWFC7wjl3rTiOQkIBRe6vFyNqYNwgBAcjCwa1sFCXuEvZZ4nHMSsE/8ZxLFb4BbDLWBc IRcXQ44QpzddDv/EIGDmGdJuXty2D9kmoLQ73aTutl8bASbbQTZ56uxoQ62s5wp1fJlF IyRQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :message-id:in-reply-to:subject:cc:to:from:date; bh=/fzLuUN6giab7z8wPOKSUJ3hjnYDJDeLhfFVeV/QfVQ=; b=O8Vx3SeVjJ26xpFwwbCdKL2fLSi5DAm0aW53UKS61QIGgfgN5BoeLK14sbqjhP2du+ yXvTJrh0QEnZ7rUmy+U/h067gbhvIo0rX8eiM4wRGUg//gC77XkG7ZWLlWG6MtSKo2Zt Tda6ZxN2d2UQl56lvfwJafDdcxd4OH2CypdIhAaida+y+KTM70OTQcPFfsqqMM9HgAMA VmSlShxWaaL1/YlsAWvVCGZT1Udy5YiX0yBHfkht6TVBLJDCxKJEfZIpfn3+53yjI0Ve sPdEddm+59jdDok1J8QuKSsYtFCLfs38uGQLAKNOs4Gr5Ogw/BXhh6MGfNrFWmAjfa98 jCuQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id t19-v6si18216532pfm.152.2018.10.07.17.42.54; Sun, 07 Oct 2018 17:43:09 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728228AbeJHHqW (ORCPT + 99 others); Mon, 8 Oct 2018 03:46:22 -0400 Received: from eddie.linux-mips.org ([148.251.95.138]:37360 "EHLO cvs.linux-mips.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725754AbeJHHqW (ORCPT ); Mon, 8 Oct 2018 03:46:22 -0400 Received: (from localhost user: 'macro', uid#1010) by eddie.linux-mips.org with ESMTP id S23994541AbeJHAhQNouZL (ORCPT ); Mon, 8 Oct 2018 02:37:16 +0200 Date: Mon, 8 Oct 2018 01:37:16 +0100 (BST) From: "Maciej W. Rozycki" To: Ralf Baechle , Paul Burton cc: linux-mips@linux-mips.org, linux-kernel@vger.kernel.org Subject: [PATCH 3/4] MIPS: Enforce strong ordering for MMIO accessors In-Reply-To: Message-ID: References: User-Agent: Alpine 2.21 (LFD 202 2017-01-01) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Architecturally the MIPS ISA does not specify ordering requirements for uncached bus accesses such as MMIO operations normally use and therefore explicit barriers have to be inserted between MMIO accesses where unspecified ordering of operations would cause unpredictable results. For example the R2020 write buffer implements write gathering and combining[1] and as used with the DECstation models 2100 and 3100 for MMIO accesses it bypasses the read buffer entirely, because conflicts are resolved by the memory controller for DRAM accesses only[2] (NB the R2020 and R3020 buffers are the same except for the maximum clock rate). Consequently if a device has say a 16-bit control register at offset 0, a 16-bit event mask register at offset 2 and a 16-bit reset register at offset 4, and the initial value of the control register is 0x1111, then in the absence of barriers a hypothetical code sequence like this: u16 init_dev(u16 __iomem *dev); u16 x; write16(dev + 2, 0xffff); write16(dev + 0, 0x2222); x = read16(dev + 0); write16(dev + 1, 0x3333); write16(dev + 0, 0x4444); return x; } will return 0x1111 and issue a single 32-bit write of 0x33334444 (in the little-endian bus configuration) to offset 0 on the system bus. This is because the read to set `x' from offset 0 bypasses the write of 0x2222 that is still in the write buffer pending the completion of the write of 0xffff to the reset register. Then the write of 0x3333 to the event mask register is merged with the preceding write to the control register as they share the same word address, making it a 32-bit write of 0x33332222 to offset 0. Finally the write of 0x4444 to the control register is combined with the outstanding 32-bit write of 0x33332222 to offset 0, because, again, it shares the same address. This is an example from a legacy system, given here because it is well documented and affects a machine we actually support. But likewise modern MIPS systems may implement weak MMIO ordering, possibly even without having it clearly documented except for being compliant with the architecture specification with respect to the currently defined SYNC instruction variants[3]. Considering the above and that we are required to implement MMIO accessors such that individual accesses made with them are strongly ordered with respect to each other[4], add the necessary barriers to our `inX', `outX', `readX' and `writeX' handlers, as well the associated special use variants. It's up to platforms then to possibly define the respective barriers so as to expand to nil if no ordering enforcement is actually needed for a given system; SYNC is supposed to be as cheap as a NOP on strongly ordered MIPS implementations though. Retain the option to generate weakly-ordered accessors, so that the arrangement for `war_io_reorder_wmb' is not lost in case we need it for fully raw accessors in the future. The reason for this is that it is unclear from commit 1e820da3c9af ("MIPS: Loongson-3: Introduce CONFIG_LOONGSON3_ENHANCEMENT") and especially commit 8faca49a6731 ("MIPS: Modify core io.h macros to account for the Octeon Errata Core-301.") why they are needed there under the previous assumption that these accessors can be weakly ordered. References: [1] "LR3020 Write Buffer", LSI Logic Corporation, September 1988, Section "Byte Gathering", pp. 6-7 [2] "DECstation 3100 Desktop Workstation Functional Specification", Digital Equipment Corporation, Revision 1.3, August 28, 1990, Section 6.1 "Processor", p. 4 [3] "MIPS Architecture For Programmers, Volume II-A: The MIPS32 Instruction Set Manual", Imagination Technologies LTD, Document Number: MD00086, Revision 6.06, December 15, 2016, Table 5.5 "Encodings of the Bits[10:6] of the SYNC instruction; the SType Field", p. 409 [4] "LINUX KERNEL MEMORY BARRIERS", Documentation/memory-barriers.txt, Section "KERNEL I/O BARRIER EFFECTS" Signed-off-by: Maciej W. Rozycki References: 8faca49a6731 ("MIPS: Modify core io.h macros to account for the Octeon Errata Core-301.") References: 1e820da3c9af ("MIPS: Loongson-3: Introduce CONFIG_LOONGSON3_ENHANCEMENT") --- arch/mips/include/asm/io.h | 28 ++++++++++++++++++++-------- 1 file changed, 20 insertions(+), 8 deletions(-) linux-mips-readx-writex-barrier.patch Index: linux-20180930-3maxp/arch/mips/include/asm/io.h =================================================================== --- linux-20180930-3maxp.orig/arch/mips/include/asm/io.h +++ linux-20180930-3maxp/arch/mips/include/asm/io.h @@ -337,7 +337,7 @@ static inline void iounmap(const volatil #define war_io_reorder_wmb() barrier() #endif -#define __BUILD_MEMORY_SINGLE(pfx, bwlq, type, irq) \ +#define __BUILD_MEMORY_SINGLE(pfx, bwlq, type, barrier, irq) \ \ static inline void pfx##write##bwlq(type val, \ volatile void __iomem *mem) \ @@ -345,7 +345,10 @@ static inline void pfx##write##bwlq(type volatile type *__mem; \ type __val; \ \ - war_io_reorder_wmb(); \ + if (barrier) \ + iobarrier_rw(); \ + else \ + war_io_reorder_wmb(); \ \ __mem = (void *)__swizzle_addr_##bwlq((unsigned long)(mem)); \ \ @@ -382,6 +385,9 @@ static inline type pfx##read##bwlq(const \ __mem = (void *)__swizzle_addr_##bwlq((unsigned long)(mem)); \ \ + if (barrier) \ + iobarrier_rw(); \ + \ if (sizeof(type) != sizeof(u64) || sizeof(u64) == sizeof(long)) \ __val = *__mem; \ else if (cpu_has_64bits) { \ @@ -409,14 +415,17 @@ static inline type pfx##read##bwlq(const return pfx##ioswab##bwlq(__mem, __val); \ } -#define __BUILD_IOPORT_SINGLE(pfx, bwlq, type, p, slow) \ +#define __BUILD_IOPORT_SINGLE(pfx, bwlq, type, barrier, p, slow) \ \ static inline void pfx##out##bwlq##p(type val, unsigned long port) \ { \ volatile type *__addr; \ type __val; \ \ - war_io_reorder_wmb(); \ + if (barrier) \ + iobarrier_rw(); \ + else \ + war_io_reorder_wmb(); \ \ __addr = (void *)__swizzle_addr_##bwlq(mips_io_port_base + port); \ \ @@ -438,6 +447,9 @@ static inline type pfx##in##bwlq##p(unsi \ BUILD_BUG_ON(sizeof(type) > sizeof(unsigned long)); \ \ + if (barrier) \ + iobarrier_rw(); \ + \ __val = *__addr; \ slow; \ \ @@ -448,7 +460,7 @@ static inline type pfx##in##bwlq##p(unsi #define __BUILD_MEMORY_PFX(bus, bwlq, type) \ \ -__BUILD_MEMORY_SINGLE(bus, bwlq, type, 1) +__BUILD_MEMORY_SINGLE(bus, bwlq, type, 1, 1) #define BUILDIO_MEM(bwlq, type) \ \ @@ -462,8 +474,8 @@ BUILDIO_MEM(l, u32) BUILDIO_MEM(q, u64) #define __BUILD_IOPORT_PFX(bus, bwlq, type) \ - __BUILD_IOPORT_SINGLE(bus, bwlq, type, ,) \ - __BUILD_IOPORT_SINGLE(bus, bwlq, type, _p, SLOW_DOWN_IO) + __BUILD_IOPORT_SINGLE(bus, bwlq, type, 1, ,) \ + __BUILD_IOPORT_SINGLE(bus, bwlq, type, 1, _p, SLOW_DOWN_IO) #define BUILDIO_IOPORT(bwlq, type) \ __BUILD_IOPORT_PFX(, bwlq, type) \ @@ -478,7 +490,7 @@ BUILDIO_IOPORT(q, u64) #define __BUILDIO(bwlq, type) \ \ -__BUILD_MEMORY_SINGLE(____raw_, bwlq, type, 0) +__BUILD_MEMORY_SINGLE(____raw_, bwlq, type, 1, 0) __BUILDIO(q, u64)