Received: by 2002:ac0:946b:0:0:0:0:0 with SMTP id j40csp2722156imj; Mon, 18 Feb 2019 10:57:02 -0800 (PST) X-Google-Smtp-Source: AHgI3IbwA2a8Hclj1u6fdFRyQvEKRdnzM1VU8bGI92ETTrCTD8aQ4sOBD4GpiAWV7ovLQrHeb1/0 X-Received: by 2002:a63:ef4c:: with SMTP id c12mr19254739pgk.84.1550516222160; Mon, 18 Feb 2019 10:57:02 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1550516222; cv=none; d=google.com; s=arc-20160816; b=ZHq7dR27jF+zeaOhsdRsRkvEkKw8amxeKN0tH1p0Un4Ubx29JInjlaV8yzYHzB6T2M qbvLCsF04ziUrNObQnTz5JpoByQmXz7wDezUPwSXmTvH/7eycFvcD9+qIhS7nlpoirNd ECkKJoWJguih4d3aLHyHoMI8PVFyEdh81yqOsFuHbW9SNPQgVizhjCSqOXx/ThyX9Sf/ AbI0Mnx0xdBICLFjyTQu3nRDO464QVW0x2M6hmaMFqK09gnK1sRMt+Y3su2RQYxhcEVs 2oEvR0JTgNPCQ9mDptrYvZ3+SJXaPef2C0h/6ISSnB7aUmCt7zyVq0m1B8mG633MrAYM EyrQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=nw5BM06zu3BQPbJqNjkx1avHfwT2RgkSgfaAa96vaik=; b=AvRUPNWwo0I00yxtyqQQ/lJV2MMz+gQHF0NiWgPGIQOJ1nHZNzDZvYlng9YPHgLlAe PQFWrhYmpUvlq6Sv1uRs1Etl1MaTE3Iul8+EFhPJiOAKdWWuq5HmIG/6KOw6gSuiOYmx 9rVfnGaPA5rQEbiCpL0GS4eG1kF1yajmGDmgZs11kBK5QAxoPnJjFqwlpzPZnrQv7uFa L/U63N2hYuHLgqv0mjSJjSqPOAB2ZkrvNtMlapUw7t+iGbr28SdjVJUnd4p3u+nnIAiS LopR1vZn8/f8SecYc2Z9n0iyF/PJdbPdCebVZ7SCPY0uhVu0B1pqg3bnQsYpv/Oi4uFT H8/w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id f23si14582535pfa.228.2019.02.18.10.56.46; Mon, 18 Feb 2019 10:57:02 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2390240AbfBRQuX (ORCPT + 99 others); Mon, 18 Feb 2019 11:50:23 -0500 Received: from foss.arm.com ([217.140.101.70]:34132 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727558AbfBRQuX (ORCPT ); Mon, 18 Feb 2019 11:50:23 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 82F1380D; Mon, 18 Feb 2019 08:50:22 -0800 (PST) Received: from fuggles.cambridge.arm.com (usa-sjc-imap-foss1.foss.arm.com [10.72.51.249]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 68E143F675; Mon, 18 Feb 2019 08:50:10 -0800 (PST) Date: Mon, 18 Feb 2019 16:50:07 +0000 From: Will Deacon To: Linus Torvalds Cc: linux-arch , Linux List Kernel Mailing , "Paul E. McKenney" , Benjamin Herrenschmidt , Arnd Bergmann , Peter Zijlstra , Andrea Parri , Daniel Lustig , David Howells , Alan Stern , Tony Luck Subject: Re: [RFC PATCH] docs/memory-barriers.txt: Rewrite "KERNEL I/O BARRIER EFFECTS" section Message-ID: <20190218165007.GC16713@fuggles.cambridge.arm.com> References: <20190211172948.3322-1-will.deacon@arm.com> <20190213172047.GH6346@brain-police> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.11.1+86 (6f28e57d73f2) () Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Feb 13, 2019 at 10:27:09AM -0800, Linus Torvalds wrote: > On Wed, Feb 13, 2019 at 9:20 AM Will Deacon wrote: > > On Mon, Feb 11, 2019 at 02:34:31PM -0800, Linus Torvalds wrote: > > > IOW, we should seriously just consider making the rule be that locking > > > will order mmio too. Because that's practically the rule anyway. > > > > I would /love/ to get rid of mmiowb() because I think it's both extremely > > difficult to use and also pretty much never needed. It reminds me a lot of > > smp_read_barrier_depends(), which we finally pushed into READ_ONCE for > > Alpha. > > Right. Make as much of this implicit as we can. > > At least as long as it's _reasonably_ cheap on all architectures that matter. > > How expensive would it be on ARM? Does the normal acquire/release > already mean the IO in between is serialized? Yeah, that should work, but actually I'm wondered whether that means we can relax our mandatory barriers as well now that we have a multi-copy atomic memory model (i.e. if acquire/release works for locks between CPUs, then it should work for DMA buffers between a CPU and a device). I'll chase this up with the architects... Either way, mmiowb() is an empty macro for us. > > > Powerpc already does it. IO within a locked region will serialize with the > > > lock. > > > > I thought ia64 was the hold out here? Did they actually have machines that > > needed this in practice? > > Note that even if mmiowb() is expensive (and I don't think that's > actually even the case on ia64), you can - and probably should - do > what PowerPC does. > > Doing an IO barrier on PowerPC is insanely expensive, but they solve > that simply track the whole "have I done any IO" manually. It's not > even that expensive, it just uses a percpu flag. > > (Admittedly, PowerPC makes it less obvious that it's a percpu variable > because it's actually in the special "paca" region that is like a > hyper-local percpu area). > > > If so, I think we can either: > > > > (a) Add an mmiowb() to their spin_unlock() code, or > > (b) Remove ia64 altogether if nobody complains > > > > I know that Peter has been in favour of (b) for a while... > > I don't think we're quite ready for (b), but see above: I don't think > adding mmiowb() to the ia64 spin unlock code is even all that > expensive. Well, I figured it was worth asking the question. > Yeah, yeah, there's the SGI "SN" platform that apparently has a bug, > so because of that platform problem maybe it needs the more complex > "use a flag" model. But even the complex model isn't _hugely_ > complex. > > But we *could* first just do the mmiowb() unconditionally in the ia64 > unlocking code, and then see if anybody notices? I'll hack this up as a starting point. We can always try to be clever later on if it's deemed necessary. Will