Received: by 2002:ac0:a591:0:0:0:0:0 with SMTP id m17-v6csp753683imm; Thu, 5 Jul 2018 08:19:27 -0700 (PDT) X-Google-Smtp-Source: AAOMgpcHysTIjEBb3AFHK9fzbW5r/WYAjzQjCuzu2t1RgtSR2gmqaWqJku7bBFbvTjQcGMLeoduk X-Received: by 2002:a65:57c9:: with SMTP id q9-v6mr6165521pgr.128.1530803967121; Thu, 05 Jul 2018 08:19:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1530803967; cv=none; d=google.com; s=arc-20160816; b=ivAPyBVCjDGWHvjhsb07lkTYE/RlSimWnbqZTcm5TghsSV5f6+Cuy2DMSwmV6cIPhc xmQGYWdz4+ZZ7zZWvJFH6aaBclV6ZwYjqK9WsQSy0FHEsNFlAgbaSZSpQY6e3dGaQYxY wUcU8ITu0CltA+gi97DprRHrz4tAnlxp3fiq7TxFtqF9infYXtTjo7+0wsbR8zCgcv8a mk7W3vUjIfIQ3FxB/0hT/OYBPecBMoJwxhHgaKYCT8Qipjwv+Ki5KswINOATQml/O3L7 mNgDT2HtqC9Ma+QUv5M+MXbeWZwekAfQ5h1feXQueEV4EwY5RUIL4FGzwKz9BwL3M0Pg Le8g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:arc-authentication-results; bh=OnXsSOnDAH6tO14HchbSTTN/IqiREEOaoVINz2RzeNA=; b=NpoBmf0XWAWwQh01Js8u8T4/uNl22CcWtJmOqLqycG0QnWhDi3AuUqtcLQ0QgcrJb9 tKUoydTXnGtTfka8jULsetrTExZlp5unnPRk9Yo9U1kx9Ph4e4b7q4/Op1FAgK+9mDVD Pq90bpCVmM84LQWLWJivWqJQwBz2TiVeTQwU7PxEU5Ip/pTxuh1jvRa6fD1tVgEKxv7B kpp+lX35MQWJJcwoJEpbgjgIBL5UyY7ZQDZxD+EavjplaIy1KOV/v2XaEJij1drQb3bO VtTAgK1IxcbpzRROHu72wPSBt1psMm7hkxn6QEbnfsKbFD3i6vXoo9rhZ57gZVXDw1dd mI1w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=nvidia.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h3-v6si6339151pgc.122.2018.07.05.08.19.12; Thu, 05 Jul 2018 08:19:27 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=nvidia.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754187AbeGEPRC (ORCPT + 99 others); Thu, 5 Jul 2018 11:17:02 -0400 Received: from hqemgate15.nvidia.com ([216.228.121.64]:15316 "EHLO hqemgate15.nvidia.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753926AbeGEPRA (ORCPT ); Thu, 5 Jul 2018 11:17:00 -0400 Received: from hqpgpgate102.nvidia.com (Not Verified[216.228.121.13]) by hqemgate15.nvidia.com (using TLS: TLSv1, AES128-SHA) id ; Thu, 05 Jul 2018 08:16:22 -0700 Received: from HQMAIL105.nvidia.com ([172.20.161.6]) by hqpgpgate102.nvidia.com (PGP Universal service); Thu, 05 Jul 2018 08:16:58 -0700 X-PGP-Universal: processed; by hqpgpgate102.nvidia.com on Thu, 05 Jul 2018 08:16:58 -0700 Received: from [10.2.171.197] (10.2.171.197) by HQMAIL105.nvidia.com (172.20.187.12) with Microsoft SMTP Server (TLS) id 15.0.1347.2; Thu, 5 Jul 2018 15:16:59 +0000 Subject: Re: [PATCH 2/2] tools/memory-model: Add write ordering by release-acquire and by locks To: Will Deacon , Andrea Parri CC: Alan Stern , LKMM Maintainers -- Akira Yokosawa , Boqun Feng , David Howells , Jade Alglave , Luc Maranget , Nicholas Piggin , "Paul E. McKenney" , Peter Zijlstra , Kernel development list References: <20180625081920.GA5619@andrea> <20180704121103.GB26941@arm.com> <20180705140029.GA5346@andrea> <20180705144424.GE14470@arm.com> From: Daniel Lustig Message-ID: Date: Thu, 5 Jul 2018 08:16:58 -0700 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.8.0 MIME-Version: 1.0 In-Reply-To: <20180705144424.GE14470@arm.com> X-Originating-IP: [10.2.171.197] X-ClientProxiedBy: HQMAIL105.nvidia.com (172.20.187.12) To HQMAIL105.nvidia.com (172.20.187.12) Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 7/5/2018 7:44 AM, Will Deacon wrote: > Andrea, > > On Thu, Jul 05, 2018 at 04:00:29PM +0200, Andrea Parri wrote: >> On Wed, Jul 04, 2018 at 01:11:04PM +0100, Will Deacon wrote: >>> On Tue, Jul 03, 2018 at 01:28:17PM -0400, Alan Stern wrote: >>>> There's also read-write ordering, in the form of the LB pattern: >>>> >>>> P0(int *x, int *y, int *z) >>>> { >>>> r0 = READ_ONCE(*x); >>>> smp_store_release(z, 1); >>>> r1 = smp_load_acquire(z); >>>> WRITE_ONCE(*y, 1); >>>> } >>>> >>>> P1(int *x, int *y) >>>> { >>>> r2 = READ_ONCE(*y); >>>> smp_mp(); >>>> WRITE_ONCE(*x, 1); >>>> } >>>> >>>> exists (0:r0=1 /\ 1:r2=1) >>> >>> The access types are irrelevant to the acquire/release primitives, so yes >>> that's also allowed. >>> >>>> Would this be allowed if smp_load_acquire() was implemented with LDAPR? >>>> If the answer is yes then we will have to remove the rfi-rel-acq and >>>> rel-rf-acq-po relations from the memory model entirely. >>> >>> I don't understand what you mean by "rfi-rel-acq-po", and I assume you mean >>> rel-rfi-acq-po for the other? Sounds like I'm confused here. >> >> [Your reply about 1/2 suggests that you've figured this out now, IAC ...] > > Yeah, the naming threw me because it's not the same order as the composition > in the actual definition (why not?). I typoed an extra 'po' suffix. Well > done for spotting it. > >> "rfi-rel-acq" (as Alan wrote, and as I wrote before my question above...) >> is defined and currently used in linux-kernel.cat (and, FWIW, it has been >> so since when we upstreamed LKMM). >> >> My point is that, as exemplified by the two tests I reported above, this >> relation already prevents you from implementing your acquire with LDAPR; >> so my/our question was not "can you run herd7 for me?" but rather "do you >> think that changes are needed to the .cat file?". > > I mean, you can run herd on the armv8 memory model and see the answer to the > litmus test. So we have two options for the LKMM .cat file (the Arm one is > baked in silicon): > > 1. Say that architectures with RCpc acquire/release instructions must > instead use RCsc instructions (if they have them) or full fences > > 2. Change the LKMM .cat file to allow RCpc acquire/release (note that I'd > still like RCsc lock/unlock semantics, and I think that's actually the > real case that matters here, but the currently difficulty in > distinguishing the two in the .cat file has led to this broader > ordering being enforced as a side-effect) > > I prefer (2), because I think it's a safe and sensible relaxation to make > and accomodates what I consider to be a likely extension to weakly ordered > architectures that we might want to support efficiently. So yes, I think > changes are needed to the LKMM .cat file, but please don't view that as a > criticism. We change stuff all the time as long as it doesn't break userspace. > >> This question goes back _at least_ to: >> >> http://lkml.kernel.org/r/1519301990-11766-1-git-send-email-parri.andrea@gmail.com >> >> (see, in part., the "IMPORTANT" note at the bottom of the commit message) >> and that discussion later resulted in: >> >> 0123f4d76ca63b ("riscv/spinlock: Strengthen implementations with fences") >> 5ce6c1f3535fa8 ("riscv/atomic: Strengthen implementations with fences") >> >> (the latest _draft_ of the RISC-V specification, as pointed out by Daniel, >> >> https://github.com/riscv/riscv-isa-manual, Appendix A.5 >> >> includes our "Linux mapping", although the currently-recommended mapping >> differs and involves a "fence.tso [+ any acquire, including RCpc .aq]"). > > [I think the discussion at hand is broader than RISC-V, and I looped in > Daniel already] Sorry for the delay, it was Independence Day here in the US. I'm aligned with Will on all this so far. As mentioned above, this issue definitely comes up on RISC-V, but for reasons even beyond RCpc vs. RCsc. Why? On RISC-V, we have RCsc atomics, but no native load-acquire or store-release opcodes. You might expect you can use "ld; fence r,rw" and "fence rw,w; sd", respectively, but those don't interact with the RCsc .aq/.rl mappings. For example, suppose we did this: (a) amoswap.w.rl x0, x0, 0(a1) // st-rel using amoswap with dummy dest ... lw a0, 0(a1) // ld-acq using fence-based mapping fence r,rw (b) Nothing here orders (a) before (b) in general. A mapping always using fences would cover what's being asked for (I believe), and using only RCsc operations (like ARM does) would also work if we had native load-acquire and store-release operations, but this mixed mapping doesn't. That's why the spec currently recommends the stronger fence.tso (~= multi-copy atomic Power lwsync) instead. ...but it's very non-obvious. Is there really a hurry to rush all this in? >> My understanding is that your answer to this question is "Yes", but I am >> not sure about the "How/Which changes?"; of course, an answer his question >> _in the form_ of PATCHes would be appreciated! (but please also consider >> that I'll be offline for most of the time until next Monday.) > > C'mon, I'm reviewing patches here. The onus shouldn't be on the reviewer to > come back with the correct version of the patch. I'm also not terribly > worried if LKMM says the wrong thing whilst we work out what the right > thing should be, but I /would/ be worried if we started adding full fences to > an architecture that has acquire/release instructions just because they're > RCpc. If it turns out that no other arch maintainers care, then fine, > because frankly this doesn't affect me, but so far it's basically been > silence and I'd really like some more input before we close the door on > them I care! Thanks for speaking up Will! Dan > > Will >