Received: by 2002:ac0:a581:0:0:0:0:0 with SMTP id m1-v6csp1169781imm; Fri, 22 Jun 2018 11:30:42 -0700 (PDT) X-Google-Smtp-Source: ADUXVKKcvX411jhD/lu9adFQ55Bza9U0ET4gWBlWannRv0VqryW+qc8h51YWp+pVrWlVGCVVA/9T X-Received: by 2002:a63:ab05:: with SMTP id p5-v6mr2371580pgf.280.1529692242010; Fri, 22 Jun 2018 11:30:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529692241; cv=none; d=google.com; s=arc-20160816; b=Q4+GWPkbvFt8PkvzQj6imfhB0vf5zyFOAQcyQDdLxQm1jzJUDqYkg9w4AdaFzMdq8j 006LccoHRJljHkco2iDbJvgrx0piYpDjpyBmclzfM2p/tJOM+408b+LY9P7LTPRf78V5 JKit0kGjEChjCuw1XakVx9WNppx/pFeHU79Thg5VW2SN+Vxh6rR6he8gWqoje9PlBStw 39c1xFQ7kgvxJKsdbNh5CKdQptX0hWucTAWAIRsi/62FY5UXm3w+69UC0V4uG4mtEXtE go/gKsBUk/4Eqd0/j1euoMBK2t3LO+rg0Rmx6MIrm7T0xtA+n0PgeRq23DIcEhCxQoo1 tAdQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=zU7n/fYqvqAhQPb/vaS0pyualK10L5VNrE+Hpve0INw=; b=chH0ayK9l6wyMZ7RXCnUAe5ZffXnVz+JiOItJrTvIRR6ENQeVs8+df+sK9DaNDofRp FspJjyuXqVOUeHIEawCPwBF29eTZXMNIWxYClr5ix81cW1hX3t6UBbFWQmepYSEFRegT h7Jb+8zvbeUyRMQcp2mlU6MOwLADST43gjCwhH13cMY2YbtayMawCrrOEFdwJURv33AZ Ufw10SbCKSCHbZVNp+9/OTeFMY3qyKBgkuv00zclelOUYfJXgbE0F1I5pvRojqBfK1dq setWn4w2Y8R3U601dZhP42sC/w64ZuiIMQ6Fkqzk09evRKqabHd051SG2K31pG2CKwFZ evqg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y20-v6si7538686pfm.186.2018.06.22.11.30.27; Fri, 22 Jun 2018 11:30:41 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934210AbeFVS3d (ORCPT + 99 others); Fri, 22 Jun 2018 14:29:33 -0400 Received: from foss.arm.com ([217.140.101.70]:39450 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933869AbeFVS3c (ORCPT ); Fri, 22 Jun 2018 14:29:32 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id DB15C14; Fri, 22 Jun 2018 11:29:31 -0700 (PDT) Received: from edgewater-inn.cambridge.arm.com (usa-sjc-imap-foss1.foss.arm.com [10.72.51.249]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id AB63E3F2EA; Fri, 22 Jun 2018 11:29:31 -0700 (PDT) Received: by edgewater-inn.cambridge.arm.com (Postfix, from userid 1000) id 351531AE3AA8; Fri, 22 Jun 2018 19:30:08 +0100 (BST) Date: Fri, 22 Jun 2018 19:30:08 +0100 From: Will Deacon To: Alan Stern Cc: LKMM Maintainers -- Akira Yokosawa , Andrea Parri , Boqun Feng , David Howells , Jade Alglave , Luc Maranget , Nicholas Piggin , "Paul E. McKenney" , Peter Zijlstra , Kernel development list Subject: Re: [PATCH 2/2] tools/memory-model: Add write ordering by release-acquire and by locks Message-ID: <20180622183007.GD1802@arm.com> References: <20180622080928.GB7601@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Alan, On Fri, Jun 22, 2018 at 02:09:04PM -0400, Alan Stern wrote: > On Fri, 22 Jun 2018, Will Deacon wrote: > > On Thu, Jun 21, 2018 at 01:27:12PM -0400, Alan Stern wrote: > > > More than one kernel developer has expressed the opinion that the LKMM > > > should enforce ordering of writes by release-acquire chains and by > > > locking. In other words, given the following code: > > > > > > WRITE_ONCE(x, 1); > > > spin_unlock(&s): > > > spin_lock(&s); > > > WRITE_ONCE(y, 1); > > > > > > or the following: > > > > > > smp_store_release(&x, 1); > > > r1 = smp_load_acquire(&x); // r1 = 1 > > > WRITE_ONCE(y, 1); > > > > > > the stores to x and y should be propagated in order to all other CPUs, > > > even though those other CPUs might not access the lock s or be part of > > > the release-acquire chain. In terms of the memory model, this means > > > that rel-rf-acq-po should be part of the cumul-fence relation. > > > > > > All the architectures supported by the Linux kernel (including RISC-V) > > > do behave this way, albeit for varying reasons. Therefore this patch > > > changes the model in accordance with the developers' wishes. > > > > Interesting... > > > > I think the second example would preclude us using LDAPR for load-acquire, > > What are the semantics of LDAPR? That instruction isn't included in my > year-old copy of the ARMv8.1 manual; the closest it comes is LDAR and > LDAXP. It's part of 8.3 and is documented in the latest Arm Arm: https://static.docs.arm.com/ddi0487/ca/DDI0487C_a_armv8_arm.pdf It's also included in the upstream armv8.cat file using the 'Q' set. > > so I'm surprised that RISC-V is ok with this. For example, the first test > > below is allowed on arm64. > > Does ARMv8 use LDAPR for smp_load_aquire()? If it doesn't, this is a > moot point. I don't think it's a moot point. We want new architectures to implement acquire/release efficiently, and it's not unlikely that they will have acquire loads that are similar in semantics to LDAPR. This patch prevents them from doing so, and it also breaks Power and RISC-V without any clear justification for the stronger semantics. > > I also think this would break if we used DMB LD to implement load-acquire > > (second test below). > > Same question. Same answer (and RISC-V is a concrete example of an architecture building acquire using a load->load+store fence). > > So I'm not a big fan of this change, and I'm surprised this works on all > > architectures. What's the justification? > > For ARMv8, I've been going by something you wrote in an earlier email > to the effect that store-release and load-acquire are fully ordered, > and therefore a release can never be forwarded to an acquire. Is that > still true? But evidently it only justifies patch 1 in this series, > not patch 2. LDAR and STLR are RCsc, so that remains true. arm64 is not broken by this patch, but I'm still objecting to the change in semantics. > For RISC-V, I've been going by Andrea's and Luc's comments. https://is.gd/WhV1xz From that state of rmem, you can propagate the writes out of order on RISC-V. > > > Reading back some of the old threads [1], it seems the direct > > > translation of the first into acquire-release would be: > > > > > > WRITE_ONCE(x, 1); > > > smp_store_release(&s, 1); > > > r1 = smp_load_acquire(&s); > > > WRITE_ONCE(y, 1); > > > > > > Which is I think easier to make happen than the second example you give. > > > > It's easier, but it will still break on architectures with native support > > for RCpc acquire/release. > > Again, do we want the kernel to support that? Yes, I think we do. That's the most common interpretation of acquire/release, it matches what C11 has done and it facilitates construction of acquire using a load->load+store fence. > For that matter, what would happen if someone were to try using RCpc > semantics for lock/unlock? Or to put it another way, why do you > contemplate the possibility of RCpc acquire/release but not RCpc > lock/unlock? I think lock/unlock is a higher-level abstraction than acquire/release and therefore should be simpler to use and easier to reason about. acquire/release are building blocks for more complicated synchronisation mechanisms and we shouldn't be penalising their implementation without good reason. > > Could we drop the acquire/release stuff from the patch and limit this change > > to locking instead? > > The LKMM uses the same CAT code for acquire/release and lock/unlock. > (In essence, it considers a lock to be an acquire and an unlock to be a > release; everything else follows from that.) Treating one differently > from the other in these tests would require some significant changes. > It wouldn't be easy. It would be boring if it was easy ;) I think this is a case of the tail wagging the dog. Paul -- please can you drop this patch until we've resolved this discussion? Will