Received: by 2002:ac0:a581:0:0:0:0:0 with SMTP id m1-v6csp1207674imm; Fri, 22 Jun 2018 12:12:53 -0700 (PDT) X-Google-Smtp-Source: ADUXVKJ9xA7GlzHHTfIJhvtnTpC3m/u8ZIVpG6kpUjqmOUsSfbDe95U7/i0TvH4euQ0Aq7TIk46x X-Received: by 2002:a17:902:7589:: with SMTP id j9-v6mr1859pll.114.1529694773883; Fri, 22 Jun 2018 12:12:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529694773; cv=none; d=google.com; s=arc-20160816; b=kiz3FooKkjfVfabxZZPGI33IOryMYHA3Fd26x1eVaaJNF9npe7m4v7c0oLm3yR3RCV ZVEAjNaz+NKTmx6Cb/9petDn27FwsiWiAt8jbCyD9m/9iuIe9KVWGXs5KCfKCPMdoQ8Y E+z9KWAHISP/pO/9+y8bhNGjK4ZUfcdSImxv35SuN5b0ymEiY9XYoqSg4UYK/h9ZIJ4j jOHi6KZhj9Uv7l4fEvinDAsdovfa98y7THUq3VyHzbtSEH1F++CDvkkVaIr877sY1M9b t7ogYfGzOy84wUVHrhj5KLOwOd3it693Ki5TePlLd8Hr9ei7/Su3w8hDwq8DRxgymUNU SgPA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:message-id:in-reply-to :subject:cc:to:from:date:arc-authentication-results; bh=AqPIWZmH//x0X5Lxfzk2rE/UHAtW0fC8L/QTmjufcjU=; b=fQYZ3lZOy+U2NOuLQD89VucBAwFhqxH49tCxpECXBh8MdKOdA9i8GrshQYg26QKXY9 rbaE7spnRAjNmlEjbrTh3kYEhj75rIVEAGU5zt3uh9hqN/ZY8hjVfojqBUMoW8pBRwKY od/Sdo1pnNrzDSntBgB9A5hndtD1ou7IStQFNDKYd/pIjQPiJOGPdnqvCVrd6OMD/mLr cUlSEXyxCXZRHRQmhAnNf5mRyIAHQxe0m946CossQaWZZPwkPhNB+lNpU3scd7orEI44 FPJbvtdkLNP9C2ZW6cOdqtI+jrZc/l78GyLxgH8uQn8mZZjKpd2HtED8bJRsxI9+cn9Y 9X7g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id t9-v6si6752065pgn.559.2018.06.22.12.12.39; Fri, 22 Jun 2018 12:12:53 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934339AbeFVTLj (ORCPT + 99 others); Fri, 22 Jun 2018 15:11:39 -0400 Received: from iolanthe.rowland.org ([192.131.102.54]:36426 "HELO iolanthe.rowland.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S1753275AbeFVTLi (ORCPT ); Fri, 22 Jun 2018 15:11:38 -0400 Received: (qmail 5008 invoked by uid 2102); 22 Jun 2018 15:11:37 -0400 Received: from localhost (sendmail-bs@127.0.0.1) by localhost with SMTP; 22 Jun 2018 15:11:37 -0400 Date: Fri, 22 Jun 2018 15:11:37 -0400 (EDT) From: Alan Stern X-X-Sender: stern@iolanthe.rowland.org To: Will Deacon cc: LKMM Maintainers -- Akira Yokosawa , Andrea Parri , Boqun Feng , David Howells , Jade Alglave , Luc Maranget , Nicholas Piggin , "Paul E. McKenney" , Peter Zijlstra , Kernel development list Subject: Re: [PATCH 2/2] tools/memory-model: Add write ordering by release-acquire and by locks In-Reply-To: <20180622183007.GD1802@arm.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, 22 Jun 2018, Will Deacon wrote: > Hi Alan, > > On Fri, Jun 22, 2018 at 02:09:04PM -0400, Alan Stern wrote: > > On Fri, 22 Jun 2018, Will Deacon wrote: > > > On Thu, Jun 21, 2018 at 01:27:12PM -0400, Alan Stern wrote: > > > > More than one kernel developer has expressed the opinion that the LKMM > > > > should enforce ordering of writes by release-acquire chains and by > > > > locking. In other words, given the following code: > > > > > > > > WRITE_ONCE(x, 1); > > > > spin_unlock(&s): > > > > spin_lock(&s); > > > > WRITE_ONCE(y, 1); > > > > > > > > or the following: > > > > > > > > smp_store_release(&x, 1); > > > > r1 = smp_load_acquire(&x); // r1 = 1 > > > > WRITE_ONCE(y, 1); > > > > > > > > the stores to x and y should be propagated in order to all other CPUs, > > > > even though those other CPUs might not access the lock s or be part of > > > > the release-acquire chain. In terms of the memory model, this means > > > > that rel-rf-acq-po should be part of the cumul-fence relation. > > > > > > > > All the architectures supported by the Linux kernel (including RISC-V) > > > > do behave this way, albeit for varying reasons. Therefore this patch > > > > changes the model in accordance with the developers' wishes. > > > > > > Interesting... > > > > > > I think the second example would preclude us using LDAPR for load-acquire, > > > > What are the semantics of LDAPR? That instruction isn't included in my > > year-old copy of the ARMv8.1 manual; the closest it comes is LDAR and > > LDAXP. > > It's part of 8.3 and is documented in the latest Arm Arm: > > https://static.docs.arm.com/ddi0487/ca/DDI0487C_a_armv8_arm.pdf > > It's also included in the upstream armv8.cat file using the 'Q' set. I'll have to look at that. > > > so I'm surprised that RISC-V is ok with this. For example, the first test > > > below is allowed on arm64. > > > > Does ARMv8 use LDAPR for smp_load_aquire()? If it doesn't, this is a > > moot point. > > I don't think it's a moot point. We want new architectures to implement > acquire/release efficiently, and it's not unlikely that they will have > acquire loads that are similar in semantics to LDAPR. This patch prevents > them from doing so, and it also breaks Power and RISC-V without any clear > justification for the stronger semantics. > > > > I also think this would break if we used DMB LD to implement load-acquire > > > (second test below). > > > > Same question. > > Same answer (and RISC-V is a concrete example of an architecture building > acquire using a load->load+store fence). > > > > So I'm not a big fan of this change, and I'm surprised this works on all > > > architectures. What's the justification? > > > > For ARMv8, I've been going by something you wrote in an earlier email > > to the effect that store-release and load-acquire are fully ordered, > > and therefore a release can never be forwarded to an acquire. Is that > > still true? But evidently it only justifies patch 1 in this series, > > not patch 2. > > LDAR and STLR are RCsc, so that remains true. arm64 is not broken by this > patch, but I'm still objecting to the change in semantics. > > > For RISC-V, I've been going by Andrea's and Luc's comments. > > https://is.gd/WhV1xz > > From that state of rmem, you can propagate the writes out of order on > RISC-V. > > > > > Reading back some of the old threads [1], it seems the direct > > > > translation of the first into acquire-release would be: > > > > > > > > WRITE_ONCE(x, 1); > > > > smp_store_release(&s, 1); > > > > r1 = smp_load_acquire(&s); > > > > WRITE_ONCE(y, 1); > > > > > > > > Which is I think easier to make happen than the second example you give. > > > > > > It's easier, but it will still break on architectures with native support > > > for RCpc acquire/release. > > > > Again, do we want the kernel to support that? > > Yes, I think we do. That's the most common interpretation of > acquire/release, it matches what C11 has done and it facilitates > construction of acquire using a load->load+store fence. > > > For that matter, what would happen if someone were to try using RCpc > > semantics for lock/unlock? Or to put it another way, why do you > > contemplate the possibility of RCpc acquire/release but not RCpc > > lock/unlock? > > I think lock/unlock is a higher-level abstraction than acquire/release > and therefore should be simpler to use and easier to reason about. > acquire/release are building blocks for more complicated synchronisation > mechanisms and we shouldn't be penalising their implementation without good > reason. > > > > Could we drop the acquire/release stuff from the patch and limit this change > > > to locking instead? > > > > The LKMM uses the same CAT code for acquire/release and lock/unlock. > > (In essence, it considers a lock to be an acquire and an unlock to be a > > release; everything else follows from that.) Treating one differently > > from the other in these tests would require some significant changes. > > It wouldn't be easy. > > It would be boring if it was easy ;) I think this is a case of the tail > wagging the dog. > > Paul -- please can you drop this patch until we've resolved this discussion? Agreed. It sounds like we'll need two versions of the Rel and Acq sets in the memory model; one for RCpc and one for RCsc. smp_load_acquire and smp_store_release will use the former, and locking will use the latter. Would it suffice to have this duplication just for release, using a single version of acquire? What would happen on ARMv8 or RISC-V if an RCsc release was read by an RCpc acquire? Or vice versa? Alan