Received: by 2002:ac0:a5a6:0:0:0:0:0 with SMTP id m35-v6csp10778imm; Thu, 30 Aug 2018 05:53:55 -0700 (PDT) X-Google-Smtp-Source: ANB0VdbY63pqrS0aucJu7nMiH18p0he0QdHy/btlJ0ivecliDMQ5Bx+4KNvMEkhCSky8JFMitEaJ X-Received: by 2002:a17:902:d688:: with SMTP id v8-v6mr10078651ply.278.1535633635833; Thu, 30 Aug 2018 05:53:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1535633635; cv=none; d=google.com; s=arc-20160816; b=Zk0KBikbS+hBN7ZjlNGS84e6KfC5cgPDbQUGEV2ge6MmYw04KO0wgkU8817+N7zBhx wgA5DYvdOVklxUIJXilb96TMHf+f7ANMqV4JtWDe+dU2DrS1FWv4lWEkAMfrsb4vbQY3 yfDeS9Hker/ApGAEoktYrINlhgCPPmJdm5L5aC/XnXz14dVCD4sLa47blMEA1t8LA+DS b/9M1rD/vuCvBMDOMknnlvxp1qoHjXf7+PvBHXM/ykRLTRD9dcZHrLOJtbRkoe1Wd9lt cm7B9xt+7+cgGjwanIkvwhHoD5E9SO7PWvKqVJy9t+KwaHgF+SQcZu7P+5vtc9e61sfQ WYyA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature:arc-authentication-results; bh=yxTGaHZkrrcNS7zClWbWhTNgJH7XwlBwTOJk4UQ2SFs=; b=d5sjDD+y64JVKGkD3+QrEj8Ny6GtcJ3cGjfwyiJdWhGViNy4cNZq1DLTYnoCKD2YoM oKKff/BwfZiy/pTy43Urmgi1fs3jLggLVeFsSmUZpfdSpQhbFQV1KMemsfTmIWFxCz1D otJUcO4leMDNuvhJYSZ+3tYo2KsI9/BO3OLXi8rZUadtujEWjigiyFvlp622EGj8+paI UUrk2npoPMsTkn0PhA2RH4eVxOBKwI6A1SY5BIn+WerAeyj12+Ygg4Vx4pITCRV3KEBc 0MnZbKywtLWlqHwuVyM+YB3kyqyjrezyDhGWmjfNIc0jKLvej1a/MPjVG1D1f7PNTMG9 wt2w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=fGiUtsRJ; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id s1-v6si6762544pfi.369.2018.08.30.05.53.40; Thu, 30 Aug 2018 05:53:55 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=fGiUtsRJ; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728926AbeH3Qwx (ORCPT + 99 others); Thu, 30 Aug 2018 12:52:53 -0400 Received: from mail-ed1-f66.google.com ([209.85.208.66]:43195 "EHLO mail-ed1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728713AbeH3Qwx (ORCPT ); Thu, 30 Aug 2018 12:52:53 -0400 Received: by mail-ed1-f66.google.com with SMTP id z27-v6so6418043edb.10; Thu, 30 Aug 2018 05:50:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=yxTGaHZkrrcNS7zClWbWhTNgJH7XwlBwTOJk4UQ2SFs=; b=fGiUtsRJbst8NYpVJTjy3BCL/fJISVy/+zL7Q2qgV6kgkV65HX6YXoHTiS9TIyCH0l pDJ5XoORb+mh0FfQyT9ltExMtTYhBKJs98dFfF5PUVxXu0TN5Oq8zRyUoD/UVElMJ7md mmQEib6+tcA8I6bgD+0fJJvXerZCKpXjQJegP2Uw1kDrimRfFyCW4bB+aN+QPC1FHjYF cu2UxOzwPiR3g7Q+GfFuDoyvr9I2Nlr2wGoBLW1+BTQEzzw2dWJwmnuNZ2fBpy0XSt6d lxI5y2JgouN93ur3huZMa1Ns8b9U9lHy5vw81qKxwXubUTO7pT+4jwRKvaGcUs5xyMD1 p8jg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=yxTGaHZkrrcNS7zClWbWhTNgJH7XwlBwTOJk4UQ2SFs=; b=o7h8Gej6l1D700jSp6wE1KjVd8TNU0VbXEY97AkBQTOaG+0UXYecyUH/O2N8OArMrZ ImnCQzlH0+oiy3a4g7Wgs23g8EHhbIcv4EiViGCYJP+Vl/rhYehaoF8u00e6Ln7dHrXk W55rlDrbduVbCXGFwgUX0HnoheFyATYYOOSQZtouK7y8ldX0u/qAFHkUlFKznLa4l3LO ZmQWkdlX2edsvNMmU2LYrDnNzA8L+X6In4sCNVISihatmdjmfXvIobWJCnhqZan1N4Ye h/5mUi2C9Sh3sTUnQNkB0LnJWw2ux//bNSQvoTKM3GHc7+XANo45SiQrIDBfRtw6vWND wLdg== X-Gm-Message-State: APzg51BmiLlllD5YzFlhhj3vKN2y/J+HWRQgUbSfO9G2GCrqDYr2+5xF qhMtT+cIORdBsWt/+dnh4Ua9fr2qSH9ouQ== X-Received: by 2002:aa7:c384:: with SMTP id k4-v6mr3663301edq.270.1535633451421; Thu, 30 Aug 2018 05:50:51 -0700 (PDT) Received: from andrea (85.100.broadband17.iol.cz. [109.80.100.85]) by smtp.gmail.com with ESMTPSA id m35-v6sm2808507ede.14.2018.08.30.05.50.50 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 30 Aug 2018 05:50:51 -0700 (PDT) Date: Thu, 30 Aug 2018 14:50:45 +0200 From: Andrea Parri To: "Paul E. McKenney" Cc: linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org, mingo@kernel.org, stern@rowland.harvard.edu, will.deacon@arm.com, peterz@infradead.org, boqun.feng@gmail.com, npiggin@gmail.com, dhowells@redhat.com, j.alglave@ucl.ac.uk, luc.maranget@inria.fr, akiyks@gmail.com Subject: Re: [PATCH RFC LKMM 1/7] tools/memory-model: Add extra ordering for locks and remove it for ordinary release/acquire Message-ID: <20180830125045.GA6936@andrea> References: <20180829211018.GA19646@linux.vnet.ibm.com> <20180829211053.20531-1-paulmck@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180829211053.20531-1-paulmck@linux.vnet.ibm.com> User-Agent: Mutt/1.9.4 (2018-02-28) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Aug 29, 2018 at 02:10:47PM -0700, Paul E. McKenney wrote: > From: Alan Stern > > More than one kernel developer has expressed the opinion that the LKMM > should enforce ordering of writes by locking. In other words, given > the following code: > > WRITE_ONCE(x, 1); > spin_unlock(&s): > spin_lock(&s); > WRITE_ONCE(y, 1); > > the stores to x and y should be propagated in order to all other CPUs, > even though those other CPUs might not access the lock s. In terms of > the memory model, this means expanding the cumul-fence relation. > > Locks should also provide read-read (and read-write) ordering in a > similar way. Given: > > READ_ONCE(x); > spin_unlock(&s); > spin_lock(&s); > READ_ONCE(y); // or WRITE_ONCE(y, 1); > > the load of x should be executed before the load of (or store to) y. > The LKMM already provides this ordering, but it provides it even in > the case where the two accesses are separated by a release/acquire > pair of fences rather than unlock/lock. This would prevent > architectures from using weakly ordered implementations of release and > acquire, which seems like an unnecessary restriction. The patch > therefore removes the ordering requirement from the LKMM for that > case. > > All the architectures supported by the Linux kernel (including RISC-V) > do provide this ordering for locks, albeit for varying reasons. > Therefore this patch changes the model in accordance with the > developers' wishes. > > Signed-off-by: Alan Stern > Signed-off-by: Paul E. McKenney > Reviewed-by: Will Deacon > Acked-by: Peter Zijlstra (Intel) Round 2 ;-), I guess... Let me start from the uncontroversial points: 1) being able to use the LKMM to reason about generic locking code is useful and desirable (paraphrasing Peter in [1]); 2) strengthening the ordering requirements of such code isn't going to boost performance (that's "real maths"). This patch is taking (1) away from us and it is formalizing (2), with almost _no_ reason (no reason at all, if we stick to the commit msg.). In [2], Will wrote: "[...] having them [the RMWs] closer to RCsc[/to the semantics of locks] would make it easier to implement and reason about generic locking implementations (i.e. reduce the number of special ordering cases and/or magic barrier macros)" "magic barrier macros" as in "mmh, if we accept this patch, we _should_ be auditing the various implementations/code to decide where to place a smp_barrier_promote_ordinary_release_acquire_to_unlock_lock()" ;-) or the like, and "special ordering cases" as in "arrgh, (otherwise) we are forced to reason on a per-arch basis while looking at generic code". (Remark: ordinary release/acquire are building blocks for code such as qspinlock, (q)rwlock, mutex, rwsem, ... and what else??). To avoid further repetition, I conclude by confirming all the concerns and my assessment of this patch as pointed out in [3]; the subsequent discussion, although not conclusive, presented several suggestions for improvement (IMO). Andrea [1] http://lkml.kernel.org/r/20180712134821.GT2494@hirez.programming.kicks-ass.net [2] http://lkml.kernel.org/r/20180713093524.GC32020@arm.com [3] http://lkml.kernel.org/r/20180710093821.GA5414@andrea http://lkml.kernel.org/r/20180711161717.GA14635@andrea > --- > .../Documentation/explanation.txt | 186 ++++++++++++++---- > tools/memory-model/linux-kernel.cat | 8 +- > ...ISA2+pooncelock+pooncelock+pombonce.litmus | 7 +- > 3 files changed, 150 insertions(+), 51 deletions(-) > > diff --git a/tools/memory-model/Documentation/explanation.txt b/tools/memory-model/Documentation/explanation.txt > index 0cbd1ef8f86d..35bff92cc773 100644 > --- a/tools/memory-model/Documentation/explanation.txt > +++ b/tools/memory-model/Documentation/explanation.txt > @@ -28,7 +28,8 @@ Explanation of the Linux-Kernel Memory Consistency Model > 20. THE HAPPENS-BEFORE RELATION: hb > 21. THE PROPAGATES-BEFORE RELATION: pb > 22. RCU RELATIONS: rcu-link, gp, rscs, rcu-fence, and rb > - 23. ODDS AND ENDS > + 23. LOCKING > + 24. ODDS AND ENDS > > > > @@ -1067,28 +1068,6 @@ allowing out-of-order writes like this to occur. The model avoided > violating the write-write coherence rule by requiring the CPU not to > send the W write to the memory subsystem at all!) > > -There is one last example of preserved program order in the LKMM: when > -a load-acquire reads from an earlier store-release. For example: > - > - smp_store_release(&x, 123); > - r1 = smp_load_acquire(&x); > - > -If the smp_load_acquire() ends up obtaining the 123 value that was > -stored by the smp_store_release(), the LKMM says that the load must be > -executed after the store; the store cannot be forwarded to the load. > -This requirement does not arise from the operational model, but it > -yields correct predictions on all architectures supported by the Linux > -kernel, although for differing reasons. > - > -On some architectures, including x86 and ARMv8, it is true that the > -store cannot be forwarded to the load. On others, including PowerPC > -and ARMv7, smp_store_release() generates object code that starts with > -a fence and smp_load_acquire() generates object code that ends with a > -fence. The upshot is that even though the store may be forwarded to > -the load, it is still true that any instruction preceding the store > -will be executed before the load or any following instructions, and > -the store will be executed before any instruction following the load. > - > > AND THEN THERE WAS ALPHA > ------------------------ > @@ -1766,6 +1745,147 @@ before it does, and the critical section in P2 both starts after P1's > grace period does and ends after it does. > > > +LOCKING > +------- > + > +The LKMM includes locking. In fact, there is special code for locking > +in the formal model, added in order to make tools run faster. > +However, this special code is intended to be more or less equivalent > +to concepts we have already covered. A spinlock_t variable is treated > +the same as an int, and spin_lock(&s) is treated almost the same as: > + > + while (cmpxchg_acquire(&s, 0, 1) != 0) > + cpu_relax(); > + > +This waits until s is equal to 0 and then atomically sets it to 1, > +and the read part of the cmpxchg operation acts as an acquire fence. > +An alternate way to express the same thing would be: > + > + r = xchg_acquire(&s, 1); > + > +along with a requirement that at the end, r = 0. Similarly, > +spin_trylock(&s) is treated almost the same as: > + > + return !cmpxchg_acquire(&s, 0, 1); > + > +which atomically sets s to 1 if it is currently equal to 0 and returns > +true if it succeeds (the read part of the cmpxchg operation acts as an > +acquire fence only if the operation is successful). spin_unlock(&s) > +is treated almost the same as: > + > + smp_store_release(&s, 0); > + > +The "almost" qualifiers above need some explanation. In the LKMM, the > +store-release in a spin_unlock() and the load-acquire which forms the > +first half of the atomic rmw update in a spin_lock() or a successful > +spin_trylock() -- we can call these things lock-releases and > +lock-acquires -- have two properties beyond those of ordinary releases > +and acquires. > + > +First, when a lock-acquire reads from a lock-release, the LKMM > +requires that every instruction po-before the lock-release must > +execute before any instruction po-after the lock-acquire. This would > +naturally hold if the release and acquire operations were on different > +CPUs, but the LKMM says it holds even when they are on the same CPU. > +For example: > + > + int x, y; > + spinlock_t s; > + > + P0() > + { > + int r1, r2; > + > + spin_lock(&s); > + r1 = READ_ONCE(x); > + spin_unlock(&s); > + spin_lock(&s); > + r2 = READ_ONCE(y); > + spin_unlock(&s); > + } > + > + P1() > + { > + WRITE_ONCE(y, 1); > + smp_wmb(); > + WRITE_ONCE(x, 1); > + } > + > +Here the second spin_lock() reads from the first spin_unlock(), and > +therefore the load of x must execute before the load of y. Thus we > +cannot have r1 = 1 and r2 = 0 at the end (this is an instance of the > +MP pattern). > + > +This requirement does not apply to ordinary release and acquire > +fences, only to lock-related operations. For instance, suppose P0() > +in the example had been written as: > + > + P0() > + { > + int r1, r2, r3; > + > + r1 = READ_ONCE(x); > + smp_store_release(&s, 1); > + r3 = smp_load_acquire(&s); > + r2 = READ_ONCE(y); > + } > + > +Then the CPU would be allowed to forward the s = 1 value from the > +smp_store_release() to the smp_load_acquire(), executing the > +instructions in the following order: > + > + r3 = smp_load_acquire(&s); // Obtains r3 = 1 > + r2 = READ_ONCE(y); > + r1 = READ_ONCE(x); > + smp_store_release(&s, 1); // Value is forwarded > + > +and thus it could load y before x, obtaining r2 = 0 and r1 = 1. > + > +Second, when a lock-acquire reads from a lock-release, and some other > +stores W and W' occur po-before the lock-release and po-after the > +lock-acquire respectively, the LKMM requires that W must propagate to > +each CPU before W' does. For example, consider: > + > + int x, y; > + spinlock_t x; > + > + P0() > + { > + spin_lock(&s); > + WRITE_ONCE(x, 1); > + spin_unlock(&s); > + } > + > + P1() > + { > + int r1; > + > + spin_lock(&s); > + r1 = READ_ONCE(x); > + WRITE_ONCE(y, 1); > + spin_unlock(&s); > + } > + > + P2() > + { > + int r2, r3; > + > + r2 = READ_ONCE(y); > + smp_rmb(); > + r3 = READ_ONCE(x); > + } > + > +If r1 = 1 at the end then the spin_lock() in P1 must have read from > +the spin_unlock() in P0. Hence the store to x must propagate to P2 > +before the store to y does, so we cannot have r2 = 1 and r3 = 0. > + > +These two special requirements for lock-release and lock-acquire do > +not arise from the operational model. Nevertheless, kernel developers > +have come to expect and rely on them because they do hold on all > +architectures supported by the Linux kernel, albeit for various > +differing reasons. > + > + > ODDS AND ENDS > ------------- > > @@ -1831,26 +1951,6 @@ they behave as follows: > events and the events preceding them against all po-later > events. > > -The LKMM includes locking. In fact, there is special code for locking > -in the formal model, added in order to make tools run faster. > -However, this special code is intended to be exactly equivalent to > -concepts we have already covered. A spinlock_t variable is treated > -the same as an int, and spin_lock(&s) is treated the same as: > - > - while (cmpxchg_acquire(&s, 0, 1) != 0) > - cpu_relax(); > - > -which waits until s is equal to 0 and then atomically sets it to 1, > -and where the read part of the atomic update is also an acquire fence. > -An alternate way to express the same thing would be: > - > - r = xchg_acquire(&s, 1); > - > -along with a requirement that at the end, r = 0. spin_unlock(&s) is > -treated the same as: > - > - smp_store_release(&s, 0); > - > Interestingly, RCU and locking each introduce the possibility of > deadlock. When faced with code sequences such as: > > diff --git a/tools/memory-model/linux-kernel.cat b/tools/memory-model/linux-kernel.cat > index 59b5cbe6b624..882fc33274ac 100644 > --- a/tools/memory-model/linux-kernel.cat > +++ b/tools/memory-model/linux-kernel.cat > @@ -38,7 +38,7 @@ let strong-fence = mb | gp > (* Release Acquire *) > let acq-po = [Acquire] ; po ; [M] > let po-rel = [M] ; po ; [Release] > -let rfi-rel-acq = [Release] ; rfi ; [Acquire] > +let po-unlock-rf-lock-po = po ; [UL] ; rf ; [LKR] ; po > > (**********************************) > (* Fundamental coherence ordering *) > @@ -60,13 +60,13 @@ let dep = addr | data > let rwdep = (dep | ctrl) ; [W] > let overwrite = co | fr > let to-w = rwdep | (overwrite & int) > -let to-r = addr | (dep ; rfi) | rfi-rel-acq > +let to-r = addr | (dep ; rfi) > let fence = strong-fence | wmb | po-rel | rmb | acq-po > -let ppo = to-r | to-w | fence > +let ppo = to-r | to-w | fence | (po-unlock-rf-lock-po & int) > > (* Propagation: Ordering from release operations and strong fences. *) > let A-cumul(r) = rfe? ; r > -let cumul-fence = A-cumul(strong-fence | po-rel) | wmb > +let cumul-fence = A-cumul(strong-fence | po-rel) | wmb | po-unlock-rf-lock-po > let prop = (overwrite & ext)? ; cumul-fence* ; rfe? > > (* > diff --git a/tools/memory-model/litmus-tests/ISA2+pooncelock+pooncelock+pombonce.litmus b/tools/memory-model/litmus-tests/ISA2+pooncelock+pooncelock+pombonce.litmus > index 0f749e419b34..094d58df7789 100644 > --- a/tools/memory-model/litmus-tests/ISA2+pooncelock+pooncelock+pombonce.litmus > +++ b/tools/memory-model/litmus-tests/ISA2+pooncelock+pooncelock+pombonce.litmus > @@ -1,11 +1,10 @@ > C ISA2+pooncelock+pooncelock+pombonce > > (* > - * Result: Sometimes > + * Result: Never > * > - * This test shows that the ordering provided by a lock-protected S > - * litmus test (P0() and P1()) are not visible to external process P2(). > - * This is likely to change soon. > + * This test shows that write-write ordering provided by locks > + * (in P0() and P1()) is visible to external process P2(). > *) > > {} > -- > 2.17.1 >