Received: by 2002:ac0:a5a6:0:0:0:0:0 with SMTP id m35-v6csp1186761imm; Fri, 14 Sep 2018 12:44:55 -0700 (PDT) X-Google-Smtp-Source: ANB0VdZigQQcPv6MNkCd4FCStaG38i0hUL9jaA/4oK3cMK7MhmNwBdSLV8jIG68rybYCXy0/mBmS X-Received: by 2002:a62:c218:: with SMTP id l24-v6mr14043525pfg.185.1536954295437; Fri, 14 Sep 2018 12:44:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1536954295; cv=none; d=google.com; s=arc-20160816; b=eP/F+TRwuDIqrN9Nh1Fb8M1xlKLqiuOHnhuoClqbNKGlyMglp13XluUIvzDVfOL/0d WUtxBdzoaY/rWMkt/6NoQTqCPtlqiaemQTGr2d2ZVR966PZHiEmlEs4azl50emLLfjae 2jrEgr589rHAQppuKE2VIzIKNWE8zHekFhxygnA4LsFYWrDlNYBMWhTxUNKm8mIxYDlw PcWuYV2itR6cerWj1aaMxtdQ6nfW8vu0c3ooytgRw4LOn3vGf2ofKUNmDW7h9OTdPIOm pVmM2DmNQkbumwc9r7TwMyclOcvXr1ij4b7rY0rXQ/HBQbrV532Ke3NUvkE7QHPe+JOD 7lDw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=7z3Sysg1v6kRkLUOFqAezJoFrE+Oi5ekgzeYgH7c390=; b=toom6ydTL/rPGNgaszFvpIgp8C3FhyWuVh3SagH75wz9G/TrZmwoqGObx16KkoKY2G LH79xYm7DDxs4ulrvknoAoiDP1jGEQh1KUIcWinj8nS/YprQRNXCzF1uefs9+O/W7QCJ 9qAvBVCQmYR88/ZA6C+66dpJsh+VjRf1v/nyRW5LkfSMje6LaX0DGvO9MHMZJjuOSn8s 2w4HyI1zy1J7YgrwL7VoZtZCijnvkg/8u66fMhNBYPZyVcubJjN0C+6t8DXkKgUnXlbf phZGMy9JdrNpmotApaJUDMPyPZvWHgXXd9Q1XcFph8hIPZnxzw1/tZClSsvhPo++Cuwc 0gdg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@amarulasolutions.com header.s=google header.b="c93/4Vsi"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l1-v6si8033917plg.285.2018.09.14.12.44.39; Fri, 14 Sep 2018 12:44:55 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@amarulasolutions.com header.s=google header.b="c93/4Vsi"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727790AbeIOBAa (ORCPT + 99 others); Fri, 14 Sep 2018 21:00:30 -0400 Received: from mail-ed1-f66.google.com ([209.85.208.66]:39916 "EHLO mail-ed1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726891AbeIOBA3 (ORCPT ); Fri, 14 Sep 2018 21:00:29 -0400 Received: by mail-ed1-f66.google.com with SMTP id h4-v6so8302708edi.6 for ; Fri, 14 Sep 2018 12:44:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amarulasolutions.com; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=7z3Sysg1v6kRkLUOFqAezJoFrE+Oi5ekgzeYgH7c390=; b=c93/4VsipWl1Ob3DFue5NsL9/Pc6md1sxiiIcZn+HriH26sHQnznc/RSA9JmcWP2gp GHkNB1ae/+zhchsl3tPe5C/y6/S/HSwbZr1JiUFEvCpyC5zMAjUpgA80IXnR3sz96KSt XN7xeZALlzbzLxWUebCPVdAdF0jqaJHDu11rs= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=7z3Sysg1v6kRkLUOFqAezJoFrE+Oi5ekgzeYgH7c390=; b=Xq9QZU3L4GycVZBJhYRA4dSfIu0G2/qKHXWvRe/Sc0nGVKPcE0hdg9CrP6X8JTlA9t FYhBPamnQ3B7u+QuNrtQg1nPGOvJv5G3SEeALEm6BUfT4/D3aJ0uhQPdrMzAnqz8w0xC 7Xy1C0Ajp+J/SDDh0zTbfJofh74ULW7aoTC42tvj91V0PAPsKkYN8YXrS2g6dtfLGMbr Ylcdh+VZfwHywqc28lzQtdcDARo8m/ZOZFPd13+PA6benaXwmIQNq6vsoV8Te2ar4qYy bZvFlUvjLHYhbdY5kuPILIohgE1cD/jZg6V+R/nPntYFz1rdAFIavHZFHLg8r1PkMqOX oQcw== X-Gm-Message-State: APzg51BPwHB2SpcCdSrqaNgU5sWz/7N61eJ4hbXznqfIQyTgh3Wu4Dfz g0On5QFihGUhT9JUU6Mz63IiGw== X-Received: by 2002:a50:94c3:: with SMTP id t3-v6mr23181712eda.249.1536954272216; Fri, 14 Sep 2018 12:44:32 -0700 (PDT) Received: from andrea ([94.230.152.15]) by smtp.gmail.com with ESMTPSA id w20-v6sm7881185edc.12.2018.09.14.12.44.30 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 14 Sep 2018 12:44:31 -0700 (PDT) Date: Fri, 14 Sep 2018 21:44:23 +0200 From: Andrea Parri To: Alan Stern Cc: "Paul E. McKenney" , Daniel Lustig , Will Deacon , Andrea Parri , Kernel development list , linux-arch@vger.kernel.org, mingo@kernel.org, peterz@infradead.org, boqun.feng@gmail.com, npiggin@gmail.com, dhowells@redhat.com, Jade Alglave , Luc Maranget , akiyks@gmail.com, Palmer Dabbelt , Linus Torvalds Subject: Re: [PATCH RFC LKMM 1/7] tools/memory-model: Add extra ordering for locks and remove it for ordinary release/acquire Message-ID: <20180914194423.GA3264@andrea> References: <20180914143752.GA7467@andrea> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.4 (2018-02-28) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > The updated Changelog draft is below. > > Alan. > > -------------------------------------------------------------------------- > More than one kernel developer has expressed the opinion that the LKMM > should enforce ordering of writes by locking. In other words, given > the following code: > > WRITE_ONCE(x, 1); > spin_unlock(&s): > spin_lock(&s); > WRITE_ONCE(y, 1); > > the stores to x and y should be propagated in order to all other CPUs, > even though those other CPUs might not access the lock s. In terms of > the memory model, this means expanding the cumul-fence relation. > > Locks should also provide read-read (and read-write) ordering in a > similar way. Given: > > READ_ONCE(x); > spin_unlock(&s); > spin_lock(&s); > READ_ONCE(y); // or WRITE_ONCE(y, 1); > > the load of x should be executed before the load of (or store to) y. > The LKMM already provides this ordering, but it provides it even in > the case where the two accesses are separated by a release/acquire > pair of fences rather than unlock/lock. This would prevent > architectures from using weakly ordered implementations of release and > acquire, which seems like an unnecessary restriction. The patch > therefore removes the ordering requirement from the LKMM for that > case. > > There are several arguments both for and against this change. Let us > refer to these enhanced ordering properties by saying that the LKMM > would require locks to be RCtso (a bit of a misnomer, but analogous to > RCpc and RCsc) and it would require ordinary acquire/release only to > be RCpc. (Note: In the following, the phrase "all supported > architectures" is meant not to include RISC-V. Although RISC-V is > indeed supported by the kernel, the implementation is still somewhat > in a state of flux and therefore statements about it would be > premature.) > > Pros: > > The kernel already provides RCtso ordering for locks on all > supported architectures, even though this is not stated > explicitly anywhere. Therefore the LKMM should formalize it. > > In theory, guaranteeing RCtso ordering would reduce the need > for additional barrier-like constructs meant to increase the > ordering strength of locks. > > Will Deacon and Peter Zijlstra are strongly in favor of > formalizing the RCtso requirement. Linus Torvalds and Peter > would like to go even further, requiring locks to have RCsc > behavior (ordering preceding writes against later reads), but > they recognize that this would incur a noticeable performance > degradation on the POWER architecture. Linus also points out > that people have made the mistake, in the past, of assuming > that locking has stronger ordering properties than is > currently guaranteed, and this change would reduce the > likelihood of such mistakes. > > Not requiring ordinary acquire/release to be any stronger than > RCpc may prove advantageous for future architectures, allowing > them to implement smp_load_acquire() and smp_store_release() > with more efficient machine instructions than would be > possible if the operations had to be RCtso. Will and Linus > approve this rationale, hypothetical though it is at the > moment (it may end up affecting the RISC-V implementation). > The same argument may or may not apply to RMW-acquire/release; > see also the second Con entry below. > > Linus feels that locks should be easy for people to use > without worrying about memory consistency issues, since they > are so pervasive in the kernel, whereas acquire/release is > much more of an "experts only" tool. Requiring locks to be > RCtso is a step in this direction. > > Cons: > > Andrea Parri and Luc Maranget think that locks should have the > same ordering properties as ordinary acquire/release (indeed, > Luc points out that the names "acquire" and "release" derive > from the usage of locks). Andrea points out that having > different ordering properties for different forms of acquires > and releases is not only unnecessary, it would also be > confusing and unmaintainable. > > Locks are constructed from lower-level primitives, typically > RMW-acquire (for locking) and ordinary release (for unlock). > It is illogical to require stronger ordering properties from > the high-level operations than from the low-level operations > they comprise. Thus, this change would make > > while (cmpxchg_acquire(&s, 0, 1) != 0) > cpu_relax(); > > an incorrect implementation of spin_lock(&s) as far as the > LKMM is concerned. In theory this weakness can be ameliorated > by changing the LKMM even further, requiring > RMW-acquire/release also to be RCtso (which it already is on > all supported architectures). > > As far as I know, nobody has singled out any examples of code > in the kernel that actually relies on locks being RCtso. > (People mumble about RCU and the scheduler, but nobody has > pointed to any actual code. If there are any real cases, > their number is likely quite small.) If RCtso ordering is not > needed, why require it? > > A handful of locking constructs (qspinlocks, qrwlocks, and > mcs_spinlocks) are built on top of smp_cond_load_acquire() > instead of an RMW-acquire instruction. It currently provides > only the ordinary acquire semantics, not the stronger ordering > this patch would require of locks. In theory this could be > ameliorated by requiring smp_cond_load_acquire() in > combination with ordinary release also to be RCtso (which is > currently true on all supported architectures). > > On future weakly ordered architectures, people may be able to > implement locks in a non-RCtso fashion with significant > performance improvement. Meeting the RCtso requirement would > necessarily add run-time overhead. > > Overall, the technical aspects of these arguments seem relatively > minor, and it appears mostly to boil down to a matter of opinion. > Since the opinions of senior kernel maintainers such as Linus, > Peter, and Will carry more weight than those of Luc and Andrea, this > patch changes the model in accordance with the maintainers' wishes. > > Signed-off-by: Alan Stern Thank you for integrating the log. I remain more sensitive to those Cons (and skeptical about that po-unlock-rf-lock-po), but: (1) this doesn't make an objection, and (2) you wore me down. ;-) So unless new arguments are brought to light, feel free to add: Reviewed-by: Andrea Parri Andrea > > v.5: Incorporated feedback from Andrea regarding the updated Changelog. > > v.4: Added pros and cons discussion to the Changelog. > > v.3: Rebased against the dev branch of Paul's linux-rcu tree. > Changed unlock-rf-lock-po to po-unlock-rf-lock-po, making it more > symmetrical and more in accordance with the use of fence.tso for > the release on RISC-V. > > v.2: Restrict the ordering to lock operations, not general release > and acquire fences. >