Received: by 10.223.185.116 with SMTP id b49csp1198641wrg; Fri, 16 Feb 2018 14:23:52 -0800 (PST) X-Google-Smtp-Source: AH8x224/2MRuJnlnLyAr2fI7HwjRctTWBhbAuVuxQWkYondxRoUe9Z3glWXnuO70CPO/GeWoMUN0 X-Received: by 2002:a17:902:aa8c:: with SMTP id d12-v6mr7176762plr.39.1518819832037; Fri, 16 Feb 2018 14:23:52 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518819832; cv=none; d=google.com; s=arc-20160816; b=bu3iDjb/YnLgArtCjrywdUawrgq9LZnl1hifwn7ZX5yvb/i6EKEmlO8jRaQq3NN5uq 9KD++LrnQZQOAmz+jQJp/7cs6Tsa78N4eNndhZClWNGESXG9aE6b8IYIEtv6huuh75G4 9iC6IWGQECZVyILccRBU4ajAboGmPKyg/6pRXX5G73ZN98fwhpwyX0TLVv3RgBYC5Tj2 fKF3/F6J8g5aKYMiP6wJeZN1644UCyAZcCVZmu+SHbCNkDbYwJ+YPT8eYgdo7bKSpZZQ VL1lXCXXU99Hh8oEogzMUNCSVsmdlBFWu6a8/y7lGAylASPZE/tnT+7i2xxvZ/0d3wF8 Lc8g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:message-id:in-reply-to :subject:cc:to:from:date:arc-authentication-results; bh=gqFZgcY3VIKCnXsbWKpllgh0Vbyc/CtB/wfsaL7xtsg=; b=M76ONLb/7hxak4zXdD2VgI0TjqzDkfbFF/hz3uhRHC/FQn8NSpuIDhAa69HO4DjfDk htrsnn+rud6LVqH5rb9NKW6LRXLs4Oa9kb3P0t6TsKCsAo10j61zrZq7305zyBu4Uf8z gdvl3wChFpuDCcr2ieMZiym56PBQ52z10wpYEc6DfPQL5boXUG+hCElJDoyKFEzmg/0C uGLdscmAxj8eQrcIfVXKzRo8gFNTsehW4N0HXmwh/pPuz2CEFjjVg9WRTwMSrEvV0zOy yfmhrDiK6F0OFE6nzD94S4nQOR9wuq252bnVG604hFZ6Uao0ftS9VFCXVkxPMFwRK/5f vCXA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id x32-v6si600575pld.595.2018.02.16.14.23.37; Fri, 16 Feb 2018 14:23:51 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1750838AbeBPWW5 (ORCPT + 99 others); Fri, 16 Feb 2018 17:22:57 -0500 Received: from iolanthe.rowland.org ([192.131.102.54]:54974 "HELO iolanthe.rowland.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S1750727AbeBPWW4 (ORCPT ); Fri, 16 Feb 2018 17:22:56 -0500 Received: (qmail 6955 invoked by uid 2102); 16 Feb 2018 17:22:55 -0500 Received: from localhost (sendmail-bs@127.0.0.1) by localhost with SMTP; 16 Feb 2018 17:22:55 -0500 Date: Fri, 16 Feb 2018 17:22:55 -0500 (EST) From: Alan Stern X-X-Sender: stern@iolanthe.rowland.org To: "Paul E. McKenney" cc: Andrea Parri , Akira Yokosawa , Kernel development list , , Will Deacon , , , , , Jade Alglave , Luc Maranget , Patrick Bellasi Subject: [PATCH] tools/memory-model: remove rb-dep, smp_read_barrier_depends, and lockless_dereference In-Reply-To: <20180216165301.GF3617@linux.vnet.ibm.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Since commit 76ebbe78f739 ("locking/barriers: Add implicit smp_read_barrier_depends() to READ_ONCE()") was merged for the 4.15 kernel, it has not been necessary to use smp_read_barrier_depends(). Similarly, commit 59ecbbe7b31c ("locking/barriers: Kill lockless_dereference()") removed lockless_dereference() from the kernel. Since these primitives are no longer part of the kernel, they do not belong in the Linux Kernel Memory Consistency Model. This patch removes them, along with the internal rb-dep relation, and updates the revelant documentation. Signed-off-by: Alan Stern --- Index: usb-4.x/tools/memory-model/linux-kernel.cat =================================================================== --- usb-4.x/tools/memory-model.orig/linux-kernel.cat +++ usb-4.x/tools/memory-model/linux-kernel.cat @@ -25,7 +25,6 @@ include "lock.cat" (*******************) (* Fences *) -let rb-dep = [R] ; fencerel(Rb_dep) ; [R] let rmb = [R \ Noreturn] ; fencerel(Rmb) ; [R \ Noreturn] let wmb = [W] ; fencerel(Wmb) ; [W] let mb = ([M] ; fencerel(Mb) ; [M]) | @@ -61,11 +60,9 @@ let dep = addr | data let rwdep = (dep | ctrl) ; [W] let overwrite = co | fr let to-w = rwdep | (overwrite & int) -let rrdep = addr | (dep ; rfi) -let strong-rrdep = rrdep+ & rb-dep -let to-r = strong-rrdep | rfi-rel-acq +let to-r = addr | (dep ; rfi) | rfi-rel-acq let fence = strong-fence | wmb | po-rel | rmb | acq-po -let ppo = rrdep* ; (to-r | to-w | fence) +let ppo = to-r | to-w | fence (* Propagation: Ordering from release operations and strong fences. *) let A-cumul(r) = rfe? ; r Index: usb-4.x/tools/memory-model/Documentation/explanation.txt =================================================================== --- usb-4.x/tools/memory-model.orig/Documentation/explanation.txt +++ usb-4.x/tools/memory-model/Documentation/explanation.txt @@ -1,5 +1,5 @@ -Explanation of the Linux-Kernel Memory Model -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Explanation of the Linux-Kernel Memory Consistency Model +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ :Author: Alan Stern :Created: October 2017 @@ -35,25 +35,24 @@ Explanation of the Linux-Kernel Memory M INTRODUCTION ------------ -The Linux-kernel memory model (LKMM) is rather complex and obscure. -This is particularly evident if you read through the linux-kernel.bell -and linux-kernel.cat files that make up the formal version of the -memory model; they are extremely terse and their meanings are far from -clear. +The Linux-kernel memory consistency model (LKMM) is rather complex and +obscure. This is particularly evident if you read through the +linux-kernel.bell and linux-kernel.cat files that make up the formal +version of the model; they are extremely terse and their meanings are +far from clear. This document describes the ideas underlying the LKMM. It is meant -for people who want to understand how the memory model was designed. -It does not go into the details of the code in the .bell and .cat -files; rather, it explains in English what the code expresses -symbolically. +for people who want to understand how the model was designed. It does +not go into the details of the code in the .bell and .cat files; +rather, it explains in English what the code expresses symbolically. Sections 2 (BACKGROUND) through 5 (ORDERING AND CYCLES) are aimed -toward beginners; they explain what memory models are and the basic -notions shared by all such models. People already familiar with these -concepts can skim or skip over them. Sections 6 (EVENTS) through 12 -(THE FROM_READS RELATION) describe the fundamental relations used in -many memory models. Starting in Section 13 (AN OPERATIONAL MODEL), -the workings of the LKMM itself are covered. +toward beginners; they explain what memory consistency models are and +the basic notions shared by all such models. People already familiar +with these concepts can skim or skip over them. Sections 6 (EVENTS) +through 12 (THE FROM_READS RELATION) describe the fundamental +relations used in many models. Starting in Section 13 (AN OPERATIONAL +MODEL), the workings of the LKMM itself are covered. Warning: The code examples in this document are not written in the proper format for litmus tests. They don't include a header line, the @@ -827,8 +826,8 @@ A-cumulative; they only affect the propa executed on C before the fence (i.e., those which precede the fence in program order). -smp_read_barrier_depends(), rcu_read_lock(), rcu_read_unlock(), and -synchronize_rcu() fences have other properties which we discuss later. +read_lock(), rcu_read_unlock(), and synchronize_rcu() fences have +other properties which we discuss later. PROPAGATION ORDER RELATION: cumul-fence @@ -988,8 +987,8 @@ Another possibility, not mentioned earli section, is: X and Y are both loads, X ->addr Y (i.e., there is an address - dependency from X to Y), and an smp_read_barrier_depends() - fence occurs between them. + dependency from X to Y), and X is a READ_ONCE() or an atomic + access. Dependencies can also cause instructions to be executed in program order. This is uncontroversial when the second instruction is a @@ -1015,9 +1014,9 @@ After all, a CPU cannot ask the memory s a particular location before it knows what that location is. However, the split-cache design used by Alpha can cause it to behave in a way that looks as if the loads were executed out of order (see the next -section for more details). For this reason, the LKMM does not include -address dependencies between read events in the ppo relation unless an -smp_read_barrier_depends() fence is present. +section for more details). The kernel includes a workaround for this +problem when the loads come from READ_ONCE(), and therefore the LKMM +includes address dependencies to loads in the ppo relation. On the other hand, dependencies can indirectly affect the ordering of two loads. This happens when there is a dependency from a load to a @@ -1114,11 +1113,12 @@ code such as the following: int *r1; int r2; - r1 = READ_ONCE(ptr); + r1 = ptr; r2 = READ_ONCE(*r1); } -can malfunction on Alpha systems. It is quite possible that r1 = &x +can malfunction on Alpha systems (notice that P1 uses an ordinary load +to read ptr instead of READ_ONCE()). It is quite possible that r1 = &x and r2 = 0 at the end, in spite of the address dependency. At first glance this doesn't seem to make sense. We know that the @@ -1141,11 +1141,15 @@ This could not have happened if the loca incoming stores in FIFO order. In constrast, other architectures maintain at least the appearance of FIFO order. -In practice, this difficulty is solved by inserting an -smp_read_barrier_depends() fence between P1's two loads. The effect -of this fence is to cause the CPU not to execute any po-later -instructions until after the local cache has finished processing all -the stores it has already received. Thus, if the code was changed to: +In practice, this difficulty is solved by inserting a special fence +between P1's two loads when the kernel is compiled for the Alpha +architecture. In fact, as of version 4.15, the kernel automatically +adds this fence (called smp_read_barrier_depends() and defined as +nothing at all on non-Alpha builds) after every READ_ONCE() and atomic +load. The effect of the fence is to cause the CPU not to execute any +po-later instructions until after the local cache has finished +processing all the stores it has already received. Thus, if the code +was changed to: P1() { @@ -1153,13 +1157,15 @@ the stores it has already received. Thu int r2; r1 = READ_ONCE(ptr); - smp_read_barrier_depends(); r2 = READ_ONCE(*r1); } then we would never get r1 = &x and r2 = 0. By the time P1 executed its second load, the x = 1 store would already be fully processed by -the local cache and available for satisfying the read request. +the local cache and available for satisfying the read request. Thus +we have yet another reason why shared data should always be read with +READ_ONCE() or another synchronization primitive rather than accessed +directly. The LKMM requires that smp_rmb(), acquire fences, and strong fences share this property with smp_read_barrier_depends(): They do not allow @@ -1751,11 +1757,10 @@ no further involvement from the CPU. Si the value of x, there is nothing for the smp_rmb() fence to act on. The LKMM defines a few extra synchronization operations in terms of -things we have already covered. In particular, rcu_dereference() and -lockless_dereference() are both treated as a READ_ONCE() followed by -smp_read_barrier_depends() -- which also happens to be how they are -defined in include/linux/rcupdate.h and include/linux/compiler.h, -respectively. +things we have already covered. In particular, rcu_dereference() is +treated as READ_ONCE() and rcu_assign_pointer() is treated as +smp_store_release() -- which is basically how the Linux kernel treats +them. There are a few oddball fences which need special treatment: smp_mb__before_atomic(), smp_mb__after_atomic(), and Index: usb-4.x/tools/memory-model/linux-kernel.bell =================================================================== --- usb-4.x/tools/memory-model.orig/linux-kernel.bell +++ usb-4.x/tools/memory-model/linux-kernel.bell @@ -24,7 +24,6 @@ instructions RMW[{'once,'acquire,'releas enum Barriers = 'wmb (*smp_wmb*) || 'rmb (*smp_rmb*) || 'mb (*smp_mb*) || - 'rb_dep (*smp_read_barrier_depends*) || 'rcu-lock (*rcu_read_lock*) || 'rcu-unlock (*rcu_read_unlock*) || 'sync-rcu (*synchronize_rcu*) || Index: usb-4.x/tools/memory-model/linux-kernel.def =================================================================== --- usb-4.x/tools/memory-model.orig/linux-kernel.def +++ usb-4.x/tools/memory-model/linux-kernel.def @@ -13,14 +13,12 @@ WRITE_ONCE(X,V) { __store{once}(X,V); } smp_store_release(X,V) { __store{release}(*X,V); } smp_load_acquire(X) __load{acquire}(*X) rcu_assign_pointer(X,V) { __store{release}(X,V); } -lockless_dereference(X) __load{lderef}(X) rcu_dereference(X) __load{deref}(X) // Fences smp_mb() { __fence{mb} ; } smp_rmb() { __fence{rmb} ; } smp_wmb() { __fence{wmb} ; } -smp_read_barrier_depends() { __fence{rb_dep}; } smp_mb__before_atomic() { __fence{before-atomic} ; } smp_mb__after_atomic() { __fence{after-atomic} ; } smp_mb__after_spinlock() { __fence{after-spinlock} ; } Index: usb-4.x/tools/memory-model/Documentation/cheatsheet.txt =================================================================== --- usb-4.x/tools/memory-model.orig/Documentation/cheatsheet.txt +++ usb-4.x/tools/memory-model/Documentation/cheatsheet.txt @@ -6,8 +6,7 @@ Store, e.g., WRITE_ONCE() Y Y Load, e.g., READ_ONCE() Y Y Y Unsuccessful RMW operation Y Y Y -smp_read_barrier_depends() Y Y Y -*_dereference() Y Y Y Y +rcu_dereference() Y Y Y Y Successful *_acquire() R Y Y Y Y Y Y Successful *_release() C Y Y Y W Y smp_rmb() Y R Y Y R