Received: by 2002:a05:6a10:1a4d:0:0:0:0 with SMTP id nk13csp1540113pxb; Wed, 2 Feb 2022 07:15:21 -0800 (PST) X-Google-Smtp-Source: ABdhPJzLGGV/GTnn8SFmuAiLCME4Egxn7Ia41h6eEg4mC0E0IR/bD0q26ecQ2kaqtpnTT3qpOc88 X-Received: by 2002:a05:6a00:a8e:: with SMTP id b14mr30108102pfl.24.1643814921442; Wed, 02 Feb 2022 07:15:21 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1643814921; cv=none; d=google.com; s=arc-20160816; b=fwjP02uJZG/vKaCGCBrGMmmajJEsu3cIXeOxlcOlZWxpx+N0r8jRdPlRMXU9LDP4qM gjJnClUGlWtBiVwWv1IMhESDI0XtB1nlPzfIXj6C5B86ychntG4gRH3fVKTxTt5YSFw9 8fgPXUvy38LLgXcKcNLSuE1afJCfFyB/Yfi0QjaPVW7Yyl8JVVQpCbljjtrzW2nlUBZk P+3DXOd70HeK7exOxGtd88xzMUFpmmLsoh1DMx1LJNHkFjhXoYsKrDoCX8Vb1LvdW7OP Sx7Dr28l8BXWgnNWRkx2Qbsof3wydEQ9Z7mPmziSSLjj2fXJUWN7/dvvXeon2iT4GG6d Jo6Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-transfer-encoding :content-disposition:mime-version:message-id:subject:cc:to:from:date; bh=ufDO76sBf6OB2vRN1x8Z9hxyUxzL87xBOe5bkqv1Ykk=; b=Zo1cc8K0I6MgjDwFSiO3vvOenD5A8srns4wb+zgVAuNhHEiAUBosJC2U1T0OU/YR1f 6b4RPFXtZ9AJIXZpOlUV4nwGwOqcQ9xJ4ct85Ip75wyHUMONbBz+fwpktbABf2ffaVrZ KKtLwFhpPXnufmSSdbY0SsoFvIvYr3IJ24CE2vzppHCk05ln6WP1j8yG4H+oRt+9JSnC VfDrkIjoZNzKsjY7wAJp3mJgPxvYLm5VLWJs2gMlPrtkSLeFLTRhhY2pMiYr6YXqGSbJ UwghxCl689RzFSR43o5KlVqGG3qPg6GttUUQF1NjUrSM5ReyW9wy22Kr3WO94zFkYhLD VcjQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id t2si18540861plg.360.2022.02.02.07.15.08; Wed, 02 Feb 2022 07:15:21 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242195AbiBATAN (ORCPT + 99 others); Tue, 1 Feb 2022 14:00:13 -0500 Received: from netrider.rowland.org ([192.131.102.5]:57641 "HELO netrider.rowland.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S242174AbiBATAJ (ORCPT ); Tue, 1 Feb 2022 14:00:09 -0500 Received: (qmail 316474 invoked by uid 1000); 1 Feb 2022 14:00:08 -0500 Date: Tue, 1 Feb 2022 14:00:08 -0500 From: Alan Stern To: "Paul E. McKenney" Cc: Paul =?iso-8859-1?Q?Heidekr=FCger?= , Andrea Parri , Will Deacon , Peter Zijlstra , Boqun Feng , Nicholas Piggin , David Howells , Jade Alglave , Luc Maranget , Akira Yokosawa , Daniel Lustig , Joel Fernandes , =?iso-8859-1?Q?Bj=F6rn_T=F6pel?= , linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org, Marco Elver , Charalampos Mainas , Pramod Bhatotia Subject: [PATCH v2] tools/memory-model: Explain syntactic and semantic dependencies Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Paul Heidekr?ger pointed out that the Linux Kernel Memory Model documentation doesn't mention the distinction between syntactic and semantic dependencies. This is an important difference, because the compiler can easily break dependencies that are only syntactic, not semantic. This patch adds a few paragraphs to the LKMM documentation explaining these issues and illustrating how they can matter. Suggested-by: Paul Heidekr?ger Signed-off-by: Alan Stern --- v2: Incorporate changes suggested by Paul McKenney, along with a few other minor edits. [as1970b] tools/memory-model/Documentation/explanation.txt | 51 +++++++++++++++++++++++ 1 file changed, 51 insertions(+) Index: usb-devel/tools/memory-model/Documentation/explanation.txt =================================================================== --- usb-devel.orig/tools/memory-model/Documentation/explanation.txt +++ usb-devel/tools/memory-model/Documentation/explanation.txt @@ -485,6 +485,57 @@ have R ->po X. It wouldn't make sense f somehow on a value that doesn't get loaded from shared memory until later in the code! +Here's a trick question: When is a dependency not a dependency? Answer: +When it is purely syntactic rather than semantic. We say a dependency +between two accesses is purely syntactic if the second access doesn't +actually depend on the result of the first. Here is a trivial example: + + r1 = READ_ONCE(x); + WRITE_ONCE(y, r1 * 0); + +There appears to be a data dependency from the load of x to the store +of y, since the value to be stored is computed from the value that was +loaded. But in fact, the value stored does not really depend on +anything since it will always be 0. Thus the data dependency is only +syntactic (it appears to exist in the code) but not semantic (the +second access will always be the same, regardless of the value of the +first access). Given code like this, a compiler could simply discard +the value returned by the load from x, which would certainly destroy +any dependency. (The compiler is not permitted to eliminate entirely +the load generated for a READ_ONCE() -- that's one of the nice +properties of READ_ONCE() -- but it is allowed to ignore the load's +value.) + +It's natural to object that no one in their right mind would write +code like the above. However, macro expansions can easily give rise +to this sort of thing, in ways that often are not apparent to the +programmer. + +Another mechanism that can lead to purely syntactic dependencies is +related to the notion of "undefined behavior". Certain program +behaviors are called "undefined" in the C language specification, +which means that when they occur there are no guarantees at all about +the outcome. Consider the following example: + + int a[1]; + int i; + + r1 = READ_ONCE(i); + r2 = READ_ONCE(a[r1]); + +Access beyond the end or before the beginning of an array is one kind +of undefined behavior. Therefore the compiler doesn't have to worry +about what will happen if r1 is nonzero, and it can assume that r1 +will always be zero regardless of the value actually loaded from i. +(If the assumption turns out to be wrong the resulting behavior will +be undefined anyway, so the compiler doesn't care!) Thus the value +from the load can be discarded, breaking the address dependency. + +The LKMM is unaware that purely syntactic dependencies are different +from semantic dependencies and therefore mistakenly predicts that the +accesses in the two examples above will be ordered. This is another +example of how the compiler can undermine the memory model. Be warned. + THE READS-FROM RELATION: rf, rfi, and rfe -----------------------------------------