Received: by 2002:a25:c205:0:0:0:0:0 with SMTP id s5csp957271ybf; Fri, 28 Feb 2020 10:55:53 -0800 (PST) X-Google-Smtp-Source: APXvYqznj6R14L9w9tC9QXlmOisfBWOWky6sNS0UPO9qDZVMCtnDPk08h7/z8y+qp1Q8HTz4Oogx X-Received: by 2002:a9d:638f:: with SMTP id w15mr3770034otk.239.1582916153807; Fri, 28 Feb 2020 10:55:53 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1582916153; cv=none; d=google.com; s=arc-20160816; b=egInL+EunrcPz/gtOMguxNtAcn+p++43e0r2i+RkMLw55hqhBtrQ9IpV7e2CRawOHs X7LSSAFenHKAzW5r92rNOmuD8bGSioVeoqt2dneHiY1OD8YXlF/vkannPTITw9Y1N8MG ULCmK6Uq3tXp9o1qTyGBZcTkGj6yN22y2DBOvnCRK0uUsoLppBYidRr8E5krTnZW0yRt jU8XD+nkKxLQ7cfgdcmrUDTKup7uS+qAG8QjFqEHfD4wJTUK/efsLYldigHw2svRyUeb /fiBoYhWq11uObH3JF9X1GP4Bs1DveL9+8E00jgg2hLdmIPXpLfDfcUBDScq1Ui6tWXR w6Ng== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=CIzu0bxWTlPcGKncH1ZbGeY7b+QCcXLXrxcS18DRBR4=; b=JEvwbWbs7MzMt40aD8V7/ak0tUn/yzlSubsVl0+kqFbO36pOKNai3CQcq8gz92ie4i LY030JGUqN6CRhQ/vcN02DeKBJdSsYgd7LIs1eT61twguBuSOdl2q2MmvBuDMhiT+5B8 /S2VrkNHQjTiv5im8ZqMwOetI9UPEGETMTRym9KnR0/t5Gcj+6xd7va7HxaPBTmy4d11 Sd2s9u+FLkiu8eMg9uzWViUcllowIxQ+fkpDzqxvD3QZc7ydhJDcKTh9ekipqm1hNz/u sKlZnYuD8nzDlouuqnrgdkt1PCbqj7/ZsWsgkhY6K4zgIwqgQW1q3Jv+jhj6cat2jojU Ktcg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=JfzBdla9; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 74si1942011otd.45.2020.02.28.10.55.41; Fri, 28 Feb 2020 10:55:53 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=JfzBdla9; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726687AbgB1SyY (ORCPT + 99 others); Fri, 28 Feb 2020 13:54:24 -0500 Received: from mail-oi1-f194.google.com ([209.85.167.194]:44524 "EHLO mail-oi1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725730AbgB1SyX (ORCPT ); Fri, 28 Feb 2020 13:54:23 -0500 Received: by mail-oi1-f194.google.com with SMTP id d62so3808088oia.11 for ; Fri, 28 Feb 2020 10:54:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=CIzu0bxWTlPcGKncH1ZbGeY7b+QCcXLXrxcS18DRBR4=; b=JfzBdla9ktw4fNC787hZT7zc7j425gzVyF0X5hWY8/xN+j/URuzBbBJxRmnDSDQVOx aHQ3Fn0kKxgik2QOZOClFGPSIlVzAU8xJnSxD50jiKpsTG7ec8eTP8VKV8YVEwXVg7Hz kkecREIVFjJAz+DWyiGrYGv7wzsLIrGaNBE8576fe0ZdJlh3qdW/ZCrWUv+zcNffonvP HbiV8zBzNnh7FWsh0NSA5zjdpyqgCzxD+ifcv1IGOPxHrmIWm4zKpsmZ/2MpFeZQ3uU2 TPz4gOE650Toz2auA7PHb81dtyYUsqxgpPyM95xNNayrcko4S5oWaXejfpB7Bu0T1Ynf SQjQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=CIzu0bxWTlPcGKncH1ZbGeY7b+QCcXLXrxcS18DRBR4=; b=rU0wx0hPY3UZlWTVALnn/rPL061Lei4ruF2p/ymSinY0jouN8hCQnP0b/AxFDzbvPj 2wx+AaYHndSyKGKGrGYi2nzuxMvrgGvSqOxUnbnks1A8K5yz2Od4zcO21y6ZxegTm7Iu y8x67M/l6hM+FsvhqXvUuUhxu17CkV33L4ltEVWhixUe7OUo+NsUblv0I905iroLhpY1 qnHGoTuyA6zOhz1T1YoFbeShoyKCLFXikpVkN3FO2A+NRfn5nZsMvC+0OR9CehdHk4CK 9OSGVTpxJK6oUH4JwngXn5T8XcnUUj/E0M20GsVJFYlMDjLdOEBZZbc0yBf1TneTnYx6 YH9w== X-Gm-Message-State: APjAAAWjd6P6M5dm2QaAFdCo1ik2qoMZBLYmN/2vfWZVPb/yRDYF/lCN S7tbFMyXX0wGSYwgbckeSEK3EsZNqTu3hCUR9hus1w== X-Received: by 2002:a54:4510:: with SMTP id l16mr4143519oil.70.1582916062251; Fri, 28 Feb 2020 10:54:22 -0800 (PST) MIME-Version: 1.0 References: <20200228164621.87523-1-elver@google.com> In-Reply-To: From: Marco Elver Date: Fri, 28 Feb 2020 19:54:10 +0100 Message-ID: Subject: Re: [PATCH] tools/memory-model/Documentation: Fix "conflict" definition To: Alan Stern Cc: "Paul E. McKenney" , Andrey Konovalov , Alexander Potapenko , Dmitry Vyukov , kasan-dev , LKML , Andrea Parri , Will Deacon , Peter Zijlstra , Boqun Feng , Nicholas Piggin , David Howells , Jade Alglave , Luc Maranget , LKMM Maintainers -- Akira Yokosawa , Daniel Lustig , Joel Fernandes , linux-arch Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, 28 Feb 2020 at 18:24, Alan Stern wrote: > > On Fri, 28 Feb 2020, Marco Elver wrote: > > > For language-level memory consistency models that are adaptations of > > data-race-free, the definition of "data race" can be summarized as > > "concurrent conflicting accesses, where at least one is non-sync/plain". > > > > The definition of "conflict" should not include the type of access nor > > whether the accesses are concurrent or not, which this patch addresses > > for explanation.txt. > > Why shouldn't it? Can you provide any references to justify this > assertion? The definition of "conflict" as we know it and is cited by various papers on memory consistency models appeared in [1]: "Two accesses to the same variable conflict if at least one is a write; two operations conflict if they execute conflicting accesses." The LKMM as well as C11 are adaptations of data-race-free, which are based on the work in [2]. Necessarily, we need both conflicting data operations (plain) and synchronization operations (marked). C11's definition is based on [3], which defines a "data race" as: "Two memory operations conflict if they access the same memory location, and at least one of them is a store, atomic store, or atomic read-modify-write operation. In a sequentially consistent execution, two memory operations from different threads form a type 1 data race if they conflict, at least one of them is a data operation, and they are adjacent in Also, note two things: (1) The existing text does not include > concurrency in the definition of "conflict". (2) Your new text does > include the type of access in the definition (you say that at least one > of the accesses must be a write). Yes, "conflict" is defined in terms of "access to the same memory location and at least one performs a write" (can be any operation that performs a write, including RMWs etc.). It should not include concurrency. We can have conflicting operations that are not concurrent, but these will never be data races. > > The definition of "data race" remains unchanged, but the informal > > definition for "conflict" is restored to what can be found in the > > literature. > > It does not remain unchanged. You removed the portion that talks about > accesses executing on different CPUs or threads. Without that > restriction, you raise the nonsensical possibility that a single thread > may by definition have a data race with itself (since modern CPUs use > multiple-instruction dispatch, in which several instructions can > execute at the same time). Andrea raised the point that "occur on different CPUs (or in different threads on the same CPU)" can be interpreted as "in different threads [even if they are serialized via some other synchronization]". Arguably, no sane memory model or abstract machine model permits observable intra-thread concurrency of instructions in the same thread. At the abstract machine level, whether or not there is true parallelism shouldn't be something that the model concerns itself with. Simply talking about "concurrency" is unambiguous, unless the model says intra-thread concurrency is a thing. I can add it back if it helps make this clearer, but we need to mention both. > > Signed-by: Marco Elver > > --- > > tools/memory-model/Documentation/explanation.txt | 15 ++++++--------- > > 1 file changed, 6 insertions(+), 9 deletions(-) > > > > diff --git a/tools/memory-model/Documentation/explanation.txt b/tools/memory-model/Documentation/explanation.txt > > index e91a2eb19592a..11cf89b5b85d9 100644 > > --- a/tools/memory-model/Documentation/explanation.txt > > +++ b/tools/memory-model/Documentation/explanation.txt > > @@ -1986,18 +1986,15 @@ violates the compiler's assumptions, which would render the ultimate > > outcome undefined. > > > > In technical terms, the compiler is allowed to assume that when the > > -program executes, there will not be any data races. A "data race" > > -occurs when two conflicting memory accesses execute concurrently; > > -two memory accesses "conflict" if: > > +program executes, there will not be any data races. A "data race" > > Unnecessary (and inconsistent with the rest of the document) whitespace > change. Reverted. > > +occurs if: > > > > - they access the same location, > > + two concurrent memory accesses "conflict"; > > > > - they occur on different CPUs (or in different threads on the > > - same CPU), > > + and at least one of the accesses is a plain access; > > > > - at least one of them is a plain access, > > - > > - and at least one of them is a store. > > + where two memory accesses "conflict" if they access the same > > + memory location, and at least one performs a write; > > > > The LKMM tries to determine whether a program contains two conflicting > > accesses which may execute concurrently; if it does then the LKMM says > > To tell the truth, the only major change I can see here (apart from the > "differenct CPUs" restriction) is that you want to remove the "at least > one is plain" part from the definition of "conflict" and instead make > it a separate requirement for a data race. That's fine with me in > principle, but there ought to be an easier way of doing it. Yes pretty much. The model needs to be able to talk about "conflicting synchronization accesses" where all accesses are marked. Right now the definition of conflict doesn't permit that. > Furthermore, this section of explanation.txt goes on to use the words > "conflict" and "conflicting" in a way that your patch doesn't address. > For example, shortly after this spot it says "Determining whether two > accesses conflict is easy"; you should change it to say "Determining > whether two accesses conflict and at least one of them is plain is > easy" -- but this looks pretty ungainly. A better approach might be to > introduce a new term, define it to mean "conflicting accesses at least > one of which is plain", and then use it instead throughout. The definition of "conflict" as used in the later text is synonymous with "data race". > Alternatively, you could simply leave the text as it stands and just > add a parenthetical disclaimer pointing out that in the CS literature, > the term "conflict" is used even when both accesses are marked, so the > usage here is somewhat non-standard. The definition of what a "conflict" is, is decades old [1, 2]. I merely thought we should avoid changing fundamental definitions that have not changed in decades, to avoid confusing people. The literature on memory models is confusing enough, so fundamental definitions that are "common ground" shouldn't be changed if it can be avoided. I think here it is pretty trivial to avoid. Thanks, -- Marco