Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965591AbdGTVNE (ORCPT ); Thu, 20 Jul 2017 17:13:04 -0400 Received: from mail-pg0-f49.google.com ([74.125.83.49]:36066 "EHLO mail-pg0-f49.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933285AbdGTVNB (ORCPT ); Thu, 20 Jul 2017 17:13:01 -0400 Subject: Re: [PATCH] documentation: Fix two-CPU control-dependency example To: "Paul E. McKenney" Cc: Boqun Feng , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, Akira Yokosawa References: <0d60072f-3848-a2b7-ce18-e1ed317b5c09@gmail.com> <20170719174334.GH3730@linux.vnet.ibm.com> <101f5108-663e-7fa4-ac2b-e790320e4e6f@gmail.com> <20170719215602.GK3730@linux.vnet.ibm.com> <20170720013112.fmrml6abdhi2nqdt@tardis> <20170720054704.GM3730@linux.vnet.ibm.com> <20170720161152.GQ3730@linux.vnet.ibm.com> From: Akira Yokosawa Message-ID: Date: Fri, 21 Jul 2017 06:12:56 +0900 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1 MIME-Version: 1.0 In-Reply-To: <20170720161152.GQ3730@linux.vnet.ibm.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 9958 Lines: 260 On 2017/07/20 09:11:52AM -0700, Paul E. McKenney wrote: > On Thu, Jul 20, 2017 at 09:55:31PM +0900, Akira Yokosawa wrote: >> On 2017/07/20 14:47, Paul E. McKenney wrote: >>> On Thu, Jul 20, 2017 at 09:31:41AM +0800, Boqun Feng wrote: >>>> On Wed, Jul 19, 2017 at 02:56:02PM -0700, Paul E. McKenney wrote: >>>>> On Thu, Jul 20, 2017 at 06:33:26AM +0900, Akira Yokosawa wrote: >>>>>> On 2017/07/20 2:43, Paul E. McKenney wrote: >>>>>>> On Mon, Jul 17, 2017 at 05:24:42PM +0900, Akira Yokosawa wrote: >>>>>>>> >From b798b9b631e237d285aa8699da00bfb8ced33bea Mon Sep 17 00:00:00 2001 >>>>>>>> From: Akira Yokosawa >>>>>>>> Date: Mon, 17 Jul 2017 16:25:33 +0900 >>>>>>>> Subject: [PATCH] documentation: Fix two-CPU control-dependency example >>>>>>>> >>>>>>>> In commit 5646f7acc95f ("memory-barriers: Fix control-ordering >>>>>>>> no-transitivity example"), the operator in "if" statement of >>>>>>>> the two-CPU example was modified from ">=" to ">". >>>>>>>> Now the example misses the point because there is no party >>>>>>>> who will modify "x" nor "y". So each CPU performs only the >>>>>>>> READ_ONCE(). >>>>>>>> >>>>>>>> The point of this example is to use control dependency for ordering, >>>>>>>> and the WRITE_ONCE() should always be executed. >>>>>>>> >>>>>>>> So it was correct prior to the above mentioned commit. Partial >>>>>>>> revert of the commit (with context adjustments regarding other >>>>>>>> changes thereafter) restores the point. >>>>>>>> >>>>>>>> Note that the three-CPU example demonstrating the lack of >>>>>>>> transitivity stands regardless of this partial revert. >>>>>>>> >>>>>>>> Signed-off-by: Akira Yokosawa >>>>>>> >>>>>>> Hello, Akira, >>>>>>> >>>>>>> You are quite right that many compilers would generate straightforward >>>>>>> code for the code fragment below, and in that case, the assertion could >>>>>>> never trigger due to either TSO or control dependencies, depending on >>>>>>> the architecture this was running on. >>>>>>> >>>>>>> However, if the compiler was too smart for our good, it could figure >>>>>>> out that "x" and "y" only take on the values zero and one, so that >>>>>>> the "if" would always be taken. At that point, the compiler could >>>>>>> simply ignore the "if" with the result shown below. >>>>>>> >>>>>>>> --- >>>>>>>> Documentation/memory-barriers.txt | 2 +- >>>>>>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>>>>>> >>>>>>>> diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt >>>>>>>> index c4ddfcd..c1ebe99 100644 >>>>>>>> --- a/Documentation/memory-barriers.txt >>>>>>>> +++ b/Documentation/memory-barriers.txt >>>>>>>> @@ -851,7 +851,7 @@ demonstrated by two related examples, with the initial values of >>>>>>>> CPU 0 CPU 1 >>>>>>>> ======================= ======================= >>>>>>>> r1 = READ_ONCE(x); r2 = READ_ONCE(y); >>>>>>>> - if (r1 > 0) if (r2 > 0) >>>>>>>> + if (r1 >= 0) if (r2 >= 0) >>>>>>>> WRITE_ONCE(y, 1); WRITE_ONCE(x, 1); >>>>>>>> >>>>>>>> assert(!(r1 == 1 && r2 == 1)); >>>>>>> >>>>>>> Original program: >>>>>>> >>>>>>> CPU 0 CPU 1 >>>>>>> ======================= ======================= >>>>>>> r1 = READ_ONCE(x); r2 = READ_ONCE(y); >>>>>>> if (r1 >= 0) if (r2 >= 0) >>>>>>> WRITE_ONCE(y, 1); WRITE_ONCE(x, 1); >>>>>>> >>>>>>> assert(!(r1 == 1 && r2 == 1)); >>>>>>> >>>>>>> Enthusiastically optimized program: >>>>>>> >>>>>>> CPU 0 CPU 1 >>>>>>> ======================= ======================= >>>>>>> r1 = READ_ONCE(x); r2 = READ_ONCE(y); >>>>>>> WRITE_ONCE(y, 1); WRITE_ONCE(x, 1); >>>>>>> >>>>>>> assert(!(r1 == 1 && r2 == 1)); >>>>>>> >>>>>>> This optimized version could trigger the assertion. >>>>>>> >>>>>>> So we do need to stick with the ">" comparison. >>>>>> >>>>>> Well but, >>>>>> >>>>>> Current example: >>>>>> >>>>>> CPU 0 CPU 1 >>>>>> ======================= ======================= >>>>>> r1 = READ_ONCE(x); r2 = READ_ONCE(y); >>>>>> if (r1 > 0) if (r2 > 0) >>>>>> WRITE_ONCE(y, 1); WRITE_ONCE(x, 1); >>>>>> >>>>>> assert(!(r1 == 1 && r2 == 1)); >>>>>> >>>>>> Such a clever compiler might be able to prove that "x" and "y" >>>>>> are never modified and end up in the following: >>>>>> >>>> >>>> Hi Akira, >>>> >>>> I wouldn't call that compiler a clever one, it's a broken one ;-) >>>> >>>> So here is the thing: READ_ONCE() is a *volatile* load, which means the >>>> compiler has to generate code that actually does a load, so the values >>>> of r1 and r2 depend on the loads, therefore, a sane compiler will not >>>> optimize the if()s out because the volatile semantics of READ_ONCE(). >>>> Otherwise, I think we have a lot more to worry about than this case. >>>> >>>>>> CPU 0 CPU 1 >>>>>> ======================= ======================= >>>>>> r1 = READ_ONCE(x); r2 = READ_ONCE(y); >>>>>> >>>>>> assert(!(r1 == 1 && r2 == 1)); >>>>>> >>>>>> This means it is impossible to describe this example in C, >>>>>> doesn't it? >>>>> >>>>> That is a matter of some debate, but it has gotten increasingly more >>>>> difficult to get C to do one's bidding over the decades. >>>>> >>>>>> What am I missing here? >>>>> >>>>> The compiler has to work harder in your example case, so it is probably >>>>> just a matter of time. In the original with ">=", all the compiler needed >>>>> to do was look at all the assignments and the initial value. In the >>>>> original, it also had to do reasoning based on control dependencies >>>>> (which are not yet part of the C standard). >>>>> >>>>> But yes, the degree to which a compiler can optimize atomics is a matter >>>>> of some debate. Here is a proposal to let the developer choose: >>>>> >>>> >>>> Hi Paul, >>>> >>>> I know the compiler could optimize atomics in very interesting ways, but >>>> this case is about volatile, so I guess our case is still fine? ;-) >>> >>> Hello, Boqun, >>> >>> When I asked that question, the answer I got was "the compiler must >>> emit the load instruction, but is under no obligation to actually use the >>> value loaded". >> >> So, it sounds like the following description found in memory-barriers.txt >> just above the example of lack of transitivity: >> >>> However, stores are not speculated. This means that ordering -is- provided >>> for load-store control dependencies, as in the following example: >>> >>> q = READ_ONCE(a); >>> if (q) { >>> WRITE_ONCE(b, 1); >>> } >>> >>> Control dependencies pair normally with other types of barriers. >>> That said, please note that neither READ_ONCE() nor WRITE_ONCE() >>> are optional! Without the READ_ONCE(), the compiler might combine the >>> load from 'a' with other loads from 'a'. Without the WRITE_ONCE(), >>> the compiler might combine the store to 'b' with other stores to 'b'. >>> Either can result in highly counterintuitive effects on ordering. >>> >>> Worse yet, if the compiler is able to prove (say) that the value of >>> variable 'a' is always non-zero, it would be well within its rights >>> to optimize the original example by eliminating the "if" statement >>> as follows: >>> >>> q = a; >>> b = 1; /* BUG: Compiler and CPU can both reorder!!! */ >>> >>> So don't leave out the READ_ONCE(). >> >> does not stand if the answer needs to be taken seriously. >> The READ_ONCE() is not good enough to prevent the "if" statement >> from being eliminated. >> >> Hmm... If we do need to care about such an extreme optimization, >> we should not rely on any control dependency in C source code >> at least until the compiler gets educated about control dependency. >> >> Is there some reasonable compromise? > > Right now, what we have are the volatile loads such as from READ_ONCE() > and WRITE_ONCE(). My defense of them has been that they need to properly > handle MMIO. But not all compiler writers agree with me, so we should > program at least a bit defensively. Though we should probably also > be writing tests to verify that the compiler is doing the right thing, > and those litmus tests should probably include your ">=" litmus test. > But we should not encourage developers to rely on it quite yet. > > My current position is that compiler writers are currently unlikely to > make their compilers do deep reasoning around the memory model, and that > they are currently unlikely to do cross-translation-unit assigned-value > analysis (though LTO might be changing that). But given a static atomic > that is visible only within a translation unit, I would expect them > to do value-range analysis. Hence my reluctance to present the ">=" > variant as a pattern to follow. So if we declare "x" and "y" are not static (not file local), can the ">=" variant be exempted from the possible optimization? For example: extern int x, y; CPU 0 CPU 1 ======================= ======================= r1 = READ_ONCE(x); r2 = READ_ONCE(y); if (r1 >= 0) if (r2 >= 0) WRITE_ONCE(y, 1); WRITE_ONCE(x, 1); assert(!(r1 == 1 && r2 == 1)); This looks safe to me. Thoughts? Thanks, Akira > > Thanx, Paul > > >> Thanks, Akira >> >>> >>> I don't happen to like that answer, by the way. ;-) >>> >>> Thanx, Paul >>> >>>>> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0062r1.html >>>>> >>>> >>>> Great material to wake up mind in the morning! Thanks. >>>> >>>> Regards, >>>> Boqun >>>> >>>>> What are your thoughts on this? >>>>> >>>>> Thanx, Paul >>>>> >>>>>> Thanks, Akira >>>>>> >>>>>>> That said, I very much welcome critical reviews of memory-barriers.txt, >>>>>>> so please do feel free to continue doing that! >>>>>>> >>>>>>> Thanx, Paul >>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >>> >> > >