Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754861AbdGUXjF (ORCPT ); Fri, 21 Jul 2017 19:39:05 -0400 Received: from mail-pg0-f49.google.com ([74.125.83.49]:34794 "EHLO mail-pg0-f49.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754767AbdGUXjD (ORCPT ); Fri, 21 Jul 2017 19:39:03 -0400 Subject: Re: [PATCH] documentation: Fix two-CPU control-dependency example To: "Paul E. McKenney" Cc: Boqun Feng , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, Akira Yokosawa References: <20170719174334.GH3730@linux.vnet.ibm.com> <101f5108-663e-7fa4-ac2b-e790320e4e6f@gmail.com> <20170719215602.GK3730@linux.vnet.ibm.com> <20170720013112.fmrml6abdhi2nqdt@tardis> <20170720054704.GM3730@linux.vnet.ibm.com> <20170720161152.GQ3730@linux.vnet.ibm.com> <20170720214234.GY3730@linux.vnet.ibm.com> <55457ca1-a8db-213c-3b9c-ead441f97200@gmail.com> <20170720230714.GA3730@linux.vnet.ibm.com> From: Akira Yokosawa Message-ID: <6ce659de-6c92-dbd8-e1dd-90f3e85521c0@gmail.com> Date: Sat, 22 Jul 2017 08:38:57 +0900 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1 MIME-Version: 1.0 In-Reply-To: <20170720230714.GA3730@linux.vnet.ibm.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2782 Lines: 82 On 2017/07/20 16:07:14 -0700, Paul E. McKenney wrote: > On Fri, Jul 21, 2017 at 07:52:03AM +0900, Akira Yokosawa wrote: >> On 2017/07/20 14:42:34 -0700, Paul E. McKenney wrote: [...] >>> For the compilers I know about at the present time, yes. >> >> So if I respin the patch with the extern, would you still feel reluctant? > > Yes, because I am not seeing how this change helps. What is this telling > the reader that the original did not, and how does it help the reader > generate good concurrent code? Well, what bothers me in the ">" version of two-CPU example is the explanation in memory-barriers.txt that follows: > These two examples are the LB and WWC litmus tests from this paper: > http://www.cl.cam.ac.uk/users/pes20/ppc-supplemental/test6.pdf and this > site: https://www.cl.cam.ac.uk/~pes20/ppcmem/index.html. I'm wondering if calling the ">" version as an "LB" litmus test is correct. Because it always results in "r1 == 0 && r2 == 0", 100%. An LB litmus test with full memory barriers would be: CPU 0 CPU 1 ======================= ======================= r1 = READ_ONCE(x); r2 = READ_ONCE(y); smp_mb(); smp_mb(); WRITE_ONCE(y, 1); WRITE_ONCE(x, 1); assert(!(r1 == 1 && r2 == 1)); and this will result in either of r1 == 0 && r2 == 0 r1 == 0 && r2 == 1 r1 == 1 && r2 == 0 but never "r1 == 1 && r2 == 1". The difference in the behavior distracts me in reading this part of memory-barriers.txt. Your priority seemed to be in reducing the chance of the "if" statement to be optimized away. So I suggested to use "extern" as a compromise. Another way would be to express the ">=" version in a pseudo-asm form. CPU 0 CPU 1 ======================= ======================= r1 = LOAD x r2 = LOAD y if (r1 >= 0) if (r2 >= 0) STORE y = 1 STORE x = 1 assert(!(r1 == 1 && r2 == 1)); This should eliminate any concern of compiler optimization. In this final part of CONTROL DEPENDENCIES section, separating the problem of optimization and transitivity would clarify the point (at least for me). Thoughts? Regards, Akira > > Thanx, Paul > >> Regards, Akira >> >>> >>> The underlying problem is that the standard says almost nothing about what >>> volatile does. I usually argue that it was intended to be used for MMIO, >>> so any optimization that would prevent a device driver from working must >>> be prohibited on volatiles. A lot of people really don't like volatile, >>> and would like to eliminate it, and these people also aren't very happy >>> about any attempt to better define volatile. >>> >>> Keeps things entertaining. ;-) >>> >>> Thanx, Paul >>>