Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp2726511pxj; Sun, 6 Jun 2021 11:45:43 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyr3FBhM8iVj4ignnbdl324qXS/NI5Go04dBCt3eJ4pAcWnpBBc8REP6ztp4aKAtXClAWaN X-Received: by 2002:a05:6402:b1a:: with SMTP id bm26mr16512031edb.387.1623005143109; Sun, 06 Jun 2021 11:45:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1623005143; cv=none; d=google.com; s=arc-20160816; b=D0HR4emzJSkf6WxQ6jjcDoL5Hk22WUzDL6tiPP1X3eL3hKmPCgG93D8QfBScx71dz5 Gr9RxSNFS6T5cDp1ax6xkX71oRCxGnAJHYcKGKExxikYmNMSg7AapAgMVBijMVIXtjm+ S0/yLIFbxgBdoO7g098C2M/NHxoN3hgAUlTxdn8VMguzHvqWKQcAPIrMduu2gSSWHj79 bjG6qSVRtjPerZMvRSmI1g0lQ2DM08SL/srNWOeOVn3PEw5QqXl0ruoJfpZQlgVVkTTA eVLOqYW5kmQyq+HFCDktKbSPypzWglFyzmSQFiLPSiv4rSjQkbdcJ9GWJIwUbKrNDQ16 SkRA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=sO+SWRMKAJ5P59CkH3sw7ytmHXCwRZwr9CsUAdJVAJ4=; b=FaQ8JeJQxO2QjpEBTO3pNDNEi2vKeJ3sPtTmlZnYQqnB54+bqWwu27Pz+UAMffOOeI Nz8mx+7qJamP87XjOV+0J3LcAjXgZicu3VIw4oBqdZ2b97J5as1y9rgdb5i+3M9OS28w F9gjJPW2JIZFapoe96TT5UzshFG8OoYvyjOmz3JAIKNMAC/GAGGKSwqr5I8RYYotIpwJ X7ByBLu0awMqMDyMJDxSt2of17aDGX+AoN1eC4p1lRQqFDkFZouW7rq5+zN9b3a/9an1 0OLoJpg5M8J8bUJb3/fRtuvpWx8hSHClbKm2+AX+TmuO0GczJ5ijBzca1RbvEzwAWJFA 8xzA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id ka11si11081566ejc.367.2021.06.06.11.45.20; Sun, 06 Jun 2021 11:45:43 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229991AbhFFSnm (ORCPT + 99 others); Sun, 6 Jun 2021 14:43:42 -0400 Received: from netrider.rowland.org ([192.131.102.5]:39557 "HELO netrider.rowland.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S229697AbhFFSnk (ORCPT ); Sun, 6 Jun 2021 14:43:40 -0400 Received: (qmail 1742359 invoked by uid 1000); 6 Jun 2021 14:41:50 -0400 Date: Sun, 6 Jun 2021 14:41:50 -0400 From: Alan Stern To: Linus Torvalds Cc: Segher Boessenkool , "Paul E. McKenney" , Peter Zijlstra , Will Deacon , Andrea Parri , Boqun Feng , Nick Piggin , David Howells , Jade Alglave , Luc Maranget , Akira Yokosawa , Linux Kernel Mailing List , linux-toolchains@vger.kernel.org, linux-arch Subject: Re: [RFC] LKMM: Add volatile_if() Message-ID: <20210606184150.GA1742067@rowland.harvard.edu> References: <20210604205600.GB4397@paulmck-ThinkPad-P17-Gen-1> <20210604214010.GD4397@paulmck-ThinkPad-P17-Gen-1> <20210605145739.GB1712909@rowland.harvard.edu> <20210606001418.GH4397@paulmck-ThinkPad-P17-Gen-1> <20210606012903.GA1723421@rowland.harvard.edu> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Jun 05, 2021 at 08:41:00PM -0700, Linus Torvalds wrote: > On Sat, Jun 5, 2021 at 6:29 PM Alan Stern wrote: > > > > Interesting. And changing one of the branches from barrier() to __asm__ > > __volatile__("nop": : :"memory") also causes a branch to be emitted. So > > even though the compiler doesn't "look inside" assembly code, it does > > compare two pieces at least textually and apparently assumes if they are > > identical then they do the same thing. > > That's actually a feature in some cases, ie the ability to do CSE on > asm statements (ie the "always has the same output" optimization that > the docs talk about). > > So gcc has always looked at the asm string for that reason, afaik. > > I think it's something of a bug when it comes to "asm volatile", but > the documentation isn't exactly super-specific. > > There is a statement of "Under certain circumstances, GCC may > duplicate (or remove duplicates of) your assembly code when > optimizing" and a suggestion of using "%=" to generate a unique > instance of an asm. > > Which might actually be a good idea for "barrier()", just in case. > However, the problem with that is that I don't think we are guaranteed > to have a universal comment character for asm statements. > > IOW, it might be a good idea to do something like > > #define barrier() \ > __asm__ __volatile__("# barrier %=": : :"memory") > > but I'm not 100% convinced that '#' is always a comment in asm code, > so the above might not actually build everywhere. > > However, *testing* the above (in my config, where '#' does work as a > comment character) shows that gcc doesn't actually consider them to be > distinct EVEN THEN, and will still merge two barrier statements. > > That's distressing. > > So the gcc docs are actively wrong, and %= does nothing - it will > still compare as the exact same inline asm, because the string > equality testing is apparently done before any expansion. > > Something like this *does* seem to work: > > #define ____barrier(id) __asm__ __volatile__("#" #id: : :"memory") > #define __barrier(id) ____barrier(id) > #define barrier() __barrier(__COUNTER__) > > which is "interesting" or "disgusting" depending on how you happen to feel. > > And again - the above works only as long as "#" is a valid comment > character in the assembler. And I have this very dim memory of us > having comments in inline asm, and it breaking certain configurations > (for when the assembler that the compiler uses is a special > human-unfriendly one that only accepts compiler output). > > You could make even more disgusting hacks, and have it generate something like > > .pushsection .discard.barrier > .long #id > .popsection > > instead of a comment. We already expect that to work and have generic > inline asm cases that generate code like that. I tried the experiment with this code: #define READ_ONCE(x) (*(volatile typeof(x) *)&(x)) #define WRITE_ONCE(x, val) (READ_ONCE(x) = (val)) #define barrier() __asm__ __volatile__("": : :"memory") int x, y; int main(int argc, char *argv[]) { if (READ_ONCE(x)) { barrier(); y = 1; } else { y = 1; } return 0; } The output from gcc -O2 is: main: mov eax, DWORD PTR x[rip] test eax, eax je .L2 .L2: mov DWORD PTR y[rip], 1 The output from clang is essentially the same (the mov and test are replaced by a cmp). This does what we want, but I wouldn't bet against a future optimization pass getting rid of the "useless" test and branch. Alan