Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp28138598rwd; Tue, 4 Jul 2023 13:45:32 -0700 (PDT) X-Google-Smtp-Source: APBJJlHyXoJKCedgqLISAfXixvacTwFtaNesK79C2Jc0lEmHGGqbRD+WFIi7qcj5e2wHZ6Rt8aTl X-Received: by 2002:a05:6870:793:b0:1b3:afdc:6c11 with SMTP id en19-20020a056870079300b001b3afdc6c11mr6151413oab.3.1688503531795; Tue, 04 Jul 2023 13:45:31 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1688503531; cv=none; d=google.com; s=arc-20160816; b=kAuwMN/MwZBMzNgwtYEAuU5p5BNFWR5jGYDsQ/hJTOTG4ljSI9AhLlvrqeAEMHs+YJ YMjOHIcNjnMpozvTt4RR34jcyRm9xWM31NH5WSRuVcmPhhkPxNqdNHLN+neLBT3dt6Wc j84RxCtxxL8bjxtXX0bJORuCyXch+jWAj0P1+AT6zYoWBtd6mg5mSqLmRH16aHCFOfDb M9Dn7/U3WEjy00vhNOo7TK1eDo4BzBj8Z2hPMVMBW+Qgsatvcx4dIujsxd5V+hhNMenh b/C+vW31OIw9IT6ukFof4XAUWygjfIsgNhf3jvlbn68zdPFMFhqhp6B+rzpoPx8HKTZY FMzA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=99XVX/52qXfzd3I34Pan6yybTZS17HG3ZCL4of9nAAU=; fh=jJBqLqMJ/oiMXhB7CREt/IMID6zSmu6YTeS7FguxOZk=; b=b5ma/DfYU6+hHyK/jMVljHJo10HmZRNBbrSj2gjN1KBE3aQhJHpe7jgtxYQBmni3eD YtpowzAO8C7X4o7YKOrzHDCHeXwaDLDn95tbXKs8LRR1nycRy4doze+XqzEiZUHp9h9s /HZEbczaOxd7OeZha5c48m9ruhJfr/SbT5t2iYVTAWgQa6wvkswU75d5I9+NgZteCau3 wjtGiOnrG1g91s+i1plFCOM6QQROFgUSrg1SsEmZRUwyCOLwczfBS+g4s1tvwC+kgww+ OHV8JGIGICQ0ekWgfTBnB0vlQ9w2N8aB+2UUco0RM3AXi/nDDRXA15OkcJCxT0sgzaqg KYmQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id x24-20020a17090ab01800b0026121515137si91961pjq.45.2023.07.04.13.45.19; Tue, 04 Jul 2023 13:45:31 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229603AbjGDUZu (ORCPT + 99 others); Tue, 4 Jul 2023 16:25:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47176 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229469AbjGDUZt (ORCPT ); Tue, 4 Jul 2023 16:25:49 -0400 Received: from netrider.rowland.org (netrider.rowland.org [192.131.102.5]) by lindbergh.monkeyblade.net (Postfix) with SMTP id 6FEE7E76 for ; Tue, 4 Jul 2023 13:25:46 -0700 (PDT) Received: (qmail 1108780 invoked by uid 1000); 4 Jul 2023 16:25:45 -0400 Date: Tue, 4 Jul 2023 16:25:45 -0400 From: Alan Stern To: Olivier Dion Cc: Mathieu Desnoyers , rnk@google.com, Andrea Parri , Will Deacon , Peter Zijlstra , Boqun Feng , Nicholas Piggin , David Howells , Jade Alglave , Luc Maranget , "Paul E. McKenney" , Nathan Chancellor , Nick Desaulniers , Tom Rix , linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org, gcc@gcc.gnu.org, llvm@lists.linux.dev Subject: Re: [RFC] Bridging the gap between the Linux Kernel Memory Consistency Model (LKMM) and C11/C++11 atomics Message-ID: References: <87ttukdcow.fsf@laura> <87ilazd278.fsf@laura> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <87ilazd278.fsf@laura> X-Spam-Status: No, score=-1.7 required=5.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,SPF_HELO_PASS,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jul 04, 2023 at 01:19:23PM -0400, Olivier Dion wrote: > On Mon, 03 Jul 2023, Alan Stern wrote: > > On Mon, Jul 03, 2023 at 03:20:31PM -0400, Olivier Dion wrote: > >> This is a request for comments on extending the atomic builtins API to > >> help avoiding redundant memory barriers. Indeed, there are > > > > What atomic builtins API are you talking about? The kernel's? That's > > what it sounded like when I first read this sentence -- why else post > > your message on a kernel mailing list? > > Good point, we meant the `__atomic' builtins from GCC and Clang. Sorry > for the confusion. Oh, is that it? Then I misunderstood entirely; I thought you were talking about augmenting the set of functions or macros made available in liburcu. I did not realize you intended to change the compilers. > Indeed, our intent is to discuss the Userspace RCU uatomic API by extending > the toolchain's atomic builtins and not the LKMM itself. The reason why > we've reached out to the Linux kernel developers is because the > original Userspace RCU uatomic API is based on the LKMM. But why do you want to change the compilers to better support urcu? That seems like going about things backward; wouldn't it make more sense to change urcu to better match the facilities offered by the current compilers? What if everybody started to do this: modifying the compilers to better support their pet projects? The end result would be chaos! > > 1. I can see why you have special fences for before/after load, > > store, and rmw operations. But why clear? In what way is > > clearing an atomic variable different from storing a 0 in it? > > We could indeed group the clear with the store. > > We had two approaches in mind: > > a) A before/after pair by category of operation: > > - load > - store > - RMW > > b) A before/after pair for every operation: > > - load > - store > - exchange > - compare_exchange > - {add,sub,and,xor,or,nand}_fetch > - fetch_{add,sub,and,xor,or,nand} > - test_and_set > - clear > > If we go for the grouping in a), we have to take into account that the > barriers emitted need to cover the worse case scenario. As an example, > Clang can emit a store for a exchange with SEQ_CST on x86-64, if the > returned value is not used. > > Therefore, for the grouping in a), all RMW would need to emit a memory > barrier (with Clang on x86-64). But with the scheme in b), we can emit > the barrier explicitly for the exchange operation. We however question > the usefulness of this kind of optimization made by the compiler, since > a user should use a store operation instead. So in the end you settled on a compromise? > > 2. You don't have a special fence for use after initializing an > > atomic. This operation can be treated specially, because at the > > point where an atomic is initialized, it generally has not yet > > been made visible to any other threads. > > I assume that you're referring to something like std::atomic_init from > C++11 and deprecated in C++20? I do not see any scenario on any > architecture where a compiler would emit an atomic operation for the > initialization of an atomic variable. If a memory barrier is required > in this situation, then an explicit one can be emitted using the > existing API. > > In our case -- with the compiler's atomic builtins -- the initialization > of a variable can be done without any atomic operations and does not > require any memory barrier. This is a consequence of being capable of > working with integral-scalar/pointer type without an atomic qualifier. > > > Therefore the fence which would normally appear after a store (or > > clear) generally need not appear after an initialization, and you > > might want to add a special API to force the generation of such a > > fence. > > I am puzzled by this. Initialization of a shared variable does not need > to be atomic until its publication. Could you expand on this? In the kernel, I believe it sometimes happens that an atomic variable may be published before it is initialized. (If that's wrong, Paul or Peter can correct me.) But since this doesn't apply to the situations you're concerned with, you can forget I mentioned it. Alan