Received: by 2002:a05:6358:45e:b0:b5:b6eb:e1f9 with SMTP id 30csp573674rwe; Fri, 26 Aug 2022 10:05:05 -0700 (PDT) X-Google-Smtp-Source: AA6agR4SkqYZ9lDO0+zUydp6TAM/9dMDqZRNBXjUWbkZdoTT/iFXE7HVyORHzFfH7n0D60ZPoCT7 X-Received: by 2002:aa7:8c59:0:b0:536:69ad:6df1 with SMTP id e25-20020aa78c59000000b0053669ad6df1mr4842549pfd.82.1661533505032; Fri, 26 Aug 2022 10:05:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1661533505; cv=none; d=google.com; s=arc-20160816; b=05DP2hT2h5HDUBz/tGZIca86QkvgR+PIb1mNUL/z3Ju59w7okaigclItcvM+Ny6uEx OiL2gm9aBWTWIKVitEnKzEYQJQWjVYm/1ski1wcQk5k/4+YW33MDj4o9FF5yixkVdIxy 7ngXDe7X5s01gsVrrE2B1GbryhldUEkMdeSX6xOXHsSAa0J0dl0P8ZVUEyTyCmeOwEJh C7DmQtu2C60xsqITCDfCwLX38B5LplwV7YPizgWYhZ1mYmd29MToVaaykC7fYWm/J1PW 6G5denKfACL/uilqgSTc+JRm1sk2f5fSH+YdsXNCuZC9Ox4U1f2wrU9IuJOuPfHeqYJg Ojfw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=AzcQtJHhCq1M9HHqJyBU8jps37qkhWyg9JcbI9atrX0=; b=R2dtGPhRduGb6tx78DNL0RQ26kQcgodHpXRLSwUyQUSuJ4htjsPPUtHQdEEBZEMo9v xRukOqb7HEZLB7ZysbGNUPh5MpyTyZko3uGo60v1t0YMk3jV4mGB9WCWFuZEVZhmknxh VnSsuFb+cYmYFpDk28UW6dhuGs6iAqwCeOCPM8s3rC6YTiblE/FTKAuT9fvbTjas5OgL afMHLSXhrej5D/D54d9uF21FYUlEKuVLAqH9jzcxCZdUH5ohWXo5W8dDs2L1Q61Lb8tW qNZRc4DPF/AhG/zBoiz6zDFqgAnhk5tWlvzyxaHkJn+y/2nPU2AEWQ01j5WDbxDyVaPv E/PQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=google header.b=clP+YjSR; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id a9-20020a63e849000000b0041bd0985ea9si2004567pgk.671.2022.08.26.10.04.52; Fri, 26 Aug 2022 10:05:05 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=google header.b=clP+YjSR; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344782AbiHZQvy (ORCPT + 99 others); Fri, 26 Aug 2022 12:51:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49232 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344780AbiHZQvT (ORCPT ); Fri, 26 Aug 2022 12:51:19 -0400 Received: from mail-lf1-x12f.google.com (mail-lf1-x12f.google.com [IPv6:2a00:1450:4864:20::12f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BCA995D0CD for ; Fri, 26 Aug 2022 09:51:14 -0700 (PDT) Received: by mail-lf1-x12f.google.com with SMTP id m3so2711949lfg.10 for ; Fri, 26 Aug 2022 09:51:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc; bh=AzcQtJHhCq1M9HHqJyBU8jps37qkhWyg9JcbI9atrX0=; b=clP+YjSRMv2aMSc6s6pfkGuBwb18hbcN4LMP8FIL90suqd9UEviO5jlvsEMFMhPKrc SvpI9lj7ObC223xski6/wogooTIqPf19hTqQu36LKAy8yovElPte2v6C7NYnhvAG9OvP Zljm4DmKwu9XwUkI6wpFhtr1vIBosEYEvZ//0= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc; bh=AzcQtJHhCq1M9HHqJyBU8jps37qkhWyg9JcbI9atrX0=; b=rbcpwtiGf+5a5jXpj6nnuW6uyeWC6HOcWmVCeD10LXgEPnxqO2fSpd+Hvzq7ynbetg rAgGzFDGNhhGGWJUsVnYsruebHHB3Id1E+srcJI3sG2gFCURgBiPeisQjGBpw7kExrRl NltNRS/6GFes2ADp+znE8DC8ou8PyrrxHGOaF9jFHGNOs/vzZmmF80eX0jYHCsF/MZIJ yv+kT6aIAvOPVwFzLjIZfMof3bIn58y53GRC8pKfnpWcjdHay7BU7W3cjMGzU9k6MIeD mMujigeXrFZsRGaMGBNrWt4ltGAri2RzfwlH2GzuC1oUfVZtwWTuSDwXp3vu0QLglc10 yY5g== X-Gm-Message-State: ACgBeo0G3oAJSWvgG3lVwLK9ADB7IIHoxJAaMf/LI5FMawKs5AMjAdAO qcl3fjio+fSvuX6b35Dyxp9RqE/uZ3IoaJK4vDQ= X-Received: by 2002:a05:6512:16a1:b0:48a:87a2:103c with SMTP id bu33-20020a05651216a100b0048a87a2103cmr3107542lfb.554.1661532672790; Fri, 26 Aug 2022 09:51:12 -0700 (PDT) Received: from mail-lj1-f180.google.com (mail-lj1-f180.google.com. [209.85.208.180]) by smtp.gmail.com with ESMTPSA id v23-20020a056512349700b00492e5219874sm417285lfr.258.2022.08.26.09.51.12 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 26 Aug 2022 09:51:12 -0700 (PDT) Received: by mail-lj1-f180.google.com with SMTP id k22so2149109ljg.2 for ; Fri, 26 Aug 2022 09:51:12 -0700 (PDT) X-Received: by 2002:a05:6000:136f:b0:225:2fad:dde7 with SMTP id q15-20020a056000136f00b002252faddde7mr311126wrz.274.1661532364193; Fri, 26 Aug 2022 09:46:04 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Linus Torvalds Date: Fri, 26 Aug 2022 09:45:47 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH v3] wait_on_bit: add an acquire memory barrier To: Mikulas Patocka Cc: Alan Stern , Andrea Parri , Will Deacon , Peter Zijlstra , Boqun Feng , Nicholas Piggin , David Howells , Jade Alglave , Luc Maranget , "Paul E. McKenney" , Akira Yokosawa , Daniel Lustig , Joel Fernandes , linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-1.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Aug 26, 2022 at 6:17 AM Mikulas Patocka wrote: > > I wouldn't do this for regular test_bit because if you read memory with > different size/alignment from what you wrote, various CPUs suffer from > store->load forwarding penalties. All of the half-way modern CPU's do ok with store->load forwarding as long as the load is fully contained in the store, so I suspect the 'testb' model is pretty much universally better than loading a word into a register and testing it there. So narrowing the load is fine (but you generally never want to narrow a *store*, because that results in huge problems with subsequent wider loads). But it's not a huge deal, and this way if somebody actually runs the numbers and does any comparisons, we have both versions, and if the 'testb' is better, we can just rename the x86 constant_test_bit_acquire() to just constant_test_bit() and use it for both cases. > But for test_bit_acqure this optimization is likely harmless because the > bit will not be tested a few instructions after writing it. Note that if we really do that, then we've already lost because of the volatile access, ie if we cared about a "write bit, test bit" pattern, we should use other operations. Now, the new "const_test_bit()" logic (commits bb7379bfa680 "bitops: define const_*() versions of the non-atomics" and 0e862838f290 "bitops: unify non-atomic bitops prototypes across architectures") means that as long as you are setting and testing a bit in a local variable, it gets elided entirely. But in general use you're going to see that load from memory, and then the wider load is likely worse (because bigger constants, and because it requries a register). So maybe there is room to tweak it further, but this version of the patch looks good to me, and I've applied it. Thanks, Linus