Received: by 2002:a05:7412:d8a:b0:e2:908c:2ebd with SMTP id b10csp4070672rdg; Wed, 18 Oct 2023 14:09:47 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHfBay6ZlEh/f1w7D57vVFIlVqTo6ygjeAGxA6pdTAYPACWHoOUlhWGmcZAlhw+cf7DwsDU X-Received: by 2002:a17:902:ab0e:b0:1ca:8419:1855 with SMTP id ik14-20020a170902ab0e00b001ca84191855mr520199plb.69.1697663386907; Wed, 18 Oct 2023 14:09:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1697663386; cv=none; d=google.com; s=arc-20160816; b=SJlzv69iXsmL6w7WgRhMAmXvb1OVJXxz5fx2TLkwon0eBghW7wFiVdfS3wyO0kqxZz m62xRFFi7N4KLvsRReV8Q2dVKi/Jjnyp/PyvwP6W7MTZPIZ2Wd8ORGSaTOpJrwgqRIir qcZMVue1Gg95xE30XZyB3HEFb5lkD1arty0ZKYJ/RF/y8/ojpFhHSKpbNGG5dazFOnB5 AOULO7+kr+kmcLGm+5v3gQB0srmNvznK11KiFQYNPfFW0ldyz2z8yWfTHej8SgNA/T/4 dsHfonhwCozqwEcORRFk+WQ5ITAm9vc9qYafCCsVUKacQ/HiqQcqfyKfy/YkZZSPNX3z gU7A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=YZuQOrHn5qschiscHqNh0CrL7M5eMboDDMfEuDpLQtI=; fh=/Dnf++/2xG0Xi30TGscvcX3UaoYjzLSxPPap0bUcRsE=; b=Yba0QBUf5Dga60C6+/23gtpsgD7iq/vNQye8jFd+S8yxo9L/gysMWv2OfzCRQ1uBBB mMX7hsSz8lKMvCDN3tG5BnNTvD9RudvEpvqsM2LLAQbfoLd7+yij1MUarOnEEKirFhTN WyHI1pBwtdxvn2UvGCjsTrVeLexfy3j73haSpL/zicUSngFSmqKHHVcuzxsFoDqp4w9G 0c0jCL3pJd8UklvLWA8F1yFVl29IIhE6pyc0ROKJ2lIXCKn+uPnzFTxohmGyJ2RlEw/s 96hO2BILGgNR6Htwjm08tkL7DordRjas0jVJ6WFGifyJjT3uKvZaj5NCZIdPdBG5L971 evCA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=jIlUX1rS; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from howler.vger.email (howler.vger.email. [23.128.96.34]) by mx.google.com with ESMTPS id q15-20020a17090311cf00b001ca85dc8815si762088plh.97.2023.10.18.14.09.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 18 Oct 2023 14:09:46 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) client-ip=23.128.96.34; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=jIlUX1rS; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by howler.vger.email (Postfix) with ESMTP id D298380E2279; Wed, 18 Oct 2023 14:09:42 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at howler.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231847AbjJRVJ2 (ORCPT + 99 others); Wed, 18 Oct 2023 17:09:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55604 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230373AbjJRVJ0 (ORCPT ); Wed, 18 Oct 2023 17:09:26 -0400 Received: from mail-ed1-x536.google.com (mail-ed1-x536.google.com [IPv6:2a00:1450:4864:20::536]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 99863AB for ; Wed, 18 Oct 2023 14:09:24 -0700 (PDT) Received: by mail-ed1-x536.google.com with SMTP id 4fb4d7f45d1cf-53dd752685fso12627570a12.3 for ; Wed, 18 Oct 2023 14:09:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1697663363; x=1698268163; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=YZuQOrHn5qschiscHqNh0CrL7M5eMboDDMfEuDpLQtI=; b=jIlUX1rSnxxh6XIyIYfJQXScokjP2wvoZiJBSv76KIaSlu+djTG3E5AXDbOMLlkNsR leWE0KEMHnDRjl6KtEdbdXkBsjq6/o2CJ8pyg++QUtsriOkUsYXWxYBZUIlgCo3arlVV jBtyPJiFZ2HdU9ER6J5sHOWiVkk6WH/9bxOXeOcXPUQ25dkL8PwVqfW5Zf/Vbypa/qBe tCcSduXWoUu4ZiDZhgGgcV/U7rRrRuzVZtlfciNJhUOLob8+4HrLau1hj109SsD6djT8 5UGjWBwOd9LF63VS3p23qOTtvvLPYMG3qFe1WbpaQiEPJHRbqu62EE7c1rAqWI0FMNZA FXtg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697663363; x=1698268163; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=YZuQOrHn5qschiscHqNh0CrL7M5eMboDDMfEuDpLQtI=; b=fsOevXZFBXEd2lyIixvNb0pdgQ1CJsCBxjR/HeEAZs+B0sE8vXR1L3TEuMjqHUXDD8 TEQG+puIyy0nBnGwgpnsbdhWfElpOTVsS8VJ81haLGj3tMfe1UWmNqNlrD9X7I7TfYJk ubsK0t9BVwpy6HcA26i0XlD2ZX08iLnCZjRACPssOIw3hANz/chUpiuPQ3kIS+e31zn5 jPcGeX32KVVns2NoAjIYq++BubCNpz7QkkIpGcgQ7vTj4V1xWpS0GiwhzfNMWn8OQwKv lkCilXwVYNkAxpeePiwRIeaCw0ys/7djINNqHiHtLfs3ifv9X7Ld8iymXwd4PU74rNDi tWNw== X-Gm-Message-State: AOJu0YxWiLGHeF/pWh0dPz4T5KXnNEJyrBRyUEUdoowzXQMPlxM03SNe 0XOxo1onm13M1c8QyyDTrstidsS4LKdza3+DuC0= X-Received: by 2002:a05:6402:50d3:b0:534:8bdf:a258 with SMTP id h19-20020a05640250d300b005348bdfa258mr115761edb.31.1697663362698; Wed, 18 Oct 2023 14:09:22 -0700 (PDT) MIME-Version: 1.0 References: <20231010164234.140750-1-ubizjak@gmail.com> <0617BB2F-D08F-410F-A6EE-4135BB03863C@vmware.com> <7D77A452-E61E-4B8B-B49C-949E1C8E257C@vmware.com> <9F926586-20D9-4979-AB7A-71124BBAABD3@vmware.com> <3F9D776E-AD7E-4814-9E3C-508550AD9287@vmware.com> <28B9471C-4FB0-4AB0-81DD-4885C3645E95@vmware.com> In-Reply-To: From: Uros Bizjak Date: Wed, 18 Oct 2023 23:09:10 +0200 Message-ID: Subject: Re: [PATCH v2 -tip] x86/percpu: Use C for arch_raw_cpu_ptr() To: Linus Torvalds Cc: Nadav Amit , "the arch/x86 maintainers" , Linux Kernel Mailing List , Andy Lutomirski , Brian Gerst , Denys Vlasenko , "H . Peter Anvin" , Peter Zijlstra , Thomas Gleixner , Josh Poimboeuf , Nick Desaulniers Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-0.6 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on howler.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (howler.vger.email [0.0.0.0]); Wed, 18 Oct 2023 14:09:43 -0700 (PDT) On Wed, Oct 18, 2023 at 10:51=E2=80=AFPM Uros Bizjak wr= ote: > > On Wed, Oct 18, 2023 at 10:34=E2=80=AFPM Linus Torvalds > wrote: > > > > On Wed, 18 Oct 2023 at 13:22, Linus Torvalds > > wrote: > > > > > > And yes, sometimes we use actual volatile accesses for them > > > (READ_ONCE() and WRITE_ONCE()) but those are *horrendous* in general, > > > and are much too strict. Not only does gcc generally lose its mind > > > when it sees volatile (ie it stops doing various sane combinations > > > that would actually be perfectly valid), but it obviously also stops > > > doing CSE on the loads (as it has to). > > > > Note, in case you wonder what I mean by "lose its mind", try this > > (extremely stupid) test program: > > > > void a(volatile int *i) { ++*i; } > > void b(int *i) { ++*i; } > > > > and note that the non-volatile version does > > > > addl $1, (%rdi) > > > > but the volatile version then refuses to combine the read+write into a > > rmw instruction, and generates > > > > movl (%rdi), %eax > > addl $1, %eax > > movl %eax, (%rdi) > > > > instead. > > > > Sure, it's correct, but it's an example of how 'volatile' ends up > > disabling a lot of other optimizations than just the "don't remove the > > access". > > > > Doing the volatile as one rmw instruction would still have been very > > obviously valid - it's still doing a read and a write. You don't need > > two instructions for that. > > FYI: This is the reason RMW instructions in percpu.h are not (blindly) > converted to C ops. They will remain in their (volatile or not) asm > form because of the above reason, and due to the fact that they don't > combine with anything. > > > I'm not complaining, and I understand *why* it happens - compiler > > writers very understandably go "oh, I'm not touching that". > > > > I'm just trying to point out that volatile really screws up code > > generation even aside from the "access _exactly_ once" issue. > > > > So using inline asm and relying on gcc doing (minimal) CSE will then > > generate better code than volatile ever could, even when we just use a > > simple 'mov" instruction. At least you get that basic combining > > effect, even if it's not great. > > Actually, RMW insns are better written in asm, while simple "mov" > should be converted to (volatile or not) memory access. On x86 "mov"s > from memory (reads) will combine nicely with almost all other > instructions. BTW: There was a discussion that GCC should construct RMW instructions also when the memory location is marked volatile, but there was no resolution reached. So, the "I'm not touching that" approach remains. However, GCC *will* combine a volatile read with a follow-up instruction. Uros.