Received: by 2002:a05:7412:d8a:b0:e2:908c:2ebd with SMTP id b10csp4061263rdg; Wed, 18 Oct 2023 13:52:39 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGhG/QZRBmHWgWi1AQ8NkBCqeIr/CMxatZUvmSs2orNe2gLTkHugthxA4792l5LytGt/BB5 X-Received: by 2002:a05:6871:b11:b0:1e9:de2c:3bd with SMTP id fq17-20020a0568710b1100b001e9de2c03bdmr583808oab.30.1697662358741; Wed, 18 Oct 2023 13:52:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1697662358; cv=none; d=google.com; s=arc-20160816; b=LPSyKSBWcUTePT/czjhxCYyqV9QXPyWF5zFLEQZrgRFhcJu54ZfvCFC1LFTfQWWZgY UWD5BljOCWKUK0jzEpUI2yW2MWC5fNket8TCt7mgGylzFMnQJRVqp7xu/57mj+J+TtZm 2+7WA20yxxZkfBd494n6YS4zQNAW6uZwFxVINv+aMLjSsYtNCXvgJj28R2jw/gOT7fl9 u0s6XSv/U3541ruFiKrpta6X4PTcLvAi1iHVcE8U1Cvquzj9wJbOQZMjbbo5LYXwe3qJ OK4MYGFEn/rqWvGLWa0liKpgBBG8HYlizGyY7UhTh2WFzPtJy09K5OFy0XgeuImfMQjS Vuig== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=We91PMUsXnIghzbF4dQ+ECP09hfUpPtEkR/fE2xW/Vk=; fh=/Dnf++/2xG0Xi30TGscvcX3UaoYjzLSxPPap0bUcRsE=; b=tU/mrGf1/qOsO/+QPdKU9pEmzNnitUWsvhms7NjPou0nYFS9K7ylaMJKWiQXLbVI5r epLSDPZh8DfNJMGhS8vuA5mnh/zdpRQRTHtC0XMRE+feb5MEfmuOsQVl144h6hKf3TQI hV388a3cyrKWCaG6cTUeruSvK6cn6ZVUp1nuHRCG9co98EJgRg8gicGaDw2Uq08xEEIn BQKCLSDeU7WLuiterE2kcf7kkhC+6fKOa57nMRn0pAOeTxz8OHtReyzZu9z3JvAJsZby cRUNr1uWfRt+ADtm8HapabHpw9Ju4yNhuB1V227yNFcLZ2cRruDE32XL8iA3vo0JGQLK Z8hg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b="M2o/5lHK"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from pete.vger.email (pete.vger.email. [2620:137:e000::3:6]) by mx.google.com with ESMTPS id ca24-20020a056a02069800b0055793097dbesi3370319pgb.469.2023.10.18.13.52.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 18 Oct 2023 13:52:38 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) client-ip=2620:137:e000::3:6; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b="M2o/5lHK"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by pete.vger.email (Postfix) with ESMTP id 7E99D8153E47; Wed, 18 Oct 2023 13:52:35 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at pete.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231629AbjJRUwQ (ORCPT + 99 others); Wed, 18 Oct 2023 16:52:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60570 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230051AbjJRUwP (ORCPT ); Wed, 18 Oct 2023 16:52:15 -0400 Received: from mail-ed1-x52c.google.com (mail-ed1-x52c.google.com [IPv6:2a00:1450:4864:20::52c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A3B9CFA for ; Wed, 18 Oct 2023 13:52:12 -0700 (PDT) Received: by mail-ed1-x52c.google.com with SMTP id 4fb4d7f45d1cf-53d9f001b35so12336446a12.2 for ; Wed, 18 Oct 2023 13:52:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1697662331; x=1698267131; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=We91PMUsXnIghzbF4dQ+ECP09hfUpPtEkR/fE2xW/Vk=; b=M2o/5lHK11T21BGih8oZ7//4K6n7npT/6235FIDRTW5BBJp2Aa1kUehe6AKAs9ZBdm VGi7i4xNvphaIY77DPRIP3WlRopSYzmiYKwXbVY7a4rA5f6LllMjgGBcsHDRH3u66wm5 urKQYf27oqYF0ybYS9RjRbS3MpFPZthQVMJ3sk3Aj/OVRjp1j91qAxW1PEkq3625SLAh TeYa248i93z403ma2RZsF1L7dRE08fyBgFLuXYjKmXDn3gfz5safi43QUFgI7miyHAkk k34NSUqHJ34vSpr7NhZ91lZHvqPdYBD9YV3wjRzm6WXftxibHsS3hydZF+u6wJfpB7dI NnIQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697662331; x=1698267131; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=We91PMUsXnIghzbF4dQ+ECP09hfUpPtEkR/fE2xW/Vk=; b=ccDAyxK0hrqai2Ms6jACLKTdprm6H2lutJsPMWh3aY5J8N/axBjPEGQC0eG4iFSQHi bEsWgLprU8kZvCfMbfMqiFvlk7BwLFYFT7tYNvC4fZGK15R8vjYWwe8Hi/vr2OiAmSE0 NckuX1eoHITJA9qYHudHfkpA6qr9uGGRFkJ2qH0sUXmKw8n611vcosQiphFx7fxwA9Zb mk68g4bn1FkS7HLVXt9HBMamKtn+9Ocb8Yo7Ff4thdJKQu2IiP3gYLsSSeP9E7M+cYO1 Kf7V0E4prfclG9JZHLIpP/KU1Qbc2rdVqfMyG3R3/xt0GBGvfcG4ZVySoSC0hIWKIJ5n Rn7A== X-Gm-Message-State: AOJu0Yy3p1fY/MQzJ0MEWwNKcup7C9VlJ9SmFKKcqHmsLJqJErzffVJF gH9XZdNwlmX2KzLTCy2cMGhRuPW2STCkVDIun9A= X-Received: by 2002:a50:d51d:0:b0:53e:3d9f:3c74 with SMTP id u29-20020a50d51d000000b0053e3d9f3c74mr98072edi.14.1697662330819; Wed, 18 Oct 2023 13:52:10 -0700 (PDT) MIME-Version: 1.0 References: <20231010164234.140750-1-ubizjak@gmail.com> <0617BB2F-D08F-410F-A6EE-4135BB03863C@vmware.com> <7D77A452-E61E-4B8B-B49C-949E1C8E257C@vmware.com> <9F926586-20D9-4979-AB7A-71124BBAABD3@vmware.com> <3F9D776E-AD7E-4814-9E3C-508550AD9287@vmware.com> <28B9471C-4FB0-4AB0-81DD-4885C3645E95@vmware.com> In-Reply-To: From: Uros Bizjak Date: Wed, 18 Oct 2023 22:51:59 +0200 Message-ID: Subject: Re: [PATCH v2 -tip] x86/percpu: Use C for arch_raw_cpu_ptr() To: Linus Torvalds Cc: Nadav Amit , "the arch/x86 maintainers" , Linux Kernel Mailing List , Andy Lutomirski , Brian Gerst , Denys Vlasenko , "H . Peter Anvin" , Peter Zijlstra , Thomas Gleixner , Josh Poimboeuf , Nick Desaulniers Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-0.6 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on pete.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (pete.vger.email [0.0.0.0]); Wed, 18 Oct 2023 13:52:35 -0700 (PDT) On Wed, Oct 18, 2023 at 10:34=E2=80=AFPM Linus Torvalds wrote: > > On Wed, 18 Oct 2023 at 13:22, Linus Torvalds > wrote: > > > > And yes, sometimes we use actual volatile accesses for them > > (READ_ONCE() and WRITE_ONCE()) but those are *horrendous* in general, > > and are much too strict. Not only does gcc generally lose its mind > > when it sees volatile (ie it stops doing various sane combinations > > that would actually be perfectly valid), but it obviously also stops > > doing CSE on the loads (as it has to). > > Note, in case you wonder what I mean by "lose its mind", try this > (extremely stupid) test program: > > void a(volatile int *i) { ++*i; } > void b(int *i) { ++*i; } > > and note that the non-volatile version does > > addl $1, (%rdi) > > but the volatile version then refuses to combine the read+write into a > rmw instruction, and generates > > movl (%rdi), %eax > addl $1, %eax > movl %eax, (%rdi) > > instead. > > Sure, it's correct, but it's an example of how 'volatile' ends up > disabling a lot of other optimizations than just the "don't remove the > access". > > Doing the volatile as one rmw instruction would still have been very > obviously valid - it's still doing a read and a write. You don't need > two instructions for that. FYI: This is the reason RMW instructions in percpu.h are not (blindly) converted to C ops. They will remain in their (volatile or not) asm form because of the above reason, and due to the fact that they don't combine with anything. > I'm not complaining, and I understand *why* it happens - compiler > writers very understandably go "oh, I'm not touching that". > > I'm just trying to point out that volatile really screws up code > generation even aside from the "access _exactly_ once" issue. > > So using inline asm and relying on gcc doing (minimal) CSE will then > generate better code than volatile ever could, even when we just use a > simple 'mov" instruction. At least you get that basic combining > effect, even if it's not great. Actually, RMW insns are better written in asm, while simple "mov" should be converted to (volatile or not) memory access. On x86 "mov"s from memory (reads) will combine nicely with almost all other instructions. > And for memory ops, *not* using volatile is dangerous when they aren't st= able. True. Uros.