Received: by 2002:a05:7412:d8a:b0:e2:908c:2ebd with SMTP id b10csp4057213rdg; Wed, 18 Oct 2023 13:43:16 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFUwD5jr7+UlZkhyodAHUzw4/G76wxkd6udGEBmvpKLCEsxj6Dukzw8ZVZQc6mLre3GtuYx X-Received: by 2002:a05:6808:10cf:b0:3a4:225d:8135 with SMTP id s15-20020a05680810cf00b003a4225d8135mr377621ois.31.1697661795872; Wed, 18 Oct 2023 13:43:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1697661795; cv=none; d=google.com; s=arc-20160816; b=ZXTHSNlLGgWWOw4UrPtGrRH8SMEeJTMCiBTCh58CrievdzNUfhlZKRdy46rxjWyIt4 J5TDZ9q18n1k2dIQwgYon0FIwzLsj8ChMmsVRjHqMPRVsgr3fTzeT0JQLsIpX6CiEGaE 5mmOm31q5Zy/ufVyOzi1Gvp7gYJlrReFRKGvqwqdBlPmNcNAloc9EQNoIRnPUFSckYNY KFncu06ZVNQYL325LtlMF3j7Ld21hWU8l3b+fGFlrYV/xNHmAKXrD/xGg8MWy6ZjVP6O Om0AvS4bhfFWR00i+prFtfJ3Y51F3aFd8kOKlwCYrZ+NX7SOwLXJlOHYqfSmnkOaM1qg R8Pw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=kGOrB8Us1nq+13prI6dQEJa6mUAXJWRLbgqTNy7bc8g=; fh=/Dnf++/2xG0Xi30TGscvcX3UaoYjzLSxPPap0bUcRsE=; b=T9jLY1kbcHHVfdzIWBkXVplZE0heV2UpWWjGl/jX8jSy277hTN8Q9oLxtMeIMr7lvk ZPVuDd6+a98mFXLKWxrurfO4KyQ7t1HodLQJrV2/nnW0lo99/Mv7hEmjQqHTDQV7Hfvh 0ZSb135/ARgO/KQOuhspU2DbxozJtqTkrIE9SlXgDPMK/To983o8f7IgUYJcXvXlQmAs RjlQqUAyAVsFe+NyWHTKE+ChAye+PX775bsUgpKvUP05symz+9ok8twEbQKFxGHE2DaG 2AZSiQiH/L9gZ/nMhr/g5uY61+MFKtMeEFWx0qAcRAHyqJtsKKo0WJpOe+ZyICTGd1Fa Mp9w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=jNyP4EZD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from groat.vger.email (groat.vger.email. [2620:137:e000::3:5]) by mx.google.com with ESMTPS id t64-20020a638143000000b005b7dd1d190asi2802012pgd.263.2023.10.18.13.43.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 18 Oct 2023 13:43:15 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) client-ip=2620:137:e000::3:5; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=jNyP4EZD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by groat.vger.email (Postfix) with ESMTP id B06A480DF980; Wed, 18 Oct 2023 13:43:11 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at groat.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231199AbjJRUm7 (ORCPT + 99 others); Wed, 18 Oct 2023 16:42:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59776 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229552AbjJRUm5 (ORCPT ); Wed, 18 Oct 2023 16:42:57 -0400 Received: from mail-lf1-x12f.google.com (mail-lf1-x12f.google.com [IPv6:2a00:1450:4864:20::12f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 73665F7 for ; Wed, 18 Oct 2023 13:42:55 -0700 (PDT) Received: by mail-lf1-x12f.google.com with SMTP id 2adb3069b0e04-507a3b8b113so6276858e87.0 for ; Wed, 18 Oct 2023 13:42:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1697661773; x=1698266573; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=kGOrB8Us1nq+13prI6dQEJa6mUAXJWRLbgqTNy7bc8g=; b=jNyP4EZDXprW43nlh/UR+UJRJozBSVrmIdeF9G7nRddzIG91Lz/LsL7mrwmbBEP22C NMuOF8cdphQGtJdORzPQW8Nkh5BRDfCw2wIpsdiMzZzwI/t1MTzg1JazahJmPgu4bUgz 2iIuQ04FvRmC/hVtn8pG8GsEBUtvYDJiZVkViDOa3FjY51nJL9C+eHc+LUpo84iwGEVs bhC8ccggmrtLdRnl4+QB6t1gdmRAjVv7H5zVLZldduZI4hCenN9BYZXQinMvG73e4gcj DQTAihLT5vk82Sb9KPOp8nLyMmscFhrHbZcmTzJ5XRvR63CX5cPv9wMg4hlRq4GEPnvQ wiOQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697661773; x=1698266573; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=kGOrB8Us1nq+13prI6dQEJa6mUAXJWRLbgqTNy7bc8g=; b=u4MABHQF2Y1/VTisOB3x5uG2ncEBGbdLJIA9kxF+JxdLMbNXF3EnbdTfc/KBD0FhCu 7hNrMN3wNmnIR0L3iIipbo/RU7uw5GyuAP9XPhuptGCgdvQwTD/H5CLvMUyH/tzqxcBa +fmAeLMwmoseuXOccptUZ8Y60rFw0BOkNwqheOKbT3QR1565NZJJXMLwnYECA7sUinbK NbqWfWQSGW7W1hjB5RyijBIYhRO558N0Fpb9nq9fWpkWay1fgkl3oF4imSKvJTAZiFKD 7HzGeHuQEANto7aIDIsBaTdZIx59qWGhEgy9jY/PewDti2T4Uj6DlXOFZ6GaCDjLGBqU 6osQ== X-Gm-Message-State: AOJu0Yx8fUVKNWjpbUnuV0Wyg/FEkk8qnTVvRWP23M886wrNAhnS8dp0 7dwZ8OferSW6ZonmgZaNHLFDSJE3p+iQzulEiPM= X-Received: by 2002:ac2:53a4:0:b0:507:a908:4a93 with SMTP id j4-20020ac253a4000000b00507a9084a93mr70822lfh.61.1697661773227; Wed, 18 Oct 2023 13:42:53 -0700 (PDT) MIME-Version: 1.0 References: <20231010164234.140750-1-ubizjak@gmail.com> <0617BB2F-D08F-410F-A6EE-4135BB03863C@vmware.com> <7D77A452-E61E-4B8B-B49C-949E1C8E257C@vmware.com> <9F926586-20D9-4979-AB7A-71124BBAABD3@vmware.com> <3F9D776E-AD7E-4814-9E3C-508550AD9287@vmware.com> <28B9471C-4FB0-4AB0-81DD-4885C3645E95@vmware.com> In-Reply-To: From: Uros Bizjak Date: Wed, 18 Oct 2023 22:42:40 +0200 Message-ID: Subject: Re: [PATCH v2 -tip] x86/percpu: Use C for arch_raw_cpu_ptr() To: Linus Torvalds Cc: Nadav Amit , "the arch/x86 maintainers" , Linux Kernel Mailing List , Andy Lutomirski , Brian Gerst , Denys Vlasenko , "H . Peter Anvin" , Peter Zijlstra , Thomas Gleixner , Josh Poimboeuf , Nick Desaulniers Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-0.6 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on groat.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (groat.vger.email [0.0.0.0]); Wed, 18 Oct 2023 13:43:11 -0700 (PDT) On Wed, Oct 18, 2023 at 10:22=E2=80=AFPM Linus Torvalds wrote: > > On Wed, 18 Oct 2023 at 12:33, Uros Bizjak wrote: > > > > This pach works for me: > > Looks fine. > > But you actually bring up another issue: > > > BTW: I also don't understand the comment from include/linux/smp.h: > > > > /* > > * Allow the architecture to differentiate between a stable and unstabl= e read. > > * For example, x86 uses an IRQ-safe asm-volatile read for the unstable= but a > > * regular asm read for the stable. > > I think the comment is badly worded, but I think the issue may actually b= e real. > > One word: rematerialization. > > The thing is, turning inline asm accesses to regular compiler loads > has a *very* bad semantic problem: the compiler may now feel like it > can not only combine the loads (ok), but also possibly rematerialize > values by re-doing the loads (NOT OK!). > > IOW, the kernel often has very strict requirements of "at most once" > behavior, because doing two loads might give different results. > > The cpu number is a good example of this. > > And yes, sometimes we use actual volatile accesses for them > (READ_ONCE() and WRITE_ONCE()) but those are *horrendous* in general, > and are much too strict. Not only does gcc generally lose its mind > when it sees volatile (ie it stops doing various sane combinations > that would actually be perfectly valid), but it obviously also stops > doing CSE on the loads (as it has to). > > So the "non-volatile asm" has been a great way to get the "at most > one" behavior: it's safe wrt interrupts changing the value, because > you will see *one* value, not two. As far as we know, gcc never > rematerializes the output of an inline asm. So when you use an inline > asm, you may have the result CSE'd, but you'll never see it generate > more than *one* copy of the inline asm. > > (Of course, as with so much about inline asm, that "knowledge" is not > necessarily explicitly spelled out anywhere, and it's just "that's how > it has always worked"). > > IOW, look at code like the one in swiotlb_pool_find_slots(), which does t= his: > > int start =3D raw_smp_processor_id() & (pool->nareas - 1); > > and the use of 'start' really is meant to be just a good heuristic, in > that different concurrent CPU's will start looking in different pools. > So that code is basically "cpu-local by default", but it's purely > about locality, it's not some kind of correctness issue, and it's not > necessarily run when the code is *tied* to a particular CPU. > > But what *is* important is that 'start' have *one* value, and one > value only. So look at that loop, which hasically does > > do { > .. use the 'i' based on 'start' .. > if (++i >=3D pool->nareas) > i =3D 0; > } while (i !=3D start); > > and it is very important indeed that the compiler does *not* think > "Oh, I can rematerialize the 'start' value". > > See what I'm saying? Using 'volatile' for loading the current CPU > value would be bad for performance for no good reason. But loading it > multiple times would be a *bug*. > > Using inline asm is basically perfect here: the compiler can *combine* > two inline asms into one, but once we have a value for 'start', it > won't change, because the compiler is not going to decide "I can drop > this value, and just re-do the inline asm to rematerialize it". > > This all makes me worried about the __seg_fs thing. Please note that there is a difference between this_* and raw_* accessors. this_* has "volatile" qualification, and for sure it won't ever be rematerialized (this would defeat the purpose of "volatile"). Previously, this_* was defined as asm-volatile, and now it is defined as a volatile memory access. GCC will disable almost all optimizations when volatile memory is accessed. IIRC, volatile-asm also won't be combined, since the compiler does not know about secondary effects of asm. Side note: The raw_smp_processor_id() uses this_, so it has volatile qualification. Perhaps we can do +#define __smp_processor_id() raw_cpu_read(pcpu_hot.cpu_number) as this seems like the relaxed version of smp_processor_id(). So, guarantees of asm-volatile are the same as guarantees of volatile memory access. Uros. > > For 'current', this is all perfect. Rematerializing current is > actually better than spilling and reloading the value. > > But for something like raw_smp_processor_id(), rematerializing would > be a correctness problem, and a really horrible one (because in > practice, the code would work 99.9999% of the time, and then once in a > blue moon, it would rematerialize a different value). > > See the problem? > > I guess we could use the stdatomics to try to explain these issues to > the compiler, but I don't even know what the C interfaces look like or > whether they are stable and usable across the range of compilers we > use. > > Linus