Received: by 2002:a05:7412:d8a:b0:e2:908c:2ebd with SMTP id b10csp4048310rdg; Wed, 18 Oct 2023 13:22:58 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFhrIZsS/uDJRIi5CuSEwJqeeKdtyia6urQHg3j6Ay0PZzl59BeZGkVy9iQPWsLK+zL+SyJ X-Received: by 2002:a05:6359:1a47:b0:166:eb70:12a3 with SMTP id ru7-20020a0563591a4700b00166eb7012a3mr49665rwb.25.1697660577787; Wed, 18 Oct 2023 13:22:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1697660577; cv=none; d=google.com; s=arc-20160816; b=hQvRt2qlyM3KUs32gE0CHeSYP7DvBeNVcJG1rLPRAj7g6hkYI6CmglQieobQYzBxBV pwSoY6Ko7L1MUPWeAbapDp2Pb1d95cleFv0E2xemRvGnhA3Ac9p0Y9RISnv1k42VK1vy QgxXjxMGqGrYE0tM5LBQnPnNz+bWXqqmrARaEl8aFRXD8WQIIRzKVF7cs1PXR8EWhVTU 07kg2oCIeoNfAouvYE1ZzgJcxq4qczO0hbqt0sCnOD178zGklItmwA+42HdLDDGCMWZS 3EaKI4b0iSnrJ7Awxk7wjA8X9iuvLkboGMyerex7htI01JHKpSZP4jOLWvpxkcOFg79L VseQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=si+9ASsp5/K6lChVtwpu5ExRbuaOYMu1IyUvA2qgc/c=; fh=3hMqaci1aSMG0AFmuDTcNcT9Fdla45I/j/Dj+Dy/pas=; b=tngJt50pc5Vdf778AAnJr6G+zxv2tLv1YPRJUk9Ns3gfHLUAme0b0gh8URLbLYjQT0 3MFJRzaerTG8jTu2t61HVkixFE3j0ObBGHb+jShgWvqs0DGcFeBCtWnXj12MYNAr3mLs dzo6TMZUFGyq4wQOd9biOD1VAKIo+aSyes80XDHSc43JSCwSHEj5+SHWDlgTIClD88bA uKpqAFNgovApo+UgwYcAOW/Ui4hh4r5h/rnEM2N0Q48MyP729dijKfTs8Gwmt05kGHhJ gLPogy4W8kRMicsmxbgfrZGBmbnBgJTtLjFSegbQ5lggRi36NnpN6aGDLEavNq4pFIGM w1+Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=google header.b=Q+lwgxio; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from lipwig.vger.email (lipwig.vger.email. [2620:137:e000::3:3]) by mx.google.com with ESMTPS id ka21-20020a056a00939500b006935df3019esi4796357pfb.235.2023.10.18.13.22.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 18 Oct 2023 13:22:57 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) client-ip=2620:137:e000::3:3; Authentication-Results: mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=google header.b=Q+lwgxio; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by lipwig.vger.email (Postfix) with ESMTP id 9E89D824C4C1; Wed, 18 Oct 2023 13:22:54 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at lipwig.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230499AbjJRUWo (ORCPT + 99 others); Wed, 18 Oct 2023 16:22:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43004 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229726AbjJRUWm (ORCPT ); Wed, 18 Oct 2023 16:22:42 -0400 Received: from mail-ed1-x533.google.com (mail-ed1-x533.google.com [IPv6:2a00:1450:4864:20::533]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BAF08A4 for ; Wed, 18 Oct 2023 13:22:40 -0700 (PDT) Received: by mail-ed1-x533.google.com with SMTP id 4fb4d7f45d1cf-53db3811d8fso163430a12.1 for ; Wed, 18 Oct 2023 13:22:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; t=1697660559; x=1698265359; darn=vger.kernel.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=si+9ASsp5/K6lChVtwpu5ExRbuaOYMu1IyUvA2qgc/c=; b=Q+lwgxiowYaKVN0Z1P7/9e2F6ryIx13JkBLjHGASOQIRO39ec5msmgTnzBuOr1z+VL mPkZQcMrFNr+VR1M/Q4u7D0mzAH8ef9KGMTtFMSjhza6W4qITDI7ZrNHD5KLWpzAYEiU 5rVqNnjFuTPprSKB2x8LEs1hJygEaWrQ1nGCw= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697660559; x=1698265359; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=si+9ASsp5/K6lChVtwpu5ExRbuaOYMu1IyUvA2qgc/c=; b=Ef8/HtaNcFttzNsXz261WCyT18+1xgRov+MTqbmQqOT84vvbhQg2wqPiBbVxX7Iqva zQj1xNIVzNIbE+QPAEtgo1FHKkkLcB2Gfogg+A13Ygm+heTMTYV9IagDQLqCpjGoIVLe k1md5AmvL954flJ+riQJctC+I5uMSrz2pQc1eLkcTZjQOSfi2dLrFijegjb8zBqqeNpE dLNr0TYaYBGF9wiW29VsxzorQ7+ezEImhbCQFjMRAKr2tJ5gMBHNuV/t72fsBlczGPmV R42nfNZ7jmk3W+cyLiKOVTaayf6U8MW5QMJBYiRLq1zmnlpT0nd0VPM23seXAyOl6j9X 2mxA== X-Gm-Message-State: AOJu0Yyqf3096WIFb3V3PvfPPdAej2r+kiNZjwB2Fialf5gpZ8jb8Q0w lbXLzsqhUv1F5Cpm45uaKte0aAUqDWoqIUr81MQFMVvj X-Received: by 2002:a05:6402:42c9:b0:53d:fc49:49e9 with SMTP id i9-20020a05640242c900b0053dfc4949e9mr428349edc.6.1697660558955; Wed, 18 Oct 2023 13:22:38 -0700 (PDT) Received: from mail-ej1-f52.google.com (mail-ej1-f52.google.com. [209.85.218.52]) by smtp.gmail.com with ESMTPSA id fj27-20020a0564022b9b00b005309eb7544fsm3268975edb.45.2023.10.18.13.22.38 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 18 Oct 2023 13:22:38 -0700 (PDT) Received: by mail-ej1-f52.google.com with SMTP id a640c23a62f3a-9c3aec5f326so15182466b.1 for ; Wed, 18 Oct 2023 13:22:38 -0700 (PDT) X-Received: by 2002:a17:907:1c20:b0:9bd:81c3:2a85 with SMTP id nc32-20020a1709071c2000b009bd81c32a85mr5443762ejc.32.1697660557766; Wed, 18 Oct 2023 13:22:37 -0700 (PDT) MIME-Version: 1.0 References: <20231010164234.140750-1-ubizjak@gmail.com> <0617BB2F-D08F-410F-A6EE-4135BB03863C@vmware.com> <7D77A452-E61E-4B8B-B49C-949E1C8E257C@vmware.com> <9F926586-20D9-4979-AB7A-71124BBAABD3@vmware.com> <3F9D776E-AD7E-4814-9E3C-508550AD9287@vmware.com> <28B9471C-4FB0-4AB0-81DD-4885C3645E95@vmware.com> In-Reply-To: From: Linus Torvalds Date: Wed, 18 Oct 2023 13:22:19 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH v2 -tip] x86/percpu: Use C for arch_raw_cpu_ptr() To: Uros Bizjak Cc: Nadav Amit , "the arch/x86 maintainers" , Linux Kernel Mailing List , Andy Lutomirski , Brian Gerst , Denys Vlasenko , "H . Peter Anvin" , Peter Zijlstra , Thomas Gleixner , Josh Poimboeuf , Nick Desaulniers Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lipwig.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (lipwig.vger.email [0.0.0.0]); Wed, 18 Oct 2023 13:22:54 -0700 (PDT) On Wed, 18 Oct 2023 at 12:33, Uros Bizjak wrote: > > This pach works for me: Looks fine. But you actually bring up another issue: > BTW: I also don't understand the comment from include/linux/smp.h: > > /* > * Allow the architecture to differentiate between a stable and unstable read. > * For example, x86 uses an IRQ-safe asm-volatile read for the unstable but a > * regular asm read for the stable. I think the comment is badly worded, but I think the issue may actually be real. One word: rematerialization. The thing is, turning inline asm accesses to regular compiler loads has a *very* bad semantic problem: the compiler may now feel like it can not only combine the loads (ok), but also possibly rematerialize values by re-doing the loads (NOT OK!). IOW, the kernel often has very strict requirements of "at most once" behavior, because doing two loads might give different results. The cpu number is a good example of this. And yes, sometimes we use actual volatile accesses for them (READ_ONCE() and WRITE_ONCE()) but those are *horrendous* in general, and are much too strict. Not only does gcc generally lose its mind when it sees volatile (ie it stops doing various sane combinations that would actually be perfectly valid), but it obviously also stops doing CSE on the loads (as it has to). So the "non-volatile asm" has been a great way to get the "at most one" behavior: it's safe wrt interrupts changing the value, because you will see *one* value, not two. As far as we know, gcc never rematerializes the output of an inline asm. So when you use an inline asm, you may have the result CSE'd, but you'll never see it generate more than *one* copy of the inline asm. (Of course, as with so much about inline asm, that "knowledge" is not necessarily explicitly spelled out anywhere, and it's just "that's how it has always worked"). IOW, look at code like the one in swiotlb_pool_find_slots(), which does this: int start = raw_smp_processor_id() & (pool->nareas - 1); and the use of 'start' really is meant to be just a good heuristic, in that different concurrent CPU's will start looking in different pools. So that code is basically "cpu-local by default", but it's purely about locality, it's not some kind of correctness issue, and it's not necessarily run when the code is *tied* to a particular CPU. But what *is* important is that 'start' have *one* value, and one value only. So look at that loop, which hasically does do { .. use the 'i' based on 'start' .. if (++i >= pool->nareas) i = 0; } while (i != start); and it is very important indeed that the compiler does *not* think "Oh, I can rematerialize the 'start' value". See what I'm saying? Using 'volatile' for loading the current CPU value would be bad for performance for no good reason. But loading it multiple times would be a *bug*. Using inline asm is basically perfect here: the compiler can *combine* two inline asms into one, but once we have a value for 'start', it won't change, because the compiler is not going to decide "I can drop this value, and just re-do the inline asm to rematerialize it". This all makes me worried about the __seg_fs thing. For 'current', this is all perfect. Rematerializing current is actually better than spilling and reloading the value. But for something like raw_smp_processor_id(), rematerializing would be a correctness problem, and a really horrible one (because in practice, the code would work 99.9999% of the time, and then once in a blue moon, it would rematerialize a different value). See the problem? I guess we could use the stdatomics to try to explain these issues to the compiler, but I don't even know what the C interfaces look like or whether they are stable and usable across the range of compilers we use. Linus