Received: by 2002:a05:7412:d8a:b0:e2:908c:2ebd with SMTP id b10csp4109500rdg; Wed, 18 Oct 2023 15:40:36 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFJWYUfjf7rxYd53WRTa8B4GZhfFGZib0mZWtwrhklAg5xARiYXTCM20R0Bc6tUflcUXwRg X-Received: by 2002:a17:903:5c7:b0:1c7:65e3:e605 with SMTP id kf7-20020a17090305c700b001c765e3e605mr681491plb.36.1697668836499; Wed, 18 Oct 2023 15:40:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1697668836; cv=none; d=google.com; s=arc-20160816; b=rIOh9JaURO4ap3xXRavr5R/GKjl7v6nEbngmYeVuq7/807eXSdPsUis6fmqdfdodSZ l/KPCT1lDETN2YK8IgvTausBy6Ms7AgVkzRyimhdOOL2UjFf8nRTNAah9WaAHdcHZ82c SEPpbo1dw2r9GRyxJeAN00abI4qcF2UzbLbxypFnXEMDbezAP7Dr76zyX+5KK5/t2U8Z lIIuEEjTaIWFQe3ABCoDxAyk7l6B7jdwE+QxVAnnp5UhLt7puMdTckIUHKFGMBYlTVEn i44vdlJIODj1dqURNMX6rIAndU3JveVz7KnaHIOZ12tQCh5EM7zovh/reF+meeQupMuT dhBw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=DQSN/8AyNe3sAydXX0IQmaasInOgjPE21LKV/uZBmFs=; fh=yYaFxW14JnU5qZhSrZaVX0Gi5mVUwcKo866zJECR4bA=; b=RXyCHBLIaV9DZPIiej1bnoHK7KqjRwvX2pja/xFpBKIBZ9eGEL05LU2T+XeuPGwo1y K04FobIo/vfWcdcU8bnvi+c/Cpts26xJLmBEbT56zRK2nlzHGLNyy9NPWsABoi65Caf0 Om57xcHBb/Q6MAZhkgigHOi/NenbbduV+HCTmLHFOdAlBGN1/sGIlCiWA/tmP56BiFCC 0oNT4K67u4SwxYBYBp8TWlJpQnqf6w7gwrEygl26fYcTuuhAl7zl7ZGbhEtoLCzGfZC8 +YFZ8k189+wroUhrR77I4w/ARqFN9c4I7HyvcVCni1aCGM2MvY/QemCRyh0bqDbrSSF3 njdA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=google header.b=fjReqgnI; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from snail.vger.email (snail.vger.email. [23.128.96.37]) by mx.google.com with ESMTPS id kh15-20020a170903064f00b001c9b5a90a92si771579plb.344.2023.10.18.15.40.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 18 Oct 2023 15:40:36 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) client-ip=23.128.96.37; Authentication-Results: mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=google header.b=fjReqgnI; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 4DF4F820E502; Wed, 18 Oct 2023 15:40:35 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231199AbjJRWk3 (ORCPT + 99 others); Wed, 18 Oct 2023 18:40:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40274 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229487AbjJRWk2 (ORCPT ); Wed, 18 Oct 2023 18:40:28 -0400 Received: from mail-ed1-x52d.google.com (mail-ed1-x52d.google.com [IPv6:2a00:1450:4864:20::52d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AEEAA119 for ; Wed, 18 Oct 2023 15:40:26 -0700 (PDT) Received: by mail-ed1-x52d.google.com with SMTP id 4fb4d7f45d1cf-522bd411679so12186575a12.0 for ; Wed, 18 Oct 2023 15:40:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; t=1697668825; x=1698273625; darn=vger.kernel.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=DQSN/8AyNe3sAydXX0IQmaasInOgjPE21LKV/uZBmFs=; b=fjReqgnIB3MnFYnlXAfnW45VENvYQUpfeIAQUpNYpA6JenM7Vhb7MBHqoID3n+Q/Ev mcHPGu5wXc08eFYyG0cow2qGmn0TpEs1CpoCxI9ru6G/qk8XVEAnWx3GSQX1anPiGjfi M93eLqvZeFTI/fFiDJxF7o6SgobphCNPRxc8M= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697668825; x=1698273625; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=DQSN/8AyNe3sAydXX0IQmaasInOgjPE21LKV/uZBmFs=; b=QRIYPT6a5ClXeicvLoq5u99XUE1thkV+JCgSqyqobTan3NdSH+I0Hhj/QaP8LKZk66 UbQqTxtJQhrWN9AD/gqlyFLVwuVK48ZY6Hb+l4pOor9mNamxJ8w/VumTLUnj5+b5Fo/7 Sr/5pNoZGKGDX3gbtc9kEjeeTkQ1yTrHXh84aCqxLD0nemISkvpoX8M4OMvl0etAkqO6 zjEsC+tkfiNsmSfyvmNHqKd6wxv5SrOkCKBHsiUp3CHPNw1cRh9vAtOtdlmPeXg7r6B9 VWIB5BAw4Mwbu6rw7kYFQHkC+g1wdqSpksyE3C1g2loDMj5v/82vG2xczc53A2Bo0eey 308Q== X-Gm-Message-State: AOJu0YzUUNjX8FJCzQJAGU1ngDRzzTWKfRNjO/aUH8CY6vbdk2CQ61wP W1jSY0Uv8yaqnztYFEu7+FjPUuMrCwqOz1n0OVQYBS99 X-Received: by 2002:a17:907:6096:b0:9b2:ccd8:2d2b with SMTP id ht22-20020a170907609600b009b2ccd82d2bmr399718ejc.77.1697668825048; Wed, 18 Oct 2023 15:40:25 -0700 (PDT) Received: from mail-ej1-f52.google.com (mail-ej1-f52.google.com. [209.85.218.52]) by smtp.gmail.com with ESMTPSA id qk17-20020a170906d9d100b009a198078c53sm2421978ejb.214.2023.10.18.15.40.23 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 18 Oct 2023 15:40:23 -0700 (PDT) Received: by mail-ej1-f52.google.com with SMTP id a640c23a62f3a-99c1c66876aso1193285666b.2 for ; Wed, 18 Oct 2023 15:40:23 -0700 (PDT) X-Received: by 2002:a17:906:dc92:b0:9af:9c4f:b579 with SMTP id cs18-20020a170906dc9200b009af9c4fb579mr437316ejc.18.1697668823207; Wed, 18 Oct 2023 15:40:23 -0700 (PDT) MIME-Version: 1.0 References: <20231010164234.140750-1-ubizjak@gmail.com> <0617BB2F-D08F-410F-A6EE-4135BB03863C@vmware.com> <7D77A452-E61E-4B8B-B49C-949E1C8E257C@vmware.com> <9F926586-20D9-4979-AB7A-71124BBAABD3@vmware.com> <3F9D776E-AD7E-4814-9E3C-508550AD9287@vmware.com> <28B9471C-4FB0-4AB0-81DD-4885C3645E95@vmware.com> In-Reply-To: From: Linus Torvalds Date: Wed, 18 Oct 2023 15:40:05 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH v2 -tip] x86/percpu: Use C for arch_raw_cpu_ptr() To: Uros Bizjak , peterz@infradead.org Cc: Nadav Amit , "the arch/x86 maintainers" , Linux Kernel Mailing List , Andy Lutomirski , Brian Gerst , Denys Vlasenko , "H . Peter Anvin" , Thomas Gleixner , Josh Poimboeuf , Nick Desaulniers Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-1.7 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Wed, 18 Oct 2023 15:40:35 -0700 (PDT) On Wed, 18 Oct 2023 at 14:40, Uros Bizjak wrote: > > The ones in "raw" form are not IRQ safe and these are implemented > without volatile qualifier. You are misreading it. Both *are* irq safe - on x86. The difference between "this_cpu_xyz()" and "raw_cpu_xyz()" is that on *other* architectures, "raw_cpu_xyz():" can be a lot more efficient, because other architectures may need to do extra work to make the "this" version be atomic on a particular CPU. See for example __count_vm_event() vs count_vm_event(). In fact, that particular use isn't even in an interrupt-safe context, that's an example of literally "I'd rather be fast that correct for certain statistics that aren't all that important". They two versions generate the same code on x86, but on other architectures, __count_vm_event() can *much* simpler and faster because it doesn't disable interrupts or do other special things. But on x86, the whole "interrupt safety" is a complete red herring. Both of them generate the exact same instruction. On x86, the "volatile" is actually for a completely different reason: to avoid too much CSE by the compiler. See commit b59167ac7baf ("x86/percpu: Fix this_cpu_read()"). In fact, that commit went overboard, and just added "volatile" to *every* percpu read. So then people complained about *that*, and PeterZ did commit 0b9ccc0a9b14 ("x86/percpu: Differentiate this_cpu_{}() and __this_cpu_{}()"), which basically made that "qual or not" be a macro choice. And in the process, it now got added to all the RMW ops, that didn't actually need it or want it in the first place, since they won't be CSE'd, since they depend on the input. So that commit basically generalized the whole thing entirely pointlessly, and caused your current confusion. End result: we should remove 'volatile' from the RMW ops. It doesn't do anything on x86. All it does is make us have two subtly different versions that we don't care about the difference. End result two: we should make it clear that "this_cpu_read()" vs "raw_cpu_read()" are *NOT* about interrupts. Even on architectures where the RMW ops need to have irq protection (so that they are atomic wrt interrupts also modifying the value), the *READ* operation obviously has no such issue. For the raw_cpu_read() vs this_cpu_read() case, the only issue is whether you can CSE the result. And in 99% of all cases, you can - and want to - CSE it. But as that commit b59167ac7baf shows, sometimes you cannot. Side note: the code that caused that problem is this: __always_inline void __cyc2ns_read(struct cyc2ns_data *data) { int seq, idx; do { seq = this_cpu_read(cyc2ns.seq.seqcount.sequence); ... } while (unlikely(seq != this_cpu_read(cyc2ns.seq.seqcount.sequence))); } where the issue is that the this_cpu_read() of that sequence number needs to be ordered. Honestly, that code is just buggy and bad. We should never have "fixed" it by changing the semantics of this_cpu_read() in the first place. The problem is that it re-implements its own locking model, and as so often happens when people do that, they do it completely wrongly. Look at the *REAL* sequence counter code in . Notice how in raw_read_seqcount_begin() we have unsigned _seq = __read_seqcount_begin(s); smp_rmb(); because it actually does the proper barriers. Notice how the garbage code in __cyc2ns_read() doesn't have them - and how it was buggy as a result. (Also notice how this all predates our "we should use load_acquire() instead of smb_rmb()", but whatever). IOW, all the "volatiles" in the x86 file are LITERAL GARBAGE and should not exist, and are due to a historical mistake. Linus