Received: by 2002:a05:7412:d8a:b0:e2:908c:2ebd with SMTP id b10csp1216486rdg; Fri, 13 Oct 2023 14:02:47 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGvpCO9E2YjA9cP3zU2uYuo1Cpinb4E0tMc3PA41Euwx351yNcp9MNiF9wznA2pQsAVLgls X-Received: by 2002:a05:6a20:1596:b0:137:74f8:62ee with SMTP id h22-20020a056a20159600b0013774f862eemr34458805pzj.18.1697230967611; Fri, 13 Oct 2023 14:02:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1697230967; cv=none; d=google.com; s=arc-20160816; b=AvXSfGuNKMY7bkDZ6AwSb5hbRwOqRzmP8mta963Gv90j+mS8zs0bKU8XXH9Ty+vcHe gtQHuhgztYQlalEh1xsuirkpIZnYg4FWQQA1eWaKF4KlOWSo6ONJDQDU0dipChuprr5V pQIEWfuS5WU5e+xCavWEiIR1zXGpYGZmup3e29Bop9w9M+DRuihOwn7RTdsyToXC5wnU kgNIOI4oBaoFENa4S8yDeL3y7f5kAnQBsJ2XDxOTJEHL1ZSV/Bn68sMZXH09fj/QpQER F657gu3UNd6H7uy4rVQtff9GiijJ7KT9AHsueicSCWb37WcfELLQN1BCMXyZRtp5fhXr i+EQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:from:subject :message-id:references:mime-version:in-reply-to:date:dkim-signature; bh=EDt833gpv5jt5jkNaSbrafWuiiiwZtQQlsR8tC9fqjo=; fh=J8duQSDL0w/CFgy0sRZWddo0zMt/r9VCVuDyIxZshFU=; b=xkS40rYzUuEP5FBHqKSbhwwFVzkBKjAaSL2tX8GCa7jrc2UtYfWJwyBpc3P47zZkUg SvbsumKS8QN7CX+Ymsd9vOc0i41rs1hqqzQHjPzLacqa0W63JT1zIX2qslVV6DlH56YG bLy4OQ4nbEBJvVBYxmO9Y4CLFJn9WR8Uj76mkz1dlZzrsA9Ichabqbt07cENkQ9FNfaU qOxJ1FcWatIW1IK6iInSP5HgCDEaZGcmpYfmg43/vmAmK7NM1vZc0nkfcEymVt0rGcYr DlxMylOpnH/cwHNXSBPpWwlr3yQjQirzj2BhsNBfxeds9djYgBOwWp/GDwQua54wuBLg nROw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=w79PfHsm; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.38 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from fry.vger.email (fry.vger.email. [23.128.96.38]) by mx.google.com with ESMTPS id g123-20020a636b81000000b0058bc1c85714si5141933pgc.467.2023.10.13.14.02.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 13 Oct 2023 14:02:47 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.38 as permitted sender) client-ip=23.128.96.38; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=w79PfHsm; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.38 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by fry.vger.email (Postfix) with ESMTP id 1185A80C346B; Fri, 13 Oct 2023 14:02:45 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at fry.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231553AbjJMVCc (ORCPT + 99 others); Fri, 13 Oct 2023 17:02:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34702 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229679AbjJMVCa (ORCPT ); Fri, 13 Oct 2023 17:02:30 -0400 Received: from mail-pl1-x649.google.com (mail-pl1-x649.google.com [IPv6:2607:f8b0:4864:20::649]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0B0C9B7 for ; Fri, 13 Oct 2023 14:02:29 -0700 (PDT) Received: by mail-pl1-x649.google.com with SMTP id d9443c01a7336-1c9d140fcddso21882845ad.1 for ; Fri, 13 Oct 2023 14:02:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1697230948; x=1697835748; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:from:to:cc:subject:date:message-id :reply-to; bh=EDt833gpv5jt5jkNaSbrafWuiiiwZtQQlsR8tC9fqjo=; b=w79PfHsmD/oNNMTQk6CvI2yncyQuOAkM+kTJAnI76uvrlVFTErkvPD0truC9xflJuN 6bxaj/699MjixR/yFhie2z7X8xW8AFhjcMJWY9N4BrDRJ+V7i69ykrSLFWq0FWya6zFA VGx2Ye1L4o1ASquB9pXVworLMw245qyveGWwCYhqKYidlCzNH7Qeol38lI89qhn1OOiG dVz3+7Aw7NPuoVQOEW6YvATELlhXlSBx7raMfL5961Pjr0zK7QuXXO50ORNi3a3avy3S uWMRPmCg0172KE99H320QFTHTlaVsSe3/sTIy5MQqGgofKaQcau+LkuYVtJS5gudjS6D ++2g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697230948; x=1697835748; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=EDt833gpv5jt5jkNaSbrafWuiiiwZtQQlsR8tC9fqjo=; b=hm5/WqYQ/T6XGzyqLH4g3AXpX/Igjzq8XcPrdCqXiVvnvy2ktF8UmQFyadqBcp+jRk lba9U3hWIY3/KRTA70QrEfbcdIglbJNBP0jFT1En77XWRM+R7p58a9jIK1W3UDMER0IH Wi0mnhauF4yb3JuMwAqhv1KalGZckDuAqMCgXz4DoyZEIJxHGR03dO+bjXOMd9/JBNU2 fXSVFtnx+xoSvKF+pBvn7UZFosUDj4gA3VIbQi7rNuTD+knP+Lf3i0/d5aYKo11h6MEJ zkM/J9BTdfngjLoqTheuSUnBUVBPwRHVZnLpCSnh7v7ysHGIS/8CXkIAEsVxoMlwhQX9 ZGTw== X-Gm-Message-State: AOJu0YzD03zadVogulyCgAzpBk3df7IVZkj4afyo1xLjr3ZfYCaiPCIE 9R9g6cPQPZFyhwkrO10fOVr5irABYQM= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a17:902:a387:b0:1ca:1e12:7c85 with SMTP id x7-20020a170902a38700b001ca1e127c85mr23522pla.3.1697230948465; Fri, 13 Oct 2023 14:02:28 -0700 (PDT) Date: Fri, 13 Oct 2023 14:02:26 -0700 In-Reply-To: Mime-Version: 1.0 References: <20231011204150.51166-1-ubizjak@gmail.com> Message-ID: Subject: Re: [PATCH tip] x86/percpu: Rewrite arch_raw_cpu_ptr() From: Sean Christopherson To: Uros Bizjak Cc: x86@kernel.org, linux-kernel@vger.kernel.org, Linus Torvalds , Nadav Amit , Ingo Molnar , Andy Lutomirski , Brian Gerst , Denys Vlasenko , "H . Peter Anvin" , Peter Zijlstra , Thomas Gleixner , Josh Poimboeuf Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-8.4 required=5.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on fry.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (fry.vger.email [0.0.0.0]); Fri, 13 Oct 2023 14:02:45 -0700 (PDT) On Fri, Oct 13, 2023, Uros Bizjak wrote: > On Fri, Oct 13, 2023 at 6:04=E2=80=AFPM Sean Christopherson wrote: > > > > On Wed, Oct 11, 2023, Uros Bizjak wrote: > > > Additionaly, the patch introduces 'rdgsbase' alternative for CPUs wit= h > > > X86_FEATURE_FSGSBASE. The rdgsbase instruction *probably* will end up > > > only decoding in the first decoder etc. But we're talking single-cycl= e > > > kind of effects, and the rdgsbase case should be much better from > > > a cache perspective and might use fewer memory pipeline resources to > > > offset the fact that it uses an unusual front end decoder resource... > > > > The switch to RDGSBASE should be a separate patch, and should come with= actual > > performance numbers. >=20 > This *is* the patch to switch to RDGSBASE. The propagation of > arguments is a nice side-effect of the patch. due to the explicit > addition of the offset addend to the %gs base. This patch is > alternative implementation of [1] >=20 > [1] x86/percpu: Use C for arch_raw_cpu_ptr(), > https://lore.kernel.org/lkml/20231010164234.140750-1-ubizjak@gmail.com/ Me confused, can't you first switch to MOV with tcp_ptr__ +=3D (unsigned lo= ng)(ptr), and then introduce the RDGSBASE alternative? > Unfortunately, I have no idea on how to measure the impact of such a > low-level feature, so I'll at least need some guidance. The "gut > feeling" says that special instruction, intended to support the > feature, is always better than emulating said feature with a memory > access. AIUI, {RD,WR}{FS,GS}BASE were added as faster alternatives to {RD,WR}MSR, n= ot to accelerate actual accesses to per-CPU data, TLS, etc. E.g. loading a 64-bi= t base via a MOV to FS/GS is impossible. And presumably saving a userspace contro= lled by actually accessing FS/GS is dangerous for one reason or another. The instructions are guarded by a CR4 bit, the ucode cost just to check CR4= .FSGSBASE is probably non-trivial.