Received: by 2002:a05:6358:a55:b0:ec:fcf4:3ecf with SMTP id 21csp5528087rwb; Tue, 17 Jan 2023 15:14:18 -0800 (PST) X-Google-Smtp-Source: AMrXdXsNiPWMFRniskMOYsb4gjIO37AjU9fC9xRzMD13aAlgHyCI7Wb/JlCRfTVCIAZ+EqRK1g4+ X-Received: by 2002:a17:90b:2787:b0:229:4614:3c48 with SMTP id pw7-20020a17090b278700b0022946143c48mr5000865pjb.9.1673997258628; Tue, 17 Jan 2023 15:14:18 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673997258; cv=none; d=google.com; s=arc-20160816; b=OjBpH29cR1NEnvkxjLPmb9/r6Fc17X5L8SpKLt6loJC0jOx2IaM4LE/yarSiXL6Es5 KOf9k1RJTygBmmzcNPLUFJaVi/fJ8jozDY2b2ck/c2w0vpv5hLfUf5ThtCueIAYxXUj9 UHdJJVQIT6Pk7kdtH/qCBVluWK8p8Fbs0VSzaPPn1BFrIS+ndpmNndtxEomMKPgLdN5w wVvl+Ja5mB/zCAHU61nNeSpOBH3mE4wb3HXdKNbi6JeAc1ELFY4hLWG97rk9eRSDgyTh QyJkt1Tz70KVMPP0fO4UlcOaXkSUc/QTnQ2TRroEaQk9SCg1Lq2SKaIPYeUklagJkKbk QzLw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=dNFni7Rgr3eQM2tShZ3fPnjMab/wLpj7M+t9eYmmNw4=; b=BXGblfD1jBNEzfSJk8Dmqnadz+yHXEQL51zOGkIF8E26JAGi4MVtAWGBDI6EhudSYY HWEvtJS0HstG4kn+S9SJOR+lzJM1VqcOl0usA1vGlO4i9a2aDPu+Rq3vkIBYCrt0Lk6f hZ+lF73z/r0OMCoIe5lMWdb6O2grnb3dEwmxuMPRiFgVPNnOkjJEp/kwfFJxxXYLYiDW B6NeiBTYPWCYpHvc1IxytcJXyZ+xGAht/Xwo53e2dCRBhY8mgspQc9gLSnDwrEt712Ht cirTS0EdOtAoXmaQYOdmLqy5ulUdKLeYEKRepiNQrHss83Q0WHK+9bquiUEldGkv5xZs ALdQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=google header.b="OiGs/qOO"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id w4-20020a656944000000b004790f4d9ad2si33588745pgq.636.2023.01.17.15.14.12; Tue, 17 Jan 2023 15:14:18 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=google header.b="OiGs/qOO"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229448AbjAQWBA (ORCPT + 46 others); Tue, 17 Jan 2023 17:01:00 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36010 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229897AbjAQVsF (ORCPT ); Tue, 17 Jan 2023 16:48:05 -0500 Received: from mail-qt1-x835.google.com (mail-qt1-x835.google.com [IPv6:2607:f8b0:4864:20::835]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5DA677495D for ; Tue, 17 Jan 2023 12:10:21 -0800 (PST) Received: by mail-qt1-x835.google.com with SMTP id e8so7252700qts.1 for ; Tue, 17 Jan 2023 12:10:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=dNFni7Rgr3eQM2tShZ3fPnjMab/wLpj7M+t9eYmmNw4=; b=OiGs/qOOajehCduepkfEcpG8axNVtd4tbOgOWt02uJ83iC16JDlEf/zlF2d6GjReiB c2GFYK1+u/tXXdt73yLm4VUQcW39gVKE5rF2mtzMzX+UMGHDE3VTw0nW+YlyiTh4UGrv I+sqk4OwgrppTiVKoB5sCf19dVIgaNbLkKY5k= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=dNFni7Rgr3eQM2tShZ3fPnjMab/wLpj7M+t9eYmmNw4=; b=BXro9fWH/Sg3aE/jnu1HE+uII5toTlFyK11w/2oum1DHCWdB3NBjDOwbn+Fj9JRTbA Ca6IVQrIpwVCDkuuZ7guyqL8j+TsUbaB0pgJ+vtn2QFPjyitQdJbbp18bZAq636A/XJO bx/5qfGMRble77Jes7s6chFwaa5CnBs2tidUrmY2gRWUdtw2iuaU3PdKrrogxZs7wLGF Ai5WDC9J9GjXlgx10aa70ZalBVXswsnMFoJKvx07GCUqPGGaZZUT5EBvqfUPtmwXbpIR WbcFpoSAdm/3/pyNGnHPW9KQq416wkvq0/7YnkjwuY1tySPUdK+QHNd7r9do0jKV0/FI TBoA== X-Gm-Message-State: AFqh2kp68iLJF4tnaXK57g28+M0albzIjKzx83hNosbWOeCXGGYwhxcA bTgVa9STOtd585jNDkljowRkgq68PnFZi4tw X-Received: by 2002:ac8:540a:0:b0:3b6:461e:afec with SMTP id b10-20020ac8540a000000b003b6461eafecmr1362388qtq.31.1673986220410; Tue, 17 Jan 2023 12:10:20 -0800 (PST) Received: from mail-qv1-f44.google.com (mail-qv1-f44.google.com. [209.85.219.44]) by smtp.gmail.com with ESMTPSA id w17-20020ac843d1000000b003b62bc6cd1csm4525521qtn.82.2023.01.17.12.10.19 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 17 Jan 2023 12:10:20 -0800 (PST) Received: by mail-qv1-f44.google.com with SMTP id k12so1500006qvj.5 for ; Tue, 17 Jan 2023 12:10:19 -0800 (PST) X-Received: by 2002:a05:6214:5d11:b0:531:7593:f551 with SMTP id me17-20020a0562145d1100b005317593f551mr220481qvb.89.1673986219572; Tue, 17 Jan 2023 12:10:19 -0800 (PST) MIME-Version: 1.0 References: <20230111123736.20025-1-kirill.shutemov@linux.intel.com> <20230111123736.20025-9-kirill.shutemov@linux.intel.com> <20230117135703.voaumisreld7crfb@box> In-Reply-To: From: Linus Torvalds Date: Tue, 17 Jan 2023 12:10:03 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCHv14 08/17] x86/mm: Reduce untagged_addr() overhead until the first LAM user To: Nick Desaulniers Cc: Peter Zijlstra , "Kirill A. Shutemov" , Dave Hansen , Andy Lutomirski , x86@kernel.org, Kostya Serebryany , Andrey Ryabinin , Andrey Konovalov , Alexander Potapenko , Taras Madan , Dmitry Vyukov , "H . J . Lu" , Andi Kleen , Rick Edgecombe , Bharata B Rao , Jacob Pan , Ashok Raj , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Sami Tolvanen , joao@overdrivepizza.com Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-1.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jan 17, 2023 at 11:17 AM Nick Desaulniers wrote: > > Perhaps that was compiler version or config specific? Possible, but... The clang code generation annoyed me enough that I actually ended up rewriting the unlikely test to be outside the loop in commit ae2a823643d7 ("dcache: move the DCACHE_OP_COMPARE case out of the __d_lookup_rcu loop"). I think that then made clang no longer have the whole "rotate loop with unlikely case in the middle" issue. And then because clang *still* messed up by trying to be too clever (see https://lore.kernel.org/all/CAHk-=wjyOB66pofW0mfzDN7SO8zS1EMRZuR-_2aHeO+7kuSrAg@mail.gmail.com/ for details), I also ended up doing commit c4e34dd99f2e ("x86: simplify load_unaligned_zeropad() implementation"). The end result is that now the compiler almost *cannot* mess up any more. So the reason clang now does a good job on __d_lookup_rcu() is largely that I took away all the places where it did badly ;) That said, clang still generates more register pressure than gcc, causing the function prologue and epilogue to be rather bigger (pushing and popping six registers, as opposed to gcc that only needs three) Gcc is also better able to schedule the prologue and epilogue together with the work of the function, which clang seems to always do it as a "push all" and "pop all" sequence. That scheduling doesn't matter in that particular place (although it does make the unlikely case of calling __d_lookup_rcu_op_compare pointlessly push all regs only to then pop them), but I've seen a few other cases where it ends up meaning that it always does that full function prologue even when the *likely* case then returns early and doesn't actually need any of that work because it didn't use any of those registers. But yeah, the RCU pathname lookup looks fine these days. And I don't actually think it was due to clang changes ;) Linus