Received: by 2002:a05:6358:a55:b0:ec:fcf4:3ecf with SMTP id 21csp5305499rwb; Tue, 17 Jan 2023 11:51:25 -0800 (PST) X-Google-Smtp-Source: AMrXdXv7/Zoc3j1va4QUrGBmPxtm3H7UsXCFJ5nc6Q0RPKdJeVGAkxcsoW8acK9iEvg3yb2+A7Qd X-Received: by 2002:a05:6402:449a:b0:49e:210a:65f3 with SMTP id er26-20020a056402449a00b0049e210a65f3mr4550393edb.0.1673985084918; Tue, 17 Jan 2023 11:51:24 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673985084; cv=none; d=google.com; s=arc-20160816; b=dZzsmAu28H9mBBt6ifrwB7pRspVwykkryOMcxUISZOWTgkFmhZ3Kyq6Jp7KXmZuFka oC6bnd9aznoKF0WUM2QXfzai3l0HRp4I0UUyXLpTAGJcfQnhmeLNlPLlGaI0MSaZfMSB FCd9gdyCCDIAA+xhNW6JAgi9eLHixXATgRH1spBv63zIQ/bWlz6E9mCkngrtI2PSio4p FROBJrXxT5g0Rh/gJ8pbiBfj9u+LmxzOt/A242nXsh/NIWef2jQIqIp6xqWzVmZXV4NS mTGCHVGY9WM7RmkrrZDsp942f0s4RGmiOJMkflsR5u8pMWDqoscKZD5X/4NlrbGGnwXF zamw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=27WlW4iebYEpp37p1l2V6Frv0TlUOFiXdW58IAriYVM=; b=JJZEyOVJdkaeKuCuUstMUx/uEnel5V379OMsebgMEBR2BwN/4SXiydOq3ttDZuktdM Augk0WHtA70tQnzGjsJgFOCWuLWwi6mzgQ7a+P3MVGiu+wbUDd0eaIzAS6cm2KnHmuoX lfagS/8yJ6LLxycLQIDrR0IoxX6plIy8gq+WOBcwmm9P2zkePyBAoac/LPdIbgAM5fym dy8IUqUEkJw83u8pnEbopyJdHJkbqIZsRSKXnzAjoAa+bE8DbyzYy6w0VlvhZbVq+fIs ZrqzfV8p61eX2Fb6VCOjL8CUxxOyZX/ClH56AiwnmNlMI3dG6n8Yc7Zx+CFnf7wYzKLx h8UQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=google header.b=NvzU4VWl; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id j11-20020a05640211cb00b00485d0d6ea57si8145015edw.619.2023.01.17.11.51.12; Tue, 17 Jan 2023 11:51:24 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=google header.b=NvzU4VWl; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235057AbjAQTef (ORCPT + 47 others); Tue, 17 Jan 2023 14:34:35 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51290 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232922AbjAQTYW (ORCPT ); Tue, 17 Jan 2023 14:24:22 -0500 Received: from mail-vk1-xa2c.google.com (mail-vk1-xa2c.google.com [IPv6:2607:f8b0:4864:20::a2c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D5622485A8 for ; Tue, 17 Jan 2023 10:33:59 -0800 (PST) Received: by mail-vk1-xa2c.google.com with SMTP id b81so15221520vkf.1 for ; Tue, 17 Jan 2023 10:33:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=27WlW4iebYEpp37p1l2V6Frv0TlUOFiXdW58IAriYVM=; b=NvzU4VWlAw329EaV4WmrknWoyZhqmU4sCrTZ1qb8Y/Isdl37ozdiJJnOSrcmerujxc 6RIgGEiHXC5DfHG/0LrDZ6Wjg7lBg9nT6fV0VQpsFnKx355j0OTv/PWyd+0iw6p+7/ug o2Cpz64z1qNTKNXfXqwBHmG/cNxqgb1+WjNJI= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=27WlW4iebYEpp37p1l2V6Frv0TlUOFiXdW58IAriYVM=; b=7yRaHsQtVQFH80wa57u3bU65NpGQn+7bvIjMc4HWzrYK3qo/642TGVehYtuORJ8r/j Ox36aZW4QT1Xwy6DtoYaX/9lXGt6s7Xe0tyMxKGeCzz43nnGAJDgUm4ckG1Rev7DgUfF K45HUNkeGXPdn1CznEFCFf99CjGHDhYId+44c986dO7LJlORsM5C08BgMClvd8yjdu5S BW0RTNVqQp8UsJudjhmWL7DXxjI3e1FQeyk1JWwQ5BZTNsZnrcxHSgMJrRVAUyvgBK3+ qbuXVqEM4X5WqM1Xz7Sh1L6bbkLjpDFFlH6Z0XIRBw4awr3oEoOXq2NiQ9WvviWWav39 4pLw== X-Gm-Message-State: AFqh2koJwNvBoIugfRnJmAZzd7236vIRtmpAh2fHVzcXGE1yG4pXbe0N j3WNAXchnUACr6P8yJbvLGtj6y8njYinfI9s X-Received: by 2002:a1f:aa01:0:b0:3af:2f59:88d5 with SMTP id t1-20020a1faa01000000b003af2f5988d5mr14248691vke.12.1673980438576; Tue, 17 Jan 2023 10:33:58 -0800 (PST) Received: from mail-qk1-f180.google.com (mail-qk1-f180.google.com. [209.85.222.180]) by smtp.gmail.com with ESMTPSA id m15-20020a05620a24cf00b007055dce4cecsm21061060qkn.97.2023.01.17.10.33.57 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 17 Jan 2023 10:33:57 -0800 (PST) Received: by mail-qk1-f180.google.com with SMTP id pa22so16597115qkn.9 for ; Tue, 17 Jan 2023 10:33:57 -0800 (PST) X-Received: by 2002:a05:620a:144a:b0:6ff:cbda:a128 with SMTP id i10-20020a05620a144a00b006ffcbdaa128mr197733qkl.697.1673980437301; Tue, 17 Jan 2023 10:33:57 -0800 (PST) MIME-Version: 1.0 References: <20230111123736.20025-1-kirill.shutemov@linux.intel.com> <20230111123736.20025-9-kirill.shutemov@linux.intel.com> <20230117135703.voaumisreld7crfb@box> In-Reply-To: From: Linus Torvalds Date: Tue, 17 Jan 2023 10:33:41 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCHv14 08/17] x86/mm: Reduce untagged_addr() overhead until the first LAM user To: Nick Desaulniers Cc: Peter Zijlstra , "Kirill A. Shutemov" , Dave Hansen , Andy Lutomirski , x86@kernel.org, Kostya Serebryany , Andrey Ryabinin , Andrey Konovalov , Alexander Potapenko , Taras Madan , Dmitry Vyukov , "H . J . Lu" , Andi Kleen , Rick Edgecombe , Bharata B Rao , Jacob Pan , Ashok Raj , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Sami Tolvanen , joao@overdrivepizza.com Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-1.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jan 17, 2023 at 10:26 AM Nick Desaulniers wrote: > > On Tue, Jan 17, 2023 at 9:29 AM Linus Torvalds > wrote: > > > > Side note: that's not something new or unusual. It's been the case > > since I started testing clang - we have several code-paths where we > > use "unlikely()" to try to get very unlikely cases to be out-of-line, > > and clang just mostly ignores it, or treats it as a very weak hint. I > > think the only way to get clang to treat it as a *strong* hint is to > > use PGO. > > I'd be surprised if that were intentional or by design. > > Do you guys have a bug report we could look at? Heh. I actually sent you an example long ago. Let me go fish it out of my mail archives and quote some of it below so that you can find it in yours.. Linus [ Time passes. Found this email to you and Bill Wendling from Feb 16, 2020, Message-ID CAHk-=wigVshsByCMjkUiZyQSR5N5zi2aAeQc+VJCzQV=nm8E7g@mail.gmail.com ]: Anyway, I'm looking at clang code generation, and comparing it with gcc on one of my "this has been optimized to hell and back" functions: __d_lookup_rcu(). It looks like clang does frame pointers, and ignores our likely/unlikely annotations. So this code: if (unlikely(parent->d_flags & DCACHE_OP_COMPARE)) { int tlen; const char *tname; ...... doesn't actually jump out of line, but instead generates the unlikely case as the fallthrough: testb $2, (%r12) je .LBB50_9 ... unlikely code goes here... and then the likely case ends up having unfortunate reloads inside the hot loop. Possibly because it has one fewer free registers than gcc because of the frame pointer. I didn't look into _why_ clang generates frame pointers but gcc doesn't. It may be just a compiler default, I think we don't end up explicitly asking either way. [ And then Bill replied with this ] It's not a no-op. We add branch probabilities to the IR, whether they're honored or not depends on the transformation. But they *should* be honored when available. I've seen in the past that instead of moving unlikely blocks out of the way (like gcc, which moves them below the function's "ret" instruction), LLVM will do something like this: <...> I.e. the loop is rotated and the unlikely code is first and the hotter code is closer together but between the unlikely and conditional test. Could this be what's going on? Otherwise, maybe clang decided that it's not beneficial to move the code out-of-line because the benefit was minimal? (I'm guessing here.)