Received: by 10.223.176.5 with SMTP id f5csp1251218wra; Fri, 2 Feb 2018 14:01:07 -0800 (PST) X-Google-Smtp-Source: AH8x226TdK4VOl5iNTIQNq1tuqxVk97VwcbpVflxvDhx8wM8/AJnsZvGSA788aEN/slPsuC3v8b/ X-Received: by 10.99.186.87 with SMTP id l23mr31591615pgu.83.1517608867744; Fri, 02 Feb 2018 14:01:07 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1517608867; cv=none; d=google.com; s=arc-20160816; b=jofq2LMp+JjAjfkeu5rdvvXpreBwEWTTpDx2KneKStqiruB2tG8Wq9pUZnEVlXzWqC +1+0k7f+YXSURwngbgjhGXWOqGltBHY4hWxCeAcdxeeZrd/zlcZmkFrPwRzEzVz9onoA 8waPvI1wtYQxFdyhji5QawMMOCydT/DyyCJwAVT+Jo+t4soh1BFJ8A4seA8LS9frbMmV lfYraYgTij5PFWfkkVWZauJ4cVwcJr7ZeOlqESkWbtLJwpklxYX5qY45yMkoQHzkBvK8 NjAx3PfPBdVkzi5WPFOSnDqmS5wIT5sUPDuApqje+RB2l0jDIEE1ONGNaWsNtUDp1bNY NUTg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature :arc-authentication-results; bh=aZujugs2EmR9fdzwmjnFSaZuMf04v8ldkYZCc8NdPfU=; b=sjo2cxiqcXO52fm0zRsnyaqGIdqk3KR/50xKUHepSx6Hj8MW1dknlR82Dyd8mD2U00 ou5Is5ras49iv2r+BSIeTmdM8eEYH8U2lZoCaJsA2QQNlmgO0yxoaG6jLCPjwpIKzN3p rJG621OhGUTB336Aj5NQvthDuVUX/8oQG2C1YhrvsZ7mdApmqCKBc+YYJs+Ud6/mE+zd Ecm54Yza1ZqBtc9Gih7VdvMY2RkVr/KTFNVS0B/BdA8f9Q51EphtUXdJx3gBJzciTRnb 4/DHzi2RxmUski3ikuoZw96nedG2dObiOgxAAy5BfB1+mNS+o/QxCOoPGJTBM9rTqLpj 5L8g== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@gmail.com header.s=20161025 header.b=f3eCAi5J; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l4-v6si2473455plt.29.2018.02.02.14.00.19; Fri, 02 Feb 2018 14:01:07 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@gmail.com header.s=20161025 header.b=f3eCAi5J; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753821AbeBBSu6 (ORCPT + 99 others); Fri, 2 Feb 2018 13:50:58 -0500 Received: from mail-io0-f171.google.com ([209.85.223.171]:37303 "EHLO mail-io0-f171.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753121AbeBBSuv (ORCPT ); Fri, 2 Feb 2018 13:50:51 -0500 Received: by mail-io0-f171.google.com with SMTP id f89so23941226ioj.4; Fri, 02 Feb 2018 10:50:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc; bh=aZujugs2EmR9fdzwmjnFSaZuMf04v8ldkYZCc8NdPfU=; b=f3eCAi5JMPrDDBr9WxMrIvqvhZPWW16YTkYD8O+MU02f8AlrhqZNGaRZySZYKWlSjE av0/GuEgwwMZNZScDo/klB7b4mHxfbVtqWtrSoe4I/q+UTVXy0K5yKEDjzBvmGfHjikm AZziFhBICGUAM4kktWNmwL67drT6Ae3K55TQAzfC87KSesWS7vLQzoFUaQhyo5kVPV33 5oPGA+l2p/Av7Flrgk5GFShUKgLVw3SRWNotDU6hqfGd7BQw4Bt44dznAqQm56cOn8hL AoPdRQbuOwI2znDx6BobOi29JAxs/xU8XLV5H10GgxucTSkXpZEJxZpzbDtTEm8DIemj jXXA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to:cc; bh=aZujugs2EmR9fdzwmjnFSaZuMf04v8ldkYZCc8NdPfU=; b=bloCjIcS6NsqqwIOMdzNA8ysuloK3bYpyAifeEPDqwluBeKW87hcHYmTRwO5R1Hvpe RdPQyM0EpPjNzYQbwQsqYHjVnaWH6Fx02prIhcZjKEbIBlxwwRvBQCGimDDKWwT/XVFI 3LWewAdoYG5Ahdah+dMKzgNN76FFISF84xCbJDxLTnRoSaYhU0sDx1tcdnpUYwhPqSaE hK1ig68OsuZqMt1kFxIxJ1Rg8U+EflEnXy57S9hCZTR4NSGEE9K5D/t1iSSfsRklCVD1 N3j1Lw3Ie7stUsBuzHD+9lwiQROvDCLT2A16oWk+o+tlKBSKnvE6T4BjwuMrRex73vT3 +FEA== X-Gm-Message-State: AKwxytdJa3GXzf3sp2mLC8zqDVtgWme1S9Z96vwWjICzpJs/vQ8+2qKG Zhzijx06DUQM2rABrPMp6xLU/cFVL0a/4nq1VWk= X-Received: by 10.107.183.78 with SMTP id h75mr44382400iof.201.1517597450991; Fri, 02 Feb 2018 10:50:50 -0800 (PST) MIME-Version: 1.0 Received: by 10.107.59.196 with HTTP; Fri, 2 Feb 2018 10:50:50 -0800 (PST) In-Reply-To: <1517583559-424-1-git-send-email-dwmw@amazon.co.uk> References: <1517583559-424-1-git-send-email-dwmw@amazon.co.uk> From: Linus Torvalds Date: Fri, 2 Feb 2018 10:50:50 -0800 X-Google-Sender-Auth: 2e56jUv7bDgvemdW_JVXgOQTz5I Message-ID: Subject: Re: [PATCH] KVM: x86: Reduce retpoline performance impact in slot_handle_level_range() To: David Woodhouse Cc: Thomas Gleixner , KarimAllah Ahmed , sironi@amazon.de, "the arch/x86 maintainers" , KVM list , Paolo Bonzini , Linux Kernel Mailing List , Borislav Petkov , Peter Zijlstra Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Feb 2, 2018 at 6:59 AM, David Woodhouse wrote: > With retpoline, tight loops of "call this function for every XXX" are > very much pessimised by taking a prediction miss *every* time. > > This one showed up very high in our early testing, and it only has five > things it'll ever call so make it take an 'op' enum instead of a > function pointer and let's see how that works out... Umm. May I suggest a different workaround? Honestly, if this is so performance-critical, the *real* fix is to actually just mark all those "slot_handle_*()" functions as "always_inline". Because that will *really* improve performance, by simply removing the indirection entirely - since then the functions involved will become static. You might get other code improvements too, because I suspect it will end up removing an extra level of function call due to those trivial wrapper functions. And there's a couple of "bool lock_flush_tlb" arguments that will simply become constant and generate much better code that way. And maybe you don't want to inline all of the slot_handle_*() functions, and it's only one or two of them that matter because they loop over a lot of entries, but honestly, most of those slot_handle_xyz() functions seem to have just a couple of call sites anyway. slot_handle_large_level() is probably already inlined by the compiler because it only has a single call-site. Will it make for bigger code? Yes. But probably not really all *that* much bigger, because of how it also will allow the compiler to simplify some things. An dif this really is so critical that those non-predicted calls were that noticeable, those other simplifications probably also matter. And then you get rid of all run-time conditionals, and all the indirect jumps entirely. Plus the patch will be smaller and simpler too. Hmm? Linus