Received: by 2002:ac0:8845:0:0:0:0:0 with SMTP id g63csp969283img; Thu, 28 Feb 2019 10:44:44 -0800 (PST) X-Google-Smtp-Source: AHgI3IaWmn4l6WvbF3u2nBgR3jSmW9s6FtzlBPvqjAW42u7UzRe7SXGgRB27k9ZlAfGdszhYYswt X-Received: by 2002:a62:64d1:: with SMTP id y200mr1005889pfb.161.1551379484421; Thu, 28 Feb 2019 10:44:44 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1551379484; cv=none; d=google.com; s=arc-20160816; b=Osh56rC1d3gRIq3gTvhFszL8t7rXmM+n/wE2o/CG0IkpGWnYlMgU9C3V1xpTE+LQPo thQP2ohs5rgV1KHSgG7Jdu8G5DawQmo7qs8HLBgZNoaWrTYBZeaCQMa1tc+Lg032q54R SIB3+Qb7fB1MUFg89pbdsh9O8nL+TYL/YIsooET2/oHTCF28UpEOtSZc4jW0Ka1L7DQH p16JWLQd+hiUmS3Ybm7JyY4XRuynuuGkzZP3Fqtrf1HiU1aArHoQNGgulP7n0moKVXIp 4sKVCOnQX4eqfnVscY76746Nnjy5XK81uQEOmUXI0VUnTrn9ZoSxsCvD28oPHxbF1mHU faYw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=chZg4WgCvueMhFpYNkmeefv340qNp8+nELVEwkJ1ja8=; b=xdWNcDFMbXEEHGJ6tCct/d03/MFVLIE1iOFDNyfwO2PgOTrsqbBmbxPkPns1DGB58P lPtt996XHM/U5U4IlwUdbfw7qZ+ZRkX8U3bBjMmuuI4hUF8MML3DKvQCNJs3tDMmvUm9 MdgZi6qUNPBzLBU19ReDh7ycQaED0D55yCVYAQRy4y3AFJNXc0BDrCCXPCzhmhza6Yre Up6ULF//9osyEr6MDCzO6LGrmSrUTgphxeHjAtZI0OiMboajhzZ2CCWswmEDJkw7rZrN 7Nv8F/ZQUHAK/eELt/Fjiluud2rlKjLI2h4+xbwPCREgBGdQ7IzQlt86fSTTXaSgh4Ah Ectg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h98si19168424plb.93.2019.02.28.10.44.28; Thu, 28 Feb 2019 10:44:44 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388598AbfB1SNY (ORCPT + 99 others); Thu, 28 Feb 2019 13:13:24 -0500 Received: from www62.your-server.de ([213.133.104.62]:59782 "EHLO www62.your-server.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2388146AbfB1SNY (ORCPT ); Thu, 28 Feb 2019 13:13:24 -0500 Received: from [78.46.172.3] (helo=sslproxy06.your-server.de) by www62.your-server.de with esmtpsa (TLSv1.2:DHE-RSA-AES256-GCM-SHA384:256) (Exim 4.89_1) (envelope-from ) id 1gzQB6-0005CA-Gb; Thu, 28 Feb 2019 19:13:04 +0100 Received: from [178.197.248.21] (helo=linux.home) by sslproxy06.your-server.de with esmtpsa (TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384:256) (Exim 4.89) (envelope-from ) id 1gzQB6-0004oF-7t; Thu, 28 Feb 2019 19:13:04 +0100 Subject: Re: [tip:x86/build] x86, retpolines: Raise limit for generating indirect calls from switch-case To: "H.J. Lu" Cc: David Woodhouse , Ingo Molnar , bjorn.topel@intel.com, David Miller , brouer@redhat.com, magnus.karlsson@intel.com, Andy Lutomirski , "H. Peter Anvin" , Thomas Gleixner , Peter Zijlstra , Borislav Petkov , Linus Torvalds , LKML , ast@kernel.org, linux-tip-commits@vger.kernel.org References: <20190221221941.29358-1-daniel@iogearbox.net> <33bf951448e7d916fd4a6ad41cd3d040e9d1f118.camel@infradead.org> <79add9a9-543b-a791-ecbe-79edd49f1bb3@iogearbox.net> <4604e680-7962-f1ee-5b79-711247f4e7d5@iogearbox.net> From: Daniel Borkmann Message-ID: <627ccc0b-3aca-7935-d370-8ad7b7056a64@iogearbox.net> Date: Thu, 28 Feb 2019 19:13:03 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.3.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit X-Authenticated-Sender: daniel@iogearbox.net X-Virus-Scanned: Clear (ClamAV 0.100.2/25374/Thu Feb 28 11:38:05 2019) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 02/28/2019 07:09 PM, H.J. Lu wrote: > On Thu, Feb 28, 2019 at 9:58 AM Daniel Borkmann wrote: >> On 02/28/2019 05:25 PM, H.J. Lu wrote: >>> On Thu, Feb 28, 2019 at 8:18 AM Daniel Borkmann wrote: >>>> On 02/28/2019 01:53 PM, H.J. Lu wrote: >>>>> On Thu, Feb 28, 2019 at 3:27 AM David Woodhouse wrote: >>>>>> On Thu, 2019-02-28 at 03:12 -0800, tip-bot for Daniel Borkmann wrote: >>>>>>> Commit-ID: ce02ef06fcf7a399a6276adb83f37373d10cbbe1 >>>>>>> Gitweb: https://git.kernel.org/tip/ce02ef06fcf7a399a6276adb83f37373d10cbbe1 >>>>>>> Author: Daniel Borkmann >>>>>>> AuthorDate: Thu, 21 Feb 2019 23:19:41 +0100 >>>>>>> Committer: Thomas Gleixner >>>>>>> CommitDate: Thu, 28 Feb 2019 12:10:31 +0100 >>>>>>> >>>>>>> x86, retpolines: Raise limit for generating indirect calls from switch-case >>>>>>> >>>>>>> From networking side, there are numerous attempts to get rid of indirect >>>>>>> calls in fast-path wherever feasible in order to avoid the cost of >>>>>>> retpolines, for example, just to name a few: >>>>>>> >>>>>>> * 283c16a2dfd3 ("indirect call wrappers: helpers to speed-up indirect calls of builtin") >>>>>>> * aaa5d90b395a ("net: use indirect call wrappers at GRO network layer") >>>>>>> * 028e0a476684 ("net: use indirect call wrappers at GRO transport layer") >>>>>>> * 356da6d0cde3 ("dma-mapping: bypass indirect calls for dma-direct") >>>>>>> * 09772d92cd5a ("bpf: avoid retpoline for lookup/update/delete calls on maps") >>>>>>> * 10870dd89e95 ("netfilter: nf_tables: add direct calls for all builtin expressions") >>>>>>> [...] >>>>>>> >>>>>>> Recent work on XDP from Björn and Magnus additionally found that manually >>>>>>> transforming the XDP return code switch statement with more than 5 cases >>>>>>> into if-else combination would result in a considerable speedup in XDP >>>>>>> layer due to avoidance of indirect calls in CONFIG_RETPOLINE enabled >>>>>>> builds. >>>>>> >>>>>> +HJL >>>>>> >>>>>> This is a GCC bug, surely? It should know how expensive each >>>>>> instruction is, and choose which to use accordingly. That should be >>>>>> true even when the indirect branch "instruction" is a retpoline, and >>>>>> thus enormously expensive. >>>>>> >>>>>> I believe this is https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86952 so >>>>>> please at least reference that bug, and be prepared to turn this hack >>>>>> off when GCC is fixed. >>>>> >>>>> We couldn't find a testcase to show jump table with indirect branch >>>>> is slower than direct branches. >>>> >>>> Ok, I've just checked https://github.com/marxin/microbenchmark/tree/retpoline-table >>>> with the below on top. >>>> >>>> Makefile | 6 +++--- >>>> switch.c | 2 +- >>>> test.c | 6 ++++-- >>>> 3 files changed, 8 insertions(+), 6 deletions(-) >>>> >>>> diff --git a/Makefile b/Makefile >>>> index bd83233..ea81520 100644 >>>> --- a/Makefile >>>> +++ b/Makefile >>>> @@ -1,16 +1,16 @@ >>>> CC=gcc >>>> CFLAGS=-g -I. >>>> -CFLAGS+=-O2 -mindirect-branch=thunk >>>> +CFLAGS+=-O2 -mindirect-branch=thunk-inline -mindirect-branch-register >>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ >>> >>> Does slowdown show up only with -mindirect-branch=thunk-inline? >> >> Not really, numbers are in similar range / outcome. Additionally, I also tried >> on a bit bigger machine (Xeon Gold 5120 this time). First is thunk-inline, second >> is thunk, and third is w/o raising limit for comparison; first test (from last >> mail) on that machine: > > Please re-open: > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86952 > > with new info. Yeah will do, thanks!