Received: by 2002:a25:ad19:0:0:0:0:0 with SMTP id y25csp7195826ybi; Mon, 8 Jul 2019 16:40:20 -0700 (PDT) X-Google-Smtp-Source: APXvYqyhlxsJxowAUEhYDJoW8db0WBxKQhyZisa5SlrzV0vGvUzHmc9gWMn4kP8vUwsK1FlJK3dy X-Received: by 2002:a17:90a:c68c:: with SMTP id n12mr21437244pjt.29.1562629220797; Mon, 08 Jul 2019 16:40:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1562629220; cv=none; d=google.com; s=arc-20160816; b=SLrbzcWGuxaiV7S5PYbJtYZ42aMj1Q5LSfO6gcsY2nY/3vVUzYY30x/ZdXJVzXA4Xd 9JkpJyGj0Bk+z2QtcCF5+DgfgP5lplUb2DKq9gnChHu8g8FUccCWcYlsjcoOdg9Yq4Fw /GryY3C+/Rh8At2ZTVRx2hh5uvlza1mq4Fsrkvz4Z589qIF1YkVhwYE6FCfIQeOfZpOs L62Q+jTPmjUWQNXYnGvt/QfT0R9EI533ghk2sxzC6FiafVuZh8Y9SzTHd3bCP9QFAmcE +v4dgYJ5ytlo6ErPQvGI8oP20UEHZZa2hgAauEX7XXl1WhXcY2fbQ4N0bCMuszqZLSFo KGjw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=Dx/f16w6dm9pLizcL3QFUDGzY1OErHGbN1JRmBm7+qM=; b=YLmVOIQCYSFlclRwUok8LLkVyOW1rcnC2CwwyqvGeKcWPkERKJIrNzCVkojidFiSo/ wCjnwW+CAuwvOslxoUvc/cC8DYur5n8UupmRLD+URfnxoBvpzw7i3w2jMSiiyZD2FBIW nnM/OQJgeIzisYVmML6rcs3d7lSb/JloHMbhK3gukeBdz8k/WN8Q9f53bAB5b5lTAoK2 7t6WNhjPINOJidUR3HIKb4I9GoUX/OdshTL0PBnrTHxLDYD6ndnNbUzbnDx4QdFULtrh Msp2dAvs/YNZfD7jzNg5HcJaykqRcaoXyhs7JfrKndZUj6ZggOitiInk0Psjbv5MHDNJ xNDw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id j7si20351487pgq.504.2019.07.08.16.40.05; Mon, 08 Jul 2019 16:40:20 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726519AbfGHWir (ORCPT + 99 others); Mon, 8 Jul 2019 18:38:47 -0400 Received: from mx1.redhat.com ([209.132.183.28]:57130 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725872AbfGHWiq (ORCPT ); Mon, 8 Jul 2019 18:38:46 -0400 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 6FB51356EC; Mon, 8 Jul 2019 22:38:46 +0000 (UTC) Received: from treble (ovpn-112-43.rdu2.redhat.com [10.10.112.43]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 6CD5D1001B2B; Mon, 8 Jul 2019 22:38:37 +0000 (UTC) Date: Mon, 8 Jul 2019 17:38:34 -0500 From: Josh Poimboeuf To: Alexei Starovoitov Cc: Ingo Molnar , Borislav Petkov , "H. Peter Anvin" , Song Liu , Thomas Gleixner , Steven Rostedt , Kairui Song , Daniel Borkmann , Alexei Starovoitov , Peter Zijlstra , LKML , linux-tip-commits@vger.kernel.org Subject: Re: [tip:x86/urgent] bpf: Fix ORC unwinding in non-JIT BPF code Message-ID: <20190708223834.zx7u45a4uuu2yyol@treble> References: <881939122b88f32be4c374d248c09d7527a87e35.1561685471.git.jpoimboe@redhat.com> <20190706202942.GA123403@gmail.com> <20190707013206.don22x3tfldec4zm@treble> <20190707055209.xqyopsnxfurhrkxw@treble> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: User-Agent: NeoMutt/20180716 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.30]); Mon, 08 Jul 2019 22:38:46 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jul 08, 2019 at 03:15:37PM -0700, Alexei Starovoitov wrote: > > 2) > > > > After doing the first optimization, GCC then does another one which is > > a little trickier. It replaces: > > > > select_insn: > > jmp *jumptable(, %rax, 8) > > ... > > ALU64_ADD_X: > > ... > > jmp *jumptable(, %rax, 8) > > ALU_ADD_X: > > ... > > jmp *jumptable(, %rax, 8) > > > > with > > > > select_insn: > > mov jumptable, %r12 > > jmp *(%r12, %rax, 8) > > ... > > ALU64_ADD_X: > > ... > > jmp *(%r12, %rax, 8) > > ALU_ADD_X: > > ... > > jmp *(%r12, %rax, 8) > > > > The problem is that it only moves the jumptable address into %r12 > > once, for the entire function, then it goes through multiple recursive > > indirect jumps which rely on that %r12 value. But objtool isn't yet > > smart enough to be able to track the value across multiple recursive > > indirect jumps through the jump table. > > > > After some digging I found that the quick and easy fix is to disable > > -fgcse. In fact, this seems to be recommended by the GCC manual, for > > code like this: > > > > -fgcse > > Perform a global common subexpression elimination pass. This > > pass also performs global constant and copy propagation. > > > > Note: When compiling a program using computed gotos, a GCC > > extension, you may get better run-time performance if you > > disable the global common subexpression elimination pass by > > adding -fno-gcse to the command line. > > > > Enabled at levels -O2, -O3, -Os. > > > > This code indeed relies extensively on computed gotos. I don't know > > *why* disabling this optimization would improve performance. In fact > > I really don't see how it could make much of a difference either way. > > > > Anyway, using -fno-gcse makes optimization #2 go away and makes > > objtool happy, with only a fix for #1 needed. > > > > If -fno-gcse isn't an option, we might be able to fix objtool by using > > the "first_jump_src" thing which Peter added, improving it such that > > it also takes table jumps into account. > > Sorry for delay. I'm mostly offgrid until next week. > As far as -fno-gcse.. I don't mind as long as it doesn't hurt performance. > Which I suspect it will :( > All these indirect gotos are there for performance. > Single indirect goto and a bunch of jmp select_insn > are way slower, since there is only one instruction > for cpu branch predictor to work with. > When every insn is followed by "jmp *jumptable" > there is more room for cpu to speculate. > It's been long time, but when I wrote it the difference > between all indirect goto vs single indirect goto was almost 2x. Just to clarify, -fno-gcse doesn't get rid of any of the indirect jumps. It still has 166 indirect jumps. It just gets rid of the second optimization, where the jumptable address is placed in a register. If you have a benchmark which is relatively easy to use, I could try to run some tests. -- Josh