Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752631AbeADSgG (ORCPT + 1 other); Thu, 4 Jan 2018 13:36:06 -0500 Received: from mail-pl0-f51.google.com ([209.85.160.51]:45094 "EHLO mail-pl0-f51.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751246AbeADSgF (ORCPT ); Thu, 4 Jan 2018 13:36:05 -0500 X-Google-Smtp-Source: ACJfBouTLirw85hyGO+V1QkM5HWiv4zVOu+CiztG4GrHFDPUMswToOzoC28WFndh1c/iPP3YyjYW8w== Date: Thu, 4 Jan 2018 10:36:01 -0800 From: Alexei Starovoitov To: Linus Torvalds Cc: David Woodhouse , Andi Kleen , Paul Turner , LKML , Greg Kroah-Hartman , Tim Chen , Dave Hansen , Thomas Gleixner , Kees Cook , Rik van Riel , Peter Zijlstra , Andy Lutomirski , Jiri Kosina , One Thousand Gnomes Subject: Re: [PATCH v3 01/13] x86/retpoline: Add initial retpoline support Message-ID: <20180104183559.wlqoxmp7rf4d44ku@ast-mbp> References: <1515058213.12987.89.camel@amazon.co.uk> <20180104143710.8961-1-dwmw@amazon.co.uk> <20180104181744.komdplek7nfdvlsw@ast-mbp> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: NeoMutt/20170421 (1.8.2) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Return-Path: On Thu, Jan 04, 2018 at 10:25:35AM -0800, Linus Torvalds wrote: > On Thu, Jan 4, 2018 at 10:17 AM, Alexei Starovoitov > wrote: > > > > Clearly Paul's approach to retpoline without lfence is faster. > > I'm guessing it wasn't shared with amazon/intel until now and > > this set of patches going to adopt it, right? > > > > Paul, could you share a link to a set of alternative gcc patches > > that do retpoline similar to llvm diff ? > > What is the alternative approach? Is it literally just doing a > > call 1f > 1: mov real_target,(%rsp) > ret > > on the assumption that the "ret" will always just predict to that "1" > due to the call stack? Pretty much. Paul's writeup: https://support.google.com/faqs/answer/7625886 tldr: jmp *%r11 gets converted to: call set_up_target; capture_spec: pause; jmp capture_spec; set_up_target: mov %r11, (%rsp); ret; where capture_spec part will be looping speculatively.