Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751830AbeAEMUg (ORCPT + 1 other); Fri, 5 Jan 2018 07:20:36 -0500 Received: from mail-vk0-f68.google.com ([209.85.213.68]:41364 "EHLO mail-vk0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751784AbeAEMUd (ORCPT ); Fri, 5 Jan 2018 07:20:33 -0500 X-Google-Smtp-Source: ACJfBos14TejnYxyoLsX1Z7vOO6N6GGjo1aFtYhbpYerV0vgVK3ktyMbAewLQkB0veBwKD8cek8oK8KafKoFqNuj7mA= MIME-Version: 1.0 In-Reply-To: References: <1515058213.12987.89.camel@amazon.co.uk> <20180104143710.8961-1-dwmw@amazon.co.uk> <20180104181744.komdplek7nfdvlsw@ast-mbp> <20180104183559.wlqoxmp7rf4d44ku@ast-mbp> <1515094078.29312.17.camel@infradead.org> <20180105102824.GA247671@google.com> From: Paul Turner Date: Fri, 5 Jan 2018 04:20:01 -0800 Message-ID: Subject: Re: [PATCH v3 01/13] x86/retpoline: Add initial retpoline support To: Paolo Bonzini Cc: David Woodhouse , Alexei Starovoitov , Linus Torvalds , Andi Kleen , LKML , Greg Kroah-Hartman , Tim Chen , Dave Hansen , Thomas Gleixner , Kees Cook , Rik van Riel , Peter Zijlstra , Andy Lutomirski , Jiri Kosina , One Thousand Gnomes Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Return-Path: On Fri, Jan 5, 2018 at 3:26 AM, Paolo Bonzini wrote: > On 05/01/2018 11:28, Paul Turner wrote: >> >> The "pause; jmp" sequence proved minutely faster than "lfence;jmp" which is why >> it was chosen. >> >> "pause; jmp" 33.231 cycles/call 9.517 ns/call >> "lfence; jmp" 33.354 cycles/call 9.552 ns/call > > Do you have timings for a non-retpolined indirect branch with the > predictor suppressed via IBRS=1? So at least we can compute the break > even point. The data I collected here previously had the run-time cost as a wash. On Skylake, an IBRS=1 and a retpolined indirect branch had cost within a few cycles. The costs to consider when making a choice here are: - The transition overheads. This is how frequently will you be switching in and out of protected code (as IBRS needs to be enabled and disabled at these boundaries). - The frequency at which you will be executing protected code on one sibling, and unprotected code on another (enabling IBRS may affect sibling execution, depending on SKU) - The implementation cost (retpoline requires auditing/rebuilding your target, while IBRS can be used out of the box). > > Paolo