Received: by 10.223.176.46 with SMTP id f43csp3863929wra; Mon, 22 Jan 2018 23:54:45 -0800 (PST) X-Google-Smtp-Source: AH8x225m8Z6t0y9kXoj7Od4PNtNlAplLChk426ofJ6MASuHHEt5rgdNgPk8zMO9i7FxUkVFux/VO X-Received: by 10.98.98.1 with SMTP id w1mr2448614pfb.9.1516694085300; Mon, 22 Jan 2018 23:54:45 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1516694085; cv=none; d=google.com; s=arc-20160816; b=m4owcUCFUP1m0zRwEnLX6G/uV1tgAd29kKFmU/hTCEhMXuMZ+1XfzArGHM4+Oln9UN 79m1yN6JSEshBVg7gz2bzKpz8E3HHGTFZ8R1QBCO61ddrXah5O80yyDMaN9a/JUIMG/3 zvX+0YSz2hl/2kPE1s3pb3LcJZWSfd3cVZycLdPyLVnbyP6yphndD9hvwHpAeqt+0ZeT wwLhLA8mgE5z8RYnRFTapzDcXqq/9pDyt834LId+KrWjwzLi1AhS5C+Y9VFB8nxrFhKP LbjQVK2BmFMUMN7yrjuzajFsz6nxHAOTymnInofoL481JoW/sTPhizs0vxL55yoUDlZB cb2Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature:arc-authentication-results; bh=hZdL615QMiOprrexlq4mpK9N6gRRDNPYeVYzdcRIyxE=; b=WDKZMzvZziamSeztqoc3NhodlhPKQPAxdvKpXew5t7y09Ep+duOBBylD8cRex0x/vH 4AgyOoHYpzlNYfFZs3hchW75uE1bFK9Cmc7+6CnbhRN4amUpNi2pzd3gEFFNZ8SoGPdM eoAeJhPbMAYVmeMjDznO6UsJmYeuopN7M5K475ld7dJUGf6vIQcxTAZBWiBzbN/86g7Q Lxh0dWiuBwjrOmKZjC8RAO5DcoopT8Ta8ltPKjs9o7C9ifG6u5H9wOvTT5Gdcah7K+zX 8myoP1+72I+Ks7WwLbc+W99oow01gwShOVX09sNp1oMIM7cEdhE4FT36q56f5iW77/7V uHUQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@gmail.com header.s=20161025 header.b=XJowBpSe; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b59-v6si2356069plb.514.2018.01.22.23.54.30; Mon, 22 Jan 2018 23:54:45 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@gmail.com header.s=20161025 header.b=XJowBpSe; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751169AbeAWHyG (ORCPT + 99 others); Tue, 23 Jan 2018 02:54:06 -0500 Received: from mail-wm0-f68.google.com ([74.125.82.68]:38863 "EHLO mail-wm0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751056AbeAWHyE (ORCPT ); Tue, 23 Jan 2018 02:54:04 -0500 Received: by mail-wm0-f68.google.com with SMTP id 141so21444813wme.3; Mon, 22 Jan 2018 23:54:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=hZdL615QMiOprrexlq4mpK9N6gRRDNPYeVYzdcRIyxE=; b=XJowBpSeZfIuuSNA9R05SYywuThQoUJ1AfxK/aposJxoFsul214losSuHyWmRTMMrN N+vLc21p5Ut2vsTvzK5XreE7m+6hFpljJ5/g9oSEmYNjJj+VbX8DZHIhYg2EIK3GKkyT bEmzag3nXRHyyhl3DhgBFn4dlv5W05fwISh8IytlwWznYWIw4XbDF4z+kX9Cc7SFjO5u y8D+CrA9L9/CGMsNUg2hCiMYzGdi2NLLPbajfjuXzUvYbQf5O5ELP4oAD04jdy5mbsSk oe/GzAQ8vqyD3blAEHytsuMnWaXI5Z5/6cX577hrZz/xMSrcPGsN42VGqQcq/SNJgz3H iEdA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition:in-reply-to:user-agent; bh=hZdL615QMiOprrexlq4mpK9N6gRRDNPYeVYzdcRIyxE=; b=oiPkFEZR54wkTbruMuxGkreLZ9wbdpjG1OuiOv8HBAs03cIyOV76pxKR+idgEURSzF bnxx1k1bOXc/c9zd5aQr012oOOvqkGwV5vqH2B2bkWyvgTj+qXQVn/2j0Ro7g3ATn7F1 pO6F8oZS4K1fJBFhVGFCDqnct2AoJMeL8bxljJd6XaF07iG3TbhUwTRGX48q0NDKVTul 8LwBczATp+XbP3xju06DX2OQZwH6s0WUfPDpDfjR7rSOgnYvJQNWUPqmQwI20MX1yhW4 7CiFoM7qqXIh6EVe1nyTQ7zJsf67wTjaumS1PLKBsbVgnXXGd+0W5MclY7JQ6Fv/8Aem eg9Q== X-Gm-Message-State: AKwxytcCZzmcwy2ysg5QcpLF9mWhrQoMe6opoN+zyoUVaLL3BVL0Bhwn xixwclMC2rECkeIaZPVJG5A= X-Received: by 10.28.188.131 with SMTP id m125mr1142653wmf.39.1516694042402; Mon, 22 Jan 2018 23:54:02 -0800 (PST) Received: from gmail.com (2E8B0CD5.catv.pool.telekom.hu. [46.139.12.213]) by smtp.gmail.com with ESMTPSA id x190sm12198210wme.27.2018.01.22.23.54.00 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 22 Jan 2018 23:54:01 -0800 (PST) Date: Tue, 23 Jan 2018 08:53:58 +0100 From: Ingo Molnar To: David Woodhouse Cc: Linus Torvalds , KarimAllah Ahmed , Linux Kernel Mailing List , Andi Kleen , Andrea Arcangeli , Andy Lutomirski , Arjan van de Ven , Ashok Raj , Asit Mallick , Borislav Petkov , Dan Williams , Dave Hansen , Greg Kroah-Hartman , "H . Peter Anvin" , Ingo Molnar , Janakarajan Natarajan , Joerg Roedel , Jun Nakajima , Laura Abbott , Masami Hiramatsu , Paolo Bonzini , Peter Zijlstra , Radim =?utf-8?B?S3LEjW3DocWZ?= , Thomas Gleixner , Tim Chen , Tom Lendacky , KVM list , the arch/x86 maintainers , Arjan Van De Ven Subject: Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation Message-ID: <20180123075358.nztpyxympwfkyi2a@gmail.com> References: <1516476182-5153-1-git-send-email-karahmed@amazon.de> <1516476182-5153-10-git-send-email-karahmed@amazon.de> <1516566497.9814.78.camel@infradead.org> <1516572013.9814.109.camel@infradead.org> <1516638426.9521.20.camel@infradead.org> <20180123072930.soz25cyky3u4hpgv@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180123072930.soz25cyky3u4hpgv@gmail.com> User-Agent: NeoMutt/20170609 (1.8.3) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org * Ingo Molnar wrote: > * David Woodhouse wrote: > > > But wait, why did I say "mostly"? Well, not everyone has a retpoline > > compiler yet... but OK, screw them; they need to update. > > > > Then there's Skylake, and that generation of CPU cores. For complicated > > reasons they actually end up being vulnerable not just on indirect > > branches, but also on a 'ret' in some circumstances (such as 16+ CALLs > > in a deep chain). > > > > The IBRS solution, ugly though it is, did address that. Retpoline > > doesn't. There are patches being floated to detect and prevent deep > > stacks, and deal with some of the other special cases that bite on SKL, > > but those are icky too. And in fact IBRS performance isn't anywhere > > near as bad on this generation of CPUs as it is on earlier CPUs > > *anyway*, which makes it not quite so insane to *contemplate* using it > > as Intel proposed. > > There's another possible method to avoid deep stacks on Skylake, without compiler > support: > > - Use the existing mcount based function tracing live patching machinery > (CONFIG_FUNCTION_TRACER=y) to install a _very_ fast and simple stack depth > tracking tracer which would issue a retpoline when stack depth crosses > boundaries of ~16 entries. The patch below demonstrates the principle, it forcibly enables dynamic ftrace patching (CONFIG_DYNAMIC_FTRACE=y et al) and turns mcount/__fentry__ into a RET: ffffffff81a01a40 <__fentry__>: ffffffff81a01a40: c3 retq This would have to be extended with (very simple) call stack depth tracking (just 3 more instructions would do in the fast path I believe) and a suitable SkyLake workaround (and also has to play nice with the ftrace callbacks). On non-SkyLake the overhead would be 0 cycles. On SkyLake this would add an overhead of maybe 2-3 cycles per function call and obviously all this code and data would be very cache hot. Given that the average number of function calls per system call is around a dozen, this would be _much_ faster than any microcode/MSR based approach. Is there a testcase for the SkyLake 16-deep-call-stack problem that I could run? Is there a description of the exact speculative execution vulnerability that has to be addressed to begin with? If this approach is workable I'd much prefer it to any MSR writes in the syscall entry path not just because it's fast enough in practice to not be turned off by everyone, but also because everyone would agree that per function call overhead needs to go away on new CPUs. Both deployment and backporting is also _much_ more flexible, simpler, faster and more complete than microcode/firmware or compiler based solutions. Assuming the vulnerability can be addressed via this route that is, which is a big assumption! Thanks, Ingo arch/x86/Kconfig | 3 +++ arch/x86/kernel/ftrace_64.S | 1 + 2 files changed, 4 insertions(+) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 423e4b64e683..df471538a79c 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -133,6 +133,8 @@ config X86 select HAVE_DMA_CONTIGUOUS select HAVE_DYNAMIC_FTRACE select HAVE_DYNAMIC_FTRACE_WITH_REGS + select DYNAMIC_FTRACE + select DYNAMIC_FTRACE_WITH_REGS select HAVE_EBPF_JIT if X86_64 select HAVE_EFFICIENT_UNALIGNED_ACCESS select HAVE_EXIT_THREAD @@ -140,6 +142,7 @@ config X86 select HAVE_FTRACE_MCOUNT_RECORD select HAVE_FUNCTION_GRAPH_TRACER select HAVE_FUNCTION_TRACER + select FUNCTION_TRACER select HAVE_GCC_PLUGINS select HAVE_HW_BREAKPOINT select HAVE_IDE diff --git a/arch/x86/kernel/ftrace_64.S b/arch/x86/kernel/ftrace_64.S index 7cb8ba08beb9..1e219e0f2887 100644 --- a/arch/x86/kernel/ftrace_64.S +++ b/arch/x86/kernel/ftrace_64.S @@ -19,6 +19,7 @@ EXPORT_SYMBOL(__fentry__) # define function_hook mcount EXPORT_SYMBOL(mcount) #endif + ret /* All cases save the original rbp (8 bytes) */ #ifdef CONFIG_FRAME_POINTER