Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965416AbeALXl1 (ORCPT + 1 other); Fri, 12 Jan 2018 18:41:27 -0500 Received: from mx1.redhat.com ([209.132.183.28]:58166 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S965331AbeALXlZ (ORCPT ); Fri, 12 Jan 2018 18:41:25 -0500 Date: Fri, 12 Jan 2018 17:41:18 -0600 From: Josh Poimboeuf To: David Woodhouse Cc: Andrew Cooper , Andi Kleen , Paul Turner , LKML , Linus Torvalds , Greg Kroah-Hartman , Tim Chen , Dave Hansen , tglx@linutronix.de, Kees Cook , Rik van Riel , Peter Zijlstra , Andy Lutomirski , Jiri Kosina , gnomes@lxorguk.ukuu.org.uk, x86@kernel.org, thomas.lendacky@amd.com Subject: Re: [PATCH] x86/retpoline: Fill RSB on context switch for affected CPUs Message-ID: <20180112234118.3rfikgt5ndvuf7lu@treble> References: <1515779365-9032-1-git-send-email-dwmw@amazon.co.uk> <1515783378.22302.482.camel@infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <1515783378.22302.482.camel@infradead.org> User-Agent: Mutt/1.6.0.1 (2016-04-01) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Fri, 12 Jan 2018 23:41:25 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Return-Path: On Fri, Jan 12, 2018 at 06:56:18PM +0000, David Woodhouse wrote: > On Fri, 2018-01-12 at 18:05 +0000, Andrew Cooper wrote: > > > > If you unconditionally fill the RSB on every entry to supervisor mode, > > then there are never guest-controlled RSB values to be found. > > > > With that property (and IBRS to protect Skylake+), you shouldn't need > > RSB filling anywhere in the middle. > > Yes, that's right. > > We have a choice — we can do it on kernel entry (in the interrupt and > syscall and NMI paths), and that's nice and easy and really safe > because we know there's *never* a bad RSB entry lurking while we're in > the kernel. > > The alternative, which is what we seem to be learning towards now in > the latest tables from Dave (https://goo.gl/pXbvBE and > https://goo.gl/Grbuhf), is to do it on context switch when we might be > switching from a shallow call stack to a deeper one. Which has much > better performance characteristics for processes which make non- > sleeping syscalls. > > The caveat with the latter approach is that we do depend on the fact > that context switches are the only imbalance in the kernel. But that's > OK — we don't have a longjmp or anything else like that. Especially > that goes into a *deeper* call stack. Do we? At least some generated code might create RSB imbalances. Function graph tracing and kretprobes, for example. They mess with the return path and could probably underflow the RSB pretty easily. I guess they'd need to be reworked a bit so they only do a single ret. -- Josh