Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751541AbeAIVTn (ORCPT + 1 other); Tue, 9 Jan 2018 16:19:43 -0500 Received: from mail-io0-f193.google.com ([209.85.223.193]:34256 "EHLO mail-io0-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750849AbeAIVTl (ORCPT ); Tue, 9 Jan 2018 16:19:41 -0500 X-Google-Smtp-Source: ACJfBotFOyTmK8KByl/ZkI3KPB//OHyYFPuf1vYCOnvAE3AdXxezZDj0Az0WE+4TPFzSgjakvaYJkqhA54UIczPa66M= MIME-Version: 1.0 In-Reply-To: <20180109211112.GO19756@char.us.oracle.com> References: <74e86dd8-804e-c9f2-098f-773283ac7065@redhat.com> <1255f660-55c5-86f0-07d0-b5846af35c4a@redhat.com> <20180109203909.GG19756@char.us.oracle.com> <20180109204715.GL19756@char.us.oracle.com> <20180109211112.GO19756@char.us.oracle.com> From: Jim Mattson Date: Tue, 9 Jan 2018 13:19:40 -0800 Message-ID: Subject: Re: [PATCH 6/7] x86/svm: Set IBPB when running a different VCPU To: Konrad Rzeszutek Wilk Cc: jun.nakajima@intel.com, Paolo Bonzini , Arjan van de Ven , Liran Alon , dwmw@amazon.co.uk, bp@alien8.de, aliguori@amazon.com, Tom Lendacky , LKML , kvm list Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Return-Path: If my documentation is up-to-date, writing IBRS does not clear the RSB (except for parts which contain an RSB that is not filled by 32 CALLs). On Tue, Jan 9, 2018 at 1:11 PM, Konrad Rzeszutek Wilk wrote: > On Tue, Jan 09, 2018 at 12:57:38PM -0800, Jim Mattson wrote: >> Before VM-entry, don't we need to flush the BHB and the RSB to avoid >> revealing KASLR information to the guest? (Thanks to Liran for >> pointing this out.) > > Exactly. > > Or is is touching with any value good enough? > > (Removing 't@char.us.oracle.com') from the email. Adding Jun. >> >> On Tue, Jan 9, 2018 at 12:47 PM, Konrad Rzeszutek Wilk >> wrote: >> > On Tue, Jan 09, 2018 at 03:39:09PM -0500, Konrad Rzeszutek Wilk wrote: >> >> On Tue, Jan 09, 2018 at 05:49:08PM +0100, Paolo Bonzini wrote: >> >> > On 09/01/2018 17:23, Arjan van de Ven wrote: >> >> > > On 1/9/2018 8:17 AM, Paolo Bonzini wrote: >> >> > >> On 09/01/2018 16:19, Arjan van de Ven wrote: >> >> > >>> On 1/9/2018 7:00 AM, Liran Alon wrote: >> >> > >>>> >> >> > >>>> ----- arjan@linux.intel.com wrote: >> >> > >>>> >> >> > >>>>> On 1/9/2018 3:41 AM, Paolo Bonzini wrote: >> >> > >>>>>> The above ("IBRS simply disables the indirect branch predictor") >> >> > >>>>>> was my >> >> > >>>>>> take-away message from private discussion with Intel. My guess is >> >> > >>>>>> that >> >> > >>>>>> the vendors are just handwaving a spec that doesn't match what >> >> > >>>>>> they have >> >> > >>>>>> implemented, because honestly a microcode update is unlikely to do >> >> > >>>>>> much >> >> > >>>>>> more than an old-fashioned chicken bit. Maybe on Skylake it does >> >> > >>>>>> though, since the performance characteristics of IBRS are so >> >> > >>>>>> different >> >> > >>>>>> from previous processors. Let's ask Arjan who might have more >> >> > >>>>>> information about it, and hope he actually can disclose it... >> >> > >>>>> >> >> > >>>>> IBRS will ensure that, when set after the ring transition, no earlier >> >> > >>>>> branch prediction data is used for indirect branches while IBRS is >> >> > >>>>> set >> >> > >> >> >> > >> Let me ask you my questions, which are independent of L0/L1/L2 >> >> > >> terminology. >> >> > >> >> >> > >> 1) Is vmentry/vmexit considered a ring transition, even if the guest is >> >> > >> running in ring 0? If IBRS=1 in the guest and the host is using IBRS, >> >> > >> the host will not do a wrmsr on exit. Is this safe for the host kernel? >> >> > > >> >> > > I think the CPU folks would want us to write the msr again. >> >> > >> >> > Want us, or need us---and if we don't do that, what happens? And if we >> >> > have to do it, how is IBRS=1 different from an IBPB?... >> >> >> >> Arjan says 'ring transition' but I am pretty sure it is more of 'prediction >> >> mode change'. And from what I have gathered so far moving from lower (guest) >> >> to higher (hypervisor) has no bearing on the branch predicator. Meaning >> >> the guest ring0 can attack us if we don't touch this MSR. >> >> >> >> We have to WRMSR 0x48 to 1 to flush out lower prediction. Aka this is a >> >> 'reset' button and at every 'prediction mode' you have to hit this. >> > >> > I suppose means that when we VMENTER the original fix (where we >> > compare the host to guest) can stay - as we entering an lower prediction >> > mode. I wonder then what does writting 0 do to it? A nop? >> > >> >> >> >> >> >> Can we have a discussion on making an kvm-security mailing list >> >> where we can figure all this out during embargo and not have these >> >> misunderstandings. >> >> >> >> > >> >> > Since I am at it, what happens on *current generation* CPUs if you >> >> > always leave IBRS=1? Slow and safe, or fast and unsafe? >> >> > >> >> > >> 2) How will the future processors work where IBRS should always be =1? >> >> > > >> >> > > IBRS=1 should be "fire and forget this ever happened". >> >> > > This is the only time anyone should use IBRS in practice >> >> > >> >> > And IBPB too I hope? But besides that, I need to know exactly how that >> >> > is implemented to ensure that it's doing the right thing. >> >> > >> >> > > (and then the host turns it on and makes sure to not expose it to the >> >> > > guests I hope) >> >> > >> >> > That's not that easy, because guests might have support for SPEC_CTRL >> >> > but not for IA32_ARCH_CAPABILITIES. >> >> > >> >> > You could disable the SPEC_CTRL bit, but then the guest might think it >> >> > is not secure. It might also actually *be* insecure, if you migrated to >> >> > an older CPU where IBRS is not fire-and-forget. >> >> > >> >> > Paolo