Received: by 10.223.176.5 with SMTP id f5csp3272023wra; Mon, 29 Jan 2018 11:06:30 -0800 (PST) X-Google-Smtp-Source: AH8x226HqNJiGpagY1S7x7mSRWzAipj8VHJLgTLxeqgbQS3YWvqVRqmYwx5fF7170tS0tDcHPl28 X-Received: by 10.98.214.7 with SMTP id r7mr28082641pfg.213.1517252790851; Mon, 29 Jan 2018 11:06:30 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1517252790; cv=none; d=google.com; s=arc-20160816; b=UVioovYId/ImO+qb8bik+TxeYPl3oaO/GR9qGQxFmUywtsXgywMnTiDvnS5VGerlHr ItufP9dRCBrX2Jb7fDnHo1rV1lIPX7hj5AjjTzeKEDNPstVuy/tZU81GpC7NpfetaizY LPsoP/2X3H3Mi0GhZfkq18hD2CNJbKz7yuGOkmp6rWwvFfuNOrTXKP4RkE6Td8DwAI06 on0xJMO2h51eAISPVRNsavxXdGRN1MURjKYbHGkm+mPB3+c6/U9tjodkqqsI9+5YdXxE 6Su15uvPduYhdrvF7dyLt7QYDY8RKl/9fGdY1KvEKvlRUUMFvhK209gjGqhNAdyoQYhm 5V+w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature :arc-authentication-results; bh=T7V/8ivbWccf9ARlWNJc9gRPr0oseGjeDJChpFrwN2I=; b=Q08u2zQSE7X8WqYv3slkSfOBRIDePQfiU4P0jcysQe5+F+ZMm3lnMwPNpx08mLu9IC xNoDHCJzYIuuxdNRRTEZW7iUoafTPkViqY8sMdNKV+sIWPymMXON84Yv8KcCFmNyLJNi iMEJLkrxoGMK4sZqlwXqdufXfYkCzZk9w0VLXJ7BkiR49TzjRSLiiL5S6dv+bQM8Ylll wg6x9n0hJRPagSEPvJS76U4+PG7pW2LTHLE8nPrqaPcOPOZVDnEvRVURQ4l3s5z2fA7g dRAPM74txX1yn5fy7JJRm+fNI5vuxQBwr5sOfmZl8jMOe42+PX6bSlaevgWIzxZLOXIR 5UMg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=cpk6pf0a; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id bd8-v6si5275997plb.3.2018.01.29.11.06.15; Mon, 29 Jan 2018 11:06:30 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=cpk6pf0a; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751482AbeA2TEa (ORCPT + 99 others); Mon, 29 Jan 2018 14:04:30 -0500 Received: from mail-it0-f65.google.com ([209.85.214.65]:53213 "EHLO mail-it0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750959AbeA2TE2 (ORCPT ); Mon, 29 Jan 2018 14:04:28 -0500 Received: by mail-it0-f65.google.com with SMTP id u62so9595888ita.2 for ; Mon, 29 Jan 2018 11:04:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=T7V/8ivbWccf9ARlWNJc9gRPr0oseGjeDJChpFrwN2I=; b=cpk6pf0anel3IS8RjmtKtq58XnfBru0rLeQN53zT4XaBd8nTJqyVWnuZz/k1NRCyq3 9KZuBPbunKkKkaoyITHx22yrwgP/ivqbqTUrgnyfN1pRXAKQxEXM23GTOKxLGjQk+QU9 FNiLexYhSqaGjpYzGNxQG9Dq9jweIl9Ni/7y4lq4cpFwDVxJyzpR4cfK4ZE9VLoVWFCf aDGLHjrTn+JVafOglWUG4ld1mNz8atiRS1G/RDd2nnlGcLti2QItc6S68mbeY8agdNV6 sp7/RTaKXpXA4njIsKuXI2H2SglhTkVuHTQwoWtOgy2bkhmoGCTsdLz8+AN5patru5O1 hOJw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=T7V/8ivbWccf9ARlWNJc9gRPr0oseGjeDJChpFrwN2I=; b=T9CNyvG7x+Y8CQjsDaUvfL7qGFwcFKWUXvEBASru52/vzXL+Gf/DuLsaS26GTIyOaC ocobmDWshQBUoxzpZyEglDfDIpzKrPBaCPlUzIVEp7naxcluDnBfiCJLKoKFsu87xlBo ixM5BhONeY99YiV64BhCNCW48c6qq53QBVg8It9B3tkLmoEj7ubLPFcyhh1O7+2raa/B lzValAMbgmykDxT+PvAkeilgGktDQvux+9HquEcVUm+1es0DtGTsGtOhkMoCw+KWXKpZ D3DbzOI8l7u3bRuSut10prywy2pWaluES2uZvxND0rLlFr6tu8sOmpERniycXP9ZzBtL gOdw== X-Gm-Message-State: AKwxytdCb4HqA8EYQJ5c7tZ8TB2c+97voA5vp7VlkRmtDfNbGm3IuC/4 B1gbzIpflT4/JTnqOLF/fpeOxQGCMH4bxo5WINf/p8qEoks= X-Received: by 10.36.210.69 with SMTP id z66mr28587695itf.151.1517252667759; Mon, 29 Jan 2018 11:04:27 -0800 (PST) MIME-Version: 1.0 Received: by 10.107.128.7 with HTTP; Mon, 29 Jan 2018 11:04:27 -0800 (PST) In-Reply-To: <1517252472.18619.28.camel@infradead.org> References: <1517167750-23485-1-git-send-email-karahmed@amazon.de> <1517252472.18619.28.camel@infradead.org> From: Jim Mattson Date: Mon, 29 Jan 2018 11:04:27 -0800 Message-ID: Subject: Re: [PATCH] x86: vmx: Allow direct access to MSR_IA32_SPEC_CTRL To: David Woodhouse Cc: KarimAllah Ahmed , kvm list , LKML , Asit Mallick , Arjan Van De Ven , Dave Hansen , Andi Kleen , Andrea Arcangeli , Linus Torvalds , Tim Chen , Thomas Gleixner , Dan Williams , Jun Nakajima , Paolo Bonzini , Greg KH , Andy Lutomirski , Ashok Raj Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Can I assume you'll send out a new version with the fixes? On Mon, Jan 29, 2018 at 11:01 AM, David Woodhouse wrote: > > (Top-posting; sorry.) > > Much of that is already fixed during our day, in > http://git.infradead.org/linux-retpoline.git/shortlog/refs/heads/ibpb > > I forgot to fix up the wrong-MSR typo though, and we do still need to address reset. > > On Mon, 2018-01-29 at 10:43 -0800, Jim Mattson wrote: >> On Sun, Jan 28, 2018 at 11:29 AM, KarimAllah Ahmed wrote: >> > >> > Add direct access to MSR_IA32_SPEC_CTRL for guests. This is needed for guests >> > that will only mitigate Spectre V2 through IBRS+IBPB and will not be using a >> > retpoline+IBPB based approach. >> > >> > To avoid the overhead of atomically saving and restoring the MSR_IA32_SPEC_CTRL >> > for guests that do not actually use the MSR, only add_atomic_switch_msr when a >> > non-zero is written to it. >> > >> > Cc: Asit Mallick >> > Cc: Arjan Van De Ven >> > Cc: Dave Hansen >> > Cc: Andi Kleen >> > Cc: Andrea Arcangeli >> > Cc: Linus Torvalds >> > Cc: Tim Chen >> > Cc: Thomas Gleixner >> > Cc: Dan Williams >> > Cc: Jun Nakajima >> > Cc: Paolo Bonzini >> > Cc: David Woodhouse >> > Cc: Greg KH >> > Cc: Andy Lutomirski >> > Signed-off-by: KarimAllah Ahmed >> > Signed-off-by: Ashok Raj >> > --- >> > arch/x86/kvm/cpuid.c | 4 +++- >> > arch/x86/kvm/cpuid.h | 1 + >> > arch/x86/kvm/vmx.c | 63 ++++++++++++++++++++++++++++++++++++++++++++++++++++ >> > 3 files changed, 67 insertions(+), 1 deletion(-) >> > >> > diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c >> > index 0099e10..dc78095 100644 >> > --- a/arch/x86/kvm/cpuid.c >> > +++ b/arch/x86/kvm/cpuid.c >> > @@ -70,6 +70,7 @@ u64 kvm_supported_xcr0(void) >> > /* These are scattered features in cpufeatures.h. */ >> > #define KVM_CPUID_BIT_AVX512_4VNNIW 2 >> > #define KVM_CPUID_BIT_AVX512_4FMAPS 3 >> > +#define KVM_CPUID_BIT_SPEC_CTRL 26 >> > #define KF(x) bit(KVM_CPUID_BIT_##x) >> > >> > int kvm_update_cpuid(struct kvm_vcpu *vcpu) >> > @@ -392,7 +393,8 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, >> > >> > /* cpuid 7.0.edx*/ >> > const u32 kvm_cpuid_7_0_edx_x86_features = >> > - KF(AVX512_4VNNIW) | KF(AVX512_4FMAPS); >> > + KF(AVX512_4VNNIW) | KF(AVX512_4FMAPS) | \ >> > + (boot_cpu_has(X86_FEATURE_SPEC_CTRL) ? KF(SPEC_CTRL) : 0); >> Isn't 'boot_cpu_has()' superflous here? And aren't there two bits to >> pass through for existing CPUs (26 and 27)? >> >> > >> > >> > /* all calls to cpuid_count() should be made on the same cpu */ >> > get_cpu(); >> > diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h >> > index cdc70a3..dcfe227 100644 >> > --- a/arch/x86/kvm/cpuid.h >> > +++ b/arch/x86/kvm/cpuid.h >> > @@ -54,6 +54,7 @@ static const struct cpuid_reg reverse_cpuid[] = { >> > [CPUID_8000_000A_EDX] = {0x8000000a, 0, CPUID_EDX}, >> > [CPUID_7_ECX] = { 7, 0, CPUID_ECX}, >> > [CPUID_8000_0007_EBX] = {0x80000007, 0, CPUID_EBX}, >> > + [CPUID_7_EDX] = { 7, 0, CPUID_EDX}, >> > }; >> > >> > static __always_inline struct cpuid_reg x86_feature_cpuid(unsigned x86_feature) >> > diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c >> > index aa8638a..1b743a0 100644 >> > --- a/arch/x86/kvm/vmx.c >> > +++ b/arch/x86/kvm/vmx.c >> > @@ -920,6 +920,9 @@ static void vmx_set_nmi_mask(struct kvm_vcpu *vcpu, bool masked); >> > static bool nested_vmx_is_page_fault_vmexit(struct vmcs12 *vmcs12, >> > u16 error_code); >> > static void vmx_update_msr_bitmap(struct kvm_vcpu *vcpu); >> > +static void __always_inline vmx_disable_intercept_for_msr(unsigned long *msr_bitmap, >> > + u32 msr, int type); >> > + >> > >> > static DEFINE_PER_CPU(struct vmcs *, vmxarea); >> > static DEFINE_PER_CPU(struct vmcs *, current_vmcs); >> > @@ -2007,6 +2010,28 @@ static void add_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr, >> > m->host[i].value = host_val; >> > } >> > >> > +/* do not touch guest_val and host_val if the msr is not found */ >> > +static int read_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr, >> > + u64 *guest_val, u64 *host_val) >> > +{ >> > + unsigned i; >> > + struct msr_autoload *m = &vmx->msr_autoload; >> > + >> > + for (i = 0; i < m->nr; ++i) >> > + if (m->guest[i].index == msr) >> > + break; >> > + >> > + if (i == m->nr) >> > + return 1; >> > + >> > + if (guest_val) >> > + *guest_val = m->guest[i].value; >> > + if (host_val) >> > + *host_val = m->host[i].value; >> > + >> > + return 0; >> > +} >> > + >> > static bool update_transition_efer(struct vcpu_vmx *vmx, int efer_offset) >> > { >> > u64 guest_efer = vmx->vcpu.arch.efer; >> > @@ -3203,7 +3228,9 @@ static inline bool vmx_feature_control_msr_valid(struct kvm_vcpu *vcpu, >> > */ >> > static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) >> > { >> > + u64 spec_ctrl = 0; >> > struct shared_msr_entry *msr; >> > + struct vcpu_vmx *vmx = to_vmx(vcpu); >> > >> > switch (msr_info->index) { >> > #ifdef CONFIG_X86_64 >> > @@ -3223,6 +3250,19 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) >> > case MSR_IA32_TSC: >> > msr_info->data = guest_read_tsc(vcpu); >> > break; >> > + case MSR_IA32_SPEC_CTRL: >> > + if (!msr_info->host_initiated && >> > + !guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL)) >> Shouldn't this conjunct be: >> !(guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL) || >> guest_cpuid_has(vcpu, X86_FEATURE_STIBP))? >> >> > >> > + return 1; >> What if !boot_cpu_has(X86_FEATURE_SPEC_CTRL) && >> !boot_cpu_has(X86_FEATURE_STIBP)? That should also return 1, I think. >> >> > >> > + >> > + /* >> > + * If the MSR is not in the atomic list yet, then it was never >> > + * written to. So the MSR value will be '0'. >> > + */ >> > + read_atomic_switch_msr(vmx, MSR_IA32_SPEC_CTRL, &spec_ctrl, NULL); >> Why not just add msr_ia32_spec_ctrl to struct vcpu_vmx, so that you >> don't have to search the atomic switch list? >> >> > >> > + >> > + msr_info->data = spec_ctrl; >> > + break; >> > case MSR_IA32_SYSENTER_CS: >> > msr_info->data = vmcs_read32(GUEST_SYSENTER_CS); >> > break; >> > @@ -3289,6 +3329,13 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) >> > int ret = 0; >> > u32 msr_index = msr_info->index; >> > u64 data = msr_info->data; >> > + unsigned long *msr_bitmap; >> > + >> > + /* >> > + * IBRS is not used (yet) to protect the host. Once it does, this >> > + * variable needs to be a bit smarter. >> > + */ >> > + u64 host_spec_ctrl = 0; >> > >> > switch (msr_index) { >> > case MSR_EFER: >> > @@ -3330,6 +3377,22 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) >> > case MSR_IA32_TSC: >> > kvm_write_tsc(vcpu, msr_info); >> > break; >> > + case MSR_IA32_SPEC_CTRL: >> > + if (!msr_info->host_initiated && >> > + !guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL)) >> > + return 1; >> This looks incomplete. As above, what if >> !boot_cpu_has(X86_FEATURE_SPEC_CTRL) && >> !boot_cpu_has(X86_FEATURE_STIBP)? >> If the host doesn't support MSR_IA32_SPEC_CTRL, you'll get a VMX-abort >> on loading the host MSRs from the VM-exit MSR load list. >> >> Also, what if the value being written is illegal? >> >> /* >> * Processors that support IBRS but not STIBP >> * (CPUID.(EAX=07H, ECX=0):EDX[27:26] = 01b) will >> * ignore attempts to set STIBP instead of causing an >> * exception due to setting that reserved bit. >> */ >> if ((data & ~(u64)(SPEC_CTRL_IBRS | SPEC_CTRL_STIBP)) || >> ((data & SPEC_CTRL_IBRS) && >> !guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL))) >> return 1; >> >> > >> > + >> > + /* >> > + * Now we know that the guest is actually using the MSR, so >> > + * atomically load and save the SPEC_CTRL MSR and pass it >> > + * through to the guest. >> > + */ >> > + add_atomic_switch_msr(vmx, MSR_IA32_SPEC_CTRL, msr_info->data, >> > + host_spec_ctrl); >> > + msr_bitmap = vmx->vmcs01.msr_bitmap; >> > + vmx_disable_intercept_for_msr(msr_bitmap, MSR_FS_BASE, MSR_TYPE_RW); >> I assume you mean MSR_IA32_SPEC_CTRL rather than MSR_FS_BASE. >> >> Also, what if the host and the guest support a different set of bits >> in MSR_IA32_SPEC_CTRL, due to a userspace modification of the guest's >> CPUID info? >> >> > >> > + >> > + break; >> > case MSR_IA32_CR_PAT: >> > if (vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_IA32_PAT) { >> > if (!kvm_mtrr_valid(vcpu, MSR_IA32_CR_PAT, data)) >> > -- >> > 2.7.4 >> > >> Where do you preserve the guest's MSR_IA32_SPEC_CTRL value on VM-exit, >> if the guest has been given permission to write the MSR? >> >> You also have to clear the guest's MSR_IA32_SPEC_CTRL on >> vmx_vcpu_reset, don't you? >>