Received: by 2002:ac0:bc90:0:0:0:0:0 with SMTP id a16csp574435img; Mon, 18 Mar 2019 09:26:00 -0700 (PDT) X-Google-Smtp-Source: APXvYqweq4UJeKaBcJHSNCImC3Oc2dSfyQxRslVagwTZufURKNOzWe09c64Wq516xCk3ZSJ53gzR X-Received: by 2002:a17:902:2e03:: with SMTP id q3mr20403457plb.166.1552926360048; Mon, 18 Mar 2019 09:26:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1552926360; cv=none; d=google.com; s=arc-20160816; b=ZZD+V7DQ1500E2qnW0u928f8FHKcHVtm6NQhtBROrjy0nPTymjtBEPLNwREwNpuJL6 LsBZJ8h4T5UTwS/B+QGanfeX5Mp3OBbkdEWO8LrF/o6e+MEBPo+4TcBqP/83SJlhDBqS jn0TJa4HLZl+r/GA9oP22BZXtVZf8IKNhWz+c94xjwDNQXsJLBZX4BQ20uCpfmRqVN22 G2RezrRzE7b6EPdRmQRGnOrvqO+LBmMo8ptIfXLd5+VDBLcAOwc5WXBPUwLJ/IZ9wm4n aKCE/Zd2zMuPuATwqwG+eCBeI3ds+SzTKRVzV4UUcS5FRky8YXQ0TWyyZaxdTgzfRhqd Hj7Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=tdt+fVbNulcv66rGyn6HV2PegutHU4HI2Xm9SCSv1qw=; b=PEtSZeMFLG7g8+LqbEquYWrH4ud2ICS8beOtp7DSuhzAPFGqp74M6/Ww4FWci3Z+sN HRQTXCO3KudDWnrvNOmQn10ks7fgsY8U7ZNAHyF5zC8KSfSqWIN6ei3BY8+S4f/JhH9H ZksU59diBYVFPafCwCnug5dz4XKvV++oMIVA4du3KGa68wXn9BNSOqYBAM8IJi9/8C9B mKf5RIyY5QsZjn9OfsNHBKrQen0GtSIjv5ngdS0w5PrD3LTscfKoFEnvrN83H6hTUdXt XRWBb0zA+3Xev+45+NR1Iz6NEwaDwJHeZfi/dfWKiiFYmW6QXwsFgU6N73Yon0heVsFb fuBw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id j11si8868556pgq.330.2019.03.18.09.25.44; Mon, 18 Mar 2019 09:26:00 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727586AbfCRQXi (ORCPT + 99 others); Mon, 18 Mar 2019 12:23:38 -0400 Received: from mga04.intel.com ([192.55.52.120]:35540 "EHLO mga04.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726719AbfCRQXi (ORCPT ); Mon, 18 Mar 2019 12:23:38 -0400 X-Amp-Result: UNKNOWN X-Amp-Original-Verdict: FILE UNKNOWN X-Amp-File-Uploaded: False Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga104.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 18 Mar 2019 09:23:37 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.58,494,1544515200"; d="scan'208";a="328346727" Received: from sjchrist-coffee.jf.intel.com (HELO linux.intel.com) ([10.54.74.181]) by fmsmga006.fm.intel.com with ESMTP; 18 Mar 2019 09:23:36 -0700 Date: Mon, 18 Mar 2019 09:23:36 -0700 From: Sean Christopherson To: Xiaoyao Li Cc: kvm@vger.kernel.org, Paolo Bonzini , Radim =?utf-8?B?S3LEjW3DocWZ?= , Thomas Gleixner , Ingo Molnar , Borislav Petkov , "H. Peter Anvin" , linux-kernel@vger.kernel.org, chao.gao@intel.com Subject: Re: [PATCH v2 1/2] kvm/vmx: avoid CPUID faulting leaking to guest Message-ID: <20190318162336.GA13528@linux.intel.com> References: <20190318114324.14198-1-xiaoyao.li@linux.intel.com> <20190318114324.14198-2-xiaoyao.li@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190318114324.14198-2-xiaoyao.li@linux.intel.com> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Mar 18, 2019 at 07:43:23PM +0800, Xiaoyao Li wrote: > cpuid faulting is a feature about CPUID instruction. When cpuif faulting ^^^^^ cpuid > is enabled, all execution of the CPUID instruction outside system-management > mode (SMM) cause a general-protection (#GP) if the CPL > 0. > > About this feature, detailed information can be found at > https://www.intel.com/content/dam/www/public/us/en/documents/application-notes/virtualization-technology-flexmigration-application-note.pdf > > Current KVM provides software emulation of this feature for guest. > However, because cpuid faulting takes higher priority over CPUID vm exit (Intel > SDM vol3.25.1.1), there is a risk of leaking cpuid faulting to guest when host > enables it. If host enables cpuid faulting by setting the bit 0 of > MSR_MISC_FEATURES_ENABLES, it will pass to guest since there is no handling of > MSR_MISC_FEATURES_ENABLES yet. As a result, when guest calls CPUID instruction > in CPL > 0, it will generate a #GP instead of CPUID vm eixt. > > This issue will cause guest boot failure when guest uses *modprobe* > to load modules. *modprobe* calls CPUID instruction, thus causing #GP in > guest. Since there is no handling of cpuid faulting in #GP handler, guest > fails boot. > > To fix this issue, we should switch cpuid faulting bit between host and guest. > Since MSR_MISC_FEATURES_ENABLES is intel-specific, this patch implement the > switching only in vmx. It clears the cpuid faulting bit and save host's > value before switching to guest, and restores the cpuid faulting settings of > host before switching to host. > > Because kvm provides the software emulation of cpuid faulting, we can > just clear the cpuid faulting bit in hardware MSR when switching to > guest. > > Signed-off-by: Xiaoyao Li > --- > Changes in v2: > - move the save/restore of cpuid faulting bit to > vmx_prepare_swich_to_guest/vmx_prepare_swich_to_host to avoid every > vmentry RDMSR, based on Paolo's comment. > > --- > arch/x86/kvm/vmx/vmx.c | 34 ++++++++++++++++++++++++++++++++++ > arch/x86/kvm/vmx/vmx.h | 2 ++ > 2 files changed, 36 insertions(+) > > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > index 541d442edd4e..2c59e0209e36 100644 > --- a/arch/x86/kvm/vmx/vmx.c > +++ b/arch/x86/kvm/vmx/vmx.c > @@ -1035,6 +1035,23 @@ static void pt_guest_exit(struct vcpu_vmx *vmx) > wrmsrl(MSR_IA32_RTIT_CTL, vmx->pt_desc.host.ctl); > } > > +static void vmx_save_host_cpuid_fault(struct vcpu_vmx *vmx) Maybe vmx_load_guest_misc_features_enables()? See below for reasoning. Alternatively you can drop the helpers altogether. > +{ > + u64 host_val; > + > + if (!boot_cpu_has(X86_FEATURE_CPUID_FAULT)) > + return; > + > + rdmsrl(MSR_MISC_FEATURES_ENABLES, host_val); > + vmx->host_msr_misc_features_enables = host_val; There's no need to cache the host value in struct vcpu_vmx, just use the kernel's shadow value. > + > + /* clear cpuid fault bit to avoid it leak to guest */ Personally I find the comment unnecessary and somewhat misleading. > + if (host_val & MSR_MISC_FEATURES_ENABLES_CPUID_FAULT) { > + wrmsrl(MSR_MISC_FEATURES_ENABLES, > + host_val & ~MSR_MISC_FEATURES_ENABLES_CPUID_FAULT); I think we can also skip WRMSR if CPUID faulting is also enabled in the guest. It probably doesn't make sense to install the guest's value if CPUID faulting is enabled in the guest and not host since intercepting CPUID likely provides better overall performance than switching the MSR on entry and exit. And last but not least, since there are other bits in the MSR that we don't want to expose to the guest, e.g. RING3MWAIT, checking for a non-zero host value is probably better than checking for individual feature bits. Same reasoning for writing the guest's value instead of clearing just the CPUID faulting bit. So something like: u64 msrval = this_cpu_read(msr_misc_features_shadow); if (!msrval || msrval == vcpu->arch.msr_misc_features_enables) return; wrmsrl(MSR_MISC_FEATURES_ENABLES, vcpu->arch.msr_misc_features_enables); or if you drop the helpers: msrval = this_cpu_read(msr_misc_features_shadow); if (msrval && msrval != vcpu->arch.msr_misc_features_enables) wrmsrl(MSR_MISC_FEATURES_ENABLES, vcpu->arch.msr_misc_features_enables); > + } > +} > + > void vmx_prepare_switch_to_guest(struct kvm_vcpu *vcpu) > { > struct vcpu_vmx *vmx = to_vmx(vcpu); > @@ -1068,6 +1085,8 @@ void vmx_prepare_switch_to_guest(struct kvm_vcpu *vcpu) > vmx->loaded_cpu_state = vmx->loaded_vmcs; > host_state = &vmx->loaded_cpu_state->host_state; > > + vmx_save_host_cpuid_fault(vmx); > + > /* > * Set host fs and gs selectors. Unfortunately, 22.2.3 does not > * allow segment selectors with cpl > 0 or ti == 1. > @@ -1124,6 +1143,19 @@ void vmx_prepare_switch_to_guest(struct kvm_vcpu *vcpu) > } > } > > +static void vmx_restore_host_cpuid_fault(struct vcpu_vmx *vmx) If you keep the helpers, maybe vmx_load_host_misc_features_enables() to pair with the new function name for loading guest state? > +{ > + u64 msrval; > + > + if (!boot_cpu_has(X86_FEATURE_CPUID_FAULT)) > + return; > + > + rdmsrl(MSR_MISC_FEATURES_ENABLES, msrval); > + msrval |= vmx->host_msr_misc_features_enables & > + MSR_MISC_FEATURES_ENABLES_CPUID_FAULT; Again, there's no need for RDMSR, the host's value can be pulled from msr_misc_features_shadow, and the WRMSR can be skipped if the host and guest have the same value, i.e. we didn't install the guest's value. u64 msrval = this_cpu_read(msr_misc_features_shadow); if (!msrval || msrval == vcpu->arch.msr_misc_features_enables) return; wrmsrl(MSR_MISC_FEATURES_ENABLES, msrval); > + wrmsrl(MSR_MISC_FEATURES_ENABLES, msrval); > +} > + > static void vmx_prepare_switch_to_host(struct vcpu_vmx *vmx) > { > struct vmcs_host_state *host_state; > @@ -1137,6 +1169,8 @@ static void vmx_prepare_switch_to_host(struct vcpu_vmx *vmx) > ++vmx->vcpu.stat.host_state_reload; > vmx->loaded_cpu_state = NULL; > > + vmx_restore_host_cpuid_fault(vmx); > + > #ifdef CONFIG_X86_64 > rdmsrl(MSR_KERNEL_GS_BASE, vmx->msr_guest_kernel_gs_base); > #endif > diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h > index 5df73b36fa49..ba867bbc5676 100644 > --- a/arch/x86/kvm/vmx/vmx.h > +++ b/arch/x86/kvm/vmx/vmx.h > @@ -268,6 +268,8 @@ struct vcpu_vmx { > u64 msr_ia32_feature_control_valid_bits; > u64 ept_pointer; > > + u64 host_msr_misc_features_enables; As mentioned above, this isn't needed. > + > struct pt_desc pt_desc; > }; > > -- > 2.19.1 >