2018-01-29 03:02:02

by KarimAllah Ahmed

[permalink] [raw]
Subject: [PATCH v2 0/4] KVM: Expose speculation control feature to guests

Add direct access to speculation control MSRs for KVM guests. This allows the
guest to protect itself against Spectre V2 using IBRS+IBPB instead of a
retpoline+IBPB based approach.

It also exposes the ARCH_CAPABILITIES MSR which is going to be used by future
Intel processors to indicate RDCL_NO and IBRS_ALL.

Ashok Raj (1):
x86/kvm: Add IBPB support

KarimAllah Ahmed (3):
x86: kvm: Update the reverse_cpuid list to include CPUID_7_EDX
x86: vmx: Allow direct access to MSR_IA32_SPEC_CTRL
x86: vmx: Allow direct access to MSR_IA32_ARCH_CAPABILITIES

arch/x86/kvm/cpuid.c | 6 ++++-
arch/x86/kvm/cpuid.h | 1 +
arch/x86/kvm/svm.c | 14 +++++++++++
arch/x86/kvm/vmx.c | 71 ++++++++++++++++++++++++++++++++++++++++++++++++++++
arch/x86/kvm/x86.c | 1 +
5 files changed, 92 insertions(+), 1 deletion(-)

Cc: Andi Kleen <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Arjan van de Ven <[email protected]>
Cc: Ashok Raj <[email protected]>
Cc: Asit Mallick <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: David Woodhouse <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Janakarajan Natarajan <[email protected]>
Cc: Joerg Roedel <[email protected]>
Cc: Jun Nakajima <[email protected]>
Cc: Laura Abbott <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Radim Krčmář <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Tim Chen <[email protected]>
Cc: Tom Lendacky <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]

--
2.7.4



2018-01-29 03:00:40

by KarimAllah Ahmed

[permalink] [raw]
Subject: [PATCH v2 1/4] x86: kvm: Update the reverse_cpuid list to include CPUID_7_EDX

Cc: Paolo Bonzini <[email protected]>
Cc: Radim Krčmář <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Signed-off-by: KarimAllah Ahmed <[email protected]>
---
arch/x86/kvm/cpuid.h | 1 +
1 file changed, 1 insertion(+)

diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
index cdc70a3..dcfe227 100644
--- a/arch/x86/kvm/cpuid.h
+++ b/arch/x86/kvm/cpuid.h
@@ -54,6 +54,7 @@ static const struct cpuid_reg reverse_cpuid[] = {
[CPUID_8000_000A_EDX] = {0x8000000a, 0, CPUID_EDX},
[CPUID_7_ECX] = { 7, 0, CPUID_ECX},
[CPUID_8000_0007_EBX] = {0x80000007, 0, CPUID_EBX},
+ [CPUID_7_EDX] = { 7, 0, CPUID_EDX},
};

static __always_inline struct cpuid_reg x86_feature_cpuid(unsigned x86_feature)
--
2.7.4


2018-01-29 03:00:55

by KarimAllah Ahmed

[permalink] [raw]
Subject: [PATCH v2 2/4] x86: vmx: Allow direct access to MSR_IA32_SPEC_CTRL

Add direct access to MSR_IA32_SPEC_CTRL for guests. This is needed for
guests that will only mitigate Spectre V2 through IBRS+IBPB and will not
be using a retpoline+IBPB based approach.

To avoid the overhead of atomically saving and restoring the
MSR_IA32_SPEC_CTRL for guests that do not actually use the MSR, only
add_atomic_switch_msr when a non-zero is written to it.

Cc: Asit Mallick <[email protected]>
Cc: Arjan Van De Ven <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Tim Chen <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Jun Nakajima <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: David Woodhouse <[email protected]>
Cc: Greg KH <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Ashok Raj <[email protected]>
Signed-off-by: KarimAllah Ahmed <[email protected]>

---
v2:
- remove 'host_spec_ctrl' in favor of only a comment (dwmw@).
- special case writing '0' in SPEC_CTRL to avoid confusing live-migration
when the instance never used the MSR (dwmw@).
- depend on X86_FEATURE_IBRS instead of X86_FEATURE_SPEC_CTRL (dwmw@).
- add MSR_IA32_SPEC_CTRL to the list of MSRs to save (dropped it by accident).
---
arch/x86/kvm/cpuid.c | 4 +++-
arch/x86/kvm/vmx.c | 65 ++++++++++++++++++++++++++++++++++++++++++++++++++++
arch/x86/kvm/x86.c | 1 +
3 files changed, 69 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 0099e10..32c0c14 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -70,6 +70,7 @@ u64 kvm_supported_xcr0(void)
/* These are scattered features in cpufeatures.h. */
#define KVM_CPUID_BIT_AVX512_4VNNIW 2
#define KVM_CPUID_BIT_AVX512_4FMAPS 3
+#define KVM_CPUID_BIT_IBRS 26
#define KF(x) bit(KVM_CPUID_BIT_##x)

int kvm_update_cpuid(struct kvm_vcpu *vcpu)
@@ -392,7 +393,8 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,

/* cpuid 7.0.edx*/
const u32 kvm_cpuid_7_0_edx_x86_features =
- KF(AVX512_4VNNIW) | KF(AVX512_4FMAPS);
+ KF(AVX512_4VNNIW) | KF(AVX512_4FMAPS) | \
+ (boot_cpu_has(X86_FEATURE_IBRS) ? KF(IBRS) : 0);

/* all calls to cpuid_count() should be made on the same cpu */
get_cpu();
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index aa8638a..dac564d 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -920,6 +920,8 @@ static void vmx_set_nmi_mask(struct kvm_vcpu *vcpu, bool masked);
static bool nested_vmx_is_page_fault_vmexit(struct vmcs12 *vmcs12,
u16 error_code);
static void vmx_update_msr_bitmap(struct kvm_vcpu *vcpu);
+static void __always_inline vmx_disable_intercept_for_msr(unsigned long *msr_bitmap,
+ u32 msr, int type);

static DEFINE_PER_CPU(struct vmcs *, vmxarea);
static DEFINE_PER_CPU(struct vmcs *, current_vmcs);
@@ -2007,6 +2009,28 @@ static void add_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr,
m->host[i].value = host_val;
}

+/* do not touch guest_val and host_val if the msr is not found */
+static int read_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr,
+ u64 *guest_val, u64 *host_val)
+{
+ unsigned i;
+ struct msr_autoload *m = &vmx->msr_autoload;
+
+ for (i = 0; i < m->nr; ++i)
+ if (m->guest[i].index == msr)
+ break;
+
+ if (i == m->nr)
+ return 1;
+
+ if (guest_val)
+ *guest_val = m->guest[i].value;
+ if (host_val)
+ *host_val = m->host[i].value;
+
+ return 0;
+}
+
static bool update_transition_efer(struct vcpu_vmx *vmx, int efer_offset)
{
u64 guest_efer = vmx->vcpu.arch.efer;
@@ -3203,7 +3227,9 @@ static inline bool vmx_feature_control_msr_valid(struct kvm_vcpu *vcpu,
*/
static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
{
+ u64 spec_ctrl = 0;
struct shared_msr_entry *msr;
+ struct vcpu_vmx *vmx = to_vmx(vcpu);

switch (msr_info->index) {
#ifdef CONFIG_X86_64
@@ -3223,6 +3249,20 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
case MSR_IA32_TSC:
msr_info->data = guest_read_tsc(vcpu);
break;
+ case MSR_IA32_SPEC_CTRL:
+ if (!msr_info->host_initiated &&
+ !guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL))
+ return 1;
+
+ /*
+ * If the MSR is not in the atomic list yet, then the guest
+ * never wrote a non-zero value to it yet i.e. the MSR value is
+ * '0'.
+ */
+ read_atomic_switch_msr(vmx, MSR_IA32_SPEC_CTRL, &spec_ctrl, NULL);
+
+ msr_info->data = spec_ctrl;
+ break;
case MSR_IA32_SYSENTER_CS:
msr_info->data = vmcs_read32(GUEST_SYSENTER_CS);
break;
@@ -3289,6 +3329,7 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
int ret = 0;
u32 msr_index = msr_info->index;
u64 data = msr_info->data;
+ unsigned long *msr_bitmap;

switch (msr_index) {
case MSR_EFER:
@@ -3330,6 +3371,30 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
case MSR_IA32_TSC:
kvm_write_tsc(vcpu, msr_info);
break;
+ case MSR_IA32_SPEC_CTRL:
+ if (!msr_info->host_initiated &&
+ !guest_cpuid_has(vcpu, X86_FEATURE_IBRS))
+ return 1;
+
+ if (!msr_info->data)
+ break;
+
+ /*
+ * Now we know that the guest is actually using the MSR, so
+ * atomically load and save the SPEC_CTRL MSR and pass it
+ * through to the guest.
+ *
+ * NOTE:
+ * IBRS is not supported yet as a mitigation for the host. Once
+ * it is supported, the "host_value" will need to be '1'
+ * instead of '0' if IBRS is used also by the host.
+ */
+ add_atomic_switch_msr(vmx, MSR_IA32_SPEC_CTRL, msr_info->data, 0);
+
+ msr_bitmap = vmx->vmcs01.msr_bitmap;
+ vmx_disable_intercept_for_msr(msr_bitmap, MSR_FS_BASE, MSR_TYPE_RW);
+
+ break;
case MSR_IA32_CR_PAT:
if (vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_IA32_PAT) {
if (!kvm_mtrr_valid(vcpu, MSR_IA32_CR_PAT, data))
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 03869eb..cabaad3 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1006,6 +1006,7 @@ static u32 msrs_to_save[] = {
#endif
MSR_IA32_TSC, MSR_IA32_CR_PAT, MSR_VM_HSAVE_PA,
MSR_IA32_FEATURE_CONTROL, MSR_IA32_BNDCFGS, MSR_TSC_AUX,
+ MSR_IA32_SPEC_CTRL
};

static unsigned num_msrs_to_save;
--
2.7.4


2018-01-29 03:01:16

by KarimAllah Ahmed

[permalink] [raw]
Subject: [PATCH v2 4/4] x86: vmx: Allow direct access to MSR_IA32_ARCH_CAPABILITIES

Add direct access to MSR_IA32_SPEC_CTRL for guests. Future intel processors
will use this MSR to indicate RDCL_NO (bit 0) and IBRS_ALL (bit 1).

Cc: Asit Mallick <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Arjan Van De Ven <[email protected]>
Cc: Tim Chen <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Jun Nakajima <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Greg KH <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Ashok Raj <[email protected]>
Signed-off-by: KarimAllah Ahmed <[email protected]>
---
arch/x86/kvm/cpuid.c | 4 +++-
arch/x86/kvm/vmx.c | 2 ++
2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 32c0c14..2339b1a 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -71,6 +71,7 @@ u64 kvm_supported_xcr0(void)
#define KVM_CPUID_BIT_AVX512_4VNNIW 2
#define KVM_CPUID_BIT_AVX512_4FMAPS 3
#define KVM_CPUID_BIT_IBRS 26
+#define KVM_CPUID_BIT_ARCH_CAPABILITIES 29
#define KF(x) bit(KVM_CPUID_BIT_##x)

int kvm_update_cpuid(struct kvm_vcpu *vcpu)
@@ -394,7 +395,8 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
/* cpuid 7.0.edx*/
const u32 kvm_cpuid_7_0_edx_x86_features =
KF(AVX512_4VNNIW) | KF(AVX512_4FMAPS) | \
- (boot_cpu_has(X86_FEATURE_IBRS) ? KF(IBRS) : 0);
+ (boot_cpu_has(X86_FEATURE_IBRS) ? KF(IBRS) : 0) | \
+ (boot_cpu_has(X86_FEATURE_ARCH_CAPABILITIES) ? KF(ARCH_CAPABILITIES) : 0);

/* all calls to cpuid_count() should be made on the same cpu */
get_cpu();
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index f82a44c..99cb761 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -9617,6 +9617,8 @@ static struct kvm_vcpu *vmx_create_vcpu(struct kvm *kvm, unsigned int id)

if (boot_cpu_has(X86_FEATURE_IBPB))
vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_PRED_CMD, MSR_TYPE_RW);
+ if (boot_cpu_has(X86_FEATURE_ARCH_CAPABILITIES))
+ vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_ARCH_CAPABILITIES, MSR_TYPE_R);
vmx_disable_intercept_for_msr(msr_bitmap, MSR_FS_BASE, MSR_TYPE_RW);
vmx_disable_intercept_for_msr(msr_bitmap, MSR_GS_BASE, MSR_TYPE_RW);
vmx_disable_intercept_for_msr(msr_bitmap, MSR_KERNEL_GS_BASE, MSR_TYPE_RW);
--
2.7.4


2018-01-29 03:01:34

by KarimAllah Ahmed

[permalink] [raw]
Subject: [PATCH v2 3/4] x86/kvm: Add IBPB support

From: Ashok Raj <[email protected]>

Add MSR passthrough for MSR_IA32_PRED_CMD and place branch predictor
barriers on switching between VMs to avoid inter VM Spectre-v2 attacks.

[peterz: rebase and changelog rewrite]
[karahmed: - rebase
- vmx: expose PRED_CMD whenever it is available
- svm: only pass through IBPB if it is available]
Cc: Asit Mallick <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Arjan Van De Ven <[email protected]>
Cc: Tim Chen <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Jun Nakajima <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Greg KH <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Signed-off-by: Ashok Raj <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: KarimAllah Ahmed <[email protected]>
---
arch/x86/kvm/svm.c | 14 ++++++++++++++
arch/x86/kvm/vmx.c | 4 ++++
2 files changed, 18 insertions(+)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 2744b973..c886e46 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -529,6 +529,7 @@ struct svm_cpu_data {
struct kvm_ldttss_desc *tss_desc;

struct page *save_area;
+ struct vmcb *current_vmcb;
};

static DEFINE_PER_CPU(struct svm_cpu_data *, svm_data);
@@ -918,6 +919,9 @@ static void svm_vcpu_init_msrpm(u32 *msrpm)

set_msr_interception(msrpm, direct_access_msrs[i].index, 1, 1);
}
+
+ if (boot_cpu_has(X86_FEATURE_IBPB))
+ set_msr_interception(msrpm, MSR_IA32_PRED_CMD, 1, 1);
}

static void add_msr_offset(u32 offset)
@@ -1706,11 +1710,17 @@ static void svm_free_vcpu(struct kvm_vcpu *vcpu)
__free_pages(virt_to_page(svm->nested.msrpm), MSRPM_ALLOC_ORDER);
kvm_vcpu_uninit(vcpu);
kmem_cache_free(kvm_vcpu_cache, svm);
+ /*
+ * The vmcb page can be recycled, causing a false negative in
+ * svm_vcpu_load(). So do a full IBPB now.
+ */
+ indirect_branch_prediction_barrier();
}

static void svm_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
{
struct vcpu_svm *svm = to_svm(vcpu);
+ struct svm_cpu_data *sd = per_cpu(svm_data, cpu);
int i;

if (unlikely(cpu != vcpu->cpu)) {
@@ -1739,6 +1749,10 @@ static void svm_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
if (static_cpu_has(X86_FEATURE_RDTSCP))
wrmsrl(MSR_TSC_AUX, svm->tsc_aux);

+ if (sd->current_vmcb != svm->vmcb) {
+ sd->current_vmcb = svm->vmcb;
+ indirect_branch_prediction_barrier();
+ }
avic_vcpu_load(vcpu, cpu);
}

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index dac564d..f82a44c 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2296,6 +2296,7 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
if (per_cpu(current_vmcs, cpu) != vmx->loaded_vmcs->vmcs) {
per_cpu(current_vmcs, cpu) = vmx->loaded_vmcs->vmcs;
vmcs_load(vmx->loaded_vmcs->vmcs);
+ indirect_branch_prediction_barrier();
}

if (!already_loaded) {
@@ -9613,6 +9614,9 @@ static struct kvm_vcpu *vmx_create_vcpu(struct kvm *kvm, unsigned int id)
goto free_msrs;

msr_bitmap = vmx->vmcs01.msr_bitmap;
+
+ if (boot_cpu_has(X86_FEATURE_IBPB))
+ vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_PRED_CMD, MSR_TYPE_RW);
vmx_disable_intercept_for_msr(msr_bitmap, MSR_FS_BASE, MSR_TYPE_RW);
vmx_disable_intercept_for_msr(msr_bitmap, MSR_GS_BASE, MSR_TYPE_RW);
vmx_disable_intercept_for_msr(msr_bitmap, MSR_KERNEL_GS_BASE, MSR_TYPE_RW);
--
2.7.4


2018-01-29 08:16:57

by David Woodhouse

[permalink] [raw]
Subject: Re: [PATCH v2 2/4] x86: vmx: Allow direct access to MSR_IA32_SPEC_CTRL

On Mon, 2018-01-29 at 01:58 +0100, KarimAllah Ahmed wrote:
> Add direct access to MSR_IA32_SPEC_CTRL for guests. This is needed for
> guests that will only mitigate Spectre V2 through IBRS+IBPB and will not
> be using a retpoline+IBPB based approach.
>
> To avoid the overhead of atomically saving and restoring the
> MSR_IA32_SPEC_CTRL for guests that do not actually use the MSR, only
> add_atomic_switch_msr when a non-zero is written to it.
>
> Cc: Asit Mallick <[email protected]>
> Cc: Arjan Van De Ven <[email protected]>
> Cc: Dave Hansen <[email protected]>
> Cc: Andi Kleen <[email protected]>
> Cc: Andrea Arcangeli <[email protected]>
> Cc: Linus Torvalds <[email protected]>
> Cc: Tim Chen <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Cc: Dan Williams <[email protected]>
> Cc: Jun Nakajima <[email protected]>
> Cc: Paolo Bonzini <[email protected]>
> Cc: David Woodhouse <[email protected]>
> Cc: Greg KH <[email protected]>
> Cc: Andy Lutomirski <[email protected]>
> Cc: Ashok Raj <[email protected]>
> Signed-off-by: KarimAllah Ahmed <[email protected]>
>
> ---
> v2:
> - remove 'host_spec_ctrl' in favor of only a comment (dwmw@).
> - special case writing '0' in SPEC_CTRL to avoid confusing live-migration
>   when the instance never used the MSR (dwmw@).

Possibly wants a comment in the code explaining this in slightly more
detail. The point being that if we migrate a guest which has never used
the MSR, we don't want the act of setting it to zero on resume to flip
it into the auto-saved mode.
 
> - depend on X86_FEATURE_IBRS instead of X86_FEATURE_SPEC_CTRL (dwmw@).
> - add MSR_IA32_SPEC_CTRL to the list of MSRs to save (dropped it by accident).
> ---
>  arch/x86/kvm/cpuid.c |  4 +++-
>  arch/x86/kvm/vmx.c   | 65 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>  arch/x86/kvm/x86.c   |  1 +
>  3 files changed, 69 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 0099e10..32c0c14 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -70,6 +70,7 @@ u64 kvm_supported_xcr0(void)
>  /* These are scattered features in cpufeatures.h. */
>  #define KVM_CPUID_BIT_AVX512_4VNNIW     2
>  #define KVM_CPUID_BIT_AVX512_4FMAPS     3
> +#define KVM_CPUID_BIT_IBRS              26
>  #define KF(x) bit(KVM_CPUID_BIT_##x)
>  
>  int kvm_update_cpuid(struct kvm_vcpu *vcpu)
> @@ -392,7 +393,8 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
>  
>   /* cpuid 7.0.edx*/
>   const u32 kvm_cpuid_7_0_edx_x86_features =
> - KF(AVX512_4VNNIW) | KF(AVX512_4FMAPS);
> + KF(AVX512_4VNNIW) | KF(AVX512_4FMAPS) | \
> + (boot_cpu_has(X86_FEATURE_IBRS) ? KF(IBRS) : 0);
>   /* all calls to cpuid_count() should be made on the same cpu */
>   get_cpu();

I think we need to expose more feature bits than that. See
https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?h=x86/pti&id=2961298efe1ea1b6fc0d7ee8b76018fa6c0bcef2

There are three AMD bits for IBRS, IBPB & STIBP which are the user-
visible ones in /proc/cpuinfo, and the ones we use within the kernel to
indicate the hardware availability (there are separate feature bits for
when we're *using* IBPB etc., but that's only because feature bits are
the only thing that ALTERNATIVEs can work from).

In addition to those bits, Intel has its own. The Intel SPEC_CTRL CPUID
bit (which you're setting above) indicates *both* IBRS and IBPB
capability. The kernel sets the corresponding AMD bits when it sees
SPEC_CTRL. Likewise Intel has a different bit for STIBP.

You could construct a set of CPUID bits for the guest based on what the
host has. So all three of the AMD IBRS/IBPB/STIBP bits in 80000008/EBX
should just be passed through, and you could set the Intel SPEC_CTRL
bit (7/EDX bit 26 that you're looking at above) only when you have
X86_FEATURE_IBPB && X86_FEATURE_IBRS. And the Intel STIBP when you have
X86_FEATURE_STIBP.

The Intel ARCH_CAPABILITIES CPUID bit is separate. Pass that through if
you have it, and expose the corresponding MSR read-only.


> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index aa8638a..dac564d 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -920,6 +920,8 @@ static void vmx_set_nmi_mask(struct kvm_vcpu *vcpu, bool masked);
>  static bool nested_vmx_is_page_fault_vmexit(struct vmcs12 *vmcs12,
>       u16 error_code);
>  static void vmx_update_msr_bitmap(struct kvm_vcpu *vcpu);
> +static void __always_inline vmx_disable_intercept_for_msr(unsigned long *msr_bitmap,
> +   u32 msr, int type);
>  
>  static DEFINE_PER_CPU(struct vmcs *, vmxarea);
>  static DEFINE_PER_CPU(struct vmcs *, current_vmcs);

Perhaps move that whole function further up?

>  static bool update_transition_efer(struct vcpu_vmx *vmx, int efer_offset)
>  {
>   u64 guest_efer = vmx->vcpu.arch.efer;
> @@ -3203,7 +3227,9 @@ static inline bool vmx_feature_control_msr_valid(struct kvm_vcpu *vcpu,
>   */
>  static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  {
> + u64 spec_ctrl = 0;

Could you ditch this additional variable and...

>   struct shared_msr_entry *msr;
> + struct vcpu_vmx *vmx = to_vmx(vcpu);
>  
>   switch (msr_info->index) {
>  #ifdef CONFIG_X86_64
> @@ -3223,6 +3249,20 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>   case MSR_IA32_TSC:
>   msr_info->data = guest_read_tsc(vcpu);
>   break;
> + case MSR_IA32_SPEC_CTRL:
> + if (!msr_info->host_initiated &&
> +     !guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL))
> + return 1;
> +
> + /*
> +  * If the MSR is not in the atomic list yet, then the guest
> +  * never wrote a non-zero value to it yet i.e. the MSR value is
> +  * '0'.
> +  */

...
    if (read_atomic_switch_msr(vmx, MSR_IA32_SPEC_CTRL, &msr_info->data, NULL))
msr_info->data = 0;


Attachments:
smime.p7s (5.09 kB)

2018-01-29 10:38:03

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] x86: kvm: Update the reverse_cpuid list to include CPUID_7_EDX

On 29/01/2018 01:58, KarimAllah Ahmed wrote:
> Cc: Paolo Bonzini <[email protected]>
> Cc: Radim Krčmář <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: H. Peter Anvin <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Signed-off-by: KarimAllah Ahmed <[email protected]>
> ---
> arch/x86/kvm/cpuid.h | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
> index cdc70a3..dcfe227 100644
> --- a/arch/x86/kvm/cpuid.h
> +++ b/arch/x86/kvm/cpuid.h
> @@ -54,6 +54,7 @@ static const struct cpuid_reg reverse_cpuid[] = {
> [CPUID_8000_000A_EDX] = {0x8000000a, 0, CPUID_EDX},
> [CPUID_7_ECX] = { 7, 0, CPUID_ECX},
> [CPUID_8000_0007_EBX] = {0x80000007, 0, CPUID_EBX},
> + [CPUID_7_EDX] = { 7, 0, CPUID_EDX},
> };
>
> static __always_inline struct cpuid_reg x86_feature_cpuid(unsigned x86_feature)
>

You can also stop using KF(...) and KVM_CPUID_BIT_* for bits that come
from this leaf. F(AVX512_4VNNIW) | F(AVX512_4FMAPS) | F(IBRS) will be
enough.

Paolo

2018-01-29 10:44:59

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH v2 2/4] x86: vmx: Allow direct access to MSR_IA32_SPEC_CTRL

On 29/01/2018 01:58, KarimAllah Ahmed wrote:
> Add direct access to MSR_IA32_SPEC_CTRL for guests. This is needed for
> guests that will only mitigate Spectre V2 through IBRS+IBPB and will not
> be using a retpoline+IBPB based approach.
>
> To avoid the overhead of atomically saving and restoring the
> MSR_IA32_SPEC_CTRL for guests that do not actually use the MSR, only
> add_atomic_switch_msr when a non-zero is written to it.

You are not storing the guest's MSR value on though vmexit, aren't you?
Also, there's an obvious typo here:

+ add_atomic_switch_msr(vmx, MSR_IA32_SPEC_CTRL, msr_info->data, 0);
+
+ msr_bitmap = vmx->vmcs01.msr_bitmap;
+ vmx_disable_intercept_for_msr(msr_bitmap, MSR_FS_BASE, MSR_TYPE_RW);
+

Finally, apparently add_atomic_switch_msr is slower than just rdmsr/wrmsr
on vmexit. Can you reuse the patches I had posted mid January instead? They
are also assuming no IBRS usage on the host, so the changes shouldn't be large,
and limited mostly to using actual X86_FEATURE_* bits instead of cpuid_count().

They lack the code to only read/write SPEC_CTRL if the direct access is enabled,
but that's small too... Enabling the direct access on the first write, as in
this patches, is okay.

Thanks,

Paolo

> Cc: Asit Mallick <[email protected]>
> Cc: Arjan Van De Ven <[email protected]>
> Cc: Dave Hansen <[email protected]>
> Cc: Andi Kleen <[email protected]>
> Cc: Andrea Arcangeli <[email protected]>
> Cc: Linus Torvalds <[email protected]>
> Cc: Tim Chen <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Cc: Dan Williams <[email protected]>
> Cc: Jun Nakajima <[email protected]>
> Cc: Paolo Bonzini <[email protected]>
> Cc: David Woodhouse <[email protected]>
> Cc: Greg KH <[email protected]>
> Cc: Andy Lutomirski <[email protected]>
> Cc: Ashok Raj <[email protected]>
> Signed-off-by: KarimAllah Ahmed <[email protected]>
>
> ---
> v2:
> - remove 'host_spec_ctrl' in favor of only a comment (dwmw@).
> - special case writing '0' in SPEC_CTRL to avoid confusing live-migration
> when the instance never used the MSR (dwmw@).
> - depend on X86_FEATURE_IBRS instead of X86_FEATURE_SPEC_CTRL (dwmw@).
> - add MSR_IA32_SPEC_CTRL to the list of MSRs to save (dropped it by accident).
> ---
> arch/x86/kvm/cpuid.c | 4 +++-
> arch/x86/kvm/vmx.c | 65 ++++++++++++++++++++++++++++++++++++++++++++++++++++
> arch/x86/kvm/x86.c | 1 +
> 3 files changed, 69 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 0099e10..32c0c14 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -70,6 +70,7 @@ u64 kvm_supported_xcr0(void)
> /* These are scattered features in cpufeatures.h. */
> #define KVM_CPUID_BIT_AVX512_4VNNIW 2
> #define KVM_CPUID_BIT_AVX512_4FMAPS 3
> +#define KVM_CPUID_BIT_IBRS 26
> #define KF(x) bit(KVM_CPUID_BIT_##x)
>
> int kvm_update_cpuid(struct kvm_vcpu *vcpu)
> @@ -392,7 +393,8 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
>
> /* cpuid 7.0.edx*/
> const u32 kvm_cpuid_7_0_edx_x86_features =
> - KF(AVX512_4VNNIW) | KF(AVX512_4FMAPS);
> + KF(AVX512_4VNNIW) | KF(AVX512_4FMAPS) | \
> + (boot_cpu_has(X86_FEATURE_IBRS) ? KF(IBRS) : 0);
>
> /* all calls to cpuid_count() should be made on the same cpu */
> get_cpu();
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index aa8638a..dac564d 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -920,6 +920,8 @@ static void vmx_set_nmi_mask(struct kvm_vcpu *vcpu, bool masked);
> static bool nested_vmx_is_page_fault_vmexit(struct vmcs12 *vmcs12,
> u16 error_code);
> static void vmx_update_msr_bitmap(struct kvm_vcpu *vcpu);
> +static void __always_inline vmx_disable_intercept_for_msr(unsigned long *msr_bitmap,
> + u32 msr, int type);
>
> static DEFINE_PER_CPU(struct vmcs *, vmxarea);
> static DEFINE_PER_CPU(struct vmcs *, current_vmcs);
> @@ -2007,6 +2009,28 @@ static void add_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr,
> m->host[i].value = host_val;
> }
>
> +/* do not touch guest_val and host_val if the msr is not found */
> +static int read_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr,
> + u64 *guest_val, u64 *host_val)
> +{
> + unsigned i;
> + struct msr_autoload *m = &vmx->msr_autoload;
> +
> + for (i = 0; i < m->nr; ++i)
> + if (m->guest[i].index == msr)
> + break;
> +
> + if (i == m->nr)
> + return 1;
> +
> + if (guest_val)
> + *guest_val = m->guest[i].value;
> + if (host_val)
> + *host_val = m->host[i].value;
> +
> + return 0;
> +}
> +
> static bool update_transition_efer(struct vcpu_vmx *vmx, int efer_offset)
> {
> u64 guest_efer = vmx->vcpu.arch.efer;
> @@ -3203,7 +3227,9 @@ static inline bool vmx_feature_control_msr_valid(struct kvm_vcpu *vcpu,
> */
> static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> {
> + u64 spec_ctrl = 0;
> struct shared_msr_entry *msr;
> + struct vcpu_vmx *vmx = to_vmx(vcpu);
>
> switch (msr_info->index) {
> #ifdef CONFIG_X86_64
> @@ -3223,6 +3249,20 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> case MSR_IA32_TSC:
> msr_info->data = guest_read_tsc(vcpu);
> break;
> + case MSR_IA32_SPEC_CTRL:
> + if (!msr_info->host_initiated &&
> + !guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL))
> + return 1;
> +
> + /*
> + * If the MSR is not in the atomic list yet, then the guest
> + * never wrote a non-zero value to it yet i.e. the MSR value is
> + * '0'.
> + */
> + read_atomic_switch_msr(vmx, MSR_IA32_SPEC_CTRL, &spec_ctrl, NULL);
> +
> + msr_info->data = spec_ctrl;
> + break;
> case MSR_IA32_SYSENTER_CS:
> msr_info->data = vmcs_read32(GUEST_SYSENTER_CS);
> break;
> @@ -3289,6 +3329,7 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> int ret = 0;
> u32 msr_index = msr_info->index;
> u64 data = msr_info->data;
> + unsigned long *msr_bitmap;
>
> switch (msr_index) {
> case MSR_EFER:
> @@ -3330,6 +3371,30 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> case MSR_IA32_TSC:
> kvm_write_tsc(vcpu, msr_info);
> break;
> + case MSR_IA32_SPEC_CTRL:
> + if (!msr_info->host_initiated &&
> + !guest_cpuid_has(vcpu, X86_FEATURE_IBRS))
> + return 1;
> +
> + if (!msr_info->data)
> + break;
> +
> + /*
> + * Now we know that the guest is actually using the MSR, so
> + * atomically load and save the SPEC_CTRL MSR and pass it
> + * through to the guest.
> + *
> + * NOTE:
> + * IBRS is not supported yet as a mitigation for the host. Once
> + * it is supported, the "host_value" will need to be '1'
> + * instead of '0' if IBRS is used also by the host.
> + */
> + add_atomic_switch_msr(vmx, MSR_IA32_SPEC_CTRL, msr_info->data, 0);
> +
> + msr_bitmap = vmx->vmcs01.msr_bitmap;
> + vmx_disable_intercept_for_msr(msr_bitmap, MSR_FS_BASE, MSR_TYPE_RW);
> +
> + break;
> case MSR_IA32_CR_PAT:
> if (vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_IA32_PAT) {
> if (!kvm_mtrr_valid(vcpu, MSR_IA32_CR_PAT, data))
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 03869eb..cabaad3 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -1006,6 +1006,7 @@ static u32 msrs_to_save[] = {
> #endif
> MSR_IA32_TSC, MSR_IA32_CR_PAT, MSR_VM_HSAVE_PA,
> MSR_IA32_FEATURE_CONTROL, MSR_IA32_BNDCFGS, MSR_TSC_AUX,
> + MSR_IA32_SPEC_CTRL
> };
>
> static unsigned num_msrs_to_save;
>


2018-01-29 10:46:48

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH v2 4/4] x86: vmx: Allow direct access to MSR_IA32_ARCH_CAPABILITIES

On 29/01/2018 01:58, KarimAllah Ahmed wrote:
> Add direct access to MSR_IA32_SPEC_CTRL for guests. Future intel processors
> will use this MSR to indicate RDCL_NO (bit 0) and IBRS_ALL (bit 1).

This has to be customizable per-VM (similar to the patches Amazon posted
a while ago for UCODE_REV for example). It's not urgent anyway given
that no processor has it yet.

Thanks,

Paolo

> Cc: Asit Mallick <[email protected]>
> Cc: Dave Hansen <[email protected]>
> Cc: Arjan Van De Ven <[email protected]>
> Cc: Tim Chen <[email protected]>
> Cc: Linus Torvalds <[email protected]>
> Cc: Andrea Arcangeli <[email protected]>
> Cc: Andi Kleen <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Cc: Dan Williams <[email protected]>
> Cc: Jun Nakajima <[email protected]>
> Cc: Andy Lutomirski <[email protected]>
> Cc: Greg KH <[email protected]>
> Cc: Paolo Bonzini <[email protected]>
> Cc: Ashok Raj <[email protected]>
> Signed-off-by: KarimAllah Ahmed <[email protected]>
> ---
> arch/x86/kvm/cpuid.c | 4 +++-
> arch/x86/kvm/vmx.c | 2 ++
> 2 files changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 32c0c14..2339b1a 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -71,6 +71,7 @@ u64 kvm_supported_xcr0(void)
> #define KVM_CPUID_BIT_AVX512_4VNNIW 2
> #define KVM_CPUID_BIT_AVX512_4FMAPS 3
> #define KVM_CPUID_BIT_IBRS 26
> +#define KVM_CPUID_BIT_ARCH_CAPABILITIES 29
> #define KF(x) bit(KVM_CPUID_BIT_##x)
>
> int kvm_update_cpuid(struct kvm_vcpu *vcpu)
> @@ -394,7 +395,8 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
> /* cpuid 7.0.edx*/
> const u32 kvm_cpuid_7_0_edx_x86_features =
> KF(AVX512_4VNNIW) | KF(AVX512_4FMAPS) | \
> - (boot_cpu_has(X86_FEATURE_IBRS) ? KF(IBRS) : 0);
> + (boot_cpu_has(X86_FEATURE_IBRS) ? KF(IBRS) : 0) | \
> + (boot_cpu_has(X86_FEATURE_ARCH_CAPABILITIES) ? KF(ARCH_CAPABILITIES) : 0);
>
> /* all calls to cpuid_count() should be made on the same cpu */
> get_cpu();
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index f82a44c..99cb761 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -9617,6 +9617,8 @@ static struct kvm_vcpu *vmx_create_vcpu(struct kvm *kvm, unsigned int id)
>
> if (boot_cpu_has(X86_FEATURE_IBPB))
> vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_PRED_CMD, MSR_TYPE_RW);
> + if (boot_cpu_has(X86_FEATURE_ARCH_CAPABILITIES))
> + vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_ARCH_CAPABILITIES, MSR_TYPE_R);
> vmx_disable_intercept_for_msr(msr_bitmap, MSR_FS_BASE, MSR_TYPE_RW);
> vmx_disable_intercept_for_msr(msr_bitmap, MSR_GS_BASE, MSR_TYPE_RW);
> vmx_disable_intercept_for_msr(msr_bitmap, MSR_KERNEL_GS_BASE, MSR_TYPE_RW);
>


2018-01-29 12:17:39

by KarimAllah Ahmed

[permalink] [raw]
Subject: Re: [PATCH v2 2/4] x86: vmx: Allow direct access to MSR_IA32_SPEC_CTRL

On 01/29/2018 11:44 AM, Paolo Bonzini wrote:
> On 29/01/2018 01:58, KarimAllah Ahmed wrote:
>> Add direct access to MSR_IA32_SPEC_CTRL for guests. This is needed for
>> guests that will only mitigate Spectre V2 through IBRS+IBPB and will not
>> be using a retpoline+IBPB based approach.
>>
>> To avoid the overhead of atomically saving and restoring the
>> MSR_IA32_SPEC_CTRL for guests that do not actually use the MSR, only
>> add_atomic_switch_msr when a non-zero is written to it.
>
> You are not storing the guest's MSR value on though vmexit, aren't you?

I originally thought that atomic_switch was also saving the guest MSR on
VM-exit. Now I know it is not.

> Also, there's an obvious typo here:
>
> + add_atomic_switch_msr(vmx, MSR_IA32_SPEC_CTRL, msr_info->data, 0);
> +
> + msr_bitmap = vmx->vmcs01.msr_bitmap;
> + vmx_disable_intercept_for_msr(msr_bitmap, MSR_FS_BASE, MSR_TYPE_RW);
> +

oops! copy & paste error :)

>
> Finally, apparently add_atomic_switch_msr is slower than just rdmsr/wrmsr
> on vmexit. Can you reuse the patches I had posted mid January instead? They
> are also assuming no IBRS usage on the host, so the changes shouldn't be large,
> and limited mostly to using actual X86_FEATURE_* bits instead of cpuid_count().
>
> They lack the code to only read/write SPEC_CTRL if the direct access is enabled,
> but that's small too... Enabling the direct access on the first write, as in
> this patches, is okay.
>
> Thanks,
>
> Paolo
>
>> Cc: Asit Mallick <[email protected]>
>> Cc: Arjan Van De Ven <[email protected]>
>> Cc: Dave Hansen <[email protected]>
>> Cc: Andi Kleen <[email protected]>
>> Cc: Andrea Arcangeli <[email protected]>
>> Cc: Linus Torvalds <[email protected]>
>> Cc: Tim Chen <[email protected]>
>> Cc: Thomas Gleixner <[email protected]>
>> Cc: Dan Williams <[email protected]>
>> Cc: Jun Nakajima <[email protected]>
>> Cc: Paolo Bonzini <[email protected]>
>> Cc: David Woodhouse <[email protected]>
>> Cc: Greg KH <[email protected]>
>> Cc: Andy Lutomirski <[email protected]>
>> Cc: Ashok Raj <[email protected]>
>> Signed-off-by: KarimAllah Ahmed <[email protected]>
>>
>> ---
>> v2:
>> - remove 'host_spec_ctrl' in favor of only a comment (dwmw@).
>> - special case writing '0' in SPEC_CTRL to avoid confusing live-migration
>> when the instance never used the MSR (dwmw@).
>> - depend on X86_FEATURE_IBRS instead of X86_FEATURE_SPEC_CTRL (dwmw@).
>> - add MSR_IA32_SPEC_CTRL to the list of MSRs to save (dropped it by accident).
>> ---
>> arch/x86/kvm/cpuid.c | 4 +++-
>> arch/x86/kvm/vmx.c | 65 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>> arch/x86/kvm/x86.c | 1 +
>> 3 files changed, 69 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
>> index 0099e10..32c0c14 100644
>> --- a/arch/x86/kvm/cpuid.c
>> +++ b/arch/x86/kvm/cpuid.c
>> @@ -70,6 +70,7 @@ u64 kvm_supported_xcr0(void)
>> /* These are scattered features in cpufeatures.h. */
>> #define KVM_CPUID_BIT_AVX512_4VNNIW 2
>> #define KVM_CPUID_BIT_AVX512_4FMAPS 3
>> +#define KVM_CPUID_BIT_IBRS 26
>> #define KF(x) bit(KVM_CPUID_BIT_##x)
>>
>> int kvm_update_cpuid(struct kvm_vcpu *vcpu)
>> @@ -392,7 +393,8 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
>>
>> /* cpuid 7.0.edx*/
>> const u32 kvm_cpuid_7_0_edx_x86_features =
>> - KF(AVX512_4VNNIW) | KF(AVX512_4FMAPS);
>> + KF(AVX512_4VNNIW) | KF(AVX512_4FMAPS) | \
>> + (boot_cpu_has(X86_FEATURE_IBRS) ? KF(IBRS) : 0);
>>
>> /* all calls to cpuid_count() should be made on the same cpu */
>> get_cpu();
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> index aa8638a..dac564d 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -920,6 +920,8 @@ static void vmx_set_nmi_mask(struct kvm_vcpu *vcpu, bool masked);
>> static bool nested_vmx_is_page_fault_vmexit(struct vmcs12 *vmcs12,
>> u16 error_code);
>> static void vmx_update_msr_bitmap(struct kvm_vcpu *vcpu);
>> +static void __always_inline vmx_disable_intercept_for_msr(unsigned long *msr_bitmap,
>> + u32 msr, int type);
>>
>> static DEFINE_PER_CPU(struct vmcs *, vmxarea);
>> static DEFINE_PER_CPU(struct vmcs *, current_vmcs);
>> @@ -2007,6 +2009,28 @@ static void add_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr,
>> m->host[i].value = host_val;
>> }
>>
>> +/* do not touch guest_val and host_val if the msr is not found */
>> +static int read_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr,
>> + u64 *guest_val, u64 *host_val)
>> +{
>> + unsigned i;
>> + struct msr_autoload *m = &vmx->msr_autoload;
>> +
>> + for (i = 0; i < m->nr; ++i)
>> + if (m->guest[i].index == msr)
>> + break;
>> +
>> + if (i == m->nr)
>> + return 1;
>> +
>> + if (guest_val)
>> + *guest_val = m->guest[i].value;
>> + if (host_val)
>> + *host_val = m->host[i].value;
>> +
>> + return 0;
>> +}
>> +
>> static bool update_transition_efer(struct vcpu_vmx *vmx, int efer_offset)
>> {
>> u64 guest_efer = vmx->vcpu.arch.efer;
>> @@ -3203,7 +3227,9 @@ static inline bool vmx_feature_control_msr_valid(struct kvm_vcpu *vcpu,
>> */
>> static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>> {
>> + u64 spec_ctrl = 0;
>> struct shared_msr_entry *msr;
>> + struct vcpu_vmx *vmx = to_vmx(vcpu);
>>
>> switch (msr_info->index) {
>> #ifdef CONFIG_X86_64
>> @@ -3223,6 +3249,20 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>> case MSR_IA32_TSC:
>> msr_info->data = guest_read_tsc(vcpu);
>> break;
>> + case MSR_IA32_SPEC_CTRL:
>> + if (!msr_info->host_initiated &&
>> + !guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL))
>> + return 1;
>> +
>> + /*
>> + * If the MSR is not in the atomic list yet, then the guest
>> + * never wrote a non-zero value to it yet i.e. the MSR value is
>> + * '0'.
>> + */
>> + read_atomic_switch_msr(vmx, MSR_IA32_SPEC_CTRL, &spec_ctrl, NULL);
>> +
>> + msr_info->data = spec_ctrl;
>> + break;
>> case MSR_IA32_SYSENTER_CS:
>> msr_info->data = vmcs_read32(GUEST_SYSENTER_CS);
>> break;
>> @@ -3289,6 +3329,7 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>> int ret = 0;
>> u32 msr_index = msr_info->index;
>> u64 data = msr_info->data;
>> + unsigned long *msr_bitmap;
>>
>> switch (msr_index) {
>> case MSR_EFER:
>> @@ -3330,6 +3371,30 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>> case MSR_IA32_TSC:
>> kvm_write_tsc(vcpu, msr_info);
>> break;
>> + case MSR_IA32_SPEC_CTRL:
>> + if (!msr_info->host_initiated &&
>> + !guest_cpuid_has(vcpu, X86_FEATURE_IBRS))
>> + return 1;
>> +
>> + if (!msr_info->data)
>> + break;
>> +
>> + /*
>> + * Now we know that the guest is actually using the MSR, so
>> + * atomically load and save the SPEC_CTRL MSR and pass it
>> + * through to the guest.
>> + *
>> + * NOTE:
>> + * IBRS is not supported yet as a mitigation for the host. Once
>> + * it is supported, the "host_value" will need to be '1'
>> + * instead of '0' if IBRS is used also by the host.
>> + */
>> + add_atomic_switch_msr(vmx, MSR_IA32_SPEC_CTRL, msr_info->data, 0);
>> +
>> + msr_bitmap = vmx->vmcs01.msr_bitmap;
>> + vmx_disable_intercept_for_msr(msr_bitmap, MSR_FS_BASE, MSR_TYPE_RW);
>> +
>> + break;
>> case MSR_IA32_CR_PAT:
>> if (vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_IA32_PAT) {
>> if (!kvm_mtrr_valid(vcpu, MSR_IA32_CR_PAT, data))
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index 03869eb..cabaad3 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -1006,6 +1006,7 @@ static u32 msrs_to_save[] = {
>> #endif
>> MSR_IA32_TSC, MSR_IA32_CR_PAT, MSR_VM_HSAVE_PA,
>> MSR_IA32_FEATURE_CONTROL, MSR_IA32_BNDCFGS, MSR_TSC_AUX,
>> + MSR_IA32_SPEC_CTRL
>> };
>>
>> static unsigned num_msrs_to_save;
>>
>
>
Amazon Development Center Germany GmbH
Berlin - Dresden - Aachen
main office: Krausenstr. 38, 10117 Berlin
Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger
Ust-ID: DE289237879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

2018-01-29 14:18:55

by Van De Ven, Arjan

[permalink] [raw]
Subject: RE: [PATCH v2 4/4] x86: vmx: Allow direct access to MSR_IA32_ARCH_CAPABILITIES

> On 29/01/2018 01:58, KarimAllah Ahmed wrote:
> > Add direct access to MSR_IA32_SPEC_CTRL for guests. Future intel processors
> > will use this MSR to indicate RDCL_NO (bit 0) and IBRS_ALL (bit 1).
>
> This has to be customizable per-VM (similar to the patches Amazon posted
> a while ago for UCODE_REV for example). It's not urgent anyway given
> that no processor has it yet.


Apollo lake has it once you update the microcode.

2018-01-29 18:55:47

by Jim Mattson

[permalink] [raw]
Subject: Re: [PATCH v2 4/4] x86: vmx: Allow direct access to MSR_IA32_ARCH_CAPABILITIES

Why should this MSR be pass-through? I doubt that it would be accessed
frequently.

On Sun, Jan 28, 2018 at 4:58 PM, KarimAllah Ahmed <[email protected]> wrote:
> Add direct access to MSR_IA32_SPEC_CTRL for guests. Future intel processors
> will use this MSR to indicate RDCL_NO (bit 0) and IBRS_ALL (bit 1).
>
> Cc: Asit Mallick <[email protected]>
> Cc: Dave Hansen <[email protected]>
> Cc: Arjan Van De Ven <[email protected]>
> Cc: Tim Chen <[email protected]>
> Cc: Linus Torvalds <[email protected]>
> Cc: Andrea Arcangeli <[email protected]>
> Cc: Andi Kleen <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Cc: Dan Williams <[email protected]>
> Cc: Jun Nakajima <[email protected]>
> Cc: Andy Lutomirski <[email protected]>
> Cc: Greg KH <[email protected]>
> Cc: Paolo Bonzini <[email protected]>
> Cc: Ashok Raj <[email protected]>
> Signed-off-by: KarimAllah Ahmed <[email protected]>
> ---
> arch/x86/kvm/cpuid.c | 4 +++-
> arch/x86/kvm/vmx.c | 2 ++
> 2 files changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 32c0c14..2339b1a 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -71,6 +71,7 @@ u64 kvm_supported_xcr0(void)
> #define KVM_CPUID_BIT_AVX512_4VNNIW 2
> #define KVM_CPUID_BIT_AVX512_4FMAPS 3
> #define KVM_CPUID_BIT_IBRS 26
> +#define KVM_CPUID_BIT_ARCH_CAPABILITIES 29
> #define KF(x) bit(KVM_CPUID_BIT_##x)
>
> int kvm_update_cpuid(struct kvm_vcpu *vcpu)
> @@ -394,7 +395,8 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
> /* cpuid 7.0.edx*/
> const u32 kvm_cpuid_7_0_edx_x86_features =
> KF(AVX512_4VNNIW) | KF(AVX512_4FMAPS) | \
> - (boot_cpu_has(X86_FEATURE_IBRS) ? KF(IBRS) : 0);
> + (boot_cpu_has(X86_FEATURE_IBRS) ? KF(IBRS) : 0) | \
> + (boot_cpu_has(X86_FEATURE_ARCH_CAPABILITIES) ? KF(ARCH_CAPABILITIES) : 0);
>
> /* all calls to cpuid_count() should be made on the same cpu */
> get_cpu();
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index f82a44c..99cb761 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -9617,6 +9617,8 @@ static struct kvm_vcpu *vmx_create_vcpu(struct kvm *kvm, unsigned int id)
>
> if (boot_cpu_has(X86_FEATURE_IBPB))
> vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_PRED_CMD, MSR_TYPE_RW);
> + if (boot_cpu_has(X86_FEATURE_ARCH_CAPABILITIES))
> + vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_ARCH_CAPABILITIES, MSR_TYPE_R);
> vmx_disable_intercept_for_msr(msr_bitmap, MSR_FS_BASE, MSR_TYPE_RW);
> vmx_disable_intercept_for_msr(msr_bitmap, MSR_GS_BASE, MSR_TYPE_RW);
> vmx_disable_intercept_for_msr(msr_bitmap, MSR_KERNEL_GS_BASE, MSR_TYPE_RW);
> --
> 2.7.4
>

2018-01-29 18:59:20

by KarimAllah Ahmed

[permalink] [raw]
Subject: Re: [PATCH v2 4/4] x86: vmx: Allow direct access to MSR_IA32_ARCH_CAPABILITIES

On 01/29/2018 07:55 PM, Jim Mattson wrote:
> Why should this MSR be pass-through? I doubt that it would be accessed
> frequently.

True. Will update it to be emulated and allow user-space to set the
value exposed.

>
> On Sun, Jan 28, 2018 at 4:58 PM, KarimAllah Ahmed <[email protected]> wrote:
>> Add direct access to MSR_IA32_SPEC_CTRL for guests. Future intel processors
>> will use this MSR to indicate RDCL_NO (bit 0) and IBRS_ALL (bit 1).
>>
>> Cc: Asit Mallick <[email protected]>
>> Cc: Dave Hansen <[email protected]>
>> Cc: Arjan Van De Ven <[email protected]>
>> Cc: Tim Chen <[email protected]>
>> Cc: Linus Torvalds <[email protected]>
>> Cc: Andrea Arcangeli <[email protected]>
>> Cc: Andi Kleen <[email protected]>
>> Cc: Thomas Gleixner <[email protected]>
>> Cc: Dan Williams <[email protected]>
>> Cc: Jun Nakajima <[email protected]>
>> Cc: Andy Lutomirski <[email protected]>
>> Cc: Greg KH <[email protected]>
>> Cc: Paolo Bonzini <[email protected]>
>> Cc: Ashok Raj <[email protected]>
>> Signed-off-by: KarimAllah Ahmed <[email protected]>
>> ---
>> arch/x86/kvm/cpuid.c | 4 +++-
>> arch/x86/kvm/vmx.c | 2 ++
>> 2 files changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
>> index 32c0c14..2339b1a 100644
>> --- a/arch/x86/kvm/cpuid.c
>> +++ b/arch/x86/kvm/cpuid.c
>> @@ -71,6 +71,7 @@ u64 kvm_supported_xcr0(void)
>> #define KVM_CPUID_BIT_AVX512_4VNNIW 2
>> #define KVM_CPUID_BIT_AVX512_4FMAPS 3
>> #define KVM_CPUID_BIT_IBRS 26
>> +#define KVM_CPUID_BIT_ARCH_CAPABILITIES 29
>> #define KF(x) bit(KVM_CPUID_BIT_##x)
>>
>> int kvm_update_cpuid(struct kvm_vcpu *vcpu)
>> @@ -394,7 +395,8 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
>> /* cpuid 7.0.edx*/
>> const u32 kvm_cpuid_7_0_edx_x86_features =
>> KF(AVX512_4VNNIW) | KF(AVX512_4FMAPS) | \
>> - (boot_cpu_has(X86_FEATURE_IBRS) ? KF(IBRS) : 0);
>> + (boot_cpu_has(X86_FEATURE_IBRS) ? KF(IBRS) : 0) | \
>> + (boot_cpu_has(X86_FEATURE_ARCH_CAPABILITIES) ? KF(ARCH_CAPABILITIES) : 0);
>>
>> /* all calls to cpuid_count() should be made on the same cpu */
>> get_cpu();
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> index f82a44c..99cb761 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -9617,6 +9617,8 @@ static struct kvm_vcpu *vmx_create_vcpu(struct kvm *kvm, unsigned int id)
>>
>> if (boot_cpu_has(X86_FEATURE_IBPB))
>> vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_PRED_CMD, MSR_TYPE_RW);
>> + if (boot_cpu_has(X86_FEATURE_ARCH_CAPABILITIES))
>> + vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_ARCH_CAPABILITIES, MSR_TYPE_R);
>> vmx_disable_intercept_for_msr(msr_bitmap, MSR_FS_BASE, MSR_TYPE_RW);
>> vmx_disable_intercept_for_msr(msr_bitmap, MSR_GS_BASE, MSR_TYPE_RW);
>> vmx_disable_intercept_for_msr(msr_bitmap, MSR_KERNEL_GS_BASE, MSR_TYPE_RW);
>> --
>> 2.7.4
>>
>
Amazon Development Center Germany GmbH
Berlin - Dresden - Aachen
main office: Krausenstr. 38, 10117 Berlin
Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger
Ust-ID: DE289237879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

2018-01-29 19:01:00

by Jim Mattson

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] x86: kvm: Update the reverse_cpuid list to include CPUID_7_EDX

On Sun, Jan 28, 2018 at 4:58 PM, KarimAllah Ahmed <[email protected]> wrote:
> Cc: Paolo Bonzini <[email protected]>
> Cc: Radim Krčmář <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: H. Peter Anvin <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Signed-off-by: KarimAllah Ahmed <[email protected]>
> ---
> arch/x86/kvm/cpuid.h | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
> index cdc70a3..dcfe227 100644
> --- a/arch/x86/kvm/cpuid.h
> +++ b/arch/x86/kvm/cpuid.h
> @@ -54,6 +54,7 @@ static const struct cpuid_reg reverse_cpuid[] = {
> [CPUID_8000_000A_EDX] = {0x8000000a, 0, CPUID_EDX},
> [CPUID_7_ECX] = { 7, 0, CPUID_ECX},
> [CPUID_8000_0007_EBX] = {0x80000007, 0, CPUID_EBX},
> + [CPUID_7_EDX] = { 7, 0, CPUID_EDX},

I haven't seen the change that introduces CPUID_7_EDX, but shouldn't
it be CPUID_7_0_EDX?

> };
>
> static __always_inline struct cpuid_reg x86_feature_cpuid(unsigned x86_feature)
> --
> 2.7.4
>

2018-01-29 19:03:55

by Jim Mattson

[permalink] [raw]
Subject: Re: [PATCH v2 2/4] x86: vmx: Allow direct access to MSR_IA32_SPEC_CTRL

On Sun, Jan 28, 2018 at 4:58 PM, KarimAllah Ahmed <[email protected]> wrote:
> Add direct access to MSR_IA32_SPEC_CTRL for guests. This is needed for
> guests that will only mitigate Spectre V2 through IBRS+IBPB and will not
> be using a retpoline+IBPB based approach.
>
> To avoid the overhead of atomically saving and restoring the
> MSR_IA32_SPEC_CTRL for guests that do not actually use the MSR, only
> add_atomic_switch_msr when a non-zero is written to it.
>
> Cc: Asit Mallick <[email protected]>
> Cc: Arjan Van De Ven <[email protected]>
> Cc: Dave Hansen <[email protected]>
> Cc: Andi Kleen <[email protected]>
> Cc: Andrea Arcangeli <[email protected]>
> Cc: Linus Torvalds <[email protected]>
> Cc: Tim Chen <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Cc: Dan Williams <[email protected]>
> Cc: Jun Nakajima <[email protected]>
> Cc: Paolo Bonzini <[email protected]>
> Cc: David Woodhouse <[email protected]>
> Cc: Greg KH <[email protected]>
> Cc: Andy Lutomirski <[email protected]>
> Cc: Ashok Raj <[email protected]>
> Signed-off-by: KarimAllah Ahmed <[email protected]>
>
> ---
> v2:
> - remove 'host_spec_ctrl' in favor of only a comment (dwmw@).
> - special case writing '0' in SPEC_CTRL to avoid confusing live-migration
> when the instance never used the MSR (dwmw@).
> - depend on X86_FEATURE_IBRS instead of X86_FEATURE_SPEC_CTRL (dwmw@).
> - add MSR_IA32_SPEC_CTRL to the list of MSRs to save (dropped it by accident).
> ---
> arch/x86/kvm/cpuid.c | 4 +++-
> arch/x86/kvm/vmx.c | 65 ++++++++++++++++++++++++++++++++++++++++++++++++++++
> arch/x86/kvm/x86.c | 1 +
> 3 files changed, 69 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 0099e10..32c0c14 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -70,6 +70,7 @@ u64 kvm_supported_xcr0(void)
> /* These are scattered features in cpufeatures.h. */
> #define KVM_CPUID_BIT_AVX512_4VNNIW 2
> #define KVM_CPUID_BIT_AVX512_4FMAPS 3
> +#define KVM_CPUID_BIT_IBRS 26
> #define KF(x) bit(KVM_CPUID_BIT_##x)
>
> int kvm_update_cpuid(struct kvm_vcpu *vcpu)
> @@ -392,7 +393,8 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
>
> /* cpuid 7.0.edx*/
> const u32 kvm_cpuid_7_0_edx_x86_features =
> - KF(AVX512_4VNNIW) | KF(AVX512_4FMAPS);
> + KF(AVX512_4VNNIW) | KF(AVX512_4FMAPS) | \
> + (boot_cpu_has(X86_FEATURE_IBRS) ? KF(IBRS) : 0);
>
> /* all calls to cpuid_count() should be made on the same cpu */
> get_cpu();
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index aa8638a..dac564d 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -920,6 +920,8 @@ static void vmx_set_nmi_mask(struct kvm_vcpu *vcpu, bool masked);
> static bool nested_vmx_is_page_fault_vmexit(struct vmcs12 *vmcs12,
> u16 error_code);
> static void vmx_update_msr_bitmap(struct kvm_vcpu *vcpu);
> +static void __always_inline vmx_disable_intercept_for_msr(unsigned long *msr_bitmap,
> + u32 msr, int type);
>
> static DEFINE_PER_CPU(struct vmcs *, vmxarea);
> static DEFINE_PER_CPU(struct vmcs *, current_vmcs);
> @@ -2007,6 +2009,28 @@ static void add_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr,
> m->host[i].value = host_val;
> }
>
> +/* do not touch guest_val and host_val if the msr is not found */
> +static int read_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr,
> + u64 *guest_val, u64 *host_val)
> +{
> + unsigned i;
> + struct msr_autoload *m = &vmx->msr_autoload;
> +
> + for (i = 0; i < m->nr; ++i)
> + if (m->guest[i].index == msr)
> + break;
> +
> + if (i == m->nr)
> + return 1;
> +
> + if (guest_val)
> + *guest_val = m->guest[i].value;
> + if (host_val)
> + *host_val = m->host[i].value;
> +
> + return 0;
> +}
> +
> static bool update_transition_efer(struct vcpu_vmx *vmx, int efer_offset)
> {
> u64 guest_efer = vmx->vcpu.arch.efer;
> @@ -3203,7 +3227,9 @@ static inline bool vmx_feature_control_msr_valid(struct kvm_vcpu *vcpu,
> */
> static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> {
> + u64 spec_ctrl = 0;
> struct shared_msr_entry *msr;
> + struct vcpu_vmx *vmx = to_vmx(vcpu);
>
> switch (msr_info->index) {
> #ifdef CONFIG_X86_64
> @@ -3223,6 +3249,20 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> case MSR_IA32_TSC:
> msr_info->data = guest_read_tsc(vcpu);
> break;
> + case MSR_IA32_SPEC_CTRL:
> + if (!msr_info->host_initiated &&
> + !guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL))
> + return 1;
> +
> + /*
> + * If the MSR is not in the atomic list yet, then the guest
> + * never wrote a non-zero value to it yet i.e. the MSR value is
> + * '0'.
> + */
> + read_atomic_switch_msr(vmx, MSR_IA32_SPEC_CTRL, &spec_ctrl, NULL);
> +
> + msr_info->data = spec_ctrl;
> + break;
> case MSR_IA32_SYSENTER_CS:
> msr_info->data = vmcs_read32(GUEST_SYSENTER_CS);
> break;
> @@ -3289,6 +3329,7 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> int ret = 0;
> u32 msr_index = msr_info->index;
> u64 data = msr_info->data;
> + unsigned long *msr_bitmap;
>
> switch (msr_index) {
> case MSR_EFER:
> @@ -3330,6 +3371,30 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> case MSR_IA32_TSC:
> kvm_write_tsc(vcpu, msr_info);
> break;
> + case MSR_IA32_SPEC_CTRL:
> + if (!msr_info->host_initiated &&
> + !guest_cpuid_has(vcpu, X86_FEATURE_IBRS))
> + return 1;
> +
> + if (!msr_info->data)
> + break;

What if the MSR has already been set? Doesn't this throw away the new
zero value? (e.g. on vcpu reset)?

> +
> + /*
> + * Now we know that the guest is actually using the MSR, so
> + * atomically load and save the SPEC_CTRL MSR and pass it
> + * through to the guest.
> + *
> + * NOTE:
> + * IBRS is not supported yet as a mitigation for the host. Once
> + * it is supported, the "host_value" will need to be '1'
> + * instead of '0' if IBRS is used also by the host.
> + */
> + add_atomic_switch_msr(vmx, MSR_IA32_SPEC_CTRL, msr_info->data, 0);
> +
> + msr_bitmap = vmx->vmcs01.msr_bitmap;
> + vmx_disable_intercept_for_msr(msr_bitmap, MSR_FS_BASE, MSR_TYPE_RW);
> +
> + break;
> case MSR_IA32_CR_PAT:
> if (vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_IA32_PAT) {
> if (!kvm_mtrr_valid(vcpu, MSR_IA32_CR_PAT, data))
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 03869eb..cabaad3 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -1006,6 +1006,7 @@ static u32 msrs_to_save[] = {
> #endif
> MSR_IA32_TSC, MSR_IA32_CR_PAT, MSR_VM_HSAVE_PA,
> MSR_IA32_FEATURE_CONTROL, MSR_IA32_BNDCFGS, MSR_TSC_AUX,
> + MSR_IA32_SPEC_CTRL
> };
>
> static unsigned num_msrs_to_save;
> --
> 2.7.4
>

2018-01-29 19:20:44

by Jim Mattson

[permalink] [raw]
Subject: Re: [PATCH v2 3/4] x86/kvm: Add IBPB support

On Sun, Jan 28, 2018 at 4:58 PM, KarimAllah Ahmed <[email protected]> wrote:
> From: Ashok Raj <[email protected]>
>
> Add MSR passthrough for MSR_IA32_PRED_CMD and place branch predictor
> barriers on switching between VMs to avoid inter VM Spectre-v2 attacks.
>
> [peterz: rebase and changelog rewrite]
> [karahmed: - rebase
> - vmx: expose PRED_CMD whenever it is available
> - svm: only pass through IBPB if it is available]
> Cc: Asit Mallick <[email protected]>
> Cc: Dave Hansen <[email protected]>
> Cc: Arjan Van De Ven <[email protected]>
> Cc: Tim Chen <[email protected]>
> Cc: Linus Torvalds <[email protected]>
> Cc: Andrea Arcangeli <[email protected]>
> Cc: Andi Kleen <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Cc: Dan Williams <[email protected]>
> Cc: Jun Nakajima <[email protected]>
> Cc: Andy Lutomirski <[email protected]>
> Cc: Greg KH <[email protected]>
> Cc: Paolo Bonzini <[email protected]>
> Signed-off-by: Ashok Raj <[email protected]>
> Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
> Link: http://lkml.kernel.org/r/[email protected]
> Signed-off-by: David Woodhouse <[email protected]>
> Signed-off-by: KarimAllah Ahmed <[email protected]>
> ---
> arch/x86/kvm/svm.c | 14 ++++++++++++++
> arch/x86/kvm/vmx.c | 4 ++++
> 2 files changed, 18 insertions(+)
>
> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> index 2744b973..c886e46 100644
> --- a/arch/x86/kvm/svm.c
> +++ b/arch/x86/kvm/svm.c
> @@ -529,6 +529,7 @@ struct svm_cpu_data {
> struct kvm_ldttss_desc *tss_desc;
>
> struct page *save_area;
> + struct vmcb *current_vmcb;
> };
>
> static DEFINE_PER_CPU(struct svm_cpu_data *, svm_data);
> @@ -918,6 +919,9 @@ static void svm_vcpu_init_msrpm(u32 *msrpm)
>
> set_msr_interception(msrpm, direct_access_msrs[i].index, 1, 1);
> }
> +
> + if (boot_cpu_has(X86_FEATURE_IBPB))
> + set_msr_interception(msrpm, MSR_IA32_PRED_CMD, 1, 1);

What if the guest doesn't have X86_FEATURE_IBPB?

> }
>
> static void add_msr_offset(u32 offset)
> @@ -1706,11 +1710,17 @@ static void svm_free_vcpu(struct kvm_vcpu *vcpu)
> __free_pages(virt_to_page(svm->nested.msrpm), MSRPM_ALLOC_ORDER);
> kvm_vcpu_uninit(vcpu);
> kmem_cache_free(kvm_vcpu_cache, svm);
> + /*
> + * The vmcb page can be recycled, causing a false negative in
> + * svm_vcpu_load(). So do a full IBPB now.
> + */
> + indirect_branch_prediction_barrier();
> }
>
> static void svm_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
> {
> struct vcpu_svm *svm = to_svm(vcpu);
> + struct svm_cpu_data *sd = per_cpu(svm_data, cpu);
> int i;
>
> if (unlikely(cpu != vcpu->cpu)) {
> @@ -1739,6 +1749,10 @@ static void svm_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
> if (static_cpu_has(X86_FEATURE_RDTSCP))
> wrmsrl(MSR_TSC_AUX, svm->tsc_aux);
>
> + if (sd->current_vmcb != svm->vmcb) {
> + sd->current_vmcb = svm->vmcb;
> + indirect_branch_prediction_barrier();
> + }
> avic_vcpu_load(vcpu, cpu);
> }
>
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index dac564d..f82a44c 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -2296,6 +2296,7 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
> if (per_cpu(current_vmcs, cpu) != vmx->loaded_vmcs->vmcs) {
> per_cpu(current_vmcs, cpu) = vmx->loaded_vmcs->vmcs;
> vmcs_load(vmx->loaded_vmcs->vmcs);
> + indirect_branch_prediction_barrier();
> }
>
> if (!already_loaded) {
> @@ -9613,6 +9614,9 @@ static struct kvm_vcpu *vmx_create_vcpu(struct kvm *kvm, unsigned int id)
> goto free_msrs;
>
> msr_bitmap = vmx->vmcs01.msr_bitmap;
> +
> + if (boot_cpu_has(X86_FEATURE_IBPB))
> + vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_PRED_CMD, MSR_TYPE_RW);

What if the guest doesn't have X86_FEATURE_IBPB?

> vmx_disable_intercept_for_msr(msr_bitmap, MSR_FS_BASE, MSR_TYPE_RW);
> vmx_disable_intercept_for_msr(msr_bitmap, MSR_GS_BASE, MSR_TYPE_RW);
> vmx_disable_intercept_for_msr(msr_bitmap, MSR_KERNEL_GS_BASE, MSR_TYPE_RW);
> --
> 2.7.4
>