LinuxLists.cc - [PATCH v2 0/7] This patch-set is to enable Guest CET support

2019-01-23 14:07:57

by Yang, Weijiang

[permalink] [raw]

Subject: [PATCH v2 0/7] This patch-set is to enable Guest CET support

Control-flow Enforcement Technology (CET) provides protection against
return/jump-oriented programming (ROP) attacks. To make kvm Guest OS own
the capability, this patch-set is required. It enables CET related CPUID
report, xsaves/xrstors, vmx entry configuration etc. for Guest OS.

PATCH 1 : Define CET VMCS fields and bits.
PATCH 2/3 : Report CET feature support in CPUID.
PATCH 4 : Fix xsaves size calculation issue.
PATCH 5 : Pass through CET MSRs to Guest.
PATCH 6 : Set Guest CET state auto loading bit.
PATCH 7 : Enable CET xsaves bits support in XSS.

Changelog:
v2:
- Re-ordered patch sequence, combined one patch.
- Added more description for CET related VMCS fields.
- Added Host CET capability check while enabling Guest CET loading bit.
- Added Host CET capability check while reporting Guest CPUID(EAX=7,
EXC=0).
- Modified code in reporting Guest CPUID(EAX=D,ECX>=1), make it clearer.
- Added Host and Guest XSS mask check while setting bits for Guest XSS.

Yang Weijiang (7):
KVM:VMX: Define CET VMCS fields and bits
KVM:CPUID: Define CET CPUID bits and CR4.CET master enable bit.
KVM:CPUID: Add CPUID support for CET xsaves component query.
KVM:CPUID: Fix xsaves area size calculation for CPUID.(EAX=0xD,ECX=1).
KVM:VMX: Pass through host CET related MSRs to Guest.
KVM:VMX: Load Guest CET via VMCS when CET is enabled in Guest
KVM:X86: Enable MSR_IA32_XSS bit 11 and 12 for CET xsaves/xrstors.

arch/x86/include/asm/kvm_host.h | 3 +-
arch/x86/include/asm/vmx.h | 8 ++++
arch/x86/kvm/cpuid.c | 67 ++++++++++++++++++++++++---------
arch/x86/kvm/vmx.c | 60 +++++++++++++++++++++++++++--
arch/x86/kvm/x86.c | 4 ++
arch/x86/kvm/x86.h | 4 ++
6 files changed, 125 insertions(+), 21 deletions(-)

--
2.17.1

2019-01-23 14:08:03

by Yang, Weijiang

[permalink] [raw]

Subject: [PATCH v2 2/7] KVM:CPUID: Define CET CPUID bits and CR4.CET master enable bit.

Guest queries CET SHSTK and IBT support by CPUID.(EAX=0x7,ECX=0),
in return, ECX[bit 7] corresponds to SHSTK feature, and EDX[bit 20]
corresponds to IBT feature.
CR4.CET[bit 23] is CET master enable bit, it controls CET feature
availability in guest OS.

Note: Although SHSTK or IBT can be enabled independently,
either of the features is controlled by CR4.CET.

Signed-off-by: Zhang Yi Z <[email protected]>
Signed-off-by: Yang Weijiang <[email protected]>
---
arch/x86/include/asm/kvm_host.h | 3 ++-
arch/x86/kvm/cpuid.c | 4 ++--
2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 55e51ff7e421..df002936088f 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -90,7 +90,8 @@
| X86_CR4_PGE | X86_CR4_PCE | X86_CR4_OSFXSR | X86_CR4_PCIDE \
| X86_CR4_OSXSAVE | X86_CR4_SMEP | X86_CR4_FSGSBASE \
| X86_CR4_OSXMMEXCPT | X86_CR4_LA57 | X86_CR4_VMXE \
- | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP))
+ | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP \
+ | X86_CR4_CET))

#define CR8_RESERVED_BITS (~(unsigned long)X86_CR8_TPR)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 7bcfa61375c0..cb1aece25b17 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -406,12 +406,12 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
F(AVX512VBMI) | F(LA57) | F(PKU) | 0 /*OSPKE*/ |
F(AVX512_VPOPCNTDQ) | F(UMIP) | F(AVX512_VBMI2) | F(GFNI) |
F(VAES) | F(VPCLMULQDQ) | F(AVX512_VNNI) | F(AVX512_BITALG) |
- F(CLDEMOTE);
+ F(CLDEMOTE) | F(SHSTK);

/* cpuid 7.0.edx*/
const u32 kvm_cpuid_7_0_edx_x86_features =
F(AVX512_4VNNIW) | F(AVX512_4FMAPS) | F(SPEC_CTRL) |
- F(SPEC_CTRL_SSBD) | F(ARCH_CAPABILITIES);
+ F(SPEC_CTRL_SSBD) | F(ARCH_CAPABILITIES) | F(IBT);

/* all calls to cpuid_count() should be made on the same cpu */
get_cpu();
--
2.17.1

2019-01-23 14:08:08

by Yang, Weijiang

[permalink] [raw]

Subject: [PATCH v2 4/7] KVM:CPUID: Fix xsaves area size calculation for CPUID.(EAX=0xD,ECX=1).

According to latest Software Development Manual vol.2/3.2,
for CPUID.(EAX=0xD,ECX=1), it should report xsaves area size
containing all states enabled by XCR0|IA32_MSR_XSS.

Signed-off-by: Zhang Yi Z <[email protected]>
Signed-off-by: Yang Weijiang <[email protected]>
---
arch/x86/kvm/cpuid.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index dbeb4e7904eb..67b464d1d749 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -136,7 +136,8 @@ int kvm_update_cpuid(struct kvm_vcpu *vcpu)

best = kvm_find_cpuid_entry(vcpu, 0xD, 1);
if (best && (best->eax & (F(XSAVES) | F(XSAVEC))))
- best->ebx = xstate_required_size(vcpu->arch.xcr0, true);
+ best->ebx = xstate_required_size(vcpu->arch.xcr0 |
+ kvm_supported_xss(), true);

/*
* The existing code assumes virtual address is 48-bit or 57-bit in the
--
2.17.1

2019-01-23 14:08:20

by Yang, Weijiang

[permalink] [raw]

Subject: [PATCH v2 6/7] KVM:VMX: Load Guest CET via VMCS when CET is enabled in Guest

"Load Guest CET state" bit controls whether guest CET states
will be loaded on Guest entry. Before doing that, KVM needs
to check if CET feature is exposed to Guest.

Signed-off-by: Zhang Yi Z <[email protected]>
Signed-off-by: Yang Weijiang <[email protected]>
---
arch/x86/kvm/vmx.c | 33 +++++++++++++++++++++++++++++++++
1 file changed, 33 insertions(+)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 68c0e5e41cb1..9c8cecac80ea 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -55,6 +55,7 @@
#include <asm/mmu_context.h>
#include <asm/spec-ctrl.h>
#include <asm/mshyperv.h>
+#include <asm/cet.h>

#include "trace.h"
#include "pmu.h"
@@ -4065,6 +4066,18 @@ static inline bool vmx_feature_control_msr_valid(struct kvm_vcpu *vcpu,
return !(val & ~valid_bits);
}

+static int vmx_guest_cet_cap(struct kvm_vcpu *vcpu)
+{
+ struct kvm_cpuid_entry2 *best;
+ int r = 0;
+
+ best = kvm_find_cpuid_entry(vcpu, 7, 0);
+ if (best && best->function == 0x7)
+ r = (best->ecx & bit(X86_FEATURE_SHSTK)) |
+ (best->edx & bit(X86_FEATURE_IBT)) ? 1 : 0;
+ return r;
+}
+
static int vmx_get_msr_feature(struct kvm_msr_entry *msr)
{
switch (msr->index) {
@@ -5409,6 +5422,26 @@ static int vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
return 1;
}

+ /*
+ * To enable Guest CET, first check if Guest CET feature is
+ * available, if it's not available but its CR4.CET is being set,
+ * return a fault to Guest; then check if Host CET is enabled and
+ * CR4.CET is toggled, if they are, then enable loading CET state
+ * bit in entry control, otherwise, clear the bit to
+ * disable guest CET state loading.
+ */
+ if (vmx_guest_cet_cap(vcpu)) {
+ if (hw_cr4 & cr4 & X86_CR4_CET) {
+ vmcs_set_bits(VM_ENTRY_CONTROLS,
+ VM_ENTRY_LOAD_GUEST_CET_STATE);
+ } else {
+ vmcs_clear_bits(VM_ENTRY_CONTROLS,
+ VM_ENTRY_LOAD_GUEST_CET_STATE);
+ }
+ } else if (cr4 & X86_CR4_CET) {
+ return 1;
+ }
+
if (to_vmx(vcpu)->nested.vmxon && !nested_cr4_valid(vcpu, cr4))
return 1;

--
2.17.1

2019-01-23 14:09:07

by Yang, Weijiang

[permalink] [raw]

Subject: [PATCH v2 5/7] KVM:VMX: Pass through host CET related MSRs to Guest.

The CET runtime settings, i.e., CET state control bits(IA32_U_CET/
IA32_S_CET), CET SSP(IA32_PL3_SSP/IA32_PL0_SSP) and SSP table address
(IA32_INTERRUPT_SSP_TABLE_ADDR) are task/thread specific, therefore,
OS needs to save/restore the states properly during context switch,
e.g., task/thread switching, interrupt/exception handling, it uses
xsaves/xrstors to achieve that.

The difference between VMCS CET area fields and xsave CET area, is that
the former is for state retention during Guest/Host context
switch while the latter is for state retention during OS execution.

Linux currently doesn't support CPL1 and CPL2, so SSPs for these level
are skipped here.

Signed-off-by: Zhang Yi Z <[email protected]>
Signed-off-by: Yang Weijiang <[email protected]>
---
arch/x86/kvm/vmx.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 7bbb8b26e901..68c0e5e41cb1 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -11531,6 +11531,7 @@ static struct kvm_vcpu *vmx_create_vcpu(struct kvm *kvm, unsigned int id)
vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_SYSENTER_CS, MSR_TYPE_RW);
vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_SYSENTER_ESP, MSR_TYPE_RW);
vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_SYSENTER_EIP, MSR_TYPE_RW);
+
vmx->msr_bitmap_mode = 0;

vmx->loaded_vmcs = &vmx->vmcs01;
@@ -11769,6 +11770,8 @@ static void nested_vmx_entry_exit_ctls_update(struct kvm_vcpu *vcpu)
static void vmx_cpuid_update(struct kvm_vcpu *vcpu)
{
struct vcpu_vmx *vmx = to_vmx(vcpu);
+ struct kvm_cpuid_entry2 *best;
+ unsigned long *msr_bitmap;

if (cpu_has_secondary_exec_ctrls()) {
vmx_compute_secondary_exec_control(vmx);
@@ -11786,6 +11789,19 @@ static void vmx_cpuid_update(struct kvm_vcpu *vcpu)
nested_vmx_cr_fixed1_bits_update(vcpu);
nested_vmx_entry_exit_ctls_update(vcpu);
}
+
+ msr_bitmap = vmx->vmcs01.msr_bitmap;
+ best = kvm_find_cpuid_entry(vcpu, 7, 0);
+ if ((best && best->function == 0x7) &&
+ ((best->ecx & bit(X86_FEATURE_SHSTK)) |
+ (best->edx & bit(X86_FEATURE_IBT)))) {
+ vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_U_CET, MSR_TYPE_RW);
+ vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_S_CET, MSR_TYPE_RW);
+ vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_INT_SSP_TAB, MSR_TYPE_RW);
+ vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_PL0_SSP, MSR_TYPE_RW);
+ vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_PL3_SSP, MSR_TYPE_RW);
+ }
+
}

static void vmx_set_supported_cpuid(u32 func, struct kvm_cpuid_entry2 *entry)
--
2.17.1

2019-01-23 14:10:06

by Yang, Weijiang

[permalink] [raw]

Subject: [PATCH v2 3/7] KVM:CPUID: Add CPUID support for CET xsaves component query.

CET xsaves component size is queried through CPUID.(EAX=0xD, ECX=11)
and CPUID.(EAX=0xD, ECX=12).

Signed-off-by: Zhang Yi Z <[email protected]>
Signed-off-by: Yang Weijiang <[email protected]>
---
arch/x86/kvm/cpuid.c | 60 +++++++++++++++++++++++++++++++++-----------
arch/x86/kvm/x86.c | 4 +++
arch/x86/kvm/x86.h | 4 +++
3 files changed, 54 insertions(+), 14 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index cb1aece25b17..dbeb4e7904eb 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -27,6 +27,8 @@
#include "trace.h"
#include "pmu.h"

+extern u64 host_xss;
+
static u32 xstate_required_size(u64 xstate_bv, bool compacted)
{
int feature_bit = 0;
@@ -65,6 +67,19 @@ u64 kvm_supported_xcr0(void)
return xcr0;
}

+u64 kvm_supported_xss(void)
+{
+ u64 xss = host_xss & KVM_SUPPORTED_XSS;
+
+ /*
+ * Either SHSTK or IBT feature depends on the xsaves component.
+ */
+ if (!boot_cpu_has(X86_FEATURE_SHSTK) && !boot_cpu_has(X86_FEATURE_IBT))
+ xss &= ~(XFEATURE_MASK_SHSTK_USER | XFEATURE_MASK_SHSTK_KERNEL);
+
+ return xss;
+}
+
#define F(x) bit(X86_FEATURE_##x)

/* For scattered features from cpufeatures.h; we currently expose none */
@@ -503,6 +518,16 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
* if the host doesn't support it.
*/
entry->edx |= F(ARCH_CAPABILITIES);
+ /*
+ * If host doesn't have CET capability,
+ * do not report CET related info.
+ */
+ if (!boot_cpu_has(X86_FEATURE_SHSTK))
+ entry->ecx &= ~F(SHSTK);
+
+ if (!boot_cpu_has(X86_FEATURE_IBT))
+ entry->edx &= ~F(IBT);
+
} else {
entry->ebx = 0;
entry->ecx = 0;
@@ -564,14 +589,17 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
}
case 0xd: {
int idx, i;
- u64 supported = kvm_supported_xcr0();
+ u64 u_supported = kvm_supported_xcr0();
+ u64 s_supported = kvm_supported_xss();
+ u64 supported;
+ int compacted;

- entry->eax &= supported;
- entry->ebx = xstate_required_size(supported, false);
+ entry->eax &= u_supported;
+ entry->ebx = xstate_required_size(u_supported, false);
entry->ecx = entry->ebx;
- entry->edx &= supported >> 32;
+ entry->edx &= u_supported >> 32;
entry->flags |= KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
- if (!supported)
+ if (!u_supported)
break;

for (idx = 1, i = 1; idx < 64; ++idx) {
@@ -583,19 +611,23 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
if (idx == 1) {
entry[i].eax &= kvm_cpuid_D_1_eax_x86_features;
cpuid_mask(&entry[i].eax, CPUID_D_1_EAX);
- entry[i].ebx = 0;
- if (entry[i].eax & (F(XSAVES)|F(XSAVEC)))
- entry[i].ebx =
- xstate_required_size(supported,
- true);
+ supported = u_supported | s_supported;
+ compacted = entry[i].eax &
+ (F(XSAVES) | F(XSAVEC));
+ entry[i].ebx = xstate_required_size(supported,
+ compacted);
+ entry[i].ecx &= s_supported;
+ entry[i].edx = 0;
} else {
+ supported = (entry[i].ecx & 1) ? s_supported :
+ u_supported;
if (entry[i].eax == 0 || !(supported & mask))
continue;
- if (WARN_ON_ONCE(entry[i].ecx & 1))
- continue;
+ entry[i].ecx &= 1;
+ entry[i].edx = 0;
+ if (entry[i].ecx)
+ entry[i].ebx = 0;
}
- entry[i].ecx = 0;
- entry[i].edx = 0;
entry[i].flags |=
KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
++*nent;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index a0f8b71b2132..b0ae24913423 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -212,6 +212,7 @@ struct kvm_stats_debugfs_item debugfs_entries[] = {
};

u64 __read_mostly host_xcr0;
+u64 __read_mostly host_xss;

static int emulator_fix_hypercall(struct x86_emulate_ctxt *ctxt);

@@ -6838,6 +6839,9 @@ int kvm_arch_init(void *opaque)
if (boot_cpu_has(X86_FEATURE_XSAVE))
host_xcr0 = xgetbv(XCR_XFEATURE_ENABLED_MASK);

+ if (boot_cpu_has(X86_FEATURE_XSAVES))
+ rdmsrl(MSR_IA32_XSS, host_xss);
+
kvm_lapic_init();
#ifdef CONFIG_X86_64
pvclock_gtod_register_notifier(&pvclock_gtod_notifier);
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index 224cd0a47568..c61da41c3c5c 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -283,6 +283,10 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, unsigned long cr2,
| XFEATURE_MASK_YMM | XFEATURE_MASK_BNDREGS \
| XFEATURE_MASK_BNDCSR | XFEATURE_MASK_AVX512 \
| XFEATURE_MASK_PKRU)
+
+#define KVM_SUPPORTED_XSS (XFEATURE_MASK_SHSTK_USER \
+ | XFEATURE_MASK_SHSTK_KERNEL)
+
extern u64 host_xcr0;

extern u64 kvm_supported_xcr0(void);
--
2.17.1

2019-01-23 14:48:56

by Yang, Weijiang

[permalink] [raw]

Subject: [PATCH v2 7/7] KVM:X86: Enable MSR_IA32_XSS bit 11 and 12 for CET xsaves/xrstors.

For kvm Guest OS, right now, only bit 11(user mode CET) and bit 12
(supervisor CET) are supported in XSS MSR, if other bits are being set,
the write to XSS will be skipped.

Signed-off-by: Zhang Yi Z <[email protected]>
Signed-off-by: Yang Weijiang <[email protected]>
---
arch/x86/kvm/vmx.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 9c8cecac80ea..25ac22b3923a 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -47,6 +47,7 @@
#include <asm/virtext.h>
#include <asm/mce.h>
#include <asm/fpu/internal.h>
+#include <asm/fpu/types.h>
#include <asm/perf_event.h>
#include <asm/debugreg.h>
#include <asm/kexec.h>
@@ -4334,12 +4335,16 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
case MSR_IA32_XSS:
if (!vmx_xsaves_supported())
return 1;
+
/*
- * The only supported bit as of Skylake is bit 8, but
- * it is not supported on KVM.
+ * Right now, only support XSS_CET_U[bit 11] and
+ * XSS_CET_S[bit 12] in MSR_IA32_XSS.
*/
- if (data != 0)
+
+ if (!vmx_guest_cet_cap(vcpu) ||
+ data & ~(KVM_SUPPORTED_XSS & host_xss))
return 1;
+
vcpu->arch.ia32_xss = data;
if (vcpu->arch.ia32_xss != host_xss)
add_atomic_switch_msr(vmx, MSR_IA32_XSS,
--
2.17.1

2019-01-23 14:48:56

by Yang, Weijiang

[permalink] [raw]

Subject: [PATCH v2 1/7] KVM:VMX: Define CET VMCS fields and bits

On processors that support CET, VMX saves/restores
the states of IA32_S_CET, SSP and IA32_INTERRUPT_SSP_TABLE_ADDR MSR
to the VMCS area for Guest/Host unconditionally.

If VM_EXIT_LOAD_HOST_CET_STATE = 1, the host CET MSRs are
restored from VMCS host-state area at VM exit as follows:

- HOST_IA32_S_CET: Host supervisor mode IA32_S_CET MSR is loaded
from this field.

- HOST_SSP : Host SSP is loaded from this field.

- HOST_INTR_SSP_TABLE_ADDR : Host IA32_INTERRUPT_SSP_TABLE_ADDR
MSR is loaded from this field.

If VM_ENTRY_LOAD_GUEST_CET_STATE = 1, the guest CET MSRs are loaded
from VMCS guest-state area at VM entry as follows:

- GUEST_IA32_S_CET : Guest supervisor mode IA32_S_CET MSR is loaded
from this field.

- GUEST_SSP : Guest SSP is loaded from this field.

- GUEST_INTR_SSP_TABL_ADDR : Guest IA32_INTERRUPT_SSP_TABLE_ADDR
MSR is loaded from this field.

Additionally, to context switch guest and host CET states, the VMM
uses xsaves/xrstors instructions to save/restore the guest CET states
at VM exit/entry. The CET xsave area is within thread_struct.fpu area.
If OS execution flow changes during task switch/interrupt/exception etc.,
the OS also relies on xsaves/xrstors to switch CET states accordingly.

Note: Although these VMCS fields are 64-bit, they don't have high fields.

Signed-off-by: Zhang Yi Z <[email protected]>
Signed-off-by: Yang Weijiang <[email protected]>
---
arch/x86/include/asm/vmx.h | 8 ++++++++
1 file changed, 8 insertions(+)

diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
index ade0f153947d..395c1f7e5938 100644
--- a/arch/x86/include/asm/vmx.h
+++ b/arch/x86/include/asm/vmx.h
@@ -98,6 +98,7 @@
#define VM_EXIT_LOAD_IA32_EFER 0x00200000
#define VM_EXIT_SAVE_VMX_PREEMPTION_TIMER 0x00400000
#define VM_EXIT_CLEAR_BNDCFGS 0x00800000
+#define VM_EXIT_LOAD_HOST_CET_STATE 0x10000000

#define VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR 0x00036dff

@@ -109,6 +110,7 @@
#define VM_ENTRY_LOAD_IA32_PAT 0x00004000
#define VM_ENTRY_LOAD_IA32_EFER 0x00008000
#define VM_ENTRY_LOAD_BNDCFGS 0x00010000
+#define VM_ENTRY_LOAD_GUEST_CET_STATE 0x00100000

#define VM_ENTRY_ALWAYSON_WITHOUT_TRUE_MSR 0x000011ff

@@ -325,6 +327,9 @@ enum vmcs_field {
GUEST_PENDING_DBG_EXCEPTIONS = 0x00006822,
GUEST_SYSENTER_ESP = 0x00006824,
GUEST_SYSENTER_EIP = 0x00006826,
+ GUEST_IA32_S_CET = 0x00006828,
+ GUEST_SSP = 0x0000682a,
+ GUEST_INTR_SSP_TABL_ADDR = 0x0000682c,
HOST_CR0 = 0x00006c00,
HOST_CR3 = 0x00006c02,
HOST_CR4 = 0x00006c04,
@@ -337,6 +342,9 @@ enum vmcs_field {
HOST_IA32_SYSENTER_EIP = 0x00006c12,
HOST_RSP = 0x00006c14,
HOST_RIP = 0x00006c16,
+ HOST_IA32_S_CET = 0x00006c18,
+ HOST_SSP = 0x00006c1a,
+ HOST_INTR_SSP_TABL_ADDR = 0x00006c1c
};

/*
--
2.17.1

2019-01-25 17:57:55

by Paolo Bonzini

[permalink] [raw]

Subject: Re: [PATCH v2 3/7] KVM:CPUID: Add CPUID support for CET xsaves component query.

On 22/01/19 21:59, Yang Weijiang wrote:
> */
> entry->edx |= F(ARCH_CAPABILITIES);
> + /*
> + * If host doesn't have CET capability,
> + * do not report CET related info.
> + */
> + if (!boot_cpu_has(X86_FEATURE_SHSTK))
> + entry->ecx &= ~F(SHSTK);
> +
> + if (!boot_cpu_has(X86_FEATURE_IBT))
> + entry->edx &= ~F(IBT);
> +

This is not needed, it is implied by

cpuid_mask(&entry->ecx, CPUID_7_ECX);
cpuid_mask(&entry->edx, CPUID_7_EDX);

earlier in the function.

Paolo

2019-01-25 18:03:15

by Paolo Bonzini

[permalink] [raw]

Subject: Re: [PATCH v2 1/7] KVM:VMX: Define CET VMCS fields and bits

On 22/01/19 21:59, Yang Weijiang wrote:
> On processors that support CET, VMX saves/restores
> the states of IA32_S_CET, SSP and IA32_INTERRUPT_SSP_TABLE_ADDR MSR
> to the VMCS area for Guest/Host unconditionally.
>
> If VM_EXIT_LOAD_HOST_CET_STATE = 1, the host CET MSRs are
> restored from VMCS host-state area at VM exit as follows:
>
> - HOST_IA32_S_CET: Host supervisor mode IA32_S_CET MSR is loaded
> from this field.
>
> - HOST_SSP : Host SSP is loaded from this field.
>
> - HOST_INTR_SSP_TABLE_ADDR : Host IA32_INTERRUPT_SSP_TABLE_ADDR
> MSR is loaded from this field.
>
> If VM_ENTRY_LOAD_GUEST_CET_STATE = 1, the guest CET MSRs are loaded
> from VMCS guest-state area at VM entry as follows:
>
> - GUEST_IA32_S_CET : Guest supervisor mode IA32_S_CET MSR is loaded
> from this field.
>
> - GUEST_SSP : Guest SSP is loaded from this field.
>
> - GUEST_INTR_SSP_TABL_ADDR : Guest IA32_INTERRUPT_SSP_TABLE_ADDR
> MSR is loaded from this field.

There is no code in this series to pass these fields to and from
userspace, and also to save/restore U_CET, INT_SSP_TAB, PL0_SSP and
PL3_SSP across context switches.

In addition, PL1_SSP and PL2_SSP should be supported even if the guest
doesn't use them. It makes sense to avoid intercepting them, but they
should still be supported and switched (possibly only if nonzero).

Am I missing something, for example a dependency on host CET support?
If not, how was this series tested?

Paolo

2019-01-25 22:30:53

by Sean Christopherson

[permalink] [raw]

Subject: Re: [PATCH v2 1/7] KVM:VMX: Define CET VMCS fields and bits

On Wed, Jan 23, 2019 at 04:59:03AM +0800, Yang Weijiang wrote:
> On processors that support CET, VMX saves/restores
> the states of IA32_S_CET, SSP and IA32_INTERRUPT_SSP_TABLE_ADDR MSR

It'd be helpful to spell out CET and SSP on their initial usage here,
especially since this is the first patch in the series.

If you're going to abbreviate INTERRUPT below, might as well do so here.

> to the VMCS area for Guest/Host unconditionally.
>
> If VM_EXIT_LOAD_HOST_CET_STATE = 1, the host CET MSRs are
> restored from VMCS host-state area at VM exit as follows:
>
> - HOST_IA32_S_CET: Host supervisor mode IA32_S_CET MSR is loaded
> from this field.
>
> - HOST_SSP : Host SSP is loaded from this field.
>
> - HOST_INTR_SSP_TABLE_ADDR : Host IA32_INTERRUPT_SSP_TABLE_ADDR
> MSR is loaded from this field.
>
> If VM_ENTRY_LOAD_GUEST_CET_STATE = 1, the guest CET MSRs are loaded
> from VMCS guest-state area at VM entry as follows:
>
> - GUEST_IA32_S_CET : Guest supervisor mode IA32_S_CET MSR is loaded
> from this field.
>
> - GUEST_SSP : Guest SSP is loaded from this field.
>
> - GUEST_INTR_SSP_TABL_ADDR : Guest IA32_INTERRUPT_SSP_TABLE_ADDR

/s/TABL_/TABLE_

> MSR is loaded from this field.
>
> Additionally, to context switch guest and host CET states, the VMM
> uses xsaves/xrstors instructions to save/restore the guest CET states
> at VM exit/entry. The CET xsave area is within thread_struct.fpu area.
> If OS execution flow changes during task switch/interrupt/exception etc.,
> the OS also relies on xsaves/xrstors to switch CET states accordingly.
>
> Note: Although these VMCS fields are 64-bit, they don't have high fields.

...that are documented. I'm still betting it's a doc bug and not a
divergence from every other VMCS field in existence.

> Signed-off-by: Zhang Yi Z <[email protected]>
> Signed-off-by: Yang Weijiang <[email protected]>
> ---
> arch/x86/include/asm/vmx.h | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
> index ade0f153947d..395c1f7e5938 100644
> --- a/arch/x86/include/asm/vmx.h
> +++ b/arch/x86/include/asm/vmx.h
> @@ -98,6 +98,7 @@
> #define VM_EXIT_LOAD_IA32_EFER 0x00200000
> #define VM_EXIT_SAVE_VMX_PREEMPTION_TIMER 0x00400000
> #define VM_EXIT_CLEAR_BNDCFGS 0x00800000
> +#define VM_EXIT_LOAD_HOST_CET_STATE 0x10000000
>
> #define VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR 0x00036dff
>
> @@ -109,6 +110,7 @@
> #define VM_ENTRY_LOAD_IA32_PAT 0x00004000
> #define VM_ENTRY_LOAD_IA32_EFER 0x00008000
> #define VM_ENTRY_LOAD_BNDCFGS 0x00010000
> +#define VM_ENTRY_LOAD_GUEST_CET_STATE 0x00100000
>
> #define VM_ENTRY_ALWAYSON_WITHOUT_TRUE_MSR 0x000011ff
>
> @@ -325,6 +327,9 @@ enum vmcs_field {
> GUEST_PENDING_DBG_EXCEPTIONS = 0x00006822,
> GUEST_SYSENTER_ESP = 0x00006824,
> GUEST_SYSENTER_EIP = 0x00006826,
> + GUEST_IA32_S_CET = 0x00006828,
> + GUEST_SSP = 0x0000682a,
> + GUEST_INTR_SSP_TABL_ADDR = 0x0000682c,

/s/TABL/TABLE

> HOST_CR0 = 0x00006c00,
> HOST_CR3 = 0x00006c02,
> HOST_CR4 = 0x00006c04,
> @@ -337,6 +342,9 @@ enum vmcs_field {
> HOST_IA32_SYSENTER_EIP = 0x00006c12,
> HOST_RSP = 0x00006c14,
> HOST_RIP = 0x00006c16,
> + HOST_IA32_S_CET = 0x00006c18,
> + HOST_SSP = 0x00006c1a,
> + HOST_INTR_SSP_TABL_ADDR = 0x00006c1c

/s/TABL/TABLE

> };
>
> /*
> --
> 2.17.1
>

2019-01-25 22:41:56

by Sean Christopherson

[permalink] [raw]

Subject: Re: [PATCH v2 3/7] KVM:CPUID: Add CPUID support for CET xsaves component query.

On Wed, Jan 23, 2019 at 04:59:05AM +0800, Yang Weijiang wrote:
> CET xsaves component size is queried through CPUID.(EAX=0xD, ECX=11)
> and CPUID.(EAX=0xD, ECX=12).
>
> Signed-off-by: Zhang Yi Z <[email protected]>
> Signed-off-by: Yang Weijiang <[email protected]>
> ---
> arch/x86/kvm/cpuid.c | 60 +++++++++++++++++++++++++++++++++-----------
> arch/x86/kvm/x86.c | 4 +++
> arch/x86/kvm/x86.h | 4 +++
> 3 files changed, 54 insertions(+), 14 deletions(-)
>
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index cb1aece25b17..dbeb4e7904eb 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -27,6 +27,8 @@
> #include "trace.h"
> #include "pmu.h"
>
> +extern u64 host_xss;

Probably better to put this in x86.h next to host_xcr0.

> +
> static u32 xstate_required_size(u64 xstate_bv, bool compacted)
> {
> int feature_bit = 0;
> @@ -65,6 +67,19 @@ u64 kvm_supported_xcr0(void)
> return xcr0;
> }
>
> +u64 kvm_supported_xss(void)
> +{
> + u64 xss = host_xss & KVM_SUPPORTED_XSS;
> +
> + /*
> + * Either SHSTK or IBT feature depends on the xsaves component.
> + */
> + if (!boot_cpu_has(X86_FEATURE_SHSTK) && !boot_cpu_has(X86_FEATURE_IBT))
> + xss &= ~(XFEATURE_MASK_SHSTK_USER | XFEATURE_MASK_SHSTK_KERNEL);

This looks wrong, e.g. the SHSTK bits are allowed for SHSTK=false && IBT=true?

And isn't this redundant, i.e. if the features aren't supported by the
boot cpu then shouldn't the associated bits be cleared in host_xss?

> +
> + return xss;
> +}
> +
> #define F(x) bit(X86_FEATURE_##x)
>
> /* For scattered features from cpufeatures.h; we currently expose none */
> @@ -503,6 +518,16 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
> * if the host doesn't support it.
> */
> entry->edx |= F(ARCH_CAPABILITIES);
> + /*
> + * If host doesn't have CET capability,
> + * do not report CET related info.
> + */
> + if (!boot_cpu_has(X86_FEATURE_SHSTK))
> + entry->ecx &= ~F(SHSTK);
> +
> + if (!boot_cpu_has(X86_FEATURE_IBT))
> + entry->edx &= ~F(IBT);
> +
> } else {
> entry->ebx = 0;
> entry->ecx = 0;
> @@ -564,14 +589,17 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
> }
> case 0xd: {
> int idx, i;
> - u64 supported = kvm_supported_xcr0();
> + u64 u_supported = kvm_supported_xcr0();
> + u64 s_supported = kvm_supported_xss();
> + u64 supported;
> + int compacted;
>
> - entry->eax &= supported;
> - entry->ebx = xstate_required_size(supported, false);
> + entry->eax &= u_supported;
> + entry->ebx = xstate_required_size(u_supported, false);
> entry->ecx = entry->ebx;
> - entry->edx &= supported >> 32;
> + entry->edx &= u_supported >> 32;
> entry->flags |= KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
> - if (!supported)
> + if (!u_supported)

Should this be '!u_supported && !s_supported'?

> break;
>
> for (idx = 1, i = 1; idx < 64; ++idx) {
> @@ -583,19 +611,23 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
> if (idx == 1) {
> entry[i].eax &= kvm_cpuid_D_1_eax_x86_features;
> cpuid_mask(&entry[i].eax, CPUID_D_1_EAX);
> - entry[i].ebx = 0;
> - if (entry[i].eax & (F(XSAVES)|F(XSAVEC)))
> - entry[i].ebx =
> - xstate_required_size(supported,
> - true);
> + supported = u_supported | s_supported;
> + compacted = entry[i].eax &
> + (F(XSAVES) | F(XSAVEC));
> + entry[i].ebx = xstate_required_size(supported,
> + compacted);
> + entry[i].ecx &= s_supported;
> + entry[i].edx = 0;
> } else {
> + supported = (entry[i].ecx & 1) ? s_supported :
> + u_supported;
> if (entry[i].eax == 0 || !(supported & mask))
> continue;
> - if (WARN_ON_ONCE(entry[i].ecx & 1))
> - continue;
> + entry[i].ecx &= 1;
> + entry[i].edx = 0;
> + if (entry[i].ecx)
> + entry[i].ebx = 0;
> }
> - entry[i].ecx = 0;
> - entry[i].edx = 0;
> entry[i].flags |=
> KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
> ++*nent;
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index a0f8b71b2132..b0ae24913423 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -212,6 +212,7 @@ struct kvm_stats_debugfs_item debugfs_entries[] = {
> };
>
> u64 __read_mostly host_xcr0;
> +u64 __read_mostly host_xss;
>
> static int emulator_fix_hypercall(struct x86_emulate_ctxt *ctxt);
>
> @@ -6838,6 +6839,9 @@ int kvm_arch_init(void *opaque)
> if (boot_cpu_has(X86_FEATURE_XSAVE))
> host_xcr0 = xgetbv(XCR_XFEATURE_ENABLED_MASK);
>
> + if (boot_cpu_has(X86_FEATURE_XSAVES))
> + rdmsrl(MSR_IA32_XSS, host_xss);
> +
> kvm_lapic_init();
> #ifdef CONFIG_X86_64
> pvclock_gtod_register_notifier(&pvclock_gtod_notifier);
> diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
> index 224cd0a47568..c61da41c3c5c 100644
> --- a/arch/x86/kvm/x86.h
> +++ b/arch/x86/kvm/x86.h
> @@ -283,6 +283,10 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, unsigned long cr2,
> | XFEATURE_MASK_YMM | XFEATURE_MASK_BNDREGS \
> | XFEATURE_MASK_BNDCSR | XFEATURE_MASK_AVX512 \
> | XFEATURE_MASK_PKRU)
> +
> +#define KVM_SUPPORTED_XSS (XFEATURE_MASK_SHSTK_USER \
> + | XFEATURE_MASK_SHSTK_KERNEL)
> +
> extern u64 host_xcr0;
>
> extern u64 kvm_supported_xcr0(void);
> --
> 2.17.1
>

2019-01-25 22:48:53

by Sean Christopherson

[permalink] [raw]

Subject: Re: [PATCH v2 4/7] KVM:CPUID: Fix xsaves area size calculation for CPUID.(EAX=0xD,ECX=1).

On Wed, Jan 23, 2019 at 04:59:06AM +0800, Yang Weijiang wrote:
> According to latest Software Development Manual vol.2/3.2,
> for CPUID.(EAX=0xD,ECX=1), it should report xsaves area size
> containing all states enabled by XCR0|IA32_MSR_XSS.
>
> Signed-off-by: Zhang Yi Z <[email protected]>
> Signed-off-by: Yang Weijiang <[email protected]>
> ---
> arch/x86/kvm/cpuid.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index dbeb4e7904eb..67b464d1d749 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -136,7 +136,8 @@ int kvm_update_cpuid(struct kvm_vcpu *vcpu)
>
> best = kvm_find_cpuid_entry(vcpu, 0xD, 1);
> if (best && (best->eax & (F(XSAVES) | F(XSAVEC))))
> - best->ebx = xstate_required_size(vcpu->arch.xcr0, true);
> + best->ebx = xstate_required_size(vcpu->arch.xcr0 |
> + kvm_supported_xss(), true);

kvm_supported_xss() provides the wrong value, that's what's *supported*.
CPUID reflects what's actually enabled, e.g. vcpu->arch.ia32_xss.

If you're going to bother adding spaces to align things, might as well
align the parameters, e.g.:

best->ebx = xstate_required_size(vcpu->arch.xcr0 |
vcpu->arch.ia32_xss, true);

>
> /*
> * The existing code assumes virtual address is 48-bit or 57-bit in the
> --
> 2.17.1
>

2019-01-25 22:50:34

by Sean Christopherson

[permalink] [raw]

Subject: Re: [PATCH v2 5/7] KVM:VMX: Pass through host CET related MSRs to Guest.

On Wed, Jan 23, 2019 at 04:59:07AM +0800, Yang Weijiang wrote:
> The CET runtime settings, i.e., CET state control bits(IA32_U_CET/
> IA32_S_CET), CET SSP(IA32_PL3_SSP/IA32_PL0_SSP) and SSP table address
> (IA32_INTERRUPT_SSP_TABLE_ADDR) are task/thread specific, therefore,
> OS needs to save/restore the states properly during context switch,
> e.g., task/thread switching, interrupt/exception handling, it uses
> xsaves/xrstors to achieve that.
>
> The difference between VMCS CET area fields and xsave CET area, is that
> the former is for state retention during Guest/Host context
> switch while the latter is for state retention during OS execution.
>
> Linux currently doesn't support CPL1 and CPL2, so SSPs for these level
> are skipped here.
>
> Signed-off-by: Zhang Yi Z <[email protected]>
> Signed-off-by: Yang Weijiang <[email protected]>
> ---
> arch/x86/kvm/vmx.c | 16 ++++++++++++++++
> 1 file changed, 16 insertions(+)
>
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 7bbb8b26e901..68c0e5e41cb1 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -11531,6 +11531,7 @@ static struct kvm_vcpu *vmx_create_vcpu(struct kvm *kvm, unsigned int id)
> vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_SYSENTER_CS, MSR_TYPE_RW);
> vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_SYSENTER_ESP, MSR_TYPE_RW);
> vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_SYSENTER_EIP, MSR_TYPE_RW);
> +

Spurious whitespace change.

> vmx->msr_bitmap_mode = 0;
>
> vmx->loaded_vmcs = &vmx->vmcs01;
> @@ -11769,6 +11770,8 @@ static void nested_vmx_entry_exit_ctls_update(struct kvm_vcpu *vcpu)
> static void vmx_cpuid_update(struct kvm_vcpu *vcpu)
> {
> struct vcpu_vmx *vmx = to_vmx(vcpu);
> + struct kvm_cpuid_entry2 *best;
> + unsigned long *msr_bitmap;
>
> if (cpu_has_secondary_exec_ctrls()) {
> vmx_compute_secondary_exec_control(vmx);
> @@ -11786,6 +11789,19 @@ static void vmx_cpuid_update(struct kvm_vcpu *vcpu)
> nested_vmx_cr_fixed1_bits_update(vcpu);
> nested_vmx_entry_exit_ctls_update(vcpu);
> }
> +
> + msr_bitmap = vmx->vmcs01.msr_bitmap;
> + best = kvm_find_cpuid_entry(vcpu, 7, 0);
> + if ((best && best->function == 0x7) &&
> + ((best->ecx & bit(X86_FEATURE_SHSTK)) |
> + (best->edx & bit(X86_FEATURE_IBT)))) {
> + vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_U_CET, MSR_TYPE_RW);
> + vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_S_CET, MSR_TYPE_RW);
> + vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_INT_SSP_TAB, MSR_TYPE_RW);
> + vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_PL0_SSP, MSR_TYPE_RW);
> + vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_PL3_SSP, MSR_TYPE_RW);

kvm_cpuid_update() an be called multiple times, don't we need to look
for a change in status as opposed to the bits being enabled? And at
that point toggling interception should probably be wrapped in a helper
function.

> + }
> +
> }
>
> static void vmx_set_supported_cpuid(u32 func, struct kvm_cpuid_entry2 *entry)
> --
> 2.17.1
>

2019-01-25 22:56:44

by Sean Christopherson

[permalink] [raw]

Subject: Re: [PATCH v2 6/7] KVM:VMX: Load Guest CET via VMCS when CET is enabled in Guest

On Wed, Jan 23, 2019 at 04:59:08AM +0800, Yang Weijiang wrote:
> "Load Guest CET state" bit controls whether guest CET states
> will be loaded on Guest entry. Before doing that, KVM needs
> to check if CET feature is exposed to Guest.
>
> Signed-off-by: Zhang Yi Z <[email protected]>
> Signed-off-by: Yang Weijiang <[email protected]>
> ---
> arch/x86/kvm/vmx.c | 33 +++++++++++++++++++++++++++++++++
> 1 file changed, 33 insertions(+)
>
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 68c0e5e41cb1..9c8cecac80ea 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -55,6 +55,7 @@
> #include <asm/mmu_context.h>
> #include <asm/spec-ctrl.h>
> #include <asm/mshyperv.h>
> +#include <asm/cet.h>
>
> #include "trace.h"
> #include "pmu.h"
> @@ -4065,6 +4066,18 @@ static inline bool vmx_feature_control_msr_valid(struct kvm_vcpu *vcpu,
> return !(val & ~valid_bits);
> }
>
> +static int vmx_guest_cet_cap(struct kvm_vcpu *vcpu)
> +{
> + struct kvm_cpuid_entry2 *best;
> + int r = 0;
> +
> + best = kvm_find_cpuid_entry(vcpu, 7, 0);
> + if (best && best->function == 0x7)
> + r = (best->ecx & bit(X86_FEATURE_SHSTK)) |
> + (best->edx & bit(X86_FEATURE_IBT)) ? 1 : 0;
> + return r;
> +}
> +
> static int vmx_get_msr_feature(struct kvm_msr_entry *msr)
> {
> switch (msr->index) {
> @@ -5409,6 +5422,26 @@ static int vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
> return 1;
> }
>
> + /*
> + * To enable Guest CET, first check if Guest CET feature is
> + * available, if it's not available but its CR4.CET is being set,
> + * return a fault to Guest; then check if Host CET is enabled and
> + * CR4.CET is toggled, if they are, then enable loading CET state

Comment doesn't match the code. Comment says "toggled", code is just
looking at "enabled".

> + * bit in entry control, otherwise, clear the bit to
> + * disable guest CET state loading.

What happens to CET state if the control is clear? Is host state
retained but inaccessible?

> + */
> + if (vmx_guest_cet_cap(vcpu)) {

Why not?

if (guest_cpuid_has(vcpu, X86_FEATURE_SHSTK) ||
guest_cpuid_has(vcpu, X86_FEATURE_IBT)) {

> + if (hw_cr4 & cr4 & X86_CR4_CET) {
> + vmcs_set_bits(VM_ENTRY_CONTROLS,
> + VM_ENTRY_LOAD_GUEST_CET_STATE);
> + } else {
> + vmcs_clear_bits(VM_ENTRY_CONTROLS,
> + VM_ENTRY_LOAD_GUEST_CET_STATE);
> + }
> + } else if (cr4 & X86_CR4_CET) {
> + return 1;
> + }
> +
> if (to_vmx(vcpu)->nested.vmxon && !nested_cr4_valid(vcpu, cr4))
> return 1;
>
> --
> 2.17.1
>

2019-01-25 23:04:05

by Sean Christopherson

[permalink] [raw]

Subject: Re: [PATCH v2 7/7] KVM:X86: Enable MSR_IA32_XSS bit 11 and 12 for CET xsaves/xrstors.

On Wed, Jan 23, 2019 at 04:59:09AM +0800, Yang Weijiang wrote:
> For kvm Guest OS, right now, only bit 11(user mode CET) and bit 12
> (supervisor CET) are supported in XSS MSR, if other bits are being set,
> the write to XSS will be skipped.
>
> Signed-off-by: Zhang Yi Z <[email protected]>
> Signed-off-by: Yang Weijiang <[email protected]>
> ---
> arch/x86/kvm/vmx.c | 11 ++++++++---
> 1 file changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 9c8cecac80ea..25ac22b3923a 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -47,6 +47,7 @@
> #include <asm/virtext.h>
> #include <asm/mce.h>
> #include <asm/fpu/internal.h>
> +#include <asm/fpu/types.h>
> #include <asm/perf_event.h>
> #include <asm/debugreg.h>
> #include <asm/kexec.h>
> @@ -4334,12 +4335,16 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> case MSR_IA32_XSS:
> if (!vmx_xsaves_supported())
> return 1;
> +
> /*
> - * The only supported bit as of Skylake is bit 8, but
> - * it is not supported on KVM.
> + * Right now, only support XSS_CET_U[bit 11] and
> + * XSS_CET_S[bit 12] in MSR_IA32_XSS.
> */
> - if (data != 0)
> +
> + if (!vmx_guest_cet_cap(vcpu) ||

This isn't super intuitive, as evidenced by the additional comment.
If you mask off the unsupported bits then you can make a clean check and
don't need a comment, e.g.:

supported = kvm_supported_xss();
if (!guest_cpuid_has(vcpu, X86_FEATURE_SHSTK) &&
!guest_cpuid_has(vcpu, X86_FEATURE_IBT))
supported &= ~(XSS_CET_S | XSS_CET_U);

if (data & ~supported)
return 1;

> + data & ~(KVM_SUPPORTED_XSS & host_xss))

Didn't you add kvm_supported_xss() in an earlier patch? Might as well
use it here.

> return 1;
> +
> vcpu->arch.ia32_xss = data;
> if (vcpu->arch.ia32_xss != host_xss)
> add_atomic_switch_msr(vmx, MSR_IA32_XSS,
> --
> 2.17.1
>

2019-01-29 03:42:07

by Yang, Weijiang

[permalink] [raw]

Subject: Re: [PATCH v2 1/7] KVM:VMX: Define CET VMCS fields and bits

On Fri, Jan 25, 2019 at 07:02:37PM +0100, Paolo Bonzini wrote:
> On 22/01/19 21:59, Yang Weijiang wrote:
> > On processors that support CET, VMX saves/restores
> > the states of IA32_S_CET, SSP and IA32_INTERRUPT_SSP_TABLE_ADDR MSR
> > to the VMCS area for Guest/Host unconditionally.
> >
> > If VM_EXIT_LOAD_HOST_CET_STATE = 1, the host CET MSRs are
> > restored from VMCS host-state area at VM exit as follows:
> >
> > - HOST_IA32_S_CET: Host supervisor mode IA32_S_CET MSR is loaded
> > from this field.
> >
> > - HOST_SSP : Host SSP is loaded from this field.
> >
> > - HOST_INTR_SSP_TABLE_ADDR : Host IA32_INTERRUPT_SSP_TABLE_ADDR
> > MSR is loaded from this field.
> >
> > If VM_ENTRY_LOAD_GUEST_CET_STATE = 1, the guest CET MSRs are loaded
> > from VMCS guest-state area at VM entry as follows:
> >
> > - GUEST_IA32_S_CET : Guest supervisor mode IA32_S_CET MSR is loaded
> > from this field.
> >
> > - GUEST_SSP : Guest SSP is loaded from this field.
> >
> > - GUEST_INTR_SSP_TABL_ADDR : Guest IA32_INTERRUPT_SSP_TABLE_ADDR
> > MSR is loaded from this field.
>
Thanks for review.

> There is no code in this series to pass these fields to and from
> userspace, and also to save/restore U_CET, INT_SSP_TAB, PL0_SSP and
> PL3_SSP across context switches.
>
The kernel consumes these MSRs, please see kernel CET patch:
https://lkml.org/lkml/fancy/2018/11/20/225

> In addition, PL1_SSP and PL2_SSP should be supported even if the guest
> doesn't use them. It makes sense to avoid intercepting them, but they
> should still be supported and switched (possibly only if nonzero).
>
> Am I missing something, for example a dependency on host CET support?
> If not, how was this series tested?
>
The guest CET feature is tested with kernel CET patches on internal
virtual platform.

> Paolo

2019-01-29 15:21:30

by Paolo Bonzini

[permalink] [raw]

Subject: Re: [PATCH v2 1/7] KVM:VMX: Define CET VMCS fields and bits

On 28/01/19 11:33, Yang Weijiang wrote:
>> There is no code in this series to pass these fields to and from
>> userspace, and also to save/restore U_CET, INT_SSP_TAB, PL0_SSP and
>> PL3_SSP across context switches.
>>
> The kernel consumes these MSRs, please see kernel CET patch:
> https://lkml.org/lkml/fancy/2018/11/20/225

Still, even if the kernel saves these fields across context switch in
XSAVE areas, KVM must support accesses to the MSRs from userspace, for
example in order to perform live migration.

For example, when reading/writing these in kvm_set_msr or
kvm_get_msr_common, you would have to do a read/write from the host
MSRs. You also have to put kvm_load_guest_fpu/kvm_put_guest_fpu calls
in __msr_io.

Thanks,

Paolo

>> In addition, PL1_SSP and PL2_SSP should be supported even if the guest
>> doesn't use them. It makes sense to avoid intercepting them, but they
>> should still be supported and switched (possibly only if nonzero).
>>
>> Am I missing something, for example a dependency on host CET support?
>> If not, how was this series tested?
>>
> The guest CET feature is tested with kernel CET patches on internal
> virtual platform.
>

2019-01-29 17:47:39

[permalink] [raw]

Subject: Re: [PATCH v2 1/7] KVM:VMX: Define CET VMCS fields and bits

On Wed, Jan 23, 2019 at 6:06 AM Yang Weijiang <[email protected]> wrote:
> Note: Although these VMCS fields are 64-bit, they don't have high fields.

This statement directly contradicts the SDM, volume 3, appendix B.2:

"A value of 1 in bits 14:13 of an encoding indicates a 64-bit field.
There are 64-bit fields only for controls and for guest state. As
noted in Section 24.11.2, every 64-bit field has two encodings, which
differ on bit 0, the access type. Thus, each such field has an even
encoding for full access and an odd encoding for high access."

2019-01-29 18:01:54

[permalink] [raw]

Subject: Re: [PATCH v2 1/7] KVM:VMX: Define CET VMCS fields and bits

On Tue, Jan 29, 2019 at 9:47 AM Jim Mattson <[email protected]> wrote:
>
> On Wed, Jan 23, 2019 at 6:06 AM Yang Weijiang <[email protected]> wrote:
> > Note: Although these VMCS fields are 64-bit, they don't have high fields.
>
> This statement directly contradicts the SDM, volume 3, appendix B.2:
>
> "A value of 1 in bits 14:13 of an encoding indicates a 64-bit field.
> There are 64-bit fields only for controls and for guest state. As
> noted in Section 24.11.2, every 64-bit field has two encodings, which
> differ on bit 0, the access type. Thus, each such field has an even
> encoding for full access and an odd encoding for high access."

Ah! They're not actually 64-bit fields. If you look at the encodings
(0x68XX and 0x6cxx), they're natural-width fields! Natural-width
fields don't have a high component.

2019-01-30 01:35:54

by Yang, Weijiang

[permalink] [raw]

Subject: Re: [PATCH v2 1/7] KVM:VMX: Define CET VMCS fields and bits

On Tue, Jan 29, 2019 at 04:19:34PM +0100, Paolo Bonzini wrote:
> On 28/01/19 11:33, Yang Weijiang wrote:
> >> There is no code in this series to pass these fields to and from
> >> userspace, and also to save/restore U_CET, INT_SSP_TAB, PL0_SSP and
> >> PL3_SSP across context switches.
> >>
> > The kernel consumes these MSRs, please see kernel CET patch:
> > https://lkml.org/lkml/fancy/2018/11/20/225
>
> Still, even if the kernel saves these fields across context switch in
> XSAVE areas, KVM must support accesses to the MSRs from userspace, for
> example in order to perform live migration.
>
> For example, when reading/writing these in kvm_set_msr or
> kvm_get_msr_common, you would have to do a read/write from the host
> MSRs. You also have to put kvm_load_guest_fpu/kvm_put_guest_fpu calls
> in __msr_io.
Thanks, you're right, if we want to support live migration we need do that.
However, this feature relies on quite a few external
dependencies, e.g., cpu capabilities, kernel version and configuration, we
don't support live migration right now. what's your opinion?
>
> Thanks,
>
> Paolo
>
> >> In addition, PL1_SSP and PL2_SSP should be supported even if the guest
> >> doesn't use them. It makes sense to avoid intercepting them, but they
> >> should still be supported and switched (possibly only if nonzero).
> >>
> >> Am I missing something, for example a dependency on host CET support?
> >> If not, how was this series tested?
> >>
> > The guest CET feature is tested with kernel CET patches on internal
> > virtual platform.
> >

2019-01-30 01:41:34

by Yang, Weijiang

[permalink] [raw]

Subject: Re: [PATCH v2 1/7] KVM:VMX: Define CET VMCS fields and bits

On Tue, Jan 29, 2019 at 10:27:50AM -0800, Sean Christopherson wrote:
> On Tue, Jan 29, 2019 at 10:01:15AM -0800, Jim Mattson wrote:
> > On Tue, Jan 29, 2019 at 9:47 AM Jim Mattson <[email protected]> wrote:
> > >
> > > On Wed, Jan 23, 2019 at 6:06 AM Yang Weijiang <[email protected]> wrote:
> > > > Note: Although these VMCS fields are 64-bit, they don't have high fields.
> > >
> > > This statement directly contradicts the SDM, volume 3, appendix B.2:
> > >
> > > "A value of 1 in bits 14:13 of an encoding indicates a 64-bit field.
> > > There are 64-bit fields only for controls and for guest state. As
> > > noted in Section 24.11.2, every 64-bit field has two encodings, which
> > > differ on bit 0, the access type. Thus, each such field has an even
> > > encoding for full access and an odd encoding for high access."
> >
> > Ah! They're not actually 64-bit fields. If you look at the encodings
> > (0x68XX and 0x6cxx), they're natural-width fields! Natural-width
> > fields don't have a high component.
>
> They're indeed natural width (I actually looked at the spec this time).
>
> The "_FULL" postfix on VMX_HOST_IA32_INTERRUPT_SSP_TABLE_ADDR_FULL and
> VMX_HOST_SSP_FULL is confusing as it generally only shows up on 64-bit
> fields. I'll see if we can get the fields renamed to drop "_FULL". I
> suggest we preemptively do the same for KVM.
Thank you for making it clear! I should have modified the annotation
correctly. I'll add a note for this in next version.

2019-01-30 08:33:32

by Paolo Bonzini

[permalink] [raw]

Subject: Re: [PATCH v2 1/7] KVM:VMX: Define CET VMCS fields and bits

On 29/01/19 09:29, Yang Weijiang wrote:
> On Tue, Jan 29, 2019 at 04:19:34PM +0100, Paolo Bonzini wrote:
>> On 28/01/19 11:33, Yang Weijiang wrote:
>>>> There is no code in this series to pass these fields to and from
>>>> userspace, and also to save/restore U_CET, INT_SSP_TAB, PL0_SSP and
>>>> PL3_SSP across context switches.
>>>>
>>> The kernel consumes these MSRs, please see kernel CET patch:
>>> https://lkml.org/lkml/fancy/2018/11/20/225
>>
>> Still, even if the kernel saves these fields across context switch in
>> XSAVE areas, KVM must support accesses to the MSRs from userspace, for
>> example in order to perform live migration.
>>
>> For example, when reading/writing these in kvm_set_msr or
>> kvm_get_msr_common, you would have to do a read/write from the host
>> MSRs. You also have to put kvm_load_guest_fpu/kvm_put_guest_fpu calls
>> in __msr_io.
> Thanks, you're right, if we want to support live migration we need do that.
> However, this feature relies on quite a few external
> dependencies, e.g., cpu capabilities, kernel version and configuration, we
> don't support live migration right now. what's your opinion?

Live migration is generally a requirement for a feature to be accepted
in KVM.

Paolo

>> Thanks,
>>
>> Paolo
>>
>>>> In addition, PL1_SSP and PL2_SSP should be supported even if the guest
>>>> doesn't use them. It makes sense to avoid intercepting them, but they
>>>> should still be supported and switched (possibly only if nonzero).
>>>>
>>>> Am I missing something, for example a dependency on host CET support?
>>>> If not, how was this series tested?
>>>>
>>> The guest CET feature is tested with kernel CET patches on internal
>>> virtual platform.
>>>

2019-01-31 08:25:20

by Yang, Weijiang

[permalink] [raw]

Subject: Re: [PATCH v2 6/7] KVM:VMX: Load Guest CET via VMCS when CET is enabled in Guest

On Fri, Jan 25, 2019 at 02:56:25PM -0800, Sean Christopherson wrote:
> On Wed, Jan 23, 2019 at 04:59:08AM +0800, Yang Weijiang wrote:
> > "Load Guest CET state" bit controls whether guest CET states
> > will be loaded on Guest entry. Before doing that, KVM needs
> > to check if CET feature is exposed to Guest.
> >
> > Signed-off-by: Zhang Yi Z <[email protected]>
> > Signed-off-by: Yang Weijiang <[email protected]>
> > ---
> > arch/x86/kvm/vmx.c | 33 +++++++++++++++++++++++++++++++++
> > 1 file changed, 33 insertions(+)
> >
> > diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> > index 68c0e5e41cb1..9c8cecac80ea 100644
> > --- a/arch/x86/kvm/vmx.c
> > +++ b/arch/x86/kvm/vmx.c
> > @@ -55,6 +55,7 @@
> > #include <asm/mmu_context.h>
> > #include <asm/spec-ctrl.h>
> > #include <asm/mshyperv.h>
> > +#include <asm/cet.h>
> >
> > #include "trace.h"
> > #include "pmu.h"
> > @@ -4065,6 +4066,18 @@ static inline bool vmx_feature_control_msr_valid(struct kvm_vcpu *vcpu,
> > return !(val & ~valid_bits);
> > }
> >
> > +static int vmx_guest_cet_cap(struct kvm_vcpu *vcpu)
> > +{
> > + struct kvm_cpuid_entry2 *best;
> > + int r = 0;
> > +
> > + best = kvm_find_cpuid_entry(vcpu, 7, 0);
> > + if (best && best->function == 0x7)
> > + r = (best->ecx & bit(X86_FEATURE_SHSTK)) |
> > + (best->edx & bit(X86_FEATURE_IBT)) ? 1 : 0;
> > + return r;
> > +}
> > +
> > static int vmx_get_msr_feature(struct kvm_msr_entry *msr)
> > {
> > switch (msr->index) {
> > @@ -5409,6 +5422,26 @@ static int vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
> > return 1;
> > }
> >
> > + /*
> > + * To enable Guest CET, first check if Guest CET feature is
> > + * available, if it's not available but its CR4.CET is being set,
> > + * return a fault to Guest; then check if Host CET is enabled and
> > + * CR4.CET is toggled, if they are, then enable loading CET state
>
> Comment doesn't match the code. Comment says "toggled", code is just
> looking at "enabled".
>
> > + * bit in entry control, otherwise, clear the bit to
> > + * disable guest CET state loading.
>
> What happens to CET state if the control is clear? Is host state
> retained but inaccessible?
>
Hi, Sean,
I consulted the CET arch for the feature enabling in guest, actually, it
can be enabled independent to host CET feature, i.e., guest CET feature
can work even without host CET feature off, I did experiments, it's
proved true. So, in next version, I'll remove host CET dependencies to
make the KVM code look much cleaner. Thanks for your comments.

> > + */
> > + if (vmx_guest_cet_cap(vcpu)) {
>
> Why not?
>
> if (guest_cpuid_has(vcpu, X86_FEATURE_SHSTK) ||
> guest_cpuid_has(vcpu, X86_FEATURE_IBT)) {
>
> > + if (hw_cr4 & cr4 & X86_CR4_CET) {
> > + vmcs_set_bits(VM_ENTRY_CONTROLS,
> > + VM_ENTRY_LOAD_GUEST_CET_STATE);
> > + } else {
> > + vmcs_clear_bits(VM_ENTRY_CONTROLS,
> > + VM_ENTRY_LOAD_GUEST_CET_STATE);
> > + }
> > + } else if (cr4 & X86_CR4_CET) {
> > + return 1;
> > + }
> > +
> > if (to_vmx(vcpu)->nested.vmxon && !nested_cr4_valid(vcpu, cr4))
> > return 1;
> >
> > --
> > 2.17.1
> >

2019-03-04 18:59:08

by Sean Christopherson

[permalink] [raw]

Subject: Re: [PATCH v2 1/7] KVM:VMX: Define CET VMCS fields and bits

On Tue, Jan 29, 2019 at 04:19:34PM +0100, Paolo Bonzini wrote:
> On 28/01/19 11:33, Yang Weijiang wrote:
> >> There is no code in this series to pass these fields to and from
> >> userspace, and also to save/restore U_CET, INT_SSP_TAB, PL0_SSP and
> >> PL3_SSP across context switches.
> >>
> > The kernel consumes these MSRs, please see kernel CET patch:
> > https://lkml.org/lkml/fancy/2018/11/20/225
>
> Still, even if the kernel saves these fields across context switch in
> XSAVE areas, KVM must support accesses to the MSRs from userspace, for
> example in order to perform live migration.
>
> For example, when reading/writing these in kvm_set_msr or
> kvm_get_msr_common, you would have to do a read/write from the host
> MSRs. You also have to put kvm_load_guest_fpu/kvm_put_guest_fpu calls
> in __msr_io.

Paolo, can you elaborate on why KVM would read the host MSRs? Wouldn't
kvm_{get,set}_msr() pull the values from the VMCS when necessary?

2019-03-08 09:16:38

by Paolo Bonzini

[permalink] [raw]

Subject: Re: [PATCH v2 1/7] KVM:VMX: Define CET VMCS fields and bits

On 04/03/19 19:56, Sean Christopherson wrote:
> On Tue, Jan 29, 2019 at 04:19:34PM +0100, Paolo Bonzini wrote:
>> On 28/01/19 11:33, Yang Weijiang wrote:
>>>> There is no code in this series to pass these fields to and from
>>>> userspace, and also to save/restore U_CET, INT_SSP_TAB, PL0_SSP and
>>>> PL3_SSP across context switches.
>>>>
>>> The kernel consumes these MSRs, please see kernel CET patch:
>>> https://lkml.org/lkml/fancy/2018/11/20/225
>>
>> Still, even if the kernel saves these fields across context switch in
>> XSAVE areas, KVM must support accesses to the MSRs from userspace, for
>> example in order to perform live migration.
>>
>> For example, when reading/writing these in kvm_set_msr or
>> kvm_get_msr_common, you would have to do a read/write from the host
>> MSRs. You also have to put kvm_load_guest_fpu/kvm_put_guest_fpu calls
>> in __msr_io.
>
> Paolo, can you elaborate on why KVM would read the host MSRs? Wouldn't
> kvm_{get,set}_msr() pull the values from the VMCS when necessary?

Not all MSRs are in the VMCS; IA32_U_CET and IA32_PL*_SSP are not.

Paolo

2019-03-08 15:51:30

by Sean Christopherson

[permalink] [raw]

Subject: Re: [PATCH v2 1/7] KVM:VMX: Define CET VMCS fields and bits

On Fri, Mar 08, 2019 at 10:15:15AM +0100, Paolo Bonzini wrote:
> On 04/03/19 19:56, Sean Christopherson wrote:
> > On Tue, Jan 29, 2019 at 04:19:34PM +0100, Paolo Bonzini wrote:
> >> On 28/01/19 11:33, Yang Weijiang wrote:
> >>>> There is no code in this series to pass these fields to and from
> >>>> userspace, and also to save/restore U_CET, INT_SSP_TAB, PL0_SSP and
> >>>> PL3_SSP across context switches.
> >>>>
> >>> The kernel consumes these MSRs, please see kernel CET patch:
> >>> https://lkml.org/lkml/fancy/2018/11/20/225
> >>
> >> Still, even if the kernel saves these fields across context switch in
> >> XSAVE areas, KVM must support accesses to the MSRs from userspace, for
> >> example in order to perform live migration.
> >>
> >> For example, when reading/writing these in kvm_set_msr or
> >> kvm_get_msr_common, you would have to do a read/write from the host
> >> MSRs. You also have to put kvm_load_guest_fpu/kvm_put_guest_fpu calls
> >> in __msr_io.
> >
> > Paolo, can you elaborate on why KVM would read the host MSRs? Wouldn't
> > kvm_{get,set}_msr() pull the values from the VMCS when necessary?
>
> Not all MSRs are in the VMCS; IA32_U_CET and IA32_PL*_SSP are not.

Ah, "host MSRs" confused me. I though you meant the host's version of
the MSRs, but you're saying do an XRSTORS to load the guest's FPU state
and then {RD,WR}MSR to pull the guest's value from hardware.

2019-03-08 16:37:30

by Paolo Bonzini

[permalink] [raw]

Subject: Re: [PATCH v2 1/7] KVM:VMX: Define CET VMCS fields and bits

On 08/03/19 16:50, Sean Christopherson wrote:
> On Fri, Mar 08, 2019 at 10:15:15AM +0100, Paolo Bonzini wrote:
>> On 04/03/19 19:56, Sean Christopherson wrote:
>>> On Tue, Jan 29, 2019 at 04:19:34PM +0100, Paolo Bonzini wrote:
>>>> On 28/01/19 11:33, Yang Weijiang wrote:
>>>>>> There is no code in this series to pass these fields to and from
>>>>>> userspace, and also to save/restore U_CET, INT_SSP_TAB, PL0_SSP and
>>>>>> PL3_SSP across context switches.
>>>>>>
>>>>> The kernel consumes these MSRs, please see kernel CET patch:
>>>>> https://lkml.org/lkml/fancy/2018/11/20/225
>>>>
>>>> Still, even if the kernel saves these fields across context switch in
>>>> XSAVE areas, KVM must support accesses to the MSRs from userspace, for
>>>> example in order to perform live migration.
>>>>
>>>> For example, when reading/writing these in kvm_set_msr or
>>>> kvm_get_msr_common, you would have to do a read/write from the host
>>>> MSRs. You also have to put kvm_load_guest_fpu/kvm_put_guest_fpu calls
>>>> in __msr_io.
>>>
>>> Paolo, can you elaborate on why KVM would read the host MSRs? Wouldn't
>>> kvm_{get,set}_msr() pull the values from the VMCS when necessary?
>>
>> Not all MSRs are in the VMCS; IA32_U_CET and IA32_PL*_SSP are not.
>
> Ah, "host MSRs" confused me. I though you meant the host's version of
> the MSRs, but you're saying do an XRSTORS to load the guest's FPU state
> and then {RD,WR}MSR to pull the guest's value from hardware.

Yes, that's what Weijang's patches are already doing in the next version.

Paolo