2021-02-03 11:23:46

by Yang, Weijiang

[permalink] [raw]
Subject: [PATCH v15 00/14] Introduce support for guest CET feature

Control-flow Enforcement Technology (CET) provides protection against
Return/Jump-Oriented Programming (ROP/JOP) attack. There're two CET
subfeatures: Shadow Stack (SHSTK) and Indirect Branch Tracking (IBT).
SHSTK is to prevent ROP and IBT is to prevent JOP.

Several parts in KVM have been updated to provide guest CET support, including:
CPUID/XSAVES settings, MSR passthrough, user-space MSR access interface,
vmentry/vmexit config, nested VM etc. These patches are dependent on CET
kernel patches for XSAVES support and CET definitions, e.g., MSR and related
feature flags.

CET kernel patches: refer to [1], [2].

Previous CET KVM patches: refer to [3].

CET QEMU patches: refer to [4].

CET KVM unit-test patch: refer to [5].

[1]: CET Shadow Stack patches v18:
https://lkml.kernel.org/linux-api/[email protected]/

[2]: Indirect Branch Tracking patches v18:
https://lkml.kernel.org/linux-api/[email protected]/

[3]: CET KVM patches v14:
https://lkml.kernel.org/kvm/[email protected]/

[4]: CET QEMU patches:
https://patchwork.ozlabs.org/project/qemu-devel/patch/[email protected]/

[5]: CET KVM unit-test patch:
https://patchwork.kernel.org/project/kvm/patch/[email protected]/

Changes in v15:
- Changed patches per Paolo's review feedback on v14.
- Added a new patch for GUEST_SSP save/restore in guest SMM case.
- Fixed guest call-trace issue due to CET MSR interception.
- Removed unnecessary guest CET state cleanup in VMCS.
- Rebased patches to 5.11-rc6.


Sean Christopherson (2):
KVM: x86: Report XSS as an MSR to be saved if there are supported
features
KVM: x86: Load guest fpu state when accessing MSRs managed by XSAVES

Yang Weijiang (12):
KVM: x86: Refresh CPUID on writes to MSR_IA32_XSS
KVM: x86: Add #CP support in guest exception dispatch
KVM: VMX: Introduce CET VMCS fields and flags
KVM: x86: Add fault checks for CR4.CET
KVM: VMX: Emulate reads and writes to CET MSRs
KVM: VMX: Add a synthetic MSR to allow userspace VMM to access
GUEST_SSP
KVM: x86: Report CET MSRs as to-be-saved if CET is supported
KVM: x86: Enable CET virtualization for VMX and advertise CET to
userspace
KVM: VMX: Pass through CET MSRs to the guest when supported
KVM: nVMX: Add helper to check the vmcs01 MSR bitmap for MSR
pass-through
KVM: nVMX: Enable CET support for nested VMX
KVM: x86: Save/Restore GUEST_SSP to/from SMRAM

arch/x86/include/asm/kvm_host.h | 4 +-
arch/x86/include/asm/vmx.h | 8 ++
arch/x86/include/uapi/asm/kvm.h | 1 +
arch/x86/include/uapi/asm/kvm_para.h | 1 +
arch/x86/kvm/cpuid.c | 26 +++-
arch/x86/kvm/emulate.c | 11 ++
arch/x86/kvm/vmx/capabilities.h | 5 +
arch/x86/kvm/vmx/nested.c | 57 ++++++--
arch/x86/kvm/vmx/vmcs12.c | 6 +
arch/x86/kvm/vmx/vmcs12.h | 14 +-
arch/x86/kvm/vmx/vmx.c | 202 ++++++++++++++++++++++++++-
arch/x86/kvm/x86.c | 67 ++++++++-
arch/x86/kvm/x86.h | 10 +-
13 files changed, 387 insertions(+), 25 deletions(-)

--
2.26.2


2021-02-03 11:23:49

by Yang, Weijiang

[permalink] [raw]
Subject: [PATCH v15 01/14] KVM: x86: Report XSS as an MSR to be saved if there are supported features

From: Sean Christopherson <[email protected]>

Add MSR_IA32_XSS to the list of MSRs reported to userspace if
supported_xss is non-zero, i.e. KVM supports at least one XSS based
feature.

Co-developed-by: Sean Christopherson <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
Signed-off-by: Yang Weijiang <[email protected]>
---
arch/x86/kvm/x86.c | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 76bce832cade..a657706358e8 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1255,6 +1255,8 @@ static const u32 msrs_to_save_all[] = {
MSR_ARCH_PERFMON_EVENTSEL0 + 12, MSR_ARCH_PERFMON_EVENTSEL0 + 13,
MSR_ARCH_PERFMON_EVENTSEL0 + 14, MSR_ARCH_PERFMON_EVENTSEL0 + 15,
MSR_ARCH_PERFMON_EVENTSEL0 + 16, MSR_ARCH_PERFMON_EVENTSEL0 + 17,
+
+ MSR_IA32_XSS,
};

static u32 msrs_to_save[ARRAY_SIZE(msrs_to_save_all)];
@@ -5795,6 +5797,10 @@ static void kvm_init_msr_list(void)
min(INTEL_PMC_MAX_GENERIC, x86_pmu.num_counters_gp))
continue;
break;
+ case MSR_IA32_XSS:
+ if (!supported_xss)
+ continue;
+ break;
default:
break;
}
--
2.26.2

2021-02-03 11:24:32

by Yang, Weijiang

[permalink] [raw]
Subject: [PATCH v15 03/14] KVM: x86: Load guest fpu state when accessing MSRs managed by XSAVES

From: Sean Christopherson <[email protected]>

A handful of CET MSRs are not context switched through "traditional"
methods, e.g. VMCS or manual switching, but rather are passed through
to the guest and are saved and restored by XSAVES/XRSTORS, i.e. in the
guest's FPU state.

Load the guest's FPU state if userspace is accessing MSRs whose values are
managed by XSAVES so that the MSR helper, e.g. vmx_{get,set}_xsave_msr(),
can simply do {RD,WR}MSR to access the guest's value.

Because is also used for the KVM_GET_MSRS device ioctl(), explicitly
check that @vcpu is non-null before attempting to load guest state. The
XSS supporting MSRs cannot be retrieved via the device ioctl() without
loading guest FPU state (which doesn't exist).

Note that guest_cpuid_has() is not queried as host userspace is allowed
to access MSRs that have not been exposed to the guest, e.g. it might do
KVM_SET_MSRS prior to KVM_SET_CPUID2.

Co-developed-by: Sean Christopherson <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
Signed-off-by: Yang Weijiang <[email protected]>
---
arch/x86/kvm/x86.c | 19 ++++++++++++++++++-
1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 30a07caf077c..99f787152d12 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -110,6 +110,8 @@ static void enter_smm(struct kvm_vcpu *vcpu);
static void __kvm_set_rflags(struct kvm_vcpu *vcpu, unsigned long rflags);
static void store_regs(struct kvm_vcpu *vcpu);
static int sync_regs(struct kvm_vcpu *vcpu);
+static void kvm_load_guest_fpu(struct kvm_vcpu *vcpu);
+static void kvm_put_guest_fpu(struct kvm_vcpu *vcpu);

struct kvm_x86_ops kvm_x86_ops __read_mostly;
EXPORT_SYMBOL_GPL(kvm_x86_ops);
@@ -3618,6 +3620,12 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
}
EXPORT_SYMBOL_GPL(kvm_get_msr_common);

+static bool is_xsaves_msr(u32 index)
+{
+ return index == MSR_IA32_U_CET ||
+ (index >= MSR_IA32_PL0_SSP && index <= MSR_IA32_PL3_SSP);
+}
+
/*
* Read or write a bunch of msrs. All parameters are kernel addresses.
*
@@ -3628,11 +3636,20 @@ static int __msr_io(struct kvm_vcpu *vcpu, struct kvm_msrs *msrs,
int (*do_msr)(struct kvm_vcpu *vcpu,
unsigned index, u64 *data))
{
+ bool fpu_loaded = false;
int i;

- for (i = 0; i < msrs->nmsrs; ++i)
+ for (i = 0; i < msrs->nmsrs; ++i) {
+ if (vcpu && !fpu_loaded && supported_xss &&
+ is_xsaves_msr(entries[i].index)) {
+ kvm_load_guest_fpu(vcpu);
+ fpu_loaded = true;
+ }
if (do_msr(vcpu, entries[i].index, &entries[i].data))
break;
+ }
+ if (fpu_loaded)
+ kvm_put_guest_fpu(vcpu);

return i;
}
--
2.26.2

2021-02-03 11:25:44

by Yang, Weijiang

[permalink] [raw]
Subject: [PATCH v15 04/14] KVM: x86: Add #CP support in guest exception dispatch

Add handling for Control Protection (#CP) exceptions, vector 21, used
and introduced by Intel's Control-Flow Enforcement Technology (CET).
relevant CET violation case. See Intel's SDM for details.

Signed-off-by: Yang Weijiang <[email protected]>
---
arch/x86/include/uapi/asm/kvm.h | 1 +
arch/x86/kvm/x86.c | 1 +
arch/x86/kvm/x86.h | 2 +-
3 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
index 8e76d3701db3..507263d1d0b2 100644
--- a/arch/x86/include/uapi/asm/kvm.h
+++ b/arch/x86/include/uapi/asm/kvm.h
@@ -32,6 +32,7 @@
#define MC_VECTOR 18
#define XM_VECTOR 19
#define VE_VECTOR 20
+#define CP_VECTOR 21

/* Select x86 specific features in <linux/kvm.h> */
#define __KVM_HAVE_PIT
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 99f787152d12..d9d3bae40a8c 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -436,6 +436,7 @@ static int exception_class(int vector)
case NP_VECTOR:
case SS_VECTOR:
case GP_VECTOR:
+ case CP_VECTOR:
return EXCPT_CONTRIBUTORY;
default:
break;
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index c5ee0f5ce0f1..bdbd0b023ecc 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -116,7 +116,7 @@ static inline bool x86_exception_has_error_code(unsigned int vector)
{
static u32 exception_has_error_code = BIT(DF_VECTOR) | BIT(TS_VECTOR) |
BIT(NP_VECTOR) | BIT(SS_VECTOR) | BIT(GP_VECTOR) |
- BIT(PF_VECTOR) | BIT(AC_VECTOR);
+ BIT(PF_VECTOR) | BIT(AC_VECTOR) | BIT(CP_VECTOR);

return (1U << vector) & exception_has_error_code;
}
--
2.26.2

2021-02-03 11:26:27

by Yang, Weijiang

[permalink] [raw]
Subject: [PATCH v15 06/14] KVM: x86: Add fault checks for CR4.CET

Add the fault checks for CR4.CET, which is the master control for all
CET features (SHSTK and IBT). In addition to basic support checks, CET
can be enabled if and only if CR0.WP==1, i.e. setting CR4.CET=1 faults
if CR0.WP==0 and setting CR0.WP=0 fails if CR4.CET==1.

Co-developed-by: Sean Christopherson <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
Signed-off-by: Yang Weijiang <[email protected]>
---
arch/x86/kvm/x86.c | 6 ++++++
arch/x86/kvm/x86.h | 3 +++
2 files changed, 9 insertions(+)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index d9d3bae40a8c..6af240d87a33 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -868,6 +868,9 @@ int kvm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
if (!(cr0 & X86_CR0_PG) && kvm_read_cr4_bits(vcpu, X86_CR4_PCIDE))
return 1;

+ if (!(cr0 & X86_CR0_WP) && kvm_read_cr4_bits(vcpu, X86_CR4_CET))
+ return 1;
+
kvm_x86_ops.set_cr0(vcpu, cr0);

kvm_post_set_cr0(vcpu, old_cr0, cr0);
@@ -1034,6 +1037,9 @@ int kvm_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
return 1;
}

+ if ((cr4 & X86_CR4_CET) && !(kvm_read_cr0(vcpu) & X86_CR0_WP))
+ return 1;
+
kvm_x86_ops.set_cr4(vcpu, cr4);

kvm_post_set_cr4(vcpu, old_cr4, cr4);
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index bdbd0b023ecc..fd8c46da2030 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -425,6 +425,9 @@ bool kvm_msr_allowed(struct kvm_vcpu *vcpu, u32 index, u32 type);
__reserved_bits |= X86_CR4_UMIP; \
if (!__cpu_has(__c, X86_FEATURE_VMX)) \
__reserved_bits |= X86_CR4_VMXE; \
+ if (!__cpu_has(__c, X86_FEATURE_SHSTK) && \
+ !__cpu_has(__c, X86_FEATURE_IBT)) \
+ __reserved_bits |= X86_CR4_CET; \
__reserved_bits; \
})

--
2.26.2

2021-02-03 11:26:27

by Yang, Weijiang

[permalink] [raw]
Subject: [PATCH v15 02/14] KVM: x86: Refresh CPUID on writes to MSR_IA32_XSS

Updated CPUID.0xD.0x1, which reports the current required storage size
of all features enabled via XCR0 | XSS, when the guest's XSS is modified.

Note, KVM does not yet support any XSS based features, i.e. supported_xss
is guaranteed to be zero at this time.

Co-developed-by: Zhang Yi Z <[email protected]>
Signed-off-by: Zhang Yi Z <[email protected]>
Signed-off-by: Yang Weijiang <[email protected]>
---
arch/x86/include/asm/kvm_host.h | 1 +
arch/x86/kvm/cpuid.c | 21 ++++++++++++++++++---
arch/x86/kvm/x86.c | 7 +++++--
3 files changed, 24 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 3d6616f6f6ef..1734f872712d 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -611,6 +611,7 @@ struct kvm_vcpu_arch {

u64 xcr0;
u64 guest_supported_xcr0;
+ u64 guest_supported_xss;

struct kvm_pio_request pio;
void *pio_data;
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 13036cf0b912..6d7d9d59fd5b 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -129,9 +129,24 @@ void kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu)
best->ebx = xstate_required_size(vcpu->arch.xcr0, false);

best = kvm_find_cpuid_entry(vcpu, 0xD, 1);
- if (best && (cpuid_entry_has(best, X86_FEATURE_XSAVES) ||
- cpuid_entry_has(best, X86_FEATURE_XSAVEC)))
- best->ebx = xstate_required_size(vcpu->arch.xcr0, true);
+ if (best) {
+ if (cpuid_entry_has(best, X86_FEATURE_XSAVES) ||
+ cpuid_entry_has(best, X86_FEATURE_XSAVEC)) {
+ u64 xstate = vcpu->arch.xcr0 | vcpu->arch.ia32_xss;
+
+ best->ebx = xstate_required_size(xstate, true);
+ }
+
+ if (!cpuid_entry_has(best, X86_FEATURE_XSAVES)) {
+ best->ecx = 0;
+ best->edx = 0;
+ }
+ vcpu->arch.guest_supported_xss =
+ (((u64)best->edx << 32) | best->ecx) & supported_xss;
+
+ } else {
+ vcpu->arch.guest_supported_xss = 0;
+ }

best = kvm_find_cpuid_entry(vcpu, KVM_CPUID_FEATURES, 0);
if (kvm_hlt_in_guest(vcpu->kvm) && best &&
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index a657706358e8..30a07caf077c 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3133,9 +3133,12 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
* IA32_XSS[bit 8]. Guests have to use RDMSR/WRMSR rather than
* XSAVES/XRSTORS to save/restore PT MSRs.
*/
- if (data & ~supported_xss)
+ if (data & ~vcpu->arch.guest_supported_xss)
return 1;
- vcpu->arch.ia32_xss = data;
+ if (vcpu->arch.ia32_xss != data) {
+ vcpu->arch.ia32_xss = data;
+ kvm_update_cpuid_runtime(vcpu);
+ }
break;
case MSR_SMI_COUNT:
if (!msr_info->host_initiated)
--
2.26.2

2021-02-03 11:28:21

by Yang, Weijiang

[permalink] [raw]
Subject: [PATCH v15 07/14] KVM: VMX: Emulate reads and writes to CET MSRs

Add support for emulating read and write accesses to CET MSRs. CET MSRs
are universally "special" as they are either context switched via
dedicated VMCS fields or via XSAVES, i.e. no additional in-memory
tracking is needed, but emulated reads/writes are more expensive.

MSRs that are switched through XSAVES are especially annoying due to the
possibility of the kernel's FPU being used in IRQ context. Disable IRQs
and ensure the guest's FPU state is loaded when accessing such MSRs.

Co-developed-by: Sean Christopherson <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
Signed-off-by: Yang Weijiang <[email protected]>
---
arch/x86/kvm/vmx/vmx.c | 105 +++++++++++++++++++++++++++++++++++++++++
arch/x86/kvm/x86.h | 5 ++
2 files changed, 110 insertions(+)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index cc60b1fc3ee7..694879c2b0b7 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1787,6 +1787,66 @@ static int vmx_get_msr_feature(struct kvm_msr_entry *msr)
}
}

+static void vmx_get_xsave_msr(struct msr_data *msr_info)
+{
+ local_irq_disable();
+ if (test_thread_flag(TIF_NEED_FPU_LOAD))
+ switch_fpu_return();
+ rdmsrl(msr_info->index, msr_info->data);
+ local_irq_enable();
+}
+
+static void vmx_set_xsave_msr(struct msr_data *msr_info)
+{
+ local_irq_disable();
+ if (test_thread_flag(TIF_NEED_FPU_LOAD))
+ switch_fpu_return();
+ wrmsrl(msr_info->index, msr_info->data);
+ local_irq_enable();
+}
+
+static bool cet_is_ssp_msr_accessible(struct kvm_vcpu *vcpu,
+ struct msr_data *msr)
+{
+ u64 mask;
+
+ if (!kvm_cet_supported())
+ return false;
+
+ if (msr->host_initiated)
+ return true;
+
+ if (!guest_cpuid_has(vcpu, X86_FEATURE_SHSTK))
+ return false;
+
+ if (msr->index == MSR_IA32_INT_SSP_TAB)
+ return true;
+
+ mask = (msr->index == MSR_IA32_PL3_SSP) ? XFEATURE_MASK_CET_USER :
+ XFEATURE_MASK_CET_KERNEL;
+ return !!(vcpu->arch.guest_supported_xss & mask);
+}
+
+static bool cet_is_control_msr_accessible(struct kvm_vcpu *vcpu,
+ struct msr_data *msr)
+{
+ u64 mask;
+
+ if (!kvm_cet_supported())
+ return false;
+
+ if (msr->host_initiated)
+ return true;
+
+ if (!guest_cpuid_has(vcpu, X86_FEATURE_SHSTK) &&
+ !guest_cpuid_has(vcpu, X86_FEATURE_IBT))
+ return false;
+
+ mask = (msr->index == MSR_IA32_U_CET) ? XFEATURE_MASK_CET_USER :
+ XFEATURE_MASK_CET_KERNEL;
+ return !!(vcpu->arch.guest_supported_xss & mask);
+}
+
/*
* Reads an msr value (of 'msr_index') into 'pdata'.
* Returns 0 on success, non-0 otherwise.
@@ -1919,6 +1979,26 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
else
msr_info->data = vmx->pt_desc.guest.addr_a[index / 2];
break;
+ case MSR_IA32_S_CET:
+ if (!cet_is_control_msr_accessible(vcpu, msr_info))
+ return 1;
+ msr_info->data = vmcs_readl(GUEST_S_CET);
+ break;
+ case MSR_IA32_U_CET:
+ if (!cet_is_control_msr_accessible(vcpu, msr_info))
+ return 1;
+ vmx_get_xsave_msr(msr_info);
+ break;
+ case MSR_IA32_INT_SSP_TAB:
+ if (!cet_is_ssp_msr_accessible(vcpu, msr_info))
+ return 1;
+ msr_info->data = vmcs_readl(GUEST_INTR_SSP_TABLE);
+ break;
+ case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP:
+ if (!cet_is_ssp_msr_accessible(vcpu, msr_info))
+ return 1;
+ vmx_get_xsave_msr(msr_info);
+ break;
case MSR_TSC_AUX:
if (!msr_info->host_initiated &&
!guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP))
@@ -2188,6 +2268,31 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
else
vmx->pt_desc.guest.addr_a[index / 2] = data;
break;
+ case MSR_IA32_S_CET:
+ case MSR_IA32_U_CET:
+ if (!cet_is_control_msr_accessible(vcpu, msr_info))
+ return 1;
+ if (data & GENMASK(9, 6))
+ return 1;
+ if (msr_index == MSR_IA32_S_CET)
+ vmcs_writel(GUEST_S_CET, data);
+ else
+ vmx_set_xsave_msr(msr_info);
+ break;
+ case MSR_IA32_INT_SSP_TAB:
+ if (!cet_is_control_msr_accessible(vcpu, msr_info))
+ return 1;
+ if (is_noncanonical_address(data, vcpu))
+ return 1;
+ vmcs_writel(GUEST_INTR_SSP_TABLE, data);
+ break;
+ case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP:
+ if (!cet_is_ssp_msr_accessible(vcpu, msr_info))
+ return 1;
+ if ((data & GENMASK(2, 0)) || is_noncanonical_address(data, vcpu))
+ return 1;
+ vmx_set_xsave_msr(msr_info);
+ break;
case MSR_TSC_AUX:
if (!msr_info->host_initiated &&
!guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP))
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index fd8c46da2030..16c661d94349 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -288,6 +288,11 @@ static inline bool kvm_mpx_supported(void)
== (XFEATURE_MASK_BNDREGS | XFEATURE_MASK_BNDCSR);
}

+static inline bool kvm_cet_supported(void)
+{
+ return supported_xss & XFEATURE_MASK_CET_USER;
+}
+
extern unsigned int min_timer_period_us;

extern bool enable_vmware_backdoor;
--
2.26.2

2021-02-03 11:28:41

by Yang, Weijiang

[permalink] [raw]
Subject: [PATCH v15 10/14] KVM: x86: Enable CET virtualization for VMX and advertise CET to userspace

Set the feature bits so that CET capabilities can be seen in guest via
CPUID enumeration. Add CR4.CET bit support in order to allow guest set CET
master control bit(CR4.CET).

Disable KVM CET feature if unrestricted_guest is unsupported/disabled as
KVM does not support emulating CET.

Don't expose CET feature if dependent CET bits are cleared in host XSS,
or if XSAVES isn't supported. Updating the CET features in common x86 is
a little ugly, but there is on clean solution without risking breakage of
SVM if SVM hardware ever gains support for CET, e.g. moving everything to
common x86 would prematurely expose CET on SVM. The alternative is to
put all the logic in VMX, but that means rereading host_xss in VMX and
duplicating the XSAVES check across VMX and SVM.

Co-developed-by: Sean Christopherson <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
Signed-off-by: Yang Weijiang <[email protected]>
---
arch/x86/include/asm/kvm_host.h | 3 +-
arch/x86/kvm/cpuid.c | 5 +--
arch/x86/kvm/vmx/capabilities.h | 5 +++
arch/x86/kvm/vmx/vmx.c | 55 +++++++++++++++++++++++++++++++--
arch/x86/kvm/x86.c | 8 +++++
5 files changed, 70 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 1734f872712d..3955e76dce96 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -100,7 +100,8 @@
| X86_CR4_PGE | X86_CR4_PCE | X86_CR4_OSFXSR | X86_CR4_PCIDE \
| X86_CR4_OSXSAVE | X86_CR4_SMEP | X86_CR4_FSGSBASE \
| X86_CR4_OSXMMEXCPT | X86_CR4_LA57 | X86_CR4_VMXE \
- | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP))
+ | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP \
+ | X86_CR4_CET))

#define CR8_RESERVED_BITS (~(unsigned long)X86_CR8_TPR)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 6d7d9d59fd5b..46087bca9418 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -417,7 +417,8 @@ void kvm_set_cpu_caps(void)
F(AVX512VBMI) | F(LA57) | F(PKU) | 0 /*OSPKE*/ | F(RDPID) |
F(AVX512_VPOPCNTDQ) | F(UMIP) | F(AVX512_VBMI2) | F(GFNI) |
F(VAES) | F(VPCLMULQDQ) | F(AVX512_VNNI) | F(AVX512_BITALG) |
- F(CLDEMOTE) | F(MOVDIRI) | F(MOVDIR64B) | 0 /*WAITPKG*/
+ F(CLDEMOTE) | F(MOVDIRI) | F(MOVDIR64B) | 0 /*WAITPKG*/ |
+ F(SHSTK)
);
/* Set LA57 based on hardware capability. */
if (cpuid_ecx(7) & F(LA57))
@@ -434,7 +435,7 @@ void kvm_set_cpu_caps(void)
F(AVX512_4VNNIW) | F(AVX512_4FMAPS) | F(SPEC_CTRL) |
F(SPEC_CTRL_SSBD) | F(ARCH_CAPABILITIES) | F(INTEL_STIBP) |
F(MD_CLEAR) | F(AVX512_VP2INTERSECT) | F(FSRM) |
- F(SERIALIZE) | F(TSXLDTRK) | F(AVX512_FP16)
+ F(SERIALIZE) | F(TSXLDTRK) | F(AVX512_FP16) | F(IBT)
);

/* TSC_ADJUST and ARCH_CAPABILITIES are emulated in software. */
diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h
index 3a1861403d73..58cb57b08697 100644
--- a/arch/x86/kvm/vmx/capabilities.h
+++ b/arch/x86/kvm/vmx/capabilities.h
@@ -103,6 +103,11 @@ static inline bool cpu_has_load_perf_global_ctrl(void)
(vmcs_config.vmexit_ctrl & VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL);
}

+static inline bool cpu_has_load_cet_ctrl(void)
+{
+ return (vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_CET_STATE) &&
+ (vmcs_config.vmexit_ctrl & VM_EXIT_LOAD_CET_STATE);
+}
static inline bool cpu_has_vmx_mpx(void)
{
return (vmcs_config.vmexit_ctrl & VM_EXIT_CLEAR_BNDCFGS) &&
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 4cd6a9710a51..c2242fc1f71a 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2278,7 +2278,7 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
case MSR_IA32_U_CET:
if (!cet_is_control_msr_accessible(vcpu, msr_info))
return 1;
- if (data & GENMASK(9, 6))
+ if ((data & GENMASK(9, 6)) || is_noncanonical_address(data, vcpu))
return 1;
if (msr_index == MSR_IA32_S_CET)
vmcs_writel(GUEST_S_CET, data);
@@ -2593,7 +2593,8 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
VM_EXIT_LOAD_IA32_EFER |
VM_EXIT_CLEAR_BNDCFGS |
VM_EXIT_PT_CONCEAL_PIP |
- VM_EXIT_CLEAR_IA32_RTIT_CTL;
+ VM_EXIT_CLEAR_IA32_RTIT_CTL |
+ VM_EXIT_LOAD_CET_STATE;
if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_EXIT_CTLS,
&_vmexit_control) < 0)
return -EIO;
@@ -2617,7 +2618,8 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
VM_ENTRY_LOAD_IA32_EFER |
VM_ENTRY_LOAD_BNDCFGS |
VM_ENTRY_PT_CONCEAL_PIP |
- VM_ENTRY_LOAD_IA32_RTIT_CTL;
+ VM_ENTRY_LOAD_IA32_RTIT_CTL |
+ VM_ENTRY_LOAD_CET_STATE;
if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_ENTRY_CTLS,
&_vmentry_control) < 0)
return -EIO;
@@ -2645,6 +2647,15 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
}
}

+ /*
+ * The CET entry and exit controls need to be synchronized, e.g. to
+ * avoid loading guest state but not restoring host state.
+ */
+ if (!(_vmentry_control & VM_ENTRY_LOAD_CET_STATE) ||
+ !(_vmexit_control & VM_EXIT_LOAD_CET_STATE)) {
+ _vmentry_control &= ~VM_ENTRY_LOAD_CET_STATE;
+ _vmexit_control &= ~VM_EXIT_LOAD_CET_STATE;
+ }

rdmsr(MSR_IA32_VMX_BASIC, vmx_msr_low, vmx_msr_high);

@@ -5943,6 +5954,12 @@ void dump_vmcs(void)
pr_err("InterruptStatus = %04x\n",
vmcs_read16(GUEST_INTR_STATUS));

+ if (vmentry_ctl & VM_ENTRY_LOAD_CET_STATE) {
+ pr_err("S_CET = 0x%016lx\n", vmcs_readl(GUEST_S_CET));
+ pr_err("SSP = 0x%016lx\n", vmcs_readl(GUEST_SSP));
+ pr_err("SSP TABLE = 0x%016lx\n",
+ vmcs_readl(GUEST_INTR_SSP_TABLE));
+ }
pr_err("*** Host State ***\n");
pr_err("RIP = 0x%016lx RSP = 0x%016lx\n",
vmcs_readl(HOST_RIP), vmcs_readl(HOST_RSP));
@@ -6017,6 +6034,12 @@ void dump_vmcs(void)
if (secondary_exec_control & SECONDARY_EXEC_ENABLE_VPID)
pr_err("Virtual processor ID = 0x%04x\n",
vmcs_read16(VIRTUAL_PROCESSOR_ID));
+ if (vmexit_ctl & VM_EXIT_LOAD_CET_STATE) {
+ pr_err("S_CET = 0x%016lx\n", vmcs_readl(HOST_S_CET));
+ pr_err("SSP = 0x%016lx\n", vmcs_readl(HOST_SSP));
+ pr_err("SSP TABLE = 0x%016lx\n",
+ vmcs_readl(HOST_INTR_SSP_TABLE));
+ }
}

/*
@@ -7395,6 +7418,15 @@ static __init void vmx_set_cpu_caps(void)

if (cpu_has_vmx_waitpkg())
kvm_cpu_cap_check_and_set(X86_FEATURE_WAITPKG);
+
+ if (!cpu_has_load_cet_ctrl() || !enable_unrestricted_guest) {
+ kvm_cpu_cap_clear(X86_FEATURE_SHSTK);
+ kvm_cpu_cap_clear(X86_FEATURE_IBT);
+ } else if (kvm_cpu_cap_has(X86_FEATURE_SHSTK) ||
+ kvm_cpu_cap_has(X86_FEATURE_IBT)) {
+ supported_xss |= XFEATURE_MASK_CET_USER |
+ XFEATURE_MASK_CET_KERNEL;
+ }
}

static void vmx_request_immediate_exit(struct kvm_vcpu *vcpu)
@@ -7833,6 +7865,8 @@ static __init int hardware_setup(void)
unsigned long host_bndcfgs;
struct desc_ptr dt;
int r, i, ept_lpage_level;
+ u64 cet_msr;
+ bool accessible;

store_idt(&dt);
host_idt_base = dt.address;
@@ -7846,6 +7880,21 @@ static __init int hardware_setup(void)
if (boot_cpu_has(X86_FEATURE_NX))
kvm_enable_efer_bits(EFER_NX);

+ accessible = (supported_xss & XFEATURE_MASK_CET_KERNEL) &&
+ (boot_cpu_has(X86_FEATURE_IBT) ||
+ boot_cpu_has(X86_FEATURE_SHSTK));
+ if (accessible) {
+ rdmsrl(MSR_IA32_S_CET, cet_msr);
+ WARN_ONCE(cet_msr, "KVM: CET S_CET in host will be lost!\n");
+ }
+
+ accessible = (supported_xss & XFEATURE_MASK_CET_KERNEL) &&
+ boot_cpu_has(X86_FEATURE_SHSTK);
+ if (accessible) {
+ rdmsrl(MSR_IA32_PL0_SSP, cet_msr);
+ WARN_ONCE(cet_msr, "KVM: CET PL0_SSP in host will be lost!\n");
+ }
+
if (boot_cpu_has(X86_FEATURE_MPX)) {
rdmsrl(MSR_IA32_BNDCFGS, host_bndcfgs);
WARN_ONCE(host_bndcfgs, "KVM: BNDCFGS in host will be lost");
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 059e101daf94..22eb6b8626a8 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -10329,6 +10329,14 @@ int kvm_arch_hardware_setup(void *opaque)

if (!kvm_cpu_cap_has(X86_FEATURE_XSAVES))
supported_xss = 0;
+ else
+ supported_xss &= host_xss;
+
+ /* Update CET features now that supported_xss is finalized. */
+ if (!kvm_cet_supported()) {
+ kvm_cpu_cap_clear(X86_FEATURE_SHSTK);
+ kvm_cpu_cap_clear(X86_FEATURE_IBT);
+ }

#define __kvm_cpu_cap_has(UNUSED_, f) kvm_cpu_cap_has(f)
cr4_reserved_bits = __cr4_reserved_bits(__kvm_cpu_cap_has, UNUSED_);
--
2.26.2

2021-02-03 11:28:56

by Yang, Weijiang

[permalink] [raw]
Subject: [PATCH v15 09/14] KVM: x86: Report CET MSRs as to-be-saved if CET is supported

Report all CET MSRs, including the synthetic GUEST_SSP MSR, as
to-be-saved, e.g. for migration, if CET is supported by KVM.

Co-developed-by: Sean Christopherson <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
Signed-off-by: Yang Weijiang <[email protected]>
---
arch/x86/kvm/x86.c | 10 ++++++++++
1 file changed, 10 insertions(+)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 6af240d87a33..059e101daf94 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1266,6 +1266,8 @@ static const u32 msrs_to_save_all[] = {
MSR_ARCH_PERFMON_EVENTSEL0 + 16, MSR_ARCH_PERFMON_EVENTSEL0 + 17,

MSR_IA32_XSS,
+ MSR_IA32_U_CET, MSR_IA32_S_CET, MSR_IA32_INT_SSP_TAB, MSR_KVM_GUEST_SSP,
+ MSR_IA32_PL0_SSP, MSR_IA32_PL1_SSP, MSR_IA32_PL2_SSP, MSR_IA32_PL3_SSP,
};

static u32 msrs_to_save[ARRAY_SIZE(msrs_to_save_all)];
@@ -5828,6 +5830,14 @@ static void kvm_init_msr_list(void)
if (!supported_xss)
continue;
break;
+ case MSR_KVM_GUEST_SSP:
+ case MSR_IA32_U_CET:
+ case MSR_IA32_S_CET:
+ case MSR_IA32_INT_SSP_TAB:
+ case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP:
+ if (!kvm_cet_supported())
+ continue;
+ break;
default:
break;
}
--
2.26.2

2021-02-03 11:29:24

by Yang, Weijiang

[permalink] [raw]
Subject: [PATCH v15 05/14] KVM: VMX: Introduce CET VMCS fields and flags

CET (Control-flow Enforcement Technology) is a CPU feature used to prevent
Return/Jump-Oriented Programming (ROP/JOP) attacks. CET introduces a new
exception type, Control Protection (#CP), and two sub-features to defend
against ROP/JOP style control-flow subversion attacks:

Shadow Stack (SHSTK):
A shadow stack is a second stack used exclusively for control transfer
operations. The shadow stack is separate from the data/normal stack and
can be enabled individually in user and kernel mode. When shadow stacks
are enabled, CALL pushes the return address on both the data and shadow
stack. RET pops the return address from both stacks and compares them.
If the return addresses from the two stacks do not match, the processor
signals a #CP.

Indirect Branch Tracking (IBT):
IBT adds a new instrution, ENDBRANCH, that is used to mark valid target
addresses of indirect branches (CALL, JMP, ENCLU[EEXIT], etc...). If an
indirect branch is executed and the next instruction is _not_ an
ENDBRANCH, the processor signals a #CP.

Several new CET MSRs are defined to support CET:
MSR_IA32_{U,S}_CET: Controls the CET settings for user mode and kernel
mode respectively.

MSR_IA32_PL{0,1,2,3}_SSP: Stores shadow stack pointers for CPL-0,1,2,3
protection respectively.

MSR_IA32_INT_SSP_TAB: Stores base address of shadow stack pointer table.

Two XSAVES state bits are introduced for CET:
IA32_XSS:[bit 11]: Control saving/restoring user mode CET states
IA32_XSS:[bit 12]: Control saving/restoring kernel mode CET states.

Six VMCS fields are introduced for CET:
{HOST,GUEST}_S_CET: Stores CET settings for kernel mode.
{HOST,GUEST}_SSP: Stores shadow stack pointer of current task/thread.
{HOST,GUEST}_INTR_SSP_TABLE: Stores base address of shadow stack pointer
table.

If VM_EXIT_LOAD_HOST_CET_STATE = 1, the host CET states are restored from
the following VMCS fields at VM-Exit:
HOST_S_CET
HOST_SSP
HOST_INTR_SSP_TABLE

If VM_ENTRY_LOAD_GUEST_CET_STATE = 1, the guest CET states are loaded from
the following VMCS fields at VM-Entry:
GUEST_S_CET
GUEST_SSP
GUEST_INTR_SSP_TABLE

Co-developed-by: Zhang Yi Z <[email protected]>
Signed-off-by: Zhang Yi Z <[email protected]>
Signed-off-by: Yang Weijiang <[email protected]>
---
arch/x86/include/asm/vmx.h | 8 ++++++++
1 file changed, 8 insertions(+)

diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
index 38ca445a8429..54e996527a3a 100644
--- a/arch/x86/include/asm/vmx.h
+++ b/arch/x86/include/asm/vmx.h
@@ -94,6 +94,7 @@
#define VM_EXIT_CLEAR_BNDCFGS 0x00800000
#define VM_EXIT_PT_CONCEAL_PIP 0x01000000
#define VM_EXIT_CLEAR_IA32_RTIT_CTL 0x02000000
+#define VM_EXIT_LOAD_CET_STATE 0x10000000

#define VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR 0x00036dff

@@ -107,6 +108,7 @@
#define VM_ENTRY_LOAD_BNDCFGS 0x00010000
#define VM_ENTRY_PT_CONCEAL_PIP 0x00020000
#define VM_ENTRY_LOAD_IA32_RTIT_CTL 0x00040000
+#define VM_ENTRY_LOAD_CET_STATE 0x00100000

#define VM_ENTRY_ALWAYSON_WITHOUT_TRUE_MSR 0x000011ff

@@ -329,6 +331,9 @@ enum vmcs_field {
GUEST_PENDING_DBG_EXCEPTIONS = 0x00006822,
GUEST_SYSENTER_ESP = 0x00006824,
GUEST_SYSENTER_EIP = 0x00006826,
+ GUEST_S_CET = 0x00006828,
+ GUEST_SSP = 0x0000682a,
+ GUEST_INTR_SSP_TABLE = 0x0000682c,
HOST_CR0 = 0x00006c00,
HOST_CR3 = 0x00006c02,
HOST_CR4 = 0x00006c04,
@@ -341,6 +346,9 @@ enum vmcs_field {
HOST_IA32_SYSENTER_EIP = 0x00006c12,
HOST_RSP = 0x00006c14,
HOST_RIP = 0x00006c16,
+ HOST_S_CET = 0x00006c18,
+ HOST_SSP = 0x00006c1a,
+ HOST_INTR_SSP_TABLE = 0x00006c1c
};

/*
--
2.26.2

2021-02-03 11:29:25

by Yang, Weijiang

[permalink] [raw]
Subject: [PATCH v15 14/14] KVM: x86: Save/Restore GUEST_SSP to/from SMRAM

Save GUEST_SSP to SMRAM when guest exits to SMM due to SMI and restore it
when guest exits SMM to interrupted normal non-root mode.

Signed-off-by: Yang Weijiang <[email protected]>
---
arch/x86/kvm/emulate.c | 11 +++++++++++
arch/x86/kvm/x86.c | 10 ++++++++++
2 files changed, 21 insertions(+)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 56cae1ff9e3f..6d4a3181d8bd 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -2615,6 +2615,17 @@ static int rsm_load_state_64(struct x86_emulate_ctxt *ctxt,
return r;
}

+ if (kvm_cet_supported()) {
+ struct msr_data msr;
+
+ val = GET_SMSTATE(u64, smstate, 0x7ec8);
+ msr.index = MSR_KVM_GUEST_SSP;
+ msr.host_initiated = true;
+ msr.data = val;
+ /* Mimic host_initiated access to bypass ssp access check. */
+ kvm_x86_ops.set_msr(ctxt->vcpu, &msr);
+ }
+
return X86EMUL_CONTINUE;
}
#endif
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 22eb6b8626a8..f63b713cd71f 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8592,6 +8592,16 @@ static void enter_smm_save_state_64(struct kvm_vcpu *vcpu, char *buf)

for (i = 0; i < 6; i++)
enter_smm_save_seg_64(vcpu, buf, i);
+
+ if (kvm_cet_supported()) {
+ struct msr_data msr;
+
+ msr.index = MSR_KVM_GUEST_SSP;
+ msr.host_initiated = true;
+ /* GUEST_SSP is stored in VMCS at vm-exit. */
+ kvm_x86_ops.get_msr(vcpu, &msr);
+ put_smstate(u64, buf, 0x7ec8, msr.data);
+ }
}
#endif

--
2.26.2

2021-02-03 11:29:26

by Yang, Weijiang

[permalink] [raw]
Subject: [PATCH v15 08/14] KVM: VMX: Add a synthetic MSR to allow userspace VMM to access GUEST_SSP

Introduce a host-only synthetic MSR, MSR_KVM_GUEST_SSP so that the VMM
can read/write the guest's SSP, e.g. to migrate CET state. Use a
synthetic MSR, e.g. as opposed to a VCPU_REG_, as GUEST_SSP is subject
to the same consistency checks as the PL*_SSP MSRs, i.e. can share code.

Co-developed-by: Sean Christopherson <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
Signed-off-by: Yang Weijiang <[email protected]>
---
arch/x86/include/uapi/asm/kvm_para.h | 1 +
arch/x86/kvm/vmx/vmx.c | 14 ++++++++++++--
2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h
index 950afebfba88..8dd562a94724 100644
--- a/arch/x86/include/uapi/asm/kvm_para.h
+++ b/arch/x86/include/uapi/asm/kvm_para.h
@@ -54,6 +54,7 @@
#define MSR_KVM_POLL_CONTROL 0x4b564d05
#define MSR_KVM_ASYNC_PF_INT 0x4b564d06
#define MSR_KVM_ASYNC_PF_ACK 0x4b564d07
+#define MSR_KVM_GUEST_SSP 0x4b564d08

struct kvm_steal_time {
__u64 steal;
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 694879c2b0b7..4cd6a9710a51 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1816,7 +1816,8 @@ static bool cet_is_ssp_msr_accessible(struct kvm_vcpu *vcpu,
if (msr->host_initiated)
return true;

- if (!guest_cpuid_has(vcpu, X86_FEATURE_SHSTK))
+ if (!guest_cpuid_has(vcpu, X86_FEATURE_SHSTK) ||
+ msr->index == MSR_KVM_GUEST_SSP)
return false;

if (msr->index == MSR_IA32_INT_SSP_TAB)
@@ -1994,6 +1995,11 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
return 1;
msr_info->data = vmcs_readl(GUEST_INTR_SSP_TABLE);
break;
+ case MSR_KVM_GUEST_SSP:
+ if (!cet_is_ssp_msr_accessible(vcpu, msr_info))
+ return 1;
+ msr_info->data = vmcs_readl(GUEST_SSP);
+ break;
case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP:
if (!cet_is_ssp_msr_accessible(vcpu, msr_info))
return 1;
@@ -2286,12 +2292,16 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
return 1;
vmcs_writel(GUEST_INTR_SSP_TABLE, data);
break;
+ case MSR_KVM_GUEST_SSP:
case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP:
if (!cet_is_ssp_msr_accessible(vcpu, msr_info))
return 1;
if ((data & GENMASK(2, 0)) || is_noncanonical_address(data, vcpu))
return 1;
- vmx_set_xsave_msr(msr_info);
+ if (msr_index == MSR_KVM_GUEST_SSP)
+ vmcs_writel(GUEST_SSP, data);
+ else
+ vmx_set_xsave_msr(msr_info);
break;
case MSR_TSC_AUX:
if (!msr_info->host_initiated &&
--
2.26.2

2021-02-03 11:29:42

by Yang, Weijiang

[permalink] [raw]
Subject: [PATCH v15 12/14] KVM: nVMX: Add helper to check the vmcs01 MSR bitmap for MSR pass-through

Add a helper to perform the check on the vmcs01/l01 MSR bitmap when
disabling interception of an MSR for L2. This reduces the boilerplate
for the existing cases, and will be used heavily in a future patch for
CET MSRs.

Co-developed-by: Sean Christopherson <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
Signed-off-by: Yang Weijiang <[email protected]>
---
arch/x86/kvm/vmx/nested.c | 27 +++++++++++++++++----------
1 file changed, 17 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index f2b9bfb58206..3b405ebabb6e 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -544,6 +544,17 @@ static void nested_vmx_disable_intercept_for_msr(unsigned long *msr_bitmap_l1,
}
}

+static void nested_vmx_cond_disable_intercept_for_msr(struct kvm_vcpu *vcpu, u32 msr,
+ unsigned long *bitmap_12,
+ unsigned long *bitmap_02,
+ int type)
+{
+ if (msr_write_intercepted_l01(vcpu, msr))
+ return;
+
+ nested_vmx_disable_intercept_for_msr(bitmap_12, bitmap_02, msr, type);
+}
+
static inline void enable_x2apic_msr_intercepts(unsigned long *msr_bitmap)
{
int msr;
@@ -640,17 +651,13 @@ static inline bool nested_vmx_prepare_msr_bitmap(struct kvm_vcpu *vcpu,
* updated to reflect this when L1 (or its L2s) actually write to
* the MSR.
*/
- if (!msr_write_intercepted_l01(vcpu, MSR_IA32_SPEC_CTRL))
- nested_vmx_disable_intercept_for_msr(
- msr_bitmap_l1, msr_bitmap_l0,
- MSR_IA32_SPEC_CTRL,
- MSR_TYPE_R | MSR_TYPE_W);
+ nested_vmx_cond_disable_intercept_for_msr(vcpu, MSR_IA32_SPEC_CTRL,
+ msr_bitmap_l1, msr_bitmap_l0,
+ MSR_TYPE_R | MSR_TYPE_W);

- if (!msr_write_intercepted_l01(vcpu, MSR_IA32_PRED_CMD))
- nested_vmx_disable_intercept_for_msr(
- msr_bitmap_l1, msr_bitmap_l0,
- MSR_IA32_PRED_CMD,
- MSR_TYPE_W);
+ nested_vmx_cond_disable_intercept_for_msr(vcpu, MSR_IA32_PRED_CMD,
+ msr_bitmap_l1, msr_bitmap_l0,
+ MSR_TYPE_W);

kvm_vcpu_unmap(vcpu, &to_vmx(vcpu)->nested.msr_bitmap_map, false);

--
2.26.2

2021-02-03 11:29:46

by Yang, Weijiang

[permalink] [raw]
Subject: [PATCH v15 13/14] KVM: nVMX: Enable CET support for nested VMX

Add vmcs12 fields for all CET fields, pass-through CET MSRs to L2 when
possible, and enumerate the VMCS controls and CR4 bit as supported.

Co-developed-by: Sean Christopherson <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
Signed-off-by: Yang Weijiang <[email protected]>
---
arch/x86/kvm/vmx/nested.c | 30 ++++++++++++++++++++++++++++--
arch/x86/kvm/vmx/vmcs12.c | 6 ++++++
arch/x86/kvm/vmx/vmcs12.h | 14 +++++++++++++-
arch/x86/kvm/vmx/vmx.c | 1 +
4 files changed, 48 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 3b405ebabb6e..9728efd529a1 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -638,6 +638,29 @@ static inline bool nested_vmx_prepare_msr_bitmap(struct kvm_vcpu *vcpu,
nested_vmx_disable_intercept_for_msr(msr_bitmap_l1, msr_bitmap_l0,
MSR_KERNEL_GS_BASE, MSR_TYPE_RW);

+ /* Pass CET MSRs to nested VM if L0 and L1 are set to pass-through. */
+ nested_vmx_cond_disable_intercept_for_msr(vcpu, MSR_IA32_U_CET,
+ msr_bitmap_l1, msr_bitmap_l0,
+ MSR_TYPE_RW);
+ nested_vmx_cond_disable_intercept_for_msr(vcpu, MSR_IA32_PL3_SSP,
+ msr_bitmap_l1, msr_bitmap_l0,
+ MSR_TYPE_RW);
+ nested_vmx_cond_disable_intercept_for_msr(vcpu, MSR_IA32_S_CET,
+ msr_bitmap_l1, msr_bitmap_l0,
+ MSR_TYPE_RW);
+ nested_vmx_cond_disable_intercept_for_msr(vcpu, MSR_IA32_PL0_SSP,
+ msr_bitmap_l1, msr_bitmap_l0,
+ MSR_TYPE_RW);
+ nested_vmx_cond_disable_intercept_for_msr(vcpu, MSR_IA32_PL1_SSP,
+ msr_bitmap_l1, msr_bitmap_l0,
+ MSR_TYPE_RW);
+ nested_vmx_cond_disable_intercept_for_msr(vcpu, MSR_IA32_PL2_SSP,
+ msr_bitmap_l1, msr_bitmap_l0,
+ MSR_TYPE_RW);
+ nested_vmx_cond_disable_intercept_for_msr(vcpu, MSR_IA32_INT_SSP_TAB,
+ msr_bitmap_l1, msr_bitmap_l0,
+ MSR_TYPE_RW);
+
/*
* Checking the L0->L1 bitmap is trying to verify two things:
*
@@ -6386,7 +6409,9 @@ void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps)
VM_EXIT_HOST_ADDR_SPACE_SIZE |
#endif
VM_EXIT_LOAD_IA32_PAT | VM_EXIT_SAVE_IA32_PAT |
- VM_EXIT_CLEAR_BNDCFGS | VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL;
+ VM_EXIT_CLEAR_BNDCFGS | VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL |
+ VM_EXIT_LOAD_CET_STATE;
+
msrs->exit_ctls_high |=
VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR |
VM_EXIT_LOAD_IA32_EFER | VM_EXIT_SAVE_IA32_EFER |
@@ -6406,7 +6431,8 @@ void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps)
VM_ENTRY_IA32E_MODE |
#endif
VM_ENTRY_LOAD_IA32_PAT | VM_ENTRY_LOAD_BNDCFGS |
- VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
+ VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL | VM_ENTRY_LOAD_CET_STATE;
+
msrs->entry_ctls_high |=
(VM_ENTRY_ALWAYSON_WITHOUT_TRUE_MSR | VM_ENTRY_LOAD_IA32_EFER);

diff --git a/arch/x86/kvm/vmx/vmcs12.c b/arch/x86/kvm/vmx/vmcs12.c
index c8e51c004f78..8fd8e0beecef 100644
--- a/arch/x86/kvm/vmx/vmcs12.c
+++ b/arch/x86/kvm/vmx/vmcs12.c
@@ -137,6 +137,9 @@ const unsigned short vmcs_field_to_offset_table[] = {
FIELD(GUEST_PENDING_DBG_EXCEPTIONS, guest_pending_dbg_exceptions),
FIELD(GUEST_SYSENTER_ESP, guest_sysenter_esp),
FIELD(GUEST_SYSENTER_EIP, guest_sysenter_eip),
+ FIELD(GUEST_S_CET, guest_s_cet),
+ FIELD(GUEST_SSP, guest_ssp),
+ FIELD(GUEST_INTR_SSP_TABLE, guest_ssp_tbl),
FIELD(HOST_CR0, host_cr0),
FIELD(HOST_CR3, host_cr3),
FIELD(HOST_CR4, host_cr4),
@@ -149,5 +152,8 @@ const unsigned short vmcs_field_to_offset_table[] = {
FIELD(HOST_IA32_SYSENTER_EIP, host_ia32_sysenter_eip),
FIELD(HOST_RSP, host_rsp),
FIELD(HOST_RIP, host_rip),
+ FIELD(HOST_S_CET, host_s_cet),
+ FIELD(HOST_SSP, host_ssp),
+ FIELD(HOST_INTR_SSP_TABLE, host_ssp_tbl),
};
const unsigned int nr_vmcs12_fields = ARRAY_SIZE(vmcs_field_to_offset_table);
diff --git a/arch/x86/kvm/vmx/vmcs12.h b/arch/x86/kvm/vmx/vmcs12.h
index 80232daf00ff..016896c9e701 100644
--- a/arch/x86/kvm/vmx/vmcs12.h
+++ b/arch/x86/kvm/vmx/vmcs12.h
@@ -115,7 +115,13 @@ struct __packed vmcs12 {
natural_width host_ia32_sysenter_eip;
natural_width host_rsp;
natural_width host_rip;
- natural_width paddingl[8]; /* room for future expansion */
+ natural_width host_s_cet;
+ natural_width host_ssp;
+ natural_width host_ssp_tbl;
+ natural_width guest_s_cet;
+ natural_width guest_ssp;
+ natural_width guest_ssp_tbl;
+ natural_width paddingl[2]; /* room for future expansion */
u32 pin_based_vm_exec_control;
u32 cpu_based_vm_exec_control;
u32 exception_bitmap;
@@ -295,6 +301,12 @@ static inline void vmx_check_vmcs12_offsets(void)
CHECK_OFFSET(host_ia32_sysenter_eip, 656);
CHECK_OFFSET(host_rsp, 664);
CHECK_OFFSET(host_rip, 672);
+ CHECK_OFFSET(host_s_cet, 680);
+ CHECK_OFFSET(host_ssp, 688);
+ CHECK_OFFSET(host_ssp_tbl, 696);
+ CHECK_OFFSET(guest_s_cet, 704);
+ CHECK_OFFSET(guest_ssp, 712);
+ CHECK_OFFSET(guest_ssp_tbl, 720);
CHECK_OFFSET(pin_based_vm_exec_control, 744);
CHECK_OFFSET(cpu_based_vm_exec_control, 748);
CHECK_OFFSET(exception_bitmap, 752);
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index b6657117191b..5856c5b81084 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7257,6 +7257,7 @@ static void nested_vmx_cr_fixed1_bits_update(struct kvm_vcpu *vcpu)
cr4_fixed1_update(X86_CR4_PKE, ecx, feature_bit(PKU));
cr4_fixed1_update(X86_CR4_UMIP, ecx, feature_bit(UMIP));
cr4_fixed1_update(X86_CR4_LA57, ecx, feature_bit(LA57));
+ cr4_fixed1_update(X86_CR4_CET, ecx, feature_bit(SHSTK));

#undef cr4_fixed1_update
}
--
2.26.2

2021-02-03 11:30:39

by Yang, Weijiang

[permalink] [raw]
Subject: [PATCH v15 11/14] KVM: VMX: Pass through CET MSRs to the guest when supported

Pass through all CET MSRs when the associated CET component (kernel vs.
user) is enabled to improve guest performance. All CET MSRs are context
switched, either via dedicated VMCS fields or XSAVES.

Co-developed-by: Zhang Yi Z <[email protected]>
Signed-off-by: Zhang Yi Z <[email protected]>
Co-developed-by: Sean Christopherson <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
Signed-off-by: Yang Weijiang <[email protected]>
---
arch/x86/kvm/vmx/vmx.c | 33 +++++++++++++++++++++++++++++++++
1 file changed, 33 insertions(+)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index c2242fc1f71a..b6657117191b 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -659,6 +659,10 @@ static bool is_valid_passthrough_msr(u32 msr)
case MSR_IA32_RTIT_ADDR0_A ... MSR_IA32_RTIT_ADDR3_B:
/* PT MSRs. These are handled in pt_update_intercept_for_msr() */
return true;
+ case MSR_IA32_U_CET:
+ case MSR_IA32_S_CET:
+ case MSR_IA32_PL0_SSP ... MSR_IA32_INT_SSP_TAB:
+ return true;
}

r = possible_passthrough_msr_slot(msr) != -ENOENT;
@@ -7343,6 +7347,32 @@ static void update_intel_pt_cfg(struct kvm_vcpu *vcpu)
vmx->pt_desc.ctl_bitmask &= ~(0xfULL << (32 + i * 4));
}

+static bool is_cet_state_supported(struct kvm_vcpu *vcpu, u32 xss_state)
+{
+ return (vcpu->arch.guest_supported_xss & xss_state) &&
+ (guest_cpuid_has(vcpu, X86_FEATURE_SHSTK) ||
+ guest_cpuid_has(vcpu, X86_FEATURE_IBT));
+}
+
+static void vmx_update_intercept_for_cet_msr(struct kvm_vcpu *vcpu)
+{
+ bool incpt = !is_cet_state_supported(vcpu, XFEATURE_MASK_CET_USER);
+
+ vmx_set_intercept_for_msr(vcpu, MSR_IA32_U_CET, MSR_TYPE_RW, incpt);
+
+ incpt |= !guest_cpuid_has(vcpu, X86_FEATURE_SHSTK);
+ vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL3_SSP, MSR_TYPE_RW, incpt);
+
+ incpt = !is_cet_state_supported(vcpu, XFEATURE_MASK_CET_KERNEL);
+ vmx_set_intercept_for_msr(vcpu, MSR_IA32_S_CET, MSR_TYPE_RW, incpt);
+
+ incpt |= !guest_cpuid_has(vcpu, X86_FEATURE_SHSTK);
+ vmx_set_intercept_for_msr(vcpu, MSR_IA32_INT_SSP_TAB, MSR_TYPE_RW, incpt);
+ vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL0_SSP, MSR_TYPE_RW, incpt);
+ vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL1_SSP, MSR_TYPE_RW, incpt);
+ vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL2_SSP, MSR_TYPE_RW, incpt);
+}
+
static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
{
struct vcpu_vmx *vmx = to_vmx(vcpu);
@@ -7386,6 +7416,9 @@ static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)

/* Refresh #PF interception to account for MAXPHYADDR changes. */
update_exception_bitmap(vcpu);
+
+ if (kvm_cet_supported())
+ vmx_update_intercept_for_cet_msr(vcpu);
}

static __init void vmx_set_cpu_caps(void)
--
2.26.2

2021-02-03 12:01:53

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH v15 07/14] KVM: VMX: Emulate reads and writes to CET MSRs

On 03/02/21 12:34, Yang Weijiang wrote:
> MSRs that are switched through XSAVES are especially annoying due to
> the possibility of the kernel's FPU being used in IRQ context. Disable
> IRQs and ensure the guest's FPU state is loaded when accessing such MSRs.

Good catch! This should be in x86.h and named kvm_get/set_xsave_msr
because it's not VMX specific. The commit message should also be there
as a comment.

In addition,

> + case MSR_IA32_S_CET:
> + if (!cet_is_control_msr_accessible(vcpu, msr_info))
> + return 1;
> + msr_info->data = vmcs_readl(GUEST_S_CET);
> + break;
> + case MSR_IA32_U_CET:
> + if (!cet_is_control_msr_accessible(vcpu, msr_info))
> + return 1;
> + vmx_get_xsave_msr(msr_info);
> + break;

these two might as well be the same "case" for symmetry with the
handling of WRMSR.

I've fixed this up locally, since these patches will not be pushed to
Linus until the corresponding bare metal support is there.

Paolo

2021-02-03 12:10:55

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH v15 14/14] KVM: x86: Save/Restore GUEST_SSP to/from SMRAM

On 03/02/21 12:34, Yang Weijiang wrote:
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 22eb6b8626a8..f63b713cd71f 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -8592,6 +8592,16 @@ static void enter_smm_save_state_64(struct kvm_vcpu *vcpu, char *buf)
>
> for (i = 0; i < 6; i++)
> enter_smm_save_seg_64(vcpu, buf, i);
> +
> + if (kvm_cet_supported()) {
> + struct msr_data msr;
> +
> + msr.index = MSR_KVM_GUEST_SSP;
> + msr.host_initiated = true;
> + /* GUEST_SSP is stored in VMCS at vm-exit. */
> + kvm_x86_ops.get_msr(vcpu, &msr);
> + put_smstate(u64, buf, 0x7ec8, msr.data);
> + }
> }
> #endif
>
>

0x7ec8 is used for I/O instruction restart and auto-halt restart.
0x7f08 is a free spot. We should really document the KVM state save
area format.

Paolo

2021-02-03 12:41:23

by Yang, Weijiang

[permalink] [raw]
Subject: Re: [PATCH v15 07/14] KVM: VMX: Emulate reads and writes to CET MSRs

On Wed, Feb 03, 2021 at 12:57:41PM +0100, Paolo Bonzini wrote:
> On 03/02/21 12:34, Yang Weijiang wrote:
> > MSRs that are switched through XSAVES are especially annoying due to the
> > possibility of the kernel's FPU being used in IRQ context. Disable IRQs
> > and ensure the guest's FPU state is loaded when accessing such MSRs.
>
> Good catch! This should be in x86.h and named kvm_get/set_xsave_msr because
> it's not VMX specific. The commit message should also be there as a
> comment.
Thanks a lot for reviewing! These words came from Sean! :-D
>
> In addition,
>
> > + case MSR_IA32_S_CET:
> > + if (!cet_is_control_msr_accessible(vcpu, msr_info))
> > + return 1;
> > + msr_info->data = vmcs_readl(GUEST_S_CET);
> > + break;
> > + case MSR_IA32_U_CET:
> > + if (!cet_is_control_msr_accessible(vcpu, msr_info))
> > + return 1;
> > + vmx_get_xsave_msr(msr_info);
> > + break;
>
> these two might as well be the same "case" for symmetry with the handling of
> WRMSR.
>
> I've fixed this up locally, since these patches will not be pushed to Linus
> until the corresponding bare metal support is there.
Got it!
>
> Paolo

2021-02-03 12:43:28

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH v15 00/14] Introduce support for guest CET feature

On 03/02/21 12:34, Yang Weijiang wrote:
> Control-flow Enforcement Technology (CET) provides protection against
> Return/Jump-Oriented Programming (ROP/JOP) attack. There're two CET
> subfeatures: Shadow Stack (SHSTK) and Indirect Branch Tracking (IBT).
> SHSTK is to prevent ROP and IBT is to prevent JOP.
>
> Several parts in KVM have been updated to provide guest CET support, including:
> CPUID/XSAVES settings, MSR passthrough, user-space MSR access interface,
> vmentry/vmexit config, nested VM etc. These patches are dependent on CET
> kernel patches for XSAVES support and CET definitions, e.g., MSR and related
> feature flags.
>
> CET kernel patches: refer to [1], [2].
>
> Previous CET KVM patches: refer to [3].
>
> CET QEMU patches: refer to [4].
>
> CET KVM unit-test patch: refer to [5].
>
> [1]: CET Shadow Stack patches v18:
> https://lkml.kernel.org/linux-api/[email protected]/
>
> [2]: Indirect Branch Tracking patches v18:
> https://lkml.kernel.org/linux-api/[email protected]/
>
> [3]: CET KVM patches v14:
> https://lkml.kernel.org/kvm/[email protected]/
>
> [4]: CET QEMU patches:
> https://patchwork.ozlabs.org/project/qemu-devel/patch/[email protected]/
>
> [5]: CET KVM unit-test patch:
> https://patchwork.kernel.org/project/kvm/patch/[email protected]/
>
> Changes in v15:
> - Changed patches per Paolo's review feedback on v14.
> - Added a new patch for GUEST_SSP save/restore in guest SMM case.
> - Fixed guest call-trace issue due to CET MSR interception.
> - Removed unnecessary guest CET state cleanup in VMCS.
> - Rebased patches to 5.11-rc6.
>
>
> Sean Christopherson (2):
> KVM: x86: Report XSS as an MSR to be saved if there are supported
> features
> KVM: x86: Load guest fpu state when accessing MSRs managed by XSAVES
>
> Yang Weijiang (12):
> KVM: x86: Refresh CPUID on writes to MSR_IA32_XSS
> KVM: x86: Add #CP support in guest exception dispatch
> KVM: VMX: Introduce CET VMCS fields and flags
> KVM: x86: Add fault checks for CR4.CET
> KVM: VMX: Emulate reads and writes to CET MSRs
> KVM: VMX: Add a synthetic MSR to allow userspace VMM to access
> GUEST_SSP
> KVM: x86: Report CET MSRs as to-be-saved if CET is supported
> KVM: x86: Enable CET virtualization for VMX and advertise CET to
> userspace
> KVM: VMX: Pass through CET MSRs to the guest when supported
> KVM: nVMX: Add helper to check the vmcs01 MSR bitmap for MSR
> pass-through
> KVM: nVMX: Enable CET support for nested VMX
> KVM: x86: Save/Restore GUEST_SSP to/from SMRAM
>
> arch/x86/include/asm/kvm_host.h | 4 +-
> arch/x86/include/asm/vmx.h | 8 ++
> arch/x86/include/uapi/asm/kvm.h | 1 +
> arch/x86/include/uapi/asm/kvm_para.h | 1 +
> arch/x86/kvm/cpuid.c | 26 +++-
> arch/x86/kvm/emulate.c | 11 ++
> arch/x86/kvm/vmx/capabilities.h | 5 +
> arch/x86/kvm/vmx/nested.c | 57 ++++++--
> arch/x86/kvm/vmx/vmcs12.c | 6 +
> arch/x86/kvm/vmx/vmcs12.h | 14 +-
> arch/x86/kvm/vmx/vmx.c | 202 ++++++++++++++++++++++++++-
> arch/x86/kvm/x86.c | 67 ++++++++-
> arch/x86/kvm/x86.h | 10 +-
> 13 files changed, 387 insertions(+), 25 deletions(-)
>

Queued, though not for 5.12 unless the bare metal support is there too.

Paolo

2021-02-03 12:55:27

by Yang, Weijiang

[permalink] [raw]
Subject: Re: [PATCH v15 14/14] KVM: x86: Save/Restore GUEST_SSP to/from SMRAM

On Wed, Feb 03, 2021 at 01:07:53PM +0100, Paolo Bonzini wrote:
> On 03/02/21 12:34, Yang Weijiang wrote:
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index 22eb6b8626a8..f63b713cd71f 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -8592,6 +8592,16 @@ static void enter_smm_save_state_64(struct kvm_vcpu *vcpu, char *buf)
> > for (i = 0; i < 6; i++)
> > enter_smm_save_seg_64(vcpu, buf, i);
> > +
> > + if (kvm_cet_supported()) {
> > + struct msr_data msr;
> > +
> > + msr.index = MSR_KVM_GUEST_SSP;
> > + msr.host_initiated = true;
> > + /* GUEST_SSP is stored in VMCS at vm-exit. */
> > + kvm_x86_ops.get_msr(vcpu, &msr);
> > + put_smstate(u64, buf, 0x7ec8, msr.data);
> > + }
> > }
> > #endif
> >
>
> 0x7ec8 is used for I/O instruction restart and auto-halt restart. 0x7f08 is
> a free spot. We should really document the KVM state save area format.
Thanks for catching the documentation error!
>
> Paolo

2021-02-03 21:50:24

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v15 04/14] KVM: x86: Add #CP support in guest exception dispatch

On Wed, Feb 03, 2021, Yang Weijiang wrote:
> Add handling for Control Protection (#CP) exceptions, vector 21, used
> and introduced by Intel's Control-Flow Enforcement Technology (CET).
> relevant CET violation case. See Intel's SDM for details.
>
> Signed-off-by: Yang Weijiang <[email protected]>
> ---
> arch/x86/include/uapi/asm/kvm.h | 1 +
> arch/x86/kvm/x86.c | 1 +
> arch/x86/kvm/x86.h | 2 +-
> 3 files changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
> index 8e76d3701db3..507263d1d0b2 100644
> --- a/arch/x86/include/uapi/asm/kvm.h
> +++ b/arch/x86/include/uapi/asm/kvm.h
> @@ -32,6 +32,7 @@
> #define MC_VECTOR 18
> #define XM_VECTOR 19
> #define VE_VECTOR 20
> +#define CP_VECTOR 21
>
> /* Select x86 specific features in <linux/kvm.h> */
> #define __KVM_HAVE_PIT
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 99f787152d12..d9d3bae40a8c 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -436,6 +436,7 @@ static int exception_class(int vector)
> case NP_VECTOR:
> case SS_VECTOR:
> case GP_VECTOR:
> + case CP_VECTOR:
> return EXCPT_CONTRIBUTORY;
> default:
> break;
> diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
> index c5ee0f5ce0f1..bdbd0b023ecc 100644
> --- a/arch/x86/kvm/x86.h
> +++ b/arch/x86/kvm/x86.h
> @@ -116,7 +116,7 @@ static inline bool x86_exception_has_error_code(unsigned int vector)
> {
> static u32 exception_has_error_code = BIT(DF_VECTOR) | BIT(TS_VECTOR) |
> BIT(NP_VECTOR) | BIT(SS_VECTOR) | BIT(GP_VECTOR) |
> - BIT(PF_VECTOR) | BIT(AC_VECTOR);
> + BIT(PF_VECTOR) | BIT(AC_VECTOR) | BIT(CP_VECTOR);

These need to be conditional on CET being exposed to the guest. TBD exceptions
are non-contributory and don't have an error code. Found when running unit
tests in L1 with a kvm/queue as L1, but an older L0. cr4_guest_rsvd_bits can be
used to avoid guest_cpuid_has() lookups.

The SDM also gets this wrong. Section 26.2.1.3, VM-Entry Control Fields, needs
to be updated to add #CP to the list.

— The field's deliver-error-code bit (bit 11) is 1 if each of the following
holds: (1) the interruption type is hardware exception; (2) bit 0
(corresponding to CR0.PE) is set in the CR0 field in the guest-state area;
(3) IA32_VMX_BASIC[56] is read as 0 (see Appendix A.1); and (4) the vector
indicates one of the following exceptions: #DF (vector 8), #TS (10),
#NP (11), #SS (12), #GP (13), #PF (14), or #AC (17).

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index dbca1687ae8e..0b6dab6915a3 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -2811,7 +2811,7 @@ static int nested_check_vm_entry_controls(struct kvm_vcpu *vcpu,
/* VM-entry interruption-info field: deliver error code */
should_have_error_code =
intr_type == INTR_TYPE_HARD_EXCEPTION && prot_mode &&
- x86_exception_has_error_code(vector);
+ x86_exception_has_error_code(vcpu, vector);
if (CC(has_error_code != should_have_error_code))
return -EINVAL;

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 28fea7ff7a86..0288d6a364bd 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -437,17 +437,20 @@ EXPORT_SYMBOL_GPL(kvm_spurious_fault);
#define EXCPT_CONTRIBUTORY 1
#define EXCPT_PF 2

-static int exception_class(int vector)
+static int exception_class(struct kvm_vcpu *vcpu, int vector)
{
switch (vector) {
case PF_VECTOR:
return EXCPT_PF;
+ case CP_VECTOR:
+ if (vcpu->arch.cr4_guest_rsvd_bits & X86_CR4_CET)
+ return EXCPT_BENIGN;
+ return EXCPT_CONTRIBUTORY;
case DE_VECTOR:
case TS_VECTOR:
case NP_VECTOR:
case SS_VECTOR:
case GP_VECTOR:
- case CP_VECTOR:
return EXCPT_CONTRIBUTORY;
default:
break;
@@ -588,8 +591,8 @@ static void kvm_multiple_exception(struct kvm_vcpu *vcpu,
kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu);
return;
}
- class1 = exception_class(prev_nr);
- class2 = exception_class(nr);
+ class1 = exception_class(vcpu, prev_nr);
+ class2 = exception_class(vcpu, nr);
if ((class1 == EXCPT_CONTRIBUTORY && class2 == EXCPT_CONTRIBUTORY)
|| (class1 == EXCPT_PF && class2 != EXCPT_BENIGN)) {
/*
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index a14da36a30ed..dce756ffb577 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -120,12 +120,16 @@ static inline bool is_la57_mode(struct kvm_vcpu *vcpu)
#endif
}

-static inline bool x86_exception_has_error_code(unsigned int vector)
+static inline bool x86_exception_has_error_code(struct kvm_vcpu *vcpu,
+ unsigned int vector)
{
static u32 exception_has_error_code = BIT(DF_VECTOR) | BIT(TS_VECTOR) |
BIT(NP_VECTOR) | BIT(SS_VECTOR) | BIT(GP_VECTOR) |
BIT(PF_VECTOR) | BIT(AC_VECTOR) | BIT(CP_VECTOR);

+ if (vector == CP_VECTOR && (vcpu->arch.cr4_guest_rsvd_bits & X86_CR4_CET))
+ return false;
+
return (1U << vector) & exception_has_error_code;
}




2021-02-04 07:13:01

by Yang, Weijiang

[permalink] [raw]
Subject: Re: [PATCH v15 04/14] KVM: x86: Add #CP support in guest exception dispatch

On Wed, Feb 03, 2021 at 01:46:42PM -0800, Sean Christopherson wrote:
> On Wed, Feb 03, 2021, Yang Weijiang wrote:
> > Add handling for Control Protection (#CP) exceptions, vector 21, used
> > and introduced by Intel's Control-Flow Enforcement Technology (CET).
> > relevant CET violation case. See Intel's SDM for details.
> >
> > Signed-off-by: Yang Weijiang <[email protected]>
> > ---
> > arch/x86/include/uapi/asm/kvm.h | 1 +
> > arch/x86/kvm/x86.c | 1 +
> > arch/x86/kvm/x86.h | 2 +-
> > 3 files changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
> > index 8e76d3701db3..507263d1d0b2 100644
> > --- a/arch/x86/include/uapi/asm/kvm.h
> > +++ b/arch/x86/include/uapi/asm/kvm.h
> > @@ -32,6 +32,7 @@
> > #define MC_VECTOR 18
> > #define XM_VECTOR 19
> > #define VE_VECTOR 20
> > +#define CP_VECTOR 21
> >
> > /* Select x86 specific features in <linux/kvm.h> */
> > #define __KVM_HAVE_PIT
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index 99f787152d12..d9d3bae40a8c 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -436,6 +436,7 @@ static int exception_class(int vector)
> > case NP_VECTOR:
> > case SS_VECTOR:
> > case GP_VECTOR:
> > + case CP_VECTOR:
> > return EXCPT_CONTRIBUTORY;
> > default:
> > break;
> > diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
> > index c5ee0f5ce0f1..bdbd0b023ecc 100644
> > --- a/arch/x86/kvm/x86.h
> > +++ b/arch/x86/kvm/x86.h
> > @@ -116,7 +116,7 @@ static inline bool x86_exception_has_error_code(unsigned int vector)
> > {
> > static u32 exception_has_error_code = BIT(DF_VECTOR) | BIT(TS_VECTOR) |
> > BIT(NP_VECTOR) | BIT(SS_VECTOR) | BIT(GP_VECTOR) |
> > - BIT(PF_VECTOR) | BIT(AC_VECTOR);
> > + BIT(PF_VECTOR) | BIT(AC_VECTOR) | BIT(CP_VECTOR);
>
> These need to be conditional on CET being exposed to the guest. TBD exceptions
> are non-contributory and don't have an error code. Found when running unit
> tests in L1 with a kvm/queue as L1, but an older L0. cr4_guest_rsvd_bits can be
> used to avoid guest_cpuid_has() lookups.
>
> The SDM also gets this wrong. Section 26.2.1.3, VM-Entry Control Fields, needs
> to be updated to add #CP to the list.
>
> — The field's deliver-error-code bit (bit 11) is 1 if each of the following
> holds: (1) the interruption type is hardware exception; (2) bit 0
> (corresponding to CR0.PE) is set in the CR0 field in the guest-state area;
> (3) IA32_VMX_BASIC[56] is read as 0 (see Appendix A.1); and (4) the vector
> indicates one of the following exceptions: #DF (vector 8), #TS (10),
> #NP (11), #SS (12), #GP (13), #PF (14), or #AC (17).
>
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index dbca1687ae8e..0b6dab6915a3 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -2811,7 +2811,7 @@ static int nested_check_vm_entry_controls(struct kvm_vcpu *vcpu,
> /* VM-entry interruption-info field: deliver error code */
> should_have_error_code =
> intr_type == INTR_TYPE_HARD_EXCEPTION && prot_mode &&
> - x86_exception_has_error_code(vector);
> + x86_exception_has_error_code(vcpu, vector);
> if (CC(has_error_code != should_have_error_code))
> return -EINVAL;
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 28fea7ff7a86..0288d6a364bd 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -437,17 +437,20 @@ EXPORT_SYMBOL_GPL(kvm_spurious_fault);
> #define EXCPT_CONTRIBUTORY 1
> #define EXCPT_PF 2
>
> -static int exception_class(int vector)
> +static int exception_class(struct kvm_vcpu *vcpu, int vector)
> {
> switch (vector) {
> case PF_VECTOR:
> return EXCPT_PF;
> + case CP_VECTOR:
> + if (vcpu->arch.cr4_guest_rsvd_bits & X86_CR4_CET)
> + return EXCPT_BENIGN;
> + return EXCPT_CONTRIBUTORY;
> case DE_VECTOR:
> case TS_VECTOR:
> case NP_VECTOR:
> case SS_VECTOR:
> case GP_VECTOR:
> - case CP_VECTOR:
> return EXCPT_CONTRIBUTORY;
> default:
> break;
> @@ -588,8 +591,8 @@ static void kvm_multiple_exception(struct kvm_vcpu *vcpu,
> kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu);
> return;
> }
> - class1 = exception_class(prev_nr);
> - class2 = exception_class(nr);
> + class1 = exception_class(vcpu, prev_nr);
> + class2 = exception_class(vcpu, nr);
> if ((class1 == EXCPT_CONTRIBUTORY && class2 == EXCPT_CONTRIBUTORY)
> || (class1 == EXCPT_PF && class2 != EXCPT_BENIGN)) {
> /*
> diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
> index a14da36a30ed..dce756ffb577 100644
> --- a/arch/x86/kvm/x86.h
> +++ b/arch/x86/kvm/x86.h
> @@ -120,12 +120,16 @@ static inline bool is_la57_mode(struct kvm_vcpu *vcpu)
> #endif
> }
>
> -static inline bool x86_exception_has_error_code(unsigned int vector)
> +static inline bool x86_exception_has_error_code(struct kvm_vcpu *vcpu,
> + unsigned int vector)
> {
> static u32 exception_has_error_code = BIT(DF_VECTOR) | BIT(TS_VECTOR) |
> BIT(NP_VECTOR) | BIT(SS_VECTOR) | BIT(GP_VECTOR) |
> BIT(PF_VECTOR) | BIT(AC_VECTOR) | BIT(CP_VECTOR);
>
> + if (vector == CP_VECTOR && (vcpu->arch.cr4_guest_rsvd_bits & X86_CR4_CET))
> + return false;
> +
> return (1U << vector) & exception_has_error_code;
> }
Thanks Sean for catching this!

Hi, Paolo,
Do I need to send another version to include Sean's change?

>
>
>

2021-02-04 08:27:33

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH v15 04/14] KVM: x86: Add #CP support in guest exception dispatch

On 03/02/21 22:46, Sean Christopherson wrote:
>
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index dbca1687ae8e..0b6dab6915a3 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -2811,7 +2811,7 @@ static int nested_check_vm_entry_controls(struct kvm_vcpu *vcpu,
> /* VM-entry interruption-info field: deliver error code */
> should_have_error_code =
> intr_type == INTR_TYPE_HARD_EXCEPTION && prot_mode &&
> - x86_exception_has_error_code(vector);
> + x86_exception_has_error_code(vcpu, vector);
> if (CC(has_error_code != should_have_error_code))
> return -EINVAL;
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 28fea7ff7a86..0288d6a364bd 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -437,17 +437,20 @@ EXPORT_SYMBOL_GPL(kvm_spurious_fault);
> #define EXCPT_CONTRIBUTORY 1
> #define EXCPT_PF 2
>
> -static int exception_class(int vector)
> +static int exception_class(struct kvm_vcpu *vcpu, int vector)
> {
> switch (vector) {
> case PF_VECTOR:
> return EXCPT_PF;
> + case CP_VECTOR:
> + if (vcpu->arch.cr4_guest_rsvd_bits & X86_CR4_CET)
> + return EXCPT_BENIGN;
> + return EXCPT_CONTRIBUTORY;
> case DE_VECTOR:
> case TS_VECTOR:
> case NP_VECTOR:
> case SS_VECTOR:
> case GP_VECTOR:
> - case CP_VECTOR:
> return EXCPT_CONTRIBUTORY;
> default:
> break;
> @@ -588,8 +591,8 @@ static void kvm_multiple_exception(struct kvm_vcpu *vcpu,
> kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu);
> return;
> }
> - class1 = exception_class(prev_nr);
> - class2 = exception_class(nr);
> + class1 = exception_class(vcpu, prev_nr);
> + class2 = exception_class(vcpu, nr);
> if ((class1 == EXCPT_CONTRIBUTORY && class2 == EXCPT_CONTRIBUTORY)
> || (class1 == EXCPT_PF && class2 != EXCPT_BENIGN)) {
> /*
> diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
> index a14da36a30ed..dce756ffb577 100644
> --- a/arch/x86/kvm/x86.h
> +++ b/arch/x86/kvm/x86.h
> @@ -120,12 +120,16 @@ static inline bool is_la57_mode(struct kvm_vcpu *vcpu)
> #endif
> }
>
> -static inline bool x86_exception_has_error_code(unsigned int vector)
> +static inline bool x86_exception_has_error_code(struct kvm_vcpu *vcpu,
> + unsigned int vector)
> {
> static u32 exception_has_error_code = BIT(DF_VECTOR) | BIT(TS_VECTOR) |
> BIT(NP_VECTOR) | BIT(SS_VECTOR) | BIT(GP_VECTOR) |
> BIT(PF_VECTOR) | BIT(AC_VECTOR) | BIT(CP_VECTOR);
>
> + if (vector == CP_VECTOR && (vcpu->arch.cr4_guest_rsvd_bits & X86_CR4_CET))
> + return false;
> +
> return (1U << vector) & exception_has_error_code;
> }
>
>
>
>

Squashed, thanks. I made a small change to the last hunk:

if (!((1U << vector) & exception_has_error_code))
return false;

if (vector == CP_VECTOR)
return !(vcpu->arch.cr4_guest_rsvd_bits & X86_CR4_CET);

return true;

2021-02-04 08:33:03

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH v15 04/14] KVM: x86: Add #CP support in guest exception dispatch

On 04/02/21 08:22, Yang Weijiang wrote:
> Thanks Sean for catching this!
>
> Hi, Paolo,
> Do I need to send another version to include Sean's change?

No, it's okay.

Paolo

2021-02-05 01:01:50

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v15 04/14] KVM: x86: Add #CP support in guest exception dispatch

On Thu, Feb 04, 2021, Paolo Bonzini wrote:
> On 03/02/21 22:46, Sean Christopherson wrote:
> >
> > diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> > index dbca1687ae8e..0b6dab6915a3 100644
> > --- a/arch/x86/kvm/vmx/nested.c
> > +++ b/arch/x86/kvm/vmx/nested.c
> > @@ -2811,7 +2811,7 @@ static int nested_check_vm_entry_controls(struct kvm_vcpu *vcpu,
> > /* VM-entry interruption-info field: deliver error code */
> > should_have_error_code =
> > intr_type == INTR_TYPE_HARD_EXCEPTION && prot_mode &&
> > - x86_exception_has_error_code(vector);
> > + x86_exception_has_error_code(vcpu, vector);
> > if (CC(has_error_code != should_have_error_code))
> > return -EINVAL;
> >
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index 28fea7ff7a86..0288d6a364bd 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -437,17 +437,20 @@ EXPORT_SYMBOL_GPL(kvm_spurious_fault);
> > #define EXCPT_CONTRIBUTORY 1
> > #define EXCPT_PF 2
> >
> > -static int exception_class(int vector)
> > +static int exception_class(struct kvm_vcpu *vcpu, int vector)
> > {
> > switch (vector) {
> > case PF_VECTOR:
> > return EXCPT_PF;
> > + case CP_VECTOR:
> > + if (vcpu->arch.cr4_guest_rsvd_bits & X86_CR4_CET)
> > + return EXCPT_BENIGN;
> > + return EXCPT_CONTRIBUTORY;
> > case DE_VECTOR:
> > case TS_VECTOR:
> > case NP_VECTOR:
> > case SS_VECTOR:
> > case GP_VECTOR:
> > - case CP_VECTOR:

This removal got lost when squasing.

arch/x86/kvm/x86.c: In function ‘exception_class’:
arch/x86/kvm/x86.c:455:2: error: duplicate case value
455 | case CP_VECTOR:
| ^~~~
arch/x86/kvm/x86.c:446:2: note: previously used here
446 | case CP_VECTOR:
| ^~~~

> > return EXCPT_CONTRIBUTORY;
> > default:
> > break;
> > @@ -588,8 +591,8 @@ static void kvm_multiple_exception(struct kvm_vcpu *vcpu,
> > kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu);
> > return;
> > }
> > - class1 = exception_class(prev_nr);
> > - class2 = exception_class(nr);
> > + class1 = exception_class(vcpu, prev_nr);
> > + class2 = exception_class(vcpu, nr);
> > if ((class1 == EXCPT_CONTRIBUTORY && class2 == EXCPT_CONTRIBUTORY)
> > || (class1 == EXCPT_PF && class2 != EXCPT_BENIGN)) {
> > /*
> > diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
> > index a14da36a30ed..dce756ffb577 100644
> > --- a/arch/x86/kvm/x86.h
> > +++ b/arch/x86/kvm/x86.h
> > @@ -120,12 +120,16 @@ static inline bool is_la57_mode(struct kvm_vcpu *vcpu)
> > #endif
> > }
> >
> > -static inline bool x86_exception_has_error_code(unsigned int vector)
> > +static inline bool x86_exception_has_error_code(struct kvm_vcpu *vcpu,
> > + unsigned int vector)
> > {
> > static u32 exception_has_error_code = BIT(DF_VECTOR) | BIT(TS_VECTOR) |
> > BIT(NP_VECTOR) | BIT(SS_VECTOR) | BIT(GP_VECTOR) |
> > BIT(PF_VECTOR) | BIT(AC_VECTOR) | BIT(CP_VECTOR);
> >
> > + if (vector == CP_VECTOR && (vcpu->arch.cr4_guest_rsvd_bits & X86_CR4_CET))
> > + return false;
> > +
> > return (1U << vector) & exception_has_error_code;
> > }
> >
> >
> >
> >
>
> Squashed, thanks. I made a small change to the last hunk:
>
> if (!((1U << vector) & exception_has_error_code))
> return false;
>
> if (vector == CP_VECTOR)
> return !(vcpu->arch.cr4_guest_rsvd_bits & X86_CR4_CET);
>
> return true;

Ha, I guessed wrong, that was my first pass at it :-)

2021-02-05 01:09:10

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH v15 04/14] KVM: x86: Add #CP support in guest exception dispatch

On 04/02/21 17:42, Sean Christopherson wrote:
> On Thu, Feb 04, 2021, Paolo Bonzini wrote:
>> On 03/02/21 22:46, Sean Christopherson wrote:
>>>
>>> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
>>> index dbca1687ae8e..0b6dab6915a3 100644
>>> --- a/arch/x86/kvm/vmx/nested.c
>>> +++ b/arch/x86/kvm/vmx/nested.c
>>> @@ -2811,7 +2811,7 @@ static int nested_check_vm_entry_controls(struct kvm_vcpu *vcpu,
>>> /* VM-entry interruption-info field: deliver error code */
>>> should_have_error_code =
>>> intr_type == INTR_TYPE_HARD_EXCEPTION && prot_mode &&
>>> - x86_exception_has_error_code(vector);
>>> + x86_exception_has_error_code(vcpu, vector);
>>> if (CC(has_error_code != should_have_error_code))
>>> return -EINVAL;
>>>
>>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>>> index 28fea7ff7a86..0288d6a364bd 100644
>>> --- a/arch/x86/kvm/x86.c
>>> +++ b/arch/x86/kvm/x86.c
>>> @@ -437,17 +437,20 @@ EXPORT_SYMBOL_GPL(kvm_spurious_fault);
>>> #define EXCPT_CONTRIBUTORY 1
>>> #define EXCPT_PF 2
>>>
>>> -static int exception_class(int vector)
>>> +static int exception_class(struct kvm_vcpu *vcpu, int vector)
>>> {
>>> switch (vector) {
>>> case PF_VECTOR:
>>> return EXCPT_PF;
>>> + case CP_VECTOR:
>>> + if (vcpu->arch.cr4_guest_rsvd_bits & X86_CR4_CET)
>>> + return EXCPT_BENIGN;
>>> + return EXCPT_CONTRIBUTORY;
>>> case DE_VECTOR:
>>> case TS_VECTOR:
>>> case NP_VECTOR:
>>> case SS_VECTOR:
>>> case GP_VECTOR:
>>> - case CP_VECTOR:
>
> This removal got lost when squasing.
>
> arch/x86/kvm/x86.c: In function ‘exception_class’:
> arch/x86/kvm/x86.c:455:2: error: duplicate case value
> 455 | case CP_VECTOR:
> | ^~~~
> arch/x86/kvm/x86.c:446:2: note: previously used here
> 446 | case CP_VECTOR:
> | ^~~~

Well, it shows that I haven't even started including those
unlikely-for-5.12 patches (CET and #DB bus lock) in my builds, since
today I was focusing on getting a kvm/next push done.

I'll probably push all those to kvm/intel-queue and remove them from
everyone's view, for now.

Paolo

2022-05-18 15:59:47

by John Allen

[permalink] [raw]
Subject: Re: [PATCH v15 07/14] KVM: VMX: Emulate reads and writes to CET MSRs

On Wed, Feb 03, 2021 at 07:34:14PM +0800, Yang Weijiang wrote:
> Add support for emulating read and write accesses to CET MSRs. CET MSRs
> are universally "special" as they are either context switched via
> dedicated VMCS fields or via XSAVES, i.e. no additional in-memory
> tracking is needed, but emulated reads/writes are more expensive.
>
> MSRs that are switched through XSAVES are especially annoying due to the
> possibility of the kernel's FPU being used in IRQ context. Disable IRQs
> and ensure the guest's FPU state is loaded when accessing such MSRs.
>
> Co-developed-by: Sean Christopherson <[email protected]>
> Signed-off-by: Sean Christopherson <[email protected]>
> Signed-off-by: Yang Weijiang <[email protected]>
> ---
> arch/x86/kvm/vmx/vmx.c | 105 +++++++++++++++++++++++++++++++++++++++++
> arch/x86/kvm/x86.h | 5 ++
> 2 files changed, 110 insertions(+)
>
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index cc60b1fc3ee7..694879c2b0b7 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -1787,6 +1787,66 @@ static int vmx_get_msr_feature(struct kvm_msr_entry *msr)
> }
> }
>
> +static void vmx_get_xsave_msr(struct msr_data *msr_info)
> +{
> + local_irq_disable();
> + if (test_thread_flag(TIF_NEED_FPU_LOAD))
> + switch_fpu_return();
> + rdmsrl(msr_info->index, msr_info->data);
> + local_irq_enable();
> +}
> +
> +static void vmx_set_xsave_msr(struct msr_data *msr_info)
> +{
> + local_irq_disable();
> + if (test_thread_flag(TIF_NEED_FPU_LOAD))
> + switch_fpu_return();
> + wrmsrl(msr_info->index, msr_info->data);
> + local_irq_enable();
> +}
> +
> +static bool cet_is_ssp_msr_accessible(struct kvm_vcpu *vcpu,
> + struct msr_data *msr)
> +{
> + u64 mask;
> +
> + if (!kvm_cet_supported())
> + return false;
> +
> + if (msr->host_initiated)
> + return true;
> +
> + if (!guest_cpuid_has(vcpu, X86_FEATURE_SHSTK))
> + return false;
> +
> + if (msr->index == MSR_IA32_INT_SSP_TAB)
> + return true;
> +
> + mask = (msr->index == MSR_IA32_PL3_SSP) ? XFEATURE_MASK_CET_USER :
> + XFEATURE_MASK_CET_KERNEL;
> + return !!(vcpu->arch.guest_supported_xss & mask);
> +}
> +
> +static bool cet_is_control_msr_accessible(struct kvm_vcpu *vcpu,
> + struct msr_data *msr)
> +{
> + u64 mask;
> +
> + if (!kvm_cet_supported())
> + return false;
> +
> + if (msr->host_initiated)
> + return true;
> +
> + if (!guest_cpuid_has(vcpu, X86_FEATURE_SHSTK) &&
> + !guest_cpuid_has(vcpu, X86_FEATURE_IBT))
> + return false;
> +
> + mask = (msr->index == MSR_IA32_U_CET) ? XFEATURE_MASK_CET_USER :
> + XFEATURE_MASK_CET_KERNEL;
> + return !!(vcpu->arch.guest_supported_xss & mask);
> +}
> +
> /*
> * Reads an msr value (of 'msr_index') into 'pdata'.
> * Returns 0 on success, non-0 otherwise.
> @@ -1919,6 +1979,26 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> else
> msr_info->data = vmx->pt_desc.guest.addr_a[index / 2];
> break;
> + case MSR_IA32_S_CET:
> + if (!cet_is_control_msr_accessible(vcpu, msr_info))
> + return 1;
> + msr_info->data = vmcs_readl(GUEST_S_CET);
> + break;
> + case MSR_IA32_U_CET:
> + if (!cet_is_control_msr_accessible(vcpu, msr_info))
> + return 1;
> + vmx_get_xsave_msr(msr_info);
> + break;
> + case MSR_IA32_INT_SSP_TAB:
> + if (!cet_is_ssp_msr_accessible(vcpu, msr_info))
> + return 1;
> + msr_info->data = vmcs_readl(GUEST_INTR_SSP_TABLE);
> + break;
> + case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP:
> + if (!cet_is_ssp_msr_accessible(vcpu, msr_info))
> + return 1;
> + vmx_get_xsave_msr(msr_info);
> + break;
> case MSR_TSC_AUX:
> if (!msr_info->host_initiated &&
> !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP))
> @@ -2188,6 +2268,31 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> else
> vmx->pt_desc.guest.addr_a[index / 2] = data;
> break;
> + case MSR_IA32_S_CET:
> + case MSR_IA32_U_CET:
> + if (!cet_is_control_msr_accessible(vcpu, msr_info))
> + return 1;
> + if (data & GENMASK(9, 6))
> + return 1;
> + if (msr_index == MSR_IA32_S_CET)
> + vmcs_writel(GUEST_S_CET, data);
> + else
> + vmx_set_xsave_msr(msr_info);
> + break;
> + case MSR_IA32_INT_SSP_TAB:
> + if (!cet_is_control_msr_accessible(vcpu, msr_info))
> + return 1;
> + if (is_noncanonical_address(data, vcpu))
> + return 1;
> + vmcs_writel(GUEST_INTR_SSP_TABLE, data);
> + break;
> + case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP:
> + if (!cet_is_ssp_msr_accessible(vcpu, msr_info))
> + return 1;
> + if ((data & GENMASK(2, 0)) || is_noncanonical_address(data, vcpu))

Sorry to revive this old thread. I'm working on the corresponding SVM
bits for shadow stack and I noticed the above check. Why isn't this
GENMASK(1, 0)? The *SSP MSRs should be a 4-byte aligned canonical
address meaning that just bits 1 and 0 should always be zero. I was
looking through the previous versions of the set and found that this
changed between versions 11 and 12, but I don't see any discussion
related to this on the list.

Thanks,
John

> + return 1;
> + vmx_set_xsave_msr(msr_info);
> + break;
> case MSR_TSC_AUX:
> if (!msr_info->host_initiated &&
> !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP))
> diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
> index fd8c46da2030..16c661d94349 100644
> --- a/arch/x86/kvm/x86.h
> +++ b/arch/x86/kvm/x86.h
> @@ -288,6 +288,11 @@ static inline bool kvm_mpx_supported(void)
> == (XFEATURE_MASK_BNDREGS | XFEATURE_MASK_BNDCSR);
> }
>
> +static inline bool kvm_cet_supported(void)
> +{
> + return supported_xss & XFEATURE_MASK_CET_USER;
> +}
> +
> extern unsigned int min_timer_period_us;
>
> extern bool enable_vmware_backdoor;
> --
> 2.26.2
>

2022-05-18 16:17:19

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v15 07/14] KVM: VMX: Emulate reads and writes to CET MSRs

On Wed, May 18, 2022, John Allen wrote:
> On Wed, Feb 03, 2021 at 07:34:14PM +0800, Yang Weijiang wrote:
> > + case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP:
> > + if (!cet_is_ssp_msr_accessible(vcpu, msr_info))
> > + return 1;
> > + if ((data & GENMASK(2, 0)) || is_noncanonical_address(data, vcpu))
>
> Sorry to revive this old thread. I'm working on the corresponding SVM
> bits for shadow stack and I noticed the above check. Why isn't this
> GENMASK(1, 0)? The *SSP MSRs should be a 4-byte aligned canonical
> address meaning that just bits 1 and 0 should always be zero. I was
> looking through the previous versions of the set and found that this
> changed between versions 11 and 12, but I don't see any discussion
> related to this on the list.

Huh. I'm not entirely sure what to make of the SDM's wording:

The linear address written must be aligned to 8 bytes and bits 2:0 must be 0
(hardware requires bits 1:0 to be 0).

Looking at the rest of the CET stuff, I believe requiring 8-byte alignment is
correct, and that the "hardware requires" blurb is trying to call out that the
SSP stored in hardware will always be 4-byte aligned but not necessarily 8-byte
aligned in order to play nice with 32-bit/compatibility mode. But "base" addresses
that come from software, e.g. via MSRs and whatnot, must always be 8-byte aligned.

2022-05-20 07:12:03

by Yang, Weijiang

[permalink] [raw]
Subject: Re: [PATCH v15 07/14] KVM: VMX: Emulate reads and writes to CET MSRs


On 5/19/2022 12:16 AM, Sean Christopherson wrote:
> On Wed, May 18, 2022, John Allen wrote:
>> On Wed, Feb 03, 2021 at 07:34:14PM +0800, Yang Weijiang wrote:
>>> + case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP:
>>> + if (!cet_is_ssp_msr_accessible(vcpu, msr_info))
>>> + return 1;
>>> + if ((data & GENMASK(2, 0)) || is_noncanonical_address(data, vcpu))
>> Sorry to revive this old thread. I'm working on the corresponding SVM
>> bits for shadow stack and I noticed the above check. Why isn't this
>> GENMASK(1, 0)? The *SSP MSRs should be a 4-byte aligned canonical
>> address meaning that just bits 1 and 0 should always be zero. I was
>> looking through the previous versions of the set and found that this
>> changed between versions 11 and 12, but I don't see any discussion
>> related to this on the list.
> Huh. I'm not entirely sure what to make of the SDM's wording:
>
> The linear address written must be aligned to 8 bytes and bits 2:0 must be 0
> (hardware requires bits 1:0 to be 0).
>
> Looking at the rest of the CET stuff, I believe requiring 8-byte alignment is
> correct, and that the "hardware requires" blurb is trying to call out that the
> SSP stored in hardware will always be 4-byte aligned but not necessarily 8-byte
> aligned in order to play nice with 32-bit/compatibility mode. But "base" addresses
> that come from software, e.g. via MSRs and whatnot, must always be 8-byte aligned.
Thanks Sean, I cannot agree more ;-)