CET (Control-flow Enforcement Technology) is a CPU feature used to prevent\n
Return/Jump-Oriented Programming (ROP/JOP) attacks. CET introduces a new\n
exception type, Control Protection (#CP), and two sub-features(SHSTK,IBT)\n
to defend against ROP/JOP style control-flow subversion attacks.\n
Shadow Stack (SHSTK):
A shadow stack is a second stack used exclusively for control transfer
operations. The shadow stack is separate from the data/normal stack and
can be enabled individually in user and kernel mode. When shadow stacks
are enabled, CALL pushes the return address on both the data and shadow
stack. RET pops the return address from both stacks and compares them.
If the return addresses from the two stacks do not match, the processor
generates a #CP.
Indirect Branch Tracking (IBT):
IBT adds new instrutions, ENDBRANCH{32|64}, to mark valid target addresses
of indirect branches (CALL, JMP etc...). If an indirect branch is executed
and the next instruction is _not_ an ENDBRANCH, the processor generates a #CP.
Build dependency:
--------------------------------------------------------------------------
The first 5 patches are took over from CET native series [1] in linux-next,
they must be included in kernel tree when build host kernel for testing CET
in guest. Will remove them once the native series landed in mainline kernel
tree. It's just for build and test purpose.
Implementation:
--------------------------------------------------------------------------
Historically, the early KVM patches can support both user SHSTK and IBT,
and most of the early patches are carried forward with changes by this new
series. Then with kernel IBT feature merged in 5.18, a new patch was added
to support the feature for guest. The last patch is introduced to support
supervisor SHSTK but the feature is not enabled on Intel platform for now,
the main purpose of this patch is to facilitate AMD folks to enable the
feature.
In summary, this new series enables CET user SHSTK/IBT and kernel IBT, but
doesn't fully support CET supervisor SHSTK, the enabling work is left for
the future.
Supported CET sub-features:
|
User SHSTK | User IBT (user mode)
--------------------------------------------------
s-SHSTK (X) | Kernel IBT (kernel mode)
|
The user SHSTK/IBT relies on host side XSAVES support(XSS[bit 11]) for user
mode CET states. The kernel IBT doesn't have dependency on host XSAVES.
The supervisor SHSTK relies on host side XSAVES support(XSS[bit 12]) for
supervisor mode CET states.
This version removed unnecessary checks for host CET enabling status before
enabling guest CET support, making guest CET support apart from that of host.
By doing so, it's expected to be more friendly to cloud computing scenarios.
CET states management:
--------------------------------------------------------------------------
CET user mode states, MSR_IA32_{U_CET,PL3_SSP} depends on {XSAVES|XRSTORS}
instructions to swap guest/host context when vm-exit/vm-entry happens.
On vm-exit, the guest CET states are stored to guest fpu area and host user
mode states are loaded from thread/process context before vCPU returns to
userspace, vice-versa on vm-entry. See details in kvm_{load|put}_guest_fpu().
So the user mode state validity depends on host side U_CET bit set in MSR_XSS.
CET supervisor mode states are grouped into two categories - XSAVES dependent
and non-dependent, the former includes MSR_IA32_PL{0,1,2}_SSP, the later
consists of MSR_IA32_S_CET and MSR_IA32_INTR_SSP_TBL. The XSAVES dependent
MSR's save/restore depends on S_CET bit set in MSR_XSS. Since the native series
doesn't enable S_CET support, these s-SHSTK shadow stack pointers are invalid.
Moveover, new VMCS fields, {GUEST|HOST}_{S_CET,SSP,INTR_SSP_TABL}, are
introduced for guest/host supervisor state switch. When CET entry/exit load
bits are set, the guest/host MSR_IA32_{S_CET,INTR_SSP_TBL,SSP} are swapped
automatically at vm-exit/entry. With these new fields, current guest kernel
IBT enalbing doesn't depend on host {XSAVES|XRSTORS} support.
Tests:
--------------------------------------------------------------------------
This series passed basic CET user shadow stack test and kernel IBT test in
L1 and L2 guest. It also passed CET KUT test which has been merged there.
Executed all KUT tests and KVM selftests against this series, all test cases
passes except the vmx test, the failure is due to CR4_CET bit testing in
test_vmxon_bad_cr(). After add CR4_CET bit to skip list, the test passed.
I'll send a patch to fix this issue later.
To run user shadow stack test and kernel IBT test in VM, you need an CET
capable platform, e.g., Sapphire Rapids server, and follow below steps to
build host/guest kernel properly:
1. Buld host kernel. Patch this series to kernel tree and build kernel
with CET capable gcc version(e.g., >=8.5.0).
2. Build guest kernel. Patch CET native series to kernel tree and opt-in
CONFIG_X86_KERNEL_IBT and CONFIG_X86_USER_SHADOW_STACK options.
3. Launch a VM with QEMU built with CET enabling patches [2].
Check kernel selftest test_shadow_stack_64 output:
[INFO] new_ssp = 7f8c82100ff8, *new_ssp = 7f8c82101001
[INFO] changing ssp from 7f8c82900ff0 to 7f8c82100ff8
[INFO] ssp is now 7f8c82101000
[OK] Shadow stack pivot
[OK] Shadow stack faults
[INFO] Corrupting shadow stack
[INFO] Generated shadow stack violation successfully
[OK] Shadow stack violation test
[INFO] Gup read -> shstk access success
[INFO] Gup write -> shstk access success
[INFO] Violation from normal write
[INFO] Gup read -> write access success
[INFO] Violation from normal write
[INFO] Gup write -> write access success
[INFO] Cow gup write -> write access success
[OK] Shadow gup test
[INFO] Violation from shstk access
[OK] mprotect() test
[SKIP] Userfaultfd unavailable.
[OK] 32 bit test
Check kernel IBT with dmesg | grep CET:
CET detected: Indirect Branch Tracking enabled
--------------------------------------------------------------------------
Changes in v2:
1. Remove excessive checks on host CET Kconfig options in v1 patchset [3].
2. Make CET CPUIDs, MSRs and control flags enabling independent to host CET status.
3. Introduce supervisor SHSTK support to make the patch set complete.
4. Refactor patches to accommodate above changes.
5. Rebase on kvm-x86/next [4].
[1]: linux-next: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/?h=next-20230420
[2]: QEMU patch: https://lore.kernel.org/all/[email protected]/
[3]: v1 patchset: https://lore.kernel.org/all/[email protected]/
[4]: Rebase branch: https://github.com/kvm-x86/linux.git, commit: 7b632f72528d (tag: kvm-x86-next-2023.04.14)
Rick Edgecombe (5):
x86/shstk: Add Kconfig option for shadow stack
x86/cpufeatures: Add CPU feature flags for shadow stacks
x86/cpufeatures: Enable CET CR4 bit for shadow stack
x86/fpu/xstate: Introduce CET MSR and XSAVES supervisor states
x86/fpu: Add helper for modifying xstate
Sean Christopherson (2):
KVM:x86: Report XSS as to-be-saved if there are supported features
KVM:x86: Load guest FPU state when accessing xsaves-managed MSRs
Yang Weijiang (14):
KVM:x86: Refresh CPUID on write to guest MSR_IA32_XSS
KVM:x86: Init kvm_caps.supported_xss with supported feature bits
KVM:x86: Add #CP support in guest exception classification
KVM:VMX: Introduce CET VMCS fields and control bits
KVM:x86: Add fault checks for guest CR4.CET setting
KVM:VMX: Emulate reads and writes to CET MSRs
KVM:VMX: Add a synthetic MSR to allow userspace VMM to access
GUEST_SSP
KVM:x86: Report CET MSRs as to-be-saved if CET is supported
KVM:x86: Save/Restore GUEST_SSP to/from SMM state save area
KVM:VMX: Pass through user CET MSRs to the guest
KVM:x86: Enable CET virtualization for VMX and advertise to userspace
KVM:nVMX: Enable user CET support for nested VMX
KVM:x86: Enable supervisor IBT support for guest
KVM:x86: Support CET supervisor shadow stack MSR access
arch/x86/Kconfig | 24 ++++
arch/x86/Kconfig.assembler | 5 +
arch/x86/include/asm/cpufeatures.h | 2 +
arch/x86/include/asm/disabled-features.h | 8 +-
arch/x86/include/asm/fpu/api.h | 9 ++
arch/x86/include/asm/fpu/types.h | 16 ++-
arch/x86/include/asm/fpu/xstate.h | 6 +-
arch/x86/include/asm/kvm_host.h | 3 +-
arch/x86/include/asm/vmx.h | 8 ++
arch/x86/include/uapi/asm/kvm.h | 1 +
arch/x86/include/uapi/asm/kvm_para.h | 1 +
arch/x86/kernel/cpu/common.c | 35 ++++--
arch/x86/kernel/cpu/cpuid-deps.c | 1 +
arch/x86/kernel/fpu/core.c | 19 +++
arch/x86/kernel/fpu/xstate.c | 90 +++++++-------
arch/x86/kvm/cpuid.c | 23 +++-
arch/x86/kvm/cpuid.h | 6 +
arch/x86/kvm/smm.c | 20 +++
arch/x86/kvm/vmx/capabilities.h | 4 +
arch/x86/kvm/vmx/nested.c | 29 ++++-
arch/x86/kvm/vmx/vmcs12.c | 6 +
arch/x86/kvm/vmx/vmcs12.h | 14 ++-
arch/x86/kvm/vmx/vmx.c | 150 ++++++++++++++++++++++-
arch/x86/kvm/vmx/vmx.h | 6 +-
arch/x86/kvm/x86.c | 79 ++++++++++--
arch/x86/kvm/x86.h | 46 ++++++-
26 files changed, 528 insertions(+), 83 deletions(-)
base-commit: 7b632f72528d5fa3f0265358a393f534da47d9dd
--
2.27.0
From: Rick Edgecombe <[email protected]>
Shadow stack provides protection for applications against function return
address corruption. It is active when the processor supports it, the
kernel has CONFIG_X86_SHADOW_STACK enabled, and the application is built
for the feature. This is only implemented for the 64-bit kernel. When it
is enabled, legacy non-shadow stack applications continue to work, but
without protection.
Since there is another feature that utilizes CET (Kernel IBT) that will
share implementation with shadow stacks, create CONFIG_CET to signify
that at least one CET feature is configured.
Co-developed-by: Yu-cheng Yu <[email protected]>
Signed-off-by: Yu-cheng Yu <[email protected]>
Signed-off-by: Rick Edgecombe <[email protected]>
Signed-off-by: Dave Hansen <[email protected]>
Reviewed-by: Borislav Petkov (AMD) <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Acked-by: Mike Rapoport (IBM) <[email protected]>
Tested-by: Pengfei Xu <[email protected]>
Tested-by: John Allen <[email protected]>
Tested-by: Kees Cook <[email protected]>
Link: https://lore.kernel.org/all/20230319001535.23210-3-rick.p.edgecombe%40intel.com
---
arch/x86/Kconfig | 24 ++++++++++++++++++++++++
arch/x86/Kconfig.assembler | 5 +++++
2 files changed, 29 insertions(+)
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index a825bf031f49..f03791b73f9f 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1851,6 +1851,11 @@ config CC_HAS_IBT
(CC_IS_CLANG && CLANG_VERSION >= 140000)) && \
$(as-instr,endbr64)
+config X86_CET
+ def_bool n
+ help
+ CET features configured (Shadow stack or IBT)
+
config X86_KERNEL_IBT
prompt "Indirect Branch Tracking"
def_bool y
@@ -1858,6 +1863,7 @@ config X86_KERNEL_IBT
# https://github.com/llvm/llvm-project/commit/9d7001eba9c4cb311e03cd8cdc231f9e579f2d0f
depends on !LD_IS_LLD || LLD_VERSION >= 140000
select OBJTOOL
+ select X86_CET
help
Build the kernel with support for Indirect Branch Tracking, a
hardware support course-grain forward-edge Control Flow Integrity
@@ -1952,6 +1958,24 @@ config X86_SGX
If unsure, say N.
+config X86_USER_SHADOW_STACK
+ bool "X86 userspace shadow stack"
+ depends on AS_WRUSS
+ depends on X86_64
+ select ARCH_USES_HIGH_VMA_FLAGS
+ select X86_CET
+ help
+ Shadow stack protection is a hardware feature that detects function
+ return address corruption. This helps mitigate ROP attacks.
+ Applications must be enabled to use it, and old userspace does not
+ get protection "for free".
+
+ CPUs supporting shadow stacks were first released in 2020.
+
+ See Documentation/x86/shstk.rst for more information.
+
+ If unsure, say N.
+
config EFI
bool "EFI runtime service support"
depends on ACPI
diff --git a/arch/x86/Kconfig.assembler b/arch/x86/Kconfig.assembler
index b88f784cb02e..8ad41da301e5 100644
--- a/arch/x86/Kconfig.assembler
+++ b/arch/x86/Kconfig.assembler
@@ -24,3 +24,8 @@ config AS_GFNI
def_bool $(as-instr,vgf2p8mulb %xmm0$(comma)%xmm1$(comma)%xmm2)
help
Supported by binutils >= 2.30 and LLVM integrated assembler
+
+config AS_WRUSS
+ def_bool $(as-instr,wrussq %rax$(comma)(%rbx))
+ help
+ Supported by binutils >= 2.31 and LLVM integrated assembler
--
2.27.0
Initialize kvm_caps.supported_xss with host XSS msr value AND XSS mask.
KVM_SUPPORTED_XSS holds all potential supported feature bits, the result
represents all KVM supported feature bits which is used for swapping guest
and host FPU contents.
Signed-off-by: Yang Weijiang <[email protected]>
---
arch/x86/kvm/vmx/vmx.c | 1 -
arch/x86/kvm/x86.c | 6 +++++-
2 files changed, 5 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 44fb619803b8..c872a5aafa50 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7806,7 +7806,6 @@ static __init void vmx_set_cpu_caps(void)
kvm_cpu_cap_set(X86_FEATURE_UMIP);
/* CPUID 0xD.1 */
- kvm_caps.supported_xss = 0;
if (!cpu_has_vmx_xsaves())
kvm_cpu_cap_clear(X86_FEATURE_XSAVES);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index ab3360a10933..d2975ca96ac5 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -223,6 +223,8 @@ static struct kvm_user_return_msrs __percpu *user_return_msrs;
| XFEATURE_MASK_BNDCSR | XFEATURE_MASK_AVX512 \
| XFEATURE_MASK_PKRU | XFEATURE_MASK_XTILE)
+#define KVM_SUPPORTED_XSS 0
+
u64 __read_mostly host_efer;
EXPORT_SYMBOL_GPL(host_efer);
@@ -9472,8 +9474,10 @@ static int __kvm_x86_vendor_init(struct kvm_x86_init_ops *ops)
rdmsrl_safe(MSR_EFER, &host_efer);
- if (boot_cpu_has(X86_FEATURE_XSAVES))
+ if (boot_cpu_has(X86_FEATURE_XSAVES)) {
rdmsrl(MSR_IA32_XSS, host_xss);
+ kvm_caps.supported_xss = host_xss & KVM_SUPPORTED_XSS;
+ }
kvm_init_pmu_capability(ops->pmu_ops);
--
2.27.0
Introduce a host-only synthetic MSR, MSR_KVM_GUEST_SSP, so that the VMM
can read/write the guest's SSP, e.g. to migrate CET state. Use a synthetic
MSR, e.g. as opposed to a VCPU_REG_, as GUEST_SSP is subject to the same
consistency checks as the PL*_SSP MSRs, i.e. can share code.
Co-developed-by: Sean Christopherson <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
Signed-off-by: Yang Weijiang <[email protected]>
---
arch/x86/include/uapi/asm/kvm_para.h | 1 +
arch/x86/kvm/vmx/vmx.c | 15 ++++++++++++---
2 files changed, 13 insertions(+), 3 deletions(-)
diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h
index 6e64b27b2c1e..7af465e4e0bd 100644
--- a/arch/x86/include/uapi/asm/kvm_para.h
+++ b/arch/x86/include/uapi/asm/kvm_para.h
@@ -58,6 +58,7 @@
#define MSR_KVM_ASYNC_PF_INT 0x4b564d06
#define MSR_KVM_ASYNC_PF_ACK 0x4b564d07
#define MSR_KVM_MIGRATION_CONTROL 0x4b564d08
+#define MSR_KVM_GUEST_SSP 0x4b564d09
struct kvm_steal_time {
__u64 steal;
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index ae816c1c7367..42211ae40650 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1968,7 +1968,8 @@ static bool cet_is_msr_accessible(struct kvm_vcpu *vcpu,
!guest_cpuid_has(vcpu, X86_FEATURE_IBT))
return false;
- if (msr->index == MSR_IA32_PL3_SSP &&
+ if ((msr->index == MSR_IA32_PL3_SSP ||
+ msr->index == MSR_KVM_GUEST_SSP) &&
!guest_cpuid_has(vcpu, X86_FEATURE_SHSTK))
return false;
@@ -2115,9 +2116,13 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
break;
case MSR_IA32_U_CET:
case MSR_IA32_PL3_SSP:
+ case MSR_KVM_GUEST_SSP:
if (!cet_is_msr_accessible(vcpu, msr_info))
return 1;
- kvm_get_xsave_msr(msr_info);
+ if (msr_info->index == MSR_KVM_GUEST_SSP)
+ msr_info->data = vmcs_readl(GUEST_SSP);
+ else
+ kvm_get_xsave_msr(msr_info);
break;
case MSR_IA32_DEBUGCTLMSR:
msr_info->data = vmcs_read64(GUEST_IA32_DEBUGCTL);
@@ -2440,12 +2445,16 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
kvm_set_xsave_msr(msr_info);
break;
case MSR_IA32_PL3_SSP:
+ case MSR_KVM_GUEST_SSP:
if (!cet_is_msr_accessible(vcpu, msr_info))
return 1;
if ((data & GENMASK(2, 0)) ||
is_noncanonical_address(data, vcpu))
return 1;
- kvm_set_xsave_msr(msr_info);
+ if (msr_index == MSR_KVM_GUEST_SSP)
+ vmcs_writel(GUEST_SSP, data);
+ else
+ kvm_set_xsave_msr(msr_info);
break;
case MSR_IA32_PERF_CAPABILITIES:
if (data && !vcpu_to_pmu(vcpu)->version)
--
2.27.0
Check potential faults for CR4.CET setting per Intel SDM.
CR4.CET is the master control bit for CET features (SHSTK and IBT).
In addition to basic support checks, CET can be enabled if and only
if CR0.WP==1, i.e. setting CR4.CET=1 faults if CR0.WP==0 and setting
CR0.WP=0 fails if CR4.CET==1.
Co-developed-by: Sean Christopherson <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
Signed-off-by: Yang Weijiang <[email protected]>
---
arch/x86/kvm/x86.c | 6 ++++++
arch/x86/kvm/x86.h | 3 +++
2 files changed, 9 insertions(+)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index a768cbf3fbb7..7cd7f6755acd 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -995,6 +995,9 @@ int kvm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
(is_64_bit_mode(vcpu) || kvm_is_cr4_bit_set(vcpu, X86_CR4_PCIDE)))
return 1;
+ if (!(cr0 & X86_CR0_WP) && kvm_read_cr4_bits(vcpu, X86_CR4_CET))
+ return 1;
+
static_call(kvm_x86_set_cr0)(vcpu, cr0);
kvm_post_set_cr0(vcpu, old_cr0, cr0);
@@ -1210,6 +1213,9 @@ int kvm_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
return 1;
}
+ if ((cr4 & X86_CR4_CET) && !(kvm_read_cr0(vcpu) & X86_CR0_WP))
+ return 1;
+
static_call(kvm_x86_set_cr4)(vcpu, cr4);
kvm_post_set_cr4(vcpu, old_cr4, cr4);
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index 2ba7c7fc4846..daadd5330dae 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -536,6 +536,9 @@ bool kvm_msr_allowed(struct kvm_vcpu *vcpu, u32 index, u32 type);
__reserved_bits |= X86_CR4_VMXE; \
if (!__cpu_has(__c, X86_FEATURE_PCID)) \
__reserved_bits |= X86_CR4_PCIDE; \
+ if (!__cpu_has(__c, X86_FEATURE_SHSTK) && \
+ !__cpu_has(__c, X86_FEATURE_IBT)) \
+ __reserved_bits |= X86_CR4_CET; \
__reserved_bits; \
})
--
2.27.0
From: Rick Edgecombe <[email protected]>
Setting CR4.CET is a prerequisite for utilizing any CET features, most of
which also require setting MSRs.
Kernel IBT already enables the CET CR4 bit when it detects IBT HW support
and is configured with kernel IBT. However, future patches that enable
userspace shadow stack support will need the bit set as well. So change
the logic to enable it in either case.
Clear MSR_IA32_U_CET in cet_disable() so that it can't live to see
userspace in a new kexec-ed kernel that has CR4.CET set from kernel IBT.
Co-developed-by: Yu-cheng Yu <[email protected]>
Signed-off-by: Yu-cheng Yu <[email protected]>
Signed-off-by: Rick Edgecombe <[email protected]>
Signed-off-by: Dave Hansen <[email protected]>
Reviewed-by: Borislav Petkov (AMD) <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Acked-by: Mike Rapoport (IBM) <[email protected]>
Tested-by: Pengfei Xu <[email protected]>
Tested-by: John Allen <[email protected]>
Tested-by: Kees Cook <[email protected]>
Link: https://lore.kernel.org/all/20230319001535.23210-5-rick.p.edgecombe%40intel.com
---
arch/x86/kernel/cpu/common.c | 35 +++++++++++++++++++++++++++--------
1 file changed, 27 insertions(+), 8 deletions(-)
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 8cd4126d8253..cc686e5039be 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -600,27 +600,43 @@ __noendbr void ibt_restore(u64 save)
static __always_inline void setup_cet(struct cpuinfo_x86 *c)
{
- u64 msr = CET_ENDBR_EN;
+ bool user_shstk, kernel_ibt;
- if (!HAS_KERNEL_IBT ||
- !cpu_feature_enabled(X86_FEATURE_IBT))
+ if (!IS_ENABLED(CONFIG_X86_CET))
return;
- wrmsrl(MSR_IA32_S_CET, msr);
+ kernel_ibt = HAS_KERNEL_IBT && cpu_feature_enabled(X86_FEATURE_IBT);
+ user_shstk = cpu_feature_enabled(X86_FEATURE_SHSTK) &&
+ IS_ENABLED(CONFIG_X86_USER_SHADOW_STACK);
+
+ if (!kernel_ibt && !user_shstk)
+ return;
+
+ if (user_shstk)
+ set_cpu_cap(c, X86_FEATURE_USER_SHSTK);
+
+ if (kernel_ibt)
+ wrmsrl(MSR_IA32_S_CET, CET_ENDBR_EN);
+ else
+ wrmsrl(MSR_IA32_S_CET, 0);
+
cr4_set_bits(X86_CR4_CET);
- if (!ibt_selftest()) {
+ if (kernel_ibt && !ibt_selftest()) {
pr_err("IBT selftest: Failed!\n");
wrmsrl(MSR_IA32_S_CET, 0);
setup_clear_cpu_cap(X86_FEATURE_IBT);
- return;
}
}
__noendbr void cet_disable(void)
{
- if (cpu_feature_enabled(X86_FEATURE_IBT))
- wrmsrl(MSR_IA32_S_CET, 0);
+ if (!(cpu_feature_enabled(X86_FEATURE_IBT) ||
+ cpu_feature_enabled(X86_FEATURE_SHSTK)))
+ return;
+
+ wrmsrl(MSR_IA32_S_CET, 0);
+ wrmsrl(MSR_IA32_U_CET, 0);
}
/*
@@ -1482,6 +1498,9 @@ static void __init cpu_parse_early_param(void)
if (cmdline_find_option_bool(boot_command_line, "noxsaves"))
setup_clear_cpu_cap(X86_FEATURE_XSAVES);
+ if (cmdline_find_option_bool(boot_command_line, "nousershstk"))
+ setup_clear_cpu_cap(X86_FEATURE_USER_SHSTK);
+
arglen = cmdline_find_option(boot_command_line, "clearcpuid", arg, sizeof(arg));
if (arglen <= 0)
return;
--
2.27.0
Save GUEST_SSP to SMM state save area when guest exits to SMM
due to SMI and restore it VMCS field when guest exits SMM.
Signed-off-by: Yang Weijiang <[email protected]>
---
arch/x86/kvm/smm.c | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)
diff --git a/arch/x86/kvm/smm.c b/arch/x86/kvm/smm.c
index b42111a24cc2..c54d3eb2b7e4 100644
--- a/arch/x86/kvm/smm.c
+++ b/arch/x86/kvm/smm.c
@@ -275,6 +275,16 @@ static void enter_smm_save_state_64(struct kvm_vcpu *vcpu,
enter_smm_save_seg_64(vcpu, &smram->gs, VCPU_SREG_GS);
smram->int_shadow = static_call(kvm_x86_get_interrupt_shadow)(vcpu);
+
+ if (kvm_cet_user_supported()) {
+ struct msr_data msr;
+
+ msr.index = MSR_KVM_GUEST_SSP;
+ msr.host_initiated = true;
+ /* GUEST_SSP is stored in VMCS at vm-exit. */
+ static_call(kvm_x86_get_msr)(vcpu, &msr);
+ smram->ssp = msr.data;
+ }
}
#endif
@@ -565,6 +575,16 @@ static int rsm_load_state_64(struct x86_emulate_ctxt *ctxt,
static_call(kvm_x86_set_interrupt_shadow)(vcpu, 0);
ctxt->interruptibility = (u8)smstate->int_shadow;
+ if (kvm_cet_user_supported()) {
+ struct msr_data msr;
+
+ msr.index = MSR_KVM_GUEST_SSP;
+ msr.host_initiated = true;
+ msr.data = smstate->ssp;
+ /* Mimic host_initiated access to bypass ssp access check. */
+ static_call(kvm_x86_set_msr)(vcpu, &msr);
+ }
+
return X86EMUL_CONTINUE;
}
#endif
--
2.27.0
Add handling for Control Protection (#CP) exceptions(vector 21).
The new vector is introduced for Intel's Control-Flow Enforcement
Technology (CET) relevant violation cases.
See Intel's SDM for details.
Signed-off-by: Yang Weijiang <[email protected]>
---
arch/x86/include/uapi/asm/kvm.h | 1 +
arch/x86/kvm/vmx/nested.c | 2 +-
arch/x86/kvm/x86.c | 10 +++++++---
arch/x86/kvm/x86.h | 13 ++++++++++---
4 files changed, 19 insertions(+), 7 deletions(-)
diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
index 7f467fe05d42..1c002abe2be8 100644
--- a/arch/x86/include/uapi/asm/kvm.h
+++ b/arch/x86/include/uapi/asm/kvm.h
@@ -33,6 +33,7 @@
#define MC_VECTOR 18
#define XM_VECTOR 19
#define VE_VECTOR 20
+#define CP_VECTOR 21
/* Select x86 specific features in <linux/kvm.h> */
#define __KVM_HAVE_PIT
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 96ede74a6067..7bc62cd72748 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -2850,7 +2850,7 @@ static int nested_check_vm_entry_controls(struct kvm_vcpu *vcpu,
/* VM-entry interruption-info field: deliver error code */
should_have_error_code =
intr_type == INTR_TYPE_HARD_EXCEPTION && prot_mode &&
- x86_exception_has_error_code(vector);
+ x86_exception_has_error_code(vcpu, vector);
if (CC(has_error_code != should_have_error_code))
return -EINVAL;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 7788646bbf1f..a768cbf3fbb7 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -520,11 +520,15 @@ EXPORT_SYMBOL_GPL(kvm_spurious_fault);
#define EXCPT_CONTRIBUTORY 1
#define EXCPT_PF 2
-static int exception_class(int vector)
+static int exception_class(struct kvm_vcpu *vcpu, int vector)
{
switch (vector) {
case PF_VECTOR:
return EXCPT_PF;
+ case CP_VECTOR:
+ if (vcpu->arch.cr4_guest_rsvd_bits & X86_CR4_CET)
+ return EXCPT_BENIGN;
+ return EXCPT_CONTRIBUTORY;
case DE_VECTOR:
case TS_VECTOR:
case NP_VECTOR:
@@ -707,8 +711,8 @@ static void kvm_multiple_exception(struct kvm_vcpu *vcpu,
kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu);
return;
}
- class1 = exception_class(prev_nr);
- class2 = exception_class(nr);
+ class1 = exception_class(vcpu, prev_nr);
+ class2 = exception_class(vcpu, nr);
if ((class1 == EXCPT_CONTRIBUTORY && class2 == EXCPT_CONTRIBUTORY) ||
(class1 == EXCPT_PF && class2 != EXCPT_BENIGN)) {
/*
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index c544602d07a3..2ba7c7fc4846 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -171,13 +171,20 @@ static inline bool is_64_bit_hypercall(struct kvm_vcpu *vcpu)
return vcpu->arch.guest_state_protected || is_64_bit_mode(vcpu);
}
-static inline bool x86_exception_has_error_code(unsigned int vector)
+static inline bool x86_exception_has_error_code(struct kvm_vcpu *vcpu,
+ unsigned int vector)
{
static u32 exception_has_error_code = BIT(DF_VECTOR) | BIT(TS_VECTOR) |
BIT(NP_VECTOR) | BIT(SS_VECTOR) | BIT(GP_VECTOR) |
- BIT(PF_VECTOR) | BIT(AC_VECTOR);
+ BIT(PF_VECTOR) | BIT(AC_VECTOR) | BIT(CP_VECTOR);
- return (1U << vector) & exception_has_error_code;
+ if (!((1U << vector) & exception_has_error_code))
+ return false;
+
+ if (vector == CP_VECTOR)
+ return !(vcpu->arch.cr4_guest_rsvd_bits & X86_CR4_CET);
+
+ return true;
}
static inline bool mmu_is_nested(struct kvm_vcpu *vcpu)
--
2.27.0
Add support for emulating read and write accesses to CET MSRs.
CET MSRs are universally "special" as they are either context switched
via dedicated VMCS fields or via XSAVES, i.e. no additional in-memory
tracking is needed, but emulated reads/writes are more expensive.
Co-developed-by: Sean Christopherson <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
Signed-off-by: Yang Weijiang <[email protected]>
---
arch/x86/kernel/fpu/core.c | 1 +
arch/x86/kvm/vmx/vmx.c | 42 ++++++++++++++++++++++++++++++++++++++
arch/x86/kvm/x86.h | 30 +++++++++++++++++++++++++++
3 files changed, 73 insertions(+)
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index f851558b673f..b4e28487882c 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -770,6 +770,7 @@ void fpregs_lock_and_load(void)
if (test_thread_flag(TIF_NEED_FPU_LOAD))
fpregs_restore_userregs();
}
+EXPORT_SYMBOL_GPL(fpregs_lock_and_load);
#ifdef CONFIG_X86_DEBUG_FPU
/*
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index c872a5aafa50..ae816c1c7367 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1955,6 +1955,26 @@ static int vmx_get_msr_feature(struct kvm_msr_entry *msr)
}
}
+static bool cet_is_msr_accessible(struct kvm_vcpu *vcpu,
+ struct msr_data *msr)
+{
+ if (!kvm_cet_user_supported())
+ return false;
+
+ if (msr->host_initiated)
+ return true;
+
+ if (!guest_cpuid_has(vcpu, X86_FEATURE_SHSTK) &&
+ !guest_cpuid_has(vcpu, X86_FEATURE_IBT))
+ return false;
+
+ if (msr->index == MSR_IA32_PL3_SSP &&
+ !guest_cpuid_has(vcpu, X86_FEATURE_SHSTK))
+ return false;
+
+ return true;
+}
+
/*
* Reads an msr value (of 'msr_info->index') into 'msr_info->data'.
* Returns 0 on success, non-0 otherwise.
@@ -2093,6 +2113,12 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
else
msr_info->data = vmx->pt_desc.guest.addr_a[index / 2];
break;
+ case MSR_IA32_U_CET:
+ case MSR_IA32_PL3_SSP:
+ if (!cet_is_msr_accessible(vcpu, msr_info))
+ return 1;
+ kvm_get_xsave_msr(msr_info);
+ break;
case MSR_IA32_DEBUGCTLMSR:
msr_info->data = vmcs_read64(GUEST_IA32_DEBUGCTL);
break;
@@ -2405,6 +2431,22 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
else
vmx->pt_desc.guest.addr_a[index / 2] = data;
break;
+ case MSR_IA32_U_CET:
+ if (!cet_is_msr_accessible(vcpu, msr_info))
+ return 1;
+ if ((data & GENMASK(9, 6)) ||
+ is_noncanonical_address(data, vcpu))
+ return 1;
+ kvm_set_xsave_msr(msr_info);
+ break;
+ case MSR_IA32_PL3_SSP:
+ if (!cet_is_msr_accessible(vcpu, msr_info))
+ return 1;
+ if ((data & GENMASK(2, 0)) ||
+ is_noncanonical_address(data, vcpu))
+ return 1;
+ kvm_set_xsave_msr(msr_info);
+ break;
case MSR_IA32_PERF_CAPABILITIES:
if (data && !vcpu_to_pmu(vcpu)->version)
return 1;
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index daadd5330dae..52cd02a6bfec 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -2,6 +2,7 @@
#ifndef ARCH_X86_KVM_X86_H
#define ARCH_X86_KVM_X86_H
+#include <asm/fpu/api.h>
#include <linux/kvm_host.h>
#include <asm/fpu/xstate.h>
#include <asm/mce.h>
@@ -370,6 +371,16 @@ static inline bool kvm_mpx_supported(void)
== (XFEATURE_MASK_BNDREGS | XFEATURE_MASK_BNDCSR);
}
+/*
+ * Guest CET user mode states depend on host XSAVES/XRSTORS to save/restore
+ * when vCPU enter/exit user space. If host doesn't support CET user bit in
+ * XSS msr, then treat this case as KVM doesn't support CET user mode.
+ */
+static inline bool kvm_cet_user_supported(void)
+{
+ return !!(kvm_caps.supported_xss & XFEATURE_MASK_CET_USER);
+}
+
extern unsigned int min_timer_period_us;
extern bool enable_vmware_backdoor;
@@ -550,4 +561,23 @@ int kvm_sev_es_string_io(struct kvm_vcpu *vcpu, unsigned int size,
unsigned int port, void *data, unsigned int count,
int in);
+/*
+ * We've already loaded guest MSRs in __msr_io() after check the MSR index.
+ * In case vcpu has been preempted, we need to disable preemption, check
+ * and reload the guest fpu states before read/write xsaves-managed MSRs.
+ */
+static inline void kvm_get_xsave_msr(struct msr_data *msr_info)
+{
+ fpregs_lock_and_load();
+ rdmsrl(msr_info->index, msr_info->data);
+ fpregs_unlock();
+}
+
+static inline void kvm_set_xsave_msr(struct msr_data *msr_info)
+{
+ fpregs_lock_and_load();
+ wrmsrl(msr_info->index, msr_info->data);
+ fpregs_unlock();
+}
+
#endif
--
2.27.0
Add all CET fields to vmcs12 as L1 KVM touches them when CET is
enabled for L2. Pass through CET MSRs to L2 when L1 can support
and enumerate the VMCS control bits together with CR4 bit as
supported.
Co-developed-by: Sean Christopherson <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
Signed-off-by: Yang Weijiang <[email protected]>
---
arch/x86/kvm/vmx/nested.c | 12 ++++++++++--
arch/x86/kvm/vmx/vmcs12.c | 6 ++++++
arch/x86/kvm/vmx/vmcs12.h | 14 +++++++++++++-
arch/x86/kvm/vmx/vmx.c | 2 ++
4 files changed, 31 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 7bc62cd72748..522ac27d2534 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -660,6 +660,13 @@ static inline bool nested_vmx_prepare_msr_bitmap(struct kvm_vcpu *vcpu,
nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0,
MSR_IA32_FLUSH_CMD, MSR_TYPE_W);
+ /* Pass CET MSRs to nested VM if L0 and L1 are set to pass-through. */
+ nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0,
+ MSR_IA32_U_CET, MSR_TYPE_RW);
+
+ nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0,
+ MSR_IA32_PL3_SSP, MSR_TYPE_RW);
+
kvm_vcpu_unmap(vcpu, &vmx->nested.msr_bitmap_map, false);
vmx->nested.force_msr_bitmap_recalc = false;
@@ -6785,7 +6792,7 @@ static void nested_vmx_setup_exit_ctls(struct vmcs_config *vmcs_conf,
VM_EXIT_HOST_ADDR_SPACE_SIZE |
#endif
VM_EXIT_LOAD_IA32_PAT | VM_EXIT_SAVE_IA32_PAT |
- VM_EXIT_CLEAR_BNDCFGS;
+ VM_EXIT_CLEAR_BNDCFGS | VM_EXIT_LOAD_CET_STATE;
msrs->exit_ctls_high |=
VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR |
VM_EXIT_LOAD_IA32_EFER | VM_EXIT_SAVE_IA32_EFER |
@@ -6807,7 +6814,8 @@ static void nested_vmx_setup_entry_ctls(struct vmcs_config *vmcs_conf,
#ifdef CONFIG_X86_64
VM_ENTRY_IA32E_MODE |
#endif
- VM_ENTRY_LOAD_IA32_PAT | VM_ENTRY_LOAD_BNDCFGS;
+ VM_ENTRY_LOAD_IA32_PAT | VM_ENTRY_LOAD_BNDCFGS |
+ VM_ENTRY_LOAD_CET_STATE;
msrs->entry_ctls_high |=
(VM_ENTRY_ALWAYSON_WITHOUT_TRUE_MSR | VM_ENTRY_LOAD_IA32_EFER |
VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL);
diff --git a/arch/x86/kvm/vmx/vmcs12.c b/arch/x86/kvm/vmx/vmcs12.c
index 106a72c923ca..4233b5ca9461 100644
--- a/arch/x86/kvm/vmx/vmcs12.c
+++ b/arch/x86/kvm/vmx/vmcs12.c
@@ -139,6 +139,9 @@ const unsigned short vmcs12_field_offsets[] = {
FIELD(GUEST_PENDING_DBG_EXCEPTIONS, guest_pending_dbg_exceptions),
FIELD(GUEST_SYSENTER_ESP, guest_sysenter_esp),
FIELD(GUEST_SYSENTER_EIP, guest_sysenter_eip),
+ FIELD(GUEST_S_CET, guest_s_cet),
+ FIELD(GUEST_SSP, guest_ssp),
+ FIELD(GUEST_INTR_SSP_TABLE, guest_ssp_tbl),
FIELD(HOST_CR0, host_cr0),
FIELD(HOST_CR3, host_cr3),
FIELD(HOST_CR4, host_cr4),
@@ -151,5 +154,8 @@ const unsigned short vmcs12_field_offsets[] = {
FIELD(HOST_IA32_SYSENTER_EIP, host_ia32_sysenter_eip),
FIELD(HOST_RSP, host_rsp),
FIELD(HOST_RIP, host_rip),
+ FIELD(HOST_S_CET, host_s_cet),
+ FIELD(HOST_SSP, host_ssp),
+ FIELD(HOST_INTR_SSP_TABLE, host_ssp_tbl),
};
const unsigned int nr_vmcs12_fields = ARRAY_SIZE(vmcs12_field_offsets);
diff --git a/arch/x86/kvm/vmx/vmcs12.h b/arch/x86/kvm/vmx/vmcs12.h
index 01936013428b..3884489e7f7e 100644
--- a/arch/x86/kvm/vmx/vmcs12.h
+++ b/arch/x86/kvm/vmx/vmcs12.h
@@ -117,7 +117,13 @@ struct __packed vmcs12 {
natural_width host_ia32_sysenter_eip;
natural_width host_rsp;
natural_width host_rip;
- natural_width paddingl[8]; /* room for future expansion */
+ natural_width host_s_cet;
+ natural_width host_ssp;
+ natural_width host_ssp_tbl;
+ natural_width guest_s_cet;
+ natural_width guest_ssp;
+ natural_width guest_ssp_tbl;
+ natural_width paddingl[2]; /* room for future expansion */
u32 pin_based_vm_exec_control;
u32 cpu_based_vm_exec_control;
u32 exception_bitmap;
@@ -292,6 +298,12 @@ static inline void vmx_check_vmcs12_offsets(void)
CHECK_OFFSET(host_ia32_sysenter_eip, 656);
CHECK_OFFSET(host_rsp, 664);
CHECK_OFFSET(host_rip, 672);
+ CHECK_OFFSET(host_s_cet, 680);
+ CHECK_OFFSET(host_ssp, 688);
+ CHECK_OFFSET(host_ssp_tbl, 696);
+ CHECK_OFFSET(guest_s_cet, 704);
+ CHECK_OFFSET(guest_ssp, 712);
+ CHECK_OFFSET(guest_ssp_tbl, 720);
CHECK_OFFSET(pin_based_vm_exec_control, 744);
CHECK_OFFSET(cpu_based_vm_exec_control, 748);
CHECK_OFFSET(exception_bitmap, 752);
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index dec7a8b81388..db4aacbcba7f 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7669,6 +7669,8 @@ static void nested_vmx_cr_fixed1_bits_update(struct kvm_vcpu *vcpu)
cr4_fixed1_update(X86_CR4_PKE, ecx, feature_bit(PKU));
cr4_fixed1_update(X86_CR4_UMIP, ecx, feature_bit(UMIP));
cr4_fixed1_update(X86_CR4_LA57, ecx, feature_bit(LA57));
+ cr4_fixed1_update(X86_CR4_CET, ecx, feature_bit(SHSTK));
+ cr4_fixed1_update(X86_CR4_CET, edx, feature_bit(IBT));
#undef cr4_fixed1_update
}
--
2.27.0
Set the feature bits so that CET capabilities can be seen in guest via
CPUID enumeration. Add CR4.CET bit support in order to allow guest set
CET master control bit(CR4.CET).
Disable KVM CET feature if unrestricted_guest is unsupported/disabled as
KVM does not support emulating CET.
Don't expose CET feature if dependent CET bits are cleared in host XSS,
or if XSAVES isn't supported. Updating the CET features in common x86 is
a little ugly, but there is no clean solution without risking breakage of
SVM if SVM hardware ever gains support for CET, e.g. moving everything to
common x86 would prematurely expose CET on SVM. The alternative is to
put all the logic in VMX, but that means rereading host_xss in VMX and
duplicating the XSAVES check across VMX and SVM.
Co-developed-by: Sean Christopherson <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
Signed-off-by: Yang Weijiang <[email protected]>
---
arch/x86/include/asm/kvm_host.h | 3 ++-
arch/x86/kvm/cpuid.c | 12 ++++++++++--
arch/x86/kvm/vmx/capabilities.h | 4 ++++
arch/x86/kvm/vmx/vmx.c | 19 +++++++++++++++++++
arch/x86/kvm/vmx/vmx.h | 6 ++++--
arch/x86/kvm/x86.c | 21 ++++++++++++++++++++-
6 files changed, 59 insertions(+), 6 deletions(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 2865c3cb3501..58e20d5895d1 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -125,7 +125,8 @@
| X86_CR4_PGE | X86_CR4_PCE | X86_CR4_OSFXSR | X86_CR4_PCIDE \
| X86_CR4_OSXSAVE | X86_CR4_SMEP | X86_CR4_FSGSBASE \
| X86_CR4_OSXMMEXCPT | X86_CR4_LA57 | X86_CR4_VMXE \
- | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP))
+ | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP \
+ | X86_CR4_CET))
#define CR8_RESERVED_BITS (~(unsigned long)X86_CR8_TPR)
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index dd6d5150d86a..033a2f1a5c3f 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -634,7 +634,7 @@ void kvm_set_cpu_caps(void)
F(AVX512_VPOPCNTDQ) | F(UMIP) | F(AVX512_VBMI2) | F(GFNI) |
F(VAES) | F(VPCLMULQDQ) | F(AVX512_VNNI) | F(AVX512_BITALG) |
F(CLDEMOTE) | F(MOVDIRI) | F(MOVDIR64B) | 0 /*WAITPKG*/ |
- F(SGX_LC) | F(BUS_LOCK_DETECT)
+ F(SGX_LC) | F(BUS_LOCK_DETECT) | F(SHSTK)
);
/* Set LA57 based on hardware capability. */
if (cpuid_ecx(7) & F(LA57))
@@ -652,7 +652,8 @@ void kvm_set_cpu_caps(void)
F(SPEC_CTRL_SSBD) | F(ARCH_CAPABILITIES) | F(INTEL_STIBP) |
F(MD_CLEAR) | F(AVX512_VP2INTERSECT) | F(FSRM) |
F(SERIALIZE) | F(TSXLDTRK) | F(AVX512_FP16) |
- F(AMX_TILE) | F(AMX_INT8) | F(AMX_BF16) | F(FLUSH_L1D)
+ F(AMX_TILE) | F(AMX_INT8) | F(AMX_BF16) | F(FLUSH_L1D) |
+ F(IBT)
);
/* TSC_ADJUST and ARCH_CAPABILITIES are emulated in software. */
@@ -665,6 +666,13 @@ void kvm_set_cpu_caps(void)
kvm_cpu_cap_set(X86_FEATURE_INTEL_STIBP);
if (boot_cpu_has(X86_FEATURE_AMD_SSBD))
kvm_cpu_cap_set(X86_FEATURE_SPEC_CTRL_SSBD);
+ /*
+ * The feature bit in boot_cpu_data.x86_capability could have been
+ * cleared due to ibt=off cmdline option, then add it back if CPU
+ * supports IBT.
+ */
+ if (cpuid_edx(7) & F(IBT))
+ kvm_cpu_cap_set(X86_FEATURE_IBT);
kvm_cpu_cap_mask(CPUID_7_1_EAX,
F(AVX_VNNI) | F(AVX512_BF16) | F(CMPCCXADD) |
diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h
index 45162c1bcd8f..85cffeae7f10 100644
--- a/arch/x86/kvm/vmx/capabilities.h
+++ b/arch/x86/kvm/vmx/capabilities.h
@@ -106,6 +106,10 @@ static inline bool cpu_has_load_perf_global_ctrl(void)
return vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
}
+static inline bool cpu_has_load_cet_ctrl(void)
+{
+ return (vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_CET_STATE);
+}
static inline bool cpu_has_vmx_mpx(void)
{
return vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_BNDCFGS;
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 1ec7835c3060..dec7a8b81388 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2631,6 +2631,7 @@ static int setup_vmcs_config(struct vmcs_config *vmcs_conf,
{ VM_ENTRY_LOAD_IA32_EFER, VM_EXIT_LOAD_IA32_EFER },
{ VM_ENTRY_LOAD_BNDCFGS, VM_EXIT_CLEAR_BNDCFGS },
{ VM_ENTRY_LOAD_IA32_RTIT_CTL, VM_EXIT_CLEAR_IA32_RTIT_CTL },
+ { VM_ENTRY_LOAD_CET_STATE, VM_EXIT_LOAD_CET_STATE },
};
memset(vmcs_conf, 0, sizeof(*vmcs_conf));
@@ -6340,6 +6341,12 @@ void dump_vmcs(struct kvm_vcpu *vcpu)
if (vmcs_read32(VM_EXIT_MSR_STORE_COUNT) > 0)
vmx_dump_msrs("guest autostore", &vmx->msr_autostore.guest);
+ if (vmentry_ctl & VM_ENTRY_LOAD_CET_STATE) {
+ pr_err("S_CET = 0x%016lx\n", vmcs_readl(GUEST_S_CET));
+ pr_err("SSP = 0x%016lx\n", vmcs_readl(GUEST_SSP));
+ pr_err("INTR SSP TABLE = 0x%016lx\n",
+ vmcs_readl(GUEST_INTR_SSP_TABLE));
+ }
pr_err("*** Host State ***\n");
pr_err("RIP = 0x%016lx RSP = 0x%016lx\n",
vmcs_readl(HOST_RIP), vmcs_readl(HOST_RSP));
@@ -6417,6 +6424,12 @@ void dump_vmcs(struct kvm_vcpu *vcpu)
if (secondary_exec_control & SECONDARY_EXEC_ENABLE_VPID)
pr_err("Virtual processor ID = 0x%04x\n",
vmcs_read16(VIRTUAL_PROCESSOR_ID));
+ if (vmexit_ctl & VM_EXIT_LOAD_CET_STATE) {
+ pr_err("S_CET = 0x%016lx\n", vmcs_readl(HOST_S_CET));
+ pr_err("SSP = 0x%016lx\n", vmcs_readl(HOST_SSP));
+ pr_err("INTR SSP TABLE = 0x%016lx\n",
+ vmcs_readl(HOST_INTR_SSP_TABLE));
+ }
}
/*
@@ -7891,6 +7904,12 @@ static __init void vmx_set_cpu_caps(void)
if (cpu_has_vmx_waitpkg())
kvm_cpu_cap_check_and_set(X86_FEATURE_WAITPKG);
+
+ if (!cpu_has_load_cet_ctrl() || !enable_unrestricted_guest) {
+ kvm_cpu_cap_clear(X86_FEATURE_SHSTK);
+ kvm_cpu_cap_clear(X86_FEATURE_IBT);
+ kvm_caps.supported_xss &= ~XFEATURE_MASK_CET_USER;
+ }
}
static void vmx_request_immediate_exit(struct kvm_vcpu *vcpu)
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index 9e66531861cf..5e3ba69006f9 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -493,7 +493,8 @@ static inline u8 vmx_get_rvi(void)
VM_ENTRY_LOAD_IA32_EFER | \
VM_ENTRY_LOAD_BNDCFGS | \
VM_ENTRY_PT_CONCEAL_PIP | \
- VM_ENTRY_LOAD_IA32_RTIT_CTL)
+ VM_ENTRY_LOAD_IA32_RTIT_CTL | \
+ VM_ENTRY_LOAD_CET_STATE)
#define __KVM_REQUIRED_VMX_VM_EXIT_CONTROLS \
(VM_EXIT_SAVE_DEBUG_CONTROLS | \
@@ -515,7 +516,8 @@ static inline u8 vmx_get_rvi(void)
VM_EXIT_LOAD_IA32_EFER | \
VM_EXIT_CLEAR_BNDCFGS | \
VM_EXIT_PT_CONCEAL_PIP | \
- VM_EXIT_CLEAR_IA32_RTIT_CTL)
+ VM_EXIT_CLEAR_IA32_RTIT_CTL | \
+ VM_EXIT_LOAD_CET_STATE)
#define KVM_REQUIRED_VMX_PIN_BASED_VM_EXEC_CONTROL \
(PIN_BASED_EXT_INTR_MASK | \
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 95dba3c3df5f..ba82b102600d 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -226,7 +226,7 @@ static struct kvm_user_return_msrs __percpu *user_return_msrs;
| XFEATURE_MASK_BNDCSR | XFEATURE_MASK_AVX512 \
| XFEATURE_MASK_PKRU | XFEATURE_MASK_XTILE)
-#define KVM_SUPPORTED_XSS 0
+#define KVM_SUPPORTED_XSS (XFEATURE_MASK_CET_USER)
u64 __read_mostly host_efer;
EXPORT_SYMBOL_GPL(host_efer);
@@ -9525,6 +9525,25 @@ static int __kvm_x86_vendor_init(struct kvm_x86_init_ops *ops)
kvm_ops_update(ops);
+ /*
+ * Check CET user bit is still set in kvm_caps.supported_xss,
+ * if not, clear the cap bits as the user parts depends on
+ * XSAVES support.
+ */
+ if (!kvm_cet_user_supported()) {
+ kvm_cpu_cap_clear(X86_FEATURE_SHSTK);
+ kvm_cpu_cap_clear(X86_FEATURE_IBT);
+ }
+
+ /*
+ * If SHSTK and IBT are available in KVM, clear CET user bit in
+ * kvm_caps.supported_xss so that kvm_cet_user_supported() returns
+ * false when called.
+ */
+ if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK) &&
+ !kvm_cpu_cap_has(X86_FEATURE_IBT))
+ kvm_caps.supported_xss &= ~XFEATURE_MASK_CET_USER;
+
for_each_online_cpu(cpu) {
smp_call_function_single(cpu, kvm_x86_check_cpu_compat, &r, 1);
if (r < 0)
--
2.27.0
From: Sean Christopherson <[email protected]>
Load the guest's FPU state if userspace is accessing MSRs whose values are
managed by XSAVES so that the MSR helpers, e.g. kvm_{get,set}_xsave_msr(),
can simply do {RD,WR}MSR to access the guest's value.
If new feature MSRs supported in XSS are passed through to the guest they
are saved and restored by XSAVES/XRSTORS, i.e. in the guest's FPU state.
Because is also used for the KVM_GET_MSRS device ioctl(), explicitly check
@vcpu is non-null before attempting to load guest state. The XSS supporting
MSRs cannot be retrieved via the device ioctl() without loading guest FPU
state (which doesn't exist).
Note that guest_cpuid_has() is not queried as host userspace is allowed
to access MSRs that have not been exposed to the guest, e.g. it might do
KVM_SET_MSRS prior to KVM_SET_CPUID2.
Signed-off-by: Sean Christopherson <[email protected]>
Co-developed-by: Yang Weijiang <[email protected]>
Signed-off-by: Yang Weijiang <[email protected]>
---
arch/x86/kvm/x86.c | 29 ++++++++++++++++++++++++++++-
1 file changed, 28 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index d2975ca96ac5..7788646bbf1f 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -130,6 +130,9 @@ static int __set_sregs2(struct kvm_vcpu *vcpu, struct kvm_sregs2 *sregs2);
static void __get_sregs2(struct kvm_vcpu *vcpu, struct kvm_sregs2 *sregs2);
static DEFINE_MUTEX(vendor_module_lock);
+static void kvm_load_guest_fpu(struct kvm_vcpu *vcpu);
+static void kvm_put_guest_fpu(struct kvm_vcpu *vcpu);
+
struct kvm_x86_ops kvm_x86_ops __read_mostly;
#define KVM_X86_OP(func) \
@@ -4336,6 +4339,21 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
}
EXPORT_SYMBOL_GPL(kvm_get_msr_common);
+static const u32 xsave_msrs[] = {
+ MSR_IA32_U_CET, MSR_IA32_PL3_SSP,
+};
+
+static bool is_xsaves_msr(u32 index)
+{
+ int i;
+
+ for (i = 0; i < ARRAY_SIZE(xsave_msrs); i++) {
+ if (index == xsave_msrs[i])
+ return true;
+ }
+ return false;
+}
+
/*
* Read or write a bunch of msrs. All parameters are kernel addresses.
*
@@ -4346,11 +4364,20 @@ static int __msr_io(struct kvm_vcpu *vcpu, struct kvm_msrs *msrs,
int (*do_msr)(struct kvm_vcpu *vcpu,
unsigned index, u64 *data))
{
+ bool fpu_loaded = false;
int i;
- for (i = 0; i < msrs->nmsrs; ++i)
+ for (i = 0; i < msrs->nmsrs; ++i) {
+ if (vcpu && !fpu_loaded && kvm_caps.supported_xss &&
+ is_xsaves_msr(entries[i].index)) {
+ kvm_load_guest_fpu(vcpu);
+ fpu_loaded = true;
+ }
if (do_msr(vcpu, entries[i].index, &entries[i].data))
break;
+ }
+ if (fpu_loaded)
+ kvm_put_guest_fpu(vcpu);
return i;
}
--
2.27.0
Add support for MSR_IA32_S_CET and GUEST_S_CET access.
Mainline Linux kernel now supports supervisor IBT for kernel code,
to make s-IBT work in guest(nested guest), pass through MSR_IA32_S_CET
to guest(nested guest) if host kernel and KVM enabled IBT.
Note, s-IBT can work independent to host xsaves support because guest
MSR_IA32_S_CET is {stored|loaded} from VMCS GUEST_S_CET field.
Signed-off-by: Yang Weijiang <[email protected]>
---
arch/x86/kvm/vmx/nested.c | 3 +++
arch/x86/kvm/vmx/vmx.c | 21 ++++++++++++++++++---
arch/x86/kvm/x86.c | 1 +
3 files changed, 22 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 522ac27d2534..bf690827bfee 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -664,6 +664,9 @@ static inline bool nested_vmx_prepare_msr_bitmap(struct kvm_vcpu *vcpu,
nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0,
MSR_IA32_U_CET, MSR_TYPE_RW);
+ nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0,
+ MSR_IA32_S_CET, MSR_TYPE_RW);
+
nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0,
MSR_IA32_PL3_SSP, MSR_TYPE_RW);
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index db4aacbcba7f..6eab3e452bbb 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -711,6 +711,7 @@ static bool is_valid_passthrough_msr(u32 msr)
return true;
case MSR_IA32_U_CET:
case MSR_IA32_PL3_SSP:
+ case MSR_IA32_S_CET:
return true;
}
@@ -1961,7 +1962,8 @@ static int vmx_get_msr_feature(struct kvm_msr_entry *msr)
static bool cet_is_msr_accessible(struct kvm_vcpu *vcpu,
struct msr_data *msr)
{
- if (!kvm_cet_user_supported())
+ if (!kvm_cet_user_supported() &&
+ !kvm_cpu_cap_has(X86_FEATURE_IBT))
return false;
if (msr->host_initiated)
@@ -1971,6 +1973,9 @@ static bool cet_is_msr_accessible(struct kvm_vcpu *vcpu,
!guest_cpuid_has(vcpu, X86_FEATURE_IBT))
return false;
+ if (msr->index == MSR_IA32_S_CET)
+ return guest_cpuid_has(vcpu, X86_FEATURE_IBT);
+
if ((msr->index == MSR_IA32_PL3_SSP ||
msr->index == MSR_KVM_GUEST_SSP) &&
!guest_cpuid_has(vcpu, X86_FEATURE_SHSTK))
@@ -2120,10 +2125,13 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
case MSR_IA32_U_CET:
case MSR_IA32_PL3_SSP:
case MSR_KVM_GUEST_SSP:
+ case MSR_IA32_S_CET:
if (!cet_is_msr_accessible(vcpu, msr_info))
return 1;
if (msr_info->index == MSR_KVM_GUEST_SSP)
msr_info->data = vmcs_readl(GUEST_SSP);
+ else if (msr_info->index == MSR_IA32_S_CET)
+ msr_info->data = vmcs_readl(GUEST_S_CET);
else
kvm_get_xsave_msr(msr_info);
break;
@@ -2440,12 +2448,16 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
vmx->pt_desc.guest.addr_a[index / 2] = data;
break;
case MSR_IA32_U_CET:
+ case MSR_IA32_S_CET:
if (!cet_is_msr_accessible(vcpu, msr_info))
return 1;
if ((data & GENMASK(9, 6)) ||
is_noncanonical_address(data, vcpu))
return 1;
- kvm_set_xsave_msr(msr_info);
+ if (msr_index == MSR_IA32_S_CET)
+ vmcs_writel(GUEST_S_CET, data);
+ else
+ kvm_set_xsave_msr(msr_info);
break;
case MSR_IA32_PL3_SSP:
case MSR_KVM_GUEST_SSP:
@@ -7759,6 +7771,9 @@ static void vmx_update_intercept_for_cet_msr(struct kvm_vcpu *vcpu)
incpt |= !guest_cpuid_has(vcpu, X86_FEATURE_SHSTK);
vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL3_SSP, MSR_TYPE_RW, incpt);
+
+ incpt |= !guest_cpuid_has(vcpu, X86_FEATURE_IBT);
+ vmx_set_intercept_for_msr(vcpu, MSR_IA32_S_CET, MSR_TYPE_RW, incpt);
}
static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
@@ -7829,7 +7844,7 @@ static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
/* Refresh #PF interception to account for MAXPHYADDR changes. */
vmx_update_exception_bitmap(vcpu);
- if (kvm_cet_user_supported())
+ if (kvm_cet_user_supported() || kvm_cpu_cap_has(X86_FEATURE_IBT))
vmx_update_intercept_for_cet_msr(vcpu);
}
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index ba82b102600d..51fccbd2d3e7 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1471,6 +1471,7 @@ static const u32 msrs_to_save_base[] = {
MSR_IA32_XFD, MSR_IA32_XFD_ERR,
MSR_IA32_XSS,
MSR_IA32_U_CET, MSR_IA32_PL3_SSP, MSR_KVM_GUEST_SSP,
+ MSR_IA32_S_CET,
};
static const u32 msrs_to_save_pmu[] = {
--
2.27.0
Add MSR access interfaces for supervisor shadow stack, i.e.,
MSR_IA32_PL{0,1,2} and MSR_IA32_INT_SSP_TAB, meanwhile pass through
them to {L1,L2} guests when {L0,L1} KVM supports supervisor shadow
stack.
Note, currently supervisor shadow stack is not supported on Intel
platforms, i.e., VMX always clears CPUID(EAX=07H,ECX=1).EDX.[bit 18].
The main purpose of this patch is to facilitate AMD folks to enable
supervisor shadow stack for their platforms.
Signed-off-by: Yang Weijiang <[email protected]>
---
arch/x86/kvm/cpuid.h | 6 +++++
arch/x86/kvm/vmx/nested.c | 12 +++++++++
arch/x86/kvm/vmx/vmx.c | 51 ++++++++++++++++++++++++++++++++++-----
3 files changed, 63 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
index b1658c0de847..019a16b25b88 100644
--- a/arch/x86/kvm/cpuid.h
+++ b/arch/x86/kvm/cpuid.h
@@ -232,4 +232,10 @@ static __always_inline bool guest_pv_has(struct kvm_vcpu *vcpu,
return vcpu->arch.pv_cpuid.features & (1u << kvm_feature);
}
+static __always_inline bool kvm_cet_kernel_shstk_supported(void)
+{
+ return !IS_ENABLED(CONFIG_KVM_INTEL) &&
+ kvm_cpu_cap_has(X86_FEATURE_SHSTK);
+}
+
#endif
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index bf690827bfee..aaaae92dc9f6 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -670,6 +670,18 @@ static inline bool nested_vmx_prepare_msr_bitmap(struct kvm_vcpu *vcpu,
nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0,
MSR_IA32_PL3_SSP, MSR_TYPE_RW);
+ nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0,
+ MSR_IA32_PL0_SSP, MSR_TYPE_RW);
+
+ nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0,
+ MSR_IA32_PL1_SSP, MSR_TYPE_RW);
+
+ nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0,
+ MSR_IA32_PL2_SSP, MSR_TYPE_RW);
+
+ nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0,
+ MSR_IA32_INT_SSP_TAB, MSR_TYPE_RW);
+
kvm_vcpu_unmap(vcpu, &vmx->nested.msr_bitmap_map, false);
vmx->nested.force_msr_bitmap_recalc = false;
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 6eab3e452bbb..074b618f1a07 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -713,6 +713,9 @@ static bool is_valid_passthrough_msr(u32 msr)
case MSR_IA32_PL3_SSP:
case MSR_IA32_S_CET:
return true;
+ case MSR_IA32_PL0_SSP ... MSR_IA32_PL2_SSP:
+ case MSR_IA32_INT_SSP_TAB:
+ return true;
}
r = possible_passthrough_msr_slot(msr) != -ENOENT;
@@ -1962,8 +1965,11 @@ static int vmx_get_msr_feature(struct kvm_msr_entry *msr)
static bool cet_is_msr_accessible(struct kvm_vcpu *vcpu,
struct msr_data *msr)
{
+ u64 mask;
+
if (!kvm_cet_user_supported() &&
- !kvm_cpu_cap_has(X86_FEATURE_IBT))
+ !(kvm_cpu_cap_has(X86_FEATURE_IBT) ||
+ kvm_cpu_cap_has(X86_FEATURE_SHSTK)))
return false;
if (msr->host_initiated)
@@ -1973,15 +1979,27 @@ static bool cet_is_msr_accessible(struct kvm_vcpu *vcpu,
!guest_cpuid_has(vcpu, X86_FEATURE_IBT))
return false;
+ if (msr->index == MSR_IA32_U_CET)
+ return true;
+
if (msr->index == MSR_IA32_S_CET)
- return guest_cpuid_has(vcpu, X86_FEATURE_IBT);
+ return guest_cpuid_has(vcpu, X86_FEATURE_IBT) ||
+ kvm_cet_kernel_shstk_supported();
- if ((msr->index == MSR_IA32_PL3_SSP ||
- msr->index == MSR_KVM_GUEST_SSP) &&
+ if (msr->index == MSR_KVM_GUEST_SSP)
+ return guest_cpuid_has(vcpu, X86_FEATURE_SHSTK);
+
+ if (msr->index == MSR_IA32_INT_SSP_TAB)
+ return guest_cpuid_has(vcpu, X86_FEATURE_SHSTK) &&
+ kvm_cet_kernel_shstk_supported();
+
+ if (msr->index == MSR_IA32_PL3_SSP &&
!guest_cpuid_has(vcpu, X86_FEATURE_SHSTK))
return false;
- return true;
+ mask = (msr->index == MSR_IA32_PL3_SSP) ? XFEATURE_MASK_CET_USER :
+ XFEATURE_MASK_CET_KERNEL;
+ return !!(kvm_caps.supported_xss & mask);
}
/*
@@ -2135,6 +2153,12 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
else
kvm_get_xsave_msr(msr_info);
break;
+ case MSR_IA32_PL0_SSP ... MSR_IA32_PL2_SSP:
+ case MSR_IA32_INT_SSP_TAB:
+ if (!cet_is_msr_accessible(vcpu, msr_info))
+ return 1;
+ kvm_get_xsave_msr(msr_info);
+ break;
case MSR_IA32_DEBUGCTLMSR:
msr_info->data = vmcs_read64(GUEST_IA32_DEBUGCTL);
break;
@@ -2471,6 +2495,12 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
else
kvm_set_xsave_msr(msr_info);
break;
+ case MSR_IA32_PL0_SSP ... MSR_IA32_PL2_SSP:
+ case MSR_IA32_INT_SSP_TAB:
+ if (!cet_is_msr_accessible(vcpu, msr_info))
+ return 1;
+ kvm_set_xsave_msr(msr_info);
+ break;
case MSR_IA32_PERF_CAPABILITIES:
if (data && !vcpu_to_pmu(vcpu)->version)
return 1;
@@ -7774,6 +7804,14 @@ static void vmx_update_intercept_for_cet_msr(struct kvm_vcpu *vcpu)
incpt |= !guest_cpuid_has(vcpu, X86_FEATURE_IBT);
vmx_set_intercept_for_msr(vcpu, MSR_IA32_S_CET, MSR_TYPE_RW, incpt);
+
+ incpt = !is_cet_state_supported(vcpu, XFEATURE_MASK_CET_KERNEL);
+ incpt |= !guest_cpuid_has(vcpu, X86_FEATURE_SHSTK);
+
+ vmx_set_intercept_for_msr(vcpu, MSR_IA32_INT_SSP_TAB, MSR_TYPE_RW, incpt);
+ vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL0_SSP, MSR_TYPE_RW, incpt);
+ vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL1_SSP, MSR_TYPE_RW, incpt);
+ vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL2_SSP, MSR_TYPE_RW, incpt);
}
static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
@@ -7844,7 +7882,8 @@ static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
/* Refresh #PF interception to account for MAXPHYADDR changes. */
vmx_update_exception_bitmap(vcpu);
- if (kvm_cet_user_supported() || kvm_cpu_cap_has(X86_FEATURE_IBT))
+ if (kvm_cet_user_supported() || kvm_cpu_cap_has(X86_FEATURE_IBT) ||
+ kvm_cpu_cap_has(X86_FEATURE_SHSTK))
vmx_update_intercept_for_cet_msr(vcpu);
}
--
2.27.0
From: Sean Christopherson <[email protected]>
Add MSR_IA32_XSS to the list of MSRs reported to userspace if
supported_xss is non-zero, i.e. KVM supports at least one XSS based
feature.
Signed-off-by: Sean Christopherson <[email protected]>
Signed-off-by: Yang Weijiang <[email protected]>
---
arch/x86/kvm/x86.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index e7f78fe79b32..33a780fe820b 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1454,6 +1454,7 @@ static const u32 msrs_to_save_base[] = {
MSR_IA32_UMWAIT_CONTROL,
MSR_IA32_XFD, MSR_IA32_XFD_ERR,
+ MSR_IA32_XSS,
};
static const u32 msrs_to_save_pmu[] = {
--
2.27.0
Pass through CET user mode MSRs when the associated CET component
is enabled to improve guest performance. All CET MSRs are context
switched, either via dedicated VMCS fields or XSAVES.
Co-developed-by: Zhang Yi Z <[email protected]>
Signed-off-by: Zhang Yi Z <[email protected]>
Co-developed-by: Sean Christopherson <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
Signed-off-by: Yang Weijiang <[email protected]>
---
arch/x86/kvm/vmx/vmx.c | 23 +++++++++++++++++++++++
1 file changed, 23 insertions(+)
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 42211ae40650..1ec7835c3060 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -709,6 +709,9 @@ static bool is_valid_passthrough_msr(u32 msr)
case MSR_LBR_CORE_TO ... MSR_LBR_CORE_TO + 8:
/* LBR MSRs. These are handled in vmx_update_intercept_for_lbr_msrs() */
return true;
+ case MSR_IA32_U_CET:
+ case MSR_IA32_PL3_SSP:
+ return true;
}
r = possible_passthrough_msr_slot(msr) != -ENOENT;
@@ -7726,6 +7729,23 @@ static void update_intel_pt_cfg(struct kvm_vcpu *vcpu)
vmx->pt_desc.ctl_bitmask &= ~(0xfULL << (32 + i * 4));
}
+static bool is_cet_state_supported(struct kvm_vcpu *vcpu, u32 xss_state)
+{
+ return (kvm_caps.supported_xss & xss_state) &&
+ (guest_cpuid_has(vcpu, X86_FEATURE_SHSTK) ||
+ guest_cpuid_has(vcpu, X86_FEATURE_IBT));
+}
+
+static void vmx_update_intercept_for_cet_msr(struct kvm_vcpu *vcpu)
+{
+ bool incpt = !is_cet_state_supported(vcpu, XFEATURE_MASK_CET_USER);
+
+ vmx_set_intercept_for_msr(vcpu, MSR_IA32_U_CET, MSR_TYPE_RW, incpt);
+
+ incpt |= !guest_cpuid_has(vcpu, X86_FEATURE_SHSTK);
+ vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL3_SSP, MSR_TYPE_RW, incpt);
+}
+
static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
{
struct vcpu_vmx *vmx = to_vmx(vcpu);
@@ -7793,6 +7813,9 @@ static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
/* Refresh #PF interception to account for MAXPHYADDR changes. */
vmx_update_exception_bitmap(vcpu);
+
+ if (kvm_cet_user_supported())
+ vmx_update_intercept_for_cet_msr(vcpu);
}
static u64 vmx_get_perf_capabilities(void)
--
2.27.0
Report all CET MSRs, including the synthetic GUEST_SSP MSR, as
to-be-saved, e.g. for migration, if CET is supported by KVM.
Co-developed-by: Sean Christopherson <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
Signed-off-by: Yang Weijiang <[email protected]>
---
arch/x86/kvm/x86.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 7cd7f6755acd..95dba3c3df5f 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1470,6 +1470,7 @@ static const u32 msrs_to_save_base[] = {
MSR_IA32_XFD, MSR_IA32_XFD_ERR,
MSR_IA32_XSS,
+ MSR_IA32_U_CET, MSR_IA32_PL3_SSP, MSR_KVM_GUEST_SSP,
};
static const u32 msrs_to_save_pmu[] = {
--
2.27.0
Hi,
On Fri, Apr 21, 2023 at 09:45:54AM -0400, Yang Weijiang wrote:
>
> Tests:
> --------------------------------------------------------------------------
> This series passed basic CET user shadow stack test and kernel IBT test in
> L1 and L2 guest. It also passed CET KUT test which has been merged there.
>
> Executed all KUT tests and KVM selftests against this series, all test cases
> passes except the vmx test, the failure is due to CR4_CET bit testing in
> test_vmxon_bad_cr(). After add CR4_CET bit to skip list, the test passed.
> I'll send a patch to fix this issue later.
>
>
> To run user shadow stack test and kernel IBT test in VM, you need an CET
> capable platform, e.g., Sapphire Rapids server, and follow below steps to
> build host/guest kernel properly:
>
> 1. Buld host kernel. Patch this series to kernel tree and build kernel
> with CET capable gcc version(e.g., >=8.5.0).
>
> 2. Build guest kernel. Patch CET native series to kernel tree and opt-in
> CONFIG_X86_KERNEL_IBT and CONFIG_X86_USER_SHADOW_STACK options.
>
> 3. Launch a VM with QEMU built with CET enabling patches [2].
>
> Check kernel selftest test_shadow_stack_64 output:
> [INFO] new_ssp = 7f8c82100ff8, *new_ssp = 7f8c82101001
> [INFO] changing ssp from 7f8c82900ff0 to 7f8c82100ff8
> [INFO] ssp is now 7f8c82101000
> [OK] Shadow stack pivot
> [OK] Shadow stack faults
> [INFO] Corrupting shadow stack
> [INFO] Generated shadow stack violation successfully
> [OK] Shadow stack violation test
> [INFO] Gup read -> shstk access success
> [INFO] Gup write -> shstk access success
> [INFO] Violation from normal write
> [INFO] Gup read -> write access success
> [INFO] Violation from normal write
> [INFO] Gup write -> write access success
> [INFO] Cow gup write -> write access success
> [OK] Shadow gup test
> [INFO] Violation from shstk access
> [OK] mprotect() test
> [SKIP] Userfaultfd unavailable.
> [OK] 32 bit test
>
> Check kernel IBT with dmesg | grep CET:
> CET detected: Indirect Branch Tracking enabled
>
> --------------------------------------------------------------------------
> Changes in v2:
> 1. Remove excessive checks on host CET Kconfig options in v1 patchset [3].
> 2. Make CET CPUIDs, MSRs and control flags enabling independent to host CET status.
> 3. Introduce supervisor SHSTK support to make the patch set complete.
> 4. Refactor patches to accommodate above changes.
> 5. Rebase on kvm-x86/next [4].
>
>
> [1]: linux-next: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/?h=next-20230420
> [2]: QEMU patch: https://lore.kernel.org/all/[email protected]/
> [3]: v1 patchset: https://lore.kernel.org/all/[email protected]/
> [4]: Rebase branch: https://github.com/kvm-x86/linux.git, commit: 7b632f72528d (tag: kvm-x86-next-2023.04.14)
I played a bit with KVM support for shadow stacks on AMD machines and I
rebased v1 patches along with John's SVM series
https://lore.kernel.org/kvm/[email protected]/
on top of v6.3-rc4 and Rick's series for host shadow stack support. I've
put this at
https://git.kernel.org/pub/scm/linux/kernel/git/rppt/linux.git/log/?h=shstk/kvm
if anybody is interested. With this I could successfully run shadow stack
self test in a guest on an AMD Zen3 machine.
One thing I've noticed while rebasing is that John's patches move
cet_is_msr_accessible() from vmx/ to x86.c and I also had to make such move
for cet_is_ssp_msr_accessible().
Would make sense to have them available for both VMX and SVM from the
start.
> Rick Edgecombe (5):
> x86/shstk: Add Kconfig option for shadow stack
> x86/cpufeatures: Add CPU feature flags for shadow stacks
> x86/cpufeatures: Enable CET CR4 bit for shadow stack
> x86/fpu/xstate: Introduce CET MSR and XSAVES supervisor states
> x86/fpu: Add helper for modifying xstate
>
> Sean Christopherson (2):
> KVM:x86: Report XSS as to-be-saved if there are supported features
> KVM:x86: Load guest FPU state when accessing xsaves-managed MSRs
>
> Yang Weijiang (14):
> KVM:x86: Refresh CPUID on write to guest MSR_IA32_XSS
> KVM:x86: Init kvm_caps.supported_xss with supported feature bits
> KVM:x86: Add #CP support in guest exception classification
> KVM:VMX: Introduce CET VMCS fields and control bits
> KVM:x86: Add fault checks for guest CR4.CET setting
> KVM:VMX: Emulate reads and writes to CET MSRs
> KVM:VMX: Add a synthetic MSR to allow userspace VMM to access
> GUEST_SSP
> KVM:x86: Report CET MSRs as to-be-saved if CET is supported
> KVM:x86: Save/Restore GUEST_SSP to/from SMM state save area
> KVM:VMX: Pass through user CET MSRs to the guest
> KVM:x86: Enable CET virtualization for VMX and advertise to userspace
> KVM:nVMX: Enable user CET support for nested VMX
> KVM:x86: Enable supervisor IBT support for guest
> KVM:x86: Support CET supervisor shadow stack MSR access
>
> arch/x86/Kconfig | 24 ++++
> arch/x86/Kconfig.assembler | 5 +
> arch/x86/include/asm/cpufeatures.h | 2 +
> arch/x86/include/asm/disabled-features.h | 8 +-
> arch/x86/include/asm/fpu/api.h | 9 ++
> arch/x86/include/asm/fpu/types.h | 16 ++-
> arch/x86/include/asm/fpu/xstate.h | 6 +-
> arch/x86/include/asm/kvm_host.h | 3 +-
> arch/x86/include/asm/vmx.h | 8 ++
> arch/x86/include/uapi/asm/kvm.h | 1 +
> arch/x86/include/uapi/asm/kvm_para.h | 1 +
> arch/x86/kernel/cpu/common.c | 35 ++++--
> arch/x86/kernel/cpu/cpuid-deps.c | 1 +
> arch/x86/kernel/fpu/core.c | 19 +++
> arch/x86/kernel/fpu/xstate.c | 90 +++++++-------
> arch/x86/kvm/cpuid.c | 23 +++-
> arch/x86/kvm/cpuid.h | 6 +
> arch/x86/kvm/smm.c | 20 +++
> arch/x86/kvm/vmx/capabilities.h | 4 +
> arch/x86/kvm/vmx/nested.c | 29 ++++-
> arch/x86/kvm/vmx/vmcs12.c | 6 +
> arch/x86/kvm/vmx/vmcs12.h | 14 ++-
> arch/x86/kvm/vmx/vmx.c | 150 ++++++++++++++++++++++-
> arch/x86/kvm/vmx/vmx.h | 6 +-
> arch/x86/kvm/x86.c | 79 ++++++++++--
> arch/x86/kvm/x86.h | 46 ++++++-
> 26 files changed, 528 insertions(+), 83 deletions(-)
>
>
> base-commit: 7b632f72528d5fa3f0265358a393f534da47d9dd
> --
> 2.27.0
>
--
Sincerely yours,
Mike.
On Fri, Apr 21, 2023 at 09:45:54AM -0400, Yang Weijiang wrote:
> Implementation:
> --------------------------------------------------------------------------
> Historically, the early KVM patches can support both user SHSTK and IBT,
> and most of the early patches are carried forward with changes by this new
> series. Then with kernel IBT feature merged in 5.18, a new patch was added
> to support the feature for guest. The last patch is introduced to support
Yeah, at the time I had to hack up kernel IBT guest support, because the
platform I had to use (tgl-nuc) didn't have serial and so I had to use
KVM :/
> supervisor SHSTK but the feature is not enabled on Intel platform for now,
> the main purpose of this patch is to facilitate AMD folks to enable the
> feature.
>
> In summary, this new series enables CET user SHSTK/IBT and kernel IBT, but
> doesn't fully support CET supervisor SHSTK, the enabling work is left for
> the future.
>
> Supported CET sub-features:
>
> |
> User SHSTK | User IBT (user mode)
> --------------------------------------------------
> s-SHSTK (X) | Kernel IBT (kernel mode)
> |
>
> The user SHSTK/IBT relies on host side XSAVES support(XSS[bit 11]) for user
> mode CET states. The kernel IBT doesn't have dependency on host XSAVES.
> The supervisor SHSTK relies on host side XSAVES support(XSS[bit 12]) for
> supervisor mode CET states.
>
> This version removed unnecessary checks for host CET enabling status before
> enabling guest CET support, making guest CET support apart from that of host.
> By doing so, it's expected to be more friendly to cloud computing scenarios.
I've on ideas about cloud stuff, but there is fundamentally no relation
bewteen the host making use of IBT/SHSTK and a guest doing so, so there
should be no dependency there.
> To run user shadow stack test and kernel IBT test in VM, you need an CET
> capable platform, e.g., Sapphire Rapids server, and follow below steps to
> build host/guest kernel properly:
>
> 1. Buld host kernel. Patch this series to kernel tree and build kernel
> with CET capable gcc version(e.g., >=8.5.0).
Why does the host kernel require a CET capable toolchain if the host
kernel does not in fact need to make use of these features in order to
provide them to the guest?
On 4/22/2023 9:02 PM, Peter Zijlstra wrote:
> On Fri, Apr 21, 2023 at 09:45:54AM -0400, Yang Weijiang wrote:
>
>> Implementation:
>> --------------------------------------------------------------------------
>> Historically, the early KVM patches can support both user SHSTK and IBT,
>> and most of the early patches are carried forward with changes by this new
>> series. Then with kernel IBT feature merged in 5.18, a new patch was added
>> to support the feature for guest. The last patch is introduced to support
> Yeah, at the time I had to hack up kernel IBT guest support, because the
> platform I had to use (tgl-nuc) didn't have serial and so I had to use
> KVM :/
You did it and beat all the hurdles :-)
>> supervisor SHSTK but the feature is not enabled on Intel platform for now,
>> the main purpose of this patch is to facilitate AMD folks to enable the
>> feature.
>>
>> In summary, this new series enables CET user SHSTK/IBT and kernel IBT, but
>> doesn't fully support CET supervisor SHSTK, the enabling work is left for
>> the future.
>>
>> Supported CET sub-features:
>>
>> |
>> User SHSTK | User IBT (user mode)
>> --------------------------------------------------
>> s-SHSTK (X) | Kernel IBT (kernel mode)
>> |
>>
>> The user SHSTK/IBT relies on host side XSAVES support(XSS[bit 11]) for user
>> mode CET states. The kernel IBT doesn't have dependency on host XSAVES.
>> The supervisor SHSTK relies on host side XSAVES support(XSS[bit 12]) for
>> supervisor mode CET states.
>>
>> This version removed unnecessary checks for host CET enabling status before
>> enabling guest CET support, making guest CET support apart from that of host.
>> By doing so, it's expected to be more friendly to cloud computing scenarios.
> I've on ideas about cloud stuff, but there is fundamentally no relation
> bewteen the host making use of IBT/SHSTK and a guest doing so, so there
> should be no dependency there.
Definitely as long as there's no quirk required for the features! Also
eliminated the
upgrade efforts for host end users in order to play with CET.
>> To run user shadow stack test and kernel IBT test in VM, you need an CET
>> capable platform, e.g., Sapphire Rapids server, and follow below steps to
>> build host/guest kernel properly:
>>
>> 1. Buld host kernel. Patch this series to kernel tree and build kernel
>> with CET capable gcc version(e.g., >=8.5.0).
> Why does the host kernel require a CET capable toolchain if the host
> kernel does not in fact need to make use of these features in order to
> provide them to the guest?
Oops, this should be a typo, guest instead of host build requires
qualified gcc version.
Thanks for the comments!
On 4/22/2023 5:54 AM, Mike Rapoport wrote:
> Hi,
>
> On Fri, Apr 21, 2023 at 09:45:54AM -0400, Yang Weijiang wrote:
>> [...]
>>
>> [1]: linux-next: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/?h=next-20230420
>> [2]: QEMU patch: https://lore.kernel.org/all/[email protected]/
>> [3]: v1 patchset: https://lore.kernel.org/all/[email protected]/
>> [4]: Rebase branch: https://github.com/kvm-x86/linux.git, commit: 7b632f72528d (tag: kvm-x86-next-2023.04.14)
>
> I played a bit with KVM support for shadow stacks on AMD machines and I
> rebased v1 patches along with John's SVM series
>
> https://lore.kernel.org/kvm/[email protected]/
>
> on top of v6.3-rc4 and Rick's series for host shadow stack support. I've
> put this at
>
> https://git.kernel.org/pub/scm/linux/kernel/git/rppt/linux.git/log/?h=shstk/kvm
>
> if anybody is interested. With this I could successfully run shadow stack
> self test in a guest on an AMD Zen3 machine.
>
> One thing I've noticed while rebasing is that John's patches move
> cet_is_msr_accessible() from vmx/ to x86.c and I also had to make such move
> for cet_is_ssp_msr_accessible().
>
> Would make sense to have them available for both VMX and SVM from the
> start.
Hi, Mike,
Yes, it makes sense to do so. I'll include the change in next version so
that John's patchset can
omit the work, thanks!
[...]
On 4/21/2023 9:45 PM, Yang Weijiang wrote:
> CET (Control-flow Enforcement Technology) is a CPU feature used to prevent\n
> Return/Jump-Oriented Programming (ROP/JOP) attacks. CET introduces a new\n
> exception type, Control Protection (#CP), and two sub-features(SHSTK,IBT)\n
> to defend against ROP/JOP style control-flow subversion attacks.\n
>
> Shadow Stack (SHSTK):
> A shadow stack is a second stack used exclusively for control transfer
> operations. The shadow stack is separate from the data/normal stack and
> can be enabled individually in user and kernel mode. When shadow stacks
> are enabled, CALL pushes the return address on both the data and shadow
> stack. RET pops the return address from both stacks and compares them.
> If the return addresses from the two stacks do not match, the processor
> generates a #CP.
>
> Indirect Branch Tracking (IBT):
> IBT adds new instrutions, ENDBRANCH{32|64}, to mark valid target addresses
/s/instrutions/instructions
> of indirect branches (CALL, JMP etc...). If an indirect branch is executed
> and the next instruction is _not_ an ENDBRANCH, the processor generates a #CP.
IMHO, it's better to mention the behavior of the new instructions when
IBT is not enabled or
on the old platforms that don't support the feature.
>
>
> Build dependency:
> --------------------------------------------------------------------------
> The first 5 patches are took over from CET native series [1] in linux-next,
> they must be included in kernel tree when build host kernel for testing CET
> in guest. Will remove them once the native series landed in mainline kernel
> tree. It's just for build and test purpose.
>
>
> Implementation:
> --------------------------------------------------------------------------
> Historically, the early KVM patches can support both user SHSTK and IBT,
> and most of the early patches are carried forward with changes by this new
> series. Then with kernel IBT feature merged in 5.18, a new patch was added
> to support the feature for guest. The last patch is introduced to support
> supervisor SHSTK but the feature is not enabled on Intel platform for now,
> the main purpose of this patch is to facilitate AMD folks to enable the
> feature.
>
> In summary, this new series enables CET user SHSTK/IBT and kernel IBT, but
> doesn't fully support CET supervisor SHSTK, the enabling work is left for
> the future.
>
> Supported CET sub-features:
>
> |
> User SHSTK | User IBT (user mode)
> --------------------------------------------------
> s-SHSTK (X) | Kernel IBT (kernel mode)
> |
>
> The user SHSTK/IBT relies on host side XSAVES support(XSS[bit 11]) for user
> mode CET states. The kernel IBT doesn't have dependency on host XSAVES.
> The supervisor SHSTK relies on host side XSAVES support(XSS[bit 12]) for
> supervisor mode CET states.
>
> This version removed unnecessary checks for host CET enabling status before
> enabling guest CET support, making guest CET support apart from that of host.
> By doing so, it's expected to be more friendly to cloud computing scenarios.
>
>
> CET states management:
> --------------------------------------------------------------------------
> CET user mode states, MSR_IA32_{U_CET,PL3_SSP} depends on {XSAVES|XRSTORS}
> instructions to swap guest/host context when vm-exit/vm-entry happens.
> On vm-exit, the guest CET states are stored to guest fpu area and host user
> mode states are loaded from thread/process context before vCPU returns to
> userspace, vice-versa on vm-entry. See details in kvm_{load|put}_guest_fpu().
> So the user mode state validity depends on host side U_CET bit set in MSR_XSS.
>
> CET supervisor mode states are grouped into two categories - XSAVES dependent
> and non-dependent, the former includes MSR_IA32_PL{0,1,2}_SSP, the later
> consists of MSR_IA32_S_CET and MSR_IA32_INTR_SSP_TBL. The XSAVES dependent
> MSR's save/restore depends on S_CET bit set in MSR_XSS. Since the native series
> doesn't enable S_CET support,
Do you know the reason why native patch doesn't enable S_CET support?
> these s-SHSTK shadow stack pointers are invalid.
>
> Moveover, new VMCS fields, {GUEST|HOST}_{S_CET,SSP,INTR_SSP_TABL}, are
> introduced for guest/host supervisor state switch. When CET entry/exit load
> bits are set, the guest/host MSR_IA32_{S_CET,INTR_SSP_TBL,SSP} are swapped
> automatically at vm-exit/entry. With these new fields, current guest kernel
> IBT enalbing doesn't depend on host {XSAVES|XRSTORS} support.
/s/enalbing/enabling
>
>
> Tests:
> --------------------------------------------------------------------------
> This series passed basic CET user shadow stack test and kernel IBT test in
> L1 and L2 guest. It also passed CET KUT test which has been merged there.
>
> Executed all KUT tests and KVM selftests against this series, all test cases
> passes except the vmx test, the failure is due to CR4_CET bit testing in
> test_vmxon_bad_cr(). After add CR4_CET bit to skip list, the test passed.
> I'll send a patch to fix this issue later.
>
>
> To run user shadow stack test and kernel IBT test in VM, you need an CET
> capable platform, e.g., Sapphire Rapids server, and follow below steps to
> build host/guest kernel properly:
>
> 1. Buld host kernel. Patch this series to kernel tree and build kernel
/s/Buld/Build
> with CET capable gcc version(e.g., >=8.5.0).
I guess these should be some compiler option(s), can you also list it
here if any?
>
> 2. Build guest kernel. Patch CET native series to kernel tree and opt-in
> CONFIG_X86_KERNEL_IBT and CONFIG_X86_USER_SHADOW_STACK options.
I guess guest kernel also needs to be built using CET capable tool chain
with CET enabled?
>
> 3. Launch a VM with QEMU built with CET enabling patches [2].
>
> Check kernel selftest test_shadow_stack_64 output:
> [INFO] new_ssp = 7f8c82100ff8, *new_ssp = 7f8c82101001
> [INFO] changing ssp from 7f8c82900ff0 to 7f8c82100ff8
> [INFO] ssp is now 7f8c82101000
> [OK] Shadow stack pivot
> [OK] Shadow stack faults
> [INFO] Corrupting shadow stack
> [INFO] Generated shadow stack violation successfully
> [OK] Shadow stack violation test
> [INFO] Gup read -> shstk access success
> [INFO] Gup write -> shstk access success
> [INFO] Violation from normal write
> [INFO] Gup read -> write access success
> [INFO] Violation from normal write
> [INFO] Gup write -> write access success
> [INFO] Cow gup write -> write access success
> [OK] Shadow gup test
> [INFO] Violation from shstk access
> [OK] mprotect() test
> [SKIP] Userfaultfd unavailable.
> [OK] 32 bit test
>
> Check kernel IBT with dmesg | grep CET:
> CET detected: Indirect Branch Tracking enabled
>
> --------------------------------------------------------------------------
> Changes in v2:
> 1. Remove excessive checks on host CET Kconfig options in v1 patchset [3].
> 2. Make CET CPUIDs, MSRs and control flags enabling independent to host CET status.
> 3. Introduce supervisor SHSTK support to make the patch set complete.
> 4. Refactor patches to accommodate above changes.
> 5. Rebase on kvm-x86/next [4].
>
>
> [1]: linux-next: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/?h=next-20230420
> [2]: QEMU patch: https://lore.kernel.org/all/[email protected]/
> [3]: v1 patchset: https://lore.kernel.org/all/[email protected]/
> [4]: Rebase branch: https://github.com/kvm-x86/linux.git, commit: 7b632f72528d (tag: kvm-x86-next-2023.04.14)
>
>
> Rick Edgecombe (5):
> x86/shstk: Add Kconfig option for shadow stack
> x86/cpufeatures: Add CPU feature flags for shadow stacks
> x86/cpufeatures: Enable CET CR4 bit for shadow stack
> x86/fpu/xstate: Introduce CET MSR and XSAVES supervisor states
> x86/fpu: Add helper for modifying xstate
>
> Sean Christopherson (2):
> KVM:x86: Report XSS as to-be-saved if there are supported features
> KVM:x86: Load guest FPU state when accessing xsaves-managed MSRs
>
> Yang Weijiang (14):
> KVM:x86: Refresh CPUID on write to guest MSR_IA32_XSS
> KVM:x86: Init kvm_caps.supported_xss with supported feature bits
> KVM:x86: Add #CP support in guest exception classification
> KVM:VMX: Introduce CET VMCS fields and control bits
> KVM:x86: Add fault checks for guest CR4.CET setting
> KVM:VMX: Emulate reads and writes to CET MSRs
> KVM:VMX: Add a synthetic MSR to allow userspace VMM to access
> GUEST_SSP
> KVM:x86: Report CET MSRs as to-be-saved if CET is supported
> KVM:x86: Save/Restore GUEST_SSP to/from SMM state save area
> KVM:VMX: Pass through user CET MSRs to the guest
> KVM:x86: Enable CET virtualization for VMX and advertise to userspace
> KVM:nVMX: Enable user CET support for nested VMX
> KVM:x86: Enable supervisor IBT support for guest
> KVM:x86: Support CET supervisor shadow stack MSR access
>
> arch/x86/Kconfig | 24 ++++
> arch/x86/Kconfig.assembler | 5 +
> arch/x86/include/asm/cpufeatures.h | 2 +
> arch/x86/include/asm/disabled-features.h | 8 +-
> arch/x86/include/asm/fpu/api.h | 9 ++
> arch/x86/include/asm/fpu/types.h | 16 ++-
> arch/x86/include/asm/fpu/xstate.h | 6 +-
> arch/x86/include/asm/kvm_host.h | 3 +-
> arch/x86/include/asm/vmx.h | 8 ++
> arch/x86/include/uapi/asm/kvm.h | 1 +
> arch/x86/include/uapi/asm/kvm_para.h | 1 +
> arch/x86/kernel/cpu/common.c | 35 ++++--
> arch/x86/kernel/cpu/cpuid-deps.c | 1 +
> arch/x86/kernel/fpu/core.c | 19 +++
> arch/x86/kernel/fpu/xstate.c | 90 +++++++-------
> arch/x86/kvm/cpuid.c | 23 +++-
> arch/x86/kvm/cpuid.h | 6 +
> arch/x86/kvm/smm.c | 20 +++
> arch/x86/kvm/vmx/capabilities.h | 4 +
> arch/x86/kvm/vmx/nested.c | 29 ++++-
> arch/x86/kvm/vmx/vmcs12.c | 6 +
> arch/x86/kvm/vmx/vmcs12.h | 14 ++-
> arch/x86/kvm/vmx/vmx.c | 150 ++++++++++++++++++++++-
> arch/x86/kvm/vmx/vmx.h | 6 +-
> arch/x86/kvm/x86.c | 79 ++++++++++--
> arch/x86/kvm/x86.h | 46 ++++++-
> 26 files changed, 528 insertions(+), 83 deletions(-)
>
>
> base-commit: 7b632f72528d5fa3f0265358a393f534da47d9dd
On 4/23/2023 4:30 PM, Binbin Wu wrote:
>
> On 4/21/2023 9:45 PM, Yang Weijiang wrote:
[...]
>> Indirect Branch Tracking (IBT):
>> IBT adds new instrutions, ENDBRANCH{32|64}, to mark valid target
>> addresses
>
> /s/instrutions/instructions
Thanks for review, I'll fix these spelling issues.
>
>
>> of indirect branches (CALL, JMP etc...). If an indirect branch is
>> executed
>> and the next instruction is _not_ an ENDBRANCH, the processor
>> generates a #CP.
>
> IMHO, it's better to mention the behavior of the new instructions when
> IBT is not enabled or
> on the old platforms that don't support the feature.
Sure, will do it.
>
> [...]
>>
>> CET supervisor mode states are grouped into two categories - XSAVES
>> dependent
>> and non-dependent, the former includes MSR_IA32_PL{0,1,2}_SSP, the later
>> consists of MSR_IA32_S_CET and MSR_IA32_INTR_SSP_TBL. The XSAVES
>> dependent
>> MSR's save/restore depends on S_CET bit set in MSR_XSS. Since the
>> native series
>> doesn't enable S_CET support,
>
> Do you know the reason why native patch doesn't enable S_CET support?
This bit controls XSAVES for supervisor SHSTK, but the landing native
part is only for user SHSTK
so it doesn't need to touch it. Anyway, enabling the bit requires
additional FPU area size(3*8byte).
>
>
>> these s-SHSTK shadow stack pointers are invalid.
>>
>> Moveover, new VMCS fields, {GUEST|HOST}_{S_CET,SSP,INTR_SSP_TABL}, are
>> introduced for guest/host supervisor state switch. When CET
>> entry/exit load
>> bits are set, the guest/host MSR_IA32_{S_CET,INTR_SSP_TBL,SSP} are
>> swapped
>> automatically at vm-exit/entry. With these new fields, current guest
>> kernel
>> IBT enalbing doesn't depend on host {XSAVES|XRSTORS} support.
>
> /s/enalbing/enabling
>
>
>>
>>
>> Tests:
>> --------------------------------------------------------------------------
>>
>> This series passed basic CET user shadow stack test and kernel IBT
>> test in
>> L1 and L2 guest. It also passed CET KUT test which has been merged
>> there.
>>
>> Executed all KUT tests and KVM selftests against this series, all
>> test cases
>> passes except the vmx test, the failure is due to CR4_CET bit testing in
>> test_vmxon_bad_cr(). After add CR4_CET bit to skip list, the test
>> passed.
>> I'll send a patch to fix this issue later.
>>
>>
>> To run user shadow stack test and kernel IBT test in VM, you need an CET
>> capable platform, e.g., Sapphire Rapids server, and follow below
>> steps to
>> build host/guest kernel properly:
>>
>> 1. Buld host kernel. Patch this series to kernel tree and build kernel
>
> /s/Buld/Build
>
>
>> with CET capable gcc version(e.g., >=8.5.0).
>
> I guess these should be some compiler option(s), can you also list it
> here if any?
Please check gcc options: -fcf-protection=[full|branch|return|none]
>
>
>>
>> 2. Build guest kernel. Patch CET native series to kernel tree and opt-in
>> CONFIG_X86_KERNEL_IBT and CONFIG_X86_USER_SHADOW_STACK options.
>
> I guess guest kernel also needs to be built using CET capable tool
> chain with CET enabled?
Yes.
For kernel IBT testing, KERNEL_IBT option + CET enabled tool chain is a
must.
>
>
[...]
On Sat, Apr 22, 2023 at 12:54:11AM +0300, Mike Rapoport wrote:
> Hi,
>
> On Fri, Apr 21, 2023 at 09:45:54AM -0400, Yang Weijiang wrote:
> >
> > Tests:
> > --------------------------------------------------------------------------
> > This series passed basic CET user shadow stack test and kernel IBT test in
> > L1 and L2 guest. It also passed CET KUT test which has been merged there.
> >
> > Executed all KUT tests and KVM selftests against this series, all test cases
> > passes except the vmx test, the failure is due to CR4_CET bit testing in
> > test_vmxon_bad_cr(). After add CR4_CET bit to skip list, the test passed.
> > I'll send a patch to fix this issue later.
> >
> >
> > To run user shadow stack test and kernel IBT test in VM, you need an CET
> > capable platform, e.g., Sapphire Rapids server, and follow below steps to
> > build host/guest kernel properly:
> >
> > 1. Buld host kernel. Patch this series to kernel tree and build kernel
> > with CET capable gcc version(e.g., >=8.5.0).
> >
> > 2. Build guest kernel. Patch CET native series to kernel tree and opt-in
> > CONFIG_X86_KERNEL_IBT and CONFIG_X86_USER_SHADOW_STACK options.
> >
> > 3. Launch a VM with QEMU built with CET enabling patches [2].
> >
> > Check kernel selftest test_shadow_stack_64 output:
> > [INFO] new_ssp = 7f8c82100ff8, *new_ssp = 7f8c82101001
> > [INFO] changing ssp from 7f8c82900ff0 to 7f8c82100ff8
> > [INFO] ssp is now 7f8c82101000
> > [OK] Shadow stack pivot
> > [OK] Shadow stack faults
> > [INFO] Corrupting shadow stack
> > [INFO] Generated shadow stack violation successfully
> > [OK] Shadow stack violation test
> > [INFO] Gup read -> shstk access success
> > [INFO] Gup write -> shstk access success
> > [INFO] Violation from normal write
> > [INFO] Gup read -> write access success
> > [INFO] Violation from normal write
> > [INFO] Gup write -> write access success
> > [INFO] Cow gup write -> write access success
> > [OK] Shadow gup test
> > [INFO] Violation from shstk access
> > [OK] mprotect() test
> > [SKIP] Userfaultfd unavailable.
> > [OK] 32 bit test
> >
> > Check kernel IBT with dmesg | grep CET:
> > CET detected: Indirect Branch Tracking enabled
> >
> > --------------------------------------------------------------------------
> > Changes in v2:
> > 1. Remove excessive checks on host CET Kconfig options in v1 patchset [3].
> > 2. Make CET CPUIDs, MSRs and control flags enabling independent to host CET status.
> > 3. Introduce supervisor SHSTK support to make the patch set complete.
> > 4. Refactor patches to accommodate above changes.
> > 5. Rebase on kvm-x86/next [4].
> >
> >
> > [1]: linux-next: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/?h=next-20230420
> > [2]: QEMU patch: https://lore.kernel.org/all/[email protected]/
> > [3]: v1 patchset: https://lore.kernel.org/all/[email protected]/
> > [4]: Rebase branch: https://github.com/kvm-x86/linux.git, commit: 7b632f72528d (tag: kvm-x86-next-2023.04.14)
>
> I played a bit with KVM support for shadow stacks on AMD machines and I
> rebased v1 patches along with John's SVM series
>
> https://lore.kernel.org/kvm/[email protected]/
>
> on top of v6.3-rc4 and Rick's series for host shadow stack support. I've
> put this at
>
> https://git.kernel.org/pub/scm/linux/kernel/git/rppt/linux.git/log/?h=shstk/kvm
>
> if anybody is interested. With this I could successfully run shadow stack
> self test in a guest on an AMD Zen3 machine.
That's great news! Thanks for testing!
Thanks,
John
>
> One thing I've noticed while rebasing is that John's patches move
> cet_is_msr_accessible() from vmx/ to x86.c and I also had to make such move
> for cet_is_ssp_msr_accessible().
>
> Would make sense to have them available for both VMX and SVM from the
> start.
>
> > Rick Edgecombe (5):
> > x86/shstk: Add Kconfig option for shadow stack
> > x86/cpufeatures: Add CPU feature flags for shadow stacks
> > x86/cpufeatures: Enable CET CR4 bit for shadow stack
> > x86/fpu/xstate: Introduce CET MSR and XSAVES supervisor states
> > x86/fpu: Add helper for modifying xstate
> >
> > Sean Christopherson (2):
> > KVM:x86: Report XSS as to-be-saved if there are supported features
> > KVM:x86: Load guest FPU state when accessing xsaves-managed MSRs
> >
> > Yang Weijiang (14):
> > KVM:x86: Refresh CPUID on write to guest MSR_IA32_XSS
> > KVM:x86: Init kvm_caps.supported_xss with supported feature bits
> > KVM:x86: Add #CP support in guest exception classification
> > KVM:VMX: Introduce CET VMCS fields and control bits
> > KVM:x86: Add fault checks for guest CR4.CET setting
> > KVM:VMX: Emulate reads and writes to CET MSRs
> > KVM:VMX: Add a synthetic MSR to allow userspace VMM to access
> > GUEST_SSP
> > KVM:x86: Report CET MSRs as to-be-saved if CET is supported
> > KVM:x86: Save/Restore GUEST_SSP to/from SMM state save area
> > KVM:VMX: Pass through user CET MSRs to the guest
> > KVM:x86: Enable CET virtualization for VMX and advertise to userspace
> > KVM:nVMX: Enable user CET support for nested VMX
> > KVM:x86: Enable supervisor IBT support for guest
> > KVM:x86: Support CET supervisor shadow stack MSR access
> >
> > arch/x86/Kconfig | 24 ++++
> > arch/x86/Kconfig.assembler | 5 +
> > arch/x86/include/asm/cpufeatures.h | 2 +
> > arch/x86/include/asm/disabled-features.h | 8 +-
> > arch/x86/include/asm/fpu/api.h | 9 ++
> > arch/x86/include/asm/fpu/types.h | 16 ++-
> > arch/x86/include/asm/fpu/xstate.h | 6 +-
> > arch/x86/include/asm/kvm_host.h | 3 +-
> > arch/x86/include/asm/vmx.h | 8 ++
> > arch/x86/include/uapi/asm/kvm.h | 1 +
> > arch/x86/include/uapi/asm/kvm_para.h | 1 +
> > arch/x86/kernel/cpu/common.c | 35 ++++--
> > arch/x86/kernel/cpu/cpuid-deps.c | 1 +
> > arch/x86/kernel/fpu/core.c | 19 +++
> > arch/x86/kernel/fpu/xstate.c | 90 +++++++-------
> > arch/x86/kvm/cpuid.c | 23 +++-
> > arch/x86/kvm/cpuid.h | 6 +
> > arch/x86/kvm/smm.c | 20 +++
> > arch/x86/kvm/vmx/capabilities.h | 4 +
> > arch/x86/kvm/vmx/nested.c | 29 ++++-
> > arch/x86/kvm/vmx/vmcs12.c | 6 +
> > arch/x86/kvm/vmx/vmcs12.h | 14 ++-
> > arch/x86/kvm/vmx/vmx.c | 150 ++++++++++++++++++++++-
> > arch/x86/kvm/vmx/vmx.h | 6 +-
> > arch/x86/kvm/x86.c | 79 ++++++++++--
> > arch/x86/kvm/x86.h | 46 ++++++-
> > 26 files changed, 528 insertions(+), 83 deletions(-)
> >
> >
> > base-commit: 7b632f72528d5fa3f0265358a393f534da47d9dd
> > --
> > 2.27.0
> >
>
> --
> Sincerely yours,
> Mike.
On 4/21/2023 9:46 PM, Yang Weijiang wrote:
> From: Sean Christopherson <[email protected]>
>
> Load the guest's FPU state if userspace is accessing MSRs whose values are
> managed by XSAVES so that the MSR helpers, e.g. kvm_{get,set}_xsave_msr(),
So far, kvm_{get,set}_xsave_msr() is not introduced yet.
IMO, it is a bit confusing to understand the whole picture without the
following patches.
May be add some description or adjust the order?
> can simply do {RD,WR}MSR to access the guest's value.
>
> If new feature MSRs supported in XSS are passed through to the guest they
> are saved and restored by XSAVES/XRSTORS, i.e. in the guest's FPU state.
>
> Because is also used for the KVM_GET_MSRS device ioctl(), explicitly check
> @vcpu is non-null before attempting to load guest state. The XSS supporting
> MSRs cannot be retrieved via the device ioctl() without loading guest FPU
> state (which doesn't exist).
>
> Note that guest_cpuid_has() is not queried as host userspace is allowed
> to access MSRs that have not been exposed to the guest, e.g. it might do
> KVM_SET_MSRS prior to KVM_SET_CPUID2.
>
> Signed-off-by: Sean Christopherson <[email protected]>
> Co-developed-by: Yang Weijiang <[email protected]>
> Signed-off-by: Yang Weijiang <[email protected]>
> ---
> arch/x86/kvm/x86.c | 29 ++++++++++++++++++++++++++++-
> 1 file changed, 28 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index d2975ca96ac5..7788646bbf1f 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -130,6 +130,9 @@ static int __set_sregs2(struct kvm_vcpu *vcpu, struct kvm_sregs2 *sregs2);
> static void __get_sregs2(struct kvm_vcpu *vcpu, struct kvm_sregs2 *sregs2);
>
> static DEFINE_MUTEX(vendor_module_lock);
> +static void kvm_load_guest_fpu(struct kvm_vcpu *vcpu);
> +static void kvm_put_guest_fpu(struct kvm_vcpu *vcpu);
> +
> struct kvm_x86_ops kvm_x86_ops __read_mostly;
>
> #define KVM_X86_OP(func) \
> @@ -4336,6 +4339,21 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> }
> EXPORT_SYMBOL_GPL(kvm_get_msr_common);
>
> +static const u32 xsave_msrs[] = {
> + MSR_IA32_U_CET, MSR_IA32_PL3_SSP,
> +};
> +
> +static bool is_xsaves_msr(u32 index)
> +{
> + int i;
> +
> + for (i = 0; i < ARRAY_SIZE(xsave_msrs); i++) {
> + if (index == xsave_msrs[i])
> + return true;
> + }
> + return false;
> +}
> +
> /*
> * Read or write a bunch of msrs. All parameters are kernel addresses.
> *
> @@ -4346,11 +4364,20 @@ static int __msr_io(struct kvm_vcpu *vcpu, struct kvm_msrs *msrs,
> int (*do_msr)(struct kvm_vcpu *vcpu,
> unsigned index, u64 *data))
> {
> + bool fpu_loaded = false;
> int i;
>
> - for (i = 0; i < msrs->nmsrs; ++i)
> + for (i = 0; i < msrs->nmsrs; ++i) {
> + if (vcpu && !fpu_loaded && kvm_caps.supported_xss &&
> + is_xsaves_msr(entries[i].index)) {
> + kvm_load_guest_fpu(vcpu);
> + fpu_loaded = true;
> + }
> if (do_msr(vcpu, entries[i].index, &entries[i].data))
> break;
> + }
> + if (fpu_loaded)
> + kvm_put_guest_fpu(vcpu);
>
> return i;
> }
On 4/27/2023 11:46 AM, Binbin Wu wrote:
>
>
> On 4/21/2023 9:46 PM, Yang Weijiang wrote:
>> From: Sean Christopherson <[email protected]>
>>
>> Load the guest's FPU state if userspace is accessing MSRs whose
>> values are
>> managed by XSAVES so that the MSR helpers, e.g.
>> kvm_{get,set}_xsave_msr(),
> So far, kvm_{get,set}_xsave_msr() is not introduced yet.
> IMO, it is a bit confusing to understand the whole picture without the
> following patches.
> May be add some description or adjust the order?
Sure, will add blurbs for the two helpers., thanks!
> [...]
On 4/21/2023 9:46 PM, Yang Weijiang wrote:
> Add handling for Control Protection (#CP) exceptions(vector 21).
> The new vector is introduced for Intel's Control-Flow Enforcement
> Technology (CET) relevant violation cases.
> See Intel's SDM for details.
>
> Signed-off-by: Yang Weijiang <[email protected]>
> ---
> arch/x86/include/uapi/asm/kvm.h | 1 +
> arch/x86/kvm/vmx/nested.c | 2 +-
> arch/x86/kvm/x86.c | 10 +++++++---
> arch/x86/kvm/x86.h | 13 ++++++++++---
> 4 files changed, 19 insertions(+), 7 deletions(-)
>
> diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
> index 7f467fe05d42..1c002abe2be8 100644
> --- a/arch/x86/include/uapi/asm/kvm.h
> +++ b/arch/x86/include/uapi/asm/kvm.h
> @@ -33,6 +33,7 @@
> #define MC_VECTOR 18
> #define XM_VECTOR 19
> #define VE_VECTOR 20
> +#define CP_VECTOR 21
>
> /* Select x86 specific features in <linux/kvm.h> */
> #define __KVM_HAVE_PIT
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index 96ede74a6067..7bc62cd72748 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -2850,7 +2850,7 @@ static int nested_check_vm_entry_controls(struct kvm_vcpu *vcpu,
> /* VM-entry interruption-info field: deliver error code */
> should_have_error_code =
> intr_type == INTR_TYPE_HARD_EXCEPTION && prot_mode &&
> - x86_exception_has_error_code(vector);
> + x86_exception_has_error_code(vcpu, vector);
> if (CC(has_error_code != should_have_error_code))
> return -EINVAL;
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 7788646bbf1f..a768cbf3fbb7 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -520,11 +520,15 @@ EXPORT_SYMBOL_GPL(kvm_spurious_fault);
> #define EXCPT_CONTRIBUTORY 1
> #define EXCPT_PF 2
>
> -static int exception_class(int vector)
> +static int exception_class(struct kvm_vcpu *vcpu, int vector)
> {
> switch (vector) {
> case PF_VECTOR:
> return EXCPT_PF;
> + case CP_VECTOR:
> + if (vcpu->arch.cr4_guest_rsvd_bits & X86_CR4_CET)
> + return EXCPT_BENIGN;
> + return EXCPT_CONTRIBUTORY;
By definition, #CP is Contributory.
Can you explain more about this change here which treats #CP as
EXCPT_BENIGN when CET is not enabled in guest?
In current KVM code, there is suppose no #CP triggered in guest if CET
is not enalbed in guest, right?
> case DE_VECTOR:
> case TS_VECTOR:
> case NP_VECTOR:
> @@ -707,8 +711,8 @@ static void kvm_multiple_exception(struct kvm_vcpu *vcpu,
> kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu);
> return;
> }
> - class1 = exception_class(prev_nr);
> - class2 = exception_class(nr);
> + class1 = exception_class(vcpu, prev_nr);
> + class2 = exception_class(vcpu, nr);
> if ((class1 == EXCPT_CONTRIBUTORY && class2 == EXCPT_CONTRIBUTORY) ||
> (class1 == EXCPT_PF && class2 != EXCPT_BENIGN)) {
> /*
> diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
> index c544602d07a3..2ba7c7fc4846 100644
> --- a/arch/x86/kvm/x86.h
> +++ b/arch/x86/kvm/x86.h
> @@ -171,13 +171,20 @@ static inline bool is_64_bit_hypercall(struct kvm_vcpu *vcpu)
> return vcpu->arch.guest_state_protected || is_64_bit_mode(vcpu);
> }
>
> -static inline bool x86_exception_has_error_code(unsigned int vector)
> +static inline bool x86_exception_has_error_code(struct kvm_vcpu *vcpu,
> + unsigned int vector)
> {
> static u32 exception_has_error_code = BIT(DF_VECTOR) | BIT(TS_VECTOR) |
> BIT(NP_VECTOR) | BIT(SS_VECTOR) | BIT(GP_VECTOR) |
> - BIT(PF_VECTOR) | BIT(AC_VECTOR);
> + BIT(PF_VECTOR) | BIT(AC_VECTOR) | BIT(CP_VECTOR);
>
> - return (1U << vector) & exception_has_error_code;
> + if (!((1U << vector) & exception_has_error_code))
> + return false;
> +
> + if (vector == CP_VECTOR)
> + return !(vcpu->arch.cr4_guest_rsvd_bits & X86_CR4_CET);
> +
> + return true;
> }
>
> static inline bool mmu_is_nested(struct kvm_vcpu *vcpu)
On Fri, 2023-04-21 at 09:46 -0400, Yang Weijiang wrote:
> @@ -2471,6 +2495,12 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu,
> struct msr_data *msr_info)
> else
> kvm_set_xsave_msr(msr_info);
> break;
> + case MSR_IA32_PL0_SSP ... MSR_IA32_PL2_SSP:
> + case MSR_IA32_INT_SSP_TAB:
> + if (!cet_is_msr_accessible(vcpu, msr_info))
> + return 1;
> + kvm_set_xsave_msr(msr_info);
> + break;
These are supposed to be canonical too, right?
On Fri, 2023-04-21 at 09:46 -0400, Yang Weijiang wrote:
> +
> + incpt = !is_cet_state_supported(vcpu,
> XFEATURE_MASK_CET_KERNEL);
> + incpt |= !guest_cpuid_has(vcpu, X86_FEATURE_SHSTK);
> +
> + vmx_set_intercept_for_msr(vcpu, MSR_IA32_INT_SSP_TAB,
> MSR_TYPE_RW, incpt);
> + vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL0_SSP,
> MSR_TYPE_RW, incpt);
> + vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL1_SSP,
> MSR_TYPE_RW, incpt);
> + vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL2_SSP,
> MSR_TYPE_RW, incpt);
> }
Why is this tied to XFEATURE_MASK_CET_KERNEL? I don't know how the SVM
side works, but the host kernel doesn't use this xfeature. Just not
clear on what the intention is. Why not use
kvm_cet_kernel_shstk_supported() again?
On Fri, 2023-04-21 at 09:46 -0400, Yang Weijiang wrote:
> Introduce a host-only synthetic MSR, MSR_KVM_GUEST_SSP, so that the
> VMM
> can read/write the guest's SSP, e.g. to migrate CET state. Use a
> synthetic
> MSR, e.g. as opposed to a VCPU_REG_, as GUEST_SSP is subject to the
> same
> consistency checks as the PL*_SSP MSRs, i.e. can share code.
It seems this is exposed to the guest? I'm thinking maybe it should not
be. IA32_PL0_SSP comes with some extra checks, so MSR_KVM_GUEST_SSP
seems a bit powerful. I think the guest doesn't need it either.
On 5/4/2023 1:06 AM, Edgecombe, Rick P wrote:
> On Fri, 2023-04-21 at 09:46 -0400, Yang Weijiang wrote:
>> @@ -2471,6 +2495,12 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu,
>> struct msr_data *msr_info)
>> else
>> kvm_set_xsave_msr(msr_info);
>> break;
>> + case MSR_IA32_PL0_SSP ... MSR_IA32_PL2_SSP:
>> + case MSR_IA32_INT_SSP_TAB:
>> + if (!cet_is_msr_accessible(vcpu, msr_info))
>> + return 1;
>> + kvm_set_xsave_msr(msr_info);
>> + break;
> These are supposed to be canonical too, right?
Yes, I'll add check in next version, thanks!
On 5/4/2023 1:07 AM, Edgecombe, Rick P wrote:
> On Fri, 2023-04-21 at 09:46 -0400, Yang Weijiang wrote:
>> +
>> + incpt = !is_cet_state_supported(vcpu,
>> XFEATURE_MASK_CET_KERNEL);
>> + incpt |= !guest_cpuid_has(vcpu, X86_FEATURE_SHSTK);
>> +
>> + vmx_set_intercept_for_msr(vcpu, MSR_IA32_INT_SSP_TAB,
>> MSR_TYPE_RW, incpt);
>> + vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL0_SSP,
>> MSR_TYPE_RW, incpt);
>> + vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL1_SSP,
>> MSR_TYPE_RW, incpt);
>> + vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL2_SSP,
>> MSR_TYPE_RW, incpt);
>> }
> Why is this tied to XFEATURE_MASK_CET_KERNEL? I don't know how the SVM
> side works, but the host kernel doesn't use this xfeature. Just not
> clear on what the intention is. Why not use
> kvm_cet_kernel_shstk_supported() again?
I don't know how SVM supports supervisor SHSTK either, here just follows
the spec.
to add the dependency check. Maybe you're right, I need to use
kvm_cet_kernel_shstk_supported()
in my patch set and leave the work to SVM enabling patches. I'll change
it, thanks!
On 5/4/2023 1:08 AM, Edgecombe, Rick P wrote:
> On Fri, 2023-04-21 at 09:46 -0400, Yang Weijiang wrote:
>> Introduce a host-only synthetic MSR, MSR_KVM_GUEST_SSP, so that the
>> VMM
>> can read/write the guest's SSP, e.g. to migrate CET state. Use a
>> synthetic
>> MSR, e.g. as opposed to a VCPU_REG_, as GUEST_SSP is subject to the
>> same
>> consistency checks as the PL*_SSP MSRs, i.e. can share code.
> It seems this is exposed to the guest? I'm thinking maybe it should not
> be. IA32_PL0_SSP comes with some extra checks, so MSR_KVM_GUEST_SSP
> seems a bit powerful. I think the guest doesn't need it either.
Make sense. The MSR is just for live migration purpose, no need to
expose it to guest,
will change it, thanks!
On 4/28/2023 2:09 PM, Binbin Wu wrote:
>
>
> On 4/21/2023 9:46 PM, Yang Weijiang wrote:
>> Add handling for Control Protection (#CP) exceptions(vector 21).
>> The new vector is introduced for Intel's Control-Flow Enforcement
>> Technology (CET) relevant violation cases.
>> See Intel's SDM for details.
>>
[...]
>> -static int exception_class(int vector)
>> +static int exception_class(struct kvm_vcpu *vcpu, int vector)
>> {
>> switch (vector) {
>> case PF_VECTOR:
>> return EXCPT_PF;
>> + case CP_VECTOR:
>> + if (vcpu->arch.cr4_guest_rsvd_bits & X86_CR4_CET)
>> + return EXCPT_BENIGN;
>> + return EXCPT_CONTRIBUTORY;
> By definition, #CP is Contributory.
> Can you explain more about this change here which treats #CP as
> EXCPT_BENIGN when CET is not enabled in guest?
I check the history of this patch, found maintainer modified the patch
due to some unit test issue in L1. You can check the
details here:
Re: [PATCH v15 04/14] KVM: x86: Add #CP support in guest exception
dispatch - Sean Christopherson (kernel.org)
<https://lore.kernel.org/all/[email protected]/>
>
> In current KVM code, there is suppose no #CP triggered in guest if CET
> is not enalbed in guest, right?
Yes.
>> case DE_VECTOR:
>> case TS_VECTOR:
>> case NP_VECTOR:
[...]
On Thu, 2023-05-04 at 09:20 +0800, Yang, Weijiang wrote:
>
> On 5/4/2023 1:07 AM, Edgecombe, Rick P wrote:
> > On Fri, 2023-04-21 at 09:46 -0400, Yang Weijiang wrote:
> > > +
> > > + incpt = !is_cet_state_supported(vcpu,
> > > XFEATURE_MASK_CET_KERNEL);
> > > + incpt |= !guest_cpuid_has(vcpu, X86_FEATURE_SHSTK);
> > > +
> > > + vmx_set_intercept_for_msr(vcpu, MSR_IA32_INT_SSP_TAB,
> > > MSR_TYPE_RW, incpt);
> > > + vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL0_SSP,
> > > MSR_TYPE_RW, incpt);
> > > + vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL1_SSP,
> > > MSR_TYPE_RW, incpt);
> > > + vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL2_SSP,
> > > MSR_TYPE_RW, incpt);
> > > }
> > Why is this tied to XFEATURE_MASK_CET_KERNEL? I don't know how the
> > SVM
> > side works, but the host kernel doesn't use this xfeature. Just not
> > clear on what the intention is. Why not use
> > kvm_cet_kernel_shstk_supported() again?
>
> I don't know how SVM supports supervisor SHSTK either, here just
> follows
> the spec.
What aspect of the spec is this?
>
> to add the dependency check. Maybe you're right, I need to use
> kvm_cet_kernel_shstk_supported()
>
> in my patch set and leave the work to SVM enabling patches. I'll
> change
> it, thanks!
Oh, I see the the SVM patch [0] is adding XFEATURE_MASK_CET_KERNEL to
kvm_caps.supported_xss as long as kvm_cpu_cap_has(X86_FEATURE_SHSTK).
And it does not look to be checking XSS host support like how
kvm_caps.supported_xss is set in your patch. It should depend on host
support, right? Is that the intent of kvm_caps.supported_xss?
Separate from all that, the code above is in VMX, so not sure how it
affects SVM in any case.
I might be confused here. The code just looked suspicious.
[0]
https://lore.kernel.org/kvm/[email protected]/
On 5/4/2023 11:41 AM, Yang, Weijiang wrote:
>
> On 4/28/2023 2:09 PM, Binbin Wu wrote:
>>
>>
>> On 4/21/2023 9:46 PM, Yang Weijiang wrote:
>>> Add handling for Control Protection (#CP) exceptions(vector 21).
>>> The new vector is introduced for Intel's Control-Flow Enforcement
>>> Technology (CET) relevant violation cases.
>>> See Intel's SDM for details.
>>>
> [...]
>>> -static int exception_class(int vector)
>>> +static int exception_class(struct kvm_vcpu *vcpu, int vector)
>>> {
>>> switch (vector) {
>>> case PF_VECTOR:
>>> return EXCPT_PF;
>>> + case CP_VECTOR:
>>> + if (vcpu->arch.cr4_guest_rsvd_bits & X86_CR4_CET)
>>> + return EXCPT_BENIGN;
>>> + return EXCPT_CONTRIBUTORY;
>> By definition, #CP is Contributory.
>> Can you explain more about this change here which treats #CP as
>> EXCPT_BENIGN when CET is not enabled in guest?
>
> I check the history of this patch, found maintainer modified the patch
> due to some unit test issue in L1. You can check the
>
> details here:
>
> Re: [PATCH v15 04/14] KVM: x86: Add #CP support in guest exception
> dispatch - Sean Christopherson (kernel.org)
> <https://lore.kernel.org/all/[email protected]/>
>
OK, is it better to add the reason in changelog?
IIUC, a new contributory exception vector (if any) should be handled
similarly (i.e., treated as contributory conditionally) in the future,
right?
>
>>
>> In current KVM code, there is suppose no #CP triggered in guest if
>> CET is not enalbed in guest, right?
>
> Yes.
>
>>> case DE_VECTOR:
>>> case TS_VECTOR:
>>> case NP_VECTOR:
>
>
> [...]
>
On 5/4/2023 12:17 PM, Edgecombe, Rick P wrote:
> On Thu, 2023-05-04 at 09:20 +0800, Yang, Weijiang wrote:
>> On 5/4/2023 1:07 AM, Edgecombe, Rick P wrote:
>>> On Fri, 2023-04-21 at 09:46 -0400, Yang Weijiang wrote:
>>>> +
>>>> + incpt = !is_cet_state_supported(vcpu,
>>>> XFEATURE_MASK_CET_KERNEL);
>>>> + incpt |= !guest_cpuid_has(vcpu, X86_FEATURE_SHSTK);
>>>> +
>>>> + vmx_set_intercept_for_msr(vcpu, MSR_IA32_INT_SSP_TAB,
>>>> MSR_TYPE_RW, incpt);
>>>> + vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL0_SSP,
>>>> MSR_TYPE_RW, incpt);
>>>> + vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL1_SSP,
>>>> MSR_TYPE_RW, incpt);
>>>> + vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL2_SSP,
>>>> MSR_TYPE_RW, incpt);
>>>> }
>>> Why is this tied to XFEATURE_MASK_CET_KERNEL? I don't know how the
>>> SVM
>>> side works, but the host kernel doesn't use this xfeature. Just not
>>> clear on what the intention is. Why not use
>>> kvm_cet_kernel_shstk_supported() again?
>> I don't know how SVM supports supervisor SHSTK either, here just
>> follows
>> the spec.
> What aspect of the spec is this?
I assumed the supervisor SHSTK states are backed via XSAVES/SRSTORS with
XFEATURE_MASK_CET_KERNEL set in XSS. This is arguable since implementation
is not determined, but XSAVES is an efficient way to manage the states
compared with
manually save/restore the MSRs.
>
>> to add the dependency check. Maybe you're right, I need to use
>> kvm_cet_kernel_shstk_supported()
>>
>> in my patch set and leave the work to SVM enabling patches. I'll
>> change
>> it, thanks!
> Oh, I see the the SVM patch [0] is adding XFEATURE_MASK_CET_KERNEL to
> kvm_caps.supported_xss as long as kvm_cpu_cap_has(X86_FEATURE_SHSTK).
> And it does not look to be checking XSS host support like how
> kvm_caps.supported_xss is set in your patch. It should depend on host
> support, right?
Yes, it should rely on host to back the states as long as the supervisor
SHSTK MSRs are implemented as XSAVES/XRSTORS managed.
> Is that the intent of kvm_caps.supported_xss?
Yes, it's used to indicate all host XSS supported guest features.
>
> Separate from all that, the code above is in VMX, so not sure how it
> affects SVM in any case.
I was confused a bit. Yes, the pass-through check is specific to VMX,
there could
be other implementation in SVM.
>
> I might be confused here. The code just looked suspicious.
>
> [0]
> https://lore.kernel.org/kvm/[email protected]/
IMO, above patch is not necessary as kvm_caps.supported_xss is
initialized in x86 part and
shared by both SVM and VMX.
On 5/4/2023 1:36 PM, Binbin Wu wrote:
>
>
> On 5/4/2023 11:41 AM, Yang, Weijiang wrote:
>>
>> On 4/28/2023 2:09 PM, Binbin Wu wrote:
>>>
>>>
>>> On 4/21/2023 9:46 PM, Yang Weijiang wrote:
>>>> Add handling for Control Protection (#CP) exceptions(vector 21).
>>>> The new vector is introduced for Intel's Control-Flow Enforcement
>>>> Technology (CET) relevant violation cases.
>>>> See Intel's SDM for details.
>>>>
>> [...]
>>>> -static int exception_class(int vector)
>>>> +static int exception_class(struct kvm_vcpu *vcpu, int vector)
>>>> {
>>>> switch (vector) {
>>>> case PF_VECTOR:
>>>> return EXCPT_PF;
>>>> + case CP_VECTOR:
>>>> + if (vcpu->arch.cr4_guest_rsvd_bits & X86_CR4_CET)
>>>> + return EXCPT_BENIGN;
>>>> + return EXCPT_CONTRIBUTORY;
>>> By definition, #CP is Contributory.
>>> Can you explain more about this change here which treats #CP as
>>> EXCPT_BENIGN when CET is not enabled in guest?
>>
>> I check the history of this patch, found maintainer modified the
>> patch due to some unit test issue in L1. You can check the
>>
>> details here:
>>
>> Re: [PATCH v15 04/14] KVM: x86: Add #CP support in guest exception
>> dispatch - Sean Christopherson (kernel.org)
>> <https://lore.kernel.org/all/[email protected]/>
>>
> OK, is it better to add the reason in changelog?
>
> IIUC, a new contributory exception vector (if any) should be handled
> similarly (i.e., treated as contributory conditionally) in the future,
> right?
Agree although the issue happens in an uncommon case, I'll add some
description in changelog in following version, thanks!
>
>
>>
>>>
>>> In current KVM code, there is suppose no #CP triggered in guest if
>>> CET is not enalbed in guest, right?
>>
>> Yes.
>>
>>>> case DE_VECTOR:
>>>> case TS_VECTOR:
>>>> case NP_VECTOR:
>>
>>
>> [...]
>>
>
On 4/21/2023 9:46 PM, Yang Weijiang wrote:
> Check potential faults for CR4.CET setting per Intel SDM.
> CR4.CET is the master control bit for CET features (SHSTK and IBT).
> In addition to basic support checks, CET can be enabled if and only
> if CR0.WP==1, i.e. setting CR4.CET=1 faults if CR0.WP==0 and setting
> CR0.WP=0 fails if CR4.CET==1.
>
> Co-developed-by: Sean Christopherson <[email protected]>
> Signed-off-by: Sean Christopherson <[email protected]>
> Signed-off-by: Yang Weijiang <[email protected]>
> ---
> arch/x86/kvm/x86.c | 6 ++++++
> arch/x86/kvm/x86.h | 3 +++
> 2 files changed, 9 insertions(+)
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index a768cbf3fbb7..7cd7f6755acd 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -995,6 +995,9 @@ int kvm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
> (is_64_bit_mode(vcpu) || kvm_is_cr4_bit_set(vcpu, X86_CR4_PCIDE)))
> return 1;
>
> + if (!(cr0 & X86_CR0_WP) && kvm_read_cr4_bits(vcpu, X86_CR4_CET))
You can use kvm_is_cr4_bit_set() instead of kvm_read_cr4_bits()
> + return 1;
> +
> static_call(kvm_x86_set_cr0)(vcpu, cr0);
>
> kvm_post_set_cr0(vcpu, old_cr0, cr0);
> @@ -1210,6 +1213,9 @@ int kvm_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
> return 1;
> }
>
> + if ((cr4 & X86_CR4_CET) && !(kvm_read_cr0(vcpu) & X86_CR0_WP))
You can use kvm_is_cr0_bit_set() to check X86_CR0_WP
> + return 1;
> +
> static_call(kvm_x86_set_cr4)(vcpu, cr4);
>
> kvm_post_set_cr4(vcpu, old_cr4, cr4);
> diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
> index 2ba7c7fc4846..daadd5330dae 100644
> --- a/arch/x86/kvm/x86.h
> +++ b/arch/x86/kvm/x86.h
> @@ -536,6 +536,9 @@ bool kvm_msr_allowed(struct kvm_vcpu *vcpu, u32 index, u32 type);
> __reserved_bits |= X86_CR4_VMXE; \
> if (!__cpu_has(__c, X86_FEATURE_PCID)) \
> __reserved_bits |= X86_CR4_PCIDE; \
> + if (!__cpu_has(__c, X86_FEATURE_SHSTK) && \
> + !__cpu_has(__c, X86_FEATURE_IBT)) \
> + __reserved_bits |= X86_CR4_CET; \
IMO, it is a bit wired to split this part from the change of
CR4_RESERVED_BITS.
> __reserved_bits; \
> })
>
On 5/5/2023 1:01 PM, Binbin Wu wrote:
>
>
> On 4/21/2023 9:46 PM, Yang Weijiang wrote:
[...]
>> @@ -995,6 +995,9 @@ int kvm_set_cr0(struct kvm_vcpu *vcpu, unsigned
>> long cr0)
>> (is_64_bit_mode(vcpu) || kvm_is_cr4_bit_set(vcpu,
>> X86_CR4_PCIDE)))
>> return 1;
>> + if (!(cr0 & X86_CR0_WP) && kvm_read_cr4_bits(vcpu, X86_CR4_CET))
> You can use kvm_is_cr4_bit_set() instead of kvm_read_cr4_bits()
Good suggestion, thanks!
>
>> + return 1;
>> +
>> static_call(kvm_x86_set_cr0)(vcpu, cr0);
>> kvm_post_set_cr0(vcpu, old_cr0, cr0);
>> @@ -1210,6 +1213,9 @@ int kvm_set_cr4(struct kvm_vcpu *vcpu, unsigned
>> long cr4)
>> return 1;
>> }
>> + if ((cr4 & X86_CR4_CET) && !(kvm_read_cr0(vcpu) & X86_CR0_WP))
> You can use kvm_is_cr0_bit_set() to check X86_CR0_WP
OK.
>
>> + return 1;
>> +
>>
[...]
>> @@ -536,6 +536,9 @@ bool kvm_msr_allowed(struct kvm_vcpu *vcpu, u32
>> index, u32 type);
>> __reserved_bits |= X86_CR4_VMXE; \
>> if (!__cpu_has(__c, X86_FEATURE_PCID)) \
>> __reserved_bits |= X86_CR4_PCIDE; \
>> + if (!__cpu_has(__c, X86_FEATURE_SHSTK) && \
>> + !__cpu_has(__c, X86_FEATURE_IBT)) \
>> + __reserved_bits |= X86_CR4_CET; \
> IMO, it is a bit wired to split this part from the change of
> CR4_RESERVED_BITS.
Make sense, will move these lines to other patch.
>
>
>> __reserved_bits; \
>> })
>