2020-07-16 03:20:38

by Yang, Weijiang

[permalink] [raw]
Subject: [RESEND PATCH v13 00/11] Introduce support for guest CET feature

Control-flow Enforcement Technology (CET) provides protection against
Return/Jump-Oriented Programming (ROP/JOP) attack. There're two CET
sub-features: Shadow Stack (SHSTK) and Indirect Branch Tracking (IBT).
SHSTK is to prevent ROP programming and IBT is to prevent JOP programming.

Several parts in KVM have been updated to provide VM CET support, including:
CPUID/XSAVES config, MSR pass-through, user space MSR access interface,
vmentry/vmexit config, nested VM etc. These patches have dependency on CET
kernel patches for xsaves support and CET definitions, e.g., MSR and related
feature flags.

CET kernel patches are here:
https://lkml.kernel.org/r/[email protected]

v13:
- Added CET definitions as a separate patch to facilitate KVM test.
- Disabled CET support in KVM if unrestricted_guest is turned off since
in this case CET related instructions/infrastructure cannot be emulated
well.
- Don't expose CET feature to guest if host kernel doesn't support CET.
- Rebased the series to v5.8-rc5, commit: 11ba468877bb

v12:
- Fixed a few issues per Sean and Paolo's review feeback.
- Refactored patches to make them properly arranged.
- Removed unnecessary hard-coded CET states for host/guest.
- Added compile-time assertions for vmcs_field_to_offset_table to detect
mismatch of the field type and field encoding number.
- Added a custom MSR MSR_KVM_GUEST_SSP for guest active SSP save/restore.
- Rebased patches to 5.7-rc3.

v11:
- Fixed a guest vmentry failure issue when guest reboots.
- Used vm_xxx_control_{set, clear}bit() to avoid side effect, it'll
clear cached data instead of pure VMCS field bits.
- Added vcpu->arch.guest_supported_xss dedidated for guest runtime mask,
this avoids supported_xss overwritten issue caused by an old qemu.
- Separated vmentry/vmexit state setting with CR0/CR4 dependency check
to make the patch more clear.
- Added CET VMCS states in dump_vmcs() for debugging purpose.
- Other refactor based on testing.
- This patch serial is built on top of below branch and CET kernel patches
for seeking xsaves support:
https://git.kernel.org/pub/scm/virt/kvm/kvm.git/log/?h=cpu-caps

v10:
- Refactored code per Sean's review feedback.
- Added CET support for nested VM.
- Removed fix-patch for CPUID(0xd,N) enumeration as this part is done
by Paolo and Sean.
- This new patchset is based on Paolo's queued cpu_caps branch.
- Modified patch per XSAVES related change.
- Consolidated KVM unit-test patch with KVM patches.

v9:
- Refactored msr-check functions per Sean's feedback.
- Fixed a few issues per Sean's suggestion.
- Rebased patch to kernel-v5.4.
- Moved CET CPUID feature bits and CR4.CET to last patch.

v8:
- Addressed Jim and Sean's feedback on: 1) CPUID(0xD,i) enumeration. 2)
sanity check when configure guest CET. 3) function improvement.
- Added more sanity check functions.
- Set host vmexit default status so that guest won't leak CET status to
host when vmexit.
- Added CR0.WP vs. CR4.CET mutual constrains.

v7:
- Rebased patch to kernel v5.3
- Sean suggested to change CPUID(0xd, n) enumeration code as alined with
existing one, and I think it's better to make the fix as an independent patch
since XSS MSR are being used widely on X86 platforms.
- Check more host and guest status before configure guest CET
per Sean's feedback.
- Add error-check before guest accesses CET MSRs per Sean's feedback.
- Other minor fixes suggested by Sean.

v6:
- Rebase patch to kernel v5.2.
- Move CPUID(0xD, n>=1) helper to a seperate patch.
- Merge xsave size fix with other patch.
- Other minor fixes per community feedback.

v5:
- Rebase patch to kernel v5.1.
- Wrap CPUID(0xD, n>=1) code to a helper function.
- Pass through MSR_IA32_PL1_SSP and MSR_IA32_PL2_SSP to Guest.
- Add Co-developed-by expression in patch description.
- Refine patch description.

v4:
- Add Sean's patch for loading Guest fpu state before access XSAVES
managed CET MSRs.
- Melt down CET bits setting into CPUID configuration patch.
- Add VMX interface to query Host XSS.
- Check Host and Guest XSS support bits before set Guest XSS.
- Make Guest SHSTK and IBT feature enabling independent.
- Do not report CET support to Guest when Host CET feature is Disabled.

v3:
- Modified patches to make Guest CET independent to Host enabling.
- Added patch 8 to add user space access for Guest CET MSR access.
- Modified code comments and patch description to reflect changes.

v2:
- Re-ordered patch sequence, combined one patch.
- Added more description for CET related VMCS fields.
- Added Host CET capability check while enabling Guest CET loading bit.
- Added Host CET capability check while reporting Guest CPUID(EAX=7, EXC=0).
- Modified code in reporting Guest CPUID(EAX=D,ECX>=1), make it clearer.
- Added Host and Guest XSS mask check while setting bits for Guest XSS.

Sean Christopherson (1):
KVM: x86: Load guest fpu state when access MSRs managed by XSAVES

Yang Weijiang (10):
KVM: x86: Include CET definitions for KVM test purpose
KVM: VMX: Introduce CET VMCS fields and flags
KVM: VMX: Set guest CET MSRs per KVM and host configuration
KVM: VMX: Configure CET settings upon guest CR0/4 changing
KVM: x86: Refresh CPUID once guest changes XSS bits
KVM: x86: Add userspace access interface for CET MSRs
KVM: VMX: Enable CET support for nested VM
KVM: VMX: Add VMCS dump and sanity check for CET states
KVM: x86: Add #CP support in guest exception dispatch
KVM: x86: Enable CET virtualization and advertise CET to userspace

arch/x86/include/asm/kvm_host.h | 4 +-
arch/x86/include/asm/vmx.h | 8 +
arch/x86/include/uapi/asm/kvm.h | 1 +
arch/x86/include/uapi/asm/kvm_para.h | 7 +-
arch/x86/kvm/cpuid.c | 28 ++-
arch/x86/kvm/vmx/capabilities.h | 5 +
arch/x86/kvm/vmx/nested.c | 34 ++++
arch/x86/kvm/vmx/vmcs12.c | 267 ++++++++++++++++-----------
arch/x86/kvm/vmx/vmcs12.h | 14 +-
arch/x86/kvm/vmx/vmx.c | 262 +++++++++++++++++++++++++-
arch/x86/kvm/x86.c | 53 +++++-
arch/x86/kvm/x86.h | 2 +-
include/linux/kvm_host.h | 32 ++++
13 files changed, 590 insertions(+), 127 deletions(-)

--
2.17.2


2020-07-16 03:21:03

by Yang, Weijiang

[permalink] [raw]
Subject: [RESEND v13 05/11] KVM: x86: Refresh CPUID once guest changes XSS bits

CPUID(0xd, 1) reports the current required storage size of XCR0 | XSS,
when guest updates the XSS, it's necessary to update the CPUID leaf, otherwise
guest will fetch old state size, and results to some WARN traces during guest
running.

supported_xss is initialized to host_xss & KVM_SUPPORTED_XSS to indicate current
MSR_IA32_XSS bits supported in KVM, but actual XSS bits seen in guest depends
on the setting of CPUID(0xd,1).{ECX, EDX} for guest.

Co-developed-by: Zhang Yi Z <[email protected]>
Signed-off-by: Zhang Yi Z <[email protected]>
Signed-off-by: Yang Weijiang <[email protected]>
---
arch/x86/include/asm/kvm_host.h | 1 +
arch/x86/kvm/cpuid.c | 23 +++++++++++++++++++----
arch/x86/kvm/x86.c | 12 ++++++++----
3 files changed, 28 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index be5363b21540..e8c749596ba2 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -654,6 +654,7 @@ struct kvm_vcpu_arch {

u64 xcr0;
u64 guest_supported_xcr0;
+ u64 guest_supported_xss;

struct kvm_pio_request pio;
void *pio_data;
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 8a294f9747aa..d97b2a6e8a8c 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -88,14 +88,29 @@ int kvm_update_cpuid(struct kvm_vcpu *vcpu)
vcpu->arch.guest_supported_xcr0 = 0;
} else {
vcpu->arch.guest_supported_xcr0 =
- (best->eax | ((u64)best->edx << 32)) & supported_xcr0;
+ (((u64)best->edx << 32) | best->eax) & supported_xcr0;
best->ebx = xstate_required_size(vcpu->arch.xcr0, false);
}

best = kvm_find_cpuid_entry(vcpu, 0xD, 1);
- if (best && (cpuid_entry_has(best, X86_FEATURE_XSAVES) ||
- cpuid_entry_has(best, X86_FEATURE_XSAVEC)))
- best->ebx = xstate_required_size(vcpu->arch.xcr0, true);
+ if (best) {
+ if (cpuid_entry_has(best, X86_FEATURE_XSAVES) ||
+ cpuid_entry_has(best, X86_FEATURE_XSAVEC)) {
+ u64 xstate = vcpu->arch.xcr0 | vcpu->arch.ia32_xss;
+
+ best->ebx = xstate_required_size(xstate, true);
+ }
+
+ if (!cpuid_entry_has(best, X86_FEATURE_XSAVES)) {
+ best->ecx = 0;
+ best->edx = 0;
+ }
+ vcpu->arch.guest_supported_xss =
+ (((u64)best->edx << 32) | best->ecx) & supported_xss;
+
+ } else {
+ vcpu->arch.guest_supported_xss = 0;
+ }

/*
* The existing code assumes virtual address is 48-bit or 57-bit in the
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 906e07039d59..8aed32ff9c0c 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2912,9 +2912,12 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
* IA32_XSS[bit 8]. Guests have to use RDMSR/WRMSR rather than
* XSAVES/XRSTORS to save/restore PT MSRs.
*/
- if (data & ~supported_xss)
+ if (data & ~vcpu->arch.guest_supported_xss)
return 1;
- vcpu->arch.ia32_xss = data;
+ if (vcpu->arch.ia32_xss != data) {
+ vcpu->arch.ia32_xss = data;
+ kvm_update_cpuid(vcpu);
+ }
break;
case MSR_SMI_COUNT:
if (!msr_info->host_initiated)
@@ -9779,8 +9782,9 @@ int kvm_arch_hardware_setup(void *opaque)

memcpy(&kvm_x86_ops, ops->runtime_ops, sizeof(kvm_x86_ops));

- if (!kvm_cpu_cap_has(X86_FEATURE_XSAVES))
- supported_xss = 0;
+ supported_xss = 0;
+ if (kvm_cpu_cap_has(X86_FEATURE_XSAVES))
+ supported_xss = host_xss & KVM_SUPPORTED_XSS;

#define __kvm_cpu_cap_has(UNUSED_, f) kvm_cpu_cap_has(f)
cr4_reserved_bits = __cr4_reserved_bits(__kvm_cpu_cap_has, UNUSED_);
--
2.17.2

2020-07-16 03:21:05

by Yang, Weijiang

[permalink] [raw]
Subject: [RESEND v13 03/11] KVM: VMX: Set guest CET MSRs per KVM and host configuration

CET MSRs pass through guest directly to enhance performance. CET runtime
control settings are stored in MSR_IA32_{U,S}_CET, Shadow Stack Pointer(SSP)
are stored in MSR_IA32_PL{0,1,2,3}_SSP, SSP table base address is stored in
MSR_IA32_INT_SSP_TAB, these MSRs are defined in kernel and re-used here.

MSR_IA32_U_CET and MSR_IA32_PL3_SSP are used for user-mode protection,the MSR
contents are switched between threads during scheduling, it makes sense to pass
through them so that the guest kernel can use xsaves/xrstors to operate them
efficiently. Other MSRs are used for non-user mode protection. See SDM for detailed
info.

The difference between CET VMCS fields and CET MSRs is that,the former are used
during VMEnter/VMExit, whereas the latter are used for CET state storage between
task/thread scheduling.

Co-developed-by: Zhang Yi Z <[email protected]>
Signed-off-by: Zhang Yi Z <[email protected]>
Signed-off-by: Yang Weijiang <[email protected]>
---
arch/x86/kvm/vmx/vmx.c | 46 ++++++++++++++++++++++++++++++++++++++++++
arch/x86/kvm/x86.c | 3 +++
2 files changed, 49 insertions(+)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 13745f2a5ecd..a9f135c52cbc 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -3126,6 +3126,13 @@ void vmx_load_mmu_pgd(struct kvm_vcpu *vcpu, unsigned long pgd)
vmcs_writel(GUEST_CR3, guest_cr3);
}

+static bool is_cet_state_supported(struct kvm_vcpu *vcpu, u32 xss_states)
+{
+ return ((supported_xss & xss_states) &&
+ (guest_cpuid_has(vcpu, X86_FEATURE_SHSTK) ||
+ guest_cpuid_has(vcpu, X86_FEATURE_IBT)));
+}
+
int vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
{
struct vcpu_vmx *vmx = to_vmx(vcpu);
@@ -7230,6 +7237,42 @@ static void update_intel_pt_cfg(struct kvm_vcpu *vcpu)
vmx->pt_desc.ctl_bitmask &= ~(0xfULL << (32 + i * 4));
}

+static void vmx_update_intercept_for_cet_msr(struct kvm_vcpu *vcpu)
+{
+ struct vcpu_vmx *vmx = to_vmx(vcpu);
+ unsigned long *msr_bitmap = vmx->vmcs01.msr_bitmap;
+ bool incpt;
+
+ incpt = !is_cet_state_supported(vcpu, XFEATURE_MASK_CET_USER);
+ /*
+ * U_CET is required for USER CET, and U_CET, PL3_SPP are bound as
+ * one component and controlled by IA32_XSS[bit 11].
+ */
+ vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_U_CET, MSR_TYPE_RW,
+ incpt);
+ vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_PL3_SSP, MSR_TYPE_RW,
+ incpt);
+
+ incpt = !is_cet_state_supported(vcpu, XFEATURE_MASK_CET_KERNEL);
+ /*
+ * S_CET is required for KERNEL CET, and PL0_SSP ... PL2_SSP are
+ * bound as one component and controlled by IA32_XSS[bit 12].
+ */
+ vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_S_CET, MSR_TYPE_RW,
+ incpt);
+ vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_PL0_SSP, MSR_TYPE_RW,
+ incpt);
+ vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_PL1_SSP, MSR_TYPE_RW,
+ incpt);
+ vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_PL2_SSP, MSR_TYPE_RW,
+ incpt);
+
+ incpt |= !guest_cpuid_has(vcpu, X86_FEATURE_SHSTK);
+ /* SSP_TAB is only available for KERNEL SHSTK.*/
+ vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_INT_SSP_TAB, MSR_TYPE_RW,
+ incpt);
+}
+
static void vmx_cpuid_update(struct kvm_vcpu *vcpu)
{
struct vcpu_vmx *vmx = to_vmx(vcpu);
@@ -7268,6 +7311,9 @@ static void vmx_cpuid_update(struct kvm_vcpu *vcpu)
vmx_set_guest_msr(vmx, msr, enabled ? 0 : TSX_CTRL_RTM_DISABLE);
}
}
+
+ if (supported_xss & (XFEATURE_MASK_CET_KERNEL | XFEATURE_MASK_CET_USER))
+ vmx_update_intercept_for_cet_msr(vcpu);
}

static __init void vmx_set_cpu_caps(void)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 88c593f83b28..ea8a9dc9fbad 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -184,6 +184,9 @@ static struct kvm_shared_msrs __percpu *shared_msrs;
| XFEATURE_MASK_BNDCSR | XFEATURE_MASK_AVX512 \
| XFEATURE_MASK_PKRU)

+#define KVM_SUPPORTED_XSS (XFEATURE_MASK_CET_USER | \
+ XFEATURE_MASK_CET_KERNEL)
+
u64 __read_mostly host_efer;
EXPORT_SYMBOL_GPL(host_efer);

--
2.17.2

2020-07-22 19:48:44

by Sean Christopherson

[permalink] [raw]
Subject: Re: [RESEND PATCH v13 00/11] Introduce support for guest CET feature

On Thu, Jul 16, 2020 at 11:16:16AM +0800, Yang Weijiang wrote:
> Control-flow Enforcement Technology (CET) provides protection against
> Return/Jump-Oriented Programming (ROP/JOP) attack. There're two CET
> sub-features: Shadow Stack (SHSTK) and Indirect Branch Tracking (IBT).
> SHSTK is to prevent ROP programming and IBT is to prevent JOP programming.
>
> Several parts in KVM have been updated to provide VM CET support, including:
> CPUID/XSAVES config, MSR pass-through, user space MSR access interface,
> vmentry/vmexit config, nested VM etc. These patches have dependency on CET
> kernel patches for xsaves support and CET definitions, e.g., MSR and related
> feature flags.
>
> CET kernel patches are here:
> https://lkml.kernel.org/r/[email protected]
>
> v13:
> - Added CET definitions as a separate patch to facilitate KVM test.

What I actually want to do is pull in actual kernel patches themselves so
that we can upstream KVM support without having to wait for the kernel to
sort out the ABI, which seems like it's going to drag on.

I was thinking that we'd only need the MSR/CR4/CPUID definitions, but forgot
that KVM also needs XSAVES context switching, so it's not as simple as I was
thinking. It's still relatively simple, but it means there would be
functional changes in the kernel.

I'll respond to the main SSP series to pose the question of taking the two
small-ish kernel patches through the KVM tree.

> arch/x86/include/asm/kvm_host.h | 4 +-
> arch/x86/include/asm/vmx.h | 8 +
> arch/x86/include/uapi/asm/kvm.h | 1 +
> arch/x86/include/uapi/asm/kvm_para.h | 7 +-
> arch/x86/kvm/cpuid.c | 28 ++-
> arch/x86/kvm/vmx/capabilities.h | 5 +
> arch/x86/kvm/vmx/nested.c | 34 ++++
> arch/x86/kvm/vmx/vmcs12.c | 267 ++++++++++++++++-----------
> arch/x86/kvm/vmx/vmcs12.h | 14 +-
> arch/x86/kvm/vmx/vmx.c | 262 +++++++++++++++++++++++++-
> arch/x86/kvm/x86.c | 53 +++++-
> arch/x86/kvm/x86.h | 2 +-
> include/linux/kvm_host.h | 32 ++++
> 13 files changed, 590 insertions(+), 127 deletions(-)

I have quite a few comments/changes (will respond to individual patches),
but have done all the updates/rework and, assuming I haven't broken things,
we're nearing the point where I can carry this and push it past the finish
line, e.g. get acks from tip/x86 maintainers for the kernel patches and
send a pull request to Paolo.

I pushed the result to:

https://github.com/sean-jc/linux/releases/tag/kvm-cet-v14-rc1

can you please review and test? If everything looks good, I'll post v14.
If not, I'll work offline with you to get it into shape.

Thanks!

2020-07-22 20:16:47

by Sean Christopherson

[permalink] [raw]
Subject: Re: [RESEND v13 03/11] KVM: VMX: Set guest CET MSRs per KVM and host configuration

On Thu, Jul 16, 2020 at 11:16:19AM +0800, Yang Weijiang wrote:
> CET MSRs pass through guest directly to enhance performance. CET runtime
> control settings are stored in MSR_IA32_{U,S}_CET, Shadow Stack Pointer(SSP)
> are stored in MSR_IA32_PL{0,1,2,3}_SSP, SSP table base address is stored in
> MSR_IA32_INT_SSP_TAB, these MSRs are defined in kernel and re-used here.
>
> MSR_IA32_U_CET and MSR_IA32_PL3_SSP are used for user-mode protection,the MSR
> contents are switched between threads during scheduling, it makes sense to pass
> through them so that the guest kernel can use xsaves/xrstors to operate them
> efficiently. Other MSRs are used for non-user mode protection. See SDM for detailed
> info.
>
> The difference between CET VMCS fields and CET MSRs is that,the former are used
> during VMEnter/VMExit, whereas the latter are used for CET state storage between
> task/thread scheduling.

I moved this patch until after CET virtualization is enabled. Functionally,
KVM should be able to virtualize CET without allowing the guest to access
the MSRs directly. Moving this after CET enabling will allow bisecting this
specific patch, e.g. if there's a problem with pass-through but not basic
support, or vice versa, and will also allow better testing of the MSR access
code. At least, that's the theory.

> Co-developed-by: Zhang Yi Z <[email protected]>
> Signed-off-by: Zhang Yi Z <[email protected]>
> Signed-off-by: Yang Weijiang <[email protected]>
> ---
> arch/x86/kvm/vmx/vmx.c | 46 ++++++++++++++++++++++++++++++++++++++++++
> arch/x86/kvm/x86.c | 3 +++
> 2 files changed, 49 insertions(+)
>
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 13745f2a5ecd..a9f135c52cbc 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -3126,6 +3126,13 @@ void vmx_load_mmu_pgd(struct kvm_vcpu *vcpu, unsigned long pgd)
> vmcs_writel(GUEST_CR3, guest_cr3);
> }
>
> +static bool is_cet_state_supported(struct kvm_vcpu *vcpu, u32 xss_states)

s/xss_states/xss_state, i.e. make it singular instead of plural to match the
function name, and because the below check is at best ambiguous for multiple
states, e.g. it arguably should be ((supported_xss & xss_states) == xss_states).

> +{
> + return ((supported_xss & xss_states) &&

Nit, the () around the entire statement is unnecessary.

Checking supported_xss is incorrect, this needs to check guest_supported_xss.

> + (guest_cpuid_has(vcpu, X86_FEATURE_SHSTK) ||
> + guest_cpuid_has(vcpu, X86_FEATURE_IBT)));

Nit, please align inner statements, e.g. so it looks like:

(guest_cpuid_has(...) ||
guest_cpuid_has(...)))

> +}
> +
> int vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
> {
> struct vcpu_vmx *vmx = to_vmx(vcpu);
> @@ -7230,6 +7237,42 @@ static void update_intel_pt_cfg(struct kvm_vcpu *vcpu)
> vmx->pt_desc.ctl_bitmask &= ~(0xfULL << (32 + i * 4));
> }
>
> +static void vmx_update_intercept_for_cet_msr(struct kvm_vcpu *vcpu)
> +{
> + struct vcpu_vmx *vmx = to_vmx(vcpu);
> + unsigned long *msr_bitmap = vmx->vmcs01.msr_bitmap;

Naming the local 'bitmap' will avoid wrapping lines. Everything except
INT_SSP_TAB fits under 80 chars, and for that one it's ok to poke out a bit.

> + bool incpt;
> +
> + incpt = !is_cet_state_supported(vcpu, XFEATURE_MASK_CET_USER);

> + /*
> + * U_CET is required for USER CET, and U_CET, PL3_SPP are bound as
> + * one component and controlled by IA32_XSS[bit 11].
> + */
> + vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_U_CET, MSR_TYPE_RW,
> + incpt);
> + vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_PL3_SSP, MSR_TYPE_RW,
> + incpt);

This is wrong, PL3_SSP should be intercepted if IBT is supported by SHSTK is
not. Weird XSAVES virtualization hole aside, we need to be consistent with
the emulation of the MSRs.

> +
> + incpt = !is_cet_state_supported(vcpu, XFEATURE_MASK_CET_KERNEL);
> + /*
> + * S_CET is required for KERNEL CET, and PL0_SSP ... PL2_SSP are
> + * bound as one component and controlled by IA32_XSS[bit 12].
> + */
> + vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_S_CET, MSR_TYPE_RW,
> + incpt);
> + vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_PL0_SSP, MSR_TYPE_RW,
> + incpt);
> + vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_PL1_SSP, MSR_TYPE_RW,
> + incpt);
> + vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_PL2_SSP, MSR_TYPE_RW,
> + incpt);
> +
> + incpt |= !guest_cpuid_has(vcpu, X86_FEATURE_SHSTK);
> + /* SSP_TAB is only available for KERNEL SHSTK.*/
> + vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_INT_SSP_TAB, MSR_TYPE_RW,
> + incpt);
> +}
> +
> static void vmx_cpuid_update(struct kvm_vcpu *vcpu)
> {
> struct vcpu_vmx *vmx = to_vmx(vcpu);
> @@ -7268,6 +7311,9 @@ static void vmx_cpuid_update(struct kvm_vcpu *vcpu)
> vmx_set_guest_msr(vmx, msr, enabled ? 0 : TSX_CTRL_RTM_DISABLE);
> }
> }
> +
> + if (supported_xss & (XFEATURE_MASK_CET_KERNEL | XFEATURE_MASK_CET_USER))

Given that the proposed kernel support bundles USER and KERNEL together, we
can simplify the KVM implementation by adding kvm_cet_supported(), with a
similar implementation to MPX:

static inline bool kvm_cet_supported(void)
{
const u64 mask = XFEATURE_MASK_CET_USER | XFEATURE_MASK_CET_KERNEL;

return (supported_xss & mask) == mask;
}

> + vmx_update_intercept_for_cet_msr(vcpu);
> }
>
> static __init void vmx_set_cpu_caps(void)
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 88c593f83b28..ea8a9dc9fbad 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -184,6 +184,9 @@ static struct kvm_shared_msrs __percpu *shared_msrs;
> | XFEATURE_MASK_BNDCSR | XFEATURE_MASK_AVX512 \
> | XFEATURE_MASK_PKRU)
>
> +#define KVM_SUPPORTED_XSS (XFEATURE_MASK_CET_USER | \
> + XFEATURE_MASK_CET_KERNEL)

Xiaoyao called out that this belongs in a later patch, which it does, but we
can actually do away with it entirely. Because CET is dependent on multiple
features and feature flags, we can't do a straight mask in any case, i.e.
having KVM_SUPPORTED_XSS doesn't add value as VMX needs to manually set the
CET flags anyways.

> u64 __read_mostly host_efer;
> EXPORT_SYMBOL_GPL(host_efer);
>
> --
> 2.17.2
>

2020-07-22 20:34:31

by Sean Christopherson

[permalink] [raw]
Subject: Re: [RESEND v13 05/11] KVM: x86: Refresh CPUID once guest changes XSS bits

On Thu, Jul 16, 2020 at 11:16:21AM +0800, Yang Weijiang wrote:
> CPUID(0xd, 1) reports the current required storage size of XCR0 | XSS,
> when guest updates the XSS, it's necessary to update the CPUID leaf, otherwise
> guest will fetch old state size, and results to some WARN traces during guest
> running.
>
> supported_xss is initialized to host_xss & KVM_SUPPORTED_XSS to indicate current
> MSR_IA32_XSS bits supported in KVM, but actual XSS bits seen in guest depends
> on the setting of CPUID(0xd,1).{ECX, EDX} for guest.
>
> Co-developed-by: Zhang Yi Z <[email protected]>
> Signed-off-by: Zhang Yi Z <[email protected]>
> Signed-off-by: Yang Weijiang <[email protected]>
> ---
> arch/x86/include/asm/kvm_host.h | 1 +
> arch/x86/kvm/cpuid.c | 23 +++++++++++++++++++----
> arch/x86/kvm/x86.c | 12 ++++++++----
> 3 files changed, 28 insertions(+), 8 deletions(-)
>
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index be5363b21540..e8c749596ba2 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -654,6 +654,7 @@ struct kvm_vcpu_arch {
>
> u64 xcr0;
> u64 guest_supported_xcr0;
> + u64 guest_supported_xss;
>
> struct kvm_pio_request pio;
> void *pio_data;
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 8a294f9747aa..d97b2a6e8a8c 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -88,14 +88,29 @@ int kvm_update_cpuid(struct kvm_vcpu *vcpu)
> vcpu->arch.guest_supported_xcr0 = 0;
> } else {
> vcpu->arch.guest_supported_xcr0 =
> - (best->eax | ((u64)best->edx << 32)) & supported_xcr0;
> + (((u64)best->edx << 32) | best->eax) & supported_xcr0;

While I don't necessarily disagree with the change, it doesn't belong in
this patch.

> best->ebx = xstate_required_size(vcpu->arch.xcr0, false);
> }
>
> best = kvm_find_cpuid_entry(vcpu, 0xD, 1);
> - if (best && (cpuid_entry_has(best, X86_FEATURE_XSAVES) ||
> - cpuid_entry_has(best, X86_FEATURE_XSAVEC)))
> - best->ebx = xstate_required_size(vcpu->arch.xcr0, true);
> + if (best) {
> + if (cpuid_entry_has(best, X86_FEATURE_XSAVES) ||
> + cpuid_entry_has(best, X86_FEATURE_XSAVEC)) {
> + u64 xstate = vcpu->arch.xcr0 | vcpu->arch.ia32_xss;
> +
> + best->ebx = xstate_required_size(xstate, true);
> + }
> +
> + if (!cpuid_entry_has(best, X86_FEATURE_XSAVES)) {
> + best->ecx = 0;
> + best->edx = 0;
> + }
> + vcpu->arch.guest_supported_xss =
> + (((u64)best->edx << 32) | best->ecx) & supported_xss;
> +
> + } else {
> + vcpu->arch.guest_supported_xss = 0;
> + }
>
> /*
> * The existing code assumes virtual address is 48-bit or 57-bit in the
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 906e07039d59..8aed32ff9c0c 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -2912,9 +2912,12 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> * IA32_XSS[bit 8]. Guests have to use RDMSR/WRMSR rather than
> * XSAVES/XRSTORS to save/restore PT MSRs.
> */
> - if (data & ~supported_xss)
> + if (data & ~vcpu->arch.guest_supported_xss)
> return 1;
> - vcpu->arch.ia32_xss = data;
> + if (vcpu->arch.ia32_xss != data) {
> + vcpu->arch.ia32_xss = data;
> + kvm_update_cpuid(vcpu);
> + }
> break;
> case MSR_SMI_COUNT:
> if (!msr_info->host_initiated)
> @@ -9779,8 +9782,9 @@ int kvm_arch_hardware_setup(void *opaque)
>
> memcpy(&kvm_x86_ops, ops->runtime_ops, sizeof(kvm_x86_ops));
>
> - if (!kvm_cpu_cap_has(X86_FEATURE_XSAVES))
> - supported_xss = 0;
> + supported_xss = 0;
> + if (kvm_cpu_cap_has(X86_FEATURE_XSAVES))
> + supported_xss = host_xss & KVM_SUPPORTED_XSS;

Updating supported_xss in the actual enabling patch.

>
> #define __kvm_cpu_cap_has(UNUSED_, f) kvm_cpu_cap_has(f)
> cr4_reserved_bits = __cr4_reserved_bits(__kvm_cpu_cap_has, UNUSED_);
> --
> 2.17.2
>

2020-07-23 03:12:00

by Yang, Weijiang

[permalink] [raw]
Subject: Re: [RESEND PATCH v13 00/11] Introduce support for guest CET feature

On Wed, Jul 22, 2020 at 12:48:05PM -0700, Sean Christopherson wrote:
> On Thu, Jul 16, 2020 at 11:16:16AM +0800, Yang Weijiang wrote:
> > Control-flow Enforcement Technology (CET) provides protection against
> > Return/Jump-Oriented Programming (ROP/JOP) attack. There're two CET
> > sub-features: Shadow Stack (SHSTK) and Indirect Branch Tracking (IBT).
> > SHSTK is to prevent ROP programming and IBT is to prevent JOP programming.
> >
> > Several parts in KVM have been updated to provide VM CET support, including:
> > CPUID/XSAVES config, MSR pass-through, user space MSR access interface,
> > vmentry/vmexit config, nested VM etc. These patches have dependency on CET
> > kernel patches for xsaves support and CET definitions, e.g., MSR and related
> > feature flags.
> >
> > CET kernel patches are here:
> > https://lkml.kernel.org/r/[email protected]
> >
> > v13:
> > - Added CET definitions as a separate patch to facilitate KVM test.
>
> What I actually want to do is pull in actual kernel patches themselves so
> that we can upstream KVM support without having to wait for the kernel to
> sort out the ABI, which seems like it's going to drag on.
That's an innovative idea and beyond my imagination, great!:-)
>
> I was thinking that we'd only need the MSR/CR4/CPUID definitions, but forgot
> that KVM also needs XSAVES context switching, so it's not as simple as I was
> thinking. It's still relatively simple, but it means there would be
> functional changes in the kernel.
>
> I'll respond to the main SSP series to pose the question of taking the two
> small-ish kernel patches through the KVM tree.
>
> > arch/x86/include/asm/kvm_host.h | 4 +-
> > arch/x86/include/asm/vmx.h | 8 +
> > arch/x86/include/uapi/asm/kvm.h | 1 +
> > arch/x86/include/uapi/asm/kvm_para.h | 7 +-
> > arch/x86/kvm/cpuid.c | 28 ++-
> > arch/x86/kvm/vmx/capabilities.h | 5 +
> > arch/x86/kvm/vmx/nested.c | 34 ++++
> > arch/x86/kvm/vmx/vmcs12.c | 267 ++++++++++++++++-----------
> > arch/x86/kvm/vmx/vmcs12.h | 14 +-
> > arch/x86/kvm/vmx/vmx.c | 262 +++++++++++++++++++++++++-
> > arch/x86/kvm/x86.c | 53 +++++-
> > arch/x86/kvm/x86.h | 2 +-
> > include/linux/kvm_host.h | 32 ++++
> > 13 files changed, 590 insertions(+), 127 deletions(-)
>
> I have quite a few comments/changes (will respond to individual patches),
> but have done all the updates/rework and, assuming I haven't broken things,
> we're nearing the point where I can carry this and push it past the finish
> line, e.g. get acks from tip/x86 maintainers for the kernel patches and
> send a pull request to Paolo.
>
> I pushed the result to:
>
> https://github.com/sean-jc/linux/releases/tag/kvm-cet-v14-rc1
>
> can you please review and test? If everything looks good, I'll post v14.
> If not, I'll work offline with you to get it into shape.
>
Thanks a lot for the efforts! I'll review and test the new patches and
let you know the status.

> Thanks!