2019-11-01 09:36:07

by Yang Weijiang

[permalink] [raw]
Subject: [PATCH v8 0/7] Introduce support for guest CET feature

Control-flow Enforcement Technology (CET) provides protection against
Return/Jump-Oriented Programming (ROP/JOP) attack. It includes two
sub-features: Shadow Stack (SHSTK) and Indirect Branch Tracking (IBT).

KVM change is required to support guest CET feature.
This patch serial implemented CET related CPUID/XSAVES enumeration, MSRs
and vmentry/vmexit configuration etc.so that guest kernel can setup CET
runtime infrastructure based on them. Some CET MSRs and related feature
flags used reference the definitions in kernel patchset.

CET kernel patches is here:
https://lkml.org/lkml/2019/8/13/1110
https://lkml.org/lkml/2019/8/13/1109

v7 -> v8:
- Addressed Jim and Sean's feedback on: 1) CPUID(0xD,i) enumeration. 2)
sanity check when configure guest CET. 3) function improvement.
- Added more sanity check functions.
- Set host vmexit default status so that guest won't leak CET status to
host when vmexit.
- Added CR0.WP vs. CR4.CET mutual constrains.

v6 -> v7:
- Rebased patch to kernel v5.3
- Sean suggested to change CPUID(0xd, n) enumeration code as alined with
existing one, and I think it's better to make the fix as an independent patch
since XSS MSR are being used widely on X86 platforms.
- Check more host and guest status before configure guest CET
per Sean's feedback.
- Add error-check before guest accesses CET MSRs per Sean's feedback.
- Other minor fixes suggested by Sean.

v5 -> v6:
- Rebase patch to kernel v5.2.
- Move CPUID(0xD, n>=1) helper to a seperate patch.
- Merge xsave size fix with other patch.
- Other minor fixes per community feedback.

v4 -> v5:
- Rebase patch to kernel v5.1.
- Wrap CPUID(0xD, n>=1) code to a helper function.
- Pass through MSR_IA32_PL1_SSP and MSR_IA32_PL2_SSP to Guest.
- Add Co-developed-by expression in patch description.
- Refine patch description.

v3 -> v4:
- Add Sean's patch for loading Guest fpu state before access XSAVES
managed CET MSRs.
- Melt down CET bits setting into CPUID configuration patch.
- Add VMX interface to query Host XSS.
- Check Host and Guest XSS support bits before set Guest XSS.
- Make Guest SHSTK and IBT feature enabling independent.
- Do not report CET support to Guest when Host CET feature is Disabled.

v2 -> v3:
- Modified patches to make Guest CET independent to Host enabling.
- Added patch 8 to add user space access for Guest CET MSR access.
- Modified code comments and patch description to reflect changes.

v1 -> v2:
- Re-ordered patch sequence, combined one patch.
- Added more description for CET related VMCS fields.
- Added Host CET capability check while enabling Guest CET loading bit.
- Added Host CET capability check while reporting Guest CPUID(EAX=7, EXC=0).
- Modified code in reporting Guest CPUID(EAX=D,ECX>=1), make it clearer.
- Added Host and Guest XSS mask check while setting bits for Guest XSS.


PATCH 1 : Fix CPUID(0xD, n) enumeration to support XSS MSR.
PATCH 2 : Define CET VMCS fields and bits.
PATCH 3 : Pass through CET MSRs to guest.
PATCH 4 : Load guest/host CET states on vmentry/vmexit.
PATCH 5 : Enable update to IA32_XSS for guest.
PATCH 6 : Load Guest FPU states for XSAVES managed MSRs.
PATCH 7 : Add user-space CET MSR access interface.



Sean Christopherson (1):
KVM: X86: Load guest fpu state when accessing MSRs managed by XSAVES

Yang Weijiang (6):
KVM: CPUID: Fix IA32_XSS support in CPUID(0xd,i) enumeration
KVM: VMX: Define CET VMCS fields and #CP flag
KVM: VMX: Pass through CET related MSRs
KVM: VMX: Load CET states on vmentry/vmexit
KVM: X86: Enable CET bits update in IA32_XSS
KVM: X86: Add user-space access interface for CET MSRs

arch/x86/include/asm/kvm_host.h | 5 +-
arch/x86/include/asm/vmx.h | 8 +
arch/x86/include/uapi/asm/kvm.h | 1 +
arch/x86/kvm/cpuid.c | 121 +++++++++-----
arch/x86/kvm/cpuid.h | 2 +
arch/x86/kvm/svm.c | 7 +
arch/x86/kvm/vmx/capabilities.h | 10 ++
arch/x86/kvm/vmx/vmx.c | 272 +++++++++++++++++++++++++++++++-
arch/x86/kvm/x86.c | 25 ++-
arch/x86/kvm/x86.h | 10 +-
10 files changed, 415 insertions(+), 46 deletions(-)

--
2.17.2


2019-11-01 09:36:08

by Yang Weijiang

[permalink] [raw]
Subject: [PATCH v8 1/7] KVM: CPUID: Fix IA32_XSS support in CPUID(0xd,i) enumeration

The control bits in IA32_XSS MSR are being used for new features,
but current CPUID(0xd,i) enumeration code doesn't support them, so
fix existing code first.

The supervisor states in IA32_XSS haven't been used in public
KVM code, so set KVM_SUPPORTED_XSS to 0 now, anyone who's developing
IA32_XSS related feature may expand the macro to add the CPUID support,
otherwise, CPUID(0xd,i>1) always reports 0 of the subleaf to guest.

Extracted old code into a new filter and keep it same flavor as others.

This patch passed selftest on a few Intel platforms.

Suggested-by: Sean Christopherson <[email protected]>
Signed-off-by: Yang Weijiang <[email protected]>
---
arch/x86/include/asm/kvm_host.h | 1 +
arch/x86/kvm/cpuid.c | 105 ++++++++++++++++++++++----------
arch/x86/kvm/svm.c | 7 +++
arch/x86/kvm/vmx/vmx.c | 6 ++
arch/x86/kvm/x86.h | 7 +++
5 files changed, 94 insertions(+), 32 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 74e88e5edd9c..d018df8c5f32 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1209,6 +1209,7 @@ struct kvm_x86_ops {
uint16_t (*nested_get_evmcs_version)(struct kvm_vcpu *vcpu);

bool (*need_emulation_on_page_fault)(struct kvm_vcpu *vcpu);
+ u64 (*supported_xss)(void);
};

struct kvm_arch_async_pf {
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 22c2720cd948..dd387a785c1e 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -62,6 +62,17 @@ u64 kvm_supported_xcr0(void)
return xcr0;
}

+u64 kvm_supported_xss(void)
+{
+ u64 kvm_xss = KVM_SUPPORTED_XSS & kvm_x86_ops->supported_xss();
+
+ if (!kvm_x86_ops->xsaves_supported())
+ return 0;
+
+ return kvm_xss;
+}
+EXPORT_SYMBOL_GPL(kvm_supported_xss);
+
#define F(x) bit(X86_FEATURE_##x)

int kvm_update_cpuid(struct kvm_vcpu *vcpu)
@@ -414,6 +425,58 @@ static inline void do_cpuid_7_mask(struct kvm_cpuid_entry2 *entry, int index)
}
}

+static inline bool do_cpuid_0xd_mask(struct kvm_cpuid_entry2 *entry, int index)
+{
+ unsigned int f_xsaves = kvm_x86_ops->xsaves_supported() ? F(XSAVES) : 0;
+ /* cpuid 0xD.1.eax */
+ const u32 kvm_cpuid_D_1_eax_x86_features =
+ F(XSAVEOPT) | F(XSAVEC) | F(XGETBV1) | f_xsaves;
+ u64 u_supported = kvm_supported_xcr0();
+ u64 s_supported = kvm_supported_xss();
+ u64 supported;
+
+ switch (index) {
+ case 0:
+ entry->eax &= u_supported;
+ entry->ebx = xstate_required_size(u_supported, false);
+ entry->ecx = entry->ebx;
+ entry->edx &= u_supported >> 32;
+
+ if (!u_supported) {
+ entry->eax = 0;
+ entry->ebx = 0;
+ entry->ecx = 0;
+ entry->edx = 0;
+ return false;
+ }
+ break;
+ case 1:
+ supported = u_supported | s_supported;
+ entry->eax &= kvm_cpuid_D_1_eax_x86_features;
+ cpuid_mask(&entry->eax, CPUID_D_1_EAX);
+ entry->ebx = 0;
+ entry->edx &= s_supported >> 32;
+ entry->ecx &= s_supported;
+ if (entry->eax & (F(XSAVES) | F(XSAVEC)))
+ entry->ebx = xstate_required_size(supported, true);
+
+ break;
+ default:
+ supported = (entry->ecx & 0x1) ? s_supported : u_supported;
+ if (!(supported & (BIT_ULL(index)))) {
+ entry->eax = 0;
+ entry->ebx = 0;
+ entry->ecx = 0;
+ entry->edx = 0;
+ return false;
+ }
+ if (entry->ecx & 0x1)
+ entry->ebx = 0;
+ break;
+ }
+ return true;
+}
+
static inline int __do_cpuid_func(struct kvm_cpuid_entry2 *entry, u32 function,
int *nent, int maxnent)
{
@@ -428,7 +491,6 @@ static inline int __do_cpuid_func(struct kvm_cpuid_entry2 *entry, u32 function,
unsigned f_lm = 0;
#endif
unsigned f_rdtscp = kvm_x86_ops->rdtscp_supported() ? F(RDTSCP) : 0;
- unsigned f_xsaves = kvm_x86_ops->xsaves_supported() ? F(XSAVES) : 0;
unsigned f_intel_pt = kvm_x86_ops->pt_supported() ? F(INTEL_PT) : 0;

/* cpuid 1.edx */
@@ -482,10 +544,6 @@ static inline int __do_cpuid_func(struct kvm_cpuid_entry2 *entry, u32 function,
F(ACE2) | F(ACE2_EN) | F(PHE) | F(PHE_EN) |
F(PMM) | F(PMM_EN);

- /* cpuid 0xD.1.eax */
- const u32 kvm_cpuid_D_1_eax_x86_features =
- F(XSAVEOPT) | F(XSAVEC) | F(XGETBV1) | f_xsaves;
-
/* all calls to cpuid_count() should be made on the same cpu */
get_cpu();

@@ -622,38 +680,21 @@ static inline int __do_cpuid_func(struct kvm_cpuid_entry2 *entry, u32 function,
break;
}
case 0xd: {
- int idx, i;
- u64 supported = kvm_supported_xcr0();
+ int i, idx;

- entry->eax &= supported;
- entry->ebx = xstate_required_size(supported, false);
- entry->ecx = entry->ebx;
- entry->edx &= supported >> 32;
- if (!supported)
+ if (!do_cpuid_0xd_mask(&entry[0], 0))
break;
-
- for (idx = 1, i = 1; idx < 64; ++idx) {
- u64 mask = ((u64)1 << idx);
+ for (i = 1, idx = 1; idx < 64; ++idx) {
if (*nent >= maxnent)
goto out;
-
do_host_cpuid(&entry[i], function, idx);
- if (idx == 1) {
- entry[i].eax &= kvm_cpuid_D_1_eax_x86_features;
- cpuid_mask(&entry[i].eax, CPUID_D_1_EAX);
- entry[i].ebx = 0;
- if (entry[i].eax & (F(XSAVES)|F(XSAVEC)))
- entry[i].ebx =
- xstate_required_size(supported,
- true);
- } else {
- if (entry[i].eax == 0 || !(supported & mask))
- continue;
- if (WARN_ON_ONCE(entry[i].ecx & 1))
- continue;
- }
- entry[i].ecx = 0;
- entry[i].edx = 0;
+ if (entry[i].eax == 0 && entry[i].ebx == 0 &&
+ entry[i].ecx == 0 && entry[i].edx == 0)
+ continue;
+
+ if (!do_cpuid_0xd_mask(&entry[i], idx))
+ continue;
+
++*nent;
++i;
}
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index e0368076a1ef..be967bf9a81d 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -7193,6 +7193,11 @@ static bool svm_need_emulation_on_page_fault(struct kvm_vcpu *vcpu)
return false;
}

+static u64 svm_supported_xss(void)
+{
+ return 0;
+}
+
static struct kvm_x86_ops svm_x86_ops __ro_after_init = {
.cpu_has_kvm_support = has_svm,
.disabled_by_bios = is_disabled,
@@ -7329,6 +7334,8 @@ static struct kvm_x86_ops svm_x86_ops __ro_after_init = {
.nested_get_evmcs_version = NULL,

.need_emulation_on_page_fault = svm_need_emulation_on_page_fault,
+
+ .supported_xss = svm_supported_xss,
};

static int __init svm_init(void)
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index c6f6b05004d9..a84198cff397 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1651,6 +1651,11 @@ static inline bool vmx_feature_control_msr_valid(struct kvm_vcpu *vcpu,
return !(val & ~valid_bits);
}

+static inline u64 vmx_supported_xss(void)
+{
+ return host_xss;
+}
+
static int vmx_get_msr_feature(struct kvm_msr_entry *msr)
{
switch (msr->index) {
@@ -7799,6 +7804,7 @@ static struct kvm_x86_ops vmx_x86_ops __ro_after_init = {
.nested_enable_evmcs = NULL,
.nested_get_evmcs_version = NULL,
.need_emulation_on_page_fault = vmx_need_emulation_on_page_fault,
+ .supported_xss = vmx_supported_xss,
};

static void vmx_cleanup_l1d_flush(void)
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index 6594020c0691..f10c12b5197d 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -293,6 +293,13 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, unsigned long cr2,
| XFEATURE_MASK_YMM | XFEATURE_MASK_BNDREGS \
| XFEATURE_MASK_BNDCSR | XFEATURE_MASK_AVX512 \
| XFEATURE_MASK_PKRU)
+
+/*
+ * Right now, no XSS states are used on x86 platform,
+ * expand the macro for new features.
+ */
+#define KVM_SUPPORTED_XSS 0
+
extern u64 host_xcr0;

extern u64 kvm_supported_xcr0(void);
--
2.17.2

2019-11-01 09:36:16

by Yang Weijiang

[permalink] [raw]
Subject: [PATCH v8 6/7] KVM: X86: Load guest fpu state when accessing MSRs managed by XSAVES

From: Sean Christopherson <[email protected]>

A handful of CET MSRs are not context switched through "traditional"
methods, e.g. VMCS or manual switching, but rather are passed through
to the guest and are saved and restored by XSAVES/XRSTORS, i.e. the
guest's FPU state.

Load the guest's FPU state if userspace is accessing MSRs whose values
are managed by XSAVES so that the MSR helper, e.g. vmx_{get,set}_msr(),
can simply do {RD,WR}MSR to access the guest's value.

Note that guest_cpuid_has() is not queried as host userspace is allowed
to access MSRs that have not been exposed to the guest, e.g. it might do
KVM_SET_MSRS prior to KVM_SET_CPUID2.

Signed-off-by: Sean Christopherson <[email protected]>
Co-developed-by: Yang Weijiang <[email protected]>
Signed-off-by: Yang Weijiang <[email protected]>
---
arch/x86/kvm/x86.c | 21 ++++++++++++++++++++-
1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 540490d5385f..6275a75d5802 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -104,6 +104,8 @@ static void enter_smm(struct kvm_vcpu *vcpu);
static void __kvm_set_rflags(struct kvm_vcpu *vcpu, unsigned long rflags);
static void store_regs(struct kvm_vcpu *vcpu);
static int sync_regs(struct kvm_vcpu *vcpu);
+static void kvm_load_guest_fpu(struct kvm_vcpu *vcpu);
+static void kvm_put_guest_fpu(struct kvm_vcpu *vcpu);

struct kvm_x86_ops *kvm_x86_ops __read_mostly;
EXPORT_SYMBOL_GPL(kvm_x86_ops);
@@ -3000,6 +3002,12 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
}
EXPORT_SYMBOL_GPL(kvm_get_msr_common);

+static bool is_xsaves_msr(u32 index)
+{
+ return index == MSR_IA32_U_CET ||
+ (index >= MSR_IA32_PL0_SSP && index <= MSR_IA32_PL3_SSP);
+}
+
/*
* Read or write a bunch of msrs. All parameters are kernel addresses.
*
@@ -3010,11 +3018,22 @@ static int __msr_io(struct kvm_vcpu *vcpu, struct kvm_msrs *msrs,
int (*do_msr)(struct kvm_vcpu *vcpu,
unsigned index, u64 *data))
{
+ bool fpu_loaded = false;
int i;
+ const u64 cet_bits = XFEATURE_MASK_CET_USER | XFEATURE_MASK_CET_KERNEL;
+ bool cet_xss = kvm_supported_xss() & cet_bits;

- for (i = 0; i < msrs->nmsrs; ++i)
+ for (i = 0; i < msrs->nmsrs; ++i) {
+ if (!fpu_loaded && cet_xss &&
+ is_xsaves_msr(entries[i].index)) {
+ kvm_load_guest_fpu(vcpu);
+ fpu_loaded = true;
+ }
if (do_msr(vcpu, entries[i].index, &entries[i].data))
break;
+ }
+ if (fpu_loaded)
+ kvm_put_guest_fpu(vcpu);

return i;
}
--
2.17.2

2019-11-01 09:36:18

by Yang Weijiang

[permalink] [raw]
Subject: [PATCH v8 7/7] KVM: X86: Add user-space access interface for CET MSRs

There're two different places storing Guest CET states, states
managed with XSAVES/XRSTORS, as restored/saved
in previous patch, can be read/write directly from/to the MSRs.
For those stored in VMCS fields, they're access via vmcs_read/
vmcs_write.

Signed-off-by: Yang Weijiang <[email protected]>
---
arch/x86/kvm/vmx/vmx.c | 140 +++++++++++++++++++++++++++++++++++++++++
arch/x86/kvm/x86.c | 3 +
2 files changed, 143 insertions(+)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index bfb1b922a9ac..71aba264b5d2 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1671,6 +1671,98 @@ static int vmx_get_msr_feature(struct kvm_msr_entry *msr)
return 0;
}

+#define CET_MSR_RSVD_BITS_1 0x3
+#define CET_MSR_RSVD_BITS_2 (0xF << 6)
+
+static bool cet_msr_write_allowed(struct kvm_vcpu *vcpu, struct msr_data *msr)
+{
+ u32 index = msr->index;
+ u64 data = msr->data;
+ u32 high_word = data >> 32;
+
+ if ((index == MSR_IA32_U_CET || index == MSR_IA32_S_CET) &&
+ (data & CET_MSR_RSVD_BITS_2))
+ return false;
+
+ if (is_64_bit_mode(vcpu)) {
+ if (is_noncanonical_address(data & PAGE_MASK, vcpu))
+ return false;
+ else if ((index == MSR_IA32_PL0_SSP ||
+ index == MSR_IA32_PL1_SSP ||
+ index == MSR_IA32_PL2_SSP ||
+ index == MSR_IA32_PL3_SSP) &&
+ (data & CET_MSR_RSVD_BITS_1))
+ return false;
+ } else {
+ if (msr->index == MSR_IA32_INT_SSP_TAB)
+ return false;
+ else if ((index == MSR_IA32_U_CET ||
+ index == MSR_IA32_S_CET ||
+ index == MSR_IA32_PL0_SSP ||
+ index == MSR_IA32_PL1_SSP ||
+ index == MSR_IA32_PL2_SSP ||
+ index == MSR_IA32_PL3_SSP) &&
+ (high_word & ~0ul))
+ return false;
+ }
+
+ return true;
+}
+
+static bool cet_msr_access_allowed(struct kvm_vcpu *vcpu, struct msr_data *msr)
+{
+ u64 kvm_xss;
+ u32 index = msr->index;
+
+ if (is_guest_mode(vcpu))
+ return false;
+
+ kvm_xss = kvm_supported_xss();
+
+ switch (index) {
+ case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP:
+ if (!boot_cpu_has(X86_FEATURE_SHSTK))
+ return false;
+ if (!msr->host_initiated) {
+ if (!guest_cpuid_has(vcpu, X86_FEATURE_SHSTK))
+ return false;
+ } else {
+ if (index == MSR_IA32_PL3_SSP) {
+ if (!(kvm_xss & XFEATURE_MASK_CET_USER))
+ return false;
+ } else {
+ if (!(kvm_xss & XFEATURE_MASK_CET_KERNEL))
+ return false;
+ }
+ }
+ break;
+ case MSR_IA32_U_CET:
+ case MSR_IA32_S_CET:
+ if (!boot_cpu_has(X86_FEATURE_SHSTK) &&
+ !boot_cpu_has(X86_FEATURE_IBT))
+ return false;
+
+ if (!msr->host_initiated) {
+ if (!guest_cpuid_has(vcpu, X86_FEATURE_SHSTK) &&
+ !guest_cpuid_has(vcpu, X86_FEATURE_IBT))
+ return false;
+ } else if (index == MSR_IA32_U_CET &&
+ !(kvm_xss & XFEATURE_MASK_CET_USER))
+ return false;
+ break;
+ case MSR_IA32_INT_SSP_TAB:
+ if (!boot_cpu_has(X86_FEATURE_SHSTK))
+ return false;
+
+ if (!msr->host_initiated &&
+ !guest_cpuid_has(vcpu, X86_FEATURE_SHSTK))
+ return false;
+ break;
+ default:
+ return false;
+ }
+ return true;
+}
/*
* Reads an msr value (of 'msr_index') into 'pdata'.
* Returns 0 on success, non-0 otherwise.
@@ -1788,6 +1880,26 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
else
msr_info->data = vmx->pt_desc.guest.addr_a[index / 2];
break;
+ case MSR_IA32_S_CET:
+ if (!cet_msr_access_allowed(vcpu, msr_info))
+ return 1;
+ msr_info->data = vmcs_readl(GUEST_S_CET);
+ break;
+ case MSR_IA32_INT_SSP_TAB:
+ if (!cet_msr_access_allowed(vcpu, msr_info))
+ return 1;
+ msr_info->data = vmcs_readl(GUEST_INTR_SSP_TABLE);
+ break;
+ case MSR_IA32_U_CET:
+ if (!cet_msr_access_allowed(vcpu, msr_info))
+ return 1;
+ rdmsrl(MSR_IA32_U_CET, msr_info->data);
+ break;
+ case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP:
+ if (!cet_msr_access_allowed(vcpu, msr_info))
+ return 1;
+ rdmsrl(msr_info->index, msr_info->data);
+ break;
case MSR_TSC_AUX:
if (!msr_info->host_initiated &&
!guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP))
@@ -2039,6 +2151,34 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
else
vmx->pt_desc.guest.addr_a[index / 2] = data;
break;
+ case MSR_IA32_S_CET:
+ if (!cet_msr_access_allowed(vcpu, msr_info))
+ return 1;
+ if (!cet_msr_write_allowed(vcpu, msr_info))
+ return 1;
+ vmcs_writel(GUEST_S_CET, data);
+ break;
+ case MSR_IA32_INT_SSP_TAB:
+ if (!cet_msr_access_allowed(vcpu, msr_info))
+ return 1;
+ if (!cet_msr_write_allowed(vcpu, msr_info))
+ return 1;
+ vmcs_writel(GUEST_INTR_SSP_TABLE, data);
+ break;
+ case MSR_IA32_U_CET:
+ if (!cet_msr_access_allowed(vcpu, msr_info))
+ return 1;
+ if (!cet_msr_write_allowed(vcpu, msr_info))
+ return 1;
+ wrmsrl(MSR_IA32_U_CET, data);
+ break;
+ case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP:
+ if (!cet_msr_access_allowed(vcpu, msr_info))
+ return 1;
+ if (!cet_msr_write_allowed(vcpu, msr_info))
+ return 1;
+ wrmsrl(msr_info->index, data);
+ break;
case MSR_TSC_AUX:
if (!msr_info->host_initiated &&
!guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP))
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 6275a75d5802..1bbe4550da90 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1143,6 +1143,9 @@ static u32 msrs_to_save[] = {
MSR_IA32_RTIT_ADDR1_A, MSR_IA32_RTIT_ADDR1_B,
MSR_IA32_RTIT_ADDR2_A, MSR_IA32_RTIT_ADDR2_B,
MSR_IA32_RTIT_ADDR3_A, MSR_IA32_RTIT_ADDR3_B,
+ MSR_IA32_XSS, MSR_IA32_U_CET, MSR_IA32_S_CET,
+ MSR_IA32_PL0_SSP, MSR_IA32_PL1_SSP, MSR_IA32_PL2_SSP,
+ MSR_IA32_PL3_SSP, MSR_IA32_INT_SSP_TAB,
};

static unsigned num_msrs_to_save;
--
2.17.2

2019-12-10 21:28:31

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v8 6/7] KVM: X86: Load guest fpu state when accessing MSRs managed by XSAVES

On Fri, Nov 01, 2019 at 04:52:21PM +0800, Yang Weijiang wrote:
> From: Sean Christopherson <[email protected]>
>
> A handful of CET MSRs are not context switched through "traditional"
> methods, e.g. VMCS or manual switching, but rather are passed through
> to the guest and are saved and restored by XSAVES/XRSTORS, i.e. the
> guest's FPU state.
>
> Load the guest's FPU state if userspace is accessing MSRs whose values
> are managed by XSAVES so that the MSR helper, e.g. vmx_{get,set}_msr(),
> can simply do {RD,WR}MSR to access the guest's value.
>
> Note that guest_cpuid_has() is not queried as host userspace is allowed
> to access MSRs that have not been exposed to the guest, e.g. it might do
> KVM_SET_MSRS prior to KVM_SET_CPUID2.
>
> Signed-off-by: Sean Christopherson <[email protected]>
> Co-developed-by: Yang Weijiang <[email protected]>
> Signed-off-by: Yang Weijiang <[email protected]>
> ---
> arch/x86/kvm/x86.c | 21 ++++++++++++++++++++-
> 1 file changed, 20 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 540490d5385f..6275a75d5802 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -104,6 +104,8 @@ static void enter_smm(struct kvm_vcpu *vcpu);
> static void __kvm_set_rflags(struct kvm_vcpu *vcpu, unsigned long rflags);
> static void store_regs(struct kvm_vcpu *vcpu);
> static int sync_regs(struct kvm_vcpu *vcpu);
> +static void kvm_load_guest_fpu(struct kvm_vcpu *vcpu);
> +static void kvm_put_guest_fpu(struct kvm_vcpu *vcpu);
>
> struct kvm_x86_ops *kvm_x86_ops __read_mostly;
> EXPORT_SYMBOL_GPL(kvm_x86_ops);
> @@ -3000,6 +3002,12 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> }
> EXPORT_SYMBOL_GPL(kvm_get_msr_common);
>
> +static bool is_xsaves_msr(u32 index)
> +{
> + return index == MSR_IA32_U_CET ||
> + (index >= MSR_IA32_PL0_SSP && index <= MSR_IA32_PL3_SSP);
> +}
> +
> /*
> * Read or write a bunch of msrs. All parameters are kernel addresses.
> *
> @@ -3010,11 +3018,22 @@ static int __msr_io(struct kvm_vcpu *vcpu, struct kvm_msrs *msrs,
> int (*do_msr)(struct kvm_vcpu *vcpu,
> unsigned index, u64 *data))
> {
> + bool fpu_loaded = false;
> int i;
> + const u64 cet_bits = XFEATURE_MASK_CET_USER | XFEATURE_MASK_CET_KERNEL;
> + bool cet_xss = kvm_supported_xss() & cet_bits;
>
> - for (i = 0; i < msrs->nmsrs; ++i)
> + for (i = 0; i < msrs->nmsrs; ++i) {
> + if (!fpu_loaded && cet_xss &&
> + is_xsaves_msr(entries[i].index)) {
> + kvm_load_guest_fpu(vcpu);

This needs to also check for a non-NULL @vcpu. KVM_GET_MSR can be called
on the VM to invoke do_get_msr_feature().

> + fpu_loaded = true;
> + }
> if (do_msr(vcpu, entries[i].index, &entries[i].data))
> break;
> + }
> + if (fpu_loaded)
> + kvm_put_guest_fpu(vcpu);
>
> return i;
> }
> --
> 2.17.2
>

2019-12-10 22:02:21

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v8 7/7] KVM: X86: Add user-space access interface for CET MSRs

On Fri, Nov 01, 2019 at 04:52:22PM +0800, Yang Weijiang wrote:
> There're two different places storing Guest CET states, states
> managed with XSAVES/XRSTORS, as restored/saved
> in previous patch, can be read/write directly from/to the MSRs.
> For those stored in VMCS fields, they're access via vmcs_read/
> vmcs_write.
>
> Signed-off-by: Yang Weijiang <[email protected]>
> ---
> arch/x86/kvm/vmx/vmx.c | 140 +++++++++++++++++++++++++++++++++++++++++
> arch/x86/kvm/x86.c | 3 +
> 2 files changed, 143 insertions(+)
>
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index bfb1b922a9ac..71aba264b5d2 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -1671,6 +1671,98 @@ static int vmx_get_msr_feature(struct kvm_msr_entry *msr)
> return 0;
> }
>
> +#define CET_MSR_RSVD_BITS_1 0x3
> +#define CET_MSR_RSVD_BITS_2 (0xF << 6)
> +
> +static bool cet_msr_write_allowed(struct kvm_vcpu *vcpu, struct msr_data *msr)
> +{
> + u32 index = msr->index;
> + u64 data = msr->data;
> + u32 high_word = data >> 32;
> +
> + if ((index == MSR_IA32_U_CET || index == MSR_IA32_S_CET) &&
> + (data & CET_MSR_RSVD_BITS_2))
> + return false;
> +
> + if (is_64_bit_mode(vcpu)) {
> + if (is_noncanonical_address(data & PAGE_MASK, vcpu))

I don't think this is correct. MSRs that contain an address usually only
fault on a non-canonical value and do the non-canonical check regardless
of mode. E.g. VM-Enter's consistency checks on SYSENTER_E{I,S}P only care
about a canonical address and are not dependent on mode, and SYSENTER
itself states that bits 63:32 are ignored in 32-bit mode. I assume the
same is true here.

If that is indeed the case, what about adding these to the common canonical
check in __kvm_set_msr()? That'd cut down on the boilerplate here and
might make it easier to audit KVM's canonical checks.

> + return false;
> + else if ((index == MSR_IA32_PL0_SSP ||
> + index == MSR_IA32_PL1_SSP ||
> + index == MSR_IA32_PL2_SSP ||
> + index == MSR_IA32_PL3_SSP) &&
> + (data & CET_MSR_RSVD_BITS_1))
> + return false;
> + } else {
> + if (msr->index == MSR_IA32_INT_SSP_TAB)
> + return false;
> + else if ((index == MSR_IA32_U_CET ||
> + index == MSR_IA32_S_CET ||
> + index == MSR_IA32_PL0_SSP ||
> + index == MSR_IA32_PL1_SSP ||
> + index == MSR_IA32_PL2_SSP ||
> + index == MSR_IA32_PL3_SSP) &&
> + (high_word & ~0ul))
> + return false;
> + }
> +
> + return true;
> +}

This helper seems like overkill, e.g. it's filled with index-specific
checks, but is called from code that has already switched on the index.
Open coding the individual checks is likely more readable and would require
less code, especially if the canonical checks are cleaned up.

> +
> +static bool cet_msr_access_allowed(struct kvm_vcpu *vcpu, struct msr_data *msr)
> +{
> + u64 kvm_xss;
> + u32 index = msr->index;
> +
> + if (is_guest_mode(vcpu))
> + return false;

I may have missed this in an earlier discussion, does CET not support
nesting?

> +
> + kvm_xss = kvm_supported_xss();
> +
> + switch (index) {
> + case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP:
> + if (!boot_cpu_has(X86_FEATURE_SHSTK))
> + return false;
> + if (!msr->host_initiated) {
> + if (!guest_cpuid_has(vcpu, X86_FEATURE_SHSTK))
> + return false;
> + } else {

This looks wrong, WRMSR from the guest only checks CPUID, it doesn't check
kvm_xss.

> + if (index == MSR_IA32_PL3_SSP) {
> + if (!(kvm_xss & XFEATURE_MASK_CET_USER))
> + return false;
> + } else {
> + if (!(kvm_xss & XFEATURE_MASK_CET_KERNEL))
> + return false;
> + }
> + }
> + break;
> + case MSR_IA32_U_CET:
> + case MSR_IA32_S_CET:

Rather than bundle everything in a single access_allowed() helper, it might
be easier to have separate helpers for each class of MSR. Except for the
guest_mode() check, there's no overlap between the classes.

> + if (!boot_cpu_has(X86_FEATURE_SHSTK) &&
> + !boot_cpu_has(X86_FEATURE_IBT))
> + return false;
> +
> + if (!msr->host_initiated) {
> + if (!guest_cpuid_has(vcpu, X86_FEATURE_SHSTK) &&
> + !guest_cpuid_has(vcpu, X86_FEATURE_IBT))
> + return false;
> + } else if (index == MSR_IA32_U_CET &&
> + !(kvm_xss & XFEATURE_MASK_CET_USER))

Same comment about guest not checking kvm_xss.

> + return false;
> + break;
> + case MSR_IA32_INT_SSP_TAB:
> + if (!boot_cpu_has(X86_FEATURE_SHSTK))
> + return false;
> +
> + if (!msr->host_initiated &&
> + !guest_cpuid_has(vcpu, X86_FEATURE_SHSTK))
> + return false;
> + break;
> + default:
> + return false;
> + }
> + return true;
> +}
> /*
> * Reads an msr value (of 'msr_index') into 'pdata'.
> * Returns 0 on success, non-0 otherwise.
> @@ -1788,6 +1880,26 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> else
> msr_info->data = vmx->pt_desc.guest.addr_a[index / 2];
> break;
> + case MSR_IA32_S_CET:
> + if (!cet_msr_access_allowed(vcpu, msr_info))
> + return 1;
> + msr_info->data = vmcs_readl(GUEST_S_CET);
> + break;
> + case MSR_IA32_INT_SSP_TAB:
> + if (!cet_msr_access_allowed(vcpu, msr_info))
> + return 1;
> + msr_info->data = vmcs_readl(GUEST_INTR_SSP_TABLE);
> + break;
> + case MSR_IA32_U_CET:
> + if (!cet_msr_access_allowed(vcpu, msr_info))
> + return 1;
> + rdmsrl(MSR_IA32_U_CET, msr_info->data);
> + break;
> + case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP:
> + if (!cet_msr_access_allowed(vcpu, msr_info))
> + return 1;
> + rdmsrl(msr_info->index, msr_info->data);
> + break;
> case MSR_TSC_AUX:
> if (!msr_info->host_initiated &&
> !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP))
> @@ -2039,6 +2151,34 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> else
> vmx->pt_desc.guest.addr_a[index / 2] = data;
> break;
> + case MSR_IA32_S_CET:
> + if (!cet_msr_access_allowed(vcpu, msr_info))
> + return 1;
> + if (!cet_msr_write_allowed(vcpu, msr_info))
> + return 1;
> + vmcs_writel(GUEST_S_CET, data);
> + break;
> + case MSR_IA32_INT_SSP_TAB:
> + if (!cet_msr_access_allowed(vcpu, msr_info))
> + return 1;
> + if (!cet_msr_write_allowed(vcpu, msr_info))
> + return 1;
> + vmcs_writel(GUEST_INTR_SSP_TABLE, data);
> + break;
> + case MSR_IA32_U_CET:
> + if (!cet_msr_access_allowed(vcpu, msr_info))
> + return 1;
> + if (!cet_msr_write_allowed(vcpu, msr_info))
> + return 1;
> + wrmsrl(MSR_IA32_U_CET, data);
> + break;
> + case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP:
> + if (!cet_msr_access_allowed(vcpu, msr_info))
> + return 1;
> + if (!cet_msr_write_allowed(vcpu, msr_info))
> + return 1;
> + wrmsrl(msr_info->index, data);
> + break;
> case MSR_TSC_AUX:
> if (!msr_info->host_initiated &&
> !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP))
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 6275a75d5802..1bbe4550da90 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -1143,6 +1143,9 @@ static u32 msrs_to_save[] = {
> MSR_IA32_RTIT_ADDR1_A, MSR_IA32_RTIT_ADDR1_B,
> MSR_IA32_RTIT_ADDR2_A, MSR_IA32_RTIT_ADDR2_B,
> MSR_IA32_RTIT_ADDR3_A, MSR_IA32_RTIT_ADDR3_B,
> + MSR_IA32_XSS, MSR_IA32_U_CET, MSR_IA32_S_CET,
> + MSR_IA32_PL0_SSP, MSR_IA32_PL1_SSP, MSR_IA32_PL2_SSP,
> + MSR_IA32_PL3_SSP, MSR_IA32_INT_SSP_TAB,
> };
>
> static unsigned num_msrs_to_save;
> --
> 2.17.2
>

2019-12-11 02:04:09

by Yang Weijiang

[permalink] [raw]
Subject: Re: [PATCH v8 6/7] KVM: X86: Load guest fpu state when accessing MSRs managed by XSAVES

On Tue, Dec 10, 2019 at 01:27:48PM -0800, Sean Christopherson wrote:
> On Fri, Nov 01, 2019 at 04:52:21PM +0800, Yang Weijiang wrote:
> > From: Sean Christopherson <[email protected]>
> >

> > - for (i = 0; i < msrs->nmsrs; ++i)
> > + for (i = 0; i < msrs->nmsrs; ++i) {
> > + if (!fpu_loaded && cet_xss &&
> > + is_xsaves_msr(entries[i].index)) {
> > + kvm_load_guest_fpu(vcpu);
>
> This needs to also check for a non-NULL @vcpu. KVM_GET_MSR can be called
> on the VM to invoke do_get_msr_feature().
>
Yeah, I need to add the check, thanks!

> > + fpu_loaded = true;
> > + }
> > if (do_msr(vcpu, entries[i].index, &entries[i].data))
> > break;
> > + }
> > + if (fpu_loaded)
> > + kvm_put_guest_fpu(vcpu);
> >
> > return i;
> > }
> > --
> > 2.17.2
> >

2019-12-11 02:19:37

by Yang Weijiang

[permalink] [raw]
Subject: Re: [PATCH v8 7/7] KVM: X86: Add user-space access interface for CET MSRs

On Tue, Dec 10, 2019 at 01:58:59PM -0800, Sean Christopherson wrote:
> On Fri, Nov 01, 2019 at 04:52:22PM +0800, Yang Weijiang wrote:
> > There're two different places storing Guest CET states, states
> > managed with XSAVES/XRSTORS, as restored/saved
> > in previous patch, can be read/write directly from/to the MSRs.
> > For those stored in VMCS fields, they're access via vmcs_read/
> > vmcs_write.
> >
> >
> > +#define CET_MSR_RSVD_BITS_1 0x3
> > +#define CET_MSR_RSVD_BITS_2 (0xF << 6)
> > +
> > +static bool cet_msr_write_allowed(struct kvm_vcpu *vcpu, struct msr_data *msr)
> > +{
> > + u32 index = msr->index;
> > + u64 data = msr->data;
> > + u32 high_word = data >> 32;
> > +
> > + if ((index == MSR_IA32_U_CET || index == MSR_IA32_S_CET) &&
> > + (data & CET_MSR_RSVD_BITS_2))
> > + return false;
> > +
> > + if (is_64_bit_mode(vcpu)) {
> > + if (is_noncanonical_address(data & PAGE_MASK, vcpu))
>
> I don't think this is correct. MSRs that contain an address usually only
> fault on a non-canonical value and do the non-canonical check regardless
> of mode. E.g. VM-Enter's consistency checks on SYSENTER_E{I,S}P only care
> about a canonical address and are not dependent on mode, and SYSENTER
> itself states that bits 63:32 are ignored in 32-bit mode. I assume the
> same is true here.
The spec. reads like this: Must be machine canonical when written on
parts that support 64 bit mode. On parts that do not support 64 bit mode, the bits 63:32 are
reserved and must be 0.

> If that is indeed the case, what about adding these to the common canonical
> check in __kvm_set_msr()? That'd cut down on the boilerplate here and
> might make it easier to audit KVM's canonical checks.
>
> > + return false;
> > + else if ((index == MSR_IA32_PL0_SSP ||
> > + index == MSR_IA32_PL1_SSP ||
> > + index == MSR_IA32_PL2_SSP ||
> > + index == MSR_IA32_PL3_SSP) &&
> > + (data & CET_MSR_RSVD_BITS_1))
> > + return false;
> > + } else {
> > + if (msr->index == MSR_IA32_INT_SSP_TAB)
> > + return false;
> > + else if ((index == MSR_IA32_U_CET ||
> > + index == MSR_IA32_S_CET ||
> > + index == MSR_IA32_PL0_SSP ||
> > + index == MSR_IA32_PL1_SSP ||
> > + index == MSR_IA32_PL2_SSP ||
> > + index == MSR_IA32_PL3_SSP) &&
> > + (high_word & ~0ul))
> > + return false;
> > + }
> > +
> > + return true;
> > +}
>
> This helper seems like overkill, e.g. it's filled with index-specific
> checks, but is called from code that has already switched on the index.
> Open coding the individual checks is likely more readable and would require
> less code, especially if the canonical checks are cleaned up.
>
I'm afraid if the checks are not wrapped in a helper, there're many
repeat checking-code, that's why I'm using a wrapper.

> > +
> > +static bool cet_msr_access_allowed(struct kvm_vcpu *vcpu, struct msr_data *msr)
> > +{
> > + u64 kvm_xss;
> > + u32 index = msr->index;
> > +
> > + if (is_guest_mode(vcpu))
> > + return false;
>
> I may have missed this in an earlier discussion, does CET not support
> nesting?
>
I don't want to make CET avaible to nested guest at time being, first to
make it available to L1 guest first. So I need to avoid exposing any CET
CPUID/MSRs to a nested guest.

> > +
> > + kvm_xss = kvm_supported_xss();
> > +
> > + switch (index) {
> > + case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP:
> > + if (!boot_cpu_has(X86_FEATURE_SHSTK))
> > + return false;
> > + if (!msr->host_initiated) {
> > + if (!guest_cpuid_has(vcpu, X86_FEATURE_SHSTK))
> > + return false;
> > + } else {
>
> This looks wrong, WRMSR from the guest only checks CPUID, it doesn't check
> kvm_xss.
>
OOPs, I need to add the check, thank you!

> > + if (index == MSR_IA32_PL3_SSP) {
> > + if (!(kvm_xss & XFEATURE_MASK_CET_USER))
> > + return false;
> > + } else {
> > + if (!(kvm_xss & XFEATURE_MASK_CET_KERNEL))
> > + return false;
> > + }
> > + }
> > + break;
> > + case MSR_IA32_U_CET:
> > + case MSR_IA32_S_CET:
>
> Rather than bundle everything in a single access_allowed() helper, it might
> be easier to have separate helpers for each class of MSR. Except for the
> guest_mode() check, there's no overlap between the classes.
>
Sure, let me double check the code.

> > + if (!boot_cpu_has(X86_FEATURE_SHSTK) &&
> > + !boot_cpu_has(X86_FEATURE_IBT))
> > + return false;
> > +
> > + if (!msr->host_initiated) {
> > + if (!guest_cpuid_has(vcpu, X86_FEATURE_SHSTK) &&
> > + !guest_cpuid_has(vcpu, X86_FEATURE_IBT))
> > + return false;
> > + } else if (index == MSR_IA32_U_CET &&
> > + !(kvm_xss & XFEATURE_MASK_CET_USER))
>
> Same comment about guest not checking kvm_xss.
>
OK.

> > + return false;
> > + break;
> > + case MSR_IA32_INT_SSP_TAB:
> > + if (!boot_cpu_has(X86_FEATURE_SHSTK))
> > + return false;
> > +
> > + if (!msr->host_initiated &&
> > + !guest_cpuid_has(vcpu, X86_FEATURE_SHSTK))
> > + return false;
> > + break;
> > + default:
> > + return false;
> > + }
> > + return true;
> > +}
> > /*
> > * Reads an msr value (of 'msr_index') into 'pdata'.
> > * Returns 0 on success, non-0 otherwise.
> > @@ -1788,6 +1880,26 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> > else
> > msr_info->data = vmx->pt_desc.guest.addr_a[index / 2];
> > break;
> > + case MSR_IA32_S_CET:
> > + if (!cet_msr_access_allowed(vcpu, msr_info))
> > + return 1;
> > + msr_info->data = vmcs_readl(GUEST_S_CET);
> > + break;
> > + case MSR_IA32_INT_SSP_TAB:
> > + if (!cet_msr_access_allowed(vcpu, msr_info))
> > + return 1;
> > + msr_info->data = vmcs_readl(GUEST_INTR_SSP_TABLE);
> > + break;
> > + case MSR_IA32_U_CET:
> > + if (!cet_msr_access_allowed(vcpu, msr_info))
> > + return 1;
> > + rdmsrl(MSR_IA32_U_CET, msr_info->data);
> > + break;
> > + case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP:
> > + if (!cet_msr_access_allowed(vcpu, msr_info))
> > + return 1;
> > + rdmsrl(msr_info->index, msr_info->data);
> > + break;
> > case MSR_TSC_AUX:
> > if (!msr_info->host_initiated &&
> > !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP))
> > @@ -2039,6 +2151,34 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> > else
> > vmx->pt_desc.guest.addr_a[index / 2] = data;
> > break;
> > + case MSR_IA32_S_CET:
> > + if (!cet_msr_access_allowed(vcpu, msr_info))
> > + return 1;
> > + if (!cet_msr_write_allowed(vcpu, msr_info))
> > + return 1;
> > + vmcs_writel(GUEST_S_CET, data);
> > + break;
> > + case MSR_IA32_INT_SSP_TAB:
> > + if (!cet_msr_access_allowed(vcpu, msr_info))
> > + return 1;
> > + if (!cet_msr_write_allowed(vcpu, msr_info))
> > + return 1;
> > + vmcs_writel(GUEST_INTR_SSP_TABLE, data);
> > + break;
> > + case MSR_IA32_U_CET:
> > + if (!cet_msr_access_allowed(vcpu, msr_info))
> > + return 1;
> > + if (!cet_msr_write_allowed(vcpu, msr_info))
> > + return 1;
> > + wrmsrl(MSR_IA32_U_CET, data);
> > + break;
> > + case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP:
> > + if (!cet_msr_access_allowed(vcpu, msr_info))
> > + return 1;
> > + if (!cet_msr_write_allowed(vcpu, msr_info))
> > + return 1;
> > + wrmsrl(msr_info->index, data);
> > + break;
> > case MSR_TSC_AUX:
> > if (!msr_info->host_initiated &&
> > !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP))
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index 6275a75d5802..1bbe4550da90 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -1143,6 +1143,9 @@ static u32 msrs_to_save[] = {
> > MSR_IA32_RTIT_ADDR1_A, MSR_IA32_RTIT_ADDR1_B,
> > MSR_IA32_RTIT_ADDR2_A, MSR_IA32_RTIT_ADDR2_B,
> > MSR_IA32_RTIT_ADDR3_A, MSR_IA32_RTIT_ADDR3_B,
> > + MSR_IA32_XSS, MSR_IA32_U_CET, MSR_IA32_S_CET,
> > + MSR_IA32_PL0_SSP, MSR_IA32_PL1_SSP, MSR_IA32_PL2_SSP,
> > + MSR_IA32_PL3_SSP, MSR_IA32_INT_SSP_TAB,
> > };
> >
> > static unsigned num_msrs_to_save;
> > --
> > 2.17.2
> >

2019-12-11 16:29:12

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v8 7/7] KVM: X86: Add user-space access interface for CET MSRs

On Wed, Dec 11, 2019 at 10:19:51AM +0800, Yang Weijiang wrote:
> On Tue, Dec 10, 2019 at 01:58:59PM -0800, Sean Christopherson wrote:
> > On Fri, Nov 01, 2019 at 04:52:22PM +0800, Yang Weijiang wrote:
> > > There're two different places storing Guest CET states, states
> > > managed with XSAVES/XRSTORS, as restored/saved
> > > in previous patch, can be read/write directly from/to the MSRs.
> > > For those stored in VMCS fields, they're access via vmcs_read/
> > > vmcs_write.
> > >
> > >
> > > +#define CET_MSR_RSVD_BITS_1 0x3
> > > +#define CET_MSR_RSVD_BITS_2 (0xF << 6)
> > > +
> > > +static bool cet_msr_write_allowed(struct kvm_vcpu *vcpu, struct msr_data *msr)
> > > +{
> > > + u32 index = msr->index;
> > > + u64 data = msr->data;
> > > + u32 high_word = data >> 32;
> > > +
> > > + if ((index == MSR_IA32_U_CET || index == MSR_IA32_S_CET) &&
> > > + (data & CET_MSR_RSVD_BITS_2))
> > > + return false;
> > > +
> > > + if (is_64_bit_mode(vcpu)) {
> > > + if (is_noncanonical_address(data & PAGE_MASK, vcpu))
> >
> > I don't think this is correct. MSRs that contain an address usually only
> > fault on a non-canonical value and do the non-canonical check regardless
> > of mode. E.g. VM-Enter's consistency checks on SYSENTER_E{I,S}P only care
> > about a canonical address and are not dependent on mode, and SYSENTER
> > itself states that bits 63:32 are ignored in 32-bit mode. I assume the
> > same is true here.
> The spec. reads like this: Must be machine canonical when written on parts
> that support 64 bit mode. On parts that do not support 64 bit mode, the bits
> 63:32 are reserved and must be 0.

Yes, that agrees with me. The key word is "support", i.e. "on parts that
support 64 bit mode" means "on parts with CPUID.0x80000001.EDX.LM=1."

The reason the architecture works this way is that unless hardware clears
the MSRs on transition from 64->32, bits 63:32 need to be ignored on the
way out instead of being validated on the way in, e.g. software writes a
64-bit value to the MSR and then transitions to 32-bit mode. Clearing the
MSRs would be painful, slow and error prone, so it's easier for hardware
to simply ignore bits 63:32 in 32-bit mode.

> > If that is indeed the case, what about adding these to the common canonical
> > check in __kvm_set_msr()? That'd cut down on the boilerplate here and
> > might make it easier to audit KVM's canonical checks.
> >
> > > + return false;
> > > + else if ((index == MSR_IA32_PL0_SSP ||
> > > + index == MSR_IA32_PL1_SSP ||
> > > + index == MSR_IA32_PL2_SSP ||
> > > + index == MSR_IA32_PL3_SSP) &&
> > > + (data & CET_MSR_RSVD_BITS_1))
> > > + return false;
> > > + } else {
> > > + if (msr->index == MSR_IA32_INT_SSP_TAB)
> > > + return false;
> > > + else if ((index == MSR_IA32_U_CET ||
> > > + index == MSR_IA32_S_CET ||
> > > + index == MSR_IA32_PL0_SSP ||
> > > + index == MSR_IA32_PL1_SSP ||
> > > + index == MSR_IA32_PL2_SSP ||
> > > + index == MSR_IA32_PL3_SSP) &&
> > > + (high_word & ~0ul))
> > > + return false;
> > > + }
> > > +
> > > + return true;
> > > +}
> >
> > This helper seems like overkill, e.g. it's filled with index-specific
> > checks, but is called from code that has already switched on the index.
> > Open coding the individual checks is likely more readable and would require
> > less code, especially if the canonical checks are cleaned up.
> >
> I'm afraid if the checks are not wrapped in a helper, there're many
> repeat checking-code, that's why I'm using a wrapper.

But you're adding almost as much, if not more, code to re-split the checks
in this helper.

> > > +
> > > +static bool cet_msr_access_allowed(struct kvm_vcpu *vcpu, struct msr_data *msr)
> > > +{
> > > + u64 kvm_xss;
> > > + u32 index = msr->index;
> > > +
> > > + if (is_guest_mode(vcpu))
> > > + return false;
> >
> > I may have missed this in an earlier discussion, does CET not support
> > nesting?
> >
> I don't want to make CET avaible to nested guest at time being, first to
> make it available to L1 guest first. So I need to avoid exposing any CET
> CPUID/MSRs to a nested guest.

2019-12-12 00:42:55

by Yang Weijiang

[permalink] [raw]
Subject: Re: [PATCH v8 7/7] KVM: X86: Add user-space access interface for CET MSRs

On Wed, Dec 11, 2019 at 08:27:02AM -0800, Sean Christopherson wrote:
> On Wed, Dec 11, 2019 at 10:19:51AM +0800, Yang Weijiang wrote:
> > On Tue, Dec 10, 2019 at 01:58:59PM -0800, Sean Christopherson wrote:
> > > On Fri, Nov 01, 2019 at 04:52:22PM +0800, Yang Weijiang wrote:
> > > > There're two different places storing Guest CET states, states
> > > > managed with XSAVES/XRSTORS, as restored/saved
> > > > in previous patch, can be read/write directly from/to the MSRs.
> > > > For those stored in VMCS fields, they're access via vmcs_read/
> > > > vmcs_write.
> > > >
> > > >
> > > > +#define CET_MSR_RSVD_BITS_1 0x3
> > > > +#define CET_MSR_RSVD_BITS_2 (0xF << 6)
> > > > +
> > > > +static bool cet_msr_write_allowed(struct kvm_vcpu *vcpu, struct msr_data *msr)
> > > > +{
> > > > + u32 index = msr->index;
> > > > + u64 data = msr->data;
> > > > + u32 high_word = data >> 32;
> > > > +
> > > > + if ((index == MSR_IA32_U_CET || index == MSR_IA32_S_CET) &&
> > > > + (data & CET_MSR_RSVD_BITS_2))
> > > > + return false;
> > > > +
> > > > + if (is_64_bit_mode(vcpu)) {
> > > > + if (is_noncanonical_address(data & PAGE_MASK, vcpu))
> > >
> > > I don't think this is correct. MSRs that contain an address usually only
> > > fault on a non-canonical value and do the non-canonical check regardless
> > > of mode. E.g. VM-Enter's consistency checks on SYSENTER_E{I,S}P only care
> > > about a canonical address and are not dependent on mode, and SYSENTER
> > > itself states that bits 63:32 are ignored in 32-bit mode. I assume the
> > > same is true here.
> > The spec. reads like this: Must be machine canonical when written on parts
> > that support 64 bit mode. On parts that do not support 64 bit mode, the bits
> > 63:32 are reserved and must be 0.
>
> Yes, that agrees with me. The key word is "support", i.e. "on parts that
> support 64 bit mode" means "on parts with CPUID.0x80000001.EDX.LM=1."
>
> The reason the architecture works this way is that unless hardware clears
> the MSRs on transition from 64->32, bits 63:32 need to be ignored on the
> way out instead of being validated on the way in, e.g. software writes a
> 64-bit value to the MSR and then transitions to 32-bit mode. Clearing the
> MSRs would be painful, slow and error prone, so it's easier for hardware
> to simply ignore bits 63:32 in 32-bit mode.
>
Make sense, I'll move the canonical check up to kvm_set_msr() like other
MSRs, thanks!

> > > If that is indeed the case, what about adding these to the common canonical
> > > check in __kvm_set_msr()? That'd cut down on the boilerplate here and
> > > might make it easier to audit KVM's canonical checks.
> > >
> > > > + return false;
> > > > + else if ((index == MSR_IA32_PL0_SSP ||
> > > > + index == MSR_IA32_PL1_SSP ||
> > > > + index == MSR_IA32_PL2_SSP ||
> > > > + index == MSR_IA32_PL3_SSP) &&
> > > > + (data & CET_MSR_RSVD_BITS_1))
> > > > + return false;
> > > > + } else {
> > > > + if (msr->index == MSR_IA32_INT_SSP_TAB)
> > > > + return false;
> > > > + else if ((index == MSR_IA32_U_CET ||
> > > > + index == MSR_IA32_S_CET ||
> > > > + index == MSR_IA32_PL0_SSP ||
> > > > + index == MSR_IA32_PL1_SSP ||
> > > > + index == MSR_IA32_PL2_SSP ||
> > > > + index == MSR_IA32_PL3_SSP) &&
> > > > + (high_word & ~0ul))
> > > > + return false;
> > > > + }
> > > > +
> > > > + return true;
> > > > +}
> > >
> > > This helper seems like overkill, e.g. it's filled with index-specific
> > > checks, but is called from code that has already switched on the index.
> > > Open coding the individual checks is likely more readable and would require
> > > less code, especially if the canonical checks are cleaned up.
> > >
> > I'm afraid if the checks are not wrapped in a helper, there're many
> > repeat checking-code, that's why I'm using a wrapper.
>
> But you're adding almost as much, if not more, code to re-split the checks
> in this helper.
>
Sure, thanks!

> > > > +
> > > > +static bool cet_msr_access_allowed(struct kvm_vcpu *vcpu, struct msr_data *msr)
> > > > +{
> > > > + u64 kvm_xss;
> > > > + u32 index = msr->index;
> > > > +
> > > > + if (is_guest_mode(vcpu))
> > > > + return false;
> > >
> > > I may have missed this in an earlier discussion, does CET not support
> > > nesting?
> > >
> > I don't want to make CET avaible to nested guest at time being, first to
> > make it available to L1 guest first. So I need to avoid exposing any CET
> > CPUID/MSRs to a nested guest.

2019-12-12 16:03:24

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: Re: [PATCH v8 0/7] Introduce support for guest CET feature

On Fri, Nov 01, 2019 at 04:52:15PM +0800, Yang Weijiang wrote:
> Control-flow Enforcement Technology (CET) provides protection against
> Return/Jump-Oriented Programming (ROP/JOP) attack. It includes two
> sub-features: Shadow Stack (SHSTK) and Indirect Branch Tracking (IBT).
>
> KVM change is required to support guest CET feature.
> This patch serial implemented CET related CPUID/XSAVES enumeration, MSRs
> and vmentry/vmexit configuration etc.so that guest kernel can setup CET
> runtime infrastructure based on them. Some CET MSRs and related feature
> flags used reference the definitions in kernel patchset.
>
> CET kernel patches is here:
> https://lkml.org/lkml/2019/8/13/1110
> https://lkml.org/lkml/2019/8/13/1109

Is there a git tree with all of them against v5.5-rc1 (so all three series)?
I tried your github tree: https://github.com/yyu168/linux_cet.git #cet
but sadly that does not apply against 5.5-rc1 :-(

Thanks!

2019-12-13 00:54:27

by Yang Weijiang

[permalink] [raw]
Subject: Re: [PATCH v8 0/7] Introduce support for guest CET feature

On Thu, Dec 12, 2019 at 11:03:45AM -0500, Konrad Rzeszutek Wilk wrote:
> On Fri, Nov 01, 2019 at 04:52:15PM +0800, Yang Weijiang wrote:
> > Control-flow Enforcement Technology (CET) provides protection against
> > Return/Jump-Oriented Programming (ROP/JOP) attack. It includes two
> > sub-features: Shadow Stack (SHSTK) and Indirect Branch Tracking (IBT).
> >
> > KVM change is required to support guest CET feature.
> > This patch serial implemented CET related CPUID/XSAVES enumeration, MSRs
> > and vmentry/vmexit configuration etc.so that guest kernel can setup CET
> > runtime infrastructure based on them. Some CET MSRs and related feature
> > flags used reference the definitions in kernel patchset.
> >
> > CET kernel patches is here:
> > https://lkml.org/lkml/2019/8/13/1110
> > https://lkml.org/lkml/2019/8/13/1109
>
> Is there a git tree with all of them against v5.5-rc1 (so all three series)?
> I tried your github tree: https://github.com/yyu168/linux_cet.git #cet
> but sadly that does not apply against 5.5-rc1 :-(
>
> Thanks!
Hi,
The CET patch includes two parts: one from kernel side the other from KVM,
the kernel patch in github is maintained by my peer, he'll rebase
it to the latest kernel tree shortly after resolve some issues.
Thank you for having interest!