2023-04-14 06:28:23

by Chao Gao

[permalink] [raw]
Subject: [RFC PATCH v2 00/11] Intel IA32_SPEC_CTRL Virtualization

Changes since RFC v1:
* add two kselftests (patch 10-11)
* set virtual MSRs also on APs [Pawan]
* enable "virtualize IA32_SPEC_CTRL" for L2 to prevent L2 from changing
some bits of IA32_SPEC_CTRL (patch 4)
* other misc cleanup and cosmetic changes

RFC v1: https://lore.kernel.org/lkml/[email protected]/


This series introduces "virtualize IA32_SPEC_CTRL" support. Here are
introduction and use cases of this new feature.

### Virtualize IA32_SPEC_CTRL

"Virtualize IA32_SPEC_CTRL" [1] is a new VMX feature on Intel CPUs. This feature
allows VMM to lock some bits of IA32_SPEC_CTRL MSR even when the MSR is
pass-thru'd to a guest.


### Use cases of "virtualize IA32_SPEC_CTRL" [2]

Software mitigations like Retpoline and software BHB-clearing sequence depend on
CPU microarchitectures. And guest cannot know exactly the underlying
microarchitecture. When a guest is migrated between processors of different
microarchitectures, software mitigations which work perfectly on previous
microachitecture may be not effective on the new one. To fix the problem, some
hardware mitigations should be used in conjunction with software mitigations.
Using virtual IA32_SPEC_CTRL, VMM can enforce hardware mitigations transparently
to guests and avoid those hardware mitigations being unintentionally disabled
when guest changes IA32_SPEC_CTRL MSR.


### Intention of this series

This series adds the capability of enforcing hardware mitigations for guests
transparently and efficiently (i.e., without intecepting IA32_SPEC_CTRL MSR
accesses) to kvm. The capability can be used to solve the VM migration issue in
a pool consisting of processors of different microarchitectures.

Specifically, below are two target scenarios of this series:

Scenario 1: If retpoline is used by a VM to mitigate IMBTI in CPL0, VMM can set
RRSBA_DIS_S on parts enumerates RRSBA. Note that the VM is presented
with a microarchitecture doesn't enumerate RRSBA.

Scenario 2: If a VM uses software BHB-clearing sequence on transitions into CPL0
to mitigate BHI, VMM can use "virtualize IA32_SPEC_CTRL" to set
BHI_DIS_S on new parts which doesn't enumerate BHI_NO.

Intel defines some virtual MSRs [2] for guests to report in-use software
mitigations. This allows guests to opt in VMM's deploying hardware mitigations
for them if the guests are either running or later migrated to a system on which
in-use software mitigations are not effective. The virtual MSRs interface is
also added in this series.

### Organization of this series

1. Patch 1-3 Advertise RRSBA_CTRL and BHI_CTRL to guest
2. Patch 4 Add "virtualize IA32_SPEC_CTRL" support
3. Patch 5-9 Allow guests to report in-use software mitigations to KVM so
that KVM can enable hardware mitigations for guests.
4. Patch 10-11 Add kselftest for virtual MSRs and IA32_SPEC_CTRL

[1]: https://cdrdv2.intel.com/v1/dl/getContent/671368 Ref. #319433-047 Chapter 12
[2]: https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/technical-documentation/branch-history-injection.html

Chao Gao (3):
KVM: VMX: Advertise MITI_ENUM_RETPOLINE_S_SUPPORT
KVM: selftests: Add tests for virtual enumeration/mitigation MSRs
KVM: selftests: Add tests for IA32_SPEC_CTRL MSR

Pawan Gupta (1):
x86/bugs: Use Virtual MSRs to request hardware mitigations

Zhang Chen (7):
x86/msr-index: Add bit definitions for BHI_DIS_S and BHI_NO
KVM: x86: Advertise CPUID.7.2.EDX and RRSBA_CTRL support
KVM: x86: Advertise BHI_CTRL support
KVM: VMX: Add IA32_SPEC_CTRL virtualization support
KVM: x86: Advertise ARCH_CAP_VIRTUAL_ENUM support
KVM: VMX: Advertise MITIGATION_CTRL support
KVM: VMX: Advertise MITI_CTRL_BHB_CLEAR_SEQ_S_SUPPORT

arch/x86/include/asm/msr-index.h | 33 +++-
arch/x86/include/asm/vmx.h | 5 +
arch/x86/include/asm/vmxfeatures.h | 2 +
arch/x86/kernel/cpu/bugs.c | 25 +++
arch/x86/kvm/cpuid.c | 22 ++-
arch/x86/kvm/reverse_cpuid.h | 8 +
arch/x86/kvm/svm/svm.c | 3 +
arch/x86/kvm/vmx/capabilities.h | 5 +
arch/x86/kvm/vmx/nested.c | 13 ++
arch/x86/kvm/vmx/vmcs.h | 2 +
arch/x86/kvm/vmx/vmx.c | 112 ++++++++++-
arch/x86/kvm/vmx/vmx.h | 43 ++++-
arch/x86/kvm/x86.c | 19 +-
tools/arch/x86/include/asm/msr-index.h | 37 +++-
tools/testing/selftests/kvm/Makefile | 2 +
.../selftests/kvm/include/x86_64/processor.h | 5 +
.../selftests/kvm/x86_64/spec_ctrl_msr_test.c | 178 ++++++++++++++++++
.../kvm/x86_64/virtual_mitigation_msr_test.c | 175 +++++++++++++++++
18 files changed, 676 insertions(+), 13 deletions(-)
create mode 100644 tools/testing/selftests/kvm/x86_64/spec_ctrl_msr_test.c
create mode 100644 tools/testing/selftests/kvm/x86_64/virtual_mitigation_msr_test.c


base-commit: 400d2132288edbd6d500f45eab5d85526ca94e46
--
2.40.0


2023-04-14 06:28:33

by Chao Gao

[permalink] [raw]
Subject: [RFC PATCH v2 04/11] KVM: VMX: Add IA32_SPEC_CTRL virtualization support

From: Zhang Chen <[email protected]>

Currently KVM disables interception of IA32_SPEC_CTRL after a non-0 is
written to IA32_SPEC_CTRL by guest. Then, guest is allowed to write any
value to hardware.

"virtualize IA32_SPEC_CTRL" is a new tertiary vm-exec control. This
feature allows KVM to specify that certain bits of the IA32_SPEC_CTRL
MSR cannot be modified by guest software.

Two VMCS fields are added:

IA32_SPEC_CTRL_MASK: bits that guest software cannot modify
IA32_SPEC_CTRL_SHADOW: value that guest software expects to be in the
IA32_SPEC_CTRL MSR

On rdmsr, the shadow value is returned. on wrmsr, EDX:EAX is written
to the IA32_SPEC_CTRL_SHADOW and (cur_val & mask) | (EDX:EAX & ~mask)
is written to the IA32_SPEC_CTRL MSR, where
* cur_val is the original value of IA32_SPEC_CTRL MSR
* mask is the value of IA32_SPEC_CTRL_MASK

Add a mask e.g., loaded_vmcs->spec_ctrl_mask to represent the bits guest
shouldn't change. It is 0 for now and some bits will be added by
following patches. Use per-vmcs cache to avoid unnecessary vmcs_write()
on nested transition because the mask is expected to be rarely changed
and the same for vmcs01 and vmcs02.

To prevent guest from changing the bits in the mask, enable "virtualize
IA32_SPEC_CTRL" if supported or emulate its behavior by intercepting
the IA32_SPEC_CTRL msr. Emulating "virtualize IA32_SPEC_CTRL" behavior
is mainly to give the same capability to KVM running on potential broken
hardware or L1 guests.

To avoid L2 evading the enforcement, enable "virtualize IA32_SPEC_CTRL"
in vmcs02. Always update the guest (shadow) value of IA32_SPEC_CTRL MSR
and the mask to preserve them across nested transitions. Note that the
shadow value may be changed because L2 may access the IA32_SPEC_CTRL
directly and the mask may be changed due to migration when L2 vCPUs are
running.

Co-developed-by: Chao Gao <[email protected]>
Signed-off-by: Chao Gao <[email protected]>
Signed-off-by: Zhang Chen <[email protected]>
Signed-off-by: Chao Gao <[email protected]>
Tested-by: Jiaan Lu <[email protected]>
---
arch/x86/include/asm/vmx.h | 5 ++++
arch/x86/include/asm/vmxfeatures.h | 2 ++
arch/x86/kvm/vmx/capabilities.h | 5 ++++
arch/x86/kvm/vmx/nested.c | 13 ++++++++++
arch/x86/kvm/vmx/vmcs.h | 2 ++
arch/x86/kvm/vmx/vmx.c | 34 ++++++++++++++++++++-----
arch/x86/kvm/vmx/vmx.h | 40 +++++++++++++++++++++++++++++-
7 files changed, 94 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
index 498dc600bd5c..b9f88ecf20c3 100644
--- a/arch/x86/include/asm/vmx.h
+++ b/arch/x86/include/asm/vmx.h
@@ -81,6 +81,7 @@
* Definitions of Tertiary Processor-Based VM-Execution Controls.
*/
#define TERTIARY_EXEC_IPI_VIRT VMCS_CONTROL_BIT(IPI_VIRT)
+#define TERTIARY_EXEC_SPEC_CTRL_VIRT VMCS_CONTROL_BIT(SPEC_CTRL_VIRT)

#define PIN_BASED_EXT_INTR_MASK VMCS_CONTROL_BIT(INTR_EXITING)
#define PIN_BASED_NMI_EXITING VMCS_CONTROL_BIT(NMI_EXITING)
@@ -233,6 +234,10 @@ enum vmcs_field {
TERTIARY_VM_EXEC_CONTROL_HIGH = 0x00002035,
PID_POINTER_TABLE = 0x00002042,
PID_POINTER_TABLE_HIGH = 0x00002043,
+ IA32_SPEC_CTRL_MASK = 0x0000204A,
+ IA32_SPEC_CTRL_MASK_HIGH = 0x0000204B,
+ IA32_SPEC_CTRL_SHADOW = 0x0000204C,
+ IA32_SPEC_CTRL_SHADOW_HIGH = 0x0000204D,
GUEST_PHYSICAL_ADDRESS = 0x00002400,
GUEST_PHYSICAL_ADDRESS_HIGH = 0x00002401,
VMCS_LINK_POINTER = 0x00002800,
diff --git a/arch/x86/include/asm/vmxfeatures.h b/arch/x86/include/asm/vmxfeatures.h
index c6a7eed03914..c70d0769b7d0 100644
--- a/arch/x86/include/asm/vmxfeatures.h
+++ b/arch/x86/include/asm/vmxfeatures.h
@@ -89,4 +89,6 @@

/* Tertiary Processor-Based VM-Execution Controls, word 3 */
#define VMX_FEATURE_IPI_VIRT ( 3*32+ 4) /* Enable IPI virtualization */
+#define VMX_FEATURE_SPEC_CTRL_VIRT ( 3*32+ 7) /* Enable IA32_SPEC_CTRL virtualization */
+
#endif /* _ASM_X86_VMXFEATURES_H */
diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h
index 45162c1bcd8f..a7ab70b55acf 100644
--- a/arch/x86/kvm/vmx/capabilities.h
+++ b/arch/x86/kvm/vmx/capabilities.h
@@ -138,6 +138,11 @@ static inline bool cpu_has_tertiary_exec_ctrls(void)
CPU_BASED_ACTIVATE_TERTIARY_CONTROLS;
}

+static __always_inline bool cpu_has_spec_ctrl_virt(void)
+{
+ return vmcs_config.cpu_based_3rd_exec_ctrl & TERTIARY_EXEC_SPEC_CTRL_VIRT;
+}
+
static inline bool cpu_has_vmx_virtualize_apic_accesses(void)
{
return vmcs_config.cpu_based_2nd_exec_ctrl &
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index f63b28f46a71..96861a42316d 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -276,6 +276,8 @@ static void vmx_switch_vmcs(struct kvm_vcpu *vcpu, struct loaded_vmcs *vmcs)
vmx->loaded_vmcs = vmcs;
vmx_vcpu_load_vmcs(vcpu, cpu, prev);
vmx_sync_vmcs_host_state(vmx, prev);
+ vmx_set_spec_ctrl_mask(vmx, prev->spec_ctrl_mask);
+ vmx_set_guest_spec_ctrl(vmx, vmx_get_guest_spec_ctrl(vmx));
put_cpu();

vcpu->arch.regs_avail = ~VMX_REGS_LAZY_LOAD_SET;
@@ -2342,6 +2344,17 @@ static void prepare_vmcs02_early(struct vcpu_vmx *vmx, struct loaded_vmcs *vmcs0
secondary_exec_controls_set(vmx, exec_control);
}

+ /*
+ * TERTIARY EXEC CONTROLS
+ */
+ if (cpu_has_tertiary_exec_ctrls()) {
+ exec_control = __tertiary_exec_controls_get(vmcs01);
+
+ /* IPI virtualization for L2 isn't supported */
+ exec_control &= ~TERTIARY_EXEC_IPI_VIRT;
+ tertiary_exec_controls_set(vmx, exec_control);
+ }
+
/*
* ENTRY CONTROLS
*
diff --git a/arch/x86/kvm/vmx/vmcs.h b/arch/x86/kvm/vmx/vmcs.h
index 7c1996b433e2..5f1fc43c8be0 100644
--- a/arch/x86/kvm/vmx/vmcs.h
+++ b/arch/x86/kvm/vmx/vmcs.h
@@ -73,6 +73,8 @@ struct loaded_vmcs {
struct list_head loaded_vmcss_on_cpu_link;
struct vmcs_host_state host_state;
struct vmcs_controls_shadow controls_shadow;
+ /* write-through cache of the SPEC_CTRL_MASK field in VMCS */
+ u64 spec_ctrl_mask;
};

static __always_inline bool is_intr_type(u32 intr_info, u32 type)
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 56e0c7ae961d..9f6919bec2b3 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1996,7 +1996,7 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
!guest_has_spec_ctrl_msr(vcpu))
return 1;

- msr_info->data = to_vmx(vcpu)->spec_ctrl;
+ msr_info->data = vmx->guest_spec_ctrl;
break;
case MSR_IA32_SYSENTER_CS:
msr_info->data = vmcs_read32(GUEST_SYSENTER_CS);
@@ -2145,7 +2145,7 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
struct vmx_uret_msr *msr;
int ret = 0;
u32 msr_index = msr_info->index;
- u64 data = msr_info->data;
+ u64 data = msr_info->data, spec_ctrl_mask;
u32 index;

switch (msr_index) {
@@ -2256,13 +2256,22 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
!guest_has_spec_ctrl_msr(vcpu))
return 1;

- if (kvm_spec_ctrl_test_value(data))
+ spec_ctrl_mask = vmx_get_spec_ctrl_mask(vmx);
+ if (kvm_spec_ctrl_test_value(data | spec_ctrl_mask))
return 1;

- vmx->spec_ctrl = data;
+ vmx_set_guest_spec_ctrl(vmx, data);
+
if (!data)
break;

+ /*
+ * Don't disable intercept to prevent guest from changing some
+ * bits if "virtualize IA32_SPEC_CTRL" isn't supported.
+ */
+ if (spec_ctrl_mask && !cpu_has_spec_ctrl_virt())
+ break;
+
/*
* For non-nested:
* When it's written (to non-zero) for the first time, pass
@@ -4822,7 +4831,8 @@ static void vmx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
__vmx_vcpu_reset(vcpu);

vmx->rmode.vm86_active = 0;
- vmx->spec_ctrl = 0;
+ vmx_set_spec_ctrl_mask(vmx, 0);
+ vmx_set_guest_spec_ctrl(vmx, 0);

vmx->msr_ia32_umwait_control = 0;

@@ -7117,8 +7127,20 @@ void noinstr vmx_spec_ctrl_restore_host(struct vcpu_vmx *vmx,
if (!cpu_feature_enabled(X86_FEATURE_MSR_SPEC_CTRL))
return;

- if (flags & VMX_RUN_SAVE_SPEC_CTRL)
+ if (flags & VMX_RUN_SAVE_SPEC_CTRL) {
vmx->spec_ctrl = __rdmsr(MSR_IA32_SPEC_CTRL);
+ /*
+ * For simplicity, always keep vmx->guest_spec_ctrl
+ * up-to-date. The guest value is needed for userspace read
+ * e.g. live migration, and nested transitions. At later
+ * point, it is uneasy to tell which one is the latest
+ * guest value since the intercept state may change.
+ */
+ if (cpu_has_spec_ctrl_virt())
+ vmx->guest_spec_ctrl = vmcs_read64(IA32_SPEC_CTRL_SHADOW);
+ else
+ vmx->guest_spec_ctrl = vmx->spec_ctrl;
+ }

/*
* If the guest/host SPEC_CTRL values differ, restore the host value.
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index cb766f65a3eb..021d86b52e18 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -291,6 +291,7 @@ struct vcpu_vmx {
#endif

u64 spec_ctrl;
+ u64 guest_spec_ctrl;
u32 msr_ia32_umwait_control;

/*
@@ -589,7 +590,8 @@ static inline u8 vmx_get_rvi(void)

#define KVM_REQUIRED_VMX_TERTIARY_VM_EXEC_CONTROL 0
#define KVM_OPTIONAL_VMX_TERTIARY_VM_EXEC_CONTROL \
- (TERTIARY_EXEC_IPI_VIRT)
+ (TERTIARY_EXEC_IPI_VIRT | \
+ TERTIARY_EXEC_SPEC_CTRL_VIRT)

#define BUILD_CONTROLS_SHADOW(lname, uname, bits) \
static inline void lname##_controls_set(struct vcpu_vmx *vmx, u##bits val) \
@@ -624,6 +626,20 @@ BUILD_CONTROLS_SHADOW(exec, CPU_BASED_VM_EXEC_CONTROL, 32)
BUILD_CONTROLS_SHADOW(secondary_exec, SECONDARY_VM_EXEC_CONTROL, 32)
BUILD_CONTROLS_SHADOW(tertiary_exec, TERTIARY_VM_EXEC_CONTROL, 64)

+static inline void vmx_set_spec_ctrl_mask(struct vcpu_vmx *vmx, u64 mask)
+{
+ if (vmx->loaded_vmcs->spec_ctrl_mask != mask) {
+ if (cpu_has_spec_ctrl_virt())
+ vmcs_write64(IA32_SPEC_CTRL_MASK, mask);
+ vmx->loaded_vmcs->spec_ctrl_mask = mask;
+ }
+}
+
+static inline u64 vmx_get_spec_ctrl_mask(struct vcpu_vmx *vmx)
+{
+ return vmx->loaded_vmcs->spec_ctrl_mask;
+}
+
/*
* VMX_REGS_LAZY_LOAD_SET - The set of registers that will be updated in the
* cache on demand. Other registers not listed here are synced to
@@ -750,4 +766,26 @@ static inline bool guest_cpuid_has_evmcs(struct kvm_vcpu *vcpu)
to_vmx(vcpu)->nested.enlightened_vmcs_enabled;
}

+static inline u64 vmx_get_guest_spec_ctrl(struct vcpu_vmx *vmx)
+{
+ return vmx->guest_spec_ctrl;
+}
+
+static inline void vmx_set_guest_spec_ctrl(struct vcpu_vmx *vmx, u64 val)
+{
+ vmx->guest_spec_ctrl = val;
+
+ /*
+ * For simplicity, always keep IA32_SPEC_CTRL_SHADOW up-to-date,
+ * regardless of the MSR intercept state.
+ */
+ if (cpu_has_spec_ctrl_virt())
+ vmcs_write64(IA32_SPEC_CTRL_SHADOW, val);
+
+ /*
+ * Update the effective value of IA32_SPEC_CTRL to reflect changes to
+ * guest's IA32_SPEC_CTRL. Bits in the mask should always be set.
+ */
+ vmx->spec_ctrl = val | vmx_get_spec_ctrl_mask(vmx);
+}
#endif /* __KVM_X86_VMX_H */
--
2.40.0

2023-04-14 06:28:36

by Chao Gao

[permalink] [raw]
Subject: [RFC PATCH v2 01/11] x86/msr-index: Add bit definitions for BHI_DIS_S and BHI_NO

From: Zhang Chen <[email protected]>

To ensure VM migration from a system where software mitigation works to
a system where it doesn't won't harm guest's security level, KVM must
mitigate BHI attacks for guests since migration is transparent to guests
and guests won't and can't react to VM migration.

For example, simple BHB clear sequence [1] is effective in mitigating BHI
attacks on processors prior to Alder Lake, but it is not on Alder Lake.
Guests migrated from prior to Alder Lake host to Alder Lake host become
vulnerable to BHI attacks even if the simmple BHB clear sequence is
deployed. In this case, KVM can enable hardware mitigation for guests by
setting BHI_DIS_S bit of IA32_SPEC_CTRL MSR.

Define the SPEC_CTRL_BHI_DIS_S of IA32_SPEC_CTRL MSR and BHI_NO bits in
arch_capabilities, which will be used by KVM later.

[1]: https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/technical-documentation/branch-history-injection.html#inpage-nav-2-4

Signed-off-by: Zhang Chen <[email protected]>
Co-developed-by: Chao Gao <[email protected]>
Signed-off-by: Chao Gao <[email protected]>
Tested-by: Jiaan Lu <[email protected]>
---
arch/x86/include/asm/msr-index.h | 8 +++++++-
tools/arch/x86/include/asm/msr-index.h | 8 +++++++-
2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index ad35355ee43e..60b25d87b82c 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -48,8 +48,10 @@
#define SPEC_CTRL_STIBP BIT(SPEC_CTRL_STIBP_SHIFT) /* STIBP mask */
#define SPEC_CTRL_SSBD_SHIFT 2 /* Speculative Store Bypass Disable bit */
#define SPEC_CTRL_SSBD BIT(SPEC_CTRL_SSBD_SHIFT) /* Speculative Store Bypass Disable */
-#define SPEC_CTRL_RRSBA_DIS_S_SHIFT 6 /* Disable RRSBA behavior */
+#define SPEC_CTRL_RRSBA_DIS_S_SHIFT 6 /* Disable RRSBA behavior in supervisor mode */
#define SPEC_CTRL_RRSBA_DIS_S BIT(SPEC_CTRL_RRSBA_DIS_S_SHIFT)
+#define SPEC_CTRL_BHI_DIS_S_SHIFT 10 /* Disable BHI behavior in supervisor mode */
+#define SPEC_CTRL_BHI_DIS_S BIT(SPEC_CTRL_BHI_DIS_S_SHIFT)

/* A mask for bits which the kernel toggles when controlling mitigations */
#define SPEC_CTRL_MITIGATIONS_MASK (SPEC_CTRL_IBRS | SPEC_CTRL_STIBP | SPEC_CTRL_SSBD \
@@ -151,6 +153,10 @@
* are restricted to targets in
* kernel.
*/
+#define ARCH_CAP_BHI_NO BIT(20) /*
+ * Not susceptible to Branch History
+ * Injection.
+ */
#define ARCH_CAP_PBRSB_NO BIT(24) /*
* Not susceptible to Post-Barrier
* Return Stack Buffer Predictions.
diff --git a/tools/arch/x86/include/asm/msr-index.h b/tools/arch/x86/include/asm/msr-index.h
index ad35355ee43e..6079a5fdb40b 100644
--- a/tools/arch/x86/include/asm/msr-index.h
+++ b/tools/arch/x86/include/asm/msr-index.h
@@ -48,8 +48,10 @@
#define SPEC_CTRL_STIBP BIT(SPEC_CTRL_STIBP_SHIFT) /* STIBP mask */
#define SPEC_CTRL_SSBD_SHIFT 2 /* Speculative Store Bypass Disable bit */
#define SPEC_CTRL_SSBD BIT(SPEC_CTRL_SSBD_SHIFT) /* Speculative Store Bypass Disable */
-#define SPEC_CTRL_RRSBA_DIS_S_SHIFT 6 /* Disable RRSBA behavior */
+#define SPEC_CTRL_RRSBA_DIS_S_SHIFT 6 /* Disable RRSBA behavior in supervisor mode */
#define SPEC_CTRL_RRSBA_DIS_S BIT(SPEC_CTRL_RRSBA_DIS_S_SHIFT)
+#define SPEC_CTRL_BHI_DIS_S_SHIFT 10 /* Disable BHI behavior in supervisor mode */
+#define SPEC_CTRL_BHI_DIS_S BIT(SPEC_CTRL_BHI_DIS_S_SHIFT)

/* A mask for bits which the kernel toggles when controlling mitigations */
#define SPEC_CTRL_MITIGATIONS_MASK (SPEC_CTRL_IBRS | SPEC_CTRL_STIBP | SPEC_CTRL_SSBD \
@@ -151,6 +153,10 @@
* are restricted to targets in
* kernel.
*/
+#define ARCH_CAP_BHI_NO BIT(20) /*
+ * Not susceptible to Branch History
+ * Injection.
+ */
#define ARCH_CAP_PBRSB_NO BIT(24) /*
* Not susceptible to Post-Barrier
* Return Stack Buffer Predictions.
--
2.40.0

2023-04-14 06:28:58

by Chao Gao

[permalink] [raw]
Subject: [RFC PATCH v2 03/11] KVM: x86: Advertise BHI_CTRL support

From: Zhang Chen <[email protected]>

Add 100% kvm-only feature for BHI_CTRL because the kernel doesn't use it
at all.

BHI_CTRL is enumerated by CPUID.7.2.EDX[4]. If supported, BHI_DIS_S (bit
10) of IA32_SPEC_CTRL MSR can be used to enable BHI_DIS_S behavior.

Note that KVM does not intercept guests' IA32_SPEC_CTRL MSR accesses
after a non-zero is written to the MSR. Therefore, guests can already
toggle the BHI_DIS_S bit if the host supports BHI_CTRL, and no extra
code is needed to allow guests to toggle the bit.

Signed-off-by: Zhang Chen <[email protected]>
Signed-off-by: Chao Gao <[email protected]>
Tested-by: Jiaan Lu <[email protected]>
---
arch/x86/kvm/cpuid.c | 2 +-
arch/x86/kvm/reverse_cpuid.h | 1 +
2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index f024c3ac2203..7cdd859d09a2 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -686,7 +686,7 @@ void kvm_set_cpu_caps(void)
);

kvm_cpu_cap_init_kvm_defined(CPUID_7_2_EDX,
- SF(RRSBA_CTRL)
+ SF(RRSBA_CTRL) | F(BHI_CTRL)
);

kvm_cpu_cap_mask(CPUID_8000_0001_ECX,
diff --git a/arch/x86/kvm/reverse_cpuid.h b/arch/x86/kvm/reverse_cpuid.h
index 72bad8314a9c..e7e70c9aa384 100644
--- a/arch/x86/kvm/reverse_cpuid.h
+++ b/arch/x86/kvm/reverse_cpuid.h
@@ -50,6 +50,7 @@ enum kvm_only_cpuid_leafs {

/* Intel-defined sub-features, CPUID level 0x00000007:2 (EDX) */
#define KVM_X86_FEATURE_RRSBA_CTRL KVM_X86_FEATURE(CPUID_7_2_EDX, 2)
+#define X86_FEATURE_BHI_CTRL KVM_X86_FEATURE(CPUID_7_2_EDX, 4)

struct cpuid_reg {
u32 function;
--
2.40.0

2023-04-14 06:29:14

by Chao Gao

[permalink] [raw]
Subject: [RFC PATCH v2 05/11] x86/bugs: Use Virtual MSRs to request hardware mitigations

From: Pawan Gupta <[email protected]>

Guests that have different family/model than the host may not be aware
of hardware mitigations(such as RRSBA_DIS_S) available on host. This is
particularly true when guests migrate. To solve this problem Intel
processors have added a virtual MSR interface through which guests can
report their mitigation status and request VMM to deploy relevant
hardware mitigations.

Use this virtualized MSR interface to request relevant hardware controls
for retpoline mitigation.

Signed-off-by: Pawan Gupta <[email protected]>
Co-developed-by: Zhang Chen <[email protected]>
Signed-off-by: Zhang Chen <[email protected]>
Signed-off-by: Chao Gao <[email protected]>
Tested-by: Jiaan Lu <[email protected]>
---
arch/x86/include/asm/msr-index.h | 25 +++++++++++++++++++++++++
arch/x86/kernel/cpu/bugs.c | 25 +++++++++++++++++++++++++
2 files changed, 50 insertions(+)

diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 60b25d87b82c..aec213f0c6fc 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -166,6 +166,7 @@
* IA32_XAPIC_DISABLE_STATUS MSR
* supported
*/
+#define ARCH_CAP_VIRTUAL_ENUM BIT_ULL(63) /* MSR_VIRTUAL_ENUMERATION supported */

#define MSR_IA32_FLUSH_CMD 0x0000010b
#define L1D_FLUSH BIT(0) /*
@@ -1103,6 +1104,30 @@
#define MSR_IA32_VMX_MISC_INTEL_PT (1ULL << 14)
#define MSR_IA32_VMX_MISC_VMWRITE_SHADOW_RO_FIELDS (1ULL << 29)
#define MSR_IA32_VMX_MISC_PREEMPTION_TIMER_SCALE 0x1F
+
+/* Intel virtual MSRs */
+#define MSR_VIRTUAL_ENUMERATION 0x50000000
+#define VIRT_ENUM_MITIGATION_CTRL_SUPPORT BIT(0) /*
+ * Mitigation ctrl via virtual
+ * MSRs supported
+ */
+
+#define MSR_VIRTUAL_MITIGATION_ENUM 0x50000001
+#define MITI_ENUM_BHB_CLEAR_SEQ_S_SUPPORT BIT(0) /* VMM supports BHI_DIS_S */
+#define MITI_ENUM_RETPOLINE_S_SUPPORT BIT(1) /* VMM supports RRSBA_DIS_S */
+
+#define MSR_VIRTUAL_MITIGATION_CTRL 0x50000002
+#define MITI_CTRL_BHB_CLEAR_SEQ_S_USED_BIT 0 /*
+ * Request VMM to deploy
+ * BHI_DIS_S mitigation
+ */
+#define MITI_CTRL_BHB_CLEAR_SEQ_S_USED BIT(MITI_CTRL_BHB_CLEAR_SEQ_S_USED_BIT)
+#define MITI_CTRL_RETPOLINE_S_USED_BIT 1 /*
+ * Request VMM to deploy
+ * RRSBA_DIS_S mitigation
+ */
+#define MITI_CTRL_RETPOLINE_S_USED BIT(MITI_CTRL_RETPOLINE_S_USED_BIT)
+
/* AMD-V MSRs */

#define MSR_VM_CR 0xc0010114
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index f9d060e71c3e..5326c03d9d5e 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -1435,6 +1435,27 @@ static void __init spectre_v2_determine_rsb_fill_type_at_vmexit(enum spectre_v2_
dump_stack();
}

+/* Speculation control using virtualized MSRs */
+static void spec_ctrl_setup_virtualized_msr(void)
+{
+ u64 msr_virt_enum, msr_mitigation_enum;
+
+ /* When retpoline is being used, request relevant hardware controls */
+ if (!boot_cpu_has(X86_FEATURE_RETPOLINE))
+ return;
+
+ if (!(x86_read_arch_cap_msr() & ARCH_CAP_VIRTUAL_ENUM))
+ return;
+
+ rdmsrl(MSR_VIRTUAL_ENUMERATION, msr_virt_enum);
+ if (!(msr_virt_enum & VIRT_ENUM_MITIGATION_CTRL_SUPPORT))
+ return;
+
+ rdmsrl(MSR_VIRTUAL_MITIGATION_ENUM, msr_mitigation_enum);
+ if (msr_mitigation_enum & MITI_ENUM_RETPOLINE_S_SUPPORT)
+ msr_set_bit(MSR_VIRTUAL_MITIGATION_CTRL, MITI_CTRL_RETPOLINE_S_USED_BIT);
+}
+
static void __init spectre_v2_select_mitigation(void)
{
enum spectre_v2_mitigation_cmd cmd = spectre_v2_parse_cmdline();
@@ -1546,6 +1567,8 @@ static void __init spectre_v2_select_mitigation(void)
mode == SPECTRE_V2_RETPOLINE)
spec_ctrl_disable_kernel_rrsba();

+ spec_ctrl_setup_virtualized_msr();
+
spectre_v2_enabled = mode;
pr_info("%s\n", spectre_v2_strings[mode]);

@@ -2115,6 +2138,8 @@ void x86_spec_ctrl_setup_ap(void)

if (ssb_mode == SPEC_STORE_BYPASS_DISABLE)
x86_amd_ssb_disable();
+
+ spec_ctrl_setup_virtualized_msr();
}

bool itlb_multihit_kvm_mitigation;
--
2.40.0

2023-04-14 06:29:38

by Chao Gao

[permalink] [raw]
Subject: [RFC PATCH v2 06/11] KVM: x86: Advertise ARCH_CAP_VIRTUAL_ENUM support

From: Zhang Chen <[email protected]>

Bit 63 of IA32_ARCH_CAPABILITIES MSR indicates availablility of the
VIRTUAL_ENUMERATION_MSR (index 0x50000000) that enumerates features like
e.g., mitigation enumeration which is used for guest to hint VMMs the
software mitigations in use.

Advertise ARCH_CAP_VIRTUAL_ENUM support for VMX and emulate read/write
of the VIRTUAL_ENUMERATION_MSR. Now VIRTUAL_ENUMERATION_MSR is always 0.

Signed-off-by: Zhang Chen <[email protected]>
Co-developed-by: Chao Gao <[email protected]>
Signed-off-by: Chao Gao <[email protected]>
Tested-by: Jiaan Lu <[email protected]>
---
arch/x86/kvm/svm/svm.c | 1 +
arch/x86/kvm/vmx/vmx.c | 19 +++++++++++++++++++
arch/x86/kvm/vmx/vmx.h | 1 +
arch/x86/kvm/x86.c | 16 +++++++++++++++-
4 files changed, 36 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 57f241c5a371..195d0cf9309a 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4093,6 +4093,7 @@ static bool svm_has_emulated_msr(struct kvm *kvm, u32 index)
{
switch (index) {
case MSR_IA32_MCG_EXT_CTL:
+ case MSR_VIRTUAL_ENUMERATION:
case MSR_IA32_VMX_BASIC ... MSR_IA32_VMX_VMFUNC:
return false;
case MSR_IA32_SMBASE:
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 9f6919bec2b3..85419137decb 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1943,6 +1943,8 @@ static inline bool is_vmx_feature_control_msr_valid(struct vcpu_vmx *vmx,
return !(msr->data & ~valid_bits);
}

+#define VIRTUAL_ENUMERATION_VALID_BITS 0ULL
+
static int vmx_get_msr_feature(struct kvm_msr_entry *msr)
{
switch (msr->index) {
@@ -1950,6 +1952,9 @@ static int vmx_get_msr_feature(struct kvm_msr_entry *msr)
if (!nested)
return 1;
return vmx_get_vmx_msr(&vmcs_config.nested, msr->index, &msr->data);
+ case MSR_VIRTUAL_ENUMERATION:
+ msr->data = VIRTUAL_ENUMERATION_VALID_BITS;
+ return 0;
default:
return KVM_MSR_RET_INVALID;
}
@@ -2096,6 +2101,12 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
case MSR_IA32_DEBUGCTLMSR:
msr_info->data = vmcs_read64(GUEST_IA32_DEBUGCTL);
break;
+ case MSR_VIRTUAL_ENUMERATION:
+ if (!msr_info->host_initiated &&
+ !(vcpu->arch.arch_capabilities & ARCH_CAP_VIRTUAL_ENUM))
+ return 1;
+ msr_info->data = vmx->msr_virtual_enumeration;
+ break;
default:
find_uret_msr:
msr = vmx_find_uret_msr(vmx, msr_info->index);
@@ -2437,6 +2448,14 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
}
ret = kvm_set_msr_common(vcpu, msr_info);
break;
+ case MSR_VIRTUAL_ENUMERATION:
+ if (!msr_info->host_initiated)
+ return 1;
+ if (data & ~VIRTUAL_ENUMERATION_VALID_BITS)
+ return 1;
+
+ vmx->msr_virtual_enumeration = data;
+ break;

default:
find_uret_msr:
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index 021d86b52e18..a7faaf9fdc26 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -292,6 +292,7 @@ struct vcpu_vmx {

u64 spec_ctrl;
u64 guest_spec_ctrl;
+ u64 msr_virtual_enumeration;
u32 msr_ia32_umwait_control;

/*
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 3c58dbae7b4c..a1bc52bebdcc 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1537,6 +1537,7 @@ static const u32 emulated_msrs_all[] = {

MSR_K7_HWCR,
MSR_KVM_POLL_CONTROL,
+ MSR_VIRTUAL_ENUMERATION,
};

static u32 emulated_msrs[ARRAY_SIZE(emulated_msrs_all)];
@@ -1570,6 +1571,7 @@ static const u32 msr_based_features_all[] = {
MSR_IA32_UCODE_REV,
MSR_IA32_ARCH_CAPABILITIES,
MSR_IA32_PERF_CAPABILITIES,
+ MSR_VIRTUAL_ENUMERATION,
};

static u32 msr_based_features[ARRAY_SIZE(msr_based_features_all)];
@@ -1591,7 +1593,8 @@ static unsigned int num_msr_based_features;
ARCH_CAP_SKIP_VMENTRY_L1DFLUSH | ARCH_CAP_SSB_NO | ARCH_CAP_MDS_NO | \
ARCH_CAP_PSCHANGE_MC_NO | ARCH_CAP_TSX_CTRL_MSR | ARCH_CAP_TAA_NO | \
ARCH_CAP_SBDR_SSDP_NO | ARCH_CAP_FBSDP_NO | ARCH_CAP_PSDP_NO | \
- ARCH_CAP_FB_CLEAR | ARCH_CAP_RRSBA | ARCH_CAP_PBRSB_NO)
+ ARCH_CAP_FB_CLEAR | ARCH_CAP_RRSBA | ARCH_CAP_PBRSB_NO | \
+ ARCH_CAP_VIRTUAL_ENUM)

static u64 kvm_get_arch_capabilities(void)
{
@@ -1610,6 +1613,17 @@ static u64 kvm_get_arch_capabilities(void)
*/
data |= ARCH_CAP_PSCHANGE_MC_NO;

+ /*
+ * Virtual enumeration is a paravirt feature. The only usage for now
+ * is to bridge the gap caused by microarchitecture changes between
+ * different Intel processors. And its usage is linked to "virtualize
+ * IA32_SPEC_CTRL" which is a VMX feature. Whether AMD SVM can benefit
+ * from the same usage and how to implement it is still unclear. Limit
+ * virtual enumeration to VMX.
+ */
+ if (static_call(kvm_x86_has_emulated_msr)(NULL, MSR_VIRTUAL_ENUMERATION))
+ data |= ARCH_CAP_VIRTUAL_ENUM;
+
/*
* If we're doing cache flushes (either "always" or "cond")
* we will do one whenever the guest does a vmlaunch/vmresume.
--
2.40.0

2023-04-14 06:30:13

by Chao Gao

[permalink] [raw]
Subject: [RFC PATCH v2 07/11] KVM: VMX: Advertise MITIGATION_CTRL support

From: Zhang Chen <[email protected]>

Advertise MITIGATION_CTRL support and emulate accesses to two associated
MSRs.

MITIGATION_CTRL is enumerated by bit 0 of MSR_VIRTUAL_ENUMERATION. If
supported, two virtual MSRs MSR_VIRTUAL_MITIGATION_ENUM(0x50000001) and
MSR_VIRTUAL_MITIGATION_CTRL(0x50000002) are available.

The two MSRs are used to for guest to report software mitigation status.
Such information is preserved across live migration, therefore KVM can
leverage the information to deploy necessary hardware mitigation for
guests to guarantee guests maintain the same security level after
migration.

Note that MSR_VIRTUAL_MITIGATION_ENUM is also a feature MSR since each
bit in the MSR represents a software mitigation that the underlying VMM
understands.

Signed-off-by: Zhang Chen <[email protected]>
Co-developed-by: Chao Gao <[email protected]>
Signed-off-by: Chao Gao <[email protected]>
Tested-by: Jiaan Lu <[email protected]>
---
arch/x86/kvm/svm/svm.c | 2 ++
arch/x86/kvm/vmx/vmx.c | 36 +++++++++++++++++++++++++++++++++++-
arch/x86/kvm/vmx/vmx.h | 2 ++
arch/x86/kvm/x86.c | 3 +++
4 files changed, 42 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 195d0cf9309a..80bb7a62e9b2 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4094,6 +4094,8 @@ static bool svm_has_emulated_msr(struct kvm *kvm, u32 index)
switch (index) {
case MSR_IA32_MCG_EXT_CTL:
case MSR_VIRTUAL_ENUMERATION:
+ case MSR_VIRTUAL_MITIGATION_ENUM:
+ case MSR_VIRTUAL_MITIGATION_CTRL:
case MSR_IA32_VMX_BASIC ... MSR_IA32_VMX_VMFUNC:
return false;
case MSR_IA32_SMBASE:
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 85419137decb..980498c4c30c 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1943,7 +1943,9 @@ static inline bool is_vmx_feature_control_msr_valid(struct vcpu_vmx *vmx,
return !(msr->data & ~valid_bits);
}

-#define VIRTUAL_ENUMERATION_VALID_BITS 0ULL
+#define VIRTUAL_ENUMERATION_VALID_BITS VIRT_ENUM_MITIGATION_CTRL_SUPPORT
+#define MITI_ENUM_VALID_BITS 0ULL
+#define MITI_CTRL_VALID_BITS 0ULL

static int vmx_get_msr_feature(struct kvm_msr_entry *msr)
{
@@ -1955,6 +1957,9 @@ static int vmx_get_msr_feature(struct kvm_msr_entry *msr)
case MSR_VIRTUAL_ENUMERATION:
msr->data = VIRTUAL_ENUMERATION_VALID_BITS;
return 0;
+ case MSR_VIRTUAL_MITIGATION_ENUM:
+ msr->data = MITI_ENUM_VALID_BITS;
+ return 0;
default:
return KVM_MSR_RET_INVALID;
}
@@ -2107,6 +2112,18 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
return 1;
msr_info->data = vmx->msr_virtual_enumeration;
break;
+ case MSR_VIRTUAL_MITIGATION_ENUM:
+ if (!msr_info->host_initiated &&
+ !(vmx->msr_virtual_enumeration & VIRT_ENUM_MITIGATION_CTRL_SUPPORT))
+ return 1;
+ msr_info->data = vmx->msr_virtual_mitigation_enum;
+ break;
+ case MSR_VIRTUAL_MITIGATION_CTRL:
+ if (!msr_info->host_initiated &&
+ !(vmx->msr_virtual_enumeration & VIRT_ENUM_MITIGATION_CTRL_SUPPORT))
+ return 1;
+ msr_info->data = vmx->msr_virtual_mitigation_ctrl;
+ break;
default:
find_uret_msr:
msr = vmx_find_uret_msr(vmx, msr_info->index);
@@ -2456,7 +2473,23 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)

vmx->msr_virtual_enumeration = data;
break;
+ case MSR_VIRTUAL_MITIGATION_ENUM:
+ if (!msr_info->host_initiated)
+ return 1;
+ if (data & ~MITI_ENUM_VALID_BITS)
+ return 1;
+
+ vmx->msr_virtual_mitigation_enum = data;
+ break;
+ case MSR_VIRTUAL_MITIGATION_CTRL:
+ if (!msr_info->host_initiated &&
+ !(vmx->msr_virtual_enumeration & VIRT_ENUM_MITIGATION_CTRL_SUPPORT))
+ return 1;
+ if (data & ~MITI_CTRL_VALID_BITS)
+ return 1;

+ vmx->msr_virtual_mitigation_ctrl = data;
+ break;
default:
find_uret_msr:
msr = vmx_find_uret_msr(vmx, msr_index);
@@ -4852,6 +4885,7 @@ static void vmx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
vmx->rmode.vm86_active = 0;
vmx_set_spec_ctrl_mask(vmx, 0);
vmx_set_guest_spec_ctrl(vmx, 0);
+ vmx->msr_virtual_mitigation_ctrl = 0;

vmx->msr_ia32_umwait_control = 0;

diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index a7faaf9fdc26..b81480c879b5 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -293,6 +293,8 @@ struct vcpu_vmx {
u64 spec_ctrl;
u64 guest_spec_ctrl;
u64 msr_virtual_enumeration;
+ u64 msr_virtual_mitigation_enum;
+ u64 msr_virtual_mitigation_ctrl;
u32 msr_ia32_umwait_control;

/*
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index a1bc52bebdcc..3b567dc03b27 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1538,6 +1538,8 @@ static const u32 emulated_msrs_all[] = {
MSR_K7_HWCR,
MSR_KVM_POLL_CONTROL,
MSR_VIRTUAL_ENUMERATION,
+ MSR_VIRTUAL_MITIGATION_ENUM,
+ MSR_VIRTUAL_MITIGATION_CTRL,
};

static u32 emulated_msrs[ARRAY_SIZE(emulated_msrs_all)];
@@ -1572,6 +1574,7 @@ static const u32 msr_based_features_all[] = {
MSR_IA32_ARCH_CAPABILITIES,
MSR_IA32_PERF_CAPABILITIES,
MSR_VIRTUAL_ENUMERATION,
+ MSR_VIRTUAL_MITIGATION_ENUM,
};

static u32 msr_based_features[ARRAY_SIZE(msr_based_features_all)];
--
2.40.0

2023-04-14 06:30:17

by Chao Gao

[permalink] [raw]
Subject: [RFC PATCH v2 09/11] KVM: VMX: Advertise MITI_CTRL_BHB_CLEAR_SEQ_S_SUPPORT

From: Zhang Chen <[email protected]>

Allow guest to query if the underying VMM understands BHB-clearing
sequence and report the statue of BHB-clearing sequence in suprevisor
mode i.e. CPL < 3 via MSR_VIRTUAL_MITIGATION_ENUM/CTRL.

Enable BHI_DIS_S for guest if guest is using BHI-clearing sequence and
the sequence isn't effective on the processor.

Signed-off-by: Zhang Chen <[email protected]>
Signed-off-by: Chao Gao <[email protected]>
Tested-by: Jiaan Lu <[email protected]>
---
arch/x86/kvm/vmx/vmx.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 25afb4c3e55e..abbbf8e66b53 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1944,8 +1944,10 @@ static inline bool is_vmx_feature_control_msr_valid(struct vcpu_vmx *vmx,
}

#define VIRTUAL_ENUMERATION_VALID_BITS VIRT_ENUM_MITIGATION_CTRL_SUPPORT
-#define MITI_ENUM_VALID_BITS MITI_ENUM_RETPOLINE_S_SUPPORT
-#define MITI_CTRL_VALID_BITS MITI_CTRL_RETPOLINE_S_USED
+#define MITI_ENUM_VALID_BITS (MITI_ENUM_RETPOLINE_S_SUPPORT | \
+ MITI_ENUM_BHB_CLEAR_SEQ_S_SUPPORT)
+#define MITI_CTRL_VALID_BITS (MITI_CTRL_RETPOLINE_S_USED | \
+ MITI_CTRL_BHB_CLEAR_SEQ_S_USED)

static int vmx_get_msr_feature(struct kvm_msr_entry *msr)
{
@@ -2496,6 +2498,11 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
arch_msr & ARCH_CAP_RRSBA)
spec_ctrl_mask |= SPEC_CTRL_RRSBA_DIS_S;

+ if (data & MITI_CTRL_BHB_CLEAR_SEQ_S_USED &&
+ kvm_cpu_cap_has(X86_FEATURE_BHI_CTRL) &&
+ !(arch_msr & ARCH_CAP_BHI_NO))
+ spec_ctrl_mask |= SPEC_CTRL_BHI_DIS_S;
+
/*
* Intercept IA32_SPEC_CTRL to disallow guest from changing
* certain bits.
--
2.40.0

2023-04-14 06:30:19

by Chao Gao

[permalink] [raw]
Subject: [RFC PATCH v2 08/11] KVM: VMX: Advertise MITI_ENUM_RETPOLINE_S_SUPPORT

Allow guest to query if the underying VMM understands Retpoline and
report the statue of Retpoline in suprevisor mode i.e. CPL < 3 via
MSR_VIRTUAL_MITIGATION_ENUM/CTRL.

Disable RRSBA behavior by setting RRSBA_DIS_S for guest if guest is
using retpoline and the processor has the behavior.

Signed-off-by: Zhang Chen <[email protected]>
Signed-off-by: Chao Gao <[email protected]>
Tested-by: Jiaan Lu <[email protected]>
---
arch/x86/kvm/vmx/vmx.c | 24 +++++++++++++++++++++---
1 file changed, 21 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 980498c4c30c..25afb4c3e55e 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1944,8 +1944,8 @@ static inline bool is_vmx_feature_control_msr_valid(struct vcpu_vmx *vmx,
}

#define VIRTUAL_ENUMERATION_VALID_BITS VIRT_ENUM_MITIGATION_CTRL_SUPPORT
-#define MITI_ENUM_VALID_BITS 0ULL
-#define MITI_CTRL_VALID_BITS 0ULL
+#define MITI_ENUM_VALID_BITS MITI_ENUM_RETPOLINE_S_SUPPORT
+#define MITI_CTRL_VALID_BITS MITI_CTRL_RETPOLINE_S_USED

static int vmx_get_msr_feature(struct kvm_msr_entry *msr)
{
@@ -2173,7 +2173,7 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
struct vmx_uret_msr *msr;
int ret = 0;
u32 msr_index = msr_info->index;
- u64 data = msr_info->data, spec_ctrl_mask;
+ u64 data = msr_info->data, arch_msr = 0, spec_ctrl_mask = 0;
u32 index;

switch (msr_index) {
@@ -2488,6 +2488,24 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
if (data & ~MITI_CTRL_VALID_BITS)
return 1;

+ if (boot_cpu_has(X86_FEATURE_ARCH_CAPABILITIES))
+ rdmsrl(MSR_IA32_ARCH_CAPABILITIES, arch_msr);
+
+ if (data & MITI_CTRL_RETPOLINE_S_USED &&
+ kvm_cpu_cap_has(X86_FEATURE_RRSBA_CTRL) &&
+ arch_msr & ARCH_CAP_RRSBA)
+ spec_ctrl_mask |= SPEC_CTRL_RRSBA_DIS_S;
+
+ /*
+ * Intercept IA32_SPEC_CTRL to disallow guest from changing
+ * certain bits.
+ */
+ if (spec_ctrl_mask && !cpu_has_spec_ctrl_virt())
+ vmx_enable_intercept_for_msr(vcpu, MSR_IA32_SPEC_CTRL, MSR_TYPE_RW);
+
+ vmx_set_spec_ctrl_mask(vmx, spec_ctrl_mask);
+ vmx_set_guest_spec_ctrl(vmx, vmx_get_guest_spec_ctrl(vmx));
+
vmx->msr_virtual_mitigation_ctrl = data;
break;
default:
--
2.40.0

2023-04-14 06:31:20

by Chao Gao

[permalink] [raw]
Subject: [RFC PATCH v2 10/11] KVM: selftests: Add tests for virtual enumeration/mitigation MSRs

Three virtual MSRs added for guest to report the usage of software
mitigations. They are enumerated in an architectural way. Try to
access the three MSRs to ensure the behavior is expected:
Specifically,

1. below three cases should cause #GP:
* access to a non-present MSR
* write to read-only MSRs
* toggling reserved bit of a writeable MSR

2. rdmsr/wrmsr in other cases should succeed

3. rdmsr should return the value last written

Signed-off-by: Chao Gao <[email protected]>
---
tools/arch/x86/include/asm/msr-index.h | 23 +++
tools/testing/selftests/kvm/Makefile | 1 +
.../kvm/x86_64/virtual_mitigation_msr_test.c | 175 ++++++++++++++++++
3 files changed, 199 insertions(+)
create mode 100644 tools/testing/selftests/kvm/x86_64/virtual_mitigation_msr_test.c

diff --git a/tools/arch/x86/include/asm/msr-index.h b/tools/arch/x86/include/asm/msr-index.h
index 6079a5fdb40b..55f75e9ebbb7 100644
--- a/tools/arch/x86/include/asm/msr-index.h
+++ b/tools/arch/x86/include/asm/msr-index.h
@@ -166,6 +166,7 @@
* IA32_XAPIC_DISABLE_STATUS MSR
* supported
*/
+#define ARCH_CAP_VIRTUAL_ENUM BIT_ULL(63) /* MSR_VIRTUAL_ENUMERATION supported */

#define MSR_IA32_FLUSH_CMD 0x0000010b
#define L1D_FLUSH BIT(0) /*
@@ -1103,6 +1104,28 @@
#define MSR_IA32_VMX_MISC_INTEL_PT (1ULL << 14)
#define MSR_IA32_VMX_MISC_VMWRITE_SHADOW_RO_FIELDS (1ULL << 29)
#define MSR_IA32_VMX_MISC_PREEMPTION_TIMER_SCALE 0x1F
+
+/* Intel virtual MSRs */
+#define MSR_VIRTUAL_ENUMERATION 0x50000000
+#define VIRT_ENUM_MITIGATION_CTRL_SUPPORT BIT(0) /*
+ * Mitigation ctrl via virtual
+ * MSRs supported
+ */
+
+#define MSR_VIRTUAL_MITIGATION_ENUM 0x50000001
+#define MITI_ENUM_BHB_CLEAR_SEQ_S_SUPPORT BIT(0) /* VMM supports BHI_DIS_S */
+#define MITI_ENUM_RETPOLINE_S_SUPPORT BIT(1) /* VMM supports RRSBA_DIS_S */
+
+#define MSR_VIRTUAL_MITIGATION_CTRL 0x50000002
+#define MITI_CTRL_BHB_CLEAR_SEQ_S_USED BIT(0) /*
+ * Request VMM to deploy
+ * BHI_DIS_S mitigation
+ */
+#define MITI_CTRL_RETPOLINE_S_USED BIT(1) /*
+ * Request VMM to deploy
+ * RRSBA_DIS_S mitigation
+ */
+
/* AMD-V MSRs */

#define MSR_VM_CR 0xc0010114
diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
index 84a627c43795..9db9a7e49a54 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -115,6 +115,7 @@ TEST_GEN_PROGS_x86_64 += x86_64/sev_migrate_tests
TEST_GEN_PROGS_x86_64 += x86_64/amx_test
TEST_GEN_PROGS_x86_64 += x86_64/max_vcpuid_cap_test
TEST_GEN_PROGS_x86_64 += x86_64/triple_fault_event_test
+TEST_GEN_PROGS_x86_64 += x86_64/virtual_mitigation_msr_test
TEST_GEN_PROGS_x86_64 += access_tracking_perf_test
TEST_GEN_PROGS_x86_64 += demand_paging_test
TEST_GEN_PROGS_x86_64 += dirty_log_test
diff --git a/tools/testing/selftests/kvm/x86_64/virtual_mitigation_msr_test.c b/tools/testing/selftests/kvm/x86_64/virtual_mitigation_msr_test.c
new file mode 100644
index 000000000000..4d924a0cf2dd
--- /dev/null
+++ b/tools/testing/selftests/kvm/x86_64/virtual_mitigation_msr_test.c
@@ -0,0 +1,175 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2023, Intel, Inc.
+ *
+ * tests for virtual mitigation MSR accesses
+ */
+
+#include <fcntl.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <sys/ioctl.h>
+
+#include "test_util.h"
+
+#include "kvm_util.h"
+#include "processor.h"
+
+static int guest_exception_count;
+static int expected_exception_count;
+static void guest_gp_handler(struct ex_regs *regs)
+{
+ /* RDMSR/WRMSR are 2 bytes */
+ regs->rip += 2;
+ ++guest_exception_count;
+}
+
+static void write_msr_expect_gp(uint32_t msr, uint64_t val)
+{
+ uint64_t old_val;
+
+ old_val = rdmsr(msr);
+ wrmsr(msr, val);
+ expected_exception_count++;
+ GUEST_ASSERT_2(guest_exception_count == expected_exception_count,
+ guest_exception_count, expected_exception_count);
+ GUEST_ASSERT_2(rdmsr(msr) == old_val, rdmsr(msr), old_val);
+}
+
+static void write_msr_expect_no_gp(uint32_t msr, uint64_t val)
+{
+ wrmsr(msr, val);
+ GUEST_ASSERT_EQ(guest_exception_count, expected_exception_count);
+ GUEST_ASSERT_EQ(rdmsr(msr), val);
+}
+
+static void read_msr_expect_gp(uint32_t msr)
+{
+ (void)rdmsr(msr);
+ expected_exception_count++;
+ GUEST_ASSERT_2(guest_exception_count == expected_exception_count,
+ guest_exception_count, expected_exception_count);
+}
+
+static void guest_code_with_virtual_mitigation_ctrl(void)
+{
+ uint64_t val, miti_ctrl = 0;
+ int i;
+
+ val = rdmsr(MSR_VIRTUAL_ENUMERATION);
+ /* MSR_VIRTUAL_ENUMERATION is read-only. #GP is expected on write */
+ write_msr_expect_gp(MSR_VIRTUAL_ENUMERATION, val);
+
+ val = rdmsr(MSR_VIRTUAL_MITIGATION_ENUM);
+ /* MSR_VIRTUAL_MITIGATION_ENUM is read-only. #GP is expected on write */
+ write_msr_expect_gp(MSR_VIRTUAL_MITIGATION_ENUM, val);
+
+ for (i = 0; i < 64; i++) {
+ if (val & BIT_ULL(i)) {
+ miti_ctrl |= BIT_ULL(i);
+ write_msr_expect_no_gp(MSR_VIRTUAL_MITIGATION_CTRL, miti_ctrl);
+ } else {
+ write_msr_expect_gp(MSR_VIRTUAL_MITIGATION_CTRL, miti_ctrl | BIT_ULL(i));
+ }
+ }
+
+ write_msr_expect_no_gp(MSR_VIRTUAL_MITIGATION_CTRL, 0);
+ GUEST_DONE();
+}
+
+static void guest_code_no_virtual_enumeration(void)
+{
+ read_msr_expect_gp(MSR_VIRTUAL_ENUMERATION);
+ read_msr_expect_gp(MSR_VIRTUAL_MITIGATION_ENUM);
+ read_msr_expect_gp(MSR_VIRTUAL_MITIGATION_CTRL);
+ GUEST_DONE();
+}
+
+bool kvm_cpu_has_virtual_mitigation_ctrl(void)
+{
+ const struct kvm_msr_list *feature_list;
+ u64 virt_enum = 0;
+ int i;
+
+ feature_list = kvm_get_feature_msr_index_list();
+ for (i = 0; i < feature_list->nmsrs; i++) {
+ if (feature_list->indices[i] == MSR_VIRTUAL_ENUMERATION)
+ virt_enum = kvm_get_feature_msr(MSR_VIRTUAL_ENUMERATION);
+ }
+
+ return virt_enum & VIRT_ENUM_MITIGATION_CTRL_SUPPORT;
+}
+
+static void enable_virtual_mitigation_ctrl(struct kvm_vcpu *vcpu)
+{
+ vcpu_set_msr(vcpu, MSR_IA32_ARCH_CAPABILITIES, ARCH_CAP_VIRTUAL_ENUM);
+ vcpu_set_msr(vcpu, MSR_VIRTUAL_ENUMERATION, VIRT_ENUM_MITIGATION_CTRL_SUPPORT);
+ vcpu_set_msr(vcpu, MSR_VIRTUAL_MITIGATION_ENUM,
+ kvm_get_feature_msr(MSR_VIRTUAL_MITIGATION_ENUM));
+}
+
+static void disable_virtual_enumeration(struct kvm_vcpu *vcpu)
+{
+ vcpu_set_msr(vcpu, MSR_IA32_ARCH_CAPABILITIES, 0);
+}
+
+static void test_virtual_mitiation_ctrl(bool enable)
+{
+ struct kvm_vcpu *vcpu;
+ struct kvm_run *run;
+ struct kvm_vm *vm;
+ struct ucall uc;
+ void *guest_code;
+
+ guest_code = enable ? guest_code_with_virtual_mitigation_ctrl :
+ guest_code_no_virtual_enumeration;
+
+ vm = vm_create_with_one_vcpu(&vcpu, guest_code);
+ run = vcpu->run;
+
+ if (enable)
+ enable_virtual_mitigation_ctrl(vcpu);
+ else
+ disable_virtual_enumeration(vcpu);
+
+
+ /* Register #GP handler */
+ vm_init_descriptor_tables(vm);
+ vcpu_init_descriptor_tables(vcpu);
+ vm_install_exception_handler(vm, GP_VECTOR, guest_gp_handler);
+
+ while (1) {
+ vcpu_run(vcpu);
+
+ TEST_ASSERT(run->exit_reason == KVM_EXIT_IO,
+ "Unexpected exit reason: %u (%s),\n",
+ run->exit_reason,
+ exit_reason_str(run->exit_reason));
+
+ switch (get_ucall(vcpu, &uc)) {
+ case UCALL_ABORT:
+ REPORT_GUEST_ASSERT_2(uc, "real %ld expected %ld");
+ break;
+ case UCALL_DONE:
+ goto done;
+ default:
+ TEST_FAIL("Unknown ucall %lu", uc.cmd);
+ }
+ }
+
+done:
+ kvm_vm_free(vm);
+}
+
+int main(int argc, char *argv[])
+{
+ TEST_REQUIRE(kvm_cpu_has(X86_FEATURE_ARCH_CAPABILITIES));
+ TEST_REQUIRE(kvm_has_cap(KVM_CAP_GET_MSR_FEATURES));
+ TEST_REQUIRE(kvm_cpu_has_virtual_mitigation_ctrl());
+
+ test_virtual_mitiation_ctrl(true);
+ test_virtual_mitiation_ctrl(false);
+
+ return 0;
+}
--
2.40.0

2023-04-14 06:32:55

by Chao Gao

[permalink] [raw]
Subject: [RFC PATCH v2 11/11] KVM: selftests: Add tests for IA32_SPEC_CTRL MSR

Toggle supported bits of IA32_SPEC_CTRL and verify the result. And also
verify the MSR value is preserved across nested transitions.

Signed-off-by: Chao Gao <[email protected]>
---
tools/arch/x86/include/asm/msr-index.h | 6 +
tools/testing/selftests/kvm/Makefile | 1 +
.../selftests/kvm/include/x86_64/processor.h | 5 +
.../selftests/kvm/x86_64/spec_ctrl_msr_test.c | 178 ++++++++++++++++++
4 files changed, 190 insertions(+)
create mode 100644 tools/testing/selftests/kvm/x86_64/spec_ctrl_msr_test.c

diff --git a/tools/arch/x86/include/asm/msr-index.h b/tools/arch/x86/include/asm/msr-index.h
index 55f75e9ebbb7..9ad6c307c0d0 100644
--- a/tools/arch/x86/include/asm/msr-index.h
+++ b/tools/arch/x86/include/asm/msr-index.h
@@ -48,6 +48,12 @@
#define SPEC_CTRL_STIBP BIT(SPEC_CTRL_STIBP_SHIFT) /* STIBP mask */
#define SPEC_CTRL_SSBD_SHIFT 2 /* Speculative Store Bypass Disable bit */
#define SPEC_CTRL_SSBD BIT(SPEC_CTRL_SSBD_SHIFT) /* Speculative Store Bypass Disable */
+#define SPEC_CTRL_IPRED_DIS_U_SHIFT 3 /* Disable IPRED behavior in user mode */
+#define SPEC_CTRL_IPRED_DIS_U BIT(SPEC_CTRL_IPRED_DIS_U_SHIFT)
+#define SPEC_CTRL_IPRED_DIS_S_SHIFT 4 /* Disable IPRED behavior in supervisor mode */
+#define SPEC_CTRL_IPRED_DIS_S BIT(SPEC_CTRL_IPRED_DIS_S_SHIFT)
+#define SPEC_CTRL_RRSBA_DIS_U_SHIFT 5 /* Disable RRSBA behavior in user mode */
+#define SPEC_CTRL_RRSBA_DIS_U BIT(SPEC_CTRL_RRSBA_DIS_U_SHIFT)
#define SPEC_CTRL_RRSBA_DIS_S_SHIFT 6 /* Disable RRSBA behavior in supervisor mode */
#define SPEC_CTRL_RRSBA_DIS_S BIT(SPEC_CTRL_RRSBA_DIS_S_SHIFT)
#define SPEC_CTRL_BHI_DIS_S_SHIFT 10 /* Disable BHI behavior in supervisor mode */
diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
index 9db9a7e49a54..9f117cf80477 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -116,6 +116,7 @@ TEST_GEN_PROGS_x86_64 += x86_64/amx_test
TEST_GEN_PROGS_x86_64 += x86_64/max_vcpuid_cap_test
TEST_GEN_PROGS_x86_64 += x86_64/triple_fault_event_test
TEST_GEN_PROGS_x86_64 += x86_64/virtual_mitigation_msr_test
+TEST_GEN_PROGS_x86_64 += x86_64/spec_ctrl_msr_test
TEST_GEN_PROGS_x86_64 += access_tracking_perf_test
TEST_GEN_PROGS_x86_64 += demand_paging_test
TEST_GEN_PROGS_x86_64 += dirty_log_test
diff --git a/tools/testing/selftests/kvm/include/x86_64/processor.h b/tools/testing/selftests/kvm/include/x86_64/processor.h
index 90387ddcb2a9..355aba25dfef 100644
--- a/tools/testing/selftests/kvm/include/x86_64/processor.h
+++ b/tools/testing/selftests/kvm/include/x86_64/processor.h
@@ -125,8 +125,13 @@ struct kvm_x86_cpu_feature {
#define X86_FEATURE_IBT KVM_X86_CPU_FEATURE(0x7, 0, EDX, 20)
#define X86_FEATURE_AMX_TILE KVM_X86_CPU_FEATURE(0x7, 0, EDX, 24)
#define X86_FEATURE_SPEC_CTRL KVM_X86_CPU_FEATURE(0x7, 0, EDX, 26)
+#define X86_FEATURE_INTEL_STIBP KVM_X86_CPU_FEATURE(0x7, 0, EDX, 27)
+#define X86_FEATURE_SPEC_CTRL_SSBD KVM_X86_CPU_FEATURE(0x7, 0, EDX, 31)
#define X86_FEATURE_ARCH_CAPABILITIES KVM_X86_CPU_FEATURE(0x7, 0, EDX, 29)
#define X86_FEATURE_PKS KVM_X86_CPU_FEATURE(0x7, 0, ECX, 31)
+#define X86_FEATURE_IPRED_CTRL KVM_X86_CPU_FEATURE(0x7, 2, EDX, 1)
+#define X86_FEATURE_RRSBA_CTRL KVM_X86_CPU_FEATURE(0x7, 2, EDX, 2)
+#define X86_FEATURE_BHI_CTRL KVM_X86_CPU_FEATURE(0x7, 2, EDX, 4)
#define X86_FEATURE_XTILECFG KVM_X86_CPU_FEATURE(0xD, 0, EAX, 17)
#define X86_FEATURE_XTILEDATA KVM_X86_CPU_FEATURE(0xD, 0, EAX, 18)
#define X86_FEATURE_XSAVES KVM_X86_CPU_FEATURE(0xD, 1, EAX, 3)
diff --git a/tools/testing/selftests/kvm/x86_64/spec_ctrl_msr_test.c b/tools/testing/selftests/kvm/x86_64/spec_ctrl_msr_test.c
new file mode 100644
index 000000000000..ced4640ee92e
--- /dev/null
+++ b/tools/testing/selftests/kvm/x86_64/spec_ctrl_msr_test.c
@@ -0,0 +1,178 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2023, Intel, Inc.
+ *
+ * tests for IA32_SPEC_CTRL MSR accesses
+ */
+
+#include <fcntl.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <sys/ioctl.h>
+
+#include "test_util.h"
+
+#include "kvm_util.h"
+#include "vmx.h"
+#include "processor.h"
+
+static void set_spec_ctrl(u64 val)
+{
+ /* Set the bit and verify the result */
+ wrmsr(MSR_IA32_SPEC_CTRL, val);
+ GUEST_ASSERT_2(rdmsr(MSR_IA32_SPEC_CTRL) == val, rdmsr(MSR_IA32_SPEC_CTRL), val);
+
+ /* Clear the bit and verify the result */
+ val = 0;
+ wrmsr(MSR_IA32_SPEC_CTRL, val);
+ GUEST_ASSERT_2(rdmsr(MSR_IA32_SPEC_CTRL) == val, rdmsr(MSR_IA32_SPEC_CTRL), val);
+}
+
+static void guest_code(void)
+{
+ set_spec_ctrl(SPEC_CTRL_IBRS);
+
+ if (this_cpu_has(X86_FEATURE_INTEL_STIBP))
+ set_spec_ctrl(SPEC_CTRL_STIBP);
+
+ if (this_cpu_has(X86_FEATURE_SPEC_CTRL_SSBD))
+ set_spec_ctrl(SPEC_CTRL_SSBD);
+
+ if (this_cpu_has(X86_FEATURE_IPRED_CTRL)) {
+ set_spec_ctrl(SPEC_CTRL_IPRED_DIS_S);
+ set_spec_ctrl(SPEC_CTRL_IPRED_DIS_U);
+ }
+
+ if (this_cpu_has(X86_FEATURE_RRSBA_CTRL)) {
+ set_spec_ctrl(SPEC_CTRL_RRSBA_DIS_S);
+ set_spec_ctrl(SPEC_CTRL_RRSBA_DIS_U);
+ }
+
+ if (this_cpu_has(X86_FEATURE_BHI_CTRL))
+ set_spec_ctrl(SPEC_CTRL_BHI_DIS_S);
+
+ GUEST_DONE();
+}
+
+static void test_spec_ctrl_access(void)
+{
+ struct kvm_vcpu *vcpu;
+ struct kvm_run *run;
+ struct kvm_vm *vm;
+ struct ucall uc;
+
+ vm = vm_create_with_one_vcpu(&vcpu, guest_code);
+ run = vcpu->run;
+
+ while (1) {
+ vcpu_run(vcpu);
+
+ TEST_ASSERT(run->exit_reason == KVM_EXIT_IO,
+ "Unexpected exit reason: %u (%s),\n",
+ run->exit_reason,
+ exit_reason_str(run->exit_reason));
+
+ switch (get_ucall(vcpu, &uc)) {
+ case UCALL_ABORT:
+ REPORT_GUEST_ASSERT_2(uc, "real %ld expected %ld");
+ break;
+ case UCALL_DONE:
+ goto done;
+ default:
+ TEST_FAIL("Unknown ucall %lu", uc.cmd);
+ }
+ }
+
+done:
+ kvm_vm_free(vm);
+}
+
+static void l2_guest_code(void)
+{
+ GUEST_ASSERT(rdmsr(MSR_IA32_SPEC_CTRL) == SPEC_CTRL_IBRS);
+ wrmsr(MSR_IA32_SPEC_CTRL, 0);
+
+ /* Exit to L1 */
+ __asm__ __volatile__("vmcall");
+}
+
+static void l1_guest_code(struct vmx_pages *vmx_pages)
+{
+#define L2_GUEST_STACK_SIZE 64
+ unsigned long l2_guest_stack[L2_GUEST_STACK_SIZE];
+ uint32_t control;
+
+ /*
+ * Try to disable interception of writes to SPEC_CTRL by writing a
+ * non-0 value. This test is intended to verify that SPEC_CTRL is
+ * preserved across nested transitions particuarlly when writes to
+ * the MSR isn't intercepted by L0 VMM or L1 VMM.
+ */
+ wrmsr(MSR_IA32_SPEC_CTRL, SPEC_CTRL_IBRS);
+
+ GUEST_ASSERT(vmx_pages->vmcs_gpa);
+ GUEST_ASSERT(prepare_for_vmx_operation(vmx_pages));
+ GUEST_ASSERT(load_vmcs(vmx_pages));
+ GUEST_ASSERT(vmptrstz() == vmx_pages->vmcs_gpa);
+ prepare_vmcs(vmx_pages, l2_guest_code,
+ &l2_guest_stack[L2_GUEST_STACK_SIZE]);
+
+ control = vmreadz(CPU_BASED_VM_EXEC_CONTROL);
+ control |= CPU_BASED_USE_MSR_BITMAPS;
+ vmwrite(CPU_BASED_VM_EXEC_CONTROL, control);
+
+ GUEST_ASSERT(!vmlaunch());
+
+ GUEST_ASSERT(vmreadz(VM_EXIT_REASON) == EXIT_REASON_VMCALL);
+ GUEST_ASSERT(rdmsr(MSR_IA32_SPEC_CTRL) == 0);
+
+ GUEST_DONE();
+}
+
+static void test_spec_ctrl_vmx_transition(void)
+{
+ vm_vaddr_t vmx_pages_gva;
+ struct kvm_vcpu *vcpu;
+ struct kvm_run *run;
+ struct kvm_vm *vm;
+ struct ucall uc;
+
+ TEST_REQUIRE(kvm_cpu_has(X86_FEATURE_VMX));
+
+ vm = vm_create_with_one_vcpu(&vcpu, l1_guest_code);
+ vcpu_alloc_vmx(vm, &vmx_pages_gva);
+ vcpu_args_set(vcpu, 1, vmx_pages_gva);
+ run = vcpu->run;
+
+ while (1) {
+ vcpu_run(vcpu);
+
+ TEST_ASSERT(run->exit_reason == KVM_EXIT_IO,
+ "Unexpected exit reason: %u (%s),\n",
+ run->exit_reason,
+ exit_reason_str(run->exit_reason));
+
+ switch (get_ucall(vcpu, &uc)) {
+ case UCALL_ABORT:
+ REPORT_GUEST_ASSERT(uc);
+ break;
+ case UCALL_DONE:
+ goto done;
+ default:
+ TEST_FAIL("Unknown ucall %lu", uc.cmd);
+ }
+ }
+
+done:
+ kvm_vm_free(vm);
+}
+
+int main(int argc, char *argv[])
+{
+ TEST_REQUIRE(kvm_cpu_has(X86_FEATURE_SPEC_CTRL));
+ test_spec_ctrl_access();
+ test_spec_ctrl_vmx_transition();
+
+ return 0;
+}
--
2.40.0

2023-04-14 09:57:06

by Binbin Wu

[permalink] [raw]
Subject: Re: [RFC PATCH v2 00/11] Intel IA32_SPEC_CTRL Virtualization


On 4/14/2023 2:25 PM, Chao Gao wrote:
> Changes since RFC v1:
> * add two kselftests (patch 10-11)
> * set virtual MSRs also on APs [Pawan]
> * enable "virtualize IA32_SPEC_CTRL" for L2 to prevent L2 from changing
> some bits of IA32_SPEC_CTRL (patch 4)
> * other misc cleanup and cosmetic changes
>
> RFC v1: https://lore.kernel.org/lkml/[email protected]/
>
>
> This series introduces "virtualize IA32_SPEC_CTRL" support. Here are
> introduction and use cases of this new feature.
>
> ### Virtualize IA32_SPEC_CTRL
>
> "Virtualize IA32_SPEC_CTRL" [1] is a new VMX feature on Intel CPUs. This feature
> allows VMM to lock some bits of IA32_SPEC_CTRL MSR even when the MSR is
> pass-thru'd to a guest.
>
>
> ### Use cases of "virtualize IA32_SPEC_CTRL" [2]
>
> Software mitigations like Retpoline and software BHB-clearing sequence depend on
> CPU microarchitectures. And guest cannot know exactly the underlying
> microarchitecture. When a guest is migrated between processors of different
> microarchitectures, software mitigations which work perfectly on previous
> microachitecture may be not effective on the new one. To fix the problem, some
> hardware mitigations should be used in conjunction with software mitigations.

So even the hardware mitigations are enabled, the software mitigations
are still needed, right?


> Using virtual IA32_SPEC_CTRL, VMM can enforce hardware mitigations transparently
> to guests and avoid those hardware mitigations being unintentionally disabled
> when guest changes IA32_SPEC_CTRL MSR.
>
>
> ### Intention of this series
>
> This series adds the capability of enforcing hardware mitigations for guests
> transparently and efficiently (i.e., without intecepting IA32_SPEC_CTRL MSR

/s/intecepting/intercepting


> accesses) to kvm. The capability can be used to solve the VM migration issue in
> a pool consisting of processors of different microarchitectures.
>
> Specifically, below are two target scenarios of this series:
>
> Scenario 1: If retpoline is used by a VM to mitigate IMBTI in CPL0, VMM can set
> RRSBA_DIS_S on parts enumerates RRSBA. Note that the VM is presented
> with a microarchitecture doesn't enumerate RRSBA.
>
> Scenario 2: If a VM uses software BHB-clearing sequence on transitions into CPL0
> to mitigate BHI, VMM can use "virtualize IA32_SPEC_CTRL" to set
> BHI_DIS_S on new parts which doesn't enumerate BHI_NO.
>
> Intel defines some virtual MSRs [2] for guests to report in-use software
> mitigations. This allows guests to opt in VMM's deploying hardware mitigations
> for them if the guests are either running or later migrated to a system on which
> in-use software mitigations are not effective. The virtual MSRs interface is
> also added in this series.
>
> ### Organization of this series
>
> 1. Patch 1-3 Advertise RRSBA_CTRL and BHI_CTRL to guest
> 2. Patch 4 Add "virtualize IA32_SPEC_CTRL" support
> 3. Patch 5-9 Allow guests to report in-use software mitigations to KVM so
> that KVM can enable hardware mitigations for guests.
> 4. Patch 10-11 Add kselftest for virtual MSRs and IA32_SPEC_CTRL
>
> [1]: https://cdrdv2.intel.com/v1/dl/getContent/671368 Ref. #319433-047 Chapter 12
> [2]: https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/technical-documentation/branch-history-injection.html
>
> Chao Gao (3):
> KVM: VMX: Advertise MITI_ENUM_RETPOLINE_S_SUPPORT
> KVM: selftests: Add tests for virtual enumeration/mitigation MSRs
> KVM: selftests: Add tests for IA32_SPEC_CTRL MSR
>
> Pawan Gupta (1):
> x86/bugs: Use Virtual MSRs to request hardware mitigations
>
> Zhang Chen (7):
> x86/msr-index: Add bit definitions for BHI_DIS_S and BHI_NO
> KVM: x86: Advertise CPUID.7.2.EDX and RRSBA_CTRL support
> KVM: x86: Advertise BHI_CTRL support
> KVM: VMX: Add IA32_SPEC_CTRL virtualization support
> KVM: x86: Advertise ARCH_CAP_VIRTUAL_ENUM support
> KVM: VMX: Advertise MITIGATION_CTRL support
> KVM: VMX: Advertise MITI_CTRL_BHB_CLEAR_SEQ_S_SUPPORT
>
> arch/x86/include/asm/msr-index.h | 33 +++-
> arch/x86/include/asm/vmx.h | 5 +
> arch/x86/include/asm/vmxfeatures.h | 2 +
> arch/x86/kernel/cpu/bugs.c | 25 +++
> arch/x86/kvm/cpuid.c | 22 ++-
> arch/x86/kvm/reverse_cpuid.h | 8 +
> arch/x86/kvm/svm/svm.c | 3 +
> arch/x86/kvm/vmx/capabilities.h | 5 +
> arch/x86/kvm/vmx/nested.c | 13 ++
> arch/x86/kvm/vmx/vmcs.h | 2 +
> arch/x86/kvm/vmx/vmx.c | 112 ++++++++++-
> arch/x86/kvm/vmx/vmx.h | 43 ++++-
> arch/x86/kvm/x86.c | 19 +-
> tools/arch/x86/include/asm/msr-index.h | 37 +++-
> tools/testing/selftests/kvm/Makefile | 2 +
> .../selftests/kvm/include/x86_64/processor.h | 5 +
> .../selftests/kvm/x86_64/spec_ctrl_msr_test.c | 178 ++++++++++++++++++
> .../kvm/x86_64/virtual_mitigation_msr_test.c | 175 +++++++++++++++++
> 18 files changed, 676 insertions(+), 13 deletions(-)
> create mode 100644 tools/testing/selftests/kvm/x86_64/spec_ctrl_msr_test.c
> create mode 100644 tools/testing/selftests/kvm/x86_64/virtual_mitigation_msr_test.c
>
>
> base-commit: 400d2132288edbd6d500f45eab5d85526ca94e46

2023-04-14 09:57:38

by Binbin Wu

[permalink] [raw]
Subject: Re: [RFC PATCH v2 01/11] x86/msr-index: Add bit definitions for BHI_DIS_S and BHI_NO


On 4/14/2023 2:25 PM, Chao Gao wrote:
> From: Zhang Chen <[email protected]>
>
> To ensure VM migration from a system where software mitigation works to
> a system where it doesn't won't harm guest's security level, KVM must
> mitigate BHI attacks for guests since migration is transparent to guests
> and guests won't and can't react to VM migration.
>
> For example, simple BHB clear sequence [1] is effective in mitigating BHI
> attacks on processors prior to Alder Lake, but it is not on Alder Lake.
> Guests migrated from prior to Alder Lake host to Alder Lake host become
> vulnerable to BHI attacks even if the simmple BHB clear sequence is

/s/simmple/simple


> deployed. In this case, KVM can enable hardware mitigation for guests by
> setting BHI_DIS_S bit of IA32_SPEC_CTRL MSR.
>
> Define the SPEC_CTRL_BHI_DIS_S of IA32_SPEC_CTRL MSR and BHI_NO bits in
> arch_capabilities, which will be used by KVM later.
>
> [1]: https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/technical-documentation/branch-history-injection.html#inpage-nav-2-4
>
> Signed-off-by: Zhang Chen <[email protected]>
> Co-developed-by: Chao Gao <[email protected]>
> Signed-off-by: Chao Gao <[email protected]>
> Tested-by: Jiaan Lu <[email protected]>
> ---
> arch/x86/include/asm/msr-index.h | 8 +++++++-
> tools/arch/x86/include/asm/msr-index.h | 8 +++++++-
> 2 files changed, 14 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
> index ad35355ee43e..60b25d87b82c 100644
> --- a/arch/x86/include/asm/msr-index.h
> +++ b/arch/x86/include/asm/msr-index.h
> @@ -48,8 +48,10 @@
> #define SPEC_CTRL_STIBP BIT(SPEC_CTRL_STIBP_SHIFT) /* STIBP mask */
> #define SPEC_CTRL_SSBD_SHIFT 2 /* Speculative Store Bypass Disable bit */
> #define SPEC_CTRL_SSBD BIT(SPEC_CTRL_SSBD_SHIFT) /* Speculative Store Bypass Disable */
> -#define SPEC_CTRL_RRSBA_DIS_S_SHIFT 6 /* Disable RRSBA behavior */
> +#define SPEC_CTRL_RRSBA_DIS_S_SHIFT 6 /* Disable RRSBA behavior in supervisor mode */
> #define SPEC_CTRL_RRSBA_DIS_S BIT(SPEC_CTRL_RRSBA_DIS_S_SHIFT)
> +#define SPEC_CTRL_BHI_DIS_S_SHIFT 10 /* Disable BHI behavior in supervisor mode */
> +#define SPEC_CTRL_BHI_DIS_S BIT(SPEC_CTRL_BHI_DIS_S_SHIFT)
>
> /* A mask for bits which the kernel toggles when controlling mitigations */
> #define SPEC_CTRL_MITIGATIONS_MASK (SPEC_CTRL_IBRS | SPEC_CTRL_STIBP | SPEC_CTRL_SSBD \
> @@ -151,6 +153,10 @@
> * are restricted to targets in
> * kernel.
> */
> +#define ARCH_CAP_BHI_NO BIT(20) /*
> + * Not susceptible to Branch History
> + * Injection.
> + */
> #define ARCH_CAP_PBRSB_NO BIT(24) /*
> * Not susceptible to Post-Barrier
> * Return Stack Buffer Predictions.
> diff --git a/tools/arch/x86/include/asm/msr-index.h b/tools/arch/x86/include/asm/msr-index.h
> index ad35355ee43e..6079a5fdb40b 100644
> --- a/tools/arch/x86/include/asm/msr-index.h
> +++ b/tools/arch/x86/include/asm/msr-index.h
> @@ -48,8 +48,10 @@
> #define SPEC_CTRL_STIBP BIT(SPEC_CTRL_STIBP_SHIFT) /* STIBP mask */
> #define SPEC_CTRL_SSBD_SHIFT 2 /* Speculative Store Bypass Disable bit */
> #define SPEC_CTRL_SSBD BIT(SPEC_CTRL_SSBD_SHIFT) /* Speculative Store Bypass Disable */
> -#define SPEC_CTRL_RRSBA_DIS_S_SHIFT 6 /* Disable RRSBA behavior */
> +#define SPEC_CTRL_RRSBA_DIS_S_SHIFT 6 /* Disable RRSBA behavior in supervisor mode */
> #define SPEC_CTRL_RRSBA_DIS_S BIT(SPEC_CTRL_RRSBA_DIS_S_SHIFT)
> +#define SPEC_CTRL_BHI_DIS_S_SHIFT 10 /* Disable BHI behavior in supervisor mode */
> +#define SPEC_CTRL_BHI_DIS_S BIT(SPEC_CTRL_BHI_DIS_S_SHIFT)
>
> /* A mask for bits which the kernel toggles when controlling mitigations */
> #define SPEC_CTRL_MITIGATIONS_MASK (SPEC_CTRL_IBRS | SPEC_CTRL_STIBP | SPEC_CTRL_SSBD \
> @@ -151,6 +153,10 @@
> * are restricted to targets in
> * kernel.
> */
> +#define ARCH_CAP_BHI_NO BIT(20) /*
> + * Not susceptible to Branch History
> + * Injection.
> + */
> #define ARCH_CAP_PBRSB_NO BIT(24) /*
> * Not susceptible to Post-Barrier
> * Return Stack Buffer Predictions.

2023-04-14 22:12:54

by Pawan Gupta

[permalink] [raw]
Subject: Re: [RFC PATCH v2 00/11] Intel IA32_SPEC_CTRL Virtualization

On Fri, Apr 14, 2023 at 05:51:43PM +0800, Binbin Wu wrote:
>
> On 4/14/2023 2:25 PM, Chao Gao wrote:
> > Changes since RFC v1:
> > * add two kselftests (patch 10-11)
> > * set virtual MSRs also on APs [Pawan]
> > * enable "virtualize IA32_SPEC_CTRL" for L2 to prevent L2 from changing
> > some bits of IA32_SPEC_CTRL (patch 4)
> > * other misc cleanup and cosmetic changes
> >
> > RFC v1: https://lore.kernel.org/lkml/[email protected]/
> >
> >
> > This series introduces "virtualize IA32_SPEC_CTRL" support. Here are
> > introduction and use cases of this new feature.
> >
> > ### Virtualize IA32_SPEC_CTRL
> >
> > "Virtualize IA32_SPEC_CTRL" [1] is a new VMX feature on Intel CPUs. This feature
> > allows VMM to lock some bits of IA32_SPEC_CTRL MSR even when the MSR is
> > pass-thru'd to a guest.
> >
> >
> > ### Use cases of "virtualize IA32_SPEC_CTRL" [2]
> >
> > Software mitigations like Retpoline and software BHB-clearing sequence depend on
> > CPU microarchitectures. And guest cannot know exactly the underlying
> > microarchitecture. When a guest is migrated between processors of different
> > microarchitectures, software mitigations which work perfectly on previous
> > microachitecture may be not effective on the new one. To fix the problem, some
> > hardware mitigations should be used in conjunction with software mitigations.
>
> So even the hardware mitigations are enabled, the software mitigations are
> still needed, right?

Retpoline mitigation is not fully effective when RET can take prediction
from an alternate predictor. Newer hardware provides a way to disable
this behavior (using RRSBA_DIS_S bit in MSR SPEC_CTRL).

eIBRS is the preferred way to mitigate BTI, but for some reason when a
guest has deployed retpoline, VMM can make it more effective by
deploying the relevant hardware control. That is why the above text
says:

"... hardware mitigations should be used in conjunction with software
mitigations."

2023-04-17 03:22:01

by Binbin Wu

[permalink] [raw]
Subject: Re: [RFC PATCH v2 04/11] KVM: VMX: Add IA32_SPEC_CTRL virtualization support


On 4/14/2023 2:25 PM, Chao Gao wrote:
> From: Zhang Chen <[email protected]>
>
> Currently KVM disables interception of IA32_SPEC_CTRL after a non-0 is
> written to IA32_SPEC_CTRL by guest. Then, guest is allowed to write any
> value to hardware.
>
> "virtualize IA32_SPEC_CTRL" is a new tertiary vm-exec control. This
> feature allows KVM to specify that certain bits of the IA32_SPEC_CTRL
> MSR cannot be modified by guest software.
>
> Two VMCS fields are added:
>
> IA32_SPEC_CTRL_MASK: bits that guest software cannot modify
> IA32_SPEC_CTRL_SHADOW: value that guest software expects to be in the
> IA32_SPEC_CTRL MSR
>
> On rdmsr, the shadow value is returned. on wrmsr, EDX:EAX is written
> to the IA32_SPEC_CTRL_SHADOW and (cur_val & mask) | (EDX:EAX & ~mask)
> is written to the IA32_SPEC_CTRL MSR, where
> * cur_val is the original value of IA32_SPEC_CTRL MSR
> * mask is the value of IA32_SPEC_CTRL_MASK
>
> Add a mask e.g.,

e.g. or i.e. ?


> loaded_vmcs->spec_ctrl_mask to represent the bits guest
> shouldn't change. It is 0 for now and some bits will be added by
> following patches. Use per-vmcs cache to avoid unnecessary vmcs_write()
> on nested transition because the mask is expected to be rarely changed
> and the same for vmcs01 and vmcs02.
>
> To prevent guest from changing the bits in the mask, enable "virtualize
> IA32_SPEC_CTRL" if supported or emulate its behavior by intercepting
> the IA32_SPEC_CTRL msr. Emulating "virtualize IA32_SPEC_CTRL" behavior
> is mainly to give the same capability to KVM running on potential broken
> hardware or L1 guests.
>
> To avoid L2 evading the enforcement, enable "virtualize IA32_SPEC_CTRL"
> in vmcs02. Always update the guest (shadow) value of IA32_SPEC_CTRL MSR
> and the mask to preserve them across nested transitions. Note that the
> shadow value may be changed because L2 may access the IA32_SPEC_CTRL
> directly and the mask may be changed due to migration when L2 vCPUs are
> running.
>
> Co-developed-by: Chao Gao <[email protected]>
> Signed-off-by: Chao Gao <[email protected]>
> Signed-off-by: Zhang Chen <[email protected]>
> Signed-off-by: Chao Gao <[email protected]>
> Tested-by: Jiaan Lu <[email protected]>
> ---
> arch/x86/include/asm/vmx.h | 5 ++++
> arch/x86/include/asm/vmxfeatures.h | 2 ++
> arch/x86/kvm/vmx/capabilities.h | 5 ++++
> arch/x86/kvm/vmx/nested.c | 13 ++++++++++
> arch/x86/kvm/vmx/vmcs.h | 2 ++
> arch/x86/kvm/vmx/vmx.c | 34 ++++++++++++++++++++-----
> arch/x86/kvm/vmx/vmx.h | 40 +++++++++++++++++++++++++++++-
> 7 files changed, 94 insertions(+), 7 deletions(-)
>
[...]

> @@ -750,4 +766,26 @@ static inline bool guest_cpuid_has_evmcs(struct kvm_vcpu *vcpu)
> to_vmx(vcpu)->nested.enlightened_vmcs_enabled;
> }
>
> +static inline u64 vmx_get_guest_spec_ctrl(struct vcpu_vmx *vmx)
> +{
> + return vmx->guest_spec_ctrl;
> +}
> +
> +static inline void vmx_set_guest_spec_ctrl(struct vcpu_vmx *vmx, u64 val)
> +{
> + vmx->guest_spec_ctrl = val;
> +
> + /*
> + * For simplicity, always keep IA32_SPEC_CTRL_SHADOW up-to-date,
> + * regardless of the MSR intercept state.

It is better to use "IA32_SPEC_CTRL"  explicitly instead of "the MSR" to
avoid misunderstand.


> + */
> + if (cpu_has_spec_ctrl_virt())
> + vmcs_write64(IA32_SPEC_CTRL_SHADOW, val);
> +
> + /*
> + * Update the effective value of IA32_SPEC_CTRL to reflect changes to
> + * guest's IA32_SPEC_CTRL. Bits in the mask should always be set.
> + */
> + vmx->spec_ctrl = val | vmx_get_spec_ctrl_mask(vmx);
> +}
> #endif /* __KVM_X86_VMX_H */

2023-04-17 06:51:02

by Chenyi Qiang

[permalink] [raw]
Subject: Re: [RFC PATCH v2 04/11] KVM: VMX: Add IA32_SPEC_CTRL virtualization support



On 4/14/2023 2:25 PM, Chao Gao wrote:
> From: Zhang Chen <[email protected]>
>
> Currently KVM disables interception of IA32_SPEC_CTRL after a non-0 is
> written to IA32_SPEC_CTRL by guest. Then, guest is allowed to write any
> value to hardware.
>
> "virtualize IA32_SPEC_CTRL" is a new tertiary vm-exec control. This
> feature allows KVM to specify that certain bits of the IA32_SPEC_CTRL
> MSR cannot be modified by guest software.
>
> Two VMCS fields are added:
>
> IA32_SPEC_CTRL_MASK: bits that guest software cannot modify
> IA32_SPEC_CTRL_SHADOW: value that guest software expects to be in the
> IA32_SPEC_CTRL MSR
>
> On rdmsr, the shadow value is returned. on wrmsr, EDX:EAX is written
> to the IA32_SPEC_CTRL_SHADOW and (cur_val & mask) | (EDX:EAX & ~mask)
> is written to the IA32_SPEC_CTRL MSR, where
> * cur_val is the original value of IA32_SPEC_CTRL MSR
> * mask is the value of IA32_SPEC_CTRL_MASK
>
> Add a mask e.g., loaded_vmcs->spec_ctrl_mask to represent the bits guest
> shouldn't change. It is 0 for now and some bits will be added by
> following patches. Use per-vmcs cache to avoid unnecessary vmcs_write()
> on nested transition because the mask is expected to be rarely changed
> and the same for vmcs01 and vmcs02.
>
> To prevent guest from changing the bits in the mask, enable "virtualize
> IA32_SPEC_CTRL" if supported or emulate its behavior by intercepting
> the IA32_SPEC_CTRL msr. Emulating "virtualize IA32_SPEC_CTRL" behavior
> is mainly to give the same capability to KVM running on potential broken
> hardware or L1 guests.
>
> To avoid L2 evading the enforcement, enable "virtualize IA32_SPEC_CTRL"
> in vmcs02. Always update the guest (shadow) value of IA32_SPEC_CTRL MSR
> and the mask to preserve them across nested transitions. Note that the
> shadow value may be changed because L2 may access the IA32_SPEC_CTRL
> directly and the mask may be changed due to migration when L2 vCPUs are
> running.
>
> Co-developed-by: Chao Gao <[email protected]>
> Signed-off-by: Chao Gao <[email protected]>
> Signed-off-by: Zhang Chen <[email protected]>
> Signed-off-by: Chao Gao <[email protected]>

Duplicated SOB. Move the Co-developed-by down like other patches.

> Tested-by: Jiaan Lu <[email protected]>
> ---
> arch/x86/include/asm/vmx.h | 5 ++++
> arch/x86/include/asm/vmxfeatures.h | 2 ++
> arch/x86/kvm/vmx/capabilities.h | 5 ++++
> arch/x86/kvm/vmx/nested.c | 13 ++++++++++
> arch/x86/kvm/vmx/vmcs.h | 2 ++
> arch/x86/kvm/vmx/vmx.c | 34 ++++++++++++++++++++-----
> arch/x86/kvm/vmx/vmx.h | 40 +++++++++++++++++++++++++++++-
> 7 files changed, 94 insertions(+), 7 deletions(-)
>
> diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
> index 498dc600bd5c..b9f88ecf20c3 100644
> --- a/arch/x86/include/asm/vmx.h
> +++ b/arch/x86/include/asm/vmx.h
> @@ -81,6 +81,7 @@
> * Definitions of Tertiary Processor-Based VM-Execution Controls.
> */
> #define TERTIARY_EXEC_IPI_VIRT VMCS_CONTROL_BIT(IPI_VIRT)
> +#define TERTIARY_EXEC_SPEC_CTRL_VIRT VMCS_CONTROL_BIT(SPEC_CTRL_VIRT)
>
> #define PIN_BASED_EXT_INTR_MASK VMCS_CONTROL_BIT(INTR_EXITING)
> #define PIN_BASED_NMI_EXITING VMCS_CONTROL_BIT(NMI_EXITING)
> @@ -233,6 +234,10 @@ enum vmcs_field {
> TERTIARY_VM_EXEC_CONTROL_HIGH = 0x00002035,
> PID_POINTER_TABLE = 0x00002042,
> PID_POINTER_TABLE_HIGH = 0x00002043,
> + IA32_SPEC_CTRL_MASK = 0x0000204A,
> + IA32_SPEC_CTRL_MASK_HIGH = 0x0000204B,
> + IA32_SPEC_CTRL_SHADOW = 0x0000204C,
> + IA32_SPEC_CTRL_SHADOW_HIGH = 0x0000204D,
> GUEST_PHYSICAL_ADDRESS = 0x00002400,
> GUEST_PHYSICAL_ADDRESS_HIGH = 0x00002401,
> VMCS_LINK_POINTER = 0x00002800,
> diff --git a/arch/x86/include/asm/vmxfeatures.h b/arch/x86/include/asm/vmxfeatures.h
> index c6a7eed03914..c70d0769b7d0 100644
> --- a/arch/x86/include/asm/vmxfeatures.h
> +++ b/arch/x86/include/asm/vmxfeatures.h
> @@ -89,4 +89,6 @@
>
> /* Tertiary Processor-Based VM-Execution Controls, word 3 */
> #define VMX_FEATURE_IPI_VIRT ( 3*32+ 4) /* Enable IPI virtualization */
> +#define VMX_FEATURE_SPEC_CTRL_VIRT ( 3*32+ 7) /* Enable IA32_SPEC_CTRL virtualization */
> +
> #endif /* _ASM_X86_VMXFEATURES_H */
> diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h
> index 45162c1bcd8f..a7ab70b55acf 100644
> --- a/arch/x86/kvm/vmx/capabilities.h
> +++ b/arch/x86/kvm/vmx/capabilities.h
> @@ -138,6 +138,11 @@ static inline bool cpu_has_tertiary_exec_ctrls(void)
> CPU_BASED_ACTIVATE_TERTIARY_CONTROLS;
> }
>
> +static __always_inline bool cpu_has_spec_ctrl_virt(void)

Do we need to use __always_inline to force generating inline code? or
just align with other cpu_has_xxx() functions, use inline annotation.

> +{
> + return vmcs_config.cpu_based_3rd_exec_ctrl & TERTIARY_EXEC_SPEC_CTRL_VIRT;
> +}
> +
> static inline bool cpu_has_vmx_virtualize_apic_accesses(void)
> {
> return vmcs_config.cpu_based_2nd_exec_ctrl &

2023-04-17 07:37:23

by Chao Gao

[permalink] [raw]
Subject: Re: [RFC PATCH v2 04/11] KVM: VMX: Add IA32_SPEC_CTRL virtualization support

On Mon, Apr 17, 2023 at 02:48:56PM +0800, Chenyi Qiang wrote:
>> Co-developed-by: Chao Gao <[email protected]>
>> Signed-off-by: Chao Gao <[email protected]>
>> Signed-off-by: Zhang Chen <[email protected]>
>> Signed-off-by: Chao Gao <[email protected]>
>
>Duplicated SOB. Move the Co-developed-by down like other patches.
>

OK. Will do

>> Tested-by: Jiaan Lu <[email protected]>
>> ---
>> arch/x86/include/asm/vmx.h | 5 ++++
>> arch/x86/include/asm/vmxfeatures.h | 2 ++
>> arch/x86/kvm/vmx/capabilities.h | 5 ++++
>> arch/x86/kvm/vmx/nested.c | 13 ++++++++++
>> arch/x86/kvm/vmx/vmcs.h | 2 ++
>> arch/x86/kvm/vmx/vmx.c | 34 ++++++++++++++++++++-----
>> arch/x86/kvm/vmx/vmx.h | 40 +++++++++++++++++++++++++++++-
>> 7 files changed, 94 insertions(+), 7 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
>> index 498dc600bd5c..b9f88ecf20c3 100644
>> --- a/arch/x86/include/asm/vmx.h
>> +++ b/arch/x86/include/asm/vmx.h
>> @@ -81,6 +81,7 @@
>> * Definitions of Tertiary Processor-Based VM-Execution Controls.
>> */
>> #define TERTIARY_EXEC_IPI_VIRT VMCS_CONTROL_BIT(IPI_VIRT)
>> +#define TERTIARY_EXEC_SPEC_CTRL_VIRT VMCS_CONTROL_BIT(SPEC_CTRL_VIRT)
>>
>> #define PIN_BASED_EXT_INTR_MASK VMCS_CONTROL_BIT(INTR_EXITING)
>> #define PIN_BASED_NMI_EXITING VMCS_CONTROL_BIT(NMI_EXITING)
>> @@ -233,6 +234,10 @@ enum vmcs_field {
>> TERTIARY_VM_EXEC_CONTROL_HIGH = 0x00002035,
>> PID_POINTER_TABLE = 0x00002042,
>> PID_POINTER_TABLE_HIGH = 0x00002043,
>> + IA32_SPEC_CTRL_MASK = 0x0000204A,
>> + IA32_SPEC_CTRL_MASK_HIGH = 0x0000204B,
>> + IA32_SPEC_CTRL_SHADOW = 0x0000204C,
>> + IA32_SPEC_CTRL_SHADOW_HIGH = 0x0000204D,
>> GUEST_PHYSICAL_ADDRESS = 0x00002400,
>> GUEST_PHYSICAL_ADDRESS_HIGH = 0x00002401,
>> VMCS_LINK_POINTER = 0x00002800,
>> diff --git a/arch/x86/include/asm/vmxfeatures.h b/arch/x86/include/asm/vmxfeatures.h
>> index c6a7eed03914..c70d0769b7d0 100644
>> --- a/arch/x86/include/asm/vmxfeatures.h
>> +++ b/arch/x86/include/asm/vmxfeatures.h
>> @@ -89,4 +89,6 @@
>>
>> /* Tertiary Processor-Based VM-Execution Controls, word 3 */
>> #define VMX_FEATURE_IPI_VIRT ( 3*32+ 4) /* Enable IPI virtualization */
>> +#define VMX_FEATURE_SPEC_CTRL_VIRT ( 3*32+ 7) /* Enable IA32_SPEC_CTRL virtualization */
>> +
>> #endif /* _ASM_X86_VMXFEATURES_H */
>> diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h
>> index 45162c1bcd8f..a7ab70b55acf 100644
>> --- a/arch/x86/kvm/vmx/capabilities.h
>> +++ b/arch/x86/kvm/vmx/capabilities.h
>> @@ -138,6 +138,11 @@ static inline bool cpu_has_tertiary_exec_ctrls(void)
>> CPU_BASED_ACTIVATE_TERTIARY_CONTROLS;
>> }
>>
>> +static __always_inline bool cpu_has_spec_ctrl_virt(void)
>
>Do we need to use __always_inline to force generating inline code? or
>just align with other cpu_has_xxx() functions, use inline annotation.

__always_inline is needed because this function is called from
vmx_spec_ctrl_restore_host() which is in .noinstr section. With just
__inline, the objtool may complain:

vmlinux.o: warning: objtool: vmx_spec_ctrl_restore_host+0xb3: call to cpu_has_spec_ctrl_virt() leaves .noinstr.text section

>
>> +{
>> + return vmcs_config.cpu_based_3rd_exec_ctrl & TERTIARY_EXEC_SPEC_CTRL_VIRT;
>> +}
>> +
>> static inline bool cpu_has_vmx_virtualize_apic_accesses(void)
>> {
>> return vmcs_config.cpu_based_2nd_exec_ctrl &
>

2023-04-17 13:59:56

by Binbin Wu

[permalink] [raw]
Subject: Re: [RFC PATCH v2 05/11] x86/bugs: Use Virtual MSRs to request hardware mitigations


On 4/14/2023 2:25 PM, Chao Gao wrote:
> From: Pawan Gupta <[email protected]>
>
> Guests that have different family/model than the host may not be aware
> of hardware mitigations(such as RRSBA_DIS_S) available on host. This is
> particularly true when guests migrate. To solve this problem Intel
> processors have added a virtual MSR interface through which guests can
> report their mitigation status and request VMM to deploy relevant
> hardware mitigations.
>
> Use this virtualized MSR interface to request relevant hardware controls
> for retpoline mitigation.
>
> Signed-off-by: Pawan Gupta <[email protected]>
> Co-developed-by: Zhang Chen <[email protected]>
> Signed-off-by: Zhang Chen <[email protected]>
> Signed-off-by: Chao Gao <[email protected]>
> Tested-by: Jiaan Lu <[email protected]>
> ---
> arch/x86/include/asm/msr-index.h | 25 +++++++++++++++++++++++++
> arch/x86/kernel/cpu/bugs.c | 25 +++++++++++++++++++++++++
> 2 files changed, 50 insertions(+)
>
> diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
> index 60b25d87b82c..aec213f0c6fc 100644
> --- a/arch/x86/include/asm/msr-index.h
> +++ b/arch/x86/include/asm/msr-index.h
> @@ -166,6 +166,7 @@
> * IA32_XAPIC_DISABLE_STATUS MSR
> * supported
> */
> +#define ARCH_CAP_VIRTUAL_ENUM BIT_ULL(63) /* MSR_VIRTUAL_ENUMERATION supported */
>
> #define MSR_IA32_FLUSH_CMD 0x0000010b
> #define L1D_FLUSH BIT(0) /*
> @@ -1103,6 +1104,30 @@
> #define MSR_IA32_VMX_MISC_INTEL_PT (1ULL << 14)
> #define MSR_IA32_VMX_MISC_VMWRITE_SHADOW_RO_FIELDS (1ULL << 29)
> #define MSR_IA32_VMX_MISC_PREEMPTION_TIMER_SCALE 0x1F
> +
> +/* Intel virtual MSRs */
> +#define MSR_VIRTUAL_ENUMERATION 0x50000000
> +#define VIRT_ENUM_MITIGATION_CTRL_SUPPORT BIT(0) /*
> + * Mitigation ctrl via virtual
> + * MSRs supported
> + */
> +
> +#define MSR_VIRTUAL_MITIGATION_ENUM 0x50000001
> +#define MITI_ENUM_BHB_CLEAR_SEQ_S_SUPPORT BIT(0) /* VMM supports BHI_DIS_S */
> +#define MITI_ENUM_RETPOLINE_S_SUPPORT BIT(1) /* VMM supports RRSBA_DIS_S */
> +
> +#define MSR_VIRTUAL_MITIGATION_CTRL 0x50000002
> +#define MITI_CTRL_BHB_CLEAR_SEQ_S_USED_BIT 0 /*
> + * Request VMM to deploy
> + * BHI_DIS_S mitigation
> + */
> +#define MITI_CTRL_BHB_CLEAR_SEQ_S_USED BIT(MITI_CTRL_BHB_CLEAR_SEQ_S_USED_BIT)

Seems it is defined, but not used to request VMM to deploy BHI_DIS_S
mitigation?


And IMO, it is more natual to put this patch after the four capability
advertising patches.


> +#define MITI_CTRL_RETPOLINE_S_USED_BIT 1 /*
> + * Request VMM to deploy
> + * RRSBA_DIS_S mitigation
> + */
> +#define MITI_CTRL_RETPOLINE_S_USED BIT(MITI_CTRL_RETPOLINE_S_USED_BIT)
> +
> /* AMD-V MSRs */
>
> #define MSR_VM_CR 0xc0010114
> diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
> index f9d060e71c3e..5326c03d9d5e 100644
> --- a/arch/x86/kernel/cpu/bugs.c
> +++ b/arch/x86/kernel/cpu/bugs.c
> @@ -1435,6 +1435,27 @@ static void __init spectre_v2_determine_rsb_fill_type_at_vmexit(enum spectre_v2_
> dump_stack();
> }
>
> +/* Speculation control using virtualized MSRs */
> +static void spec_ctrl_setup_virtualized_msr(void)
> +{
> + u64 msr_virt_enum, msr_mitigation_enum;
> +
> + /* When retpoline is being used, request relevant hardware controls */
> + if (!boot_cpu_has(X86_FEATURE_RETPOLINE))
> + return;
> +
> + if (!(x86_read_arch_cap_msr() & ARCH_CAP_VIRTUAL_ENUM))
> + return;
> +
> + rdmsrl(MSR_VIRTUAL_ENUMERATION, msr_virt_enum);
> + if (!(msr_virt_enum & VIRT_ENUM_MITIGATION_CTRL_SUPPORT))
> + return;
> +
> + rdmsrl(MSR_VIRTUAL_MITIGATION_ENUM, msr_mitigation_enum);
> + if (msr_mitigation_enum & MITI_ENUM_RETPOLINE_S_SUPPORT)
> + msr_set_bit(MSR_VIRTUAL_MITIGATION_CTRL, MITI_CTRL_RETPOLINE_S_USED_BIT);
> +}
> +
> static void __init spectre_v2_select_mitigation(void)
> {
> enum spectre_v2_mitigation_cmd cmd = spectre_v2_parse_cmdline();
> @@ -1546,6 +1567,8 @@ static void __init spectre_v2_select_mitigation(void)
> mode == SPECTRE_V2_RETPOLINE)
> spec_ctrl_disable_kernel_rrsba();
>
> + spec_ctrl_setup_virtualized_msr();
> +
> spectre_v2_enabled = mode;
> pr_info("%s\n", spectre_v2_strings[mode]);
>
> @@ -2115,6 +2138,8 @@ void x86_spec_ctrl_setup_ap(void)
>
> if (ssb_mode == SPEC_STORE_BYPASS_DISABLE)
> x86_amd_ssb_disable();
> +
> + spec_ctrl_setup_virtualized_msr();
> }
>
> bool itlb_multihit_kvm_mitigation;

2023-04-18 02:14:17

by Chao Gao

[permalink] [raw]
Subject: Re: [RFC PATCH v2 05/11] x86/bugs: Use Virtual MSRs to request hardware mitigations

On Mon, Apr 17, 2023 at 09:43:59PM +0800, Binbin Wu wrote:
>
>On 4/14/2023 2:25 PM, Chao Gao wrote:
>> From: Pawan Gupta <[email protected]>
>>
>> Guests that have different family/model than the host may not be aware
>> of hardware mitigations(such as RRSBA_DIS_S) available on host. This is
>> particularly true when guests migrate. To solve this problem Intel
>> processors have added a virtual MSR interface through which guests can
>> report their mitigation status and request VMM to deploy relevant
>> hardware mitigations.
>>
>> Use this virtualized MSR interface to request relevant hardware controls
>> for retpoline mitigation.
>>
>> Signed-off-by: Pawan Gupta <[email protected]>
>> Co-developed-by: Zhang Chen <[email protected]>
>> Signed-off-by: Zhang Chen <[email protected]>
>> Signed-off-by: Chao Gao <[email protected]>
>> Tested-by: Jiaan Lu <[email protected]>
>> ---
>> arch/x86/include/asm/msr-index.h | 25 +++++++++++++++++++++++++
>> arch/x86/kernel/cpu/bugs.c | 25 +++++++++++++++++++++++++
>> 2 files changed, 50 insertions(+)
>>
>> diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
>> index 60b25d87b82c..aec213f0c6fc 100644
>> --- a/arch/x86/include/asm/msr-index.h
>> +++ b/arch/x86/include/asm/msr-index.h
>> @@ -166,6 +166,7 @@
>> * IA32_XAPIC_DISABLE_STATUS MSR
>> * supported
>> */
>> +#define ARCH_CAP_VIRTUAL_ENUM BIT_ULL(63) /* MSR_VIRTUAL_ENUMERATION supported */
>> #define MSR_IA32_FLUSH_CMD 0x0000010b
>> #define L1D_FLUSH BIT(0) /*
>> @@ -1103,6 +1104,30 @@
>> #define MSR_IA32_VMX_MISC_INTEL_PT (1ULL << 14)
>> #define MSR_IA32_VMX_MISC_VMWRITE_SHADOW_RO_FIELDS (1ULL << 29)
>> #define MSR_IA32_VMX_MISC_PREEMPTION_TIMER_SCALE 0x1F
>> +
>> +/* Intel virtual MSRs */
>> +#define MSR_VIRTUAL_ENUMERATION 0x50000000
>> +#define VIRT_ENUM_MITIGATION_CTRL_SUPPORT BIT(0) /*
>> + * Mitigation ctrl via virtual
>> + * MSRs supported
>> + */
>> +
>> +#define MSR_VIRTUAL_MITIGATION_ENUM 0x50000001
>> +#define MITI_ENUM_BHB_CLEAR_SEQ_S_SUPPORT BIT(0) /* VMM supports BHI_DIS_S */
>> +#define MITI_ENUM_RETPOLINE_S_SUPPORT BIT(1) /* VMM supports RRSBA_DIS_S */
>> +
>> +#define MSR_VIRTUAL_MITIGATION_CTRL 0x50000002
>> +#define MITI_CTRL_BHB_CLEAR_SEQ_S_USED_BIT 0 /*
>> + * Request VMM to deploy
>> + * BHI_DIS_S mitigation
>> + */
>> +#define MITI_CTRL_BHB_CLEAR_SEQ_S_USED BIT(MITI_CTRL_BHB_CLEAR_SEQ_S_USED_BIT)
>
>Seems it is defined, but not used to request VMM to deploy BHI_DIS_S
>mitigation?

Because Linux kernel doesn't use BHB-clearing sequence. Instead,
"disable unprivileged eBPF by default" + SMAP + eIBRS are used.

KVM uses this bit when checking if guests, which may not be running
Linux, are using BHB-clearing sequence.

>
>
>And IMO, it is more natual to put this patch after the four capability
>advertising patches.

Makes sense. I will organize the series in that order.

2023-04-18 02:16:29

by Chao Gao

[permalink] [raw]
Subject: Re: [RFC PATCH v2 04/11] KVM: VMX: Add IA32_SPEC_CTRL virtualization support

On Mon, Apr 17, 2023 at 11:17:36AM +0800, Binbin Wu wrote:
>
>On 4/14/2023 2:25 PM, Chao Gao wrote:
>> From: Zhang Chen <[email protected]>
>>
>> Currently KVM disables interception of IA32_SPEC_CTRL after a non-0 is
>> written to IA32_SPEC_CTRL by guest. Then, guest is allowed to write any
>> value to hardware.
>>
>> "virtualize IA32_SPEC_CTRL" is a new tertiary vm-exec control. This
>> feature allows KVM to specify that certain bits of the IA32_SPEC_CTRL
>> MSR cannot be modified by guest software.
>>
>> Two VMCS fields are added:
>>
>> IA32_SPEC_CTRL_MASK: bits that guest software cannot modify
>> IA32_SPEC_CTRL_SHADOW: value that guest software expects to be in the
>> IA32_SPEC_CTRL MSR
>>
>> On rdmsr, the shadow value is returned. on wrmsr, EDX:EAX is written
>> to the IA32_SPEC_CTRL_SHADOW and (cur_val & mask) | (EDX:EAX & ~mask)
>> is written to the IA32_SPEC_CTRL MSR, where
>> * cur_val is the original value of IA32_SPEC_CTRL MSR
>> * mask is the value of IA32_SPEC_CTRL_MASK
>>
>> Add a mask e.g.,
>
>e.g. or i.e. ?

Yes, here should be "i.e.".

>> +static inline void vmx_set_guest_spec_ctrl(struct vcpu_vmx *vmx, u64 val)
>> +{
>> + vmx->guest_spec_ctrl = val;
>> +
>> + /*
>> + * For simplicity, always keep IA32_SPEC_CTRL_SHADOW up-to-date,
>> + * regardless of the MSR intercept state.
>
>It is better to use "IA32_SPEC_CTRL"? explicitly instead of "the MSR" to
>avoid misunderstand.

Agreed. Will do.

2023-05-15 07:31:04

by Xiaoyao Li

[permalink] [raw]
Subject: Re: [RFC PATCH v2 03/11] KVM: x86: Advertise BHI_CTRL support

On 4/14/2023 2:25 PM, Chao Gao wrote:
> From: Zhang Chen <[email protected]>
>
> Add 100% kvm-only feature for BHI_CTRL because the kernel doesn't use it
> at all.
>
> BHI_CTRL is enumerated by CPUID.7.2.EDX[4]. If supported, BHI_DIS_S (bit
> 10) of IA32_SPEC_CTRL MSR can be used to enable BHI_DIS_S behavior.
>
> Note that KVM does not intercept guests' IA32_SPEC_CTRL MSR accesses
> after a non-zero is written to the MSR. Therefore, guests can already
> toggle the BHI_DIS_S bit if the host supports BHI_CTRL, and no extra
> code is needed to allow guests to toggle the bit.

Same as Patch 2, please first fix virtualization of MSR_IA32_SPEC_CTRL.
Otherwise the this patch makes no sense. E.g, if only
X86_FEATURE_BHI_CTRL is exposed to guest without any other CPUID bits
related to MSR_IA32_SPEC_CTRL, then guest cannot write
MSR_IA32_SPEC_CTRL at all.

> Signed-off-by: Zhang Chen <[email protected]>
> Signed-off-by: Chao Gao <[email protected]>
> Tested-by: Jiaan Lu <[email protected]>
> ---
> arch/x86/kvm/cpuid.c | 2 +-
> arch/x86/kvm/reverse_cpuid.h | 1 +
> 2 files changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index f024c3ac2203..7cdd859d09a2 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -686,7 +686,7 @@ void kvm_set_cpu_caps(void)
> );
>
> kvm_cpu_cap_init_kvm_defined(CPUID_7_2_EDX,
> - SF(RRSBA_CTRL)
> + SF(RRSBA_CTRL) | F(BHI_CTRL)
> );
>
> kvm_cpu_cap_mask(CPUID_8000_0001_ECX,
> diff --git a/arch/x86/kvm/reverse_cpuid.h b/arch/x86/kvm/reverse_cpuid.h
> index 72bad8314a9c..e7e70c9aa384 100644
> --- a/arch/x86/kvm/reverse_cpuid.h
> +++ b/arch/x86/kvm/reverse_cpuid.h
> @@ -50,6 +50,7 @@ enum kvm_only_cpuid_leafs {
>
> /* Intel-defined sub-features, CPUID level 0x00000007:2 (EDX) */
> #define KVM_X86_FEATURE_RRSBA_CTRL KVM_X86_FEATURE(CPUID_7_2_EDX, 2)
> +#define X86_FEATURE_BHI_CTRL KVM_X86_FEATURE(CPUID_7_2_EDX, 4)
>
> struct cpuid_reg {
> u32 function;


2023-05-16 07:25:26

by Xiaoyao Li

[permalink] [raw]
Subject: Re: [RFC PATCH v2 04/11] KVM: VMX: Add IA32_SPEC_CTRL virtualization support

On 4/14/2023 2:25 PM, Chao Gao wrote:

...

> +static inline void vmx_set_guest_spec_ctrl(struct vcpu_vmx *vmx, u64 val)
> +{
> + vmx->guest_spec_ctrl = val;
> +
> + /*
> + * For simplicity, always keep IA32_SPEC_CTRL_SHADOW up-to-date,
> + * regardless of the MSR intercept state.
> + */
> + if (cpu_has_spec_ctrl_virt())
> + vmcs_write64(IA32_SPEC_CTRL_SHADOW, val);
> +
> + /*
> + * Update the effective value of IA32_SPEC_CTRL to reflect changes to
> + * guest's IA32_SPEC_CTRL. Bits in the mask should always be set.
> + */

Why bits in the mask should always be set?

The bits set in the mask only means them cannot be modified by guest.
KVM can use the mask to force the bits to 0 as well.

> + vmx->spec_ctrl = val | vmx_get_spec_ctrl_mask(vmx);
> +}
> #endif /* __KVM_X86_VMX_H */


2023-05-16 09:26:11

by Chao Gao

[permalink] [raw]
Subject: Re: [RFC PATCH v2 04/11] KVM: VMX: Add IA32_SPEC_CTRL virtualization support

On Tue, May 16, 2023 at 03:16:59PM +0800, Xiaoyao Li wrote:
>On 4/14/2023 2:25 PM, Chao Gao wrote:
>
>...
>
>> +static inline void vmx_set_guest_spec_ctrl(struct vcpu_vmx *vmx, u64 val)
>> +{
>> + vmx->guest_spec_ctrl = val;
>> +
>> + /*
>> + * For simplicity, always keep IA32_SPEC_CTRL_SHADOW up-to-date,
>> + * regardless of the MSR intercept state.
>> + */
>> + if (cpu_has_spec_ctrl_virt())
>> + vmcs_write64(IA32_SPEC_CTRL_SHADOW, val);
>> +
>> + /*
>> + * Update the effective value of IA32_SPEC_CTRL to reflect changes to
>> + * guest's IA32_SPEC_CTRL. Bits in the mask should always be set.
>> + */
>
>Why bits in the mask should always be set?
>
>The bits set in the mask only means them cannot be modified by guest. KVM can
>use the mask to force the bits to 0 as well.

Yes.

Because there is no use case for VMMs to lock some bits to 0 behind guests, this
isn't used in series. There was a note in v1's changelog [1]:

Note "virtual IA32_SPEC_CTRL" is now used by VMM to enforce some bits
of IA32_SPEC_CTRL to 1 (i.e., enabled some HW mitigations transparently
for guests). In theory, VMM can disable some HW mitigations behind guests.
But to keep this series simple, we leave that for future work.


But somehow I dropped it (when I tried to slim down the changelog). Will add it
back and add a comment above the definition of spec_ctrl_mask.

[1]: https://lore.kernel.org/lkml/[email protected]/

>
>> + vmx->spec_ctrl = val | vmx_get_spec_ctrl_mask(vmx);
>> +}
>> #endif /* __KVM_X86_VMX_H */
>

2023-05-18 10:35:27

by Xiaoyao Li

[permalink] [raw]
Subject: Re: [RFC PATCH v2 08/11] KVM: VMX: Advertise MITI_ENUM_RETPOLINE_S_SUPPORT

On 4/14/2023 2:25 PM, Chao Gao wrote:
> Allow guest to query if the underying VMM understands Retpoline and
> report the statue of Retpoline in suprevisor mode i.e. CPL < 3 via
> MSR_VIRTUAL_MITIGATION_ENUM/CTRL.
>
> Disable RRSBA behavior by setting RRSBA_DIS_S for guest if guest is
> using retpoline and the processor has the behavior.
>
> Signed-off-by: Zhang Chen <[email protected]>
> Signed-off-by: Chao Gao <[email protected]>
> Tested-by: Jiaan Lu <[email protected]>
> ---
> arch/x86/kvm/vmx/vmx.c | 24 +++++++++++++++++++++---
> 1 file changed, 21 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 980498c4c30c..25afb4c3e55e 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -1944,8 +1944,8 @@ static inline bool is_vmx_feature_control_msr_valid(struct vcpu_vmx *vmx,
> }
>
> #define VIRTUAL_ENUMERATION_VALID_BITS VIRT_ENUM_MITIGATION_CTRL_SUPPORT
> -#define MITI_ENUM_VALID_BITS 0ULL
> -#define MITI_CTRL_VALID_BITS 0ULL
> +#define MITI_ENUM_VALID_BITS MITI_ENUM_RETPOLINE_S_SUPPORT
> +#define MITI_CTRL_VALID_BITS MITI_CTRL_RETPOLINE_S_USED
>
> static int vmx_get_msr_feature(struct kvm_msr_entry *msr)
> {
> @@ -2173,7 +2173,7 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> struct vmx_uret_msr *msr;
> int ret = 0;
> u32 msr_index = msr_info->index;
> - u64 data = msr_info->data, spec_ctrl_mask;
> + u64 data = msr_info->data, arch_msr = 0, spec_ctrl_mask = 0;

Sugget to make arch_msr and spec_ctrl_mask as local variables of each
case {} block

> u32 index;
>
> switch (msr_index) {
> @@ -2488,6 +2488,24 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> if (data & ~MITI_CTRL_VALID_BITS)
> return 1;
>
> + if (boot_cpu_has(X86_FEATURE_ARCH_CAPABILITIES))
> + rdmsrl(MSR_IA32_ARCH_CAPABILITIES, arch_msr);
> +
> + if (data & MITI_CTRL_RETPOLINE_S_USED &&
> + kvm_cpu_cap_has(X86_FEATURE_RRSBA_CTRL) &&

why kvm_cpu_cap_has() is used here? it means whether KVM supports expose
this feature to guest. But what we need here is whether host supports
this feature. Though they might get the same result, we'd better use
boot_cpu_has() or even read CPUID directly (since cpuid info can be
changed by clearcpuid magic) to avoid confusion.

> + arch_msr & ARCH_CAP_RRSBA)
> + spec_ctrl_mask |= SPEC_CTRL_RRSBA_DIS_S;
> +
> + /*
> + * Intercept IA32_SPEC_CTRL to disallow guest from changing
> + * certain bits.
> + */
> + if (spec_ctrl_mask && !cpu_has_spec_ctrl_virt())
> + vmx_enable_intercept_for_msr(vcpu, MSR_IA32_SPEC_CTRL, MSR_TYPE_RW);
> +
> + vmx_set_spec_ctrl_mask(vmx, spec_ctrl_mask);
> + vmx_set_guest_spec_ctrl(vmx, vmx_get_guest_spec_ctrl(vmx));
> +
> vmx->msr_virtual_mitigation_ctrl = data;
> break;
> default:


2023-05-18 10:46:48

by Xiaoyao Li

[permalink] [raw]
Subject: Re: [RFC PATCH v2 06/11] KVM: x86: Advertise ARCH_CAP_VIRTUAL_ENUM support

On 4/14/2023 2:25 PM, Chao Gao wrote:
> From: Zhang Chen <[email protected]>
>
> Bit 63 of IA32_ARCH_CAPABILITIES MSR indicates availablility of the
> VIRTUAL_ENUMERATION_MSR (index 0x50000000) that enumerates features like
> e.g., mitigation enumeration which is used for guest to hint VMMs the
> software mitigations in use.
>
> Advertise ARCH_CAP_VIRTUAL_ENUM support for VMX and emulate read/write
> of the VIRTUAL_ENUMERATION_MSR. Now VIRTUAL_ENUMERATION_MSR is always 0.
>
> Signed-off-by: Zhang Chen <[email protected]>
> Co-developed-by: Chao Gao <[email protected]>
> Signed-off-by: Chao Gao <[email protected]>
> Tested-by: Jiaan Lu <[email protected]>
> ---
> arch/x86/kvm/svm/svm.c | 1 +
> arch/x86/kvm/vmx/vmx.c | 19 +++++++++++++++++++
> arch/x86/kvm/vmx/vmx.h | 1 +
> arch/x86/kvm/x86.c | 16 +++++++++++++++-
> 4 files changed, 36 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 57f241c5a371..195d0cf9309a 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -4093,6 +4093,7 @@ static bool svm_has_emulated_msr(struct kvm *kvm, u32 index)
> {
> switch (index) {
> case MSR_IA32_MCG_EXT_CTL:
> + case MSR_VIRTUAL_ENUMERATION:
> case MSR_IA32_VMX_BASIC ... MSR_IA32_VMX_VMFUNC:
> return false;
> case MSR_IA32_SMBASE:
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 9f6919bec2b3..85419137decb 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -1943,6 +1943,8 @@ static inline bool is_vmx_feature_control_msr_valid(struct vcpu_vmx *vmx,
> return !(msr->data & ~valid_bits);
> }
>
> +#define VIRTUAL_ENUMERATION_VALID_BITS 0ULL
> +
> static int vmx_get_msr_feature(struct kvm_msr_entry *msr)
> {
> switch (msr->index) {
> @@ -1950,6 +1952,9 @@ static int vmx_get_msr_feature(struct kvm_msr_entry *msr)
> if (!nested)
> return 1;
> return vmx_get_vmx_msr(&vmcs_config.nested, msr->index, &msr->data);
> + case MSR_VIRTUAL_ENUMERATION:
> + msr->data = VIRTUAL_ENUMERATION_VALID_BITS;
> + return 0;
> default:
> return KVM_MSR_RET_INVALID;
> }
> @@ -2096,6 +2101,12 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> case MSR_IA32_DEBUGCTLMSR:
> msr_info->data = vmcs_read64(GUEST_IA32_DEBUGCTL);
> break;
> + case MSR_VIRTUAL_ENUMERATION:
> + if (!msr_info->host_initiated &&
> + !(vcpu->arch.arch_capabilities & ARCH_CAP_VIRTUAL_ENUM))
> + return 1;
> + msr_info->data = vmx->msr_virtual_enumeration;
> + break;
> default:
> find_uret_msr:
> msr = vmx_find_uret_msr(vmx, msr_info->index);
> @@ -2437,6 +2448,14 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> }
> ret = kvm_set_msr_common(vcpu, msr_info);
> break;
> + case MSR_VIRTUAL_ENUMERATION:
> + if (!msr_info->host_initiated)
> + return 1;
> + if (data & ~VIRTUAL_ENUMERATION_VALID_BITS)
> + return 1;
> +
> + vmx->msr_virtual_enumeration = data;
> + break;
>
> default:
> find_uret_msr:
> diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
> index 021d86b52e18..a7faaf9fdc26 100644
> --- a/arch/x86/kvm/vmx/vmx.h
> +++ b/arch/x86/kvm/vmx/vmx.h
> @@ -292,6 +292,7 @@ struct vcpu_vmx {
>
> u64 spec_ctrl;
> u64 guest_spec_ctrl;
> + u64 msr_virtual_enumeration;
> u32 msr_ia32_umwait_control;
>
> /*
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 3c58dbae7b4c..a1bc52bebdcc 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -1537,6 +1537,7 @@ static const u32 emulated_msrs_all[] = {
>
> MSR_K7_HWCR,
> MSR_KVM_POLL_CONTROL,
> + MSR_VIRTUAL_ENUMERATION,
> };
>
> static u32 emulated_msrs[ARRAY_SIZE(emulated_msrs_all)];
> @@ -1570,6 +1571,7 @@ static const u32 msr_based_features_all[] = {
> MSR_IA32_UCODE_REV,
> MSR_IA32_ARCH_CAPABILITIES,
> MSR_IA32_PERF_CAPABILITIES,
> + MSR_VIRTUAL_ENUMERATION,
> };
>
> static u32 msr_based_features[ARRAY_SIZE(msr_based_features_all)];
> @@ -1591,7 +1593,8 @@ static unsigned int num_msr_based_features;
> ARCH_CAP_SKIP_VMENTRY_L1DFLUSH | ARCH_CAP_SSB_NO | ARCH_CAP_MDS_NO | \
> ARCH_CAP_PSCHANGE_MC_NO | ARCH_CAP_TSX_CTRL_MSR | ARCH_CAP_TAA_NO | \
> ARCH_CAP_SBDR_SSDP_NO | ARCH_CAP_FBSDP_NO | ARCH_CAP_PSDP_NO | \
> - ARCH_CAP_FB_CLEAR | ARCH_CAP_RRSBA | ARCH_CAP_PBRSB_NO)
> + ARCH_CAP_FB_CLEAR | ARCH_CAP_RRSBA | ARCH_CAP_PBRSB_NO | \
> + ARCH_CAP_VIRTUAL_ENUM)

We cannot do it.

Otherwise, an AMD L1 with X86_FEATURE_ARCH_CAPABILITIES configured is
possible to expose MSR_VIRTUAL_ENUMERATION to L2 while no support for it.

>
> static u64 kvm_get_arch_capabilities(void)
> {
> @@ -1610,6 +1613,17 @@ static u64 kvm_get_arch_capabilities(void)
> */
> data |= ARCH_CAP_PSCHANGE_MC_NO;
>
> + /*
> + * Virtual enumeration is a paravirt feature. The only usage for now
> + * is to bridge the gap caused by microarchitecture changes between
> + * different Intel processors. And its usage is linked to "virtualize
> + * IA32_SPEC_CTRL" which is a VMX feature. Whether AMD SVM can benefit
> + * from the same usage and how to implement it is still unclear. Limit
> + * virtual enumeration to VMX.
> + */
> + if (static_call(kvm_x86_has_emulated_msr)(NULL, MSR_VIRTUAL_ENUMERATION))
> + data |= ARCH_CAP_VIRTUAL_ENUM;
> +
> /*
> * If we're doing cache flushes (either "always" or "cond")
> * we will do one whenever the guest does a vmlaunch/vmresume.


2023-05-19 10:03:30

by Chao Gao

[permalink] [raw]
Subject: Re: [RFC PATCH v2 06/11] KVM: x86: Advertise ARCH_CAP_VIRTUAL_ENUM support

On Thu, May 18, 2023 at 06:14:40PM +0800, Xiaoyao Li wrote:
>> static u32 msr_based_features[ARRAY_SIZE(msr_based_features_all)];
>> @@ -1591,7 +1593,8 @@ static unsigned int num_msr_based_features;
>> ARCH_CAP_SKIP_VMENTRY_L1DFLUSH | ARCH_CAP_SSB_NO | ARCH_CAP_MDS_NO | \
>> ARCH_CAP_PSCHANGE_MC_NO | ARCH_CAP_TSX_CTRL_MSR | ARCH_CAP_TAA_NO | \
>> ARCH_CAP_SBDR_SSDP_NO | ARCH_CAP_FBSDP_NO | ARCH_CAP_PSDP_NO | \
>> - ARCH_CAP_FB_CLEAR | ARCH_CAP_RRSBA | ARCH_CAP_PBRSB_NO)
>> + ARCH_CAP_FB_CLEAR | ARCH_CAP_RRSBA | ARCH_CAP_PBRSB_NO | \
>> + ARCH_CAP_VIRTUAL_ENUM)
>
>We cannot do it.
>
>Otherwise, an AMD L1 with X86_FEATURE_ARCH_CAPABILITIES configured is
>possible to expose MSR_VIRTUAL_ENUMERATION to L2 while no support for it.

How does AMD L1 see the ARCH_CAP_VIRTUAL_ENUM feature in the first
place? because ...

>
>> static u64 kvm_get_arch_capabilities(void)
>> {
>> @@ -1610,6 +1613,17 @@ static u64 kvm_get_arch_capabilities(void)
>> */
>> data |= ARCH_CAP_PSCHANGE_MC_NO;
>> + /*
>> + * Virtual enumeration is a paravirt feature. The only usage for now
>> + * is to bridge the gap caused by microarchitecture changes between
>> + * different Intel processors. And its usage is linked to "virtualize
>> + * IA32_SPEC_CTRL" which is a VMX feature. Whether AMD SVM can benefit
>> + * from the same usage and how to implement it is still unclear. Limit
>> + * virtual enumeration to VMX.
>> + */
>> + if (static_call(kvm_x86_has_emulated_msr)(NULL, MSR_VIRTUAL_ENUMERATION))
>> + data |= ARCH_CAP_VIRTUAL_ENUM;

the feature is exposed on Intel CPUs only.

Do you mean AMD L1 created on Intel L0? and Intel L0 even emulates
nested (SVM) support for the L1? This sounds a very contrived case.

>> +
>> /*
>> * If we're doing cache flushes (either "always" or "cond")
>> * we will do one whenever the guest does a vmlaunch/vmresume.
>

2023-05-19 10:42:09

by Chao Gao

[permalink] [raw]
Subject: Re: [RFC PATCH v2 08/11] KVM: VMX: Advertise MITI_ENUM_RETPOLINE_S_SUPPORT

On Thu, May 18, 2023 at 06:25:30PM +0800, Xiaoyao Li wrote:
>> @@ -2173,7 +2173,7 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>> struct vmx_uret_msr *msr;
>> int ret = 0;
>> u32 msr_index = msr_info->index;
>> - u64 data = msr_info->data, spec_ctrl_mask;
>> + u64 data = msr_info->data, arch_msr = 0, spec_ctrl_mask = 0;
>
>Sugget to make arch_msr and spec_ctrl_mask as local variables of each case {}
>block

Sure. Will do

>
>> u32 index;
>> switch (msr_index) {
>> @@ -2488,6 +2488,24 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>> if (data & ~MITI_CTRL_VALID_BITS)
>> return 1;
>> + if (boot_cpu_has(X86_FEATURE_ARCH_CAPABILITIES))
>> + rdmsrl(MSR_IA32_ARCH_CAPABILITIES, arch_msr);
>> +
>> + if (data & MITI_CTRL_RETPOLINE_S_USED &&
>> + kvm_cpu_cap_has(X86_FEATURE_RRSBA_CTRL) &&
>
>why kvm_cpu_cap_has() is used here? it means whether KVM supports expose this
>feature to guest. But what we need here is whether host supports this
>feature. Though they might get the same result, we'd better use
>boot_cpu_has() or even read CPUID directly (since cpuid info can be changed
>by clearcpuid magic) to avoid confusion.

OK. This makes sense. I will use boot_cpu_has(). clearcpuid sometimes is
helpful for debugging. I prefer to honor it.

Thanks.

2023-05-22 01:13:56

by Xiaoyao Li

[permalink] [raw]
Subject: Re: [RFC PATCH v2 06/11] KVM: x86: Advertise ARCH_CAP_VIRTUAL_ENUM support

On 5/19/2023 5:57 PM, Chao Gao wrote:
> On Thu, May 18, 2023 at 06:14:40PM +0800, Xiaoyao Li wrote:
>>> static u32 msr_based_features[ARRAY_SIZE(msr_based_features_all)];
>>> @@ -1591,7 +1593,8 @@ static unsigned int num_msr_based_features;
>>> ARCH_CAP_SKIP_VMENTRY_L1DFLUSH | ARCH_CAP_SSB_NO | ARCH_CAP_MDS_NO | \
>>> ARCH_CAP_PSCHANGE_MC_NO | ARCH_CAP_TSX_CTRL_MSR | ARCH_CAP_TAA_NO | \
>>> ARCH_CAP_SBDR_SSDP_NO | ARCH_CAP_FBSDP_NO | ARCH_CAP_PSDP_NO | \
>>> - ARCH_CAP_FB_CLEAR | ARCH_CAP_RRSBA | ARCH_CAP_PBRSB_NO)
>>> + ARCH_CAP_FB_CLEAR | ARCH_CAP_RRSBA | ARCH_CAP_PBRSB_NO | \
>>> + ARCH_CAP_VIRTUAL_ENUM)
>>
>> We cannot do it.
>>
>> Otherwise, an AMD L1 with X86_FEATURE_ARCH_CAPABILITIES configured is
>> possible to expose MSR_VIRTUAL_ENUMERATION to L2 while no support for it.
>
> How does AMD L1 see the ARCH_CAP_VIRTUAL_ENUM feature in the first
> place? because ...
>
>>
>>> static u64 kvm_get_arch_capabilities(void)
>>> {
>>> @@ -1610,6 +1613,17 @@ static u64 kvm_get_arch_capabilities(void)
>>> */
>>> data |= ARCH_CAP_PSCHANGE_MC_NO;
>>> + /*
>>> + * Virtual enumeration is a paravirt feature. The only usage for now
>>> + * is to bridge the gap caused by microarchitecture changes between
>>> + * different Intel processors. And its usage is linked to "virtualize
>>> + * IA32_SPEC_CTRL" which is a VMX feature. Whether AMD SVM can benefit
>>> + * from the same usage and how to implement it is still unclear. Limit
>>> + * virtual enumeration to VMX.
>>> + */
>>> + if (static_call(kvm_x86_has_emulated_msr)(NULL, MSR_VIRTUAL_ENUMERATION))
>>> + data |= ARCH_CAP_VIRTUAL_ENUM;
>
> the feature is exposed on Intel CPUs only.
>
> Do you mean AMD L1 created on Intel L0? and Intel L0 even emulates
> nested (SVM) support for the L1? This sounds a very contrived case.

you are right. I was thinking of an rare case but ignored the fact that
VMX doesn't nested svm.

Sorry for it.

>>> +
>>> /*
>>> * If we're doing cache flushes (either "always" or "cond")
>>> * we will do one whenever the guest does a vmlaunch/vmresume.
>>


2023-05-22 09:55:38

by Jingqi Liu

[permalink] [raw]
Subject: Re: [RFC PATCH v2 10/11] KVM: selftests: Add tests for virtual enumeration/mitigation MSRs

On 4/14/2023 2:25 PM, Chao Gao wrote:
> Three virtual MSRs added for guest to report the usage of software
Seems it's better like below.
s/Three virtual MSRs added/Add three virtual MSRs ?
> mitigations. They are enumerated in an architectural way. Try to
> access the three MSRs to ensure the behavior is expected:
> Specifically,
>
> 1. below three cases should cause #GP:
> * access to a non-present MSR
> * write to read-only MSRs
> * toggling reserved bit of a writeable MSR
>
> 2. rdmsr/wrmsr in other cases should succeed
>
> 3. rdmsr should return the value last written
>
> Signed-off-by: Chao Gao <[email protected]>
> ---
Thanks,
Jingqi

2023-05-22 10:09:18

by Jingqi Liu

[permalink] [raw]
Subject: Re: [RFC PATCH v2 08/11] KVM: VMX: Advertise MITI_ENUM_RETPOLINE_S_SUPPORT

On 4/14/2023 2:25 PM, Chao Gao wrote:
> Allow guest to query if the underying VMM understands Retpoline and
> report the statue of Retpoline in suprevisor mode i.e. CPL < 3 via
s/statue/status
> MSR_VIRTUAL_MITIGATION_ENUM/CTRL.
>
> Disable RRSBA behavior by setting RRSBA_DIS_S for guest if guest is
> using retpoline and the processor has the behavior.
>
> Signed-off-by: Zhang Chen <[email protected]>
> Signed-off-by: Chao Gao <[email protected]>
> Tested-by: Jiaan Lu <[email protected]>
> ---
Thanks,
Jingqi

2023-05-22 10:09:27

by Jingqi Liu

[permalink] [raw]
Subject: Re: [RFC PATCH v2 09/11] KVM: VMX: Advertise MITI_CTRL_BHB_CLEAR_SEQ_S_SUPPORT


On 4/14/2023 2:25 PM, Chao Gao wrote:
> From: Zhang Chen <[email protected]>
>
> Allow guest to query if the underying VMM understands BHB-clearing
> sequence and report the statue of BHB-clearing sequence in suprevisor
s/statue/status
> mode i.e. CPL < 3 via MSR_VIRTUAL_MITIGATION_ENUM/CTRL.
>
> Enable BHI_DIS_S for guest if guest is using BHI-clearing sequence and
> the sequence isn't effective on the processor.
>
> Signed-off-by: Zhang Chen <[email protected]>
> Signed-off-by: Chao Gao <[email protected]>
> Tested-by: Jiaan Lu <[email protected]>
> ---
Thanks,
Jingqi