This patch series includes KVM enabling patches for Linear-address masking
(LAM) v11 and Linear Address Space Separation (LASS) v3 since the two features
have overlapping prep work and concepts. Sent as a single series to reduce the
probability of conflicts.
The patch series is organized as follows:
- Patch 1-4: Common prep work for both LAM and LASS.
- Patch 5-13: LAM part.
- Patch 14-16: LASS part.
Dependency:
- LAM has no other dependency.
- LASS patches depends on LASS kernel enabling patches, which are not merged yet.
https://lore.kernel.org/all/[email protected]/
==== LAM v11 ====
Linear-address masking (LAM) [1], modifies the checking that is applied to
*64-bit* linear addresses, allowing software to use of the untranslated
address bits for metadata and masks the metadata bits before using them as
linear addresses to access memory.
When the feature is virtualized and exposed to guest, it can be used for
efficient address sanitizers (ASAN) implementation and for optimizations in
JITs and virtual machines.
The patch series brings LAM virtualization support in KVM.
Please review and consider applying.
LAM QEMU patch:
https://lists.gnu.org/archive/html/qemu-devel/2023-07/msg04160.html
LAM kvm-unit-tests patch:
https://lore.kernel.org/kvm/[email protected]/
--- Test ---
1. Add test cases in kvm-unit-test for LAM [2], including LAM_SUP and LAM_{U57,U48}.
For supervisor pointers, the test covers CR4 LAM_SUP bits toggle, Memory/MMIO
access with tagged pointer, and some special instructions (INVLPG, INVPCID,
INVVPID), INVVPID cases also used to cover VMX instruction VMExit path.
For user pointers, the test covers CR3 LAM bits toggle, Memory/MMIO access with
tagged pointer.
MMIO cases are used to trigger instruction emulation path.
Run the unit test with both LAM feature on/off (i.e. including negative cases).
Run the unit test in L1 guest with both LAM feature on/off.
2. Run Kernel LAM kselftests in guest, with both EPT=Y/N.
3. Launch a nested guest and run tests listed in 1 & 2.
All tests have passed on real machine supporting LAM.
[1] Intel ISE https://cdrdv2.intel.com/v1/dl/getContent/671368
Chapter Linear Address Masking (LAM)
[2] https://lore.kernel.org/kvm/[email protected]/
----------
Changelog
v11:
- A separate patch to drop non-PA bits when getting GFN for guest's PGD [Sean]
- Add a patch to remove kvm_vcpu_is_illegal_gpa() [Isaku]
- Squash CR4 LAM bit handling with the address untag for supervisor pointers. [Sean]
- Squash CR3 LAM bits handling with the address untag for user pointers. [Sean]
- Adopt KVM-governed feature framework to track "LAM enabled" as a separate
optimization patch, and add the reason in patch change log. [Sean, Kai]
- Some comment modifications/additions according to reviews [Sean]
v10:
https://lore.kernel.org/kvm/[email protected]/
==== LASS v3 ====
Linear Address Space Separation (LASS)[1] is a new mechanism that
enforces the same mode-based protections as paging, i.e. SMAP/SMEP
but without traversing the paging structures. Because the protections
enforced by LASS are applied before paging, "probes" by malicious
software will provide no paging-based timing information.
This patch series provide a LASS KVM solution and depends on kernel
enabling that can be found at [2].
--- Test ---
1. Test the basic function of LASS virtualization including LASS
enumeration and enabling in guest and nested environment.
2. Run selftest with following cases:
- data access to user address space in supervisor mode
- data access to supervisor address space in user mode
- data access to linear address across space boundary
- Using KVM FEP mechanism to run test cases above
- VMX instruction execution with VMCS structure in user
address space
- instruction fetch from user address space in supervisor mode
- instruction fetch from supervisor address space in user mode
All tests have passed on real machine supporting LASS.
[1] Intel ISE spec https://cdrdv2.intel.com/v1/dl/getContent/671368
Chapter Linear Address Space Separation (LASS)
[2] LASS kernel patch series
https://lore.kernel.org/all/[email protected]/
----------
Change log
v3:
1. Refine commit message [Sean/Chao Gao]
2. Enhance the implementation of LASS violation check [Sean]
3. Re-organize patch as Sean's suggestion [Sean]
v2:
https://lore.kernel.org/all/[email protected]/
Binbin Wu (10):
KVM: x86: Consolidate flags for __linearize()
KVM: x86: Use a new flag for branch targets
KVM: x86: Add an emulation flag for implicit system access
KVM: x86: Add X86EMUL_F_INVLPG and pass it in em_invlpg()
KVM: x86/mmu: Drop non-PA bits when getting GFN for guest's PGD
KVM: x86: Add & use kvm_vcpu_is_legal_cr3() to check CR3's legality
KVM: x86: Remove kvm_vcpu_is_illegal_gpa()
KVM: x86: Introduce get_untagged_addr() in kvm_x86_ops and call it in
emulator
KVM: x86: Untag address for vmexit handlers when LAM applicable
KVM: x86: Use KVM-governed feature framework to track "LAM enabled"
Robert Hoo (3):
KVM: x86: Virtualize LAM for supervisor pointer
KVM: x86: Virtualize LAM for user pointer
KVM: x86: Advertise and enable LAM (user and supervisor)
Zeng Guang (3):
KVM: emulator: Add emulation of LASS violation checks on linear
address
KVM: VMX: Virtualize LASS
KVM: x86: Advertise LASS CPUID to user space
arch/x86/include/asm/kvm-x86-ops.h | 4 +-
arch/x86/include/asm/kvm_host.h | 8 ++-
arch/x86/kvm/cpuid.c | 4 +-
arch/x86/kvm/cpuid.h | 13 ++--
arch/x86/kvm/emulate.c | 39 +++++++----
arch/x86/kvm/governed_features.h | 1 +
arch/x86/kvm/kvm_emulate.h | 13 ++++
arch/x86/kvm/mmu.h | 8 +++
arch/x86/kvm/mmu/mmu.c | 2 +-
arch/x86/kvm/mmu/mmu_internal.h | 1 +
arch/x86/kvm/mmu/paging_tmpl.h | 2 +-
arch/x86/kvm/svm/nested.c | 4 +-
arch/x86/kvm/vmx/nested.c | 14 ++--
arch/x86/kvm/vmx/sgx.c | 4 +-
arch/x86/kvm/vmx/vmx.c | 106 ++++++++++++++++++++++++++++-
arch/x86/kvm/vmx/vmx.h | 5 ++
arch/x86/kvm/x86.c | 28 +++++++-
arch/x86/kvm/x86.h | 4 ++
18 files changed, 226 insertions(+), 34 deletions(-)
base-commit: 0bb80ecc33a8fb5a682236443c1e740d5c917d1d
prerequisite-patch-id: 51db36ad7156234d05f8c4004ec6a31ef609b81a
--
2.25.1
Use the governed feature framework to track if Linear Address Masking (LAM)
is "enabled", i.e. if LAM can be used by the guest.
Using the framework to avoid the relative expensive call guest_cpuid_has()
during cr3 and vmexit handling paths for LAM.
No functional change intended.
Signed-off-by: Binbin Wu <[email protected]>
Tested-by: Xuelian Guo <[email protected]>
---
arch/x86/kvm/cpuid.h | 3 +--
arch/x86/kvm/governed_features.h | 1 +
arch/x86/kvm/mmu.h | 3 +--
arch/x86/kvm/vmx/vmx.c | 1 +
4 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
index 3c579ce2f60f..93c63ba29337 100644
--- a/arch/x86/kvm/cpuid.h
+++ b/arch/x86/kvm/cpuid.h
@@ -275,8 +275,7 @@ static __always_inline bool guest_can_use(struct kvm_vcpu *vcpu,
static inline bool kvm_vcpu_is_legal_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
{
- if (kvm_cpu_cap_has(X86_FEATURE_LAM) &&
- guest_cpuid_has(vcpu, X86_FEATURE_LAM))
+ if (guest_can_use(vcpu, X86_FEATURE_LAM))
cr3 &= ~(X86_CR3_LAM_U48 | X86_CR3_LAM_U57);
return kvm_vcpu_is_legal_gpa(vcpu, cr3);
diff --git a/arch/x86/kvm/governed_features.h b/arch/x86/kvm/governed_features.h
index 423a73395c10..ad463b1ed4e4 100644
--- a/arch/x86/kvm/governed_features.h
+++ b/arch/x86/kvm/governed_features.h
@@ -16,6 +16,7 @@ KVM_GOVERNED_X86_FEATURE(PAUSEFILTER)
KVM_GOVERNED_X86_FEATURE(PFTHRESHOLD)
KVM_GOVERNED_X86_FEATURE(VGIF)
KVM_GOVERNED_X86_FEATURE(VNMI)
+KVM_GOVERNED_X86_FEATURE(LAM)
#undef KVM_GOVERNED_X86_FEATURE
#undef KVM_GOVERNED_FEATURE
diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
index e700f1f854ae..f04cc5ade1cd 100644
--- a/arch/x86/kvm/mmu.h
+++ b/arch/x86/kvm/mmu.h
@@ -148,8 +148,7 @@ static inline unsigned long kvm_get_active_pcid(struct kvm_vcpu *vcpu)
static inline unsigned long kvm_get_active_cr3_lam_bits(struct kvm_vcpu *vcpu)
{
- if (!kvm_cpu_cap_has(X86_FEATURE_LAM) ||
- !guest_cpuid_has(vcpu, X86_FEATURE_LAM))
+ if (!guest_can_use(vcpu, X86_FEATURE_LAM))
return 0;
return kvm_read_cr3(vcpu) & (X86_CR3_LAM_U48 | X86_CR3_LAM_U57);
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 23eac6bb4fac..3bdeebee71cc 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7767,6 +7767,7 @@ static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_XSAVES);
kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_VMX);
+ kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_LAM);
vmx_setup_uret_msrs(vmx);
--
2.25.1
Add and call vmx_get_untagged_addr() for 64-bit memory operand in vmexit
handlers when LAM is applicable. Also wire get_untagged_addr() interface.
As of now, vmx_get_untagged_addr() doesn't do untag yet.
For vmexit handlers related to 64-bit linear address:
- Cases need to untag address (handled in get_vmx_mem_address())
Operand(s) of VMX instructions and INVPCID.
Operand(s) of SGX ENCLS.
- Cases LAM doesn't apply to (no change needed)
Operand of INVLPG.
Linear address in INVPCID descriptor.
Linear address in INVVPID descriptor.
BASEADDR specified in SESC of ECREATE.
Note:
LAM doesn't apply to the writes to control registers or MSRs.
LAM masking applies before paging, so the faulting linear address in CR2
doesn't contain the metadata.
The guest linear address saved in VMCS doesn't contain metadata.
Signed-off-by: Binbin Wu <[email protected]>
Reviewed-by: Chao Gao <[email protected]>
Tested-by: Xuelian Guo <[email protected]>
---
arch/x86/kvm/vmx/nested.c | 5 +++++
arch/x86/kvm/vmx/sgx.c | 1 +
arch/x86/kvm/vmx/vmx.c | 7 +++++++
arch/x86/kvm/vmx/vmx.h | 2 ++
arch/x86/kvm/x86.c | 4 ++++
5 files changed, 19 insertions(+)
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 51622878d6e4..4ba46e1b29d2 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -4980,6 +4980,7 @@ int get_vmx_mem_address(struct kvm_vcpu *vcpu, unsigned long exit_qualification,
else
*ret = off;
+ *ret = vmx_get_untagged_addr(vcpu, *ret, 0);
/* Long mode: #GP(0)/#SS(0) if the memory address is in a
* non-canonical form. This is the only check on the memory
* destination for long mode!
@@ -5797,6 +5798,10 @@ static int handle_invvpid(struct kvm_vcpu *vcpu)
vpid02 = nested_get_vpid02(vcpu);
switch (type) {
case VMX_VPID_EXTENT_INDIVIDUAL_ADDR:
+ /*
+ * LAM doesn't apply to addresses that are inputs to TLB
+ * invalidation.
+ */
if (!operand.vpid ||
is_noncanonical_address(operand.gla, vcpu))
return nested_vmx_fail(vcpu,
diff --git a/arch/x86/kvm/vmx/sgx.c b/arch/x86/kvm/vmx/sgx.c
index 3e822e582497..6fef01e0536e 100644
--- a/arch/x86/kvm/vmx/sgx.c
+++ b/arch/x86/kvm/vmx/sgx.c
@@ -37,6 +37,7 @@ static int sgx_get_encls_gva(struct kvm_vcpu *vcpu, unsigned long offset,
if (!IS_ALIGNED(*gva, alignment)) {
fault = true;
} else if (likely(is_64_bit_mode(vcpu))) {
+ *gva = vmx_get_untagged_addr(vcpu, *gva, 0);
fault = is_noncanonical_address(*gva, vcpu);
} else {
*gva &= 0xffffffff;
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 6eba8c08eff6..b572cfe27342 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -8209,6 +8209,11 @@ static void vmx_vm_destroy(struct kvm *kvm)
free_pages((unsigned long)kvm_vmx->pid_table, vmx_get_pid_table_order(kvm));
}
+gva_t vmx_get_untagged_addr(struct kvm_vcpu *vcpu, gva_t gva, unsigned int flags)
+{
+ return gva;
+}
+
static struct kvm_x86_ops vmx_x86_ops __initdata = {
.name = KBUILD_MODNAME,
@@ -8349,6 +8354,8 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = {
.complete_emulated_msr = kvm_complete_insn_gp,
.vcpu_deliver_sipi_vector = kvm_vcpu_deliver_sipi_vector,
+
+ .get_untagged_addr = vmx_get_untagged_addr,
};
static unsigned int vmx_handle_intel_pt_intr(void)
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index c2130d2c8e24..45cee1a8bc0a 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -420,6 +420,8 @@ void vmx_enable_intercept_for_msr(struct kvm_vcpu *vcpu, u32 msr, int type);
u64 vmx_get_l2_tsc_offset(struct kvm_vcpu *vcpu);
u64 vmx_get_l2_tsc_multiplier(struct kvm_vcpu *vcpu);
+gva_t vmx_get_untagged_addr(struct kvm_vcpu *vcpu, gva_t gva, unsigned int flags);
+
static inline void vmx_set_intercept_for_msr(struct kvm_vcpu *vcpu, u32 msr,
int type, bool value)
{
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index e03313287816..4c2cdfcae79d 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -13396,6 +13396,10 @@ int kvm_handle_invpcid(struct kvm_vcpu *vcpu, unsigned long type, gva_t gva)
switch (type) {
case INVPCID_TYPE_INDIV_ADDR:
+ /*
+ * LAM doesn't apply to addresses that are inputs to TLB
+ * invalidation.
+ */
if ((!pcid_enabled && (operand.pcid != 0)) ||
is_noncanonical_address(operand.gla, vcpu)) {
kvm_inject_gp(vcpu, 0);
--
2.25.1
Remove kvm_vcpu_is_illegal_gpa() and use !kvm_vcpu_is_legal_gpa() instead.
No functional change intended.
Signed-off-by: Binbin Wu <[email protected]>
Tested-by: Xuelian Guo <[email protected]>
---
arch/x86/kvm/cpuid.h | 5 -----
arch/x86/kvm/vmx/nested.c | 2 +-
arch/x86/kvm/vmx/vmx.c | 2 +-
3 files changed, 2 insertions(+), 7 deletions(-)
diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
index ca1fdab31d1e..31b7def60282 100644
--- a/arch/x86/kvm/cpuid.h
+++ b/arch/x86/kvm/cpuid.h
@@ -47,11 +47,6 @@ static inline bool kvm_vcpu_is_legal_gpa(struct kvm_vcpu *vcpu, gpa_t gpa)
return !(gpa & vcpu->arch.reserved_gpa_bits);
}
-static inline bool kvm_vcpu_is_illegal_gpa(struct kvm_vcpu *vcpu, gpa_t gpa)
-{
- return !kvm_vcpu_is_legal_gpa(vcpu, gpa);
-}
-
static inline bool kvm_vcpu_is_legal_aligned_gpa(struct kvm_vcpu *vcpu,
gpa_t gpa, gpa_t alignment)
{
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index db61cf8e3128..51622878d6e4 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -2717,7 +2717,7 @@ static bool nested_vmx_check_eptp(struct kvm_vcpu *vcpu, u64 new_eptp)
}
/* Reserved bits should not be set */
- if (CC(kvm_vcpu_is_illegal_gpa(vcpu, new_eptp) || ((new_eptp >> 7) & 0x1f)))
+ if (CC(!kvm_vcpu_is_legal_gpa(vcpu, new_eptp) || ((new_eptp >> 7) & 0x1f)))
return false;
/* AD, if set, should be supported */
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 72e3943f3693..6eba8c08eff6 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -5782,7 +5782,7 @@ static int handle_ept_violation(struct kvm_vcpu *vcpu)
* would also use advanced VM-exit information for EPT violations to
* reconstruct the page fault error code.
*/
- if (unlikely(allow_smaller_maxphyaddr && kvm_vcpu_is_illegal_gpa(vcpu, gpa)))
+ if (unlikely(allow_smaller_maxphyaddr && !kvm_vcpu_is_legal_gpa(vcpu, gpa)))
return kvm_emulate_instruction(vcpu, 0);
return kvm_mmu_page_fault(vcpu, gpa, error_code, NULL, 0);
--
2.25.1
Drop non-PA bits when getting GFN for guest's PGD with the maximum theoretical
mask for guest MAXPHYADDR.
Do it unconditionally because it's harmless for 32-bit guests, querying 64-bit
mode would be more expensive, and for EPT the mask isn't tied to guest mode.
Using PT_BASE_ADDR_MASK would be technically wrong (PAE paging has 64-bit
elements _excpet_ for CR3, which has only 32 valid bits), it wouldn't matter
in practice though.
Opportunistically use GENMASK_ULL() to define __PT_BASE_ADDR_MASK.
Signed-off-by: Binbin Wu <[email protected]>
Tested-by: Xuelian Guo <[email protected]>
---
arch/x86/kvm/mmu/mmu.c | 2 +-
arch/x86/kvm/mmu/mmu_internal.h | 1 +
arch/x86/kvm/mmu/paging_tmpl.h | 2 +-
3 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index e1d011c67cc6..f316df038e61 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -3774,7 +3774,7 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
hpa_t root;
root_pgd = kvm_mmu_get_guest_pgd(vcpu, mmu);
- root_gfn = root_pgd >> PAGE_SHIFT;
+ root_gfn = (root_pgd & __PT_BASE_ADDR_MASK) >> PAGE_SHIFT;
if (!kvm_vcpu_is_visible_gfn(vcpu, root_gfn)) {
mmu->root.hpa = kvm_mmu_get_dummy_root();
diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h
index b102014e2c60..b5aca7560fd0 100644
--- a/arch/x86/kvm/mmu/mmu_internal.h
+++ b/arch/x86/kvm/mmu/mmu_internal.h
@@ -13,6 +13,7 @@
#endif
/* Page table builder macros common to shadow (host) PTEs and guest PTEs. */
+#define __PT_BASE_ADDR_MASK GENMASK_ULL(51, 12)
#define __PT_LEVEL_SHIFT(level, bits_per_level) \
(PAGE_SHIFT + ((level) - 1) * (bits_per_level))
#define __PT_INDEX(address, level, bits_per_level) \
diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
index c85255073f67..4d4e98fe4f35 100644
--- a/arch/x86/kvm/mmu/paging_tmpl.h
+++ b/arch/x86/kvm/mmu/paging_tmpl.h
@@ -62,7 +62,7 @@
#endif
/* Common logic, but per-type values. These also need to be undefined. */
-#define PT_BASE_ADDR_MASK ((pt_element_t)(((1ULL << 52) - 1) & ~(u64)(PAGE_SIZE-1)))
+#define PT_BASE_ADDR_MASK ((pt_element_t)__PT_BASE_ADDR_MASK)
#define PT_LVL_ADDR_MASK(lvl) __PT_LVL_ADDR_MASK(PT_BASE_ADDR_MASK, lvl, PT_LEVEL_BITS)
#define PT_LVL_OFFSET_MASK(lvl) __PT_LVL_OFFSET_MASK(PT_BASE_ADDR_MASK, lvl, PT_LEVEL_BITS)
#define PT_INDEX(addr, lvl) __PT_INDEX(addr, lvl, PT_LEVEL_BITS)
--
2.25.1
Add and use kvm_vcpu_is_legal_cr3() to check CR3's legality to provide
a clear distinction b/t CR3 and GPA checks. So that kvm_vcpu_is_legal_cr3()
can be adjusted according to new features.
No functional change intended.
Signed-off-by: Binbin Wu <[email protected]>
Tested-by: Xuelian Guo <[email protected]>
---
arch/x86/kvm/cpuid.h | 5 +++++
arch/x86/kvm/svm/nested.c | 4 ++--
arch/x86/kvm/vmx/nested.c | 4 ++--
arch/x86/kvm/x86.c | 4 ++--
4 files changed, 11 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
index 284fa4704553..ca1fdab31d1e 100644
--- a/arch/x86/kvm/cpuid.h
+++ b/arch/x86/kvm/cpuid.h
@@ -278,4 +278,9 @@ static __always_inline bool guest_can_use(struct kvm_vcpu *vcpu,
vcpu->arch.governed_features.enabled);
}
+static inline bool kvm_vcpu_is_legal_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
+{
+ return kvm_vcpu_is_legal_gpa(vcpu, cr3);
+}
+
#endif
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index dd496c9e5f91..c63aa6624e7f 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -311,7 +311,7 @@ static bool __nested_vmcb_check_save(struct kvm_vcpu *vcpu,
if ((save->efer & EFER_LME) && (save->cr0 & X86_CR0_PG)) {
if (CC(!(save->cr4 & X86_CR4_PAE)) ||
CC(!(save->cr0 & X86_CR0_PE)) ||
- CC(kvm_vcpu_is_illegal_gpa(vcpu, save->cr3)))
+ CC(!kvm_vcpu_is_legal_cr3(vcpu, save->cr3)))
return false;
}
@@ -520,7 +520,7 @@ static void nested_svm_transition_tlb_flush(struct kvm_vcpu *vcpu)
static int nested_svm_load_cr3(struct kvm_vcpu *vcpu, unsigned long cr3,
bool nested_npt, bool reload_pdptrs)
{
- if (CC(kvm_vcpu_is_illegal_gpa(vcpu, cr3)))
+ if (CC(!kvm_vcpu_is_legal_cr3(vcpu, cr3)))
return -EINVAL;
if (reload_pdptrs && !nested_npt && is_pae_paging(vcpu) &&
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index c5ec0ef51ff7..db61cf8e3128 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -1085,7 +1085,7 @@ static int nested_vmx_load_cr3(struct kvm_vcpu *vcpu, unsigned long cr3,
bool nested_ept, bool reload_pdptrs,
enum vm_entry_failure_code *entry_failure_code)
{
- if (CC(kvm_vcpu_is_illegal_gpa(vcpu, cr3))) {
+ if (CC(!kvm_vcpu_is_legal_cr3(vcpu, cr3))) {
*entry_failure_code = ENTRY_FAIL_DEFAULT;
return -EINVAL;
}
@@ -2912,7 +2912,7 @@ static int nested_vmx_check_host_state(struct kvm_vcpu *vcpu,
if (CC(!nested_host_cr0_valid(vcpu, vmcs12->host_cr0)) ||
CC(!nested_host_cr4_valid(vcpu, vmcs12->host_cr4)) ||
- CC(kvm_vcpu_is_illegal_gpa(vcpu, vmcs12->host_cr3)))
+ CC(!kvm_vcpu_is_legal_cr3(vcpu, vmcs12->host_cr3)))
return -EINVAL;
if (CC(is_noncanonical_address(vmcs12->host_ia32_sysenter_esp, vcpu)) ||
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 6c9c81e82e65..ea48ba87dacf 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1284,7 +1284,7 @@ int kvm_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
* stuff CR3, e.g. for RSM emulation, and there is no guarantee that
* the current vCPU mode is accurate.
*/
- if (kvm_vcpu_is_illegal_gpa(vcpu, cr3))
+ if (!kvm_vcpu_is_legal_cr3(vcpu, cr3))
return 1;
if (is_pae_paging(vcpu) && !load_pdptrs(vcpu, cr3))
@@ -11468,7 +11468,7 @@ static bool kvm_is_valid_sregs(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs)
*/
if (!(sregs->cr4 & X86_CR4_PAE) || !(sregs->efer & EFER_LMA))
return false;
- if (kvm_vcpu_is_illegal_gpa(vcpu, sregs->cr3))
+ if (!kvm_vcpu_is_legal_cr3(vcpu, sregs->cr3))
return false;
} else {
/*
--
2.25.1
From: Zeng Guang <[email protected]>
Virtualize CR4.LASS and implement LASS violation check to achieve the
mode-based protection in VMX.
Virtualize CR4.LASS[bit 27] under KVM control instead of being guest-owned
as CR4.LASS generally set once for each vCPU at boot time and won't be
toggled at runtime. Meanwhile only if VM has LASS capability enumerated
with CPUID.(EAX=07H.ECX=1):EAX.LASS[bit 6], KVM allows guest software to
be able to set CR4.LASS. Updating cr4_fixed1 to set CR4.LASS bit in the
emulated IA32_VMX_CR4_FIXED1 MSR for guests and allow guests to enable
LASS in nested VMX operation. It's noteworthy that setting CR4.LASS bit
enables LASS only in IA-32e mode and won't effectuate in legacy mode.
LASS violation check takes effect in KVM emulation of instruction fetch and
data access including implicit access when vCPU is running in long mode, and
also involved in emulation of VMX instruction and SGX ENCLS instruction to
enforce the mode-based protections before paging.
Linear addresses used for TLB invalidation (INVPLG, INVPCID, and INVVPID) and
branch targets are not subject to LASS enforcement.
Signed-off-by: Zeng Guang <[email protected]>
Signed-off-by: Binbin Wu <[email protected]>
Tested-by: Xuelian Guo <[email protected]>
---
arch/x86/include/asm/kvm_host.h | 2 +-
arch/x86/kvm/vmx/nested.c | 5 ++--
arch/x86/kvm/vmx/sgx.c | 3 +-
arch/x86/kvm/vmx/vmx.c | 50 +++++++++++++++++++++++++++++++++
arch/x86/kvm/vmx/vmx.h | 3 ++
arch/x86/kvm/x86.c | 2 +-
arch/x86/kvm/x86.h | 2 ++
7 files changed, 62 insertions(+), 5 deletions(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 3e73fc45c8e6..2972fde1ad9e 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -126,7 +126,7 @@
| X86_CR4_OSXSAVE | X86_CR4_SMEP | X86_CR4_FSGSBASE \
| X86_CR4_OSXMMEXCPT | X86_CR4_LA57 | X86_CR4_VMXE \
| X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP \
- | X86_CR4_LAM_SUP))
+ | X86_CR4_LASS | X86_CR4_LAM_SUP))
#define CR8_RESERVED_BITS (~(unsigned long)X86_CR8_TPR)
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 4ba46e1b29d2..821763335cf6 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -4985,7 +4985,8 @@ int get_vmx_mem_address(struct kvm_vcpu *vcpu, unsigned long exit_qualification,
* non-canonical form. This is the only check on the memory
* destination for long mode!
*/
- exn = is_noncanonical_address(*ret, vcpu);
+ exn = is_noncanonical_address(*ret, vcpu) ||
+ vmx_is_lass_violation(vcpu, *ret, len, 0);
} else {
/*
* When not in long mode, the virtual/linear address is
@@ -5799,7 +5800,7 @@ static int handle_invvpid(struct kvm_vcpu *vcpu)
switch (type) {
case VMX_VPID_EXTENT_INDIVIDUAL_ADDR:
/*
- * LAM doesn't apply to addresses that are inputs to TLB
+ * LAM and LASS don't apply to addresses that are inputs to TLB
* invalidation.
*/
if (!operand.vpid ||
diff --git a/arch/x86/kvm/vmx/sgx.c b/arch/x86/kvm/vmx/sgx.c
index 6fef01e0536e..ac70da799df9 100644
--- a/arch/x86/kvm/vmx/sgx.c
+++ b/arch/x86/kvm/vmx/sgx.c
@@ -38,7 +38,8 @@ static int sgx_get_encls_gva(struct kvm_vcpu *vcpu, unsigned long offset,
fault = true;
} else if (likely(is_64_bit_mode(vcpu))) {
*gva = vmx_get_untagged_addr(vcpu, *gva, 0);
- fault = is_noncanonical_address(*gva, vcpu);
+ fault = is_noncanonical_address(*gva, vcpu) ||
+ vmx_is_lass_violation(vcpu, *gva, size, 0);
} else {
*gva &= 0xffffffff;
fault = (s.unusable) ||
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 3bdeebee71cc..aa2949cd547b 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7680,6 +7680,7 @@ static void nested_vmx_cr_fixed1_bits_update(struct kvm_vcpu *vcpu)
entry = kvm_find_cpuid_entry_index(vcpu, 0x7, 1);
cr4_fixed1_update(X86_CR4_LAM_SUP, eax, feature_bit(LAM));
+ cr4_fixed1_update(X86_CR4_LASS, eax, feature_bit(LASS));
#undef cr4_fixed1_update
}
@@ -8259,6 +8260,53 @@ gva_t vmx_get_untagged_addr(struct kvm_vcpu *vcpu, gva_t gva, unsigned int flags
return (sign_extend64(gva, lam_bit) & ~BIT_ULL(63)) | (gva & BIT_ULL(63));
}
+bool vmx_is_lass_violation(struct kvm_vcpu *vcpu, unsigned long addr,
+ unsigned int size, unsigned int flags)
+{
+ const bool is_supervisor_address = !!(addr & BIT_ULL(63));
+ const bool implicit_supervisor = !!(flags & X86EMUL_F_IMPLICIT);
+ const bool fetch = !!(flags & X86EMUL_F_FETCH);
+
+ if (!kvm_is_cr4_bit_set(vcpu, X86_CR4_LASS) || !is_long_mode(vcpu))
+ return false;
+
+ /*
+ * INVLPG isn't subject to LASS, e.g. to allow invalidating userspace
+ * addresses without toggling RFLAGS.AC. Branch targets aren't subject
+ * to LASS in order to simplify far control transfers (the subsequent
+ * fetch will enforce LASS as appropriate).
+ */
+ if (flags & (X86EMUL_F_BRANCH | X86EMUL_F_INVLPG))
+ return false;
+
+ if (!implicit_supervisor && vmx_get_cpl(vcpu) == 3)
+ return is_supervisor_address;
+
+ /*
+ * LASS enforcement for supervisor-mode data accesses depends on SMAP
+ * being enabled, and like SMAP ignores explicit accesses if RFLAGS.AC=1.
+ */
+ if (!fetch) {
+ if (!kvm_is_cr4_bit_set(vcpu, X86_CR4_SMAP))
+ return false;
+
+ if (!implicit_supervisor && (kvm_get_rflags(vcpu) & X86_EFLAGS_AC))
+ return false;
+ }
+
+ /*
+ * The entire access must be in the appropriate address space. Note,
+ * if LAM is supported, @addr has already been untagged, so barring a
+ * massive architecture change to expand the canonical address range,
+ * it's impossible for a user access to straddle user and supervisor
+ * address spaces.
+ */
+ if (size && !((addr + size - 1) & BIT_ULL(63)))
+ return true;
+
+ return !is_supervisor_address;
+}
+
static struct kvm_x86_ops vmx_x86_ops __initdata = {
.name = KBUILD_MODNAME,
@@ -8401,6 +8449,8 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = {
.vcpu_deliver_sipi_vector = kvm_vcpu_deliver_sipi_vector,
.get_untagged_addr = vmx_get_untagged_addr,
+
+ .is_lass_violation = vmx_is_lass_violation,
};
static unsigned int vmx_handle_intel_pt_intr(void)
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index 45cee1a8bc0a..4cafe99a2d94 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -422,6 +422,9 @@ u64 vmx_get_l2_tsc_multiplier(struct kvm_vcpu *vcpu);
gva_t vmx_get_untagged_addr(struct kvm_vcpu *vcpu, gva_t gva, unsigned int flags);
+bool vmx_is_lass_violation(struct kvm_vcpu *vcpu, unsigned long addr,
+ unsigned int size, unsigned int flags);
+
static inline void vmx_set_intercept_for_msr(struct kvm_vcpu *vcpu, u32 msr,
int type, bool value)
{
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 58d7a9241630..49fc73205720 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -13407,7 +13407,7 @@ int kvm_handle_invpcid(struct kvm_vcpu *vcpu, unsigned long type, gva_t gva)
switch (type) {
case INVPCID_TYPE_INDIV_ADDR:
/*
- * LAM doesn't apply to addresses that are inputs to TLB
+ * LAM and LASS don't apply to addresses that are inputs to TLB
* invalidation.
*/
if ((!pcid_enabled && (operand.pcid != 0)) ||
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index 53e883721e71..6c766fe1301c 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -531,6 +531,8 @@ bool kvm_msr_allowed(struct kvm_vcpu *vcpu, u32 index, u32 type);
__reserved_bits |= X86_CR4_PCIDE; \
if (!__cpu_has(__c, X86_FEATURE_LAM)) \
__reserved_bits |= X86_CR4_LAM_SUP; \
+ if (!__cpu_has(__c, X86_FEATURE_LASS)) \
+ __reserved_bits |= X86_CR4_LASS; \
__reserved_bits; \
})
--
2.25.1
From: Robert Hoo <[email protected]>
Add support to allow guests to set the new CR3 control bits for LAM and add
implementation to get untagged address for user pointers.
LAM modifies the checking that is applied to 64-bit linear addresses, allowing
software to use of the untranslated address bits for metadata and masks the
metadata bits before using them as linear addresses to access memory. LAM uses
two new CR3 non-address bits LAM_U48 (bit 62) and LAM_U57 (bit 61) to configure
LAM for user pointers. LAM also changes VMENTER to allow both bits to be set in
VMCS's HOST_CR3 and GUEST_CR3 for virtualization.
When EPT is on, CR3 is not trapped by KVM and it's up to the guest to set any of
the two LAM control bits. However, when EPT is off, the actual CR3 used by the
guest is generated from the shadow MMU root which is different from the CR3 that
is *set* by the guest, and KVM needs to manually apply any active control bits
to VMCS's GUEST_CR3 based on the cached CR3 *seen* by the guest.
KVM manually checks guest's CR3 to make sure it points to a valid guest physical
address (i.e. to support smaller MAXPHYSADDR in the guest). Extend this check
to allow the two LAM control bits to be set. After check, LAM bits of guest CR3
will be stripped off to extract guest physical address.
In case of nested, for a guest which supports LAM, both VMCS12's HOST_CR3 and
GUEST_CR3 are allowed to have the new LAM control bits set, i.e. when L0 enters
L1 to emulate a VMEXIT from L2 to L1 or when L0 enters L2 directly. KVM also
manually checks VMCS12's HOST_CR3 and GUEST_CR3 being valid physical address.
Extend such check to allow the new LAM control bits too.
Note, LAM doesn't have a global control bit to turn on/off LAM completely, but
purely depends on hardware's CPUID to determine it can be enabled or not. That
means, when EPT is on, even when KVM doesn't expose LAM to guest, the guest can
still set LAM control bits in CR3 w/o causing problem. This is an unfortunate
virtualization hole. KVM could choose to intercept CR3 in this case and inject
fault but this would hurt performance when running a normal VM w/o LAM support.
This is undesirable. Just choose to let the guest do such illegal thing as the
worst case is guest being killed when KVM eventually find out such illegal
behaviour and that is the guest to blame.
Suggested-by: Sean Christopherson <[email protected]>
Signed-off-by: Robert Hoo <[email protected]>
Co-developed-by: Binbin Wu <[email protected]>
Signed-off-by: Binbin Wu <[email protected]>
Reviewed-by: Kai Huang <[email protected]>
Reviewed-by: Chao Gao <[email protected]>
Tested-by: Xuelian Guo <[email protected]>
---
arch/x86/kvm/cpuid.h | 4 ++++
arch/x86/kvm/mmu.h | 9 +++++++++
arch/x86/kvm/vmx/vmx.c | 12 +++++++++---
3 files changed, 22 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
index 31b7def60282..3c579ce2f60f 100644
--- a/arch/x86/kvm/cpuid.h
+++ b/arch/x86/kvm/cpuid.h
@@ -275,6 +275,10 @@ static __always_inline bool guest_can_use(struct kvm_vcpu *vcpu,
static inline bool kvm_vcpu_is_legal_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
{
+ if (kvm_cpu_cap_has(X86_FEATURE_LAM) &&
+ guest_cpuid_has(vcpu, X86_FEATURE_LAM))
+ cr3 &= ~(X86_CR3_LAM_U48 | X86_CR3_LAM_U57);
+
return kvm_vcpu_is_legal_gpa(vcpu, cr3);
}
diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
index 253fb2093d5d..e700f1f854ae 100644
--- a/arch/x86/kvm/mmu.h
+++ b/arch/x86/kvm/mmu.h
@@ -146,6 +146,15 @@ static inline unsigned long kvm_get_active_pcid(struct kvm_vcpu *vcpu)
return kvm_get_pcid(vcpu, kvm_read_cr3(vcpu));
}
+static inline unsigned long kvm_get_active_cr3_lam_bits(struct kvm_vcpu *vcpu)
+{
+ if (!kvm_cpu_cap_has(X86_FEATURE_LAM) ||
+ !guest_cpuid_has(vcpu, X86_FEATURE_LAM))
+ return 0;
+
+ return kvm_read_cr3(vcpu) & (X86_CR3_LAM_U48 | X86_CR3_LAM_U57);
+}
+
static inline void kvm_mmu_load_pgd(struct kvm_vcpu *vcpu)
{
u64 root_hpa = vcpu->arch.mmu->root.hpa;
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index ee35a91aa584..23eac6bb4fac 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -3400,7 +3400,8 @@ static void vmx_load_mmu_pgd(struct kvm_vcpu *vcpu, hpa_t root_hpa,
update_guest_cr3 = false;
vmx_ept_load_pdptrs(vcpu);
} else {
- guest_cr3 = root_hpa | kvm_get_active_pcid(vcpu);
+ guest_cr3 = root_hpa | kvm_get_active_pcid(vcpu) |
+ kvm_get_active_cr3_lam_bits(vcpu);
}
if (update_guest_cr3)
@@ -8222,6 +8223,7 @@ static void vmx_vm_destroy(struct kvm *kvm)
gva_t vmx_get_untagged_addr(struct kvm_vcpu *vcpu, gva_t gva, unsigned int flags)
{
int lam_bit;
+ unsigned long cr3_bits;
if (flags & (X86EMUL_F_FETCH | X86EMUL_F_BRANCH | X86EMUL_F_IMPLICIT |
X86EMUL_F_INVLPG))
@@ -8235,8 +8237,12 @@ gva_t vmx_get_untagged_addr(struct kvm_vcpu *vcpu, gva_t gva, unsigned int flags
* or a supervisor address.
*/
if (!(gva & BIT_ULL(63))) {
- /* KVM doesn't yet virtualize LAM_U{48,57}. */
- return gva;
+ cr3_bits = kvm_get_active_cr3_lam_bits(vcpu);
+ if (!(cr3_bits & (X86_CR3_LAM_U57 | X86_CR3_LAM_U48)))
+ return gva;
+
+ /* LAM_U48 is ignored if LAM_U57 is set. */
+ lam_bit = cr3_bits & X86_CR3_LAM_U57 ? 56 : 47;
} else {
if (!kvm_is_cr4_bit_set(vcpu, X86_CR4_LAM_SUP))
return gva;
--
2.25.1
Use the new flag X86EMUL_F_BRANCH instead of X86EMUL_F_FETCH in assign_eip()
to distinguish instruction fetch and branch target computation for features
that handle differently on them, e.g. Linear Address Space Separation (LASS).
As of this patch, X86EMUL_F_BRANCH and X86EMUL_F_FETCH are identical as far
as KVM is concerned.
No functional change intended.
Signed-off-by: Binbin Wu <[email protected]>
Tested-by: Xuelian Guo <[email protected]>
---
arch/x86/kvm/emulate.c | 5 +++--
arch/x86/kvm/kvm_emulate.h | 1 +
2 files changed, 4 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 87ee1802166a..274d6e7aa0c1 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -721,7 +721,8 @@ static __always_inline int __linearize(struct x86_emulate_ctxt *ctxt,
(flags & X86EMUL_F_WRITE))
goto bad;
/* unreadable code segment */
- if (!(flags & X86EMUL_F_FETCH) && (desc.type & 8) && !(desc.type & 2))
+ if (!(flags & (X86EMUL_F_FETCH | X86EMUL_F_BRANCH)) &&
+ (desc.type & 8) && !(desc.type & 2))
goto bad;
lim = desc_limit_scaled(&desc);
if (!(desc.type & 8) && (desc.type & 4)) {
@@ -772,7 +773,7 @@ static inline int assign_eip(struct x86_emulate_ctxt *ctxt, ulong dst)
if (ctxt->op_bytes != sizeof(unsigned long))
addr.ea = dst & ((1UL << (ctxt->op_bytes << 3)) - 1);
rc = __linearize(ctxt, addr, &max_size, 1, ctxt->mode, &linear,
- X86EMUL_F_FETCH);
+ X86EMUL_F_BRANCH);
if (rc == X86EMUL_CONTINUE)
ctxt->_eip = addr.ea;
return rc;
diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h
index e24c8ac7b930..e1fd83908334 100644
--- a/arch/x86/kvm/kvm_emulate.h
+++ b/arch/x86/kvm/kvm_emulate.h
@@ -91,6 +91,7 @@ struct x86_instruction_info {
/* x86-specific emulation flags */
#define X86EMUL_F_WRITE BIT(0)
#define X86EMUL_F_FETCH BIT(1)
+#define X86EMUL_F_BRANCH BIT(2)
struct x86_emulate_ops {
void (*vm_bugged)(struct x86_emulate_ctxt *ctxt);
--
2.25.1
From: Zeng Guang <[email protected]>
When Intel Linear Address Space Separation (LASS) is enabled, the
processor applies a LASS violation check to every access to a linear
address. To align with hardware behavior, KVM needs to perform the
same check in instruction emulation.
Define a new function in x86_emulator_ops to perform the LASS violation
check in KVM emulator. The function accepts an address and a size, which
delimit the memory access, and a flag, which provides extra information
about the access that is necessary for LASS violation checks, e.g., whether
the access is an instruction fetch or implicit access.
emulator_is_lass_violation() is just a placeholder. it will be wired up
to VMX/SVM implementation by a later patch.
Signed-off-by: Zeng Guang <[email protected]>
Signed-off-by: Binbin Wu <[email protected]>
Tested-by: Xuelian Guo <[email protected]>
---
arch/x86/include/asm/kvm-x86-ops.h | 3 ++-
arch/x86/include/asm/kvm_host.h | 3 +++
arch/x86/kvm/emulate.c | 11 +++++++++++
arch/x86/kvm/kvm_emulate.h | 3 +++
arch/x86/kvm/x86.c | 10 ++++++++++
5 files changed, 29 insertions(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h
index 179931b73876..fc9945e80177 100644
--- a/arch/x86/include/asm/kvm-x86-ops.h
+++ b/arch/x86/include/asm/kvm-x86-ops.h
@@ -133,8 +133,9 @@ KVM_X86_OP_OPTIONAL(migrate_timers)
KVM_X86_OP(msr_filter_changed)
KVM_X86_OP(complete_emulated_msr)
KVM_X86_OP(vcpu_deliver_sipi_vector)
-KVM_X86_OP_OPTIONAL_RET0(vcpu_get_apicv_inhibit_reasons);
+KVM_X86_OP_OPTIONAL_RET0(vcpu_get_apicv_inhibit_reasons)
KVM_X86_OP(get_untagged_addr)
+KVM_X86_OP_OPTIONAL_RET0(is_lass_violation)
#undef KVM_X86_OP
#undef KVM_X86_OP_OPTIONAL
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index d4e3657b840a..3e73fc45c8e6 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1754,6 +1754,9 @@ struct kvm_x86_ops {
unsigned long (*vcpu_get_apicv_inhibit_reasons)(struct kvm_vcpu *vcpu);
gva_t (*get_untagged_addr)(struct kvm_vcpu *vcpu, gva_t gva, unsigned int flags);
+
+ bool (*is_lass_violation)(struct kvm_vcpu *vcpu, unsigned long addr,
+ unsigned int size, unsigned int flags);
};
struct kvm_x86_nested_ops {
diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 7af58b8d57ac..cbd08daeae9e 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -742,6 +742,10 @@ static __always_inline int __linearize(struct x86_emulate_ctxt *ctxt,
}
break;
}
+
+ if (ctxt->ops->is_lass_violation(ctxt, *linear, size, flags))
+ goto bad;
+
if (la & (insn_alignment(ctxt, size) - 1))
return emulate_gp(ctxt, 0);
return X86EMUL_CONTINUE;
@@ -848,6 +852,9 @@ static inline int jmp_rel(struct x86_emulate_ctxt *ctxt, int rel)
static int linear_read_system(struct x86_emulate_ctxt *ctxt, ulong linear,
void *data, unsigned size)
{
+ if (ctxt->ops->is_lass_violation(ctxt, linear, size, X86EMUL_F_IMPLICIT))
+ return emulate_gp(ctxt, 0);
+
return ctxt->ops->read_std(ctxt, linear, data, size, &ctxt->exception, true);
}
@@ -855,6 +862,10 @@ static int linear_write_system(struct x86_emulate_ctxt *ctxt,
ulong linear, void *data,
unsigned int size)
{
+ if (ctxt->ops->is_lass_violation(ctxt, linear, size,
+ X86EMUL_F_IMPLICIT | X86EMUL_F_WRITE))
+ return emulate_gp(ctxt, 0);
+
return ctxt->ops->write_std(ctxt, linear, data, size, &ctxt->exception, true);
}
diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h
index 26f402616604..a76baa51fa16 100644
--- a/arch/x86/kvm/kvm_emulate.h
+++ b/arch/x86/kvm/kvm_emulate.h
@@ -234,6 +234,9 @@ struct x86_emulate_ops {
gva_t (*get_untagged_addr)(struct x86_emulate_ctxt *ctxt, gva_t addr,
unsigned int flags);
+
+ bool (*is_lass_violation)(struct x86_emulate_ctxt *ctxt, unsigned long addr,
+ unsigned int size, unsigned int flags);
};
/* Type, address-of, and value of an instruction's operand. */
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 4c2cdfcae79d..58d7a9241630 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8317,6 +8317,15 @@ static gva_t emulator_get_untagged_addr(struct x86_emulate_ctxt *ctxt,
return static_call(kvm_x86_get_untagged_addr)(emul_to_vcpu(ctxt), addr, flags);
}
+static bool emulator_is_lass_violation(struct x86_emulate_ctxt *ctxt,
+ unsigned long addr,
+ unsigned int size,
+ unsigned int flags)
+{
+ return static_call(kvm_x86_is_lass_violation)(emul_to_vcpu(ctxt),
+ addr, size, flags);
+}
+
static const struct x86_emulate_ops emulate_ops = {
.vm_bugged = emulator_vm_bugged,
.read_gpr = emulator_read_gpr,
@@ -8362,6 +8371,7 @@ static const struct x86_emulate_ops emulate_ops = {
.triple_fault = emulator_triple_fault,
.set_xcr = emulator_set_xcr,
.get_untagged_addr = emulator_get_untagged_addr,
+ .is_lass_violation = emulator_is_lass_violation,
};
static void toggle_interruptibility(struct kvm_vcpu *vcpu, u32 mask)
--
2.25.1
Add an emulation flag X86EMUL_F_INVLPG, which is used to identify an
instruction that does TLB invalidation without true memory access.
Only invlpg & invlpga implemented in emulator belong to this kind.
invlpga doesn't need additional information for emulation. Just pass
the flag to em_invlpg().
Linear Address Masking (LAM) and Linear Address Space Separation (LASS)
don't apply to addresses that are inputs to TLB invalidation. The flag
will be consumed to support LAM/LASS virtualization.
Signed-off-by: Binbin Wu <[email protected]>
Tested-by: Xuelian Guo <[email protected]>
---
arch/x86/kvm/emulate.c | 4 +++-
arch/x86/kvm/kvm_emulate.h | 1 +
2 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 274d6e7aa0c1..f54e1a2cafa9 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -3441,8 +3441,10 @@ static int em_invlpg(struct x86_emulate_ctxt *ctxt)
{
int rc;
ulong linear;
+ unsigned int max_size;
- rc = linearize(ctxt, ctxt->src.addr.mem, 1, false, &linear);
+ rc = __linearize(ctxt, ctxt->src.addr.mem, &max_size, 1, ctxt->mode,
+ &linear, X86EMUL_F_INVLPG);
if (rc == X86EMUL_CONTINUE)
ctxt->ops->invlpg(ctxt, linear);
/* Disable writeback. */
diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h
index 5f9869018332..70f329a685fe 100644
--- a/arch/x86/kvm/kvm_emulate.h
+++ b/arch/x86/kvm/kvm_emulate.h
@@ -93,6 +93,7 @@ struct x86_instruction_info {
#define X86EMUL_F_FETCH BIT(1)
#define X86EMUL_F_BRANCH BIT(2)
#define X86EMUL_F_IMPLICIT BIT(3)
+#define X86EMUL_F_INVLPG BIT(4)
struct x86_emulate_ops {
void (*vm_bugged)(struct x86_emulate_ctxt *ctxt);
--
2.25.1
From: Robert Hoo <[email protected]>
Add support to allow guests to set the new CR4 control bit for LAM and add
implementation to get untagged address for supervisor pointers.
LAM modifies the canonicality check applied to 64-bit linear addresses for
data accesses, allowing software to use of the untranslated address bits for
metadata and masks the metadata bits before using them as linear addresses
to access memory. LAM uses CR4.LAM_SUP (bit 28) to configure and enable LAM
for supervisor pointers. It also changes VMENTER to allow the bit to be set
in VMCS's HOST_CR4 and GUEST_CR4 to support virtualization. Note CR4.LAM_SUP
is allowed to be set even not in 64-bit mode, but it will not take effect
since LAM only applies to 64-bit linear addresses.
Move CR4.LAM_SUP out of CR4_RESERVED_BITS, its reservation depends on vcpu
supporting LAM or not. Leave it intercepted to prevent guest from setting
the bit if LAM is not exposed to guest as well as to avoid vmread every time
when KVM fetches its value, with the expectation that guest won't toggle the
bit frequently.
Set CR4.LAM_SUP bit in the emulated IA32_VMX_CR4_FIXED1 MSR for guests to
allow guests to enable LAM for supervisor pointers in nested VMX operation.
Hardware is not required to do TLB flush when CR4.LAM_SUP toggled, KVM doesn't
need to emulate TLB flush based on it.
There's no other features/vmx_exec_controls connection, no other code needed
in {kvm,vmx}_set_cr4().
Skip address untag for instruction fetch, branch target and operand of INVLPG,
which LAM doesn't apply to. Skip address untag for implicit system accesses
since LAM doesn't apply to the loading of base addresses of memory management
registers and segment registers, their values still need to be canonical (for
now, get_untagged_addr() interface is not called for implicit system accesses,
just for future proof).
Signed-off-by: Robert Hoo <[email protected]>
Co-developed-by: Binbin Wu <[email protected]>
Signed-off-by: Binbin Wu <[email protected]>
Reviewed-by: Chao Gao <[email protected]>
Reviewed-by: Kai Huang <[email protected]>
Tested-by: Xuelian Guo <[email protected]>
---
arch/x86/include/asm/kvm_host.h | 3 ++-
arch/x86/kvm/vmx/vmx.c | 40 ++++++++++++++++++++++++++++++++-
arch/x86/kvm/x86.h | 2 ++
3 files changed, 43 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 08e94f30d376..d4e3657b840a 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -125,7 +125,8 @@
| X86_CR4_PGE | X86_CR4_PCE | X86_CR4_OSFXSR | X86_CR4_PCIDE \
| X86_CR4_OSXSAVE | X86_CR4_SMEP | X86_CR4_FSGSBASE \
| X86_CR4_OSXMMEXCPT | X86_CR4_LA57 | X86_CR4_VMXE \
- | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP))
+ | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP \
+ | X86_CR4_LAM_SUP))
#define CR8_RESERVED_BITS (~(unsigned long)X86_CR8_TPR)
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index b572cfe27342..ee35a91aa584 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7677,6 +7677,9 @@ static void nested_vmx_cr_fixed1_bits_update(struct kvm_vcpu *vcpu)
cr4_fixed1_update(X86_CR4_UMIP, ecx, feature_bit(UMIP));
cr4_fixed1_update(X86_CR4_LA57, ecx, feature_bit(LA57));
+ entry = kvm_find_cpuid_entry_index(vcpu, 0x7, 1);
+ cr4_fixed1_update(X86_CR4_LAM_SUP, eax, feature_bit(LAM));
+
#undef cr4_fixed1_update
}
@@ -8209,9 +8212,44 @@ static void vmx_vm_destroy(struct kvm *kvm)
free_pages((unsigned long)kvm_vmx->pid_table, vmx_get_pid_table_order(kvm));
}
+/*
+ * Note, the SDM states that the linear address is masked *after* the modified
+ * canonicality check, whereas KVM masks (untags) the address and then performs
+ * a "normal" canonicality check. Functionally, the two methods are identical,
+ * and when the masking occurs relative to the canonicality check isn't visible
+ * to software, i.e. KVM's behavior doesn't violate the SDM.
+ */
gva_t vmx_get_untagged_addr(struct kvm_vcpu *vcpu, gva_t gva, unsigned int flags)
{
- return gva;
+ int lam_bit;
+
+ if (flags & (X86EMUL_F_FETCH | X86EMUL_F_BRANCH | X86EMUL_F_IMPLICIT |
+ X86EMUL_F_INVLPG))
+ return gva;
+
+ if (!is_64_bit_mode(vcpu))
+ return gva;
+
+ /*
+ * Bit 63 determines if the address should be treated as user address
+ * or a supervisor address.
+ */
+ if (!(gva & BIT_ULL(63))) {
+ /* KVM doesn't yet virtualize LAM_U{48,57}. */
+ return gva;
+ } else {
+ if (!kvm_is_cr4_bit_set(vcpu, X86_CR4_LAM_SUP))
+ return gva;
+
+ lam_bit = kvm_is_cr4_bit_set(vcpu, X86_CR4_LA57) ? 56 : 47;
+ }
+
+ /*
+ * Untag the address by sign-extending the lam_bit, but NOT to bit 63.
+ * Bit 63 is retained from the raw virtual address so that untagging
+ * doesn't change a user access to a supervisor access, and vice versa.
+ */
+ return (sign_extend64(gva, lam_bit) & ~BIT_ULL(63)) | (gva & BIT_ULL(63));
}
static struct kvm_x86_ops vmx_x86_ops __initdata = {
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index 1e7be1f6ab29..53e883721e71 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -529,6 +529,8 @@ bool kvm_msr_allowed(struct kvm_vcpu *vcpu, u32 index, u32 type);
__reserved_bits |= X86_CR4_VMXE; \
if (!__cpu_has(__c, X86_FEATURE_PCID)) \
__reserved_bits |= X86_CR4_PCIDE; \
+ if (!__cpu_has(__c, X86_FEATURE_LAM)) \
+ __reserved_bits |= X86_CR4_LAM_SUP; \
__reserved_bits; \
})
--
2.25.1
Consolidate @write and @fetch of __linearize() into a set of flags so that
additional flags can be added without needing more/new boolean parameters,
to precisely identify the access type.
No functional change intended.
Signed-off-by: Binbin Wu <[email protected]>
Reviewed-by: Chao Gao <[email protected]>
Acked-by: Kai Huang <[email protected]>
Tested-by: Xuelian Guo <[email protected]>
---
arch/x86/kvm/emulate.c | 21 +++++++++++----------
arch/x86/kvm/kvm_emulate.h | 4 ++++
2 files changed, 15 insertions(+), 10 deletions(-)
diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 2673cd5c46cb..87ee1802166a 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -687,8 +687,8 @@ static unsigned insn_alignment(struct x86_emulate_ctxt *ctxt, unsigned size)
static __always_inline int __linearize(struct x86_emulate_ctxt *ctxt,
struct segmented_address addr,
unsigned *max_size, unsigned size,
- bool write, bool fetch,
- enum x86emul_mode mode, ulong *linear)
+ enum x86emul_mode mode, ulong *linear,
+ unsigned int flags)
{
struct desc_struct desc;
bool usable;
@@ -717,11 +717,11 @@ static __always_inline int __linearize(struct x86_emulate_ctxt *ctxt,
if (!usable)
goto bad;
/* code segment in protected mode or read-only data segment */
- if ((((ctxt->mode != X86EMUL_MODE_REAL) && (desc.type & 8))
- || !(desc.type & 2)) && write)
+ if ((((ctxt->mode != X86EMUL_MODE_REAL) && (desc.type & 8)) || !(desc.type & 2)) &&
+ (flags & X86EMUL_F_WRITE))
goto bad;
/* unreadable code segment */
- if (!fetch && (desc.type & 8) && !(desc.type & 2))
+ if (!(flags & X86EMUL_F_FETCH) && (desc.type & 8) && !(desc.type & 2))
goto bad;
lim = desc_limit_scaled(&desc);
if (!(desc.type & 8) && (desc.type & 4)) {
@@ -757,8 +757,8 @@ static int linearize(struct x86_emulate_ctxt *ctxt,
ulong *linear)
{
unsigned max_size;
- return __linearize(ctxt, addr, &max_size, size, write, false,
- ctxt->mode, linear);
+ return __linearize(ctxt, addr, &max_size, size, ctxt->mode, linear,
+ write ? X86EMUL_F_WRITE : 0);
}
static inline int assign_eip(struct x86_emulate_ctxt *ctxt, ulong dst)
@@ -771,7 +771,8 @@ static inline int assign_eip(struct x86_emulate_ctxt *ctxt, ulong dst)
if (ctxt->op_bytes != sizeof(unsigned long))
addr.ea = dst & ((1UL << (ctxt->op_bytes << 3)) - 1);
- rc = __linearize(ctxt, addr, &max_size, 1, false, true, ctxt->mode, &linear);
+ rc = __linearize(ctxt, addr, &max_size, 1, ctxt->mode, &linear,
+ X86EMUL_F_FETCH);
if (rc == X86EMUL_CONTINUE)
ctxt->_eip = addr.ea;
return rc;
@@ -907,8 +908,8 @@ static int __do_insn_fetch_bytes(struct x86_emulate_ctxt *ctxt, int op_size)
* boundary check itself. Instead, we use max_size to check
* against op_size.
*/
- rc = __linearize(ctxt, addr, &max_size, 0, false, true, ctxt->mode,
- &linear);
+ rc = __linearize(ctxt, addr, &max_size, 0, ctxt->mode, &linear,
+ X86EMUL_F_FETCH);
if (unlikely(rc != X86EMUL_CONTINUE))
return rc;
diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h
index be7aeb9b8ea3..e24c8ac7b930 100644
--- a/arch/x86/kvm/kvm_emulate.h
+++ b/arch/x86/kvm/kvm_emulate.h
@@ -88,6 +88,10 @@ struct x86_instruction_info {
#define X86EMUL_IO_NEEDED 5 /* IO is needed to complete emulation */
#define X86EMUL_INTERCEPTED 6 /* Intercepted by nested VMCB/VMCS */
+/* x86-specific emulation flags */
+#define X86EMUL_F_WRITE BIT(0)
+#define X86EMUL_F_FETCH BIT(1)
+
struct x86_emulate_ops {
void (*vm_bugged)(struct x86_emulate_ctxt *ctxt);
/*
--
2.25.1
From: Zeng Guang <[email protected]>
Linear address space separation (LASS) is an independent mechanism
to enforce the mode-based protection that can prevent user-mode
accesses to supervisor-mode addresses, and vice versa. Because the
LASS protections are applied before paging, malicious software can
not acquire any paging-based timing information to compromise the
security of system.
The CPUID bit definition to support LASS:
CPUID.(EAX=07H.ECX=1):EAX.LASS[bit 6]
Advertise LASS to user space to support LASS virtualization.
Note: KVM LASS feature exposure also depends on cpuid capability
held by host kernel. It will be masked to guest if host vsyscall
is in emulate mode which actually disables LASS.
Signed-off-by: Zeng Guang <[email protected]>
Signed-off-by: Binbin Wu <[email protected]>
Tested-by: Xuelian Guo <[email protected]>
---
arch/x86/kvm/cpuid.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index a0db266bab73..81a52218c20f 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -675,7 +675,7 @@ void kvm_set_cpu_caps(void)
kvm_cpu_cap_set(X86_FEATURE_SPEC_CTRL_SSBD);
kvm_cpu_cap_mask(CPUID_7_1_EAX,
- F(AVX_VNNI) | F(AVX512_BF16) | F(CMPCCXADD) |
+ F(AVX_VNNI) | F(AVX512_BF16) | F(LASS) | F(CMPCCXADD) |
F(FZRM) | F(FSRS) | F(FSRC) |
F(AMX_FP16) | F(AVX_IFMA) | F(LAM)
);
--
2.25.1
Hi Sean,
Does this version of LAM patch set have the chance to be pulled for 6.7?
On 9/13/2023 8:42 PM, Binbin Wu wrote:
> This patch series includes KVM enabling patches for Linear-address masking
> (LAM) v11 and Linear Address Space Separation (LASS) v3 since the two features
> have overlapping prep work and concepts. Sent as a single series to reduce the
> probability of conflicts.
>
> The patch series is organized as follows:
> - Patch 1-4: Common prep work for both LAM and LASS.
> - Patch 5-13: LAM part.
> - Patch 14-16: LASS part.
>
> Dependency:
> - LAM has no other dependency.
> - LASS patches depends on LASS kernel enabling patches, which are not merged yet.
> https://lore.kernel.org/all/[email protected]/
>
>
> ==== LAM v11 ====
>
> Linear-address masking (LAM) [1], modifies the checking that is applied to
> *64-bit* linear addresses, allowing software to use of the untranslated
> address bits for metadata and masks the metadata bits before using them as
> linear addresses to access memory.
>
> When the feature is virtualized and exposed to guest, it can be used for
> efficient address sanitizers (ASAN) implementation and for optimizations in
> JITs and virtual machines.
>
> The patch series brings LAM virtualization support in KVM.
>
> Please review and consider applying.
>
> LAM QEMU patch:
> https://lists.gnu.org/archive/html/qemu-devel/2023-07/msg04160.html
>
> LAM kvm-unit-tests patch:
> https://lore.kernel.org/kvm/[email protected]/
>
> --- Test ---
> 1. Add test cases in kvm-unit-test for LAM [2], including LAM_SUP and LAM_{U57,U48}.
> For supervisor pointers, the test covers CR4 LAM_SUP bits toggle, Memory/MMIO
> access with tagged pointer, and some special instructions (INVLPG, INVPCID,
> INVVPID), INVVPID cases also used to cover VMX instruction VMExit path.
> For user pointers, the test covers CR3 LAM bits toggle, Memory/MMIO access with
> tagged pointer.
> MMIO cases are used to trigger instruction emulation path.
> Run the unit test with both LAM feature on/off (i.e. including negative cases).
> Run the unit test in L1 guest with both LAM feature on/off.
> 2. Run Kernel LAM kselftests in guest, with both EPT=Y/N.
> 3. Launch a nested guest and run tests listed in 1 & 2.
>
> All tests have passed on real machine supporting LAM.
>
> [1] Intel ISE https://cdrdv2.intel.com/v1/dl/getContent/671368
> Chapter Linear Address Masking (LAM)
> [2] https://lore.kernel.org/kvm/[email protected]/
>
> ----------
> Changelog
>
> v11:
> - A separate patch to drop non-PA bits when getting GFN for guest's PGD [Sean]
> - Add a patch to remove kvm_vcpu_is_illegal_gpa() [Isaku]
> - Squash CR4 LAM bit handling with the address untag for supervisor pointers. [Sean]
> - Squash CR3 LAM bits handling with the address untag for user pointers. [Sean]
> - Adopt KVM-governed feature framework to track "LAM enabled" as a separate
> optimization patch, and add the reason in patch change log. [Sean, Kai]
> - Some comment modifications/additions according to reviews [Sean]
>
> v10:
> https://lore.kernel.org/kvm/[email protected]/
>
>
> ==== LASS v3 ====
>
> Linear Address Space Separation (LASS)[1] is a new mechanism that
> enforces the same mode-based protections as paging, i.e. SMAP/SMEP
> but without traversing the paging structures. Because the protections
> enforced by LASS are applied before paging, "probes" by malicious
> software will provide no paging-based timing information.
>
> This patch series provide a LASS KVM solution and depends on kernel
> enabling that can be found at [2].
>
> --- Test ---
> 1. Test the basic function of LASS virtualization including LASS
> enumeration and enabling in guest and nested environment.
> 2. Run selftest with following cases:
> - data access to user address space in supervisor mode
> - data access to supervisor address space in user mode
> - data access to linear address across space boundary
> - Using KVM FEP mechanism to run test cases above
> - VMX instruction execution with VMCS structure in user
> address space
> - instruction fetch from user address space in supervisor mode
> - instruction fetch from supervisor address space in user mode
>
> All tests have passed on real machine supporting LASS.
>
> [1] Intel ISE spec https://cdrdv2.intel.com/v1/dl/getContent/671368
> Chapter Linear Address Space Separation (LASS)
>
> [2] LASS kernel patch series
> https://lore.kernel.org/all/[email protected]/
>
> ----------
> Change log
>
> v3:
> 1. Refine commit message [Sean/Chao Gao]
> 2. Enhance the implementation of LASS violation check [Sean]
> 3. Re-organize patch as Sean's suggestion [Sean]
>
> v2:
> https://lore.kernel.org/all/[email protected]/
>
>
> Binbin Wu (10):
> KVM: x86: Consolidate flags for __linearize()
> KVM: x86: Use a new flag for branch targets
> KVM: x86: Add an emulation flag for implicit system access
> KVM: x86: Add X86EMUL_F_INVLPG and pass it in em_invlpg()
> KVM: x86/mmu: Drop non-PA bits when getting GFN for guest's PGD
> KVM: x86: Add & use kvm_vcpu_is_legal_cr3() to check CR3's legality
> KVM: x86: Remove kvm_vcpu_is_illegal_gpa()
> KVM: x86: Introduce get_untagged_addr() in kvm_x86_ops and call it in
> emulator
> KVM: x86: Untag address for vmexit handlers when LAM applicable
> KVM: x86: Use KVM-governed feature framework to track "LAM enabled"
>
> Robert Hoo (3):
> KVM: x86: Virtualize LAM for supervisor pointer
> KVM: x86: Virtualize LAM for user pointer
> KVM: x86: Advertise and enable LAM (user and supervisor)
>
> Zeng Guang (3):
> KVM: emulator: Add emulation of LASS violation checks on linear
> address
> KVM: VMX: Virtualize LASS
> KVM: x86: Advertise LASS CPUID to user space
>
> arch/x86/include/asm/kvm-x86-ops.h | 4 +-
> arch/x86/include/asm/kvm_host.h | 8 ++-
> arch/x86/kvm/cpuid.c | 4 +-
> arch/x86/kvm/cpuid.h | 13 ++--
> arch/x86/kvm/emulate.c | 39 +++++++----
> arch/x86/kvm/governed_features.h | 1 +
> arch/x86/kvm/kvm_emulate.h | 13 ++++
> arch/x86/kvm/mmu.h | 8 +++
> arch/x86/kvm/mmu/mmu.c | 2 +-
> arch/x86/kvm/mmu/mmu_internal.h | 1 +
> arch/x86/kvm/mmu/paging_tmpl.h | 2 +-
> arch/x86/kvm/svm/nested.c | 4 +-
> arch/x86/kvm/vmx/nested.c | 14 ++--
> arch/x86/kvm/vmx/sgx.c | 4 +-
> arch/x86/kvm/vmx/vmx.c | 106 ++++++++++++++++++++++++++++-
> arch/x86/kvm/vmx/vmx.h | 5 ++
> arch/x86/kvm/x86.c | 28 +++++++-
> arch/x86/kvm/x86.h | 4 ++
> 18 files changed, 226 insertions(+), 34 deletions(-)
>
>
> base-commit: 0bb80ecc33a8fb5a682236443c1e740d5c917d1d
> prerequisite-patch-id: 51db36ad7156234d05f8c4004ec6a31ef609b81a
On Sun, Oct 08, 2023, Binbin Wu wrote:
> Hi Sean,
>
> Does this version of LAM patch set have the chance to be pulled for 6.7?
There's still a chance, but I haven't looked at this version yet, so I can't give
a more confident answer, sorry. For a variety of reasons, my review time this
cycle has been much more limited than I anticipated.
On Wed, Sep 13, 2023, Binbin Wu wrote:
> Binbin Wu (10):
> KVM: x86: Consolidate flags for __linearize()
> KVM: x86: Use a new flag for branch targets
> KVM: x86: Add an emulation flag for implicit system access
> KVM: x86: Add X86EMUL_F_INVLPG and pass it in em_invlpg()
> KVM: x86/mmu: Drop non-PA bits when getting GFN for guest's PGD
> KVM: x86: Add & use kvm_vcpu_is_legal_cr3() to check CR3's legality
> KVM: x86: Remove kvm_vcpu_is_illegal_gpa()
> KVM: x86: Introduce get_untagged_addr() in kvm_x86_ops and call it in
> emulator
> KVM: x86: Untag address for vmexit handlers when LAM applicable
> KVM: x86: Use KVM-governed feature framework to track "LAM enabled"
>
> Robert Hoo (3):
> KVM: x86: Virtualize LAM for supervisor pointer
> KVM: x86: Virtualize LAM for user pointer
> KVM: x86: Advertise and enable LAM (user and supervisor)
>
> Zeng Guang (3):
> KVM: emulator: Add emulation of LASS violation checks on linear
> address
> KVM: VMX: Virtualize LASS
> KVM: x86: Advertise LASS CPUID to user space
This all looks good! I have a few minor nits, but nothing I can't tweak when
applying. Assuming nothing explodes in testing, I'll get this applied for 6.8
next week.
My apologies for not getting to this sooner and missing 6.7 :-(
On Fri, Oct 20, 2023, Sean Christopherson wrote:
> On Wed, Sep 13, 2023, Binbin Wu wrote:
> > Binbin Wu (10):
> > KVM: x86: Consolidate flags for __linearize()
> > KVM: x86: Use a new flag for branch targets
> > KVM: x86: Add an emulation flag for implicit system access
> > KVM: x86: Add X86EMUL_F_INVLPG and pass it in em_invlpg()
> > KVM: x86/mmu: Drop non-PA bits when getting GFN for guest's PGD
> > KVM: x86: Add & use kvm_vcpu_is_legal_cr3() to check CR3's legality
> > KVM: x86: Remove kvm_vcpu_is_illegal_gpa()
> > KVM: x86: Introduce get_untagged_addr() in kvm_x86_ops and call it in
> > emulator
> > KVM: x86: Untag address for vmexit handlers when LAM applicable
> > KVM: x86: Use KVM-governed feature framework to track "LAM enabled"
> >
> > Robert Hoo (3):
> > KVM: x86: Virtualize LAM for supervisor pointer
> > KVM: x86: Virtualize LAM for user pointer
> > KVM: x86: Advertise and enable LAM (user and supervisor)
> >
> > Zeng Guang (3):
> > KVM: emulator: Add emulation of LASS violation checks on linear
> > address
> > KVM: VMX: Virtualize LASS
> > KVM: x86: Advertise LASS CPUID to user space
>
> This all looks good! I have a few minor nits, but nothing I can't tweak when
> applying. Assuming nothing explodes in testing, I'll get this applied for 6.8
> next week.
Gah, by "this" I meant the LAM parts. LASS is going to have to wait until the
kernel support lands.
On 10/21/2023 8:34 AM, Sean Christopherson wrote:
> On Fri, Oct 20, 2023, Sean Christopherson wrote:
>> On Wed, Sep 13, 2023, Binbin Wu wrote:
>>> Binbin Wu (10):
>>> KVM: x86: Consolidate flags for __linearize()
>>> KVM: x86: Use a new flag for branch targets
>>> KVM: x86: Add an emulation flag for implicit system access
>>> KVM: x86: Add X86EMUL_F_INVLPG and pass it in em_invlpg()
>>> KVM: x86/mmu: Drop non-PA bits when getting GFN for guest's PGD
>>> KVM: x86: Add & use kvm_vcpu_is_legal_cr3() to check CR3's legality
>>> KVM: x86: Remove kvm_vcpu_is_illegal_gpa()
>>> KVM: x86: Introduce get_untagged_addr() in kvm_x86_ops and call it in
>>> emulator
>>> KVM: x86: Untag address for vmexit handlers when LAM applicable
>>> KVM: x86: Use KVM-governed feature framework to track "LAM enabled"
>>>
>>> Robert Hoo (3):
>>> KVM: x86: Virtualize LAM for supervisor pointer
>>> KVM: x86: Virtualize LAM for user pointer
>>> KVM: x86: Advertise and enable LAM (user and supervisor)
>>>
>>> Zeng Guang (3):
>>> KVM: emulator: Add emulation of LASS violation checks on linear
>>> address
>>> KVM: VMX: Virtualize LASS
>>> KVM: x86: Advertise LASS CPUID to user space
>> This all looks good! I have a few minor nits, but nothing I can't tweak when
>> applying. Assuming nothing explodes in testing, I'll get this applied for 6.8
>> next week.
Thanks very much!
> Gah, by "this" I meant the LAM parts. LASS is going to have to wait until the
> kernel support lands.
On Wed, Sep 13, 2023, Binbin Wu wrote:
> Use the new flag X86EMUL_F_BRANCH instead of X86EMUL_F_FETCH in assign_eip()
> to distinguish instruction fetch and branch target computation for features
> that handle differently on them, e.g. Linear Address Space Separation (LASS).
A slightly different shortlog+changelog:
KVM: x86: Add an emulator flag to differntiate branch targets from fetches
Add an emulator flag, X86EMUL_F_BRANCH, and use it instead of
X86EMUL_F_FETCH in assign_eip() to distinguish between instruction fetch
and branch target computation for features that handle them differently,
e.g. Intel's upcoming Linear Address Space Separation (LASS) applies to
code fetches but not branch target calculations.
The shortlog in particular is far too vague.
> As of this patch, X86EMUL_F_BRANCH and X86EMUL_F_FETCH are identical as far
> as KVM is concerned.
This patch looks good, but I'm going to skip it for now as it's not needed until
LASS is supported, since LAM doesn't differentiate between the two. I.e. this
should have been the first patch in the LASS portion of the series. No need to
repost, it's trivially easy to tweak vmx_get_untagged_addr().
On Wed, 13 Sep 2023 20:42:11 +0800, Binbin Wu wrote:
> This patch series includes KVM enabling patches for Linear-address masking
> (LAM) v11 and Linear Address Space Separation (LASS) v3 since the two features
> have overlapping prep work and concepts. Sent as a single series to reduce the
> probability of conflicts.
>
> The patch series is organized as follows:
> - Patch 1-4: Common prep work for both LAM and LASS.
> - Patch 5-13: LAM part.
> - Patch 14-16: LASS part.
>
> [...]
Applied to kvm-x86 lam (for 6.8)! I skipped the LASS patches, including patch 2
(the branch targets patch). I kept the IMPLICIT emulator flag even thought it's
not strictly needed as it's a nice way to document non-existent code.
I massaged a few changelogs and fixed the KVM_X86_OP_OPTIONAL() issue, but
otherwise I don't think I made any code changes (it's been a long day :-) ).
Please take a look to make sure it all looks good.
Thanks!
[01/16] KVM: x86: Consolidate flags for __linearize()
https://github.com/kvm-x86/linux/commit/81c940395b14
[02/16] KVM: x86: Use a new flag for branch targets
(no commit info)
[03/16] KVM: x86: Add an emulation flag for implicit system access
https://github.com/kvm-x86/linux/commit/90532843aebf
[04/16] KVM: x86: Add X86EMUL_F_INVLPG and pass it in em_invlpg()
https://github.com/kvm-x86/linux/commit/34b4ed7c1eaf
[05/16] KVM: x86/mmu: Drop non-PA bits when getting GFN for guest's PGD
https://github.com/kvm-x86/linux/commit/8b83853c5c98
[06/16] KVM: x86: Add & use kvm_vcpu_is_legal_cr3() to check CR3's legality
https://github.com/kvm-x86/linux/commit/82ba7169837e
[07/16] KVM: x86: Remove kvm_vcpu_is_illegal_gpa()
https://github.com/kvm-x86/linux/commit/95df55ee42fe
[08/16] KVM: x86: Introduce get_untagged_addr() in kvm_x86_ops and call it in emulator
https://github.com/kvm-x86/linux/commit/7a747b6c84a1
[09/16] KVM: x86: Untag address for vmexit handlers when LAM applicable
https://github.com/kvm-x86/linux/commit/ef99001b30a8
[10/16] KVM: x86: Virtualize LAM for supervisor pointer
https://github.com/kvm-x86/linux/commit/4daea9a5183f
[11/16] KVM: x86: Virtualize LAM for user pointer
https://github.com/kvm-x86/linux/commit/0cadc474eff0
[12/16] KVM: x86: Advertise and enable LAM (user and supervisor)
https://github.com/kvm-x86/linux/commit/6ef90ee226f1
[13/16] KVM: x86: Use KVM-governed feature framework to track "LAM enabled"
https://github.com/kvm-x86/linux/commit/b291db540763
[14/16] KVM: emulator: Add emulation of LASS violation checks on linear address
(no commit info)
[15/16] KVM: VMX: Virtualize LASS
(no commit info)
[16/16] KVM: x86: Advertise LASS CPUID to user space
(no commit info)
--
https://github.com/kvm-x86/linux/tree/next
On 10/24/2023 7:43 AM, Sean Christopherson wrote:
> On Wed, 13 Sep 2023 20:42:11 +0800, Binbin Wu wrote:
>> This patch series includes KVM enabling patches for Linear-address masking
>> (LAM) v11 and Linear Address Space Separation (LASS) v3 since the two features
>> have overlapping prep work and concepts. Sent as a single series to reduce the
>> probability of conflicts.
>>
>> The patch series is organized as follows:
>> - Patch 1-4: Common prep work for both LAM and LASS.
>> - Patch 5-13: LAM part.
>> - Patch 14-16: LASS part.
>>
>> [...]
> Applied to kvm-x86 lam (for 6.8)! I skipped the LASS patches, including patch 2
> (the branch targets patch). I kept the IMPLICIT emulator flag even thought it's
> not strictly needed as it's a nice way to document non-existent code.
>
> I massaged a few changelogs and fixed the KVM_X86_OP_OPTIONAL() issue, but
> otherwise I don't think I made any code changes (it's been a long day :-) ).
> Please take a look to make sure it all looks good.
Hi Sean,
Thanks for changelogs massage and the KVM_X86_OP_OPTIONAL() issue fix.
The LAM patches were applied as expected.
>
> Thanks!
>
> [01/16] KVM: x86: Consolidate flags for __linearize()
> https://github.com/kvm-x86/linux/commit/81c940395b14
> [02/16] KVM: x86: Use a new flag for branch targets
> (no commit info)
> [03/16] KVM: x86: Add an emulation flag for implicit system access
> https://github.com/kvm-x86/linux/commit/90532843aebf
> [04/16] KVM: x86: Add X86EMUL_F_INVLPG and pass it in em_invlpg()
> https://github.com/kvm-x86/linux/commit/34b4ed7c1eaf
> [05/16] KVM: x86/mmu: Drop non-PA bits when getting GFN for guest's PGD
> https://github.com/kvm-x86/linux/commit/8b83853c5c98
> [06/16] KVM: x86: Add & use kvm_vcpu_is_legal_cr3() to check CR3's legality
> https://github.com/kvm-x86/linux/commit/82ba7169837e
> [07/16] KVM: x86: Remove kvm_vcpu_is_illegal_gpa()
> https://github.com/kvm-x86/linux/commit/95df55ee42fe
> [08/16] KVM: x86: Introduce get_untagged_addr() in kvm_x86_ops and call it in emulator
> https://github.com/kvm-x86/linux/commit/7a747b6c84a1
> [09/16] KVM: x86: Untag address for vmexit handlers when LAM applicable
> https://github.com/kvm-x86/linux/commit/ef99001b30a8
> [10/16] KVM: x86: Virtualize LAM for supervisor pointer
> https://github.com/kvm-x86/linux/commit/4daea9a5183f
> [11/16] KVM: x86: Virtualize LAM for user pointer
> https://github.com/kvm-x86/linux/commit/0cadc474eff0
> [12/16] KVM: x86: Advertise and enable LAM (user and supervisor)
> https://github.com/kvm-x86/linux/commit/6ef90ee226f1
> [13/16] KVM: x86: Use KVM-governed feature framework to track "LAM enabled"
> https://github.com/kvm-x86/linux/commit/b291db540763
> [14/16] KVM: emulator: Add emulation of LASS violation checks on linear address
> (no commit info)
> [15/16] KVM: VMX: Virtualize LASS
> (no commit info)
> [16/16] KVM: x86: Advertise LASS CPUID to user space
> (no commit info)
>
> --
> https://github.com/kvm-x86/linux/tree/next