LinuxLists.cc - [PATCH v2 0/8] LASS KVM virtualization support

2023-07-19 03:31:34

Subject: [PATCH v2 0/8] LASS KVM virtualization support

Linear Address Space Separation (LASS)[1] is a new mechanism that
enforces the same mode-based protections as paging, i.e. SMAP/SMEP
but without traversing the paging structures. Because the protections
enforced by LASS are applied before paging, "probes" by malicious
software will provide no paging-based timing information.

Based on a linear-address organization, LASS partitions 64-bit linear
address space into two halves, user-mode address (LA[bit 63]=0) and
supervisor-mode address (LA[bit 63]=1).

LASS aims to prevent any attempt to probe supervisor-mode addresses by
user mode, and likewise stop any attempt to access (if SMAP enabled) or
execute user-mode addresses from supervisor mode.

When platform has LASS capability, KVM requires to expose this feature
to guest VM enumerated by CPUID.(EAX=07H.ECX=1):EAX.LASS[bit 6], and
allow guest to enable it via CR4.LASS[bit 27] on demand. For instruction
executed in the guest directly, hardware will perform the check. But KVM
also needs to behave same as hardware to apply LASS to kinds of guest
memory accesses when emulating instructions by software.

KVM will take following LASS violations check on emulation path.
User-mode access to supervisor space address:
LA[bit 63] && (CPL == 3)
Supervisor-mode access to user space address:
Instruction fetch: !LA[bit 63] && (CPL < 3)
Data access: !LA[bit 63] && (CR4.SMAP==1) && ((RFLAGS.AC == 0 &&
CPL < 3) || Implicit supervisor access)

This patch series provide a LASS KVM solution and depends on kernel
enabling that can be found at [2].

We tested the basic function of LASS virtualization including LASS
enumeration and enabling in non-root and nested environment. As KVM
unittest framework is not compatible to LASS rule, we use kernel module
and application test to emulate LASS violation instead. With KVM forced
emulation mechanism, we also verified the LASS functionality on some
emulation path with instruction fetch and data access to have same
behavior as hardware.

How to extend kselftest to support LASS is under investigation and
experiment.

[1] Intel ISE spec https://cdrdv2.intel.com/v1/dl/getContent/671368
Chapter Linear Address Space Separation (LASS)

[2] LASS kernel patch series
https://lore.kernel.org/all/[email protected]/

------------------------------------------------------------------------

v1->v2
1. refactor and optimize the interface of instruction emulation
by introducing new set of operation type definition prefixed with
"X86EMUL_F_" to distinguish access.
2. reorganize the patch to make each area of KVM better isolated.
3. refine LASS violation check design with consideration of wraparound
access across address space boundary.

v0->v1
1. Adapt to new __linearize() API
2. Function refactor of vmx_check_lass()
3. Refine commit message to be more precise
4. Drop LASS kvm cap detection depending
on hardware capability

Binbin Wu (4):
KVM: x86: Consolidate flags for __linearize()
KVM: x86: Use a new flag for branch instructions
KVM: x86: Add an emulation flag for implicit system access
KVM: x86: Add X86EMUL_F_INVTLB and pass it in em_invlpg()

Zeng Guang (4):
KVM: emulator: Add emulation of LASS violation checks on linear
address
KVM: VMX: Implement and apply vmx_is_lass_violation() for LASS
protection
KVM: x86: Virtualize CR4.LASS
KVM: x86: Advertise LASS CPUID to user space

arch/x86/include/asm/kvm-x86-ops.h | 3 ++-
arch/x86/include/asm/kvm_host.h | 5 +++-
arch/x86/kvm/cpuid.c | 5 ++--
arch/x86/kvm/emulate.c | 37 ++++++++++++++++++++---------
arch/x86/kvm/kvm_emulate.h | 9 +++++++
arch/x86/kvm/vmx/nested.c | 3 ++-
arch/x86/kvm/vmx/sgx.c | 4 ++++
arch/x86/kvm/vmx/vmx.c | 38 ++++++++++++++++++++++++++++++
arch/x86/kvm/vmx/vmx.h | 3 +++
arch/x86/kvm/x86.c | 10 ++++++++
arch/x86/kvm/x86.h | 2 ++
11 files changed, 102 insertions(+), 17 deletions(-)

--
2.27.0

2023-07-19 03:32:53

by Zeng Guang

[permalink] [raw]

Subject: [PATCH v2 4/8] KVM: x86: Add X86EMUL_F_INVTLB and pass it in em_invlpg()

From: Binbin Wu <[email protected]>

Add an emulation flag X86EMUL_F_INVTLB, which is used to identify an
instruction that does TLB invalidation without true memory access.

Only invlpg & invlpga implemented in emulator belong to this kind.
invlpga doesn't need additional information for emulation. Just pass
the flag to em_invlpg().

Signed-off-by: Binbin Wu <[email protected]>
Signed-off-by: Zeng Guang <[email protected]>
---
arch/x86/kvm/emulate.c | 4 +++-
arch/x86/kvm/kvm_emulate.h | 1 +
2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 8e706d19ae45..9b4b3ce6d52a 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -3443,8 +3443,10 @@ static int em_invlpg(struct x86_emulate_ctxt *ctxt)
{
int rc;
ulong linear;
+ unsigned max_size;

- rc = linearize(ctxt, ctxt->src.addr.mem, 1, false, &linear);
+ rc = __linearize(ctxt, ctxt->src.addr.mem, &max_size, 1, ctxt->mode,
+ &linear, X86EMUL_F_INVTLB);
if (rc == X86EMUL_CONTINUE)
ctxt->ops->invlpg(ctxt, linear);
/* Disable writeback. */
diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h
index c0e48f4fa7c4..c944055091e1 100644
--- a/arch/x86/kvm/kvm_emulate.h
+++ b/arch/x86/kvm/kvm_emulate.h
@@ -93,6 +93,7 @@ struct x86_instruction_info {
#define X86EMUL_F_FETCH BIT(1)
#define X86EMUL_F_BRANCH BIT(2)
#define X86EMUL_F_IMPLICIT BIT(3)
+#define X86EMUL_F_INVTLB BIT(4)

struct x86_emulate_ops {
void (*vm_bugged)(struct x86_emulate_ctxt *ctxt);
--
2.27.0

2023-07-19 03:33:00

by Zeng Guang

[permalink] [raw]

Subject: [PATCH v2 6/8] KVM: VMX: Implement and apply vmx_is_lass_violation() for LASS protection

Implement and wire up vmx_is_lass_violation() in kvm_x86_ops for VMX.

LASS violation check takes effect in KVM emulation of instruction fetch
and data access including implicit access when vCPU is running in long
mode, and also involved in emulation of VMX instruction and SGX ENCLS
instruction to enforce the mode-based protections before paging.

But the target memory address of emulation of TLB invalidation and branch
instructions aren't subject to LASS as exceptions.

Signed-off-by: Zeng Guang <[email protected]>
Tested-by: Xuelian Guo <[email protected]>
---
arch/x86/kvm/vmx/nested.c | 3 ++-
arch/x86/kvm/vmx/sgx.c | 4 ++++
arch/x86/kvm/vmx/vmx.c | 35 +++++++++++++++++++++++++++++++++++
arch/x86/kvm/vmx/vmx.h | 3 +++
4 files changed, 44 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index e35cf0bd0df9..72e78566a3b6 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -4985,7 +4985,8 @@ int get_vmx_mem_address(struct kvm_vcpu *vcpu, unsigned long exit_qualification,
* non-canonical form. This is the only check on the memory
* destination for long mode!
*/
- exn = is_noncanonical_address(*ret, vcpu);
+ exn = is_noncanonical_address(*ret, vcpu) ||
+ vmx_is_lass_violation(vcpu, *ret, len, 0);
} else {
/*
* When not in long mode, the virtual/linear address is
diff --git a/arch/x86/kvm/vmx/sgx.c b/arch/x86/kvm/vmx/sgx.c
index 2261b684a7d4..f8de637ce634 100644
--- a/arch/x86/kvm/vmx/sgx.c
+++ b/arch/x86/kvm/vmx/sgx.c
@@ -46,6 +46,10 @@ static int sgx_get_encls_gva(struct kvm_vcpu *vcpu, unsigned long offset,
((s.base != 0 || s.limit != 0xffffffff) &&
(((u64)*gva + size - 1) > s.limit + 1));
}
+
+ if (!fault)
+ fault = vmx_is_lass_violation(vcpu, *gva, size, 0);
+
if (fault)
kvm_inject_gp(vcpu, 0);
return fault ? -EINVAL : 0;
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 44fb619803b8..15a7c6e7a25d 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -8127,6 +8127,40 @@ static void vmx_vm_destroy(struct kvm *kvm)
free_pages((unsigned long)kvm_vmx->pid_table, vmx_get_pid_table_order(kvm));
}

+bool vmx_is_lass_violation(struct kvm_vcpu *vcpu, unsigned long addr,
+ unsigned int size, unsigned int flags)
+{
+ const bool is_supervisor_address = !!(addr & BIT_ULL(63));
+ const bool implicit = !!(flags & X86EMUL_F_IMPLICIT);
+ const bool fetch = !!(flags & X86EMUL_F_FETCH);
+ const bool is_wraparound_access = size ? (addr + size - 1) < addr : false;
+
+ if (!kvm_is_cr4_bit_set(vcpu, X86_CR4_LASS) || !is_long_mode(vcpu))
+ return false;
+
+ /*
+ * INVTLB isn't subject to LASS, e.g. to allow invalidating userspace
+ * addresses without toggling RFLAGS.AC. Branch targets aren't subject
+ * to LASS in order to simplifiy far control transfers (the subsequent
+ * fetch will enforce LASS as appropriate).
+ */
+ if (flags & (X86EMUL_F_BRANCH | X86EMUL_F_INVTLB))
+ return false;
+
+ if (!implicit && vmx_get_cpl(vcpu) == 3)
+ return is_supervisor_address;
+
+ /* LASS is enforced for supervisor-mode access iff SMAP is enabled. */
+ if (!fetch && !kvm_is_cr4_bit_set(vcpu, X86_CR4_SMAP))
+ return false;
+
+ /* Like SMAP, RFLAGS.AC disables LASS checks in supervisor mode. */
+ if (!fetch && !implicit && (kvm_get_rflags(vcpu) & X86_EFLAGS_AC))
+ return false;
+
+ return is_wraparound_access ? true : !is_supervisor_address;
+}
+
static struct kvm_x86_ops vmx_x86_ops __initdata = {
.name = KBUILD_MODNAME,

@@ -8266,6 +8300,7 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = {
.complete_emulated_msr = kvm_complete_insn_gp,

.vcpu_deliver_sipi_vector = kvm_vcpu_deliver_sipi_vector,
+ .is_lass_violation = vmx_is_lass_violation,
};

static unsigned int vmx_handle_intel_pt_intr(void)
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index 9e66531861cf..c1e541a790bb 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -433,6 +433,9 @@ void vmx_enable_intercept_for_msr(struct kvm_vcpu *vcpu, u32 msr, int type);
u64 vmx_get_l2_tsc_offset(struct kvm_vcpu *vcpu);
u64 vmx_get_l2_tsc_multiplier(struct kvm_vcpu *vcpu);

+bool vmx_is_lass_violation(struct kvm_vcpu *vcpu, unsigned long addr,
+ unsigned int size, unsigned int flags);
+
static inline void vmx_set_intercept_for_msr(struct kvm_vcpu *vcpu, u32 msr,
int type, bool value)
{
--
2.27.0

2023-07-19 04:17:02

by Zeng Guang

[permalink] [raw]

Subject: [PATCH v2 7/8] KVM: x86: Virtualize CR4.LASS

Virtualize CR4.LASS[bit 27] under KVM control instead of being guest-owned
as CR4.LASS generally set once for each vCPU at boot time and won't be
toggled at runtime. Besides, only if VM has LASS capability enumerated with
CPUID.(EAX=07H.ECX=1):EAX.LASS[bit 6], KVM allows guest software to be able
to set CR4.LASS.

Updating cr4_fixed1 to set CR4.LASS bit in the emulated IA32_VMX_CR4_FIXED1
MSR for guests and allow guests to enable LASS in nested VMX operation as
well.

Notes: Setting CR4.LASS to 1 enable LASS in IA-32e mode. It doesn't take
effect in legacy mode even if CR4.LASS is set.

Signed-off-by: Zeng Guang <[email protected]>
Tested-by: Xuelian Guo <[email protected]>
---
arch/x86/include/asm/kvm_host.h | 2 +-
arch/x86/kvm/vmx/vmx.c | 3 +++
arch/x86/kvm/x86.h | 2 ++
3 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 791f0dd48cd9..a881b0518a18 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -125,7 +125,7 @@
| X86_CR4_PGE | X86_CR4_PCE | X86_CR4_OSFXSR | X86_CR4_PCIDE \
| X86_CR4_OSXSAVE | X86_CR4_SMEP | X86_CR4_FSGSBASE \
| X86_CR4_OSXMMEXCPT | X86_CR4_LA57 | X86_CR4_VMXE \
- | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP))
+ | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP | X86_CR4_LASS))

#define CR8_RESERVED_BITS (~(unsigned long)X86_CR8_TPR)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 15a7c6e7a25d..e74991bed362 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7603,6 +7603,9 @@ static void nested_vmx_cr_fixed1_bits_update(struct kvm_vcpu *vcpu)
cr4_fixed1_update(X86_CR4_UMIP, ecx, feature_bit(UMIP));
cr4_fixed1_update(X86_CR4_LA57, ecx, feature_bit(LA57));

+ entry = kvm_find_cpuid_entry_index(vcpu, 0x7, 1);
+ cr4_fixed1_update(X86_CR4_LASS, eax, feature_bit(LASS));
+
#undef cr4_fixed1_update
}

diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index c544602d07a3..e1295f490308 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -529,6 +529,8 @@ bool kvm_msr_allowed(struct kvm_vcpu *vcpu, u32 index, u32 type);
__reserved_bits |= X86_CR4_VMXE; \
if (!__cpu_has(__c, X86_FEATURE_PCID)) \
__reserved_bits |= X86_CR4_PCIDE; \
+ if (!__cpu_has(__c, X86_FEATURE_LASS)) \
+ __reserved_bits |= X86_CR4_LASS; \
__reserved_bits; \
})

--
2.27.0

2023-07-19 04:18:09

by Zeng Guang

[permalink] [raw]

Subject: [PATCH v2 2/8] KVM: x86: Use a new flag for branch instructions

From: Binbin Wu <[email protected]>

Use the new flag X86EMUL_F_BRANCH instead of X86EMUL_F_FETCH in
assign_eip(), since strictly speaking it is not behavior of instruction
fetch.

Another reason is to distinguish instruction fetch and execution of branch
instruction for feature(s) that handle differently on them.

Branch instruction is not data access instruction, so skip checking against
execute-only code segment as instruction fetch.

Signed-off-by: Binbin Wu <[email protected]>
Signed-off-by: Zeng Guang <[email protected]>
---
arch/x86/kvm/emulate.c | 5 +++--
arch/x86/kvm/kvm_emulate.h | 1 +
2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 3ddfbc99fa4f..8e706d19ae45 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -721,7 +721,8 @@ static __always_inline int __linearize(struct x86_emulate_ctxt *ctxt,
(flags & X86EMUL_F_WRITE))
goto bad;
/* unreadable code segment */
- if (!(flags & X86EMUL_F_FETCH) && (desc.type & 8) && !(desc.type & 2))
+ if (!(flags & (X86EMUL_F_FETCH | X86EMUL_F_BRANCH))
+ && (desc.type & 8) && !(desc.type & 2))
goto bad;
lim = desc_limit_scaled(&desc);
if (!(desc.type & 8) && (desc.type & 4)) {
@@ -772,7 +773,7 @@ static inline int assign_eip(struct x86_emulate_ctxt *ctxt, ulong dst)
if (ctxt->op_bytes != sizeof(unsigned long))
addr.ea = dst & ((1UL << (ctxt->op_bytes << 3)) - 1);
rc = __linearize(ctxt, addr, &max_size, 1, ctxt->mode, &linear,
- X86EMUL_F_FETCH);
+ X86EMUL_F_BRANCH);
if (rc == X86EMUL_CONTINUE)
ctxt->_eip = addr.ea;
return rc;
diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h
index 86bbe997162d..9fc7d34a4ac1 100644
--- a/arch/x86/kvm/kvm_emulate.h
+++ b/arch/x86/kvm/kvm_emulate.h
@@ -91,6 +91,7 @@ struct x86_instruction_info {
/* x86-specific emulation flags */
#define X86EMUL_F_WRITE BIT(0)
#define X86EMUL_F_FETCH BIT(1)
+#define X86EMUL_F_BRANCH BIT(2)

struct x86_emulate_ops {
void (*vm_bugged)(struct x86_emulate_ctxt *ctxt);
--
2.27.0

2023-07-19 04:18:29

by Zeng Guang

[permalink] [raw]

Subject: [PATCH v2 1/8] KVM: x86: Consolidate flags for __linearize()

From: Binbin Wu <[email protected]>

Consolidate @write and @fetch of __linearize() into a set of flags so that
additional flags can be added without needing more/new boolean parameters,
to precisely identify the access type.

No functional change intended.

Signed-off-by: Binbin Wu <[email protected]>
Reviewed-by: Chao Gao <[email protected]>
Acked-by: Kai Huang <[email protected]>
Signed-off-by: Zeng Guang <[email protected]>
---
arch/x86/kvm/emulate.c | 21 +++++++++++----------
arch/x86/kvm/kvm_emulate.h | 4 ++++
2 files changed, 15 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 936a397a08cd..3ddfbc99fa4f 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -687,8 +687,8 @@ static unsigned insn_alignment(struct x86_emulate_ctxt *ctxt, unsigned size)
static __always_inline int __linearize(struct x86_emulate_ctxt *ctxt,
struct segmented_address addr,
unsigned *max_size, unsigned size,
- bool write, bool fetch,
- enum x86emul_mode mode, ulong *linear)
+ enum x86emul_mode mode, ulong *linear,
+ unsigned int flags)
{
struct desc_struct desc;
bool usable;
@@ -717,11 +717,11 @@ static __always_inline int __linearize(struct x86_emulate_ctxt *ctxt,
if (!usable)
goto bad;
/* code segment in protected mode or read-only data segment */
- if ((((ctxt->mode != X86EMUL_MODE_REAL) && (desc.type & 8))
- || !(desc.type & 2)) && write)
+ if ((((ctxt->mode != X86EMUL_MODE_REAL) && (desc.type & 8)) || !(desc.type & 2)) &&
+ (flags & X86EMUL_F_WRITE))
goto bad;
/* unreadable code segment */
- if (!fetch && (desc.type & 8) && !(desc.type & 2))
+ if (!(flags & X86EMUL_F_FETCH) && (desc.type & 8) && !(desc.type & 2))
goto bad;
lim = desc_limit_scaled(&desc);
if (!(desc.type & 8) && (desc.type & 4)) {
@@ -757,8 +757,8 @@ static int linearize(struct x86_emulate_ctxt *ctxt,
ulong *linear)
{
unsigned max_size;
- return __linearize(ctxt, addr, &max_size, size, write, false,
- ctxt->mode, linear);
+ return __linearize(ctxt, addr, &max_size, size, ctxt->mode, linear,
+ write ? X86EMUL_F_WRITE : 0);
}

static inline int assign_eip(struct x86_emulate_ctxt *ctxt, ulong dst)
@@ -771,7 +771,8 @@ static inline int assign_eip(struct x86_emulate_ctxt *ctxt, ulong dst)

if (ctxt->op_bytes != sizeof(unsigned long))
addr.ea = dst & ((1UL << (ctxt->op_bytes << 3)) - 1);
- rc = __linearize(ctxt, addr, &max_size, 1, false, true, ctxt->mode, &linear);
+ rc = __linearize(ctxt, addr, &max_size, 1, ctxt->mode, &linear,
+ X86EMUL_F_FETCH);
if (rc == X86EMUL_CONTINUE)
ctxt->_eip = addr.ea;
return rc;
@@ -907,8 +908,8 @@ static int __do_insn_fetch_bytes(struct x86_emulate_ctxt *ctxt, int op_size)
* boundary check itself. Instead, we use max_size to check
* against op_size.
*/
- rc = __linearize(ctxt, addr, &max_size, 0, false, true, ctxt->mode,
- &linear);
+ rc = __linearize(ctxt, addr, &max_size, 0, ctxt->mode, &linear,
+ X86EMUL_F_FETCH);
if (unlikely(rc != X86EMUL_CONTINUE))
return rc;

diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h
index ab65f3a47dfd..86bbe997162d 100644
--- a/arch/x86/kvm/kvm_emulate.h
+++ b/arch/x86/kvm/kvm_emulate.h
@@ -88,6 +88,10 @@ struct x86_instruction_info {
#define X86EMUL_IO_NEEDED 5 /* IO is needed to complete emulation */
#define X86EMUL_INTERCEPTED 6 /* Intercepted by nested VMCB/VMCS */

+/* x86-specific emulation flags */
+#define X86EMUL_F_WRITE BIT(0)
+#define X86EMUL_F_FETCH BIT(1)
+
struct x86_emulate_ops {
void (*vm_bugged)(struct x86_emulate_ctxt *ctxt);
/*
--
2.27.0

2023-07-19 04:18:48

by Zeng Guang

[permalink] [raw]

Subject: [PATCH v2 3/8] KVM: x86: Add an emulation flag for implicit system access

From: Binbin Wu <[email protected]>

Add an emulation flag X86EMUL_F_IMPLICIT to identify the behavior of
implicit system access in instruction emulation.

Signed-off-by: Binbin Wu <[email protected]>
Signed-off-by: Zeng Guang <[email protected]>
---
arch/x86/kvm/kvm_emulate.h | 1 +
1 file changed, 1 insertion(+)

diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h
index 9fc7d34a4ac1..c0e48f4fa7c4 100644
--- a/arch/x86/kvm/kvm_emulate.h
+++ b/arch/x86/kvm/kvm_emulate.h
@@ -92,6 +92,7 @@ struct x86_instruction_info {
#define X86EMUL_F_WRITE BIT(0)
#define X86EMUL_F_FETCH BIT(1)
#define X86EMUL_F_BRANCH BIT(2)
+#define X86EMUL_F_IMPLICIT BIT(3)

struct x86_emulate_ops {
void (*vm_bugged)(struct x86_emulate_ctxt *ctxt);
--
2.27.0

2023-07-19 04:22:30

by Zeng Guang

[permalink] [raw]

Subject: [PATCH v2 5/8] KVM: emulator: Add emulation of LASS violation checks on linear address

When enabled Intel CPU feature Linear Address Space Separation (LASS),
KVM emulator will take LASS violation check on every access to guest
memory by a linear address.

We defined a new function prototype in kvm_x86_ops for emulator to
construct the interface to identify whether a LASS violation occurs.
It can have further practical implementation according to vendor
specific requirements.

Emulator will use the passed (address, size) pair and instruction
operation type (flags) to enforce LASS protection when KVM emulates
instruction fetch, data access including implicit data access to a
system data structure.

Signed-off-by: Zeng Guang <[email protected]>
Tested-by: Xuelian Guo <[email protected]>
---
arch/x86/include/asm/kvm-x86-ops.h | 3 ++-
arch/x86/include/asm/kvm_host.h | 3 +++
arch/x86/kvm/emulate.c | 11 +++++++++++
arch/x86/kvm/kvm_emulate.h | 2 ++
arch/x86/kvm/x86.c | 10 ++++++++++
5 files changed, 28 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h
index 13bc212cd4bc..a301f0a46381 100644
--- a/arch/x86/include/asm/kvm-x86-ops.h
+++ b/arch/x86/include/asm/kvm-x86-ops.h
@@ -132,7 +132,8 @@ KVM_X86_OP_OPTIONAL(migrate_timers)
KVM_X86_OP(msr_filter_changed)
KVM_X86_OP(complete_emulated_msr)
KVM_X86_OP(vcpu_deliver_sipi_vector)
-KVM_X86_OP_OPTIONAL_RET0(vcpu_get_apicv_inhibit_reasons);
+KVM_X86_OP_OPTIONAL_RET0(vcpu_get_apicv_inhibit_reasons)
+KVM_X86_OP_OPTIONAL_RET0(is_lass_violation)

#undef KVM_X86_OP
#undef KVM_X86_OP_OPTIONAL
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index fb9d1f2d6136..791f0dd48cd9 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1731,6 +1731,9 @@ struct kvm_x86_ops {
* Returns vCPU specific APICv inhibit reasons
*/
unsigned long (*vcpu_get_apicv_inhibit_reasons)(struct kvm_vcpu *vcpu);
+
+ bool (*is_lass_violation)(struct kvm_vcpu *vcpu, unsigned long addr,
+ unsigned int size, unsigned int flags);
};

struct kvm_x86_nested_ops {
diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 9b4b3ce6d52a..7bb595811486 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -742,6 +742,10 @@ static __always_inline int __linearize(struct x86_emulate_ctxt *ctxt,
}
break;
}
+
+ if (ctxt->ops->is_lass_violation(ctxt, *linear, size, flags))
+ goto bad;
+
if (la & (insn_alignment(ctxt, size) - 1))
return emulate_gp(ctxt, 0);
return X86EMUL_CONTINUE;
@@ -848,6 +852,9 @@ static inline int jmp_rel(struct x86_emulate_ctxt *ctxt, int rel)
static int linear_read_system(struct x86_emulate_ctxt *ctxt, ulong linear,
void *data, unsigned size)
{
+ if (ctxt->ops->is_lass_violation(ctxt, linear, size, X86EMUL_F_IMPLICIT))
+ return emulate_gp(ctxt, 0);
+
return ctxt->ops->read_std(ctxt, linear, data, size, &ctxt->exception, true);
}

@@ -855,6 +862,10 @@ static int linear_write_system(struct x86_emulate_ctxt *ctxt,
ulong linear, void *data,
unsigned int size)
{
+ if (ctxt->ops->is_lass_violation(ctxt, linear, size,
+ X86EMUL_F_IMPLICIT | X86EMUL_F_WRITE))
+ return emulate_gp(ctxt, 0);
+
return ctxt->ops->write_std(ctxt, linear, data, size, &ctxt->exception, true);
}

diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h
index c944055091e1..6f0996d0da56 100644
--- a/arch/x86/kvm/kvm_emulate.h
+++ b/arch/x86/kvm/kvm_emulate.h
@@ -232,6 +232,8 @@ struct x86_emulate_ops {
int (*leave_smm)(struct x86_emulate_ctxt *ctxt);
void (*triple_fault)(struct x86_emulate_ctxt *ctxt);
int (*set_xcr)(struct x86_emulate_ctxt *ctxt, u32 index, u64 xcr);
+ bool (*is_lass_violation)(struct x86_emulate_ctxt *ctxt, unsigned long addr,
+ unsigned int size, unsigned int flags);
};

/* Type, address-of, and value of an instruction's operand. */
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 04b57a336b34..6448ff706539 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8287,6 +8287,15 @@ static void emulator_vm_bugged(struct x86_emulate_ctxt *ctxt)
kvm_vm_bugged(kvm);
}

+static bool emulator_is_lass_violation(struct x86_emulate_ctxt *ctxt,
+ unsigned long addr,
+ unsigned int size,
+ unsigned int flags)
+{
+ return static_call(kvm_x86_is_lass_violation)(emul_to_vcpu(ctxt),
+ addr, size, flags);
+}
+
static const struct x86_emulate_ops emulate_ops = {
.vm_bugged = emulator_vm_bugged,
.read_gpr = emulator_read_gpr,
@@ -8332,6 +8341,7 @@ static const struct x86_emulate_ops emulate_ops = {
.leave_smm = emulator_leave_smm,
.triple_fault = emulator_triple_fault,
.set_xcr = emulator_set_xcr,
+ .is_lass_violation = emulator_is_lass_violation,
};

static void toggle_interruptibility(struct kvm_vcpu *vcpu, u32 mask)
--
2.27.0

2023-07-19 05:03:54

by Zeng Guang

[permalink] [raw]

Subject: [PATCH v2 8/8] KVM: x86: Advertise LASS CPUID to user space

Linear address space separation (LASS) is an independent mechanism
to enforce the mode-based protection that can prevent user-mode
accesses to supervisor-mode addresses, and vice versa. Because the
LASS protections are applied before paging, malicious software can
not acquire any paging-based timing information to compromise the
security of system.

The CPUID bit definition to support LASS:
CPUID.(EAX=07H.ECX=1):EAX.LASS[bit 6]

Advertise LASS to user space to support LASS virtualization.

Signed-off-by: Zeng Guang <[email protected]>
Tested-by: Xuelian Guo <[email protected]>
---
arch/x86/kvm/cpuid.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 0c9660a07b23..a7fafe99ffe4 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -646,9 +646,8 @@ void kvm_set_cpu_caps(void)
kvm_cpu_cap_set(X86_FEATURE_SPEC_CTRL_SSBD);

kvm_cpu_cap_mask(CPUID_7_1_EAX,
- F(AVX_VNNI) | F(AVX512_BF16) | F(CMPCCXADD) |
- F(FZRM) | F(FSRS) | F(FSRC) |
- F(AMX_FP16) | F(AVX_IFMA)
+ F(AVX_VNNI) | F(AVX512_BF16) | F(LASS) | F(CMPCCXADD) |
+ F(FZRM) | F(FSRS) | F(FSRC) | F(AMX_FP16) | F(AVX_IFMA)
);

kvm_cpu_cap_init_kvm_defined(CPUID_7_1_EDX,
--
2.27.0

2023-08-07 07:27:04

by Binbin Wu

[permalink] [raw]

Subject: Re: [PATCH v2 6/8] KVM: VMX: Implement and apply vmx_is_lass_violation() for LASS protection

On 7/19/2023 10:45 AM, Zeng Guang wrote:
> Implement and wire up vmx_is_lass_violation() in kvm_x86_ops for VMX.
>
> LASS violation check takes effect in KVM emulation of instruction fetch
> and data access including implicit access when vCPU is running in long
> mode, and also involved in emulation of VMX instruction and SGX ENCLS
> instruction to enforce the mode-based protections before paging.
>
> But the target memory address of emulation of TLB invalidation and branch
> instructions aren't subject to LASS as exceptions.
>
> Signed-off-by: Zeng Guang <[email protected]>
> Tested-by: Xuelian Guo <[email protected]>
> ---
> arch/x86/kvm/vmx/nested.c | 3 ++-
> arch/x86/kvm/vmx/sgx.c | 4 ++++
> arch/x86/kvm/vmx/vmx.c | 35 +++++++++++++++++++++++++++++++++++
> arch/x86/kvm/vmx/vmx.h | 3 +++
> 4 files changed, 44 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index e35cf0bd0df9..72e78566a3b6 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -4985,7 +4985,8 @@ int get_vmx_mem_address(struct kvm_vcpu *vcpu, unsigned long exit_qualification,
> * non-canonical form. This is the only check on the memory
> * destination for long mode!
> */
> - exn = is_noncanonical_address(*ret, vcpu);
> + exn = is_noncanonical_address(*ret, vcpu) ||
> + vmx_is_lass_violation(vcpu, *ret, len, 0);
> } else {
> /*
> * When not in long mode, the virtual/linear address is
> diff --git a/arch/x86/kvm/vmx/sgx.c b/arch/x86/kvm/vmx/sgx.c
> index 2261b684a7d4..f8de637ce634 100644
> --- a/arch/x86/kvm/vmx/sgx.c
> +++ b/arch/x86/kvm/vmx/sgx.c
> @@ -46,6 +46,10 @@ static int sgx_get_encls_gva(struct kvm_vcpu *vcpu, unsigned long offset,
> ((s.base != 0 || s.limit != 0xffffffff) &&
> (((u64)*gva + size - 1) > s.limit + 1));
> }
> +
> + if (!fault)
> + fault = vmx_is_lass_violation(vcpu, *gva, size, 0);
> +
> if (fault)
> kvm_inject_gp(vcpu, 0);
> return fault ? -EINVAL : 0;
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 44fb619803b8..15a7c6e7a25d 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -8127,6 +8127,40 @@ static void vmx_vm_destroy(struct kvm *kvm)
> free_pages((unsigned long)kvm_vmx->pid_table, vmx_get_pid_table_order(kvm));
> }
>
> +bool vmx_is_lass_violation(struct kvm_vcpu *vcpu, unsigned long addr,
> + unsigned int size, unsigned int flags)
> +{
> + const bool is_supervisor_address = !!(addr & BIT_ULL(63));
> + const bool implicit = !!(flags & X86EMUL_F_IMPLICIT);
> + const bool fetch = !!(flags & X86EMUL_F_FETCH);
> + const bool is_wraparound_access = size ? (addr + size - 1) < addr : false;
> +
> + if (!kvm_is_cr4_bit_set(vcpu, X86_CR4_LASS) || !is_long_mode(vcpu))
> + return false;
> +
> + /*
> + * INVTLB isn't subject to LASS, e.g. to allow invalidating userspace
> + * addresses without toggling RFLAGS.AC. Branch targets aren't subject
> + * to LASS in order to simplifiy far control transfers (the subsequent
s/simplifiy/simplifiy

> + * fetch will enforce LASS as appropriate).
> + */
> + if (flags & (X86EMUL_F_BRANCH | X86EMUL_F_INVTLB))
> + return false;
> +
> + if (!implicit && vmx_get_cpl(vcpu) == 3)
> + return is_supervisor_address;
> +
> + /* LASS is enforced for supervisor-mode access iff SMAP is enabled. */
To be more accurate, supervisor-mode data access.
Also, "iff" here is not is a typo for "if" or it stands for "if and only
if"?
It is not accureate to use "if and only if" here because beside SMAP,
there are other conditions, i.e. implicit or RFLAGS.AC.

> + if (!fetch && !kvm_is_cr4_bit_set(vcpu, X86_CR4_SMAP))
> + return false;
> +
> + /* Like SMAP, RFLAGS.AC disables LASS checks in supervisor mode. */
> + if (!fetch && !implicit && (kvm_get_rflags(vcpu) & X86_EFLAGS_AC))
> + return false;
> +
> + return is_wraparound_access ? true : !is_supervisor_address;
> +}
> +
> static struct kvm_x86_ops vmx_x86_ops __initdata = {
> .name = KBUILD_MODNAME,
>
> @@ -8266,6 +8300,7 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = {
> .complete_emulated_msr = kvm_complete_insn_gp,
>
> .vcpu_deliver_sipi_vector = kvm_vcpu_deliver_sipi_vector,
> + .is_lass_violation = vmx_is_lass_violation,
> };
>
> static unsigned int vmx_handle_intel_pt_intr(void)
> diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
> index 9e66531861cf..c1e541a790bb 100644
> --- a/arch/x86/kvm/vmx/vmx.h
> +++ b/arch/x86/kvm/vmx/vmx.h
> @@ -433,6 +433,9 @@ void vmx_enable_intercept_for_msr(struct kvm_vcpu *vcpu, u32 msr, int type);
> u64 vmx_get_l2_tsc_offset(struct kvm_vcpu *vcpu);
> u64 vmx_get_l2_tsc_multiplier(struct kvm_vcpu *vcpu);
>
> +bool vmx_is_lass_violation(struct kvm_vcpu *vcpu, unsigned long addr,
> + unsigned int size, unsigned int flags);
> +
> static inline void vmx_set_intercept_for_msr(struct kvm_vcpu *vcpu, u32 msr,
> int type, bool value)
> {

2023-08-19 11:41:51

by Binbin Wu

[permalink] [raw]

Subject: Re: [PATCH v2 4/8] KVM: x86: Add X86EMUL_F_INVTLB and pass it in em_invlpg()

On 8/16/2023 7:11 AM, Sean Christopherson wrote:
> On Wed, Jul 19, 2023, Zeng Guang wrote:
>> From: Binbin Wu <[email protected]>
>>
>> Add an emulation flag X86EMUL_F_INVTLB, which is used to identify an
>> instruction that does TLB invalidation without true memory access.
>>
>> Only invlpg & invlpga implemented in emulator belong to this kind.
>> invlpga doesn't need additional information for emulation. Just pass
>> the flag to em_invlpg().
> Please add a paragraph explaining *why* this flag is being added. Ideally, the
> previous patch would also explain the need for an IMPLICIT flag, but that one
> doesn't bug me all that much because implicit accesses are known to be special
> snowflakes, i.e. it's easy to imagine that KVM would need to identify such
> accesses. But for INVLPG, without already knowing the details of LASS (or LAM),
> it's harder to think of why it needs to exist.
OK, will add the reason for this case and for IMPLICIT as well.
Thanks.

>
>> Signed-off-by: Binbin Wu <[email protected]>
>> Signed-off-by: Zeng Guang <[email protected]>
>> ---
>> arch/x86/kvm/emulate.c | 4 +++-
>> arch/x86/kvm/kvm_emulate.h | 1 +
>> 2 files changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
>> index 8e706d19ae45..9b4b3ce6d52a 100644
>> --- a/arch/x86/kvm/emulate.c
>> +++ b/arch/x86/kvm/emulate.c
>> @@ -3443,8 +3443,10 @@ static int em_invlpg(struct x86_emulate_ctxt *ctxt)
>> {
>> int rc;
>> ulong linear;
>> + unsigned max_size;
> unsigned int
Let me think why I use 'unsigned'...
It's because the exist code uses 'unsigned'.
I suppose it is considered bad practice?
I will cleanup the exist code as well. Is it OK to cleanup it
opportunistically inside this patch?

>> - rc = linearize(ctxt, ctxt->src.addr.mem, 1, false, &linear);
>> + rc = __linearize(ctxt, ctxt->src.addr.mem, &max_size, 1, ctxt->mode,
>> + &linear, X86EMUL_F_INVTLB);
> Align indentation:
Will update it.

>
> rc = __linearize(ctxt, ctxt->src.addr.mem, &max_size, 1, ctxt->mode,
> &linear, X86EMUL_F_INVTLB);
>
>> if (rc == X86EMUL_CONTINUE)
>> ctxt->ops->invlpg(ctxt, linear);
>> /* Disable writeback. */
>> diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h
>> index c0e48f4fa7c4..c944055091e1 100644
>> --- a/arch/x86/kvm/kvm_emulate.h
>> +++ b/arch/x86/kvm/kvm_emulate.h
>> @@ -93,6 +93,7 @@ struct x86_instruction_info {
>> #define X86EMUL_F_FETCH BIT(1)
>> #define X86EMUL_F_BRANCH BIT(2)
>> #define X86EMUL_F_IMPLICIT BIT(3)
>> +#define X86EMUL_F_INVTLB BIT(4)
> Why F_INVTLB instead of X86EMUL_F_INVLPG? Ah, because LAM is ignored for the
> linear address in the INVPCID and INVVPID descriptors. Hrm.
>
> I think my vote is to call this X86EMUL_F_INVLPG even though *in theory* it's not
> strictly limited to INVLPG. Odds are good KVM's emulator will never support
> INVPCID or INVVPID,
One case is kvm_handle_invpcid() is in the common kvm x86 code.
LAM doesn't apply to the address in descriptor of invpcid though, but I
am not sure if there will be the need for SVM in the future.
But for now, F_INVLPG is OK if you think F_INVTLB brings confusion.

> and IMO even though F_INVLPG would be somewhat of a misnomer,
> it's much more intuitive even for INVPCID and INVVPID descriptors. F_INVTLB makes
> me think more of the actual act of invalidating the TLB.
>
> I'm not dead set against INVTLB if someone really likes it, but I did scratch my
> head for a second when I saw it.

2023-08-19 13:57:57

by Sean Christopherson

[permalink] [raw]

Subject: Re: [PATCH v2 2/8] KVM: x86: Use a new flag for branch instructions

On Wed, Aug 16, 2023, Binbin Wu wrote:
>
>
> On 8/16/2023 6:51 AM, Sean Christopherson wrote:
> > Branch *targets*, not branch instructions.
> >
> > On Wed, Jul 19, 2023, Zeng Guang wrote:
> > > From: Binbin Wu <[email protected]>
> > >
> > > Use the new flag X86EMUL_F_BRANCH instead of X86EMUL_F_FETCH in
> > > assign_eip(), since strictly speaking it is not behavior of instruction
> > > fetch.
> > Eh, I'd just drop this paragraph, as evidenced by this code existing as-is for
> > years, we wouldn't introduce X86EMUL_F_BRANCH just because resolving a branch
> > target isn't strictly an instruction fetch.
> >
> > > Another reason is to distinguish instruction fetch and execution of branch
> > > instruction for feature(s) that handle differently on them.
> > Similar to the shortlog, it's about computing the branch target, not executing a
> > branch instruction. That distinction matters, e.g. a Jcc that is not taken will
> > *not* follow the branch target, but the instruction is still *executed*. And there
> > exist instructions that compute branch targets, but aren't what most people would
> > typically consider a branch instruction, e.g. XBEGIN.
> >
> > > Branch instruction is not data access instruction, so skip checking against
> > > execute-only code segment as instruction fetch.
> > Rather than call out individual use case, I would simply state that as of this
> > patch, X86EMUL_F_BRANCH and X86EMUL_F_FETCH are identical as far as KVM is
> > concernered. That let's the reader know that (a) there's no intended change in
> > behavior and (b) that the intent is to effectively split all consumption of
> > X86EMUL_F_FETCH into (X86EMUL_F_FETCH | X86EMUL_F_BRANCH).
>
> How about this:
>
>     KVM: x86: Use a new flag for branch targets
>
>     Use the new flag X86EMUL_F_BRANCH instead of X86EMUL_F_FETCH in
> assign_eip()
>     to distinguish instruction fetch and branch target computation for
> feature(s)

Just "features", i.e. no parentheses...

>     that handle differently on them.

...and tack on ", e.g. LASS and LAM." at the end. There's zero reason not to more
explicitly call out why the flag is being added. Trying to predict the future in
changelogs is generally discouraged, but having understandable changelogs is more
important.

>     As of this patch, X86EMUL_F_BRANCH and X86EMUL_F_FETCH are identical as
> far as
>     KVM is concernered.
>
>     No functional change intended.

Heh, you need to fix whatever is forcefully wrapping lines, but other than the
nit above, the content itself is good.

2023-08-19 18:01:30

by Binbin Wu

[permalink] [raw]

Subject: Re: [PATCH v2 2/8] KVM: x86: Use a new flag for branch instructions

On 8/16/2023 6:51 AM, Sean Christopherson wrote:
> Branch *targets*, not branch instructions.
>
> On Wed, Jul 19, 2023, Zeng Guang wrote:
>> From: Binbin Wu <[email protected]>
>>
>> Use the new flag X86EMUL_F_BRANCH instead of X86EMUL_F_FETCH in
>> assign_eip(), since strictly speaking it is not behavior of instruction
>> fetch.
> Eh, I'd just drop this paragraph, as evidenced by this code existing as-is for
> years, we wouldn't introduce X86EMUL_F_BRANCH just because resolving a branch
> target isn't strictly an instruction fetch.
>
>> Another reason is to distinguish instruction fetch and execution of branch
>> instruction for feature(s) that handle differently on them.
> Similar to the shortlog, it's about computing the branch target, not executing a
> branch instruction. That distinction matters, e.g. a Jcc that is not taken will
> *not* follow the branch target, but the instruction is still *executed*. And there
> exist instructions that compute branch targets, but aren't what most people would
> typically consider a branch instruction, e.g. XBEGIN.
>
>> Branch instruction is not data access instruction, so skip checking against
>> execute-only code segment as instruction fetch.
> Rather than call out individual use case, I would simply state that as of this
> patch, X86EMUL_F_BRANCH and X86EMUL_F_FETCH are identical as far as KVM is
> concernered. That let's the reader know that (a) there's no intended change in
> behavior and (b) that the intent is to effectively split all consumption of
> X86EMUL_F_FETCH into (X86EMUL_F_FETCH | X86EMUL_F_BRANCH).

How about this:

    KVM: x86: Use a new flag for branch targets

    Use the new flag X86EMUL_F_BRANCH instead of X86EMUL_F_FETCH in
assign_eip()
    to distinguish instruction fetch and branch target computation for
feature(s)
    that handle differently on them.

    As of this patch, X86EMUL_F_BRANCH and X86EMUL_F_FETCH are
identical as far as
    KVM is concernered.

    No functional change intended.

>> Signed-off-by: Binbin Wu <[email protected]>
>> Signed-off-by: Zeng Guang <[email protected]>
>> ---
>> arch/x86/kvm/emulate.c | 5 +++--
>> arch/x86/kvm/kvm_emulate.h | 1 +
>> 2 files changed, 4 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
>> index 3ddfbc99fa4f..8e706d19ae45 100644
>> --- a/arch/x86/kvm/emulate.c
>> +++ b/arch/x86/kvm/emulate.c
>> @@ -721,7 +721,8 @@ static __always_inline int __linearize(struct x86_emulate_ctxt *ctxt,
>> (flags & X86EMUL_F_WRITE))
>> goto bad;
>> /* unreadable code segment */
>> - if (!(flags & X86EMUL_F_FETCH) && (desc.type & 8) && !(desc.type & 2))
>> + if (!(flags & (X86EMUL_F_FETCH | X86EMUL_F_BRANCH))
>> + && (desc.type & 8) && !(desc.type & 2))
> Put the && on the first line, and align indendation.
I should have been more careful on the alignment & indentation.
Will update it. Thanks.

>
> /* unreadable code segment */
> if (!(flags & (X86EMUL_F_FETCH | X86EMUL_F_BRANCH)) &&
> (desc.type & 8) && !(desc.type & 2))
> goto bad;

2023-08-19 20:21:31

by Sean Christopherson

[permalink] [raw]

Subject: Re: [PATCH v2 4/8] KVM: x86: Add X86EMUL_F_INVTLB and pass it in em_invlpg()

On Wed, Jul 19, 2023, Zeng Guang wrote:
> From: Binbin Wu <[email protected]>
>
> Add an emulation flag X86EMUL_F_INVTLB, which is used to identify an
> instruction that does TLB invalidation without true memory access.
>
> Only invlpg & invlpga implemented in emulator belong to this kind.
> invlpga doesn't need additional information for emulation. Just pass
> the flag to em_invlpg().

Please add a paragraph explaining *why* this flag is being added. Ideally, the
previous patch would also explain the need for an IMPLICIT flag, but that one
doesn't bug me all that much because implicit accesses are known to be special
snowflakes, i.e. it's easy to imagine that KVM would need to identify such
accesses. But for INVLPG, without already knowing the details of LASS (or LAM),
it's harder to think of why it needs to exist.

> Signed-off-by: Binbin Wu <[email protected]>
> Signed-off-by: Zeng Guang <[email protected]>
> ---
> arch/x86/kvm/emulate.c | 4 +++-
> arch/x86/kvm/kvm_emulate.h | 1 +
> 2 files changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
> index 8e706d19ae45..9b4b3ce6d52a 100644
> --- a/arch/x86/kvm/emulate.c
> +++ b/arch/x86/kvm/emulate.c
> @@ -3443,8 +3443,10 @@ static int em_invlpg(struct x86_emulate_ctxt *ctxt)
> {
> int rc;
> ulong linear;
> + unsigned max_size;

unsigned int

> - rc = linearize(ctxt, ctxt->src.addr.mem, 1, false, &linear);
> + rc = __linearize(ctxt, ctxt->src.addr.mem, &max_size, 1, ctxt->mode,
> + &linear, X86EMUL_F_INVTLB);

Align indentation:

rc = __linearize(ctxt, ctxt->src.addr.mem, &max_size, 1, ctxt->mode,
&linear, X86EMUL_F_INVTLB);

> if (rc == X86EMUL_CONTINUE)
> ctxt->ops->invlpg(ctxt, linear);
> /* Disable writeback. */
> diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h
> index c0e48f4fa7c4..c944055091e1 100644
> --- a/arch/x86/kvm/kvm_emulate.h
> +++ b/arch/x86/kvm/kvm_emulate.h
> @@ -93,6 +93,7 @@ struct x86_instruction_info {
> #define X86EMUL_F_FETCH BIT(1)
> #define X86EMUL_F_BRANCH BIT(2)
> #define X86EMUL_F_IMPLICIT BIT(3)
> +#define X86EMUL_F_INVTLB BIT(4)

Why F_INVTLB instead of X86EMUL_F_INVLPG? Ah, because LAM is ignored for the
linear address in the INVPCID and INVVPID descriptors. Hrm.

I think my vote is to call this X86EMUL_F_INVLPG even though *in theory* it's not
strictly limited to INVLPG. Odds are good KVM's emulator will never support
INVPCID or INVVPID, and IMO even though F_INVLPG would be somewhat of a misnomer,
it's much more intuitive even for INVPCID and INVVPID descriptors. F_INVTLB makes
me think more of the actual act of invalidating the TLB.

I'm not dead set against INVTLB if someone really likes it, but I did scratch my
head for a second when I saw it.