2023-04-20 14:19:13

by Zeng Guang

[permalink] [raw]
Subject: [PATCH 0/6] LASS KVM virtualization support

Linear Address Space Separation (LASS)[1] is a new mechanism that
enforces the same mode-based protections as paging, i.e. SMAP/SMEP but
without traversing the paging structures. Because the protections
enforced by LASS are applied before paging, "probes" by malicious
software will provide no paging-based timing information.

LASS works in long mode and partitions the 64-bit canonical linear
address space into two halves:
1. Lower half (LA[63]=0) --> user space
2. Upper half (LA[63]=1) --> kernel space

When LASS is enabled, a general protection #GP fault or a stack fault
#SS will be generated if software accesses the address from the half
in which it resides to another half, e.g., either from user space to
upper half, or from kernel space to lower half. This protection applies
to data access, code execution.

This series add KVM LASS virtualization support.

When platform has LASS capability, KVM requires to expose this feature
to guest VM enumerated by CPUID.(EAX=07H.ECX=1):EAX.LASS[bit 6], and
allow guest to enable it via CR4.LASS[bit 27] on demand. For instruction
executed in the guest directly, hardware will perform the LASS violation
check, while KVM also needs to apply LASS to instructions emulated by
software and injects #GP or #SS fault to the guest.

Following LASS voilations check will be taken on KVM emulation path.
User-mode access to supervisor space address:
LA[bit 63] && (CPL == 3)
Supervisor-mode access to user space address:
Instruction fetch: !LA[bit 63] && (CPL < 3)
Data access: !LA[bit 63] && (CR4.SMAP==1) && ((RFLAGS.AC == 0 &&
CPL < 3) || Implicit supervisor access)

We tested the basic function of LASS virtualization including LASS
enumeration and enabling in non-root and nested environment. As current
KVM unittest framework is not compatible to LASS rule that kernel should
run in the upper half, we use kernel module and application test to verify
LASS functionalities in guest instead. The data access related x86 emulator
code is verified with forced emulation prefix (FEP) mechanism. Other test
cases are working in progress.

How to add tests for LASS in KUT or kselftest is still under investigation.

[1] Intel Architecutre Instruction Set Extensions and Future Features
Programming Reference: Chapter Linear Address Space Separation (LASS)
https://cdrdv2.intel.com/v1/dl/getContent/671368

Zeng Guang (6):
KVM: x86: Virtualize CR4.LASS
KVM: VMX: Add new ops in kvm_x86_ops for LASS violation check
KVM: x86: Add emulator helper for LASS violation check
KVM: x86: LASS protection on KVM emulation when LASS enabled
KVM: x86: Advertise LASS CPUID to user space
KVM: x86: Set KVM LASS based on hardware capability

arch/x86/include/asm/cpuid.h | 36 +++++++++++++++++++
arch/x86/include/asm/kvm-x86-ops.h | 1 +
arch/x86/include/asm/kvm_host.h | 7 +++-
arch/x86/kvm/cpuid.c | 8 +++--
arch/x86/kvm/emulate.c | 36 ++++++++++++++++---
arch/x86/kvm/kvm_emulate.h | 1 +
arch/x86/kvm/vmx/nested.c | 3 ++
arch/x86/kvm/vmx/sgx.c | 2 ++
arch/x86/kvm/vmx/vmx.c | 58 ++++++++++++++++++++++++++++++
arch/x86/kvm/vmx/vmx.h | 2 ++
arch/x86/kvm/x86.c | 9 +++++
arch/x86/kvm/x86.h | 2 ++
12 files changed, 157 insertions(+), 8 deletions(-)

--
2.27.0


2023-04-20 14:19:21

by Zeng Guang

[permalink] [raw]
Subject: [PATCH 3/6] KVM: x86: Add emulator helper for LASS violation check

When LASS is enabled, KVM need apply LASS violation check to instruction
emulations. Add helper for the usage of x86 emulator to perform LASS
protection.

Signed-off-by: Zeng Guang <[email protected]>
---
arch/x86/kvm/kvm_emulate.h | 1 +
arch/x86/kvm/x86.c | 9 +++++++++
2 files changed, 10 insertions(+)

diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h
index 2d9662be8333..1c55247d52d7 100644
--- a/arch/x86/kvm/kvm_emulate.h
+++ b/arch/x86/kvm/kvm_emulate.h
@@ -224,6 +224,7 @@ struct x86_emulate_ops {
int (*leave_smm)(struct x86_emulate_ctxt *ctxt);
void (*triple_fault)(struct x86_emulate_ctxt *ctxt);
int (*set_xcr)(struct x86_emulate_ctxt *ctxt, u32 index, u64 xcr);
+ bool (*check_lass)(struct x86_emulate_ctxt *ctxt, u64 access, u64 la, u64 flags);
};

/* Type, address-of, and value of an instruction's operand. */
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 87feb1249ad6..704c5e4b9e76 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8193,6 +8193,14 @@ static void emulator_vm_bugged(struct x86_emulate_ctxt *ctxt)
kvm_vm_bugged(kvm);
}

+static bool emulator_check_lass(struct x86_emulate_ctxt *ctxt,
+ u64 access, u64 la, u64 flags)
+{
+ struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);
+
+ return static_call(kvm_x86_check_lass)(vcpu, access, la, flags);
+}
+
static const struct x86_emulate_ops emulate_ops = {
.vm_bugged = emulator_vm_bugged,
.read_gpr = emulator_read_gpr,
@@ -8237,6 +8245,7 @@ static const struct x86_emulate_ops emulate_ops = {
.leave_smm = emulator_leave_smm,
.triple_fault = emulator_triple_fault,
.set_xcr = emulator_set_xcr,
+ .check_lass = emulator_check_lass,
};

static void toggle_interruptibility(struct kvm_vcpu *vcpu, u32 mask)
--
2.27.0

2023-04-20 14:19:32

by Zeng Guang

[permalink] [raw]
Subject: [PATCH 4/6] KVM: x86: LASS protection on KVM emulation when LASS enabled

Do LASS violation check for instructions emulated by KVM. Note that for
instructions executed in the guest directly, hardware will perform the
check.

Not all instruction emulation leads to accesses to guest linear addresses
because 1) some instrutions like CPUID, RDMSR, don't take memory as
operands 2) instruction fetch in most cases is already done inside the
guest.

Four cases in which kvm may access guest linear addresses are identified
by code inspection:
- KVM emulator uses segmented address for instruction fetches or data
accesses.
- For implicit data access, KVM emulator gets address to a system data
structure(GDT/LDT/IDT/TR).
- For VMX instruction emulation, KVM gets the address from "VM-exit
instruction information" field in VMCS.
- For SGX ENCLS instruction emulation, KVM gets the address from registers.

LASS violation check applies to these linear address so as to enforce
mode-based protections as hardware behaves.

As exceptions, the target memory address of emulation of invlpg, branch
and call instructions doesn't require LASS violation check.

Signed-off-by: Zeng Guang <[email protected]>
---
arch/x86/kvm/emulate.c | 36 +++++++++++++++++++++++++++++++-----
arch/x86/kvm/vmx/nested.c | 3 +++
arch/x86/kvm/vmx/sgx.c | 2 ++
3 files changed, 36 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 5cc3efa0e21c..a9a022fd712e 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -687,7 +687,8 @@ static __always_inline int __linearize(struct x86_emulate_ctxt *ctxt,
struct segmented_address addr,
unsigned *max_size, unsigned size,
bool write, bool fetch,
- enum x86emul_mode mode, ulong *linear)
+ enum x86emul_mode mode, ulong *linear,
+ u64 flags)
{
struct desc_struct desc;
bool usable;
@@ -695,6 +696,7 @@ static __always_inline int __linearize(struct x86_emulate_ctxt *ctxt,
u32 lim;
u16 sel;
u8 va_bits;
+ u64 access = fetch ? PFERR_FETCH_MASK : 0;

la = seg_base(ctxt, addr.seg) + addr.ea;
*max_size = 0;
@@ -740,6 +742,10 @@ static __always_inline int __linearize(struct x86_emulate_ctxt *ctxt,
}
break;
}
+
+ if (ctxt->ops->check_lass(ctxt, access, *linear, flags))
+ goto bad;
+
if (la & (insn_alignment(ctxt, size) - 1))
return emulate_gp(ctxt, 0);
return X86EMUL_CONTINUE;
@@ -757,7 +763,7 @@ static int linearize(struct x86_emulate_ctxt *ctxt,
{
unsigned max_size;
return __linearize(ctxt, addr, &max_size, size, write, false,
- ctxt->mode, linear);
+ ctxt->mode, linear, 0);
}

static inline int assign_eip(struct x86_emulate_ctxt *ctxt, ulong dst)
@@ -770,7 +776,10 @@ static inline int assign_eip(struct x86_emulate_ctxt *ctxt, ulong dst)

if (ctxt->op_bytes != sizeof(unsigned long))
addr.ea = dst & ((1UL << (ctxt->op_bytes << 3)) - 1);
- rc = __linearize(ctxt, addr, &max_size, 1, false, true, ctxt->mode, &linear);
+
+ /* LASS doesn't apply to address for branch and call instructions */
+ rc = __linearize(ctxt, addr, &max_size, 1, false, true, ctxt->mode,
+ &linear, KVM_X86_EMULFLAG_SKIP_LASS);
if (rc == X86EMUL_CONTINUE)
ctxt->_eip = addr.ea;
return rc;
@@ -845,6 +854,13 @@ static inline int jmp_rel(struct x86_emulate_ctxt *ctxt, int rel)
static int linear_read_system(struct x86_emulate_ctxt *ctxt, ulong linear,
void *data, unsigned size)
{
+ if (ctxt->ops->check_lass(ctxt, PFERR_IMPLICIT_ACCESS, linear, 0)) {
+ ctxt->exception.vector = GP_VECTOR;
+ ctxt->exception.error_code = 0;
+ ctxt->exception.error_code_valid = true;
+ return X86EMUL_PROPAGATE_FAULT;
+ }
+
return ctxt->ops->read_std(ctxt, linear, data, size, &ctxt->exception, true);
}

@@ -852,6 +868,13 @@ static int linear_write_system(struct x86_emulate_ctxt *ctxt,
ulong linear, void *data,
unsigned int size)
{
+ if (ctxt->ops->check_lass(ctxt, PFERR_IMPLICIT_ACCESS, linear, 0)) {
+ ctxt->exception.vector = GP_VECTOR;
+ ctxt->exception.error_code = 0;
+ ctxt->exception.error_code_valid = true;
+ return X86EMUL_PROPAGATE_FAULT;
+ }
+
return ctxt->ops->write_std(ctxt, linear, data, size, &ctxt->exception, true);
}

@@ -907,7 +930,7 @@ static int __do_insn_fetch_bytes(struct x86_emulate_ctxt *ctxt, int op_size)
* against op_size.
*/
rc = __linearize(ctxt, addr, &max_size, 0, false, true, ctxt->mode,
- &linear);
+ &linear, 0);
if (unlikely(rc != X86EMUL_CONTINUE))
return rc;

@@ -3432,8 +3455,11 @@ static int em_invlpg(struct x86_emulate_ctxt *ctxt)
{
int rc;
ulong linear;
+ unsigned max_size;

- rc = linearize(ctxt, ctxt->src.addr.mem, 1, false, &linear);
+ /* LASS doesn't apply to the memory address for invlpg */
+ rc = __linearize(ctxt, ctxt->src.addr.mem, &max_size, 1, false, false,
+ ctxt->mode, &linear, KVM_X86_EMULFLAG_SKIP_LASS);
if (rc == X86EMUL_CONTINUE)
ctxt->ops->invlpg(ctxt, linear);
/* Disable writeback. */
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index c8ae9d0e59b3..55c88c4593a6 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -4974,6 +4974,9 @@ int get_vmx_mem_address(struct kvm_vcpu *vcpu, unsigned long exit_qualification,
* destination for long mode!
*/
exn = is_noncanonical_address(*ret, vcpu);
+
+ if (!exn)
+ exn = __vmx_check_lass(vcpu, 0, *ret, 0);
} else {
/*
* When not in long mode, the virtual/linear address is
diff --git a/arch/x86/kvm/vmx/sgx.c b/arch/x86/kvm/vmx/sgx.c
index b12da2a6dec9..30cb5d0980be 100644
--- a/arch/x86/kvm/vmx/sgx.c
+++ b/arch/x86/kvm/vmx/sgx.c
@@ -37,6 +37,8 @@ static int sgx_get_encls_gva(struct kvm_vcpu *vcpu, unsigned long offset,
fault = true;
} else if (likely(is_long_mode(vcpu))) {
fault = is_noncanonical_address(*gva, vcpu);
+ if (!fault)
+ fault = __vmx_check_lass(vcpu, 0, *gva, 0);
} else {
*gva &= 0xffffffff;
fault = (s.unusable) ||
--
2.27.0

2023-04-20 14:19:54

by Zeng Guang

[permalink] [raw]
Subject: [PATCH 6/6] KVM: x86: Set KVM LASS based on hardware capability

Host kernel may clear LASS capability in boot_cpu_data.x86_capability
besides explicitly using clearcpuid parameter. That will cause guest
not being able to manage LASS independently. So set KVM LASS directly
based on hardware capability to eliminate the dependency.

Add new helper functions to facilitate getting result of CPUID sub-leaf.

Signed-off-by: Zeng Guang <[email protected]>
---
arch/x86/include/asm/cpuid.h | 36 ++++++++++++++++++++++++++++++++++++
arch/x86/kvm/cpuid.c | 4 ++++
2 files changed, 40 insertions(+)

diff --git a/arch/x86/include/asm/cpuid.h b/arch/x86/include/asm/cpuid.h
index 9bee3e7bf973..a25dd00b7c0a 100644
--- a/arch/x86/include/asm/cpuid.h
+++ b/arch/x86/include/asm/cpuid.h
@@ -127,6 +127,42 @@ static inline unsigned int cpuid_edx(unsigned int op)
return edx;
}

+static inline unsigned int cpuid_count_eax(unsigned int op, int count)
+{
+ unsigned int eax, ebx, ecx, edx;
+
+ cpuid_count(op, count, &eax, &ebx, &ecx, &edx);
+
+ return eax;
+}
+
+static inline unsigned int cpuid_count_ebx(unsigned int op, int count)
+{
+ unsigned int eax, ebx, ecx, edx;
+
+ cpuid_count(op, count, &eax, &ebx, &ecx, &edx);
+
+ return ebx;
+}
+
+static inline unsigned int cpuid_count_ecx(unsigned int op, int count)
+{
+ unsigned int eax, ebx, ecx, edx;
+
+ cpuid_count(op, count, &eax, &ebx, &ecx, &edx);
+
+ return ecx;
+}
+
+static inline unsigned int cpuid_count_edx(unsigned int op, int count)
+{
+ unsigned int eax, ebx, ecx, edx;
+
+ cpuid_count(op, count, &eax, &ebx, &ecx, &edx);
+
+ return edx;
+}
+
static __always_inline bool cpuid_function_is_indexed(u32 function)
{
switch (function) {
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 5facb8037140..e99b99ebe1fe 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -667,6 +667,10 @@ void kvm_set_cpu_caps(void)
F(AMX_FP16) | F(AVX_IFMA)
);

+ /* Set LASS based on hardware capability */
+ if (cpuid_count_eax(7, 1) & F(LASS))
+ kvm_cpu_cap_set(X86_FEATURE_LASS);
+
kvm_cpu_cap_init_kvm_defined(CPUID_7_1_EDX,
F(AVX_VNNI_INT8) | F(AVX_NE_CONVERT) | F(PREFETCHITI)
);
--
2.27.0

2023-04-20 14:20:48

by Zeng Guang

[permalink] [raw]
Subject: [PATCH 5/6] KVM: x86: Advertise LASS CPUID to user space

LASS (Linear-address space separation) is an independent mechanism
to enforce the mode-based protection that can prevent user-mode
accesses to supervisor-mode addresses, and vice versa. Because the
LASS protections are applied before paging, malicious software can
not acquire any paging-based timing information to compromise the
security of system.

The CPUID bit definition to support LASS:
CPUID.(EAX=07H.ECX=1):EAX.LASS[bit 6]

Advertise LASS to user space to support LASS virtualization.

Signed-off-by: Zeng Guang <[email protected]>
---
arch/x86/kvm/cpuid.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index ba7f7abc8964..5facb8037140 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -663,8 +663,8 @@ void kvm_set_cpu_caps(void)
kvm_cpu_cap_set(X86_FEATURE_SPEC_CTRL_SSBD);

kvm_cpu_cap_mask(CPUID_7_1_EAX,
- F(AVX_VNNI) | F(AVX512_BF16) | F(CMPCCXADD) | F(AMX_FP16) |
- F(AVX_IFMA)
+ F(AVX_VNNI) | F(AVX512_BF16) | F(LASS) | F(CMPCCXADD) |
+ F(AMX_FP16) | F(AVX_IFMA)
);

kvm_cpu_cap_init_kvm_defined(CPUID_7_1_EDX,
--
2.27.0

2023-04-24 01:45:17

by Binbin Wu

[permalink] [raw]
Subject: Re: [PATCH 0/6] LASS KVM virtualization support


On 4/20/2023 9:37 PM, Zeng Guang wrote:
> Linear Address Space Separation (LASS)[1] is a new mechanism that
> enforces the same mode-based protections as paging, i.e. SMAP/SMEP but
> without traversing the paging structures. Because the protections
> enforced by LASS are applied before paging, "probes" by malicious
> software will provide no paging-based timing information.
>
> LASS works in long mode and partitions the 64-bit canonical linear
> address space into two halves:
> 1. Lower half (LA[63]=0) --> user space
> 2. Upper half (LA[63]=1) --> kernel space
>
> When LASS is enabled, a general protection #GP fault or a stack fault
> #SS will be generated if software accesses the address from the half
> in which it resides to another half,

The accessor's mode is based on CPL, not the address range,
so it feels a bit inaccurate of descripton "in which it resides".


> e.g., either from user space to
> upper half, or from kernel space to lower half. This protection applies
> to data access, code execution.
>
> This series add KVM LASS virtualization support.
>
> When platform has LASS capability, KVM requires to expose this feature
> to guest VM enumerated by CPUID.(EAX=07H.ECX=1):EAX.LASS[bit 6], and
> allow guest to enable it via CR4.LASS[bit 27] on demand. For instruction
> executed in the guest directly, hardware will perform the LASS violation
> check, while KVM also needs to apply LASS to instructions emulated by
> software and injects #GP or #SS fault to the guest.
>
> Following LASS voilations check will be taken on KVM emulation path.

/s/voilations/violations


> User-mode access to supervisor space address:
> LA[bit 63] && (CPL == 3)
> Supervisor-mode access to user space address:
> Instruction fetch: !LA[bit 63] && (CPL < 3)
> Data access: !LA[bit 63] && (CR4.SMAP==1) && ((RFLAGS.AC == 0 &&
> CPL < 3) || Implicit supervisor access)
>
> We tested the basic function of LASS virtualization including LASS
> enumeration and enabling in non-root and nested environment. As current
> KVM unittest framework is not compatible to LASS rule that kernel should
> run in the upper half, we use kernel module and application test to verify
> LASS functionalities in guest instead. The data access related x86 emulator
> code is verified with forced emulation prefix (FEP) mechanism. Other test
> cases are working in progress.
>
> How to add tests for LASS in KUT or kselftest is still under investigation.
>
> [1] Intel Architecutre Instruction Set Extensions and Future Features

/s/Architecutre/Architecture


> Programming Reference: Chapter Linear Address Space Separation (LASS)
> https://cdrdv2.intel.com/v1/dl/getContent/671368
>
> Zeng Guang (6):
> KVM: x86: Virtualize CR4.LASS
> KVM: VMX: Add new ops in kvm_x86_ops for LASS violation check
> KVM: x86: Add emulator helper for LASS violation check
> KVM: x86: LASS protection on KVM emulation when LASS enabled
> KVM: x86: Advertise LASS CPUID to user space
> KVM: x86: Set KVM LASS based on hardware capability
>
> arch/x86/include/asm/cpuid.h | 36 +++++++++++++++++++
> arch/x86/include/asm/kvm-x86-ops.h | 1 +
> arch/x86/include/asm/kvm_host.h | 7 +++-
> arch/x86/kvm/cpuid.c | 8 +++--
> arch/x86/kvm/emulate.c | 36 ++++++++++++++++---
> arch/x86/kvm/kvm_emulate.h | 1 +
> arch/x86/kvm/vmx/nested.c | 3 ++
> arch/x86/kvm/vmx/sgx.c | 2 ++
> arch/x86/kvm/vmx/vmx.c | 58 ++++++++++++++++++++++++++++++
> arch/x86/kvm/vmx/vmx.h | 2 ++
> arch/x86/kvm/x86.c | 9 +++++
> arch/x86/kvm/x86.h | 2 ++
> 12 files changed, 157 insertions(+), 8 deletions(-)
>

2023-04-25 02:11:10

by Zeng Guang

[permalink] [raw]
Subject: Re: [PATCH 0/6] LASS KVM virtualization support


On 4/24/2023 9:20 AM, Binbin Wu wrote:
> On 4/20/2023 9:37 PM, Zeng Guang wrote:
>> Linear Address Space Separation (LASS)[1] is a new mechanism that
>> enforces the same mode-based protections as paging, i.e. SMAP/SMEP but
>> without traversing the paging structures. Because the protections
>> enforced by LASS are applied before paging, "probes" by malicious
>> software will provide no paging-based timing information.
>>
>> LASS works in long mode and partitions the 64-bit canonical linear
>> address space into two halves:
>> 1. Lower half (LA[63]=0) --> user space
>> 2. Upper half (LA[63]=1) --> kernel space
>>
>> When LASS is enabled, a general protection #GP fault or a stack fault
>> #SS will be generated if software accesses the address from the half
>> in which it resides to another half,
> The accessor's mode is based on CPL, not the address range,
> so it feels a bit inaccurate of descripton "in which it resides".
>
This is alternative description to implicitly signify the privilege level,
i.e. code running in upper half means it is in supervisor mode,
otherwise it's in user mode.  :)

>> e.g., either from user space to
>> upper half, or from kernel space to lower half. This protection applies
>> to data access, code execution.
>>
>> This series add KVM LASS virtualization support.
>>
>> When platform has LASS capability, KVM requires to expose this feature
>> to guest VM enumerated by CPUID.(EAX=07H.ECX=1):EAX.LASS[bit 6], and
>> allow guest to enable it via CR4.LASS[bit 27] on demand. For instruction
>> executed in the guest directly, hardware will perform the LASS violation
>> check, while KVM also needs to apply LASS to instructions emulated by
>> software and injects #GP or #SS fault to the guest.
>>
>> Following LASS voilations check will be taken on KVM emulation path.
> /s/voilations/violations
>
>
>> User-mode access to supervisor space address:
>> LA[bit 63] && (CPL == 3)
>> Supervisor-mode access to user space address:
>> Instruction fetch: !LA[bit 63] && (CPL < 3)
>> Data access: !LA[bit 63] && (CR4.SMAP==1) && ((RFLAGS.AC == 0 &&
>> CPL < 3) || Implicit supervisor access)
>>
>> We tested the basic function of LASS virtualization including LASS
>> enumeration and enabling in non-root and nested environment. As current
>> KVM unittest framework is not compatible to LASS rule that kernel should
>> run in the upper half, we use kernel module and application test to verify
>> LASS functionalities in guest instead. The data access related x86 emulator
>> code is verified with forced emulation prefix (FEP) mechanism. Other test
>> cases are working in progress.
>>
>> How to add tests for LASS in KUT or kselftest is still under investigation.
>>
>> [1] Intel Architecutre Instruction Set Extensions and Future Features
> /s/Architecutre/Architecture
>
Sorry for typos above. Thanks.
>> Programming Reference: Chapter Linear Address Space Separation (LASS)
>> https://cdrdv2.intel.com/v1/dl/getContent/671368
>>
>> Zeng Guang (6):
>> KVM: x86: Virtualize CR4.LASS
>> KVM: VMX: Add new ops in kvm_x86_ops for LASS violation check
>> KVM: x86: Add emulator helper for LASS violation check
>> KVM: x86: LASS protection on KVM emulation when LASS enabled
>> KVM: x86: Advertise LASS CPUID to user space
>> KVM: x86: Set KVM LASS based on hardware capability
>>
>> arch/x86/include/asm/cpuid.h | 36 +++++++++++++++++++
>> arch/x86/include/asm/kvm-x86-ops.h | 1 +
>> arch/x86/include/asm/kvm_host.h | 7 +++-
>> arch/x86/kvm/cpuid.c | 8 +++--
>> arch/x86/kvm/emulate.c | 36 ++++++++++++++++---
>> arch/x86/kvm/kvm_emulate.h | 1 +
>> arch/x86/kvm/vmx/nested.c | 3 ++
>> arch/x86/kvm/vmx/sgx.c | 2 ++
>> arch/x86/kvm/vmx/vmx.c | 58 ++++++++++++++++++++++++++++++
>> arch/x86/kvm/vmx/vmx.h | 2 ++
>> arch/x86/kvm/x86.c | 9 +++++
>> arch/x86/kvm/x86.h | 2 ++
>> 12 files changed, 157 insertions(+), 8 deletions(-)
>>

2023-04-25 02:53:31

by Binbin Wu

[permalink] [raw]
Subject: Re: [PATCH 4/6] KVM: x86: LASS protection on KVM emulation when LASS enabled



On 4/20/2023 9:37 PM, Zeng Guang wrote:
> Do LASS violation check for instructions emulated by KVM. Note that for
> instructions executed in the guest directly, hardware will perform the
> check.
>
> Not all instruction emulation leads to accesses to guest linear addresses
> because 1) some instrutions like CPUID, RDMSR, don't take memory as

/s/instrutions/instructions
> operands 2) instruction fetch in most cases is already done inside the
> guest.
What are the instruction fetch cases not covered in non-root mode?
And IIUC, the patch actually doesn't distinguish them and alway checks
LASS voilation
for instruction fetch in instruction emulation, right?

>
> Four cases in which kvm may access guest linear addresses are identified
> by code inspection:
> - KVM emulator uses segmented address for instruction fetches or data
> accesses.
> - For implicit data access, KVM emulator gets address to a system data
to or from?

> structure(GDT/LDT/IDT/TR).
> - For VMX instruction emulation, KVM gets the address from "VM-exit
> instruction information" field in VMCS.
> - For SGX ENCLS instruction emulation, KVM gets the address from registers.
>
> LASS violation check applies to these linear address so as to enforce
address -> addresses

> mode-based protections as hardware behaves.
>
> As exceptions, the target memory address of emulation of invlpg, branch
> and call instructions doesn't require LASS violation check.
>
> Signed-off-by: Zeng Guang <[email protected]>
> ---
> arch/x86/kvm/emulate.c | 36 +++++++++++++++++++++++++++++++-----
> arch/x86/kvm/vmx/nested.c | 3 +++
> arch/x86/kvm/vmx/sgx.c | 2 ++
> 3 files changed, 36 insertions(+), 5 deletions(-)
>
> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
> index 5cc3efa0e21c..a9a022fd712e 100644
> --- a/arch/x86/kvm/emulate.c
> +++ b/arch/x86/kvm/emulate.c
> @@ -687,7 +687,8 @@ static __always_inline int __linearize(struct x86_emulate_ctxt *ctxt,
> struct segmented_address addr,
> unsigned *max_size, unsigned size,
> bool write, bool fetch,
> - enum x86emul_mode mode, ulong *linear)
> + enum x86emul_mode mode, ulong *linear,
> + u64 flags)
> {
> struct desc_struct desc;
> bool usable;
> @@ -695,6 +696,7 @@ static __always_inline int __linearize(struct x86_emulate_ctxt *ctxt,
> u32 lim;
> u16 sel;
> u8 va_bits;
> + u64 access = fetch ? PFERR_FETCH_MASK : 0;
>
> la = seg_base(ctxt, addr.seg) + addr.ea;
> *max_size = 0;
> @@ -740,6 +742,10 @@ static __always_inline int __linearize(struct x86_emulate_ctxt *ctxt,
> }
> break;
> }
> +
> + if (ctxt->ops->check_lass(ctxt, access, *linear, flags))
> + goto bad;
> +
> if (la & (insn_alignment(ctxt, size) - 1))
> return emulate_gp(ctxt, 0);
> return X86EMUL_CONTINUE;
> @@ -757,7 +763,7 @@ static int linearize(struct x86_emulate_ctxt *ctxt,
> {
> unsigned max_size;
> return __linearize(ctxt, addr, &max_size, size, write, false,
> - ctxt->mode, linear);
> + ctxt->mode, linear, 0);
> }
>
> static inline int assign_eip(struct x86_emulate_ctxt *ctxt, ulong dst)
> @@ -770,7 +776,10 @@ static inline int assign_eip(struct x86_emulate_ctxt *ctxt, ulong dst)
>
> if (ctxt->op_bytes != sizeof(unsigned long))
> addr.ea = dst & ((1UL << (ctxt->op_bytes << 3)) - 1);
> - rc = __linearize(ctxt, addr, &max_size, 1, false, true, ctxt->mode, &linear);
> +
> + /* LASS doesn't apply to address for branch and call instructions */
> + rc = __linearize(ctxt, addr, &max_size, 1, false, true, ctxt->mode,
> + &linear, KVM_X86_EMULFLAG_SKIP_LASS);
> if (rc == X86EMUL_CONTINUE)
> ctxt->_eip = addr.ea;
> return rc;
> @@ -845,6 +854,13 @@ static inline int jmp_rel(struct x86_emulate_ctxt *ctxt, int rel)
> static int linear_read_system(struct x86_emulate_ctxt *ctxt, ulong linear,
> void *data, unsigned size)
> {
> + if (ctxt->ops->check_lass(ctxt, PFERR_IMPLICIT_ACCESS, linear, 0)) {
> + ctxt->exception.vector = GP_VECTOR;
> + ctxt->exception.error_code = 0;
> + ctxt->exception.error_code_valid = true;
> + return X86EMUL_PROPAGATE_FAULT;
> + }
> +
> return ctxt->ops->read_std(ctxt, linear, data, size, &ctxt->exception, true);
> }
>
> @@ -852,6 +868,13 @@ static int linear_write_system(struct x86_emulate_ctxt *ctxt,
> ulong linear, void *data,
> unsigned int size)
> {
> + if (ctxt->ops->check_lass(ctxt, PFERR_IMPLICIT_ACCESS, linear, 0)) {
> + ctxt->exception.vector = GP_VECTOR;
> + ctxt->exception.error_code = 0;
> + ctxt->exception.error_code_valid = true;
> + return X86EMUL_PROPAGATE_FAULT;
> + }
> +
> return ctxt->ops->write_std(ctxt, linear, data, size, &ctxt->exception, true);
> }
>
> @@ -907,7 +930,7 @@ static int __do_insn_fetch_bytes(struct x86_emulate_ctxt *ctxt, int op_size)
> * against op_size.
> */
> rc = __linearize(ctxt, addr, &max_size, 0, false, true, ctxt->mode,
> - &linear);
> + &linear, 0);
> if (unlikely(rc != X86EMUL_CONTINUE))
> return rc;
>
> @@ -3432,8 +3455,11 @@ static int em_invlpg(struct x86_emulate_ctxt *ctxt)
> {
> int rc;
> ulong linear;
> + unsigned max_size;
>
> - rc = linearize(ctxt, ctxt->src.addr.mem, 1, false, &linear);
> + /* LASS doesn't apply to the memory address for invlpg */
> + rc = __linearize(ctxt, ctxt->src.addr.mem, &max_size, 1, false, false,
> + ctxt->mode, &linear, KVM_X86_EMULFLAG_SKIP_LASS);
> if (rc == X86EMUL_CONTINUE)
> ctxt->ops->invlpg(ctxt, linear);
> /* Disable writeback. */
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index c8ae9d0e59b3..55c88c4593a6 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -4974,6 +4974,9 @@ int get_vmx_mem_address(struct kvm_vcpu *vcpu, unsigned long exit_qualification,
> * destination for long mode!
> */
> exn = is_noncanonical_address(*ret, vcpu);
> +
> + if (!exn)
> + exn = __vmx_check_lass(vcpu, 0, *ret, 0);
> } else {
> /*
> * When not in long mode, the virtual/linear address is
> diff --git a/arch/x86/kvm/vmx/sgx.c b/arch/x86/kvm/vmx/sgx.c
> index b12da2a6dec9..30cb5d0980be 100644
> --- a/arch/x86/kvm/vmx/sgx.c
> +++ b/arch/x86/kvm/vmx/sgx.c
> @@ -37,6 +37,8 @@ static int sgx_get_encls_gva(struct kvm_vcpu *vcpu, unsigned long offset,
> fault = true;
> } else if (likely(is_long_mode(vcpu))) {
> fault = is_noncanonical_address(*gva, vcpu);
> + if (!fault)
> + fault = __vmx_check_lass(vcpu, 0, *gva, 0);
> } else {
> *gva &= 0xffffffff;
> fault = (s.unusable) ||

2023-04-25 03:09:14

by Binbin Wu

[permalink] [raw]
Subject: Re: [PATCH 6/6] KVM: x86: Set KVM LASS based on hardware capability



On 4/20/2023 9:37 PM, Zeng Guang wrote:
> Host kernel may clear LASS capability in boot_cpu_data.x86_capability
Is there some option to do it?

> besides explicitly using clearcpuid parameter. That will cause guest
> not being able to manage LASS independently. So set KVM LASS directly
> based on hardware capability to eliminate the dependency.
>
> Add new helper functions to facilitate getting result of CPUID sub-leaf.
>
> Signed-off-by: Zeng Guang <[email protected]>
> ---
> arch/x86/include/asm/cpuid.h | 36 ++++++++++++++++++++++++++++++++++++
> arch/x86/kvm/cpuid.c | 4 ++++
> 2 files changed, 40 insertions(+)
>
> diff --git a/arch/x86/include/asm/cpuid.h b/arch/x86/include/asm/cpuid.h
> index 9bee3e7bf973..a25dd00b7c0a 100644
> --- a/arch/x86/include/asm/cpuid.h
> +++ b/arch/x86/include/asm/cpuid.h
> @@ -127,6 +127,42 @@ static inline unsigned int cpuid_edx(unsigned int op)
> return edx;
> }
>
> +static inline unsigned int cpuid_count_eax(unsigned int op, int count)
> +{
> + unsigned int eax, ebx, ecx, edx;
> +
> + cpuid_count(op, count, &eax, &ebx, &ecx, &edx);
> +
> + return eax;
> +}
> +
> +static inline unsigned int cpuid_count_ebx(unsigned int op, int count)
> +{
> + unsigned int eax, ebx, ecx, edx;
> +
> + cpuid_count(op, count, &eax, &ebx, &ecx, &edx);
> +
> + return ebx;
> +}
> +
> +static inline unsigned int cpuid_count_ecx(unsigned int op, int count)
> +{
> + unsigned int eax, ebx, ecx, edx;
> +
> + cpuid_count(op, count, &eax, &ebx, &ecx, &edx);
> +
> + return ecx;
> +}
> +
> +static inline unsigned int cpuid_count_edx(unsigned int op, int count)
> +{
> + unsigned int eax, ebx, ecx, edx;
> +
> + cpuid_count(op, count, &eax, &ebx, &ecx, &edx);
> +
> + return edx;
> +}
> +
> static __always_inline bool cpuid_function_is_indexed(u32 function)
> {
> switch (function) {
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 5facb8037140..e99b99ebe1fe 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -667,6 +667,10 @@ void kvm_set_cpu_caps(void)
> F(AMX_FP16) | F(AVX_IFMA)
> );
>
> + /* Set LASS based on hardware capability */
> + if (cpuid_count_eax(7, 1) & F(LASS))
> + kvm_cpu_cap_set(X86_FEATURE_LASS);
> +
> kvm_cpu_cap_init_kvm_defined(CPUID_7_1_EDX,
> F(AVX_VNNI_INT8) | F(AVX_NE_CONVERT) | F(PREFETCHITI)
> );

2023-04-25 06:51:30

by Zeng Guang

[permalink] [raw]
Subject: Re: [PATCH 4/6] KVM: x86: LASS protection on KVM emulation when LASS enabled


On 4/25/2023 10:52 AM, Binbin Wu wrote:
>
> On 4/20/2023 9:37 PM, Zeng Guang wrote:
>> Do LASS violation check for instructions emulated by KVM. Note that for
>> instructions executed in the guest directly, hardware will perform the
>> check.
>>
>> Not all instruction emulation leads to accesses to guest linear addresses
>> because 1) some instrutions like CPUID, RDMSR, don't take memory as
> /s/instrutions/instructions
Oops. :P
>> operands 2) instruction fetch in most cases is already done inside the
>> guest.
> What are the instruction fetch cases not covered in non-root mode?
> And IIUC, the patch actually doesn't distinguish them and alway checks
> LASS voilation
> for instruction fetch in instruction emulation, right?

Here states most of instruction needn't be fetched by KVM. KVM intercept the
most of privileged instructions and complete the function emulation
directly.
But some instructions requires KVM to fetch the code and emulate
further, e.g.
lgdt/sgdt etc. KVM will always do LASS violation check on instruction
fetch once
it happens.

>> Four cases in which kvm may access guest linear addresses are identified
>> by code inspection:
>> - KVM emulator uses segmented address for instruction fetches or data
>> accesses.
>> - For implicit data access, KVM emulator gets address to a system data
> to or from?

It means the address pointing *to* a system data structure.

>> structure(GDT/LDT/IDT/TR).
>> - For VMX instruction emulation, KVM gets the address from "VM-exit
>> instruction information" field in VMCS.
>> - For SGX ENCLS instruction emulation, KVM gets the address from registers.
>>
>> LASS violation check applies to these linear address so as to enforce
> address -> addresses
OK.
>
>> mode-based protections as hardware behaves.
>>
>> As exceptions, the target memory address of emulation of invlpg, branch
>> and call instructions doesn't require LASS violation check.
>>
>> Signed-off-by: Zeng Guang <[email protected]>
>> ---
>> arch/x86/kvm/emulate.c | 36 +++++++++++++++++++++++++++++++-----
>> arch/x86/kvm/vmx/nested.c | 3 +++
>> arch/x86/kvm/vmx/sgx.c | 2 ++
>> 3 files changed, 36 insertions(+), 5 deletions(-)
>>
>> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
>> index 5cc3efa0e21c..a9a022fd712e 100644
>> --- a/arch/x86/kvm/emulate.c
>> +++ b/arch/x86/kvm/emulate.c
>> @@ -687,7 +687,8 @@ static __always_inline int __linearize(struct x86_emulate_ctxt *ctxt,
>> struct segmented_address addr,
>> unsigned *max_size, unsigned size,
>> bool write, bool fetch,
>> - enum x86emul_mode mode, ulong *linear)
>> + enum x86emul_mode mode, ulong *linear,
>> + u64 flags)
>> {
>> struct desc_struct desc;
>> bool usable;
>> @@ -695,6 +696,7 @@ static __always_inline int __linearize(struct x86_emulate_ctxt *ctxt,
>> u32 lim;
>> u16 sel;
>> u8 va_bits;
>> + u64 access = fetch ? PFERR_FETCH_MASK : 0;
>>
>> la = seg_base(ctxt, addr.seg) + addr.ea;
>> *max_size = 0;
>> @@ -740,6 +742,10 @@ static __always_inline int __linearize(struct x86_emulate_ctxt *ctxt,
>> }
>> break;
>> }
>> +
>> + if (ctxt->ops->check_lass(ctxt, access, *linear, flags))
>> + goto bad;
>> +
>> if (la & (insn_alignment(ctxt, size) - 1))
>> return emulate_gp(ctxt, 0);
>> return X86EMUL_CONTINUE;
>> @@ -757,7 +763,7 @@ static int linearize(struct x86_emulate_ctxt *ctxt,
>> {
>> unsigned max_size;
>> return __linearize(ctxt, addr, &max_size, size, write, false,
>> - ctxt->mode, linear);
>> + ctxt->mode, linear, 0);
>> }
>>
>> static inline int assign_eip(struct x86_emulate_ctxt *ctxt, ulong dst)
>> @@ -770,7 +776,10 @@ static inline int assign_eip(struct x86_emulate_ctxt *ctxt, ulong dst)
>>
>> if (ctxt->op_bytes != sizeof(unsigned long))
>> addr.ea = dst & ((1UL << (ctxt->op_bytes << 3)) - 1);
>> - rc = __linearize(ctxt, addr, &max_size, 1, false, true, ctxt->mode, &linear);
>> +
>> + /* LASS doesn't apply to address for branch and call instructions */
>> + rc = __linearize(ctxt, addr, &max_size, 1, false, true, ctxt->mode,
>> + &linear, KVM_X86_EMULFLAG_SKIP_LASS);
>> if (rc == X86EMUL_CONTINUE)
>> ctxt->_eip = addr.ea;
>> return rc;
>> @@ -845,6 +854,13 @@ static inline int jmp_rel(struct x86_emulate_ctxt *ctxt, int rel)
>> static int linear_read_system(struct x86_emulate_ctxt *ctxt, ulong linear,
>> void *data, unsigned size)
>> {
>> + if (ctxt->ops->check_lass(ctxt, PFERR_IMPLICIT_ACCESS, linear, 0)) {
>> + ctxt->exception.vector = GP_VECTOR;
>> + ctxt->exception.error_code = 0;
>> + ctxt->exception.error_code_valid = true;
>> + return X86EMUL_PROPAGATE_FAULT;
>> + }
>> +
>> return ctxt->ops->read_std(ctxt, linear, data, size, &ctxt->exception, true);
>> }
>>
>> @@ -852,6 +868,13 @@ static int linear_write_system(struct x86_emulate_ctxt *ctxt,
>> ulong linear, void *data,
>> unsigned int size)
>> {
>> + if (ctxt->ops->check_lass(ctxt, PFERR_IMPLICIT_ACCESS, linear, 0)) {
>> + ctxt->exception.vector = GP_VECTOR;
>> + ctxt->exception.error_code = 0;
>> + ctxt->exception.error_code_valid = true;
>> + return X86EMUL_PROPAGATE_FAULT;
>> + }
>> +
>> return ctxt->ops->write_std(ctxt, linear, data, size, &ctxt->exception, true);
>> }
>>
>> @@ -907,7 +930,7 @@ static int __do_insn_fetch_bytes(struct x86_emulate_ctxt *ctxt, int op_size)
>> * against op_size.
>> */
>> rc = __linearize(ctxt, addr, &max_size, 0, false, true, ctxt->mode,
>> - &linear);
>> + &linear, 0);
>> if (unlikely(rc != X86EMUL_CONTINUE))
>> return rc;
>>
>> @@ -3432,8 +3455,11 @@ static int em_invlpg(struct x86_emulate_ctxt *ctxt)
>> {
>> int rc;
>> ulong linear;
>> + unsigned max_size;
>>
>> - rc = linearize(ctxt, ctxt->src.addr.mem, 1, false, &linear);
>> + /* LASS doesn't apply to the memory address for invlpg */
>> + rc = __linearize(ctxt, ctxt->src.addr.mem, &max_size, 1, false, false,
>> + ctxt->mode, &linear, KVM_X86_EMULFLAG_SKIP_LASS);
>> if (rc == X86EMUL_CONTINUE)
>> ctxt->ops->invlpg(ctxt, linear);
>> /* Disable writeback. */
>> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
>> index c8ae9d0e59b3..55c88c4593a6 100644
>> --- a/arch/x86/kvm/vmx/nested.c
>> +++ b/arch/x86/kvm/vmx/nested.c
>> @@ -4974,6 +4974,9 @@ int get_vmx_mem_address(struct kvm_vcpu *vcpu, unsigned long exit_qualification,
>> * destination for long mode!
>> */
>> exn = is_noncanonical_address(*ret, vcpu);
>> +
>> + if (!exn)
>> + exn = __vmx_check_lass(vcpu, 0, *ret, 0);
>> } else {
>> /*
>> * When not in long mode, the virtual/linear address is
>> diff --git a/arch/x86/kvm/vmx/sgx.c b/arch/x86/kvm/vmx/sgx.c
>> index b12da2a6dec9..30cb5d0980be 100644
>> --- a/arch/x86/kvm/vmx/sgx.c
>> +++ b/arch/x86/kvm/vmx/sgx.c
>> @@ -37,6 +37,8 @@ static int sgx_get_encls_gva(struct kvm_vcpu *vcpu, unsigned long offset,
>> fault = true;
>> } else if (likely(is_long_mode(vcpu))) {
>> fault = is_noncanonical_address(*gva, vcpu);
>> + if (!fault)
>> + fault = __vmx_check_lass(vcpu, 0, *gva, 0);
>> } else {
>> *gva &= 0xffffffff;
>> fault = (s.unusable) ||

2023-04-25 06:53:32

by Zeng Guang

[permalink] [raw]
Subject: Re: [PATCH 6/6] KVM: x86: Set KVM LASS based on hardware capability


On 4/25/2023 10:57 AM, Binbin Wu wrote:
>
> On 4/20/2023 9:37 PM, Zeng Guang wrote:
>> Host kernel may clear LASS capability in boot_cpu_data.x86_capability
> Is there some option to do it?

Kernel supporting LASS will turn off the LASS capability with specific
option, e.g.
"vsyscall=emulate".

>> besides explicitly using clearcpuid parameter. That will cause guest
>> not being able to manage LASS independently. So set KVM LASS directly
>> based on hardware capability to eliminate the dependency.
>>
>> Add new helper functions to facilitate getting result of CPUID sub-leaf.
>>
>> Signed-off-by: Zeng Guang <[email protected]>
>> ---
>> arch/x86/include/asm/cpuid.h | 36 ++++++++++++++++++++++++++++++++++++
>> arch/x86/kvm/cpuid.c | 4 ++++
>> 2 files changed, 40 insertions(+)
>>
>> diff --git a/arch/x86/include/asm/cpuid.h b/arch/x86/include/asm/cpuid.h
>> index 9bee3e7bf973..a25dd00b7c0a 100644
>> --- a/arch/x86/include/asm/cpuid.h
>> +++ b/arch/x86/include/asm/cpuid.h
>> @@ -127,6 +127,42 @@ static inline unsigned int cpuid_edx(unsigned int op)
>> return edx;
>> }
>>
>> +static inline unsigned int cpuid_count_eax(unsigned int op, int count)
>> +{
>> + unsigned int eax, ebx, ecx, edx;
>> +
>> + cpuid_count(op, count, &eax, &ebx, &ecx, &edx);
>> +
>> + return eax;
>> +}
>> +
>> +static inline unsigned int cpuid_count_ebx(unsigned int op, int count)
>> +{
>> + unsigned int eax, ebx, ecx, edx;
>> +
>> + cpuid_count(op, count, &eax, &ebx, &ecx, &edx);
>> +
>> + return ebx;
>> +}
>> +
>> +static inline unsigned int cpuid_count_ecx(unsigned int op, int count)
>> +{
>> + unsigned int eax, ebx, ecx, edx;
>> +
>> + cpuid_count(op, count, &eax, &ebx, &ecx, &edx);
>> +
>> + return ecx;
>> +}
>> +
>> +static inline unsigned int cpuid_count_edx(unsigned int op, int count)
>> +{
>> + unsigned int eax, ebx, ecx, edx;
>> +
>> + cpuid_count(op, count, &eax, &ebx, &ecx, &edx);
>> +
>> + return edx;
>> +}
>> +
>> static __always_inline bool cpuid_function_is_indexed(u32 function)
>> {
>> switch (function) {
>> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
>> index 5facb8037140..e99b99ebe1fe 100644
>> --- a/arch/x86/kvm/cpuid.c
>> +++ b/arch/x86/kvm/cpuid.c
>> @@ -667,6 +667,10 @@ void kvm_set_cpu_caps(void)
>> F(AMX_FP16) | F(AVX_IFMA)
>> );
>>
>> + /* Set LASS based on hardware capability */
>> + if (cpuid_count_eax(7, 1) & F(LASS))
>> + kvm_cpu_cap_set(X86_FEATURE_LASS);
>> +
>> kvm_cpu_cap_init_kvm_defined(CPUID_7_1_EDX,
>> F(AVX_VNNI_INT8) | F(AVX_NE_CONVERT) | F(PREFETCHITI)
>> );

2023-04-25 07:32:54

by Chao Gao

[permalink] [raw]
Subject: Re: [PATCH 6/6] KVM: x86: Set KVM LASS based on hardware capability

On Thu, Apr 20, 2023 at 09:37:24PM +0800, Zeng Guang wrote:
>Host kernel may clear LASS capability in boot_cpu_data.x86_capability
>besides explicitly using clearcpuid parameter. That will cause guest
>not being able to manage LASS independently. So set KVM LASS directly
>based on hardware capability to eliminate the dependency.

...

>+ /* Set LASS based on hardware capability */
>+ if (cpuid_count_eax(7, 1) & F(LASS))
>+ kvm_cpu_cap_set(X86_FEATURE_LASS);
>+

What if LASS is cleared in boot_cpu_data because not all CPUs support LASS?

In arch/x86/kernel/cpu/common.c, identify_cpu() clears features which are
not supported by all CPUs:

/*
* On SMP, boot_cpu_data holds the common feature set between
* all CPUs; so make sure that we indicate which features are
* common between the CPUs. The first time this routine gets
* executed, c == &boot_cpu_data.
*/
if (c != &boot_cpu_data) {
/* AND the already accumulated flags with these */
for (i = 0; i < NCAPINTS; i++)
boot_cpu_data.x86_capability[i] &= c->x86_capability[i];

LA57 seems to have the same issue. We may need to add some checks for LA57
in KVM's cpu hotplug callback.

> kvm_cpu_cap_init_kvm_defined(CPUID_7_1_EDX,
> F(AVX_VNNI_INT8) | F(AVX_NE_CONVERT) | F(PREFETCHITI)
> );
>--
>2.27.0
>

2023-04-26 01:44:42

by Yuan Yao

[permalink] [raw]
Subject: Re: [PATCH 4/6] KVM: x86: LASS protection on KVM emulation when LASS enabled

On Thu, Apr 20, 2023 at 09:37:22PM +0800, Zeng Guang wrote:
> Do LASS violation check for instructions emulated by KVM. Note that for
> instructions executed in the guest directly, hardware will perform the
> check.
>
> Not all instruction emulation leads to accesses to guest linear addresses
> because 1) some instrutions like CPUID, RDMSR, don't take memory as
> operands 2) instruction fetch in most cases is already done inside the
> guest.
>
> Four cases in which kvm may access guest linear addresses are identified
> by code inspection:
> - KVM emulator uses segmented address for instruction fetches or data
> accesses.
> - For implicit data access, KVM emulator gets address to a system data
> structure(GDT/LDT/IDT/TR).
> - For VMX instruction emulation, KVM gets the address from "VM-exit
> instruction information" field in VMCS.
> - For SGX ENCLS instruction emulation, KVM gets the address from registers.
>
> LASS violation check applies to these linear address so as to enforce
> mode-based protections as hardware behaves.
>
> As exceptions, the target memory address of emulation of invlpg, branch
> and call instructions doesn't require LASS violation check.
>
> Signed-off-by: Zeng Guang <[email protected]>
> ---
> arch/x86/kvm/emulate.c | 36 +++++++++++++++++++++++++++++++-----
> arch/x86/kvm/vmx/nested.c | 3 +++
> arch/x86/kvm/vmx/sgx.c | 2 ++
> 3 files changed, 36 insertions(+), 5 deletions(-)
>
> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
> index 5cc3efa0e21c..a9a022fd712e 100644
> --- a/arch/x86/kvm/emulate.c
> +++ b/arch/x86/kvm/emulate.c
> @@ -687,7 +687,8 @@ static __always_inline int __linearize(struct x86_emulate_ctxt *ctxt,
> struct segmented_address addr,
> unsigned *max_size, unsigned size,
> bool write, bool fetch,
> - enum x86emul_mode mode, ulong *linear)
> + enum x86emul_mode mode, ulong *linear,
> + u64 flags)
> {
> struct desc_struct desc;
> bool usable;
> @@ -695,6 +696,7 @@ static __always_inline int __linearize(struct x86_emulate_ctxt *ctxt,
> u32 lim;
> u16 sel;
> u8 va_bits;
> + u64 access = fetch ? PFERR_FETCH_MASK : 0;
>
> la = seg_base(ctxt, addr.seg) + addr.ea;
> *max_size = 0;
> @@ -740,6 +742,10 @@ static __always_inline int __linearize(struct x86_emulate_ctxt *ctxt,
> }
> break;
> }
> +
> + if (ctxt->ops->check_lass(ctxt, access, *linear, flags))
> + goto bad;
> +
> if (la & (insn_alignment(ctxt, size) - 1))
> return emulate_gp(ctxt, 0);
> return X86EMUL_CONTINUE;
> @@ -757,7 +763,7 @@ static int linearize(struct x86_emulate_ctxt *ctxt,
> {
> unsigned max_size;
> return __linearize(ctxt, addr, &max_size, size, write, false,
> - ctxt->mode, linear);
> + ctxt->mode, linear, 0);
> }
>
> static inline int assign_eip(struct x86_emulate_ctxt *ctxt, ulong dst)
> @@ -770,7 +776,10 @@ static inline int assign_eip(struct x86_emulate_ctxt *ctxt, ulong dst)
>
> if (ctxt->op_bytes != sizeof(unsigned long))
> addr.ea = dst & ((1UL << (ctxt->op_bytes << 3)) - 1);
> - rc = __linearize(ctxt, addr, &max_size, 1, false, true, ctxt->mode, &linear);
> +
> + /* LASS doesn't apply to address for branch and call instructions */
> + rc = __linearize(ctxt, addr, &max_size, 1, false, true, ctxt->mode,
> + &linear, KVM_X86_EMULFLAG_SKIP_LASS);

The emulator.c is common part of x86, so may more common
abstraction like permiession_check_before_paging better ?
Let's also wait other guy's input for this.

> if (rc == X86EMUL_CONTINUE)
> ctxt->_eip = addr.ea;
> return rc;
> @@ -845,6 +854,13 @@ static inline int jmp_rel(struct x86_emulate_ctxt *ctxt, int rel)
> static int linear_read_system(struct x86_emulate_ctxt *ctxt, ulong linear,
> void *data, unsigned size)
> {
> + if (ctxt->ops->check_lass(ctxt, PFERR_IMPLICIT_ACCESS, linear, 0)) {
> + ctxt->exception.vector = GP_VECTOR;
> + ctxt->exception.error_code = 0;
> + ctxt->exception.error_code_valid = true;
> + return X86EMUL_PROPAGATE_FAULT;
> + }
> +
> return ctxt->ops->read_std(ctxt, linear, data, size, &ctxt->exception, true);
> }
>
> @@ -852,6 +868,13 @@ static int linear_write_system(struct x86_emulate_ctxt *ctxt,
> ulong linear, void *data,
> unsigned int size)
> {
> + if (ctxt->ops->check_lass(ctxt, PFERR_IMPLICIT_ACCESS, linear, 0)) {
> + ctxt->exception.vector = GP_VECTOR;
> + ctxt->exception.error_code = 0;
> + ctxt->exception.error_code_valid = true;
> + return X86EMUL_PROPAGATE_FAULT;
> + }
> +
> return ctxt->ops->write_std(ctxt, linear, data, size, &ctxt->exception, true);
> }
>
> @@ -907,7 +930,7 @@ static int __do_insn_fetch_bytes(struct x86_emulate_ctxt *ctxt, int op_size)
> * against op_size.
> */
> rc = __linearize(ctxt, addr, &max_size, 0, false, true, ctxt->mode,
> - &linear);
> + &linear, 0);
> if (unlikely(rc != X86EMUL_CONTINUE))
> return rc;
>
> @@ -3432,8 +3455,11 @@ static int em_invlpg(struct x86_emulate_ctxt *ctxt)
> {
> int rc;
> ulong linear;
> + unsigned max_size;
>
> - rc = linearize(ctxt, ctxt->src.addr.mem, 1, false, &linear);
> + /* LASS doesn't apply to the memory address for invlpg */
> + rc = __linearize(ctxt, ctxt->src.addr.mem, &max_size, 1, false, false,
> + ctxt->mode, &linear, KVM_X86_EMULFLAG_SKIP_LASS);
> if (rc == X86EMUL_CONTINUE)
> ctxt->ops->invlpg(ctxt, linear);
> /* Disable writeback. */
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index c8ae9d0e59b3..55c88c4593a6 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -4974,6 +4974,9 @@ int get_vmx_mem_address(struct kvm_vcpu *vcpu, unsigned long exit_qualification,
> * destination for long mode!
> */
> exn = is_noncanonical_address(*ret, vcpu);
> +
> + if (!exn)
> + exn = __vmx_check_lass(vcpu, 0, *ret, 0);
> } else {
> /*
> * When not in long mode, the virtual/linear address is
> diff --git a/arch/x86/kvm/vmx/sgx.c b/arch/x86/kvm/vmx/sgx.c
> index b12da2a6dec9..30cb5d0980be 100644
> --- a/arch/x86/kvm/vmx/sgx.c
> +++ b/arch/x86/kvm/vmx/sgx.c
> @@ -37,6 +37,8 @@ static int sgx_get_encls_gva(struct kvm_vcpu *vcpu, unsigned long offset,
> fault = true;
> } else if (likely(is_long_mode(vcpu))) {
> fault = is_noncanonical_address(*gva, vcpu);
> + if (!fault)
> + fault = __vmx_check_lass(vcpu, 0, *gva, 0);
> } else {
> *gva &= 0xffffffff;
> fault = (s.unusable) ||
> --
> 2.27.0
>