2023-07-18 14:15:21

by Zeng Guang

[permalink] [raw]
Subject: [PATCH v2 0/8] LASS KVM virtualization support

Linear Address Space Separation (LASS)[1] is a new mechanism that
enforces the same mode-based protections as paging, i.e. SMAP/SMEP
but without traversing the paging structures. Because the protections
enforced by LASS are applied before paging, "probes" by malicious
software will provide no paging-based timing information.

Based on a linear-address organization, LASS partitions 64-bit linear
address space into two halves, user-mode address (LA[bit 63]=0) and
supervisor-mode address (LA[bit 63]=1).

LASS aims to prevent any attempt to probe supervisor-mode addresses by
user mode, and likewise stop any attempt to access (if SMAP enabled) or
execute user-mode addresses from supervisor mode.

When platform has LASS capability, KVM requires to expose this feature
to guest VM enumerated by CPUID.(EAX=07H.ECX=1):EAX.LASS[bit 6], and
allow guest to enable it via CR4.LASS[bit 27] on demand. For instruction
executed in the guest directly, hardware will perform the check. But KVM
also needs to behave same as hardware to apply LASS to kinds of guest
memory accesses when emulating instructions by software.

KVM will take following LASS violations check on emulation path.
User-mode access to supervisor space address:
LA[bit 63] && (CPL == 3)
Supervisor-mode access to user space address:
Instruction fetch: !LA[bit 63] && (CPL < 3)
Data access: !LA[bit 63] && (CR4.SMAP==1) && ((RFLAGS.AC == 0 &&
CPL < 3) || Implicit supervisor access)

This patch series provide a LASS KVM solution and depends on kernel
enabling that can be found at
https://lore.kernel.org/all/[email protected]/

We tested the basic function of LASS virtualization including LASS
enumeration and enabling in non-root and nested environment. As KVM
unittest framework is not compatible to LASS rule, we use kernel module
and application test to emulate LASS violation instead. With KVM forced
emulation mechanism, we also verified the LASS functionality on some
emulation path with instruction fetch and data access to have same
behavior as hardware.

How to extend kselftest to support LASS is under investigation and
experiment.

[1] Intel ISE https://cdrdv2.intel.com/v1/dl/getContent/671368
Chapter Linear Address Space Separation (LASS)

------------------------------------------------------------------------

v1->v2
1. refactor and optimize the interface of instruction emulation
by introducing new set of operation type definition prefixed with
"X86EMUL_F_" to distinguish access.
2. reorganize the patch to make each area of KVM better isolated.
3. refine LASS violation check design with consideration of wraparound
access across address space boundary.

v0->v1
1. Adapt to new __linearize() API
2. Function refactor of vmx_check_lass()
3. Refine commit message to be more precise
4. Drop LASS kvm cap detection depending
on hardware capability

Binbin Wu (4):
KVM: x86: Consolidate flags for __linearize()
KVM: x86: Use a new flag for branch instructions
KVM: x86: Add an emulation flag for implicit system access
KVM: x86: Add X86EMUL_F_INVTLB and pass it in em_invlpg()

Zeng Guang (4):
KVM: emulator: Add emulation of LASS violation checks on linear
address
KVM: VMX: Implement and apply vmx_is_lass_violation() for LASS
protection
KVM: x86: Virtualize CR4.LASS
KVM: x86: Advertise LASS CPUID to user space

arch/x86/include/asm/kvm-x86-ops.h | 3 ++-
arch/x86/include/asm/kvm_host.h | 5 +++-
arch/x86/kvm/cpuid.c | 5 ++--
arch/x86/kvm/emulate.c | 37 ++++++++++++++++++++---------
arch/x86/kvm/kvm_emulate.h | 9 +++++++
arch/x86/kvm/vmx/nested.c | 3 ++-
arch/x86/kvm/vmx/sgx.c | 4 ++++
arch/x86/kvm/vmx/vmx.c | 38 ++++++++++++++++++++++++++++++
arch/x86/kvm/vmx/vmx.h | 3 +++
arch/x86/kvm/x86.c | 10 ++++++++
arch/x86/kvm/x86.h | 2 ++
11 files changed, 102 insertions(+), 17 deletions(-)

--
2.27.0



2023-07-18 14:19:37

by Zeng Guang

[permalink] [raw]
Subject: [PATCH v2 7/8] KVM: x86: Virtualize CR4.LASS

Virtualize CR4.LASS[bit 27] under KVM control instead of being guest-owned
as CR4.LASS generally set once for each vCPU at boot time and won't be
toggled at runtime. Besides, only if VM has LASS capability enumerated with
CPUID.(EAX=07H.ECX=1):EAX.LASS[bit 6], KVM allows guest software to be able
to set CR4.LASS.

Updating cr4_fixed1 to set CR4.LASS bit in the emulated IA32_VMX_CR4_FIXED1
MSR for guests and allow guests to enable LASS in nested VMX operation as
well.

Notes: Setting CR4.LASS to 1 enable LASS in IA-32e mode. It doesn't take
effect in legacy mode even if CR4.LASS is set.

Signed-off-by: Zeng Guang <[email protected]>
Tested-by: Xuelian Guo <[email protected]>
---
arch/x86/include/asm/kvm_host.h | 2 +-
arch/x86/kvm/vmx/vmx.c | 3 +++
arch/x86/kvm/x86.h | 2 ++
3 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 791f0dd48cd9..a881b0518a18 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -125,7 +125,7 @@
| X86_CR4_PGE | X86_CR4_PCE | X86_CR4_OSFXSR | X86_CR4_PCIDE \
| X86_CR4_OSXSAVE | X86_CR4_SMEP | X86_CR4_FSGSBASE \
| X86_CR4_OSXMMEXCPT | X86_CR4_LA57 | X86_CR4_VMXE \
- | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP))
+ | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP | X86_CR4_LASS))

#define CR8_RESERVED_BITS (~(unsigned long)X86_CR8_TPR)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 15a7c6e7a25d..e74991bed362 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7603,6 +7603,9 @@ static void nested_vmx_cr_fixed1_bits_update(struct kvm_vcpu *vcpu)
cr4_fixed1_update(X86_CR4_UMIP, ecx, feature_bit(UMIP));
cr4_fixed1_update(X86_CR4_LA57, ecx, feature_bit(LA57));

+ entry = kvm_find_cpuid_entry_index(vcpu, 0x7, 1);
+ cr4_fixed1_update(X86_CR4_LASS, eax, feature_bit(LASS));
+
#undef cr4_fixed1_update
}

diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index c544602d07a3..e1295f490308 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -529,6 +529,8 @@ bool kvm_msr_allowed(struct kvm_vcpu *vcpu, u32 index, u32 type);
__reserved_bits |= X86_CR4_VMXE; \
if (!__cpu_has(__c, X86_FEATURE_PCID)) \
__reserved_bits |= X86_CR4_PCIDE; \
+ if (!__cpu_has(__c, X86_FEATURE_LASS)) \
+ __reserved_bits |= X86_CR4_LASS; \
__reserved_bits; \
})

--
2.27.0


2023-07-18 14:23:01

by Zeng Guang

[permalink] [raw]
Subject: [PATCH v2 4/8] KVM: x86: Add X86EMUL_F_INVTLB and pass it in em_invlpg()

From: Binbin Wu <[email protected]>

Add an emulation flag X86EMUL_F_INVTLB, which is used to identify an
instruction that does TLB invalidation without true memory access.

Only invlpg & invlpga implemented in emulator belong to this kind.
invlpga doesn't need additional information for emulation. Just pass
the flag to em_invlpg().

Signed-off-by: Binbin Wu <[email protected]>
Signed-off-by: Zeng Guang <[email protected]>
---
arch/x86/kvm/emulate.c | 4 +++-
arch/x86/kvm/kvm_emulate.h | 1 +
2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 8e706d19ae45..9b4b3ce6d52a 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -3443,8 +3443,10 @@ static int em_invlpg(struct x86_emulate_ctxt *ctxt)
{
int rc;
ulong linear;
+ unsigned max_size;

- rc = linearize(ctxt, ctxt->src.addr.mem, 1, false, &linear);
+ rc = __linearize(ctxt, ctxt->src.addr.mem, &max_size, 1, ctxt->mode,
+ &linear, X86EMUL_F_INVTLB);
if (rc == X86EMUL_CONTINUE)
ctxt->ops->invlpg(ctxt, linear);
/* Disable writeback. */
diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h
index c0e48f4fa7c4..c944055091e1 100644
--- a/arch/x86/kvm/kvm_emulate.h
+++ b/arch/x86/kvm/kvm_emulate.h
@@ -93,6 +93,7 @@ struct x86_instruction_info {
#define X86EMUL_F_FETCH BIT(1)
#define X86EMUL_F_BRANCH BIT(2)
#define X86EMUL_F_IMPLICIT BIT(3)
+#define X86EMUL_F_INVTLB BIT(4)

struct x86_emulate_ops {
void (*vm_bugged)(struct x86_emulate_ctxt *ctxt);
--
2.27.0


2023-07-18 14:33:03

by Zeng Guang

[permalink] [raw]
Subject: [PATCH v2 1/8] KVM: x86: Consolidate flags for __linearize()

From: Binbin Wu <[email protected]>

Consolidate @write and @fetch of __linearize() into a set of flags so that
additional flags can be added without needing more/new boolean parameters,
to precisely identify the access type.

No functional change intended.

Signed-off-by: Binbin Wu <[email protected]>
Reviewed-by: Chao Gao <[email protected]>
Acked-by: Kai Huang <[email protected]>
Signed-off-by: Zeng Guang <[email protected]>
---
arch/x86/kvm/emulate.c | 21 +++++++++++----------
arch/x86/kvm/kvm_emulate.h | 4 ++++
2 files changed, 15 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 936a397a08cd..3ddfbc99fa4f 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -687,8 +687,8 @@ static unsigned insn_alignment(struct x86_emulate_ctxt *ctxt, unsigned size)
static __always_inline int __linearize(struct x86_emulate_ctxt *ctxt,
struct segmented_address addr,
unsigned *max_size, unsigned size,
- bool write, bool fetch,
- enum x86emul_mode mode, ulong *linear)
+ enum x86emul_mode mode, ulong *linear,
+ unsigned int flags)
{
struct desc_struct desc;
bool usable;
@@ -717,11 +717,11 @@ static __always_inline int __linearize(struct x86_emulate_ctxt *ctxt,
if (!usable)
goto bad;
/* code segment in protected mode or read-only data segment */
- if ((((ctxt->mode != X86EMUL_MODE_REAL) && (desc.type & 8))
- || !(desc.type & 2)) && write)
+ if ((((ctxt->mode != X86EMUL_MODE_REAL) && (desc.type & 8)) || !(desc.type & 2)) &&
+ (flags & X86EMUL_F_WRITE))
goto bad;
/* unreadable code segment */
- if (!fetch && (desc.type & 8) && !(desc.type & 2))
+ if (!(flags & X86EMUL_F_FETCH) && (desc.type & 8) && !(desc.type & 2))
goto bad;
lim = desc_limit_scaled(&desc);
if (!(desc.type & 8) && (desc.type & 4)) {
@@ -757,8 +757,8 @@ static int linearize(struct x86_emulate_ctxt *ctxt,
ulong *linear)
{
unsigned max_size;
- return __linearize(ctxt, addr, &max_size, size, write, false,
- ctxt->mode, linear);
+ return __linearize(ctxt, addr, &max_size, size, ctxt->mode, linear,
+ write ? X86EMUL_F_WRITE : 0);
}

static inline int assign_eip(struct x86_emulate_ctxt *ctxt, ulong dst)
@@ -771,7 +771,8 @@ static inline int assign_eip(struct x86_emulate_ctxt *ctxt, ulong dst)

if (ctxt->op_bytes != sizeof(unsigned long))
addr.ea = dst & ((1UL << (ctxt->op_bytes << 3)) - 1);
- rc = __linearize(ctxt, addr, &max_size, 1, false, true, ctxt->mode, &linear);
+ rc = __linearize(ctxt, addr, &max_size, 1, ctxt->mode, &linear,
+ X86EMUL_F_FETCH);
if (rc == X86EMUL_CONTINUE)
ctxt->_eip = addr.ea;
return rc;
@@ -907,8 +908,8 @@ static int __do_insn_fetch_bytes(struct x86_emulate_ctxt *ctxt, int op_size)
* boundary check itself. Instead, we use max_size to check
* against op_size.
*/
- rc = __linearize(ctxt, addr, &max_size, 0, false, true, ctxt->mode,
- &linear);
+ rc = __linearize(ctxt, addr, &max_size, 0, ctxt->mode, &linear,
+ X86EMUL_F_FETCH);
if (unlikely(rc != X86EMUL_CONTINUE))
return rc;

diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h
index ab65f3a47dfd..86bbe997162d 100644
--- a/arch/x86/kvm/kvm_emulate.h
+++ b/arch/x86/kvm/kvm_emulate.h
@@ -88,6 +88,10 @@ struct x86_instruction_info {
#define X86EMUL_IO_NEEDED 5 /* IO is needed to complete emulation */
#define X86EMUL_INTERCEPTED 6 /* Intercepted by nested VMCB/VMCS */

+/* x86-specific emulation flags */
+#define X86EMUL_F_WRITE BIT(0)
+#define X86EMUL_F_FETCH BIT(1)
+
struct x86_emulate_ops {
void (*vm_bugged)(struct x86_emulate_ctxt *ctxt);
/*
--
2.27.0


2023-07-19 03:27:15

by Zeng Guang

[permalink] [raw]
Subject: Re: [PATCH v2 0/8] LASS KVM virtualization support

Please ignore this patch set as I posted wrong one by mistake.
I will submit the correct patch series soon. Sorry for bothering.

On 7/18/2023 9:18 PM, Zeng, Guang wrote:
> Linear Address Space Separation (LASS)[1] is a new mechanism that
> enforces the same mode-based protections as paging, i.e. SMAP/SMEP
> but without traversing the paging structures. Because the protections
> enforced by LASS are applied before paging, "probes" by malicious
> software will provide no paging-based timing information.
>
> Based on a linear-address organization, LASS partitions 64-bit linear
> address space into two halves, user-mode address (LA[bit 63]=0) and
> supervisor-mode address (LA[bit 63]=1).
>
> LASS aims to prevent any attempt to probe supervisor-mode addresses by
> user mode, and likewise stop any attempt to access (if SMAP enabled) or
> execute user-mode addresses from supervisor mode.
>
> When platform has LASS capability, KVM requires to expose this feature
> to guest VM enumerated by CPUID.(EAX=07H.ECX=1):EAX.LASS[bit 6], and
> allow guest to enable it via CR4.LASS[bit 27] on demand. For instruction
> executed in the guest directly, hardware will perform the check. But KVM
> also needs to behave same as hardware to apply LASS to kinds of guest
> memory accesses when emulating instructions by software.
>
> KVM will take following LASS violations check on emulation path.
> User-mode access to supervisor space address:
> LA[bit 63] && (CPL == 3)
> Supervisor-mode access to user space address:
> Instruction fetch: !LA[bit 63] && (CPL < 3)
> Data access: !LA[bit 63] && (CR4.SMAP==1) && ((RFLAGS.AC == 0 &&
> CPL < 3) || Implicit supervisor access)
>
> This patch series provide a LASS KVM solution and depends on kernel
> enabling that can be found at
> https://lore.kernel.org/all/[email protected]/
>
> We tested the basic function of LASS virtualization including LASS
> enumeration and enabling in non-root and nested environment. As KVM
> unittest framework is not compatible to LASS rule, we use kernel module
> and application test to emulate LASS violation instead. With KVM forced
> emulation mechanism, we also verified the LASS functionality on some
> emulation path with instruction fetch and data access to have same
> behavior as hardware.
>
> How to extend kselftest to support LASS is under investigation and
> experiment.
>
> [1] Intel ISE https://cdrdv2.intel.com/v1/dl/getContent/671368
> Chapter Linear Address Space Separation (LASS)
>
> ------------------------------------------------------------------------
>
> v1->v2
> 1. refactor and optimize the interface of instruction emulation
> by introducing new set of operation type definition prefixed with
> "X86EMUL_F_" to distinguish access.
> 2. reorganize the patch to make each area of KVM better isolated.
> 3. refine LASS violation check design with consideration of wraparound
> access across address space boundary.
>
> v0->v1
> 1. Adapt to new __linearize() API
> 2. Function refactor of vmx_check_lass()
> 3. Refine commit message to be more precise
> 4. Drop LASS kvm cap detection depending
> on hardware capability
>
> Binbin Wu (4):
> KVM: x86: Consolidate flags for __linearize()
> KVM: x86: Use a new flag for branch instructions
> KVM: x86: Add an emulation flag for implicit system access
> KVM: x86: Add X86EMUL_F_INVTLB and pass it in em_invlpg()
>
> Zeng Guang (4):
> KVM: emulator: Add emulation of LASS violation checks on linear
> address
> KVM: VMX: Implement and apply vmx_is_lass_violation() for LASS
> protection
> KVM: x86: Virtualize CR4.LASS
> KVM: x86: Advertise LASS CPUID to user space
>
> arch/x86/include/asm/kvm-x86-ops.h | 3 ++-
> arch/x86/include/asm/kvm_host.h | 5 +++-
> arch/x86/kvm/cpuid.c | 5 ++--
> arch/x86/kvm/emulate.c | 37 ++++++++++++++++++++---------
> arch/x86/kvm/kvm_emulate.h | 9 +++++++
> arch/x86/kvm/vmx/nested.c | 3 ++-
> arch/x86/kvm/vmx/sgx.c | 4 ++++
> arch/x86/kvm/vmx/vmx.c | 38 ++++++++++++++++++++++++++++++
> arch/x86/kvm/vmx/vmx.h | 3 +++
> arch/x86/kvm/x86.c | 10 ++++++++
> arch/x86/kvm/x86.h | 2 ++
> 11 files changed, 102 insertions(+), 17 deletions(-)
>

2023-07-20 02:59:17

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH v2 0/8] LASS KVM virtualization support

On July 18, 2023 6:18:36 AM PDT, Zeng Guang <[email protected]> wrote:
>Linear Address Space Separation (LASS)[1] is a new mechanism that
>enforces the same mode-based protections as paging, i.e. SMAP/SMEP
>but without traversing the paging structures. Because the protections
>enforced by LASS are applied before paging, "probes" by malicious
>software will provide no paging-based timing information.
>
>Based on a linear-address organization, LASS partitions 64-bit linear
>address space into two halves, user-mode address (LA[bit 63]=0) and
>supervisor-mode address (LA[bit 63]=1).
>
>LASS aims to prevent any attempt to probe supervisor-mode addresses by
>user mode, and likewise stop any attempt to access (if SMAP enabled) or
>execute user-mode addresses from supervisor mode.
>
>When platform has LASS capability, KVM requires to expose this feature
>to guest VM enumerated by CPUID.(EAX=07H.ECX=1):EAX.LASS[bit 6], and
>allow guest to enable it via CR4.LASS[bit 27] on demand. For instruction
>executed in the guest directly, hardware will perform the check. But KVM
>also needs to behave same as hardware to apply LASS to kinds of guest
>memory accesses when emulating instructions by software.
>
>KVM will take following LASS violations check on emulation path.
>User-mode access to supervisor space address:
> LA[bit 63] && (CPL == 3)
>Supervisor-mode access to user space address:
> Instruction fetch: !LA[bit 63] && (CPL < 3)
> Data access: !LA[bit 63] && (CR4.SMAP==1) && ((RFLAGS.AC == 0 &&
> CPL < 3) || Implicit supervisor access)
>
>This patch series provide a LASS KVM solution and depends on kernel
>enabling that can be found at
>https://lore.kernel.org/all/[email protected]/
>
>We tested the basic function of LASS virtualization including LASS
>enumeration and enabling in non-root and nested environment. As KVM
>unittest framework is not compatible to LASS rule, we use kernel module
>and application test to emulate LASS violation instead. With KVM forced
>emulation mechanism, we also verified the LASS functionality on some
>emulation path with instruction fetch and data access to have same
>behavior as hardware.
>
>How to extend kselftest to support LASS is under investigation and
>experiment.
>
>[1] Intel ISE https://cdrdv2.intel.com/v1/dl/getContent/671368
>Chapter Linear Address Space Separation (LASS)
>
>------------------------------------------------------------------------
>
>v1->v2
>1. refactor and optimize the interface of instruction emulation
> by introducing new set of operation type definition prefixed with
> "X86EMUL_F_" to distinguish access.
>2. reorganize the patch to make each area of KVM better isolated.
>3. refine LASS violation check design with consideration of wraparound
> access across address space boundary.
>
>v0->v1
>1. Adapt to new __linearize() API
>2. Function refactor of vmx_check_lass()
>3. Refine commit message to be more precise
>4. Drop LASS kvm cap detection depending
> on hardware capability
>
>Binbin Wu (4):
> KVM: x86: Consolidate flags for __linearize()
> KVM: x86: Use a new flag for branch instructions
> KVM: x86: Add an emulation flag for implicit system access
> KVM: x86: Add X86EMUL_F_INVTLB and pass it in em_invlpg()
>
>Zeng Guang (4):
> KVM: emulator: Add emulation of LASS violation checks on linear
> address
> KVM: VMX: Implement and apply vmx_is_lass_violation() for LASS
> protection
> KVM: x86: Virtualize CR4.LASS
> KVM: x86: Advertise LASS CPUID to user space
>
> arch/x86/include/asm/kvm-x86-ops.h | 3 ++-
> arch/x86/include/asm/kvm_host.h | 5 +++-
> arch/x86/kvm/cpuid.c | 5 ++--
> arch/x86/kvm/emulate.c | 37 ++++++++++++++++++++---------
> arch/x86/kvm/kvm_emulate.h | 9 +++++++
> arch/x86/kvm/vmx/nested.c | 3 ++-
> arch/x86/kvm/vmx/sgx.c | 4 ++++
> arch/x86/kvm/vmx/vmx.c | 38 ++++++++++++++++++++++++++++++
> arch/x86/kvm/vmx/vmx.h | 3 +++
> arch/x86/kvm/x86.c | 10 ++++++++
> arch/x86/kvm/x86.h | 2 ++
> 11 files changed, 102 insertions(+), 17 deletions(-)
>

Equating this with SMEP/SMAP is backwards.

LASS is something completely different: it makes it so *user space accesses* cannot even walk the kernel page tables (specifically, the negative half of the linear address space.)

Such an access with immediately #PF: it is similar to always having U=0 in the uppermost level of the page tables, except with LASS enabled the CPU will not even touch the page tables in memory.