2024-03-18 23:38:58

by Paolo Bonzini

[permalink] [raw]
Subject: [PATCH v4 00/15] KVM: SEV: allow customizing VMSA features

[Dave: there is a small arch/x86/kernel/fpu change in patch 9;
I am CCing you in the cover letter just for context. - Paolo]

The idea that no parameter would ever be necessary when enabling SEV or
SEV-ES for a VM was decidedly optimistic. The first source of variability
that was encountered is the desired set of VMSA features, as that affects
the measurement of the VM's initial state and cannot be changed
arbitrarily by the hypervisor.

This series adds all the APIs that are needed to customize the features,
with room for future enhancements:

- a new /dev/kvm device attribute to retrieve the set of supported
features (right now, only debug swap)

- a new sub-operation for KVM_MEM_ENCRYPT_OP that can take a struct,
replacing the existing KVM_SEV_INIT and KVM_SEV_ES_INIT

It then puts the new op to work by including the VMSA features as a field
of the The existing KVM_SEV_INIT and KVM_SEV_ES_INIT use the full set of
supported VMSA features for backwards compatibility; but I am considering
also making them use zero as the feature mask, and will gladly adjust the
patches if so requested.

In order to avoid creating *two* new KVM_MEM_ENCRYPT_OPs, I decided that
I could as well make SEV and SEV-ES use VM types. And then, why not make
a SEV-ES VM, when created with the new VM type instead of KVM_SEV_ES_INIT,
reject KVM_GET_REGS/KVM_SET_REGS and friends on the vCPU file descriptor
once the VMSA has been encrypted... Which is how the API should have
always behaved.

Note: despite having the same number of patches as v3, #9 and #15 are new!
The series is structured as follows:

- patches 1 and 2 change sev.c so that it is compiled only if SEV is enabled
in kconfig

- patches 3 to 6 introduce the new device attribute to retrieve supported
VMSA features

- patches 7 and 8 introduce new infrastructure for VM types, replacing
the similar code in the TDX patches

- patch 9 allows setting the FPU and AVX state prior to encryption of the
VMSA

- patches 10 to 12 introduce the new VM types for SEV and
SEV-ES, and KVM_SEV_INIT2 as a new sub-operation for KVM_MEM_ENCRYPT_OP.

- patch 13 reenables DebugSwap, now that there is an API that allows doing
so without breaking backwards compatibility

- patches 14 and 15 test the new ioctl.

The idea is that SEV SNP will only ever support KVM_SEV_INIT2. I have
placed patches for QEMU to support this new API at branch sevinit2 of
https://gitlab.com/bonzini/qemu.

I haven't fully tested patch 9 and it really deserves a selftest,
it is a bit tricky though without ucall infrastructure for SEV.
I will look at it tomorrow.

The series is at branch kvm-coco-queue of kvm.git, and I would like to
include it in kvm/next as soon as possible after the release of -rc1.

Thanks,

Paolo

v3->v4:
- moved patches 1 and 5 to separate "fixes" series for 6.9
- do not conditionalize prototypes for functions that are called by common
SVM code
- rebased on top of SEV selftest infrastructure from 6.9 merge window;
include new patch to drop the "subtype" concept
- rebased on top of SEV-SNP patches from 6.9 merge window
- rebased on top of patch to disable DebugSwap from 6.8 rc;
drop "warn" once SEV_ES_INIT stops enabling VMSA features and
finally re-enable DebugSwap
- simplified logic to return -EINVAL from ioctls
- also block KVM_GET/SET_FPU for protected-state guests
- move logic to set kvm_>arch.has_protected_state to svm_vm_init
- fix "struct struct" in documentation

Paolo Bonzini (14):
KVM: SVM: Compile sev.c if and only if CONFIG_KVM_AMD_SEV=y
KVM: x86: use u64_to_user_addr()
KVM: introduce new vendor op for KVM_GET_DEVICE_ATTR
KVM: SEV: publish supported VMSA features
KVM: SEV: store VMSA features in kvm_sev_info
KVM: x86: add fields to struct kvm_arch for CoCo features
KVM: x86: Add supported_vm_types to kvm_caps
KVM: SEV: sync FPU and AVX state at LAUNCH_UPDATE_VMSA time
KVM: SEV: introduce to_kvm_sev_info
KVM: SEV: define VM types for SEV and SEV-ES
KVM: SEV: introduce KVM_SEV_INIT2 operation
KVM: SEV: allow SEV-ES DebugSwap again
selftests: kvm: add tests for KVM_SEV_INIT2
selftests: kvm: switch to using KVM_X86_*_VM

Sean Christopherson (1):
KVM: SVM: Invert handling of SEV and SEV_ES feature flags

Documentation/virt/kvm/api.rst | 2 +
.../virt/kvm/x86/amd-memory-encryption.rst | 52 +++++-
arch/x86/include/asm/fpu/api.h | 3 +
arch/x86/include/asm/kvm-x86-ops.h | 1 +
arch/x86/include/asm/kvm_host.h | 8 +-
arch/x86/include/uapi/asm/kvm.h | 12 ++
arch/x86/kernel/fpu/xstate.h | 2 -
arch/x86/kvm/Makefile | 7 +-
arch/x86/kvm/cpuid.c | 2 +-
arch/x86/kvm/svm/sev.c | 172 ++++++++++++++----
arch/x86/kvm/svm/svm.c | 27 ++-
arch/x86/kvm/svm/svm.h | 43 ++++-
arch/x86/kvm/x86.c | 170 +++++++++++------
arch/x86/kvm/x86.h | 2 +
tools/testing/selftests/kvm/Makefile | 1 +
.../selftests/kvm/include/kvm_util_base.h | 11 +-
.../selftests/kvm/include/x86_64/processor.h | 6 -
.../selftests/kvm/include/x86_64/sev.h | 16 +-
tools/testing/selftests/kvm/lib/kvm_util.c | 1 -
.../selftests/kvm/lib/x86_64/processor.c | 14 +-
tools/testing/selftests/kvm/lib/x86_64/sev.c | 30 ++-
.../selftests/kvm/set_memory_region_test.c | 8 +-
.../selftests/kvm/x86_64/sev_init2_tests.c | 149 +++++++++++++++
23 files changed, 569 insertions(+), 170 deletions(-)
create mode 100644 tools/testing/selftests/kvm/x86_64/sev_init2_tests.c

--
2.43.0



2024-03-18 23:38:58

by Paolo Bonzini

[permalink] [raw]
Subject: [PATCH v4 03/15] KVM: x86: use u64_to_user_addr()

There is no danger to the kernel if 32-bit userspace provides a 64-bit
value that has the high bits set, but for whatever reason happens to
resolve to an address that has something mapped there. KVM uses the
checked version of get_user() and put_user(), so any faults are caught
properly.

Suggested-by: Sean Christopherson <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
---
arch/x86/kvm/x86.c | 24 +++---------------------
1 file changed, 3 insertions(+), 21 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 47d9f03b7778..3d2029402513 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4842,25 +4842,13 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
return r;
}

-static inline void __user *kvm_get_attr_addr(struct kvm_device_attr *attr)
-{
- void __user *uaddr = (void __user*)(unsigned long)attr->addr;
-
- if ((u64)(unsigned long)uaddr != attr->addr)
- return ERR_PTR_USR(-EFAULT);
- return uaddr;
-}
-
static int kvm_x86_dev_get_attr(struct kvm_device_attr *attr)
{
- u64 __user *uaddr = kvm_get_attr_addr(attr);
+ u64 __user *uaddr = u64_to_user_ptr(attr->addr);

if (attr->group)
return -ENXIO;

- if (IS_ERR(uaddr))
- return PTR_ERR(uaddr);
-
switch (attr->attr) {
case KVM_X86_XCOMP_GUEST_SUPP:
if (put_user(kvm_caps.supported_xcr0, uaddr))
@@ -5712,12 +5700,9 @@ static int kvm_arch_tsc_has_attr(struct kvm_vcpu *vcpu,
static int kvm_arch_tsc_get_attr(struct kvm_vcpu *vcpu,
struct kvm_device_attr *attr)
{
- u64 __user *uaddr = kvm_get_attr_addr(attr);
+ u64 __user *uaddr = u64_to_user_ptr(attr->addr);
int r;

- if (IS_ERR(uaddr))
- return PTR_ERR(uaddr);
-
switch (attr->attr) {
case KVM_VCPU_TSC_OFFSET:
r = -EFAULT;
@@ -5735,13 +5720,10 @@ static int kvm_arch_tsc_get_attr(struct kvm_vcpu *vcpu,
static int kvm_arch_tsc_set_attr(struct kvm_vcpu *vcpu,
struct kvm_device_attr *attr)
{
- u64 __user *uaddr = kvm_get_attr_addr(attr);
+ u64 __user *uaddr = u64_to_user_ptr(attr->addr);
struct kvm *kvm = vcpu->kvm;
int r;

- if (IS_ERR(uaddr))
- return PTR_ERR(uaddr);
-
switch (attr->attr) {
case KVM_VCPU_TSC_OFFSET: {
u64 offset, tsc, ns;
--
2.43.0



2024-03-18 23:39:31

by Paolo Bonzini

[permalink] [raw]
Subject: [PATCH v4 07/15] KVM: x86: add fields to struct kvm_arch for CoCo features

Some VM types have characteristics in common; in fact, the only use
of VM types right now is kvm_arch_has_private_mem and it assumes that
_all_ nonzero VM types have private memory.

We will soon introduce a VM type for SEV and SEV-ES VMs, and at that
point we will have two special characteristics of confidential VMs
that depend on the VM type: not just if memory is private, but
also whether guest state is protected. For the latter we have
kvm->arch.guest_state_protected, which is only set on a fully initialized
VM.

For VM types with protected guest state, we can actually fix a problem in
the SEV-ES implementation, where ioctls to set registers do not cause an
error even if the VM has been initialized and the guest state encrypted.
Make sure that when using VM types that will become an error.

Signed-off-by: Paolo Bonzini <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
---
arch/x86/include/asm/kvm_host.h | 7 ++-
arch/x86/kvm/x86.c | 93 ++++++++++++++++++++++++++-------
2 files changed, 79 insertions(+), 21 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index f6cc7bfb5462..7380877bc9b5 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1279,12 +1279,14 @@ enum kvm_apicv_inhibit {
};

struct kvm_arch {
- unsigned long vm_type;
unsigned long n_used_mmu_pages;
unsigned long n_requested_mmu_pages;
unsigned long n_max_mmu_pages;
unsigned int indirect_shadow_pages;
u8 mmu_valid_gen;
+ u8 vm_type;
+ bool has_private_mem;
+ bool has_protected_state;
struct hlist_head mmu_page_hash[KVM_NUM_MMU_PAGES];
struct list_head active_mmu_pages;
struct list_head zapped_obsolete_pages;
@@ -2153,8 +2155,9 @@ void kvm_mmu_new_pgd(struct kvm_vcpu *vcpu, gpa_t new_pgd);
void kvm_configure_mmu(bool enable_tdp, int tdp_forced_root_level,
int tdp_max_root_level, int tdp_huge_page_level);

+
#ifdef CONFIG_KVM_PRIVATE_MEM
-#define kvm_arch_has_private_mem(kvm) ((kvm)->arch.vm_type != KVM_X86_DEFAULT_VM)
+#define kvm_arch_has_private_mem(kvm) ((kvm)->arch.has_private_mem)
#else
#define kvm_arch_has_private_mem(kvm) false
#endif
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index e8253aa8ef5e..98b7979b4698 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5560,11 +5560,15 @@ static int kvm_vcpu_ioctl_x86_set_vcpu_events(struct kvm_vcpu *vcpu,
return 0;
}

-static void kvm_vcpu_ioctl_x86_get_debugregs(struct kvm_vcpu *vcpu,
- struct kvm_debugregs *dbgregs)
+static int kvm_vcpu_ioctl_x86_get_debugregs(struct kvm_vcpu *vcpu,
+ struct kvm_debugregs *dbgregs)
{
unsigned int i;

+ if (vcpu->kvm->arch.has_protected_state &&
+ vcpu->arch.guest_state_protected)
+ return -EINVAL;
+
memset(dbgregs, 0, sizeof(*dbgregs));

BUILD_BUG_ON(ARRAY_SIZE(vcpu->arch.db) != ARRAY_SIZE(dbgregs->db));
@@ -5573,6 +5577,7 @@ static void kvm_vcpu_ioctl_x86_get_debugregs(struct kvm_vcpu *vcpu,

dbgregs->dr6 = vcpu->arch.dr6;
dbgregs->dr7 = vcpu->arch.dr7;
+ return 0;
}

static int kvm_vcpu_ioctl_x86_set_debugregs(struct kvm_vcpu *vcpu,
@@ -5580,6 +5585,10 @@ static int kvm_vcpu_ioctl_x86_set_debugregs(struct kvm_vcpu *vcpu,
{
unsigned int i;

+ if (vcpu->kvm->arch.has_protected_state &&
+ vcpu->arch.guest_state_protected)
+ return -EINVAL;
+
if (dbgregs->flags)
return -EINVAL;

@@ -5600,8 +5609,8 @@ static int kvm_vcpu_ioctl_x86_set_debugregs(struct kvm_vcpu *vcpu,
}


-static void kvm_vcpu_ioctl_x86_get_xsave2(struct kvm_vcpu *vcpu,
- u8 *state, unsigned int size)
+static int kvm_vcpu_ioctl_x86_get_xsave2(struct kvm_vcpu *vcpu,
+ u8 *state, unsigned int size)
{
/*
* Only copy state for features that are enabled for the guest. The
@@ -5619,24 +5628,25 @@ static void kvm_vcpu_ioctl_x86_get_xsave2(struct kvm_vcpu *vcpu,
XFEATURE_MASK_FPSSE;

if (fpstate_is_confidential(&vcpu->arch.guest_fpu))
- return;
+ return vcpu->kvm->arch.has_protected_state ? -EINVAL : 0;

fpu_copy_guest_fpstate_to_uabi(&vcpu->arch.guest_fpu, state, size,
supported_xcr0, vcpu->arch.pkru);
+ return 0;
}

-static void kvm_vcpu_ioctl_x86_get_xsave(struct kvm_vcpu *vcpu,
- struct kvm_xsave *guest_xsave)
+static int kvm_vcpu_ioctl_x86_get_xsave(struct kvm_vcpu *vcpu,
+ struct kvm_xsave *guest_xsave)
{
- kvm_vcpu_ioctl_x86_get_xsave2(vcpu, (void *)guest_xsave->region,
- sizeof(guest_xsave->region));
+ return kvm_vcpu_ioctl_x86_get_xsave2(vcpu, (void *)guest_xsave->region,
+ sizeof(guest_xsave->region));
}

static int kvm_vcpu_ioctl_x86_set_xsave(struct kvm_vcpu *vcpu,
struct kvm_xsave *guest_xsave)
{
if (fpstate_is_confidential(&vcpu->arch.guest_fpu))
- return 0;
+ return vcpu->kvm->arch.has_protected_state ? -EINVAL : 0;

return fpu_copy_uabi_to_guest_fpstate(&vcpu->arch.guest_fpu,
guest_xsave->region,
@@ -5644,18 +5654,23 @@ static int kvm_vcpu_ioctl_x86_set_xsave(struct kvm_vcpu *vcpu,
&vcpu->arch.pkru);
}

-static void kvm_vcpu_ioctl_x86_get_xcrs(struct kvm_vcpu *vcpu,
- struct kvm_xcrs *guest_xcrs)
+static int kvm_vcpu_ioctl_x86_get_xcrs(struct kvm_vcpu *vcpu,
+ struct kvm_xcrs *guest_xcrs)
{
+ if (vcpu->kvm->arch.has_protected_state &&
+ vcpu->arch.guest_state_protected)
+ return -EINVAL;
+
if (!boot_cpu_has(X86_FEATURE_XSAVE)) {
guest_xcrs->nr_xcrs = 0;
- return;
+ return 0;
}

guest_xcrs->nr_xcrs = 1;
guest_xcrs->flags = 0;
guest_xcrs->xcrs[0].xcr = XCR_XFEATURE_ENABLED_MASK;
guest_xcrs->xcrs[0].value = vcpu->arch.xcr0;
+ return 0;
}

static int kvm_vcpu_ioctl_x86_set_xcrs(struct kvm_vcpu *vcpu,
@@ -5663,6 +5678,10 @@ static int kvm_vcpu_ioctl_x86_set_xcrs(struct kvm_vcpu *vcpu,
{
int i, r = 0;

+ if (vcpu->kvm->arch.has_protected_state &&
+ vcpu->arch.guest_state_protected)
+ return -EINVAL;
+
if (!boot_cpu_has(X86_FEATURE_XSAVE))
return -EINVAL;

@@ -6045,7 +6064,9 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
case KVM_GET_DEBUGREGS: {
struct kvm_debugregs dbgregs;

- kvm_vcpu_ioctl_x86_get_debugregs(vcpu, &dbgregs);
+ r = kvm_vcpu_ioctl_x86_get_debugregs(vcpu, &dbgregs);
+ if (r < 0)
+ break;

r = -EFAULT;
if (copy_to_user(argp, &dbgregs,
@@ -6075,7 +6096,9 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
if (!u.xsave)
break;

- kvm_vcpu_ioctl_x86_get_xsave(vcpu, u.xsave);
+ r = kvm_vcpu_ioctl_x86_get_xsave(vcpu, u.xsave);
+ if (r < 0)
+ break;

r = -EFAULT;
if (copy_to_user(argp, u.xsave, sizeof(struct kvm_xsave)))
@@ -6104,7 +6127,9 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
if (!u.xsave)
break;

- kvm_vcpu_ioctl_x86_get_xsave2(vcpu, u.buffer, size);
+ r = kvm_vcpu_ioctl_x86_get_xsave2(vcpu, u.buffer, size);
+ if (r < 0)
+ break;

r = -EFAULT;
if (copy_to_user(argp, u.xsave, size))
@@ -6120,7 +6145,9 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
if (!u.xcrs)
break;

- kvm_vcpu_ioctl_x86_get_xcrs(vcpu, u.xcrs);
+ r = kvm_vcpu_ioctl_x86_get_xcrs(vcpu, u.xcrs);
+ if (r < 0)
+ break;

r = -EFAULT;
if (copy_to_user(argp, u.xcrs,
@@ -6264,6 +6291,11 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
}
#endif
case KVM_GET_SREGS2: {
+ r = -EINVAL;
+ if (vcpu->kvm->arch.has_protected_state &&
+ vcpu->arch.guest_state_protected)
+ goto out;
+
u.sregs2 = kzalloc(sizeof(struct kvm_sregs2), GFP_KERNEL);
r = -ENOMEM;
if (!u.sregs2)
@@ -6276,6 +6308,11 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
break;
}
case KVM_SET_SREGS2: {
+ r = -EINVAL;
+ if (vcpu->kvm->arch.has_protected_state &&
+ vcpu->arch.guest_state_protected)
+ goto out;
+
u.sregs2 = memdup_user(argp, sizeof(struct kvm_sregs2));
if (IS_ERR(u.sregs2)) {
r = PTR_ERR(u.sregs2);
@@ -11483,6 +11520,10 @@ static void __get_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)

int kvm_arch_vcpu_ioctl_get_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
{
+ if (vcpu->kvm->arch.has_protected_state &&
+ vcpu->arch.guest_state_protected)
+ return -EINVAL;
+
vcpu_load(vcpu);
__get_regs(vcpu, regs);
vcpu_put(vcpu);
@@ -11524,6 +11565,10 @@ static void __set_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)

int kvm_arch_vcpu_ioctl_set_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
{
+ if (vcpu->kvm->arch.has_protected_state &&
+ vcpu->arch.guest_state_protected)
+ return -EINVAL;
+
vcpu_load(vcpu);
__set_regs(vcpu, regs);
vcpu_put(vcpu);
@@ -11596,6 +11641,10 @@ static void __get_sregs2(struct kvm_vcpu *vcpu, struct kvm_sregs2 *sregs2)
int kvm_arch_vcpu_ioctl_get_sregs(struct kvm_vcpu *vcpu,
struct kvm_sregs *sregs)
{
+ if (vcpu->kvm->arch.has_protected_state &&
+ vcpu->arch.guest_state_protected)
+ return -EINVAL;
+
vcpu_load(vcpu);
__get_sregs(vcpu, sregs);
vcpu_put(vcpu);
@@ -11863,6 +11912,10 @@ int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu,
{
int ret;

+ if (vcpu->kvm->arch.has_protected_state &&
+ vcpu->arch.guest_state_protected)
+ return -EINVAL;
+
vcpu_load(vcpu);
ret = __set_sregs(vcpu, sregs);
vcpu_put(vcpu);
@@ -11980,7 +12033,7 @@ int kvm_arch_vcpu_ioctl_get_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu)
struct fxregs_state *fxsave;

if (fpstate_is_confidential(&vcpu->arch.guest_fpu))
- return 0;
+ return vcpu->kvm->arch.has_protected_state ? -EINVAL : 0;

vcpu_load(vcpu);

@@ -12003,7 +12056,7 @@ int kvm_arch_vcpu_ioctl_set_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu)
struct fxregs_state *fxsave;

if (fpstate_is_confidential(&vcpu->arch.guest_fpu))
- return 0;
+ return vcpu->kvm->arch.has_protected_state ? -EINVAL : 0;

vcpu_load(vcpu);

@@ -12529,6 +12582,8 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
return -EINVAL;

kvm->arch.vm_type = type;
+ kvm->arch.has_private_mem =
+ (type == KVM_X86_SW_PROTECTED_VM);

ret = kvm_page_track_init(kvm);
if (ret)
--
2.43.0



2024-03-18 23:39:41

by Paolo Bonzini

[permalink] [raw]
Subject: [PATCH v4 15/15] selftests: kvm: switch to using KVM_X86_*_VM

This removes the concept of "subtypes", instead letting the tests use proper
VM types that were recently added. While the sev_init_vm() and sev_es_init_vm()
are still able to operate with the legacy KVM_SEV_INIT and KVM_SEV_ES_INIT
ioctls, this is limited to VMs that are created manually with
vm_create_barebones().

Signed-off-by: Paolo Bonzini <[email protected]>
---
.../selftests/kvm/include/kvm_util_base.h | 5 ++--
.../selftests/kvm/include/x86_64/processor.h | 6 ----
.../selftests/kvm/include/x86_64/sev.h | 16 ++--------
tools/testing/selftests/kvm/lib/kvm_util.c | 1 -
.../selftests/kvm/lib/x86_64/processor.c | 14 +++++----
tools/testing/selftests/kvm/lib/x86_64/sev.c | 30 +++++++++++++++++--
6 files changed, 40 insertions(+), 32 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h
index 7c06ceb36643..8acca8237687 100644
--- a/tools/testing/selftests/kvm/include/kvm_util_base.h
+++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
@@ -93,7 +93,6 @@ enum kvm_mem_region_type {
struct kvm_vm {
int mode;
unsigned long type;
- uint8_t subtype;
int kvm_fd;
int fd;
unsigned int pgtable_levels;
@@ -200,8 +199,8 @@ enum vm_guest_mode {
struct vm_shape {
uint32_t type;
uint8_t mode;
- uint8_t subtype;
- uint16_t padding;
+ uint8_t pad0;
+ uint16_t pad1;
};

kvm_static_assert(sizeof(struct vm_shape) == sizeof(uint64_t));
diff --git a/tools/testing/selftests/kvm/include/x86_64/processor.h b/tools/testing/selftests/kvm/include/x86_64/processor.h
index 3bd03b088dda..3c8dfd8180b1 100644
--- a/tools/testing/selftests/kvm/include/x86_64/processor.h
+++ b/tools/testing/selftests/kvm/include/x86_64/processor.h
@@ -23,12 +23,6 @@
extern bool host_cpu_is_intel;
extern bool host_cpu_is_amd;

-enum vm_guest_x86_subtype {
- VM_SUBTYPE_NONE = 0,
- VM_SUBTYPE_SEV,
- VM_SUBTYPE_SEV_ES,
-};
-
/* Forced emulation prefix, used to invoke the emulator unconditionally. */
#define KVM_FEP "ud2; .byte 'k', 'v', 'm';"

diff --git a/tools/testing/selftests/kvm/include/x86_64/sev.h b/tools/testing/selftests/kvm/include/x86_64/sev.h
index 8a1bf88474c9..0719f083351a 100644
--- a/tools/testing/selftests/kvm/include/x86_64/sev.h
+++ b/tools/testing/selftests/kvm/include/x86_64/sev.h
@@ -67,20 +67,8 @@ kvm_static_assert(SEV_RET_SUCCESS == 0);
__TEST_ASSERT_VM_VCPU_IOCTL(!ret, #cmd, ret, vm); \
})

-static inline void sev_vm_init(struct kvm_vm *vm)
-{
- vm->arch.sev_fd = open_sev_dev_path_or_exit();
-
- vm_sev_ioctl(vm, KVM_SEV_INIT, NULL);
-}
-
-
-static inline void sev_es_vm_init(struct kvm_vm *vm)
-{
- vm->arch.sev_fd = open_sev_dev_path_or_exit();
-
- vm_sev_ioctl(vm, KVM_SEV_ES_INIT, NULL);
-}
+void sev_vm_init(struct kvm_vm *vm);
+void sev_es_vm_init(struct kvm_vm *vm);

static inline void sev_register_encrypted_memory(struct kvm_vm *vm,
struct userspace_mem_region *region)
diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
index b2262b5fad9e..9da388100f3a 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -276,7 +276,6 @@ struct kvm_vm *____vm_create(struct vm_shape shape)

vm->mode = shape.mode;
vm->type = shape.type;
- vm->subtype = shape.subtype;

vm->pa_bits = vm_guest_mode_params[vm->mode].pa_bits;
vm->va_bits = vm_guest_mode_params[vm->mode].va_bits;
diff --git a/tools/testing/selftests/kvm/lib/x86_64/processor.c b/tools/testing/selftests/kvm/lib/x86_64/processor.c
index 74a4c736c9ae..9f87ca8b7ab6 100644
--- a/tools/testing/selftests/kvm/lib/x86_64/processor.c
+++ b/tools/testing/selftests/kvm/lib/x86_64/processor.c
@@ -578,10 +578,11 @@ void kvm_arch_vm_post_create(struct kvm_vm *vm)
sync_global_to_guest(vm, host_cpu_is_intel);
sync_global_to_guest(vm, host_cpu_is_amd);

- if (vm->subtype == VM_SUBTYPE_SEV)
- sev_vm_init(vm);
- else if (vm->subtype == VM_SUBTYPE_SEV_ES)
- sev_es_vm_init(vm);
+ if (vm->type == KVM_X86_SEV_VM || vm->type == KVM_X86_SEV_ES_VM) {
+ struct kvm_sev_init init = { 0 };
+
+ vm_sev_ioctl(vm, KVM_SEV_INIT2, &init);
+ }
}

void vcpu_arch_set_entry_point(struct kvm_vcpu *vcpu, void *guest_code)
@@ -1081,9 +1082,12 @@ void kvm_get_cpu_address_width(unsigned int *pa_bits, unsigned int *va_bits)

void kvm_init_vm_address_properties(struct kvm_vm *vm)
{
- if (vm->subtype == VM_SUBTYPE_SEV || vm->subtype == VM_SUBTYPE_SEV_ES) {
+ if (vm->type == KVM_X86_SEV_VM || vm->type == KVM_X86_SEV_ES_VM) {
+ vm->arch.sev_fd = open_sev_dev_path_or_exit();
vm->arch.c_bit = BIT_ULL(this_cpu_property(X86_PROPERTY_SEV_C_BIT));
vm->gpa_tag_mask = vm->arch.c_bit;
+ } else {
+ vm->arch.sev_fd = -1;
}
}

diff --git a/tools/testing/selftests/kvm/lib/x86_64/sev.c b/tools/testing/selftests/kvm/lib/x86_64/sev.c
index e248d3364b9c..597994fa4f41 100644
--- a/tools/testing/selftests/kvm/lib/x86_64/sev.c
+++ b/tools/testing/selftests/kvm/lib/x86_64/sev.c
@@ -35,6 +35,32 @@ static void encrypt_region(struct kvm_vm *vm, struct userspace_mem_region *regio
}
}

+void sev_vm_init(struct kvm_vm *vm)
+{
+ if (vm->type == KVM_X86_DEFAULT_VM) {
+ assert(vm->arch.sev_fd == -1);
+ vm->arch.sev_fd = open_sev_dev_path_or_exit();
+ vm_sev_ioctl(vm, KVM_SEV_INIT, NULL);
+ } else {
+ struct kvm_sev_init init = { 0 };
+ assert(vm->type == KVM_X86_SEV_VM);
+ vm_sev_ioctl(vm, KVM_SEV_INIT2, &init);
+ }
+}
+
+void sev_es_vm_init(struct kvm_vm *vm)
+{
+ if (vm->type == KVM_X86_DEFAULT_VM) {
+ assert(vm->arch.sev_fd == -1);
+ vm->arch.sev_fd = open_sev_dev_path_or_exit();
+ vm_sev_ioctl(vm, KVM_SEV_ES_INIT, NULL);
+ } else {
+ struct kvm_sev_init init = { 0 };
+ assert(vm->type == KVM_X86_SEV_ES_VM);
+ vm_sev_ioctl(vm, KVM_SEV_INIT2, &init);
+ }
+}
+
void sev_vm_launch(struct kvm_vm *vm, uint32_t policy)
{
struct kvm_sev_launch_start launch_start = {
@@ -91,10 +117,8 @@ struct kvm_vm *vm_sev_create_with_one_vcpu(uint32_t policy, void *guest_code,
struct kvm_vcpu **cpu)
{
struct vm_shape shape = {
- .type = VM_TYPE_DEFAULT,
.mode = VM_MODE_DEFAULT,
- .subtype = policy & SEV_POLICY_ES ? VM_SUBTYPE_SEV_ES :
- VM_SUBTYPE_SEV,
+ .type = policy & SEV_POLICY_ES ? KVM_X86_SEV_ES_VM : KVM_X86_SEV_VM,
};
struct kvm_vm *vm;
struct kvm_vcpu *cpus[1];
--
2.43.0


--
2.43.0


2024-03-19 02:20:57

by Michael Roth

[permalink] [raw]
Subject: Re: [PATCH v4 00/15] KVM: SEV: allow customizing VMSA features

On Mon, Mar 18, 2024 at 07:33:37PM -0400, Paolo Bonzini wrote:
>
> The idea is that SEV SNP will only ever support KVM_SEV_INIT2. I have
> placed patches for QEMU to support this new API at branch sevinit2 of
> https://gitlab.com/bonzini/qemu.

I don't see any references to KVM_SEV_INIT2 in the sevinit2 branch. Has
everything been pushed already?

-Mike

2024-03-19 19:44:11

by Paolo Bonzini

[permalink] [raw]
Subject: [PATCH v4 18/15] selftests: kvm: add test for transferring FPU state into the VMSA

Test that CRn, XCR0 and FPU state are correctly moved from KVM's internal
state to the VMSA by SEV_LAUNCH_UPDATE_VMSA.

Signed-off-by: Paolo Bonzini <[email protected]>
---
.../selftests/kvm/x86_64/sev_smoke_test.c | 87 +++++++++++++++++++
1 file changed, 87 insertions(+)

diff --git a/tools/testing/selftests/kvm/x86_64/sev_smoke_test.c b/tools/testing/selftests/kvm/x86_64/sev_smoke_test.c
index 234c80dd344d..195150bc5013 100644
--- a/tools/testing/selftests/kvm/x86_64/sev_smoke_test.c
+++ b/tools/testing/selftests/kvm/x86_64/sev_smoke_test.c
@@ -4,6 +4,7 @@
#include <stdlib.h>
#include <string.h>
#include <sys/ioctl.h>
+#include <math.h>

#include "test_util.h"
#include "kvm_util.h"
@@ -13,6 +14,8 @@
#include "sev.h"


+#define XFEATURE_MASK_X87_AVX (XFEATURE_MASK_FP | XFEATURE_MASK_SSE | XFEATURE_MASK_YMM)
+
static void guest_sev_es_code(void)
{
/* TODO: Check CPUID after GHCB-based hypercall support is added. */
@@ -35,6 +38,84 @@ static void guest_sev_code(void)
GUEST_DONE();
}

+/* Stash state passed via VMSA before any compiled code runs. */
+extern void guest_code_xsave(void);
+asm("guest_code_xsave:\n"
+ "mov $-1, %eax\n"
+ "mov $-1, %edx\n"
+ "xsave (%rdi)\n"
+ "jmp guest_sev_es_code");
+
+static void compare_xsave(u8 *from_host, u8 *from_guest)
+{
+ int i;
+ bool bad = false;
+ for (i = 0; i < 4095; i++) {
+ if (from_host[i] != from_guest[i]) {
+ printf("mismatch at %02hhx | %02hhx %02hhx\n", i, from_host[i], from_guest[i]);
+ bad = true;
+ }
+ }
+
+ if (bad)
+ abort();
+}
+
+static void test_sync_vmsa(uint32_t policy)
+{
+ struct kvm_vcpu *vcpu;
+ struct kvm_vm *vm;
+ vm_vaddr_t gva;
+ void *hva;
+
+ double x87val = M_PI;
+ struct kvm_xsave __attribute__((aligned(64))) xsave = { 0 };
+ struct kvm_sregs sregs;
+ struct kvm_xcrs xcrs = {
+ .nr_xcrs = 1,
+ .xcrs[0].xcr = 0,
+ .xcrs[0].value = XFEATURE_MASK_X87_AVX,
+ };
+
+ vm = vm_sev_create_with_one_vcpu(KVM_X86_SEV_ES_VM, guest_code_xsave, &vcpu);
+ gva = vm_vaddr_alloc_shared(vm, PAGE_SIZE, KVM_UTIL_MIN_VADDR,
+ MEM_REGION_TEST_DATA);
+ hva = addr_gva2hva(vm, gva);
+
+ vcpu_args_set(vcpu, 1, gva);
+
+ vcpu_sregs_get(vcpu, &sregs);
+ sregs.cr4 |= X86_CR4_OSFXSR | X86_CR4_OSXSAVE;
+ vcpu_sregs_set(vcpu, &sregs);
+
+ vcpu_xcrs_set(vcpu, &xcrs);
+ asm("fninit; fldl %3\n"
+ "vpcmpeqb %%ymm4, %%ymm4, %%ymm4\n"
+ "xsave (%2)"
+ : "=m"(xsave)
+ : "A"(XFEATURE_MASK_X87_AVX), "r"(&xsave), "m" (x87val)
+ : "ymm4", "st", "st(1)", "st(2)", "st(3)", "st(4)", "st(5)", "st(6)", "st(7)");
+ vcpu_xsave_set(vcpu, &xsave);
+
+ vm_sev_launch(vm, SEV_POLICY_ES | policy, NULL);
+
+ /* This page is shared, so make it decrypted. */
+ memset(hva, 0, 4096);
+
+ vcpu_run(vcpu);
+
+ TEST_ASSERT(vcpu->run->exit_reason == KVM_EXIT_SYSTEM_EVENT,
+ "Wanted SYSTEM_EVENT, got %s",
+ exit_reason_str(vcpu->run->exit_reason));
+ TEST_ASSERT_EQ(vcpu->run->system_event.type, KVM_SYSTEM_EVENT_SEV_TERM);
+ TEST_ASSERT_EQ(vcpu->run->system_event.ndata, 1);
+ TEST_ASSERT_EQ(vcpu->run->system_event.data[0], GHCB_MSR_TERM_REQ);
+
+ compare_xsave((u8 *)&xsave, (u8 *)hva);
+
+ kvm_vm_free(vm);
+}
+
static void test_sev(void *guest_code, uint64_t policy)
{
struct kvm_vcpu *vcpu;
@@ -87,6 +168,12 @@ int main(int argc, char *argv[])
if (kvm_cpu_has(X86_FEATURE_SEV_ES)) {
test_sev(guest_sev_es_code, SEV_POLICY_ES | SEV_POLICY_NO_DBG);
test_sev(guest_sev_es_code, SEV_POLICY_ES);
+
+ if (kvm_has_cap(KVM_CAP_XCRS) &&
+ (xgetbv(0) & XFEATURE_MASK_X87_AVX) == XFEATURE_MASK_X87_AVX) {
+ test_sync_vmsa(0);
+ test_sync_vmsa(SEV_POLICY_NO_DBG);
+ }
}

return 0;
--
2.43.0


2024-03-19 19:44:37

by Paolo Bonzini

[permalink] [raw]
Subject: [PATCH v4 17/15] selftests: kvm: split "launch" phase of SEV VM creation

Allow the caller to set the initial state of the VM. Doing this
before sev_vm_launch() matters for SEV-ES, since that is the
place where the VMSA is updated and after which the guest state
becomes sealed.

Signed-off-by: Paolo Bonzini <[email protected]>
---
tools/testing/selftests/kvm/include/x86_64/sev.h | 3 ++-
tools/testing/selftests/kvm/lib/x86_64/sev.c | 16 ++++++++++------
.../selftests/kvm/x86_64/sev_smoke_test.c | 7 ++++++-
3 files changed, 18 insertions(+), 8 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/x86_64/sev.h b/tools/testing/selftests/kvm/include/x86_64/sev.h
index 0719f083351a..82c11c81a956 100644
--- a/tools/testing/selftests/kvm/include/x86_64/sev.h
+++ b/tools/testing/selftests/kvm/include/x86_64/sev.h
@@ -31,8 +31,9 @@ void sev_vm_launch(struct kvm_vm *vm, uint32_t policy);
void sev_vm_launch_measure(struct kvm_vm *vm, uint8_t *measurement);
void sev_vm_launch_finish(struct kvm_vm *vm);

-struct kvm_vm *vm_sev_create_with_one_vcpu(uint32_t policy, void *guest_code,
+struct kvm_vm *vm_sev_create_with_one_vcpu(uint32_t type, void *guest_code,
struct kvm_vcpu **cpu);
+void vm_sev_launch(struct kvm_vm *vm, uint32_t policy, uint8_t *measurement);

kvm_static_assert(SEV_RET_SUCCESS == 0);

diff --git a/tools/testing/selftests/kvm/lib/x86_64/sev.c b/tools/testing/selftests/kvm/lib/x86_64/sev.c
index 597994fa4f41..d482029b6004 100644
--- a/tools/testing/selftests/kvm/lib/x86_64/sev.c
+++ b/tools/testing/selftests/kvm/lib/x86_64/sev.c
@@ -113,26 +113,30 @@ void sev_vm_launch_finish(struct kvm_vm *vm)
TEST_ASSERT_EQ(status.state, SEV_GUEST_STATE_RUNNING);
}

-struct kvm_vm *vm_sev_create_with_one_vcpu(uint32_t policy, void *guest_code,
+struct kvm_vm *vm_sev_create_with_one_vcpu(uint32_t type, void *guest_code,
struct kvm_vcpu **cpu)
{
struct vm_shape shape = {
.mode = VM_MODE_DEFAULT,
- .type = policy & SEV_POLICY_ES ? KVM_X86_SEV_ES_VM : KVM_X86_SEV_VM,
+ .type = type,
};
struct kvm_vm *vm;
struct kvm_vcpu *cpus[1];
- uint8_t measurement[512];

vm = __vm_create_with_vcpus(shape, 1, 0, guest_code, cpus);
*cpu = cpus[0];

+ return vm;
+}
+
+void vm_sev_launch(struct kvm_vm *vm, uint32_t policy, uint8_t *measurement)
+{
sev_vm_launch(vm, policy);

- /* TODO: Validate the measurement is as expected. */
+ if (!measurement)
+ measurement = alloca(256);
+
sev_vm_launch_measure(vm, measurement);

sev_vm_launch_finish(vm);
-
- return vm;
}
diff --git a/tools/testing/selftests/kvm/x86_64/sev_smoke_test.c b/tools/testing/selftests/kvm/x86_64/sev_smoke_test.c
index 026779f3ed06..234c80dd344d 100644
--- a/tools/testing/selftests/kvm/x86_64/sev_smoke_test.c
+++ b/tools/testing/selftests/kvm/x86_64/sev_smoke_test.c
@@ -41,7 +41,12 @@ static void test_sev(void *guest_code, uint64_t policy)
struct kvm_vm *vm;
struct ucall uc;

- vm = vm_sev_create_with_one_vcpu(policy, guest_code, &vcpu);
+ uint32_t type = policy & SEV_POLICY_ES ? KVM_X86_SEV_ES_VM : KVM_X86_SEV_VM;
+
+ vm = vm_sev_create_with_one_vcpu(type, guest_code, &vcpu);
+
+ /* TODO: Validate the measurement is as expected. */
+ vm_sev_launch(vm, policy, NULL);

for (;;) {
vcpu_run(vcpu);
--
2.43.0



2024-03-19 19:44:49

by Paolo Bonzini

[permalink] [raw]
Subject: [PATCH v4 16/15] fixup! KVM: SEV: sync FPU and AVX state at LAUNCH_UPDATE_VMSA time

A small change to add EXPORT_SYMBOL_GPL, and especially to actually match
the format in which the processor expects x87 registers in the VMSA.

Signed-off-by: Paolo Bonzini <[email protected]>
---
arch/x86/kernel/fpu/xstate.c | 1 +
arch/x86/kvm/svm/sev.c | 12 ++++++++++--
2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 117e74c44e75..eeaf4ec9243d 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -990,6 +990,7 @@ void *get_xsave_addr(struct xregs_state *xsave, int xfeature_nr)

return __raw_xsave_addr(xsave, xfeature_nr);
}
+EXPORT_SYMBOL_GPL(get_xsave_addr);

#ifdef CONFIG_ARCH_HAS_PKEYS

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index cee6beb2bf29..66fa852b48b3 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -679,9 +679,17 @@ static int sev_es_sync_vmsa(struct vcpu_svm *svm)
save->x87_rip = xsave->i387.rip;

for (i = 0; i < 8; i++) {
- d = save->fpreg_x87 + i * 10;
+ /*
+ * The format of the x87 save area is totally undocumented,
+ * and definitely not what you would expect. It consists
+ * of an 8*8 bytes area with bytes 0-7 and an 8*2 bytes area
+ * with bytes 8-9 of each register.
+ */
+ d = save->fpreg_x87 + i * 8;
s = ((u8 *)xsave->i387.st_space) + i * 16;
- memcpy(d, s, 10);
+ memcpy(d, s, 8);
+ save->fpreg_x87[64 + i * 2] = s[8];
+ save->fpreg_x87[64 + i * 2 + 1] = s[9];
}
memcpy(save->fpreg_xmm, xsave->i387.xmm_space, 256);

--
2.43.0