2022-01-13 13:37:26

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH 0/5] KVM: x86: Partially allow KVM_SET_CPUID{,2} after KVM_RUN for CPU hotplug

Recently, KVM made it illegal to change CPUID after KVM_RUN but
unfortunately this change is not fully compatible with existing VMMs.
In particular, QEMU reuses vCPU fds for CPU hotplug after unplug and it
calls KVM_SET_CPUID2. Relax the requirement by implementing an allowlist
of entries which are allowed to change.

Going forward, VMMs are supposed to change the behavior. There is no real
need to change CPUID information. For hotplug purposes (and if reusing vCPU
fds is still considered being worthy), VMMs can be a bit smarter and always
pick the fd with the required LAPIC/x2APIC id to eliminate the need to
change this info later.

Vitaly Kuznetsov (5):
KVM: x86: Fix indentation in kvm_set_cpuid()
KVM: x86: Do runtime CPUID update before updating
vcpu->arch.cpuid_entries
KVM: x86: Partially allow KVM_SET_CPUID{,2} after KVM_RUN
KVM: selftests: Rename 'get_cpuid_test' to 'cpuid_test'
KVM: selftests: Test KVM_SET_CPUID2 after KVM_RUN

arch/x86/kvm/cpuid.c | 120 ++++++++++++++----
arch/x86/kvm/x86.c | 19 ---
tools/testing/selftests/kvm/.gitignore | 2 +-
tools/testing/selftests/kvm/Makefile | 5 +-
.../selftests/kvm/include/x86_64/processor.h | 7 +
.../selftests/kvm/lib/x86_64/processor.c | 33 ++++-
.../x86_64/{get_cpuid_test.c => cpuid_test.c} | 78 ++++++++++++
7 files changed, 216 insertions(+), 48 deletions(-)
rename tools/testing/selftests/kvm/x86_64/{get_cpuid_test.c => cpuid_test.c} (66%)

--
2.34.1



2022-01-13 13:37:37

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH 1/5] KVM: x86: Fix indentation in kvm_set_cpuid()

Indent kvm_set_cpuid() properly.

No functional change intended.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
arch/x86/kvm/cpuid.c | 22 +++++++++++-----------
1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 07e9215e911d..89af3c7390d3 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -276,21 +276,21 @@ u64 kvm_vcpu_reserved_gpa_bits_raw(struct kvm_vcpu *vcpu)
static int kvm_set_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *e2,
int nent)
{
- int r;
+ int r;

- r = kvm_check_cpuid(e2, nent);
- if (r)
- return r;
+ r = kvm_check_cpuid(e2, nent);
+ if (r)
+ return r;

- kvfree(vcpu->arch.cpuid_entries);
- vcpu->arch.cpuid_entries = e2;
- vcpu->arch.cpuid_nent = nent;
+ kvfree(vcpu->arch.cpuid_entries);
+ vcpu->arch.cpuid_entries = e2;
+ vcpu->arch.cpuid_nent = nent;

- kvm_update_kvm_cpuid_base(vcpu);
- kvm_update_cpuid_runtime(vcpu);
- kvm_vcpu_after_set_cpuid(vcpu);
+ kvm_update_kvm_cpuid_base(vcpu);
+ kvm_update_cpuid_runtime(vcpu);
+ kvm_vcpu_after_set_cpuid(vcpu);

- return 0;
+ return 0;
}

/* when an old userspace process fills a new kernel module */
--
2.34.1


2022-01-13 13:37:38

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH 2/5] KVM: x86: Do runtime CPUID update before updating vcpu->arch.cpuid_entries

kvm_update_cpuid_runtime() mangles CPUID data coming from userspace
VMM after updating 'vcpu->arch.cpuid_entries', this makes it
impossible to compare an update with what was previously
supplied. Introduce __kvm_update_cpuid_runtime() version which can be
used to tweak the input before it goes to 'vcpu->arch.cpuid_entries'
so the upcoming update check can compare tweaked data.

No functional change intended.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
arch/x86/kvm/cpuid.c | 34 ++++++++++++++++++++++++----------
1 file changed, 24 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 89af3c7390d3..c4d4c350afbe 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -125,14 +125,21 @@ static void kvm_update_kvm_cpuid_base(struct kvm_vcpu *vcpu)
}
}

-static struct kvm_cpuid_entry2 *kvm_find_kvm_cpuid_features(struct kvm_vcpu *vcpu)
+static struct kvm_cpuid_entry2 *__kvm_find_kvm_cpuid_features(struct kvm_vcpu *vcpu,
+ struct kvm_cpuid_entry2 *entries, int nent)
{
u32 base = vcpu->arch.kvm_cpuid_base;

if (!base)
return NULL;

- return kvm_find_cpuid_entry(vcpu, base | KVM_CPUID_FEATURES, 0);
+ return cpuid_entry2_find(entries, nent, base | KVM_CPUID_FEATURES, 0);
+}
+
+static struct kvm_cpuid_entry2 *kvm_find_kvm_cpuid_features(struct kvm_vcpu *vcpu)
+{
+ return __kvm_find_kvm_cpuid_features(vcpu, vcpu->arch.cpuid_entries,
+ vcpu->arch.cpuid_nent);
}

void kvm_update_pv_runtime(struct kvm_vcpu *vcpu)
@@ -147,11 +154,12 @@ void kvm_update_pv_runtime(struct kvm_vcpu *vcpu)
vcpu->arch.pv_cpuid.features = best->eax;
}

-void kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu)
+static void __kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *entries,
+ int nent)
{
struct kvm_cpuid_entry2 *best;

- best = kvm_find_cpuid_entry(vcpu, 1, 0);
+ best = cpuid_entry2_find(entries, nent, 1, 0);
if (best) {
/* Update OSXSAVE bit */
if (boot_cpu_has(X86_FEATURE_XSAVE))
@@ -162,33 +170,38 @@ void kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu)
vcpu->arch.apic_base & MSR_IA32_APICBASE_ENABLE);
}

- best = kvm_find_cpuid_entry(vcpu, 7, 0);
+ best = cpuid_entry2_find(entries, nent, 7, 0);
if (best && boot_cpu_has(X86_FEATURE_PKU) && best->function == 0x7)
cpuid_entry_change(best, X86_FEATURE_OSPKE,
kvm_read_cr4_bits(vcpu, X86_CR4_PKE));

- best = kvm_find_cpuid_entry(vcpu, 0xD, 0);
+ best = cpuid_entry2_find(entries, nent, 0xD, 0);
if (best)
best->ebx = xstate_required_size(vcpu->arch.xcr0, false);

- best = kvm_find_cpuid_entry(vcpu, 0xD, 1);
+ best = cpuid_entry2_find(entries, nent, 0xD, 1);
if (best && (cpuid_entry_has(best, X86_FEATURE_XSAVES) ||
cpuid_entry_has(best, X86_FEATURE_XSAVEC)))
best->ebx = xstate_required_size(vcpu->arch.xcr0, true);

- best = kvm_find_kvm_cpuid_features(vcpu);
+ best = __kvm_find_kvm_cpuid_features(vcpu, entries, nent);
if (kvm_hlt_in_guest(vcpu->kvm) && best &&
(best->eax & (1 << KVM_FEATURE_PV_UNHALT)))
best->eax &= ~(1 << KVM_FEATURE_PV_UNHALT);

if (!kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT)) {
- best = kvm_find_cpuid_entry(vcpu, 0x1, 0);
+ best = cpuid_entry2_find(entries, nent, 0x1, 0);
if (best)
cpuid_entry_change(best, X86_FEATURE_MWAIT,
vcpu->arch.ia32_misc_enable_msr &
MSR_IA32_MISC_ENABLE_MWAIT);
}
}
+
+void kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu)
+{
+ __kvm_update_cpuid_runtime(vcpu, vcpu->arch.cpuid_entries, vcpu->arch.cpuid_nent);
+}
EXPORT_SYMBOL_GPL(kvm_update_cpuid_runtime);

static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
@@ -278,6 +291,8 @@ static int kvm_set_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *e2,
{
int r;

+ __kvm_update_cpuid_runtime(vcpu, e2, nent);
+
r = kvm_check_cpuid(e2, nent);
if (r)
return r;
@@ -287,7 +302,6 @@ static int kvm_set_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *e2,
vcpu->arch.cpuid_nent = nent;

kvm_update_kvm_cpuid_base(vcpu);
- kvm_update_cpuid_runtime(vcpu);
kvm_vcpu_after_set_cpuid(vcpu);

return 0;
--
2.34.1


2022-01-13 13:37:40

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH 3/5] KVM: x86: Partially allow KVM_SET_CPUID{,2} after KVM_RUN

Commit feb627e8d6f6 ("KVM: x86: Forbid KVM_SET_CPUID{,2} after KVM_RUN")
forbade changing CPUID altogether but unfortunately this is not fully
compatible with existing VMMs. In particular, QEMU reuses vCPU fds for
CPU hotplug after unplug and it calls KVM_SET_CPUID2. Checking for
full equality of the data with what's in 'vcpu->arch.cpuid_entries' is
not feasible for two reasons: the data is tweaked by KVM itself
(__kvm_update_cpuid_runtime()) and QEMU may use vCPU fd for some other
virtual CPU so it may want to change LAPIC/x2APIC ids. Relax the
requirement by implementing an allowlist of entries which are allowed
to change.

Reported-by: Igor Mammedov <[email protected]>
Fixes: feb627e8d6f6 ("KVM: x86: Forbid KVM_SET_CPUID{,2} after KVM_RUN")
Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
arch/x86/kvm/cpuid.c | 70 +++++++++++++++++++++++++++++++++++++++++---
arch/x86/kvm/x86.c | 19 ------------
2 files changed, 66 insertions(+), 23 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index c4d4c350afbe..4a8e1c8ada6c 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -80,9 +80,11 @@ static inline struct kvm_cpuid_entry2 *cpuid_entry2_find(
return NULL;
}

-static int kvm_check_cpuid(struct kvm_cpuid_entry2 *entries, int nent)
+static int kvm_check_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *entries,
+ int nent, bool is_update)
{
- struct kvm_cpuid_entry2 *best;
+ struct kvm_cpuid_entry2 *best, *e;
+ int i;

/*
* The existing code assumes virtual address is 48-bit or 57-bit in the
@@ -96,6 +98,59 @@ static int kvm_check_cpuid(struct kvm_cpuid_entry2 *entries, int nent)
return -EINVAL;
}

+ if (!is_update)
+ return 0;
+
+ /*
+ * KVM does not correctly handle changing guest CPUID after KVM_RUN, as
+ * MAXPHYADDR, GBPAGES support, AMD reserved bit behavior, etc.. aren't
+ * tracked in kvm_mmu_page_role. As a result, KVM may miss guest page
+ * faults due to reusing SPs/SPTEs. In practice no sane VMM mucks with
+ * the core vCPU model on the fly. It would've been better to forbid any
+ * KVM_SET_CPUID{,2} calls after KVM_RUN altogether but unfortunately
+ * some VMMs (e.g. QEMU) reuse vCPU fds for CPU hotplug/unplug and they
+ * need to set CPUID to e.g. change [x2]APIC id. Implement an allowlist
+ * of CPUIDs which are allowed to change.
+ */
+ for (i = 0; i < nent; i++) {
+ e = &entries[i];
+
+ best = kvm_find_cpuid_entry(vcpu, e->function, e->index);
+ if (!best)
+ return -EINVAL;
+
+ switch (e->function) {
+ case 0x1:
+ /* Only initial LAPIC id is allowed to change */
+ if (e->eax != best->eax ||
+ (e->ebx & ~0xff000000u) != (best->ebx & ~0xff000000u) ||
+ e->ecx != best->ecx || e->edx != best->edx)
+ return -EINVAL;
+ break;
+ case 0x3:
+ /* processor serial number */
+ case 0x4:
+ /* cache parameters */
+ case 0xb:
+ case 0x1f:
+ /* x2APIC id and CPU topology */
+ case 0x80000005:
+ /* AMD l1 cache information */
+ case 0x80000006:
+ /* l2 cache information */
+ case 0x8000001d:
+ /* AMD cache topology */
+ case 0x8000001e:
+ /* AMD processor topology */
+ break;
+ default:
+ if (e->eax != best->eax || e->ebx != best->ebx ||
+ e->ecx != best->ecx || e->edx != best->edx)
+ return -EINVAL;
+ break;
+ }
+ }
+
return 0;
}

@@ -290,10 +345,11 @@ static int kvm_set_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *e2,
int nent)
{
int r;
+ bool is_update = vcpu->arch.last_vmentry_cpu != -1;

__kvm_update_cpuid_runtime(vcpu, e2, nent);

- r = kvm_check_cpuid(e2, nent);
+ r = kvm_check_cpuid(vcpu, e2, nent, is_update);
if (r)
return r;

@@ -302,7 +358,13 @@ static int kvm_set_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *e2,
vcpu->arch.cpuid_nent = nent;

kvm_update_kvm_cpuid_base(vcpu);
- kvm_vcpu_after_set_cpuid(vcpu);
+
+ /*
+ * KVM_SET_CPUID{,2} after KVM_RUN is not allowed to change vCPU features, see
+ * kvm_check_cpuid().
+ */
+ if (!is_update)
+ kvm_vcpu_after_set_cpuid(vcpu);

return 0;
}
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 829d03fcb481..935b60c37c6d 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5148,17 +5148,6 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
struct kvm_cpuid __user *cpuid_arg = argp;
struct kvm_cpuid cpuid;

- /*
- * KVM does not correctly handle changing guest CPUID after KVM_RUN, as
- * MAXPHYADDR, GBPAGES support, AMD reserved bit behavior, etc.. aren't
- * tracked in kvm_mmu_page_role. As a result, KVM may miss guest page
- * faults due to reusing SPs/SPTEs. In practice no sane VMM mucks with
- * the core vCPU model on the fly, so fail.
- */
- r = -EINVAL;
- if (vcpu->arch.last_vmentry_cpu != -1)
- goto out;
-
r = -EFAULT;
if (copy_from_user(&cpuid, cpuid_arg, sizeof(cpuid)))
goto out;
@@ -5169,14 +5158,6 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
struct kvm_cpuid2 __user *cpuid_arg = argp;
struct kvm_cpuid2 cpuid;

- /*
- * KVM_SET_CPUID{,2} after KVM_RUN is forbidded, see the comment in
- * KVM_SET_CPUID case above.
- */
- r = -EINVAL;
- if (vcpu->arch.last_vmentry_cpu != -1)
- goto out;
-
r = -EFAULT;
if (copy_from_user(&cpuid, cpuid_arg, sizeof(cpuid)))
goto out;
--
2.34.1


2022-01-13 13:37:42

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH 4/5] KVM: selftests: Rename 'get_cpuid_test' to 'cpuid_test'

In preparation to reusing the existing 'get_cpuid_test' for testing
"KVM_SET_CPUID{,2} after KVM_RUN" rename it to 'cpuid_test' to avoid
the confusion.

No functional change intended.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
tools/testing/selftests/kvm/.gitignore | 2 +-
tools/testing/selftests/kvm/Makefile | 5 +++--
.../selftests/kvm/x86_64/{get_cpuid_test.c => cpuid_test.c} | 0
3 files changed, 4 insertions(+), 3 deletions(-)
rename tools/testing/selftests/kvm/x86_64/{get_cpuid_test.c => cpuid_test.c} (100%)

diff --git a/tools/testing/selftests/kvm/.gitignore b/tools/testing/selftests/kvm/.gitignore
index 3cb5ac5da087..289d4cc32d2f 100644
--- a/tools/testing/selftests/kvm/.gitignore
+++ b/tools/testing/selftests/kvm/.gitignore
@@ -7,11 +7,11 @@
/s390x/memop
/s390x/resets
/s390x/sync_regs_test
+/x86_64/cpuid_test
/x86_64/cr4_cpuid_sync_test
/x86_64/debug_regs
/x86_64/evmcs_test
/x86_64/emulator_error_test
-/x86_64/get_cpuid_test
/x86_64/get_msr_index_features
/x86_64/kvm_clock_test
/x86_64/kvm_pv_test
diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
index 17342b575e85..0ac99fa29dce 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -38,11 +38,12 @@ LIBKVM_x86_64 = lib/x86_64/apic.c lib/x86_64/processor.c lib/x86_64/vmx.c lib/x8
LIBKVM_aarch64 = lib/aarch64/processor.c lib/aarch64/ucall.c lib/aarch64/handlers.S lib/aarch64/spinlock.c lib/aarch64/gic.c lib/aarch64/gic_v3.c lib/aarch64/vgic.c
LIBKVM_s390x = lib/s390x/processor.c lib/s390x/ucall.c lib/s390x/diag318_test_handler.c

-TEST_GEN_PROGS_x86_64 = x86_64/cr4_cpuid_sync_test
+TEST_GEN_PROGS_x86_64 = x86_64/cpuid_test
+TEST_GEN_PROGS_x86_64 += x86_64/cr4_cpuid_sync_test
TEST_GEN_PROGS_x86_64 += x86_64/get_msr_index_features
TEST_GEN_PROGS_x86_64 += x86_64/evmcs_test
TEST_GEN_PROGS_x86_64 += x86_64/emulator_error_test
-TEST_GEN_PROGS_x86_64 += x86_64/get_cpuid_test
+TEST_GEN_PROGS_x86_64 += x86_64/cpuid_test
TEST_GEN_PROGS_x86_64 += x86_64/hyperv_clock
TEST_GEN_PROGS_x86_64 += x86_64/hyperv_cpuid
TEST_GEN_PROGS_x86_64 += x86_64/hyperv_features
diff --git a/tools/testing/selftests/kvm/x86_64/get_cpuid_test.c b/tools/testing/selftests/kvm/x86_64/cpuid_test.c
similarity index 100%
rename from tools/testing/selftests/kvm/x86_64/get_cpuid_test.c
rename to tools/testing/selftests/kvm/x86_64/cpuid_test.c
--
2.34.1


2022-01-13 13:37:46

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH 5/5] KVM: selftests: Test KVM_SET_CPUID2 after KVM_RUN

KVM keeps an allowlist of things which can be changed with
KVM_SET_CPUID2 after KVM_RUN was performed on a vCPU, changing all
other entries is forbidden. Test this.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
.../selftests/kvm/include/x86_64/processor.h | 7 ++
.../selftests/kvm/lib/x86_64/processor.c | 33 +++++++-
.../testing/selftests/kvm/x86_64/cpuid_test.c | 78 +++++++++++++++++++
3 files changed, 114 insertions(+), 4 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/x86_64/processor.h b/tools/testing/selftests/kvm/include/x86_64/processor.h
index 05e65ca1c30c..5a3a4809b49a 100644
--- a/tools/testing/selftests/kvm/include/x86_64/processor.h
+++ b/tools/testing/selftests/kvm/include/x86_64/processor.h
@@ -358,6 +358,8 @@ uint64_t kvm_get_feature_msr(uint64_t msr_index);
struct kvm_cpuid2 *kvm_get_supported_cpuid(void);

struct kvm_cpuid2 *vcpu_get_cpuid(struct kvm_vm *vm, uint32_t vcpuid);
+int __vcpu_set_cpuid(struct kvm_vm *vm, uint32_t vcpuid,
+ struct kvm_cpuid2 *cpuid);
void vcpu_set_cpuid(struct kvm_vm *vm, uint32_t vcpuid,
struct kvm_cpuid2 *cpuid);

@@ -401,6 +403,11 @@ uint64_t vm_get_page_table_entry(struct kvm_vm *vm, int vcpuid, uint64_t vaddr);
void vm_set_page_table_entry(struct kvm_vm *vm, int vcpuid, uint64_t vaddr,
uint64_t pte);

+/*
+ * get_cpuid() - find matching CPUID entry and return pointer to it.
+ */
+struct kvm_cpuid_entry2 *get_cpuid(struct kvm_cpuid2 *cpuid, uint32_t function,
+ uint32_t index);
/*
* set_cpuid() - overwrites a matching cpuid entry with the provided value.
* matches based on ent->function && ent->index. returns true
diff --git a/tools/testing/selftests/kvm/lib/x86_64/processor.c b/tools/testing/selftests/kvm/lib/x86_64/processor.c
index eef7b34756d5..6441b03c46a9 100644
--- a/tools/testing/selftests/kvm/lib/x86_64/processor.c
+++ b/tools/testing/selftests/kvm/lib/x86_64/processor.c
@@ -847,6 +847,17 @@ kvm_get_supported_cpuid_index(uint32_t function, uint32_t index)
return entry;
}

+
+int __vcpu_set_cpuid(struct kvm_vm *vm, uint32_t vcpuid,
+ struct kvm_cpuid2 *cpuid)
+{
+ struct vcpu *vcpu = vcpu_find(vm, vcpuid);
+
+ TEST_ASSERT(vcpu != NULL, "vcpu not found, vcpuid: %u", vcpuid);
+
+ return ioctl(vcpu->fd, KVM_SET_CPUID2, cpuid);
+}
+
/*
* VM VCPU CPUID Set
*
@@ -864,12 +875,9 @@ kvm_get_supported_cpuid_index(uint32_t function, uint32_t index)
void vcpu_set_cpuid(struct kvm_vm *vm,
uint32_t vcpuid, struct kvm_cpuid2 *cpuid)
{
- struct vcpu *vcpu = vcpu_find(vm, vcpuid);
int rc;

- TEST_ASSERT(vcpu != NULL, "vcpu not found, vcpuid: %u", vcpuid);
-
- rc = ioctl(vcpu->fd, KVM_SET_CPUID2, cpuid);
+ rc = __vcpu_set_cpuid(vm, vcpuid, cpuid);
TEST_ASSERT(rc == 0, "KVM_SET_CPUID2 failed, rc: %i errno: %i",
rc, errno);

@@ -1337,6 +1345,23 @@ void assert_on_unhandled_exception(struct kvm_vm *vm, uint32_t vcpuid)
}
}

+struct kvm_cpuid_entry2 *get_cpuid(struct kvm_cpuid2 *cpuid, uint32_t function,
+ uint32_t index)
+{
+ int i;
+
+ for (i = 0; i < cpuid->nent; i++) {
+ struct kvm_cpuid_entry2 *cur = &cpuid->entries[i];
+
+ if (cur->function == function && cur->index == index)
+ return cur;
+ }
+
+ TEST_FAIL("CPUID function 0x%x index 0x%x not found ", function, index);
+
+ return NULL;
+}
+
bool set_cpuid(struct kvm_cpuid2 *cpuid,
struct kvm_cpuid_entry2 *ent)
{
diff --git a/tools/testing/selftests/kvm/x86_64/cpuid_test.c b/tools/testing/selftests/kvm/x86_64/cpuid_test.c
index a711f83749ea..8ed09772a741 100644
--- a/tools/testing/selftests/kvm/x86_64/cpuid_test.c
+++ b/tools/testing/selftests/kvm/x86_64/cpuid_test.c
@@ -154,6 +154,82 @@ struct kvm_cpuid2 *vcpu_alloc_cpuid(struct kvm_vm *vm, vm_vaddr_t *p_gva, struct
return guest_cpuids;
}

+static inline struct kvm_cpuid_entry2 *cpuid_entry2_find(
+ struct kvm_cpuid_entry2 *entries, int nent, u32 function, u32 index)
+{
+ struct kvm_cpuid_entry2 *e;
+ int i;
+
+ for (i = 0; i < nent; i++) {
+ e = &entries[i];
+
+ if (e->function == function &&
+ (!(e->flags & KVM_CPUID_FLAG_SIGNIFCANT_INDEX) || e->index == index))
+ return e;
+ }
+
+ return NULL;
+}
+
+static void set_cpuid_after_run(struct kvm_vm *vm, struct kvm_cpuid2 *cpuid)
+{
+ struct kvm_cpuid_entry2 *ent;
+ int rc;
+ u32 eax, ebx, x;
+
+ /* Setting unmodified CPUID is allowed */
+ rc = __vcpu_set_cpuid(vm, VCPU_ID, cpuid);
+ TEST_ASSERT(!rc, "Setting unmodified CPUID after KVM_RUN failed: %d", rc);
+
+ /* Changing initial APIC id is allowed */
+ ent = get_cpuid(cpuid, 0x1, 0);
+ x = ent->ebx >> 24;
+ ent->ebx = (ent->ebx & ~0xff000000u) | (x + 2) << 24;
+ rc = __vcpu_set_cpuid(vm, VCPU_ID, cpuid);
+ TEST_ASSERT(!rc, "Changing initial APIC id failed: %d", rc);
+
+ /* Changing other bits in CPUID.01H is forbidden */
+ eax = ent->eax;
+ ebx = ent->ebx;
+ ent->eax++;
+ rc = __vcpu_set_cpuid(vm, VCPU_ID, cpuid);
+ TEST_ASSERT(rc, "Changing CPUID.01H.EAX should fail");
+ ent->eax = eax;
+ ent->ebx++;
+ rc = __vcpu_set_cpuid(vm, VCPU_ID, cpuid);
+ TEST_ASSERT(rc, "Changing lower bits of CPUID.01H.EBX should fail");
+ ent->ebx = ebx;
+
+ /* Changing x2APIC id is allowed */
+ ent = get_cpuid(cpuid, 0xb, 0);
+ ent->edx += 2;
+ rc = __vcpu_set_cpuid(vm, VCPU_ID, cpuid);
+ TEST_ASSERT(!rc, "Changing x2APIC id failed: %d", rc);
+
+ /* Changing MAXPHYADDR is forbidden */
+ ent = get_cpuid(cpuid, 0x80000008, 0);
+ eax = ent->eax;
+ x = eax & 0xff;
+ ent->eax = (eax & ~0xffu) | (x - 1);
+ rc = __vcpu_set_cpuid(vm, VCPU_ID, cpuid);
+ TEST_ASSERT(rc, "Changing MAXPHYADDR should fail");
+ ent->eax = eax;
+
+ /* Changing processor serial number is allowed */
+ ent = get_cpuid(cpuid, 0x3, 0);
+ ent->ecx += 1;
+ rc = __vcpu_set_cpuid(vm, VCPU_ID, cpuid);
+ TEST_ASSERT(!rc, "Changing processor serial number failed: %d", rc);
+
+ /* Changing CPU features is forbidden */
+ ent = get_cpuid(cpuid, 0x7, 0);
+ ebx = ent->ebx;
+ ent->ebx--;
+ rc = __vcpu_set_cpuid(vm, VCPU_ID, cpuid);
+ TEST_ASSERT(rc, "Changing CPU features should fail");
+ ent->ebx = ebx;
+}
+
int main(void)
{
struct kvm_cpuid2 *supp_cpuid, *cpuid2;
@@ -175,5 +251,7 @@ int main(void)
for (stage = 0; stage < 3; stage++)
run_vcpu(vm, VCPU_ID, stage);

+ set_cpuid_after_run(vm, cpuid2);
+
kvm_vm_free(vm);
}
--
2.34.1


2022-01-13 20:00:15

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH 0/5] KVM: x86: Partially allow KVM_SET_CPUID{,2} after KVM_RUN for CPU hotplug

On Thu, Jan 13, 2022, Vitaly Kuznetsov wrote:
> Recently, KVM made it illegal to change CPUID after KVM_RUN but
> unfortunately this change is not fully compatible with existing VMMs.
> In particular, QEMU reuses vCPU fds for CPU hotplug after unplug and it
> calls KVM_SET_CPUID2. Relax the requirement by implementing an allowlist
> of entries which are allowed to change.

Honestly, I'd prefer we give up and just revert feb627e8d6f6 ("KVM: x86: Forbid
KVM_SET_CPUID{,2} after KVM_RUN"). Attempting to retroactively restrict the
existing ioctls is becoming a mess, and I'm more than a bit concerned that this
will be a maintenance nightmare in the future, without all that much benefit to
anyone.

I also don't love that the set of volatile entries is nothing more than "this is
what QEMU needs today". There's no architectural justification, and the few cases
that do architecturally allow CPUID bits to change are disallowed. E.g. OSXSAVE,
MONITOR/MWAIT, CPUID.0x12.EAX.SGX1 are all _architecturally_ defined scenarios
where CPUID can change, yet none of those appear in this list. Some of those are
explicitly handled by KVM (runtime CPUID updates), but why should it be illegal
for userspace to intercept writes to MISC_ENABLE and do its own CPUID emulation?

2022-01-14 08:12:46

by Vitaly Kuznetsov

[permalink] [raw]
Subject: Re: [PATCH 0/5] KVM: x86: Partially allow KVM_SET_CPUID{,2} after KVM_RUN for CPU hotplug

Sean Christopherson <[email protected]> writes:

> On Thu, Jan 13, 2022, Vitaly Kuznetsov wrote:
>> Recently, KVM made it illegal to change CPUID after KVM_RUN but
>> unfortunately this change is not fully compatible with existing VMMs.
>> In particular, QEMU reuses vCPU fds for CPU hotplug after unplug and it
>> calls KVM_SET_CPUID2. Relax the requirement by implementing an allowlist
>> of entries which are allowed to change.
>
> Honestly, I'd prefer we give up and just revert feb627e8d6f6 ("KVM: x86: Forbid
> KVM_SET_CPUID{,2} after KVM_RUN"). Attempting to retroactively restrict the
> existing ioctls is becoming a mess, and I'm more than a bit concerned that this
> will be a maintenance nightmare in the future, without all that much benefit to
> anyone.

I cannot say I disagree)

>
> I also don't love that the set of volatile entries is nothing more than "this is
> what QEMU needs today". There's no architectural justification, and the few cases
> that do architecturally allow CPUID bits to change are disallowed. E.g. OSXSAVE,
> MONITOR/MWAIT, CPUID.0x12.EAX.SGX1 are all _architecturally_ defined scenarios
> where CPUID can change, yet none of those appear in this list. Some of those are
> explicitly handled by KVM (runtime CPUID updates), but why should it be illegal
> for userspace to intercept writes to MISC_ENABLE and do its own CPUID emulation?

I see. Another approach would be to switch from the current allowlist
approach to a blocklist of things which we forbid to change
("MAXPHYADDR, GBPAGES support, AMD reserved bit behavior, ...") after the
first KVM_RUN.

--
Vitaly


2022-01-14 09:08:57

by Igor Mammedov

[permalink] [raw]
Subject: Re: [PATCH 0/5] KVM: x86: Partially allow KVM_SET_CPUID{,2} after KVM_RUN for CPU hotplug

On Thu, 13 Jan 2022 20:00:08 +0000
Sean Christopherson <[email protected]> wrote:

> On Thu, Jan 13, 2022, Vitaly Kuznetsov wrote:
> > Recently, KVM made it illegal to change CPUID after KVM_RUN but
> > unfortunately this change is not fully compatible with existing VMMs.
> > In particular, QEMU reuses vCPU fds for CPU hotplug after unplug and it
> > calls KVM_SET_CPUID2. Relax the requirement by implementing an allowlist
> > of entries which are allowed to change.
>
> Honestly, I'd prefer we give up and just revert feb627e8d6f6 ("KVM: x86: Forbid
> KVM_SET_CPUID{,2} after KVM_RUN"). Attempting to retroactively restrict the
> existing ioctls is becoming a mess, and I'm more than a bit concerned that this
> will be a maintenance nightmare in the future, without all that much benefit to
> anyone.

in 63f5a1909f9 ("KVM: x86: Alert userspace that KVM_SET_CPUID{,2} after KVM_RUN is broken")
you mention heterogeneous configuration, and that implies that
a userspace (not upstream qemu today) might attempt to change CPUID
and that would be wrong. Do we still care about that?


> I also don't love that the set of volatile entries is nothing more than "this is
> what QEMU needs today". There's no architectural justification, and the few cases
> that do architecturally allow CPUID bits to change are disallowed. E.g. OSXSAVE,
> MONITOR/MWAIT, CPUID.0x12.EAX.SGX1 are all _architecturally_ defined scenarios
> where CPUID can change, yet none of those appear in this list. Some of those are
> explicitly handled by KVM (runtime CPUID updates), but why should it be illegal
> for userspace to intercept writes to MISC_ENABLE and do its own CPUID emulation?
>


2022-01-14 22:50:04

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH 0/5] KVM: x86: Partially allow KVM_SET_CPUID{,2} after KVM_RUN for CPU hotplug

On Fri, Jan 14, 2022, Igor Mammedov wrote:
> On Thu, 13 Jan 2022 20:00:08 +0000
> Sean Christopherson <[email protected]> wrote:
>
> > On Thu, Jan 13, 2022, Vitaly Kuznetsov wrote:
> > > Recently, KVM made it illegal to change CPUID after KVM_RUN but
> > > unfortunately this change is not fully compatible with existing VMMs.
> > > In particular, QEMU reuses vCPU fds for CPU hotplug after unplug and it
> > > calls KVM_SET_CPUID2. Relax the requirement by implementing an allowlist
> > > of entries which are allowed to change.
> >
> > Honestly, I'd prefer we give up and just revert feb627e8d6f6 ("KVM: x86: Forbid
> > KVM_SET_CPUID{,2} after KVM_RUN"). Attempting to retroactively restrict the
> > existing ioctls is becoming a mess, and I'm more than a bit concerned that this
> > will be a maintenance nightmare in the future, without all that much benefit to
> > anyone.
>
> in 63f5a1909f9 ("KVM: x86: Alert userspace that KVM_SET_CPUID{,2} after KVM_RUN is broken")
> you mention heterogeneous configuration, and that implies that
> a userspace (not upstream qemu today) might attempt to change CPUID
> and that would be wrong. Do we still care about that?

We still care, and I really do like the idea in theory, but if we're stuck choosing
between taking on a pile of ugly code in KVM and letting userspace shoot themselves
in the foot, I choose the latter :-)