2018-12-10 17:23:55

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH v2 0/7] x86/kvm/hyper-v: Implement KVM_GET_SUPPORTED_HV_CPUID

Changes since v1:
- Add Michael's R-b tags.
- Document KVM_GET_SUPPORTED_HV_CPUID and KVM_CAP_HYPERV_CPUID.
- Fix error handling in kvm_vcpu_ioctl_get_hv_cpuid().
- Check for -E2BIG in the selftest. PATCH6 is added to support the change.

As suggested by Paolo, introduce new KVM_GET_SUPPORTED_HV_CPUID returning
all Hyper-V CPUID feature leaves, this way we won't need to add a new
KVM_CAP_HYPERV_* for every new Hyper-V enlightenment we implement in
KVM.

(Using the existing KVM_GET_SUPPORTED_CPUID doesn't seem to be possible:
Hyper-V CPUID feature leaves intersect with KVM's (e.g. 0x40000000,
0x40000001) and we would probably confuse userspace in case we decide to
return these twice).

This patch series also does some housekeeping work in hyperv-tlfs.h, adds a
simple selftest (that's how I actually checked that the newly added ioctl
works) and removes recently added KVM_CAP_HYPERV_STIMER_DIRECT before it's
too late.

Vitaly Kuznetsov (7):
x86/hyper-v: Do some housekeeping in hyperv-tlfs.h
x86/hyper-v: Drop HV_X64_CONFIGURE_PROFILER definition
x86/kvm/hyper-v: Introduce nested_get_evmcs_version() helper
x86/kvm/hyper-v: Introduce KVM_GET_SUPPORTED_HV_CPUID
x86/kvm/hyper-v: Drop KVM_CAP_HYPERV_STIMER_DIRECT
KVM: selftests: implement an unchecked version of vcpu_ioctl()
KVM: selftests: Add hyperv_cpuid test

Documentation/virtual/kvm/api.txt | 57 ++++++
arch/x86/include/asm/hyperv-tlfs.h | 185 +++++++++---------
arch/x86/include/asm/kvm_host.h | 1 +
arch/x86/kvm/hyperv.c | 125 +++++++++++-
arch/x86/kvm/hyperv.h | 2 +
arch/x86/kvm/svm.c | 7 +
arch/x86/kvm/vmx.c | 24 ++-
arch/x86/kvm/x86.c | 21 +-
include/uapi/linux/kvm.h | 5 +-
tools/testing/selftests/kvm/Makefile | 1 +
.../testing/selftests/kvm/include/kvm_util.h | 2 +
tools/testing/selftests/kvm/lib/kvm_util.c | 14 +-
.../selftests/kvm/x86_64/hyperv_cpuid.c | 157 +++++++++++++++
13 files changed, 493 insertions(+), 108 deletions(-)
create mode 100644 tools/testing/selftests/kvm/x86_64/hyperv_cpuid.c

--
2.19.2



2018-12-10 17:24:05

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH v2 3/7] x86/kvm/hyper-v: Introduce nested_get_evmcs_version() helper

The upcoming KVM_GET_SUPPORTED_HV_CPUID ioctl will need to return
Enlightened VMCS version in HYPERV_CPUID_NESTED_FEATURES.EAX when
it was enabled.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
arch/x86/include/asm/kvm_host.h | 1 +
arch/x86/kvm/svm.c | 7 +++++++
arch/x86/kvm/vmx.c | 24 +++++++++++++++++-------
3 files changed, 25 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index e1a40e649cdc..184669d48d80 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1185,6 +1185,7 @@ struct kvm_x86_ops {

int (*nested_enable_evmcs)(struct kvm_vcpu *vcpu,
uint16_t *vmcs_version);
+ uint16_t (*nested_get_evmcs_version)(struct kvm_vcpu *vcpu);
};

struct kvm_arch_async_pf {
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 0e21ccc46792..472f950ccab7 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -7037,6 +7037,12 @@ static int svm_unregister_enc_region(struct kvm *kvm,
return ret;
}

+static uint16_t nested_get_evmcs_version(struct kvm_vcpu *vcpu)
+{
+ /* Not supported */
+ return 0;
+}
+
static int nested_enable_evmcs(struct kvm_vcpu *vcpu,
uint16_t *vmcs_version)
{
@@ -7175,6 +7181,7 @@ static struct kvm_x86_ops svm_x86_ops __ro_after_init = {
.mem_enc_unreg_region = svm_unregister_enc_region,

.nested_enable_evmcs = nested_enable_evmcs,
+ .nested_get_evmcs_version = nested_get_evmcs_version,
};

static int __init svm_init(void)
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 4555077d69ce..f5d865e88095 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -1605,6 +1605,21 @@ static inline void evmcs_sanitize_exec_ctrls(struct vmcs_config *vmcs_conf) {}
static inline void evmcs_touch_msr_bitmap(void) {}
#endif /* IS_ENABLED(CONFIG_HYPERV) */

+static uint16_t nested_get_evmcs_version(struct kvm_vcpu *vcpu)
+{
+ struct vcpu_vmx *vmx = to_vmx(vcpu);
+ /*
+ * vmcs_version represents the range of supported Enlightened VMCS
+ * versions: lower 8 bits is the minimal version, higher 8 bits is the
+ * maximum supported version. KVM supports versions from 1 to
+ * KVM_EVMCS_VERSION.
+ */
+ if (vmx->nested.enlightened_vmcs_enabled)
+ return (KVM_EVMCS_VERSION << 8) | 1;
+
+ return 0;
+}
+
static int nested_enable_evmcs(struct kvm_vcpu *vcpu,
uint16_t *vmcs_version)
{
@@ -1616,14 +1631,8 @@ static int nested_enable_evmcs(struct kvm_vcpu *vcpu,

vmx->nested.enlightened_vmcs_enabled = true;

- /*
- * vmcs_version represents the range of supported Enlightened VMCS
- * versions: lower 8 bits is the minimal version, higher 8 bits is the
- * maximum supported version. KVM supports versions from 1 to
- * KVM_EVMCS_VERSION.
- */
if (vmcs_version)
- *vmcs_version = (KVM_EVMCS_VERSION << 8) | 1;
+ *vmcs_version = nested_get_evmcs_version(vcpu);

vmx->nested.msrs.pinbased_ctls_high &= ~EVMCS1_UNSUPPORTED_PINCTRL;
vmx->nested.msrs.entry_ctls_high &= ~EVMCS1_UNSUPPORTED_VMENTRY_CTRL;
@@ -15107,6 +15116,7 @@ static struct kvm_x86_ops vmx_x86_ops __ro_after_init = {
.enable_smi_window = enable_smi_window,

.nested_enable_evmcs = nested_enable_evmcs,
+ .nested_get_evmcs_version = nested_get_evmcs_version,
};

static void vmx_cleanup_l1d_flush(void)
--
2.19.2


2018-12-10 17:24:18

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH v2 6/7] KVM: selftests: implement an unchecked version of vcpu_ioctl()

In case we want to test failing ioctls we need an option to not
fail. Following _vcpu_run() precedent implement _vcpu_ioctl().

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
tools/testing/selftests/kvm/include/kvm_util.h | 2 ++
tools/testing/selftests/kvm/lib/kvm_util.c | 14 ++++++++++++--
2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/kvm_util.h b/tools/testing/selftests/kvm/include/kvm_util.h
index a4e59e3b4826..539028f91582 100644
--- a/tools/testing/selftests/kvm/include/kvm_util.h
+++ b/tools/testing/selftests/kvm/include/kvm_util.h
@@ -78,6 +78,8 @@ void vm_userspace_mem_region_add(struct kvm_vm *vm,

void vcpu_ioctl(struct kvm_vm *vm, uint32_t vcpuid, unsigned long ioctl,
void *arg);
+int _vcpu_ioctl(struct kvm_vm *vm, uint32_t vcpuid, unsigned long ioctl,
+ void *arg);
void vm_ioctl(struct kvm_vm *vm, unsigned long ioctl, void *arg);
void vm_mem_region_set_flags(struct kvm_vm *vm, uint32_t slot, uint32_t flags);
void vm_vcpu_add(struct kvm_vm *vm, uint32_t vcpuid, int pgd_memslot,
diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
index 1b41e71283d5..9635090cd699 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -1269,6 +1269,16 @@ int _vcpu_sregs_set(struct kvm_vm *vm, uint32_t vcpuid, struct kvm_sregs *sregs)
*/
void vcpu_ioctl(struct kvm_vm *vm, uint32_t vcpuid,
unsigned long cmd, void *arg)
+{
+ int ret;
+
+ ret = _vcpu_ioctl(vm, vcpuid, cmd, arg);
+ TEST_ASSERT(ret == 0, "vcpu ioctl %lu failed, rc: %i errno: %i (%s)",
+ cmd, ret, errno, strerror(errno));
+}
+
+int _vcpu_ioctl(struct kvm_vm *vm, uint32_t vcpuid,
+ unsigned long cmd, void *arg)
{
struct vcpu *vcpu = vcpu_find(vm, vcpuid);
int ret;
@@ -1276,8 +1286,8 @@ void vcpu_ioctl(struct kvm_vm *vm, uint32_t vcpuid,
TEST_ASSERT(vcpu != NULL, "vcpu not found, vcpuid: %u", vcpuid);

ret = ioctl(vcpu->fd, cmd, arg);
- TEST_ASSERT(ret == 0, "vcpu ioctl %lu failed, rc: %i errno: %i (%s)",
- cmd, ret, errno, strerror(errno));
+
+ return ret;
}

/*
--
2.19.2


2018-12-10 17:24:23

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH v2 7/7] KVM: selftests: Add hyperv_cpuid test

Add a simple (and stupid) hyperv_cpuid test: check that we got the
expected number of entries with and without Enlightened VMCS enabled
and that all currently reserved fields are zeroed.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
Changes since v1:
- Check for -E2BIG
---
tools/testing/selftests/kvm/Makefile | 1 +
.../selftests/kvm/x86_64/hyperv_cpuid.c | 157 ++++++++++++++++++
2 files changed, 158 insertions(+)
create mode 100644 tools/testing/selftests/kvm/x86_64/hyperv_cpuid.c

diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
index 01a219229238..a5f0f11f4381 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -14,6 +14,7 @@ TEST_GEN_PROGS_x86_64 += x86_64/vmx_tsc_adjust_test
TEST_GEN_PROGS_x86_64 += x86_64/cr4_cpuid_sync_test
TEST_GEN_PROGS_x86_64 += x86_64/state_test
TEST_GEN_PROGS_x86_64 += x86_64/evmcs_test
+TEST_GEN_PROGS_x86_64 += x86_64/hyperv_cpuid
TEST_GEN_PROGS_x86_64 += dirty_log_test

TEST_GEN_PROGS_aarch64 += dirty_log_test
diff --git a/tools/testing/selftests/kvm/x86_64/hyperv_cpuid.c b/tools/testing/selftests/kvm/x86_64/hyperv_cpuid.c
new file mode 100644
index 000000000000..264425f75806
--- /dev/null
+++ b/tools/testing/selftests/kvm/x86_64/hyperv_cpuid.c
@@ -0,0 +1,157 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Test for x86 KVM_CAP_HYPERV_CPUID
+ *
+ * Copyright (C) 2018, Red Hat, Inc.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.
+ *
+ */
+
+#define _GNU_SOURCE /* for program_invocation_short_name */
+#include <fcntl.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <sys/ioctl.h>
+
+#include "test_util.h"
+#include "kvm_util.h"
+#include "processor.h"
+
+#define VCPU_ID 0
+
+static void guest_code(void)
+{
+}
+
+static void test_hv_cpuid(struct kvm_cpuid2 *hv_cpuid_entries,
+ int evmcs_enabled)
+{
+ int i;
+
+ if (!evmcs_enabled)
+ TEST_ASSERT(hv_cpuid_entries->nent == 6,
+ "KVM_GET_SUPPORTED_HV_CPUID should return 6 entries"
+ " when Enlightened VMCS is disabled (returned %d)",
+ hv_cpuid_entries->nent);
+ else
+ TEST_ASSERT(hv_cpuid_entries->nent == 7,
+ "KVM_GET_SUPPORTED_HV_CPUID should return 7 entries"
+ " when Enlightened VMCS is enabled (returned %d)",
+ hv_cpuid_entries->nent);
+
+ for (i = 0; i < hv_cpuid_entries->nent; i++) {
+ struct kvm_cpuid_entry2 *entry = &hv_cpuid_entries->entries[i];
+
+ TEST_ASSERT((entry->function >= 0x40000000) &&
+ (entry->function <= 0x4000000A),
+ "function %lx is our of supported range",
+ entry->function);
+
+ TEST_ASSERT(entry->index == 0,
+ ".index field should be zero");
+
+ TEST_ASSERT(entry->index == 0,
+ ".index field should be zero");
+
+ TEST_ASSERT(entry->flags == 0,
+ ".flags field should be zero");
+
+ TEST_ASSERT(entry->padding[0] == entry->padding[1]
+ == entry->padding[2] == 0,
+ ".index field should be zero");
+
+ /*
+ * If needed for debug:
+ * fprintf(stdout,
+ * "CPUID%lx EAX=0x%lx EBX=0x%lx ECX=0x%lx EDX=0x%lx\n",
+ * entry->function, entry->eax, entry->ebx, entry->ecx,
+ * entry->edx);
+ */
+ }
+
+}
+
+void test_hv_cpuid_e2big(struct kvm_vm *vm)
+{
+ static struct kvm_cpuid2 cpuid = {.nent = 0};
+ int ret;
+
+ ret = _vcpu_ioctl(vm, VCPU_ID, KVM_GET_SUPPORTED_HV_CPUID, &cpuid);
+
+ TEST_ASSERT(ret == -1 && errno == E2BIG,
+ "KVM_GET_SUPPORTED_HV_CPUID didn't fail with -E2BIG when"
+ " it should have: %d %d", ret, errno);
+}
+
+
+struct kvm_cpuid2 *kvm_get_supported_hv_cpuid(struct kvm_vm *vm)
+{
+ int nent = 20; /* should be enough */
+ static struct kvm_cpuid2 *cpuid;
+ int ret;
+
+ cpuid = malloc(sizeof(*cpuid) + nent * sizeof(struct kvm_cpuid_entry2));
+
+ if (!cpuid) {
+ perror("malloc");
+ abort();
+ }
+
+ cpuid->nent = nent;
+
+ vcpu_ioctl(vm, VCPU_ID, KVM_GET_SUPPORTED_HV_CPUID, cpuid);
+
+ return cpuid;
+}
+
+
+int main(int argc, char *argv[])
+{
+ struct kvm_vm *vm;
+ int rv;
+ uint16_t evmcs_ver;
+ struct kvm_cpuid2 *hv_cpuid_entries;
+ struct kvm_enable_cap enable_evmcs_cap = {
+ .cap = KVM_CAP_HYPERV_ENLIGHTENED_VMCS,
+ .args[0] = (unsigned long)&evmcs_ver
+ };
+
+ /* Tell stdout not to buffer its content */
+ setbuf(stdout, NULL);
+
+ rv = kvm_check_cap(KVM_CAP_HYPERV_CPUID);
+ if (!rv) {
+ fprintf(stderr,
+ "KVM_CAP_HYPERV_CPUID not supported, skip test\n");
+ exit(KSFT_SKIP);
+ }
+
+ /* Create VM */
+ vm = vm_create_default(VCPU_ID, 0, guest_code);
+
+ test_hv_cpuid_e2big(vm);
+
+ hv_cpuid_entries = kvm_get_supported_hv_cpuid(vm);
+ if (!hv_cpuid_entries)
+ return 1;
+
+ test_hv_cpuid(hv_cpuid_entries, 0);
+
+ free(hv_cpuid_entries);
+
+ vcpu_ioctl(vm, VCPU_ID, KVM_ENABLE_CAP, &enable_evmcs_cap);
+
+ hv_cpuid_entries = kvm_get_supported_hv_cpuid(vm);
+ if (!hv_cpuid_entries)
+ return 1;
+
+ test_hv_cpuid(hv_cpuid_entries, 1);
+
+ free(hv_cpuid_entries);
+
+ kvm_vm_free(vm);
+
+ return 0;
+}
--
2.19.2


2018-12-10 17:24:41

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH v2 5/7] x86/kvm/hyper-v: Drop KVM_CAP_HYPERV_STIMER_DIRECT

The newly introduced KVM_GET_SUPPORTED_HV_CPUID covers Direct Mode stimers
feature so we can drop KVM_CAP_HYPERV_STIMER_DIRECT and reuse its number.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
- This patch only makes sense befor 4.21 merge window, disregard if it
doesn't make it in time.
---
arch/x86/kvm/x86.c | 1 -
include/uapi/linux/kvm.h | 3 +--
2 files changed, 1 insertion(+), 3 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 142776435166..67ba42929c11 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2997,7 +2997,6 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
case KVM_CAP_HYPERV_TLBFLUSH:
case KVM_CAP_HYPERV_SEND_IPI:
case KVM_CAP_HYPERV_ENLIGHTENED_VMCS:
- case KVM_CAP_HYPERV_STIMER_DIRECT:
case KVM_CAP_HYPERV_CPUID:
case KVM_CAP_PCI_SEGMENT:
case KVM_CAP_DEBUGREGS:
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 9d0038eae002..a9121c9bada5 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -975,8 +975,7 @@ struct kvm_ppc_resize_hpt {
#define KVM_CAP_HYPERV_ENLIGHTENED_VMCS 163
#define KVM_CAP_EXCEPTION_PAYLOAD 164
#define KVM_CAP_ARM_VM_IPA_SIZE 165
-#define KVM_CAP_HYPERV_STIMER_DIRECT 166
-#define KVM_CAP_HYPERV_CPUID 167
+#define KVM_CAP_HYPERV_CPUID 166

#ifdef KVM_CAP_IRQ_ROUTING

--
2.19.2


2018-12-10 17:24:52

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH v2 2/7] x86/hyper-v: Drop HV_X64_CONFIGURE_PROFILER definition

BIT(13) in HYPERV_CPUID_FEATURES.EBX is described as "ConfigureProfiler" in
TLFS v4.0 but starting 5.0 it is replaced with 'Reserved'. As we don't
currently us it in kernel it can just be dropped.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
Reviewed-by: Michael Kelley <[email protected]>
---
arch/x86/include/asm/hyperv-tlfs.h | 1 -
1 file changed, 1 deletion(-)

diff --git a/arch/x86/include/asm/hyperv-tlfs.h b/arch/x86/include/asm/hyperv-tlfs.h
index 9e6f70e989c2..11546619a65c 100644
--- a/arch/x86/include/asm/hyperv-tlfs.h
+++ b/arch/x86/include/asm/hyperv-tlfs.h
@@ -93,7 +93,6 @@
#define HV_X64_ACCESS_STATS BIT(8)
#define HV_X64_DEBUGGING BIT(11)
#define HV_X64_CPU_POWER_MANAGEMENT BIT(12)
-#define HV_X64_CONFIGURE_PROFILER BIT(13)

/*
* Feature identification. EDX indicates which miscellaneous features
--
2.19.2


2018-12-10 17:25:07

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH v2 1/7] x86/hyper-v: Do some housekeeping in hyperv-tlfs.h

hyperv-tlfs.h is a bit messy: CPUID feature bits are not always sorted,
it's hard to get which CPUID they belong to, some items are duplicated
(e.g. HV_X64_MSR_CRASH_CTL_NOTIFY/HV_CRASH_CTL_CRASH_NOTIFY).

Do some housekeeping work. While on it, replace all (1 << X) with BIT(X)
macro.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
Reviewed-by: Michael Kelley <[email protected]>
---
arch/x86/include/asm/hyperv-tlfs.h | 186 ++++++++++++++---------------
arch/x86/kvm/hyperv.c | 4 +-
2 files changed, 93 insertions(+), 97 deletions(-)

diff --git a/arch/x86/include/asm/hyperv-tlfs.h b/arch/x86/include/asm/hyperv-tlfs.h
index ebfed56976d2..9e6f70e989c2 100644
--- a/arch/x86/include/asm/hyperv-tlfs.h
+++ b/arch/x86/include/asm/hyperv-tlfs.h
@@ -30,158 +30,151 @@
/*
* Feature identification. EAX indicates which features are available
* to the partition based upon the current partition privileges.
+ * These are HYPERV_CPUID_FEATURES.EAX bits.
*/

/* VP Runtime (HV_X64_MSR_VP_RUNTIME) available */
-#define HV_X64_MSR_VP_RUNTIME_AVAILABLE (1 << 0)
+#define HV_X64_MSR_VP_RUNTIME_AVAILABLE BIT(0)
/* Partition Reference Counter (HV_X64_MSR_TIME_REF_COUNT) available*/
-#define HV_MSR_TIME_REF_COUNT_AVAILABLE (1 << 1)
-/* Partition reference TSC MSR is available */
-#define HV_MSR_REFERENCE_TSC_AVAILABLE (1 << 9)
-/* Partition Guest IDLE MSR is available */
-#define HV_X64_MSR_GUEST_IDLE_AVAILABLE (1 << 10)
-
-/* A partition's reference time stamp counter (TSC) page */
-#define HV_X64_MSR_REFERENCE_TSC 0x40000021
-
-/*
- * There is a single feature flag that signifies if the partition has access
- * to MSRs with local APIC and TSC frequencies.
- */
-#define HV_X64_ACCESS_FREQUENCY_MSRS (1 << 11)
-
-/* AccessReenlightenmentControls privilege */
-#define HV_X64_ACCESS_REENLIGHTENMENT BIT(13)
-
+#define HV_MSR_TIME_REF_COUNT_AVAILABLE BIT(1)
/*
* Basic SynIC MSRs (HV_X64_MSR_SCONTROL through HV_X64_MSR_EOM
* and HV_X64_MSR_SINT0 through HV_X64_MSR_SINT15) available
*/
-#define HV_X64_MSR_SYNIC_AVAILABLE (1 << 2)
+#define HV_X64_MSR_SYNIC_AVAILABLE BIT(2)
/*
* Synthetic Timer MSRs (HV_X64_MSR_STIMER0_CONFIG through
* HV_X64_MSR_STIMER3_COUNT) available
*/
-#define HV_MSR_SYNTIMER_AVAILABLE (1 << 3)
+#define HV_MSR_SYNTIMER_AVAILABLE BIT(3)
/*
* APIC access MSRs (HV_X64_MSR_EOI, HV_X64_MSR_ICR and HV_X64_MSR_TPR)
* are available
*/
-#define HV_X64_MSR_APIC_ACCESS_AVAILABLE (1 << 4)
+#define HV_X64_MSR_APIC_ACCESS_AVAILABLE BIT(4)
/* Hypercall MSRs (HV_X64_MSR_GUEST_OS_ID and HV_X64_MSR_HYPERCALL) available*/
-#define HV_X64_MSR_HYPERCALL_AVAILABLE (1 << 5)
+#define HV_X64_MSR_HYPERCALL_AVAILABLE BIT(5)
/* Access virtual processor index MSR (HV_X64_MSR_VP_INDEX) available*/
-#define HV_X64_MSR_VP_INDEX_AVAILABLE (1 << 6)
+#define HV_X64_MSR_VP_INDEX_AVAILABLE BIT(6)
/* Virtual system reset MSR (HV_X64_MSR_RESET) is available*/
-#define HV_X64_MSR_RESET_AVAILABLE (1 << 7)
- /*
- * Access statistics pages MSRs (HV_X64_MSR_STATS_PARTITION_RETAIL_PAGE,
- * HV_X64_MSR_STATS_PARTITION_INTERNAL_PAGE, HV_X64_MSR_STATS_VP_RETAIL_PAGE,
- * HV_X64_MSR_STATS_VP_INTERNAL_PAGE) available
- */
-#define HV_X64_MSR_STAT_PAGES_AVAILABLE (1 << 8)
-
-/* Frequency MSRs available */
-#define HV_FEATURE_FREQUENCY_MSRS_AVAILABLE (1 << 8)
-
-/* Crash MSR available */
-#define HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE (1 << 10)
-
-/* stimer Direct Mode is available */
-#define HV_STIMER_DIRECT_MODE_AVAILABLE (1 << 19)
+#define HV_X64_MSR_RESET_AVAILABLE BIT(7)
+/*
+ * Access statistics pages MSRs (HV_X64_MSR_STATS_PARTITION_RETAIL_PAGE,
+ * HV_X64_MSR_STATS_PARTITION_INTERNAL_PAGE, HV_X64_MSR_STATS_VP_RETAIL_PAGE,
+ * HV_X64_MSR_STATS_VP_INTERNAL_PAGE) available
+ */
+#define HV_X64_MSR_STAT_PAGES_AVAILABLE BIT(8)
+/* Partition reference TSC MSR is available */
+#define HV_MSR_REFERENCE_TSC_AVAILABLE BIT(9)
+/* Partition Guest IDLE MSR is available */
+#define HV_X64_MSR_GUEST_IDLE_AVAILABLE BIT(10)
+/*
+ * There is a single feature flag that signifies if the partition has access
+ * to MSRs with local APIC and TSC frequencies.
+ */
+#define HV_X64_ACCESS_FREQUENCY_MSRS BIT(11)
+/* AccessReenlightenmentControls privilege */
+#define HV_X64_ACCESS_REENLIGHTENMENT BIT(13)

/*
- * Feature identification: EBX indicates which flags were specified at
- * partition creation. The format is the same as the partition creation
- * flag structure defined in section Partition Creation Flags.
+ * Feature identification: indicates which flags were specified at partition
+ * creation. The format is the same as the partition creation flag structure
+ * defined in section Partition Creation Flags.
+ * These are HYPERV_CPUID_FEATURES.EBX bits.
*/
-#define HV_X64_CREATE_PARTITIONS (1 << 0)
-#define HV_X64_ACCESS_PARTITION_ID (1 << 1)
-#define HV_X64_ACCESS_MEMORY_POOL (1 << 2)
-#define HV_X64_ADJUST_MESSAGE_BUFFERS (1 << 3)
-#define HV_X64_POST_MESSAGES (1 << 4)
-#define HV_X64_SIGNAL_EVENTS (1 << 5)
-#define HV_X64_CREATE_PORT (1 << 6)
-#define HV_X64_CONNECT_PORT (1 << 7)
-#define HV_X64_ACCESS_STATS (1 << 8)
-#define HV_X64_DEBUGGING (1 << 11)
-#define HV_X64_CPU_POWER_MANAGEMENT (1 << 12)
-#define HV_X64_CONFIGURE_PROFILER (1 << 13)
+#define HV_X64_CREATE_PARTITIONS BIT(0)
+#define HV_X64_ACCESS_PARTITION_ID BIT(1)
+#define HV_X64_ACCESS_MEMORY_POOL BIT(2)
+#define HV_X64_ADJUST_MESSAGE_BUFFERS BIT(3)
+#define HV_X64_POST_MESSAGES BIT(4)
+#define HV_X64_SIGNAL_EVENTS BIT(5)
+#define HV_X64_CREATE_PORT BIT(6)
+#define HV_X64_CONNECT_PORT BIT(7)
+#define HV_X64_ACCESS_STATS BIT(8)
+#define HV_X64_DEBUGGING BIT(11)
+#define HV_X64_CPU_POWER_MANAGEMENT BIT(12)
+#define HV_X64_CONFIGURE_PROFILER BIT(13)

/*
* Feature identification. EDX indicates which miscellaneous features
* are available to the partition.
+ * These are HYPERV_CPUID_FEATURES.EDX bits.
*/
/* The MWAIT instruction is available (per section MONITOR / MWAIT) */
-#define HV_X64_MWAIT_AVAILABLE (1 << 0)
+#define HV_X64_MWAIT_AVAILABLE BIT(0)
/* Guest debugging support is available */
-#define HV_X64_GUEST_DEBUGGING_AVAILABLE (1 << 1)
+#define HV_X64_GUEST_DEBUGGING_AVAILABLE BIT(1)
/* Performance Monitor support is available*/
-#define HV_X64_PERF_MONITOR_AVAILABLE (1 << 2)
+#define HV_X64_PERF_MONITOR_AVAILABLE BIT(2)
/* Support for physical CPU dynamic partitioning events is available*/
-#define HV_X64_CPU_DYNAMIC_PARTITIONING_AVAILABLE (1 << 3)
+#define HV_X64_CPU_DYNAMIC_PARTITIONING_AVAILABLE BIT(3)
/*
* Support for passing hypercall input parameter block via XMM
* registers is available
*/
-#define HV_X64_HYPERCALL_PARAMS_XMM_AVAILABLE (1 << 4)
+#define HV_X64_HYPERCALL_PARAMS_XMM_AVAILABLE BIT(4)
/* Support for a virtual guest idle state is available */
-#define HV_X64_GUEST_IDLE_STATE_AVAILABLE (1 << 5)
-/* Guest crash data handler available */
-#define HV_X64_GUEST_CRASH_MSR_AVAILABLE (1 << 10)
+#define HV_X64_GUEST_IDLE_STATE_AVAILABLE BIT(5)
+/* Frequency MSRs available */
+#define HV_FEATURE_FREQUENCY_MSRS_AVAILABLE BIT(8)
+/* Crash MSR available */
+#define HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE BIT(10)
+/* stimer Direct Mode is available */
+#define HV_STIMER_DIRECT_MODE_AVAILABLE BIT(19)

/*
* Implementation recommendations. Indicates which behaviors the hypervisor
* recommends the OS implement for optimal performance.
+ * These are HYPERV_CPUID_ENLIGHTMENT_INFO.EAX bits.
+ */
+/*
+ * Recommend using hypercall for address space switches rather
+ * than MOV to CR3 instruction
*/
- /*
- * Recommend using hypercall for address space switches rather
- * than MOV to CR3 instruction
- */
-#define HV_X64_AS_SWITCH_RECOMMENDED (1 << 0)
+#define HV_X64_AS_SWITCH_RECOMMENDED BIT(0)
/* Recommend using hypercall for local TLB flushes rather
* than INVLPG or MOV to CR3 instructions */
-#define HV_X64_LOCAL_TLB_FLUSH_RECOMMENDED (1 << 1)
+#define HV_X64_LOCAL_TLB_FLUSH_RECOMMENDED BIT(1)
/*
* Recommend using hypercall for remote TLB flushes rather
* than inter-processor interrupts
*/
-#define HV_X64_REMOTE_TLB_FLUSH_RECOMMENDED (1 << 2)
+#define HV_X64_REMOTE_TLB_FLUSH_RECOMMENDED BIT(2)
/*
* Recommend using MSRs for accessing APIC registers
* EOI, ICR and TPR rather than their memory-mapped counterparts
*/
-#define HV_X64_APIC_ACCESS_RECOMMENDED (1 << 3)
+#define HV_X64_APIC_ACCESS_RECOMMENDED BIT(3)
/* Recommend using the hypervisor-provided MSR to initiate a system RESET */
-#define HV_X64_SYSTEM_RESET_RECOMMENDED (1 << 4)
+#define HV_X64_SYSTEM_RESET_RECOMMENDED BIT(4)
/*
* Recommend using relaxed timing for this partition. If used,
* the VM should disable any watchdog timeouts that rely on the
* timely delivery of external interrupts
*/
-#define HV_X64_RELAXED_TIMING_RECOMMENDED (1 << 5)
+#define HV_X64_RELAXED_TIMING_RECOMMENDED BIT(5)

/*
* Recommend not using Auto End-Of-Interrupt feature
*/
-#define HV_DEPRECATING_AEOI_RECOMMENDED (1 << 9)
+#define HV_DEPRECATING_AEOI_RECOMMENDED BIT(9)

/*
* Recommend using cluster IPI hypercalls.
*/
-#define HV_X64_CLUSTER_IPI_RECOMMENDED (1 << 10)
+#define HV_X64_CLUSTER_IPI_RECOMMENDED BIT(10)

/* Recommend using the newer ExProcessorMasks interface */
-#define HV_X64_EX_PROCESSOR_MASKS_RECOMMENDED (1 << 11)
+#define HV_X64_EX_PROCESSOR_MASKS_RECOMMENDED BIT(11)

/* Recommend using enlightened VMCS */
-#define HV_X64_ENLIGHTENED_VMCS_RECOMMENDED (1 << 14)
+#define HV_X64_ENLIGHTENED_VMCS_RECOMMENDED BIT(14)

-/*
- * Crash notification flags.
- */
-#define HV_CRASH_CTL_CRASH_NOTIFY_MSG BIT_ULL(62)
-#define HV_CRASH_CTL_CRASH_NOTIFY BIT_ULL(63)
+/* Nested features. These are HYPERV_CPUID_NESTED_FEATURES.EAX bits. */
+#define HV_X64_NESTED_GUEST_MAPPING_FLUSH BIT(18)
+#define HV_X64_NESTED_MSR_BITMAP BIT(19)
+
+/* Hyper-V specific model specific registers (MSRs) */

/* MSR used to identify the guest OS. */
#define HV_X64_MSR_GUEST_OS_ID 0x40000000
@@ -201,6 +194,9 @@
/* MSR used to read the per-partition time reference counter */
#define HV_X64_MSR_TIME_REF_COUNT 0x40000020

+/* A partition's reference time stamp counter (TSC) page */
+#define HV_X64_MSR_REFERENCE_TSC 0x40000021
+
/* MSR used to retrieve the TSC frequency */
#define HV_X64_MSR_TSC_FREQUENCY 0x40000022

@@ -258,9 +254,11 @@
#define HV_X64_MSR_CRASH_P3 0x40000103
#define HV_X64_MSR_CRASH_P4 0x40000104
#define HV_X64_MSR_CRASH_CTL 0x40000105
-#define HV_X64_MSR_CRASH_CTL_NOTIFY (1ULL << 63)
-#define HV_X64_MSR_CRASH_PARAMS \
- (1 + (HV_X64_MSR_CRASH_P4 - HV_X64_MSR_CRASH_P0))
+
+/* TSC emulation after migration */
+#define HV_X64_MSR_REENLIGHTENMENT_CONTROL 0x40000106
+#define HV_X64_MSR_TSC_EMULATION_CONTROL 0x40000107
+#define HV_X64_MSR_TSC_EMULATION_STATUS 0x40000108

/*
* Declare the MSR used to setup pages used to communicate with the hypervisor.
@@ -311,13 +309,6 @@ struct ms_hyperv_tsc_page {

#define HV_LINUX_VENDOR_ID 0x8100

-/* TSC emulation after migration */
-#define HV_X64_MSR_REENLIGHTENMENT_CONTROL 0x40000106
-
-/* Nested features (CPUID 0x4000000A) EAX */
-#define HV_X64_NESTED_GUEST_MAPPING_FLUSH BIT(18)
-#define HV_X64_NESTED_MSR_BITMAP BIT(19)
-
struct hv_reenlightenment_control {
__u64 vector:8;
__u64 reserved1:8;
@@ -326,9 +317,6 @@ struct hv_reenlightenment_control {
__u64 target_vp:32;
};

-#define HV_X64_MSR_TSC_EMULATION_CONTROL 0x40000107
-#define HV_X64_MSR_TSC_EMULATION_STATUS 0x40000108
-
struct hv_tsc_emulation_control {
__u64 enabled:1;
__u64 reserved:63;
@@ -344,6 +332,14 @@ struct hv_tsc_emulation_status {
#define HV_X64_MSR_HYPERCALL_PAGE_ADDRESS_MASK \
(~((1ull << HV_X64_MSR_HYPERCALL_PAGE_ADDRESS_SHIFT) - 1))

+/*
+ * Crash notification (HV_X64_MSR_CRASH_CTL) flags.
+ */
+#define HV_CRASH_CTL_CRASH_NOTIFY_MSG BIT_ULL(62)
+#define HV_CRASH_CTL_CRASH_NOTIFY BIT_ULL(63)
+#define HV_X64_MSR_CRASH_PARAMS \
+ (1 + (HV_X64_MSR_CRASH_P4 - HV_X64_MSR_CRASH_P0))
+
#define HV_IPI_LOW_VECTOR 0x10
#define HV_IPI_HIGH_VECTOR 0xff

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index e6a2a085644a..cecf907e4c49 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -861,9 +861,9 @@ static int kvm_hv_msr_set_crash_ctl(struct kvm_vcpu *vcpu, u64 data, bool host)
struct kvm_hv *hv = &vcpu->kvm->arch.hyperv;

if (host)
- hv->hv_crash_ctl = data & HV_X64_MSR_CRASH_CTL_NOTIFY;
+ hv->hv_crash_ctl = data & HV_CRASH_CTL_CRASH_NOTIFY;

- if (!host && (data & HV_X64_MSR_CRASH_CTL_NOTIFY)) {
+ if (!host && (data & HV_CRASH_CTL_CRASH_NOTIFY)) {

vcpu_debug(vcpu, "hv crash (0x%llx 0x%llx 0x%llx 0x%llx 0x%llx)\n",
hv->hv_crash_param[0],
--
2.19.2


2018-12-10 17:25:21

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH v2 4/7] x86/kvm/hyper-v: Introduce KVM_GET_SUPPORTED_HV_CPUID

With every new Hyper-V Enlightenment we implement we're forced to add a
KVM_CAP_HYPERV_* capability. While this approach works it is fairly
inconvenient: the majority of the enlightenments we do have corresponding
CPUID feature bit(s) and userspace has to know this anyways to be able to
expose the feature to the guest.

Add KVM_GET_SUPPORTED_HV_CPUID ioctl (backed by KVM_CAP_HYPERV_CPUID, "one
cap to rule them all!") returning all Hyper-V CPUID feature leaves.

Using the existing KVM_GET_SUPPORTED_CPUID doesn't seem to be possible:
Hyper-V CPUID feature leaves intersect with KVM's (e.g. 0x40000000,
0x40000001) and we would probably confuse userspace in case we decide to
return these twice.

KVM_CAP_HYPERV_CPUID's number is interim: we're intended to drop
KVM_CAP_HYPERV_STIMER_DIRECT and use its number instead.

Suggested-by: Paolo Bonzini <[email protected]>
Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
Changes since v1:
- Document KVM_GET_SUPPORTED_HV_CPUID and KVM_CAP_HYPERV_CPUID.
- Fix error handling in kvm_vcpu_ioctl_get_hv_cpuid()
---
Documentation/virtual/kvm/api.txt | 57 ++++++++++++++
arch/x86/kvm/hyperv.c | 121 ++++++++++++++++++++++++++++++
arch/x86/kvm/hyperv.h | 2 +
arch/x86/kvm/x86.c | 20 +++++
include/uapi/linux/kvm.h | 4 +
5 files changed, 204 insertions(+)

diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index cd209f7730af..5b72de0af24d 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -3753,6 +3753,63 @@ Coalesced pio is based on coalesced mmio. There is little difference
between coalesced mmio and pio except that coalesced pio records accesses
to I/O ports.

+4.117 KVM_GET_SUPPORTED_HV_CPUID
+
+Capability: KVM_CAP_HYPERV_CPUID
+Architectures: x86
+Type: vcpu ioctl
+Parameters: struct kvm_cpuid2 (in/out)
+Returns: 0 on success, -1 on error
+
+struct kvm_cpuid2 {
+ __u32 nent;
+ __u32 padding;
+ struct kvm_cpuid_entry2 entries[0];
+};
+
+struct kvm_cpuid_entry2 {
+ __u32 function;
+ __u32 index;
+ __u32 flags;
+ __u32 eax;
+ __u32 ebx;
+ __u32 ecx;
+ __u32 edx;
+ __u32 padding[3];
+};
+
+This ioctl returns x86 cpuid features leaves related to Hyper-V emulation in
+KVM. Userspace can use the information returned by this ioctl to construct
+cpuid information presented to guests consuming Hyper-V enlightenments (e.g.
+Windows or Hyper-V guests).
+
+CPUID feature leaves returned by this ioctl are defined by Hyper-V Top Level
+Functional Specification (TLFS). These leaves can't be obtained with
+KVM_GET_SUPPORTED_CPUID ioctl because some of them intersect with KVM feature
+leaves (0x40000000, 0x40000001).
+
+Currently, the following list of CPUID leaves are returned:
+ HYPERV_CPUID_VENDOR_AND_MAX_FUNCTIONS
+ HYPERV_CPUID_INTERFACE
+ HYPERV_CPUID_VERSION
+ HYPERV_CPUID_FEATURES
+ HYPERV_CPUID_ENLIGHTMENT_INFO
+ HYPERV_CPUID_IMPLEMENT_LIMITS
+ HYPERV_CPUID_NESTED_FEATURES
+
+HYPERV_CPUID_NESTED_FEATURES leaf is only exposed when Enlightened VMCS was
+enabled on the corresponding vCPU (KVM_CAP_HYPERV_ENLIGHTENED_VMCS).
+
+Userspace invokes KVM_GET_SUPPORTED_CPUID by passing a kvm_cpuid2 structure
+with the 'nent' field indicating the number of entries in the variable-size
+array 'entries'. If the number of entries is too low to describe all Hyper-V
+feature leaves, an error (E2BIG) is returned. If the number is more or equal
+to the number of Hyper-V feature leaves, the 'nent' field is adjusted to the
+number of valid entries in the 'entries' array, which is then filled.
+
+'index' and 'flags' fields in 'struct kvm_cpuid_entry2' are currently reserved,
+userspace should not expect to get any particular value there.
+
5. The kvm_run structure
------------------------

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index cecf907e4c49..04c3cdf3389e 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -1804,3 +1804,124 @@ int kvm_vm_ioctl_hv_eventfd(struct kvm *kvm, struct kvm_hyperv_eventfd *args)
return kvm_hv_eventfd_deassign(kvm, args->conn_id);
return kvm_hv_eventfd_assign(kvm, args->conn_id, args->fd);
}
+
+int kvm_vcpu_ioctl_get_hv_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid2 *cpuid,
+ struct kvm_cpuid_entry2 __user *entries)
+{
+ uint16_t evmcs_ver = kvm_x86_ops->nested_get_evmcs_version(vcpu);
+ struct kvm_cpuid_entry2 cpuid_entries[] = {
+ { .function = HYPERV_CPUID_VENDOR_AND_MAX_FUNCTIONS },
+ { .function = HYPERV_CPUID_INTERFACE },
+ { .function = HYPERV_CPUID_VERSION },
+ { .function = HYPERV_CPUID_FEATURES },
+ { .function = HYPERV_CPUID_ENLIGHTMENT_INFO },
+ { .function = HYPERV_CPUID_IMPLEMENT_LIMITS },
+ { .function = HYPERV_CPUID_NESTED_FEATURES },
+ };
+ int i, nent = ARRAY_SIZE(cpuid_entries);
+
+ /* Skip NESTED_FEATURES if eVMCS is not supported */
+ if (!evmcs_ver)
+ --nent;
+
+ if (cpuid->nent < nent)
+ return -E2BIG;
+
+ if (cpuid->nent > nent)
+ cpuid->nent = nent;
+
+ for (i = 0; i < nent; i++) {
+ struct kvm_cpuid_entry2 *ent = &cpuid_entries[i];
+ u32 signature[3];
+
+ switch (ent->function) {
+ case HYPERV_CPUID_VENDOR_AND_MAX_FUNCTIONS:
+ memcpy(signature, "Linux KVM Hv", 12);
+
+ ent->eax = HYPERV_CPUID_NESTED_FEATURES;
+ ent->ebx = signature[0];
+ ent->ecx = signature[1];
+ ent->edx = signature[2];
+ break;
+
+ case HYPERV_CPUID_INTERFACE:
+ memcpy(signature, "Hv#1\0\0\0\0\0\0\0\0", 12);
+ ent->eax = signature[0];
+ break;
+
+ case HYPERV_CPUID_VERSION:
+ /*
+ * We implement some Hyper-V 2016 functions so let's use
+ * this version.
+ */
+ ent->eax = 0x00003839;
+ ent->ebx = 0x000A0000;
+ break;
+
+ case HYPERV_CPUID_FEATURES:
+ ent->eax |= HV_X64_MSR_VP_RUNTIME_AVAILABLE;
+ ent->eax |= HV_MSR_TIME_REF_COUNT_AVAILABLE;
+ ent->eax |= HV_X64_MSR_SYNIC_AVAILABLE;
+ ent->eax |= HV_MSR_SYNTIMER_AVAILABLE;
+ ent->eax |= HV_X64_MSR_APIC_ACCESS_AVAILABLE;
+ ent->eax |= HV_X64_MSR_HYPERCALL_AVAILABLE;
+ ent->eax |= HV_X64_MSR_VP_INDEX_AVAILABLE;
+ ent->eax |= HV_X64_MSR_RESET_AVAILABLE;
+ ent->eax |= HV_MSR_REFERENCE_TSC_AVAILABLE;
+ ent->eax |= HV_X64_MSR_GUEST_IDLE_AVAILABLE;
+ ent->eax |= HV_X64_ACCESS_FREQUENCY_MSRS;
+ ent->eax |= HV_X64_ACCESS_REENLIGHTENMENT;
+
+ ent->ebx |= HV_X64_POST_MESSAGES;
+ ent->ebx |= HV_X64_SIGNAL_EVENTS;
+
+ ent->edx |= HV_FEATURE_FREQUENCY_MSRS_AVAILABLE;
+ ent->edx |= HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE;
+ ent->edx |= HV_STIMER_DIRECT_MODE_AVAILABLE;
+
+ break;
+
+ case HYPERV_CPUID_ENLIGHTMENT_INFO:
+ ent->eax |= HV_X64_REMOTE_TLB_FLUSH_RECOMMENDED;
+ ent->eax |= HV_X64_APIC_ACCESS_RECOMMENDED;
+ ent->eax |= HV_X64_SYSTEM_RESET_RECOMMENDED;
+ ent->eax |= HV_X64_RELAXED_TIMING_RECOMMENDED;
+ ent->eax |= HV_X64_CLUSTER_IPI_RECOMMENDED;
+ ent->eax |= HV_X64_EX_PROCESSOR_MASKS_RECOMMENDED;
+ ent->eax |= HV_X64_ENLIGHTENED_VMCS_RECOMMENDED;
+
+ /*
+ * Default number of spinlock retry attempts, matches
+ * HyperV 2016.
+ */
+ ent->ebx = 0x00000FFF;
+
+ break;
+
+ case HYPERV_CPUID_IMPLEMENT_LIMITS:
+ /* Maximum number of virtual processors */
+ ent->eax = KVM_MAX_VCPUS;
+ /*
+ * Maximum number of logical processors, matches
+ * HyperV 2016.
+ */
+ ent->ebx = 64;
+
+ break;
+
+ case HYPERV_CPUID_NESTED_FEATURES:
+ ent->eax = evmcs_ver;
+
+ break;
+
+ default:
+ break;
+ }
+ }
+
+ if (copy_to_user(entries, cpuid_entries,
+ nent * sizeof(struct kvm_cpuid_entry2)))
+ return -EFAULT;
+
+ return 0;
+}
diff --git a/arch/x86/kvm/hyperv.h b/arch/x86/kvm/hyperv.h
index 0e66c12ed2c3..25428f5966e6 100644
--- a/arch/x86/kvm/hyperv.h
+++ b/arch/x86/kvm/hyperv.h
@@ -95,5 +95,7 @@ void kvm_hv_setup_tsc_page(struct kvm *kvm,
void kvm_hv_init_vm(struct kvm *kvm);
void kvm_hv_destroy_vm(struct kvm *kvm);
int kvm_vm_ioctl_hv_eventfd(struct kvm *kvm, struct kvm_hyperv_eventfd *args);
+int kvm_vcpu_ioctl_get_hv_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid2 *cpuid,
+ struct kvm_cpuid_entry2 __user *entries);

#endif
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index b21b5ceb8d26..142776435166 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2998,6 +2998,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
case KVM_CAP_HYPERV_SEND_IPI:
case KVM_CAP_HYPERV_ENLIGHTENED_VMCS:
case KVM_CAP_HYPERV_STIMER_DIRECT:
+ case KVM_CAP_HYPERV_CPUID:
case KVM_CAP_PCI_SEGMENT:
case KVM_CAP_DEBUGREGS:
case KVM_CAP_X86_ROBUST_SINGLESTEP:
@@ -4191,6 +4192,25 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
r = kvm_x86_ops->set_nested_state(vcpu, user_kvm_nested_state, &kvm_state);
break;
}
+ case KVM_GET_SUPPORTED_HV_CPUID: {
+ struct kvm_cpuid2 __user *cpuid_arg = argp;
+ struct kvm_cpuid2 cpuid;
+
+ r = -EFAULT;
+ if (copy_from_user(&cpuid, cpuid_arg, sizeof(cpuid)))
+ goto out;
+
+ r = kvm_vcpu_ioctl_get_hv_cpuid(vcpu, &cpuid,
+ cpuid_arg->entries);
+ if (r)
+ goto out;
+
+ r = -EFAULT;
+ if (copy_to_user(cpuid_arg, &cpuid, sizeof(cpuid)))
+ goto out;
+ r = 0;
+ break;
+ }
default:
r = -EINVAL;
}
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index b8da14cee8e5..9d0038eae002 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -976,6 +976,7 @@ struct kvm_ppc_resize_hpt {
#define KVM_CAP_EXCEPTION_PAYLOAD 164
#define KVM_CAP_ARM_VM_IPA_SIZE 165
#define KVM_CAP_HYPERV_STIMER_DIRECT 166
+#define KVM_CAP_HYPERV_CPUID 167

#ifdef KVM_CAP_IRQ_ROUTING

@@ -1422,6 +1423,9 @@ struct kvm_enc_region {
#define KVM_GET_NESTED_STATE _IOWR(KVMIO, 0xbe, struct kvm_nested_state)
#define KVM_SET_NESTED_STATE _IOW(KVMIO, 0xbf, struct kvm_nested_state)

+/* Available with KVM_CAP_HYPERV_CPUID */
+#define KVM_GET_SUPPORTED_HV_CPUID _IOWR(KVMIO, 0xc0, struct kvm_cpuid2)
+
/* Secure Encrypted Virtualization command */
enum sev_cmd_id {
/* Guest initialization commands */
--
2.19.2


2018-12-10 22:01:08

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH v2 1/7] x86/hyper-v: Do some housekeeping in hyperv-tlfs.h

On Mon, 10 Dec 2018, Vitaly Kuznetsov wrote:

> hyperv-tlfs.h is a bit messy: CPUID feature bits are not always sorted,
> it's hard to get which CPUID they belong to, some items are duplicated
> (e.g. HV_X64_MSR_CRASH_CTL_NOTIFY/HV_CRASH_CTL_CRASH_NOTIFY).
>
> Do some housekeeping work. While on it, replace all (1 << X) with BIT(X)
> macro.
>
> Signed-off-by: Vitaly Kuznetsov <[email protected]>
> Reviewed-by: Michael Kelley <[email protected]>

Acked-by: Thomas Gleixner <[email protected]>

2018-12-11 00:04:21

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH v2 2/7] x86/hyper-v: Drop HV_X64_CONFIGURE_PROFILER definition

On Mon, 10 Dec 2018, Vitaly Kuznetsov wrote:

> BIT(13) in HYPERV_CPUID_FEATURES.EBX is described as "ConfigureProfiler" in
> TLFS v4.0 but starting 5.0 it is replaced with 'Reserved'. As we don't
> currently us it in kernel it can just be dropped.
>
> Signed-off-by: Vitaly Kuznetsov <[email protected]>
> Reviewed-by: Michael Kelley <[email protected]>

Acked-by: Thomas Gleixner <[email protected]>

2018-12-11 12:51:53

by Roman Kagan

[permalink] [raw]
Subject: Re: [PATCH v2 4/7] x86/kvm/hyper-v: Introduce KVM_GET_SUPPORTED_HV_CPUID

On Mon, Dec 10, 2018 at 06:21:56PM +0100, Vitaly Kuznetsov wrote:
> With every new Hyper-V Enlightenment we implement we're forced to add a
> KVM_CAP_HYPERV_* capability. While this approach works it is fairly
> inconvenient: the majority of the enlightenments we do have corresponding
> CPUID feature bit(s) and userspace has to know this anyways to be able to
> expose the feature to the guest.
>
> Add KVM_GET_SUPPORTED_HV_CPUID ioctl (backed by KVM_CAP_HYPERV_CPUID, "one
> cap to rule them all!") returning all Hyper-V CPUID feature leaves.
>
> Using the existing KVM_GET_SUPPORTED_CPUID doesn't seem to be possible:
> Hyper-V CPUID feature leaves intersect with KVM's (e.g. 0x40000000,
> 0x40000001) and we would probably confuse userspace in case we decide to
> return these twice.
>
> KVM_CAP_HYPERV_CPUID's number is interim: we're intended to drop
> KVM_CAP_HYPERV_STIMER_DIRECT and use its number instead.
>
> Suggested-by: Paolo Bonzini <[email protected]>
> Signed-off-by: Vitaly Kuznetsov <[email protected]>
> ---
> Changes since v1:
> - Document KVM_GET_SUPPORTED_HV_CPUID and KVM_CAP_HYPERV_CPUID.
> - Fix error handling in kvm_vcpu_ioctl_get_hv_cpuid()
> ---
> Documentation/virtual/kvm/api.txt | 57 ++++++++++++++
> arch/x86/kvm/hyperv.c | 121 ++++++++++++++++++++++++++++++
> arch/x86/kvm/hyperv.h | 2 +
> arch/x86/kvm/x86.c | 20 +++++
> include/uapi/linux/kvm.h | 4 +
> 5 files changed, 204 insertions(+)
>
> diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
> index cd209f7730af..5b72de0af24d 100644
> --- a/Documentation/virtual/kvm/api.txt
> +++ b/Documentation/virtual/kvm/api.txt
> @@ -3753,6 +3753,63 @@ Coalesced pio is based on coalesced mmio. There is little difference
> between coalesced mmio and pio except that coalesced pio records accesses
> to I/O ports.
>
> +4.117 KVM_GET_SUPPORTED_HV_CPUID
> +
> +Capability: KVM_CAP_HYPERV_CPUID
> +Architectures: x86
> +Type: vcpu ioctl
> +Parameters: struct kvm_cpuid2 (in/out)
> +Returns: 0 on success, -1 on error
> +
> +struct kvm_cpuid2 {
> + __u32 nent;
> + __u32 padding;
> + struct kvm_cpuid_entry2 entries[0];
> +};
> +
> +struct kvm_cpuid_entry2 {
> + __u32 function;
> + __u32 index;
> + __u32 flags;
> + __u32 eax;
> + __u32 ebx;
> + __u32 ecx;
> + __u32 edx;
> + __u32 padding[3];
> +};
> +
> +This ioctl returns x86 cpuid features leaves related to Hyper-V emulation in
> +KVM. Userspace can use the information returned by this ioctl to construct
> +cpuid information presented to guests consuming Hyper-V enlightenments (e.g.
> +Windows or Hyper-V guests).
> +
> +CPUID feature leaves returned by this ioctl are defined by Hyper-V Top Level
> +Functional Specification (TLFS). These leaves can't be obtained with
> +KVM_GET_SUPPORTED_CPUID ioctl because some of them intersect with KVM feature
> +leaves (0x40000000, 0x40000001).
> +
> +Currently, the following list of CPUID leaves are returned:
> + HYPERV_CPUID_VENDOR_AND_MAX_FUNCTIONS
> + HYPERV_CPUID_INTERFACE
> + HYPERV_CPUID_VERSION
> + HYPERV_CPUID_FEATURES
> + HYPERV_CPUID_ENLIGHTMENT_INFO
> + HYPERV_CPUID_IMPLEMENT_LIMITS
> + HYPERV_CPUID_NESTED_FEATURES
> +
> +HYPERV_CPUID_NESTED_FEATURES leaf is only exposed when Enlightened VMCS was
> +enabled on the corresponding vCPU (KVM_CAP_HYPERV_ENLIGHTENED_VMCS).

IOW the output of ioctl(KVM_GET_SUPPORTED_HV_CPUID) depends on
whether ioctl(KVM_ENABLE_CAP, KVM_CAP_HYPERV_ENLIGHTENED_VMCS) has
already been called on that vcpu? I wonder if this fits the intended
usage?

Roman.

> +
> +Userspace invokes KVM_GET_SUPPORTED_CPUID by passing a kvm_cpuid2 structure
> +with the 'nent' field indicating the number of entries in the variable-size
> +array 'entries'. If the number of entries is too low to describe all Hyper-V
> +feature leaves, an error (E2BIG) is returned. If the number is more or equal
> +to the number of Hyper-V feature leaves, the 'nent' field is adjusted to the
> +number of valid entries in the 'entries' array, which is then filled.
> +
> +'index' and 'flags' fields in 'struct kvm_cpuid_entry2' are currently reserved,
> +userspace should not expect to get any particular value there.
> +
> 5. The kvm_run structure
> ------------------------
>
> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
> index cecf907e4c49..04c3cdf3389e 100644
> --- a/arch/x86/kvm/hyperv.c
> +++ b/arch/x86/kvm/hyperv.c
> @@ -1804,3 +1804,124 @@ int kvm_vm_ioctl_hv_eventfd(struct kvm *kvm, struct kvm_hyperv_eventfd *args)
> return kvm_hv_eventfd_deassign(kvm, args->conn_id);
> return kvm_hv_eventfd_assign(kvm, args->conn_id, args->fd);
> }
> +
> +int kvm_vcpu_ioctl_get_hv_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid2 *cpuid,
> + struct kvm_cpuid_entry2 __user *entries)
> +{
> + uint16_t evmcs_ver = kvm_x86_ops->nested_get_evmcs_version(vcpu);
> + struct kvm_cpuid_entry2 cpuid_entries[] = {
> + { .function = HYPERV_CPUID_VENDOR_AND_MAX_FUNCTIONS },
> + { .function = HYPERV_CPUID_INTERFACE },
> + { .function = HYPERV_CPUID_VERSION },
> + { .function = HYPERV_CPUID_FEATURES },
> + { .function = HYPERV_CPUID_ENLIGHTMENT_INFO },
> + { .function = HYPERV_CPUID_IMPLEMENT_LIMITS },
> + { .function = HYPERV_CPUID_NESTED_FEATURES },
> + };
> + int i, nent = ARRAY_SIZE(cpuid_entries);
> +
> + /* Skip NESTED_FEATURES if eVMCS is not supported */
> + if (!evmcs_ver)
> + --nent;
> +
> + if (cpuid->nent < nent)
> + return -E2BIG;
> +
> + if (cpuid->nent > nent)
> + cpuid->nent = nent;
> +
> + for (i = 0; i < nent; i++) {
> + struct kvm_cpuid_entry2 *ent = &cpuid_entries[i];
> + u32 signature[3];
> +
> + switch (ent->function) {
> + case HYPERV_CPUID_VENDOR_AND_MAX_FUNCTIONS:
> + memcpy(signature, "Linux KVM Hv", 12);
> +
> + ent->eax = HYPERV_CPUID_NESTED_FEATURES;
> + ent->ebx = signature[0];
> + ent->ecx = signature[1];
> + ent->edx = signature[2];
> + break;
> +
> + case HYPERV_CPUID_INTERFACE:
> + memcpy(signature, "Hv#1\0\0\0\0\0\0\0\0", 12);
> + ent->eax = signature[0];
> + break;
> +
> + case HYPERV_CPUID_VERSION:
> + /*
> + * We implement some Hyper-V 2016 functions so let's use
> + * this version.
> + */
> + ent->eax = 0x00003839;
> + ent->ebx = 0x000A0000;
> + break;
> +
> + case HYPERV_CPUID_FEATURES:
> + ent->eax |= HV_X64_MSR_VP_RUNTIME_AVAILABLE;
> + ent->eax |= HV_MSR_TIME_REF_COUNT_AVAILABLE;
> + ent->eax |= HV_X64_MSR_SYNIC_AVAILABLE;
> + ent->eax |= HV_MSR_SYNTIMER_AVAILABLE;
> + ent->eax |= HV_X64_MSR_APIC_ACCESS_AVAILABLE;
> + ent->eax |= HV_X64_MSR_HYPERCALL_AVAILABLE;
> + ent->eax |= HV_X64_MSR_VP_INDEX_AVAILABLE;
> + ent->eax |= HV_X64_MSR_RESET_AVAILABLE;
> + ent->eax |= HV_MSR_REFERENCE_TSC_AVAILABLE;
> + ent->eax |= HV_X64_MSR_GUEST_IDLE_AVAILABLE;
> + ent->eax |= HV_X64_ACCESS_FREQUENCY_MSRS;
> + ent->eax |= HV_X64_ACCESS_REENLIGHTENMENT;
> +
> + ent->ebx |= HV_X64_POST_MESSAGES;
> + ent->ebx |= HV_X64_SIGNAL_EVENTS;
> +
> + ent->edx |= HV_FEATURE_FREQUENCY_MSRS_AVAILABLE;
> + ent->edx |= HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE;
> + ent->edx |= HV_STIMER_DIRECT_MODE_AVAILABLE;
> +
> + break;
> +
> + case HYPERV_CPUID_ENLIGHTMENT_INFO:
> + ent->eax |= HV_X64_REMOTE_TLB_FLUSH_RECOMMENDED;
> + ent->eax |= HV_X64_APIC_ACCESS_RECOMMENDED;
> + ent->eax |= HV_X64_SYSTEM_RESET_RECOMMENDED;
> + ent->eax |= HV_X64_RELAXED_TIMING_RECOMMENDED;
> + ent->eax |= HV_X64_CLUSTER_IPI_RECOMMENDED;
> + ent->eax |= HV_X64_EX_PROCESSOR_MASKS_RECOMMENDED;
> + ent->eax |= HV_X64_ENLIGHTENED_VMCS_RECOMMENDED;
> +
> + /*
> + * Default number of spinlock retry attempts, matches
> + * HyperV 2016.
> + */
> + ent->ebx = 0x00000FFF;
> +
> + break;
> +
> + case HYPERV_CPUID_IMPLEMENT_LIMITS:
> + /* Maximum number of virtual processors */
> + ent->eax = KVM_MAX_VCPUS;
> + /*
> + * Maximum number of logical processors, matches
> + * HyperV 2016.
> + */
> + ent->ebx = 64;
> +
> + break;
> +
> + case HYPERV_CPUID_NESTED_FEATURES:
> + ent->eax = evmcs_ver;
> +
> + break;
> +
> + default:
> + break;
> + }
> + }
> +
> + if (copy_to_user(entries, cpuid_entries,
> + nent * sizeof(struct kvm_cpuid_entry2)))
> + return -EFAULT;
> +
> + return 0;
> +}
> diff --git a/arch/x86/kvm/hyperv.h b/arch/x86/kvm/hyperv.h
> index 0e66c12ed2c3..25428f5966e6 100644
> --- a/arch/x86/kvm/hyperv.h
> +++ b/arch/x86/kvm/hyperv.h
> @@ -95,5 +95,7 @@ void kvm_hv_setup_tsc_page(struct kvm *kvm,
> void kvm_hv_init_vm(struct kvm *kvm);
> void kvm_hv_destroy_vm(struct kvm *kvm);
> int kvm_vm_ioctl_hv_eventfd(struct kvm *kvm, struct kvm_hyperv_eventfd *args);
> +int kvm_vcpu_ioctl_get_hv_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid2 *cpuid,
> + struct kvm_cpuid_entry2 __user *entries);
>
> #endif
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index b21b5ceb8d26..142776435166 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -2998,6 +2998,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
> case KVM_CAP_HYPERV_SEND_IPI:
> case KVM_CAP_HYPERV_ENLIGHTENED_VMCS:
> case KVM_CAP_HYPERV_STIMER_DIRECT:
> + case KVM_CAP_HYPERV_CPUID:
> case KVM_CAP_PCI_SEGMENT:
> case KVM_CAP_DEBUGREGS:
> case KVM_CAP_X86_ROBUST_SINGLESTEP:
> @@ -4191,6 +4192,25 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
> r = kvm_x86_ops->set_nested_state(vcpu, user_kvm_nested_state, &kvm_state);
> break;
> }
> + case KVM_GET_SUPPORTED_HV_CPUID: {
> + struct kvm_cpuid2 __user *cpuid_arg = argp;
> + struct kvm_cpuid2 cpuid;
> +
> + r = -EFAULT;
> + if (copy_from_user(&cpuid, cpuid_arg, sizeof(cpuid)))
> + goto out;
> +
> + r = kvm_vcpu_ioctl_get_hv_cpuid(vcpu, &cpuid,
> + cpuid_arg->entries);
> + if (r)
> + goto out;
> +
> + r = -EFAULT;
> + if (copy_to_user(cpuid_arg, &cpuid, sizeof(cpuid)))
> + goto out;
> + r = 0;
> + break;
> + }
> default:
> r = -EINVAL;
> }
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index b8da14cee8e5..9d0038eae002 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -976,6 +976,7 @@ struct kvm_ppc_resize_hpt {
> #define KVM_CAP_EXCEPTION_PAYLOAD 164
> #define KVM_CAP_ARM_VM_IPA_SIZE 165
> #define KVM_CAP_HYPERV_STIMER_DIRECT 166
> +#define KVM_CAP_HYPERV_CPUID 167
>
> #ifdef KVM_CAP_IRQ_ROUTING
>
> @@ -1422,6 +1423,9 @@ struct kvm_enc_region {
> #define KVM_GET_NESTED_STATE _IOWR(KVMIO, 0xbe, struct kvm_nested_state)
> #define KVM_SET_NESTED_STATE _IOW(KVMIO, 0xbf, struct kvm_nested_state)
>
> +/* Available with KVM_CAP_HYPERV_CPUID */
> +#define KVM_GET_SUPPORTED_HV_CPUID _IOWR(KVMIO, 0xc0, struct kvm_cpuid2)
> +
> /* Secure Encrypted Virtualization command */
> enum sev_cmd_id {
> /* Guest initialization commands */
> --
> 2.19.2
>

2018-12-11 13:30:45

by Vitaly Kuznetsov

[permalink] [raw]
Subject: Re: [PATCH v2 4/7] x86/kvm/hyper-v: Introduce KVM_GET_SUPPORTED_HV_CPUID

Roman Kagan <[email protected]> writes:

> On Mon, Dec 10, 2018 at 06:21:56PM +0100, Vitaly Kuznetsov wrote:

>> +
>> +Currently, the following list of CPUID leaves are returned:
>> + HYPERV_CPUID_VENDOR_AND_MAX_FUNCTIONS
>> + HYPERV_CPUID_INTERFACE
>> + HYPERV_CPUID_VERSION
>> + HYPERV_CPUID_FEATURES
>> + HYPERV_CPUID_ENLIGHTMENT_INFO
>> + HYPERV_CPUID_IMPLEMENT_LIMITS
>> + HYPERV_CPUID_NESTED_FEATURES
>> +
>> +HYPERV_CPUID_NESTED_FEATURES leaf is only exposed when Enlightened VMCS was
>> +enabled on the corresponding vCPU (KVM_CAP_HYPERV_ENLIGHTENED_VMCS).
>
> IOW the output of ioctl(KVM_GET_SUPPORTED_HV_CPUID) depends on
> whether ioctl(KVM_ENABLE_CAP, KVM_CAP_HYPERV_ENLIGHTENED_VMCS) has
> already been called on that vcpu? I wonder if this fits the intended
> usage?

I added HYPERV_CPUID_NESTED_FEATURES in the list (and made the new ioctl
per-cpu and not per-vm) for consistency. *In theory*
KVM_CAP_HYPERV_ENLIGHTENED_VMCS is also enabled per-vcpu so some
hypothetical userspace can later check enabled eVMCS versions (which can
differ across vCPUs!) with KVM_GET_SUPPORTED_HV_CPUID. We will also have
direct tlb flush and other nested features there so to avoid addning new
KVM_CAP_* for them we need the CPUID.

Another thing I'm thinking about is something like 'hv_all' cpu flag for
Qemu which would enable everything by setting guest CPUIDs to what
KVM_GET_SUPPORTED_HV_CPUID returns. In that case it would also be
convenient to have HYPERV_CPUID_NESTED_FEATURES properly filled (or not
filled when eVMCS was not enabled).

--
Vitaly

2018-12-11 14:30:18

by Roman Kagan

[permalink] [raw]
Subject: Re: [PATCH v2 4/7] x86/kvm/hyper-v: Introduce KVM_GET_SUPPORTED_HV_CPUID

On Tue, Dec 11, 2018 at 02:28:14PM +0100, Vitaly Kuznetsov wrote:
> Roman Kagan <[email protected]> writes:
>
> > On Mon, Dec 10, 2018 at 06:21:56PM +0100, Vitaly Kuznetsov wrote:
>
> >> +
> >> +Currently, the following list of CPUID leaves are returned:
> >> + HYPERV_CPUID_VENDOR_AND_MAX_FUNCTIONS
> >> + HYPERV_CPUID_INTERFACE
> >> + HYPERV_CPUID_VERSION
> >> + HYPERV_CPUID_FEATURES
> >> + HYPERV_CPUID_ENLIGHTMENT_INFO
> >> + HYPERV_CPUID_IMPLEMENT_LIMITS
> >> + HYPERV_CPUID_NESTED_FEATURES
> >> +
> >> +HYPERV_CPUID_NESTED_FEATURES leaf is only exposed when Enlightened VMCS was
> >> +enabled on the corresponding vCPU (KVM_CAP_HYPERV_ENLIGHTENED_VMCS).
> >
> > IOW the output of ioctl(KVM_GET_SUPPORTED_HV_CPUID) depends on
> > whether ioctl(KVM_ENABLE_CAP, KVM_CAP_HYPERV_ENLIGHTENED_VMCS) has
> > already been called on that vcpu? I wonder if this fits the intended
> > usage?
>
> I added HYPERV_CPUID_NESTED_FEATURES in the list (and made the new ioctl
> per-cpu and not per-vm) for consistency. *In theory*
> KVM_CAP_HYPERV_ENLIGHTENED_VMCS is also enabled per-vcpu so some
> hypothetical userspace can later check enabled eVMCS versions (which can
> differ across vCPUs!) with KVM_GET_SUPPORTED_HV_CPUID. We will also have
> direct tlb flush and other nested features there so to avoid addning new
> KVM_CAP_* for them we need the CPUID.

This is different from how KVM_GET_SUPPORTED_CPUID is used: QEMU assumes
that its output doesn't change between calls, and even caches the result
calling the ioctl only once.

> Another thing I'm thinking about is something like 'hv_all' cpu flag for
> Qemu which would enable everything by setting guest CPUIDs to what
> KVM_GET_SUPPORTED_HV_CPUID returns. In that case it would also be
> convenient to have HYPERV_CPUID_NESTED_FEATURES properly filled (or not
> filled when eVMCS was not enabled).

I think this is orthogonal to the way you obtain capability info from
the kernel.

Roman.

2018-12-11 15:11:46

by Roman Kagan

[permalink] [raw]
Subject: Re: [PATCH v2 4/7] x86/kvm/hyper-v: Introduce KVM_GET_SUPPORTED_HV_CPUID

On Tue, Dec 11, 2018 at 04:04:01PM +0100, Vitaly Kuznetsov wrote:
> Roman Kagan <[email protected]> writes:
>
> > On Tue, Dec 11, 2018 at 02:28:14PM +0100, Vitaly Kuznetsov wrote:
> >> Roman Kagan <[email protected]> writes:
> >>
> >> > On Mon, Dec 10, 2018 at 06:21:56PM +0100, Vitaly Kuznetsov wrote:
> >>
> >> >> +
> >> >> +Currently, the following list of CPUID leaves are returned:
> >> >> + HYPERV_CPUID_VENDOR_AND_MAX_FUNCTIONS
> >> >> + HYPERV_CPUID_INTERFACE
> >> >> + HYPERV_CPUID_VERSION
> >> >> + HYPERV_CPUID_FEATURES
> >> >> + HYPERV_CPUID_ENLIGHTMENT_INFO
> >> >> + HYPERV_CPUID_IMPLEMENT_LIMITS
> >> >> + HYPERV_CPUID_NESTED_FEATURES
> >> >> +
> >> >> +HYPERV_CPUID_NESTED_FEATURES leaf is only exposed when Enlightened VMCS was
> >> >> +enabled on the corresponding vCPU (KVM_CAP_HYPERV_ENLIGHTENED_VMCS).
> >> >
> >> > IOW the output of ioctl(KVM_GET_SUPPORTED_HV_CPUID) depends on
> >> > whether ioctl(KVM_ENABLE_CAP, KVM_CAP_HYPERV_ENLIGHTENED_VMCS) has
> >> > already been called on that vcpu? I wonder if this fits the intended
> >> > usage?
> >>
> >> I added HYPERV_CPUID_NESTED_FEATURES in the list (and made the new ioctl
> >> per-cpu and not per-vm) for consistency. *In theory*
> >> KVM_CAP_HYPERV_ENLIGHTENED_VMCS is also enabled per-vcpu so some
> >> hypothetical userspace can later check enabled eVMCS versions (which can
> >> differ across vCPUs!) with KVM_GET_SUPPORTED_HV_CPUID. We will also have
> >> direct tlb flush and other nested features there so to avoid addning new
> >> KVM_CAP_* for them we need the CPUID.
> >
> > This is different from how KVM_GET_SUPPORTED_CPUID is used: QEMU assumes
> > that its output doesn't change between calls, and even caches the result
> > calling the ioctl only once.
> >
>
> Yes, I'm not sure if we have to have full consistency between
> KVM_GET_SUPPORTED_CPUID and KVM_GET_SUPPORTED_HV_CPUID.

Neither do I. I just noticed the difference and thought it might
matter.

> >> Another thing I'm thinking about is something like 'hv_all' cpu flag for
> >> Qemu which would enable everything by setting guest CPUIDs to what
> >> KVM_GET_SUPPORTED_HV_CPUID returns. In that case it would also be
> >> convenient to have HYPERV_CPUID_NESTED_FEATURES properly filled (or not
> >> filled when eVMCS was not enabled).
> >
> > I think this is orthogonal to the way you obtain capability info from
> > the kernel.
>
> Not necessarily. If very dumb userspace does 'host passthrough' for
> Hyper-V features without doing anything (e.g. not enabling Enlightened
> VMCS) it will just put the result of KVM_GET_SUPPORTED_HV_CPUID in guest
> facing CPUIDs and it will all work. In case eVMCS was previously enabled
> it again just copies everything and this still works.
>
> We don't probably need this for Qemu though. If you think it would be
> better to have HYPERV_CPUID_NESTED_FEATURES returned regardless of eVMCS
> enablement I'm ready to budge)

I have no opinion on this. I hope Paolo, who requested the feature, can
explain the desired semantics :)

Roman.

2018-12-11 17:29:00

by Vitaly Kuznetsov

[permalink] [raw]
Subject: Re: [PATCH v2 4/7] x86/kvm/hyper-v: Introduce KVM_GET_SUPPORTED_HV_CPUID

Roman Kagan <[email protected]> writes:

> On Tue, Dec 11, 2018 at 02:28:14PM +0100, Vitaly Kuznetsov wrote:
>> Roman Kagan <[email protected]> writes:
>>
>> > On Mon, Dec 10, 2018 at 06:21:56PM +0100, Vitaly Kuznetsov wrote:
>>
>> >> +
>> >> +Currently, the following list of CPUID leaves are returned:
>> >> + HYPERV_CPUID_VENDOR_AND_MAX_FUNCTIONS
>> >> + HYPERV_CPUID_INTERFACE
>> >> + HYPERV_CPUID_VERSION
>> >> + HYPERV_CPUID_FEATURES
>> >> + HYPERV_CPUID_ENLIGHTMENT_INFO
>> >> + HYPERV_CPUID_IMPLEMENT_LIMITS
>> >> + HYPERV_CPUID_NESTED_FEATURES
>> >> +
>> >> +HYPERV_CPUID_NESTED_FEATURES leaf is only exposed when Enlightened VMCS was
>> >> +enabled on the corresponding vCPU (KVM_CAP_HYPERV_ENLIGHTENED_VMCS).
>> >
>> > IOW the output of ioctl(KVM_GET_SUPPORTED_HV_CPUID) depends on
>> > whether ioctl(KVM_ENABLE_CAP, KVM_CAP_HYPERV_ENLIGHTENED_VMCS) has
>> > already been called on that vcpu? I wonder if this fits the intended
>> > usage?
>>
>> I added HYPERV_CPUID_NESTED_FEATURES in the list (and made the new ioctl
>> per-cpu and not per-vm) for consistency. *In theory*
>> KVM_CAP_HYPERV_ENLIGHTENED_VMCS is also enabled per-vcpu so some
>> hypothetical userspace can later check enabled eVMCS versions (which can
>> differ across vCPUs!) with KVM_GET_SUPPORTED_HV_CPUID. We will also have
>> direct tlb flush and other nested features there so to avoid addning new
>> KVM_CAP_* for them we need the CPUID.
>
> This is different from how KVM_GET_SUPPORTED_CPUID is used: QEMU assumes
> that its output doesn't change between calls, and even caches the result
> calling the ioctl only once.
>

Yes, I'm not sure if we have to have full consistency between
KVM_GET_SUPPORTED_CPUID and KVM_GET_SUPPORTED_HV_CPUID.

>> Another thing I'm thinking about is something like 'hv_all' cpu flag for
>> Qemu which would enable everything by setting guest CPUIDs to what
>> KVM_GET_SUPPORTED_HV_CPUID returns. In that case it would also be
>> convenient to have HYPERV_CPUID_NESTED_FEATURES properly filled (or not
>> filled when eVMCS was not enabled).
>
> I think this is orthogonal to the way you obtain capability info from
> the kernel.

Not necessarily. If very dumb userspace does 'host passthrough' for
Hyper-V features without doing anything (e.g. not enabling Enlightened
VMCS) it will just put the result of KVM_GET_SUPPORTED_HV_CPUID in guest
facing CPUIDs and it will all work. In case eVMCS was previously enabled
it again just copies everything and this still works.

We don't probably need this for Qemu though. If you think it would be
better to have HYPERV_CPUID_NESTED_FEATURES returned regardless of eVMCS
enablement I'm ready to budge)

--
Vitaly

2018-12-14 10:52:27

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH v2 0/7] x86/kvm/hyper-v: Implement KVM_GET_SUPPORTED_HV_CPUID

On 10/12/18 18:21, Vitaly Kuznetsov wrote:
> Changes since v1:
> - Add Michael's R-b tags.
> - Document KVM_GET_SUPPORTED_HV_CPUID and KVM_CAP_HYPERV_CPUID.
> - Fix error handling in kvm_vcpu_ioctl_get_hv_cpuid().
> - Check for -E2BIG in the selftest. PATCH6 is added to support the change.
>
> As suggested by Paolo, introduce new KVM_GET_SUPPORTED_HV_CPUID returning
> all Hyper-V CPUID feature leaves, this way we won't need to add a new
> KVM_CAP_HYPERV_* for every new Hyper-V enlightenment we implement in
> KVM.
>
> (Using the existing KVM_GET_SUPPORTED_CPUID doesn't seem to be possible:
> Hyper-V CPUID feature leaves intersect with KVM's (e.g. 0x40000000,
> 0x40000001) and we would probably confuse userspace in case we decide to
> return these twice).
>
> This patch series also does some housekeeping work in hyperv-tlfs.h, adds a
> simple selftest (that's how I actually checked that the newly added ioctl
> works) and removes recently added KVM_CAP_HYPERV_STIMER_DIRECT before it's
> too late.
>
> Vitaly Kuznetsov (7):
> x86/hyper-v: Do some housekeeping in hyperv-tlfs.h
> x86/hyper-v: Drop HV_X64_CONFIGURE_PROFILER definition
> x86/kvm/hyper-v: Introduce nested_get_evmcs_version() helper
> x86/kvm/hyper-v: Introduce KVM_GET_SUPPORTED_HV_CPUID
> x86/kvm/hyper-v: Drop KVM_CAP_HYPERV_STIMER_DIRECT
> KVM: selftests: implement an unchecked version of vcpu_ioctl()
> KVM: selftests: Add hyperv_cpuid test
>
> Documentation/virtual/kvm/api.txt | 57 ++++++
> arch/x86/include/asm/hyperv-tlfs.h | 185 +++++++++---------
> arch/x86/include/asm/kvm_host.h | 1 +
> arch/x86/kvm/hyperv.c | 125 +++++++++++-
> arch/x86/kvm/hyperv.h | 2 +
> arch/x86/kvm/svm.c | 7 +
> arch/x86/kvm/vmx.c | 24 ++-
> arch/x86/kvm/x86.c | 21 +-
> include/uapi/linux/kvm.h | 5 +-
> tools/testing/selftests/kvm/Makefile | 1 +
> .../testing/selftests/kvm/include/kvm_util.h | 2 +
> tools/testing/selftests/kvm/lib/kvm_util.c | 14 +-
> .../selftests/kvm/x86_64/hyperv_cpuid.c | 157 +++++++++++++++
> 13 files changed, 493 insertions(+), 108 deletions(-)
> create mode 100644 tools/testing/selftests/kvm/x86_64/hyperv_cpuid.c
>

Queued, thanks. I moved this above the direct EOI series so that
KVM_CAP_HYPERV_STIMER_DIRECT need not exist at any point of the history.

Paolo

2018-12-17 10:52:42

by Vitaly Kuznetsov

[permalink] [raw]
Subject: Re: [PATCH v2 0/7] x86/kvm/hyper-v: Implement KVM_GET_SUPPORTED_HV_CPUID

Paolo Bonzini <[email protected]> writes:

>
> Queued, thanks. I moved this above the direct EOI series so that
> KVM_CAP_HYPERV_STIMER_DIRECT need not exist at any point of the history.
>

Thanks! Just to make sure (and to conclude our discussion with Roman):
with your Qemu maintainer hat on, do you agree with the design decision
that KVM_GET_SUPPORTED_HV_CPUID's output value changes when
KVM_CAP_HYPERV_ENLIGHTENED_VMCS gets enabled? This differs from
KVM_GET_SUPPORTED_CPUID (where we always list all feature bits even if
they require explicit enablement)?

--
Vitaly

2018-12-17 19:49:41

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH v2 0/7] x86/kvm/hyper-v: Implement KVM_GET_SUPPORTED_HV_CPUID

On 17/12/18 11:30, Vitaly Kuznetsov wrote:
>> Queued, thanks. I moved this above the direct EOI series so that
>> KVM_CAP_HYPERV_STIMER_DIRECT need not exist at any point of the history.
>>
> Thanks! Just to make sure (and to conclude our discussion with Roman):
> with your Qemu maintainer hat on, do you agree with the design decision
> that KVM_GET_SUPPORTED_HV_CPUID's output value changes when
> KVM_CAP_HYPERV_ENLIGHTENED_VMCS gets enabled? This differs from
> KVM_GET_SUPPORTED_CPUID (where we always list all feature bits even if
> they require explicit enablement)?

It doesn't really matter, since the old capability cannot be removed.
If you want to change it with a patch on top, that's also okay with me.

Paolo

2018-12-18 08:17:14

by Vitaly Kuznetsov

[permalink] [raw]
Subject: Re: [PATCH v2 0/7] x86/kvm/hyper-v: Implement KVM_GET_SUPPORTED_HV_CPUID

Paolo Bonzini <[email protected]> writes:

> On 17/12/18 11:30, Vitaly Kuznetsov wrote:
>>> Queued, thanks. I moved this above the direct EOI series so that
>>> KVM_CAP_HYPERV_STIMER_DIRECT need not exist at any point of the history.
>>>
>> Thanks! Just to make sure (and to conclude our discussion with Roman):
>> with your Qemu maintainer hat on, do you agree with the design decision
>> that KVM_GET_SUPPORTED_HV_CPUID's output value changes when
>> KVM_CAP_HYPERV_ENLIGHTENED_VMCS gets enabled? This differs from
>> KVM_GET_SUPPORTED_CPUID (where we always list all feature bits even if
>> they require explicit enablement)?
>
> It doesn't really matter, since the old capability cannot be removed.
> If you want to change it with a patch on top, that's also okay with me.

Ok, let's leave it as it is: maybe some other userspace (which doesn't
use legacy capabilities) will be grateful :-)

--
Vitaly