2021-04-15 13:44:54

by Vineeth Pillai

[permalink] [raw]
Subject: [PATCH v2 0/7] Hyper-V nested virt enlightenments for SVM


This patch series enables the nested virtualization enlightenments for
SVM. This is very similar to the enlightenments for VMX except for the
fact that there is no enlightened VMCS. For SVM, VMCB is already an
architectural in-memory data structure.

The supported enlightenments are:

Enlightened TLB Flush: If this is enabled, ASID invalidations invalidate
only gva -> hpa entries. To flush entries derived from NPT, hyper-v
provided hypercalls (HvFlushGuestPhysicalAddressSpace or
HvFlushGuestPhysicalAddressList) should be used.

Enlightened MSR bitmap(TLFS 16.5.3): "When enabled, L0 hypervisor does
not monitor the MSR bitmaps for changes. Instead, the L1 hypervisor must
invalidate the corresponding clean field after making changes to one of
the MSR bitmaps."

Direct Virtual Flush(TLFS 16.8): The hypervisor exposes hypercalls
(HvFlushVirtualAddressSpace, HvFlushVirtualAddressSpaceEx,
HvFlushVirtualAddressList, and HvFlushVirtualAddressListEx) that allow
operating systems to more efficiently manage the virtual TLB. The L1
hypervisor can choose to allow its guest to use those hypercalls and
delegate the responsibility to handle them to the L0 hypervisor. This
requires the use of a partition assist page."

L2 Windows boot time was measured with and without the patch. Time was
measured from power on to the login screen and was averaged over a
consecutive 5 trials:
Without the patch: 42 seconds
With the patch: 29 seconds
--

Changes from v1:
- Move the remote TLB flush related fields from kvm_vcpu_hv and kvm_hv
to kvm_vcpu_arch and kvm_arch.
- Modify the VMCB clean mask runtime based on whether L1 hypervisor
is running on Hyper-V or not.
- Detect Hyper-V nested enlightenments based on
HYPERV_CPUID_VENDOR_AND_MAX_FUNCTIONS.
- Address other minor review comments.
---

Vineeth Pillai (7):
hyperv: Detect Nested virtualization support for SVM
hyperv: SVM enlightened TLB flush support flag
KVM: x86: hyper-v: Move the remote TLB flush logic out of vmx
KVM: SVM: hyper-v: Nested enlightenments in VMCB
KVM: SVM: hyper-v: Remote TLB flush for SVM
KVM: SVM: hyper-v: Enlightened MSR-Bitmap support
KVM: SVM: hyper-v: Direct Virtual Flush support

arch/x86/include/asm/hyperv-tlfs.h | 9 +++
arch/x86/include/asm/kvm_host.h | 14 ++++
arch/x86/include/asm/svm.h | 24 +++++-
arch/x86/kernel/cpu/mshyperv.c | 10 ++-
arch/x86/kvm/hyperv.c | 87 +++++++++++++++++++++
arch/x86/kvm/hyperv.h | 20 +++++
arch/x86/kvm/svm/svm.c | 120 +++++++++++++++++++++++++++++
arch/x86/kvm/svm/svm.h | 30 +++++++-
arch/x86/kvm/vmx/vmx.c | 97 ++---------------------
arch/x86/kvm/vmx/vmx.h | 10 ---
arch/x86/kvm/x86.c | 9 ++-
11 files changed, 323 insertions(+), 107 deletions(-)

--
2.25.1


2021-04-15 13:45:08

by Vineeth Pillai

[permalink] [raw]
Subject: [PATCH v2 1/7] hyperv: Detect Nested virtualization support for SVM

Detect nested features exposed by Hyper-V if SVM is enabled.

Signed-off-by: Vineeth Pillai <[email protected]>
---
arch/x86/kernel/cpu/mshyperv.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
index 3546d3e21787..c6f812851e37 100644
--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -252,6 +252,7 @@ static void __init hv_smp_prepare_cpus(unsigned int max_cpus)

static void __init ms_hyperv_init_platform(void)
{
+ int hv_max_functions_eax;
int hv_host_info_eax;
int hv_host_info_ebx;
int hv_host_info_ecx;
@@ -269,6 +270,8 @@ static void __init ms_hyperv_init_platform(void)
ms_hyperv.misc_features = cpuid_edx(HYPERV_CPUID_FEATURES);
ms_hyperv.hints = cpuid_eax(HYPERV_CPUID_ENLIGHTMENT_INFO);

+ hv_max_functions_eax = cpuid_eax(HYPERV_CPUID_VENDOR_AND_MAX_FUNCTIONS);
+
pr_info("Hyper-V: privilege flags low 0x%x, high 0x%x, hints 0x%x, misc 0x%x\n",
ms_hyperv.features, ms_hyperv.priv_high, ms_hyperv.hints,
ms_hyperv.misc_features);
@@ -298,8 +301,7 @@ static void __init ms_hyperv_init_platform(void)
/*
* Extract host information.
*/
- if (cpuid_eax(HYPERV_CPUID_VENDOR_AND_MAX_FUNCTIONS) >=
- HYPERV_CPUID_VERSION) {
+ if (hv_max_functions_eax >= HYPERV_CPUID_VERSION) {
hv_host_info_eax = cpuid_eax(HYPERV_CPUID_VERSION);
hv_host_info_ebx = cpuid_ebx(HYPERV_CPUID_VERSION);
hv_host_info_ecx = cpuid_ecx(HYPERV_CPUID_VERSION);
@@ -325,9 +327,11 @@ static void __init ms_hyperv_init_platform(void)
ms_hyperv.isolation_config_a, ms_hyperv.isolation_config_b);
}

- if (ms_hyperv.hints & HV_X64_ENLIGHTENED_VMCS_RECOMMENDED) {
+ if (hv_max_functions_eax >= HYPERV_CPUID_NESTED_FEATURES) {
ms_hyperv.nested_features =
cpuid_eax(HYPERV_CPUID_NESTED_FEATURES);
+ pr_info("Hyper-V: Nested features: 0x%x\n",
+ ms_hyperv.nested_features);
}

/*
--
2.25.1

2021-04-15 13:45:14

by Vineeth Pillai

[permalink] [raw]
Subject: [PATCH v2 2/7] hyperv: SVM enlightened TLB flush support flag

Bit 22 of HYPERV_CPUID_FEATURES.EDX is specific to SVM and specifies
support for enlightened TLB flush. With this enlightenment enabled,
ASID invalidations flushes only gva->hpa entries. To flush TLB entries
derived from NPT, hypercalls should be used
(HvFlushGuestPhysicalAddressSpace or HvFlushGuestPhysicalAddressList)

Signed-off-by: Vineeth Pillai <[email protected]>
---
arch/x86/include/asm/hyperv-tlfs.h | 9 +++++++++
1 file changed, 9 insertions(+)

diff --git a/arch/x86/include/asm/hyperv-tlfs.h b/arch/x86/include/asm/hyperv-tlfs.h
index 606f5cc579b2..005bf14d0449 100644
--- a/arch/x86/include/asm/hyperv-tlfs.h
+++ b/arch/x86/include/asm/hyperv-tlfs.h
@@ -133,6 +133,15 @@
#define HV_X64_NESTED_GUEST_MAPPING_FLUSH BIT(18)
#define HV_X64_NESTED_MSR_BITMAP BIT(19)

+/*
+ * This is specific to AMD and specifies that enlightened TLB flush is
+ * supported. If guest opts in to this feature, ASID invalidations only
+ * flushes gva -> hpa mapping entries. To flush the TLB entries derived
+ * from NPT, hypercalls should be used (HvFlushGuestPhysicalAddressSpace
+ * or HvFlushGuestPhysicalAddressList).
+ */
+#define HV_X64_NESTED_ENLIGHTENED_TLB BIT(22)
+
/* HYPERV_CPUID_ISOLATION_CONFIG.EAX bits. */
#define HV_PARAVISOR_PRESENT BIT(0)

--
2.25.1

2021-04-15 13:45:20

by Vineeth Pillai

[permalink] [raw]
Subject: [PATCH v2 3/7] KVM: x86: hyper-v: Move the remote TLB flush logic out of vmx

Currently the remote TLB flush logic is specific to VMX.
Move it to a common place so that SVM can use it as well.

Signed-off-by: Vineeth Pillai <[email protected]>
---
arch/x86/include/asm/kvm_host.h | 14 +++++
arch/x86/kvm/hyperv.c | 87 +++++++++++++++++++++++++++++
arch/x86/kvm/hyperv.h | 20 +++++++
arch/x86/kvm/vmx/vmx.c | 97 +++------------------------------
arch/x86/kvm/vmx/vmx.h | 10 ----
arch/x86/kvm/x86.c | 9 ++-
6 files changed, 136 insertions(+), 101 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 877a4025d8da..ed84c15d18bc 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -838,6 +838,15 @@ struct kvm_vcpu_arch {

/* Protected Guests */
bool guest_state_protected;
+
+#if IS_ENABLED(CONFIG_HYPERV)
+ /*
+ * Two Dimensional paging CR3
+ * EPTP for Intel
+ * nCR3 for AMD
+ */
+ u64 tdp_pointer;
+#endif
};

struct kvm_lpage_info {
@@ -1079,6 +1088,11 @@ struct kvm_arch {
*/
spinlock_t tdp_mmu_pages_lock;
#endif /* CONFIG_X86_64 */
+
+#if IS_ENABLED(CONFIG_HYPERV)
+ int tdp_pointers_match;
+ spinlock_t tdp_pointer_lock;
+#endif
};

struct kvm_vm_stat {
diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 58fa8c029867..614b4448a028 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -32,6 +32,7 @@
#include <linux/eventfd.h>

#include <asm/apicdef.h>
+#include <asm/mshyperv.h>
#include <trace/events/kvm.h>

#include "trace.h"
@@ -2180,3 +2181,89 @@ int kvm_get_hv_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid2 *cpuid,

return 0;
}
+
+#if IS_ENABLED(CONFIG_HYPERV)
+/* check_tdp_pointer() should be under protection of tdp_pointer_lock. */
+static void check_tdp_pointer_match(struct kvm *kvm)
+{
+ u64 tdp_pointer = INVALID_PAGE;
+ bool valid_tdp = false;
+ struct kvm_vcpu *vcpu;
+ int i;
+
+ kvm_for_each_vcpu(i, vcpu, kvm) {
+ if (!valid_tdp) {
+ tdp_pointer = vcpu->arch.tdp_pointer;
+ valid_tdp = true;
+ continue;
+ }
+
+ if (tdp_pointer != vcpu->arch.tdp_pointer) {
+ kvm->arch.tdp_pointers_match = TDP_POINTERS_MISMATCH;
+ return;
+ }
+ }
+
+ kvm->arch.tdp_pointers_match = TDP_POINTERS_MATCH;
+}
+
+static int kvm_fill_hv_flush_list_func(struct hv_guest_mapping_flush_list *flush,
+ void *data)
+{
+ struct kvm_tlb_range *range = data;
+
+ return hyperv_fill_flush_guest_mapping_list(flush, range->start_gfn,
+ range->pages);
+}
+
+static inline int __hv_remote_flush_tlb_with_range(struct kvm *kvm,
+ struct kvm_vcpu *vcpu, struct kvm_tlb_range *range)
+{
+ u64 tdp_pointer = vcpu->arch.tdp_pointer;
+
+ /*
+ * FLUSH_GUEST_PHYSICAL_ADDRESS_SPACE hypercall needs address
+ * of the base of EPT PML4 table, strip off EPT configuration
+ * information.
+ */
+ if (range)
+ return hyperv_flush_guest_mapping_range(tdp_pointer & PAGE_MASK,
+ kvm_fill_hv_flush_list_func, (void *)range);
+ else
+ return hyperv_flush_guest_mapping(tdp_pointer & PAGE_MASK);
+}
+
+int kvm_hv_remote_flush_tlb_with_range(struct kvm *kvm,
+ struct kvm_tlb_range *range)
+{
+ struct kvm_vcpu *vcpu;
+ int ret = 0, i;
+
+ spin_lock(&kvm->arch.tdp_pointer_lock);
+
+ if (kvm->arch.tdp_pointers_match == TDP_POINTERS_CHECK)
+ check_tdp_pointer_match(kvm);
+
+ if (kvm->arch.tdp_pointers_match != TDP_POINTERS_MATCH) {
+ kvm_for_each_vcpu(i, vcpu, kvm) {
+ /* If tdp_pointer is invalid pointer, bypass flush request. */
+ if (VALID_PAGE(vcpu->arch.tdp_pointer))
+ ret |= __hv_remote_flush_tlb_with_range(
+ kvm, vcpu, range);
+ }
+ } else {
+ ret = __hv_remote_flush_tlb_with_range(kvm,
+ kvm_get_vcpu(kvm, 0), range);
+ }
+
+ spin_unlock(&kvm->arch.tdp_pointer_lock);
+ return ret;
+}
+EXPORT_SYMBOL_GPL(kvm_hv_remote_flush_tlb_with_range);
+
+int kvm_hv_remote_flush_tlb(struct kvm *kvm)
+{
+ return kvm_hv_remote_flush_tlb_with_range(kvm, NULL);
+}
+EXPORT_SYMBOL_GPL(kvm_hv_remote_flush_tlb);
+#endif
diff --git a/arch/x86/kvm/hyperv.h b/arch/x86/kvm/hyperv.h
index e951af1fcb2c..b27c6f47f58d 100644
--- a/arch/x86/kvm/hyperv.h
+++ b/arch/x86/kvm/hyperv.h
@@ -50,6 +50,12 @@
/* Hyper-V HV_X64_MSR_SYNDBG_OPTIONS bits */
#define HV_X64_SYNDBG_OPTION_USE_HCALLS BIT(2)

+enum tdp_pointers_status {
+ TDP_POINTERS_CHECK = 0,
+ TDP_POINTERS_MATCH = 1,
+ TDP_POINTERS_MISMATCH = 2
+};
+
static inline struct kvm_hv *to_kvm_hv(struct kvm *kvm)
{
return &kvm->arch.hyperv;
@@ -141,4 +147,18 @@ int kvm_vm_ioctl_hv_eventfd(struct kvm *kvm, struct kvm_hyperv_eventfd *args);
int kvm_get_hv_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid2 *cpuid,
struct kvm_cpuid_entry2 __user *entries);

+#if IS_ENABLED(CONFIG_HYPERV)
+static inline void kvm_update_arch_tdp_pointer(struct kvm *kvm,
+ struct kvm_vcpu *vcpu, u64 tdp_pointer)
+{
+ spin_lock(&kvm->arch.tdp_pointer_lock);
+ vcpu->arch.tdp_pointer = tdp_pointer;
+ kvm->arch.tdp_pointers_match = TDP_POINTERS_CHECK;
+ spin_unlock(&kvm->arch.tdp_pointer_lock);
+}
+
+int kvm_hv_remote_flush_tlb(struct kvm *kvm);
+int kvm_hv_remote_flush_tlb_with_range(struct kvm *kvm,
+ struct kvm_tlb_range *range);
+#endif
#endif
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 50810d471462..67f607319eb7 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -62,6 +62,7 @@
#include "vmcs12.h"
#include "vmx.h"
#include "x86.h"
+#include "hyperv.h"

MODULE_AUTHOR("Qumranet");
MODULE_LICENSE("GPL");
@@ -472,83 +473,6 @@ static const u32 vmx_uret_msrs_list[] = {
static bool __read_mostly enlightened_vmcs = true;
module_param(enlightened_vmcs, bool, 0444);

-/* check_ept_pointer() should be under protection of ept_pointer_lock. */
-static void check_ept_pointer_match(struct kvm *kvm)
-{
- struct kvm_vcpu *vcpu;
- u64 tmp_eptp = INVALID_PAGE;
- int i;
-
- kvm_for_each_vcpu(i, vcpu, kvm) {
- if (!VALID_PAGE(tmp_eptp)) {
- tmp_eptp = to_vmx(vcpu)->ept_pointer;
- } else if (tmp_eptp != to_vmx(vcpu)->ept_pointer) {
- to_kvm_vmx(kvm)->ept_pointers_match
- = EPT_POINTERS_MISMATCH;
- return;
- }
- }
-
- to_kvm_vmx(kvm)->ept_pointers_match = EPT_POINTERS_MATCH;
-}
-
-static int kvm_fill_hv_flush_list_func(struct hv_guest_mapping_flush_list *flush,
- void *data)
-{
- struct kvm_tlb_range *range = data;
-
- return hyperv_fill_flush_guest_mapping_list(flush, range->start_gfn,
- range->pages);
-}
-
-static inline int __hv_remote_flush_tlb_with_range(struct kvm *kvm,
- struct kvm_vcpu *vcpu, struct kvm_tlb_range *range)
-{
- u64 ept_pointer = to_vmx(vcpu)->ept_pointer;
-
- /*
- * FLUSH_GUEST_PHYSICAL_ADDRESS_SPACE hypercall needs address
- * of the base of EPT PML4 table, strip off EPT configuration
- * information.
- */
- if (range)
- return hyperv_flush_guest_mapping_range(ept_pointer & PAGE_MASK,
- kvm_fill_hv_flush_list_func, (void *)range);
- else
- return hyperv_flush_guest_mapping(ept_pointer & PAGE_MASK);
-}
-
-static int hv_remote_flush_tlb_with_range(struct kvm *kvm,
- struct kvm_tlb_range *range)
-{
- struct kvm_vcpu *vcpu;
- int ret = 0, i;
-
- spin_lock(&to_kvm_vmx(kvm)->ept_pointer_lock);
-
- if (to_kvm_vmx(kvm)->ept_pointers_match == EPT_POINTERS_CHECK)
- check_ept_pointer_match(kvm);
-
- if (to_kvm_vmx(kvm)->ept_pointers_match != EPT_POINTERS_MATCH) {
- kvm_for_each_vcpu(i, vcpu, kvm) {
- /* If ept_pointer is invalid pointer, bypass flush request. */
- if (VALID_PAGE(to_vmx(vcpu)->ept_pointer))
- ret |= __hv_remote_flush_tlb_with_range(
- kvm, vcpu, range);
- }
- } else {
- ret = __hv_remote_flush_tlb_with_range(kvm,
- kvm_get_vcpu(kvm, 0), range);
- }
-
- spin_unlock(&to_kvm_vmx(kvm)->ept_pointer_lock);
- return ret;
-}
-static int hv_remote_flush_tlb(struct kvm *kvm)
-{
- return hv_remote_flush_tlb_with_range(kvm, NULL);
-}
-
static int hv_enable_direct_tlbflush(struct kvm_vcpu *vcpu)
{
struct hv_enlightened_vmcs *evmcs;
@@ -3115,13 +3039,10 @@ static void vmx_load_mmu_pgd(struct kvm_vcpu *vcpu, unsigned long pgd,
eptp = construct_eptp(vcpu, pgd, pgd_level);
vmcs_write64(EPT_POINTER, eptp);

- if (kvm_x86_ops.tlb_remote_flush) {
- spin_lock(&to_kvm_vmx(kvm)->ept_pointer_lock);
- to_vmx(vcpu)->ept_pointer = eptp;
- to_kvm_vmx(kvm)->ept_pointers_match
- = EPT_POINTERS_CHECK;
- spin_unlock(&to_kvm_vmx(kvm)->ept_pointer_lock);
- }
+#if IS_ENABLED(CONFIG_HYPERV)
+ if (kvm_x86_ops.tlb_remote_flush)
+ kvm_update_arch_tdp_pointer(kvm, vcpu, eptp);
+#endif

if (!enable_unrestricted_guest && !is_paging(vcpu))
guest_cr3 = to_kvm_vmx(kvm)->ept_identity_map_addr;
@@ -6989,8 +6910,6 @@ static int vmx_create_vcpu(struct kvm_vcpu *vcpu)
vmx->pi_desc.nv = POSTED_INTR_VECTOR;
vmx->pi_desc.sn = 1;

- vmx->ept_pointer = INVALID_PAGE;
-
return 0;

free_vmcs:
@@ -7007,8 +6926,6 @@ static int vmx_create_vcpu(struct kvm_vcpu *vcpu)

static int vmx_vm_init(struct kvm *kvm)
{
- spin_lock_init(&to_kvm_vmx(kvm)->ept_pointer_lock);
-
if (!ple_gap)
kvm->arch.pause_in_guest = true;

@@ -7818,9 +7735,9 @@ static __init int hardware_setup(void)
#if IS_ENABLED(CONFIG_HYPERV)
if (ms_hyperv.nested_features & HV_X64_NESTED_GUEST_MAPPING_FLUSH
&& enable_ept) {
- vmx_x86_ops.tlb_remote_flush = hv_remote_flush_tlb;
+ vmx_x86_ops.tlb_remote_flush = kvm_hv_remote_flush_tlb;
vmx_x86_ops.tlb_remote_flush_with_range =
- hv_remote_flush_tlb_with_range;
+ kvm_hv_remote_flush_tlb_with_range;
}
#endif

diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index 89da5e1251f1..d2e2ab46f5bb 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -325,7 +325,6 @@ struct vcpu_vmx {
*/
u64 msr_ia32_feature_control;
u64 msr_ia32_feature_control_valid_bits;
- u64 ept_pointer;

struct pt_desc pt_desc;
struct lbr_desc lbr_desc;
@@ -338,21 +337,12 @@ struct vcpu_vmx {
} shadow_msr_intercept;
};

-enum ept_pointers_status {
- EPT_POINTERS_CHECK = 0,
- EPT_POINTERS_MATCH = 1,
- EPT_POINTERS_MISMATCH = 2
-};
-
struct kvm_vmx {
struct kvm kvm;

unsigned int tss_addr;
bool ept_identity_pagetable_done;
gpa_t ept_identity_map_addr;
-
- enum ept_pointers_status ept_pointers_match;
- spinlock_t ept_pointer_lock;
};

bool nested_vmx_allowed(struct kvm_vcpu *vcpu);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 2a20ce60152e..f566e78b59b9 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -10115,6 +10115,10 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
vcpu->arch.pending_external_vector = -1;
vcpu->arch.preempted_in_kernel = false;

+#if IS_ENABLED(CONFIG_HYPERV)
+ vcpu->arch.tdp_pointer = INVALID_PAGE;
+#endif
+
r = static_call(kvm_x86_vcpu_create)(vcpu);
if (r)
goto free_guest_fpu;
@@ -10470,7 +10474,6 @@ void kvm_arch_free_vm(struct kvm *kvm)
vfree(kvm);
}

-
int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
{
if (type)
@@ -10498,6 +10501,10 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)

kvm->arch.guest_can_read_msr_platform_info = true;

+#if IS_ENABLED(CONFIG_HYPERV)
+ spin_lock_init(&kvm->arch.tdp_pointer_lock);
+#endif
+
INIT_DELAYED_WORK(&kvm->arch.kvmclock_update_work, kvmclock_update_fn);
INIT_DELAYED_WORK(&kvm->arch.kvmclock_sync_work, kvmclock_sync_fn);

--
2.25.1

2021-04-15 13:45:44

by Vineeth Pillai

[permalink] [raw]
Subject: [PATCH v2 6/7] KVM: SVM: hyper-v: Enlightened MSR-Bitmap support

Enlightened MSR-Bitmap as per TLFS:

"The L1 hypervisor may collaborate with the L0 hypervisor to make MSR
accesses more efficient. It can enable enlightened MSR bitmaps by setting
the corresponding field in the enlightened VMCS to 1. When enabled, L0
hypervisor does not monitor the MSR bitmaps for changes. Instead, the L1
hypervisor must invalidate the corresponding clean field after making
changes to one of the MSR bitmaps."

Enable this for SVM.

Related VMX changes:
commit ceef7d10dfb6 ("KVM: x86: VMX: hyper-v: Enlightened MSR-Bitmap support")

Signed-off-by: Vineeth Pillai <[email protected]>
---
arch/x86/kvm/svm/svm.c | 27 +++++++++++++++++++++++++++
1 file changed, 27 insertions(+)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index de141d5ae5fb..f59f03b5c722 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -649,6 +649,27 @@ static bool msr_write_intercepted(struct kvm_vcpu *vcpu, u32 msr)
return !!test_bit(bit_write, &tmp);
}

+#if IS_ENABLED(CONFIG_HYPERV)
+static inline void hv_vmcb_dirty_nested_enlightenments(struct kvm_vcpu *vcpu)
+{
+ struct vmcb *vmcb = to_svm(vcpu)->vmcb;
+
+ /*
+ * vmcb can be NULL if called during early vcpu init.
+ * And its okay not to mark vmcb dirty during vcpu init
+ * as we mark it dirty unconditionally towards end of vcpu
+ * init phase.
+ */
+ if (vmcb && vmcb_is_clean(vmcb, VMCB_HV_NESTED_ENLIGHTENMENTS) &&
+ vmcb->hv_enlightenments.hv_enlightenments_control.msr_bitmap)
+ vmcb_mark_dirty(vmcb, VMCB_HV_NESTED_ENLIGHTENMENTS);
+}
+#else
+static inline void hv_vmcb_dirty_nested_enlightenments(struct kvm_vcpu *vcpu)
+{
+}
+#endif
+
static void set_msr_interception_bitmap(struct kvm_vcpu *vcpu, u32 *msrpm,
u32 msr, int read, int write)
{
@@ -680,6 +701,9 @@ static void set_msr_interception_bitmap(struct kvm_vcpu *vcpu, u32 *msrpm,
write ? clear_bit(bit_write, &tmp) : set_bit(bit_write, &tmp);

msrpm[offset] = tmp;
+
+ hv_vmcb_dirty_nested_enlightenments(vcpu);
+
}

void set_msr_interception(struct kvm_vcpu *vcpu, u32 *msrpm, u32 msr,
@@ -1143,6 +1167,9 @@ static void hv_init_vmcb(struct vmcb *vmcb)
if (npt_enabled &&
ms_hyperv.nested_features & HV_X64_NESTED_ENLIGHTENED_TLB)
hve->hv_enlightenments_control.enlightened_npt_tlb = 1;
+
+ if (ms_hyperv.nested_features & HV_X64_NESTED_MSR_BITMAP)
+ hve->hv_enlightenments_control.msr_bitmap = 1;
}
#else
static inline void hv_init_vmcb(struct vmcb *vmcb)
--
2.25.1

2021-04-15 13:45:48

by Vineeth Pillai

[permalink] [raw]
Subject: [PATCH v2 7/7] KVM: SVM: hyper-v: Direct Virtual Flush support

From Hyper-V TLFS:
"The hypervisor exposes hypercalls (HvFlushVirtualAddressSpace,
HvFlushVirtualAddressSpaceEx, HvFlushVirtualAddressList, and
HvFlushVirtualAddressListEx) that allow operating systems to more
efficiently manage the virtual TLB. The L1 hypervisor can choose to
allow its guest to use those hypercalls and delegate the responsibility
to handle them to the L0 hypervisor. This requires the use of a
partition assist page."

Add the Direct Virtual Flush support for SVM.

Related VMX changes:
commit 6f6a657c9998 ("KVM/Hyper-V/VMX: Add direct tlb flush support")

Signed-off-by: Vineeth Pillai <[email protected]>
---
arch/x86/kvm/svm/svm.c | 48 ++++++++++++++++++++++++++++++++++++++++++
1 file changed, 48 insertions(+)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index f59f03b5c722..cff01256c47e 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -443,6 +443,32 @@ static void svm_init_osvw(struct kvm_vcpu *vcpu)
vcpu->arch.osvw.status |= 1;
}

+#if IS_ENABLED(CONFIG_HYPERV)
+static int hv_enable_direct_tlbflush(struct kvm_vcpu *vcpu)
+{
+ struct hv_enlightenments *hve;
+ struct hv_partition_assist_pg **p_hv_pa_pg =
+ &to_kvm_hv(vcpu->kvm)->hv_pa_pg;
+
+ if (!*p_hv_pa_pg)
+ *p_hv_pa_pg = kzalloc(PAGE_SIZE, GFP_KERNEL);
+
+ if (!*p_hv_pa_pg)
+ return -ENOMEM;
+
+ hve = (struct hv_enlightenments *)&to_svm(vcpu)->vmcb->hv_enlightenments;
+
+ hve->partition_assist_page = __pa(*p_hv_pa_pg);
+ hve->hv_vm_id = (unsigned long)vcpu->kvm;
+ if (!hve->hv_enlightenments_control.nested_flush_hypercall) {
+ hve->hv_enlightenments_control.nested_flush_hypercall = 1;
+ vmcb_mark_dirty(to_svm(vcpu)->vmcb, VMCB_HV_NESTED_ENLIGHTENMENTS);
+ }
+
+ return 0;
+}
+#endif
+
static int has_svm(void)
{
const char *msg;
@@ -1037,6 +1063,21 @@ static __init int svm_hardware_setup(void)
svm_x86_ops.tlb_remote_flush_with_range =
kvm_hv_remote_flush_tlb_with_range;
}
+
+ if (ms_hyperv.nested_features & HV_X64_NESTED_DIRECT_FLUSH) {
+ pr_info("kvm: Hyper-V Direct TLB Flush enabled\n");
+ for_each_online_cpu(cpu) {
+ struct hv_vp_assist_page *vp_ap =
+ hv_get_vp_assist_page(cpu);
+
+ if (!vp_ap)
+ continue;
+
+ vp_ap->nested_control.features.directhypercall = 1;
+ }
+ svm_x86_ops.enable_direct_tlbflush =
+ hv_enable_direct_tlbflush;
+ }
#endif

if (nrips) {
@@ -3921,6 +3962,13 @@ static __no_kcsan fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu)
}
svm->vmcb->save.cr2 = vcpu->arch.cr2;

+#if IS_ENABLED(CONFIG_HYPERV)
+ if (svm->vmcb->hv_enlightenments.hv_vp_id != to_hv_vcpu(vcpu)->vp_index) {
+ svm->vmcb->hv_enlightenments.hv_vp_id = to_hv_vcpu(vcpu)->vp_index;
+ vmcb_mark_dirty(svm->vmcb, VMCB_HV_NESTED_ENLIGHTENMENTS);
+ }
+#endif
+
/*
* Run with all-zero DR6 unless needed, so that we can get the exact cause
* of a #DB.
--
2.25.1

2021-04-15 13:46:33

by Vineeth Pillai

[permalink] [raw]
Subject: [PATCH v2 4/7] KVM: SVM: hyper-v: Nested enlightenments in VMCB

Add Hyper-V specific fields in VMCB to support SVM enlightenments.
Also a small refactoring of VMCB clean bits handling.

Signed-off-by: Vineeth Pillai <[email protected]>
---
arch/x86/include/asm/svm.h | 24 +++++++++++++++++++++++-
arch/x86/kvm/svm/svm.c | 8 ++++++++
arch/x86/kvm/svm/svm.h | 30 ++++++++++++++++++++++++++++--
3 files changed, 59 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index 1c561945b426..3586d7523ce8 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -322,9 +322,31 @@ static inline void __unused_size_checks(void)
BUILD_BUG_ON(sizeof(struct ghcb) != EXPECTED_GHCB_SIZE);
}

+
+#if IS_ENABLED(CONFIG_HYPERV)
+struct __packed hv_enlightenments {
+ struct __packed hv_enlightenments_control {
+ u32 nested_flush_hypercall:1;
+ u32 msr_bitmap:1;
+ u32 enlightened_npt_tlb: 1;
+ u32 reserved:29;
+ } hv_enlightenments_control;
+ u32 hv_vp_id;
+ u64 hv_vm_id;
+ u64 partition_assist_page;
+ u64 reserved;
+};
+#define VMCB_CONTROL_END 992 // 32 bytes for Hyper-V
+#else
+#define VMCB_CONTROL_END 1024
+#endif
+
struct vmcb {
struct vmcb_control_area control;
- u8 reserved_control[1024 - sizeof(struct vmcb_control_area)];
+ u8 reserved_control[VMCB_CONTROL_END - sizeof(struct vmcb_control_area)];
+#if IS_ENABLED(CONFIG_HYPERV)
+ struct hv_enlightenments hv_enlightenments;
+#endif
struct vmcb_save_area save;
} __packed;

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index baee91c1e936..2ad1f55c88d0 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -31,6 +31,7 @@
#include <asm/tlbflush.h>
#include <asm/desc.h>
#include <asm/debugreg.h>
+#include <asm/hypervisor.h>
#include <asm/kvm_para.h>
#include <asm/irq_remapping.h>
#include <asm/spec-ctrl.h>
@@ -122,6 +123,8 @@ bool npt_enabled = true;
bool npt_enabled;
#endif

+u32 __read_mostly vmcb_all_clean_mask = VMCB_ALL_CLEAN_MASK;
+
/*
* These 2 parameters are used to config the controls for Pause-Loop Exiting:
* pause_filter_count: On processors that support Pause filtering(indicated
@@ -1051,6 +1054,11 @@ static __init int svm_hardware_setup(void)
*/
allow_smaller_maxphyaddr = !npt_enabled;

+#if IS_ENABLED(CONFIG_HYPERV)
+ if (hypervisor_is_type(X86_HYPER_MS_HYPERV))
+ vmcb_all_clean_mask |= VMCB_HYPERV_CLEAN_MASK;
+#endif
+
return 0;

err:
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 39e071fdab0c..63ed05c8027b 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -33,6 +33,11 @@ static const u32 host_save_user_msrs[] = {
extern u32 msrpm_offsets[MSRPM_OFFSETS] __read_mostly;
extern bool npt_enabled;

+/*
+ * Clean bits in VMCB.
+ * VMCB_ALL_CLEAN_MASK and VMCB_HYPERV_CLEAN_MASK might
+ * also need to be updated if this enum is modified.
+ */
enum {
VMCB_INTERCEPTS, /* Intercept vectors, TSC offset,
pause filter count */
@@ -50,12 +55,28 @@ enum {
* AVIC PHYSICAL_TABLE pointer,
* AVIC LOGICAL_TABLE pointer
*/
- VMCB_DIRTY_MAX,
+#if IS_ENABLED(CONFIG_HYPERV)
+ VMCB_HV_NESTED_ENLIGHTENMENTS = 31,
+#endif
};

+#define VMCB_ALL_CLEAN_MASK ( \
+ (1U << VMCB_INTERCEPTS) | (1U << VMCB_PERM_MAP) | \
+ (1U << VMCB_ASID) | (1U << VMCB_INTR) | \
+ (1U << VMCB_NPT) | (1U << VMCB_CR) | (1U << VMCB_DR) | \
+ (1U << VMCB_DT) | (1U << VMCB_SEG) | (1U << VMCB_CR2) | \
+ (1U << VMCB_LBR) | (1U << VMCB_AVIC) \
+ )
+
+#if IS_ENABLED(CONFIG_HYPERV)
+#define VMCB_HYPERV_CLEAN_MASK (1U << VMCB_HV_NESTED_ENLIGHTENMENTS)
+#endif
+
/* TPR and CR2 are always written before VMRUN */
#define VMCB_ALWAYS_DIRTY_MASK ((1U << VMCB_INTR) | (1U << VMCB_CR2))

+extern u32 vmcb_all_clean_mask __read_mostly;
+
struct kvm_sev_info {
bool active; /* SEV enabled guest */
bool es_active; /* SEV-ES enabled guest */
@@ -230,7 +251,7 @@ static inline void vmcb_mark_all_dirty(struct vmcb *vmcb)

static inline void vmcb_mark_all_clean(struct vmcb *vmcb)
{
- vmcb->control.clean = ((1 << VMCB_DIRTY_MAX) - 1)
+ vmcb->control.clean = vmcb_all_clean_mask
& ~VMCB_ALWAYS_DIRTY_MASK;
}

@@ -239,6 +260,11 @@ static inline void vmcb_mark_dirty(struct vmcb *vmcb, int bit)
vmcb->control.clean &= ~(1 << bit);
}

+static inline bool vmcb_is_clean(struct vmcb *vmcb, int bit)
+{
+ return (vmcb->control.clean & (1 << bit));
+}
+
static inline struct vcpu_svm *to_svm(struct kvm_vcpu *vcpu)
{
return container_of(vcpu, struct vcpu_svm, vcpu);
--
2.25.1

2021-04-15 13:48:04

by Vineeth Pillai

[permalink] [raw]
Subject: [PATCH v2 5/7] KVM: SVM: hyper-v: Remote TLB flush for SVM

Enable remote TLB flush for SVM.

Signed-off-by: Vineeth Pillai <[email protected]>
---
arch/x86/kvm/svm/svm.c | 37 +++++++++++++++++++++++++++++++++++++
1 file changed, 37 insertions(+)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 2ad1f55c88d0..de141d5ae5fb 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -37,6 +37,7 @@
#include <asm/spec-ctrl.h>
#include <asm/cpu_device_id.h>
#include <asm/traps.h>
+#include <asm/mshyperv.h>

#include <asm/virtext.h>
#include "trace.h"
@@ -44,6 +45,8 @@
#include "svm.h"
#include "svm_ops.h"

+#include "hyperv.h"
+
#define __ex(x) __kvm_handle_fault_on_reboot(x)

MODULE_AUTHOR("Qumranet");
@@ -931,6 +934,8 @@ static __init void svm_set_cpu_caps(void)
kvm_cpu_cap_set(X86_FEATURE_VIRT_SSBD);
}

+static struct kvm_x86_ops svm_x86_ops;
+
static __init int svm_hardware_setup(void)
{
int cpu;
@@ -1000,6 +1005,16 @@ static __init int svm_hardware_setup(void)
kvm_configure_mmu(npt_enabled, get_max_npt_level(), PG_LEVEL_1G);
pr_info("kvm: Nested Paging %sabled\n", npt_enabled ? "en" : "dis");

+#if IS_ENABLED(CONFIG_HYPERV)
+ if (ms_hyperv.nested_features & HV_X64_NESTED_ENLIGHTENED_TLB
+ && npt_enabled) {
+ pr_info("kvm: Hyper-V enlightened NPT TLB flush enabled\n");
+ svm_x86_ops.tlb_remote_flush = kvm_hv_remote_flush_tlb;
+ svm_x86_ops.tlb_remote_flush_with_range =
+ kvm_hv_remote_flush_tlb_with_range;
+ }
+#endif
+
if (nrips) {
if (!boot_cpu_has(X86_FEATURE_NRIPS))
nrips = false;
@@ -1120,6 +1135,21 @@ static void svm_check_invpcid(struct vcpu_svm *svm)
}
}

+#if IS_ENABLED(CONFIG_HYPERV)
+static void hv_init_vmcb(struct vmcb *vmcb)
+{
+ struct hv_enlightenments *hve = &vmcb->hv_enlightenments;
+
+ if (npt_enabled &&
+ ms_hyperv.nested_features & HV_X64_NESTED_ENLIGHTENED_TLB)
+ hve->hv_enlightenments_control.enlightened_npt_tlb = 1;
+}
+#else
+static inline void hv_init_vmcb(struct vmcb *vmcb)
+{
+}
+#endif
+
static void init_vmcb(struct vcpu_svm *svm)
{
struct vmcb_control_area *control = &svm->vmcb->control;
@@ -1282,6 +1312,8 @@ static void init_vmcb(struct vcpu_svm *svm)
}
}

+ hv_init_vmcb(svm->vmcb);
+
vmcb_mark_all_dirty(svm->vmcb);

enable_gif(svm);
@@ -3975,6 +4007,11 @@ static void svm_load_mmu_pgd(struct kvm_vcpu *vcpu, unsigned long root,
svm->vmcb->control.nested_cr3 = cr3;
vmcb_mark_dirty(svm->vmcb, VMCB_NPT);

+#if IS_ENABLED(CONFIG_HYPERV)
+ if (kvm_x86_ops.tlb_remote_flush)
+ kvm_update_arch_tdp_pointer(vcpu->kvm, vcpu, cr3);
+#endif
+
/* Loading L2's CR3 is handled by enter_svm_guest_mode. */
if (!test_bit(VCPU_EXREG_CR3, (ulong *)&vcpu->arch.regs_avail))
return;
--
2.25.1

2021-04-16 08:48:26

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH v2 3/7] KVM: x86: hyper-v: Move the remote TLB flush logic out of vmx

On 16/04/21 10:36, Vitaly Kuznetsov wrote:
> - Create a dedicated set of files, e.g. 'kvmonhyperv.[ch]' (I also
> thought about 'hyperv_host.[ch]' but then I realized it's equally
> misleading as one can read this as 'KVM is acting as Hyper-V host').
>
> Personally, I'd vote for the later. Besides eliminating confusion, the
> benefit of having dedicated files is that we can avoid compiling them
> completely when !IS_ENABLED(CONFIG_HYPERV) (#ifdefs in C are ugly).

Indeed. For the file, kvm-on-hv.[ch] can do.

Paolo

2021-04-16 09:02:40

by Vitaly Kuznetsov

[permalink] [raw]
Subject: Re: [PATCH v2 4/7] KVM: SVM: hyper-v: Nested enlightenments in VMCB

Vineeth Pillai <[email protected]> writes:

> Add Hyper-V specific fields in VMCB to support SVM enlightenments.
> Also a small refactoring of VMCB clean bits handling.
>
> Signed-off-by: Vineeth Pillai <[email protected]>
> ---
> arch/x86/include/asm/svm.h | 24 +++++++++++++++++++++++-
> arch/x86/kvm/svm/svm.c | 8 ++++++++
> arch/x86/kvm/svm/svm.h | 30 ++++++++++++++++++++++++++++--
> 3 files changed, 59 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
> index 1c561945b426..3586d7523ce8 100644
> --- a/arch/x86/include/asm/svm.h
> +++ b/arch/x86/include/asm/svm.h
> @@ -322,9 +322,31 @@ static inline void __unused_size_checks(void)
> BUILD_BUG_ON(sizeof(struct ghcb) != EXPECTED_GHCB_SIZE);
> }
>
> +
> +#if IS_ENABLED(CONFIG_HYPERV)
> +struct __packed hv_enlightenments {
> + struct __packed hv_enlightenments_control {
> + u32 nested_flush_hypercall:1;
> + u32 msr_bitmap:1;
> + u32 enlightened_npt_tlb: 1;
> + u32 reserved:29;
> + } hv_enlightenments_control;
> + u32 hv_vp_id;
> + u64 hv_vm_id;
> + u64 partition_assist_page;
> + u64 reserved;
> +};

Enlightened VMCS seems to have the same part:

struct {
u32 nested_flush_hypercall:1;
u32 msr_bitmap:1;
u32 reserved:30;
} __packed hv_enlightenments_control;
u32 hv_vp_id;
u64 hv_vm_id;
u64 partition_assist_page;

Would it maybe make sense to unify these two (in case they are the same
thing in Hyper-V, of course)?


> +#define VMCB_CONTROL_END 992 // 32 bytes for Hyper-V
> +#else
> +#define VMCB_CONTROL_END 1024
> +#endif
> +
> struct vmcb {
> struct vmcb_control_area control;
> - u8 reserved_control[1024 - sizeof(struct vmcb_control_area)];
> + u8 reserved_control[VMCB_CONTROL_END - sizeof(struct vmcb_control_area)];
> +#if IS_ENABLED(CONFIG_HYPERV)
> + struct hv_enlightenments hv_enlightenments;
> +#endif
> struct vmcb_save_area save;
> } __packed;
>
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index baee91c1e936..2ad1f55c88d0 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -31,6 +31,7 @@
> #include <asm/tlbflush.h>
> #include <asm/desc.h>
> #include <asm/debugreg.h>
> +#include <asm/hypervisor.h>
> #include <asm/kvm_para.h>
> #include <asm/irq_remapping.h>
> #include <asm/spec-ctrl.h>
> @@ -122,6 +123,8 @@ bool npt_enabled = true;
> bool npt_enabled;
> #endif
>
> +u32 __read_mostly vmcb_all_clean_mask = VMCB_ALL_CLEAN_MASK;
> +
> /*
> * These 2 parameters are used to config the controls for Pause-Loop Exiting:
> * pause_filter_count: On processors that support Pause filtering(indicated
> @@ -1051,6 +1054,11 @@ static __init int svm_hardware_setup(void)
> */
> allow_smaller_maxphyaddr = !npt_enabled;
>
> +#if IS_ENABLED(CONFIG_HYPERV)
> + if (hypervisor_is_type(X86_HYPER_MS_HYPERV))
> + vmcb_all_clean_mask |= VMCB_HYPERV_CLEAN_MASK;
> +#endif
> +
> return 0;
>
> err:
> diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
> index 39e071fdab0c..63ed05c8027b 100644
> --- a/arch/x86/kvm/svm/svm.h
> +++ b/arch/x86/kvm/svm/svm.h
> @@ -33,6 +33,11 @@ static const u32 host_save_user_msrs[] = {
> extern u32 msrpm_offsets[MSRPM_OFFSETS] __read_mostly;
> extern bool npt_enabled;
>
> +/*
> + * Clean bits in VMCB.
> + * VMCB_ALL_CLEAN_MASK and VMCB_HYPERV_CLEAN_MASK might
> + * also need to be updated if this enum is modified.
> + */
> enum {
> VMCB_INTERCEPTS, /* Intercept vectors, TSC offset,
> pause filter count */
> @@ -50,12 +55,28 @@ enum {
> * AVIC PHYSICAL_TABLE pointer,
> * AVIC LOGICAL_TABLE pointer
> */
> - VMCB_DIRTY_MAX,
> +#if IS_ENABLED(CONFIG_HYPERV)
> + VMCB_HV_NESTED_ENLIGHTENMENTS = 31,
> +#endif
> };
>
> +#define VMCB_ALL_CLEAN_MASK ( \
> + (1U << VMCB_INTERCEPTS) | (1U << VMCB_PERM_MAP) | \
> + (1U << VMCB_ASID) | (1U << VMCB_INTR) | \
> + (1U << VMCB_NPT) | (1U << VMCB_CR) | (1U << VMCB_DR) | \
> + (1U << VMCB_DT) | (1U << VMCB_SEG) | (1U << VMCB_CR2) | \
> + (1U << VMCB_LBR) | (1U << VMCB_AVIC) \
> + )

What if we preserve VMCB_DIRTY_MAX and drop this newly introduced
VMCB_ALL_CLEAN_MASK (which basically lists all the members of the enum
above)? '1 << VMCB_DIRTY_MAX' can still work. (If the 'VMCB_DIRTY_MAX'
name becomes misleading we can e.g. rename it to VMCB_NATIVE_DIRTY_MAX
or something but I'm not sure it's worth it)

> +
> +#if IS_ENABLED(CONFIG_HYPERV)
> +#define VMCB_HYPERV_CLEAN_MASK (1U << VMCB_HV_NESTED_ENLIGHTENMENTS)
> +#endif

VMCB_HYPERV_CLEAN_MASK is a single bit, why do we need it at all
(BIT(VMCB_HV_NESTED_ENLIGHTENMENTS) is not super long)

> +
> /* TPR and CR2 are always written before VMRUN */
> #define VMCB_ALWAYS_DIRTY_MASK ((1U << VMCB_INTR) | (1U << VMCB_CR2))
>
> +extern u32 vmcb_all_clean_mask __read_mostly;
> +
> struct kvm_sev_info {
> bool active; /* SEV enabled guest */
> bool es_active; /* SEV-ES enabled guest */
> @@ -230,7 +251,7 @@ static inline void vmcb_mark_all_dirty(struct vmcb *vmcb)
>
> static inline void vmcb_mark_all_clean(struct vmcb *vmcb)
> {
> - vmcb->control.clean = ((1 << VMCB_DIRTY_MAX) - 1)
> + vmcb->control.clean = vmcb_all_clean_mask
> & ~VMCB_ALWAYS_DIRTY_MASK;
> }
>
> @@ -239,6 +260,11 @@ static inline void vmcb_mark_dirty(struct vmcb *vmcb, int bit)
> vmcb->control.clean &= ~(1 << bit);
> }
>
> +static inline bool vmcb_is_clean(struct vmcb *vmcb, int bit)
> +{
> + return (vmcb->control.clean & (1 << bit));
> +}
> +
> static inline struct vcpu_svm *to_svm(struct kvm_vcpu *vcpu)
> {
> return container_of(vcpu, struct vcpu_svm, vcpu);

--
Vitaly

2021-04-16 09:07:06

by Vitaly Kuznetsov

[permalink] [raw]
Subject: Re: [PATCH v2 5/7] KVM: SVM: hyper-v: Remote TLB flush for SVM

Vineeth Pillai <[email protected]> writes:

> Enable remote TLB flush for SVM.
>
> Signed-off-by: Vineeth Pillai <[email protected]>
> ---
> arch/x86/kvm/svm/svm.c | 37 +++++++++++++++++++++++++++++++++++++
> 1 file changed, 37 insertions(+)
>
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 2ad1f55c88d0..de141d5ae5fb 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -37,6 +37,7 @@
> #include <asm/spec-ctrl.h>
> #include <asm/cpu_device_id.h>
> #include <asm/traps.h>
> +#include <asm/mshyperv.h>
>
> #include <asm/virtext.h>
> #include "trace.h"
> @@ -44,6 +45,8 @@
> #include "svm.h"
> #include "svm_ops.h"
>
> +#include "hyperv.h"
> +
> #define __ex(x) __kvm_handle_fault_on_reboot(x)
>
> MODULE_AUTHOR("Qumranet");
> @@ -931,6 +934,8 @@ static __init void svm_set_cpu_caps(void)
> kvm_cpu_cap_set(X86_FEATURE_VIRT_SSBD);
> }
>
> +static struct kvm_x86_ops svm_x86_ops;
> +
> static __init int svm_hardware_setup(void)
> {
> int cpu;
> @@ -1000,6 +1005,16 @@ static __init int svm_hardware_setup(void)
> kvm_configure_mmu(npt_enabled, get_max_npt_level(), PG_LEVEL_1G);
> pr_info("kvm: Nested Paging %sabled\n", npt_enabled ? "en" : "dis");
>
> +#if IS_ENABLED(CONFIG_HYPERV)
> + if (ms_hyperv.nested_features & HV_X64_NESTED_ENLIGHTENED_TLB
> + && npt_enabled) {
> + pr_info("kvm: Hyper-V enlightened NPT TLB flush enabled\n");
> + svm_x86_ops.tlb_remote_flush = kvm_hv_remote_flush_tlb;
> + svm_x86_ops.tlb_remote_flush_with_range =
> + kvm_hv_remote_flush_tlb_with_range;
> + }
> +#endif
> +
> if (nrips) {
> if (!boot_cpu_has(X86_FEATURE_NRIPS))
> nrips = false;
> @@ -1120,6 +1135,21 @@ static void svm_check_invpcid(struct vcpu_svm *svm)
> }
> }
>
> +#if IS_ENABLED(CONFIG_HYPERV)
> +static void hv_init_vmcb(struct vmcb *vmcb)
> +{
> + struct hv_enlightenments *hve = &vmcb->hv_enlightenments;
> +
> + if (npt_enabled &&
> + ms_hyperv.nested_features & HV_X64_NESTED_ENLIGHTENED_TLB)

Nitpick: we can probably have a 'static inline' for

"npt_enabled && ms_hyperv.nested_features & HV_X64_NESTED_ENLIGHTENED_TLB"

e.g. 'hv_svm_enlightened_tlbflush()'

> + hve->hv_enlightenments_control.enlightened_npt_tlb = 1;
> +}
> +#else
> +static inline void hv_init_vmcb(struct vmcb *vmcb)
> +{
> +}
> +#endif
> +
> static void init_vmcb(struct vcpu_svm *svm)
> {
> struct vmcb_control_area *control = &svm->vmcb->control;
> @@ -1282,6 +1312,8 @@ static void init_vmcb(struct vcpu_svm *svm)
> }
> }
>
> + hv_init_vmcb(svm->vmcb);
> +
> vmcb_mark_all_dirty(svm->vmcb);
>
> enable_gif(svm);
> @@ -3975,6 +4007,11 @@ static void svm_load_mmu_pgd(struct kvm_vcpu *vcpu, unsigned long root,
> svm->vmcb->control.nested_cr3 = cr3;
> vmcb_mark_dirty(svm->vmcb, VMCB_NPT);
>
> +#if IS_ENABLED(CONFIG_HYPERV)
> + if (kvm_x86_ops.tlb_remote_flush)
> + kvm_update_arch_tdp_pointer(vcpu->kvm, vcpu, cr3);
> +#endif
> +
> /* Loading L2's CR3 is handled by enter_svm_guest_mode. */
> if (!test_bit(VCPU_EXREG_CR3, (ulong *)&vcpu->arch.regs_avail))
> return;

--
Vitaly

2021-04-16 09:46:59

by Vitaly Kuznetsov

[permalink] [raw]
Subject: Re: [PATCH v2 1/7] hyperv: Detect Nested virtualization support for SVM

Vineeth Pillai <[email protected]> writes:

> Detect nested features exposed by Hyper-V if SVM is enabled.
>

It may make sense to expand this a bit as it is probably unclear how the
change is related to SVM.

Something like:

HYPERV_CPUID_NESTED_FEATURES CPUID leaf can be present on both Intel and
AMD Hyper-V guests. Previously, the code was using
HV_X64_ENLIGHTENED_VMCS_RECOMMENDED feature bit to determine the
availability of nested features leaf and this complies to TLFS:
"Recommend a nested hypervisor using the enlightened VMCS interface.
Also indicates that additional nested enlightenments may be available
(see leaf 0x4000000A)". Enlightened VMCS, however, is an Intel only
feature so the detection method doesn't work for AMD. Use
HYPERV_CPUID_VENDOR_AND_MAX_FUNCTIONS.EAX CPUID information ("The
maximum input value for hypervisor CPUID information.") instead, this
works for both AMD and Intel.


> Signed-off-by: Vineeth Pillai <[email protected]>
> ---
> arch/x86/kernel/cpu/mshyperv.c | 10 +++++++---
> 1 file changed, 7 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
> index 3546d3e21787..c6f812851e37 100644
> --- a/arch/x86/kernel/cpu/mshyperv.c
> +++ b/arch/x86/kernel/cpu/mshyperv.c
> @@ -252,6 +252,7 @@ static void __init hv_smp_prepare_cpus(unsigned int max_cpus)
>
> static void __init ms_hyperv_init_platform(void)
> {
> + int hv_max_functions_eax;
> int hv_host_info_eax;
> int hv_host_info_ebx;
> int hv_host_info_ecx;
> @@ -269,6 +270,8 @@ static void __init ms_hyperv_init_platform(void)
> ms_hyperv.misc_features = cpuid_edx(HYPERV_CPUID_FEATURES);
> ms_hyperv.hints = cpuid_eax(HYPERV_CPUID_ENLIGHTMENT_INFO);
>
> + hv_max_functions_eax = cpuid_eax(HYPERV_CPUID_VENDOR_AND_MAX_FUNCTIONS);
> +
> pr_info("Hyper-V: privilege flags low 0x%x, high 0x%x, hints 0x%x, misc 0x%x\n",
> ms_hyperv.features, ms_hyperv.priv_high, ms_hyperv.hints,
> ms_hyperv.misc_features);
> @@ -298,8 +301,7 @@ static void __init ms_hyperv_init_platform(void)
> /*
> * Extract host information.
> */
> - if (cpuid_eax(HYPERV_CPUID_VENDOR_AND_MAX_FUNCTIONS) >=
> - HYPERV_CPUID_VERSION) {
> + if (hv_max_functions_eax >= HYPERV_CPUID_VERSION) {
> hv_host_info_eax = cpuid_eax(HYPERV_CPUID_VERSION);
> hv_host_info_ebx = cpuid_ebx(HYPERV_CPUID_VERSION);
> hv_host_info_ecx = cpuid_ecx(HYPERV_CPUID_VERSION);
> @@ -325,9 +327,11 @@ static void __init ms_hyperv_init_platform(void)
> ms_hyperv.isolation_config_a, ms_hyperv.isolation_config_b);
> }
>
> - if (ms_hyperv.hints & HV_X64_ENLIGHTENED_VMCS_RECOMMENDED) {
> + if (hv_max_functions_eax >= HYPERV_CPUID_NESTED_FEATURES) {
> ms_hyperv.nested_features =
> cpuid_eax(HYPERV_CPUID_NESTED_FEATURES);
> + pr_info("Hyper-V: Nested features: 0x%x\n",
> + ms_hyperv.nested_features);
> }
>
> /*

With the commit message expanded,

Reviewed-by: Vitaly Kuznetsov <[email protected]>

--
Vitaly

2021-04-16 09:51:11

by Vitaly Kuznetsov

[permalink] [raw]
Subject: Re: [PATCH v2 3/7] KVM: x86: hyper-v: Move the remote TLB flush logic out of vmx

Vineeth Pillai <[email protected]> writes:

> Currently the remote TLB flush logic is specific to VMX.
> Move it to a common place so that SVM can use it as well.
>
> Signed-off-by: Vineeth Pillai <[email protected]>
> ---
> arch/x86/include/asm/kvm_host.h | 14 +++++
> arch/x86/kvm/hyperv.c | 87 +++++++++++++++++++++++++++++
> arch/x86/kvm/hyperv.h | 20 +++++++
> arch/x86/kvm/vmx/vmx.c | 97 +++------------------------------
> arch/x86/kvm/vmx/vmx.h | 10 ----
> arch/x86/kvm/x86.c | 9 ++-
> 6 files changed, 136 insertions(+), 101 deletions(-)
>
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 877a4025d8da..ed84c15d18bc 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -838,6 +838,15 @@ struct kvm_vcpu_arch {
>
> /* Protected Guests */
> bool guest_state_protected;
> +
> +#if IS_ENABLED(CONFIG_HYPERV)
> + /*
> + * Two Dimensional paging CR3
> + * EPTP for Intel
> + * nCR3 for AMD
> + */
> + u64 tdp_pointer;
> +#endif
> };
>
> struct kvm_lpage_info {
> @@ -1079,6 +1088,11 @@ struct kvm_arch {
> */
> spinlock_t tdp_mmu_pages_lock;
> #endif /* CONFIG_X86_64 */
> +
> +#if IS_ENABLED(CONFIG_HYPERV)
> + int tdp_pointers_match;
> + spinlock_t tdp_pointer_lock;
> +#endif
> };
>
> struct kvm_vm_stat {
> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
> index 58fa8c029867..614b4448a028 100644
> --- a/arch/x86/kvm/hyperv.c
> +++ b/arch/x86/kvm/hyperv.c

I still think that using arch/x86/kvm/hyperv.[ch] for KVM-on-Hyper-V is
misleading. Currently, these are dedicated to emulating Hyper-V
interface to KVM guests and this is orthogonal to nesting KVM on
Hyper-V. As a solution, I'd suggest you either:
- Put the stuff in x86.c
- Create a dedicated set of files, e.g. 'kvmonhyperv.[ch]' (I also
thought about 'hyperv_host.[ch]' but then I realized it's equally
misleading as one can read this as 'KVM is acting as Hyper-V host').

Personally, I'd vote for the later. Besides eliminating confusion, the
benefit of having dedicated files is that we can avoid compiling them
completely when !IS_ENABLED(CONFIG_HYPERV) (#ifdefs in C are ugly).


> @@ -32,6 +32,7 @@
> #include <linux/eventfd.h>
>
> #include <asm/apicdef.h>
> +#include <asm/mshyperv.h>
> #include <trace/events/kvm.h>
>
> #include "trace.h"
> @@ -2180,3 +2181,89 @@ int kvm_get_hv_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid2 *cpuid,
>
> return 0;
> }
> +
> +#if IS_ENABLED(CONFIG_HYPERV)
> +/* check_tdp_pointer() should be under protection of tdp_pointer_lock. */
> +static void check_tdp_pointer_match(struct kvm *kvm)
> +{
> + u64 tdp_pointer = INVALID_PAGE;
> + bool valid_tdp = false;
> + struct kvm_vcpu *vcpu;
> + int i;
> +
> + kvm_for_each_vcpu(i, vcpu, kvm) {
> + if (!valid_tdp) {
> + tdp_pointer = vcpu->arch.tdp_pointer;
> + valid_tdp = true;
> + continue;
> + }
> +
> + if (tdp_pointer != vcpu->arch.tdp_pointer) {
> + kvm->arch.tdp_pointers_match = TDP_POINTERS_MISMATCH;
> + return;
> + }
> + }
> +
> + kvm->arch.tdp_pointers_match = TDP_POINTERS_MATCH;
> +}
> +
> +static int kvm_fill_hv_flush_list_func(struct hv_guest_mapping_flush_list *flush,
> + void *data)
> +{
> + struct kvm_tlb_range *range = data;
> +
> + return hyperv_fill_flush_guest_mapping_list(flush, range->start_gfn,
> + range->pages);
> +}
> +
> +static inline int __hv_remote_flush_tlb_with_range(struct kvm *kvm,
> + struct kvm_vcpu *vcpu, struct kvm_tlb_range *range)
> +{
> + u64 tdp_pointer = vcpu->arch.tdp_pointer;
> +
> + /*
> + * FLUSH_GUEST_PHYSICAL_ADDRESS_SPACE hypercall needs address
> + * of the base of EPT PML4 table, strip off EPT configuration
> + * information.
> + */
> + if (range)
> + return hyperv_flush_guest_mapping_range(tdp_pointer & PAGE_MASK,
> + kvm_fill_hv_flush_list_func, (void *)range);
> + else
> + return hyperv_flush_guest_mapping(tdp_pointer & PAGE_MASK);
> +}
> +
> +int kvm_hv_remote_flush_tlb_with_range(struct kvm *kvm,
> + struct kvm_tlb_range *range)
> +{
> + struct kvm_vcpu *vcpu;
> + int ret = 0, i;
> +
> + spin_lock(&kvm->arch.tdp_pointer_lock);
> +
> + if (kvm->arch.tdp_pointers_match == TDP_POINTERS_CHECK)
> + check_tdp_pointer_match(kvm);
> +
> + if (kvm->arch.tdp_pointers_match != TDP_POINTERS_MATCH) {
> + kvm_for_each_vcpu(i, vcpu, kvm) {
> + /* If tdp_pointer is invalid pointer, bypass flush request. */
> + if (VALID_PAGE(vcpu->arch.tdp_pointer))
> + ret |= __hv_remote_flush_tlb_with_range(
> + kvm, vcpu, range);
> + }
> + } else {
> + ret = __hv_remote_flush_tlb_with_range(kvm,
> + kvm_get_vcpu(kvm, 0), range);
> + }
> +
> + spin_unlock(&kvm->arch.tdp_pointer_lock);
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(kvm_hv_remote_flush_tlb_with_range);
> +
> +int kvm_hv_remote_flush_tlb(struct kvm *kvm)
> +{
> + return kvm_hv_remote_flush_tlb_with_range(kvm, NULL);
> +}
> +EXPORT_SYMBOL_GPL(kvm_hv_remote_flush_tlb);
> +#endif
> diff --git a/arch/x86/kvm/hyperv.h b/arch/x86/kvm/hyperv.h
> index e951af1fcb2c..b27c6f47f58d 100644
> --- a/arch/x86/kvm/hyperv.h
> +++ b/arch/x86/kvm/hyperv.h
> @@ -50,6 +50,12 @@
> /* Hyper-V HV_X64_MSR_SYNDBG_OPTIONS bits */
> #define HV_X64_SYNDBG_OPTION_USE_HCALLS BIT(2)
>
> +enum tdp_pointers_status {
> + TDP_POINTERS_CHECK = 0,
> + TDP_POINTERS_MATCH = 1,
> + TDP_POINTERS_MISMATCH = 2
> +};
> +
> static inline struct kvm_hv *to_kvm_hv(struct kvm *kvm)
> {
> return &kvm->arch.hyperv;
> @@ -141,4 +147,18 @@ int kvm_vm_ioctl_hv_eventfd(struct kvm *kvm, struct kvm_hyperv_eventfd *args);
> int kvm_get_hv_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid2 *cpuid,
> struct kvm_cpuid_entry2 __user *entries);
>
> +#if IS_ENABLED(CONFIG_HYPERV)
> +static inline void kvm_update_arch_tdp_pointer(struct kvm *kvm,
> + struct kvm_vcpu *vcpu, u64 tdp_pointer)
> +{
> + spin_lock(&kvm->arch.tdp_pointer_lock);
> + vcpu->arch.tdp_pointer = tdp_pointer;
> + kvm->arch.tdp_pointers_match = TDP_POINTERS_CHECK;
> + spin_unlock(&kvm->arch.tdp_pointer_lock);
> +}
> +
> +int kvm_hv_remote_flush_tlb(struct kvm *kvm);
> +int kvm_hv_remote_flush_tlb_with_range(struct kvm *kvm,
> + struct kvm_tlb_range *range);
> +#endif
> #endif
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 50810d471462..67f607319eb7 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -62,6 +62,7 @@
> #include "vmcs12.h"
> #include "vmx.h"
> #include "x86.h"
> +#include "hyperv.h"
>
> MODULE_AUTHOR("Qumranet");
> MODULE_LICENSE("GPL");
> @@ -472,83 +473,6 @@ static const u32 vmx_uret_msrs_list[] = {
> static bool __read_mostly enlightened_vmcs = true;
> module_param(enlightened_vmcs, bool, 0444);
>
> -/* check_ept_pointer() should be under protection of ept_pointer_lock. */
> -static void check_ept_pointer_match(struct kvm *kvm)
> -{
> - struct kvm_vcpu *vcpu;
> - u64 tmp_eptp = INVALID_PAGE;
> - int i;
> -
> - kvm_for_each_vcpu(i, vcpu, kvm) {
> - if (!VALID_PAGE(tmp_eptp)) {
> - tmp_eptp = to_vmx(vcpu)->ept_pointer;
> - } else if (tmp_eptp != to_vmx(vcpu)->ept_pointer) {
> - to_kvm_vmx(kvm)->ept_pointers_match
> - = EPT_POINTERS_MISMATCH;
> - return;
> - }
> - }
> -
> - to_kvm_vmx(kvm)->ept_pointers_match = EPT_POINTERS_MATCH;
> -}
> -
> -static int kvm_fill_hv_flush_list_func(struct hv_guest_mapping_flush_list *flush,
> - void *data)
> -{
> - struct kvm_tlb_range *range = data;
> -
> - return hyperv_fill_flush_guest_mapping_list(flush, range->start_gfn,
> - range->pages);
> -}
> -
> -static inline int __hv_remote_flush_tlb_with_range(struct kvm *kvm,
> - struct kvm_vcpu *vcpu, struct kvm_tlb_range *range)
> -{
> - u64 ept_pointer = to_vmx(vcpu)->ept_pointer;
> -
> - /*
> - * FLUSH_GUEST_PHYSICAL_ADDRESS_SPACE hypercall needs address
> - * of the base of EPT PML4 table, strip off EPT configuration
> - * information.
> - */
> - if (range)
> - return hyperv_flush_guest_mapping_range(ept_pointer & PAGE_MASK,
> - kvm_fill_hv_flush_list_func, (void *)range);
> - else
> - return hyperv_flush_guest_mapping(ept_pointer & PAGE_MASK);
> -}
> -
> -static int hv_remote_flush_tlb_with_range(struct kvm *kvm,
> - struct kvm_tlb_range *range)
> -{
> - struct kvm_vcpu *vcpu;
> - int ret = 0, i;
> -
> - spin_lock(&to_kvm_vmx(kvm)->ept_pointer_lock);
> -
> - if (to_kvm_vmx(kvm)->ept_pointers_match == EPT_POINTERS_CHECK)
> - check_ept_pointer_match(kvm);
> -
> - if (to_kvm_vmx(kvm)->ept_pointers_match != EPT_POINTERS_MATCH) {
> - kvm_for_each_vcpu(i, vcpu, kvm) {
> - /* If ept_pointer is invalid pointer, bypass flush request. */
> - if (VALID_PAGE(to_vmx(vcpu)->ept_pointer))
> - ret |= __hv_remote_flush_tlb_with_range(
> - kvm, vcpu, range);
> - }
> - } else {
> - ret = __hv_remote_flush_tlb_with_range(kvm,
> - kvm_get_vcpu(kvm, 0), range);
> - }
> -
> - spin_unlock(&to_kvm_vmx(kvm)->ept_pointer_lock);
> - return ret;
> -}
> -static int hv_remote_flush_tlb(struct kvm *kvm)
> -{
> - return hv_remote_flush_tlb_with_range(kvm, NULL);
> -}
> -
> static int hv_enable_direct_tlbflush(struct kvm_vcpu *vcpu)
> {
> struct hv_enlightened_vmcs *evmcs;
> @@ -3115,13 +3039,10 @@ static void vmx_load_mmu_pgd(struct kvm_vcpu *vcpu, unsigned long pgd,
> eptp = construct_eptp(vcpu, pgd, pgd_level);
> vmcs_write64(EPT_POINTER, eptp);
>
> - if (kvm_x86_ops.tlb_remote_flush) {
> - spin_lock(&to_kvm_vmx(kvm)->ept_pointer_lock);
> - to_vmx(vcpu)->ept_pointer = eptp;
> - to_kvm_vmx(kvm)->ept_pointers_match
> - = EPT_POINTERS_CHECK;
> - spin_unlock(&to_kvm_vmx(kvm)->ept_pointer_lock);
> - }
> +#if IS_ENABLED(CONFIG_HYPERV)
> + if (kvm_x86_ops.tlb_remote_flush)
> + kvm_update_arch_tdp_pointer(kvm, vcpu, eptp);
> +#endif
>
> if (!enable_unrestricted_guest && !is_paging(vcpu))
> guest_cr3 = to_kvm_vmx(kvm)->ept_identity_map_addr;
> @@ -6989,8 +6910,6 @@ static int vmx_create_vcpu(struct kvm_vcpu *vcpu)
> vmx->pi_desc.nv = POSTED_INTR_VECTOR;
> vmx->pi_desc.sn = 1;
>
> - vmx->ept_pointer = INVALID_PAGE;
> -
> return 0;
>
> free_vmcs:
> @@ -7007,8 +6926,6 @@ static int vmx_create_vcpu(struct kvm_vcpu *vcpu)
>
> static int vmx_vm_init(struct kvm *kvm)
> {
> - spin_lock_init(&to_kvm_vmx(kvm)->ept_pointer_lock);
> -
> if (!ple_gap)
> kvm->arch.pause_in_guest = true;
>
> @@ -7818,9 +7735,9 @@ static __init int hardware_setup(void)
> #if IS_ENABLED(CONFIG_HYPERV)
> if (ms_hyperv.nested_features & HV_X64_NESTED_GUEST_MAPPING_FLUSH
> && enable_ept) {
> - vmx_x86_ops.tlb_remote_flush = hv_remote_flush_tlb;
> + vmx_x86_ops.tlb_remote_flush = kvm_hv_remote_flush_tlb;
> vmx_x86_ops.tlb_remote_flush_with_range =
> - hv_remote_flush_tlb_with_range;
> + kvm_hv_remote_flush_tlb_with_range;
> }
> #endif
>
> diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
> index 89da5e1251f1..d2e2ab46f5bb 100644
> --- a/arch/x86/kvm/vmx/vmx.h
> +++ b/arch/x86/kvm/vmx/vmx.h
> @@ -325,7 +325,6 @@ struct vcpu_vmx {
> */
> u64 msr_ia32_feature_control;
> u64 msr_ia32_feature_control_valid_bits;
> - u64 ept_pointer;
>
> struct pt_desc pt_desc;
> struct lbr_desc lbr_desc;
> @@ -338,21 +337,12 @@ struct vcpu_vmx {
> } shadow_msr_intercept;
> };
>
> -enum ept_pointers_status {
> - EPT_POINTERS_CHECK = 0,
> - EPT_POINTERS_MATCH = 1,
> - EPT_POINTERS_MISMATCH = 2
> -};
> -
> struct kvm_vmx {
> struct kvm kvm;
>
> unsigned int tss_addr;
> bool ept_identity_pagetable_done;
> gpa_t ept_identity_map_addr;
> -
> - enum ept_pointers_status ept_pointers_match;
> - spinlock_t ept_pointer_lock;
> };
>
> bool nested_vmx_allowed(struct kvm_vcpu *vcpu);
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 2a20ce60152e..f566e78b59b9 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -10115,6 +10115,10 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
> vcpu->arch.pending_external_vector = -1;
> vcpu->arch.preempted_in_kernel = false;
>
> +#if IS_ENABLED(CONFIG_HYPERV)
> + vcpu->arch.tdp_pointer = INVALID_PAGE;
> +#endif
> +
> r = static_call(kvm_x86_vcpu_create)(vcpu);
> if (r)
> goto free_guest_fpu;
> @@ -10470,7 +10474,6 @@ void kvm_arch_free_vm(struct kvm *kvm)
> vfree(kvm);
> }
>
> -

Stray change?

> int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
> {
> if (type)
> @@ -10498,6 +10501,10 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
>
> kvm->arch.guest_can_read_msr_platform_info = true;
>
> +#if IS_ENABLED(CONFIG_HYPERV)
> + spin_lock_init(&kvm->arch.tdp_pointer_lock);
> +#endif
> +
> INIT_DELAYED_WORK(&kvm->arch.kvmclock_update_work, kvmclock_update_fn);
> INIT_DELAYED_WORK(&kvm->arch.kvmclock_sync_work, kvmclock_sync_fn);

--
Vitaly

2021-04-16 16:28:18

by Vineeth Pillai

[permalink] [raw]
Subject: Re: [PATCH v2 1/7] hyperv: Detect Nested virtualization support for SVM


> It may make sense to expand this a bit as it is probably unclear how the
> change is related to SVM.
>
> Something like:
>
> HYPERV_CPUID_NESTED_FEATURES CPUID leaf can be present on both Intel and
> AMD Hyper-V guests. Previously, the code was using
> HV_X64_ENLIGHTENED_VMCS_RECOMMENDED feature bit to determine the
> availability of nested features leaf and this complies to TLFS:
> "Recommend a nested hypervisor using the enlightened VMCS interface.
> Also indicates that additional nested enlightenments may be available
> (see leaf 0x4000000A)". Enlightened VMCS, however, is an Intel only
> feature so the detection method doesn't work for AMD. Use
> HYPERV_CPUID_VENDOR_AND_MAX_FUNCTIONS.EAX CPUID information ("The
> maximum input value for hypervisor CPUID information.") instead, this
> works for both AMD and Intel.
Thanks for the input. Will update the commit message in next revision.

Thanks,
Vineeth

2021-04-16 18:52:59

by Vineeth Pillai

[permalink] [raw]
Subject: Re: [PATCH v2 4/7] KVM: SVM: hyper-v: Nested enlightenments in VMCB


On 4/16/2021 4:58 AM, Vitaly Kuznetsov wrote:
>
>> +
>> +#if IS_ENABLED(CONFIG_HYPERV)
>> +struct __packed hv_enlightenments {
>> + struct __packed hv_enlightenments_control {
>> + u32 nested_flush_hypercall:1;
>> + u32 msr_bitmap:1;
>> + u32 enlightened_npt_tlb: 1;
>> + u32 reserved:29;
>> + } hv_enlightenments_control;
>> + u32 hv_vp_id;
>> + u64 hv_vm_id;
>> + u64 partition_assist_page;
>> + u64 reserved;
>> +};
> Enlightened VMCS seems to have the same part:
>
> struct {
> u32 nested_flush_hypercall:1;
> u32 msr_bitmap:1;
> u32 reserved:30;
> } __packed hv_enlightenments_control;
> u32 hv_vp_id;
> u64 hv_vm_id;
> u64 partition_assist_page;
>
> Would it maybe make sense to unify these two (in case they are the same
> thing in Hyper-V, of course)?
They are very similar but,  the individual bits are a bit different. SVM
struct has an
additional bit 'enlightened_npt_tlb'. There might be future changes as
well if new
enlightenments are designed for performance optimization. So I feel, we
can have
it as separate structs.


>>
>> +#define VMCB_ALL_CLEAN_MASK ( \
>> + (1U << VMCB_INTERCEPTS) | (1U << VMCB_PERM_MAP) | \
>> + (1U << VMCB_ASID) | (1U << VMCB_INTR) | \
>> + (1U << VMCB_NPT) | (1U << VMCB_CR) | (1U << VMCB_DR) | \
>> + (1U << VMCB_DT) | (1U << VMCB_SEG) | (1U << VMCB_CR2) | \
>> + (1U << VMCB_LBR) | (1U << VMCB_AVIC) \
>> + )
> What if we preserve VMCB_DIRTY_MAX and drop this newly introduced
> VMCB_ALL_CLEAN_MASK (which basically lists all the members of the enum
> above)? '1 << VMCB_DIRTY_MAX' can still work. (If the 'VMCB_DIRTY_MAX'
> name becomes misleading we can e.g. rename it to VMCB_NATIVE_DIRTY_MAX
> or something but I'm not sure it's worth it)

I thought of keeping this code because, if we have non-contiguous bits
in future, we
would need this kinda logic anyways. But I get your point. Will revert this.


>
>> +#if IS_ENABLED(CONFIG_HYPERV)
>> +#define VMCB_HYPERV_CLEAN_MASK (1U << VMCB_HV_NESTED_ENLIGHTENMENTS)
>> +#endif
> VMCB_HYPERV_CLEAN_MASK is a single bit, why do we need it at all
> (BIT(VMCB_HV_NESTED_ENLIGHTENMENTS) is not super long)

Agreed. Will change it in next revision.

Thanks,
Vineeth

2021-04-16 18:56:24

by Vineeth Pillai

[permalink] [raw]
Subject: Re: [PATCH v2 5/7] KVM: SVM: hyper-v: Remote TLB flush for SVM


On 4/16/2021 5:04 AM, Vitaly Kuznetsov wrote:
> Vineeth Pillai <[email protected]> writes:
>
>
>
> +#if IS_ENABLED(CONFIG_HYPERV)
> +static void hv_init_vmcb(struct vmcb *vmcb)
> +{
> + struct hv_enlightenments *hve = &vmcb->hv_enlightenments;
> +
> + if (npt_enabled &&
> + ms_hyperv.nested_features & HV_X64_NESTED_ENLIGHTENED_TLB)
> Nitpick: we can probably have a 'static inline' for
>
> "npt_enabled && ms_hyperv.nested_features & HV_X64_NESTED_ENLIGHTENED_TLB"
>
> e.g. 'hv_svm_enlightened_tlbflush()'
Makes sense, will do.

Thanks,
Vineeth




2021-04-16 20:39:17

by Vineeth Pillai

[permalink] [raw]
Subject: Re: [PATCH v2 3/7] KVM: x86: hyper-v: Move the remote TLB flush logic out of vmx


On 4/16/2021 4:36 AM, Vitaly Kuznetsov wrote:
>
>> struct kvm_vm_stat {
>> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
>> index 58fa8c029867..614b4448a028 100644
>> --- a/arch/x86/kvm/hyperv.c
>> +++ b/arch/x86/kvm/hyperv.c
> I still think that using arch/x86/kvm/hyperv.[ch] for KVM-on-Hyper-V is
> misleading. Currently, these are dedicated to emulating Hyper-V
> interface to KVM guests and this is orthogonal to nesting KVM on
> Hyper-V. As a solution, I'd suggest you either:
> - Put the stuff in x86.c
> - Create a dedicated set of files, e.g. 'kvmonhyperv.[ch]' (I also
> thought about 'hyperv_host.[ch]' but then I realized it's equally
> misleading as one can read this as 'KVM is acting as Hyper-V host').
>
> Personally, I'd vote for the later. Besides eliminating confusion, the
> benefit of having dedicated files is that we can avoid compiling them
> completely when !IS_ENABLED(CONFIG_HYPERV) (#ifdefs in C are ugly).
Makes sense, creating new set of files looks good to me. The default
hyperv.c
for hyperv emulation also seems misleading - probably we should rename it
to hyperv_host_emul.[ch] or similar. That way, probably I can use
hyperv.[ch]
for kvm on hyperv code. If you feel, thats too big of a churn, I shall use
kvm_on_hyperv.[ch] (to avoid reading the file differently). What do you
think?


>> @@ -10470,7 +10474,6 @@ void kvm_arch_free_vm(struct kvm *kvm)
>> vfree(kvm);
>> }
>>
>> -
> Stray change?
It was kinda leftover, but I thought I'd keep it as it removes and
unnecessary line.

Thanks,
Vineeth

2021-04-20 15:59:19

by Vitaly Kuznetsov

[permalink] [raw]
Subject: Re: [PATCH v2 3/7] KVM: x86: hyper-v: Move the remote TLB flush logic out of vmx

Vineeth Pillai <[email protected]> writes:

> On 4/16/2021 4:36 AM, Vitaly Kuznetsov wrote:
>>
>>> struct kvm_vm_stat {
>>> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
>>> index 58fa8c029867..614b4448a028 100644
>>> --- a/arch/x86/kvm/hyperv.c
>>> +++ b/arch/x86/kvm/hyperv.c
>> I still think that using arch/x86/kvm/hyperv.[ch] for KVM-on-Hyper-V is
>> misleading. Currently, these are dedicated to emulating Hyper-V
>> interface to KVM guests and this is orthogonal to nesting KVM on
>> Hyper-V. As a solution, I'd suggest you either:
>> - Put the stuff in x86.c
>> - Create a dedicated set of files, e.g. 'kvmonhyperv.[ch]' (I also
>> thought about 'hyperv_host.[ch]' but then I realized it's equally
>> misleading as one can read this as 'KVM is acting as Hyper-V host').
>>
>> Personally, I'd vote for the later. Besides eliminating confusion, the
>> benefit of having dedicated files is that we can avoid compiling them
>> completely when !IS_ENABLED(CONFIG_HYPERV) (#ifdefs in C are ugly).
> Makes sense, creating new set of files looks good to me. The default
> hyperv.c
> for hyperv emulation also seems misleading - probably we should rename it
> to hyperv_host_emul.[ch] or similar. That way, probably I can use
> hyperv.[ch]
> for kvm on hyperv code. If you feel, thats too big of a churn, I shall use
> kvm_on_hyperv.[ch] (to avoid reading the file differently). What do you
> think?

I agree that 'hyperv.[ch]' is not ideal but I'm on the fence whether
renaming it is worth it. If we were to rename it, I'd suggest just
'hyperv_emul.[ch]' to indicate that here we're emulating Hyper-V.

I don't think reusing 'hyperv.[ch]' for KVM-on-Hyper-V is a good idea,
it would be doubly misleading and not friendly to backporters. Let's not
do that.

>
>
>>> @@ -10470,7 +10474,6 @@ void kvm_arch_free_vm(struct kvm *kvm)
>>> vfree(kvm);
>>> }
>>>
>>> -
>> Stray change?
> It was kinda leftover, but I thought I'd keep it as it removes and
> unnecessary line.

The idea is to have meaninful patches as concise as possible splitting
off cleanup / preparatory patches which don't actually change anything;
this way big series are much easier to review.

>
> Thanks,
> Vineeth
>

--
Vitaly

2021-04-21 10:28:42

by Wei Liu

[permalink] [raw]
Subject: Re: [PATCH v2 2/7] hyperv: SVM enlightened TLB flush support flag

On Thu, Apr 15, 2021 at 01:43:37PM +0000, Vineeth Pillai wrote:
> Bit 22 of HYPERV_CPUID_FEATURES.EDX is specific to SVM and specifies
> support for enlightened TLB flush. With this enlightenment enabled,
> ASID invalidations flushes only gva->hpa entries. To flush TLB entries
> derived from NPT, hypercalls should be used
> (HvFlushGuestPhysicalAddressSpace or HvFlushGuestPhysicalAddressList)
>
> Signed-off-by: Vineeth Pillai <[email protected]>
> ---
> arch/x86/include/asm/hyperv-tlfs.h | 9 +++++++++
> 1 file changed, 9 insertions(+)
>
> diff --git a/arch/x86/include/asm/hyperv-tlfs.h b/arch/x86/include/asm/hyperv-tlfs.h
> index 606f5cc579b2..005bf14d0449 100644
> --- a/arch/x86/include/asm/hyperv-tlfs.h
> +++ b/arch/x86/include/asm/hyperv-tlfs.h
> @@ -133,6 +133,15 @@
> #define HV_X64_NESTED_GUEST_MAPPING_FLUSH BIT(18)
> #define HV_X64_NESTED_MSR_BITMAP BIT(19)
>
> +/*
> + * This is specific to AMD and specifies that enlightened TLB flush is
> + * supported. If guest opts in to this feature, ASID invalidations only
> + * flushes gva -> hpa mapping entries. To flush the TLB entries derived
> + * from NPT, hypercalls should be used (HvFlushGuestPhysicalAddressSpace
> + * or HvFlushGuestPhysicalAddressList).
> + */
> +#define HV_X64_NESTED_ENLIGHTENED_TLB BIT(22)
> +

This is not yet documented in TLFS, right? I can't find this bit in the
latest edition (6.0b).

My first thought is the comment says this is AMD specific but the name
is rather generic. That looks a bit odd to begin with.

Wei.

> /* HYPERV_CPUID_ISOLATION_CONFIG.EAX bits. */
> #define HV_PARAVISOR_PRESENT BIT(0)
>
> --
> 2.25.1
>

2021-04-21 14:00:41

by Vineeth Pillai

[permalink] [raw]
Subject: Re: [PATCH v2 2/7] hyperv: SVM enlightened TLB flush support flag



On 4/21/21 6:00 AM, Wei Liu wrote:
> On Thu, Apr 15, 2021 at 01:43:37PM +0000, Vineeth Pillai wrote:
>>
>> +/*
>> + * This is specific to AMD and specifies that enlightened TLB flush is
>> + * supported. If guest opts in to this feature, ASID invalidations only
>> + * flushes gva -> hpa mapping entries. To flush the TLB entries derived
>> + * from NPT, hypercalls should be used (HvFlushGuestPhysicalAddressSpace
>> + * or HvFlushGuestPhysicalAddressList).
>> + */
>> +#define HV_X64_NESTED_ENLIGHTENED_TLB BIT(22)
>> +
> c
> This is not yet documented in TLFS, right? I can't find this bit in the
> latest edition (6.0b).
This would be documented in the TLFS update which is soon to be
released.

>
> My first thought is the comment says this is AMD specific but the name
> is rather generic. That looks a bit odd to begin with.
I thought of of keeping the name generic to avoid renaming Intel
specific ones also. If I understand correctly, the TLFS would also
be having generic name for this and just translated the generic
name here in this header.

Thanks,
Vineeth

2021-04-21 17:15:27

by Wei Liu

[permalink] [raw]
Subject: Re: [PATCH v2 2/7] hyperv: SVM enlightened TLB flush support flag

On Wed, Apr 21, 2021 at 07:15:54AM -0400, Vineeth Pillai wrote:
>
>
> On 4/21/21 6:00 AM, Wei Liu wrote:
> > On Thu, Apr 15, 2021 at 01:43:37PM +0000, Vineeth Pillai wrote:
> > > +/*
> > > + * This is specific to AMD and specifies that enlightened TLB flush is
> > > + * supported. If guest opts in to this feature, ASID invalidations only
> > > + * flushes gva -> hpa mapping entries. To flush the TLB entries derived
> > > + * from NPT, hypercalls should be used (HvFlushGuestPhysicalAddressSpace
> > > + * or HvFlushGuestPhysicalAddressList).
> > > + */
> > > +#define HV_X64_NESTED_ENLIGHTENED_TLB BIT(22)
> > > +
> > c
> > This is not yet documented in TLFS, right? I can't find this bit in the
> > latest edition (6.0b).
> This would be documented in the TLFS update which is soon to be
> released.

Okay.

>
> >
> > My first thought is the comment says this is AMD specific but the name
> > is rather generic. That looks a bit odd to begin with.
> I thought of of keeping the name generic to avoid renaming Intel
> specific ones also. If I understand correctly, the TLFS would also
> be having generic name for this and just translated the generic
> name here in this header.

Okay. Let's match what is written in TLFS.

Wei.

>
> Thanks,
> Vineeth
>

2021-04-21 18:56:00

by Vineeth Pillai

[permalink] [raw]
Subject: Re: [PATCH v2 5/7] KVM: SVM: hyper-v: Remote TLB flush for SVM


On 4/16/2021 1:26 PM, Vineeth Pillai wrote:
>
>>
>>   +#if IS_ENABLED(CONFIG_HYPERV)
>> +static void hv_init_vmcb(struct vmcb *vmcb)
>> +{
>> +    struct hv_enlightenments *hve = &vmcb->hv_enlightenments;
>> +
>> +    if (npt_enabled &&
>> +        ms_hyperv.nested_features & HV_X64_NESTED_ENLIGHTENED_TLB)
>> Nitpick: we can probably have a 'static inline' for
>>
>>   "npt_enabled && ms_hyperv.nested_features &
>> HV_X64_NESTED_ENLIGHTENED_TLB"
>>
>> e.g. 'hv_svm_enlightened_tlbflush()'
> Makes sense, will do.
On a second thought, this function itself is small and just does this
one check.
So, might not make sense to add one more function. I shall rather change
this
function to be an inline.

Thanks,
Vineeth