LinuxLists.cc - [PATCH 0/7] Hyper-V nested virt enlightenments for SVM

2021-04-07 21:06:23

Subject: [PATCH 0/7] Hyper-V nested virt enlightenments for SVM

This patch series enables the nested virtualization enlightenments for
SVM. This is very similar to the enlightenments for VMX except for the
fact that there is no enlightened VMCS. For SVM, VMCB is already an
in-memory data structure. The supported enlightenments are:

Enlightened TLB Flush: If this is enabled, ASID invalidations invalidate
only gva -> hpa entries. To flush entries derived from NPT, hyper-v
provided hypercalls (HvFlushGuestPhysicalAddressSpace or
HvFlushGuestPhysicalAddressList) should be used.

Enlightened MSR bitmap(TLFS 16.5.3): "When enabled, L0 hypervisor does
not monitor the MSR bitmaps for changes. Instead, the L1 hypervisor must
invalidate the corresponding clean field after making changes to one of
the MSR bitmaps."

Direct Virtual Flush(TLFS 16.8): The hypervisor exposes hypercalls
(HvFlushVirtualAddressSpace, HvFlushVirtualAddressSpaceEx,
HvFlushVirtualAddressList, and HvFlushVirtualAddressListEx) that allow
operating systems to more efficiently manage the virtual TLB. The L1
hypervisor can choose to allow its guest to use those hypercalls and
delegate the responsibility to handle them to the L0 hypervisor. This
requires the use of a partition assist page."

L2 Windows boot time was measured with and without the patch. Time was
measured from power on to the login screen and was averaged over a
consecutive 5 trials:
Without the patch: 42 seconds
With the patch: 29 seconds
--

Vineeth Pillai (7):
hyperv: Detect Nested virtualization support for SVM
hyperv: SVM enlightened TLB flush support flag
KVM: x86: hyper-v: Move the remote TLB flush logic out of vmx
KVM: SVM: hyper-v: Nested enlightenments in VMCB
KVM: SVM: hyper-v: Remote TLB flush for SVM
KVM: SVM: hyper-v: Enlightened MSR-Bitmap support
KVM: SVM: hyper-v: Direct Virtual Flush support

arch/x86/include/asm/hyperv-tlfs.h | 9 +++
arch/x86/include/asm/kvm_host.h | 15 ++++
arch/x86/include/asm/svm.h | 24 ++++++-
arch/x86/kernel/cpu/mshyperv.c | 10 ++-
arch/x86/kvm/hyperv.c | 89 +++++++++++++++++++++++
arch/x86/kvm/hyperv.h | 12 ++++
arch/x86/kvm/svm/svm.c | 110 +++++++++++++++++++++++++++++
arch/x86/kvm/svm/svm.h | 30 +++++++-
arch/x86/kvm/vmx/vmx.c | 97 ++-----------------------
arch/x86/kvm/vmx/vmx.h | 10 ---
10 files changed, 302 insertions(+), 104 deletions(-)

--
2.25.1

2021-04-07 21:07:48

by Vineeth Pillai

[permalink] [raw]

Subject: [PATCH 4/7] KVM: SVM: hyper-v: Nested enlightenments in VMCB

Add Hyper-V specific fields in VMCB to support SVM enlightenments.
Also a small refactoring of VMCB clean bits handling.

Signed-off-by: Vineeth Pillai <[email protected]>
---
arch/x86/include/asm/svm.h | 24 +++++++++++++++++++++++-
arch/x86/kvm/svm/svm.h | 30 ++++++++++++++++++++++++++++--
2 files changed, 51 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index 1c561945b426..3586d7523ce8 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -322,9 +322,31 @@ static inline void __unused_size_checks(void)
BUILD_BUG_ON(sizeof(struct ghcb) != EXPECTED_GHCB_SIZE);
}

+
+#if IS_ENABLED(CONFIG_HYPERV)
+struct __packed hv_enlightenments {
+ struct __packed hv_enlightenments_control {
+ u32 nested_flush_hypercall:1;
+ u32 msr_bitmap:1;
+ u32 enlightened_npt_tlb: 1;
+ u32 reserved:29;
+ } hv_enlightenments_control;
+ u32 hv_vp_id;
+ u64 hv_vm_id;
+ u64 partition_assist_page;
+ u64 reserved;
+};
+#define VMCB_CONTROL_END 992 // 32 bytes for Hyper-V
+#else
+#define VMCB_CONTROL_END 1024
+#endif
+
struct vmcb {
struct vmcb_control_area control;
- u8 reserved_control[1024 - sizeof(struct vmcb_control_area)];
+ u8 reserved_control[VMCB_CONTROL_END - sizeof(struct vmcb_control_area)];
+#if IS_ENABLED(CONFIG_HYPERV)
+ struct hv_enlightenments hv_enlightenments;
+#endif
struct vmcb_save_area save;
} __packed;

diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 39e071fdab0c..842c8764a68c 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -33,6 +33,11 @@ static const u32 host_save_user_msrs[] = {
extern u32 msrpm_offsets[MSRPM_OFFSETS] __read_mostly;
extern bool npt_enabled;

+/*
+ * Clean bits in VMCB.
+ * Should update VMCB_ALL_CLEAN_MASK also
+ * if this enum is modified.
+ */
enum {
VMCB_INTERCEPTS, /* Intercept vectors, TSC offset,
pause filter count */
@@ -50,9 +55,25 @@ enum {
* AVIC PHYSICAL_TABLE pointer,
* AVIC LOGICAL_TABLE pointer
*/
- VMCB_DIRTY_MAX,
+#if IS_ENABLED(CONFIG_HYPERV)
+ VMCB_HV_NESTED_ENLIGHTENMENTS = 31,
+#endif
};

+#define __CLEAN_MASK ( \
+ (1U << VMCB_INTERCEPTS) | (1U << VMCB_PERM_MAP) | \
+ (1U << VMCB_ASID) | (1U << VMCB_INTR) | \
+ (1U << VMCB_NPT) | (1U << VMCB_CR) | (1U << VMCB_DR) | \
+ (1U << VMCB_DT) | (1U << VMCB_SEG) | (1U << VMCB_CR2) | \
+ (1U << VMCB_LBR) | (1U << VMCB_AVIC) \
+ )
+
+#if IS_ENABLED(CONFIG_HYPERV)
+#define VMCB_ALL_CLEAN_MASK (__CLEAN_MASK | (1U << VMCB_HV_NESTED_ENLIGHTENMENTS))
+#else
+#define VMCB_ALL_CLEAN_MASK __CLEAN_MASK
+#endif
+
/* TPR and CR2 are always written before VMRUN */
#define VMCB_ALWAYS_DIRTY_MASK ((1U << VMCB_INTR) | (1U << VMCB_CR2))

@@ -230,7 +251,7 @@ static inline void vmcb_mark_all_dirty(struct vmcb *vmcb)

static inline void vmcb_mark_all_clean(struct vmcb *vmcb)
{
- vmcb->control.clean = ((1 << VMCB_DIRTY_MAX) - 1)
+ vmcb->control.clean = VMCB_ALL_CLEAN_MASK
& ~VMCB_ALWAYS_DIRTY_MASK;
}

@@ -239,6 +260,11 @@ static inline void vmcb_mark_dirty(struct vmcb *vmcb, int bit)
vmcb->control.clean &= ~(1 << bit);
}

+static inline bool vmcb_is_clean(struct vmcb *vmcb, int bit)
+{
+ return (vmcb->control.clean & (1 << bit));
+}
+
static inline struct vcpu_svm *to_svm(struct kvm_vcpu *vcpu)
{
return container_of(vcpu, struct vcpu_svm, vcpu);
--
2.25.1

2021-04-07 21:07:53

by Vineeth Pillai

[permalink] [raw]

Subject: [PATCH 7/7] KVM: SVM: hyper-v: Direct Virtual Flush support

From Hyper-V TLFS:
"The hypervisor exposes hypercalls (HvFlushVirtualAddressSpace,
HvFlushVirtualAddressSpaceEx, HvFlushVirtualAddressList, and
HvFlushVirtualAddressListEx) that allow operating systems to more
efficiently manage the virtual TLB. The L1 hypervisor can choose to
allow its guest to use those hypercalls and delegate the responsibility
to handle them to the L0 hypervisor. This requires the use of a
partition assist page."

Add the Direct Virtual Flush support for SVM.

Related VMX changes:
commit 6f6a657c9998 ("KVM/Hyper-V/VMX: Add direct tlb flush support")

Signed-off-by: Vineeth Pillai <[email protected]>
---
arch/x86/kvm/svm/svm.c | 48 ++++++++++++++++++++++++++++++++++++++++++
1 file changed, 48 insertions(+)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 3562a247b7e8..c6d3f3a7c986 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -440,6 +440,32 @@ static void svm_init_osvw(struct kvm_vcpu *vcpu)
vcpu->arch.osvw.status |= 1;
}

+#if IS_ENABLED(CONFIG_HYPERV)
+static int hv_enable_direct_tlbflush(struct kvm_vcpu *vcpu)
+{
+ struct hv_enlightenments *hve;
+ struct hv_partition_assist_pg **p_hv_pa_pg =
+ &to_kvm_hv(vcpu->kvm)->hv_pa_pg;
+
+ if (!*p_hv_pa_pg)
+ *p_hv_pa_pg = kzalloc(PAGE_SIZE, GFP_KERNEL);
+
+ if (!*p_hv_pa_pg)
+ return -ENOMEM;
+
+ hve = (struct hv_enlightenments *)&to_svm(vcpu)->vmcb->hv_enlightenments;
+
+ hve->partition_assist_page = __pa(*p_hv_pa_pg);
+ hve->hv_vm_id = (unsigned long)vcpu->kvm;
+ if (!hve->hv_enlightenments_control.nested_flush_hypercall) {
+ hve->hv_enlightenments_control.nested_flush_hypercall = 1;
+ vmcb_mark_dirty(to_svm(vcpu)->vmcb, VMCB_HV_NESTED_ENLIGHTENMENTS);
+ }
+
+ return 0;
+}
+#endif
+
static int has_svm(void)
{
const char *msg;
@@ -1034,6 +1060,21 @@ static __init int svm_hardware_setup(void)
svm_x86_ops.tlb_remote_flush_with_range =
kvm_hv_remote_flush_tlb_with_range;
}
+
+ if (ms_hyperv.nested_features & HV_X64_NESTED_DIRECT_FLUSH) {
+ pr_info("kvm: Hyper-V Direct TLB Flush enabled\n");
+ for_each_online_cpu(cpu) {
+ struct hv_vp_assist_page *vp_ap =
+ hv_get_vp_assist_page(cpu);
+
+ if (!vp_ap)
+ continue;
+
+ vp_ap->nested_control.features.directhypercall = 1;
+ }
+ svm_x86_ops.enable_direct_tlbflush =
+ hv_enable_direct_tlbflush;
+ }
#endif

if (nrips) {
@@ -3913,6 +3954,13 @@ static __no_kcsan fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu)
}
svm->vmcb->save.cr2 = vcpu->arch.cr2;

+#if IS_ENABLED(CONFIG_HYPERV)
+ if (svm->vmcb->hv_enlightenments.hv_vp_id != to_hv_vcpu(vcpu)->vp_index) {
+ svm->vmcb->hv_enlightenments.hv_vp_id = to_hv_vcpu(vcpu)->vp_index;
+ vmcb_mark_dirty(svm->vmcb, VMCB_HV_NESTED_ENLIGHTENMENTS);
+ }
+#endif
+
/*
* Run with all-zero DR6 unless needed, so that we can get the exact cause
* of a #DB.
--
2.25.1

2021-04-07 21:08:48

by Vineeth Pillai

[permalink] [raw]

Subject: [PATCH 3/7] KVM: x86: hyper-v: Move the remote TLB flush logic out of vmx

Currently the remote TLB flush logic is specific to VMX.
Move it to a common place so that SVM can use it as well.

Signed-off-by: Vineeth Pillai <[email protected]>
---
arch/x86/include/asm/kvm_host.h | 15 +++++
arch/x86/kvm/hyperv.c | 89 ++++++++++++++++++++++++++++++
arch/x86/kvm/hyperv.h | 12 ++++
arch/x86/kvm/vmx/vmx.c | 97 +++------------------------------
arch/x86/kvm/vmx/vmx.h | 10 ----
5 files changed, 123 insertions(+), 100 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 877a4025d8da..336716124b7e 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -530,6 +530,12 @@ struct kvm_vcpu_hv {
struct kvm_vcpu_hv_stimer stimer[HV_SYNIC_STIMER_COUNT];
DECLARE_BITMAP(stimer_pending_bitmap, HV_SYNIC_STIMER_COUNT);
cpumask_t tlb_flush;
+ /*
+ * Two Dimensional paging CR3
+ * EPTP for Intel
+ * nCR3 for AMD
+ */
+ u64 tdp_pointer;
};

/* Xen HVM per vcpu emulation context */
@@ -884,6 +890,12 @@ struct kvm_hv_syndbg {
u64 options;
};

+enum tdp_pointers_status {
+ TDP_POINTERS_CHECK = 0,
+ TDP_POINTERS_MATCH = 1,
+ TDP_POINTERS_MISMATCH = 2
+};
+
/* Hyper-V emulation context */
struct kvm_hv {
struct mutex hv_lock;
@@ -908,6 +920,9 @@ struct kvm_hv {

struct hv_partition_assist_pg *hv_pa_pg;
struct kvm_hv_syndbg hv_syndbg;
+
+ enum tdp_pointers_status tdp_pointers_match;
+ spinlock_t tdp_pointer_lock;
};

struct msr_bitmap_range {
diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 58fa8c029867..c5bec598bf28 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -32,6 +32,7 @@
#include <linux/eventfd.h>

#include <asm/apicdef.h>
+#include <asm/mshyperv.h>
#include <trace/events/kvm.h>

#include "trace.h"
@@ -913,6 +914,8 @@ static int kvm_hv_vcpu_init(struct kvm_vcpu *vcpu)
for (i = 0; i < ARRAY_SIZE(hv_vcpu->stimer); i++)
stimer_init(&hv_vcpu->stimer[i], i);

+ hv_vcpu->tdp_pointer = INVALID_PAGE;
+
hv_vcpu->vp_index = kvm_vcpu_get_idx(vcpu);

return 0;
@@ -1960,6 +1963,7 @@ void kvm_hv_init_vm(struct kvm *kvm)
{
struct kvm_hv *hv = to_kvm_hv(kvm);

+ spin_lock_init(&hv->tdp_pointer_lock);
mutex_init(&hv->hv_lock);
idr_init(&hv->conn_to_evt);
}
@@ -2180,3 +2184,88 @@ int kvm_get_hv_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid2 *cpuid,

return 0;
}
+
+/* check_tdp_pointer() should be under protection of tdp_pointer_lock. */
+static void check_tdp_pointer_match(struct kvm *kvm)
+{
+ u64 tdp_pointer = INVALID_PAGE;
+ bool valid_tdp = false;
+ struct kvm_vcpu *vcpu;
+ int i;
+
+ kvm_for_each_vcpu(i, vcpu, kvm) {
+ if (!valid_tdp) {
+ tdp_pointer = to_hv_vcpu(vcpu)->tdp_pointer;
+ valid_tdp = true;
+ continue;
+ }
+
+ if (tdp_pointer != to_hv_vcpu(vcpu)->tdp_pointer) {
+ to_kvm_hv(kvm)->tdp_pointers_match
+ = TDP_POINTERS_MISMATCH;
+ return;
+ }
+ }
+
+ to_kvm_hv(kvm)->tdp_pointers_match = TDP_POINTERS_MATCH;
+}
+
+static int kvm_fill_hv_flush_list_func(struct hv_guest_mapping_flush_list *flush,
+ void *data)
+{
+ struct kvm_tlb_range *range = data;
+
+ return hyperv_fill_flush_guest_mapping_list(flush, range->start_gfn,
+ range->pages);
+}
+
+static inline int __hv_remote_flush_tlb_with_range(struct kvm *kvm,
+ struct kvm_vcpu *vcpu, struct kvm_tlb_range *range)
+{
+ u64 tdp_pointer = to_hv_vcpu(vcpu)->tdp_pointer;
+
+ /*
+ * FLUSH_GUEST_PHYSICAL_ADDRESS_SPACE hypercall needs address
+ * of the base of EPT PML4 table, strip off EPT configuration
+ * information.
+ */
+ if (range)
+ return hyperv_flush_guest_mapping_range(tdp_pointer & PAGE_MASK,
+ kvm_fill_hv_flush_list_func, (void *)range);
+ else
+ return hyperv_flush_guest_mapping(tdp_pointer & PAGE_MASK);
+}
+
+int kvm_hv_remote_flush_tlb_with_range(struct kvm *kvm,
+ struct kvm_tlb_range *range)
+{
+ struct kvm_vcpu *vcpu;
+ int ret = 0, i;
+
+ spin_lock(&to_kvm_hv(kvm)->tdp_pointer_lock);
+
+ if (to_kvm_hv(kvm)->tdp_pointers_match == TDP_POINTERS_CHECK)
+ check_tdp_pointer_match(kvm);
+
+ if (to_kvm_hv(kvm)->tdp_pointers_match != TDP_POINTERS_MATCH) {
+ kvm_for_each_vcpu(i, vcpu, kvm) {
+ /* If tdp_pointer is invalid pointer, bypass flush request. */
+ if (VALID_PAGE(to_hv_vcpu(vcpu)->tdp_pointer))
+ ret |= __hv_remote_flush_tlb_with_range(
+ kvm, vcpu, range);
+ }
+ } else {
+ ret = __hv_remote_flush_tlb_with_range(kvm,
+ kvm_get_vcpu(kvm, 0), range);
+ }
+
+ spin_unlock(&to_kvm_hv(kvm)->tdp_pointer_lock);
+ return ret;
+}
+EXPORT_SYMBOL_GPL(kvm_hv_remote_flush_tlb_with_range);
+
+int kvm_hv_remote_flush_tlb(struct kvm *kvm)
+{
+ return kvm_hv_remote_flush_tlb_with_range(kvm, NULL);
+}
+EXPORT_SYMBOL_GPL(kvm_hv_remote_flush_tlb);
diff --git a/arch/x86/kvm/hyperv.h b/arch/x86/kvm/hyperv.h
index e951af1fcb2c..225ede22a815 100644
--- a/arch/x86/kvm/hyperv.h
+++ b/arch/x86/kvm/hyperv.h
@@ -141,4 +141,16 @@ int kvm_vm_ioctl_hv_eventfd(struct kvm *kvm, struct kvm_hyperv_eventfd *args);
int kvm_get_hv_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid2 *cpuid,
struct kvm_cpuid_entry2 __user *entries);

+static inline void kvm_update_arch_tdp_pointer(struct kvm *kvm,
+ struct kvm_vcpu *vcpu, u64 tdp_pointer)
+{
+ spin_lock(&to_kvm_hv(kvm)->tdp_pointer_lock);
+ to_hv_vcpu(vcpu)->tdp_pointer = tdp_pointer;
+ to_kvm_hv(kvm)->tdp_pointers_match = TDP_POINTERS_CHECK;
+ spin_unlock(&to_kvm_hv(kvm)->tdp_pointer_lock);
+}
+
+int kvm_hv_remote_flush_tlb(struct kvm *kvm);
+int kvm_hv_remote_flush_tlb_with_range(struct kvm *kvm,
+ struct kvm_tlb_range *range);
#endif
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 50810d471462..67f607319eb7 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -62,6 +62,7 @@
#include "vmcs12.h"
#include "vmx.h"
#include "x86.h"
+#include "hyperv.h"

MODULE_AUTHOR("Qumranet");
MODULE_LICENSE("GPL");
@@ -472,83 +473,6 @@ static const u32 vmx_uret_msrs_list[] = {
static bool __read_mostly enlightened_vmcs = true;
module_param(enlightened_vmcs, bool, 0444);

-/* check_ept_pointer() should be under protection of ept_pointer_lock. */
-static void check_ept_pointer_match(struct kvm *kvm)
-{
- struct kvm_vcpu *vcpu;
- u64 tmp_eptp = INVALID_PAGE;
- int i;
-
- kvm_for_each_vcpu(i, vcpu, kvm) {
- if (!VALID_PAGE(tmp_eptp)) {
- tmp_eptp = to_vmx(vcpu)->ept_pointer;
- } else if (tmp_eptp != to_vmx(vcpu)->ept_pointer) {
- to_kvm_vmx(kvm)->ept_pointers_match
- = EPT_POINTERS_MISMATCH;
- return;
- }
- }
-
- to_kvm_vmx(kvm)->ept_pointers_match = EPT_POINTERS_MATCH;
-}
-
-static int kvm_fill_hv_flush_list_func(struct hv_guest_mapping_flush_list *flush,
- void *data)
-{
- struct kvm_tlb_range *range = data;
-
- return hyperv_fill_flush_guest_mapping_list(flush, range->start_gfn,
- range->pages);
-}
-
-static inline int __hv_remote_flush_tlb_with_range(struct kvm *kvm,
- struct kvm_vcpu *vcpu, struct kvm_tlb_range *range)
-{
- u64 ept_pointer = to_vmx(vcpu)->ept_pointer;
-
- /*
- * FLUSH_GUEST_PHYSICAL_ADDRESS_SPACE hypercall needs address
- * of the base of EPT PML4 table, strip off EPT configuration
- * information.
- */
- if (range)
- return hyperv_flush_guest_mapping_range(ept_pointer & PAGE_MASK,
- kvm_fill_hv_flush_list_func, (void *)range);
- else
- return hyperv_flush_guest_mapping(ept_pointer & PAGE_MASK);
-}
-
-static int hv_remote_flush_tlb_with_range(struct kvm *kvm,
- struct kvm_tlb_range *range)
-{
- struct kvm_vcpu *vcpu;
- int ret = 0, i;
-
- spin_lock(&to_kvm_vmx(kvm)->ept_pointer_lock);
-
- if (to_kvm_vmx(kvm)->ept_pointers_match == EPT_POINTERS_CHECK)
- check_ept_pointer_match(kvm);
-
- if (to_kvm_vmx(kvm)->ept_pointers_match != EPT_POINTERS_MATCH) {
- kvm_for_each_vcpu(i, vcpu, kvm) {
- /* If ept_pointer is invalid pointer, bypass flush request. */
- if (VALID_PAGE(to_vmx(vcpu)->ept_pointer))
- ret |= __hv_remote_flush_tlb_with_range(
- kvm, vcpu, range);
- }
- } else {
- ret = __hv_remote_flush_tlb_with_range(kvm,
- kvm_get_vcpu(kvm, 0), range);
- }
-
- spin_unlock(&to_kvm_vmx(kvm)->ept_pointer_lock);
- return ret;
-}
-static int hv_remote_flush_tlb(struct kvm *kvm)
-{
- return hv_remote_flush_tlb_with_range(kvm, NULL);
-}
-
static int hv_enable_direct_tlbflush(struct kvm_vcpu *vcpu)
{
struct hv_enlightened_vmcs *evmcs;
@@ -3115,13 +3039,10 @@ static void vmx_load_mmu_pgd(struct kvm_vcpu *vcpu, unsigned long pgd,
eptp = construct_eptp(vcpu, pgd, pgd_level);
vmcs_write64(EPT_POINTER, eptp);

- if (kvm_x86_ops.tlb_remote_flush) {
- spin_lock(&to_kvm_vmx(kvm)->ept_pointer_lock);
- to_vmx(vcpu)->ept_pointer = eptp;
- to_kvm_vmx(kvm)->ept_pointers_match
- = EPT_POINTERS_CHECK;
- spin_unlock(&to_kvm_vmx(kvm)->ept_pointer_lock);
- }
+#if IS_ENABLED(CONFIG_HYPERV)
+ if (kvm_x86_ops.tlb_remote_flush)
+ kvm_update_arch_tdp_pointer(kvm, vcpu, eptp);
+#endif

if (!enable_unrestricted_guest && !is_paging(vcpu))
guest_cr3 = to_kvm_vmx(kvm)->ept_identity_map_addr;
@@ -6989,8 +6910,6 @@ static int vmx_create_vcpu(struct kvm_vcpu *vcpu)
vmx->pi_desc.nv = POSTED_INTR_VECTOR;
vmx->pi_desc.sn = 1;

- vmx->ept_pointer = INVALID_PAGE;
-
return 0;

free_vmcs:
@@ -7007,8 +6926,6 @@ static int vmx_create_vcpu(struct kvm_vcpu *vcpu)

static int vmx_vm_init(struct kvm *kvm)
{
- spin_lock_init(&to_kvm_vmx(kvm)->ept_pointer_lock);
-
if (!ple_gap)
kvm->arch.pause_in_guest = true;

@@ -7818,9 +7735,9 @@ static __init int hardware_setup(void)
#if IS_ENABLED(CONFIG_HYPERV)
if (ms_hyperv.nested_features & HV_X64_NESTED_GUEST_MAPPING_FLUSH
&& enable_ept) {
- vmx_x86_ops.tlb_remote_flush = hv_remote_flush_tlb;
+ vmx_x86_ops.tlb_remote_flush = kvm_hv_remote_flush_tlb;
vmx_x86_ops.tlb_remote_flush_with_range =
- hv_remote_flush_tlb_with_range;
+ kvm_hv_remote_flush_tlb_with_range;
}
#endif

diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index 89da5e1251f1..d2e2ab46f5bb 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -325,7 +325,6 @@ struct vcpu_vmx {
*/
u64 msr_ia32_feature_control;
u64 msr_ia32_feature_control_valid_bits;
- u64 ept_pointer;

struct pt_desc pt_desc;
struct lbr_desc lbr_desc;
@@ -338,21 +337,12 @@ struct vcpu_vmx {
} shadow_msr_intercept;
};

-enum ept_pointers_status {
- EPT_POINTERS_CHECK = 0,
- EPT_POINTERS_MATCH = 1,
- EPT_POINTERS_MISMATCH = 2
-};
-
struct kvm_vmx {
struct kvm kvm;

unsigned int tss_addr;
bool ept_identity_pagetable_done;
gpa_t ept_identity_map_addr;
-
- enum ept_pointers_status ept_pointers_match;
- spinlock_t ept_pointer_lock;
};

bool nested_vmx_allowed(struct kvm_vcpu *vcpu);
--
2.25.1

2021-04-07 22:37:27

by Vineeth Pillai

[permalink] [raw]

Subject: [PATCH 6/7] KVM: SVM: hyper-v: Enlightened MSR-Bitmap support

Enlightened MSR-Bitmap as per TLFS:

"The L1 hypervisor may collaborate with the L0 hypervisor to make MSR
accesses more efficient. It can enable enlightened MSR bitmaps by setting
the corresponding field in the enlightened VMCS to 1. When enabled, L0
hypervisor does not monitor the MSR bitmaps for changes. Instead, the L1
hypervisor must invalidate the corresponding clean field after making
changes to one of the MSR bitmaps."

Enable this for SVM.

Related VMX changes:
commit ceef7d10dfb6 ("KVM: x86: VMX: hyper-v: Enlightened MSR-Bitmap support")

Signed-off-by: Vineeth Pillai <[email protected]>
---
arch/x86/kvm/svm/svm.c | 27 +++++++++++++++++++++++++++
1 file changed, 27 insertions(+)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 6287cab61f15..3562a247b7e8 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -646,6 +646,27 @@ static bool msr_write_intercepted(struct kvm_vcpu *vcpu, u32 msr)
return !!test_bit(bit_write, &tmp);
}

+#if IS_ENABLED(CONFIG_HYPERV)
+static inline void hv_vmcb_dirty_nested_enlightenments(struct kvm_vcpu *vcpu)
+{
+ struct vmcb *vmcb = to_svm(vcpu)->vmcb;
+
+ /*
+ * vmcb can be NULL if called during early vcpu init.
+ * And its okay not to mark vmcb dirty during vcpu init
+ * as we mark it dirty unconditionally towards end of vcpu
+ * init phase.
+ */
+ if (vmcb && vmcb_is_clean(vmcb, VMCB_HV_NESTED_ENLIGHTENMENTS) &&
+ vmcb->hv_enlightenments.hv_enlightenments_control.msr_bitmap)
+ vmcb_mark_dirty(vmcb, VMCB_HV_NESTED_ENLIGHTENMENTS);
+}
+#else
+static inline void hv_vmcb_dirty_nested_enlightenments(struct kvm_vcpu *vcpu)
+{
+}
+#endif
+
static void set_msr_interception_bitmap(struct kvm_vcpu *vcpu, u32 *msrpm,
u32 msr, int read, int write)
{
@@ -677,6 +698,9 @@ static void set_msr_interception_bitmap(struct kvm_vcpu *vcpu, u32 *msrpm,
write ? clear_bit(bit_write, &tmp) : set_bit(bit_write, &tmp);

msrpm[offset] = tmp;
+
+ hv_vmcb_dirty_nested_enlightenments(vcpu);
+
}

void set_msr_interception(struct kvm_vcpu *vcpu, u32 *msrpm, u32 msr,
@@ -1135,6 +1159,9 @@ static void hv_init_vmcb(struct vmcb *vmcb)
if (npt_enabled &&
ms_hyperv.nested_features & HV_X64_NESTED_ENLIGHTENED_TLB)
hve->hv_enlightenments_control.enlightened_npt_tlb = 1;
+
+ if (ms_hyperv.nested_features & HV_X64_NESTED_MSR_BITMAP)
+ hve->hv_enlightenments_control.msr_bitmap = 1;
}
#else
static inline void hv_init_vmcb(struct vmcb *vmcb)
--
2.25.1

2021-04-08 11:17:29

by Vitaly Kuznetsov

[permalink] [raw]

Subject: Re: [PATCH 3/7] KVM: x86: hyper-v: Move the remote TLB flush logic out of vmx

Vineeth Pillai <[email protected]> writes:

> Currently the remote TLB flush logic is specific to VMX.
> Move it to a common place so that SVM can use it as well.
>
> Signed-off-by: Vineeth Pillai <[email protected]>
> ---
> arch/x86/include/asm/kvm_host.h | 15 +++++
> arch/x86/kvm/hyperv.c | 89 ++++++++++++++++++++++++++++++
> arch/x86/kvm/hyperv.h | 12 ++++
> arch/x86/kvm/vmx/vmx.c | 97 +++------------------------------
> arch/x86/kvm/vmx/vmx.h | 10 ----
> 5 files changed, 123 insertions(+), 100 deletions(-)
>
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 877a4025d8da..336716124b7e 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -530,6 +530,12 @@ struct kvm_vcpu_hv {
> struct kvm_vcpu_hv_stimer stimer[HV_SYNIC_STIMER_COUNT];
> DECLARE_BITMAP(stimer_pending_bitmap, HV_SYNIC_STIMER_COUNT);
> cpumask_t tlb_flush;
> + /*
> + * Two Dimensional paging CR3
> + * EPTP for Intel
> + * nCR3 for AMD
> + */
> + u64 tdp_pointer;
> };

'struct kvm_vcpu_hv' is only allocated when we emulate Hyper-V in KVM
(run Windows/Hyper-V guests on top of KVM). Remote TLB flush is used
when we run KVM on Hyper-V and this is a very different beast. Let's not
mix these things together. I understand that some unification is needed
to bring the AMD specific feature but let's do it differently.

E.g. 'ept_pointer' and friends from 'struct kvm_vmx' can just go to
'struct kvm_vcpu_arch' (in case they really need to be unified).

>
> /* Xen HVM per vcpu emulation context */
> @@ -884,6 +890,12 @@ struct kvm_hv_syndbg {
> u64 options;
> };
>
> +enum tdp_pointers_status {
> + TDP_POINTERS_CHECK = 0,
> + TDP_POINTERS_MATCH = 1,
> + TDP_POINTERS_MISMATCH = 2
> +};
> +
> /* Hyper-V emulation context */
> struct kvm_hv {
> struct mutex hv_lock;
> @@ -908,6 +920,9 @@ struct kvm_hv {
>
> struct hv_partition_assist_pg *hv_pa_pg;
> struct kvm_hv_syndbg hv_syndbg;
> +
> + enum tdp_pointers_status tdp_pointers_match;
> + spinlock_t tdp_pointer_lock;
> };
>
> struct msr_bitmap_range {
> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
> index 58fa8c029867..c5bec598bf28 100644
> --- a/arch/x86/kvm/hyperv.c
> +++ b/arch/x86/kvm/hyperv.c
> @@ -32,6 +32,7 @@
> #include <linux/eventfd.h>
>
> #include <asm/apicdef.h>
> +#include <asm/mshyperv.h>
> #include <trace/events/kvm.h>
>
> #include "trace.h"
> @@ -913,6 +914,8 @@ static int kvm_hv_vcpu_init(struct kvm_vcpu *vcpu)
> for (i = 0; i < ARRAY_SIZE(hv_vcpu->stimer); i++)
> stimer_init(&hv_vcpu->stimer[i], i);
>
> + hv_vcpu->tdp_pointer = INVALID_PAGE;
> +
> hv_vcpu->vp_index = kvm_vcpu_get_idx(vcpu);
>
> return 0;
> @@ -1960,6 +1963,7 @@ void kvm_hv_init_vm(struct kvm *kvm)
> {
> struct kvm_hv *hv = to_kvm_hv(kvm);
>
> + spin_lock_init(&hv->tdp_pointer_lock);
> mutex_init(&hv->hv_lock);
> idr_init(&hv->conn_to_evt);
> }
> @@ -2180,3 +2184,88 @@ int kvm_get_hv_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid2 *cpuid,
>
> return 0;
> }
> +
> +/* check_tdp_pointer() should be under protection of tdp_pointer_lock. */
> +static void check_tdp_pointer_match(struct kvm *kvm)
> +{
> + u64 tdp_pointer = INVALID_PAGE;
> + bool valid_tdp = false;
> + struct kvm_vcpu *vcpu;
> + int i;
> +
> + kvm_for_each_vcpu(i, vcpu, kvm) {
> + if (!valid_tdp) {
> + tdp_pointer = to_hv_vcpu(vcpu)->tdp_pointer;
> + valid_tdp = true;
> + continue;
> + }
> +
> + if (tdp_pointer != to_hv_vcpu(vcpu)->tdp_pointer) {
> + to_kvm_hv(kvm)->tdp_pointers_match
> + = TDP_POINTERS_MISMATCH;
> + return;
> + }
> + }
> +
> + to_kvm_hv(kvm)->tdp_pointers_match = TDP_POINTERS_MATCH;
> +}
> +
> +static int kvm_fill_hv_flush_list_func(struct hv_guest_mapping_flush_list *flush,
> + void *data)
> +{
> + struct kvm_tlb_range *range = data;
> +
> + return hyperv_fill_flush_guest_mapping_list(flush, range->start_gfn,
> + range->pages);
> +}
> +
> +static inline int __hv_remote_flush_tlb_with_range(struct kvm *kvm,
> + struct kvm_vcpu *vcpu, struct kvm_tlb_range *range)
> +{
> + u64 tdp_pointer = to_hv_vcpu(vcpu)->tdp_pointer;
> +
> + /*
> + * FLUSH_GUEST_PHYSICAL_ADDRESS_SPACE hypercall needs address
> + * of the base of EPT PML4 table, strip off EPT configuration
> + * information.
> + */
> + if (range)
> + return hyperv_flush_guest_mapping_range(tdp_pointer & PAGE_MASK,
> + kvm_fill_hv_flush_list_func, (void *)range);
> + else
> + return hyperv_flush_guest_mapping(tdp_pointer & PAGE_MASK);
> +}
> +
> +int kvm_hv_remote_flush_tlb_with_range(struct kvm *kvm,
> + struct kvm_tlb_range *range)
> +{
> + struct kvm_vcpu *vcpu;
> + int ret = 0, i;
> +
> + spin_lock(&to_kvm_hv(kvm)->tdp_pointer_lock);
> +
> + if (to_kvm_hv(kvm)->tdp_pointers_match == TDP_POINTERS_CHECK)
> + check_tdp_pointer_match(kvm);
> +
> + if (to_kvm_hv(kvm)->tdp_pointers_match != TDP_POINTERS_MATCH) {
> + kvm_for_each_vcpu(i, vcpu, kvm) {
> + /* If tdp_pointer is invalid pointer, bypass flush request. */
> + if (VALID_PAGE(to_hv_vcpu(vcpu)->tdp_pointer))
> + ret |= __hv_remote_flush_tlb_with_range(
> + kvm, vcpu, range);
> + }
> + } else {
> + ret = __hv_remote_flush_tlb_with_range(kvm,
> + kvm_get_vcpu(kvm, 0), range);
> + }
> +
> + spin_unlock(&to_kvm_hv(kvm)->tdp_pointer_lock);
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(kvm_hv_remote_flush_tlb_with_range);
> +
> +int kvm_hv_remote_flush_tlb(struct kvm *kvm)
> +{
> + return kvm_hv_remote_flush_tlb_with_range(kvm, NULL);
> +}
> +EXPORT_SYMBOL_GPL(kvm_hv_remote_flush_tlb);
> diff --git a/arch/x86/kvm/hyperv.h b/arch/x86/kvm/hyperv.h
> index e951af1fcb2c..225ede22a815 100644
> --- a/arch/x86/kvm/hyperv.h
> +++ b/arch/x86/kvm/hyperv.h
> @@ -141,4 +141,16 @@ int kvm_vm_ioctl_hv_eventfd(struct kvm *kvm, struct kvm_hyperv_eventfd *args);
> int kvm_get_hv_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid2 *cpuid,
> struct kvm_cpuid_entry2 __user *entries);
>
> +static inline void kvm_update_arch_tdp_pointer(struct kvm *kvm,
> + struct kvm_vcpu *vcpu, u64 tdp_pointer)
> +{
> + spin_lock(&to_kvm_hv(kvm)->tdp_pointer_lock);
> + to_hv_vcpu(vcpu)->tdp_pointer = tdp_pointer;
> + to_kvm_hv(kvm)->tdp_pointers_match = TDP_POINTERS_CHECK;
> + spin_unlock(&to_kvm_hv(kvm)->tdp_pointer_lock);
> +}
> +
> +int kvm_hv_remote_flush_tlb(struct kvm *kvm);
> +int kvm_hv_remote_flush_tlb_with_range(struct kvm *kvm,
> + struct kvm_tlb_range *range);
> #endif
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 50810d471462..67f607319eb7 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -62,6 +62,7 @@
> #include "vmcs12.h"
> #include "vmx.h"
> #include "x86.h"
> +#include "hyperv.h"
>
> MODULE_AUTHOR("Qumranet");
> MODULE_LICENSE("GPL");
> @@ -472,83 +473,6 @@ static const u32 vmx_uret_msrs_list[] = {
> static bool __read_mostly enlightened_vmcs = true;
> module_param(enlightened_vmcs, bool, 0444);
>
> -/* check_ept_pointer() should be under protection of ept_pointer_lock. */
> -static void check_ept_pointer_match(struct kvm *kvm)
> -{
> - struct kvm_vcpu *vcpu;
> - u64 tmp_eptp = INVALID_PAGE;
> - int i;
> -
> - kvm_for_each_vcpu(i, vcpu, kvm) {
> - if (!VALID_PAGE(tmp_eptp)) {
> - tmp_eptp = to_vmx(vcpu)->ept_pointer;
> - } else if (tmp_eptp != to_vmx(vcpu)->ept_pointer) {
> - to_kvm_vmx(kvm)->ept_pointers_match
> - = EPT_POINTERS_MISMATCH;
> - return;
> - }
> - }
> -
> - to_kvm_vmx(kvm)->ept_pointers_match = EPT_POINTERS_MATCH;
> -}
> -
> -static int kvm_fill_hv_flush_list_func(struct hv_guest_mapping_flush_list *flush,
> - void *data)
> -{
> - struct kvm_tlb_range *range = data;
> -
> - return hyperv_fill_flush_guest_mapping_list(flush, range->start_gfn,
> - range->pages);
> -}
> -
> -static inline int __hv_remote_flush_tlb_with_range(struct kvm *kvm,
> - struct kvm_vcpu *vcpu, struct kvm_tlb_range *range)
> -{
> - u64 ept_pointer = to_vmx(vcpu)->ept_pointer;
> -
> - /*
> - * FLUSH_GUEST_PHYSICAL_ADDRESS_SPACE hypercall needs address
> - * of the base of EPT PML4 table, strip off EPT configuration
> - * information.
> - */
> - if (range)
> - return hyperv_flush_guest_mapping_range(ept_pointer & PAGE_MASK,
> - kvm_fill_hv_flush_list_func, (void *)range);
> - else
> - return hyperv_flush_guest_mapping(ept_pointer & PAGE_MASK);
> -}
> -
> -static int hv_remote_flush_tlb_with_range(struct kvm *kvm,
> - struct kvm_tlb_range *range)
> -{
> - struct kvm_vcpu *vcpu;
> - int ret = 0, i;
> -
> - spin_lock(&to_kvm_vmx(kvm)->ept_pointer_lock);
> -
> - if (to_kvm_vmx(kvm)->ept_pointers_match == EPT_POINTERS_CHECK)
> - check_ept_pointer_match(kvm);
> -
> - if (to_kvm_vmx(kvm)->ept_pointers_match != EPT_POINTERS_MATCH) {
> - kvm_for_each_vcpu(i, vcpu, kvm) {
> - /* If ept_pointer is invalid pointer, bypass flush request. */
> - if (VALID_PAGE(to_vmx(vcpu)->ept_pointer))
> - ret |= __hv_remote_flush_tlb_with_range(
> - kvm, vcpu, range);
> - }
> - } else {
> - ret = __hv_remote_flush_tlb_with_range(kvm,
> - kvm_get_vcpu(kvm, 0), range);
> - }
> -
> - spin_unlock(&to_kvm_vmx(kvm)->ept_pointer_lock);
> - return ret;
> -}
> -static int hv_remote_flush_tlb(struct kvm *kvm)
> -{
> - return hv_remote_flush_tlb_with_range(kvm, NULL);
> -}
> -
> static int hv_enable_direct_tlbflush(struct kvm_vcpu *vcpu)
> {
> struct hv_enlightened_vmcs *evmcs;
> @@ -3115,13 +3039,10 @@ static void vmx_load_mmu_pgd(struct kvm_vcpu *vcpu, unsigned long pgd,
> eptp = construct_eptp(vcpu, pgd, pgd_level);
> vmcs_write64(EPT_POINTER, eptp);
>
> - if (kvm_x86_ops.tlb_remote_flush) {
> - spin_lock(&to_kvm_vmx(kvm)->ept_pointer_lock);
> - to_vmx(vcpu)->ept_pointer = eptp;
> - to_kvm_vmx(kvm)->ept_pointers_match
> - = EPT_POINTERS_CHECK;
> - spin_unlock(&to_kvm_vmx(kvm)->ept_pointer_lock);
> - }
> +#if IS_ENABLED(CONFIG_HYPERV)
> + if (kvm_x86_ops.tlb_remote_flush)
> + kvm_update_arch_tdp_pointer(kvm, vcpu, eptp);
> +#endif
>
> if (!enable_unrestricted_guest && !is_paging(vcpu))
> guest_cr3 = to_kvm_vmx(kvm)->ept_identity_map_addr;
> @@ -6989,8 +6910,6 @@ static int vmx_create_vcpu(struct kvm_vcpu *vcpu)
> vmx->pi_desc.nv = POSTED_INTR_VECTOR;
> vmx->pi_desc.sn = 1;
>
> - vmx->ept_pointer = INVALID_PAGE;
> -
> return 0;
>
> free_vmcs:
> @@ -7007,8 +6926,6 @@ static int vmx_create_vcpu(struct kvm_vcpu *vcpu)
>
> static int vmx_vm_init(struct kvm *kvm)
> {
> - spin_lock_init(&to_kvm_vmx(kvm)->ept_pointer_lock);
> -
> if (!ple_gap)
> kvm->arch.pause_in_guest = true;
>
> @@ -7818,9 +7735,9 @@ static __init int hardware_setup(void)
> #if IS_ENABLED(CONFIG_HYPERV)
> if (ms_hyperv.nested_features & HV_X64_NESTED_GUEST_MAPPING_FLUSH
> && enable_ept) {
> - vmx_x86_ops.tlb_remote_flush = hv_remote_flush_tlb;
> + vmx_x86_ops.tlb_remote_flush = kvm_hv_remote_flush_tlb;
> vmx_x86_ops.tlb_remote_flush_with_range =
> - hv_remote_flush_tlb_with_range;
> + kvm_hv_remote_flush_tlb_with_range;
> }
> #endif
>
> diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
> index 89da5e1251f1..d2e2ab46f5bb 100644
> --- a/arch/x86/kvm/vmx/vmx.h
> +++ b/arch/x86/kvm/vmx/vmx.h
> @@ -325,7 +325,6 @@ struct vcpu_vmx {
> */
> u64 msr_ia32_feature_control;
> u64 msr_ia32_feature_control_valid_bits;
> - u64 ept_pointer;
>
> struct pt_desc pt_desc;
> struct lbr_desc lbr_desc;
> @@ -338,21 +337,12 @@ struct vcpu_vmx {
> } shadow_msr_intercept;
> };
>
> -enum ept_pointers_status {
> - EPT_POINTERS_CHECK = 0,
> - EPT_POINTERS_MATCH = 1,
> - EPT_POINTERS_MISMATCH = 2
> -};
> -
> struct kvm_vmx {
> struct kvm kvm;
>
> unsigned int tss_addr;
> bool ept_identity_pagetable_done;
> gpa_t ept_identity_map_addr;
> -
> - enum ept_pointers_status ept_pointers_match;
> - spinlock_t ept_pointer_lock;
> };
>
> bool nested_vmx_allowed(struct kvm_vcpu *vcpu);

--
Vitaly

2021-04-08 11:23:09

by Vitaly Kuznetsov

[permalink] [raw]

Subject: Re: [PATCH 6/7] KVM: SVM: hyper-v: Enlightened MSR-Bitmap support

Vineeth Pillai <[email protected]> writes:

> Enlightened MSR-Bitmap as per TLFS:
>
> "The L1 hypervisor may collaborate with the L0 hypervisor to make MSR
> accesses more efficient. It can enable enlightened MSR bitmaps by setting
> the corresponding field in the enlightened VMCS to 1. When enabled, L0
> hypervisor does not monitor the MSR bitmaps for changes. Instead, the L1
> hypervisor must invalidate the corresponding clean field after making
> changes to one of the MSR bitmaps."
>
> Enable this for SVM.
>
> Related VMX changes:
> commit ceef7d10dfb6 ("KVM: x86: VMX: hyper-v: Enlightened MSR-Bitmap support")
>
> Signed-off-by: Vineeth Pillai <[email protected]>
> ---
> arch/x86/kvm/svm/svm.c | 27 +++++++++++++++++++++++++++
> 1 file changed, 27 insertions(+)
>
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 6287cab61f15..3562a247b7e8 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -646,6 +646,27 @@ static bool msr_write_intercepted(struct kvm_vcpu *vcpu, u32 msr)
> return !!test_bit(bit_write, &tmp);
> }
>
> +#if IS_ENABLED(CONFIG_HYPERV)
> +static inline void hv_vmcb_dirty_nested_enlightenments(struct kvm_vcpu *vcpu)
> +{
> + struct vmcb *vmcb = to_svm(vcpu)->vmcb;
> +
> + /*
> + * vmcb can be NULL if called during early vcpu init.
> + * And its okay not to mark vmcb dirty during vcpu init
> + * as we mark it dirty unconditionally towards end of vcpu
> + * init phase.
> + */
> + if (vmcb && vmcb_is_clean(vmcb, VMCB_HV_NESTED_ENLIGHTENMENTS) &&
> + vmcb->hv_enlightenments.hv_enlightenments_control.msr_bitmap)
> + vmcb_mark_dirty(vmcb, VMCB_HV_NESTED_ENLIGHTENMENTS);

vmcb_is_clean() check seems to be superfluous, vmcb_mark_dirty() does no
harm if the bit was already cleared.

> +}
> +#else
> +static inline void hv_vmcb_dirty_nested_enlightenments(struct kvm_vcpu *vcpu)
> +{
> +}
> +#endif
> +
> static void set_msr_interception_bitmap(struct kvm_vcpu *vcpu, u32 *msrpm,
> u32 msr, int read, int write)
> {
> @@ -677,6 +698,9 @@ static void set_msr_interception_bitmap(struct kvm_vcpu *vcpu, u32 *msrpm,
> write ? clear_bit(bit_write, &tmp) : set_bit(bit_write, &tmp);
>
> msrpm[offset] = tmp;
> +
> + hv_vmcb_dirty_nested_enlightenments(vcpu);
> +
> }
>
> void set_msr_interception(struct kvm_vcpu *vcpu, u32 *msrpm, u32 msr,
> @@ -1135,6 +1159,9 @@ static void hv_init_vmcb(struct vmcb *vmcb)
> if (npt_enabled &&
> ms_hyperv.nested_features & HV_X64_NESTED_ENLIGHTENED_TLB)
> hve->hv_enlightenments_control.enlightened_npt_tlb = 1;
> +
> + if (ms_hyperv.nested_features & HV_X64_NESTED_MSR_BITMAP)
> + hve->hv_enlightenments_control.msr_bitmap = 1;
> }
> #else
> static inline void hv_init_vmcb(struct vmcb *vmcb)

--
Vitaly

2021-04-08 11:27:36

by Vitaly Kuznetsov

[permalink] [raw]

Subject: Re: [PATCH 7/7] KVM: SVM: hyper-v: Direct Virtual Flush support

Vineeth Pillai <[email protected]> writes:

> From Hyper-V TLFS:
> "The hypervisor exposes hypercalls (HvFlushVirtualAddressSpace,
> HvFlushVirtualAddressSpaceEx, HvFlushVirtualAddressList, and
> HvFlushVirtualAddressListEx) that allow operating systems to more
> efficiently manage the virtual TLB. The L1 hypervisor can choose to
> allow its guest to use those hypercalls and delegate the responsibility
> to handle them to the L0 hypervisor. This requires the use of a
> partition assist page."
>
> Add the Direct Virtual Flush support for SVM.
>
> Related VMX changes:
> commit 6f6a657c9998 ("KVM/Hyper-V/VMX: Add direct tlb flush support")
>
> Signed-off-by: Vineeth Pillai <[email protected]>
> ---
> arch/x86/kvm/svm/svm.c | 48 ++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 48 insertions(+)
>
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 3562a247b7e8..c6d3f3a7c986 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -440,6 +440,32 @@ static void svm_init_osvw(struct kvm_vcpu *vcpu)
> vcpu->arch.osvw.status |= 1;
> }
>
> +#if IS_ENABLED(CONFIG_HYPERV)
> +static int hv_enable_direct_tlbflush(struct kvm_vcpu *vcpu)
> +{
> + struct hv_enlightenments *hve;
> + struct hv_partition_assist_pg **p_hv_pa_pg =
> + &to_kvm_hv(vcpu->kvm)->hv_pa_pg;
> +
> + if (!*p_hv_pa_pg)
> + *p_hv_pa_pg = kzalloc(PAGE_SIZE, GFP_KERNEL);
> +
> + if (!*p_hv_pa_pg)
> + return -ENOMEM;
> +
> + hve = (struct hv_enlightenments *)&to_svm(vcpu)->vmcb->hv_enlightenments;
> +
> + hve->partition_assist_page = __pa(*p_hv_pa_pg);
> + hve->hv_vm_id = (unsigned long)vcpu->kvm;
> + if (!hve->hv_enlightenments_control.nested_flush_hypercall) {
> + hve->hv_enlightenments_control.nested_flush_hypercall = 1;
> + vmcb_mark_dirty(to_svm(vcpu)->vmcb, VMCB_HV_NESTED_ENLIGHTENMENTS);
> + }
> +
> + return 0;
> +}
> +#endif
> +
> static int has_svm(void)
> {
> const char *msg;
> @@ -1034,6 +1060,21 @@ static __init int svm_hardware_setup(void)
> svm_x86_ops.tlb_remote_flush_with_range =
> kvm_hv_remote_flush_tlb_with_range;
> }
> +
> + if (ms_hyperv.nested_features & HV_X64_NESTED_DIRECT_FLUSH) {
> + pr_info("kvm: Hyper-V Direct TLB Flush enabled\n");
> + for_each_online_cpu(cpu) {
> + struct hv_vp_assist_page *vp_ap =
> + hv_get_vp_assist_page(cpu);
> +
> + if (!vp_ap)
> + continue;
> +
> + vp_ap->nested_control.features.directhypercall = 1;
> + }
> + svm_x86_ops.enable_direct_tlbflush =
> + hv_enable_direct_tlbflush;
> + }
> #endif
>
> if (nrips) {
> @@ -3913,6 +3954,13 @@ static __no_kcsan fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu)
> }
> svm->vmcb->save.cr2 = vcpu->arch.cr2;
>
> +#if IS_ENABLED(CONFIG_HYPERV)
> + if (svm->vmcb->hv_enlightenments.hv_vp_id != to_hv_vcpu(vcpu)->vp_index) {

This looks wrong (see my previous comment about mixing KVM-on-Hyper-V
and Windows/Hyper-V-on-KVM). 'to_hv_vcpu(vcpu)->vp_index' is
'Windows/Hyper-V-on-KVM' thingy, it does not exist when we run without
any Hyper-V enlightenments exposed (e.g. when we run Linux as our
guest).

> + svm->vmcb->hv_enlightenments.hv_vp_id = to_hv_vcpu(vcpu)->vp_index;
> + vmcb_mark_dirty(svm->vmcb, VMCB_HV_NESTED_ENLIGHTENMENTS);
> + }
> +#endif
> +
> /*
> * Run with all-zero DR6 unless needed, so that we can get the exact cause
> * of a #DB.

--
Vitaly

2021-04-08 13:21:28

by Vineeth Pillai

[permalink] [raw]

Subject: Re: [PATCH 3/7] KVM: x86: hyper-v: Move the remote TLB flush logic out of vmx

On 4/8/21 7:14 AM, Vitaly Kuznetsov wrote:
> + /*
>> + * Two Dimensional paging CR3
>> + * EPTP for Intel
>> + * nCR3 for AMD
>> + */
>> + u64 tdp_pointer;
>> };
> 'struct kvm_vcpu_hv' is only allocated when we emulate Hyper-V in KVM
> (run Windows/Hyper-V guests on top of KVM). Remote TLB flush is used
> when we run KVM on Hyper-V and this is a very different beast. Let's not
> mix these things together. I understand that some unification is needed
> to bring the AMD specific feature but let's do it differently.
>
> E.g. 'ept_pointer' and friends from 'struct kvm_vmx' can just go to
> 'struct kvm_vcpu_arch' (in case they really need to be unified).
Ahh yes, thanks for catching this. Will fix this in the next version.

Thanks,
Vineeth

2021-04-08 15:46:14

by Paolo Bonzini

[permalink] [raw]

Subject: Re: [PATCH 4/7] KVM: SVM: hyper-v: Nested enlightenments in VMCB

On 07/04/21 16:41, Vineeth Pillai wrote:
> +#define VMCB_ALL_CLEAN_MASK (__CLEAN_MASK | (1U << VMCB_HV_NESTED_ENLIGHTENMENTS))
> +#else
> +#define VMCB_ALL_CLEAN_MASK __CLEAN_MASK
> +#endif

I think this should depend on whether KVM is running on top of Hyper-V;
not on whether KVM is *compiled* with Hyper-V support.

So you should turn VMCB_ALL_CLEAN_MASK into a __read_mostly variable.

Paolo

> /* TPR and CR2 are always written before VMRUN */
> #define VMCB_ALWAYS_DIRTY_MASK ((1U << VMCB_INTR) | (1U << VMCB_CR2))
>
> @@ -230,7 +251,7 @@ static inline void vmcb_mark_all_dirty(struct vmcb *vmcb)
>
> static inline void vmcb_mark_all_clean(struct vmcb *vmcb)
> {
> - vmcb->control.clean = ((1 << VMCB_DIRTY_MAX) - 1)
> + vmcb->control.clean = VMCB_ALL_CLEAN_MASK
> & ~VMCB_ALWAYS_DIRTY_MASK;
> }

2021-04-08 15:48:07

by Paolo Bonzini

[permalink] [raw]

Subject: Re: [PATCH 6/7] KVM: SVM: hyper-v: Enlightened MSR-Bitmap support

On 07/04/21 16:41, Vineeth Pillai wrote:
>
> +#if IS_ENABLED(CONFIG_HYPERV)
> +static inline void hv_vmcb_dirty_nested_enlightenments(struct kvm_vcpu *vcpu)
> +{
> + struct vmcb *vmcb = to_svm(vcpu)->vmcb;
> +
> + /*
> + * vmcb can be NULL if called during early vcpu init.
> + * And its okay not to mark vmcb dirty during vcpu init
> + * as we mark it dirty unconditionally towards end of vcpu
> + * init phase.
> + */
> + if (vmcb && vmcb_is_clean(vmcb, VMCB_HV_NESTED_ENLIGHTENMENTS) &&
> + vmcb->hv_enlightenments.hv_enlightenments_control.msr_bitmap)
> + vmcb_mark_dirty(vmcb, VMCB_HV_NESTED_ENLIGHTENMENTS);
> +}

In addition to what Vitaly said, can svm->vmcb really be NULL? If so it
might be better to reorder initializations and skip the NULL check.

Paolo

2021-04-09 12:26:53

by Vineeth Pillai

[permalink] [raw]

Subject: Re: [PATCH 4/7] KVM: SVM: hyper-v: Nested enlightenments in VMCB

On 4/8/21 11:44 AM, Paolo Bonzini wrote:
> On 07/04/21 16:41, Vineeth Pillai wrote:
>> +#define VMCB_ALL_CLEAN_MASK (__CLEAN_MASK | (1U <<
>> VMCB_HV_NESTED_ENLIGHTENMENTS))
>> +#else
>> +#define VMCB_ALL_CLEAN_MASK __CLEAN_MASK
>> +#endif
>
> I think this should depend on whether KVM is running on top of
> Hyper-V; not on whether KVM is *compiled* with Hyper-V support.
>
> So you should turn VMCB_ALL_CLEAN_MASK into a __read_mostly variable.
Will do.

Thanks,
Vineeth