2018-05-16 15:27:06

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH v4 0/8] KVM: x86: hyperv: PV TLB flush for Windows guests

Changes since v3 [Radim Krcmar]:
- PATCH2 fixing 'HV_GENERIC_SET_SPARCE_4K' typo added.
- PATCH5 introducing kvm_make_vcpus_request_mask() API added.
- Fix undefined behavior for hv->vp_index >= 64.
- Merge kvm_hv_flush_tlb() and kvm_hv_flush_tlb_ex()
- For -ex case preload all banks with a single kvm_read_guest().

Description:

This is both a new feature and a bugfix.

Bugfix description:

It was found that Windows 2016 guests on KVM crash when they have > 64
vCPUs, non-flat topology (>1 core/thread per socket; in case it has >64
sockets Windows just ignores vCPUs above 64) and Hyper-V enlightenments
(any) are enabled. The most common error reported is "PAGE FAULT IN
NONPAGED AREA" but I saw different messages. Apparently, Windows doesn't
expect to run on a Hyper-V server without PV TLB flush support as there's
no such Hyper-V servers out there (it's only WS2016 supporting > 64 vCPUs
AFAIR).

Adding PV TLB flush support to KVM helps, Windows 2016 guests now boot
normally (I tried '-smp 128,sockets=64,cores=1,threads=2' and
'-smp 128,sockets=8,cores=16,threads=1' but other topologies should work
too).

Feature description:

PV TLB flush helps a lot when running overcommited. KVM gained support for
it recently but it is only available for Linux guests. Windows guests use
emulated Hyper-V interface and PV TLB flush needs to be added there.

I tested WS2016 guest with 128 vCPUs running on a 12 pCPU server. The test
was running 65 threads doing 50 mmap()/munmap() for 16384 pages with a
tiny random nanosleep in between (I used Cygwin. It would be great if
someone could point me to a good Windows-native TLB trashing test).

The average results are:
Before:
real 0m22.464s
user 0m0.990s
sys 1m26.3276s

After:
real 0m19.304s
user 0m0.908s
sys 0m36.249s

When running without overcommit the results of the same test are very close
so the feature can be enabled by default.

Implementation details.

The implementation is very simplistic and straightforward. We ignore
'address space' argument of the hypercalls (as there is no good way to
figure out what's currently in CR3 of a running vCPU as generally we don't
VMEXIT on guest CR3 write) and do full TLB flush on specified vCPUs. In
case said vCPUs are not running TLB flush will be performed upon guest
enter.

Qemu (and other userspaces) need to enable CPUID feature bits to make
Windows aware the feature is supported. I'll post Qemu enablement patch
separately.

Patches are based on the current kvm/queue branch.

Vitaly Kuznetsov (8):
x86/hyper-v: move struct hv_flush_pcpu{,ex} definitions to common
header
x86/hyperv: fix typo in 'HV_GENERIC_SET_SPARCE_4K' definition
KVM: x86: hyperv: use defines when parsing hypercall parameters
KVM: x86: hyperv: do rep check for each hypercall separately
KVM: introduce kvm_make_vcpus_request_mask() API
KVM: x86: hyperv: simplistic HVCALL_FLUSH_VIRTUAL_ADDRESS_{LIST,SPACE}
implementation
KVM: x86: hyperv: simplistic
HVCALL_FLUSH_VIRTUAL_ADDRESS_{LIST,SPACE}_EX implementation
KVM: x86: hyperv: declare KVM_CAP_HYPERV_TLBFLUSH capability

Documentation/virtual/kvm/api.txt | 9 ++
arch/x86/hyperv/mmu.c | 42 +++------
arch/x86/include/asm/hyperv-tlfs.h | 22 ++++-
arch/x86/include/asm/kvm_host.h | 1 +
arch/x86/kvm/hyperv.c | 171 ++++++++++++++++++++++++++++++++++---
arch/x86/kvm/trace.h | 51 +++++++++++
arch/x86/kvm/x86.c | 1 +
include/linux/kvm_host.h | 3 +
include/uapi/linux/kvm.h | 1 +
virt/kvm/kvm_main.c | 34 ++++++--
10 files changed, 282 insertions(+), 53 deletions(-)

--
2.14.3



2018-05-16 15:22:31

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH v4 8/8] KVM: x86: hyperv: declare KVM_CAP_HYPERV_TLBFLUSH capability

We need a new capability to indicate support for the newly added
HvFlushVirtualAddress{List,Space}{,Ex} hypercalls. Upon seeing this
capability, userspace is supposed to announce PV TLB flush features
by setting the appropriate CPUID bits (if needed).

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
Documentation/virtual/kvm/api.txt | 9 +++++++++
arch/x86/kvm/x86.c | 1 +
include/uapi/linux/kvm.h | 1 +
3 files changed, 11 insertions(+)

diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index 758bf403a169..c563da4244da 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -4603,3 +4603,12 @@ Architectures: s390
This capability indicates that kvm will implement the interfaces to handle
reset, migration and nested KVM for branch prediction blocking. The stfle
facility 82 should not be provided to the guest without this capability.
+
+8.14 KVM_CAP_HYPERV_TLBFLUSH
+
+Architectures: x86
+
+This capability indicates that KVM supports paravirtualized Hyper-V TLB Flush
+hypercalls:
+HvFlushVirtualAddressSpace, HvFlushVirtualAddressSpaceEx,
+HvFlushVirtualAddressList, HvFlushVirtualAddressListEx.
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 87e480521550..d523b124656a 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2874,6 +2874,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
case KVM_CAP_HYPERV_SYNIC2:
case KVM_CAP_HYPERV_VP_INDEX:
case KVM_CAP_HYPERV_EVENTFD:
+ case KVM_CAP_HYPERV_TLBFLUSH:
case KVM_CAP_PCI_SEGMENT:
case KVM_CAP_DEBUGREGS:
case KVM_CAP_X86_ROBUST_SINGLESTEP:
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index b02c41e53d56..b252ceb3965c 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -948,6 +948,7 @@ struct kvm_ppc_resize_hpt {
#define KVM_CAP_S390_BPB 152
#define KVM_CAP_GET_MSR_FEATURES 153
#define KVM_CAP_HYPERV_EVENTFD 154
+#define KVM_CAP_HYPERV_TLBFLUSH 155

#ifdef KVM_CAP_IRQ_ROUTING

--
2.14.3


2018-05-16 15:22:57

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH v4 4/8] KVM: x86: hyperv: do rep check for each hypercall separately

Prepare to support TLB flush hypercalls, some of which are REP hypercalls.
Also, return HV_STATUS_INVALID_HYPERCALL_INPUT as it seems more
appropriate.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
arch/x86/kvm/hyperv.c | 21 ++++++++++++---------
1 file changed, 12 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index dcfeae2deafa..edb1ac44d628 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -1311,7 +1311,7 @@ int kvm_hv_hypercall(struct kvm_vcpu *vcpu)
{
u64 param, ingpa, outgpa, ret = HV_STATUS_SUCCESS;
uint16_t code, rep_idx, rep_cnt;
- bool fast, longmode;
+ bool fast, longmode, rep;

/*
* hypercall generates UD from non zero cpl and real mode
@@ -1344,28 +1344,31 @@ int kvm_hv_hypercall(struct kvm_vcpu *vcpu)
fast = !!(param & HV_HYPERCALL_FAST_BIT);
rep_cnt = (param >> HV_HYPERCALL_REP_COMP_OFFSET) & 0xfff;
rep_idx = (param >> HV_HYPERCALL_REP_START_OFFSET) & 0xfff;
+ rep = !!(rep_cnt || rep_idx);

trace_kvm_hv_hypercall(code, fast, rep_cnt, rep_idx, ingpa, outgpa);

- /* Hypercall continuation is not supported yet */
- if (rep_cnt || rep_idx) {
- ret = HV_STATUS_INVALID_HYPERCALL_CODE;
- goto set_result;
- }
-
switch (code) {
case HVCALL_NOTIFY_LONG_SPIN_WAIT:
+ if (unlikely(rep)) {
+ ret = HV_STATUS_INVALID_HYPERCALL_INPUT;
+ break;
+ }
kvm_vcpu_on_spin(vcpu, true);
break;
case HVCALL_SIGNAL_EVENT:
+ if (unlikely(rep)) {
+ ret = HV_STATUS_INVALID_HYPERCALL_INPUT;
+ break;
+ }
ret = kvm_hvcall_signal_event(vcpu, fast, ingpa);
if (ret != HV_STATUS_INVALID_PORT_ID)
break;
/* maybe userspace knows this conn_id: fall through */
case HVCALL_POST_MESSAGE:
/* don't bother userspace if it has no way to handle it */
- if (!vcpu_to_synic(vcpu)->active) {
- ret = HV_STATUS_INVALID_HYPERCALL_CODE;
+ if (unlikely(rep || !vcpu_to_synic(vcpu)->active)) {
+ ret = HV_STATUS_INVALID_HYPERCALL_INPUT;
break;
}
vcpu->run->exit_reason = KVM_EXIT_HYPERV;
--
2.14.3


2018-05-16 15:22:59

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH v4 7/8] KVM: x86: hyperv: simplistic HVCALL_FLUSH_VIRTUAL_ADDRESS_{LIST,SPACE}_EX implementation

Implement HvFlushVirtualAddress{List,Space}Ex hypercalls in the same way
we've implemented non-EX counterparts.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
arch/x86/kvm/hyperv.c | 110 ++++++++++++++++++++++++++++++++++++++++++++------
arch/x86/kvm/trace.h | 27 +++++++++++++
2 files changed, 125 insertions(+), 12 deletions(-)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 0d916606519d..b298391f0f93 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -1242,31 +1242,102 @@ int kvm_hv_get_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata)
return kvm_hv_get_msr(vcpu, msr, pdata);
}

+static __always_inline int get_sparse_bank_no(u64 valid_bank_mask, int bank_no)
+{
+ int i = 0, j;
+
+ if (!(valid_bank_mask & BIT_ULL(bank_no)))
+ return -1;
+
+ for (j = 0; j < bank_no; j++)
+ if (valid_bank_mask & BIT_ULL(j))
+ i++;
+
+ return i;
+}
+
static u64 kvm_hv_flush_tlb(struct kvm_vcpu *current_vcpu, u64 ingpa,
- u16 rep_cnt)
+ u16 rep_cnt, bool ex)
{
struct kvm *kvm = current_vcpu->kvm;
struct kvm_vcpu_hv *hv_current = &current_vcpu->arch.hyperv;
+ struct hv_tlb_flush_ex flush_ex;
struct hv_tlb_flush flush;
struct kvm_vcpu *vcpu;
unsigned long vcpu_bitmap[BITS_TO_LONGS(KVM_MAX_VCPUS)] = {0};
- int i;
+ unsigned long valid_bank_mask;
+ u64 sparse_banks[64];
+ int sparse_banks_len, i;
+ bool all_cpus;

- if (unlikely(kvm_read_guest(kvm, ingpa, &flush, sizeof(flush))))
- return HV_STATUS_INVALID_HYPERCALL_INPUT;
+ if (!ex) {
+ if (unlikely(kvm_read_guest(kvm, ingpa, &flush, sizeof(flush))))
+ return HV_STATUS_INVALID_HYPERCALL_INPUT;

- trace_kvm_hv_flush_tlb(flush.processor_mask, flush.address_space,
- flush.flags);
+ trace_kvm_hv_flush_tlb(flush.processor_mask,
+ flush.address_space, flush.flags);
+
+ sparse_banks[0] = flush.processor_mask;
+ all_cpus = flush.flags & HV_FLUSH_ALL_PROCESSORS;
+ } else {
+ if (unlikely(kvm_read_guest(kvm, ingpa, &flush_ex,
+ sizeof(flush_ex))))
+ return HV_STATUS_INVALID_HYPERCALL_INPUT;
+
+ trace_kvm_hv_flush_tlb_ex(flush_ex.hv_vp_set.valid_bank_mask,
+ flush_ex.hv_vp_set.format,
+ flush_ex.address_space,
+ flush_ex.flags);
+
+ valid_bank_mask = flush_ex.hv_vp_set.valid_bank_mask;
+ all_cpus = flush_ex.hv_vp_set.format !=
+ HV_GENERIC_SET_SPARSE_4K;
+
+ sparse_banks_len = bitmap_weight(&valid_bank_mask, 64) *
+ sizeof(sparse_banks[0]);
+
+ if (!sparse_banks_len && !all_cpus)
+ goto ret_success;
+
+ if (!all_cpus &&
+ kvm_read_guest(kvm,
+ ingpa + offsetof(struct hv_tlb_flush_ex,
+ hv_vp_set.bank_contents),
+ sparse_banks,
+ sparse_banks_len))
+ return HV_STATUS_INVALID_HYPERCALL_INPUT;
+ }

cpumask_clear(&hv_current->tlb_lush);

kvm_for_each_vcpu(i, vcpu, kvm) {
struct kvm_vcpu_hv *hv = &vcpu->arch.hyperv;
+ int bank = hv->vp_index / 64, sbank = 0;
+
+ if (!all_cpus) {
+ /* Banks >64 can't be represented */
+ if (bank >= 64)
+ continue;
+
+ /* Non-ex hypercalls can only address first 64 vCPUs */
+ if (!ex && bank)
+ continue;
+
+ if (ex) {
+ /*
+ * Check is the bank of this vCPU is in sparse
+ * set and get the sparse bank number.
+ */
+ sbank = get_sparse_bank_no(valid_bank_mask,
+ bank);
+
+ if (sbank < 0)
+ continue;
+ }

- if (!(flush.flags & HV_FLUSH_ALL_PROCESSORS) &&
- (hv->vp_index >= 64 ||
- !(flush.processor_mask & BIT_ULL(hv->vp_index))))
- continue;
+ if (!(sparse_banks[sbank] & BIT_ULL(hv->vp_index % 64)))
+ continue;
+ }

/*
* vcpu->arch.cr3 may not be up-to-date for running vCPUs so we
@@ -1280,6 +1351,7 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *current_vcpu, u64 ingpa,
KVM_REQ_TLB_FLUSH | KVM_REQUEST_NO_WAKEUP,
vcpu_bitmap, &hv_current->tlb_lush);

+ret_success:
/* We always do full TLB flush, set rep_done = rep_cnt. */
return (u64)HV_STATUS_SUCCESS |
((u64)rep_cnt << HV_HYPERCALL_REP_COMP_OFFSET);
@@ -1427,14 +1499,28 @@ int kvm_hv_hypercall(struct kvm_vcpu *vcpu)
ret = HV_STATUS_INVALID_HYPERCALL_INPUT;
break;
}
- ret = kvm_hv_flush_tlb(vcpu, ingpa, rep_cnt);
+ ret = kvm_hv_flush_tlb(vcpu, ingpa, rep_cnt, false);
break;
case HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE:
if (unlikely(fast || rep)) {
ret = HV_STATUS_INVALID_HYPERCALL_INPUT;
break;
}
- ret = kvm_hv_flush_tlb(vcpu, ingpa, rep_cnt);
+ ret = kvm_hv_flush_tlb(vcpu, ingpa, rep_cnt, false);
+ break;
+ case HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST_EX:
+ if (unlikely(fast || !rep_cnt || rep_idx)) {
+ ret = HV_STATUS_INVALID_HYPERCALL_INPUT;
+ break;
+ }
+ ret = kvm_hv_flush_tlb(vcpu, ingpa, rep_cnt, true);
+ break;
+ case HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE_EX:
+ if (unlikely(fast || rep)) {
+ ret = HV_STATUS_INVALID_HYPERCALL_INPUT;
+ break;
+ }
+ ret = kvm_hv_flush_tlb(vcpu, ingpa, rep_cnt, true);
break;
default:
ret = HV_STATUS_INVALID_HYPERCALL_CODE;
diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h
index 47a4fd758743..0f997683404f 100644
--- a/arch/x86/kvm/trace.h
+++ b/arch/x86/kvm/trace.h
@@ -1391,6 +1391,33 @@ TRACE_EVENT(kvm_hv_flush_tlb,
__entry->processor_mask, __entry->address_space,
__entry->flags)
);
+
+/*
+ * Tracepoint for kvm_hv_flush_tlb_ex.
+ */
+TRACE_EVENT(kvm_hv_flush_tlb_ex,
+ TP_PROTO(u64 valid_bank_mask, u64 format, u64 address_space, u64 flags),
+ TP_ARGS(valid_bank_mask, format, address_space, flags),
+
+ TP_STRUCT__entry(
+ __field(u64, valid_bank_mask)
+ __field(u64, format)
+ __field(u64, address_space)
+ __field(u64, flags)
+ ),
+
+ TP_fast_assign(
+ __entry->valid_bank_mask = valid_bank_mask;
+ __entry->format = format;
+ __entry->address_space = address_space;
+ __entry->flags = flags;
+ ),
+
+ TP_printk("valid_bank_mask 0x%llx format 0x%llx "
+ "address_space 0x%llx flags 0x%llx",
+ __entry->valid_bank_mask, __entry->format,
+ __entry->address_space, __entry->flags)
+);
#endif /* _TRACE_KVM_H */

#undef TRACE_INCLUDE_PATH
--
2.14.3


2018-05-16 15:24:01

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH v4 6/8] KVM: x86: hyperv: simplistic HVCALL_FLUSH_VIRTUAL_ADDRESS_{LIST,SPACE} implementation

Implement HvFlushVirtualAddress{List,Space} hypercalls in a simplistic way:
do full TLB flush with KVM_REQ_TLB_FLUSH and kick vCPUs which are currently
IN_GUEST_MODE.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
arch/x86/include/asm/kvm_host.h | 1 +
arch/x86/kvm/hyperv.c | 58 ++++++++++++++++++++++++++++++++++++++++-
arch/x86/kvm/trace.h | 24 +++++++++++++++++
3 files changed, 82 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 8cb846162694..a79254cc12de 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -476,6 +476,7 @@ struct kvm_vcpu_hv {
struct kvm_hyperv_exit exit;
struct kvm_vcpu_hv_stimer stimer[HV_SYNIC_STIMER_COUNT];
DECLARE_BITMAP(stimer_pending_bitmap, HV_SYNIC_STIMER_COUNT);
+ cpumask_t tlb_lush;
};

struct kvm_vcpu_arch {
diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index edb1ac44d628..0d916606519d 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -1242,6 +1242,49 @@ int kvm_hv_get_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata)
return kvm_hv_get_msr(vcpu, msr, pdata);
}

+static u64 kvm_hv_flush_tlb(struct kvm_vcpu *current_vcpu, u64 ingpa,
+ u16 rep_cnt)
+{
+ struct kvm *kvm = current_vcpu->kvm;
+ struct kvm_vcpu_hv *hv_current = &current_vcpu->arch.hyperv;
+ struct hv_tlb_flush flush;
+ struct kvm_vcpu *vcpu;
+ unsigned long vcpu_bitmap[BITS_TO_LONGS(KVM_MAX_VCPUS)] = {0};
+ int i;
+
+ if (unlikely(kvm_read_guest(kvm, ingpa, &flush, sizeof(flush))))
+ return HV_STATUS_INVALID_HYPERCALL_INPUT;
+
+ trace_kvm_hv_flush_tlb(flush.processor_mask, flush.address_space,
+ flush.flags);
+
+ cpumask_clear(&hv_current->tlb_lush);
+
+ kvm_for_each_vcpu(i, vcpu, kvm) {
+ struct kvm_vcpu_hv *hv = &vcpu->arch.hyperv;
+
+ if (!(flush.flags & HV_FLUSH_ALL_PROCESSORS) &&
+ (hv->vp_index >= 64 ||
+ !(flush.processor_mask & BIT_ULL(hv->vp_index))))
+ continue;
+
+ /*
+ * vcpu->arch.cr3 may not be up-to-date for running vCPUs so we
+ * can't analyze it here, flush TLB regardless of the specified
+ * address space.
+ */
+ __set_bit(i, vcpu_bitmap);
+ }
+
+ kvm_make_vcpus_request_mask(kvm,
+ KVM_REQ_TLB_FLUSH | KVM_REQUEST_NO_WAKEUP,
+ vcpu_bitmap, &hv_current->tlb_lush);
+
+ /* We always do full TLB flush, set rep_done = rep_cnt. */
+ return (u64)HV_STATUS_SUCCESS |
+ ((u64)rep_cnt << HV_HYPERCALL_REP_COMP_OFFSET);
+}
+
bool kvm_hv_hypercall_enabled(struct kvm *kvm)
{
return READ_ONCE(kvm->arch.hyperv.hv_hypercall) & HV_X64_MSR_HYPERCALL_ENABLE;
@@ -1379,12 +1422,25 @@ int kvm_hv_hypercall(struct kvm_vcpu *vcpu)
vcpu->arch.complete_userspace_io =
kvm_hv_hypercall_complete_userspace;
return 0;
+ case HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST:
+ if (unlikely(fast || !rep_cnt || rep_idx)) {
+ ret = HV_STATUS_INVALID_HYPERCALL_INPUT;
+ break;
+ }
+ ret = kvm_hv_flush_tlb(vcpu, ingpa, rep_cnt);
+ break;
+ case HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE:
+ if (unlikely(fast || rep)) {
+ ret = HV_STATUS_INVALID_HYPERCALL_INPUT;
+ break;
+ }
+ ret = kvm_hv_flush_tlb(vcpu, ingpa, rep_cnt);
+ break;
default:
ret = HV_STATUS_INVALID_HYPERCALL_CODE;
break;
}

-set_result:
kvm_hv_hypercall_set_result(vcpu, ret);
return 1;
}
diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h
index 9807c314c478..47a4fd758743 100644
--- a/arch/x86/kvm/trace.h
+++ b/arch/x86/kvm/trace.h
@@ -1367,6 +1367,30 @@ TRACE_EVENT(kvm_hv_timer_state,
__entry->vcpu_id,
__entry->hv_timer_in_use)
);
+
+/*
+ * Tracepoint for kvm_hv_flush_tlb.
+ */
+TRACE_EVENT(kvm_hv_flush_tlb,
+ TP_PROTO(u64 processor_mask, u64 address_space, u64 flags),
+ TP_ARGS(processor_mask, address_space, flags),
+
+ TP_STRUCT__entry(
+ __field(u64, processor_mask)
+ __field(u64, address_space)
+ __field(u64, flags)
+ ),
+
+ TP_fast_assign(
+ __entry->processor_mask = processor_mask;
+ __entry->address_space = address_space;
+ __entry->flags = flags;
+ ),
+
+ TP_printk("processor_mask 0x%llx address_space 0x%llx flags 0x%llx",
+ __entry->processor_mask, __entry->address_space,
+ __entry->flags)
+);
#endif /* _TRACE_KVM_H */

#undef TRACE_INCLUDE_PATH
--
2.14.3


2018-05-16 15:24:27

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH v4 3/8] KVM: x86: hyperv: use defines when parsing hypercall parameters

Avoid open-coding offsets for hypercall input parameters, we already
have defines for them.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
arch/x86/kvm/hyperv.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 5708e951a5c6..dcfeae2deafa 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -1341,9 +1341,9 @@ int kvm_hv_hypercall(struct kvm_vcpu *vcpu)
#endif

code = param & 0xffff;
- fast = (param >> 16) & 0x1;
- rep_cnt = (param >> 32) & 0xfff;
- rep_idx = (param >> 48) & 0xfff;
+ fast = !!(param & HV_HYPERCALL_FAST_BIT);
+ rep_cnt = (param >> HV_HYPERCALL_REP_COMP_OFFSET) & 0xfff;
+ rep_idx = (param >> HV_HYPERCALL_REP_START_OFFSET) & 0xfff;

trace_kvm_hv_hypercall(code, fast, rep_cnt, rep_idx, ingpa, outgpa);

--
2.14.3


2018-05-16 15:25:25

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH v4 5/8] KVM: introduce kvm_make_vcpus_request_mask() API

Hyper-V style PV TLB flush hypercalls inmplementation will use this API.
To avoid memory allocation in CONFIG_CPUMASK_OFFSTACK case add
cpumask_var_t argument.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
include/linux/kvm_host.h | 3 +++
virt/kvm/kvm_main.c | 34 ++++++++++++++++++++++++++--------
2 files changed, 29 insertions(+), 8 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 6d6e79c59e68..14e710d639c7 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -730,6 +730,9 @@ void kvm_put_guest_fpu(struct kvm_vcpu *vcpu);

void kvm_flush_remote_tlbs(struct kvm *kvm);
void kvm_reload_remote_mmus(struct kvm *kvm);
+
+bool kvm_make_vcpus_request_mask(struct kvm *kvm, unsigned int req,
+ unsigned long *vcpu_bitmap, cpumask_var_t tmp);
bool kvm_make_all_cpus_request(struct kvm *kvm, unsigned int req);

long kvm_arch_dev_ioctl(struct file *filp,
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index c7b2e927f699..b125d94307d2 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -203,29 +203,47 @@ static inline bool kvm_kick_many_cpus(const struct cpumask *cpus, bool wait)
return true;
}

-bool kvm_make_all_cpus_request(struct kvm *kvm, unsigned int req)
+bool kvm_make_vcpus_request_mask(struct kvm *kvm, unsigned int req,
+ unsigned long *vcpu_bitmap, cpumask_var_t tmp)
{
int i, cpu, me;
- cpumask_var_t cpus;
- bool called;
struct kvm_vcpu *vcpu;
-
- zalloc_cpumask_var(&cpus, GFP_ATOMIC);
+ bool called;

me = get_cpu();
+
kvm_for_each_vcpu(i, vcpu, kvm) {
+ if (!test_bit(i, vcpu_bitmap))
+ continue;
+
kvm_make_request(req, vcpu);
cpu = vcpu->cpu;

if (!(req & KVM_REQUEST_NO_WAKEUP) && kvm_vcpu_wake_up(vcpu))
continue;

- if (cpus != NULL && cpu != -1 && cpu != me &&
+ if (tmp != NULL && cpu != -1 && cpu != me &&
kvm_request_needs_ipi(vcpu, req))
- __cpumask_set_cpu(cpu, cpus);
+ __cpumask_set_cpu(cpu, tmp);
}
- called = kvm_kick_many_cpus(cpus, !!(req & KVM_REQUEST_WAIT));
+
+ called = kvm_kick_many_cpus(tmp, !!(req & KVM_REQUEST_WAIT));
put_cpu();
+
+ return called;
+}
+
+bool kvm_make_all_cpus_request(struct kvm *kvm, unsigned int req)
+{
+ cpumask_var_t cpus;
+ bool called;
+ static unsigned long vcpu_bitmap[BITS_TO_LONGS(KVM_MAX_VCPUS)]
+ = {[0 ... BITS_TO_LONGS(KVM_MAX_VCPUS)-1] = ULONG_MAX};
+
+ zalloc_cpumask_var(&cpus, GFP_ATOMIC);
+
+ called = kvm_make_vcpus_request_mask(kvm, req, vcpu_bitmap, cpus);
+
free_cpumask_var(cpus);
return called;
}
--
2.14.3


2018-05-16 15:26:47

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH v4 1/8] x86/hyper-v: move struct hv_flush_pcpu{,ex} definitions to common header

Hyper-V TLB flush hypercalls definitions will be required for KVM so move
them hyperv-tlfs.h. Structures also need to be renamed as '_pcpu' suffix is
irrelevant for a general-purpose definition.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
arch/x86/hyperv/mmu.c | 40 ++++++++++----------------------------
arch/x86/include/asm/hyperv-tlfs.h | 20 +++++++++++++++++++
2 files changed, 30 insertions(+), 30 deletions(-)

diff --git a/arch/x86/hyperv/mmu.c b/arch/x86/hyperv/mmu.c
index 56c9ebac946f..528a1f34df96 100644
--- a/arch/x86/hyperv/mmu.c
+++ b/arch/x86/hyperv/mmu.c
@@ -13,32 +13,12 @@
#define CREATE_TRACE_POINTS
#include <asm/trace/hyperv.h>

-/* HvFlushVirtualAddressSpace, HvFlushVirtualAddressList hypercalls */
-struct hv_flush_pcpu {
- u64 address_space;
- u64 flags;
- u64 processor_mask;
- u64 gva_list[];
-};
-
-/* HvFlushVirtualAddressSpaceEx, HvFlushVirtualAddressListEx hypercalls */
-struct hv_flush_pcpu_ex {
- u64 address_space;
- u64 flags;
- struct {
- u64 format;
- u64 valid_bank_mask;
- u64 bank_contents[];
- } hv_vp_set;
- u64 gva_list[];
-};
-
/* Each gva in gva_list encodes up to 4096 pages to flush */
#define HV_TLB_FLUSH_UNIT (4096 * PAGE_SIZE)

-static struct hv_flush_pcpu __percpu **pcpu_flush;
+static struct hv_tlb_flush * __percpu *pcpu_flush;

-static struct hv_flush_pcpu_ex __percpu **pcpu_flush_ex;
+static struct hv_tlb_flush_ex * __percpu *pcpu_flush_ex;

/*
* Fills in gva_list starting from offset. Returns the number of items added.
@@ -71,7 +51,7 @@ static inline int fill_gva_list(u64 gva_list[], int offset,
}

/* Return the number of banks in the resulting vp_set */
-static inline int cpumask_to_vp_set(struct hv_flush_pcpu_ex *flush,
+static inline int cpumask_to_vp_set(struct hv_tlb_flush_ex *flush,
const struct cpumask *cpus)
{
int cpu, vcpu, vcpu_bank, vcpu_offset, nr_bank = 1;
@@ -81,7 +61,7 @@ static inline int cpumask_to_vp_set(struct hv_flush_pcpu_ex *flush,
return 0;

/*
- * Clear all banks up to the maximum possible bank as hv_flush_pcpu_ex
+ * Clear all banks up to the maximum possible bank as hv_tlb_flush_ex
* structs are not cleared between calls, we risk flushing unneeded
* vCPUs otherwise.
*/
@@ -109,8 +89,8 @@ static void hyperv_flush_tlb_others(const struct cpumask *cpus,
const struct flush_tlb_info *info)
{
int cpu, vcpu, gva_n, max_gvas;
- struct hv_flush_pcpu **flush_pcpu;
- struct hv_flush_pcpu *flush;
+ struct hv_tlb_flush **flush_pcpu;
+ struct hv_tlb_flush *flush;
u64 status = U64_MAX;
unsigned long flags;

@@ -196,8 +176,8 @@ static void hyperv_flush_tlb_others_ex(const struct cpumask *cpus,
const struct flush_tlb_info *info)
{
int nr_bank = 0, max_gvas, gva_n;
- struct hv_flush_pcpu_ex **flush_pcpu;
- struct hv_flush_pcpu_ex *flush;
+ struct hv_tlb_flush_ex **flush_pcpu;
+ struct hv_tlb_flush_ex *flush;
u64 status = U64_MAX;
unsigned long flags;

@@ -303,7 +283,7 @@ void hyper_alloc_mmu(void)
return;

if (!(ms_hyperv.hints & HV_X64_EX_PROCESSOR_MASKS_RECOMMENDED))
- pcpu_flush = alloc_percpu(struct hv_flush_pcpu *);
+ pcpu_flush = alloc_percpu(struct hv_tlb_flush *);
else
- pcpu_flush_ex = alloc_percpu(struct hv_flush_pcpu_ex *);
+ pcpu_flush_ex = alloc_percpu(struct hv_tlb_flush_ex *);
}
diff --git a/arch/x86/include/asm/hyperv-tlfs.h b/arch/x86/include/asm/hyperv-tlfs.h
index a8897615354e..3d4ce3935a62 100644
--- a/arch/x86/include/asm/hyperv-tlfs.h
+++ b/arch/x86/include/asm/hyperv-tlfs.h
@@ -713,4 +713,24 @@ struct hv_enlightened_vmcs {
#define HV_STIMER_AUTOENABLE (1ULL << 3)
#define HV_STIMER_SINT(config) (__u8)(((config) >> 16) & 0x0F)

+/* HvFlushVirtualAddressSpace, HvFlushVirtualAddressList hypercalls */
+struct hv_tlb_flush {
+ u64 address_space;
+ u64 flags;
+ u64 processor_mask;
+ u64 gva_list[];
+};
+
+/* HvFlushVirtualAddressSpaceEx, HvFlushVirtualAddressListEx hypercalls */
+struct hv_tlb_flush_ex {
+ u64 address_space;
+ u64 flags;
+ struct {
+ u64 format;
+ u64 valid_bank_mask;
+ u64 bank_contents[];
+ } hv_vp_set;
+ u64 gva_list[];
+};
+
#endif
--
2.14.3


2018-05-16 15:26:59

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH v4 2/8] x86/hyperv: fix typo in 'HV_GENERIC_SET_SPARCE_4K' definition

It should really be HV_GENERIC_SET_SPARSE_4K.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
arch/x86/hyperv/mmu.c | 2 +-
arch/x86/include/asm/hyperv-tlfs.h | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/hyperv/mmu.c b/arch/x86/hyperv/mmu.c
index 528a1f34df96..6cd3a08bb449 100644
--- a/arch/x86/hyperv/mmu.c
+++ b/arch/x86/hyperv/mmu.c
@@ -219,7 +219,7 @@ static void hyperv_flush_tlb_others_ex(const struct cpumask *cpus,
flush->hv_vp_set.valid_bank_mask = 0;

if (!cpumask_equal(cpus, cpu_present_mask)) {
- flush->hv_vp_set.format = HV_GENERIC_SET_SPARCE_4K;
+ flush->hv_vp_set.format = HV_GENERIC_SET_SPARSE_4K;
nr_bank = cpumask_to_vp_set(flush, cpus);
}

diff --git a/arch/x86/include/asm/hyperv-tlfs.h b/arch/x86/include/asm/hyperv-tlfs.h
index 3d4ce3935a62..c69891e721b6 100644
--- a/arch/x86/include/asm/hyperv-tlfs.h
+++ b/arch/x86/include/asm/hyperv-tlfs.h
@@ -363,7 +363,7 @@ struct hv_tsc_emulation_status {
#define HV_FLUSH_USE_EXTENDED_RANGE_FORMAT BIT(3)

enum HV_GENERIC_SET_FORMAT {
- HV_GENERIC_SET_SPARCE_4K,
+ HV_GENERIC_SET_SPARSE_4K,
HV_GENERIC_SET_ALL,
};

--
2.14.3


2018-05-16 15:54:06

by KY Srinivasan

[permalink] [raw]
Subject: RE: [PATCH v4 1/8] x86/hyper-v: move struct hv_flush_pcpu{,ex} definitions to common header



> -----Original Message-----
> From: Vitaly Kuznetsov <[email protected]>
> Sent: Wednesday, May 16, 2018 8:21 AM
> To: [email protected]
> Cc: [email protected]; Paolo Bonzini <[email protected]>; Radim Kr?m??
> <[email protected]>; Roman Kagan <[email protected]>; KY
> Srinivasan <[email protected]>; Haiyang Zhang
> <[email protected]>; Stephen Hemminger
> <[email protected]>; Michael Kelley (EOSG)
> <[email protected]>; Mohammed Gamal
> <[email protected]>; Cathy Avery <[email protected]>; linux-
> [email protected]
> Subject: [PATCH v4 1/8] x86/hyper-v: move struct hv_flush_pcpu{,ex}
> definitions to common header
>
> Hyper-V TLB flush hypercalls definitions will be required for KVM so move
> them hyperv-tlfs.h. Structures also need to be renamed as '_pcpu' suffix is
> irrelevant for a general-purpose definition.
>
> Signed-off-by: Vitaly Kuznetsov <[email protected]>
> ---
> arch/x86/hyperv/mmu.c | 40 ++++++++++----------------------------
> arch/x86/include/asm/hyperv-tlfs.h | 20 +++++++++++++++++++
> 2 files changed, 30 insertions(+), 30 deletions(-)

Vitaly,

We should coordinate on this; I have patches in flight that conflict with
the changes here.

Regards,

K. Y
>
> diff --git a/arch/x86/hyperv/mmu.c b/arch/x86/hyperv/mmu.c
> index 56c9ebac946f..528a1f34df96 100644
> --- a/arch/x86/hyperv/mmu.c
> +++ b/arch/x86/hyperv/mmu.c
> @@ -13,32 +13,12 @@
> #define CREATE_TRACE_POINTS
> #include <asm/trace/hyperv.h>
>
> -/* HvFlushVirtualAddressSpace, HvFlushVirtualAddressList hypercalls */
> -struct hv_flush_pcpu {
> - u64 address_space;
> - u64 flags;
> - u64 processor_mask;
> - u64 gva_list[];
> -};
> -
> -/* HvFlushVirtualAddressSpaceEx, HvFlushVirtualAddressListEx hypercalls */
> -struct hv_flush_pcpu_ex {
> - u64 address_space;
> - u64 flags;
> - struct {
> - u64 format;
> - u64 valid_bank_mask;
> - u64 bank_contents[];
> - } hv_vp_set;
> - u64 gva_list[];
> -};
> -
> /* Each gva in gva_list encodes up to 4096 pages to flush */
> #define HV_TLB_FLUSH_UNIT (4096 * PAGE_SIZE)
>
> -static struct hv_flush_pcpu __percpu **pcpu_flush;
> +static struct hv_tlb_flush * __percpu *pcpu_flush;
>
> -static struct hv_flush_pcpu_ex __percpu **pcpu_flush_ex;
> +static struct hv_tlb_flush_ex * __percpu *pcpu_flush_ex;
>
> /*
> * Fills in gva_list starting from offset. Returns the number of items added.
> @@ -71,7 +51,7 @@ static inline int fill_gva_list(u64 gva_list[], int offset,
> }
>
> /* Return the number of banks in the resulting vp_set */
> -static inline int cpumask_to_vp_set(struct hv_flush_pcpu_ex *flush,
> +static inline int cpumask_to_vp_set(struct hv_tlb_flush_ex *flush,
> const struct cpumask *cpus)
> {
> int cpu, vcpu, vcpu_bank, vcpu_offset, nr_bank = 1;
> @@ -81,7 +61,7 @@ static inline int cpumask_to_vp_set(struct
> hv_flush_pcpu_ex *flush,
> return 0;
>
> /*
> - * Clear all banks up to the maximum possible bank as
> hv_flush_pcpu_ex
> + * Clear all banks up to the maximum possible bank as
> hv_tlb_flush_ex
> * structs are not cleared between calls, we risk flushing unneeded
> * vCPUs otherwise.
> */
> @@ -109,8 +89,8 @@ static void hyperv_flush_tlb_others(const struct
> cpumask *cpus,
> const struct flush_tlb_info *info)
> {
> int cpu, vcpu, gva_n, max_gvas;
> - struct hv_flush_pcpu **flush_pcpu;
> - struct hv_flush_pcpu *flush;
> + struct hv_tlb_flush **flush_pcpu;
> + struct hv_tlb_flush *flush;
> u64 status = U64_MAX;
> unsigned long flags;
>
> @@ -196,8 +176,8 @@ static void hyperv_flush_tlb_others_ex(const struct
> cpumask *cpus,
> const struct flush_tlb_info *info)
> {
> int nr_bank = 0, max_gvas, gva_n;
> - struct hv_flush_pcpu_ex **flush_pcpu;
> - struct hv_flush_pcpu_ex *flush;
> + struct hv_tlb_flush_ex **flush_pcpu;
> + struct hv_tlb_flush_ex *flush;
> u64 status = U64_MAX;
> unsigned long flags;
>
> @@ -303,7 +283,7 @@ void hyper_alloc_mmu(void)
> return;
>
> if (!(ms_hyperv.hints &
> HV_X64_EX_PROCESSOR_MASKS_RECOMMENDED))
> - pcpu_flush = alloc_percpu(struct hv_flush_pcpu *);
> + pcpu_flush = alloc_percpu(struct hv_tlb_flush *);
> else
> - pcpu_flush_ex = alloc_percpu(struct hv_flush_pcpu_ex *);
> + pcpu_flush_ex = alloc_percpu(struct hv_tlb_flush_ex *);
> }
> diff --git a/arch/x86/include/asm/hyperv-tlfs.h
> b/arch/x86/include/asm/hyperv-tlfs.h
> index a8897615354e..3d4ce3935a62 100644
> --- a/arch/x86/include/asm/hyperv-tlfs.h
> +++ b/arch/x86/include/asm/hyperv-tlfs.h
> @@ -713,4 +713,24 @@ struct hv_enlightened_vmcs {
> #define HV_STIMER_AUTOENABLE (1ULL << 3)
> #define HV_STIMER_SINT(config) (__u8)(((config) >> 16) &
> 0x0F)
>
> +/* HvFlushVirtualAddressSpace, HvFlushVirtualAddressList hypercalls */
> +struct hv_tlb_flush {
> + u64 address_space;
> + u64 flags;
> + u64 processor_mask;
> + u64 gva_list[];
> +};
> +
> +/* HvFlushVirtualAddressSpaceEx, HvFlushVirtualAddressListEx hypercalls
> */
> +struct hv_tlb_flush_ex {
> + u64 address_space;
> + u64 flags;
> + struct {
> + u64 format;
> + u64 valid_bank_mask;
> + u64 bank_contents[];
> + } hv_vp_set;
> + u64 gva_list[];
> +};
> +
> #endif
> --
> 2.14.3


2018-05-17 07:07:37

by Vitaly Kuznetsov

[permalink] [raw]
Subject: Re: [PATCH v4 1/8] x86/hyper-v: move struct hv_flush_pcpu{,ex} definitions to common header

KY Srinivasan <[email protected]> writes:

>> -----Original Message-----
>> From: Vitaly Kuznetsov <[email protected]>
>> Sent: Wednesday, May 16, 2018 8:21 AM
>> To: [email protected]
>> Cc: [email protected]; Paolo Bonzini <[email protected]>; Radim Krčmář
>> <[email protected]>; Roman Kagan <[email protected]>; KY
>> Srinivasan <[email protected]>; Haiyang Zhang
>> <[email protected]>; Stephen Hemminger
>> <[email protected]>; Michael Kelley (EOSG)
>> <[email protected]>; Mohammed Gamal
>> <[email protected]>; Cathy Avery <[email protected]>; linux-
>> [email protected]
>> Subject: [PATCH v4 1/8] x86/hyper-v: move struct hv_flush_pcpu{,ex}
>> definitions to common header
>>
>> Hyper-V TLB flush hypercalls definitions will be required for KVM so move
>> them hyperv-tlfs.h. Structures also need to be renamed as '_pcpu' suffix is
>> irrelevant for a general-purpose definition.
>>
>> Signed-off-by: Vitaly Kuznetsov <[email protected]>
>> ---
>> arch/x86/hyperv/mmu.c | 40 ++++++++++----------------------------
>> arch/x86/include/asm/hyperv-tlfs.h | 20 +++++++++++++++++++
>> 2 files changed, 30 insertions(+), 30 deletions(-)
>
> Vitaly,
>
> We should coordinate on this; I have patches in flight that conflict with
> the changes here.

I see you also fixed 'HV_GENERIC_SET_SPARCE_4K' typo. I don't think it's
a big deal as we fix it in the same way :-)

I see you also altered hv_flush_pcpu_ex definition but kept it in mmu.c:
I think my patches should be applied on top of yours, I only need to
move it to hyperv-tlfs.h header and get rid of _pcpu suffix.

I hope Thomas will merge your patches very soon. In case Paolo/Radim
decide that my series is ready too we'll ask their guidance on how to
resolve the conflict (topic branch?). It should be relatively easy.

Thanks,

--
Vitaly

2018-05-17 19:35:36

by KY Srinivasan

[permalink] [raw]
Subject: RE: [PATCH v4 1/8] x86/hyper-v: move struct hv_flush_pcpu{,ex} definitions to common header



> -----Original Message-----
> From: Vitaly Kuznetsov <[email protected]>
> Sent: Thursday, May 17, 2018 12:06 AM
> To: KY Srinivasan <[email protected]>
> Cc: [email protected]; [email protected]; Paolo Bonzini
> <[email protected]>; Radim Krčmář <[email protected]>; Roman
> Kagan <[email protected]>; Haiyang Zhang <[email protected]>;
> Stephen Hemminger <[email protected]>; Michael Kelley (EOSG)
> <[email protected]>; Mohammed Gamal
> <[email protected]>; Cathy Avery <[email protected]>; linux-
> [email protected]
> Subject: Re: [PATCH v4 1/8] x86/hyper-v: move struct hv_flush_pcpu{,ex}
> definitions to common header
>
> KY Srinivasan <[email protected]> writes:
>
> >> -----Original Message-----
> >> From: Vitaly Kuznetsov <[email protected]>
> >> Sent: Wednesday, May 16, 2018 8:21 AM
> >> To: [email protected]
> >> Cc: [email protected]; Paolo Bonzini <[email protected]>; Radim
> Krčmář
> >> <[email protected]>; Roman Kagan <[email protected]>; KY
> >> Srinivasan <[email protected]>; Haiyang Zhang
> >> <[email protected]>; Stephen Hemminger
> >> <[email protected]>; Michael Kelley (EOSG)
> >> <[email protected]>; Mohammed Gamal
> >> <[email protected]>; Cathy Avery <[email protected]>; linux-
> >> [email protected]
> >> Subject: [PATCH v4 1/8] x86/hyper-v: move struct hv_flush_pcpu{,ex}
> >> definitions to common header
> >>
> >> Hyper-V TLB flush hypercalls definitions will be required for KVM so move
> >> them hyperv-tlfs.h. Structures also need to be renamed as '_pcpu' suffix
> is
> >> irrelevant for a general-purpose definition.
> >>
> >> Signed-off-by: Vitaly Kuznetsov <[email protected]>
> >> ---
> >> arch/x86/hyperv/mmu.c | 40 ++++++++++----------------------------
> >> arch/x86/include/asm/hyperv-tlfs.h | 20 +++++++++++++++++++
> >> 2 files changed, 30 insertions(+), 30 deletions(-)
> >
> > Vitaly,
> >
> > We should coordinate on this; I have patches in flight that conflict with
> > the changes here.
>
> I see you also fixed 'HV_GENERIC_SET_SPARCE_4K' typo. I don't think it's
> a big deal as we fix it in the same way :-)
>
> I see you also altered hv_flush_pcpu_ex definition but kept it in mmu.c:
> I think my patches should be applied on top of yours, I only need to
> move it to hyperv-tlfs.h header and get rid of _pcpu suffix.
>
> I hope Thomas will merge your patches very soon. In case Paolo/Radim
> decide that my series is ready too we'll ask their guidance on how to
> resolve the conflict (topic branch?). It should be relatively easy.

I agree; waiting for Thomas to merge.

K. Y
>
> Thanks,
>
> --
> Vitaly

2018-05-18 07:00:52

by Wanpeng Li

[permalink] [raw]
Subject: Re: [PATCH v4 0/8] KVM: x86: hyperv: PV TLB flush for Windows guests

Hi Vitaly,
2018-05-16 23:21 GMT+08:00 Vitaly Kuznetsov <[email protected]>:
> Changes since v3 [Radim Krcmar]:
> - PATCH2 fixing 'HV_GENERIC_SET_SPARCE_4K' typo added.
> - PATCH5 introducing kvm_make_vcpus_request_mask() API added.
> - Fix undefined behavior for hv->vp_index >= 64.
> - Merge kvm_hv_flush_tlb() and kvm_hv_flush_tlb_ex()
> - For -ex case preload all banks with a single kvm_read_guest().
>
> Description:
>
> This is both a new feature and a bugfix.
>
> Bugfix description:
>
> It was found that Windows 2016 guests on KVM crash when they have > 64
> vCPUs, non-flat topology (>1 core/thread per socket; in case it has >64
> sockets Windows just ignores vCPUs above 64) and Hyper-V enlightenments

We try the below command line, the Windows 2016 guest successfully to
login and there are 80 vCPUs can be observed in the guest w/o the
patchset, why you mentioned the crash and ignore?

/usr/local/bin/qemu-system-x86_64 -machine pc-i440fx-rhel7.3.0 -m
8192 -smp 80,sockets=2,cores=40,threads=1 -device
ide-drive,bus=ide.0,drive=test -drive
id=test,if=none,file=/instanceimage/359b18ab-05bb-460d-9b53-89505bca68ed/359b18ab-05bb-460d-9b53-89505bca68ed_vda_1.qcow2
-net nic,model=virtio -net user -monitor stdio -usb -usbdevice tablet
--enable-kvm --cpu host -vnc 0.0.0.0:2

Regards,
Wanpeng Li

2018-05-18 11:00:49

by Vitaly Kuznetsov

[permalink] [raw]
Subject: Re: [PATCH v4 0/8] KVM: x86: hyperv: PV TLB flush for Windows guests

Wanpeng Li <[email protected]> writes:

> Hi Vitaly,
> 2018-05-16 23:21 GMT+08:00 Vitaly Kuznetsov <[email protected]>:
>> Changes since v3 [Radim Krcmar]:
>> - PATCH2 fixing 'HV_GENERIC_SET_SPARCE_4K' typo added.
>> - PATCH5 introducing kvm_make_vcpus_request_mask() API added.
>> - Fix undefined behavior for hv->vp_index >= 64.
>> - Merge kvm_hv_flush_tlb() and kvm_hv_flush_tlb_ex()
>> - For -ex case preload all banks with a single kvm_read_guest().
>>
>> Description:
>>
>> This is both a new feature and a bugfix.
>>
>> Bugfix description:
>>
>> It was found that Windows 2016 guests on KVM crash when they have > 64
>> vCPUs, non-flat topology (>1 core/thread per socket; in case it has >64
>> sockets Windows just ignores vCPUs above 64) and Hyper-V enlightenments
>
> We try the below command line, the Windows 2016 guest successfully to
> login and there are 80 vCPUs can be observed in the guest w/o the
> patchset, why you mentioned the crash and ignore?
>
> /usr/local/bin/qemu-system-x86_64 -machine pc-i440fx-rhel7.3.0 -m
> 8192 -smp 80,sockets=2,cores=40,threads=1 -device
> ide-drive,bus=ide.0,drive=test -drive
> id=test,if=none,file=/instanceimage/359b18ab-05bb-460d-9b53-89505bca68ed/359b18ab-05bb-460d-9b53-89505bca68ed_vda_1.qcow2
> -net nic,model=virtio -net user -monitor stdio -usb -usbdevice tablet
> --enable-kvm --cpu host -vnc 0.0.0.0:2

Crash happens when you manifest yourself as Hyper-V, you can do this by
adding any 'hv-*' feature (e.g. try '-cpu host,hv_vpindex').

--
Vitaly

2018-05-18 11:20:08

by Vitaly Kuznetsov

[permalink] [raw]
Subject: Re: [PATCH v4 0/8] KVM: x86: hyperv: PV TLB flush for Windows guests

Vitaly Kuznetsov <[email protected]> writes:

> Wanpeng Li <[email protected]> writes:
>
>> Hi Vitaly,
>> 2018-05-16 23:21 GMT+08:00 Vitaly Kuznetsov <[email protected]>:
>>> Changes since v3 [Radim Krcmar]:
>>> - PATCH2 fixing 'HV_GENERIC_SET_SPARCE_4K' typo added.
>>> - PATCH5 introducing kvm_make_vcpus_request_mask() API added.
>>> - Fix undefined behavior for hv->vp_index >= 64.
>>> - Merge kvm_hv_flush_tlb() and kvm_hv_flush_tlb_ex()
>>> - For -ex case preload all banks with a single kvm_read_guest().
>>>
>>> Description:
>>>
>>> This is both a new feature and a bugfix.
>>>
>>> Bugfix description:
>>>
>>> It was found that Windows 2016 guests on KVM crash when they have > 64
>>> vCPUs, non-flat topology (>1 core/thread per socket; in case it has >64
>>> sockets Windows just ignores vCPUs above 64) and Hyper-V enlightenments
>>
>> We try the below command line, the Windows 2016 guest successfully to
>> login and there are 80 vCPUs can be observed in the guest w/o the
>> patchset, why you mentioned the crash and ignore?
>>
>> /usr/local/bin/qemu-system-x86_64 -machine pc-i440fx-rhel7.3.0 -m
>> 8192 -smp 80,sockets=2,cores=40,threads=1 -device
>> ide-drive,bus=ide.0,drive=test -drive
>> id=test,if=none,file=/instanceimage/359b18ab-05bb-460d-9b53-89505bca68ed/359b18ab-05bb-460d-9b53-89505bca68ed_vda_1.qcow2
>> -net nic,model=virtio -net user -monitor stdio -usb -usbdevice tablet
>> --enable-kvm --cpu host -vnc 0.0.0.0:2
>
> Crash happens when you manifest yourself as Hyper-V, you can do this by
> adding any 'hv-*' feature (e.g. try '-cpu host,hv_vpindex').

Oh, and the 'ignore' happens when you pass more than 64 sockets
(somthing like "-smp 128,sockets=128,cores=1,threads=1") -- and this
happens regardless of Hyper-V enlightenments. But I guess it's just
because Windows doesn't support more than 64 sockets.

--
Vitaly

2018-05-18 11:56:51

by Wanpeng Li

[permalink] [raw]
Subject: Re: [PATCH v4 0/8] KVM: x86: hyperv: PV TLB flush for Windows guests

2018-05-18 19:19 GMT+08:00 Vitaly Kuznetsov <[email protected]>:
> Vitaly Kuznetsov <[email protected]> writes:
>
>> Wanpeng Li <[email protected]> writes:
>>
>>> Hi Vitaly,
>>> 2018-05-16 23:21 GMT+08:00 Vitaly Kuznetsov <[email protected]>:
>>>> Changes since v3 [Radim Krcmar]:
>>>> - PATCH2 fixing 'HV_GENERIC_SET_SPARCE_4K' typo added.
>>>> - PATCH5 introducing kvm_make_vcpus_request_mask() API added.
>>>> - Fix undefined behavior for hv->vp_index >= 64.
>>>> - Merge kvm_hv_flush_tlb() and kvm_hv_flush_tlb_ex()
>>>> - For -ex case preload all banks with a single kvm_read_guest().
>>>>
>>>> Description:
>>>>
>>>> This is both a new feature and a bugfix.
>>>>
>>>> Bugfix description:
>>>>
>>>> It was found that Windows 2016 guests on KVM crash when they have > 64
>>>> vCPUs, non-flat topology (>1 core/thread per socket; in case it has >64
>>>> sockets Windows just ignores vCPUs above 64) and Hyper-V enlightenments
>>>
>>> We try the below command line, the Windows 2016 guest successfully to
>>> login and there are 80 vCPUs can be observed in the guest w/o the
>>> patchset, why you mentioned the crash and ignore?
>>>
>>> /usr/local/bin/qemu-system-x86_64 -machine pc-i440fx-rhel7.3.0 -m
>>> 8192 -smp 80,sockets=2,cores=40,threads=1 -device
>>> ide-drive,bus=ide.0,drive=test -drive
>>> id=test,if=none,file=/instanceimage/359b18ab-05bb-460d-9b53-89505bca68ed/359b18ab-05bb-460d-9b53-89505bca68ed_vda_1.qcow2
>>> -net nic,model=virtio -net user -monitor stdio -usb -usbdevice tablet
>>> --enable-kvm --cpu host -vnc 0.0.0.0:2
>>
>> Crash happens when you manifest yourself as Hyper-V, you can do this by
>> adding any 'hv-*' feature (e.g. try '-cpu host,hv_vpindex').
>
> Oh, and the 'ignore' happens when you pass more than 64 sockets
> (somthing like "-smp 128,sockets=128,cores=1,threads=1") -- and this
> happens regardless of Hyper-V enlightenments. But I guess it's just
> because Windows doesn't support more than 64 sockets.

Is there an option in the guest to avoid to check pvtlb support in hyperv?

Regards,
Wanpeng Li

2018-05-18 12:43:15

by Vitaly Kuznetsov

[permalink] [raw]
Subject: Re: [PATCH v4 0/8] KVM: x86: hyperv: PV TLB flush for Windows guests

Wanpeng Li <[email protected]> writes:

> 2018-05-18 19:19 GMT+08:00 Vitaly Kuznetsov <[email protected]>:
>> Vitaly Kuznetsov <[email protected]> writes:
>>
>>> Wanpeng Li <[email protected]> writes:
>>>
>>>> Hi Vitaly,
>>>> 2018-05-16 23:21 GMT+08:00 Vitaly Kuznetsov <[email protected]>:
>>>>> Changes since v3 [Radim Krcmar]:
>>>>> - PATCH2 fixing 'HV_GENERIC_SET_SPARCE_4K' typo added.
>>>>> - PATCH5 introducing kvm_make_vcpus_request_mask() API added.
>>>>> - Fix undefined behavior for hv->vp_index >= 64.
>>>>> - Merge kvm_hv_flush_tlb() and kvm_hv_flush_tlb_ex()
>>>>> - For -ex case preload all banks with a single kvm_read_guest().
>>>>>
>>>>> Description:
>>>>>
>>>>> This is both a new feature and a bugfix.
>>>>>
>>>>> Bugfix description:
>>>>>
>>>>> It was found that Windows 2016 guests on KVM crash when they have > 64
>>>>> vCPUs, non-flat topology (>1 core/thread per socket; in case it has >64
>>>>> sockets Windows just ignores vCPUs above 64) and Hyper-V enlightenments
>>>>
>>>> We try the below command line, the Windows 2016 guest successfully to
>>>> login and there are 80 vCPUs can be observed in the guest w/o the
>>>> patchset, why you mentioned the crash and ignore?
>>>>
>>>> /usr/local/bin/qemu-system-x86_64 -machine pc-i440fx-rhel7.3.0 -m
>>>> 8192 -smp 80,sockets=2,cores=40,threads=1 -device
>>>> ide-drive,bus=ide.0,drive=test -drive
>>>> id=test,if=none,file=/instanceimage/359b18ab-05bb-460d-9b53-89505bca68ed/359b18ab-05bb-460d-9b53-89505bca68ed_vda_1.qcow2
>>>> -net nic,model=virtio -net user -monitor stdio -usb -usbdevice tablet
>>>> --enable-kvm --cpu host -vnc 0.0.0.0:2
>>>
>>> Crash happens when you manifest yourself as Hyper-V, you can do this by
>>> adding any 'hv-*' feature (e.g. try '-cpu host,hv_vpindex').
>>
>> Oh, and the 'ignore' happens when you pass more than 64 sockets
>> (somthing like "-smp 128,sockets=128,cores=1,threads=1") -- and this
>> happens regardless of Hyper-V enlightenments. But I guess it's just
>> because Windows doesn't support more than 64 sockets.
>
> Is there an option in the guest to avoid to check pvtlb support in hyperv?
>

You mean to tell Windows to not use PV TLB flush when it's available? I
have no idea. My guess would be that it's left up to the hypervisor: if
the feature is available Windows will use it.

--
Vitaly

2018-05-18 12:57:39

by Wanpeng Li

[permalink] [raw]
Subject: Re: [PATCH v4 0/8] KVM: x86: hyperv: PV TLB flush for Windows guests

2018-05-18 20:42 GMT+08:00 Vitaly Kuznetsov <[email protected]>:
> Wanpeng Li <[email protected]> writes:
>
>> 2018-05-18 19:19 GMT+08:00 Vitaly Kuznetsov <[email protected]>:
>>> Vitaly Kuznetsov <[email protected]> writes:
>>>
>>>> Wanpeng Li <[email protected]> writes:
>>>>
>>>>> Hi Vitaly,
>>>>> 2018-05-16 23:21 GMT+08:00 Vitaly Kuznetsov <[email protected]>:
>>>>>> Changes since v3 [Radim Krcmar]:
>>>>>> - PATCH2 fixing 'HV_GENERIC_SET_SPARCE_4K' typo added.
>>>>>> - PATCH5 introducing kvm_make_vcpus_request_mask() API added.
>>>>>> - Fix undefined behavior for hv->vp_index >= 64.
>>>>>> - Merge kvm_hv_flush_tlb() and kvm_hv_flush_tlb_ex()
>>>>>> - For -ex case preload all banks with a single kvm_read_guest().
>>>>>>
>>>>>> Description:
>>>>>>
>>>>>> This is both a new feature and a bugfix.
>>>>>>
>>>>>> Bugfix description:
>>>>>>
>>>>>> It was found that Windows 2016 guests on KVM crash when they have > 64
>>>>>> vCPUs, non-flat topology (>1 core/thread per socket; in case it has >64
>>>>>> sockets Windows just ignores vCPUs above 64) and Hyper-V enlightenments
>>>>>
>>>>> We try the below command line, the Windows 2016 guest successfully to
>>>>> login and there are 80 vCPUs can be observed in the guest w/o the
>>>>> patchset, why you mentioned the crash and ignore?
>>>>>
>>>>> /usr/local/bin/qemu-system-x86_64 -machine pc-i440fx-rhel7.3.0 -m
>>>>> 8192 -smp 80,sockets=2,cores=40,threads=1 -device
>>>>> ide-drive,bus=ide.0,drive=test -drive
>>>>> id=test,if=none,file=/instanceimage/359b18ab-05bb-460d-9b53-89505bca68ed/359b18ab-05bb-460d-9b53-89505bca68ed_vda_1.qcow2
>>>>> -net nic,model=virtio -net user -monitor stdio -usb -usbdevice tablet
>>>>> --enable-kvm --cpu host -vnc 0.0.0.0:2
>>>>
>>>> Crash happens when you manifest yourself as Hyper-V, you can do this by
>>>> adding any 'hv-*' feature (e.g. try '-cpu host,hv_vpindex').
>>>
>>> Oh, and the 'ignore' happens when you pass more than 64 sockets
>>> (somthing like "-smp 128,sockets=128,cores=1,threads=1") -- and this
>>> happens regardless of Hyper-V enlightenments. But I guess it's just
>>> because Windows doesn't support more than 64 sockets.
>>
>> Is there an option in the guest to avoid to check pvtlb support in hyperv?
>>
>
> You mean to tell Windows to not use PV TLB flush when it's available? I
> have no idea. My guess would be that it's left up to the hypervisor: if
> the feature is available Windows will use it.

I mean a way to work around Windows guest crash since there is no PV
TLB flush enabled in product environment currently.

Regards,
Wanpeng Li

2018-05-18 13:19:48

by Vitaly Kuznetsov

[permalink] [raw]
Subject: Re: [PATCH v4 0/8] KVM: x86: hyperv: PV TLB flush for Windows guests

Wanpeng Li <[email protected]> writes:

> 2018-05-18 20:42 GMT+08:00 Vitaly Kuznetsov <[email protected]>:
>> Wanpeng Li <[email protected]> writes:
>>
>>> 2018-05-18 19:19 GMT+08:00 Vitaly Kuznetsov <[email protected]>:
>>>> Vitaly Kuznetsov <[email protected]> writes:
>>>>
>>>>> Wanpeng Li <[email protected]> writes:
>>>>>
>>>>>> Hi Vitaly,
>>>>>> 2018-05-16 23:21 GMT+08:00 Vitaly Kuznetsov <[email protected]>:
>>>>>>> Changes since v3 [Radim Krcmar]:
>>>>>>> - PATCH2 fixing 'HV_GENERIC_SET_SPARCE_4K' typo added.
>>>>>>> - PATCH5 introducing kvm_make_vcpus_request_mask() API added.
>>>>>>> - Fix undefined behavior for hv->vp_index >= 64.
>>>>>>> - Merge kvm_hv_flush_tlb() and kvm_hv_flush_tlb_ex()
>>>>>>> - For -ex case preload all banks with a single kvm_read_guest().
>>>>>>>
>>>>>>> Description:
>>>>>>>
>>>>>>> This is both a new feature and a bugfix.
>>>>>>>
>>>>>>> Bugfix description:
>>>>>>>
>>>>>>> It was found that Windows 2016 guests on KVM crash when they have > 64
>>>>>>> vCPUs, non-flat topology (>1 core/thread per socket; in case it has >64
>>>>>>> sockets Windows just ignores vCPUs above 64) and Hyper-V enlightenments
>>>>>>
>>>>>> We try the below command line, the Windows 2016 guest successfully to
>>>>>> login and there are 80 vCPUs can be observed in the guest w/o the
>>>>>> patchset, why you mentioned the crash and ignore?
>>>>>>
>>>>>> /usr/local/bin/qemu-system-x86_64 -machine pc-i440fx-rhel7.3.0 -m
>>>>>> 8192 -smp 80,sockets=2,cores=40,threads=1 -device
>>>>>> ide-drive,bus=ide.0,drive=test -drive
>>>>>> id=test,if=none,file=/instanceimage/359b18ab-05bb-460d-9b53-89505bca68ed/359b18ab-05bb-460d-9b53-89505bca68ed_vda_1.qcow2
>>>>>> -net nic,model=virtio -net user -monitor stdio -usb -usbdevice tablet
>>>>>> --enable-kvm --cpu host -vnc 0.0.0.0:2
>>>>>
>>>>> Crash happens when you manifest yourself as Hyper-V, you can do this by
>>>>> adding any 'hv-*' feature (e.g. try '-cpu host,hv_vpindex').
>>>>
>>>> Oh, and the 'ignore' happens when you pass more than 64 sockets
>>>> (somthing like "-smp 128,sockets=128,cores=1,threads=1") -- and this
>>>> happens regardless of Hyper-V enlightenments. But I guess it's just
>>>> because Windows doesn't support more than 64 sockets.
>>>
>>> Is there an option in the guest to avoid to check pvtlb support in hyperv?
>>>
>>
>> You mean to tell Windows to not use PV TLB flush when it's available? I
>> have no idea. My guess would be that it's left up to the hypervisor: if
>> the feature is available Windows will use it.
>
> I mean a way to work around Windows guest crash since there is no PV
> TLB flush enabled in product environment currently.
>

Unfortunately I don't know of such option, all real Hyper-V servers
support it so I think nodoby ever tested how Windows behaves without
it. I did and oops, it crashes...

--
Vitaly

2018-05-19 12:17:46

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH v4 1/8] x86/hyper-v: move struct hv_flush_pcpu{,ex} definitions to common header

On Thu, 17 May 2018, Vitaly Kuznetsov wrote:
> KY Srinivasan <[email protected]> writes:
> > We should coordinate on this; I have patches in flight that conflict with
> > the changes here.
>
> I see you also fixed 'HV_GENERIC_SET_SPARCE_4K' typo. I don't think it's
> a big deal as we fix it in the same way :-)
>
> I see you also altered hv_flush_pcpu_ex definition but kept it in mmu.c:
> I think my patches should be applied on top of yours, I only need to
> move it to hyperv-tlfs.h header and get rid of _pcpu suffix.
>
> I hope Thomas will merge your patches very soon. In case Paolo/Radim
> decide that my series is ready too we'll ask their guidance on how to
> resolve the conflict (topic branch?). It should be relatively easy.

I've applied KY's series to tip/x86/hyperv. It has no other changes in
there. So it could be pulled into KVM if necessary w/o pulling other stuff.

Thanks,

tglx

2018-05-26 14:31:43

by Radim Krčmář

[permalink] [raw]
Subject: Re: [PATCH v4 0/8] KVM: x86: hyperv: PV TLB flush for Windows guests

2018-05-16 17:21+0200, Vitaly Kuznetsov:
> Changes since v3 [Radim Krcmar]:
> - PATCH2 fixing 'HV_GENERIC_SET_SPARCE_4K' typo added.
> - PATCH5 introducing kvm_make_vcpus_request_mask() API added.
> - Fix undefined behavior for hv->vp_index >= 64.
> - Merge kvm_hv_flush_tlb() and kvm_hv_flush_tlb_ex()
> - For -ex case preload all banks with a single kvm_read_guest().

I've pulled tip/hyperv into kvm/queue and applied on top, thanks.

2018-05-26 14:50:26

by Radim Krčmář

[permalink] [raw]
Subject: Re: [PATCH v4 5/8] KVM: introduce kvm_make_vcpus_request_mask() API

2018-05-16 17:21+0200, Vitaly Kuznetsov:
> Hyper-V style PV TLB flush hypercalls inmplementation will use this API.
> To avoid memory allocation in CONFIG_CPUMASK_OFFSTACK case add
> cpumask_var_t argument.
>
> Signed-off-by: Vitaly Kuznetsov <[email protected]>
> ---
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> -bool kvm_make_all_cpus_request(struct kvm *kvm, unsigned int req)
> +bool kvm_make_vcpus_request_mask(struct kvm *kvm, unsigned int req,
> + unsigned long *vcpu_bitmap, cpumask_var_t tmp)
> {
> int i, cpu, me;
> - cpumask_var_t cpus;
> - bool called;
> struct kvm_vcpu *vcpu;
> -
> - zalloc_cpumask_var(&cpus, GFP_ATOMIC);
> + bool called;
>
> me = get_cpu();
> +

Two optimizations come to mind: First is to use for_each_set_bit instead
of kvm_for_each_vcpu to improve the sparse case.

> kvm_for_each_vcpu(i, vcpu, kvm) {
> + if (!test_bit(i, vcpu_bitmap))

And the second is to pass vcpu_bitmap = NULL instead of building the
bitmap with all VCPUs. Doesn't looks too good in the end, though:

#define kvm_for_each_vcpu_bitmap(idx, vcpup, kvm, bitmap, len) \
for (idx = (bitmap ? find_first_bit(bitmap, len) : 0); \
idx < len && (vcpup = kvm_get_vcpu(kvm, idx)) != NULL; \
bitmap ? find_next_bit(bitmap, len, idx + 1) : idx++)

2018-06-11 11:59:02

by Vitaly Kuznetsov

[permalink] [raw]
Subject: Re: [PATCH v4 5/8] KVM: introduce kvm_make_vcpus_request_mask() API

Radim Krčmář <[email protected]> writes:

> 2018-05-16 17:21+0200, Vitaly Kuznetsov:
>> Hyper-V style PV TLB flush hypercalls inmplementation will use this API.
>> To avoid memory allocation in CONFIG_CPUMASK_OFFSTACK case add
>> cpumask_var_t argument.
>>
>> Signed-off-by: Vitaly Kuznetsov <[email protected]>
>> ---
>> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
>> -bool kvm_make_all_cpus_request(struct kvm *kvm, unsigned int req)
>> +bool kvm_make_vcpus_request_mask(struct kvm *kvm, unsigned int req,
>> + unsigned long *vcpu_bitmap, cpumask_var_t tmp)
>> {
>> int i, cpu, me;
>> - cpumask_var_t cpus;
>> - bool called;
>> struct kvm_vcpu *vcpu;
>> -
>> - zalloc_cpumask_var(&cpus, GFP_ATOMIC);
>> + bool called;
>>
>> me = get_cpu();
>> +
>
> Two optimizations come to mind: First is to use for_each_set_bit instead
> of kvm_for_each_vcpu to improve the sparse case.
>

I think I had such version at some point but then for some reason I
decided I'm re-implementing kvm_for_each_vcpu for no particular
reason.

>> kvm_for_each_vcpu(i, vcpu, kvm) {
>> + if (!test_bit(i, vcpu_bitmap))
>
> And the second is to pass vcpu_bitmap = NULL instead of building the
> bitmap with all VCPUs. Doesn't looks too good in the end, though:
>
> #define kvm_for_each_vcpu_bitmap(idx, vcpup, kvm, bitmap, len) \
> for (idx = (bitmap ? find_first_bit(bitmap, len) : 0); \
> idx < len && (vcpup = kvm_get_vcpu(kvm, idx)) != NULL; \
> bitmap ? find_next_bit(bitmap, len, idx + 1) : idx++)

I'll take a try, thanks!

--
Vitaly