2018-09-26 17:04:59

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH v6 0/7] KVM: x86: hyperv: PV IPI support for Windows guests

Changes since v5:
- New 'hybrid' approach to VP indexes: introduce 'num_mismatched_vp_indexes'
and use it for optimization in both PV IPI and TLB flush [Paolo Bonzini,
Roman Kagan].
- Rebase, KVM_CAP_HYPERV_SEND_IPI is now '160'.
- Patches 3-5 are new in this version.
- Drop "x86/hyper-v: rename ipi_arg_{ex,non_ex} structures" patch as it is
already merged in kvm/queue.

Original description:
Using hypercall for sending IPIs is faster because this allows to specify
any number of vCPUs (even > 64 with sparse CPU set), the whole procedure
will take only one VMEXIT.

Same as PV TLB flush, this allows Windows guests having > 64 vCPUs to boot
on KVM when Hyper-V extensions are enabled.

Vitaly Kuznetsov (7):
KVM: x86: hyperv: enforce vp_index < KVM_MAX_VCPUS
KVM: x86: hyperv: optimize 'all cpus' case in kvm_hv_flush_tlb()
KVM: x86: hyperv: consistently use 'hv_vcpu' for 'struct kvm_vcpu_hv'
variables
KVM: x86: hyperv: keep track of mismatched VP indexes
KVM: x86: hyperv: valid_bank_mask should be 'u64'
KVM: x86: hyperv: optimize kvm_hv_flush_tlb() for vp_index == vcpu_idx
case
KVM: x86: hyperv: implement PV IPI send hypercalls

Documentation/virtual/kvm/api.txt | 7 +
arch/x86/include/asm/kvm_host.h | 3 +
arch/x86/kvm/hyperv.c | 268 +++++++++++++++++++++++-------
arch/x86/kvm/trace.h | 42 +++++
arch/x86/kvm/x86.c | 1 +
include/uapi/linux/kvm.h | 1 +
virt/kvm/kvm_main.c | 6 +-
7 files changed, 265 insertions(+), 63 deletions(-)

--
2.17.1



2018-09-26 17:03:40

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH v6 2/7] KVM: x86: hyperv: optimize 'all cpus' case in kvm_hv_flush_tlb()

We can use 'NULL' to represent 'all cpus' case in
kvm_make_vcpus_request_mask() and avoid building vCPU mask with
all vCPUs.

Suggested-by: Radim Krčmář <[email protected]>
Signed-off-by: Vitaly Kuznetsov <[email protected]>
Reviewed-by: Roman Kagan <[email protected]>
---
arch/x86/kvm/hyperv.c | 42 +++++++++++++++++++++++-------------------
virt/kvm/kvm_main.c | 6 ++----
2 files changed, 25 insertions(+), 23 deletions(-)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 0cd597b0f754..b45ce136be2f 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -1325,35 +1325,39 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *current_vcpu, u64 ingpa,

cpumask_clear(&hv_current->tlb_lush);

+ if (all_cpus) {
+ kvm_make_vcpus_request_mask(kvm,
+ KVM_REQ_TLB_FLUSH | KVM_REQUEST_NO_WAKEUP,
+ NULL, &hv_current->tlb_lush);
+ goto ret_success;
+ }
+
kvm_for_each_vcpu(i, vcpu, kvm) {
struct kvm_vcpu_hv *hv = &vcpu->arch.hyperv;
int bank = hv->vp_index / 64, sbank = 0;

- if (!all_cpus) {
- /* Banks >64 can't be represented */
- if (bank >= 64)
- continue;
-
- /* Non-ex hypercalls can only address first 64 vCPUs */
- if (!ex && bank)
- continue;
+ /* Banks >64 can't be represented */
+ if (bank >= 64)
+ continue;

- if (ex) {
- /*
- * Check is the bank of this vCPU is in sparse
- * set and get the sparse bank number.
- */
- sbank = get_sparse_bank_no(valid_bank_mask,
- bank);
+ /* Non-ex hypercalls can only address first 64 vCPUs */
+ if (!ex && bank)
+ continue;

- if (sbank < 0)
- continue;
- }
+ if (ex) {
+ /*
+ * Check is the bank of this vCPU is in sparse
+ * set and get the sparse bank number.
+ */
+ sbank = get_sparse_bank_no(valid_bank_mask, bank);

- if (!(sparse_banks[sbank] & BIT_ULL(hv->vp_index % 64)))
+ if (sbank < 0)
continue;
}

+ if (!(sparse_banks[sbank] & BIT_ULL(hv->vp_index % 64)))
+ continue;
+
/*
* vcpu->arch.cr3 may not be up-to-date for running vCPUs so we
* can't analyze it here, flush TLB regardless of the specified
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index f986e31fa68c..587e1a0a8715 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -219,7 +219,7 @@ bool kvm_make_vcpus_request_mask(struct kvm *kvm, unsigned int req,
me = get_cpu();

kvm_for_each_vcpu(i, vcpu, kvm) {
- if (!test_bit(i, vcpu_bitmap))
+ if (vcpu_bitmap && !test_bit(i, vcpu_bitmap))
continue;

kvm_make_request(req, vcpu);
@@ -243,12 +243,10 @@ bool kvm_make_all_cpus_request(struct kvm *kvm, unsigned int req)
{
cpumask_var_t cpus;
bool called;
- static unsigned long vcpu_bitmap[BITS_TO_LONGS(KVM_MAX_VCPUS)]
- = {[0 ... BITS_TO_LONGS(KVM_MAX_VCPUS)-1] = ULONG_MAX};

zalloc_cpumask_var(&cpus, GFP_ATOMIC);

- called = kvm_make_vcpus_request_mask(kvm, req, vcpu_bitmap, cpus);
+ called = kvm_make_vcpus_request_mask(kvm, req, NULL, cpus);

free_cpumask_var(cpus);
return called;
--
2.17.1


2018-09-26 17:03:47

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH v6 3/7] KVM: x86: hyperv: consistently use 'hv_vcpu' for 'struct kvm_vcpu_hv' variables

Rename 'hv' to 'hv_vcpu' in kvm_hv_set_msr/kvm_hv_get_msr(); 'hv' is
'reserved' for 'struct kvm_hv' variables across the file.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
arch/x86/kvm/hyperv.c | 18 +++++++++---------
1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index b45ce136be2f..c8764faf783b 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -1042,20 +1042,20 @@ static u64 current_task_runtime_100ns(void)

static int kvm_hv_set_msr(struct kvm_vcpu *vcpu, u32 msr, u64 data, bool host)
{
- struct kvm_vcpu_hv *hv = &vcpu->arch.hyperv;
+ struct kvm_vcpu_hv *hv_vcpu = &vcpu->arch.hyperv;

switch (msr) {
case HV_X64_MSR_VP_INDEX:
if (!host || (u32)data >= KVM_MAX_VCPUS)
return 1;
- hv->vp_index = (u32)data;
+ hv_vcpu->vp_index = (u32)data;
break;
case HV_X64_MSR_VP_ASSIST_PAGE: {
u64 gfn;
unsigned long addr;

if (!(data & HV_X64_MSR_VP_ASSIST_PAGE_ENABLE)) {
- hv->hv_vapic = data;
+ hv_vcpu->hv_vapic = data;
if (kvm_lapic_enable_pv_eoi(vcpu, 0))
return 1;
break;
@@ -1066,7 +1066,7 @@ static int kvm_hv_set_msr(struct kvm_vcpu *vcpu, u32 msr, u64 data, bool host)
return 1;
if (__clear_user((void __user *)addr, PAGE_SIZE))
return 1;
- hv->hv_vapic = data;
+ hv_vcpu->hv_vapic = data;
kvm_vcpu_mark_page_dirty(vcpu, gfn);
if (kvm_lapic_enable_pv_eoi(vcpu,
gfn_to_gpa(gfn) | KVM_MSR_ENABLED))
@@ -1082,7 +1082,7 @@ static int kvm_hv_set_msr(struct kvm_vcpu *vcpu, u32 msr, u64 data, bool host)
case HV_X64_MSR_VP_RUNTIME:
if (!host)
return 1;
- hv->runtime_offset = data - current_task_runtime_100ns();
+ hv_vcpu->runtime_offset = data - current_task_runtime_100ns();
break;
case HV_X64_MSR_SCONTROL:
case HV_X64_MSR_SVERSION:
@@ -1174,11 +1174,11 @@ static int kvm_hv_get_msr(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata,
bool host)
{
u64 data = 0;
- struct kvm_vcpu_hv *hv = &vcpu->arch.hyperv;
+ struct kvm_vcpu_hv *hv_vcpu = &vcpu->arch.hyperv;

switch (msr) {
case HV_X64_MSR_VP_INDEX:
- data = hv->vp_index;
+ data = hv_vcpu->vp_index;
break;
case HV_X64_MSR_EOI:
return kvm_hv_vapic_msr_read(vcpu, APIC_EOI, pdata);
@@ -1187,10 +1187,10 @@ static int kvm_hv_get_msr(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata,
case HV_X64_MSR_TPR:
return kvm_hv_vapic_msr_read(vcpu, APIC_TASKPRI, pdata);
case HV_X64_MSR_VP_ASSIST_PAGE:
- data = hv->hv_vapic;
+ data = hv_vcpu->hv_vapic;
break;
case HV_X64_MSR_VP_RUNTIME:
- data = current_task_runtime_100ns() + hv->runtime_offset;
+ data = current_task_runtime_100ns() + hv_vcpu->runtime_offset;
break;
case HV_X64_MSR_SCONTROL:
case HV_X64_MSR_SVERSION:
--
2.17.1


2018-09-26 17:03:49

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH v6 4/7] KVM: x86: hyperv: keep track of mismatched VP indexes

In most common cases VP index of a vcpu matches its vcpu index. Userspace
is, however, free to set any mapping it wishes and we need to account for
that when we need to find a vCPU with a particular VP index. To keep search
algorithms optimal in both cases introduce 'num_mismatched_vp_indexes'
counter showing how many vCPUs with mismatching VP index we have. In case
the counter is zero we can assume vp_index == vcpu_idx.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
arch/x86/include/asm/kvm_host.h | 3 +++
arch/x86/kvm/hyperv.c | 26 +++++++++++++++++++++++---
2 files changed, 26 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 09b2e3e2cf1b..711f79f1b5e6 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -781,6 +781,9 @@ struct kvm_hv {
u64 hv_reenlightenment_control;
u64 hv_tsc_emulation_control;
u64 hv_tsc_emulation_status;
+
+ /* How many vCPUs have VP index != vCPU index */
+ atomic_t num_mismatched_vp_indexes;
};

enum kvm_irqchip_mode {
diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index c8764faf783b..6a19c8e3c432 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -1045,11 +1045,31 @@ static int kvm_hv_set_msr(struct kvm_vcpu *vcpu, u32 msr, u64 data, bool host)
struct kvm_vcpu_hv *hv_vcpu = &vcpu->arch.hyperv;

switch (msr) {
- case HV_X64_MSR_VP_INDEX:
- if (!host || (u32)data >= KVM_MAX_VCPUS)
+ case HV_X64_MSR_VP_INDEX: {
+ struct kvm_hv *hv = &vcpu->kvm->arch.hyperv;
+ int vcpu_idx = kvm_vcpu_get_idx(vcpu);
+ u32 new_vp_index = (u32)data;
+
+ if (!host || new_vp_index >= KVM_MAX_VCPUS)
return 1;
- hv_vcpu->vp_index = (u32)data;
+
+ if (new_vp_index == hv_vcpu->vp_index)
+ return 0;
+
+ /*
+ * VP index is changing, increment num_mismatched_vp_indexes in
+ * case it was equal to vcpu_idx before; on the other hand, if
+ * the new VP index matches vcpu_idx num_mismatched_vp_indexes
+ * needs to be decremented.
+ */
+ if (hv_vcpu->vp_index == vcpu_idx)
+ atomic_inc(&hv->num_mismatched_vp_indexes);
+ else if (new_vp_index == vcpu_idx)
+ atomic_dec(&hv->num_mismatched_vp_indexes);
+
+ hv_vcpu->vp_index = new_vp_index;
break;
+ }
case HV_X64_MSR_VP_ASSIST_PAGE: {
u64 gfn;
unsigned long addr;
--
2.17.1


2018-09-26 17:04:12

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH v6 5/7] KVM: x86: hyperv: valid_bank_mask should be 'u64'

This probably doesn't matter much (KVM_MAX_VCPUS is much lower nowadays)
but valid_bank_mask is really u64 and not unsigned long.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
arch/x86/kvm/hyperv.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 6a19c8e3c432..eeb12eacd525 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -1300,7 +1300,7 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *current_vcpu, u64 ingpa,
struct hv_tlb_flush flush;
struct kvm_vcpu *vcpu;
unsigned long vcpu_bitmap[BITS_TO_LONGS(KVM_MAX_VCPUS)] = {0};
- unsigned long valid_bank_mask = 0;
+ u64 valid_bank_mask = 0;
u64 sparse_banks[64];
int sparse_banks_len, i;
bool all_cpus;
@@ -1328,7 +1328,8 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *current_vcpu, u64 ingpa,
all_cpus = flush_ex.hv_vp_set.format !=
HV_GENERIC_SET_SPARSE_4K;

- sparse_banks_len = bitmap_weight(&valid_bank_mask, 64) *
+ sparse_banks_len =
+ bitmap_weight((unsigned long *)&valid_bank_mask, 64) *
sizeof(sparse_banks[0]);

if (!sparse_banks_len && !all_cpus)
--
2.17.1


2018-09-26 17:04:23

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH v6 1/7] KVM: x86: hyperv: enforce vp_index < KVM_MAX_VCPUS

Hyper-V TLFS (5.0b) states:

> Virtual processors are identified by using an index (VP index). The
> maximum number of virtual processors per partition supported by the
> current implementation of the hypervisor can be obtained through CPUID
> leaf 0x40000005. A virtual processor index must be less than the
> maximum number of virtual processors per partition.

Forbid userspace to set VP_INDEX above KVM_MAX_VCPUS. get_vcpu_by_vpidx()
can now be optimized to bail early when supplied vpidx is >= KVM_MAX_VCPUS.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
Reviewed-by: Roman Kagan <[email protected]>
---
arch/x86/kvm/hyperv.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 01d209ab5481..0cd597b0f754 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -132,8 +132,10 @@ static struct kvm_vcpu *get_vcpu_by_vpidx(struct kvm *kvm, u32 vpidx)
struct kvm_vcpu *vcpu = NULL;
int i;

- if (vpidx < KVM_MAX_VCPUS)
- vcpu = kvm_get_vcpu(kvm, vpidx);
+ if (vpidx >= KVM_MAX_VCPUS)
+ return NULL;
+
+ vcpu = kvm_get_vcpu(kvm, vpidx);
if (vcpu && vcpu_to_hv_vcpu(vcpu)->vp_index == vpidx)
return vcpu;
kvm_for_each_vcpu(i, vcpu, kvm)
@@ -1044,7 +1046,7 @@ static int kvm_hv_set_msr(struct kvm_vcpu *vcpu, u32 msr, u64 data, bool host)

switch (msr) {
case HV_X64_MSR_VP_INDEX:
- if (!host)
+ if (!host || (u32)data >= KVM_MAX_VCPUS)
return 1;
hv->vp_index = (u32)data;
break;
--
2.17.1


2018-09-26 17:04:37

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH v6 7/7] KVM: x86: hyperv: implement PV IPI send hypercalls

Using hypercall for sending IPIs is faster because this allows to specify
any number of vCPUs (even > 64 with sparse CPU set), the whole procedure
will take only one VMEXIT.

Current Hyper-V TLFS (v5.0b) claims that HvCallSendSyntheticClusterIpi
hypercall can't be 'fast' (passing parameters through registers) but
apparently this is not true, Windows always uses it as 'fast' so we need
to support that.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
Documentation/virtual/kvm/api.txt | 7 ++
arch/x86/kvm/hyperv.c | 115 ++++++++++++++++++++++++++++++
arch/x86/kvm/trace.h | 42 +++++++++++
arch/x86/kvm/x86.c | 1 +
include/uapi/linux/kvm.h | 1 +
5 files changed, 166 insertions(+)

diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index 647f94128a85..1659b75d577d 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -4772,3 +4772,10 @@ CPU when the exception is taken. If this virtual SError is taken to EL1 using
AArch64, this value will be reported in the ISS field of ESR_ELx.

See KVM_CAP_VCPU_EVENTS for more details.
+8.20 KVM_CAP_HYPERV_SEND_IPI
+
+Architectures: x86
+
+This capability indicates that KVM supports paravirtualized Hyper-V IPI send
+hypercalls:
+HvCallSendSyntheticClusterIpi, HvCallSendSyntheticClusterIpiEx.
diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index cc0535a078f7..4b4a6d015ade 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -1405,6 +1405,107 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *current_vcpu, u64 ingpa,
((u64)rep_cnt << HV_HYPERCALL_REP_COMP_OFFSET);
}

+static u64 kvm_hv_send_ipi(struct kvm_vcpu *current_vcpu, u64 ingpa, u64 outgpa,
+ bool ex, bool fast)
+{
+ struct kvm *kvm = current_vcpu->kvm;
+ struct kvm_hv *hv = &kvm->arch.hyperv;
+ struct hv_send_ipi_ex send_ipi_ex;
+ struct hv_send_ipi send_ipi;
+ struct kvm_vcpu *vcpu;
+ unsigned long valid_bank_mask;
+ u64 sparse_banks[64];
+ int sparse_banks_len, bank, i, sbank;
+ struct kvm_lapic_irq irq = {.delivery_mode = APIC_DM_FIXED};
+ bool all_cpus;
+
+ if (!ex) {
+ if (!fast) {
+ if (unlikely(kvm_read_guest(kvm, ingpa, &send_ipi,
+ sizeof(send_ipi))))
+ return HV_STATUS_INVALID_HYPERCALL_INPUT;
+ sparse_banks[0] = send_ipi.cpu_mask;
+ irq.vector = send_ipi.vector;
+ } else {
+ /* 'reserved' part of hv_send_ipi should be 0 */
+ if (unlikely(ingpa >> 32 != 0))
+ return HV_STATUS_INVALID_HYPERCALL_INPUT;
+ sparse_banks[0] = outgpa;
+ irq.vector = (u32)ingpa;
+ }
+ all_cpus = false;
+ valid_bank_mask = BIT_ULL(0);
+
+ trace_kvm_hv_send_ipi(irq.vector, sparse_banks[0]);
+ } else {
+ if (unlikely(kvm_read_guest(kvm, ingpa, &send_ipi_ex,
+ sizeof(send_ipi_ex))))
+ return HV_STATUS_INVALID_HYPERCALL_INPUT;
+
+ trace_kvm_hv_send_ipi_ex(send_ipi_ex.vector,
+ send_ipi_ex.vp_set.format,
+ send_ipi_ex.vp_set.valid_bank_mask);
+
+ irq.vector = send_ipi_ex.vector;
+ valid_bank_mask = send_ipi_ex.vp_set.valid_bank_mask;
+ sparse_banks_len = bitmap_weight(&valid_bank_mask, 64) *
+ sizeof(sparse_banks[0]);
+
+ all_cpus = send_ipi_ex.vp_set.format == HV_GENERIC_SET_ALL;
+
+ if (!sparse_banks_len)
+ goto ret_success;
+
+ if (!all_cpus &&
+ kvm_read_guest(kvm,
+ ingpa + offsetof(struct hv_send_ipi_ex,
+ vp_set.bank_contents),
+ sparse_banks,
+ sparse_banks_len))
+ return HV_STATUS_INVALID_HYPERCALL_INPUT;
+ }
+
+ if ((irq.vector < HV_IPI_LOW_VECTOR) ||
+ (irq.vector > HV_IPI_HIGH_VECTOR))
+ return HV_STATUS_INVALID_HYPERCALL_INPUT;
+
+ if (all_cpus || atomic_read(&hv->num_mismatched_vp_indexes)) {
+ kvm_for_each_vcpu(i, vcpu, kvm) {
+ if (all_cpus || hv_vcpu_in_sparse_set(
+ &vcpu->arch.hyperv, sparse_banks,
+ valid_bank_mask)) {
+ /* We fail only when APIC is disabled */
+ kvm_apic_set_irq(vcpu, &irq, NULL);
+ }
+ }
+ goto ret_success;
+ }
+
+ /*
+ * num_mismatched_vp_indexes is zero so every vcpu has
+ * vp_index == vcpu_idx.
+ */
+ sbank = 0;
+ for_each_set_bit(bank, (unsigned long *)&valid_bank_mask, 64) {
+ for_each_set_bit(i, (unsigned long *)&sparse_banks[sbank], 64) {
+ u32 vp_index = bank * 64 + i;
+ struct kvm_vcpu *vcpu =
+ get_vcpu_by_vpidx(kvm, vp_index);
+
+ /* Unknown vCPU specified */
+ if (!vcpu)
+ continue;
+
+ /* We fail only when APIC is disabled */
+ kvm_apic_set_irq(vcpu, &irq, NULL);
+ }
+ sbank++;
+ }
+
+ret_success:
+ return HV_STATUS_SUCCESS;
+}
+
bool kvm_hv_hypercall_enabled(struct kvm *kvm)
{
return READ_ONCE(kvm->arch.hyperv.hv_hypercall) & HV_X64_MSR_HYPERCALL_ENABLE;
@@ -1574,6 +1675,20 @@ int kvm_hv_hypercall(struct kvm_vcpu *vcpu)
}
ret = kvm_hv_flush_tlb(vcpu, ingpa, rep_cnt, true);
break;
+ case HVCALL_SEND_IPI:
+ if (unlikely(rep)) {
+ ret = HV_STATUS_INVALID_HYPERCALL_INPUT;
+ break;
+ }
+ ret = kvm_hv_send_ipi(vcpu, ingpa, outgpa, false, fast);
+ break;
+ case HVCALL_SEND_IPI_EX:
+ if (unlikely(fast || rep)) {
+ ret = HV_STATUS_INVALID_HYPERCALL_INPUT;
+ break;
+ }
+ ret = kvm_hv_send_ipi(vcpu, ingpa, outgpa, true, false);
+ break;
default:
ret = HV_STATUS_INVALID_HYPERCALL_CODE;
break;
diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h
index 0f997683404f..0659465a745c 100644
--- a/arch/x86/kvm/trace.h
+++ b/arch/x86/kvm/trace.h
@@ -1418,6 +1418,48 @@ TRACE_EVENT(kvm_hv_flush_tlb_ex,
__entry->valid_bank_mask, __entry->format,
__entry->address_space, __entry->flags)
);
+
+/*
+ * Tracepoints for kvm_hv_send_ipi.
+ */
+TRACE_EVENT(kvm_hv_send_ipi,
+ TP_PROTO(u32 vector, u64 processor_mask),
+ TP_ARGS(vector, processor_mask),
+
+ TP_STRUCT__entry(
+ __field(u32, vector)
+ __field(u64, processor_mask)
+ ),
+
+ TP_fast_assign(
+ __entry->vector = vector;
+ __entry->processor_mask = processor_mask;
+ ),
+
+ TP_printk("vector %x processor_mask 0x%llx",
+ __entry->vector, __entry->processor_mask)
+);
+
+TRACE_EVENT(kvm_hv_send_ipi_ex,
+ TP_PROTO(u32 vector, u64 format, u64 valid_bank_mask),
+ TP_ARGS(vector, format, valid_bank_mask),
+
+ TP_STRUCT__entry(
+ __field(u32, vector)
+ __field(u64, format)
+ __field(u64, valid_bank_mask)
+ ),
+
+ TP_fast_assign(
+ __entry->vector = vector;
+ __entry->format = format;
+ __entry->valid_bank_mask = valid_bank_mask;
+ ),
+
+ TP_printk("vector %x format %llx valid_bank_mask 0x%llx",
+ __entry->vector, __entry->format,
+ __entry->valid_bank_mask)
+);
#endif /* _TRACE_KVM_H */

#undef TRACE_INCLUDE_PATH
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index edbf00ec56b3..63badc7efbea 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2912,6 +2912,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
case KVM_CAP_HYPERV_VP_INDEX:
case KVM_CAP_HYPERV_EVENTFD:
case KVM_CAP_HYPERV_TLBFLUSH:
+ case KVM_CAP_HYPERV_SEND_IPI:
case KVM_CAP_PCI_SEGMENT:
case KVM_CAP_DEBUGREGS:
case KVM_CAP_X86_ROBUST_SINGLESTEP:
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 251be353f950..043f1e58b6e5 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -953,6 +953,7 @@ struct kvm_ppc_resize_hpt {
#define KVM_CAP_NESTED_STATE 157
#define KVM_CAP_ARM_INJECT_SERROR_ESR 158
#define KVM_CAP_MSR_PLATFORM_INFO 159
+#define KVM_CAP_HYPERV_SEND_IPI 160

#ifdef KVM_CAP_IRQ_ROUTING

--
2.17.1


2018-09-26 17:05:59

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH v6 6/7] KVM: x86: hyperv: optimize kvm_hv_flush_tlb() for vp_index == vcpu_idx case

VP inedx almost always matches VCPU and when it does it's faster to walk
the sparse set instead of all vcpus.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
arch/x86/kvm/hyperv.c | 96 +++++++++++++++++++++++--------------------
1 file changed, 52 insertions(+), 44 deletions(-)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index eeb12eacd525..cc0535a078f7 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -1277,32 +1277,37 @@ int kvm_hv_get_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata, bool host)
return kvm_hv_get_msr(vcpu, msr, pdata, host);
}

-static __always_inline int get_sparse_bank_no(u64 valid_bank_mask, int bank_no)
+static __always_inline bool hv_vcpu_in_sparse_set(struct kvm_vcpu_hv *hv_vcpu,
+ u64 sparse_banks[],
+ u64 valid_bank_mask)
{
- int i = 0, j;
+ int bank = hv_vcpu->vp_index / 64, sbank;

- if (!(valid_bank_mask & BIT_ULL(bank_no)))
- return -1;
+ if (bank >= 64)
+ return false;

- for (j = 0; j < bank_no; j++)
- if (valid_bank_mask & BIT_ULL(j))
- i++;
+ if (!(valid_bank_mask & BIT_ULL(bank)))
+ return false;

- return i;
+ /* Sparse bank number equals to the number of set bits before it */
+ sbank = bitmap_weight((unsigned long *)&valid_bank_mask, bank);
+
+ return !!(sparse_banks[sbank] & BIT_ULL(hv_vcpu->vp_index % 64));
}

static u64 kvm_hv_flush_tlb(struct kvm_vcpu *current_vcpu, u64 ingpa,
u16 rep_cnt, bool ex)
{
struct kvm *kvm = current_vcpu->kvm;
- struct kvm_vcpu_hv *hv_current = &current_vcpu->arch.hyperv;
+ struct kvm_hv *hv = &kvm->arch.hyperv;
+ struct kvm_vcpu_hv *hv_vcpu = &current_vcpu->arch.hyperv;
struct hv_tlb_flush_ex flush_ex;
struct hv_tlb_flush flush;
struct kvm_vcpu *vcpu;
unsigned long vcpu_bitmap[BITS_TO_LONGS(KVM_MAX_VCPUS)] = {0};
- u64 valid_bank_mask = 0;
+ u64 valid_bank_mask;
u64 sparse_banks[64];
- int sparse_banks_len, i;
+ int sparse_banks_len, i, bank, sbank;
bool all_cpus;

if (!ex) {
@@ -1312,6 +1317,7 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *current_vcpu, u64 ingpa,
trace_kvm_hv_flush_tlb(flush.processor_mask,
flush.address_space, flush.flags);

+ valid_bank_mask = BIT_ULL(0);
sparse_banks[0] = flush.processor_mask;
all_cpus = flush.flags & HV_FLUSH_ALL_PROCESSORS;
} else {
@@ -1344,52 +1350,54 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *current_vcpu, u64 ingpa,
return HV_STATUS_INVALID_HYPERCALL_INPUT;
}

- cpumask_clear(&hv_current->tlb_lush);
+ /*
+ * vcpu->arch.cr3 may not be up-to-date for running vCPUs so we can't
+ * analyze it here, flush TLB regardless of the specified address space.
+ */
+ cpumask_clear(&hv_vcpu->tlb_lush);

if (all_cpus) {
kvm_make_vcpus_request_mask(kvm,
KVM_REQ_TLB_FLUSH | KVM_REQUEST_NO_WAKEUP,
- NULL, &hv_current->tlb_lush);
+ NULL, &hv_vcpu->tlb_lush);
goto ret_success;
}

- kvm_for_each_vcpu(i, vcpu, kvm) {
- struct kvm_vcpu_hv *hv = &vcpu->arch.hyperv;
- int bank = hv->vp_index / 64, sbank = 0;
-
- /* Banks >64 can't be represented */
- if (bank >= 64)
- continue;
-
- /* Non-ex hypercalls can only address first 64 vCPUs */
- if (!ex && bank)
- continue;
-
- if (ex) {
- /*
- * Check is the bank of this vCPU is in sparse
- * set and get the sparse bank number.
- */
- sbank = get_sparse_bank_no(valid_bank_mask, bank);
-
- if (sbank < 0)
- continue;
+ if (atomic_read(&hv->num_mismatched_vp_indexes)) {
+ kvm_for_each_vcpu(i, vcpu, kvm) {
+ if (hv_vcpu_in_sparse_set(&vcpu->arch.hyperv,
+ sparse_banks,
+ valid_bank_mask))
+ __set_bit(i, vcpu_bitmap);
}
+ goto flush_request;
+ }

- if (!(sparse_banks[sbank] & BIT_ULL(hv->vp_index % 64)))
- continue;
-
- /*
- * vcpu->arch.cr3 may not be up-to-date for running vCPUs so we
- * can't analyze it here, flush TLB regardless of the specified
- * address space.
- */
- __set_bit(i, vcpu_bitmap);
+ /*
+ * num_mismatched_vp_indexes is zero so every vcpu has
+ * vp_index == vcpu_idx.
+ */
+ sbank = 0;
+ for_each_set_bit(bank, (unsigned long *)&valid_bank_mask,
+ BITS_PER_LONG) {
+ for_each_set_bit(i,
+ (unsigned long *)&sparse_banks[sbank],
+ BITS_PER_LONG) {
+ u32 vp_index = bank * 64 + i;
+
+ /* A non-existent vCPU was specified */
+ if (vp_index >= KVM_MAX_VCPUS)
+ return HV_STATUS_INVALID_HYPERCALL_INPUT;
+
+ __set_bit(vp_index, vcpu_bitmap);
+ }
+ sbank++;
}

+flush_request:
kvm_make_vcpus_request_mask(kvm,
KVM_REQ_TLB_FLUSH | KVM_REQUEST_NO_WAKEUP,
- vcpu_bitmap, &hv_current->tlb_lush);
+ vcpu_bitmap, &hv_vcpu->tlb_lush);

ret_success:
/* We always do full TLB flush, set rep_done = rep_cnt. */
--
2.17.1


2018-09-27 07:50:15

by Roman Kagan

[permalink] [raw]
Subject: Re: [PATCH v6 3/7] KVM: x86: hyperv: consistently use 'hv_vcpu' for 'struct kvm_vcpu_hv' variables

On Wed, Sep 26, 2018 at 07:02:55PM +0200, Vitaly Kuznetsov wrote:
> Rename 'hv' to 'hv_vcpu' in kvm_hv_set_msr/kvm_hv_get_msr(); 'hv' is
> 'reserved' for 'struct kvm_hv' variables across the file.
>
> Signed-off-by: Vitaly Kuznetsov <[email protected]>
> ---
> arch/x86/kvm/hyperv.c | 18 +++++++++---------
> 1 file changed, 9 insertions(+), 9 deletions(-)

Reviewed-by: Roman Kagan <[email protected]>

2018-09-27 07:59:54

by Roman Kagan

[permalink] [raw]
Subject: Re: [PATCH v6 4/7] KVM: x86: hyperv: keep track of mismatched VP indexes

On Wed, Sep 26, 2018 at 07:02:56PM +0200, Vitaly Kuznetsov wrote:
> In most common cases VP index of a vcpu matches its vcpu index. Userspace
> is, however, free to set any mapping it wishes and we need to account for
> that when we need to find a vCPU with a particular VP index. To keep search
> algorithms optimal in both cases introduce 'num_mismatched_vp_indexes'
> counter showing how many vCPUs with mismatching VP index we have. In case
> the counter is zero we can assume vp_index == vcpu_idx.
>
> Signed-off-by: Vitaly Kuznetsov <[email protected]>
> ---
> arch/x86/include/asm/kvm_host.h | 3 +++
> arch/x86/kvm/hyperv.c | 26 +++++++++++++++++++++++---
> 2 files changed, 26 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 09b2e3e2cf1b..711f79f1b5e6 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -781,6 +781,9 @@ struct kvm_hv {
> u64 hv_reenlightenment_control;
> u64 hv_tsc_emulation_control;
> u64 hv_tsc_emulation_status;
> +
> + /* How many vCPUs have VP index != vCPU index */
> + atomic_t num_mismatched_vp_indexes;
> };
>
> enum kvm_irqchip_mode {
> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
> index c8764faf783b..6a19c8e3c432 100644
> --- a/arch/x86/kvm/hyperv.c
> +++ b/arch/x86/kvm/hyperv.c
> @@ -1045,11 +1045,31 @@ static int kvm_hv_set_msr(struct kvm_vcpu *vcpu, u32 msr, u64 data, bool host)
> struct kvm_vcpu_hv *hv_vcpu = &vcpu->arch.hyperv;
>
> switch (msr) {
> - case HV_X64_MSR_VP_INDEX:
> - if (!host || (u32)data >= KVM_MAX_VCPUS)
> + case HV_X64_MSR_VP_INDEX: {
> + struct kvm_hv *hv = &vcpu->kvm->arch.hyperv;
> + int vcpu_idx = kvm_vcpu_get_idx(vcpu);
> + u32 new_vp_index = (u32)data;
> +
> + if (!host || new_vp_index >= KVM_MAX_VCPUS)
> return 1;
> - hv_vcpu->vp_index = (u32)data;
> +
> + if (new_vp_index == hv_vcpu->vp_index)
> + return 0;
> +
> + /*
> + * VP index is changing, increment num_mismatched_vp_indexes in
> + * case it was equal to vcpu_idx before; on the other hand, if
> + * the new VP index matches vcpu_idx num_mismatched_vp_indexes
> + * needs to be decremented.

It may be worth mentioning that the initial balance is provided by
kvm_hv_vcpu_postcreate setting vp_index = vcpu_idx.

> + */
> + if (hv_vcpu->vp_index == vcpu_idx)
> + atomic_inc(&hv->num_mismatched_vp_indexes);
> + else if (new_vp_index == vcpu_idx)
> + atomic_dec(&hv->num_mismatched_vp_indexes);

> +
> + hv_vcpu->vp_index = new_vp_index;
> break;
> + }
> case HV_X64_MSR_VP_ASSIST_PAGE: {
> u64 gfn;
> unsigned long addr;

Reviewed-by: Roman Kagan <[email protected]>

2018-09-27 08:02:28

by Roman Kagan

[permalink] [raw]
Subject: Re: [PATCH v6 5/7] KVM: x86: hyperv: valid_bank_mask should be 'u64'

On Wed, Sep 26, 2018 at 07:02:57PM +0200, Vitaly Kuznetsov wrote:
> This probably doesn't matter much (KVM_MAX_VCPUS is much lower nowadays)
> but valid_bank_mask is really u64 and not unsigned long.
>
> Signed-off-by: Vitaly Kuznetsov <[email protected]>
> ---
> arch/x86/kvm/hyperv.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)

Reviewed-by: Roman Kagan <[email protected]>

2018-09-27 09:19:00

by Vitaly Kuznetsov

[permalink] [raw]
Subject: Re: [PATCH v6 4/7] KVM: x86: hyperv: keep track of mismatched VP indexes

Roman Kagan <[email protected]> writes:

> On Wed, Sep 26, 2018 at 07:02:56PM +0200, Vitaly Kuznetsov wrote:
>> In most common cases VP index of a vcpu matches its vcpu index. Userspace
>> is, however, free to set any mapping it wishes and we need to account for
>> that when we need to find a vCPU with a particular VP index. To keep search
>> algorithms optimal in both cases introduce 'num_mismatched_vp_indexes'
>> counter showing how many vCPUs with mismatching VP index we have. In case
>> the counter is zero we can assume vp_index == vcpu_idx.
>>
>> Signed-off-by: Vitaly Kuznetsov <[email protected]>
>> ---
>> arch/x86/include/asm/kvm_host.h | 3 +++
>> arch/x86/kvm/hyperv.c | 26 +++++++++++++++++++++++---
>> 2 files changed, 26 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
>> index 09b2e3e2cf1b..711f79f1b5e6 100644
>> --- a/arch/x86/include/asm/kvm_host.h
>> +++ b/arch/x86/include/asm/kvm_host.h
>> @@ -781,6 +781,9 @@ struct kvm_hv {
>> u64 hv_reenlightenment_control;
>> u64 hv_tsc_emulation_control;
>> u64 hv_tsc_emulation_status;
>> +
>> + /* How many vCPUs have VP index != vCPU index */
>> + atomic_t num_mismatched_vp_indexes;
>> };
>>
>> enum kvm_irqchip_mode {
>> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
>> index c8764faf783b..6a19c8e3c432 100644
>> --- a/arch/x86/kvm/hyperv.c
>> +++ b/arch/x86/kvm/hyperv.c
>> @@ -1045,11 +1045,31 @@ static int kvm_hv_set_msr(struct kvm_vcpu *vcpu, u32 msr, u64 data, bool host)
>> struct kvm_vcpu_hv *hv_vcpu = &vcpu->arch.hyperv;
>>
>> switch (msr) {
>> - case HV_X64_MSR_VP_INDEX:
>> - if (!host || (u32)data >= KVM_MAX_VCPUS)
>> + case HV_X64_MSR_VP_INDEX: {
>> + struct kvm_hv *hv = &vcpu->kvm->arch.hyperv;
>> + int vcpu_idx = kvm_vcpu_get_idx(vcpu);
>> + u32 new_vp_index = (u32)data;
>> +
>> + if (!host || new_vp_index >= KVM_MAX_VCPUS)
>> return 1;
>> - hv_vcpu->vp_index = (u32)data;
>> +
>> + if (new_vp_index == hv_vcpu->vp_index)
>> + return 0;
>> +
>> + /*
>> + * VP index is changing, increment num_mismatched_vp_indexes in
>> + * case it was equal to vcpu_idx before; on the other hand, if
>> + * the new VP index matches vcpu_idx num_mismatched_vp_indexes
>> + * needs to be decremented.
>
> It may be worth mentioning that the initial balance is provided by
> kvm_hv_vcpu_postcreate setting vp_index = vcpu_idx.
>

Of course, yes, will update the comment in case I'll be re-submitting.

>> + */
>> + if (hv_vcpu->vp_index == vcpu_idx)
>> + atomic_inc(&hv->num_mismatched_vp_indexes);
>> + else if (new_vp_index == vcpu_idx)
>> + atomic_dec(&hv->num_mismatched_vp_indexes);
>
>> +
>> + hv_vcpu->vp_index = new_vp_index;
>> break;
>> + }
>> case HV_X64_MSR_VP_ASSIST_PAGE: {
>> u64 gfn;
>> unsigned long addr;
>
> Reviewed-by: Roman Kagan <[email protected]>

Thanks!

--
Vitaly

2018-09-27 09:44:25

by Roman Kagan

[permalink] [raw]
Subject: Re: [PATCH v6 6/7] KVM: x86: hyperv: optimize kvm_hv_flush_tlb() for vp_index == vcpu_idx case

On Wed, Sep 26, 2018 at 07:02:58PM +0200, Vitaly Kuznetsov wrote:
> VP inedx almost always matches VCPU and when it does it's faster to walk
> the sparse set instead of all vcpus.
>
> Signed-off-by: Vitaly Kuznetsov <[email protected]>
> ---
> arch/x86/kvm/hyperv.c | 96 +++++++++++++++++++++++--------------------
> 1 file changed, 52 insertions(+), 44 deletions(-)
>
> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
> index eeb12eacd525..cc0535a078f7 100644
> --- a/arch/x86/kvm/hyperv.c
> +++ b/arch/x86/kvm/hyperv.c
> @@ -1277,32 +1277,37 @@ int kvm_hv_get_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata, bool host)
> return kvm_hv_get_msr(vcpu, msr, pdata, host);
> }
>
> -static __always_inline int get_sparse_bank_no(u64 valid_bank_mask, int bank_no)
> +static __always_inline bool hv_vcpu_in_sparse_set(struct kvm_vcpu_hv *hv_vcpu,
> + u64 sparse_banks[],
> + u64 valid_bank_mask)
> {
> - int i = 0, j;
> + int bank = hv_vcpu->vp_index / 64, sbank;
>
> - if (!(valid_bank_mask & BIT_ULL(bank_no)))
> - return -1;
> + if (bank >= 64)
> + return false;
>
> - for (j = 0; j < bank_no; j++)
> - if (valid_bank_mask & BIT_ULL(j))
> - i++;
> + if (!(valid_bank_mask & BIT_ULL(bank)))
> + return false;
>
> - return i;
> + /* Sparse bank number equals to the number of set bits before it */
> + sbank = bitmap_weight((unsigned long *)&valid_bank_mask, bank);
> +
> + return !!(sparse_banks[sbank] & BIT_ULL(hv_vcpu->vp_index % 64));
> }
>
> static u64 kvm_hv_flush_tlb(struct kvm_vcpu *current_vcpu, u64 ingpa,
> u16 rep_cnt, bool ex)
> {
> struct kvm *kvm = current_vcpu->kvm;
> - struct kvm_vcpu_hv *hv_current = &current_vcpu->arch.hyperv;
> + struct kvm_hv *hv = &kvm->arch.hyperv;
> + struct kvm_vcpu_hv *hv_vcpu = &current_vcpu->arch.hyperv;
> struct hv_tlb_flush_ex flush_ex;
> struct hv_tlb_flush flush;
> struct kvm_vcpu *vcpu;
> unsigned long vcpu_bitmap[BITS_TO_LONGS(KVM_MAX_VCPUS)] = {0};
> - u64 valid_bank_mask = 0;
> + u64 valid_bank_mask;
> u64 sparse_banks[64];
> - int sparse_banks_len, i;
> + int sparse_banks_len, i, bank, sbank;
> bool all_cpus;
>
> if (!ex) {
> @@ -1312,6 +1317,7 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *current_vcpu, u64 ingpa,
> trace_kvm_hv_flush_tlb(flush.processor_mask,
> flush.address_space, flush.flags);
>
> + valid_bank_mask = BIT_ULL(0);
> sparse_banks[0] = flush.processor_mask;
> all_cpus = flush.flags & HV_FLUSH_ALL_PROCESSORS;
> } else {
> @@ -1344,52 +1350,54 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *current_vcpu, u64 ingpa,
> return HV_STATUS_INVALID_HYPERCALL_INPUT;
> }
>
> - cpumask_clear(&hv_current->tlb_lush);
> + /*
> + * vcpu->arch.cr3 may not be up-to-date for running vCPUs so we can't
> + * analyze it here, flush TLB regardless of the specified address space.
> + */
> + cpumask_clear(&hv_vcpu->tlb_lush);

Maybe squash this hv_current -> hv_vcpu renaming into patch 3?
(And yes this "lush" is funny, too ;)

>
> if (all_cpus) {
> kvm_make_vcpus_request_mask(kvm,
> KVM_REQ_TLB_FLUSH | KVM_REQUEST_NO_WAKEUP,
> - NULL, &hv_current->tlb_lush);
> + NULL, &hv_vcpu->tlb_lush);
> goto ret_success;
> }
>
> - kvm_for_each_vcpu(i, vcpu, kvm) {
> - struct kvm_vcpu_hv *hv = &vcpu->arch.hyperv;
> - int bank = hv->vp_index / 64, sbank = 0;
> -
> - /* Banks >64 can't be represented */
> - if (bank >= 64)
> - continue;
> -
> - /* Non-ex hypercalls can only address first 64 vCPUs */
> - if (!ex && bank)
> - continue;
> -
> - if (ex) {
> - /*
> - * Check is the bank of this vCPU is in sparse
> - * set and get the sparse bank number.
> - */
> - sbank = get_sparse_bank_no(valid_bank_mask, bank);
> -
> - if (sbank < 0)
> - continue;
> + if (atomic_read(&hv->num_mismatched_vp_indexes)) {
> + kvm_for_each_vcpu(i, vcpu, kvm) {
> + if (hv_vcpu_in_sparse_set(&vcpu->arch.hyperv,
> + sparse_banks,
> + valid_bank_mask))
> + __set_bit(i, vcpu_bitmap);
> }
> + goto flush_request;
> + }
>
> - if (!(sparse_banks[sbank] & BIT_ULL(hv->vp_index % 64)))
> - continue;
> -
> - /*
> - * vcpu->arch.cr3 may not be up-to-date for running vCPUs so we
> - * can't analyze it here, flush TLB regardless of the specified
> - * address space.
> - */
> - __set_bit(i, vcpu_bitmap);
> + /*
> + * num_mismatched_vp_indexes is zero so every vcpu has
> + * vp_index == vcpu_idx.
> + */
> + sbank = 0;
> + for_each_set_bit(bank, (unsigned long *)&valid_bank_mask,
> + BITS_PER_LONG) {

s/BITS_PER_LONG/64/

> + for_each_set_bit(i,
> + (unsigned long *)&sparse_banks[sbank],
> + BITS_PER_LONG) {

ditto

> + u32 vp_index = bank * 64 + i;
> +
> + /* A non-existent vCPU was specified */
> + if (vp_index >= KVM_MAX_VCPUS)
> + return HV_STATUS_INVALID_HYPERCALL_INPUT;
> +
> + __set_bit(vp_index, vcpu_bitmap);
> + }
> + sbank++;
> }

I wonder if copying the bank as a whole would make it easier to follow
(and somewhat more efficient):

sbank = 0;
for_each_set_bit(bank, (unsigned long *)&valid_bank_mask, 64)
((u64)vcpu_bitmap)[bank] = sparse_banks[sbank++];

Also it seems equally efficient but slightly easier to read if
vcpu_bitmap is filled first regardless of ->num_mismatched_vp_indexes,
and then either passed directly to kvm_make_vcpus_request_mask if
num_mismatched_vp_indexes == 0, or converted into the real vcpu mask
using the regular test_bit otherwise.

So eventually it all would look like

...
u64 vp_bitmap[KVM_MAX_VCPUS / 64] = {0};
DECLARE_BITMAP(vcpu_bitmap, KVM_MAX_VCPUS) = {0};
unsigned long *vcpu_mask;
...
if (all_cpus) {
vcpu_mask = NULL;
goto flush_request;
}

sbank = 0;
for_each_set_bit(bank, (unsigned long *)&valid_bank_mask, 64)
vp_bitmap[bank] = sparse_banks[sbank++];

if (likely(!atomic_read(&hv->num_mismatched_vp_indexes)) {
/* for all vcpus vp_index == vcpu_idx */
vcpu_mask = vp_bitmap;
goto flush_request;
}

kvm_for_each_vcpu(i, vcpu, kvm)
if (test_bit(vcpu_to_hv_vcpu(vcpu)->vp_index, vp_bitmap))
__set_bit(i, vcpu_bitmap);
vcpu_mask = vcpu_bitmap;
...

>
> +flush_request:
> kvm_make_vcpus_request_mask(kvm,
> KVM_REQ_TLB_FLUSH | KVM_REQUEST_NO_WAKEUP,
> - vcpu_bitmap, &hv_current->tlb_lush);
> + vcpu_bitmap, &hv_vcpu->tlb_lush);
>
> ret_success:
> /* We always do full TLB flush, set rep_done = rep_cnt. */

Roman.

2018-09-27 11:09:19

by Roman Kagan

[permalink] [raw]
Subject: Re: [PATCH v6 7/7] KVM: x86: hyperv: implement PV IPI send hypercalls

On Wed, Sep 26, 2018 at 07:02:59PM +0200, Vitaly Kuznetsov wrote:
> Using hypercall for sending IPIs is faster because this allows to specify
> any number of vCPUs (even > 64 with sparse CPU set), the whole procedure
> will take only one VMEXIT.
>
> Current Hyper-V TLFS (v5.0b) claims that HvCallSendSyntheticClusterIpi
> hypercall can't be 'fast' (passing parameters through registers) but
> apparently this is not true, Windows always uses it as 'fast' so we need
> to support that.
>
> Signed-off-by: Vitaly Kuznetsov <[email protected]>
> ---
> Documentation/virtual/kvm/api.txt | 7 ++
> arch/x86/kvm/hyperv.c | 115 ++++++++++++++++++++++++++++++
> arch/x86/kvm/trace.h | 42 +++++++++++
> arch/x86/kvm/x86.c | 1 +
> include/uapi/linux/kvm.h | 1 +
> 5 files changed, 166 insertions(+)
>
> diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
> index 647f94128a85..1659b75d577d 100644
> --- a/Documentation/virtual/kvm/api.txt
> +++ b/Documentation/virtual/kvm/api.txt
> @@ -4772,3 +4772,10 @@ CPU when the exception is taken. If this virtual SError is taken to EL1 using
> AArch64, this value will be reported in the ISS field of ESR_ELx.
>
> See KVM_CAP_VCPU_EVENTS for more details.
> +8.20 KVM_CAP_HYPERV_SEND_IPI
> +
> +Architectures: x86
> +
> +This capability indicates that KVM supports paravirtualized Hyper-V IPI send
> +hypercalls:
> +HvCallSendSyntheticClusterIpi, HvCallSendSyntheticClusterIpiEx.
> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
> index cc0535a078f7..4b4a6d015ade 100644
> --- a/arch/x86/kvm/hyperv.c
> +++ b/arch/x86/kvm/hyperv.c
> @@ -1405,6 +1405,107 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *current_vcpu, u64 ingpa,
> ((u64)rep_cnt << HV_HYPERCALL_REP_COMP_OFFSET);
> }
>
> +static u64 kvm_hv_send_ipi(struct kvm_vcpu *current_vcpu, u64 ingpa, u64 outgpa,
> + bool ex, bool fast)
> +{
> + struct kvm *kvm = current_vcpu->kvm;
> + struct kvm_hv *hv = &kvm->arch.hyperv;
> + struct hv_send_ipi_ex send_ipi_ex;
> + struct hv_send_ipi send_ipi;
> + struct kvm_vcpu *vcpu;
> + unsigned long valid_bank_mask;
> + u64 sparse_banks[64];
> + int sparse_banks_len, bank, i, sbank;
> + struct kvm_lapic_irq irq = {.delivery_mode = APIC_DM_FIXED};
> + bool all_cpus;
> +
> + if (!ex) {
> + if (!fast) {
> + if (unlikely(kvm_read_guest(kvm, ingpa, &send_ipi,
> + sizeof(send_ipi))))
> + return HV_STATUS_INVALID_HYPERCALL_INPUT;
> + sparse_banks[0] = send_ipi.cpu_mask;
> + irq.vector = send_ipi.vector;
> + } else {
> + /* 'reserved' part of hv_send_ipi should be 0 */
> + if (unlikely(ingpa >> 32 != 0))
> + return HV_STATUS_INVALID_HYPERCALL_INPUT;
> + sparse_banks[0] = outgpa;
> + irq.vector = (u32)ingpa;
> + }
> + all_cpus = false;
> + valid_bank_mask = BIT_ULL(0);
> +
> + trace_kvm_hv_send_ipi(irq.vector, sparse_banks[0]);
> + } else {
> + if (unlikely(kvm_read_guest(kvm, ingpa, &send_ipi_ex,
> + sizeof(send_ipi_ex))))
> + return HV_STATUS_INVALID_HYPERCALL_INPUT;
> +
> + trace_kvm_hv_send_ipi_ex(send_ipi_ex.vector,
> + send_ipi_ex.vp_set.format,
> + send_ipi_ex.vp_set.valid_bank_mask);
> +
> + irq.vector = send_ipi_ex.vector;
> + valid_bank_mask = send_ipi_ex.vp_set.valid_bank_mask;
> + sparse_banks_len = bitmap_weight(&valid_bank_mask, 64) *
> + sizeof(sparse_banks[0]);
> +
> + all_cpus = send_ipi_ex.vp_set.format == HV_GENERIC_SET_ALL;
> +
> + if (!sparse_banks_len)
> + goto ret_success;
> +
> + if (!all_cpus &&
> + kvm_read_guest(kvm,
> + ingpa + offsetof(struct hv_send_ipi_ex,
> + vp_set.bank_contents),
> + sparse_banks,
> + sparse_banks_len))
> + return HV_STATUS_INVALID_HYPERCALL_INPUT;
> + }
> +
> + if ((irq.vector < HV_IPI_LOW_VECTOR) ||
> + (irq.vector > HV_IPI_HIGH_VECTOR))
> + return HV_STATUS_INVALID_HYPERCALL_INPUT;
> +
> + if (all_cpus || atomic_read(&hv->num_mismatched_vp_indexes)) {
> + kvm_for_each_vcpu(i, vcpu, kvm) {
> + if (all_cpus || hv_vcpu_in_sparse_set(
> + &vcpu->arch.hyperv, sparse_banks,
> + valid_bank_mask)) {
> + /* We fail only when APIC is disabled */
> + kvm_apic_set_irq(vcpu, &irq, NULL);
> + }
> + }
> + goto ret_success;
> + }
> +
> + /*
> + * num_mismatched_vp_indexes is zero so every vcpu has
> + * vp_index == vcpu_idx.
> + */
> + sbank = 0;
> + for_each_set_bit(bank, (unsigned long *)&valid_bank_mask, 64) {
> + for_each_set_bit(i, (unsigned long *)&sparse_banks[sbank], 64) {
> + u32 vp_index = bank * 64 + i;
> + struct kvm_vcpu *vcpu =
> + get_vcpu_by_vpidx(kvm, vp_index);
> +
> + /* Unknown vCPU specified */
> + if (!vcpu)
> + continue;
> +
> + /* We fail only when APIC is disabled */
> + kvm_apic_set_irq(vcpu, &irq, NULL);
> + }
> + sbank++;
> + }
> +
> +ret_success:
> + return HV_STATUS_SUCCESS;
> +}
> +

I must say that now it looks even more tempting to follow the same
pattern as your kvm_hv_flush_tlb: define a function that would call
kvm_apic_set_irq() on all vcpus in a mask (optimizing the all-set case
with a NULL mask), and make kvm_hv_send_ipi perform the same hv_vp_set
-> vcpu_mask transformation followed by calling into that function.

Roman.

2018-10-01 15:49:44

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH v6 4/7] KVM: x86: hyperv: keep track of mismatched VP indexes

On 27/09/2018 11:17, Vitaly Kuznetsov wrote:
> Roman Kagan <[email protected]> writes:
>
>> On Wed, Sep 26, 2018 at 07:02:56PM +0200, Vitaly Kuznetsov wrote:
>>> In most common cases VP index of a vcpu matches its vcpu index. Userspace
>>> is, however, free to set any mapping it wishes and we need to account for
>>> that when we need to find a vCPU with a particular VP index. To keep search
>>> algorithms optimal in both cases introduce 'num_mismatched_vp_indexes'
>>> counter showing how many vCPUs with mismatching VP index we have. In case
>>> the counter is zero we can assume vp_index == vcpu_idx.
>>>
>>> Signed-off-by: Vitaly Kuznetsov <[email protected]>
>>> ---
>>> arch/x86/include/asm/kvm_host.h | 3 +++
>>> arch/x86/kvm/hyperv.c | 26 +++++++++++++++++++++++---
>>> 2 files changed, 26 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
>>> index 09b2e3e2cf1b..711f79f1b5e6 100644
>>> --- a/arch/x86/include/asm/kvm_host.h
>>> +++ b/arch/x86/include/asm/kvm_host.h
>>> @@ -781,6 +781,9 @@ struct kvm_hv {
>>> u64 hv_reenlightenment_control;
>>> u64 hv_tsc_emulation_control;
>>> u64 hv_tsc_emulation_status;
>>> +
>>> + /* How many vCPUs have VP index != vCPU index */
>>> + atomic_t num_mismatched_vp_indexes;
>>> };
>>>
>>> enum kvm_irqchip_mode {
>>> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
>>> index c8764faf783b..6a19c8e3c432 100644
>>> --- a/arch/x86/kvm/hyperv.c
>>> +++ b/arch/x86/kvm/hyperv.c
>>> @@ -1045,11 +1045,31 @@ static int kvm_hv_set_msr(struct kvm_vcpu *vcpu, u32 msr, u64 data, bool host)
>>> struct kvm_vcpu_hv *hv_vcpu = &vcpu->arch.hyperv;
>>>
>>> switch (msr) {
>>> - case HV_X64_MSR_VP_INDEX:
>>> - if (!host || (u32)data >= KVM_MAX_VCPUS)
>>> + case HV_X64_MSR_VP_INDEX: {
>>> + struct kvm_hv *hv = &vcpu->kvm->arch.hyperv;
>>> + int vcpu_idx = kvm_vcpu_get_idx(vcpu);
>>> + u32 new_vp_index = (u32)data;
>>> +
>>> + if (!host || new_vp_index >= KVM_MAX_VCPUS)
>>> return 1;
>>> - hv_vcpu->vp_index = (u32)data;
>>> +
>>> + if (new_vp_index == hv_vcpu->vp_index)
>>> + return 0;
>>> +
>>> + /*
>>> + * VP index is changing, increment num_mismatched_vp_indexes in
>>> + * case it was equal to vcpu_idx before; on the other hand, if
>>> + * the new VP index matches vcpu_idx num_mismatched_vp_indexes
>>> + * needs to be decremented.
>>
>> It may be worth mentioning that the initial balance is provided by
>> kvm_hv_vcpu_postcreate setting vp_index = vcpu_idx.
>>
>
> Of course, yes, will update the comment in case I'll be re-submitting.

/*
* VP index is initialized to hv_vcpu->vp_index by
* kvm_hv_vcpu_postcreate so they initially match. Now the
* VP index is changing, adjust num_mismatched_vp_indexes if
* it now matches or no longer matches vcpu_idx.
*/

?

Paolo

2018-10-01 15:55:10

by Roman Kagan

[permalink] [raw]
Subject: Re: [PATCH v6 4/7] KVM: x86: hyperv: keep track of mismatched VP indexes

On Mon, Oct 01, 2018 at 05:48:54PM +0200, Paolo Bonzini wrote:
> On 27/09/2018 11:17, Vitaly Kuznetsov wrote:
> > Roman Kagan <[email protected]> writes:
> >
> >> On Wed, Sep 26, 2018 at 07:02:56PM +0200, Vitaly Kuznetsov wrote:
> >>> In most common cases VP index of a vcpu matches its vcpu index. Userspace
> >>> is, however, free to set any mapping it wishes and we need to account for
> >>> that when we need to find a vCPU with a particular VP index. To keep search
> >>> algorithms optimal in both cases introduce 'num_mismatched_vp_indexes'
> >>> counter showing how many vCPUs with mismatching VP index we have. In case
> >>> the counter is zero we can assume vp_index == vcpu_idx.
> >>>
> >>> Signed-off-by: Vitaly Kuznetsov <[email protected]>
> >>> ---
> >>> arch/x86/include/asm/kvm_host.h | 3 +++
> >>> arch/x86/kvm/hyperv.c | 26 +++++++++++++++++++++++---
> >>> 2 files changed, 26 insertions(+), 3 deletions(-)
> >>>
> >>> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> >>> index 09b2e3e2cf1b..711f79f1b5e6 100644
> >>> --- a/arch/x86/include/asm/kvm_host.h
> >>> +++ b/arch/x86/include/asm/kvm_host.h
> >>> @@ -781,6 +781,9 @@ struct kvm_hv {
> >>> u64 hv_reenlightenment_control;
> >>> u64 hv_tsc_emulation_control;
> >>> u64 hv_tsc_emulation_status;
> >>> +
> >>> + /* How many vCPUs have VP index != vCPU index */
> >>> + atomic_t num_mismatched_vp_indexes;
> >>> };
> >>>
> >>> enum kvm_irqchip_mode {
> >>> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
> >>> index c8764faf783b..6a19c8e3c432 100644
> >>> --- a/arch/x86/kvm/hyperv.c
> >>> +++ b/arch/x86/kvm/hyperv.c
> >>> @@ -1045,11 +1045,31 @@ static int kvm_hv_set_msr(struct kvm_vcpu *vcpu, u32 msr, u64 data, bool host)
> >>> struct kvm_vcpu_hv *hv_vcpu = &vcpu->arch.hyperv;
> >>>
> >>> switch (msr) {
> >>> - case HV_X64_MSR_VP_INDEX:
> >>> - if (!host || (u32)data >= KVM_MAX_VCPUS)
> >>> + case HV_X64_MSR_VP_INDEX: {
> >>> + struct kvm_hv *hv = &vcpu->kvm->arch.hyperv;
> >>> + int vcpu_idx = kvm_vcpu_get_idx(vcpu);
> >>> + u32 new_vp_index = (u32)data;
> >>> +
> >>> + if (!host || new_vp_index >= KVM_MAX_VCPUS)
> >>> return 1;
> >>> - hv_vcpu->vp_index = (u32)data;
> >>> +
> >>> + if (new_vp_index == hv_vcpu->vp_index)
> >>> + return 0;
> >>> +
> >>> + /*
> >>> + * VP index is changing, increment num_mismatched_vp_indexes in
> >>> + * case it was equal to vcpu_idx before; on the other hand, if
> >>> + * the new VP index matches vcpu_idx num_mismatched_vp_indexes
> >>> + * needs to be decremented.
> >>
> >> It may be worth mentioning that the initial balance is provided by
> >> kvm_hv_vcpu_postcreate setting vp_index = vcpu_idx.
> >>
> >
> > Of course, yes, will update the comment in case I'll be re-submitting.
>
> /*
> * VP index is initialized to hv_vcpu->vp_index by
> * kvm_hv_vcpu_postcreate so they initially match. Now the
> * VP index is changing, adjust num_mismatched_vp_indexes if
> * it now matches or no longer matches vcpu_idx.
> */
>
> ?

To my taste - perfect :)

Roman.

2018-10-01 15:58:03

by Roman Kagan

[permalink] [raw]
Subject: Re: [PATCH v6 4/7] KVM: x86: hyperv: keep track of mismatched VP indexes

On Mon, Oct 01, 2018 at 03:54:26PM +0000, Roman Kagan wrote:
> On Mon, Oct 01, 2018 at 05:48:54PM +0200, Paolo Bonzini wrote:
> > On 27/09/2018 11:17, Vitaly Kuznetsov wrote:
> > > Roman Kagan <[email protected]> writes:
> > >
> > >> On Wed, Sep 26, 2018 at 07:02:56PM +0200, Vitaly Kuznetsov wrote:
> > >>> In most common cases VP index of a vcpu matches its vcpu index. Userspace
> > >>> is, however, free to set any mapping it wishes and we need to account for
> > >>> that when we need to find a vCPU with a particular VP index. To keep search
> > >>> algorithms optimal in both cases introduce 'num_mismatched_vp_indexes'
> > >>> counter showing how many vCPUs with mismatching VP index we have. In case
> > >>> the counter is zero we can assume vp_index == vcpu_idx.
> > >>>
> > >>> Signed-off-by: Vitaly Kuznetsov <[email protected]>
> > >>> ---
> > >>> arch/x86/include/asm/kvm_host.h | 3 +++
> > >>> arch/x86/kvm/hyperv.c | 26 +++++++++++++++++++++++---
> > >>> 2 files changed, 26 insertions(+), 3 deletions(-)
> > >>>
> > >>> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> > >>> index 09b2e3e2cf1b..711f79f1b5e6 100644
> > >>> --- a/arch/x86/include/asm/kvm_host.h
> > >>> +++ b/arch/x86/include/asm/kvm_host.h
> > >>> @@ -781,6 +781,9 @@ struct kvm_hv {
> > >>> u64 hv_reenlightenment_control;
> > >>> u64 hv_tsc_emulation_control;
> > >>> u64 hv_tsc_emulation_status;
> > >>> +
> > >>> + /* How many vCPUs have VP index != vCPU index */
> > >>> + atomic_t num_mismatched_vp_indexes;
> > >>> };
> > >>>
> > >>> enum kvm_irqchip_mode {
> > >>> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
> > >>> index c8764faf783b..6a19c8e3c432 100644
> > >>> --- a/arch/x86/kvm/hyperv.c
> > >>> +++ b/arch/x86/kvm/hyperv.c
> > >>> @@ -1045,11 +1045,31 @@ static int kvm_hv_set_msr(struct kvm_vcpu *vcpu, u32 msr, u64 data, bool host)
> > >>> struct kvm_vcpu_hv *hv_vcpu = &vcpu->arch.hyperv;
> > >>>
> > >>> switch (msr) {
> > >>> - case HV_X64_MSR_VP_INDEX:
> > >>> - if (!host || (u32)data >= KVM_MAX_VCPUS)
> > >>> + case HV_X64_MSR_VP_INDEX: {
> > >>> + struct kvm_hv *hv = &vcpu->kvm->arch.hyperv;
> > >>> + int vcpu_idx = kvm_vcpu_get_idx(vcpu);
> > >>> + u32 new_vp_index = (u32)data;
> > >>> +
> > >>> + if (!host || new_vp_index >= KVM_MAX_VCPUS)
> > >>> return 1;
> > >>> - hv_vcpu->vp_index = (u32)data;
> > >>> +
> > >>> + if (new_vp_index == hv_vcpu->vp_index)
> > >>> + return 0;
> > >>> +
> > >>> + /*
> > >>> + * VP index is changing, increment num_mismatched_vp_indexes in
> > >>> + * case it was equal to vcpu_idx before; on the other hand, if
> > >>> + * the new VP index matches vcpu_idx num_mismatched_vp_indexes
> > >>> + * needs to be decremented.
> > >>
> > >> It may be worth mentioning that the initial balance is provided by
> > >> kvm_hv_vcpu_postcreate setting vp_index = vcpu_idx.
> > >>
> > >
> > > Of course, yes, will update the comment in case I'll be re-submitting.
> >
> > /*
> > * VP index is initialized to hv_vcpu->vp_index by
^^^^^^^^^^^^^^^^^
vcpu_idx

> > * kvm_hv_vcpu_postcreate so they initially match. Now the
> > * VP index is changing, adjust num_mismatched_vp_indexes if
> > * it now matches or no longer matches vcpu_idx.
> > */
> >
> > ?
>
> To my taste - perfect :)

Well, almost :)

Roman.

2018-10-01 16:02:13

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH v6 7/7] KVM: x86: hyperv: implement PV IPI send hypercalls

On 27/09/2018 13:07, Roman Kagan wrote:
> On Wed, Sep 26, 2018 at 07:02:59PM +0200, Vitaly Kuznetsov wrote:
>> Using hypercall for sending IPIs is faster because this allows to specify
>> any number of vCPUs (even > 64 with sparse CPU set), the whole procedure
>> will take only one VMEXIT.
>>
>> Current Hyper-V TLFS (v5.0b) claims that HvCallSendSyntheticClusterIpi
>> hypercall can't be 'fast' (passing parameters through registers) but
>> apparently this is not true, Windows always uses it as 'fast' so we need
>> to support that.
>>
>> Signed-off-by: Vitaly Kuznetsov <[email protected]>
>> ---
>> Documentation/virtual/kvm/api.txt | 7 ++
>> arch/x86/kvm/hyperv.c | 115 ++++++++++++++++++++++++++++++
>> arch/x86/kvm/trace.h | 42 +++++++++++
>> arch/x86/kvm/x86.c | 1 +
>> include/uapi/linux/kvm.h | 1 +
>> 5 files changed, 166 insertions(+)
>>
>> diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
>> index 647f94128a85..1659b75d577d 100644
>> --- a/Documentation/virtual/kvm/api.txt
>> +++ b/Documentation/virtual/kvm/api.txt
>> @@ -4772,3 +4772,10 @@ CPU when the exception is taken. If this virtual SError is taken to EL1 using
>> AArch64, this value will be reported in the ISS field of ESR_ELx.
>>
>> See KVM_CAP_VCPU_EVENTS for more details.
>> +8.20 KVM_CAP_HYPERV_SEND_IPI
>> +
>> +Architectures: x86
>> +
>> +This capability indicates that KVM supports paravirtualized Hyper-V IPI send
>> +hypercalls:
>> +HvCallSendSyntheticClusterIpi, HvCallSendSyntheticClusterIpiEx.
>> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
>> index cc0535a078f7..4b4a6d015ade 100644
>> --- a/arch/x86/kvm/hyperv.c
>> +++ b/arch/x86/kvm/hyperv.c
>> @@ -1405,6 +1405,107 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *current_vcpu, u64 ingpa,
>> ((u64)rep_cnt << HV_HYPERCALL_REP_COMP_OFFSET);
>> }
>>
>> +static u64 kvm_hv_send_ipi(struct kvm_vcpu *current_vcpu, u64 ingpa, u64 outgpa,
>> + bool ex, bool fast)
>> +{
>> + struct kvm *kvm = current_vcpu->kvm;
>> + struct kvm_hv *hv = &kvm->arch.hyperv;
>> + struct hv_send_ipi_ex send_ipi_ex;
>> + struct hv_send_ipi send_ipi;
>> + struct kvm_vcpu *vcpu;
>> + unsigned long valid_bank_mask;
>> + u64 sparse_banks[64];
>> + int sparse_banks_len, bank, i, sbank;
>> + struct kvm_lapic_irq irq = {.delivery_mode = APIC_DM_FIXED};
>> + bool all_cpus;
>> +
>> + if (!ex) {
>> + if (!fast) {
>> + if (unlikely(kvm_read_guest(kvm, ingpa, &send_ipi,
>> + sizeof(send_ipi))))
>> + return HV_STATUS_INVALID_HYPERCALL_INPUT;
>> + sparse_banks[0] = send_ipi.cpu_mask;
>> + irq.vector = send_ipi.vector;
>> + } else {
>> + /* 'reserved' part of hv_send_ipi should be 0 */
>> + if (unlikely(ingpa >> 32 != 0))
>> + return HV_STATUS_INVALID_HYPERCALL_INPUT;
>> + sparse_banks[0] = outgpa;
>> + irq.vector = (u32)ingpa;
>> + }
>> + all_cpus = false;
>> + valid_bank_mask = BIT_ULL(0);
>> +
>> + trace_kvm_hv_send_ipi(irq.vector, sparse_banks[0]);
>> + } else {
>> + if (unlikely(kvm_read_guest(kvm, ingpa, &send_ipi_ex,
>> + sizeof(send_ipi_ex))))
>> + return HV_STATUS_INVALID_HYPERCALL_INPUT;
>> +
>> + trace_kvm_hv_send_ipi_ex(send_ipi_ex.vector,
>> + send_ipi_ex.vp_set.format,
>> + send_ipi_ex.vp_set.valid_bank_mask);
>> +
>> + irq.vector = send_ipi_ex.vector;
>> + valid_bank_mask = send_ipi_ex.vp_set.valid_bank_mask;
>> + sparse_banks_len = bitmap_weight(&valid_bank_mask, 64) *
>> + sizeof(sparse_banks[0]);
>> +
>> + all_cpus = send_ipi_ex.vp_set.format == HV_GENERIC_SET_ALL;
>> +
>> + if (!sparse_banks_len)
>> + goto ret_success;
>> +
>> + if (!all_cpus &&
>> + kvm_read_guest(kvm,
>> + ingpa + offsetof(struct hv_send_ipi_ex,
>> + vp_set.bank_contents),
>> + sparse_banks,
>> + sparse_banks_len))
>> + return HV_STATUS_INVALID_HYPERCALL_INPUT;
>> + }
>> +
>> + if ((irq.vector < HV_IPI_LOW_VECTOR) ||
>> + (irq.vector > HV_IPI_HIGH_VECTOR))
>> + return HV_STATUS_INVALID_HYPERCALL_INPUT;
>> +
>> + if (all_cpus || atomic_read(&hv->num_mismatched_vp_indexes)) {
>> + kvm_for_each_vcpu(i, vcpu, kvm) {
>> + if (all_cpus || hv_vcpu_in_sparse_set(
>> + &vcpu->arch.hyperv, sparse_banks,
>> + valid_bank_mask)) {
>> + /* We fail only when APIC is disabled */
>> + kvm_apic_set_irq(vcpu, &irq, NULL);
>> + }
>> + }
>> + goto ret_success;
>> + }
>> +
>> + /*
>> + * num_mismatched_vp_indexes is zero so every vcpu has
>> + * vp_index == vcpu_idx.
>> + */
>> + sbank = 0;
>> + for_each_set_bit(bank, (unsigned long *)&valid_bank_mask, 64) {
>> + for_each_set_bit(i, (unsigned long *)&sparse_banks[sbank], 64) {
>> + u32 vp_index = bank * 64 + i;
>> + struct kvm_vcpu *vcpu =
>> + get_vcpu_by_vpidx(kvm, vp_index);
>> +
>> + /* Unknown vCPU specified */
>> + if (!vcpu)
>> + continue;
>> +
>> + /* We fail only when APIC is disabled */
>> + kvm_apic_set_irq(vcpu, &irq, NULL);
>> + }
>> + sbank++;
>> + }
>> +
>> +ret_success:
>> + return HV_STATUS_SUCCESS;
>> +}
>> +
>
> I must say that now it looks even more tempting to follow the same
> pattern as your kvm_hv_flush_tlb: define a function that would call
> kvm_apic_set_irq() on all vcpus in a mask (optimizing the all-set case
> with a NULL mask), and make kvm_hv_send_ipi perform the same hv_vp_set
> -> vcpu_mask transformation followed by calling into that function.


It would perhaps be cleaner, but really kvm_apic_set_irq is as efficient
as it can be, since it takes the destination vcpu directly.

The code duplication for walking the sparse set is a bit ugly, perhaps
that could be changed to use an iterator macro.

Paolo

2018-10-01 16:21:19

by Vitaly Kuznetsov

[permalink] [raw]
Subject: Re: [PATCH v6 7/7] KVM: x86: hyperv: implement PV IPI send hypercalls

Paolo Bonzini <[email protected]> writes:

> On 27/09/2018 13:07, Roman Kagan wrote:
...
>>
>> I must say that now it looks even more tempting to follow the same
>> pattern as your kvm_hv_flush_tlb: define a function that would call
>> kvm_apic_set_irq() on all vcpus in a mask (optimizing the all-set case
>> with a NULL mask), and make kvm_hv_send_ipi perform the same hv_vp_set
>> -> vcpu_mask transformation followed by calling into that function.
>
>
> It would perhaps be cleaner, but really kvm_apic_set_irq is as efficient
> as it can be, since it takes the destination vcpu directly.
>
> The code duplication for walking the sparse set is a bit ugly, perhaps
> that could be changed to use an iterator macro.

I actually like Roman's suggestion on how to re-write kvm_hv_flush_tlb()
and I also agree that it would be easier for future readers if we write
kvm_hv_send_ipi() in a similar way. Actually, I already have v7 in my
stash, will be sending it out shortly.

--
Vitaly

2018-10-01 16:22:13

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH v6 7/7] KVM: x86: hyperv: implement PV IPI send hypercalls

On 01/10/2018 18:20, Vitaly Kuznetsov wrote:
> Paolo Bonzini <[email protected]> writes:
>
>> On 27/09/2018 13:07, Roman Kagan wrote:
> ...
>>>
>>> I must say that now it looks even more tempting to follow the same
>>> pattern as your kvm_hv_flush_tlb: define a function that would call
>>> kvm_apic_set_irq() on all vcpus in a mask (optimizing the all-set case
>>> with a NULL mask), and make kvm_hv_send_ipi perform the same hv_vp_set
>>> -> vcpu_mask transformation followed by calling into that function.
>>
>>
>> It would perhaps be cleaner, but really kvm_apic_set_irq is as efficient
>> as it can be, since it takes the destination vcpu directly.
>>
>> The code duplication for walking the sparse set is a bit ugly, perhaps
>> that could be changed to use an iterator macro.
>
> I actually like Roman's suggestion on how to re-write kvm_hv_flush_tlb()
> and I also agree that it would be easier for future readers if we write
> kvm_hv_send_ipi() in a similar way. Actually, I already have v7 in my
> stash, will be sending it out shortly.

Just send follow ups now, please. I already have enough long queue. :)

Paolo

2018-10-01 16:42:49

by Vitaly Kuznetsov

[permalink] [raw]
Subject: Re: [PATCH v6 7/7] KVM: x86: hyperv: implement PV IPI send hypercalls

Paolo Bonzini <[email protected]> writes:

> On 01/10/2018 18:20, Vitaly Kuznetsov wrote:
>> Paolo Bonzini <[email protected]> writes:
>>
>>> On 27/09/2018 13:07, Roman Kagan wrote:
>> ...
>>>>
>>>> I must say that now it looks even more tempting to follow the same
>>>> pattern as your kvm_hv_flush_tlb: define a function that would call
>>>> kvm_apic_set_irq() on all vcpus in a mask (optimizing the all-set case
>>>> with a NULL mask), and make kvm_hv_send_ipi perform the same hv_vp_set
>>>> -> vcpu_mask transformation followed by calling into that function.
>>>
>>>
>>> It would perhaps be cleaner, but really kvm_apic_set_irq is as efficient
>>> as it can be, since it takes the destination vcpu directly.
>>>
>>> The code duplication for walking the sparse set is a bit ugly, perhaps
>>> that could be changed to use an iterator macro.
>>
>> I actually like Roman's suggestion on how to re-write kvm_hv_flush_tlb()
>> and I also agree that it would be easier for future readers if we write
>> kvm_hv_send_ipi() in a similar way. Actually, I already have v7 in my
>> stash, will be sending it out shortly.
>
> Just send follow ups now, please. I already have enough long queue. :)

Oh yea, you should :-)

No problem at all, I'll convert v7 into a follow-up series when
kvm/queue is pushed.

Thanks,

--
Vitaly