2018-07-23 06:41:17

by Wanpeng Li

[permalink] [raw]
Subject: [PATCH v5 0/6] KVM: X86: Implement Exit-less IPIs support

Using hypercall to send IPIs by one vmexit instead of one by one for
xAPIC/x2APIC physical mode and one vmexit per-cluster for x2APIC cluster
mode. Intel guest can enter x2apic cluster mode when interrupt remmaping
is enabled in qemu, however, latest AMD EPYC still just supports xapic
mode which can get great improvement by Exit-less IPIs. This patchset
lets a guest send multicast IPIs, with at most 128 destinations per
hypercall in 64-bit mode and 64 vCPUs per hypercall in 32-bit mode.

Hardware: Xeon Skylake 2.5GHz, 2 sockets, 40 cores, 80 threads, the VM
is 80 vCPUs, IPI microbenchmark(https://lkml.org/lkml/2017/12/19/141):

x2apic cluster mode, vanilla

Dry-run: 0, 2392199 ns
Self-IPI: 6907514, 15027589 ns
Normal IPI: 223910476, 251301666 ns
Broadcast IPI: 0, 9282161150 ns
Broadcast lock: 0, 8812934104 ns

x2apic cluster mode, pv-ipi

Dry-run: 0, 2449341 ns
Self-IPI: 6720360, 15028732 ns
Normal IPI: 228643307, 255708477 ns
Broadcast IPI: 0, 7572293590 ns => 22% performance boost
Broadcast lock: 0, 8316124651 ns

x2apic physical mode, vanilla

Dry-run: 0, 3135933 ns
Self-IPI: 8572670, 17901757 ns
Normal IPI: 226444334, 255421709 ns
Broadcast IPI: 0, 19845070887 ns
Broadcast lock: 0, 19827383656 ns

x2apic physical mode, pv-ipi

Dry-run: 0, 2446381 ns
Self-IPI: 6788217, 15021056 ns
Normal IPI: 219454441, 249583458 ns
Broadcast IPI: 0, 7806540019 ns => 154% performance boost
Broadcast lock: 0, 9143618799 ns

v4 -> v5:
* update hypercall layout description
* fix PV IPIs send hypercall loops

v3 -> v4:
* offset algorithm w/ __uint128_t to scale to higher APIC IDs
* remove num_possible_cpus limit
* pass op_64_bit to check bitmap size
* better describe hypercall layout

v2 -> v3:
* rename ipi_mask_done to irq_restore_exit, __send_ipi_mask return int
instead of bool
* fix build errors reported by 0day
* split patches, nothing change

v1 -> v2:
* sparse apic id > 128, or any other errors, fallback to original apic hooks
* have two bitmask arguments so that one hypercall handles 128 vCPUs
* fix KVM_FEATURE_PV_SEND_IPI doc
* document hypercall
* fix NMI selftest fails
* fix build errors reported by 0day

Cc: Paolo Bonzini <[email protected]>
Cc: Radim Krčmář <[email protected]>
Cc: Vitaly Kuznetsov <[email protected]>

Wanpeng Li (6):
KVM: X86: Add kvm hypervisor init time platform setup callback
KVM: X86: Implement PV IPIs in linux guest
KVM: X86: Fallback to original apic hooks when bad happens
KVM: X86: Implement PV IPIs send hypercall
KVM: X86: Add NMI support to PV IPIs
KVM: X86: Expose PV_SEND_IPI CPUID feature bit to guest

Documentation/virtual/kvm/cpuid.txt | 4 ++
Documentation/virtual/kvm/hypercalls.txt | 20 ++++++
arch/x86/include/uapi/asm/kvm_para.h | 1 +
arch/x86/kernel/kvm.c | 111 +++++++++++++++++++++++++++++++
arch/x86/kvm/cpuid.c | 3 +-
arch/x86/kvm/x86.c | 43 ++++++++++++
include/uapi/linux/kvm_para.h | 1 +
7 files changed, 182 insertions(+), 1 deletion(-)

--
2.7.4



2018-07-23 06:41:22

by Wanpeng Li

[permalink] [raw]
Subject: [PATCH v5 6/6] KVM: X86: Expose PV_SEND_IPI CPUID feature bit to guest

From: Wanpeng Li <[email protected]>

Expose PV_SEND_IPI feature bit to guest, the guest can check this feature
bit before using paravirtualized send IPIs.

Cc: Paolo Bonzini <[email protected]>
Cc: Radim Krčmář <[email protected]>
Cc: Vitaly Kuznetsov <[email protected]>
Signed-off-by: Wanpeng Li <[email protected]>
---
Documentation/virtual/kvm/cpuid.txt | 4 ++++
arch/x86/kvm/cpuid.c | 3 ++-
2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/Documentation/virtual/kvm/cpuid.txt b/Documentation/virtual/kvm/cpuid.txt
index ab022dc..97ca194 100644
--- a/Documentation/virtual/kvm/cpuid.txt
+++ b/Documentation/virtual/kvm/cpuid.txt
@@ -62,6 +62,10 @@ KVM_FEATURE_ASYNC_PF_VMEXIT || 10 || paravirtualized async PF VM exit
|| || can be enabled by setting bit 2
|| || when writing to msr 0x4b564d02
------------------------------------------------------------------------------
+KVM_FEATURE_PV_SEND_IPI || 11 || guest checks this feature bit
+ || || before using paravirtualized
+ || || send IPIs.
+------------------------------------------------------------------------------
KVM_FEATURE_CLOCKSOURCE_STABLE_BIT || 24 || host will warn if no guest-side
|| || per-cpu warps are expected in
|| || kvmclock.
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 7e042e3..7bcfa61 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -621,7 +621,8 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
(1 << KVM_FEATURE_CLOCKSOURCE_STABLE_BIT) |
(1 << KVM_FEATURE_PV_UNHALT) |
(1 << KVM_FEATURE_PV_TLB_FLUSH) |
- (1 << KVM_FEATURE_ASYNC_PF_VMEXIT);
+ (1 << KVM_FEATURE_ASYNC_PF_VMEXIT) |
+ (1 << KVM_FEATURE_PV_SEND_IPI);

if (sched_info_on())
entry->eax |= (1 << KVM_FEATURE_STEAL_TIME);
--
2.7.4


2018-07-23 06:41:30

by Wanpeng Li

[permalink] [raw]
Subject: [PATCH v5 5/6] KVM: X86: Add NMI support to PV IPIs

From: Wanpeng Li <[email protected]>

The NMI delivery mode of ICR is used to deliver an NMI to the processor,
and the vector information is ignored.

Cc: Paolo Bonzini <[email protected]>
Cc: Radim Krčmář <[email protected]>
Cc: Vitaly Kuznetsov <[email protected]>
Signed-off-by: Wanpeng Li <[email protected]>
---
arch/x86/kernel/kvm.c | 15 ++++++++++++---
arch/x86/kvm/x86.c | 16 +++++++++++-----
2 files changed, 23 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 57eb4a2..3456531 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -458,7 +458,7 @@ static void __init sev_map_percpu_data(void)
static int __send_ipi_mask(const struct cpumask *mask, int vector)
{
unsigned long flags;
- int cpu, apic_id, min = 0, max = 0, ret = 0;
+ int cpu, apic_id, min = 0, max = 0, ret = 0, icr = 0;
#ifdef CONFIG_X86_64
__uint128_t ipi_bitmap = 0;
int cluster_size = 128;
@@ -472,6 +472,15 @@ static int __send_ipi_mask(const struct cpumask *mask, int vector)

local_irq_save(flags);

+ switch (vector) {
+ default:
+ icr = APIC_DM_FIXED | vector;
+ break;
+ case NMI_VECTOR:
+ icr = APIC_DM_NMI;
+ break;
+ }
+
for_each_cpu(cpu, mask) {
apic_id = per_cpu(x86_cpu_to_apicid, cpu);
if (!ipi_bitmap) {
@@ -483,7 +492,7 @@ static int __send_ipi_mask(const struct cpumask *mask, int vector)
max = apic_id < max ? max : apic_id;
} else {
ret = kvm_hypercall4(KVM_HC_SEND_IPI, (unsigned long)ipi_bitmap,
- (unsigned long)(ipi_bitmap >> BITS_PER_LONG), min, vector);
+ (unsigned long)(ipi_bitmap >> BITS_PER_LONG), min, icr);
min = max = apic_id;
ipi_bitmap = 0;
}
@@ -492,7 +501,7 @@ static int __send_ipi_mask(const struct cpumask *mask, int vector)

if (ipi_bitmap) {
ret = kvm_hypercall4(KVM_HC_SEND_IPI, (unsigned long)ipi_bitmap,
- (unsigned long)(ipi_bitmap >> BITS_PER_LONG), min, vector);
+ (unsigned long)(ipi_bitmap >> BITS_PER_LONG), min, icr);
}

local_irq_restore(flags);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index a43a29f..c118040 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6695,17 +6695,23 @@ static void kvm_pv_kick_cpu_op(struct kvm *kvm, unsigned long flags, int apicid)
* Return 0 if successfully added and 1 if discarded.
*/
static int kvm_pv_send_ipi(struct kvm *kvm, unsigned long ipi_bitmap_low,
- unsigned long ipi_bitmap_high, int min, int vector, int op_64_bit)
+ unsigned long ipi_bitmap_high, int min, unsigned long icr, int op_64_bit)
{
int i;
struct kvm_apic_map *map;
struct kvm_vcpu *vcpu;
- struct kvm_lapic_irq irq = {
- .delivery_mode = APIC_DM_FIXED,
- .vector = vector,
- };
+ struct kvm_lapic_irq irq = {0};
int cluster_size = op_64_bit ? 64 : 32;

+ switch (icr & APIC_VECTOR_MASK) {
+ default:
+ irq.vector = icr & APIC_VECTOR_MASK;
+ break;
+ case NMI_VECTOR:
+ break;
+ }
+ irq.delivery_mode = icr & APIC_MODE_MASK;
+
rcu_read_lock();
map = rcu_dereference(kvm->arch.apic_map);

--
2.7.4


2018-07-23 06:41:38

by Wanpeng Li

[permalink] [raw]
Subject: [PATCH v5 4/6] KVM: X86: Implement PV IPIs send hypercall

From: Wanpeng Li <[email protected]>

Using hypercall to send IPIs by one vmexit instead of one by one for
xAPIC/x2APIC physical mode and one vmexit per-cluster for x2APIC cluster
mode. Intel guest can enter x2apic cluster mode when interrupt remmaping
is enabled in qemu, however, latest AMD EPYC still just supports xapic
mode which can get great improvement by Exit-less IPIs. This patchset
lets a guest send multicast IPIs, with at most 128 destinations per
hypercall in 64-bit mode and 64 vCPUs per hypercall in 32-bit mode.

Hardware: Xeon Skylake 2.5GHz, 2 sockets, 40 cores, 80 threads, the VM
is 80 vCPUs, IPI microbenchmark(https://lkml.org/lkml/2017/12/19/141):

x2apic cluster mode, vanilla

Dry-run: 0, 2392199 ns
Self-IPI: 6907514, 15027589 ns
Normal IPI: 223910476, 251301666 ns
Broadcast IPI: 0, 9282161150 ns
Broadcast lock: 0, 8812934104 ns

x2apic cluster mode, pv-ipi

Dry-run: 0, 2449341 ns
Self-IPI: 6720360, 15028732 ns
Normal IPI: 228643307, 255708477 ns
Broadcast IPI: 0, 7572293590 ns => 22% performance boost
Broadcast lock: 0, 8316124651 ns

x2apic physical mode, vanilla

Dry-run: 0, 3135933 ns
Self-IPI: 8572670, 17901757 ns
Normal IPI: 226444334, 255421709 ns
Broadcast IPI: 0, 19845070887 ns
Broadcast lock: 0, 19827383656 ns

x2apic physical mode, pv-ipi

Dry-run: 0, 2446381 ns
Self-IPI: 6788217, 15021056 ns
Normal IPI: 219454441, 249583458 ns
Broadcast IPI: 0, 7806540019 ns => 154% performance boost
Broadcast lock: 0, 9143618799 ns

Cc: Paolo Bonzini <[email protected]>
Cc: Radim Krčmář <[email protected]>
Cc: Vitaly Kuznetsov <[email protected]>
Signed-off-by: Wanpeng Li <[email protected]>
---
Documentation/virtual/kvm/hypercalls.txt | 20 +++++++++++++++++
arch/x86/kvm/x86.c | 37 ++++++++++++++++++++++++++++++++
2 files changed, 57 insertions(+)

diff --git a/Documentation/virtual/kvm/hypercalls.txt b/Documentation/virtual/kvm/hypercalls.txt
index a890529..9895123 100644
--- a/Documentation/virtual/kvm/hypercalls.txt
+++ b/Documentation/virtual/kvm/hypercalls.txt
@@ -121,3 +121,23 @@ compute the CLOCK_REALTIME for its clock, at the same instant.

Returns KVM_EOPNOTSUPP if the host does not use TSC clocksource,
or if clock type is different than KVM_CLOCK_PAIRING_WALLCLOCK.
+
+6. KVM_HC_SEND_IPI
+------------------------
+Architecture: x86
+Status: active
+Purpose: Hypercall used to send IPIs.
+
+a0: lower part of the bitmap of destination APIC IDs
+a1: higher part of the bitmap of destination APIC IDs
+a2: the lowest APIC ID in bitmap
+a3: APIC ICR
+
+The hypercall lets a guest send multicast IPIs, with at most 128
+128 destinations per hypercall in 64-bit mode and 64 vCPUs per
+hypercall in 32-bit mode. The destinations are represented by a
+bitmap contained in the first two arguments (a0 and a1). Bit 0 of
+a0 corresponds to the APIC ID in the third argument (a2), bit 1
+corresponds to the APIC ID a2+1, and so on.
+
+Returns 0 if successfully delivery the IPIs and 1 if discarded.
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 2b812b3..a43a29f 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6691,6 +6691,40 @@ static void kvm_pv_kick_cpu_op(struct kvm *kvm, unsigned long flags, int apicid)
kvm_irq_delivery_to_apic(kvm, NULL, &lapic_irq, NULL);
}

+/*
+ * Return 0 if successfully added and 1 if discarded.
+ */
+static int kvm_pv_send_ipi(struct kvm *kvm, unsigned long ipi_bitmap_low,
+ unsigned long ipi_bitmap_high, int min, int vector, int op_64_bit)
+{
+ int i;
+ struct kvm_apic_map *map;
+ struct kvm_vcpu *vcpu;
+ struct kvm_lapic_irq irq = {
+ .delivery_mode = APIC_DM_FIXED,
+ .vector = vector,
+ };
+ int cluster_size = op_64_bit ? 64 : 32;
+
+ rcu_read_lock();
+ map = rcu_dereference(kvm->arch.apic_map);
+
+ for_each_set_bit(i, &ipi_bitmap_low, cluster_size) {
+ vcpu = map->phys_map[min + i]->vcpu;
+ if (!kvm_apic_set_irq(vcpu, &irq, NULL))
+ return 1;
+ }
+
+ for_each_set_bit(i, &ipi_bitmap_high, cluster_size) {
+ vcpu = map->phys_map[min + i + cluster_size]->vcpu;
+ if (!kvm_apic_set_irq(vcpu, &irq, NULL))
+ return 1;
+ }
+
+ rcu_read_unlock();
+ return 0;
+}
+
void kvm_vcpu_deactivate_apicv(struct kvm_vcpu *vcpu)
{
vcpu->arch.apicv_active = false;
@@ -6739,6 +6773,9 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
case KVM_HC_CLOCK_PAIRING:
ret = kvm_pv_clock_pairing(vcpu, a0, a1);
break;
+ case KVM_HC_SEND_IPI:
+ ret = kvm_pv_send_ipi(vcpu->kvm, a0, a1, a2, a3, op_64_bit);
+ break;
#endif
default:
ret = -KVM_ENOSYS;
--
2.7.4


2018-07-23 06:42:14

by Wanpeng Li

[permalink] [raw]
Subject: [PATCH v5 3/6] KVM: X86: Fallback to original apic hooks when bad happens

From: Wanpeng Li <[email protected]>

Fallback to original apic hooks when unlikely kvm fails to add the
pending IRQ to lapic.

Cc: Paolo Bonzini <[email protected]>
Cc: Radim Krčmář <[email protected]>
Cc: Vitaly Kuznetsov <[email protected]>
Signed-off-by: Wanpeng Li <[email protected]>
---
arch/x86/kernel/kvm.c | 20 +++++++++++++-------
1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index eed6046..57eb4a2 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -47,6 +47,7 @@
#include <asm/hypervisor.h>
#include <asm/kvm_guest.h>

+static struct apic orig_apic;
static int kvmapf = 1;

static int __init parse_no_kvmapf(char *arg)
@@ -454,10 +455,10 @@ static void __init sev_map_percpu_data(void)
}

#ifdef CONFIG_SMP
-static void __send_ipi_mask(const struct cpumask *mask, int vector)
+static int __send_ipi_mask(const struct cpumask *mask, int vector)
{
unsigned long flags;
- int cpu, apic_id, min = 0, max = 0;
+ int cpu, apic_id, min = 0, max = 0, ret = 0;
#ifdef CONFIG_X86_64
__uint128_t ipi_bitmap = 0;
int cluster_size = 128;
@@ -467,7 +468,7 @@ static void __send_ipi_mask(const struct cpumask *mask, int vector)
#endif

if (cpumask_empty(mask))
- return;
+ return 0;

local_irq_save(flags);

@@ -481,7 +482,7 @@ static void __send_ipi_mask(const struct cpumask *mask, int vector)
} else if (apic_id < min + cluster_size) {
max = apic_id < max ? max : apic_id;
} else {
- kvm_hypercall4(KVM_HC_SEND_IPI, (unsigned long)ipi_bitmap,
+ ret = kvm_hypercall4(KVM_HC_SEND_IPI, (unsigned long)ipi_bitmap,
(unsigned long)(ipi_bitmap >> BITS_PER_LONG), min, vector);
min = max = apic_id;
ipi_bitmap = 0;
@@ -490,11 +491,12 @@ static void __send_ipi_mask(const struct cpumask *mask, int vector)
}

if (ipi_bitmap) {
- kvm_hypercall4(KVM_HC_SEND_IPI, (unsigned long)ipi_bitmap,
+ ret = kvm_hypercall4(KVM_HC_SEND_IPI, (unsigned long)ipi_bitmap,
(unsigned long)(ipi_bitmap >> BITS_PER_LONG), min, vector);
}

local_irq_restore(flags);
+ return ret;
}

static void kvm_send_ipi_mask(const struct cpumask *mask, int vector)
@@ -511,7 +513,8 @@ static void kvm_send_ipi_mask_allbutself(const struct cpumask *mask, int vector)
cpumask_copy(&new_mask, mask);
cpumask_clear_cpu(this_cpu, &new_mask);
local_mask = &new_mask;
- __send_ipi_mask(local_mask, vector);
+ if (__send_ipi_mask(local_mask, vector))
+ orig_apic.send_IPI_mask_allbutself(mask, vector);
}

static void kvm_send_ipi_allbutself(int vector)
@@ -521,7 +524,8 @@ static void kvm_send_ipi_allbutself(int vector)

static void kvm_send_ipi_all(int vector)
{
- __send_ipi_mask(cpu_online_mask, vector);
+ if (__send_ipi_mask(cpu_online_mask, vector))
+ orig_apic.send_IPI_all(vector);
}

/*
@@ -529,6 +533,8 @@ static void kvm_send_ipi_all(int vector)
*/
static void kvm_setup_pv_ipi(void)
{
+ orig_apic = *apic;
+
apic->send_IPI_mask = kvm_send_ipi_mask;
apic->send_IPI_mask_allbutself = kvm_send_ipi_mask_allbutself;
apic->send_IPI_allbutself = kvm_send_ipi_allbutself;
--
2.7.4


2018-07-23 06:42:55

by Wanpeng Li

[permalink] [raw]
Subject: [PATCH v5 2/6] KVM: X86: Implement PV IPIs in linux guest

From: Wanpeng Li <[email protected]>

Implement paravirtual apic hooks to enable PV IPIs.

apic->send_IPI_mask
apic->send_IPI_mask_allbutself
apic->send_IPI_allbutself
apic->send_IPI_all

This patch lets a guest send multicast IPIs, with at most 128 destinations
per hypercall in 64-bit mode and 64 vCPUs per hypercall in 32-bit mode.

Cc: Paolo Bonzini <[email protected]>
Cc: Radim Krčmář <[email protected]>
Cc: Vitaly Kuznetsov <[email protected]>
Signed-off-by: Wanpeng Li <[email protected]>
---
arch/x86/include/uapi/asm/kvm_para.h | 1 +
arch/x86/kernel/kvm.c | 86 ++++++++++++++++++++++++++++++++++++
include/uapi/linux/kvm_para.h | 1 +
3 files changed, 88 insertions(+)

diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h
index 0ede697..19980ec 100644
--- a/arch/x86/include/uapi/asm/kvm_para.h
+++ b/arch/x86/include/uapi/asm/kvm_para.h
@@ -28,6 +28,7 @@
#define KVM_FEATURE_PV_UNHALT 7
#define KVM_FEATURE_PV_TLB_FLUSH 9
#define KVM_FEATURE_ASYNC_PF_VMEXIT 10
+#define KVM_FEATURE_PV_SEND_IPI 11

#define KVM_HINTS_REALTIME 0

diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 591bcf2..eed6046 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -454,6 +454,88 @@ static void __init sev_map_percpu_data(void)
}

#ifdef CONFIG_SMP
+static void __send_ipi_mask(const struct cpumask *mask, int vector)
+{
+ unsigned long flags;
+ int cpu, apic_id, min = 0, max = 0;
+#ifdef CONFIG_X86_64
+ __uint128_t ipi_bitmap = 0;
+ int cluster_size = 128;
+#else
+ u64 ipi_bitmap = 0;
+ int cluster_size = 64;
+#endif
+
+ if (cpumask_empty(mask))
+ return;
+
+ local_irq_save(flags);
+
+ for_each_cpu(cpu, mask) {
+ apic_id = per_cpu(x86_cpu_to_apicid, cpu);
+ if (!ipi_bitmap) {
+ min = max = apic_id;
+ } else if (apic_id < min && max - apic_id < cluster_size) {
+ ipi_bitmap <<= min - apic_id;
+ min = apic_id;
+ } else if (apic_id < min + cluster_size) {
+ max = apic_id < max ? max : apic_id;
+ } else {
+ kvm_hypercall4(KVM_HC_SEND_IPI, (unsigned long)ipi_bitmap,
+ (unsigned long)(ipi_bitmap >> BITS_PER_LONG), min, vector);
+ min = max = apic_id;
+ ipi_bitmap = 0;
+ }
+ __set_bit(apic_id - min, (unsigned long *)&ipi_bitmap);
+ }
+
+ if (ipi_bitmap) {
+ kvm_hypercall4(KVM_HC_SEND_IPI, (unsigned long)ipi_bitmap,
+ (unsigned long)(ipi_bitmap >> BITS_PER_LONG), min, vector);
+ }
+
+ local_irq_restore(flags);
+}
+
+static void kvm_send_ipi_mask(const struct cpumask *mask, int vector)
+{
+ __send_ipi_mask(mask, vector);
+}
+
+static void kvm_send_ipi_mask_allbutself(const struct cpumask *mask, int vector)
+{
+ unsigned int this_cpu = smp_processor_id();
+ struct cpumask new_mask;
+ const struct cpumask *local_mask;
+
+ cpumask_copy(&new_mask, mask);
+ cpumask_clear_cpu(this_cpu, &new_mask);
+ local_mask = &new_mask;
+ __send_ipi_mask(local_mask, vector);
+}
+
+static void kvm_send_ipi_allbutself(int vector)
+{
+ kvm_send_ipi_mask_allbutself(cpu_online_mask, vector);
+}
+
+static void kvm_send_ipi_all(int vector)
+{
+ __send_ipi_mask(cpu_online_mask, vector);
+}
+
+/*
+ * Set the IPI entry points
+ */
+static void kvm_setup_pv_ipi(void)
+{
+ apic->send_IPI_mask = kvm_send_ipi_mask;
+ apic->send_IPI_mask_allbutself = kvm_send_ipi_mask_allbutself;
+ apic->send_IPI_allbutself = kvm_send_ipi_allbutself;
+ apic->send_IPI_all = kvm_send_ipi_all;
+ pr_info("KVM setup pv IPIs\n");
+}
+
static void __init kvm_smp_prepare_cpus(unsigned int max_cpus)
{
native_smp_prepare_cpus(max_cpus);
@@ -626,6 +708,10 @@ static uint32_t __init kvm_detect(void)

static void __init kvm_apic_init(void)
{
+#if defined(CONFIG_SMP)
+ if (kvm_para_has_feature(KVM_FEATURE_PV_SEND_IPI))
+ kvm_setup_pv_ipi();
+#endif
}

static void __init kvm_init_platform(void)
diff --git a/include/uapi/linux/kvm_para.h b/include/uapi/linux/kvm_para.h
index dcf629d..a98217d 100644
--- a/include/uapi/linux/kvm_para.h
+++ b/include/uapi/linux/kvm_para.h
@@ -26,6 +26,7 @@
#define KVM_HC_MIPS_EXIT_VM 7
#define KVM_HC_MIPS_CONSOLE_OUTPUT 8
#define KVM_HC_CLOCK_PAIRING 9
+#define KVM_HC_SEND_IPI 10

/*
* hypercalls use architecture specific
--
2.7.4


2018-07-23 06:42:58

by Wanpeng Li

[permalink] [raw]
Subject: [PATCH v5 1/6] KVM: X86: Add kvm hypervisor init time platform setup callback

From: Wanpeng Li <[email protected]>

Add kvm hypervisor init time platform setup callback which
will be used to replace native apic hooks by pararvirtual
hooks.

Cc: Paolo Bonzini <[email protected]>
Cc: Radim Krčmář <[email protected]>
Cc: Vitaly Kuznetsov <[email protected]>
Signed-off-by: Wanpeng Li <[email protected]>
---
arch/x86/kernel/kvm.c | 10 ++++++++++
1 file changed, 10 insertions(+)

diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 5b2300b..591bcf2 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -624,12 +624,22 @@ static uint32_t __init kvm_detect(void)
return kvm_cpuid_base();
}

+static void __init kvm_apic_init(void)
+{
+}
+
+static void __init kvm_init_platform(void)
+{
+ x86_platform.apic_post_init = kvm_apic_init;
+}
+
const __initconst struct hypervisor_x86 x86_hyper_kvm = {
.name = "KVM",
.detect = kvm_detect,
.type = X86_HYPER_KVM,
.init.guest_late_init = kvm_guest_init,
.init.x2apic_available = kvm_para_available,
+ .init.init_platform = kvm_init_platform,
};

static __init int activate_jump_labels(void)
--
2.7.4


2018-08-02 13:03:49

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH v5 3/6] KVM: X86: Fallback to original apic hooks when bad happens

On 23/07/2018 08:39, Wanpeng Li wrote:
> From: Wanpeng Li <[email protected]>
>
> Fallback to original apic hooks when unlikely kvm fails to add the
> pending IRQ to lapic.
>
> Cc: Paolo Bonzini <[email protected]>
> Cc: Radim Krčmář <[email protected]>
> Cc: Vitaly Kuznetsov <[email protected]>
> Signed-off-by: Wanpeng Li <[email protected]>
> ---
> arch/x86/kernel/kvm.c | 20 +++++++++++++-------
> 1 file changed, 13 insertions(+), 7 deletions(-)
>
> diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
> index eed6046..57eb4a2 100644
> --- a/arch/x86/kernel/kvm.c
> +++ b/arch/x86/kernel/kvm.c
> @@ -47,6 +47,7 @@
> #include <asm/hypervisor.h>
> #include <asm/kvm_guest.h>
>
> +static struct apic orig_apic;
> static int kvmapf = 1;
>
> static int __init parse_no_kvmapf(char *arg)
> @@ -454,10 +455,10 @@ static void __init sev_map_percpu_data(void)
> }
>
> #ifdef CONFIG_SMP
> -static void __send_ipi_mask(const struct cpumask *mask, int vector)
> +static int __send_ipi_mask(const struct cpumask *mask, int vector)
> {
> unsigned long flags;
> - int cpu, apic_id, min = 0, max = 0;
> + int cpu, apic_id, min = 0, max = 0, ret = 0;
> #ifdef CONFIG_X86_64
> __uint128_t ipi_bitmap = 0;
> int cluster_size = 128;
> @@ -467,7 +468,7 @@ static void __send_ipi_mask(const struct cpumask *mask, int vector)
> #endif
>
> if (cpumask_empty(mask))
> - return;
> + return 0;
>
> local_irq_save(flags);
>
> @@ -481,7 +482,7 @@ static void __send_ipi_mask(const struct cpumask *mask, int vector)
> } else if (apic_id < min + cluster_size) {
> max = apic_id < max ? max : apic_id;
> } else {
> - kvm_hypercall4(KVM_HC_SEND_IPI, (unsigned long)ipi_bitmap,
> + ret = kvm_hypercall4(KVM_HC_SEND_IPI, (unsigned long)ipi_bitmap,
> (unsigned long)(ipi_bitmap >> BITS_PER_LONG), min, vector);
> min = max = apic_id;
> ipi_bitmap = 0;
> @@ -490,11 +491,12 @@ static void __send_ipi_mask(const struct cpumask *mask, int vector)
> }
>
> if (ipi_bitmap) {
> - kvm_hypercall4(KVM_HC_SEND_IPI, (unsigned long)ipi_bitmap,
> + ret = kvm_hypercall4(KVM_HC_SEND_IPI, (unsigned long)ipi_bitmap,
> (unsigned long)(ipi_bitmap >> BITS_PER_LONG), min, vector);
> }
>
> local_irq_restore(flags);
> + return ret;
> }
>
> static void kvm_send_ipi_mask(const struct cpumask *mask, int vector)
> @@ -511,7 +513,8 @@ static void kvm_send_ipi_mask_allbutself(const struct cpumask *mask, int vector)
> cpumask_copy(&new_mask, mask);
> cpumask_clear_cpu(this_cpu, &new_mask);
> local_mask = &new_mask;
> - __send_ipi_mask(local_mask, vector);
> + if (__send_ipi_mask(local_mask, vector))
> + orig_apic.send_IPI_mask_allbutself(mask, vector);
> }
>
> static void kvm_send_ipi_allbutself(int vector)
> @@ -521,7 +524,8 @@ static void kvm_send_ipi_allbutself(int vector)
>
> static void kvm_send_ipi_all(int vector)
> {
> - __send_ipi_mask(cpu_online_mask, vector);
> + if (__send_ipi_mask(cpu_online_mask, vector))
> + orig_apic.send_IPI_all(vector);
> }
>
> /*
> @@ -529,6 +533,8 @@ static void kvm_send_ipi_all(int vector)
> */
> static void kvm_setup_pv_ipi(void)
> {
> + orig_apic = *apic;
> +
> apic->send_IPI_mask = kvm_send_ipi_mask;
> apic->send_IPI_mask_allbutself = kvm_send_ipi_mask_allbutself;
> apic->send_IPI_allbutself = kvm_send_ipi_allbutself;
>

Is this actually needed?

Paolo

2018-08-02 13:06:12

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH v5 4/6] KVM: X86: Implement PV IPIs send hypercall

On 23/07/2018 08:39, Wanpeng Li wrote:
> +Returns 0 if successfully delivery the IPIs and 1 if discarded.

I'm changing this to

"Returns the number of CPUs to which the IPIs were delivered successfully"

with an obvious change to x86.c.

Paolo

2018-08-03 04:11:07

by Wanpeng Li

[permalink] [raw]
Subject: Re: [PATCH v5 4/6] KVM: X86: Implement PV IPIs send hypercall

On Thu, 2 Aug 2018 at 21:04, Paolo Bonzini <[email protected]> wrote:
>
> On 23/07/2018 08:39, Wanpeng Li wrote:
> > +Returns 0 if successfully delivery the IPIs and 1 if discarded.
>
> I'm changing this to
>
> "Returns the number of CPUs to which the IPIs were delivered successfully"
>
> with an obvious change to x86.c.

Thanks Paolo!

Regards,
Wanpeng Li