2019-05-28 07:58:09

by Wanpeng Li

[permalink] [raw]
Subject: [PATCH RESEND v2 0/3] KVM: Yield to IPI target if necessary

The idea is from Xen, when sending a call-function IPI-many to vCPUs,
yield if any of the IPI target vCPUs was preempted. 17% performance
increase of ebizzy benchmark can be observed in an over-subscribe
environment. (w/ kvm-pv-tlb disabled, testing TLB flush call-function
IPI-many since call-function is not easy to be trigged by userspace
workload).

v1 -> v2:
* check map is not NULL
* check map->phys_map[dest_id] is not NULL
* make kvm_sched_yield static
* change dest_id to unsinged long

Wanpeng Li (3):
KVM: X86: Implement PV sched yield in linux guest
KVM: X86: Implement PV sched yield hypercall
KVM: X86: Expose PV_SCHED_YIELD CPUID feature bit to guest

Documentation/virtual/kvm/cpuid.txt | 4 ++++
Documentation/virtual/kvm/hypercalls.txt | 11 +++++++++++
arch/x86/include/uapi/asm/kvm_para.h | 1 +
arch/x86/kernel/kvm.c | 21 +++++++++++++++++++++
arch/x86/kvm/cpuid.c | 3 ++-
arch/x86/kvm/x86.c | 26 ++++++++++++++++++++++++++
include/uapi/linux/kvm_para.h | 1 +
7 files changed, 66 insertions(+), 1 deletion(-)

--
2.7.4


2019-05-28 07:58:10

by Wanpeng Li

[permalink] [raw]
Subject: [PATCH RESEND v2 1/3] KVM: X86: Implement PV sched yield in linux guest

From: Wanpeng Li <[email protected]>

When sending a call-function IPI-many to vCPUs, yield if any of
the IPI target vCPUs was preempted, we just select the first
preempted target vCPU which we found since the state of target
vCPUs can change underneath and to avoid race conditions.

Cc: Paolo Bonzini <[email protected]>
Cc: Radim Krčmář <[email protected]>
Signed-off-by: Wanpeng Li <[email protected]>
---
Documentation/virtual/kvm/hypercalls.txt | 11 +++++++++++
arch/x86/include/uapi/asm/kvm_para.h | 1 +
arch/x86/kernel/kvm.c | 21 +++++++++++++++++++++
include/uapi/linux/kvm_para.h | 1 +
4 files changed, 34 insertions(+)

diff --git a/Documentation/virtual/kvm/hypercalls.txt b/Documentation/virtual/kvm/hypercalls.txt
index da24c13..da21065 100644
--- a/Documentation/virtual/kvm/hypercalls.txt
+++ b/Documentation/virtual/kvm/hypercalls.txt
@@ -141,3 +141,14 @@ a0 corresponds to the APIC ID in the third argument (a2), bit 1
corresponds to the APIC ID a2+1, and so on.

Returns the number of CPUs to which the IPIs were delivered successfully.
+
+7. KVM_HC_SCHED_YIELD
+------------------------
+Architecture: x86
+Status: active
+Purpose: Hypercall used to yield if the IPI target vCPU is preempted
+
+a0: destination APIC ID
+
+Usage example: When sending a call-function IPI-many to vCPUs, yield if
+any of the IPI target vCPUs was preempted.
diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h
index 19980ec..d0bf77c 100644
--- a/arch/x86/include/uapi/asm/kvm_para.h
+++ b/arch/x86/include/uapi/asm/kvm_para.h
@@ -29,6 +29,7 @@
#define KVM_FEATURE_PV_TLB_FLUSH 9
#define KVM_FEATURE_ASYNC_PF_VMEXIT 10
#define KVM_FEATURE_PV_SEND_IPI 11
+#define KVM_FEATURE_PV_SCHED_YIELD 12

#define KVM_HINTS_REALTIME 0

diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 3f0cc82..54400c2 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -540,6 +540,21 @@ static void kvm_setup_pv_ipi(void)
pr_info("KVM setup pv IPIs\n");
}

+static void kvm_smp_send_call_func_ipi(const struct cpumask *mask)
+{
+ int cpu;
+
+ native_send_call_func_ipi(mask);
+
+ /* Make sure other vCPUs get a chance to run if they need to. */
+ for_each_cpu(cpu, mask) {
+ if (vcpu_is_preempted(cpu)) {
+ kvm_hypercall1(KVM_HC_SCHED_YIELD, per_cpu(x86_cpu_to_apicid, cpu));
+ break;
+ }
+ }
+}
+
static void __init kvm_smp_prepare_cpus(unsigned int max_cpus)
{
native_smp_prepare_cpus(max_cpus);
@@ -651,6 +666,12 @@ static void __init kvm_guest_init(void)
#ifdef CONFIG_SMP
smp_ops.smp_prepare_cpus = kvm_smp_prepare_cpus;
smp_ops.smp_prepare_boot_cpu = kvm_smp_prepare_boot_cpu;
+ if (kvm_para_has_feature(KVM_FEATURE_PV_SCHED_YIELD) &&
+ !kvm_para_has_hint(KVM_HINTS_REALTIME) &&
+ kvm_para_has_feature(KVM_FEATURE_STEAL_TIME)) {
+ smp_ops.send_call_func_ipi = kvm_smp_send_call_func_ipi;
+ pr_info("KVM setup pv sched yield\n");
+ }
if (cpuhp_setup_state_nocalls(CPUHP_AP_ONLINE_DYN, "x86/kvm:online",
kvm_cpu_online, kvm_cpu_down_prepare) < 0)
pr_err("kvm_guest: Failed to install cpu hotplug callbacks\n");
diff --git a/include/uapi/linux/kvm_para.h b/include/uapi/linux/kvm_para.h
index 6c0ce49..8b86609 100644
--- a/include/uapi/linux/kvm_para.h
+++ b/include/uapi/linux/kvm_para.h
@@ -28,6 +28,7 @@
#define KVM_HC_MIPS_CONSOLE_OUTPUT 8
#define KVM_HC_CLOCK_PAIRING 9
#define KVM_HC_SEND_IPI 10
+#define KVM_HC_SCHED_YIELD 11

/*
* hypercalls use architecture specific
--
2.7.4

2019-05-28 07:58:55

by Wanpeng Li

[permalink] [raw]
Subject: [PATCH RESEND v2 3/3] KVM: X86: Expose PV_SCHED_YIELD CPUID feature bit to guest

From: Wanpeng Li <[email protected]>

Expose PV_SCHED_YIELD feature bit to guest, the guest can check this
feature bit before using paravirtualized sched yield.

Cc: Paolo Bonzini <[email protected]>
Cc: Radim Krčmář <[email protected]>
Signed-off-by: Wanpeng Li <[email protected]>
---
Documentation/virtual/kvm/cpuid.txt | 4 ++++
arch/x86/kvm/cpuid.c | 3 ++-
2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/Documentation/virtual/kvm/cpuid.txt b/Documentation/virtual/kvm/cpuid.txt
index 97ca194..1c39683 100644
--- a/Documentation/virtual/kvm/cpuid.txt
+++ b/Documentation/virtual/kvm/cpuid.txt
@@ -66,6 +66,10 @@ KVM_FEATURE_PV_SEND_IPI || 11 || guest checks this feature bit
|| || before using paravirtualized
|| || send IPIs.
------------------------------------------------------------------------------
+KVM_FEATURE_PV_SHED_YIELD || 12 || guest checks this feature bit
+ || || before using paravirtualized
+ || || sched yield.
+------------------------------------------------------------------------------
KVM_FEATURE_CLOCKSOURCE_STABLE_BIT || 24 || host will warn if no guest-side
|| || per-cpu warps are expected in
|| || kvmclock.
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index e18a9f9..c018fc8 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -643,7 +643,8 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
(1 << KVM_FEATURE_PV_UNHALT) |
(1 << KVM_FEATURE_PV_TLB_FLUSH) |
(1 << KVM_FEATURE_ASYNC_PF_VMEXIT) |
- (1 << KVM_FEATURE_PV_SEND_IPI);
+ (1 << KVM_FEATURE_PV_SEND_IPI) |
+ (1 << KVM_FEATURE_PV_SCHED_YIELD);

if (sched_info_on())
entry->eax |= (1 << KVM_FEATURE_STEAL_TIME);
--
2.7.4

2019-05-28 07:59:29

by Wanpeng Li

[permalink] [raw]
Subject: [PATCH RESEND v2 2/3] KVM: X86: Implement PV sched yield hypercall

From: Wanpeng Li <[email protected]>

The target vCPUs are in runnable state after vcpu_kick and suitable
as a yield target. This patch implements the sched yield hypercall.

17% performance increase of ebizzy benchmark can be observed in an
over-subscribe environment. (w/ kvm-pv-tlb disabled, testing TLB flush
call-function IPI-many since call-function is not easy to be trigged
by userspace workload).

Cc: Paolo Bonzini <[email protected]>
Cc: Radim Krčmář <[email protected]>
Signed-off-by: Wanpeng Li <[email protected]>
---
arch/x86/kvm/x86.c | 26 ++++++++++++++++++++++++++
1 file changed, 26 insertions(+)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index e7e57de..f73af3b 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7172,6 +7172,28 @@ void kvm_vcpu_deactivate_apicv(struct kvm_vcpu *vcpu)
kvm_x86_ops->refresh_apicv_exec_ctrl(vcpu);
}

+static void kvm_sched_yield(struct kvm *kvm, unsigned long dest_id)
+{
+ struct kvm_vcpu *target;
+ struct kvm_apic_map *map;
+
+ rcu_read_lock();
+ map = rcu_dereference(kvm->arch.apic_map);
+
+ if (unlikely(!map))
+ goto out;
+
+ if (map->phys_map[dest_id]->vcpu) {
+ target = map->phys_map[dest_id]->vcpu;
+ rcu_read_unlock();
+ kvm_vcpu_yield_to(target);
+ }
+
+out:
+ if (!target)
+ rcu_read_unlock();
+}
+
int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
{
unsigned long nr, a0, a1, a2, a3, ret;
@@ -7218,6 +7240,10 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
case KVM_HC_SEND_IPI:
ret = kvm_pv_send_ipi(vcpu->kvm, a0, a1, a2, a3, op_64_bit);
break;
+ case KVM_HC_SCHED_YIELD:
+ kvm_sched_yield(vcpu->kvm, a0);
+ ret = 0;
+ break;
default:
ret = -KVM_ENOSYS;
break;
--
2.7.4