This is a simple optimization for kvm_vcpu_on_spin, the
main idea is described in patch-1's commit msg.
I did some tests base on the RFC version, the result shows
that it can improves the performance slightly.
== Geekbench-3.4.1 ==
VM1: 8U,4G, vcpu(0...7) is 1:1 pinned to pcpu(6...11,18,19)
running Geekbench-3.4.1 *10 truns*
VM2/VM3/VM4: configure is the same as VM1
stress each vcpu usage(seed by top in guest) to 40%
The comparison of each testcase's score:
(higher is better)
before after improve
Inter
single 1176.7 1179.0 0.2%
multi 3459.5 3426.5 -0.9%
Float
single 1150.5 1150.9 0.0%
multi 3364.5 3391.9 0.8%
Memory(stream)
single 1768.7 1773.1 0.2%
multi 2511.6 2557.2 1.8%
Overall
single 1284.2 1286.2 0.2%
multi 3231.4 3238.4 0.2%
== kernbench-0.42 ==
VM1: 8U,12G, vcpu(0...7) is 1:1 pinned to pcpu(6...11,18,19)
running "kernbench -n 10"
VM2/VM3/VM4: configure is the same as VM1
stress each vcpu usage(seed by top in guest) to 40%
The comparison of 'Elapsed Time':
(sooner is better)
before after improve
load -j4 12.762 12.751 0.1%
load -j32 9.743 8.955 8.1%
load -j 9.688 9.229 4.7%
Physical Machine:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 24
On-line CPU(s) list: 0-23
Thread(s) per core: 2
Core(s) per socket: 6
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 45
Model name: Intel(R) Xeon(R) CPU E5-2640 0 @ 2.50GHz
Stepping: 7
CPU MHz: 2799.902
BogoMIPS: 5004.67
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 15360K
NUMA node0 CPU(s): 0-5,12-17
NUMA node1 CPU(s): 6-11,18-23
---
Changes since V1:
- split the implementation of s390 & arm. [David]
- refactor the impls according to the suggestion. [Paolo]
Changes since RFC:
- only cache result for X86. [David & Cornlia & Paolo]
- add performance numbers. [David]
- impls arm/s390. [Christoffer & David]
- refactor the impls. [me]
---
Longpeng(Mike) (4):
KVM: add spinlock optimization framework
KVM: X86: implement the logic for spinlock optimization
KVM: s390: implements the kvm_arch_vcpu_in_kernel()
KVM: arm: implements the kvm_arch_vcpu_in_kernel()
arch/arm/kvm/handle_exit.c | 2 +-
arch/arm64/kvm/handle_exit.c | 2 +-
arch/mips/kvm/mips.c | 6 ++++++
arch/powerpc/kvm/powerpc.c | 6 ++++++
arch/s390/kvm/diag.c | 2 +-
arch/s390/kvm/kvm-s390.c | 6 ++++++
arch/x86/include/asm/kvm_host.h | 5 +++++
arch/x86/kvm/hyperv.c | 2 +-
arch/x86/kvm/svm.c | 10 +++++++++-
arch/x86/kvm/vmx.c | 16 +++++++++++++++-
arch/x86/kvm/x86.c | 11 +++++++++++
include/linux/kvm_host.h | 3 ++-
virt/kvm/arm/arm.c | 5 +++++
virt/kvm/kvm_main.c | 4 +++-
14 files changed, 72 insertions(+), 8 deletions(-)
--
1.8.3.1
This implements the kvm_arch_vcpu_in_kernel() for ARM.
Signed-off-by: Longpeng(Mike) <[email protected]>
---
virt/kvm/arm/arm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index 862f820..b9f68e4 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -418,7 +418,7 @@ int kvm_arch_vcpu_runnable(struct kvm_vcpu *v)
bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu)
{
- return false;
+ return vcpu_mode_priv(vcpu);
}
/* Just ensure a guest exit from a particular CPU */
--
1.8.3.1
1. Implements the kvm_arch_vcpu_in_kernel(), because get_cpl requires
vcpu_load, so we must cache the result(whether the vcpu was preempted
when its cpl=0) in kvm_vcpu_arch.
2. Add ->spin_in_kernel hook, because we can get benefit from VMX.
Signed-off-by: Longpeng(Mike) <[email protected]>
---
arch/x86/include/asm/kvm_host.h | 5 +++++
arch/x86/kvm/hyperv.c | 2 +-
arch/x86/kvm/svm.c | 8 +++++++-
arch/x86/kvm/vmx.c | 16 +++++++++++++++-
arch/x86/kvm/x86.c | 7 ++++++-
5 files changed, 34 insertions(+), 4 deletions(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 87ac4fb..d2b2d57 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -688,6 +688,9 @@ struct kvm_vcpu_arch {
/* GPA available (AMD only) */
bool gpa_available;
+
+ /* be preempted when it's in kernel-mode(cpl=0) */
+ bool preempted_in_kernel;
};
struct kvm_lpage_info {
@@ -1057,6 +1060,8 @@ struct kvm_x86_ops {
void (*cancel_hv_timer)(struct kvm_vcpu *vcpu);
void (*setup_mce)(struct kvm_vcpu *vcpu);
+
+ bool (*spin_in_kernel)(struct kvm_vcpu *vcpu);
};
struct kvm_arch_async_pf {
diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index cd0e6e6..dec5e8a 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -1268,7 +1268,7 @@ int kvm_hv_hypercall(struct kvm_vcpu *vcpu)
switch (code) {
case HVCALL_NOTIFY_LONG_SPIN_WAIT:
- kvm_vcpu_on_spin(vcpu, kvm_arch_vcpu_in_kernel(vcpu));
+ kvm_vcpu_on_spin(vcpu, kvm_x86_ops->spin_in_kernel(vcpu));
break;
case HVCALL_POST_MESSAGE:
case HVCALL_SIGNAL_EVENT:
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index e6ed24e..ccb6df7 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -3751,7 +3751,7 @@ static int pause_interception(struct vcpu_svm *svm)
{
struct kvm_vcpu *vcpu = &(svm->vcpu);
- kvm_vcpu_on_spin(vcpu, kvm_arch_vcpu_in_kernel(vcpu));
+ kvm_vcpu_on_spin(vcpu, kvm_x86_ops->spin_in_kernel(vcpu));
return 1;
}
@@ -5364,6 +5364,11 @@ static void svm_setup_mce(struct kvm_vcpu *vcpu)
vcpu->arch.mcg_cap &= 0x1ff;
}
+static bool svm_spin_in_kernel(struct kvm_vcpu *vcpu)
+{
+ return svm_get_cpl(vcpu) == 0;
+}
+
static struct kvm_x86_ops svm_x86_ops __ro_after_init = {
.cpu_has_kvm_support = has_svm,
.disabled_by_bios = is_disabled,
@@ -5476,6 +5481,7 @@ static void svm_setup_mce(struct kvm_vcpu *vcpu)
.deliver_posted_interrupt = svm_deliver_avic_intr,
.update_pi_irte = svm_update_pi_irte,
.setup_mce = svm_setup_mce,
+ .spin_in_kernel = svm_spin_in_kernel,
};
static int __init svm_init(void)
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 9d6223a..297a158 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -6761,7 +6761,8 @@ static int handle_pause(struct kvm_vcpu *vcpu)
if (ple_gap)
grow_ple_window(vcpu);
- kvm_vcpu_on_spin(vcpu, kvm_arch_vcpu_in_kernel(vcpu));
+ /* See comments in vmx_spin_in_kernel() */
+ kvm_vcpu_on_spin(vcpu, true);
return kvm_skip_emulated_instruction(vcpu);
}
@@ -11636,6 +11637,17 @@ static void vmx_setup_mce(struct kvm_vcpu *vcpu)
~FEATURE_CONTROL_LMCE;
}
+static bool vmx_spin_in_kernel(struct kvm_vcpu *vcpu)
+{
+ /*
+ * Intel sdm vol3 ch-25.1.3 says: The “PAUSE-loop exiting”
+ * VM-execution control is ignored if CPL > 0. OTOH, KVM
+ * never set PAUSE_EXITING and just set PLE if supported,
+ * so the vcpu must be CPL=0 if it gets a PAUSE exit.
+ */
+ return true;
+}
+
static struct kvm_x86_ops vmx_x86_ops __ro_after_init = {
.cpu_has_kvm_support = cpu_has_kvm_support,
.disabled_by_bios = vmx_disabled_by_bios,
@@ -11763,6 +11775,8 @@ static void vmx_setup_mce(struct kvm_vcpu *vcpu)
#endif
.setup_mce = vmx_setup_mce,
+
+ .spin_in_kernel = vmx_spin_in_kernel,
};
static int __init vmx_init(void)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 4430be6..28299b9 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2881,6 +2881,10 @@ static void kvm_steal_time_set_preempted(struct kvm_vcpu *vcpu)
void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
{
int idx;
+
+ if (vcpu->preempted)
+ vcpu->arch.preempted_in_kernel = !kvm_x86_ops->get_cpl(vcpu);
+
/*
* Disable page faults because we're in atomic context here.
* kvm_write_guest_offset_cached() would call might_fault()
@@ -7992,6 +7996,7 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
kvm_pmu_init(vcpu);
vcpu->arch.pending_external_vector = -1;
+ vcpu->arch.preempted_in_kernel = false;
kvm_hv_vcpu_init(vcpu);
@@ -8441,7 +8446,7 @@ int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu)
bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu)
{
- return false;
+ return vcpu->arch.preempted_in_kernel;
}
EXPORT_SYMBOL_GPL(kvm_arch_vcpu_in_kernel);
--
1.8.3.1
If the vcpu(me) exit due to request a usermode spinlock, then
the spinlock-holder may be preempted in usermode or kernmode.
But if the vcpu(me) is in kernmode, then the holder must be
preempted in kernmode, so we should choose a vcpu in kernmode
as the most eligible candidate.
This introduces kvm_arch_vcpu_in_kernel() to decide whether the
vcpu is in kernel-mode when it's preempted or spinlock exit.
Signed-off-by: Longpeng(Mike) <[email protected]>
---
arch/arm/kvm/handle_exit.c | 2 +-
arch/arm64/kvm/handle_exit.c | 2 +-
arch/mips/kvm/mips.c | 6 ++++++
arch/powerpc/kvm/powerpc.c | 6 ++++++
arch/s390/kvm/diag.c | 2 +-
arch/s390/kvm/kvm-s390.c | 6 ++++++
arch/x86/kvm/hyperv.c | 2 +-
arch/x86/kvm/svm.c | 4 +++-
arch/x86/kvm/vmx.c | 2 +-
arch/x86/kvm/x86.c | 6 ++++++
include/linux/kvm_host.h | 3 ++-
virt/kvm/arm/arm.c | 5 +++++
virt/kvm/kvm_main.c | 4 +++-
13 files changed, 42 insertions(+), 8 deletions(-)
diff --git a/arch/arm/kvm/handle_exit.c b/arch/arm/kvm/handle_exit.c
index 54442e3..a7ea5db 100644
--- a/arch/arm/kvm/handle_exit.c
+++ b/arch/arm/kvm/handle_exit.c
@@ -67,7 +67,7 @@ static int kvm_handle_wfx(struct kvm_vcpu *vcpu, struct kvm_run *run)
if (kvm_vcpu_get_hsr(vcpu) & HSR_WFI_IS_WFE) {
trace_kvm_wfx(*vcpu_pc(vcpu), true);
vcpu->stat.wfe_exit_stat++;
- kvm_vcpu_on_spin(vcpu);
+ kvm_vcpu_on_spin(vcpu, kvm_arch_vcpu_in_kernel(vcpu));
} else {
trace_kvm_wfx(*vcpu_pc(vcpu), false);
vcpu->stat.wfi_exit_stat++;
diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
index 17d8a16..d6c8cb6 100644
--- a/arch/arm64/kvm/handle_exit.c
+++ b/arch/arm64/kvm/handle_exit.c
@@ -84,7 +84,7 @@ static int kvm_handle_wfx(struct kvm_vcpu *vcpu, struct kvm_run *run)
if (kvm_vcpu_get_hsr(vcpu) & ESR_ELx_WFx_ISS_WFE) {
trace_kvm_wfx_arm64(*vcpu_pc(vcpu), true);
vcpu->stat.wfe_exit_stat++;
- kvm_vcpu_on_spin(vcpu);
+ kvm_vcpu_on_spin(vcpu, kvm_arch_vcpu_in_kernel(vcpu));
} else {
trace_kvm_wfx_arm64(*vcpu_pc(vcpu), false);
vcpu->stat.wfi_exit_stat++;
diff --git a/arch/mips/kvm/mips.c b/arch/mips/kvm/mips.c
index d4b2ad1..70208be 100644
--- a/arch/mips/kvm/mips.c
+++ b/arch/mips/kvm/mips.c
@@ -98,6 +98,12 @@ int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu)
return !!(vcpu->arch.pending_exceptions);
}
+bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu)
+{
+ return false;
+}
+EXPORT_SYMBOL_GPL(kvm_arch_vcpu_in_kernel);
+
int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu)
{
return 1;
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 1a75c0b..6184c45 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -58,6 +58,12 @@ int kvm_arch_vcpu_runnable(struct kvm_vcpu *v)
return !!(v->arch.pending_exceptions) || kvm_request_pending(v);
}
+bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu)
+{
+ return false;
+}
+EXPORT_SYMBOL_GPL(kvm_arch_vcpu_in_kernel);
+
int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu)
{
return 1;
diff --git a/arch/s390/kvm/diag.c b/arch/s390/kvm/diag.c
index ce865bd..4ea8c38 100644
--- a/arch/s390/kvm/diag.c
+++ b/arch/s390/kvm/diag.c
@@ -150,7 +150,7 @@ static int __diag_time_slice_end(struct kvm_vcpu *vcpu)
{
VCPU_EVENT(vcpu, 5, "%s", "diag time slice end");
vcpu->stat.diagnose_44++;
- kvm_vcpu_on_spin(vcpu);
+ kvm_vcpu_on_spin(vcpu, kvm_arch_vcpu_in_kernel(vcpu));
return 0;
}
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index af09d34..0b0c689 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -2447,6 +2447,12 @@ int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu)
return kvm_s390_vcpu_has_irq(vcpu, 0);
}
+bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu)
+{
+ return false;
+}
+EXPORT_SYMBOL_GPL(kvm_arch_vcpu_in_kernel);
+
void kvm_s390_vcpu_block(struct kvm_vcpu *vcpu)
{
atomic_or(PROG_BLOCK_SIE, &vcpu->arch.sie_block->prog20);
diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 337b6d2..cd0e6e6 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -1268,7 +1268,7 @@ int kvm_hv_hypercall(struct kvm_vcpu *vcpu)
switch (code) {
case HVCALL_NOTIFY_LONG_SPIN_WAIT:
- kvm_vcpu_on_spin(vcpu);
+ kvm_vcpu_on_spin(vcpu, kvm_arch_vcpu_in_kernel(vcpu));
break;
case HVCALL_POST_MESSAGE:
case HVCALL_SIGNAL_EVENT:
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 1107626..e6ed24e 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -3749,7 +3749,9 @@ static int interrupt_window_interception(struct vcpu_svm *svm)
static int pause_interception(struct vcpu_svm *svm)
{
- kvm_vcpu_on_spin(&(svm->vcpu));
+ struct kvm_vcpu *vcpu = &(svm->vcpu);
+
+ kvm_vcpu_on_spin(vcpu, kvm_arch_vcpu_in_kernel(vcpu));
return 1;
}
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 9b21b12..9d6223a 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -6761,7 +6761,7 @@ static int handle_pause(struct kvm_vcpu *vcpu)
if (ple_gap)
grow_ple_window(vcpu);
- kvm_vcpu_on_spin(vcpu);
+ kvm_vcpu_on_spin(vcpu, kvm_arch_vcpu_in_kernel(vcpu));
return kvm_skip_emulated_instruction(vcpu);
}
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index d734aa8..4430be6 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8439,6 +8439,12 @@ int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu)
return kvm_vcpu_running(vcpu) || kvm_vcpu_has_events(vcpu);
}
+bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu)
+{
+ return false;
+}
+EXPORT_SYMBOL_GPL(kvm_arch_vcpu_in_kernel);
+
int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu)
{
return kvm_vcpu_exiting_guest_mode(vcpu) == IN_GUEST_MODE;
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 21a6fd6..91460aa 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -720,7 +720,7 @@ int kvm_vcpu_write_guest(struct kvm_vcpu *vcpu, gpa_t gpa, const void *data,
bool kvm_vcpu_wake_up(struct kvm_vcpu *vcpu);
void kvm_vcpu_kick(struct kvm_vcpu *vcpu);
int kvm_vcpu_yield_to(struct kvm_vcpu *target);
-void kvm_vcpu_on_spin(struct kvm_vcpu *vcpu);
+void kvm_vcpu_on_spin(struct kvm_vcpu *vcpu, bool me_in_kern);
void kvm_load_guest_fpu(struct kvm_vcpu *vcpu);
void kvm_put_guest_fpu(struct kvm_vcpu *vcpu);
@@ -800,6 +800,7 @@ int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
void kvm_arch_hardware_unsetup(void);
void kvm_arch_check_processor_compat(void *rtn);
int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu);
+bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu);
int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu);
#ifndef __KVM_HAVE_ARCH_VM_ALLOC
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index a39a1e1..862f820 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -416,6 +416,11 @@ int kvm_arch_vcpu_runnable(struct kvm_vcpu *v)
&& !v->arch.power_off && !v->arch.pause);
}
+bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu)
+{
+ return false;
+}
+
/* Just ensure a guest exit from a particular CPU */
static void exit_vm_noop(void *info)
{
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 15252d7..e7720d2 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2317,7 +2317,7 @@ static bool kvm_vcpu_eligible_for_directed_yield(struct kvm_vcpu *vcpu)
#endif
}
-void kvm_vcpu_on_spin(struct kvm_vcpu *me)
+void kvm_vcpu_on_spin(struct kvm_vcpu *me, bool me_in_kern)
{
struct kvm *kvm = me->kvm;
struct kvm_vcpu *vcpu;
@@ -2348,6 +2348,8 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me)
continue;
if (swait_active(&vcpu->wq) && !kvm_arch_vcpu_runnable(vcpu))
continue;
+ if (me_in_kern && !kvm_arch_vcpu_in_kernel(vcpu))
+ continue;
if (!kvm_vcpu_eligible_for_directed_yield(vcpu))
continue;
--
1.8.3.1
This implements the kvm_arch_vcpu_in_kernel() for s390.
Signed-off-by: Longpeng(Mike) <[email protected]>
---
arch/s390/kvm/kvm-s390.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 0b0c689..e46177b 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -2449,7 +2449,7 @@ int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu)
bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu)
{
- return false;
+ return !(vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE);
}
EXPORT_SYMBOL_GPL(kvm_arch_vcpu_in_kernel);
--
1.8.3.1
On 08/08/2017 06:05, Longpeng(Mike) wrote:
> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
> index cd0e6e6..dec5e8a 100644
> --- a/arch/x86/kvm/hyperv.c
> +++ b/arch/x86/kvm/hyperv.c
> @@ -1268,7 +1268,7 @@ int kvm_hv_hypercall(struct kvm_vcpu *vcpu)
>
> switch (code) {
> case HVCALL_NOTIFY_LONG_SPIN_WAIT:
> - kvm_vcpu_on_spin(vcpu, kvm_arch_vcpu_in_kernel(vcpu));
> + kvm_vcpu_on_spin(vcpu, kvm_x86_ops->spin_in_kernel(vcpu));
> break;
> case HVCALL_POST_MESSAGE:
> case HVCALL_SIGNAL_EVENT:
This can be true as well. I can change this on commit.
Paolo
On 08/08/2017 06:05, Longpeng(Mike) wrote:
> return 1;
> diff --git a/arch/s390/kvm/diag.c b/arch/s390/kvm/diag.c
> index ce865bd..4ea8c38 100644
> --- a/arch/s390/kvm/diag.c
> +++ b/arch/s390/kvm/diag.c
> @@ -150,7 +150,7 @@ static int __diag_time_slice_end(struct kvm_vcpu *vcpu)
> {
> VCPU_EVENT(vcpu, 5, "%s", "diag time slice end");
> vcpu->stat.diagnose_44++;
> - kvm_vcpu_on_spin(vcpu);
> + kvm_vcpu_on_spin(vcpu, kvm_arch_vcpu_in_kernel(vcpu));
> return 0;
> }
>
IIUC, diag is a privileged instruction so this an also be "true".
Paolo
On Tue, 8 Aug 2017 12:05:34 +0800
"Longpeng(Mike)" <[email protected]> wrote:
> This implements the kvm_arch_vcpu_in_kernel() for s390.
>
> Signed-off-by: Longpeng(Mike) <[email protected]>
> ---
> arch/s390/kvm/kvm-s390.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index 0b0c689..e46177b 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -2449,7 +2449,7 @@ int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu)
>
> bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu)
> {
> - return false;
> + return !(vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE);
> }
> EXPORT_SYMBOL_GPL(kvm_arch_vcpu_in_kernel);
>
Acked-by: Cornelia Huck <[email protected]>
On Tue, Aug 08, 2017 at 12:05:35PM +0800, Longpeng(Mike) wrote:
> This implements the kvm_arch_vcpu_in_kernel() for ARM.
>
> Signed-off-by: Longpeng(Mike) <[email protected]>
> ---
> virt/kvm/arm/arm.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
> index 862f820..b9f68e4 100644
> --- a/virt/kvm/arm/arm.c
> +++ b/virt/kvm/arm/arm.c
> @@ -418,7 +418,7 @@ int kvm_arch_vcpu_runnable(struct kvm_vcpu *v)
>
> bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu)
> {
> - return false;
> + return vcpu_mode_priv(vcpu);
> }
>
> /* Just ensure a guest exit from a particular CPU */
> --
> 1.8.3.1
>
>
I'm not taking any position on whether this series makes sense overall
or not (it's a bit concerning to do this without having measured a
benefit), but for the arm/arm64 change:
Acked-by: Christoffer Dall <[email protected]>
On Tue, Aug 8, 2017 at 9:39 AM, Christoffer Dall <[email protected]> wrote:
> On Tue, Aug 08, 2017 at 12:05:35PM +0800, Longpeng(Mike) wrote:
>> This implements the kvm_arch_vcpu_in_kernel() for ARM.
>>
>> Signed-off-by: Longpeng(Mike) <[email protected]>
>> ---
>> virt/kvm/arm/arm.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
>> index 862f820..b9f68e4 100644
>> --- a/virt/kvm/arm/arm.c
>> +++ b/virt/kvm/arm/arm.c
>> @@ -418,7 +418,7 @@ int kvm_arch_vcpu_runnable(struct kvm_vcpu *v)
>>
>> bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu)
>> {
>> - return false;
>> + return vcpu_mode_priv(vcpu);
>> }
>>
>> /* Just ensure a guest exit from a particular CPU */
>> --
>> 1.8.3.1
>>
>>
> I'm not taking any position on whether this series makes sense overall
> or not (it's a bit concerning to do this without having measured a
> benefit), but for the arm/arm64 change:
oh, sorry, strike that, I didn't notice you had added performance
results in the cover letter.
Thanks,
-Christoffer
>
> Acked-by: Christoffer Dall <[email protected]>
On Tue, 8 Aug 2017 12:05:31 +0800
"Longpeng(Mike)" <[email protected]> wrote:
> This is a simple optimization for kvm_vcpu_on_spin, the
> main idea is described in patch-1's commit msg.
I think this generally looks good now.
>
> I did some tests base on the RFC version, the result shows
> that it can improves the performance slightly.
Did you re-run tests on this version?
I would also like to see some s390 numbers; unfortunately I only have a
z/VM environment and any performance numbers would be nearly useless
there. Maybe somebody within IBM with a better setup can run a quick
test?
On Tue, 8 Aug 2017 09:34:14 +0200
Paolo Bonzini <[email protected]> wrote:
> On 08/08/2017 06:05, Longpeng(Mike) wrote:
> > return 1;
> > diff --git a/arch/s390/kvm/diag.c b/arch/s390/kvm/diag.c
> > index ce865bd..4ea8c38 100644
> > --- a/arch/s390/kvm/diag.c
> > +++ b/arch/s390/kvm/diag.c
> > @@ -150,7 +150,7 @@ static int __diag_time_slice_end(struct kvm_vcpu *vcpu)
> > {
> > VCPU_EVENT(vcpu, 5, "%s", "diag time slice end");
> > vcpu->stat.diagnose_44++;
> > - kvm_vcpu_on_spin(vcpu);
> > + kvm_vcpu_on_spin(vcpu, kvm_arch_vcpu_in_kernel(vcpu));
> > return 0;
> > }
> >
>
> IIUC, diag is a privileged instruction so this an also be "true".
>
> Paolo
Yes, indeed.
On 2017/8/8 15:41, Cornelia Huck wrote:
> On Tue, 8 Aug 2017 12:05:31 +0800
> "Longpeng(Mike)" <[email protected]> wrote:
>
>> This is a simple optimization for kvm_vcpu_on_spin, the
>> main idea is described in patch-1's commit msg.
>
> I think this generally looks good now.
>
>>
>> I did some tests base on the RFC version, the result shows
>> that it can improves the performance slightly.
>
> Did you re-run tests on this version?
Hi Cornelia,
I didn't re-run tests on V2. But the major difference between RFC and V2
is that V2 only cache result for X86 (s390/arm needn't) and V2 saves a
expensive operation ( 440-1400 cycles on my test machine ) for X86/VMX.
So I think V2's performance is at least the same as RFC or even slightly
better. :)
>
> I would also like to see some s390 numbers; unfortunately I only have a
> z/VM environment and any performance numbers would be nearly useless
> there. Maybe somebody within IBM with a better setup can run a quick
> test?
>
> .
>
--
Regards,
Longpeng(Mike)
On 2017/8/8 15:30, Paolo Bonzini wrote:
> On 08/08/2017 06:05, Longpeng(Mike) wrote:
>> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
>> index cd0e6e6..dec5e8a 100644
>> --- a/arch/x86/kvm/hyperv.c
>> +++ b/arch/x86/kvm/hyperv.c
>> @@ -1268,7 +1268,7 @@ int kvm_hv_hypercall(struct kvm_vcpu *vcpu)
>>
>> switch (code) {
>> case HVCALL_NOTIFY_LONG_SPIN_WAIT:
>> - kvm_vcpu_on_spin(vcpu, kvm_arch_vcpu_in_kernel(vcpu));
>> + kvm_vcpu_on_spin(vcpu, kvm_x86_ops->spin_in_kernel(vcpu));
>> break;
>> case HVCALL_POST_MESSAGE:
>> case HVCALL_SIGNAL_EVENT:
>
> This can be true as well. I can change this on commit.
>
Thanks,
hope you could help me to fix the same problem in patch-1(s390) too.
> Paolo
>
> .
>
--
Regards,
Longpeng(Mike)
On 08.08.2017 06:05, Longpeng(Mike) wrote:
> This implements the kvm_arch_vcpu_in_kernel() for s390.
>
> Signed-off-by: Longpeng(Mike) <[email protected]>
> ---
> arch/s390/kvm/kvm-s390.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index 0b0c689..e46177b 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -2449,7 +2449,7 @@ int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu)
>
> bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu)
> {
> - return false;
> + return !(vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE);
> }
> EXPORT_SYMBOL_GPL(kvm_arch_vcpu_in_kernel);
>
>
Reviewed-by: David Hildenbrand <[email protected]>
--
Thanks,
David
On 08/08/2017 10:31, Longpeng (Mike) wrote:
>
>
> On 2017/8/8 15:30, Paolo Bonzini wrote:
>
>> On 08/08/2017 06:05, Longpeng(Mike) wrote:
>>> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
>>> index cd0e6e6..dec5e8a 100644
>>> --- a/arch/x86/kvm/hyperv.c
>>> +++ b/arch/x86/kvm/hyperv.c
>>> @@ -1268,7 +1268,7 @@ int kvm_hv_hypercall(struct kvm_vcpu *vcpu)
>>>
>>> switch (code) {
>>> case HVCALL_NOTIFY_LONG_SPIN_WAIT:
>>> - kvm_vcpu_on_spin(vcpu, kvm_arch_vcpu_in_kernel(vcpu));
>>> + kvm_vcpu_on_spin(vcpu, kvm_x86_ops->spin_in_kernel(vcpu));
>>> break;
>>> case HVCALL_POST_MESSAGE:
>>> case HVCALL_SIGNAL_EVENT:
>>
>> This can be true as well. I can change this on commit.
>
> Thanks,
> hope you could help me to fix the same problem in patch-1(s390) too.
Yes. Another possibility is to always pass false in patch 1 to
kvm_vcpu_on_spin. Then the parameter can be adjusted in patches 3 and 4
(passing true for s390 and vcpu_mode_priv(vcpu) for ARM).
Paolo
> +bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu)
> +{
> + return false;
> +}
why don't we need an EXPORT_SYMBOL here?
> +
> /* Just ensure a guest exit from a particular CPU */
> static void exit_vm_noop(void *info)
> {
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 15252d7..e7720d2 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -2317,7 +2317,7 @@ static bool kvm_vcpu_eligible_for_directed_yield(struct kvm_vcpu *vcpu)
> #endif
> }
>
> -void kvm_vcpu_on_spin(struct kvm_vcpu *me)
> +void kvm_vcpu_on_spin(struct kvm_vcpu *me, bool me_in_kern)
> {
> struct kvm *kvm = me->kvm;
> struct kvm_vcpu *vcpu;
> @@ -2348,6 +2348,8 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me)
> continue;
> if (swait_active(&vcpu->wq) && !kvm_arch_vcpu_runnable(vcpu))
> continue;
> + if (me_in_kern && !kvm_arch_vcpu_in_kernel(vcpu))
> + continue;
hm, does this patch compile? (me_in_kern)
I would even move this to an other patch.
Maybe even split into
a) introducing kvm_arch_vcpu_in_kernel() for all archs
b) modifying kvm_vcpu_on_spin(), passing the result from
kvm_arch_vcpu_in_kernel()
c) filling kvm_arch_vcpu_in_kernel() with life for different archs
(multiple patches)
d) pimping kvm_vcpu_on_spin()
> if (!kvm_vcpu_eligible_for_directed_yield(vcpu))
> continue;
>
>
--
Thanks,
David
On 08.08.2017 10:42, David Hildenbrand wrote:
>
>> +bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu)
>> +{
>> + return false;
>> +}
>
> why don't we need an EXPORT_SYMBOL here?
>
>> +
>> /* Just ensure a guest exit from a particular CPU */
>> static void exit_vm_noop(void *info)
>> {
>> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
>> index 15252d7..e7720d2 100644
>> --- a/virt/kvm/kvm_main.c
>> +++ b/virt/kvm/kvm_main.c
>> @@ -2317,7 +2317,7 @@ static bool kvm_vcpu_eligible_for_directed_yield(struct kvm_vcpu *vcpu)
>> #endif
>> }
>>
>> -void kvm_vcpu_on_spin(struct kvm_vcpu *me)
>> +void kvm_vcpu_on_spin(struct kvm_vcpu *me, bool me_in_kern)
>> {
>> struct kvm *kvm = me->kvm;
>> struct kvm_vcpu *vcpu;
>> @@ -2348,6 +2348,8 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me)
>> continue;
>> if (swait_active(&vcpu->wq) && !kvm_arch_vcpu_runnable(vcpu))
>> continue;
>> + if (me_in_kern && !kvm_arch_vcpu_in_kernel(vcpu))
>> + continue;
>
>
> hm, does this patch compile? (me_in_kern)
pardon me, missed the parameter, so ignore this comment. comment
regarding splitting up below still holds :)
>
> I would even move this to an other patch.
>
> Maybe even split into
>
> a) introducing kvm_arch_vcpu_in_kernel() for all archs
> b) modifying kvm_vcpu_on_spin(), passing the result from
> kvm_arch_vcpu_in_kernel()
> c) filling kvm_arch_vcpu_in_kernel() with life for different archs
> (multiple patches)
> d) pimping kvm_vcpu_on_spin()
>
>> if (!kvm_vcpu_eligible_for_directed_yield(vcpu))
>> continue;
>>
>>
>
>
--
Thanks,
David
On 08/08/2017 10:42, David Hildenbrand wrote:
>
>> +bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu)
>> +{
>> + return false;
>> +}
>
> why don't we need an EXPORT_SYMBOL here?
Is it used outside the KVM module? I think no architecture actually needs
to export it.
>> -void kvm_vcpu_on_spin(struct kvm_vcpu *me)
>> +void kvm_vcpu_on_spin(struct kvm_vcpu *me, bool me_in_kern)
>> {
>> struct kvm *kvm = me->kvm;
>> struct kvm_vcpu *vcpu;
>> @@ -2348,6 +2348,8 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me)
>> continue;
>> if (swait_active(&vcpu->wq) && !kvm_arch_vcpu_runnable(vcpu))
>> continue;
>> + if (me_in_kern && !kvm_arch_vcpu_in_kernel(vcpu))
>> + continue;
>
>
> hm, does this patch compile? (me_in_kern)
Why not? :) This is what I have:
>From d62a40d49f44ff7e789a15416316ef1cba93fa85 Mon Sep 17 00:00:00 2001
From: "Longpeng(Mike)" <[email protected]>
Date: Tue, 8 Aug 2017 12:05:32 +0800
Subject: [PATCH 1/4] KVM: add spinlock optimization framework
If a vcpu exits due to request a user mode spinlock, then
the spinlock-holder may be preempted in user mode or kernel mode.
(Note that not all architectures trap spin loops in user mode,
only AMD x86 and ARM/ARM64 currently do).
But if a vcpu exits in kernel mode, then the holder must be
preempted in kernel mode, so we should choose a vcpu in kernel mode
as a more likely candidate for the lock holder.
This introduces kvm_arch_vcpu_in_kernel() to decide whether the
vcpu is in kernel-mode when it's preempted. kvm_vcpu_on_spin's
new argument says the same of the spinning VCPU.
Signed-off-by: Longpeng(Mike) <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
---
arch/arm/kvm/handle_exit.c | 2 +-
arch/arm64/kvm/handle_exit.c | 2 +-
arch/mips/kvm/mips.c | 5 +++++
arch/powerpc/kvm/powerpc.c | 5 +++++
arch/s390/kvm/diag.c | 2 +-
arch/s390/kvm/kvm-s390.c | 5 +++++
arch/x86/kvm/hyperv.c | 2 +-
arch/x86/kvm/svm.c | 2 +-
arch/x86/kvm/vmx.c | 2 +-
arch/x86/kvm/x86.c | 5 +++++
include/linux/kvm_host.h | 3 ++-
virt/kvm/arm/arm.c | 5 +++++
virt/kvm/kvm_main.c | 4 +++-
13 files changed, 36 insertions(+), 8 deletions(-)
diff --git a/arch/arm/kvm/handle_exit.c b/arch/arm/kvm/handle_exit.c
index 54442e375354..196122bb6968 100644
--- a/arch/arm/kvm/handle_exit.c
+++ b/arch/arm/kvm/handle_exit.c
@@ -67,7 +67,7 @@ static int kvm_handle_wfx(struct kvm_vcpu *vcpu, struct kvm_run *run)
if (kvm_vcpu_get_hsr(vcpu) & HSR_WFI_IS_WFE) {
trace_kvm_wfx(*vcpu_pc(vcpu), true);
vcpu->stat.wfe_exit_stat++;
- kvm_vcpu_on_spin(vcpu);
+ kvm_vcpu_on_spin(vcpu, false);
} else {
trace_kvm_wfx(*vcpu_pc(vcpu), false);
vcpu->stat.wfi_exit_stat++;
diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
index 17d8a1677a0b..da57622cacca 100644
--- a/arch/arm64/kvm/handle_exit.c
+++ b/arch/arm64/kvm/handle_exit.c
@@ -84,7 +84,7 @@ static int kvm_handle_wfx(struct kvm_vcpu *vcpu, struct kvm_run *run)
if (kvm_vcpu_get_hsr(vcpu) & ESR_ELx_WFx_ISS_WFE) {
trace_kvm_wfx_arm64(*vcpu_pc(vcpu), true);
vcpu->stat.wfe_exit_stat++;
- kvm_vcpu_on_spin(vcpu);
+ kvm_vcpu_on_spin(vcpu, false);
} else {
trace_kvm_wfx_arm64(*vcpu_pc(vcpu), false);
vcpu->stat.wfi_exit_stat++;
diff --git a/arch/mips/kvm/mips.c b/arch/mips/kvm/mips.c
index d4b2ad18eef2..70208bed5a15 100644
--- a/arch/mips/kvm/mips.c
+++ b/arch/mips/kvm/mips.c
@@ -98,6 +98,11 @@ int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu)
return !!(vcpu->arch.pending_exceptions);
}
+bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu)
+{
+ return false;
+}
+
int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu)
{
return 1;
}
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 1a75c0b5f4ca..6184c45015f3 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -58,6 +58,11 @@ int kvm_arch_vcpu_runnable(struct kvm_vcpu *v)
return !!(v->arch.pending_exceptions) || kvm_request_pending(v);
}
+bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu)
+{
+ return false;
+}
+
int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu)
{
return 1;
diff --git a/arch/s390/kvm/diag.c b/arch/s390/kvm/diag.c
index ce865bd4f81d..6182edebea3d 100644
--- a/arch/s390/kvm/diag.c
+++ b/arch/s390/kvm/diag.c
@@ -150,7 +150,7 @@ static int __diag_time_slice_end(struct kvm_vcpu *vcpu)
{
VCPU_EVENT(vcpu, 5, "%s", "diag time slice end");
vcpu->stat.diagnose_44++;
- kvm_vcpu_on_spin(vcpu);
+ kvm_vcpu_on_spin(vcpu, false);
return 0;
}
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index af09d3437631..0b0c689f1d9a 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -2447,6 +2447,11 @@ int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu)
return kvm_s390_vcpu_has_irq(vcpu, 0);
}
+bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu)
+{
+ return false;
+}
+
void kvm_s390_vcpu_block(struct kvm_vcpu *vcpu)
{
atomic_or(PROG_BLOCK_SIE, &vcpu->arch.sie_block->prog20);
diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index bf9992300efa..5243d54f73ab 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -1274,7 +1274,7 @@ int kvm_hv_hypercall(struct kvm_vcpu *vcpu)
switch (code) {
case HVCALL_NOTIFY_LONG_SPIN_WAIT:
- kvm_vcpu_on_spin(vcpu);
+ kvm_vcpu_on_spin(vcpu, false);
break;
case HVCALL_POST_MESSAGE:
case HVCALL_SIGNAL_EVENT:
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 2432bb952a30..0cc486fd9871 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -3749,7 +3749,7 @@ static int interrupt_window_interception(struct vcpu_svm *svm)
static int pause_interception(struct vcpu_svm *svm)
{
- kvm_vcpu_on_spin(&(svm->vcpu));
+ kvm_vcpu_on_spin(&svm->vcpu, false);
return 1;
}
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 2c0f5287fb78..fef784c22190 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -6781,7 +6781,7 @@ static int handle_pause(struct kvm_vcpu *vcpu)
if (ple_gap)
grow_ple_window(vcpu);
- kvm_vcpu_on_spin(vcpu);
+ kvm_vcpu_on_spin(vcpu, false);
return kvm_skip_emulated_instruction(vcpu);
}
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 33fd6b6419ef..aba9d038d09e 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8432,6 +8432,11 @@ int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu)
return kvm_vcpu_running(vcpu) || kvm_vcpu_has_events(vcpu);
}
+bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu)
+{
+ return false;
+}
+
int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu)
{
return kvm_vcpu_exiting_guest_mode(vcpu) == IN_GUEST_MODE;
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 28112d7917c1..6882538eda32 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -720,7 +720,7 @@ int kvm_vcpu_write_guest(struct kvm_vcpu *vcpu, gpa_t gpa, const void *data,
bool kvm_vcpu_wake_up(struct kvm_vcpu *vcpu);
void kvm_vcpu_kick(struct kvm_vcpu *vcpu);
int kvm_vcpu_yield_to(struct kvm_vcpu *target);
-void kvm_vcpu_on_spin(struct kvm_vcpu *vcpu);
+void kvm_vcpu_on_spin(struct kvm_vcpu *vcpu, bool usermode_vcpu_not_eligible);
void kvm_load_guest_fpu(struct kvm_vcpu *vcpu);
void kvm_put_guest_fpu(struct kvm_vcpu *vcpu);
@@ -800,6 +800,7 @@ int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
void kvm_arch_hardware_unsetup(void);
void kvm_arch_check_processor_compat(void *rtn);
int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu);
+bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu);
int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu);
#ifndef __KVM_HAVE_ARCH_VM_ALLOC
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index a39a1e161e63..862f820d06d4 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -416,6 +416,11 @@ int kvm_arch_vcpu_runnable(struct kvm_vcpu *v)
&& !v->arch.power_off && !v->arch.pause);
}
+bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu)
+{
+ return false;
+}
+
/* Just ensure a guest exit from a particular CPU */
static void exit_vm_noop(void *info)
{
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 15252d723b54..e17c40d986f3 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2317,7 +2317,7 @@ static bool kvm_vcpu_eligible_for_directed_yield(struct kvm_vcpu *vcpu)
#endif
}
-void kvm_vcpu_on_spin(struct kvm_vcpu *me)
+void kvm_vcpu_on_spin(struct kvm_vcpu *me, bool yield_to_kernel_mode)
{
struct kvm *kvm = me->kvm;
struct kvm_vcpu *vcpu;
@@ -2348,6 +2348,8 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me)
continue;
if (swait_active(&vcpu->wq) && !kvm_arch_vcpu_runnable(vcpu))
continue;
+ if (yield_to_kernel_mode && !kvm_arch_vcpu_in_kernel(vcpu))
+ continue;
if (!kvm_vcpu_eligible_for_directed_yield(vcpu))
continue;
--
1.8.3.1
>From 325c00511ab67d7e46e887ecea0d8b6b91005948 Mon Sep 17 00:00:00 2001
From: "Longpeng(Mike)" <[email protected]>
Date: Tue, 8 Aug 2017 12:05:33 +0800
Subject: [PATCH 2/4] KVM: X86: implement the logic for spinlock optimization
get_cpl requires vcpu_load, so we must cache the result (whether the
vcpu was preempted when its cpl=0) in kvm_vcpu_arch.
Signed-off-by: Longpeng(Mike) <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
---
arch/x86/include/asm/kvm_host.h | 3 +++
arch/x86/kvm/hyperv.c | 2 +-
arch/x86/kvm/svm.c | 5 ++++-
arch/x86/kvm/vmx.c | 8 +++++++-
arch/x86/kvm/x86.c | 7 ++++++-
5 files changed, 21 insertions(+), 4 deletions(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 87ac4fba6d8e..1679aabcabe5 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -688,6 +688,9 @@ struct kvm_vcpu_arch {
/* GPA available (AMD only) */
bool gpa_available;
+
+ /* be preempted when it's in kernel-mode(cpl=0) */
+ bool preempted_in_kernel;
};
struct kvm_lpage_info {
diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 5243d54f73ab..dc97f2544b6f 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -1274,7 +1274,7 @@ int kvm_hv_hypercall(struct kvm_vcpu *vcpu)
switch (code) {
case HVCALL_NOTIFY_LONG_SPIN_WAIT:
- kvm_vcpu_on_spin(vcpu, false);
+ kvm_vcpu_on_spin(vcpu, true);
break;
case HVCALL_POST_MESSAGE:
case HVCALL_SIGNAL_EVENT:
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 0cc486fd9871..fc027f2e87d1 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -3749,7 +3749,10 @@ static int interrupt_window_interception(struct vcpu_svm *svm)
static int pause_interception(struct vcpu_svm *svm)
{
- kvm_vcpu_on_spin(&svm->vcpu, false);
+ struct kvm_vcpu *vcpu = &svm->vcpu;
+ bool in_kernel = (svm_get_cpl(vcpu) == 0);
+
+ kvm_vcpu_on_spin(vcpu, in_kernel);
return 1;
}
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index fef784c22190..46d08b389e36 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -6781,7 +6781,13 @@ static int handle_pause(struct kvm_vcpu *vcpu)
if (ple_gap)
grow_ple_window(vcpu);
- kvm_vcpu_on_spin(vcpu, false);
+ /*
+ * Intel sdm vol3 ch-25.1.3 says: The "PAUSE-loop exiting"
+ * VM-execution control is ignored if CPL > 0. OTOH, KVM
+ * never set PAUSE_EXITING and just only uses PLE,
+ * so the vcpu must be CPL=0 if it gets a PAUSE exit.
+ */
+ kvm_vcpu_on_spin(vcpu, true);
return kvm_skip_emulated_instruction(vcpu);
}
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index aba9d038d09e..bee9ad0ed4cd 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2873,6 +2873,10 @@ static void kvm_steal_time_set_preempted(struct kvm_vcpu *vcpu)
void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
{
int idx;
+
+ if (vcpu->preempted)
+ vcpu->arch.preempted_in_kernel = !kvm_x86_ops->get_cpl(vcpu);
+
/*
* Disable page faults because we're in atomic context here.
* kvm_write_guest_offset_cached() would call might_fault()
@@ -7985,6 +7989,7 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
kvm_pmu_init(vcpu);
vcpu->arch.pending_external_vector = -1;
+ vcpu->arch.preempted_in_kernel = false;
kvm_hv_vcpu_init(vcpu);
@@ -8434,7 +8439,7 @@ int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu)
bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu)
{
- return false;
+ return vcpu->arch.preempted_in_kernel;
}
int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu)
--
1.8.3.1
>From c83cf5cf8d972ea37b55db7e059767f5e3af55d0 Mon Sep 17 00:00:00 2001
From: "Longpeng(Mike)" <[email protected]>
Date: Tue, 8 Aug 2017 12:05:34 +0800
Subject: [PATCH 3/4] KVM: s390: implements the kvm_arch_vcpu_in_kernel()
This implements kvm_arch_vcpu_in_kernel() for s390. DIAG is a privileged
operation, so it cannot be called from problem state (user mode).
Signed-off-by: Longpeng(Mike) <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
---
arch/s390/kvm/diag.c | 2 +-
arch/s390/kvm/kvm-s390.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/s390/kvm/diag.c b/arch/s390/kvm/diag.c
index 6182edebea3d..5ee90020382d 100644
--- a/arch/s390/kvm/diag.c
+++ b/arch/s390/kvm/diag.c
@@ -150,7 +150,7 @@ static int __diag_time_slice_end(struct kvm_vcpu *vcpu)
{
VCPU_EVENT(vcpu, 5, "%s", "diag time slice end");
vcpu->stat.diagnose_44++;
- kvm_vcpu_on_spin(vcpu, false);
+ kvm_vcpu_on_spin(vcpu, true);
return 0;
}
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 0b0c689f1d9a..e46177bde829 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -2449,7 +2449,7 @@ int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu)
bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu)
{
- return false;
+ return !(vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE);
}
void kvm_s390_vcpu_block(struct kvm_vcpu *vcpu)
--
1.8.3.1
>From 3d7c9a55166d7f0d955774360bca72a13b4ee96c Mon Sep 17 00:00:00 2001
From: "Longpeng(Mike)" <[email protected]>
Date: Tue, 8 Aug 2017 12:05:35 +0800
Subject: [PATCH 4/4] KVM: arm: implements the kvm_arch_vcpu_in_kernel()
This implements the kvm_arch_vcpu_in_kernel() for ARM, and adjusts
the calls to kvm_vcpu_on_spin().
Signed-off-by: Longpeng(Mike) <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
---
arch/arm/kvm/handle_exit.c | 2 +-
arch/arm64/kvm/handle_exit.c | 2 +-
virt/kvm/arm/arm.c | 2 +-
3 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/arch/arm/kvm/handle_exit.c b/arch/arm/kvm/handle_exit.c
index 196122bb6968..cf8bf6bf87c4 100644
--- a/arch/arm/kvm/handle_exit.c
+++ b/arch/arm/kvm/handle_exit.c
@@ -67,7 +67,7 @@ static int kvm_handle_wfx(struct kvm_vcpu *vcpu, struct kvm_run *run)
if (kvm_vcpu_get_hsr(vcpu) & HSR_WFI_IS_WFE) {
trace_kvm_wfx(*vcpu_pc(vcpu), true);
vcpu->stat.wfe_exit_stat++;
- kvm_vcpu_on_spin(vcpu, false);
+ kvm_vcpu_on_spin(vcpu, vcpu_mode_priv(vcpu));
} else {
trace_kvm_wfx(*vcpu_pc(vcpu), false);
vcpu->stat.wfi_exit_stat++;
diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
index da57622cacca..7debb74843a0 100644
--- a/arch/arm64/kvm/handle_exit.c
+++ b/arch/arm64/kvm/handle_exit.c
@@ -84,7 +84,7 @@ static int kvm_handle_wfx(struct kvm_vcpu *vcpu, struct kvm_run *run)
if (kvm_vcpu_get_hsr(vcpu) & ESR_ELx_WFx_ISS_WFE) {
trace_kvm_wfx_arm64(*vcpu_pc(vcpu), true);
vcpu->stat.wfe_exit_stat++;
- kvm_vcpu_on_spin(vcpu, false);
+ kvm_vcpu_on_spin(vcpu, vcpu_mode_priv(vcpu));
} else {
trace_kvm_wfx_arm64(*vcpu_pc(vcpu), false);
vcpu->stat.wfi_exit_stat++;
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index 862f820d06d4..b9f68e4add71 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -418,7 +418,7 @@ int kvm_arch_vcpu_runnable(struct kvm_vcpu *v)
bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu)
{
- return false;
+ return vcpu_mode_priv(vcpu);
}
/* Just ensure a guest exit from a particular CPU */
--
1.8.3.1
On 2017/8/8 17:00, Paolo Bonzini wrote:
> On 08/08/2017 10:42, David Hildenbrand wrote:
>>
>>> +bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu)
>>> +{
>>> + return false;
>>> +}
>>
>> why don't we need an EXPORT_SYMBOL here?
>
> Is it used outside the KVM module? I think no architecture actually needs
> to export it.
>
Hi Paolo & David,
In my original approach, I call kvm_arch_vcpu_in_kernel() in handle_pause(),
without EXPORT_SYMBOL the compiler reports:
ERROR: "kvm_arch_vcpu_in_kernel" [arch/x86/kvm/kvm-intel.ko] undefined!
ERROR: "kvm_arch_vcpu_in_kernel" [arch/x86/kvm/kvm-amd.ko] undefined!
But Paolo's approach is significantly better, it's a work of art, thanks a lot.
--
Regards,
Longpeng(Mike)
>>> -void kvm_vcpu_on_spin(struct kvm_vcpu *me)
>>> +void kvm_vcpu_on_spin(struct kvm_vcpu *me, bool me_in_kern)
>>> {
>>> struct kvm *kvm = me->kvm;
>>> struct kvm_vcpu *vcpu;
>>> @@ -2348,6 +2348,8 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me)
>>> continue;
>>> if (swait_active(&vcpu->wq) && !kvm_arch_vcpu_runnable(vcpu))
>>> continue;
>>> + if (me_in_kern && !kvm_arch_vcpu_in_kernel(vcpu))
>>> + continue;
>>
>>
>> hm, does this patch compile? (me_in_kern)
>
> Why not? :) This is what I have:
>
>>From d62a40d49f44ff7e789a15416316ef1cba93fa85 Mon Sep 17 00:00:00 2001
> From: "Longpeng(Mike)" <[email protected]>
> Date: Tue, 8 Aug 2017 12:05:32 +0800
> Subject: [PATCH 1/4] KVM: add spinlock optimization framework
>
> If a vcpu exits due to request a user mode spinlock, then
> the spinlock-holder may be preempted in user mode or kernel mode.
> (Note that not all architectures trap spin loops in user mode,
> only AMD x86 and ARM/ARM64 currently do).
>
> But if a vcpu exits in kernel mode, then the holder must be
> preempted in kernel mode, so we should choose a vcpu in kernel mode
> as a more likely candidate for the lock holder.
>
> This introduces kvm_arch_vcpu_in_kernel() to decide whether the
> vcpu is in kernel-mode when it's preempted. kvm_vcpu_on_spin's
> new argument says the same of the spinning VCPU.
>
> Signed-off-by: Longpeng(Mike) <[email protected]>
> Signed-off-by: Paolo Bonzini <[email protected]>
> ---
> arch/arm/kvm/handle_exit.c | 2 +-
> arch/arm64/kvm/handle_exit.c | 2 +-
> arch/mips/kvm/mips.c | 5 +++++
> arch/powerpc/kvm/powerpc.c | 5 +++++
> arch/s390/kvm/diag.c | 2 +-
> arch/s390/kvm/kvm-s390.c | 5 +++++
> arch/x86/kvm/hyperv.c | 2 +-
> arch/x86/kvm/svm.c | 2 +-
> arch/x86/kvm/vmx.c | 2 +-
> arch/x86/kvm/x86.c | 5 +++++
> include/linux/kvm_host.h | 3 ++-
> virt/kvm/arm/arm.c | 5 +++++
> virt/kvm/kvm_main.c | 4 +++-
> 13 files changed, 36 insertions(+), 8 deletions(-)
>
> diff --git a/arch/arm/kvm/handle_exit.c b/arch/arm/kvm/handle_exit.c
> index 54442e375354..196122bb6968 100644
> --- a/arch/arm/kvm/handle_exit.c
> +++ b/arch/arm/kvm/handle_exit.c
> @@ -67,7 +67,7 @@ static int kvm_handle_wfx(struct kvm_vcpu *vcpu, struct kvm_run *run)
> if (kvm_vcpu_get_hsr(vcpu) & HSR_WFI_IS_WFE) {
> trace_kvm_wfx(*vcpu_pc(vcpu), true);
> vcpu->stat.wfe_exit_stat++;
> - kvm_vcpu_on_spin(vcpu);
> + kvm_vcpu_on_spin(vcpu, false);
> } else {
> trace_kvm_wfx(*vcpu_pc(vcpu), false);
> vcpu->stat.wfi_exit_stat++;
> diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
> index 17d8a1677a0b..da57622cacca 100644
> --- a/arch/arm64/kvm/handle_exit.c
> +++ b/arch/arm64/kvm/handle_exit.c
> @@ -84,7 +84,7 @@ static int kvm_handle_wfx(struct kvm_vcpu *vcpu, struct kvm_run *run)
> if (kvm_vcpu_get_hsr(vcpu) & ESR_ELx_WFx_ISS_WFE) {
> trace_kvm_wfx_arm64(*vcpu_pc(vcpu), true);
> vcpu->stat.wfe_exit_stat++;
> - kvm_vcpu_on_spin(vcpu);
> + kvm_vcpu_on_spin(vcpu, false);
> } else {
> trace_kvm_wfx_arm64(*vcpu_pc(vcpu), false);
> vcpu->stat.wfi_exit_stat++;
> diff --git a/arch/mips/kvm/mips.c b/arch/mips/kvm/mips.c
> index d4b2ad18eef2..70208bed5a15 100644
> --- a/arch/mips/kvm/mips.c
> +++ b/arch/mips/kvm/mips.c
> @@ -98,6 +98,11 @@ int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu)
> return !!(vcpu->arch.pending_exceptions);
> }
>
> +bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu)
> +{
> + return false;
> +}
> +
> int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu)
> {
> return 1;
> }
> diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
> index 1a75c0b5f4ca..6184c45015f3 100644
> --- a/arch/powerpc/kvm/powerpc.c
> +++ b/arch/powerpc/kvm/powerpc.c
> @@ -58,6 +58,11 @@ int kvm_arch_vcpu_runnable(struct kvm_vcpu *v)
> return !!(v->arch.pending_exceptions) || kvm_request_pending(v);
> }
>
> +bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu)
> +{
> + return false;
> +}
> +
> int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu)
> {
> return 1;
> diff --git a/arch/s390/kvm/diag.c b/arch/s390/kvm/diag.c
> index ce865bd4f81d..6182edebea3d 100644
> --- a/arch/s390/kvm/diag.c
> +++ b/arch/s390/kvm/diag.c
> @@ -150,7 +150,7 @@ static int __diag_time_slice_end(struct kvm_vcpu *vcpu)
> {
> VCPU_EVENT(vcpu, 5, "%s", "diag time slice end");
> vcpu->stat.diagnose_44++;
> - kvm_vcpu_on_spin(vcpu);
> + kvm_vcpu_on_spin(vcpu, false);
> return 0;
> }
>
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index af09d3437631..0b0c689f1d9a 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -2447,6 +2447,11 @@ int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu)
> return kvm_s390_vcpu_has_irq(vcpu, 0);
> }
>
> +bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu)
> +{
> + return false;
> +}
> +
> void kvm_s390_vcpu_block(struct kvm_vcpu *vcpu)
> {
> atomic_or(PROG_BLOCK_SIE, &vcpu->arch.sie_block->prog20);
> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
> index bf9992300efa..5243d54f73ab 100644
> --- a/arch/x86/kvm/hyperv.c
> +++ b/arch/x86/kvm/hyperv.c
> @@ -1274,7 +1274,7 @@ int kvm_hv_hypercall(struct kvm_vcpu *vcpu)
>
> switch (code) {
> case HVCALL_NOTIFY_LONG_SPIN_WAIT:
> - kvm_vcpu_on_spin(vcpu);
> + kvm_vcpu_on_spin(vcpu, false);
> break;
> case HVCALL_POST_MESSAGE:
> case HVCALL_SIGNAL_EVENT:
> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> index 2432bb952a30..0cc486fd9871 100644
> --- a/arch/x86/kvm/svm.c
> +++ b/arch/x86/kvm/svm.c
> @@ -3749,7 +3749,7 @@ static int interrupt_window_interception(struct vcpu_svm *svm)
>
> static int pause_interception(struct vcpu_svm *svm)
> {
> - kvm_vcpu_on_spin(&(svm->vcpu));
> + kvm_vcpu_on_spin(&svm->vcpu, false);
> return 1;
> }
>
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 2c0f5287fb78..fef784c22190 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -6781,7 +6781,7 @@ static int handle_pause(struct kvm_vcpu *vcpu)
> if (ple_gap)
> grow_ple_window(vcpu);
>
> - kvm_vcpu_on_spin(vcpu);
> + kvm_vcpu_on_spin(vcpu, false);
> return kvm_skip_emulated_instruction(vcpu);
> }
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 33fd6b6419ef..aba9d038d09e 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -8432,6 +8432,11 @@ int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu)
> return kvm_vcpu_running(vcpu) || kvm_vcpu_has_events(vcpu);
> }
>
> +bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu)
> +{
> + return false;
> +}
> +
> int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu)
> {
> return kvm_vcpu_exiting_guest_mode(vcpu) == IN_GUEST_MODE;
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index 28112d7917c1..6882538eda32 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -720,7 +720,7 @@ int kvm_vcpu_write_guest(struct kvm_vcpu *vcpu, gpa_t gpa, const void *data,
> bool kvm_vcpu_wake_up(struct kvm_vcpu *vcpu);
> void kvm_vcpu_kick(struct kvm_vcpu *vcpu);
> int kvm_vcpu_yield_to(struct kvm_vcpu *target);
> -void kvm_vcpu_on_spin(struct kvm_vcpu *vcpu);
> +void kvm_vcpu_on_spin(struct kvm_vcpu *vcpu, bool usermode_vcpu_not_eligible);
> void kvm_load_guest_fpu(struct kvm_vcpu *vcpu);
> void kvm_put_guest_fpu(struct kvm_vcpu *vcpu);
>
> @@ -800,6 +800,7 @@ int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
> void kvm_arch_hardware_unsetup(void);
> void kvm_arch_check_processor_compat(void *rtn);
> int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu);
> +bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu);
> int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu);
>
> #ifndef __KVM_HAVE_ARCH_VM_ALLOC
> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
> index a39a1e161e63..862f820d06d4 100644
> --- a/virt/kvm/arm/arm.c
> +++ b/virt/kvm/arm/arm.c
> @@ -416,6 +416,11 @@ int kvm_arch_vcpu_runnable(struct kvm_vcpu *v)
> && !v->arch.power_off && !v->arch.pause);
> }
>
> +bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu)
> +{
> + return false;
> +}
> +
> /* Just ensure a guest exit from a particular CPU */
> static void exit_vm_noop(void *info)
> {
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 15252d723b54..e17c40d986f3 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -2317,7 +2317,7 @@ static bool kvm_vcpu_eligible_for_directed_yield(struct kvm_vcpu *vcpu)
> #endif
> }
>
> -void kvm_vcpu_on_spin(struct kvm_vcpu *me)
> +void kvm_vcpu_on_spin(struct kvm_vcpu *me, bool yield_to_kernel_mode)
> {
> struct kvm *kvm = me->kvm;
> struct kvm_vcpu *vcpu;
> @@ -2348,6 +2348,8 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me)
> continue;
> if (swait_active(&vcpu->wq) && !kvm_arch_vcpu_runnable(vcpu))
> continue;
> + if (yield_to_kernel_mode && !kvm_arch_vcpu_in_kernel(vcpu))
> + continue;
> if (!kvm_vcpu_eligible_for_directed_yield(vcpu))
> continue;
>
--
Regards,
Longpeng(Mike)
On 08.08.2017 06:05, Longpeng(Mike) wrote:
> This is a simple optimization for kvm_vcpu_on_spin, the
> main idea is described in patch-1's commit msg.
>
> I did some tests base on the RFC version, the result shows
> that it can improves the performance slightly.
>
> == Geekbench-3.4.1 ==
> VM1: 8U,4G, vcpu(0...7) is 1:1 pinned to pcpu(6...11,18,19)
> running Geekbench-3.4.1 *10 truns*
> VM2/VM3/VM4: configure is the same as VM1
> stress each vcpu usage(seed by top in guest) to 40%
>
> The comparison of each testcase's score:
> (higher is better)
> before after improve
> Inter
> single 1176.7 1179.0 0.2%
> multi 3459.5 3426.5 -0.9%
> Float
> single 1150.5 1150.9 0.0%
> multi 3364.5 3391.9 0.8%
> Memory(stream)
> single 1768.7 1773.1 0.2%
> multi 2511.6 2557.2 1.8%
> Overall
> single 1284.2 1286.2 0.2%
> multi 3231.4 3238.4 0.2%
>
>
> == kernbench-0.42 ==
> VM1: 8U,12G, vcpu(0...7) is 1:1 pinned to pcpu(6...11,18,19)
> running "kernbench -n 10"
> VM2/VM3/VM4: configure is the same as VM1
> stress each vcpu usage(seed by top in guest) to 40%
>
> The comparison of 'Elapsed Time':
> (sooner is better)
> before after improve
> load -j4 12.762 12.751 0.1%
> load -j32 9.743 8.955 8.1%
> load -j 9.688 9.229 4.7%
>
>
> Physical Machine:
> Architecture: x86_64
> CPU op-mode(s): 32-bit, 64-bit
> Byte Order: Little Endian
> CPU(s): 24
> On-line CPU(s) list: 0-23
> Thread(s) per core: 2
> Core(s) per socket: 6
> Socket(s): 2
> NUMA node(s): 2
> Vendor ID: GenuineIntel
> CPU family: 6
> Model: 45
> Model name: Intel(R) Xeon(R) CPU E5-2640 0 @ 2.50GHz
> Stepping: 7
> CPU MHz: 2799.902
> BogoMIPS: 5004.67
> Virtualization: VT-x
> L1d cache: 32K
> L1i cache: 32K
> L2 cache: 256K
> L3 cache: 15360K
> NUMA node0 CPU(s): 0-5,12-17
> NUMA node1 CPU(s): 6-11,18-23
>
> ---
> Changes since V1:
> - split the implementation of s390 & arm. [David]
> - refactor the impls according to the suggestion. [Paolo]
>
> Changes since RFC:
> - only cache result for X86. [David & Cornlia & Paolo]
> - add performance numbers. [David]
> - impls arm/s390. [Christoffer & David]
> - refactor the impls. [me]
>
> ---
> Longpeng(Mike) (4):
> KVM: add spinlock optimization framework
> KVM: X86: implement the logic for spinlock optimization
> KVM: s390: implements the kvm_arch_vcpu_in_kernel()
> KVM: arm: implements the kvm_arch_vcpu_in_kernel()
>
> arch/arm/kvm/handle_exit.c | 2 +-
> arch/arm64/kvm/handle_exit.c | 2 +-
> arch/mips/kvm/mips.c | 6 ++++++
> arch/powerpc/kvm/powerpc.c | 6 ++++++
> arch/s390/kvm/diag.c | 2 +-
> arch/s390/kvm/kvm-s390.c | 6 ++++++
> arch/x86/include/asm/kvm_host.h | 5 +++++
> arch/x86/kvm/hyperv.c | 2 +-
> arch/x86/kvm/svm.c | 10 +++++++++-
> arch/x86/kvm/vmx.c | 16 +++++++++++++++-
> arch/x86/kvm/x86.c | 11 +++++++++++
> include/linux/kvm_host.h | 3 ++-
> virt/kvm/arm/arm.c | 5 +++++
> virt/kvm/kvm_main.c | 4 +++-
> 14 files changed, 72 insertions(+), 8 deletions(-)
>
I am curious, is there any architecture that allows to trigger
kvm_vcpu_on_spin(vcpu); while _not_ in kernel mode?
I would have guessed that user space should never be allowed to make cpu
wide decisions (giving up the CPU to the hypervisor).
E.g. s390x diag can only be executed from kernel space. VMX PAUSE is
only valid from kernel space.
I.o.w. do we need a parameter to kvm_vcpu_on_spin(vcpu); at all, or is
"me_in_kernel" basically always true?
--
Thanks,
David
On Tue, Aug 8, 2017 at 1:25 PM, David Hildenbrand <[email protected]> wrote:
> On 08.08.2017 06:05, Longpeng(Mike) wrote:
>> This is a simple optimization for kvm_vcpu_on_spin, the
>> main idea is described in patch-1's commit msg.
>>
>> I did some tests base on the RFC version, the result shows
>> that it can improves the performance slightly.
>>
>> == Geekbench-3.4.1 ==
>> VM1: 8U,4G, vcpu(0...7) is 1:1 pinned to pcpu(6...11,18,19)
>> running Geekbench-3.4.1 *10 truns*
>> VM2/VM3/VM4: configure is the same as VM1
>> stress each vcpu usage(seed by top in guest) to 40%
>>
>> The comparison of each testcase's score:
>> (higher is better)
>> before after improve
>> Inter
>> single 1176.7 1179.0 0.2%
>> multi 3459.5 3426.5 -0.9%
>> Float
>> single 1150.5 1150.9 0.0%
>> multi 3364.5 3391.9 0.8%
>> Memory(stream)
>> single 1768.7 1773.1 0.2%
>> multi 2511.6 2557.2 1.8%
>> Overall
>> single 1284.2 1286.2 0.2%
>> multi 3231.4 3238.4 0.2%
>>
>>
>> == kernbench-0.42 ==
>> VM1: 8U,12G, vcpu(0...7) is 1:1 pinned to pcpu(6...11,18,19)
>> running "kernbench -n 10"
>> VM2/VM3/VM4: configure is the same as VM1
>> stress each vcpu usage(seed by top in guest) to 40%
>>
>> The comparison of 'Elapsed Time':
>> (sooner is better)
>> before after improve
>> load -j4 12.762 12.751 0.1%
>> load -j32 9.743 8.955 8.1%
>> load -j 9.688 9.229 4.7%
>>
>>
>> Physical Machine:
>> Architecture: x86_64
>> CPU op-mode(s): 32-bit, 64-bit
>> Byte Order: Little Endian
>> CPU(s): 24
>> On-line CPU(s) list: 0-23
>> Thread(s) per core: 2
>> Core(s) per socket: 6
>> Socket(s): 2
>> NUMA node(s): 2
>> Vendor ID: GenuineIntel
>> CPU family: 6
>> Model: 45
>> Model name: Intel(R) Xeon(R) CPU E5-2640 0 @ 2.50GHz
>> Stepping: 7
>> CPU MHz: 2799.902
>> BogoMIPS: 5004.67
>> Virtualization: VT-x
>> L1d cache: 32K
>> L1i cache: 32K
>> L2 cache: 256K
>> L3 cache: 15360K
>> NUMA node0 CPU(s): 0-5,12-17
>> NUMA node1 CPU(s): 6-11,18-23
>>
>> ---
>> Changes since V1:
>> - split the implementation of s390 & arm. [David]
>> - refactor the impls according to the suggestion. [Paolo]
>>
>> Changes since RFC:
>> - only cache result for X86. [David & Cornlia & Paolo]
>> - add performance numbers. [David]
>> - impls arm/s390. [Christoffer & David]
>> - refactor the impls. [me]
>>
>> ---
>> Longpeng(Mike) (4):
>> KVM: add spinlock optimization framework
>> KVM: X86: implement the logic for spinlock optimization
>> KVM: s390: implements the kvm_arch_vcpu_in_kernel()
>> KVM: arm: implements the kvm_arch_vcpu_in_kernel()
>>
>> arch/arm/kvm/handle_exit.c | 2 +-
>> arch/arm64/kvm/handle_exit.c | 2 +-
>> arch/mips/kvm/mips.c | 6 ++++++
>> arch/powerpc/kvm/powerpc.c | 6 ++++++
>> arch/s390/kvm/diag.c | 2 +-
>> arch/s390/kvm/kvm-s390.c | 6 ++++++
>> arch/x86/include/asm/kvm_host.h | 5 +++++
>> arch/x86/kvm/hyperv.c | 2 +-
>> arch/x86/kvm/svm.c | 10 +++++++++-
>> arch/x86/kvm/vmx.c | 16 +++++++++++++++-
>> arch/x86/kvm/x86.c | 11 +++++++++++
>> include/linux/kvm_host.h | 3 ++-
>> virt/kvm/arm/arm.c | 5 +++++
>> virt/kvm/kvm_main.c | 4 +++-
>> 14 files changed, 72 insertions(+), 8 deletions(-)
>>
>
> I am curious, is there any architecture that allows to trigger
> kvm_vcpu_on_spin(vcpu); while _not_ in kernel mode?
>
> I would have guessed that user space should never be allowed to make cpu
> wide decisions (giving up the CPU to the hypervisor).
>
> E.g. s390x diag can only be executed from kernel space. VMX PAUSE is
> only valid from kernel space.
>
> I.o.w. do we need a parameter to kvm_vcpu_on_spin(vcpu); at all, or is
> "me_in_kernel" basically always true?
>
ARM can be configured to not trap WFE in userspace.
Thanks,
-Christoffer
On 2017/8/8 19:25, David Hildenbrand wrote:
> On 08.08.2017 06:05, Longpeng(Mike) wrote:
>> This is a simple optimization for kvm_vcpu_on_spin, the
>> main idea is described in patch-1's commit msg.
>>
>> I did some tests base on the RFC version, the result shows
>> that it can improves the performance slightly.
>>
>> == Geekbench-3.4.1 ==
>> VM1: 8U,4G, vcpu(0...7) is 1:1 pinned to pcpu(6...11,18,19)
>> running Geekbench-3.4.1 *10 truns*
>> VM2/VM3/VM4: configure is the same as VM1
>> stress each vcpu usage(seed by top in guest) to 40%
>>
>> The comparison of each testcase's score:
>> (higher is better)
>> before after improve
>> Inter
>> single 1176.7 1179.0 0.2%
>> multi 3459.5 3426.5 -0.9%
>> Float
>> single 1150.5 1150.9 0.0%
>> multi 3364.5 3391.9 0.8%
>> Memory(stream)
>> single 1768.7 1773.1 0.2%
>> multi 2511.6 2557.2 1.8%
>> Overall
>> single 1284.2 1286.2 0.2%
>> multi 3231.4 3238.4 0.2%
>>
>>
>> == kernbench-0.42 ==
>> VM1: 8U,12G, vcpu(0...7) is 1:1 pinned to pcpu(6...11,18,19)
>> running "kernbench -n 10"
>> VM2/VM3/VM4: configure is the same as VM1
>> stress each vcpu usage(seed by top in guest) to 40%
>>
>> The comparison of 'Elapsed Time':
>> (sooner is better)
>> before after improve
>> load -j4 12.762 12.751 0.1%
>> load -j32 9.743 8.955 8.1%
>> load -j 9.688 9.229 4.7%
>>
>>
>> Physical Machine:
>> Architecture: x86_64
>> CPU op-mode(s): 32-bit, 64-bit
>> Byte Order: Little Endian
>> CPU(s): 24
>> On-line CPU(s) list: 0-23
>> Thread(s) per core: 2
>> Core(s) per socket: 6
>> Socket(s): 2
>> NUMA node(s): 2
>> Vendor ID: GenuineIntel
>> CPU family: 6
>> Model: 45
>> Model name: Intel(R) Xeon(R) CPU E5-2640 0 @ 2.50GHz
>> Stepping: 7
>> CPU MHz: 2799.902
>> BogoMIPS: 5004.67
>> Virtualization: VT-x
>> L1d cache: 32K
>> L1i cache: 32K
>> L2 cache: 256K
>> L3 cache: 15360K
>> NUMA node0 CPU(s): 0-5,12-17
>> NUMA node1 CPU(s): 6-11,18-23
>>
>> ---
>> Changes since V1:
>> - split the implementation of s390 & arm. [David]
>> - refactor the impls according to the suggestion. [Paolo]
>>
>> Changes since RFC:
>> - only cache result for X86. [David & Cornlia & Paolo]
>> - add performance numbers. [David]
>> - impls arm/s390. [Christoffer & David]
>> - refactor the impls. [me]
>>
>> ---
>> Longpeng(Mike) (4):
>> KVM: add spinlock optimization framework
>> KVM: X86: implement the logic for spinlock optimization
>> KVM: s390: implements the kvm_arch_vcpu_in_kernel()
>> KVM: arm: implements the kvm_arch_vcpu_in_kernel()
>>
>> arch/arm/kvm/handle_exit.c | 2 +-
>> arch/arm64/kvm/handle_exit.c | 2 +-
>> arch/mips/kvm/mips.c | 6 ++++++
>> arch/powerpc/kvm/powerpc.c | 6 ++++++
>> arch/s390/kvm/diag.c | 2 +-
>> arch/s390/kvm/kvm-s390.c | 6 ++++++
>> arch/x86/include/asm/kvm_host.h | 5 +++++
>> arch/x86/kvm/hyperv.c | 2 +-
>> arch/x86/kvm/svm.c | 10 +++++++++-
>> arch/x86/kvm/vmx.c | 16 +++++++++++++++-
>> arch/x86/kvm/x86.c | 11 +++++++++++
>> include/linux/kvm_host.h | 3 ++-
>> virt/kvm/arm/arm.c | 5 +++++
>> virt/kvm/kvm_main.c | 4 +++-
>> 14 files changed, 72 insertions(+), 8 deletions(-)
>>
>
> I am curious, is there any architecture that allows to trigger
> kvm_vcpu_on_spin(vcpu); while _not_ in kernel mode?
IIUC, X86/SVM will trap to host due to PAUSE insn no matter the vcpu is in
kernel-mode or user-mode.
>
> I would have guessed that user space should never be allowed to make cpu
> wide decisions (giving up the CPU to the hypervisor).
>
> E.g. s390x diag can only be executed from kernel space. VMX PAUSE is
> only valid from kernel space.
X86/VMX has "PAUSE exiting" and "PAUSE-loop exiting"(PLE). KVM only uses PLE,
this is as you said "only valid from kernel space"
However, the "PAUSE exiting" can cause user-mode vcpu exit too.
>
> I.o.w. do we need a parameter to kvm_vcpu_on_spin(vcpu); at all, or is
> "me_in_kernel" basically always true?
>
--
Regards,
Longpeng(Mike)
On 08.08.2017 13:49, Longpeng (Mike) wrote:
>
>
> On 2017/8/8 19:25, David Hildenbrand wrote:
>
>> On 08.08.2017 06:05, Longpeng(Mike) wrote:
>>> This is a simple optimization for kvm_vcpu_on_spin, the
>>> main idea is described in patch-1's commit msg.
>>>
>>> I did some tests base on the RFC version, the result shows
>>> that it can improves the performance slightly.
>>>
>>> == Geekbench-3.4.1 ==
>>> VM1: 8U,4G, vcpu(0...7) is 1:1 pinned to pcpu(6...11,18,19)
>>> running Geekbench-3.4.1 *10 truns*
>>> VM2/VM3/VM4: configure is the same as VM1
>>> stress each vcpu usage(seed by top in guest) to 40%
>>>
>>> The comparison of each testcase's score:
>>> (higher is better)
>>> before after improve
>>> Inter
>>> single 1176.7 1179.0 0.2%
>>> multi 3459.5 3426.5 -0.9%
>>> Float
>>> single 1150.5 1150.9 0.0%
>>> multi 3364.5 3391.9 0.8%
>>> Memory(stream)
>>> single 1768.7 1773.1 0.2%
>>> multi 2511.6 2557.2 1.8%
>>> Overall
>>> single 1284.2 1286.2 0.2%
>>> multi 3231.4 3238.4 0.2%
>>>
>>>
>>> == kernbench-0.42 ==
>>> VM1: 8U,12G, vcpu(0...7) is 1:1 pinned to pcpu(6...11,18,19)
>>> running "kernbench -n 10"
>>> VM2/VM3/VM4: configure is the same as VM1
>>> stress each vcpu usage(seed by top in guest) to 40%
>>>
>>> The comparison of 'Elapsed Time':
>>> (sooner is better)
>>> before after improve
>>> load -j4 12.762 12.751 0.1%
>>> load -j32 9.743 8.955 8.1%
>>> load -j 9.688 9.229 4.7%
>>>
>>>
>>> Physical Machine:
>>> Architecture: x86_64
>>> CPU op-mode(s): 32-bit, 64-bit
>>> Byte Order: Little Endian
>>> CPU(s): 24
>>> On-line CPU(s) list: 0-23
>>> Thread(s) per core: 2
>>> Core(s) per socket: 6
>>> Socket(s): 2
>>> NUMA node(s): 2
>>> Vendor ID: GenuineIntel
>>> CPU family: 6
>>> Model: 45
>>> Model name: Intel(R) Xeon(R) CPU E5-2640 0 @ 2.50GHz
>>> Stepping: 7
>>> CPU MHz: 2799.902
>>> BogoMIPS: 5004.67
>>> Virtualization: VT-x
>>> L1d cache: 32K
>>> L1i cache: 32K
>>> L2 cache: 256K
>>> L3 cache: 15360K
>>> NUMA node0 CPU(s): 0-5,12-17
>>> NUMA node1 CPU(s): 6-11,18-23
>>>
>>> ---
>>> Changes since V1:
>>> - split the implementation of s390 & arm. [David]
>>> - refactor the impls according to the suggestion. [Paolo]
>>>
>>> Changes since RFC:
>>> - only cache result for X86. [David & Cornlia & Paolo]
>>> - add performance numbers. [David]
>>> - impls arm/s390. [Christoffer & David]
>>> - refactor the impls. [me]
>>>
>>> ---
>>> Longpeng(Mike) (4):
>>> KVM: add spinlock optimization framework
>>> KVM: X86: implement the logic for spinlock optimization
>>> KVM: s390: implements the kvm_arch_vcpu_in_kernel()
>>> KVM: arm: implements the kvm_arch_vcpu_in_kernel()
>>>
>>> arch/arm/kvm/handle_exit.c | 2 +-
>>> arch/arm64/kvm/handle_exit.c | 2 +-
>>> arch/mips/kvm/mips.c | 6 ++++++
>>> arch/powerpc/kvm/powerpc.c | 6 ++++++
>>> arch/s390/kvm/diag.c | 2 +-
>>> arch/s390/kvm/kvm-s390.c | 6 ++++++
>>> arch/x86/include/asm/kvm_host.h | 5 +++++
>>> arch/x86/kvm/hyperv.c | 2 +-
>>> arch/x86/kvm/svm.c | 10 +++++++++-
>>> arch/x86/kvm/vmx.c | 16 +++++++++++++++-
>>> arch/x86/kvm/x86.c | 11 +++++++++++
>>> include/linux/kvm_host.h | 3 ++-
>>> virt/kvm/arm/arm.c | 5 +++++
>>> virt/kvm/kvm_main.c | 4 +++-
>>> 14 files changed, 72 insertions(+), 8 deletions(-)
>>>
>>
>> I am curious, is there any architecture that allows to trigger
>> kvm_vcpu_on_spin(vcpu); while _not_ in kernel mode?
>
>
> IIUC, X86/SVM will trap to host due to PAUSE insn no matter the vcpu is in
> kernel-mode or user-mode.
>
>>
>> I would have guessed that user space should never be allowed to make cpu
>> wide decisions (giving up the CPU to the hypervisor).
>>
>> E.g. s390x diag can only be executed from kernel space. VMX PAUSE is
>> only valid from kernel space.
>
>
> X86/VMX has "PAUSE exiting" and "PAUSE-loop exiting"(PLE). KVM only uses PLE,
> this is as you said "only valid from kernel space"
>
> However, the "PAUSE exiting" can cause user-mode vcpu exit too.
Thanks Longpeng and Christoffer!
--
Thanks,
David
On 08/08/2017 04:14 AM, Longpeng (Mike) wrote:
>
>
> On 2017/8/8 15:41, Cornelia Huck wrote:
>
>> On Tue, 8 Aug 2017 12:05:31 +0800
>> "Longpeng(Mike)" <[email protected]> wrote:
>>
>>> This is a simple optimization for kvm_vcpu_on_spin, the
>>> main idea is described in patch-1's commit msg.
>>
>> I think this generally looks good now.
>>
>>>
>>> I did some tests base on the RFC version, the result shows
>>> that it can improves the performance slightly.
>>
>> Did you re-run tests on this version?
>
>
> Hi Cornelia,
>
> I didn't re-run tests on V2. But the major difference between RFC and V2
> is that V2 only cache result for X86 (s390/arm needn't) and V2 saves a
> expensive operation ( 440-1400 cycles on my test machine ) for X86/VMX.
>
> So I think V2's performance is at least the same as RFC or even slightly
> better. :)
>
>>
>> I would also like to see some s390 numbers; unfortunately I only have a
>> z/VM environment and any performance numbers would be nearly useless
>> there. Maybe somebody within IBM with a better setup can run a quick
>> test?
Won't swear I didn't screw something up, but here's some quick numbers.
Host was 4.12.0 with and without this series, running QEMU 2.10.0-rc0.
Created 4 guests, each with 4 CPU (unpinned) and 4GB RAM. VM1 did full
kernel compiles with kernbench, which took averages of 5 runs of
different job sizes (I threw away the "-j 1" numbers). VM2-VM4 ran cpu
burners on 2 of their 4 cpus.
Numbers from VM1 kernbench output, and the delta between runs:
load -j 3 before after delta
Elapsed Time 183.178 182.58 -0.598
User Time 534.19 531.52 -2.67
System Time 32.538 33.37 0.832
Percent CPU 308.8 309 0.2
Context Switches 98484.6 99001 516.4
Sleeps 227347 228752 1405
load -j 16 before after delta
Elapsed Time 153.352 147.59 -5.762
User Time 545.829 533.41 -12.419
System Time 34.289 34.85 0.561
Percent CPU 347.6 348 0.4
Context Switches 160518 159120 -1398
Sleeps 240740 240536 -204
- Eric
On 2017/8/10 21:18, Eric Farman wrote:
>
>
> On 08/08/2017 04:14 AM, Longpeng (Mike) wrote:
>>
>>
>> On 2017/8/8 15:41, Cornelia Huck wrote:
>>
>>> On Tue, 8 Aug 2017 12:05:31 +0800
>>> "Longpeng(Mike)" <[email protected]> wrote:
>>>
>>>> This is a simple optimization for kvm_vcpu_on_spin, the
>>>> main idea is described in patch-1's commit msg.
>>>
>>> I think this generally looks good now.
>>>
>>>>
>>>> I did some tests base on the RFC version, the result shows
>>>> that it can improves the performance slightly.
>>>
>>> Did you re-run tests on this version?
>>
>>
>> Hi Cornelia,
>>
>> I didn't re-run tests on V2. But the major difference between RFC and V2
>> is that V2 only cache result for X86 (s390/arm needn't) and V2 saves a
>> expensive operation ( 440-1400 cycles on my test machine ) for X86/VMX.
>>
>> So I think V2's performance is at least the same as RFC or even slightly
>> better. :)
>>
>>>
>>> I would also like to see some s390 numbers; unfortunately I only have a
>>> z/VM environment and any performance numbers would be nearly useless
>>> there. Maybe somebody within IBM with a better setup can run a quick
>>> test?
>
> Won't swear I didn't screw something up, but here's some quick numbers. Host was
> 4.12.0 with and without this series, running QEMU 2.10.0-rc0. Created 4 guests,
> each with 4 CPU (unpinned) and 4GB RAM. VM1 did full kernel compiles with
> kernbench, which took averages of 5 runs of different job sizes (I threw away
> the "-j 1" numbers). VM2-VM4 ran cpu burners on 2 of their 4 cpus.
>
> Numbers from VM1 kernbench output, and the delta between runs:
>
> load -j 3 before after delta
> Elapsed Time 183.178 182.58 -0.598
> User Time 534.19 531.52 -2.67
> System Time 32.538 33.37 0.832
> Percent CPU 308.8 309 0.2
> Context Switches 98484.6 99001 516.4
> Sleeps 227347 228752 1405
>
> load -j 16 before after delta
> Elapsed Time 153.352 147.59 -5.762
> User Time 545.829 533.41 -12.419
> System Time 34.289 34.85 0.561
> Percent CPU 347.6 348 0.4
> Context Switches 160518 159120 -1398
> Sleeps 240740 240536 -204
>
Thanks Eric!
The `Elapsed Time` is smaller with this series , the result is the same as my
numbers in cover-letter.
>
> - Eric
>
>
> .
>
--
Regards,
Longpeng(Mike)
On Thu, 10 Aug 2017 09:18:09 -0400
Eric Farman <[email protected]> wrote:
> On 08/08/2017 04:14 AM, Longpeng (Mike) wrote:
> >
> >
> > On 2017/8/8 15:41, Cornelia Huck wrote:
> >
> >> On Tue, 8 Aug 2017 12:05:31 +0800
> >> "Longpeng(Mike)" <[email protected]> wrote:
> >>
> >>> This is a simple optimization for kvm_vcpu_on_spin, the
> >>> main idea is described in patch-1's commit msg.
> >>
> >> I think this generally looks good now.
> >>
> >>>
> >>> I did some tests base on the RFC version, the result shows
> >>> that it can improves the performance slightly.
> >>
> >> Did you re-run tests on this version?
> >
> >
> > Hi Cornelia,
> >
> > I didn't re-run tests on V2. But the major difference between RFC and V2
> > is that V2 only cache result for X86 (s390/arm needn't) and V2 saves a
> > expensive operation ( 440-1400 cycles on my test machine ) for X86/VMX.
> >
> > So I think V2's performance is at least the same as RFC or even slightly
> > better. :)
> >
> >>
> >> I would also like to see some s390 numbers; unfortunately I only have a
> >> z/VM environment and any performance numbers would be nearly useless
> >> there. Maybe somebody within IBM with a better setup can run a quick
> >> test?
>
> Won't swear I didn't screw something up, but here's some quick numbers.
> Host was 4.12.0 with and without this series, running QEMU 2.10.0-rc0.
> Created 4 guests, each with 4 CPU (unpinned) and 4GB RAM. VM1 did full
> kernel compiles with kernbench, which took averages of 5 runs of
> different job sizes (I threw away the "-j 1" numbers). VM2-VM4 ran cpu
> burners on 2 of their 4 cpus.
>
> Numbers from VM1 kernbench output, and the delta between runs:
>
> load -j 3 before after delta
> Elapsed Time 183.178 182.58 -0.598
> User Time 534.19 531.52 -2.67
> System Time 32.538 33.37 0.832
> Percent CPU 308.8 309 0.2
> Context Switches 98484.6 99001 516.4
> Sleeps 227347 228752 1405
>
> load -j 16 before after delta
> Elapsed Time 153.352 147.59 -5.762
> User Time 545.829 533.41 -12.419
> System Time 34.289 34.85 0.561
> Percent CPU 347.6 348 0.4
> Context Switches 160518 159120 -1398
> Sleeps 240740 240536 -204
Thanks a lot, Eric!
The decreases in elapsed time look nice, and we probably should not
care about the increases reported.