2018-03-12 11:55:37

by Wanpeng Li

[permalink] [raw]
Subject: [PATCH v2 0/3] Provides userspace with per-VM capability to improve latency

Provides userspace with per-VM capability(KVM_CAP_X86_DISABLE_EXITS) to
not intercept MWAIT/HLT/PAUSE in order that to improve latency in some
workloads.

The patchset implements the original proposal from Radim.
https://www.spinics.net/lists/kvm/msg146879.html

In addition, thanks to Jan H. Schönherr's attempt last year.

v1 -> v2:
* remove blinding setting KVM_ENABLE_CAP statement in doc
* move PV_UNHALT associated statement to 2/3
* rename kvm_mwait_can_in_guest to kvm_can_mwait_in_guest
* remove unconditionally set INTERCEPT HLT in svm
* call vmx_clear_hlt() from pre_enter_smm()
* add a check to kvm_update_cpuid() that forbits KVM_FEATURE_PV_UNHALT
when halt exits are disabld

Cc: Paolo Bonzini <[email protected]>
Cc: Radim Krčmář <[email protected]>
Cc: Jan H. Schönherr <[email protected]>

Wanpeng Li (3):
KVM: X86: Provides userspace with a capability to not intercept MWAIT
KVM: X86: Provides userspace with a capability to not intercept HLT
KVM: X86: Provides userspace with a capability to not intercept PAUSE

Documentation/virtual/kvm/api.txt | 24 ++++++++++++-------
arch/x86/include/asm/kvm_host.h | 4 ++++
arch/x86/kvm/cpuid.c | 5 ++++
arch/x86/kvm/svm.c | 9 ++++---
arch/x86/kvm/vmx.c | 50 ++++++++++++++++++++++++++++++++-------
arch/x86/kvm/x86.c | 29 +++++++++++++++++++----
arch/x86/kvm/x86.h | 24 +++++++++++++++----
include/uapi/linux/kvm.h | 2 +-
tools/include/uapi/linux/kvm.h | 2 +-
9 files changed, 118 insertions(+), 31 deletions(-)

--
2.7.4



2018-03-12 11:55:37

by Wanpeng Li

[permalink] [raw]
Subject: [PATCH v2 2/3] KVM: X86: Provides userspace with a capability to not intercept HLT

From: Wanpeng Li <[email protected]>

If host CPUs are dedicated to a VM, we can avoid VM exits on HLT.
This patch adds the per-VM non-HLT-exiting capability.

Cc: Paolo Bonzini <[email protected]>
Cc: Radim Krčmář <[email protected]>
Cc: Jan H. Schönherr <[email protected]>
Signed-off-by: Wanpeng Li <[email protected]>
---
Documentation/virtual/kvm/api.txt | 3 ++-
arch/x86/include/asm/kvm_host.h | 1 +
arch/x86/kvm/cpuid.c | 5 +++++
arch/x86/kvm/svm.c | 4 +++-
arch/x86/kvm/vmx.c | 24 ++++++++++++++++++++++++
arch/x86/kvm/x86.c | 3 +++
arch/x86/kvm/x86.h | 9 ++++++++-
7 files changed, 46 insertions(+), 3 deletions(-)

diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index 76e5a15..b46494d 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -4367,10 +4367,11 @@ Returns: 0 on success, -EINVAL when args[0] contains invalid exits
Valid exits in args[0] are

#define KVM_X86_DISABLE_EXITS_MWAIT (1 << 0)
+#define KVM_X86_DISABLE_EXITS_HLT (1 << 1)

Enabling this capability on a VM provides userspace with a way to no
longer intercepts some instructions for improved latency in some
-workloads.
+workloads. Do not enable KVM_FEATURE_PV_UNHALT if you disable HLT exits.

8. Other capabilities.
----------------------
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index e107171..1a79065 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -812,6 +812,7 @@ struct kvm_arch {
gpa_t wall_clock;

bool mwait_in_guest;
+ bool hlt_in_guest;

bool ept_identity_pagetable_done;
gpa_t ept_identity_map_addr;
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index e2d3050..82055b9 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -135,6 +135,11 @@ int kvm_update_cpuid(struct kvm_vcpu *vcpu)
return -EINVAL;
}

+ best = kvm_find_cpuid_entry(vcpu, KVM_CPUID_FEATURES, 0);
+ if (kvm_hlt_in_guest(vcpu->kvm) && best &&
+ (best->eax & (1 << KVM_FEATURE_PV_UNHALT)))
+ best->eax &= ~(1 << KVM_FEATURE_PV_UNHALT);
+
/* Update physical-address width */
vcpu->arch.maxphyaddr = cpuid_query_maxphyaddr(vcpu);
kvm_mmu_reset_context(vcpu);
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 321b3fd..0b2e7af 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -1372,7 +1372,6 @@ static void init_vmcb(struct vcpu_svm *svm)
set_intercept(svm, INTERCEPT_RDPMC);
set_intercept(svm, INTERCEPT_CPUID);
set_intercept(svm, INTERCEPT_INVD);
- set_intercept(svm, INTERCEPT_HLT);
set_intercept(svm, INTERCEPT_INVLPG);
set_intercept(svm, INTERCEPT_INVLPGA);
set_intercept(svm, INTERCEPT_IOIO_PROT);
@@ -1395,6 +1394,9 @@ static void init_vmcb(struct vcpu_svm *svm)
set_intercept(svm, INTERCEPT_MWAIT);
}

+ if (!kvm_hlt_in_guest(svm->vcpu.kvm))
+ set_intercept(svm, INTERCEPT_HLT);
+
control->iopm_base_pa = __sme_set(iopm_base);
control->msrpm_base_pa = __sme_set(__pa(svm->msrpm));
control->int_ctl = V_INTR_MASKING_MASK;
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 2302ae2..fa0c5e1 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2543,6 +2543,19 @@ static int nested_vmx_check_exception(struct kvm_vcpu *vcpu, unsigned long *exit
return 0;
}

+static void vmx_clear_hlt(struct kvm_vcpu *vcpu)
+{
+ /*
+ * Ensure that we clear the HLT state in the VMCS. We don't need to
+ * explicitly skip the instruction because if the HLT state is set,
+ * then the instruction is already executing and RIP has already been
+ * advanced.
+ */
+ if (kvm_hlt_in_guest(vcpu->kvm) &&
+ vmcs_read32(GUEST_ACTIVITY_STATE) == GUEST_ACTIVITY_HLT)
+ vmcs_write32(GUEST_ACTIVITY_STATE, GUEST_ACTIVITY_ACTIVE);
+}
+
static void vmx_queue_exception(struct kvm_vcpu *vcpu)
{
struct vcpu_vmx *vmx = to_vmx(vcpu);
@@ -2573,6 +2586,8 @@ static void vmx_queue_exception(struct kvm_vcpu *vcpu)
intr_info |= INTR_TYPE_HARD_EXCEPTION;

vmcs_write32(VM_ENTRY_INTR_INFO_FIELD, intr_info);
+
+ vmx_clear_hlt(vcpu);
}

static bool vmx_rdtscp_supported(void)
@@ -5532,6 +5547,8 @@ static u32 vmx_exec_control(struct vcpu_vmx *vmx)
if (kvm_mwait_in_guest(vmx->vcpu.kvm))
exec_control &= ~(CPU_BASED_MWAIT_EXITING |
CPU_BASED_MONITOR_EXITING);
+ if (kvm_hlt_in_guest(vmx->vcpu.kvm))
+ exec_control &= ~CPU_BASED_HLT_EXITING;
return exec_control;
}

@@ -5893,6 +5910,8 @@ static void vmx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
update_exception_bitmap(vcpu);

vpid_sync_context(vmx->vpid);
+ if (init_event)
+ vmx_clear_hlt(vcpu);
}

/*
@@ -5963,6 +5982,8 @@ static void vmx_inject_irq(struct kvm_vcpu *vcpu)
} else
intr |= INTR_TYPE_EXT_INTR;
vmcs_write32(VM_ENTRY_INTR_INFO_FIELD, intr);
+
+ vmx_clear_hlt(vcpu);
}

static void vmx_inject_nmi(struct kvm_vcpu *vcpu)
@@ -5993,6 +6014,8 @@ static void vmx_inject_nmi(struct kvm_vcpu *vcpu)

vmcs_write32(VM_ENTRY_INTR_INFO_FIELD,
INTR_TYPE_NMI_INTR | INTR_INFO_VALID_MASK | NMI_VECTOR);
+
+ vmx_clear_hlt(vcpu);
}

static bool vmx_get_nmi_mask(struct kvm_vcpu *vcpu)
@@ -12314,6 +12337,7 @@ static int vmx_pre_enter_smm(struct kvm_vcpu *vcpu, char *smstate)

vmx->nested.smm.vmxon = vmx->nested.vmxon;
vmx->nested.vmxon = false;
+ vmx_clear_hlt(vcpu);
return 0;
}

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 5fae476..73255e6 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2874,6 +2874,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
r = KVM_CLOCK_TSC_STABLE;
break;
case KVM_CAP_X86_DISABLE_EXITS:
+ r |= KVM_X86_DISABLE_EXITS_HTL;
if(kvm_can_mwait_in_guest())
r |= KVM_X86_DISABLE_EXITS_MWAIT;
break;
@@ -4228,6 +4229,8 @@ static int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
if ((cap->args[0] & KVM_X86_DISABLE_EXITS_MWAIT) &&
kvm_can_mwait_in_guest())
kvm->arch.mwait_in_guest = true;
+ if (cap->args[0] & KVM_X86_DISABLE_EXITS_HTL)
+ kvm->arch.hlt_in_guest = true;
r = 0;
break;
default:
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index cd1215e..d4ddb00 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -263,11 +263,18 @@ static inline u64 nsec_to_cycles(struct kvm_vcpu *vcpu, u64 nsec)
})

#define KVM_X86_DISABLE_EXITS_MWAIT (1 << 0)
-#define KVM_X86_DISABLE_VALID_EXITS (KVM_X86_DISABLE_EXITS_MWAIT)
+#define KVM_X86_DISABLE_EXITS_HTL (1 << 1)
+#define KVM_X86_DISABLE_VALID_EXITS (KVM_X86_DISABLE_EXITS_MWAIT | \
+ KVM_X86_DISABLE_EXITS_HTL)

static inline bool kvm_mwait_in_guest(struct kvm *kvm)
{
return kvm->arch.mwait_in_guest;
}

+static inline bool kvm_hlt_in_guest(struct kvm *kvm)
+{
+ return kvm->arch.hlt_in_guest;
+}
+
#endif
--
2.7.4


2018-03-12 11:55:58

by Wanpeng Li

[permalink] [raw]
Subject: [PATCH v2 1/3] KVM: X86: Provides userspace with a capability to not intercept MWAIT

From: Wanpeng Li <[email protected]>

Allowing a guest to execute MWAIT without interception enables a guest
to put a (physical) CPU into a power saving state, where it takes
longer to return from than what may be desired by the host.

Don't give a guest that power over a host by default. (Especially,
since nothing prevents a guest from using MWAIT even when it is not
advertised via CPUID.)

Cc: Paolo Bonzini <[email protected]>
Cc: Radim Krčmář <[email protected]>
Cc: Jan H. Schönherr <[email protected]>
Signed-off-by: Wanpeng Li <[email protected]>
---
Documentation/virtual/kvm/api.txt | 23 ++++++++++++++---------
arch/x86/include/asm/kvm_host.h | 2 ++
arch/x86/kvm/svm.c | 2 +-
arch/x86/kvm/vmx.c | 9 +++++----
arch/x86/kvm/x86.c | 24 ++++++++++++++++++++----
arch/x86/kvm/x86.h | 10 +++++-----
include/uapi/linux/kvm.h | 2 +-
tools/include/uapi/linux/kvm.h | 2 +-
8 files changed, 49 insertions(+), 25 deletions(-)

diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index 98de506..76e5a15 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -4358,6 +4358,20 @@ enables QEMU to build error log and branch to guest kernel registered
machine check handling routine. Without this capability KVM will
branch to guests' 0x200 interrupt vector.

+7.13 KVM_CAP_X86_DISABLE_EXITS
+
+Architectures: x86
+Parameters: args[0] defines which exits are disabled
+Returns: 0 on success, -EINVAL when args[0] contains invalid exits
+
+Valid exits in args[0] are
+
+#define KVM_X86_DISABLE_EXITS_MWAIT (1 << 0)
+
+Enabling this capability on a VM provides userspace with a way to no
+longer intercepts some instructions for improved latency in some
+workloads.
+
8. Other capabilities.
----------------------

@@ -4470,15 +4484,6 @@ reserved.
Both registers and addresses are 64-bits wide.
It will be possible to run 64-bit or 32-bit guest code.

-8.8 KVM_CAP_X86_GUEST_MWAIT
-
-Architectures: x86
-
-This capability indicates that guest using memory monotoring instructions
-(MWAIT/MWAITX) to stop the virtual CPU will not cause a VM exit. As such time
-spent while virtual CPU is halted in this way will then be accounted for as
-guest running time on the host (as opposed to e.g. HLT).
-
8.9 KVM_CAP_ARM_USER_IRQ

Architectures: arm, arm64
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 0395c35..e107171 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -811,6 +811,8 @@ struct kvm_arch {

gpa_t wall_clock;

+ bool mwait_in_guest;
+
bool ept_identity_pagetable_done;
gpa_t ept_identity_map_addr;

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index be9c839..321b3fd 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -1390,7 +1390,7 @@ static void init_vmcb(struct vcpu_svm *svm)
set_intercept(svm, INTERCEPT_XSETBV);
set_intercept(svm, INTERCEPT_RSM);

- if (!kvm_mwait_in_guest()) {
+ if (!kvm_mwait_in_guest(svm->vcpu.kvm)) {
set_intercept(svm, INTERCEPT_MONITOR);
set_intercept(svm, INTERCEPT_MWAIT);
}
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 6cefd7b..2302ae2 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -3733,13 +3733,11 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf)
CPU_BASED_UNCOND_IO_EXITING |
CPU_BASED_MOV_DR_EXITING |
CPU_BASED_USE_TSC_OFFSETING |
+ CPU_BASED_MWAIT_EXITING |
+ CPU_BASED_MONITOR_EXITING |
CPU_BASED_INVLPG_EXITING |
CPU_BASED_RDPMC_EXITING;

- if (!kvm_mwait_in_guest())
- min |= CPU_BASED_MWAIT_EXITING |
- CPU_BASED_MONITOR_EXITING;
-
opt = CPU_BASED_TPR_SHADOW |
CPU_BASED_USE_MSR_BITMAPS |
CPU_BASED_ACTIVATE_SECONDARY_CONTROLS;
@@ -5531,6 +5529,9 @@ static u32 vmx_exec_control(struct vcpu_vmx *vmx)
exec_control |= CPU_BASED_CR3_STORE_EXITING |
CPU_BASED_CR3_LOAD_EXITING |
CPU_BASED_INVLPG_EXITING;
+ if (kvm_mwait_in_guest(vmx->vcpu.kvm))
+ exec_control &= ~(CPU_BASED_MWAIT_EXITING |
+ CPU_BASED_MONITOR_EXITING);
return exec_control;
}

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 36ef3d8..5fae476 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2809,9 +2809,15 @@ static int msr_io(struct kvm_vcpu *vcpu, struct kvm_msrs __user *user_msrs,
return r;
}

+static inline bool kvm_can_mwait_in_guest(void)
+{
+ return boot_cpu_has(X86_FEATURE_MWAIT) &&
+ !boot_cpu_has_bug(X86_BUG_MONITOR);
+}
+
int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
{
- int r;
+ int r = 0;

switch (ext) {
case KVM_CAP_IRQCHIP:
@@ -2867,8 +2873,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
case KVM_CAP_ADJUST_CLOCK:
r = KVM_CLOCK_TSC_STABLE;
break;
- case KVM_CAP_X86_GUEST_MWAIT:
- r = kvm_mwait_in_guest();
+ case KVM_CAP_X86_DISABLE_EXITS:
+ if(kvm_can_mwait_in_guest())
+ r |= KVM_X86_DISABLE_EXITS_MWAIT;
break;
case KVM_CAP_X86_SMM:
/* SMBASE is usually relocated above 1M on modern chipsets,
@@ -2909,7 +2916,6 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
r = KVM_X2APIC_API_VALID_FLAGS;
break;
default:
- r = 0;
break;
}
return r;
@@ -4214,6 +4220,16 @@ static int kvm_vm_ioctl_enable_cap(struct kvm *kvm,

r = 0;
break;
+ case KVM_CAP_X86_DISABLE_EXITS:
+ r = -EINVAL;
+ if (cap->args[0] & ~KVM_X86_DISABLE_VALID_EXITS)
+ break;
+
+ if ((cap->args[0] & KVM_X86_DISABLE_EXITS_MWAIT) &&
+ kvm_can_mwait_in_guest())
+ kvm->arch.mwait_in_guest = true;
+ r = 0;
+ break;
default:
r = -EINVAL;
break;
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index b91215d..cd1215e 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -2,8 +2,6 @@
#ifndef ARCH_X86_KVM_X86_H
#define ARCH_X86_KVM_X86_H

-#include <asm/processor.h>
-#include <asm/mwait.h>
#include <linux/kvm_host.h>
#include <asm/pvclock.h>
#include "kvm_cache_regs.h"
@@ -264,10 +262,12 @@ static inline u64 nsec_to_cycles(struct kvm_vcpu *vcpu, u64 nsec)
__rem; \
})

-static inline bool kvm_mwait_in_guest(void)
+#define KVM_X86_DISABLE_EXITS_MWAIT (1 << 0)
+#define KVM_X86_DISABLE_VALID_EXITS (KVM_X86_DISABLE_EXITS_MWAIT)
+
+static inline bool kvm_mwait_in_guest(struct kvm *kvm)
{
- return boot_cpu_has(X86_FEATURE_MWAIT) &&
- !boot_cpu_has_bug(X86_BUG_MONITOR);
+ return kvm->arch.mwait_in_guest;
}

#endif
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 088c2c9..1065006 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -929,7 +929,7 @@ struct kvm_ppc_resize_hpt {
#define KVM_CAP_S390_GS 140
#define KVM_CAP_S390_AIS 141
#define KVM_CAP_SPAPR_TCE_VFIO 142
-#define KVM_CAP_X86_GUEST_MWAIT 143
+#define KVM_CAP_X86_DISABLE_EXITS 143
#define KVM_CAP_ARM_USER_IRQ 144
#define KVM_CAP_S390_CMMA_MIGRATION 145
#define KVM_CAP_PPC_FWNMI 146
diff --git a/tools/include/uapi/linux/kvm.h b/tools/include/uapi/linux/kvm.h
index 0fb5ef9..b13c257 100644
--- a/tools/include/uapi/linux/kvm.h
+++ b/tools/include/uapi/linux/kvm.h
@@ -924,7 +924,7 @@ struct kvm_ppc_resize_hpt {
#define KVM_CAP_S390_GS 140
#define KVM_CAP_S390_AIS 141
#define KVM_CAP_SPAPR_TCE_VFIO 142
-#define KVM_CAP_X86_GUEST_MWAIT 143
+#define KVM_CAP_X86_DISABLE_EXITS 143
#define KVM_CAP_ARM_USER_IRQ 144
#define KVM_CAP_S390_CMMA_MIGRATION 145
#define KVM_CAP_PPC_FWNMI 146
--
2.7.4


2018-03-12 11:56:49

by Wanpeng Li

[permalink] [raw]
Subject: [PATCH v2 3/3] KVM: X86: Provides userspace with a capability to not intercept PAUSE

From: Wanpeng Li <[email protected]>

Allow to disable pause loop exit/pause filtering on a per VM basis.

If some VMs have dedicated host CPUs, they won't be negatively affected
due to needlessly intercepted PAUSE instructions.

Thanks to Jan H. Schönherr's initial patch.

Cc: Paolo Bonzini <[email protected]>
Cc: Radim Krčmář <[email protected]>
Cc: Jan H. Schönherr <[email protected]>
Signed-off-by: Wanpeng Li <[email protected]>
---
arch/x86/include/asm/kvm_host.h | 1 +
arch/x86/kvm/svm.c | 3 ++-
arch/x86/kvm/vmx.c | 17 +++++++++++++----
arch/x86/kvm/x86.c | 4 +++-
arch/x86/kvm/x86.h | 9 ++++++++-
5 files changed, 27 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 1a79065..d73ca26 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -813,6 +813,7 @@ struct kvm_arch {

bool mwait_in_guest;
bool hlt_in_guest;
+ bool pause_in_guest;

bool ept_identity_pagetable_done;
gpa_t ept_identity_map_addr;
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 0b2e7af..ddc705c 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -1452,7 +1452,8 @@ static void init_vmcb(struct vcpu_svm *svm)
svm->nested.vmcb = 0;
svm->vcpu.arch.hflags = 0;

- if (boot_cpu_has(X86_FEATURE_PAUSEFILTER)) {
+ if (boot_cpu_has(X86_FEATURE_PAUSEFILTER) &&
+ !kvm_pause_in_guest(svm->vcpu.kvm)) {
control->pause_filter_count = 3000;
set_intercept(svm, INTERCEPT_PAUSE);
}
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index fa0c5e1..400f9d1 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -5582,7 +5582,7 @@ static void vmx_compute_secondary_exec_control(struct vcpu_vmx *vmx)
}
if (!enable_unrestricted_guest)
exec_control &= ~SECONDARY_EXEC_UNRESTRICTED_GUEST;
- if (!ple_gap)
+ if (kvm_pause_in_guest(vmx->vcpu.kvm))
exec_control &= ~SECONDARY_EXEC_PAUSE_LOOP_EXITING;
if (!kvm_vcpu_apicv_active(vcpu))
exec_control &= ~(SECONDARY_EXEC_APIC_REGISTER_VIRT |
@@ -5745,7 +5745,7 @@ static void vmx_vcpu_setup(struct vcpu_vmx *vmx)
vmcs_write64(POSTED_INTR_DESC_ADDR, __pa((&vmx->pi_desc)));
}

- if (ple_gap) {
+ if (!kvm_pause_in_guest(vmx->vcpu.kvm)) {
vmcs_write32(PLE_GAP, ple_gap);
vmx->ple_window = ple_window;
vmx->ple_window_dirty = true;
@@ -7182,7 +7182,7 @@ static __exit void hardware_unsetup(void)
*/
static int handle_pause(struct kvm_vcpu *vcpu)
{
- if (ple_gap)
+ if (!kvm_pause_in_guest(vcpu->kvm))
grow_ple_window(vcpu);

/*
@@ -9878,6 +9878,13 @@ static struct kvm_vcpu *vmx_create_vcpu(struct kvm *kvm, unsigned int id)
return ERR_PTR(err);
}

+static int vmx_vm_init(struct kvm *kvm)
+{
+ if (!ple_gap)
+ kvm->arch.pause_in_guest = true;
+ return 0;
+}
+
static void __init vmx_check_processor_compat(void *rtn)
{
struct vmcs_config vmcs_conf;
@@ -12019,7 +12026,7 @@ static void vmx_cancel_hv_timer(struct kvm_vcpu *vcpu)

static void vmx_sched_in(struct kvm_vcpu *vcpu, int cpu)
{
- if (ple_gap)
+ if (!kvm_pause_in_guest(vcpu->kvm))
shrink_ple_window(vcpu);
}

@@ -12379,6 +12386,8 @@ static struct kvm_x86_ops vmx_x86_ops __ro_after_init = {
.cpu_has_accelerated_tpr = report_flexpriority,
.cpu_has_high_real_mode_segbase = vmx_has_high_real_mode_segbase,

+ .vm_init = vmx_vm_init,
+
.vcpu_create = vmx_create_vcpu,
.vcpu_free = vmx_free_vcpu,
.vcpu_reset = vmx_vcpu_reset,
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 73255e6..8060f27 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2874,7 +2874,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
r = KVM_CLOCK_TSC_STABLE;
break;
case KVM_CAP_X86_DISABLE_EXITS:
- r |= KVM_X86_DISABLE_EXITS_HTL;
+ r |= KVM_X86_DISABLE_EXITS_HTL | KVM_X86_DISABLE_EXITS_PAUSE;
if(kvm_can_mwait_in_guest())
r |= KVM_X86_DISABLE_EXITS_MWAIT;
break;
@@ -4231,6 +4231,8 @@ static int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
kvm->arch.mwait_in_guest = true;
if (cap->args[0] & KVM_X86_DISABLE_EXITS_HTL)
kvm->arch.hlt_in_guest = true;
+ if (cap->args[0] & KVM_X86_DISABLE_EXITS_PAUSE)
+ kvm->arch.pause_in_guest = true;
r = 0;
break;
default:
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index d4ddb00..658ea9a 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -264,8 +264,10 @@ static inline u64 nsec_to_cycles(struct kvm_vcpu *vcpu, u64 nsec)

#define KVM_X86_DISABLE_EXITS_MWAIT (1 << 0)
#define KVM_X86_DISABLE_EXITS_HTL (1 << 1)
+#define KVM_X86_DISABLE_EXITS_PAUSE (1 << 2)
#define KVM_X86_DISABLE_VALID_EXITS (KVM_X86_DISABLE_EXITS_MWAIT | \
- KVM_X86_DISABLE_EXITS_HTL)
+ KVM_X86_DISABLE_EXITS_HTL | \
+ KVM_X86_DISABLE_EXITS_PAUSE)

static inline bool kvm_mwait_in_guest(struct kvm *kvm)
{
@@ -277,4 +279,9 @@ static inline bool kvm_hlt_in_guest(struct kvm *kvm)
return kvm->arch.hlt_in_guest;
}

+static inline bool kvm_pause_in_guest(struct kvm *kvm)
+{
+ return kvm->arch.pause_in_guest;
+}
+
#endif
--
2.7.4


2018-03-13 18:22:45

by Jim Mattson

[permalink] [raw]
Subject: Re: [PATCH v2 1/3] KVM: X86: Provides userspace with a capability to not intercept MWAIT

Is there a need for a new API for yielding MONITOR/MWAIT to the guest?
Why not just tie this to the guest CPUID.01H:ECX[MWAIT] being set?

On Mon, Mar 12, 2018 at 4:53 AM, Wanpeng Li <[email protected]> wrote:
> From: Wanpeng Li <[email protected]>
>
> Allowing a guest to execute MWAIT without interception enables a guest
> to put a (physical) CPU into a power saving state, where it takes
> longer to return from than what may be desired by the host.
>
> Don't give a guest that power over a host by default. (Especially,
> since nothing prevents a guest from using MWAIT even when it is not
> advertised via CPUID.)
>
> Cc: Paolo Bonzini <[email protected]>
> Cc: Radim Krčmář <[email protected]>
> Cc: Jan H. Schönherr <[email protected]>
> Signed-off-by: Wanpeng Li <[email protected]>
> ---
> Documentation/virtual/kvm/api.txt | 23 ++++++++++++++---------
> arch/x86/include/asm/kvm_host.h | 2 ++
> arch/x86/kvm/svm.c | 2 +-
> arch/x86/kvm/vmx.c | 9 +++++----
> arch/x86/kvm/x86.c | 24 ++++++++++++++++++++----
> arch/x86/kvm/x86.h | 10 +++++-----
> include/uapi/linux/kvm.h | 2 +-
> tools/include/uapi/linux/kvm.h | 2 +-
> 8 files changed, 49 insertions(+), 25 deletions(-)
>
> diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
> index 98de506..76e5a15 100644
> --- a/Documentation/virtual/kvm/api.txt
> +++ b/Documentation/virtual/kvm/api.txt
> @@ -4358,6 +4358,20 @@ enables QEMU to build error log and branch to guest kernel registered
> machine check handling routine. Without this capability KVM will
> branch to guests' 0x200 interrupt vector.
>
> +7.13 KVM_CAP_X86_DISABLE_EXITS
> +
> +Architectures: x86
> +Parameters: args[0] defines which exits are disabled
> +Returns: 0 on success, -EINVAL when args[0] contains invalid exits
> +
> +Valid exits in args[0] are
> +
> +#define KVM_X86_DISABLE_EXITS_MWAIT (1 << 0)
> +
> +Enabling this capability on a VM provides userspace with a way to no
> +longer intercepts some instructions for improved latency in some
> +workloads.
> +
> 8. Other capabilities.
> ----------------------
>
> @@ -4470,15 +4484,6 @@ reserved.
> Both registers and addresses are 64-bits wide.
> It will be possible to run 64-bit or 32-bit guest code.
>
> -8.8 KVM_CAP_X86_GUEST_MWAIT
> -
> -Architectures: x86
> -
> -This capability indicates that guest using memory monotoring instructions
> -(MWAIT/MWAITX) to stop the virtual CPU will not cause a VM exit. As such time
> -spent while virtual CPU is halted in this way will then be accounted for as
> -guest running time on the host (as opposed to e.g. HLT).
> -
> 8.9 KVM_CAP_ARM_USER_IRQ
>
> Architectures: arm, arm64
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 0395c35..e107171 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -811,6 +811,8 @@ struct kvm_arch {
>
> gpa_t wall_clock;
>
> + bool mwait_in_guest;
> +
> bool ept_identity_pagetable_done;
> gpa_t ept_identity_map_addr;
>
> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> index be9c839..321b3fd 100644
> --- a/arch/x86/kvm/svm.c
> +++ b/arch/x86/kvm/svm.c
> @@ -1390,7 +1390,7 @@ static void init_vmcb(struct vcpu_svm *svm)
> set_intercept(svm, INTERCEPT_XSETBV);
> set_intercept(svm, INTERCEPT_RSM);
>
> - if (!kvm_mwait_in_guest()) {
> + if (!kvm_mwait_in_guest(svm->vcpu.kvm)) {
> set_intercept(svm, INTERCEPT_MONITOR);
> set_intercept(svm, INTERCEPT_MWAIT);
> }
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 6cefd7b..2302ae2 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -3733,13 +3733,11 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf)
> CPU_BASED_UNCOND_IO_EXITING |
> CPU_BASED_MOV_DR_EXITING |
> CPU_BASED_USE_TSC_OFFSETING |
> + CPU_BASED_MWAIT_EXITING |
> + CPU_BASED_MONITOR_EXITING |
> CPU_BASED_INVLPG_EXITING |
> CPU_BASED_RDPMC_EXITING;
>
> - if (!kvm_mwait_in_guest())
> - min |= CPU_BASED_MWAIT_EXITING |
> - CPU_BASED_MONITOR_EXITING;
> -
> opt = CPU_BASED_TPR_SHADOW |
> CPU_BASED_USE_MSR_BITMAPS |
> CPU_BASED_ACTIVATE_SECONDARY_CONTROLS;
> @@ -5531,6 +5529,9 @@ static u32 vmx_exec_control(struct vcpu_vmx *vmx)
> exec_control |= CPU_BASED_CR3_STORE_EXITING |
> CPU_BASED_CR3_LOAD_EXITING |
> CPU_BASED_INVLPG_EXITING;
> + if (kvm_mwait_in_guest(vmx->vcpu.kvm))
> + exec_control &= ~(CPU_BASED_MWAIT_EXITING |
> + CPU_BASED_MONITOR_EXITING);
> return exec_control;
> }
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 36ef3d8..5fae476 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -2809,9 +2809,15 @@ static int msr_io(struct kvm_vcpu *vcpu, struct kvm_msrs __user *user_msrs,
> return r;
> }
>
> +static inline bool kvm_can_mwait_in_guest(void)
> +{
> + return boot_cpu_has(X86_FEATURE_MWAIT) &&
> + !boot_cpu_has_bug(X86_BUG_MONITOR);
> +}
> +
> int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
> {
> - int r;
> + int r = 0;
>
> switch (ext) {
> case KVM_CAP_IRQCHIP:
> @@ -2867,8 +2873,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
> case KVM_CAP_ADJUST_CLOCK:
> r = KVM_CLOCK_TSC_STABLE;
> break;
> - case KVM_CAP_X86_GUEST_MWAIT:
> - r = kvm_mwait_in_guest();
> + case KVM_CAP_X86_DISABLE_EXITS:
> + if(kvm_can_mwait_in_guest())
> + r |= KVM_X86_DISABLE_EXITS_MWAIT;
> break;
> case KVM_CAP_X86_SMM:
> /* SMBASE is usually relocated above 1M on modern chipsets,
> @@ -2909,7 +2916,6 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
> r = KVM_X2APIC_API_VALID_FLAGS;
> break;
> default:
> - r = 0;
> break;
> }
> return r;
> @@ -4214,6 +4220,16 @@ static int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
>
> r = 0;
> break;
> + case KVM_CAP_X86_DISABLE_EXITS:
> + r = -EINVAL;
> + if (cap->args[0] & ~KVM_X86_DISABLE_VALID_EXITS)
> + break;
> +
> + if ((cap->args[0] & KVM_X86_DISABLE_EXITS_MWAIT) &&
> + kvm_can_mwait_in_guest())
> + kvm->arch.mwait_in_guest = true;
> + r = 0;
> + break;
> default:
> r = -EINVAL;
> break;
> diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
> index b91215d..cd1215e 100644
> --- a/arch/x86/kvm/x86.h
> +++ b/arch/x86/kvm/x86.h
> @@ -2,8 +2,6 @@
> #ifndef ARCH_X86_KVM_X86_H
> #define ARCH_X86_KVM_X86_H
>
> -#include <asm/processor.h>
> -#include <asm/mwait.h>
> #include <linux/kvm_host.h>
> #include <asm/pvclock.h>
> #include "kvm_cache_regs.h"
> @@ -264,10 +262,12 @@ static inline u64 nsec_to_cycles(struct kvm_vcpu *vcpu, u64 nsec)
> __rem; \
> })
>
> -static inline bool kvm_mwait_in_guest(void)
> +#define KVM_X86_DISABLE_EXITS_MWAIT (1 << 0)
> +#define KVM_X86_DISABLE_VALID_EXITS (KVM_X86_DISABLE_EXITS_MWAIT)
> +
> +static inline bool kvm_mwait_in_guest(struct kvm *kvm)
> {
> - return boot_cpu_has(X86_FEATURE_MWAIT) &&
> - !boot_cpu_has_bug(X86_BUG_MONITOR);
> + return kvm->arch.mwait_in_guest;
> }
>
> #endif
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index 088c2c9..1065006 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -929,7 +929,7 @@ struct kvm_ppc_resize_hpt {
> #define KVM_CAP_S390_GS 140
> #define KVM_CAP_S390_AIS 141
> #define KVM_CAP_SPAPR_TCE_VFIO 142
> -#define KVM_CAP_X86_GUEST_MWAIT 143
> +#define KVM_CAP_X86_DISABLE_EXITS 143
> #define KVM_CAP_ARM_USER_IRQ 144
> #define KVM_CAP_S390_CMMA_MIGRATION 145
> #define KVM_CAP_PPC_FWNMI 146
> diff --git a/tools/include/uapi/linux/kvm.h b/tools/include/uapi/linux/kvm.h
> index 0fb5ef9..b13c257 100644
> --- a/tools/include/uapi/linux/kvm.h
> +++ b/tools/include/uapi/linux/kvm.h
> @@ -924,7 +924,7 @@ struct kvm_ppc_resize_hpt {
> #define KVM_CAP_S390_GS 140
> #define KVM_CAP_S390_AIS 141
> #define KVM_CAP_SPAPR_TCE_VFIO 142
> -#define KVM_CAP_X86_GUEST_MWAIT 143
> +#define KVM_CAP_X86_DISABLE_EXITS 143
> #define KVM_CAP_ARM_USER_IRQ 144
> #define KVM_CAP_S390_CMMA_MIGRATION 145
> #define KVM_CAP_PPC_FWNMI 146
> --
> 2.7.4
>

2018-03-13 23:44:30

by Wanpeng Li

[permalink] [raw]
Subject: Re: [PATCH v2 1/3] KVM: X86: Provides userspace with a capability to not intercept MWAIT

Hi Jim,
2018-03-14 2:21 GMT+08:00 Jim Mattson <[email protected]>:
> Is there a need for a new API for yielding MONITOR/MWAIT to the guest?
> Why not just tie this to the guest CPUID.01H:ECX[MWAIT] being set?

The API also will be used by HLT/PAUSE. Please refer to Paolo's
original proposal though I didn't find a link which is replied by
Paolo direclty. https://marc.info/?l=kvm&m=151182818103804&w=2

Regards,
Wanpeng Li

>
> On Mon, Mar 12, 2018 at 4:53 AM, Wanpeng Li <[email protected]> wrote:
>> From: Wanpeng Li <[email protected]>
>>
>> Allowing a guest to execute MWAIT without interception enables a guest
>> to put a (physical) CPU into a power saving state, where it takes
>> longer to return from than what may be desired by the host.
>>
>> Don't give a guest that power over a host by default. (Especially,
>> since nothing prevents a guest from using MWAIT even when it is not
>> advertised via CPUID.)
>>
>> Cc: Paolo Bonzini <[email protected]>
>> Cc: Radim Krčmář <[email protected]>
>> Cc: Jan H. Schönherr <[email protected]>
>> Signed-off-by: Wanpeng Li <[email protected]>
>> ---
>> Documentation/virtual/kvm/api.txt | 23 ++++++++++++++---------
>> arch/x86/include/asm/kvm_host.h | 2 ++
>> arch/x86/kvm/svm.c | 2 +-
>> arch/x86/kvm/vmx.c | 9 +++++----
>> arch/x86/kvm/x86.c | 24 ++++++++++++++++++++----
>> arch/x86/kvm/x86.h | 10 +++++-----
>> include/uapi/linux/kvm.h | 2 +-
>> tools/include/uapi/linux/kvm.h | 2 +-
>> 8 files changed, 49 insertions(+), 25 deletions(-)
>>
>> diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
>> index 98de506..76e5a15 100644
>> --- a/Documentation/virtual/kvm/api.txt
>> +++ b/Documentation/virtual/kvm/api.txt
>> @@ -4358,6 +4358,20 @@ enables QEMU to build error log and branch to guest kernel registered
>> machine check handling routine. Without this capability KVM will
>> branch to guests' 0x200 interrupt vector.
>>
>> +7.13 KVM_CAP_X86_DISABLE_EXITS
>> +
>> +Architectures: x86
>> +Parameters: args[0] defines which exits are disabled
>> +Returns: 0 on success, -EINVAL when args[0] contains invalid exits
>> +
>> +Valid exits in args[0] are
>> +
>> +#define KVM_X86_DISABLE_EXITS_MWAIT (1 << 0)
>> +
>> +Enabling this capability on a VM provides userspace with a way to no
>> +longer intercepts some instructions for improved latency in some
>> +workloads.
>> +
>> 8. Other capabilities.
>> ----------------------
>>
>> @@ -4470,15 +4484,6 @@ reserved.
>> Both registers and addresses are 64-bits wide.
>> It will be possible to run 64-bit or 32-bit guest code.
>>
>> -8.8 KVM_CAP_X86_GUEST_MWAIT
>> -
>> -Architectures: x86
>> -
>> -This capability indicates that guest using memory monotoring instructions
>> -(MWAIT/MWAITX) to stop the virtual CPU will not cause a VM exit. As such time
>> -spent while virtual CPU is halted in this way will then be accounted for as
>> -guest running time on the host (as opposed to e.g. HLT).
>> -
>> 8.9 KVM_CAP_ARM_USER_IRQ
>>
>> Architectures: arm, arm64
>> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
>> index 0395c35..e107171 100644
>> --- a/arch/x86/include/asm/kvm_host.h
>> +++ b/arch/x86/include/asm/kvm_host.h
>> @@ -811,6 +811,8 @@ struct kvm_arch {
>>
>> gpa_t wall_clock;
>>
>> + bool mwait_in_guest;
>> +
>> bool ept_identity_pagetable_done;
>> gpa_t ept_identity_map_addr;
>>
>> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
>> index be9c839..321b3fd 100644
>> --- a/arch/x86/kvm/svm.c
>> +++ b/arch/x86/kvm/svm.c
>> @@ -1390,7 +1390,7 @@ static void init_vmcb(struct vcpu_svm *svm)
>> set_intercept(svm, INTERCEPT_XSETBV);
>> set_intercept(svm, INTERCEPT_RSM);
>>
>> - if (!kvm_mwait_in_guest()) {
>> + if (!kvm_mwait_in_guest(svm->vcpu.kvm)) {
>> set_intercept(svm, INTERCEPT_MONITOR);
>> set_intercept(svm, INTERCEPT_MWAIT);
>> }
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> index 6cefd7b..2302ae2 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -3733,13 +3733,11 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf)
>> CPU_BASED_UNCOND_IO_EXITING |
>> CPU_BASED_MOV_DR_EXITING |
>> CPU_BASED_USE_TSC_OFFSETING |
>> + CPU_BASED_MWAIT_EXITING |
>> + CPU_BASED_MONITOR_EXITING |
>> CPU_BASED_INVLPG_EXITING |
>> CPU_BASED_RDPMC_EXITING;
>>
>> - if (!kvm_mwait_in_guest())
>> - min |= CPU_BASED_MWAIT_EXITING |
>> - CPU_BASED_MONITOR_EXITING;
>> -
>> opt = CPU_BASED_TPR_SHADOW |
>> CPU_BASED_USE_MSR_BITMAPS |
>> CPU_BASED_ACTIVATE_SECONDARY_CONTROLS;
>> @@ -5531,6 +5529,9 @@ static u32 vmx_exec_control(struct vcpu_vmx *vmx)
>> exec_control |= CPU_BASED_CR3_STORE_EXITING |
>> CPU_BASED_CR3_LOAD_EXITING |
>> CPU_BASED_INVLPG_EXITING;
>> + if (kvm_mwait_in_guest(vmx->vcpu.kvm))
>> + exec_control &= ~(CPU_BASED_MWAIT_EXITING |
>> + CPU_BASED_MONITOR_EXITING);
>> return exec_control;
>> }
>>
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index 36ef3d8..5fae476 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -2809,9 +2809,15 @@ static int msr_io(struct kvm_vcpu *vcpu, struct kvm_msrs __user *user_msrs,
>> return r;
>> }
>>
>> +static inline bool kvm_can_mwait_in_guest(void)
>> +{
>> + return boot_cpu_has(X86_FEATURE_MWAIT) &&
>> + !boot_cpu_has_bug(X86_BUG_MONITOR);
>> +}
>> +
>> int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
>> {
>> - int r;
>> + int r = 0;
>>
>> switch (ext) {
>> case KVM_CAP_IRQCHIP:
>> @@ -2867,8 +2873,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
>> case KVM_CAP_ADJUST_CLOCK:
>> r = KVM_CLOCK_TSC_STABLE;
>> break;
>> - case KVM_CAP_X86_GUEST_MWAIT:
>> - r = kvm_mwait_in_guest();
>> + case KVM_CAP_X86_DISABLE_EXITS:
>> + if(kvm_can_mwait_in_guest())
>> + r |= KVM_X86_DISABLE_EXITS_MWAIT;
>> break;
>> case KVM_CAP_X86_SMM:
>> /* SMBASE is usually relocated above 1M on modern chipsets,
>> @@ -2909,7 +2916,6 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
>> r = KVM_X2APIC_API_VALID_FLAGS;
>> break;
>> default:
>> - r = 0;
>> break;
>> }
>> return r;
>> @@ -4214,6 +4220,16 @@ static int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
>>
>> r = 0;
>> break;
>> + case KVM_CAP_X86_DISABLE_EXITS:
>> + r = -EINVAL;
>> + if (cap->args[0] & ~KVM_X86_DISABLE_VALID_EXITS)
>> + break;
>> +
>> + if ((cap->args[0] & KVM_X86_DISABLE_EXITS_MWAIT) &&
>> + kvm_can_mwait_in_guest())
>> + kvm->arch.mwait_in_guest = true;
>> + r = 0;
>> + break;
>> default:
>> r = -EINVAL;
>> break;
>> diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
>> index b91215d..cd1215e 100644
>> --- a/arch/x86/kvm/x86.h
>> +++ b/arch/x86/kvm/x86.h
>> @@ -2,8 +2,6 @@
>> #ifndef ARCH_X86_KVM_X86_H
>> #define ARCH_X86_KVM_X86_H
>>
>> -#include <asm/processor.h>
>> -#include <asm/mwait.h>
>> #include <linux/kvm_host.h>
>> #include <asm/pvclock.h>
>> #include "kvm_cache_regs.h"
>> @@ -264,10 +262,12 @@ static inline u64 nsec_to_cycles(struct kvm_vcpu *vcpu, u64 nsec)
>> __rem; \
>> })
>>
>> -static inline bool kvm_mwait_in_guest(void)
>> +#define KVM_X86_DISABLE_EXITS_MWAIT (1 << 0)
>> +#define KVM_X86_DISABLE_VALID_EXITS (KVM_X86_DISABLE_EXITS_MWAIT)
>> +
>> +static inline bool kvm_mwait_in_guest(struct kvm *kvm)
>> {
>> - return boot_cpu_has(X86_FEATURE_MWAIT) &&
>> - !boot_cpu_has_bug(X86_BUG_MONITOR);
>> + return kvm->arch.mwait_in_guest;
>> }
>>
>> #endif
>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
>> index 088c2c9..1065006 100644
>> --- a/include/uapi/linux/kvm.h
>> +++ b/include/uapi/linux/kvm.h
>> @@ -929,7 +929,7 @@ struct kvm_ppc_resize_hpt {
>> #define KVM_CAP_S390_GS 140
>> #define KVM_CAP_S390_AIS 141
>> #define KVM_CAP_SPAPR_TCE_VFIO 142
>> -#define KVM_CAP_X86_GUEST_MWAIT 143
>> +#define KVM_CAP_X86_DISABLE_EXITS 143
>> #define KVM_CAP_ARM_USER_IRQ 144
>> #define KVM_CAP_S390_CMMA_MIGRATION 145
>> #define KVM_CAP_PPC_FWNMI 146
>> diff --git a/tools/include/uapi/linux/kvm.h b/tools/include/uapi/linux/kvm.h
>> index 0fb5ef9..b13c257 100644
>> --- a/tools/include/uapi/linux/kvm.h
>> +++ b/tools/include/uapi/linux/kvm.h
>> @@ -924,7 +924,7 @@ struct kvm_ppc_resize_hpt {
>> #define KVM_CAP_S390_GS 140
>> #define KVM_CAP_S390_AIS 141
>> #define KVM_CAP_SPAPR_TCE_VFIO 142
>> -#define KVM_CAP_X86_GUEST_MWAIT 143
>> +#define KVM_CAP_X86_DISABLE_EXITS 143
>> #define KVM_CAP_ARM_USER_IRQ 144
>> #define KVM_CAP_S390_CMMA_MIGRATION 145
>> #define KVM_CAP_PPC_FWNMI 146
>> --
>> 2.7.4
>>

2018-03-15 01:51:08

by Wanpeng Li

[permalink] [raw]
Subject: Re: [PATCH v2 0/3] Provides userspace with per-VM capability to improve latency

2018-03-15 0:11 GMT+08:00 Paolo Bonzini <[email protected]>:
> On 12/03/2018 12:53, Wanpeng Li wrote:
>> Provides userspace with per-VM capability(KVM_CAP_X86_DISABLE_EXITS) to
>> not intercept MWAIT/HLT/PAUSE in order that to improve latency in some
>> workloads.
>>
>> The patchset implements the original proposal from Radim.
>> https://www.spinics.net/lists/kvm/msg146879.html
>>
>> In addition, thanks to Jan H. Schönherr's attempt last year.
>>
>> v1 -> v2:
>> * remove blinding setting KVM_ENABLE_CAP statement in doc
>> * move PV_UNHALT associated statement to 2/3
>> * rename kvm_mwait_can_in_guest to kvm_can_mwait_in_guest
>> * remove unconditionally set INTERCEPT HLT in svm
>> * call vmx_clear_hlt() from pre_enter_smm()
>> * add a check to kvm_update_cpuid() that forbits KVM_FEATURE_PV_UNHALT
>> when halt exits are disabld
>>
>> Cc: Paolo Bonzini <[email protected]>
>> Cc: Radim Krčmář <[email protected]>
>> Cc: Jan H. Schönherr <[email protected]>
>>
>> Wanpeng Li (3):
>> KVM: X86: Provides userspace with a capability to not intercept MWAIT
>> KVM: X86: Provides userspace with a capability to not intercept HLT
>> KVM: X86: Provides userspace with a capability to not intercept PAUSE
>>
>> Documentation/virtual/kvm/api.txt | 24 ++++++++++++-------
>> arch/x86/include/asm/kvm_host.h | 4 ++++
>> arch/x86/kvm/cpuid.c | 5 ++++
>> arch/x86/kvm/svm.c | 9 ++++---
>> arch/x86/kvm/vmx.c | 50 ++++++++++++++++++++++++++++++++-------
>> arch/x86/kvm/x86.c | 29 +++++++++++++++++++----
>> arch/x86/kvm/x86.h | 24 +++++++++++++++----
>> include/uapi/linux/kvm.h | 2 +-
>> tools/include/uapi/linux/kvm.h | 2 +-
>> 9 files changed, 118 insertions(+), 31 deletions(-)
>>
>
> Queued, thanks.

Thanks Paolo. :)

>Do you have QEMU patches to automatically enable this
> together with HINTS_DEDICATED?

I will cook the patches. :)

Regards,
Wanpeng Li

2018-05-23 11:23:55

by Wanpeng Li

[permalink] [raw]
Subject: Re: [PATCH v2 0/3] Provides userspace with per-VM capability to improve latency

2018-03-12 19:53 GMT+08:00 Wanpeng Li <[email protected]>:
> Provides userspace with per-VM capability(KVM_CAP_X86_DISABLE_EXITS) to
> not intercept MWAIT/HLT/PAUSE in order that to improve latency in some
> workloads.

When running cyclictest in the guest w/ vCPU pin on host and
cyclictest pin in guest, the avg latency can be reduced 40%.

Regards,
Wanpeng Li

>
> The patchset implements the original proposal from Radim.
> https://www.spinics.net/lists/kvm/msg146879.html
>
> In addition, thanks to Jan H. Schönherr's attempt last year.
>
> v1 -> v2:
> * remove blinding setting KVM_ENABLE_CAP statement in doc
> * move PV_UNHALT associated statement to 2/3
> * rename kvm_mwait_can_in_guest to kvm_can_mwait_in_guest
> * remove unconditionally set INTERCEPT HLT in svm
> * call vmx_clear_hlt() from pre_enter_smm()
> * add a check to kvm_update_cpuid() that forbits KVM_FEATURE_PV_UNHALT
> when halt exits are disabld
>
> Cc: Paolo Bonzini <[email protected]>
> Cc: Radim Krčmář <[email protected]>
> Cc: Jan H. Schönherr <[email protected]>
>
> Wanpeng Li (3):
> KVM: X86: Provides userspace with a capability to not intercept MWAIT
> KVM: X86: Provides userspace with a capability to not intercept HLT
> KVM: X86: Provides userspace with a capability to not intercept PAUSE
>
> Documentation/virtual/kvm/api.txt | 24 ++++++++++++-------
> arch/x86/include/asm/kvm_host.h | 4 ++++
> arch/x86/kvm/cpuid.c | 5 ++++
> arch/x86/kvm/svm.c | 9 ++++---
> arch/x86/kvm/vmx.c | 50 ++++++++++++++++++++++++++++++++-------
> arch/x86/kvm/x86.c | 29 +++++++++++++++++++----
> arch/x86/kvm/x86.h | 24 +++++++++++++++----
> include/uapi/linux/kvm.h | 2 +-
> tools/include/uapi/linux/kvm.h | 2 +-
> 9 files changed, 118 insertions(+), 31 deletions(-)
>
> --
> 2.7.4
>