Received: by 10.213.65.68 with SMTP id h4csp512020imn; Tue, 13 Mar 2018 11:22:45 -0700 (PDT) X-Google-Smtp-Source: AG47ELv+SyQGFBBw94Dycbv02WbomzQN1TeP21vzQQBG/Ytic4/E7YXl55QYL+hirlRebOaheWJD X-Received: by 10.101.97.139 with SMTP id c11mr1214405pgv.435.1520965365764; Tue, 13 Mar 2018 11:22:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1520965365; cv=none; d=google.com; s=arc-20160816; b=0RwJhJ019ufU8xWCxgQ06okNd0v6EnToelmqfXtXlPhCp1ZUSPuG3uRJUPiyqDfyl3 GIyOD1w9b7Zh2nd0P+jgbnI62rVjDNQm1sGtodRDiX1I6QpWSqMS2bvo21AsK40tnVHn Pu1MH1XRBeHGywxXM4gBeH4juDJKnY+/WHs+DUETtP2PGCvz0XJ3kG0Q7cnP/z/Fui/m E5HaKZiDVPsxD5GnhYrcc+PPijMM+Ktj/CSkVkxWkRWLKKib2xATjC5Kr24OZ+v/G0I5 eDYdaAjswlFw+iHCuDjneWblByLTRO1BE2NnpWAbPCan3dMAiEN5ZyyMpXKg2h1iSems ZvCA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:cc:to:subject :message-id:date:from:references:in-reply-to:mime-version :dkim-signature:arc-authentication-results; bh=hVCAuTMb7yUlmKr5FWK5ZBTA+QGE02MXkGAQ9GaK7iM=; b=qACdol5PK1ESm9F02n303EbUpSED7wREZGM3AmVhTsMlDVRH3SoS8ToiB+AAduDQxF N9CjpLIL0a0wWMItJRyLmBm/UrsCzlydJmNubN1PdPqlS2wF9Bga9Z7pTWE21524zczz Dk4DuIVgMrMPE14ZykQl9IKlmQ2J8oNOi8xUDAfX7lbEq3e9Ycbeu3s9qImwKqNkol/X +kAFEFKncZLRlh0Is1qGKczKucAzWLGsPR+OjXWdZwT/Yvt4i9+MF8Fr6mi70p9Ue7Ay 2ddnLvQg4MpjqBIF7uO0haaR5q4uU34V4S82Hyy3JbI7qOD+pAyiW4eXOkOem6RuhwTd SJzw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=fjsxgXbg; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id k1si441811pgc.638.2018.03.13.11.22.30; Tue, 13 Mar 2018 11:22:45 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=fjsxgXbg; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752470AbeCMSV2 (ORCPT + 99 others); Tue, 13 Mar 2018 14:21:28 -0400 Received: from mail-io0-f196.google.com ([209.85.223.196]:39462 "EHLO mail-io0-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751362AbeCMSV0 (ORCPT ); Tue, 13 Mar 2018 14:21:26 -0400 Received: by mail-io0-f196.google.com with SMTP id v10so1221795iob.6 for ; Tue, 13 Mar 2018 11:21:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=hVCAuTMb7yUlmKr5FWK5ZBTA+QGE02MXkGAQ9GaK7iM=; b=fjsxgXbgFDUA1yBBeUjlJxr44xbKKIoYv8e7Dn+n2uQSQk3udfUmrUZDMmV9rWPi9r ujHpkh7eJEOHrEchWH/J7aJW9JYBvWTuza3i7BZoSbaf9N9RZx29eV+TvcpBOfnn0S5+ iJpsNB8C1LXCv3vfDHXOlVypJknUnyua1YPzLWOjm6d2wAQReifJ8qzrWxIw0sF1UsNV BvfcNSvqcOYaGjFOUwJJDRW6TqW77ncrT4ODYV31QUSg+p6QQcCx7eVck0GlcXIygGPq COEugwC4KnmXV3ZcvELOgrl4KB34nBfWoX/InLtNQ/p0nX7qdOPh26+Z5zPX0OkYEK2D ZftQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=hVCAuTMb7yUlmKr5FWK5ZBTA+QGE02MXkGAQ9GaK7iM=; b=q/OFL4CSuVug221NlAslJYKeHIInVJGohnZw5OUNglVY5rZZH2LCWp0EOToR6II3K+ NDYj7vRviCCLaFZ8FTWrv1JxKcjGcmYxueY5mn78QYceruFgZwy8Cqec0JfGaF18Lr1j ure5mjOgrwKwH+IEff2c6buKbcrLla4AzQ38nFDXFV2lnF1Ot9s22go/rGAL5tcemoxa +29bvJwDo47dbKnGRlqjtRoNQGG6uI9WTsm+DW4nHkd+7+tIFAfpEVrAZu7D5JalaPQd l6iXgCgpS9lWDwYahnK7hY3r9g6yKjHfnmqQisDx+lDhhaHNbuToKZq5OgMrv/OrBpiP yYew== X-Gm-Message-State: AElRT7FXVq/A+5j/GU44mOf8A3rqspHQ2K46uGrTyIv29r3a1TE+vEaZ KIyd+B+l36JS2JFozchZ/UW3E1RltYR6nDLoM5jxEw== X-Received: by 10.107.21.131 with SMTP id 125mr1892201iov.74.1520965285793; Tue, 13 Mar 2018 11:21:25 -0700 (PDT) MIME-Version: 1.0 Received: by 10.107.223.72 with HTTP; Tue, 13 Mar 2018 11:21:25 -0700 (PDT) In-Reply-To: <1520855584-10079-2-git-send-email-wanpengli@tencent.com> References: <1520855584-10079-1-git-send-email-wanpengli@tencent.com> <1520855584-10079-2-git-send-email-wanpengli@tencent.com> From: Jim Mattson Date: Tue, 13 Mar 2018 11:21:25 -0700 Message-ID: Subject: Re: [PATCH v2 1/3] KVM: X86: Provides userspace with a capability to not intercept MWAIT To: Wanpeng Li Cc: LKML , kvm list , Paolo Bonzini , =?UTF-8?B?UmFkaW0gS3LEjW3DocWZ?= , =?UTF-8?B?SmFuIEggLiBTY2jDtm5oZXJy?= Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Is there a need for a new API for yielding MONITOR/MWAIT to the guest? Why not just tie this to the guest CPUID.01H:ECX[MWAIT] being set? On Mon, Mar 12, 2018 at 4:53 AM, Wanpeng Li wrote: > From: Wanpeng Li > > Allowing a guest to execute MWAIT without interception enables a guest > to put a (physical) CPU into a power saving state, where it takes > longer to return from than what may be desired by the host. > > Don't give a guest that power over a host by default. (Especially, > since nothing prevents a guest from using MWAIT even when it is not > advertised via CPUID.) > > Cc: Paolo Bonzini > Cc: Radim Kr=C4=8Dm=C3=A1=C5=99 > Cc: Jan H. Sch=C3=B6nherr > Signed-off-by: Wanpeng Li > --- > Documentation/virtual/kvm/api.txt | 23 ++++++++++++++--------- > arch/x86/include/asm/kvm_host.h | 2 ++ > arch/x86/kvm/svm.c | 2 +- > arch/x86/kvm/vmx.c | 9 +++++---- > arch/x86/kvm/x86.c | 24 ++++++++++++++++++++---- > arch/x86/kvm/x86.h | 10 +++++----- > include/uapi/linux/kvm.h | 2 +- > tools/include/uapi/linux/kvm.h | 2 +- > 8 files changed, 49 insertions(+), 25 deletions(-) > > diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kv= m/api.txt > index 98de506..76e5a15 100644 > --- a/Documentation/virtual/kvm/api.txt > +++ b/Documentation/virtual/kvm/api.txt > @@ -4358,6 +4358,20 @@ enables QEMU to build error log and branch to gues= t kernel registered > machine check handling routine. Without this capability KVM will > branch to guests' 0x200 interrupt vector. > > +7.13 KVM_CAP_X86_DISABLE_EXITS > + > +Architectures: x86 > +Parameters: args[0] defines which exits are disabled > +Returns: 0 on success, -EINVAL when args[0] contains invalid exits > + > +Valid exits in args[0] are > + > +#define KVM_X86_DISABLE_EXITS_MWAIT (1 << 0) > + > +Enabling this capability on a VM provides userspace with a way to no > +longer intercepts some instructions for improved latency in some > +workloads. > + > 8. Other capabilities. > ---------------------- > > @@ -4470,15 +4484,6 @@ reserved. > Both registers and addresses are 64-bits wide. > It will be possible to run 64-bit or 32-bit guest code. > > -8.8 KVM_CAP_X86_GUEST_MWAIT > - > -Architectures: x86 > - > -This capability indicates that guest using memory monotoring instruction= s > -(MWAIT/MWAITX) to stop the virtual CPU will not cause a VM exit. As suc= h time > -spent while virtual CPU is halted in this way will then be accounted for= as > -guest running time on the host (as opposed to e.g. HLT). > - > 8.9 KVM_CAP_ARM_USER_IRQ > > Architectures: arm, arm64 > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_h= ost.h > index 0395c35..e107171 100644 > --- a/arch/x86/include/asm/kvm_host.h > +++ b/arch/x86/include/asm/kvm_host.h > @@ -811,6 +811,8 @@ struct kvm_arch { > > gpa_t wall_clock; > > + bool mwait_in_guest; > + > bool ept_identity_pagetable_done; > gpa_t ept_identity_map_addr; > > diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c > index be9c839..321b3fd 100644 > --- a/arch/x86/kvm/svm.c > +++ b/arch/x86/kvm/svm.c > @@ -1390,7 +1390,7 @@ static void init_vmcb(struct vcpu_svm *svm) > set_intercept(svm, INTERCEPT_XSETBV); > set_intercept(svm, INTERCEPT_RSM); > > - if (!kvm_mwait_in_guest()) { > + if (!kvm_mwait_in_guest(svm->vcpu.kvm)) { > set_intercept(svm, INTERCEPT_MONITOR); > set_intercept(svm, INTERCEPT_MWAIT); > } > diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c > index 6cefd7b..2302ae2 100644 > --- a/arch/x86/kvm/vmx.c > +++ b/arch/x86/kvm/vmx.c > @@ -3733,13 +3733,11 @@ static __init int setup_vmcs_config(struct vmcs_c= onfig *vmcs_conf) > CPU_BASED_UNCOND_IO_EXITING | > CPU_BASED_MOV_DR_EXITING | > CPU_BASED_USE_TSC_OFFSETING | > + CPU_BASED_MWAIT_EXITING | > + CPU_BASED_MONITOR_EXITING | > CPU_BASED_INVLPG_EXITING | > CPU_BASED_RDPMC_EXITING; > > - if (!kvm_mwait_in_guest()) > - min |=3D CPU_BASED_MWAIT_EXITING | > - CPU_BASED_MONITOR_EXITING; > - > opt =3D CPU_BASED_TPR_SHADOW | > CPU_BASED_USE_MSR_BITMAPS | > CPU_BASED_ACTIVATE_SECONDARY_CONTROLS; > @@ -5531,6 +5529,9 @@ static u32 vmx_exec_control(struct vcpu_vmx *vmx) > exec_control |=3D CPU_BASED_CR3_STORE_EXITING | > CPU_BASED_CR3_LOAD_EXITING | > CPU_BASED_INVLPG_EXITING; > + if (kvm_mwait_in_guest(vmx->vcpu.kvm)) > + exec_control &=3D ~(CPU_BASED_MWAIT_EXITING | > + CPU_BASED_MONITOR_EXITING); > return exec_control; > } > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 36ef3d8..5fae476 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -2809,9 +2809,15 @@ static int msr_io(struct kvm_vcpu *vcpu, struct kv= m_msrs __user *user_msrs, > return r; > } > > +static inline bool kvm_can_mwait_in_guest(void) > +{ > + return boot_cpu_has(X86_FEATURE_MWAIT) && > + !boot_cpu_has_bug(X86_BUG_MONITOR); > +} > + > int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) > { > - int r; > + int r =3D 0; > > switch (ext) { > case KVM_CAP_IRQCHIP: > @@ -2867,8 +2873,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, l= ong ext) > case KVM_CAP_ADJUST_CLOCK: > r =3D KVM_CLOCK_TSC_STABLE; > break; > - case KVM_CAP_X86_GUEST_MWAIT: > - r =3D kvm_mwait_in_guest(); > + case KVM_CAP_X86_DISABLE_EXITS: > + if(kvm_can_mwait_in_guest()) > + r |=3D KVM_X86_DISABLE_EXITS_MWAIT; > break; > case KVM_CAP_X86_SMM: > /* SMBASE is usually relocated above 1M on modern chipset= s, > @@ -2909,7 +2916,6 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, l= ong ext) > r =3D KVM_X2APIC_API_VALID_FLAGS; > break; > default: > - r =3D 0; > break; > } > return r; > @@ -4214,6 +4220,16 @@ static int kvm_vm_ioctl_enable_cap(struct kvm *kvm= , > > r =3D 0; > break; > + case KVM_CAP_X86_DISABLE_EXITS: > + r =3D -EINVAL; > + if (cap->args[0] & ~KVM_X86_DISABLE_VALID_EXITS) > + break; > + > + if ((cap->args[0] & KVM_X86_DISABLE_EXITS_MWAIT) && > + kvm_can_mwait_in_guest()) > + kvm->arch.mwait_in_guest =3D true; > + r =3D 0; > + break; > default: > r =3D -EINVAL; > break; > diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h > index b91215d..cd1215e 100644 > --- a/arch/x86/kvm/x86.h > +++ b/arch/x86/kvm/x86.h > @@ -2,8 +2,6 @@ > #ifndef ARCH_X86_KVM_X86_H > #define ARCH_X86_KVM_X86_H > > -#include > -#include > #include > #include > #include "kvm_cache_regs.h" > @@ -264,10 +262,12 @@ static inline u64 nsec_to_cycles(struct kvm_vcpu *v= cpu, u64 nsec) > __rem; \ > }) > > -static inline bool kvm_mwait_in_guest(void) > +#define KVM_X86_DISABLE_EXITS_MWAIT (1 << 0) > +#define KVM_X86_DISABLE_VALID_EXITS (KVM_X86_DISABLE_EXITS_MWAI= T) > + > +static inline bool kvm_mwait_in_guest(struct kvm *kvm) > { > - return boot_cpu_has(X86_FEATURE_MWAIT) && > - !boot_cpu_has_bug(X86_BUG_MONITOR); > + return kvm->arch.mwait_in_guest; > } > > #endif > diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h > index 088c2c9..1065006 100644 > --- a/include/uapi/linux/kvm.h > +++ b/include/uapi/linux/kvm.h > @@ -929,7 +929,7 @@ struct kvm_ppc_resize_hpt { > #define KVM_CAP_S390_GS 140 > #define KVM_CAP_S390_AIS 141 > #define KVM_CAP_SPAPR_TCE_VFIO 142 > -#define KVM_CAP_X86_GUEST_MWAIT 143 > +#define KVM_CAP_X86_DISABLE_EXITS 143 > #define KVM_CAP_ARM_USER_IRQ 144 > #define KVM_CAP_S390_CMMA_MIGRATION 145 > #define KVM_CAP_PPC_FWNMI 146 > diff --git a/tools/include/uapi/linux/kvm.h b/tools/include/uapi/linux/kv= m.h > index 0fb5ef9..b13c257 100644 > --- a/tools/include/uapi/linux/kvm.h > +++ b/tools/include/uapi/linux/kvm.h > @@ -924,7 +924,7 @@ struct kvm_ppc_resize_hpt { > #define KVM_CAP_S390_GS 140 > #define KVM_CAP_S390_AIS 141 > #define KVM_CAP_SPAPR_TCE_VFIO 142 > -#define KVM_CAP_X86_GUEST_MWAIT 143 > +#define KVM_CAP_X86_DISABLE_EXITS 143 > #define KVM_CAP_ARM_USER_IRQ 144 > #define KVM_CAP_S390_CMMA_MIGRATION 145 > #define KVM_CAP_PPC_FWNMI 146 > -- > 2.7.4 >