Received: by 10.213.65.68 with SMTP id h4csp3628imn; Mon, 12 Mar 2018 04:55:37 -0700 (PDT) X-Google-Smtp-Source: AG47ELtwXS+87jW1e+NZByucQ2w5u6iTfpes6+yEYFOJRTpOM3KggGNHLSWBoVcaX7c54QPvWGV2 X-Received: by 10.99.163.1 with SMTP id s1mr6550165pge.47.1520855737630; Mon, 12 Mar 2018 04:55:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1520855737; cv=none; d=google.com; s=arc-20160816; b=qjVmJDU/b+KsE1T6kU9FSac4hLKn36E9unli0LRI+WBCTUgrAhrk1OEy9cnzWGdPg0 Ldb6ElM5Nbk1OUetHfo4pGCIT80iRWQvGtdz++NQKZlFgGaCqN4MtwmHsjMJJwjJff6i RX/4oV3fz+VPpDCWfj/+34+VVgvtDxAhrs/hTryXNBTkvIJwqZTyzbp6P5WjcOjoQX2Y hF9T4k8c93T77uJ6NG6RVrWzHgE4NmJqvUG6xvbNVyVi8w6B/6bl8+dVtsh9b4aGQ5WO SPip3dMWYiIWg0oz9c6tc0lJpDcSphQqQwEf0tsB06KMu1E77AsnLzz/vkSylJhLIkhH 8Z5A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature:arc-authentication-results; bh=8G4TX1gWaEzYoz95PhzP9rBDA9vJ/G3kq8r57hmduNY=; b=m4qicSeZNsQ2iQ/TTpGWFkB3L9HQrK7HrnOC7RPrmAo8E+llOucwXiHSDkyrxlMF9/ 9it8lMdlcVXz89mQS6IDs0NS13TmY88CHIGEFpDUBBPM5pbxk8nqeNhd8t2S5JXXoiNQ 1PjaaHsQP3x2NdTAxHDL/wHFI91PM/mjbPJDblA66WaECna5f4Q4PGPW5ASSIGU2SQlu RijvWbv1m8XN9oy6eG7piMp6d2wze3jGz5SECBZb94I6f40/MPrITkwBl7lEDoK0eki6 Ma15MDCK1cd7cR7ObDFKoAYj8VtMZxjKDDdJJoR0k3fPgwYg2c5YHX+8KIWR+E1wNAWq E5DA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=Q57AePzq; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id u11si2042513pgv.461.2018.03.12.04.55.22; Mon, 12 Mar 2018 04:55:37 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=Q57AePzq; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751442AbeCLLyc (ORCPT + 99 others); Mon, 12 Mar 2018 07:54:32 -0400 Received: from mail-pl0-f67.google.com ([209.85.160.67]:33531 "EHLO mail-pl0-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750995AbeCLLyY (ORCPT ); Mon, 12 Mar 2018 07:54:24 -0400 Received: by mail-pl0-f67.google.com with SMTP id c11-v6so9177364plo.0; Mon, 12 Mar 2018 04:54:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=8G4TX1gWaEzYoz95PhzP9rBDA9vJ/G3kq8r57hmduNY=; b=Q57AePzqG+aIA82Ez+AH71eonQ1pVkz1EfYWt3RXkZwULqxbvl/NahBryIoTipCFPh bivT/iOfXLNcgC+ny94wVXjjE6magD0pJyW51VzBN99DWi0u2xaQx4ph2cTFNLdeGa8G fyRxPelBCno5WYVa1ZceKW8DF7DG/JX3Rl78dQ97opvmNNs6wZlL28avyQXWFDRjY6wU 6RqInE7XDZJ/Zy1Cg2tfDN7R5Tq72LAyii1MtKhGI2+k/a762PqCU5jJssMmOnhisyCK xDnm1de7kWEvtov5zpQ0kOlDC2H7I1iRsMsbBEW/2G+wkQzK9HWlPTV9vQHk3bwtwEkY /vUw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=8G4TX1gWaEzYoz95PhzP9rBDA9vJ/G3kq8r57hmduNY=; b=OD/0gBfh2CVD3gpSFIQaA1yu0VqESr+cQA76V/viL89dQxACZ9pGbSU8oB7OiAxs7r yG3DzcztxYqhZTGcxHB/kv4PvEmuIjc6GuPN4o9WJGg9jF37j+9CemtgsRFKeAXaTs9T 0Surk9CMdKwul5yySctEAO+qZIi1IMu7RyFMZRcTW/7q4Voz+gTvtan4obWyHwgRzGub wJ5eTBO0yxFdlLTuXYe5lqCgl1XfMNUtRFxI8wG3EHpQBJvpCiUDRSl01GifD6DKxOgS T/Ahrp5wdXn3OkTyflOAo0ffn6yZ++HGF8RAY8Ydu3hsVKS1OGY3CTAzkJ4omXL29H0q 1BoA== X-Gm-Message-State: AElRT7HOGe8Tr1c71mDC1RHZO41sbR0CTNeYosiGBWno7Wi9jG70gqe/ DIFAifs87bPrxrlAu3WMqN0zJQ== X-Received: by 2002:a17:902:8a4:: with SMTP id 33-v6mr8188699pll.274.1520855663783; Mon, 12 Mar 2018 04:54:23 -0700 (PDT) Received: from localhost.localdomain ([103.7.29.6]) by smtp.googlemail.com with ESMTPSA id v8sm16519878pfa.32.2018.03.12.04.54.22 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Mon, 12 Mar 2018 04:54:23 -0700 (PDT) From: Wanpeng Li X-Google-Original-From: Wanpeng Li To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , =?UTF-8?q?Radim=20Kr=C4=8Dm=C3=A1=C5=99?= , =?UTF-8?q?Jan=20H=20=2E=20Sch=C3=B6nherr?= Subject: [PATCH v2 2/3] KVM: X86: Provides userspace with a capability to not intercept HLT Date: Mon, 12 Mar 2018 04:53:03 -0700 Message-Id: <1520855584-10079-3-git-send-email-wanpengli@tencent.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1520855584-10079-1-git-send-email-wanpengli@tencent.com> References: <1520855584-10079-1-git-send-email-wanpengli@tencent.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Wanpeng Li If host CPUs are dedicated to a VM, we can avoid VM exits on HLT. This patch adds the per-VM non-HLT-exiting capability. Cc: Paolo Bonzini Cc: Radim Krčmář Cc: Jan H. Schönherr Signed-off-by: Wanpeng Li --- Documentation/virtual/kvm/api.txt | 3 ++- arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/cpuid.c | 5 +++++ arch/x86/kvm/svm.c | 4 +++- arch/x86/kvm/vmx.c | 24 ++++++++++++++++++++++++ arch/x86/kvm/x86.c | 3 +++ arch/x86/kvm/x86.h | 9 ++++++++- 7 files changed, 46 insertions(+), 3 deletions(-) diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index 76e5a15..b46494d 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -4367,10 +4367,11 @@ Returns: 0 on success, -EINVAL when args[0] contains invalid exits Valid exits in args[0] are #define KVM_X86_DISABLE_EXITS_MWAIT (1 << 0) +#define KVM_X86_DISABLE_EXITS_HLT (1 << 1) Enabling this capability on a VM provides userspace with a way to no longer intercepts some instructions for improved latency in some -workloads. +workloads. Do not enable KVM_FEATURE_PV_UNHALT if you disable HLT exits. 8. Other capabilities. ---------------------- diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index e107171..1a79065 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -812,6 +812,7 @@ struct kvm_arch { gpa_t wall_clock; bool mwait_in_guest; + bool hlt_in_guest; bool ept_identity_pagetable_done; gpa_t ept_identity_map_addr; diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index e2d3050..82055b9 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -135,6 +135,11 @@ int kvm_update_cpuid(struct kvm_vcpu *vcpu) return -EINVAL; } + best = kvm_find_cpuid_entry(vcpu, KVM_CPUID_FEATURES, 0); + if (kvm_hlt_in_guest(vcpu->kvm) && best && + (best->eax & (1 << KVM_FEATURE_PV_UNHALT))) + best->eax &= ~(1 << KVM_FEATURE_PV_UNHALT); + /* Update physical-address width */ vcpu->arch.maxphyaddr = cpuid_query_maxphyaddr(vcpu); kvm_mmu_reset_context(vcpu); diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 321b3fd..0b2e7af 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -1372,7 +1372,6 @@ static void init_vmcb(struct vcpu_svm *svm) set_intercept(svm, INTERCEPT_RDPMC); set_intercept(svm, INTERCEPT_CPUID); set_intercept(svm, INTERCEPT_INVD); - set_intercept(svm, INTERCEPT_HLT); set_intercept(svm, INTERCEPT_INVLPG); set_intercept(svm, INTERCEPT_INVLPGA); set_intercept(svm, INTERCEPT_IOIO_PROT); @@ -1395,6 +1394,9 @@ static void init_vmcb(struct vcpu_svm *svm) set_intercept(svm, INTERCEPT_MWAIT); } + if (!kvm_hlt_in_guest(svm->vcpu.kvm)) + set_intercept(svm, INTERCEPT_HLT); + control->iopm_base_pa = __sme_set(iopm_base); control->msrpm_base_pa = __sme_set(__pa(svm->msrpm)); control->int_ctl = V_INTR_MASKING_MASK; diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 2302ae2..fa0c5e1 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -2543,6 +2543,19 @@ static int nested_vmx_check_exception(struct kvm_vcpu *vcpu, unsigned long *exit return 0; } +static void vmx_clear_hlt(struct kvm_vcpu *vcpu) +{ + /* + * Ensure that we clear the HLT state in the VMCS. We don't need to + * explicitly skip the instruction because if the HLT state is set, + * then the instruction is already executing and RIP has already been + * advanced. + */ + if (kvm_hlt_in_guest(vcpu->kvm) && + vmcs_read32(GUEST_ACTIVITY_STATE) == GUEST_ACTIVITY_HLT) + vmcs_write32(GUEST_ACTIVITY_STATE, GUEST_ACTIVITY_ACTIVE); +} + static void vmx_queue_exception(struct kvm_vcpu *vcpu) { struct vcpu_vmx *vmx = to_vmx(vcpu); @@ -2573,6 +2586,8 @@ static void vmx_queue_exception(struct kvm_vcpu *vcpu) intr_info |= INTR_TYPE_HARD_EXCEPTION; vmcs_write32(VM_ENTRY_INTR_INFO_FIELD, intr_info); + + vmx_clear_hlt(vcpu); } static bool vmx_rdtscp_supported(void) @@ -5532,6 +5547,8 @@ static u32 vmx_exec_control(struct vcpu_vmx *vmx) if (kvm_mwait_in_guest(vmx->vcpu.kvm)) exec_control &= ~(CPU_BASED_MWAIT_EXITING | CPU_BASED_MONITOR_EXITING); + if (kvm_hlt_in_guest(vmx->vcpu.kvm)) + exec_control &= ~CPU_BASED_HLT_EXITING; return exec_control; } @@ -5893,6 +5910,8 @@ static void vmx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) update_exception_bitmap(vcpu); vpid_sync_context(vmx->vpid); + if (init_event) + vmx_clear_hlt(vcpu); } /* @@ -5963,6 +5982,8 @@ static void vmx_inject_irq(struct kvm_vcpu *vcpu) } else intr |= INTR_TYPE_EXT_INTR; vmcs_write32(VM_ENTRY_INTR_INFO_FIELD, intr); + + vmx_clear_hlt(vcpu); } static void vmx_inject_nmi(struct kvm_vcpu *vcpu) @@ -5993,6 +6014,8 @@ static void vmx_inject_nmi(struct kvm_vcpu *vcpu) vmcs_write32(VM_ENTRY_INTR_INFO_FIELD, INTR_TYPE_NMI_INTR | INTR_INFO_VALID_MASK | NMI_VECTOR); + + vmx_clear_hlt(vcpu); } static bool vmx_get_nmi_mask(struct kvm_vcpu *vcpu) @@ -12314,6 +12337,7 @@ static int vmx_pre_enter_smm(struct kvm_vcpu *vcpu, char *smstate) vmx->nested.smm.vmxon = vmx->nested.vmxon; vmx->nested.vmxon = false; + vmx_clear_hlt(vcpu); return 0; } diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 5fae476..73255e6 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -2874,6 +2874,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) r = KVM_CLOCK_TSC_STABLE; break; case KVM_CAP_X86_DISABLE_EXITS: + r |= KVM_X86_DISABLE_EXITS_HTL; if(kvm_can_mwait_in_guest()) r |= KVM_X86_DISABLE_EXITS_MWAIT; break; @@ -4228,6 +4229,8 @@ static int kvm_vm_ioctl_enable_cap(struct kvm *kvm, if ((cap->args[0] & KVM_X86_DISABLE_EXITS_MWAIT) && kvm_can_mwait_in_guest()) kvm->arch.mwait_in_guest = true; + if (cap->args[0] & KVM_X86_DISABLE_EXITS_HTL) + kvm->arch.hlt_in_guest = true; r = 0; break; default: diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index cd1215e..d4ddb00 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -263,11 +263,18 @@ static inline u64 nsec_to_cycles(struct kvm_vcpu *vcpu, u64 nsec) }) #define KVM_X86_DISABLE_EXITS_MWAIT (1 << 0) -#define KVM_X86_DISABLE_VALID_EXITS (KVM_X86_DISABLE_EXITS_MWAIT) +#define KVM_X86_DISABLE_EXITS_HTL (1 << 1) +#define KVM_X86_DISABLE_VALID_EXITS (KVM_X86_DISABLE_EXITS_MWAIT | \ + KVM_X86_DISABLE_EXITS_HTL) static inline bool kvm_mwait_in_guest(struct kvm *kvm) { return kvm->arch.mwait_in_guest; } +static inline bool kvm_hlt_in_guest(struct kvm *kvm) +{ + return kvm->arch.hlt_in_guest; +} + #endif -- 2.7.4