Received: by 10.213.65.68 with SMTP id h4csp3626212imn; Tue, 10 Apr 2018 01:54:17 -0700 (PDT) X-Google-Smtp-Source: AIpwx49KGECW3g8Rw5TFE8UzoFqfOF7Q2iWXez51/OA1ChyTLW+ZNHHYbcx0G6k7Xx3GHQTmHp7A X-Received: by 2002:a17:902:10c:: with SMTP id 12-v6mr41505566plb.405.1523350457777; Tue, 10 Apr 2018 01:54:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1523350457; cv=none; d=google.com; s=arc-20160816; b=jGRz+UORcFJ9j5L2CTBDVf+GvdWhvzTjZe83CQ0Qb4ixM/VV6UN32kOmbGiTm7HGKo oS5byhjtVeeU1V1kUSa0cSWrn2/ibwpcGDhSVQ67iGrhuNv11dgl0wWUY+NvDPfJaib6 6Mx6A2Y25/gJKow32FhlLiOVA1Tv+kt5VdzJhZ0sVRhDf1XESrXIcDdVR/LaRf6Kkdgf VWPvHa70cZL9U78o4IfWHQr0Oe1jiPqSe3wpe1F3J76JP0vAPpsYEK7S0bBhMAIiuSC/ xrtTsGBCgyUP43F/54SApK11aDW9GZIMiNsLut+egIMOvWNPSRT2OABC//ALb3+3fH7R Kqhw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature :arc-authentication-results; bh=TuzrknC8RxhKaZUeLA02ps4JuImeQbcv5CY3DNDE6w8=; b=meqKk0ZBHO28FNE8sSqjHbZZMrRY0O5sPAnV4iCsqWofwCagrIp7zvKJFlFwdaKqta BObjpG2Ehz1367y1k9kYdcbdzOp/UYx6DKp608nUrkd5Yav1BM36kAaf6CP8R089rs+f 8l8OpRCSvx3rsx4kZa/fIu7oL0pZBaumDg5l04QgMEk3rbZL1GvMVXo3mXzBesmJSB+A eJT2D/fVuTg10fokLNiW+JVdahjjdmXsZfW27n46hldVYJyPBSlbJ8uptD6i8O76O8NX M1FexL5ucoFr8Z9l1NJQBddj0HeQCmp/fZA+1poaJl1jLtV4i10fH6rxTIOe+zPLqhs0 M/Yg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@amazon.de header.s=amazon201209 header.b=hzRKkRao; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amazon.de Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id i1-v6si2097230pld.748.2018.04.10.01.53.40; Tue, 10 Apr 2018 01:54:17 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@amazon.de header.s=amazon201209 header.b=hzRKkRao; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amazon.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752573AbeDJIvA (ORCPT + 99 others); Tue, 10 Apr 2018 04:51:00 -0400 Received: from smtp-fw-4101.amazon.com ([72.21.198.25]:4274 "EHLO smtp-fw-4101.amazon.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752228AbeDJIu4 (ORCPT ); Tue, 10 Apr 2018 04:50:56 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.de; i=@amazon.de; q=dns/txt; s=amazon201209; t=1523350256; x=1554886256; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=TuzrknC8RxhKaZUeLA02ps4JuImeQbcv5CY3DNDE6w8=; b=hzRKkRaojVlgKLSuVeHYzUmcP2AAyiHf8033M7bhb4imyFX+dS1oVurq 6yqP2gmLVFuhyP9u8Soi2g+6eSYhK27+6MMX1988bd+qQESToaCSN75Ae P2oCfiiep3OX7cxh7lfBFM0Y6QEA3svp3HRozLUAdbRDEqk+oy58qbJg4 g=; X-IronPort-AV: E=Sophos;i="5.48,431,1517875200"; d="scan'208";a="715358097" Received: from iad6-co-svc-p1-lb1-vlan3.amazon.com (HELO email-inbound-relay-1d-38ae4ad2.us-east-1.amazon.com) ([10.124.125.6]) by smtp-border-fw-out-4101.iad4.amazon.com with ESMTP/TLS/DHE-RSA-AES256-SHA; 10 Apr 2018 08:50:54 +0000 Received: from u54e1ad5160425a4b64ea.ant.amazon.com (iad1-ws-svc-lb91-vlan3.amazon.com [10.0.103.150]) by email-inbound-relay-1d-38ae4ad2.us-east-1.amazon.com (8.14.7/8.14.7) with ESMTP id w3A8okO8013239 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 10 Apr 2018 08:50:48 GMT Received: from u54e1ad5160425a4b64ea.ant.amazon.com (localhost [127.0.0.1]) by u54e1ad5160425a4b64ea.ant.amazon.com (8.15.2/8.15.2/Debian-3) with ESMTP id w3A8ojZk005826; Tue, 10 Apr 2018 10:50:45 +0200 Received: (from karahmed@localhost) by u54e1ad5160425a4b64ea.ant.amazon.com (8.15.2/8.15.2/Submit) id w3A8oi9L005823; Tue, 10 Apr 2018 10:50:44 +0200 From: KarimAllah Ahmed To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: KarimAllah Ahmed , Paolo Bonzini , =?UTF-8?q?Radim=20Kr=C4=8Dm=C3=A1=C5=99?= , Thomas Gleixner , Ingo Molnar , "H . Peter Anvin" , x86@kernel.org Subject: [PATCH] X86/VMX: Disable VMX preempition timer if MWAIT is not intercepted Date: Tue, 10 Apr 2018 10:50:11 +0200 Message-Id: <1523350211-5747-1-git-send-email-karahmed@amazon.de> X-Mailer: git-send-email 2.7.4 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The VMX-preemption timer is used by KVM as a way to set deadlines for the guest (i.e. timer emulation). That was safe till very recently when capability KVM_X86_DISABLE_EXITS_MWAIT to disable intercepting MWAIT was introduced. According to Intel SDM 25.5.1: """ The VMX-preemption timer operates in the C-states C0, C1, and C2; it also operates in the shutdown and wait-for-SIPI states. If the timer counts down to zero in any state other than the wait-for SIPI state, the logical processor transitions to the C0 C-state and causes a VM exit; the timer does not cause a VM exit if it counts down to zero in the wait-for-SIPI state. The timer is not decremented in C-states deeper than C2. """ Now once the guest issues the MWAIT with a c-state deeper than C2 the preemption timer will never wake it up again since it stopped ticking! Usually this is compensated by other activities in the system that would wake the core from the deep C-state (and cause a VMExit). For example, if the host itself is ticking or it received interrupts, etc! So disable the VMX-preemption timer is MWAIT is exposed to the guest! Cc: Paolo Bonzini Cc: Radim Krčmář Cc: Thomas Gleixner Cc: Ingo Molnar Cc: H. Peter Anvin Cc: x86@kernel.org Cc: kvm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: KarimAllah Ahmed --- arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/lapic.c | 3 ++- arch/x86/kvm/vmx.c | 11 +++++++++-- 3 files changed, 12 insertions(+), 3 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 97448f1..5d9da9c 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1090,6 +1090,7 @@ struct kvm_x86_ops { uint32_t guest_irq, bool set); void (*apicv_post_state_restore)(struct kvm_vcpu *vcpu); + bool (*has_hv_timer)(struct kvm_vcpu *vcpu); int (*set_hv_timer)(struct kvm_vcpu *vcpu, u64 guest_deadline_tsc); void (*cancel_hv_timer)(struct kvm_vcpu *vcpu); diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index a071dc1..9fb50e6 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -1561,7 +1561,8 @@ static bool start_hv_timer(struct kvm_lapic *apic) int r; WARN_ON(preemptible()); - if (!kvm_x86_ops->set_hv_timer) + if (!kvm_x86_ops->has_hv_timer || + !kvm_x86_ops->has_hv_timer(apic->vcpu)) return false; if (!apic_lvtt_period(apic) && atomic_read(&ktimer->pending)) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index d2e54e7..d99a823 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -7112,7 +7112,7 @@ static __init int hardware_setup(void) cpu_preemption_timer_multi = vmx_msr & VMX_MISC_PREEMPTION_TIMER_RATE_MASK; } else { - kvm_x86_ops->set_hv_timer = NULL; + kvm_x86_ops->has_hv_timer = NULL; kvm_x86_ops->cancel_hv_timer = NULL; } @@ -11901,6 +11901,11 @@ static inline int u64_shl_div_u64(u64 a, unsigned int shift, return 0; } +static bool vmx_has_hv_timer(struct kvm_vcpu *vcpu) +{ + return !kvm_pause_in_guest(vcpu->kvm); +} + static int vmx_set_hv_timer(struct kvm_vcpu *vcpu, u64 guest_deadline_tsc) { struct vcpu_vmx *vmx = to_vmx(vcpu); @@ -12136,7 +12141,8 @@ static void pi_post_block(struct kvm_vcpu *vcpu) static void vmx_post_block(struct kvm_vcpu *vcpu) { - if (kvm_x86_ops->set_hv_timer) + if (kvm_x86_ops->has_hv_timer && + kvm_x86_ops->has_hv_timer(vcpu)) kvm_lapic_switch_to_hv_timer(vcpu); pi_post_block(vcpu); @@ -12592,6 +12598,7 @@ static struct kvm_x86_ops vmx_x86_ops __ro_after_init = { .update_pi_irte = vmx_update_pi_irte, #ifdef CONFIG_X86_64 + .has_hv_timer = vmx_has_hv_timer, .set_hv_timer = vmx_set_hv_timer, .cancel_hv_timer = vmx_cancel_hv_timer, #endif -- 2.7.4