Received: by 10.192.165.156 with SMTP id m28csp1388595imm; Fri, 13 Apr 2018 20:12:41 -0700 (PDT) X-Google-Smtp-Source: AIpwx49ppI7OSZaPPi99Qp7++21wnCUt5KdkAFadDDQkJBUqS5yuKPiepuhp44PR0M3IhzxNmjJB X-Received: by 2002:a17:902:47aa:: with SMTP id r39-v6mr7374696pld.59.1523675561419; Fri, 13 Apr 2018 20:12:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1523675561; cv=none; d=google.com; s=arc-20160816; b=X1BjCwg8tRB+BMKQwlMjFMCW69jUearZVlOgPYRCBdnp17wHHJINhezOYOrHN/UuTe 4LhvNNEBUzwG8E6df+gXlwbtYK85yUFdtkzsPHYTxJe01+e0riZJNGiJw8dtfaeFWkDV bFfKCB402akz1vTFoRMk+LhOmKB34qDCI5d3yBy+XMTvvdqLfib/hjaFQBivZXdsW4Xb tGeDt3pv5O6T/NgM0aZnFp76d6iHnP2bxeecn6IgukmnE32MOTFbAeu89m5wj4D1s9jC RQylkJQnf4StD0ATj59sex27JT9wQX1+8gpT2wtkdJjpcDIk+Yzdo2nJ3Z4a0taHS+q0 ke1g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature :arc-authentication-results; bh=0YUS/iW7eTnXzHBaL0jcgxg/mUaqzEScjgG4477ap/M=; b=EdoavRo0nIN8+FYWGTZJjMu+IFLrqFD39SFKmApRbXx9Gf1iSAmKo3/geBkyiuivJs Po4s6uwQlurqRe1wV3DyVwUGqrW3SpUPzrs329SdEtHT9Fx+QttRnMSr9FtyeUt5D+5r JzE7ljSP4/yvRy/2v0WOMY6dkIpwXx/h3r0oh/N62YFxmINi4J7wrwNqLwK85G+M7WLP 39bYOUfrjDCEH2kdx/2qp5bL3eNl0CDi+hnaDb8OhDqsXrA8TV6LqmysxocqO3gFMgCW V1JEDRgUD4YdrrQ1LN9ksRQzLjbnXj7IdWJ7YvVhk/fC87rXsDs6A7Y5yQ2RgKGcCCDA 8pZQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@amazon.de header.s=amazon201209 header.b=ZksLovjm; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amazon.de Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y21si6099143pfi.195.2018.04.13.20.12.25; Fri, 13 Apr 2018 20:12:41 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@amazon.de header.s=amazon201209 header.b=ZksLovjm; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amazon.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751624AbeDNDLI (ORCPT + 99 others); Fri, 13 Apr 2018 23:11:08 -0400 Received: from smtp-fw-9102.amazon.com ([207.171.184.29]:56216 "EHLO smtp-fw-9102.amazon.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751108AbeDNDLG (ORCPT ); Fri, 13 Apr 2018 23:11:06 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.de; i=@amazon.de; q=dns/txt; s=amazon201209; t=1523675466; x=1555211466; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=0YUS/iW7eTnXzHBaL0jcgxg/mUaqzEScjgG4477ap/M=; b=ZksLovjmqYgL9EyDWNlY7jFDA5+YnSBj7YPYL+0POMEES+VizBXNQ1Db XXJeJnCvna5IOQat1CLAxQxE4ovcw/A+tTDROdbSqBbZQH0cJIsJwh0M/ TIf6L5nLVcmO7XeqF03+nvLOXgX6MZvLdC2A3XU5I1+iJYQNGxbbyWdmH c=; X-IronPort-AV: E=Sophos;i="5.48,447,1517875200"; d="scan'208";a="606841893" Received: from sea3-co-svc-lb6-vlan3.sea.amazon.com (HELO email-inbound-relay-2c-397e131e.us-west-2.amazon.com) ([10.47.22.38]) by smtp-border-fw-out-9102.sea19.amazon.com with ESMTP/TLS/DHE-RSA-AES256-SHA; 14 Apr 2018 03:11:04 +0000 Received: from u54e1ad5160425a4b64ea.ant.amazon.com (pdx2-ws-svc-lb17-vlan2.amazon.com [10.247.140.66]) by email-inbound-relay-2c-397e131e.us-west-2.amazon.com (8.14.7/8.14.7) with ESMTP id w3E3B0V5056271 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sat, 14 Apr 2018 03:11:01 GMT Received: from u54e1ad5160425a4b64ea.ant.amazon.com (localhost [127.0.0.1]) by u54e1ad5160425a4b64ea.ant.amazon.com (8.15.2/8.15.2/Debian-3) with ESMTP id w3E3AxUK027434; Sat, 14 Apr 2018 05:10:59 +0200 Received: (from karahmed@localhost) by u54e1ad5160425a4b64ea.ant.amazon.com (8.15.2/8.15.2/Submit) id w3E3AwrI027433; Sat, 14 Apr 2018 05:10:58 +0200 From: KarimAllah Ahmed To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: KarimAllah Ahmed , Jim Mattson , Paolo Bonzini , =?UTF-8?q?Radim=20Kr=C4=8Dm=C3=A1=C5=99?= Subject: [PATCH v4] X86/KVM: Properly update 'tsc_offset' to represent the running guest Date: Sat, 14 Apr 2018 05:10:52 +0200 Message-Id: <1523675452-27271-1-git-send-email-karahmed@amazon.de> X-Mailer: git-send-email 2.7.4 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Update 'tsc_offset' on vmentry/vmexit of L2 guests to ensure that it always captures the TSC_OFFSET of the running guest whether it is the L1 or L2 guest. Cc: Jim Mattson Cc: Paolo Bonzini Cc: Radim Krčmář Cc: kvm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Suggested-by: Paolo Bonzini Signed-off-by: KarimAllah Ahmed [AMD changes, fix update_ia32_tsc_adjust_msr. - Paolo] Signed-off-by: Paolo Bonzini --- v3 -> v4: - Restore L01 tsc_offset on enter_vmx_non_root_mode failures. - Move tsc_offset update for L02 later in nested_vmx_run. v2 -> v3: - Add AMD bits as well. - Fix update_ia32_tsc_adjust_msr. v1 -> v2: - Rewrote the patch to always update tsc_offset to represent the current guest (pbonzini@) --- arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/svm.c | 17 ++++++++++++++++- arch/x86/kvm/vmx.c | 29 ++++++++++++++++++++++++----- arch/x86/kvm/x86.c | 6 ++++-- 4 files changed, 45 insertions(+), 8 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 7a200f6..a40a32e 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1016,6 +1016,7 @@ struct kvm_x86_ops { bool (*has_wbinvd_exit)(void); + u64 (*read_l1_tsc_offset)(struct kvm_vcpu *vcpu); void (*write_tsc_offset)(struct kvm_vcpu *vcpu, u64 offset); void (*get_exit_info)(struct kvm_vcpu *vcpu, u64 *info1, u64 *info2); diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index b58787d..1f00c18 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -1423,12 +1423,23 @@ static void init_sys_seg(struct vmcb_seg *seg, uint32_t type) seg->base = 0; } +static u64 svm_read_l1_tsc_offset(struct kvm_vcpu *vcpu) +{ + struct vcpu_svm *svm = to_svm(vcpu); + + if (is_guest_mode(vcpu)) + return svm->nested.hsave->control.tsc_offset; + + return vcpu->arch.tsc_offset; +} + static void svm_write_tsc_offset(struct kvm_vcpu *vcpu, u64 offset) { struct vcpu_svm *svm = to_svm(vcpu); u64 g_tsc_offset = 0; if (is_guest_mode(vcpu)) { + /* Write L1's TSC offset. */ g_tsc_offset = svm->vmcb->control.tsc_offset - svm->nested.hsave->control.tsc_offset; svm->nested.hsave->control.tsc_offset = offset; @@ -3322,6 +3333,7 @@ static int nested_svm_vmexit(struct vcpu_svm *svm) /* Restore the original control entries */ copy_vmcb_control_area(vmcb, hsave); + vcpu->arch.tsc_offset = svm->vmcb->control.tsc_offset; kvm_clear_exception_queue(&svm->vcpu); kvm_clear_interrupt_queue(&svm->vcpu); @@ -3482,10 +3494,12 @@ static void enter_svm_guest_mode(struct vcpu_svm *svm, u64 vmcb_gpa, /* We don't want to see VMMCALLs from a nested guest */ clr_intercept(svm, INTERCEPT_VMMCALL); + vcpu->arch.tsc_offset += nested_vmcb->control.tsc_offset; + svm->vmcb->control.tsc_offset = vcpu->arch.tsc_offset; + svm->vmcb->control.virt_ext = nested_vmcb->control.virt_ext; svm->vmcb->control.int_vector = nested_vmcb->control.int_vector; svm->vmcb->control.int_state = nested_vmcb->control.int_state; - svm->vmcb->control.tsc_offset += nested_vmcb->control.tsc_offset; svm->vmcb->control.event_inj = nested_vmcb->control.event_inj; svm->vmcb->control.event_inj_err = nested_vmcb->control.event_inj_err; @@ -7102,6 +7116,7 @@ static struct kvm_x86_ops svm_x86_ops __ro_after_init = { .has_wbinvd_exit = svm_has_wbinvd_exit, + .read_l1_tsc_offset = svm_read_l1_tsc_offset, .write_tsc_offset = svm_write_tsc_offset, .set_tdp_cr3 = set_tdp_cr3, diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index b6942de..05ba3c6 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -2885,6 +2885,17 @@ static void setup_msrs(struct vcpu_vmx *vmx) vmx_update_msr_bitmap(&vmx->vcpu); } +static u64 vmx_read_l1_tsc_offset(struct kvm_vcpu *vcpu) +{ + struct vmcs12 *vmcs12 = get_vmcs12(vcpu); + + if (is_guest_mode(vcpu) && + (vmcs12->cpu_based_vm_exec_control & CPU_BASED_USE_TSC_OFFSETING)) + return vcpu->arch.tsc_offset - vmcs12->tsc_offset; + + return vcpu->arch.tsc_offset; +} + /* * reads and returns guest's timestamp counter "register" * guest_tsc = (host_tsc * tsc multiplier) >> 48 + tsc_offset @@ -11112,11 +11123,8 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12, vmcs_write64(GUEST_IA32_PAT, vmx->vcpu.arch.pat); } - if (vmcs12->cpu_based_vm_exec_control & CPU_BASED_USE_TSC_OFFSETING) - vmcs_write64(TSC_OFFSET, - vcpu->arch.tsc_offset + vmcs12->tsc_offset); - else - vmcs_write64(TSC_OFFSET, vcpu->arch.tsc_offset); + vmcs_write64(TSC_OFFSET, vcpu->arch.tsc_offset); + if (kvm_has_tsc_control) decache_tsc_multiplier(vmx); @@ -11366,6 +11374,8 @@ static int enter_vmx_non_root_mode(struct kvm_vcpu *vcpu, bool from_vmentry) vmx_segment_cache_clear(vmx); if (prepare_vmcs02(vcpu, vmcs12, from_vmentry, &exit_qual)) { + if (vmcs12->cpu_based_vm_exec_control & CPU_BASED_USE_TSC_OFFSETING) + vcpu->arch.tsc_offset -= vmcs12->tsc_offset; leave_guest_mode(vcpu); vmx_switch_vmcs(vcpu, &vmx->vmcs01); nested_vmx_entry_failure(vcpu, vmcs12, @@ -11377,6 +11387,8 @@ static int enter_vmx_non_root_mode(struct kvm_vcpu *vcpu, bool from_vmentry) vmcs12->vm_entry_msr_load_addr, vmcs12->vm_entry_msr_load_count); if (msr_entry_idx) { + if (vmcs12->cpu_based_vm_exec_control & CPU_BASED_USE_TSC_OFFSETING) + vcpu->arch.tsc_offset -= vmcs12->tsc_offset; leave_guest_mode(vcpu); vmx_switch_vmcs(vcpu, &vmx->vmcs01); nested_vmx_entry_failure(vcpu, vmcs12, @@ -11461,6 +11473,9 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch) return 1; } + if (vmcs12->cpu_based_vm_exec_control & CPU_BASED_USE_TSC_OFFSETING) + vcpu->arch.tsc_offset += vmcs12->tsc_offset; + /* * We're finally done with prerequisite checking, and can start with * the nested entry. @@ -11964,6 +11979,9 @@ static void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 exit_reason, leave_guest_mode(vcpu); + if (vmcs12->cpu_based_vm_exec_control & CPU_BASED_USE_TSC_OFFSETING) + vcpu->arch.tsc_offset -= vmcs12->tsc_offset; + if (likely(!vmx->fail)) { if (exit_reason == -1) sync_vmcs12(vcpu, vmcs12); @@ -12801,6 +12819,7 @@ static struct kvm_x86_ops vmx_x86_ops __ro_after_init = { .has_wbinvd_exit = cpu_has_vmx_wbinvd_exit, + .read_l1_tsc_offset = vmx_read_l1_tsc_offset, .write_tsc_offset = vmx_write_tsc_offset, .set_tdp_cr3 = vmx_set_cr3, diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 7846a8a..2f83cb2 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1490,7 +1490,7 @@ static void kvm_track_tsc_matching(struct kvm_vcpu *vcpu) static void update_ia32_tsc_adjust_msr(struct kvm_vcpu *vcpu, s64 offset) { - u64 curr_offset = vcpu->arch.tsc_offset; + u64 curr_offset = kvm_x86_ops->read_l1_tsc_offset(vcpu); vcpu->arch.ia32_tsc_adjust_msr += offset - curr_offset; } @@ -1532,7 +1532,9 @@ static u64 kvm_compute_tsc_offset(struct kvm_vcpu *vcpu, u64 target_tsc) u64 kvm_read_l1_tsc(struct kvm_vcpu *vcpu, u64 host_tsc) { - return vcpu->arch.tsc_offset + kvm_scale_tsc(vcpu, host_tsc); + u64 tsc_offset = kvm_x86_ops->read_l1_tsc_offset(vcpu); + + return tsc_offset + kvm_scale_tsc(vcpu, host_tsc); } EXPORT_SYMBOL_GPL(kvm_read_l1_tsc); -- 2.7.4