Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751624AbdLLIRq (ORCPT ); Tue, 12 Dec 2017 03:17:46 -0500 Received: from mx1.redhat.com ([209.132.183.28]:50154 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750749AbdLLIRm (ORCPT ); Tue, 12 Dec 2017 03:17:42 -0500 From: Vitaly Kuznetsov To: Roman Kagan Cc: kvm@vger.kernel.org, x86@kernel.org, Paolo Bonzini , Radim =?utf-8?B?S3LEjW3DocWZ?= , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , "K. Y. Srinivasan" , Haiyang Zhang , Stephen Hemminger , "Michael Kelley \(EOSG\)" , Andy Lutomirski , Mohammed Gamal , Cathy Avery , linux-kernel@vger.kernel.org, devel@linuxdriverproject.org Subject: Re: [PATCH 6/6] x86/kvm: support Hyper-V reenlightenment References: <20171208105000.25116-1-vkuznets@redhat.com> <20171208105000.25116-7-vkuznets@redhat.com> <20171208173909.GA4777@rkaganb.sw.ru> <877ettk2mx.fsf@vitty.brq.redhat.com> Date: Tue, 12 Dec 2017 09:17:37 +0100 In-Reply-To: <877ettk2mx.fsf@vitty.brq.redhat.com> (Vitaly Kuznetsov's message of "Mon, 11 Dec 2017 10:57:58 +0100") Message-ID: <87zi6o4axq.fsf@vitty.brq.redhat.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.32]); Tue, 12 Dec 2017 08:17:42 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3239 Lines: 95 Vitaly Kuznetsov writes: > Roman Kagan writes: > >> On Fri, Dec 08, 2017 at 11:50:00AM +0100, Vitaly Kuznetsov wrote: >>> When we run nested KVM on Hyper-V guests we need to update masterclocks for >>> all guests when L1 migrates to a host with different TSC frequency. >>> Implement the procedure in the following way: >>> - Pause all guests. >>> - Tell our host (Hyper-V) to stop emulating TSC accesses. >>> - Update our gtod copy, recompute clocks. >>> - Unpause all guests. >>> >>> This is somewhat similar to cpufreq but we have two important differences: >>> we can only disable TSC emulation globally (on all CPUs) and we don't know >>> the new TSC frequency until we turn the emulation off so we can't >>> 'prepare' ourselves to the event. >>> >>> Signed-off-by: Vitaly Kuznetsov >>> --- >>> arch/x86/kvm/x86.c | 45 +++++++++++++++++++++++++++++++++++++++++++++ >>> 1 file changed, 45 insertions(+) >>> >>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c >>> index 96e04a0cb921..04d90712ffd2 100644 >>> --- a/arch/x86/kvm/x86.c >>> +++ b/arch/x86/kvm/x86.c >>> @@ -68,6 +68,7 @@ >>> #include >>> #include >>> #include >>> +#include >>> >>> #define CREATE_TRACE_POINTS >>> #include "trace.h" >>> @@ -5946,6 +5947,43 @@ static void tsc_khz_changed(void *data) >>> __this_cpu_write(cpu_tsc_khz, khz); >>> } >>> >>> +void kvm_hyperv_tsc_notifier(void) >>> +{ >>> +#ifdef CONFIG_X86_64 >>> + struct kvm *kvm; >>> + struct kvm_vcpu *vcpu; >>> + int cpu; >>> + >>> + spin_lock(&kvm_lock); >>> + list_for_each_entry(kvm, &vm_list, vm_list) >>> + kvm_make_mclock_inprogress_request(kvm); >>> + >>> + hyperv_stop_tsc_emulation(); >>> + >>> + /* TSC frequency always matches when on Hyper-V */ >>> + for_each_present_cpu(cpu) >>> + per_cpu(cpu_tsc_khz, cpu) = tsc_khz; >>> + kvm_max_guest_tsc_khz = tsc_khz; >>> + >>> + list_for_each_entry(kvm, &vm_list, vm_list) { >>> + struct kvm_arch *ka = &kvm->arch; >>> + >>> + spin_lock(&ka->pvclock_gtod_sync_lock); >>> + >>> + pvclock_update_vm_gtod_copy(kvm); >>> + >>> + kvm_for_each_vcpu(cpu, vcpu, kvm) >>> + kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu); >>> + >>> + kvm_for_each_vcpu(cpu, vcpu, kvm) >>> + kvm_clear_request(KVM_REQ_MCLOCK_INPROGRESS, vcpu); >>> + >>> + spin_unlock(&ka->pvclock_gtod_sync_lock); >>> + } >>> + spin_unlock(&kvm_lock); >> >> Can't you skip all this if the tsc frequency hasn't changed (which >> should probably be the case when the CPU supports tsc frequency >> scaling)? >> > > The thing is that we don't know if it changed or not: only after > disabling TSC emulation we'll be able to read the new one from the host > and we need to do this with all VMs paused. (having second thoughts here) While we don't know if TSC frequency has changed or not, we can check the emulation status before calling the callback and if TSC accesses are not emulated omit the call. However, it seems that Hyper-V host (as of WS2016) turns on emulation regardless of the TSC scaling presence. I'll add emulation status check before issuing the callback in v2. The change will go to PATCH3. -- Vitaly