Received: by 2002:a05:6a10:f347:0:0:0:0 with SMTP id d7csp3310995pxu; Tue, 8 Dec 2020 08:45:52 -0800 (PST) X-Google-Smtp-Source: ABdhPJyAkBjiaPSijxOzlhtiyOCHS/WrinyZX6Ik0t1JIF6AyGhA6cQOA/fLA2F0csD9Im0hCMFv X-Received: by 2002:a17:906:1a4e:: with SMTP id j14mr23655928ejf.507.1607445951787; Tue, 08 Dec 2020 08:45:51 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1607445951; cv=none; d=google.com; s=arc-20160816; b=w4d/hiNoEjYj//3NdZLihsPA0V/0M20v6z/dDlUy39ViU7RNF/sev2c0EROx3chSv9 1NoaafNB+HiGtcw8DtxZcYQ9AxtXO7wwfzolmCJSitJ6USomyMe3V9Z6ES0bavO5KGp5 T1OmC/BNkZ/pHc/Lc7pcplqkoH1hd43zPnoPCayqSpVfLMqwRGwqrLU89EXomBiam66F ue9hj70GxNC4Efm+ELKc5knRhKyYGx3SOL6lDLYDPnvGtHNLx7oTl7tlhR18N8A22cx3 ytf3RHxLPZbkxQABT1dIzTl9HpEvqv4CXUlClMbkqbAQMIFBDuyG6bSM4TNT2Q002GHL 77BQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:message-id:date:references :in-reply-to:subject:cc:to:dkim-signature:dkim-signature:from; bh=Iq3ZgfbLiNh1DRbKNB8RtarB5zocyMl+IvwCrRY4efk=; b=IdY9LitKH7Y0C2ezOO1BLFPP4JHiJ3+Vb5BUowF4bsm3YkeroT0S4iZUqhUbUUR2SP ZbZvIxLzqjz7vfe9qh+ZSFgPHNecnyCT7XdPF5WDh8Rx/R+Z/o9d3VkK6hdZ7QXJBmRj ckw1YMa5eMfx4qn7/PIA7vuVVXvrEAm4xozyflthL5nlnbFiANkn4/O26tchUtFOZHbb 07na/IukdkIwIn04Nq3zMe2xTZa+rY8vmCo7NFhRG0gZ9RQqZB2WwbRKG3PpF8EkQUCC YrOJm+rxGJuYM1nl79qHDyeQIOuRENYiVmoRQSljPKVQHsTpNMaqm7nxeYZfZauvc7XG qndA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=QvT6lUSM; dkim=neutral (no key) header.i=@linutronix.de header.b=Mf3MTvcL; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id qp17si9045707ejb.516.2020.12.08.08.45.28; Tue, 08 Dec 2020 08:45:51 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=QvT6lUSM; dkim=neutral (no key) header.i=@linutronix.de header.b=Mf3MTvcL; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730216AbgLHQli (ORCPT + 99 others); Tue, 8 Dec 2020 11:41:38 -0500 Received: from Galois.linutronix.de ([193.142.43.55]:40038 "EHLO galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728679AbgLHQlh (ORCPT ); Tue, 8 Dec 2020 11:41:37 -0500 From: Thomas Gleixner DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1607445655; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=Iq3ZgfbLiNh1DRbKNB8RtarB5zocyMl+IvwCrRY4efk=; b=QvT6lUSM1ljLedM+bk5U6sdWNgx9/wbsKML2tTtr4lSqF7mo+yTooxJH0tSp+Zbo3gS+6A VKsy2m0U+sSr88M0D50DIZW1GYnfzDUohIsA4Meax2SrU5sIIfI918x1UmzjA9FZiAMHOA VHSAdjMh5BoIQtuHUTllixZ6VXUX/BrteI0hBG97GyZ1Um/0DUXOPK5NVQ0xOuEoYjv1Uw C+5sqmmHTl9BymtN0inwXNIKB+jhDt0Fd9IXq0pnaVsWh27dnRNng6n5rGPaTvEKqdK3tx kGScmjSDRGBE4vvHLi+21cCGgcIxXhyJ4RILoPH8Lz8ACw03Q2gX4mjo1sxZdA== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1607445655; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=Iq3ZgfbLiNh1DRbKNB8RtarB5zocyMl+IvwCrRY4efk=; b=Mf3MTvcLP8LeKWQiIe+uRXjllsWBuKTimcyW5BgxQoQ16nudU1gZeR0rXUNtxyXtywjWHB cgyyKiir1JzVPZCQ== To: Maxim Levitsky , Oliver Upton Cc: kvm list , "H. Peter Anvin" , Paolo Bonzini , Jonathan Corbet , Jim Mattson , Wanpeng Li , "open list\:KERNEL SELFTEST FRAMEWORK" , Vitaly Kuznetsov , Marcelo Tosatti , Sean Christopherson , open list , Ingo Molnar , "maintainer\:X86 ARCHITECTURE \(32-BIT AND 64-BIT\)" , Joerg Roedel , Borislav Petkov , Shuah Khan , Andrew Jones , "open list\:DOCUMENTATION" Subject: Re: [PATCH v2 1/3] KVM: x86: implement KVM_{GET|SET}_TSC_STATE In-Reply-To: References: <20201203171118.372391-1-mlevitsk@redhat.com> <20201203171118.372391-2-mlevitsk@redhat.com> Date: Tue, 08 Dec 2020 17:40:54 +0100 Message-ID: <87lfe82quh.fsf@nanos.tec.linutronix.de> MIME-Version: 1.0 Content-Type: text/plain Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Dec 08 2020 at 13:13, Maxim Levitsky wrote: > On Mon, 2020-12-07 at 11:29 -0600, Oliver Upton wrote: >> >> How would a VMM maintain the phase relationship between guest TSCs >> using these ioctls? > > By using the nanosecond timestamp. > > While I did made it optional in the V2 it was done for the sole sake of being > able to set TSC on (re)boot to 0 from qemu, and for cases when qemu migrates > from a VM where the feature is not enabled. > In this case the tsc is set to the given value exactly, just like you > can do today with KVM_SET_MSRS. > In all other cases the nanosecond timestamp will be given. > > When the userspace uses the nanosecond timestamp, the phase relationship > would not only be maintained but be exact, even if TSC reads were not > synchronized and even if their restore on the target wasn't synchronized as well. > > Here is an example: > > Let's assume that TSC on source/target is synchronized, and that the guest TSC > is synchronized as well. > > Let's call the guest TSC frequency F (guest TSC increments by F each second) > > We do KVM_GET_TSC_STATE on vcpu0 and receive (t0,tsc0). > We do KVM_GET_TSC_STATE on vcpu1 after 1 second passed (exaggerated) > and receive (t0 + 1s, tsc0 + F) Why? You freeeze the VM and store the realtime timestamp of doing that. At that point assuming a full sync host system the only interesting thing to store is the guest offset which is the same on all vCPUs and it is known already. So on restore the only thing which needs to be adjusted is the guest wide offset. newoffset = oldoffset + (now - tfreeze) Then set newoffset for all vCPUs. Anything else is complexity for no value and bound to fall apart in hard to debug ways. The offset is still the same for all vCPUs whether you can restore them in the same nanosecond or whether you need 3 minutes for each one. It does not matter because when you restore vCPU1 3 minutes after vCPU0 then TSC has advanced 3 minutes as well. It's still correct from the guest POV. Even if you support TSCADJUST and let the guest write to it does not change the per guest offset at all. TSCADJUST is per [v]CPU and adds on top: tscvcpu = tsc_host + guest_offset + TSC_ADJUST Scaling is just orthogonal and does not change any of this. Thanks, tglx