Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753058AbcKPSKH (ORCPT ); Wed, 16 Nov 2016 13:10:07 -0500 Received: from mx1.redhat.com ([209.132.183.28]:54784 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752210AbcKPSKG (ORCPT ); Wed, 16 Nov 2016 13:10:06 -0500 Date: Wed, 16 Nov 2016 19:10:02 +0100 From: Radim =?utf-8?B?S3LEjW3DocWZ?= To: Paolo Bonzini Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, mtosatti@redhat.com Subject: Re: [PATCH v2] KVM: x86: do not go through vcpu in __get_kvmclock_ns Message-ID: <20161116181001.GB7678@potion> References: <1479145881-20828-1-git-send-email-pbonzini@redhat.com> <20161116161030.GA7678@potion> <1216909500.13168461.1479317268749.JavaMail.zimbra@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1216909500.13168461.1479317268749.JavaMail.zimbra@redhat.com> X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Wed, 16 Nov 2016 18:10:05 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2216 Lines: 48 2016-11-16 12:27-0500, Paolo Bonzini: >> > + if (!ka->use_master_clock) >> > + return ktime_get_boot_ns() + ka->kvmclock_offset; >> > >> > - return ns; >> > + hv_clock.tsc_timestamp = ka->master_cycle_now; >> > + hv_clock.system_time = ka->master_kernel_ns + ka->kvmclock_offset; >> > + kvm_get_time_scale(NSEC_PER_SEC, __this_cpu_read(cpu_tsc_khz) * 1000LL, >> > + &hv_clock.tsc_shift, >> > + &hv_clock.tsc_to_system_mul); >> >> Doesn't this result in a minor drift with scaled clock, because the >> guest can be combining two systems that approximate frequency? > > You mean instead of doing read_l1_tsc? Yes, when we avoid the scaling and use host TSC directly. >> 1) tsc_shift and tsc_to_system_mul for kvmclock scaling >> 2) hardware TSC scaling ratio >> >> If we are on a 7654321 kHz TSC and TSC-ratio scale to 1234567 kHz and >> then tsc_shift+tsc_to_system_mul kvmclock-scale to 1000000 kHz, we >> should be using multipliers of >> 0.161290204578564186163606151349022336533834941074459772460... and >> 0.810000591300431649315104000025920018921613812778083328000..., >> to achieve that. Those multipliers cannot be precisely expressed in >> what we have (shifts and 64/32 bit multipliers with intermediate values >> only up to 128 bits), so performing the scaling will result in slightly >> incorrect frequency. >> >> The result of combining two operations that alter the freqency is quite >> unlikely to cancel out and produce the same result as an operation that >> uses a different shift+multiplier to scale in one step, so I think that >> we aren't getting the same time as the guest with TSC-scaling is seeing. > > I think you get pretty good precision, since 30 fractional bits are more > or less equivalent to nanosecond precision. For example, cutting the two > ratios above to 30 fractional bits I get respectively 173184038/2^30 > and 869731512/2^30. Multiplying them gives 140279173/2^30 which matches > exactly the fixed point representation of 1000000/7654321. > > Since the TSC scaling frequency has a larger precision (32 or 48 bits), > you should get at most 1 ulp error, which is not bad. True, it would take many years to accumulate into a second. Thanks.