Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752693AbdGLWpk (ORCPT ); Wed, 12 Jul 2017 18:45:40 -0400 Received: from gate.crashing.org ([63.228.1.57]:58238 "EHLO gate.crashing.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751036AbdGLWpj (ORCPT ); Wed, 12 Jul 2017 18:45:39 -0400 Message-ID: <1499899504.2865.44.camel@kernel.crashing.org> Subject: Re: [PATCH] powerpc/time: use get_tb instead of get_vtb in running_clock From: Benjamin Herrenschmidt To: Jia He , linuxppc-dev@lists.ozlabs.org Cc: linux-kernel@vger.kernel.org, Paul Mackerras , Michael Ellerman , Ingo Molnar , fweisbec@gmail.com, Thomas Gleixner , John Stultz , Stanislaw Gruszka , Ivan Mikhaylov , cyrilbur@gmail.com Date: Thu, 13 Jul 2017 08:45:04 +1000 In-Reply-To: <1499871670-24671-1-git-send-email-hejianet@gmail.com> References: <1499871670-24671-1-git-send-email-hejianet@gmail.com> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.22.6 (3.22.6-2.fc25) Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1979 Lines: 46 On Wed, 2017-07-12 at 23:01 +0800, Jia He wrote: > Virtual time base(vtb) is a register which increases only in guest. > Any exit from guest to host will stop the vtb(saved and restored by kvm). > But if there is an IO causes guest exits to host, the guest's watchdog > (watchdog_timer_fn -> is_softlockup -> get_timestamp -> running_clock) > needs to also include the time elapsed in host. get_vtb is not correct in > this case. > > Also, the TB_OFFSET is well saved and restored by qemu after commit [1]. > So we can use get_tb here. That completely defeats the purpose here... This was done specifically to exploit the VTB which doesn't count in hypervisor mode. > > [1] http://git.qemu.org/?p=qemu.git;a=commit;h=42043e4f1 > > Signed-off-by: Jia He > --- > arch/powerpc/kernel/time.c | 7 +++---- > 1 file changed, 3 insertions(+), 4 deletions(-) > > diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c > index fe6f3a2..c542dd3 100644 > --- a/arch/powerpc/kernel/time.c > +++ b/arch/powerpc/kernel/time.c > @@ -695,16 +695,15 @@ notrace unsigned long long sched_clock(void) > unsigned long long running_clock(void) > { > /* > - * Don't read the VTB as a host since KVM does not switch in host > - * timebase into the VTB when it takes a guest off the CPU, reading the > - * VTB would result in reading 'last switched out' guest VTB. > + * Use get_tb instead of get_vtb for guest since the TB_OFFSET has been > + * well saved/restored when qemu does suspend/resume. > * > * Host kernels are often compiled with CONFIG_PPC_PSERIES checked, it > * would be unsafe to rely only on the #ifdef above. > */ > if (firmware_has_feature(FW_FEATURE_LPAR) && > cpu_has_feature(CPU_FTR_ARCH_207S)) > - return mulhdu(get_vtb() - boot_tb, tb_to_ns_scale) << tb_to_ns_shift; > + return mulhdu(get_tb() - boot_tb, tb_to_ns_scale) << tb_to_ns_shift; > > /* > * This is a next best approximation without a VTB.