Received: by 2002:ab2:1689:0:b0:1f7:5705:b850 with SMTP id d9csp354884lqa; Sat, 27 Apr 2024 05:22:48 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCV1vUIHkbESLgdwwKjf6cTQeOYzK735n+OaTnmS2AHTOAEnZva591BF/FuaCLpnd1eEfioFGCIPIBW7wySbz4O7QRmOdJ1e97t/WXxzYg== X-Google-Smtp-Source: AGHT+IEKF+IX3eyEetwBtXEcFDDs66ENla5e6ZBv+0i06Jz4CJWzGwEl9qtQSvtP7WqsPnBVrFf5 X-Received: by 2002:a05:6a00:3d11:b0:6f3:8468:f9d1 with SMTP id lo17-20020a056a003d1100b006f38468f9d1mr6081875pfb.14.1714220568646; Sat, 27 Apr 2024 05:22:48 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1714220568; cv=pass; d=google.com; s=arc-20160816; b=GYbeJ3HIarry7IG6wF4gU52CNdJbefoJDbrzJdJ9oQhqNnbAeRfp1lgEi6joYVD9FR AWtPoshjzbNi7nLXMRej/PGUXC6dFn7K0aihtUqVopP1ExwCkAxMH85NcFYODXRUsGso YIJtRtGQ4Ih4490pczA00fU/FVd/aqODJXO08L43ORBAwa+i226lzvzcUcy3KJcSyZnv srOLx7xjYQFQ+hnIoki7kNw1wAlgFpFBM/LX1LkUub7ysZ+qO7CLiS7ReqoVzGj2uFcB 4G8WJKLy+0GBrlDpJEb++RofVsRZln3bTvImlm2ZIMYVw1XVZe2FHMovcbiceQRcg5o9 rJ3g== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:message-id:date:subject:cc:to :from:dkim-signature; bh=Z6fBy03Etm/CSR189UL7iYBAbnioWxrHrZek/W4lxZw=; fh=GxGcIBasQq2Gc/176mTrpEex7eS1coPy2FlYCD7Qc50=; b=a1SVJW20DwOOEQDR6bB8lvcoc817b0+sJYVbyYe7qazcxGW1CRg1CqgtsQM5gr2it/ b6h+bVRdxbA2/GOngVDGBIMUIBip/Resie1lJ3kxoV77aGh6Cq4hNKB2yYM9TTEwXhEa bIAlBUhsTJCdTm+vHzkWkRpIoPGMF5khPq2papnHIHs6eeN8+55B2FfYLJj8OwdVRLvq Fu3+lCqCvYf1mwpLSl8KhIi9B2N9iL64ZUSOIucDrf2FetPp4+mR7y9HzvKX1/lIKpxS li5P1v50jjSQGf2ckg4QafH1yZKgFPyT96Tmuc4X7KeYYb95bTX45RqnWixZolIJDu8w F28w==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b="vS/M7D5S"; arc=pass (i=1 dkim=pass dkdomain=infradead.org); spf=pass (google.com: domain of linux-kernel+bounces-161021-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-161021-linux.lists.archive=gmail.com@vger.kernel.org" Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [2604:1380:45e3:2400::1]) by mx.google.com with ESMTPS id f6-20020a635546000000b006003ee04f15si10724365pgm.698.2024.04.27.05.22.48 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 27 Apr 2024 05:22:48 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-161021-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) client-ip=2604:1380:45e3:2400::1; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b="vS/M7D5S"; arc=pass (i=1 dkim=pass dkdomain=infradead.org); spf=pass (google.com: domain of linux-kernel+bounces-161021-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-161021-linux.lists.archive=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 99EB3284B38 for ; Sat, 27 Apr 2024 11:21:33 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 2EF27548E4; Sat, 27 Apr 2024 11:19:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="vS/M7D5S" Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E2BD738DD3; Sat, 27 Apr 2024 11:19:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=90.155.50.34 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714216787; cv=none; b=uAdDtT52Ht6eCNwrLddas4MC/kRgAOH979T3iYKApfg4gB0RptXDvOfTwzd04PAVF04sv5+X/QsHbJMmXwRKBNkZPqeLjlLIewrgD9MeJf6aAfdMfuSL/uF+qDlXQf81OqyaPAXLk4R5z4E5sAdSNnOlz+qegSmr+a5bqOvtMXk= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714216787; c=relaxed/simple; bh=Dne5fBR1iIOYX/rPcjbOE3SYHMPYwk167rrQ55hOsLU=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version:Content-Type; b=qrSPeXl7rrYSWIL/P8qGYMXq4JIQEWJZPtoQEZxuLJn7emWFORdMjj9qVLwgh+hKu8WaYaggUZ6EbzyxziZt/1Y/xmn/EQBp2VWiEkq6zgCBzSGOIbfyw8f6zXxYAXLtH9Hc5CbzEPE9vlHhP3EooI7dcOAac6wMrf5VSLFBfBo= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org; spf=none smtp.mailfrom=casper.srs.infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=vS/M7D5S; arc=none smtp.client-ip=90.155.50.34 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=casper.srs.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Sender:Content-Transfer-Encoding: Content-Type:MIME-Version:Message-ID:Date:Subject:Cc:To:From:Reply-To: Content-ID:Content-Description:In-Reply-To:References; bh=Z6fBy03Etm/CSR189UL7iYBAbnioWxrHrZek/W4lxZw=; b=vS/M7D5Sl4CQpsJxN9B544ayJl NsbwfjA7eg/2ImNuB3vQ+FyO6KnYWChQhE9UEVfFRJmRllbf7DUhFzenLUO53LEAv2KInFk8Oyopb 9MjFoaiEhjWZejm4EGeYBo+qxV2Fdp1y9QNgkAKlrTFAoVmM8xKunZBtnnuKVSezJCdlERi2FB7// jujUFNSHVAnwubObnsL7ygyYBQ0dIwi7XclXO4znpOewBlKzimiAe6eGE3mQuF7Wwdi4ZQImLLgC8 1juHV3lRqr0WEdRDyDptHnhZDzA9XiKwBEPic5gfoaMvn32X6ZBc3pRiqUA3XLJWQ41asQjYvd0XE 2ar5/N2A==; Received: from [2001:8b0:10b:1::ebe] (helo=i7.infradead.org) by casper.infradead.org with esmtpsa (Exim 4.97.1 #2 (Red Hat Linux)) id 1s0g5g-00000007Jwg-1Y9q; Sat, 27 Apr 2024 11:19:36 +0000 Received: from dwoodhou by i7.infradead.org with local (Exim 4.97.1 #2 (Red Hat Linux)) id 1s0g5f-000000002b0-2wnU; Sat, 27 Apr 2024 12:19:35 +0100 From: David Woodhouse To: kvm@vger.kernel.org Cc: Paolo Bonzini , Jonathan Corbet , Sean Christopherson , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" , Paul Durrant , Shuah Khan , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, Oliver Upton , Marcelo Tosatti , jalliste@amazon.co.uk, sveith@amazon.de, zide.chen@intel.com, Dongli Zhang Subject: [RFC PATCH v2] Cleaning up the KVM clock mess Date: Sat, 27 Apr 2024 12:04:57 +0100 Message-ID: <20240427111929.9600-1-dwmw2@infradead.org> X-Mailer: git-send-email 2.44.0 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: David Woodhouse X-SRS-Rewrite: SMTP reverse-path rewritten from by casper.infradead.org. See http://www.infradead.org/rpr.html Clean up the KVM clock mess somewhat so that it is either based on the guest TSC ("master clock" mode), or on the host CLOCK_MONOTONIC_RAW in cases where the TSC isn't usable. Eliminate the third variant where it was based directly on the *host* TSC, due to bugs in e.g. __get_kvmclock(). Kill off the last vestiges of the KVM clock being based on CLOCK_MONOTONIC instead of CLOCK_MONOTONIC_RAW and thus being subject to NTP skew. Fix up migration support to allow the KVM clock to be saved/restored as an arithmetic function of the guest TSC, since that's what it actually is in the *common* case so it can be migrated precisely. Or at least to within ±1 ns which is good enough, as discussed in https://lore.kernel.org/kvm/c8dca08bf848e663f192de6705bf04aa3966e856.camel@infradead.org In v2 of this series, TSC synchronization is improved and simplified a bit too, and we allow masterclock mode to be used even when the guest TSCs are out of sync, as long as they're running at the same *rate*. The different *offset* shouldn't matter. And the kvm_get_time_scale() function annoyed me by being entirely opaque, so I studied it until my brain hurt and then added some comments. In v2 I also dropped the commits which were removing the periodic clock syncs. Those are going to be needed still but *only* for non-masterclock mode, which I'll do next. Along with ensuring that a masterclock update while already in masterclock mode doesn't jump the clock, and just does the same as KVM_SET_CLOCK_GUEST does to preserve it. Needs a *lot* more testing. I think I'm almost done refactoring the code, so should focus on building up the tests next. (I do still hate that we're abusing KVM_GET_CLOCK just to get the tuple of {host_tsc, CLOCK_REALTIME} without even *caring* about the eponymous KVM clock. Especially as this information is (a) fundamentally what the vDSO gettimeofday() exposes to us anyway, (b) using CLOCK_REALTIME not TAI, (c) not available on other platforms, for example for migrating the Arm arch counter.) David Woodhouse (13): KVM: x86/xen: Do not corrupt KVM clock in kvm_xen_shared_info_init() KVM: x86: Improve accuracy of KVM clock when TSC scaling is in force KVM: x86: Explicitly disable TSC scaling without CONSTANT_TSC KVM: x86: Add KVM_VCPU_TSC_SCALE and fix the documentation on TSC migration KVM: x86: Avoid NTP frequency skew for KVM clock on 32-bit host KVM: x86: Fix KVM clock precision in __get_kvmclock() KVM: x86: Fix software TSC upscaling in kvm_update_guest_time() KVM: x86: Simplify and comment kvm_get_time_scale() KVM: x86: Remove implicit rdtsc() from kvm_compute_l1_tsc_offset() KVM: x86: Improve synchronization in kvm_synchronize_tsc() KVM: x86: Kill cur_tsc_{nsec,offset,write} fields KVM: x86: Allow KVM master clock mode when TSCs are offset from each other KVM: x86: Factor out kvm_use_master_clock() Jack Allister (2): KVM: x86: Add KVM_[GS]ET_CLOCK_GUEST for accurate KVM clock migration KVM: selftests: Add KVM/PV clock selftest to prove timer correction Documentation/virt/kvm/api.rst | 37 ++ Documentation/virt/kvm/devices/vcpu.rst | 115 +++- arch/x86/include/asm/kvm_host.h | 15 +- arch/x86/include/uapi/asm/kvm.h | 6 + arch/x86/kvm/svm/svm.c | 3 +- arch/x86/kvm/vmx/vmx.c | 2 +- arch/x86/kvm/x86.c | 687 +++++++++++++++------- arch/x86/kvm/xen.c | 4 +- include/uapi/linux/kvm.h | 3 + tools/testing/selftests/kvm/Makefile | 1 + tools/testing/selftests/kvm/x86_64/pvclock_test.c | 192 ++++++ 11 files changed, 822 insertions(+), 243 deletions(-)