Received: by 2002:a05:7412:3784:b0:e2:908c:2ebd with SMTP id jk4csp1606394rdb; Mon, 2 Oct 2023 15:15:46 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFlkPc1xWdeaVdp/V3QSHTTnP0Z0gY0f1Z3WHqli5YQ+mqo/npKteN70VVBIRrcnozpl2tM X-Received: by 2002:aca:1917:0:b0:3a7:37ae:4a47 with SMTP id l23-20020aca1917000000b003a737ae4a47mr12251637oii.12.1696284946420; Mon, 02 Oct 2023 15:15:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1696284946; cv=none; d=google.com; s=arc-20160816; b=cV8wL6GtOqFXNoEeNhNTm3nC5zyzqn7l/tCMfsbUn6EBapdcGCslIsCGR9xboS9mzt xMezn/s7gHnbNr2BYxQKyEn3tydSYny61GTRiZuT3ErDXTSP+ugL6yLYiSIBXmPyVJq1 5Fp2LSZXc6I1SM7Md8/YOVvJqL9L918aOETSEhXnVVOOPBWomM/HwXoK9hKfK9F0wfm5 s//o81vStX2gv7L0UerYY/7TJyFYg522Cg2gZpaXcy1BPdCre+JajNe+6wL5fp67QbBx ZpeNcx836IPv9YIP1WAt2pp6hH50D6t0wK6QU9JPMvXAFRu1F9drtT812Rlb7htcHirK myhA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=IiAsR0Ba9+p25e+G6xW6VNHUT7Mv8K0srKGN2fQeRBE=; fh=bI2JkY8g6cGS8rh1ULarhgG+X58l2aslFgMGQNFY9Mo=; b=ERbipmU4RWBhZGDFoAl/sDwGdaQNhLmPy/C/qkdtjoAdIsu/m+th3eELkNxYajKyQe /IanWmvsZQ9n0xsPg7njVr4MaOfnOivEC2i8Psa+ox3OyW9WF52Px4RG5/QkWK+S4GRr alFt7WVc42xa2gS2UpNxZU++cw94BTPb2tisWJ9lhUhljaLaFEkJ64TjLO4iD66Vz9I4 V8c/HZixay5gkJMghH3XOO1q76MItfaEZZfNtu3ujFmlQ2AnPUPYBqj1X38NapYyFaD/ X82i8fNNE4zdSkvsuebA9zmXm+An9Quwfd0krn7ft6LoQMT+dhbi7nfqoraijRRrgNgi k32g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=CnVuIaX3; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from agentk.vger.email (agentk.vger.email. [23.128.96.32]) by mx.google.com with ESMTPS id v24-20020a17090a899800b00274b7365250si8524765pjn.165.2023.10.02.15.15.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 02 Oct 2023 15:15:46 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) client-ip=23.128.96.32; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=CnVuIaX3; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by agentk.vger.email (Postfix) with ESMTP id 9FD01805A586; Mon, 2 Oct 2023 11:19:17 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at agentk.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238766AbjJBSTA (ORCPT + 99 others); Mon, 2 Oct 2023 14:19:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38472 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238759AbjJBSS5 (ORCPT ); Mon, 2 Oct 2023 14:18:57 -0400 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 192E0C6 for ; Mon, 2 Oct 2023 11:18:53 -0700 (PDT) Received: by mail-yb1-xb49.google.com with SMTP id 3f1490d57ef6-d814634fe4bso48562276.1 for ; Mon, 02 Oct 2023 11:18:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1696270732; x=1696875532; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=IiAsR0Ba9+p25e+G6xW6VNHUT7Mv8K0srKGN2fQeRBE=; b=CnVuIaX3D2J3q8zR3Ebf44tsagA0ZszhHVk1mp6tb2+h+oA2V44bf+lYlFAEnNr2VA 2aOh8dL9tBtLIqStpvOOHAVQV6k8KM/FExG4rQxR+VHMJ6+57yTPZNnww8lZAbQd4n85 9JisomqPvYp7Y5ngWyfBtdFlWjRT19ZjmrxdXJVWjM5Kc1S9VQnUQtnV30I7ng5oIjin xqK2RqbOytqOmki9nWMZn21ITd6fUZ89cv2JmkxJrTilYNStX4qGYLLp2qrQBHDwpFiU t4WWZ+8q2E1kzkGzvxT53PiI+BngZqb4mOCWR0pms2khCfrneRZC6m0DQj9DqZefgS18 jWZg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1696270732; x=1696875532; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=IiAsR0Ba9+p25e+G6xW6VNHUT7Mv8K0srKGN2fQeRBE=; b=kwaxdircsK2lbn9WcJV3KMDAyV2JaEpq4NtnNib+JhTVWG945X1BK+ZNJQz7ItqXlz Sqdwmz/ZRb1fGmdCPaWQCRe7woyBEP5sEmNRK5u71hEYqw+t2xQ0xdazpgdZpdKmYX4G ds23FIL3jZaBlaZg3JFgnrm14A8+jJJ1E0paTA9yRp26g8E81XeOHi1YQQs77iRFYYrG U4D0xCO9+93fdN/lUy1WXMFV1PVn6hwX3kr09Q2BglBMDC3y7+nmdU3cvlTnpSPcW7Md bFBo4+AK5TNy/g6IX7uXaIKXWc6eKGuOOl8ptJF+xcQ0n4cqjHNpvzgMPHIAJbVoG8MG A2Fg== X-Gm-Message-State: AOJu0Yzn/WLkfCLtX9hYdmrIu/nGitaZTZEwPi0k4Seab/3maDZKnpaR XCq+5CHG4wo7fHTsGTvTjeoDwXEHe4U= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a25:ad1e:0:b0:d7e:752f:baee with SMTP id y30-20020a25ad1e000000b00d7e752fbaeemr170399ybi.10.1696270732253; Mon, 02 Oct 2023 11:18:52 -0700 (PDT) Date: Mon, 2 Oct 2023 11:18:50 -0700 In-Reply-To: Mime-Version: 1.0 References: <20230926230649.67852-1-dongli.zhang@oracle.com> <377d9706-cc10-dfb8-5326-96c83c47338d@oracle.com> <36f3dbb1-61d7-e90a-02cf-9f151a1a3d35@oracle.com> <884aa233ef46d5209b2d1c92ce992f50a76bd656.camel@infradead.org> Message-ID: Subject: Re: [PATCH RFC 1/1] KVM: x86: add param to update master clock periodically From: Sean Christopherson To: Dongli Zhang Cc: David Woodhouse , Joe Jin , x86@kernel.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, pbonzini@redhat.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, Peter Zijlstra Content-Type: text/plain; charset="us-ascii" X-Spam-Status: No, score=-8.4 required=5.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on agentk.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (agentk.vger.email [0.0.0.0]); Mon, 02 Oct 2023 11:19:17 -0700 (PDT) +PeterZ Thomas and Peter, We're trying to address an issue where KVM's paravirt kvmclock drifts from the host's TSC-based monotonic raw clock because of historical reasons (at least, AFAICT), even when the TSC is constant. Due to some dubious KVM behavior, KVM may sometimes re-sync kvmclock against the host's monotonic raw clock, which causes non-trivial jumps in time from the guest's perspective. Linux-as-a-guest demotes all paravirt clock sources when the TSC is constant and nonstop, and so the goofy KVM behavior isn't likely to affect the guest's clocksource, but the guest's sched_clock() implementation keeps using the paravirt clock. Irrespective of if/how we fix the KVM host-side mess, using a paravirt clock for the scheduler when using a constant, nonstop TSC for the clocksource seems at best inefficient, and at worst unnecessarily complex and risky. Is there any reason not to prefer native_sched_clock() over whatever paravirt clock is present when the TSC is the preferred clocksource? Assuming the desirable thing to do is to use native_sched_clock() in this scenario, do we need a separate rating system, or can we simply tie the sched clock selection to the clocksource selection, e.g. override the paravirt stuff if the TSC clock has higher priority and is chosen? Some more details below (and in the rest of the thread). Thanks! On Mon, Oct 02, 2023, Dongli Zhang wrote: > Hi Sean and David, > > On 10/2/23 09:37, Sean Christopherson wrote: > > However, why does any of this matter if the host has a constant TSC? If that's > > the case, a sane setup will expose a constant TSC to the guest and the guest will > > use the TSC instead of kvmclock for the guest clocksource. > > > > Dongli, is this for long-lived "legacy" guests that were created on hosts without > > a constant TSC? If not, then why is kvmclock being used? Or heaven forbid, are > > you running on hardware without a constant TSC? :-) > > This is for test guests, and the host has all of below: > > tsc, rdtscp, constant_tsc, nonstop_tsc, tsc_deadline_timer, tsc_adjust > > A clocksource is used for two things. > > > 1. current_clocksource. Yes, I agree we should always use tsc on modern hardware. > > Do we need to update the documentation to always suggest TSC when it is > constant, as I believe many users still prefer pv clock than tsc? > > Thanks to tsc ratio scaling, the live migration will not impact tsc. > > >From the source code, the rating of kvm-clock is still higher than tsc. > > BTW., how about to decrease the rating if guest detects constant tsc? > > 166 struct clocksource kvm_clock = { > 167 .name = "kvm-clock", > 168 .read = kvm_clock_get_cycles, > 169 .rating = 400, > 170 .mask = CLOCKSOURCE_MASK(64), > 171 .flags = CLOCK_SOURCE_IS_CONTINUOUS, > 172 .enable = kvm_cs_enable, > 173 }; > > 1196 static struct clocksource clocksource_tsc = { > 1197 .name = "tsc", > 1198 .rating = 300, > 1199 .read = read_tsc, That's already done in kvmclock_init(). if (boot_cpu_has(X86_FEATURE_CONSTANT_TSC) && boot_cpu_has(X86_FEATURE_NONSTOP_TSC) && !check_tsc_unstable()) kvm_clock.rating = 299; See also: https://lore.kernel.org/all/ZOjF2DMBgW%2FzVvL3@google.com > 2. The sched_clock. > > The scheduling is impacted if there is big drift. ... > Unfortunately, the "no-kvmclock" kernel parameter disables all pv clock > operations (not only sched_clock), e.g., after line 300. ... > Should I introduce a new param to disable no-kvm-sched-clock only, or to > introduce a new param to allow the selection of sched_clock? I don't think we want a KVM-specific knob, because every flavor of paravirt guest would need to do the same thing. And unless there's a good reason to use a paravirt clock, this really shouldn't be something the guest admin needs to opt into using.