Received: by 2002:a05:6358:c692:b0:131:369:b2a3 with SMTP id fe18csp4163115rwb; Mon, 31 Jul 2023 02:26:30 -0700 (PDT) X-Google-Smtp-Source: APBJJlHEjg5BwQv1Dn1M9dsihpIF3qJ8lRWo/vFK0hSJLzWygZZ9wm5aVcGOAkRQnGteXtztAEIt X-Received: by 2002:a05:620a:c4b:b0:768:1eff:c6db with SMTP id u11-20020a05620a0c4b00b007681effc6dbmr9144235qki.67.1690795589684; Mon, 31 Jul 2023 02:26:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690795589; cv=none; d=google.com; s=arc-20160816; b=dZ8CEOeFbEBjeD+eHhQFalJb6IuChZco6Ok8+IYimVmw6Fhre9s5h4Jes0+nCe36p+ 5VQt2xJZMAwqraCvsHk3VfeFfkkrWDn4QXaUrPR9NgJ39W8vcX+5v04SQN3YxdJj1lpv LH2WW+Fm7GJ3+LN/d06ztQs4o7rnMIC+FhwUOiyC/5kV4hej4YGeDIT/PDYTN6EuRuBE P/wLkgtH3Y6OtuB3BIbPQ8f3VLNQ4yhyrpKbEw9ZDJWR0PKLsz8kdv2SbD8rv/r5Hmr+ mI1b8IW9815jyeYlyEkm59RjPKdrITeX22VVlrvMSqYbBK7PFuVcY/GcVlpWkOhAlqlA XoCw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=FKkqGPurpa2QsGpfBoKL8nVGiAlS2IV7agqK5BJUOfg=; fh=x7IloeLyzpHfAFPUWJTbifVEr5+RmqAg0u+mVdA+TQY=; b=rGbMt+iQeoozYBCNfRv/WPENQtL56++uEhus/acYXYmj7Bxdtq2IFtu8f4sdcC25PO FVI/faCjPLybgDvdkv6OCEoTdjMVyqS5odm7Oz85LcBvIwEkgveJBcjBBTOlZuKDLyHB xSvYdqsQa/RP7oMZB4zh6MEj+AbEk9QukE1do1Rt4K9SlIpi7Fs3i1Qps6camJYsMwy2 vQQDc0uCjUHokm06f/mSoyDypfyuDWHT2pzWCiY0btJ1g1MvBLSu7fdkR8pKlncpu2dK j4I2RhFNDLzqX6FCtPOAK+/8lfnGo17YklYacS/XNhx5FF7AZm/lCJs6VPQfa25sasRc DuQg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b=k38sp1FM; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id bd20-20020a056a00279400b0068094fd5451si5718549pfb.188.2023.07.31.02.26.17; Mon, 31 Jul 2023 02:26:29 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b=k38sp1FM; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230419AbjGaIMZ (ORCPT + 99 others); Mon, 31 Jul 2023 04:12:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33736 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230171AbjGaIL4 (ORCPT ); Mon, 31 Jul 2023 04:11:56 -0400 Received: from mail-vk1-xa30.google.com (mail-vk1-xa30.google.com [IPv6:2607:f8b0:4864:20::a30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6E5B710DD; Mon, 31 Jul 2023 01:08:21 -0700 (PDT) Received: by mail-vk1-xa30.google.com with SMTP id 71dfb90a1353d-48651709fa5so1330047e0c.1; Mon, 31 Jul 2023 01:08:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1690790900; x=1691395700; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=FKkqGPurpa2QsGpfBoKL8nVGiAlS2IV7agqK5BJUOfg=; b=k38sp1FMAN8WRbhTcW8fgQnUaJCQyzuuQ8ydgegW6F1R553bfg3LrAXR+gtuAaMMNq hO5o9jgCoIdqp+eFRniJoFqtrD0GWJOGg5iAEUyN2hELt8rvYGicpoe6nUWwfofB1Lo3 Nh1A7i1NQ+ZH2GJBjHYa119xqVlP0yM4w0cAEx1ja1GULspD9ooy7g8Q8q9IEJmRH27O sorKz38ma8Lcn3JkyKq+MmmCvdeqLaGgE+XYETBJYAfp3J6zaeEte0NlkRk9rkkv3j+V NvtBXkwmy5abwwW8wHPrxMkha0VQFAT+6zfJnRj9hgY8U+/pF8iMcPGxSLaO7MsBaW4g 2Pnw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690790900; x=1691395700; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=FKkqGPurpa2QsGpfBoKL8nVGiAlS2IV7agqK5BJUOfg=; b=l58J2PW8JbfgnXEl9+/budlQ8YbTYRYWWpD0g/ynUUl2zhOi4yTssCAex7uyPVLwSG 2NC6bx71BAGg6sgu7KNMDZeJkQvbhLTvTClivxRCdOCoRuAI03msmdaQLgzHc7SVPSd9 TWOdgIv1CTWivn1pqtsuLzL64KUYWiIqfkoltRlV54HE5y4OAfMn1LwdT1Xu4QKANxSj gmXm+IOluzcfbtY2BObAkyDXv0746SX1y4C/IWcEFuJ4uTZKUqQ3EZ9IjhQbjxpg6G4y rSsuhkARXzMxsrYGh2QY9FR2UwjDH3U7RIcQN5NciHcu1iXDfcobnqqVLCLRCLQYtQDz 8n9g== X-Gm-Message-State: ABy/qLaKgU0GISvLu+sxe7RXg4Uvq9L15hds6l6eGlJh6Xlm0E7eK/9B IyMcpJBN2ZW+eOmZsOfOn0jUhOL/k0zav9qM X-Received: by 2002:a1f:5443:0:b0:486:4867:2363 with SMTP id i64-20020a1f5443000000b0048648672363mr4126459vkb.5.1690790900398; Mon, 31 Jul 2023 01:08:20 -0700 (PDT) Received: from localhost.localdomain ([103.7.29.32]) by smtp.gmail.com with ESMTPSA id n15-20020a170902d2cf00b001b54a88e6adsm7853304plc.309.2023.07.31.01.08.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 31 Jul 2023 01:08:18 -0700 (PDT) From: Like Xu X-Google-Original-From: Like Xu To: Paolo Bonzini Cc: Oliver Upton , Sean Christopherson , kvm@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH v3] KVM: x86/tsc: Don't sync user changes to TSC with KVM-initiated change Date: Mon, 31 Jul 2023 16:07:58 +0800 Message-ID: <20230731080758.29482-1-likexu@tencent.com> X-Mailer: git-send-email 2.41.0 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Like Xu Add kvm->arch.user_changed_tsc to avoid synchronizing user changes to the TSC with the KVM-initiated change in kvm_arch_vcpu_postcreate() by conditioning this mess on userspace having written the TSC at least once already. Here lies UAPI baggage: user-initiated TSC write with a small delta (1 second) of virtual cycle time against real time is interpreted as an attempt to synchronize the CPU. In such a scenario, the vcpu's tsc_offset is not configured as expected, resulting in significant guest service response latency, which is observed in our production environment. Reported-by: Yong He Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217423 Suggested-by: Sean Christopherson Suggested-by: Oliver Upton Original-by: Oliver Upton Tested-by: Like Xu Signed-off-by: Like Xu --- V2 -> V3 Changelog: - Use the kvm->arch.user_changed_tsc proposal; (Oliver & Paolo) V2: https://lore.kernel.org/kvm/20230724073516.45394-1-likexu@tencent.com/ arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/x86.c | 23 ++++++++++++++++------- 2 files changed, 17 insertions(+), 7 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 3bc146dfd38d..e8d423ef1474 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1303,6 +1303,7 @@ struct kvm_arch { u64 cur_tsc_offset; u64 cur_tsc_generation; int nr_vcpus_matched_tsc; + bool user_changed_tsc; u32 default_tsc_khz; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 278dbd37dab2..eeaf4ad9174d 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -2713,7 +2713,7 @@ static void __kvm_synchronize_tsc(struct kvm_vcpu *vcpu, u64 offset, u64 tsc, kvm_track_tsc_matching(vcpu); } -static void kvm_synchronize_tsc(struct kvm_vcpu *vcpu, u64 data) +static void kvm_synchronize_tsc(struct kvm_vcpu *vcpu, u64 data, bool user_initiated) { struct kvm *kvm = vcpu->kvm; u64 offset, ns, elapsed; @@ -2734,20 +2734,29 @@ static void kvm_synchronize_tsc(struct kvm_vcpu *vcpu, u64 data) * kvm_clock stable after CPU hotplug */ synchronizing = true; - } else { + } else if (kvm->arch.user_changed_tsc) { u64 tsc_exp = kvm->arch.last_tsc_write + nsec_to_cycles(vcpu, elapsed); u64 tsc_hz = vcpu->arch.virtual_tsc_khz * 1000LL; /* - * Special case: TSC write with a small delta (1 second) - * of virtual cycle time against real time is - * interpreted as an attempt to synchronize the CPU. + * Here lies UAPI baggage: user-initiated TSC write with + * a small delta (1 second) of virtual cycle time + * against real time is interpreted as an attempt to + * synchronize the CPU. + * + * Don't synchronize user changes to the TSC with the + * KVM-initiated change in kvm_arch_vcpu_postcreate() + * by conditioning this mess on userspace having + * written the TSC at least once already. */ synchronizing = data < tsc_exp + tsc_hz && data + tsc_hz > tsc_exp; } } + if (user_initiated) + kvm->arch.user_changed_tsc = true; + /* * For a reliable TSC, we can match TSC offsets, and for an unstable * TSC, we add elapsed time in this computation. We could let the @@ -3776,7 +3785,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) break; case MSR_IA32_TSC: if (msr_info->host_initiated) { - kvm_synchronize_tsc(vcpu, data); + kvm_synchronize_tsc(vcpu, data, true); } else { u64 adj = kvm_compute_l1_tsc_offset(vcpu, data) - vcpu->arch.l1_tsc_offset; adjust_tsc_offset_guest(vcpu, adj); @@ -11950,7 +11959,7 @@ void kvm_arch_vcpu_postcreate(struct kvm_vcpu *vcpu) if (mutex_lock_killable(&vcpu->mutex)) return; vcpu_load(vcpu); - kvm_synchronize_tsc(vcpu, 0); + kvm_synchronize_tsc(vcpu, 0, false); vcpu_put(vcpu); /* poll control enabled by default */ base-commit: 5a7591176c47cce363c1eed704241e5d1c42c5a6 -- 2.41.0