Received: by 2002:a05:7412:d8a:b0:e2:908c:2ebd with SMTP id b10csp1255332rdg; Fri, 13 Oct 2023 15:44:08 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGFfQ38EKqPLwX2+TsclfOIK92sjuV0+SBl2HTFCKftbxOd9pXxTdcbjt1ZjUCy38limLm6 X-Received: by 2002:a05:6a21:6d92:b0:13a:dd47:c31a with SMTP id wl18-20020a056a216d9200b0013add47c31amr1898676pzb.20.1697237047799; Fri, 13 Oct 2023 15:44:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1697237047; cv=none; d=google.com; s=arc-20160816; b=frvi54uuyvPHz1n4Y9oTUXYyoTffgQ/3Rg8y4FsmB9JVxZyCBBTqSR0qbLuOndmUmg UTt6jYsBY630q3KxeAQU4Yt9Joto4+9UzW8SHpuaQZzutL/mF+LKLfgxhNi+o0q5HMR4 T95FAEcDikMlZMnNwpv0nrXRdok/k/BrUmq9U8NfRVAHfJzf0enVjQSBQ8zVCiCdzSXd kWiWnyFXMO13TaTAIK71fkhaBjHuQzI9YuibYs25jRgQjTut0yGfYikHXcOsrt6ZI3H9 hk673Ue3l/LKKYln60dYupSouNFQXjm8oTpRcQejHyR2z0VGr26SiJENtlBZQ0w0QRPa 2l+g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:from:subject :message-id:references:mime-version:in-reply-to:date:dkim-signature; bh=qxipDt6TGihxODuhMyyTMhWX2f2ccAuzIUgR7BlbguE=; fh=vfE0JWJCM+uS7HKzZ8FR3wuQ7EEP31kdtQQqUlj6Otk=; b=J7nBC557s35+CQnvDHvBkSGbv8vTomPcGEBGhHCOJrJNQMta48zTbdbCAM9Hhky/pj Fa6L21CdwFcxBOM5SJey3cTbTpU1VFUbfN9h9LnhQvIGnPkYcFeFx11S3RG3Fm3s3ZTw bIXl7UU34s2Uu+e2MooBo+Y+kpopkq5080gizIpXL8vJ3eIl4l4MnaDfFCnVfWhM060s +ix7+xP+vi/q7ltz859mDzQ0c0/8CvxoOdlM5nUS8sscZNafp3JqlqcQ7SfUeuACu56e Pa9RkSoEqZNQyK2IgE+qDDJYVkC0O3AUPW32NO8JyzPMv7GIOpc7TA79seIobWyf82Ki Dp/Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=oKhJpFAn; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from lipwig.vger.email (lipwig.vger.email. [2620:137:e000::3:3]) by mx.google.com with ESMTPS id h34-20020a632122000000b00578c8ce14edsi5515457pgh.252.2023.10.13.15.44.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 13 Oct 2023 15:44:07 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) client-ip=2620:137:e000::3:3; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=oKhJpFAn; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by lipwig.vger.email (Postfix) with ESMTP id 265E4825CEC4; Fri, 13 Oct 2023 15:44:05 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at lipwig.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229830AbjJMWn6 (ORCPT + 99 others); Fri, 13 Oct 2023 18:43:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51564 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229518AbjJMWn5 (ORCPT ); Fri, 13 Oct 2023 18:43:57 -0400 Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3DCB7B7 for ; Fri, 13 Oct 2023 15:43:56 -0700 (PDT) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-5a507eb61a6so38758757b3.1 for ; Fri, 13 Oct 2023 15:43:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1697237035; x=1697841835; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:from:to:cc:subject:date:message-id :reply-to; bh=qxipDt6TGihxODuhMyyTMhWX2f2ccAuzIUgR7BlbguE=; b=oKhJpFAnEuKwHemVtQ+xiSksKNnvB1DTjooOIceS66JjykDm4mWOQ9mA8on6kQrXZc sPk6BvrLg/qdKgOtLvUG3eLfCTSOcaClHhaIzDDaVBBkp8mMF3XNTi9LVX916Fj4Bn1D ej+Q6zXPHsO/gbfFXUzZ9tvwPHlpkNKFITY64Op8QCSMFzjAqTuIbHXp+oCcM6WTjHZJ 8GKoMlLIE6PlOquHF1BYS/8A57wnWBS4yAy1GSErI4/sG2lBhmXTmQ93eacjNSI0I0Yd 1fHQHsb7YlGY56ZMPFRXKZ4M1Gov0+1Q+5mw4oCzd8Jb4mEaGMOQ0RFjWSw4xwbsvHGG Pj/g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697237035; x=1697841835; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=qxipDt6TGihxODuhMyyTMhWX2f2ccAuzIUgR7BlbguE=; b=PBxa7sEzC5Oi8eUgdBS5iX034jqKujO/zYrpetD2CFRTfxjLk9AXU5AjMwt6RpX6oE GBzEvS0ffyQol+uE0mZZ84vcX16xwhyKzCHJOfX86moOYRMLu/fXbSS9XI5QS5IIXOeo BwFTzPk8T3yTYaUf5mNHTNC0J16xGYqxkFI9XzXLgAbUiVKSvHzVbEwVbFd2wcEKIGGA CU4vfm/3rNk5ENIVJpmA24BFezqPzTbb/4eJlRCqVsz+hRdd6HPuwTnzhfP1E+QhQTkb MTJv1eUkdN0G932DnTZ4SM/ZgNU6nqYV/4nsfD3h4j8zjBhwBFzpS/flYGOVGLG+pRgw YSuQ== X-Gm-Message-State: AOJu0YwklX6xLAqoCwCJLike4qZlGTOfGXV4KKvkW4IKXOYNOQMQBhyr cs4FGcej+/1bGZsUOT018+QWxqTs2HU= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a81:838b:0:b0:5a8:170d:45a9 with SMTP id t133-20020a81838b000000b005a8170d45a9mr120408ywf.8.1697237035509; Fri, 13 Oct 2023 15:43:55 -0700 (PDT) Date: Fri, 13 Oct 2023 15:43:53 -0700 In-Reply-To: <20231006011255.4163884-1-vannapurve@google.com> Mime-Version: 1.0 References: <20231006011255.4163884-1-vannapurve@google.com> Message-ID: Subject: Re: [PATCH] x86/tdx: Override the tsc calibration for TDX VMs From: Sean Christopherson To: Vishal Annapurve Cc: "Kirill A. Shutemov" , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , Peter Zijlstra , Jun Nakajima , Isaku Yamahata , Erdem Aktas , Sagi Shahar , Nikolay Borisov , "Jason A. Donenfeld" , Kuppuswamy Sathyanarayanan , "H. Peter Anvin" , x86@kernel.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-8.4 required=5.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lipwig.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (lipwig.vger.email [0.0.0.0]); Fri, 13 Oct 2023 15:44:05 -0700 (PDT) On Fri, Oct 06, 2023, Vishal Annapurve wrote: > TSC calibration for native execution gets the TSC frequency from CPUID, > but also ends up setting lapic_timer_period. When using oneshot mode > with lapic timer, predefined value of lapic_timer_period causes lapic > timer calibration to be skipped with wrong multipliers set for lapic > timer. >=20 > To avoid this issue, override the TSC calibration step for TDX VMs to > just calculate the TSC frequency using cpuid values. This is a hack to workaround a KVM TDX bug. Per Intel's SDM: The APIC timer frequency will be the processor=E2=80=99s bus clock or cor= e crystal clock frequency (when TSC/core crystal clock ratio is enumerated in CPUID leaf 0x15) divided by the value specified in the divide configuration reg= ister. TDX hardcodes the core crystal frequency to 25Mhz, whereas KVM hardcodes th= e APIC bus frequency to 1Ghz. Upstream KVM's *current* behavior is fine, because = KVM doesn't advertise support for CPUID 0x15, i.e. doesn't announce to host use= rspace that it's safe to expose CPUID 0x15 to the guest. Because TDX makes exposi= ng CPUID 0x15 mandatory, KVM needs to be taught to correctly emulate the guest= 's APIC bus frequency, a.k.a. the TDX guest core crystal frequency of 25Mhz. I.e. tmict_to_ns() needs to replace APIC_BUS_CYCLE_NS with some math that m= akes the guest's APIC timer actually run at 25Mhz given whatever the host APIC b= us runs at. static inline u64 tmict_to_ns(struct kvm_lapic *apic, u32 tmict) { return (u64)tmict * APIC_BUS_CYCLE_NS * (u64)apic->divide_count; } The existing guest code "works" because the calibration code effectively di= scovers the host APIC bus frequency. If we really want to force calibration, then = the best way to do that would be to add a command line option to do exactly tha= t, not hack around a KVM TDX bug. diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c index 15f97c0abc9d..ce1cec6b3c18 100644 --- a/arch/x86/kernel/tsc.c +++ b/arch/x86/kernel/tsc.c @@ -723,7 +723,8 @@ unsigned long native_calibrate_tsc(void) * lapic_timer_period here to avoid having to calibrate the APIC * timer later. */ - lapic_timer_period =3D crystal_khz * 1000 / HZ; + if (!force_lapic_timer_calibration) + lapic_timer_period =3D crystal_khz * 1000 / HZ; #endif =20 return crystal_khz * ebx_numerator / eax_denominator; But I would be very leery of forcing calibration, as effectively calibratin= g to the *host* core crystal frequency will cause the guest APIC timer to be wro= ng if the VM is migrated to a host with a different core crystal frequency. Rely= ing on CPUID 0x15, if it's available, avoids that problem because it puts the o= nus on the hypervisor to account for the new host's frequency when emulating the A= PIC timer. That mess exists today, but deliberate ignoring the mechanism that = allows the host to fix the mess would be asinine IMO. Even better would be for GCE to just enumerate support for TSC deadline alr= eady, because KVM already does the right thing to convert guest TSC frequency to = host TSC frequency. KVM TDX would still need to add full support for CPUID 0x15= , but at least any problems with using one-shot mode will unlikely to impact real= guests.