Received: by 2002:a05:7412:f589:b0:e2:908c:2ebd with SMTP id eh9csp557991rdb; Tue, 31 Oct 2023 15:40:16 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEde60HcUn7uERMjnLuUnIlAOOfQyp2HT2O1Lw688jGAQaM1Aje0+czb4cr7+yfYvuNxD2h X-Received: by 2002:a05:6870:b88:b0:1ec:7e2a:6e31 with SMTP id lg8-20020a0568700b8800b001ec7e2a6e31mr15711291oab.35.1698792015779; Tue, 31 Oct 2023 15:40:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698792015; cv=none; d=google.com; s=arc-20160816; b=Zpojd7G+Haq3BMdGx6Q+qzJU6+aDn9nBh02gcFQGgxsBcZcafngWsaiT/98doyd7Wg /lhS9JIslMntsa8o1tA7/V/bpk/bqk1EaU73ra0Qj4dgrqhQRYJKk+8a0bYie6nott5U qXY1VORMOmc1YoU2FdaBv1YiZcZsDiMMgPJcyfkPQZ4nY2oHezPdMcpdJ/tuDXMzrU2/ thVKRKj+Ic+1Pgl7mx0R4vGd1m7PSxm9OCZ5k1NjBhTyonuvwKAGQ6TeAh952Bn1Ugm7 EbPw8IRRv7hCRRkGc99q+LVYIQR79NQbwUnI1reNoosEywIp9Ydj5bnzy16UrA67HkHr 6awg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=LkerRqEcfT0+JzgZxhmV6L5Iy5i6TtOybYrAtyVVjb4=; fh=bRwdyKJhQoWgJWpaaWz1VwYSS4ftk48674EhpLrN0r4=; b=TC1h2+asNXnZXkDajL8r67gKsXSxYxgyhbmYZLz9KVO+aLQ2rerC9Tj13GLg/S/47c dEa4SX0rz7oLrryvrLbsVzWMY8uSkMBXJn+IRzgHIUaOgdT/wXd/AfY75hpsJMUELSpv cI3yu5HeoINfye5InFpoT2bYyS12FF2g7E2kp04tIEDkqtmfyeghE5mTaQ99jIkdrg3O Q1F8sSK8RaI3ktrmaszGtoYvmSa/KC6Dk7QV9omxQZjJ5SEGKyI+RgB2Jq/JrjlYY3fF mTVa9kTStmwScske4OndSDRezHBCZlrHKK5NAFfyBW8+Ald3X5MyIhpgP6CCkUS3K7+q nr7w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=iyQb2LYG; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:4 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from howler.vger.email (howler.vger.email. [2620:137:e000::3:4]) by mx.google.com with ESMTPS id i69-20020a638748000000b005b9602a7ba3si1621652pge.475.2023.10.31.15.40.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 31 Oct 2023 15:40:15 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:4 as permitted sender) client-ip=2620:137:e000::3:4; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=iyQb2LYG; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:4 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by howler.vger.email (Postfix) with ESMTP id E11BD80756D4; Tue, 31 Oct 2023 15:40:12 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at howler.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231778AbjJaWkD (ORCPT + 99 others); Tue, 31 Oct 2023 18:40:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45372 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231532AbjJaWkC (ORCPT ); Tue, 31 Oct 2023 18:40:02 -0400 Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A33E3F4 for ; Tue, 31 Oct 2023 15:39:59 -0700 (PDT) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-5afa071d100so89578597b3.1 for ; Tue, 31 Oct 2023 15:39:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1698791999; x=1699396799; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=LkerRqEcfT0+JzgZxhmV6L5Iy5i6TtOybYrAtyVVjb4=; b=iyQb2LYGGDjYdOx5NNHqyq8HB0Lfxnxx54fvgYxw8MCA6kbi9/te2PmX5tNc/pr5Wx 6INZva/XbN94G+LW1Cb0lZI9sQrDmi+6RoBmuLpCVKg62msZzx+8QRwlk/jc2rKEQK01 nMReJaD6DYIvYuuF7jDtrqVN0sb4xLm+fVRRO+CbIHpG4hMep/Zm509CU/d/Ebp9FN1R YCbgM8hDNFr+Wn224D1oMgcJfpmEcRl2DW+P/C4jNiJgMcBVfZa82JvKhQ8jR9fNDOTj HYjSx5MXsCcD3/+LSrTFKzpSlHyf64SCQUC/ymIzj0nI6Wo4OkwnuHo54q4HuNM51bQ2 DJFw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698791999; x=1699396799; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=LkerRqEcfT0+JzgZxhmV6L5Iy5i6TtOybYrAtyVVjb4=; b=HLxRE8b3D5PjnaLrCqQVP/MikWbvjAJTkrS3B4D7GiI6BngLa9O6V/eoHLYssXW2qu 3jCtPc9B6ztdr8upXThZZmdHPGD6uEeAbmZehROsoCjE9a62vvGz+XOPaVvGFninaCz2 yFRMp9yhTD6YwzYm1118Da3Nb/5lGw+CkKhzzHujc+iPGI+YwQFX0dIUpd1SaQW5N4Ga t3nYnDFOndyPGAahamQaeED5xgmfuMOj7EffBxeXMTpvP+NtP8uQAX5ys5lFGMYZJ/j2 U+Xvw99hjvGmzK+K3P/0ilGgh0YFAvXhml3q1jdY+NYeGnQ2B+mF6Sv/e7Je0BSwm/16 Aq5w== X-Gm-Message-State: AOJu0YxVH8JJfnLr6w6o3MkNWq99djNfq8EzDxdYjcwhw9OWf3GeGWpR 1rPMFa2xn92uJR0R09QlAr1ZZnk62Fw= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a81:a151:0:b0:5a7:b9b0:d23f with SMTP id y78-20020a81a151000000b005a7b9b0d23fmr301520ywg.6.1698791998887; Tue, 31 Oct 2023 15:39:58 -0700 (PDT) Date: Tue, 31 Oct 2023 15:39:57 -0700 In-Reply-To: <20231031115748.622578-1-paul@xen.org> Mime-Version: 1.0 References: <20231031115748.622578-1-paul@xen.org> Message-ID: Subject: Re: [PATCH v2] KVM x86/xen: add an override for PVCLOCK_TSC_STABLE_BIT From: Sean Christopherson To: Paul Durrant Cc: Paolo Bonzini , Jonathan Corbet , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" , David Woodhouse , kvm@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="us-ascii" X-Spam-Status: No, score=-8.4 required=5.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE, USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on howler.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (howler.vger.email [0.0.0.0]); Tue, 31 Oct 2023 15:40:13 -0700 (PDT) On Tue, Oct 31, 2023, Paul Durrant wrote: > From: Paul Durrant > > Unless explicitly told to do so (by passing 'clocksource=tsc' and > 'tsc=stable:socket', and then jumping through some hoops concerning > potential CPU hotplug) Xen will never use TSC as its clocksource. > Hence, by default, a Xen guest will not see PVCLOCK_TSC_STABLE_BIT set > in either the primary or secondary pvclock memory areas. This has > led to bugs in some guest kernels which only become evident if > PVCLOCK_TSC_STABLE_BIT *is* set in the pvclocks. Hence, to support > such guests, give the VMM a new Xen HVM config flag to tell KVM to > forcibly clear the bit in the Xen pvclocks. > > Signed-off-by: Paul Durrant > --- > Documentation/virt/kvm/api.rst | 6 ++++++ > arch/x86/kvm/x86.c | 28 +++++++++++++++++++++++----- > arch/x86/kvm/xen.c | 3 ++- > include/uapi/linux/kvm.h | 1 + > 4 files changed, 32 insertions(+), 6 deletions(-) > > diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst > index 21a7578142a1..9752a01270df 100644 > --- a/Documentation/virt/kvm/api.rst > +++ b/Documentation/virt/kvm/api.rst > @@ -8252,6 +8252,7 @@ PVHVM guests. Valid flags are:: > #define KVM_XEN_HVM_CONFIG_EVTCHN_2LEVEL (1 << 4) > #define KVM_XEN_HVM_CONFIG_EVTCHN_SEND (1 << 5) > #define KVM_XEN_HVM_CONFIG_RUNSTATE_UPDATE_FLAG (1 << 6) > + #define KVM_XEN_HVM_CONFIG_PVCLOCK_TSC_UNSTABLE (1 << 7) > > The KVM_XEN_HVM_CONFIG_HYPERCALL_MSR flag indicates that the KVM_XEN_HVM_CONFIG > ioctl is available, for the guest to set its hypercall page. > @@ -8295,6 +8296,11 @@ behave more correctly, not using the XEN_RUNSTATE_UPDATE flag until/unless > specifically enabled (by the guest making the hypercall, causing the VMM > to enable the KVM_XEN_ATTR_TYPE_RUNSTATE_UPDATE_FLAG attribute). > > +The KVM_XEN_HVM_CONFIG_PVCLOCK_TSC_UNSTABLE flag indicates that KVM supports > +clearing the PVCLOCK_TSC_STABLE_BIT flag in Xen pvclock sources. This will be > +done when the KVM_CAP_XEN_HVM ioctl sets the > +KVM_XEN_HVM_CONFIG_PVCLOCK_TSC_UNSTABLE flag. > + > 8.31 KVM_CAP_PPC_MULTITCE > ------------------------- > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 41cce5031126..6abad6dacf07 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -3096,7 +3096,8 @@ u64 get_kvmclock_ns(struct kvm *kvm) > > static void kvm_setup_guest_pvclock(struct kvm_vcpu *v, > struct gfn_to_pfn_cache *gpc, > - unsigned int offset) > + unsigned int offset, > + bool force_tsc_unstable) > { > struct kvm_vcpu_arch *vcpu = &v->arch; > struct pvclock_vcpu_time_info *guest_hv_clock; > @@ -3122,6 +3123,10 @@ static void kvm_setup_guest_pvclock(struct kvm_vcpu *v, > */ > > guest_hv_clock->version = vcpu->hv_clock.version = (guest_hv_clock->version + 1) | 1; > + > + if (force_tsc_unstable) > + guest_hv_clock->flags &= ~PVCLOCK_TSC_STABLE_BIT; I don't see how this works. This clears the bit in the guest copy, then clobbers all of guest_hv_clock with a memcpy(). if (force_tsc_unstable) guest_hv_clock->flags &= ~PVCLOCK_TSC_STABLE_BIT; smp_wmb(); /* retain PVCLOCK_GUEST_STOPPED if set in guest copy */ vcpu->hv_clock.flags |= (guest_hv_clock->flags & PVCLOCK_GUEST_STOPPED); if (vcpu->pvclock_set_guest_stopped_request) { vcpu->hv_clock.flags |= PVCLOCK_GUEST_STOPPED; vcpu->pvclock_set_guest_stopped_request = false; } memcpy(guest_hv_clock, &vcpu->hv_clock, sizeof(*guest_hv_clock)); <= sets PVCLOCK_TSC_STABLE_BIT again, no? smp_wmb(); Any reason not to make this a generic "capability" instead of a Xen specific flag? E.g. I assume these problematic guests would mishandle PVCLOCK_TSC_STABLE_BIT if it showed up in kvmclock, but they don't use kvmclock so it's not a problem in practice. I doubt there's a real need or use case, but it'd require less churn and IMO is simpler overall, e.g. diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index e3eb608b6692..731b201bfd5a 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3225,7 +3225,7 @@ static int kvm_guest_time_update(struct kvm_vcpu *v) /* If the host uses TSC clocksource, then it is stable */ pvclock_flags = 0; - if (use_master_clock) + if (use_master_clock && !vcpu->kvm.force_tsc_unstable) pvclock_flags |= PVCLOCK_TSC_STABLE_BIT; vcpu->hv_clock.flags = pvclock_flags; I also assume this is a "set and forget" thing? I.e. KVM can require the flag to be set before any vCPUs are created.