Received: by 2002:a05:7412:d8a:b0:e2:908c:2ebd with SMTP id b10csp1160411rdg; Fri, 13 Oct 2023 12:03:03 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGl3vLp6mMSIAhj+47fy5NEyDWppXMTe1YwhpMDXZb2E60XShb+Sltu6GcaxmhP7gOszuU4 X-Received: by 2002:a17:902:db0d:b0:1c7:61a1:9688 with SMTP id m13-20020a170902db0d00b001c761a19688mr32287793plx.7.1697223783424; Fri, 13 Oct 2023 12:03:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1697223783; cv=none; d=google.com; s=arc-20160816; b=cLzO3habqBv2C2OHhnwtwYOtN5vtemFs9SsbhjsA+1p7QjL0mpKc8wY7xkbirixtMe QTVSw8R5wqtOHv1HSyb+N2IKfPN5al9CThbT+RGNy9IH/srfk+Lx+ji0CF8IK7i3iU/E CFfLZ42Kp43QE37km6Qial0nwQ0sqLizIlLPw/L2ZpciFT0w02G7lEqh3ixL9b3FhWk5 Uav2Wt0Wo+DnN1OoZiOk/kv3x2dCh55ldm4/mKC36seJrQcwb2MeAvj0bx7ISdR15KjA k73MfeHjs0IXDCE8rV1A970FLp4x7ODs1VYWnDhleMkLbaQWxUgeoWaCMMOhNlVM74SQ uXbg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:from:subject :message-id:references:mime-version:in-reply-to:date:dkim-signature; bh=ezE2irapYgf+YDmCO+ecdM4YYnOHGtV+6PhfUW/a4YY=; fh=kEwGWtJQc59vbmlm12aVprPFZ9WpoqRDDmJESEnXeTY=; b=g8Xb0Jw2YmMqK7lPC60oZXh7YCmWUB42qtg8tj/NyUHcPdGHX9rtY5Jlcb7t/zjNSB 4Wch1+3TGLojoSZkgigwdf5xUh+aR9caYhWpPVOpywSNDPc/EEi3USuscyuoFEbFcGlm j23qB7Z8Qr8shIS3oM9JE9Gc9in5UCfuqNUZd+2uFwMhaixNVbH8cF37SLMcohmZiywK yUnuuBujkbxCunDC+Whgy/EFJCvKtuGg9rtQei7DYcZwcmAMKa6z3egDOvMClaa80xGT NoGRqAKimi2+1Wk1dM016I3k0xXgxLFM5sQkvV27Xokj2PFMNGYZ/oRj3xlhd83PqxsZ Xo5w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=WtM5iYcZ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from agentk.vger.email (agentk.vger.email. [2620:137:e000::3:2]) by mx.google.com with ESMTPS id x13-20020a170902ec8d00b001c9c8109692si5619723plg.537.2023.10.13.12.03.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 13 Oct 2023 12:03:03 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) client-ip=2620:137:e000::3:2; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=WtM5iYcZ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by agentk.vger.email (Postfix) with ESMTP id 268CD82D9800; Fri, 13 Oct 2023 12:03:01 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at agentk.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231557AbjJMTCv (ORCPT + 99 others); Fri, 13 Oct 2023 15:02:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44412 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229704AbjJMTCu (ORCPT ); Fri, 13 Oct 2023 15:02:50 -0400 Received: from mail-pj1-x104a.google.com (mail-pj1-x104a.google.com [IPv6:2607:f8b0:4864:20::104a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 43E73A9 for ; Fri, 13 Oct 2023 12:02:49 -0700 (PDT) Received: by mail-pj1-x104a.google.com with SMTP id 98e67ed59e1d1-27763c2c27dso1943197a91.2 for ; Fri, 13 Oct 2023 12:02:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1697223769; x=1697828569; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:from:to:cc:subject:date:message-id :reply-to; bh=ezE2irapYgf+YDmCO+ecdM4YYnOHGtV+6PhfUW/a4YY=; b=WtM5iYcZ0vQKSZZu7rJE8xA6RbvNvyNAbvufCAgnTHRVKQKmPp/AnzESJFM74PmErn paN5MgssmZFSLw3kI4TSjpb3xpGfBEC6BlBRlBKw8/Ofdlwz6FGko6oWBgY+V1+WIOuA 9VPKcrFmXIm9lisw9UTH1gj5bFOEV0Nclvx+VjZlhdRrE0Op+cUGol6uLqYsOdXdjZNk IHKsFOmBLNmj6O8ninCMzQrEA5S5ZH7qmYxHPKmeob926gOMh7/l0C8OxQhNgtnF4vcS HJLXqLAzlS7wz9NqY3lJxzVFZxPfykRaLvyCkF2oPDbxMnCX9r/vCfG3Mm+RJRkq3vXd iCZw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697223769; x=1697828569; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=ezE2irapYgf+YDmCO+ecdM4YYnOHGtV+6PhfUW/a4YY=; b=xAib5SehmSEvbbGEzuw5/0Byz4nZn+SETxn3HvnEi8wwTjEynCTPdLhZxf6AxHBHBd 27FeasHAgMFyv2eJOhDwNMRfzEueX2AqQb/57e1dX05os4neo9BW+XlOy+xSPBWlX4eC zLjbnzJAbH0qeDS9MOIl7wFS5dOznr4pbpe3i/U2NZrxMxrLL6x8pCMVDMwHLrvaoBY/ oCw6U2knCieLWOWMlJcilN+Bj5X0BltB/pjs05zLr0LJ4p+a5Lvmyg8aQ0lDoI2Ap2Hd 0iumxAf7QGFB2VyuHXoqNF5vL00rRSQGV+dW9jgtgM6q8XdiHn/T+qffPTFPcK5/bJ+z YXlQ== X-Gm-Message-State: AOJu0YziKuurHDXLUGEDfLjYH3DwmNyXo4uwPir+5bRKTjWhjfznt4cJ SZ19P1TI2kYUnGyJnUTrx1s9serbPYs= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a17:90a:c705:b0:27d:1af4:2ef3 with SMTP id o5-20020a17090ac70500b0027d1af42ef3mr165731pjt.3.1697223768664; Fri, 13 Oct 2023 12:02:48 -0700 (PDT) Date: Fri, 13 Oct 2023 12:02:47 -0700 In-Reply-To: <5ea168df6dfbe910524a381b88347636e1a6a3bc.camel@infradead.org> Mime-Version: 1.0 References: <9975969725a64c2ba2b398244dba3437bff5154e.camel@infradead.org> <34057852-f6c0-d6d5-261f-bbb5fa056425@oracle.com> <8f3493ca4c0e726d5c3876bb7dd2cfc432d9deaa.camel@infradead.org> <5ea168df6dfbe910524a381b88347636e1a6a3bc.camel@infradead.org> Message-ID: Subject: Re: [PATCH RFC 1/1] KVM: x86: add param to update master clock periodically From: Sean Christopherson To: David Woodhouse Cc: Dongli Zhang , Joe Jin , x86@kernel.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, pbonzini@redhat.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-8.4 required=5.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on agentk.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (agentk.vger.email [0.0.0.0]); Fri, 13 Oct 2023 12:03:01 -0700 (PDT) On Fri, Oct 13, 2023, David Woodhouse wrote: > On Fri, 2023-10-13 at 11:07 -0700, Sean Christopherson wrote: > > I generally support the idea, but I think it needs to an opt-in from us= erspace. > > Essentially a "I pinky swear to give all vCPUs the same TSC frequency, = to not > > suspend the host, and to not run software/firmware that writes IA32_TSC= _ADJUST". > > AFAICT, there are too many edge cases and assumptions about userspace f= or KVM to > > safely couple kvmclock to guest TSC by default. >=20 > I think IA32_TSC_ADJUST is OK, isn't it? There is a "real" TSC value > and if vCPUs adjust themselves forward and backwards from that, it's > just handled as a delta. I meant the host writing IA32_TSC_ADJUST. E.g. if a host SMM handler mucks= with TSC offsets to try and hide the time spent in the SMM handler, then the pla= tform owner gets to keep the pieces. > And we solved 'give all vCPUS the same TSC frequency' by making that > KVM-wide. >=20 > Maybe suspending and resuming the host can be treated like live > migration, where you know the host TSC is different so you have to make > do with a delta based on CLOCK_TAI. >=20 > But while I'm picking on the edge cases and suggesting that we *can* > cope with some of them, I do agree with your suggestion that "let > kvmclock run by itself without being clamped back to > CLOCK_MONOTONIC_RAW" should be an opt *in* feature. Yeah, I'm of the mind that just because we can cope with some edge cases, d= oesn't mean we should. At this point, kvmclock really should be considered deprec= ated on modern hardware. I.e. needs to be supported for older VMs, but shouldn'= t be advertised/used when creating entirely new VMs. Hence my desire to go with a low effort solution for getting kvmclock to pl= ay nice with modern hardware. > > > [1] Yes, I believe "back" does happen. I have test failures in my que= ue > > > to look at, where guests see the "Xen" clock going backwards. > >=20 > > Yeah, I assume "back" can happen based purely on the wierdness of the p= vclock math.o > >=20 > > What if we add a module param to disable KVM's TSC synchronization craz= iness > > entirely?=C2=A0 If we first clean up the peroidic sync mess, then it se= ems like it'd > > be relatively straightforward to let kill off all of the synchronizatio= n, including > > the synchronization of kvmclock to the host's TSC-based CLOCK_MONOTONIC= _RAW. > >=20 > > Not intended to be a functional patch... >=20 > Will stare harder at the actual patch when it isn't Friday night. >=20 > In the meantime, I do think a KVM cap that the VMM opts into is better > than a module param? Hmm, yeah, I think a capability would be cleaner overall. Then KVM could r= eturn -EINVAL instead of silently forcing synchronization if the platform conditi= ons aren't meant, e.g. if the TSC isn't constant or if the host timekeeping isn= 't using TSC. The interaction with kvmclock_periodic_sync might be a bit awkward, but tha= t's easy enough to solve with a wrapper.