Received: by 2002:a05:7412:8d10:b0:f3:1519:9f41 with SMTP id bj16csp6539293rdb; Fri, 15 Dec 2023 01:07:45 -0800 (PST) X-Google-Smtp-Source: AGHT+IHYUDgypGvhm1/UEZuWZ1T2kh7AwDTfWcUxVIvVOPGdTAY8vcI5x/W6GJSePRbi4Uf9Adlx X-Received: by 2002:a17:902:7ed0:b0:1d3:39a4:65b3 with SMTP id p16-20020a1709027ed000b001d339a465b3mr3123695plb.136.1702631265005; Fri, 15 Dec 2023 01:07:45 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702631264; cv=none; d=google.com; s=arc-20160816; b=Mq7x1KsXx/TxcMb5c6/P5Gngcu5JvL3bD3AnRlLHpk9JNR8qHN07Kr5aiCuG8kh6Fa PQSL24Q9BRIomR8RDMG5y02Jjkc8Vn7yAw0675UeaVLDFPSTIhgFQo0EWrkLGmH1DtQQ wXJF3rXVOeH8NoCRJM+PLbYznp5j/uh5CUCdDB1DaFkMUEeIQ5AFCAuYE7e9E7I44f8f 28dJoltO6KbW9J5cuit5viyb8KCffABOKBkkK/6QbUae7bpYCeVQfJrPYYuY4aSenIAb V/9ZkYpb+Q6JxFrSAsMcbx6jmRv0tZS2gyP/v7FddjHT0devj0UwnLRfA7xLXOAJmJrG MrAA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:reply-to:user-agent:mime-version :list-unsubscribe:list-subscribe:list-id:precedence:date:message-id :dkim-signature; bh=dCB/Ho7uAkTd2udyIh4uiQfCkhgIrChr0Q/E8CeXzB0=; fh=foxZJI6T4CNHcL72SieDBgQlquS9H0z3HssVLidbSr8=; b=Yt6M37U1Aoz/8MhOPKkWNdHsFOcwcbw/SqLnl41VyU1cc2MEYDaGBZH3YpKIUopQW0 E+01SDiia7sz1dEtW8bAnHe/xhFDLD16YbObYbYjvJflavTrU98r1mEzQpxFPWCbEiDn koZLVKitAf0QsnciPN52I7Bi/bNmng3XWspK8CZjj8EaXWUXJexrCMIG4DuokAGURjCH j8NDgUzxZbEQzjASqz6KD4Gi1HqaT+BZramnoL1I5T2b8uNtk/2d9tka/q6tN5r/HnHl uVKgn2Ri6BRUpshcEK5cH9ispGTvatpbrln+lViI5w+g9Y7e3yaAR2py48sgn4L12uA/ vGCw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=MR1BJh1S; spf=pass (google.com: domain of linux-kernel+bounces-658-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) smtp.mailfrom="linux-kernel+bounces-658-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from sy.mirrors.kernel.org (sy.mirrors.kernel.org. [147.75.48.161]) by mx.google.com with ESMTPS id u13-20020a170902e5cd00b001cfc01dc055si13045052plf.57.2023.12.15.01.07.44 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Dec 2023 01:07:44 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-658-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) client-ip=147.75.48.161; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=MR1BJh1S; spf=pass (google.com: domain of linux-kernel+bounces-658-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) smtp.mailfrom="linux-kernel+bounces-658-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sy.mirrors.kernel.org (Postfix) with ESMTPS id A599CB20F53 for ; Fri, 15 Dec 2023 09:07:43 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id A98A21B27F; Fri, 15 Dec 2023 09:07:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="MR1BJh1S" X-Original-To: linux-kernel@vger.kernel.org Received: from mail-wm1-f47.google.com (mail-wm1-f47.google.com [209.85.128.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 08B501A710; Fri, 15 Dec 2023 09:07:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-wm1-f47.google.com with SMTP id 5b1f17b1804b1-40c317723a8so5051915e9.3; Fri, 15 Dec 2023 01:07:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1702631248; x=1703236048; darn=vger.kernel.org; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:reply-to:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=dCB/Ho7uAkTd2udyIh4uiQfCkhgIrChr0Q/E8CeXzB0=; b=MR1BJh1SAkutx2pwn+vPi7qU5U3CnaJ/FkM4LGVPBGatb/jHSMSWblAbOGqx+xd6sm /IeT30a5y1AxBqbwr8bcF396Z0V0XsmGqZrntbxr9+bo+o0KAsyvX11+fGZ4iewYkfnP aICt+HihFPaHuo+89WAci8Xj9mTlXKDFOCsbxPPQn6vYo1mdLwhp8U0pWsonspKsmzuA FqHx5lnQUXfcfoL+tAtkAcXmcpBtrvOdEldjduiU6r2UN8wLHLfVI+U0/Ry7UJlq2NgU 0qP63BHKgvbeT863rDXmb5i1GpuyipPvoMszgwwMFVTXsit4qw8qqUt0nGiFWf7XgVFy C76Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702631248; x=1703236048; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:reply-to:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=dCB/Ho7uAkTd2udyIh4uiQfCkhgIrChr0Q/E8CeXzB0=; b=TY0EBTk7bK7ci8o1+i9cWUJEJL0aVl/AFJ//Dgo844y+s+VVY1Yu/j08gXAetX9pS0 r/6KeVadywTwGmRXzbxK427i6p/RPF5s6Rg1O6BJkL722jDSW4SYK1oyjt9lkLzbYv6O GwsJodcg6wbmSRmpq0fY0m4fL+J3YSzKiQVzOmdrQ1B4HPAojysXSQVt88HW+CPfVIo0 7w9ZbiGRLWQRIkmmXMBoinZ0N1Ktkdt/wEglcglMvlhoE4JYY0fvw0aQx8svZKtu/TYn wPBErXWg2tBm2kKWu8c2S+xKQGl4p5qjxReKHbV5NWt8AStIZhLWTXYuvpyZfW5gxZT4 n6Mg== X-Gm-Message-State: AOJu0Yx5c8Vj0VVc/e34GDo2thLyCbQVzxcxVLpw7bhCfQdGxlyRIgxo oaCsl+7Jd3wmUa1BW/PrcDA= X-Received: by 2002:a05:600c:19c9:b0:40c:6d4b:2fa5 with SMTP id u9-20020a05600c19c900b0040c6d4b2fa5mr110088wmq.63.1702631247860; Fri, 15 Dec 2023 01:07:27 -0800 (PST) Received: from [192.168.2.124] (54-240-197-233.amazon.com. [54.240.197.233]) by smtp.gmail.com with ESMTPSA id h2-20020a05600c350200b0040c6b2c8fa9sm1415680wmq.41.2023.12.15.01.07.26 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 15 Dec 2023 01:07:27 -0800 (PST) Message-ID: Date: Fri, 15 Dec 2023 09:07:25 +0000 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Reply-To: paul@xen.org Subject: Re: [PATCH v3] KVM: x86/xen: improve accuracy of Xen timers To: David Woodhouse , kvm@vger.kernel.org, linux-kernel Cc: Paul Durrant , Sean Christopherson , Paolo Bonzini , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" References: Content-Language: en-US From: "Durrant, Paul" In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit On 14/12/2023 16:54, David Woodhouse wrote: > From: David Woodhouse > > A test program such as http://david.woodhou.se/timerlat.c confirms user > reports that timers are increasingly inaccurate as the lifetime of a > guest increases. Reporting the actual delay observed when asking for > 100µs of sleep, it starts off OK on a newly-launched guest but gets > worse over time, giving incorrect sleep times: > > root@ip-10-0-193-21:~# ./timerlat -c -n 5 > 00000000 latency 103243/100000 (3.2430%) > 00000001 latency 103243/100000 (3.2430%) > 00000002 latency 103242/100000 (3.2420%) > 00000003 latency 103245/100000 (3.2450%) > 00000004 latency 103245/100000 (3.2450%) > > The biggest problem is that get_kvmclock_ns() returns inaccurate values > when the guest TSC is scaled. The guest sees a TSC value scaled from the > host TSC by a mul/shift conversion (hopefully done in hardware). The > guest then converts that guest TSC value into nanoseconds using the > mul/shift conversion given to it by the KVM pvclock information. > > But get_kvmclock_ns() performs only a single conversion directly from > host TSC to nanoseconds, giving a different result. A test program at > http://david.woodhou.se/tsdrift.c demonstrates the cumulative error > over a day. > > It's non-trivial to fix get_kvmclock_ns(), although I'll come back to > that. The actual guest hv_clock is per-CPU, and *theoretically* each > vCPU could be running at a *different* frequency. But this patch is > needed anyway because... > > The other issue with Xen timers was that the code would snapshot the > host CLOCK_MONOTONIC at some point in time, and then... after a few > interrupts may have occurred, some preemption perhaps... would also read > the guest's kvmclock. Then it would proceed under the false assumption > that those two happened at the *same* time. Any time which *actually* > elapsed between reading the two clocks was introduced as inaccuracies > in the time at which the timer fired. > > Fix it to use a variant of kvm_get_time_and_clockread(), which reads the > host TSC just *once*, then use the returned TSC value to calculate the > kvmclock (making sure to do that the way the guest would instead of > making the same mistake get_kvmclock_ns() does). > > Sadly, hrtimers based on CLOCK_MONOTONIC_RAW are not supported, so Xen > timers still have to use CLOCK_MONOTONIC. In practice the difference > between the two won't matter over the timescales involved, as the > *absolute* values don't matter; just the delta. > > This does mean a new variant of kvm_get_time_and_clockread() is needed; > called kvm_get_monotonic_and_clockread() because that's what it does. > > Fixes: 536395260582 ("KVM: x86/xen: handle PV timers oneshot mode") > Signed-off-by: David Woodhouse > --- > v3: > • Rebase and repost. > > v2: > • Fall back to get_kvmclock_ns() if vcpu-arch.hv_clock isn't set up > yet, with a big comment explaining why that's actually OK. > • Fix do_monotonic() *not* to add the boot time offset. > • Rename do_monotonic_raw() → do_kvmclock_base() and add a comment > to make it clear that it *does* add the boot time offset. That > was just left as a bear trap for the unwary developer, wasn't it? > >  arch/x86/kvm/x86.c |  61 +++++++++++++++++++++-- >  arch/x86/kvm/x86.h |   1 + >  arch/x86/kvm/xen.c | 121 ++++++++++++++++++++++++++++++++++----------- >  3 files changed, 149 insertions(+), 34 deletions(-) > Reviewed-by: Paul Durrant