Received: by 2002:ac0:aa62:0:0:0:0:0 with SMTP id w31-v6csp3803486ima; Tue, 23 Oct 2018 11:31:43 -0700 (PDT) X-Google-Smtp-Source: ACcGV60X22nUFazja+kJ3mPzRjoa07ciY43x6m1WXxKUwFTykJFK9/aesbRp44gtq9Cbmc2VMJQG X-Received: by 2002:a63:9f0a:: with SMTP id g10-v6mr45706274pge.232.1540319503523; Tue, 23 Oct 2018 11:31:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1540319503; cv=none; d=google.com; s=arc-20160816; b=N+qL3o7qNDz1PVFPvU1Rq+uYvk66Cj1QEJIHVQbLNtyufD0OxNdOtH4DJRybAlBh+j /Ijo7M/y4TYgCovFqTkp0zLyuvtIlD1XDCVrx+D0V2PdD6R6WEmboIZsEPcXfhmFAaO7 hEH0CnaVSd2hiSMPYO/m5V83QpLJphHNhOwA9uoZxTXIjxENW39KVaAy8yEpsM0u62RI xznF38JxFWwhQKWBjXU0BH6CQ1Gka6htJOO9Th66HuK6I6vQ3DvVsLdCrWCuqMAH9lBK gX9yL0EURkzBY4wVjbB02bBLPLCppKiEuYs3EoWYJPneF0V3ziZepqzW4Bn6I8ORQcXh HLJw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature; bh=+iI2WIH8XTggL0ZcvmaPJUnh55lXQmNlHjbP+7TjBmw=; b=qLSaJKcs7agjBM2u3UB7bddkT4zEkgbPXKvBRYiwsryWLAJLWCd0yQbX7lZtrvliE4 aN7TcGUdIczoveXCYjUU7/klKbEkW37WdXICXnMmCmRhJ/I40fJsnQh9JZasxmNwQMBx jMMHFbK5aSrPm/AbmD3oJBcx9g5a0hPSm2KIyUDMDXPwBG07K3CHaB1mBOZ6OMim1nsj wbM/KxDsw/846PpBWGA/Fpyj+seM3r1recB1PM8OCYVM+yLihwcjeZ3OTq/+CDRNxea3 rRIBas3ubV5eDB8R7UFnH1H98G70FMM3Igfw4b/DvKtlWUkwjgk6GGCDsorDiOjUU2kD U8Zg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=Kz061Smv; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id z16-v6si1930192pgu.525.2018.10.23.11.31.26; Tue, 23 Oct 2018 11:31:43 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=Kz061Smv; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728590AbeJXCzf (ORCPT + 99 others); Tue, 23 Oct 2018 22:55:35 -0400 Received: from mail-wr1-f68.google.com ([209.85.221.68]:43397 "EHLO mail-wr1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728408AbeJXCzf (ORCPT ); Tue, 23 Oct 2018 22:55:35 -0400 Received: by mail-wr1-f68.google.com with SMTP id t10-v6so2775755wrn.10 for ; Tue, 23 Oct 2018 11:31:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=+iI2WIH8XTggL0ZcvmaPJUnh55lXQmNlHjbP+7TjBmw=; b=Kz061SmvKEPtC/Q1J/gDZj7FFj88C1PAwKLv6E4bVjZ3CrNfr2kdIyqnxYVUxAawp4 ed9zzU4s6Uzj2ohPH1ou5vxhcnOB0l5frlWmLk86ZntEPiPuEVjHTNNBhwKp1uj7OMt3 EM2ceN4NjTCg0F09MO59Ry/L/iQoh0n8Cc5uE= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=+iI2WIH8XTggL0ZcvmaPJUnh55lXQmNlHjbP+7TjBmw=; b=UdM8YA76jIJPXB+7MpINyExu2UlVvR6ufhSjkFGwD2OAi6eZGkWT3+c6Yygrk6T07q 9XIMBiAb+a9e31/yGBmewKNMJaBlaQf1sI9blf72TqxVTqP/kqmvV5DRfb3H0f78qC00 hWEzdmMZ1pk9U7LX3SKMWas8smiYlvqxvUlESe+OvTgY6vMXPw5RycLpqBW1UfUR0cNd UhHQzXaKNXPZ0btR9aANZ0aHR0zi+58ozLw/BdDTzjJ57vdIKW63SgCu2s7rhnGE6w8R c06nBqOyNiX8hf0vRpZGz666d6khIwYDO70hYGcT1bpUdfQrsyhI2/mmMLagAIVehFNb Qu7g== X-Gm-Message-State: ABuFfogszvJGZSspw3+CmM9h88BgS3rU8dFxDXy8glUFyY69fadUPK8b tMXF1qPcQsesSoQhI8ZcTEJl96PdVETxnjShjBoVrA== X-Received: by 2002:adf:ba06:: with SMTP id o6-v6mr25781187wrg.249.1540319461616; Tue, 23 Oct 2018 11:31:01 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a1c:b485:0:0:0:0:0 with HTTP; Tue, 23 Oct 2018 11:31:00 -0700 (PDT) In-Reply-To: References: <20181015160945.5993-1-christopher.s.hall@intel.com> From: John Stultz Date: Tue, 23 Oct 2018 11:31:00 -0700 Message-ID: Subject: Re: TSC to Mono-raw Drift To: Thomas Gleixner Cc: Christopher Hall , "H. Peter Anvin" , linux-rt-users , jesus.sanchez-palencia@intel.com, Gavin Hindman , liam.r.girdwood@intel.com, Peter Zijlstra , LKML , Miroslav Lichvar Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Oct 19, 2018 at 3:36 PM, John Stultz wrote: > On Fri, Oct 19, 2018 at 1:50 PM, Thomas Gleixner wrote: >> John, >> >> On Fri, 19 Oct 2018, John Stultz wrote: >>> On Fri, Oct 19, 2018 at 11:57 AM, Thomas Gleixner wrote: >>> > I don't think you need complex oscillation for that. The error is constant >>> > and small enough that it is a fractional nanoseconds thing with an interval >>> > <= 1s. So you can just add that in a regular interval. Due to it being >>> > small you can't observe time jumping I think. >>> >>> Well, from the examples the trouble is we seem to be a bit fast, >>> rather then slow. >>> So we'll have to reduce mult by one, and rework the calculations, but >>> maybe something like this (correcting the raw_interval value) would >>> work. >> >> Shouldn't be rocket science. It's a one off calculation of adjustment value >> and maybe the period at which the correction happens. >> >>> But this also sort of breaks, fundamental argument that the raw clock >>> is a simple mult/shift transformation of the underlying clocksource >>> counter. Its not the accuracy of the clock but the consistency that >>> was key. >>> >>> The counter argument is that the raw clock is abstracting the >>> underlying hardware so folks who would have used the TSC directly can >>> now use the raw clock and have a generic abstracted hardware-counter >>> interface. So userland shouldn't really be worried about the >>> occasional injections made since they shouldn't be trying to >>> re-generate the abstraction from the hardware themselves. <-- >>> Remember this point as we move to the next comment:) >>> >>> > The end-result is 'correct' as much correct it is in relation to real >>> > nanoseconds. :) >>> > >>> >> I guess I'd want to understand more of the use here and the need to >>> >> tie the raw clock back to the hardware counter it abstracts. >>> > >>> > The problem there is ART which is distributed to PCIe devices and ART time >>> > stamps are exposed in various ways. ART has a fixed ratio vs. TSC so there >>> > is a reasonable expectation that MONOTONIC_RAW is accurate. >>> >>> Which is maybe sort of my issue here. The raw clock provided a >>> abstraction away from the hardware for generic usage, but then its >>> being re-used with other un-abstracted hardware references. So unless >>> they use the same method of transformation, there will be problems (of >>> varying degree). >> >> OTOH. If people use the CPUID provided frequency information and the TSC >> from userspace then they get different results which is contrary to the >> goal of providing them an abstracted way of doing it. > > But that's my point. If they are pulling time values from the hardware > directly that's unabstracted. I'm not sure its smart to be comparing > the abstracted and unabstracted time stamps if your worried about > precision. They are sort of two separate (though similar) time > domains. > >>> We might be able to reduce the degree in this case, but I worry the >>> extra complexity may only cause problems for others. >> >> Is it really that complex to add a fixed correction value periodically? >> >> I don't think so and it should just work for any clocksource which is >> exposed this way. Famous last words ..... > > I'm not saying that the code is overly complex (at least compared to > the rest of the timekeeping code :), but just how the accumulation is > done is less-trivial. So if someone else is trying to mimic the > abstracted time with unabstracted hardware values (again, not > something I reccomend, but that's sort of the usage case pushing > this), they need to use a similar method that is slightly more > complicated (or use slower math). Its all subtle stuff, but this makes > something that was relatively very simple (by design) a bit harder to > explain. Adding Mirosalv as he's always thoughtful on these sorts of issues. I spent a little bit of time thinking this out. Unfortunately I don't think its a simple matter of calculating the granularity error on the raw clock and adding it in each interval. The other trouble spot is that the adjusted clocks (monotonic/realtime) are adjusted off of that raw clock. So they would need to have that error added as well, otherwise the raw and a otherwise non-adjusted monotonic clock would drift. However, to be correct, the ntp adjustments made would have to be made to both the base interval + error, which mucks the math up a fair bit. Maybe Miroslav will have a better idea here, but otherwise I'll stew on this a bit more and see what I can come up with. thanks -john