Received: by 2002:a05:6a10:f347:0:0:0:0 with SMTP id d7csp2820989pxu; Mon, 7 Dec 2020 17:13:01 -0800 (PST) X-Google-Smtp-Source: ABdhPJxH5TCmQ/N485RJD1F+qVs7VKJmbiiUfDBjyaMfhDuwXVeFKHm9jFhfn1OEbbcgr0rpmbCo X-Received: by 2002:a17:906:2445:: with SMTP id a5mr16117357ejb.330.1607389981551; Mon, 07 Dec 2020 17:13:01 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1607389981; cv=none; d=google.com; s=arc-20160816; b=hrt9I0P7AuuQepz2+qHnY8zkwXQq+crApl/tKmk950VGR/vlAlfeNWWtA1WDI2lqqK BZg1jZmQ/JKNB6yme6CW62LEPG8EUCbamX6aA8Ge8/RpzrDDeHCicdgN9Bl3z3XINZDm OdL9m6dWiKDJ13xfgVZVAZ7YKpwhBSVoCo6n5j8AfTlGtFyoLzkMKPDers72JiFg9r4K SwXAy/gPzTs6UH4cQdnRfVhE4eViiiQQiRjf4JP5IYiyDVUgCM2HK73jUG1Fv/18pqBm c3SsJj6AUXuN59hd0ugjTWKLAAGzBScDnYzRXscK9m6sOnuTXns8OooCijnYOo0sLUK9 W+Fw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=bs1pNw1TpBQF5RUuJZRYk30Ml/bCx749DByttaYNFPU=; b=aFiaAyDt1MJF3aa+8K1ryW6Lb1tEalhvEwkGIFoDWkY0B79IGO6AIoW6Ysm3W8cRfe s1rYNaYkw5l7x152YN+PWBNURqf2wz5w+M6XyUlPSrIBAmNT79fx2UFHvsiSnJJfpW2f A298wi8R5Nu+gexOBWAVQYMZ9Ewq3tWx0mVwTx9vnYHvmGuXraTO2c/K3KrMYQjCiJRu Vod87anjhW8QvY2KfdQu9SHmKig76fpIreScahKYguCDbjV344jWRdP1685MN33jh06G ACJEHc1fvNtw12JCkbNBCBQZofxOXRrXUY5kfGpWUeHWNhFkwNjlAdvJJZeDMBhIphPd eCjQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ziepe.ca header.s=google header.b=YcRQbgFR; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id x18si8393097ejd.242.2020.12.07.17.12.38; Mon, 07 Dec 2020 17:13:01 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@ziepe.ca header.s=google header.b=YcRQbgFR; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726734AbgLGU7q (ORCPT + 99 others); Mon, 7 Dec 2020 15:59:46 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44708 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726415AbgLGU7p (ORCPT ); Mon, 7 Dec 2020 15:59:45 -0500 Received: from mail-qv1-xf2d.google.com (mail-qv1-xf2d.google.com [IPv6:2607:f8b0:4864:20::f2d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6DB32C061793 for ; Mon, 7 Dec 2020 12:59:05 -0800 (PST) Received: by mail-qv1-xf2d.google.com with SMTP id ek7so7238561qvb.6 for ; Mon, 07 Dec 2020 12:59:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ziepe.ca; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=bs1pNw1TpBQF5RUuJZRYk30Ml/bCx749DByttaYNFPU=; b=YcRQbgFRZCI9i/ib4FNHbnXr57ZA16t6cWxA32yn/RtRpCF8NuSMugoZsjIw8R6vpG vnUEVHvKhkqDUd3mNAfksdrFSjgG+gU+AWRFrs06mODWssMJLp+D6ZYUkWfuXSP25Auk 5tyIMilDGRelBvYYZhsCPv5wzh+93Wq/0LGSRgy2oeLezsJLRfFICEMdffJuVopraTau 7FMMvFYSZjRhb0ukRtEIFmwwC0EFRqNs5+oSRRxPFt6kiztQiUEkkEqcEik5nuXmqeR8 FBCnRBXjiPRaxwxzHWW7HjcF1r2Syv9VDwgxhHIKdJwCi1KfFXYih2gAdbvhKlrZozcB YjFw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=bs1pNw1TpBQF5RUuJZRYk30Ml/bCx749DByttaYNFPU=; b=gI5FV+6H8N5qSoK57870wLZWWcdsPIaLe5Ph6GSPUrIBhgGkCN/6CR2Jp55j8Lsu9M +yxXqykOOt5QFsqWOoE3+dBgxZTC1/PkchIh+cJ2Brtaxld510lopYQ3fCmbwfOhSfQS nnux+6+BuxNxm5/F5bUBkfhMqRYXAWZL1Q2dn2cLrGEN1O4uS7AcYsYwmQxiwE28Nj/n xi5f2MVPfHhM9h4GjrSgUeQlqidEN5Djd61Y2T9A1Cmw1FBTF7d7lOE+U93tDLgbMAD4 8pBXR/36K6WQWZx2Tuy2fLUyEN+aEMBrlveQYKWPOaUzxvKh3eocbTZ7HhT+N+NCQvz2 8RlQ== X-Gm-Message-State: AOAM531mYB7XZsUwIyecSOFG+rvOLrBpLQ8cgNf5kbqBGgRY/AxaOm3C TZ5mbJNu+1kNwsiAHrkox+2iCg== X-Received: by 2002:ad4:4ee3:: with SMTP id dv3mr21556629qvb.58.1607374744507; Mon, 07 Dec 2020 12:59:04 -0800 (PST) Received: from ziepe.ca (hlfxns017vw-142-162-115-133.dhcp-dynamic.fibreop.ns.bellaliant.net. [142.162.115.133]) by smtp.gmail.com with ESMTPSA id k26sm4941155qtb.41.2020.12.07.12.59.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 07 Dec 2020 12:59:03 -0800 (PST) Received: from jgg by mlx with local (Exim 4.94) (envelope-from ) id 1kmNb5-007fYV-6o; Mon, 07 Dec 2020 16:59:03 -0400 Date: Mon, 7 Dec 2020 16:59:03 -0400 From: Jason Gunthorpe To: Thomas Gleixner Cc: LKML , Alexandre Belloni , Miroslav Lichvar , John Stultz , Prarit Bhargava , Alessandro Zummo , linux-rtc@vger.kernel.org, Peter Zijlstra Subject: Re: [patch 5/8] ntp: Make the RTC synchronization more reliable Message-ID: <20201207205903.GK5487@ziepe.ca> References: <20201206214613.444124194@linutronix.de> <20201206220542.062910520@linutronix.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20201206220542.062910520@linutronix.de> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, Dec 06, 2020 at 10:46:18PM +0100, Thomas Gleixner wrote: > Miroslav reported that the periodic RTC synchronization in the NTP code > fails more often than not to hit the specified update window. > > The reason is that the code uses delayed_work to schedule the update which > needs to be in thread context as the underlying RTC might be connected via > a slow bus, e.g. I2C. In the update function it verifies whether the > current time is correct vs. the requirements of the underlying RTC. > > But delayed_work is using the timer wheel for scheduling which is > inaccurate by design. Depending on the distance to the expiry the wheel > gets less granular to allow batching and to avoid the cascading of the > original timer wheel. See 500462a9de65 ("timers: Switch to a non-cascading > wheel") and the code for further details. > > The code already deals with this by splitting the 660 seconds period into a > long 659 seconds timer and then retrying with a smaller delta. > > But looking at the actual granularities of the timer wheel (which depend on > the HZ configuration) the 659 seconds timer ends up in an outer wheel level > and is affected by a worst case granularity of: > > HZ Granularity > 1000 32s > 250 16s > 100 40s > > So the initial timer can be already off by max 12.5% which is not a big > issue as the period of the sync is defined as ~11 minutes. > > The fine grained second attempt schedules to the desired update point with > a timer expiring less than a second from now. Depending on the actual delta > and the HZ setting even the second attempt can end up in outer wheel levels > which have a large enough granularity to make the correctness check fail. > > As this is a fundamental property of the timer wheel there is no way to > make this more accurate short of iterating in one jiffies steps towards the > update point. > > Switch it to an hrtimer instead which schedules the actual update work. The > hrtimer will expire precisely (max 1 jiffie delay when high resolution > timers are not available). The actual scheduling delay of the work is the > same as before. > > The update is triggered from do_adjtimex() which is a bit racy but not much > more racy than it was before: > > if (ntp_synced()) > queue_delayed_work(system_power_efficient_wq, &sync_work, 0); > > which is racy when the work is currently executed and has not managed to > reschedule itself. > > This becomes now: > > if (ntp_synced() && !hrtimer_is_queued(&sync_hrtimer)) > queue_work(system_power_efficient_wq, &sync_work, 0); > > which is racy when the hrtimer has expired and the work is currently > executed and has not yet managed to rearm the hrtimer. > > Not a big problem as it just schedules work for nothing. > > The new implementation has a safe guard in place to catch the case where > the hrtimer is queued on entry to the work function and avoids an extra > update attempt of the RTC that way. > > Reported-by: Miroslav Lichvar > Signed-off-by: Thomas Gleixner > Cc: John Stultz > Cc: Prarit Bhargava > Cc: Jason Gunthorpe > --- > include/linux/timex.h | 1 > kernel/time/ntp.c | 90 ++++++++++++++++++++++++--------------------- > kernel/time/ntp_internal.h | 7 +++ > 3 files changed, 55 insertions(+), 43 deletions(-) Reviewed-by: Jason Gunthorpe Jason