Received: by 2002:a05:6a10:5bc5:0:0:0:0 with SMTP id os5csp2851076pxb; Mon, 1 Nov 2021 03:12:10 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyke2SCTH/2AFsZ5WjigVJ7IGkyO9E3i7EETJpJ5AQnOwXtcICrcHLFilc4Qqb/REsvpnwK X-Received: by 2002:a5d:9da2:: with SMTP id ay34mr16339060iob.47.1635761530121; Mon, 01 Nov 2021 03:12:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1635761530; cv=none; d=google.com; s=arc-20160816; b=dKHRucSuYH6figm1z+EuorHM8Obc+ia1j508iZg3G0DnfQh6u7n0nlgx+9B/omX3QE 14IrTQjtU6U9cXc/W7oWyQVXnKBiCcS+FxKHc+qTD2ituOw1R6f41cd8Gfmiv4trdWF5 Wh10uvczry2Vsug8VKN31u03ol2nIjgh21pjSysAWqv9t09HqvdAW4f88Ov3y14f2I+I XkbGwSYNa0LK7vV+Fr6Rk63c/PZ/FwjiZ/rdeTt7oRGHrtRNBgMUPghUzXcYC4NJIyBV Jp/e1Fa7KTW+dYV3vz4x+3SoCDWT20dakH6peyXCedFZgR+uR8xopkR2jWm8lE0c7DVt XsQA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=odNKFvb7j5zuXPMotpbmzrn2RuEn8capegrq89srUI0=; b=kQ6+ICdZrhtfOC8xapAM53IYkxHLKe8r1hAYfXfU8DX/PbHDEUzPoejU5uVOBUw6Xi WGY+m8etHjjPT7q66ffK/eqYd9IE7TF3iXbThO5N//5p1cHyQb6/49oGITuBYdu9MdQQ A+xhRuccjl4aiXKr5iSeDsLZu2VMANcZelHHuHYXGg+yezvU0h3enaxiSbu/083ERTFM mMO/HmVyfdLW0vigoiSiw+lgQrUW2DCErH+drMKvNf49gFuL/VrUYJndFBY1IxsDJA7F iMv+WuIeIrEfqEhwLO7WceYlBgctbT2UNyrXiTs6cTWKXqwyoMiE1rEuEnOnGmFLHoNk pNPA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=OR+5M6DD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id d15si2227974ilv.102.2021.11.01.03.11.58; Mon, 01 Nov 2021 03:12:10 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=OR+5M6DD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231942AbhKAKNX (ORCPT + 99 others); Mon, 1 Nov 2021 06:13:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54384 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232518AbhKAKNT (ORCPT ); Mon, 1 Nov 2021 06:13:19 -0400 Received: from mail-qt1-x82b.google.com (mail-qt1-x82b.google.com [IPv6:2607:f8b0:4864:20::82b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4B31CC0F26DA for ; Mon, 1 Nov 2021 02:58:07 -0700 (PDT) Received: by mail-qt1-x82b.google.com with SMTP id v17so15304014qtp.1 for ; Mon, 01 Nov 2021 02:58:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=odNKFvb7j5zuXPMotpbmzrn2RuEn8capegrq89srUI0=; b=OR+5M6DD2jtsPuLMEfK9aaPwND5eAmFGsELJFkPDTeHv3VROKhwwYa6Om4zoANhY+x b9D/tg3fr2q2sH1FSRe7yVdRlIA4BFfwWOLBDvyBx1CZ1RgBvb0i7V5FFZ6zUBz5bgZ4 aXeloJSyxaUMr0uyDgHs1hbFjgmIZ8e6U6ZqkbTP6m4lpQGG2Tt+q60j2IOFvFam1Kz7 gePvSeEV40dErJ5ELdQ/Ms+g6Wwagj53UMlIfRCQHQb+JBJ5R5WO7eEhjJabd26ctl8m oS+YiM+kKKex5AoLWZL4fepVnG1VAbyywMTfb1d7hY5F0wM1vSY6HjCNdmoGgS8QyiRm q1zQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=odNKFvb7j5zuXPMotpbmzrn2RuEn8capegrq89srUI0=; b=0k83/sbkVCwo9kVPdjQYlnW5bvbqYZRe7nVYtii+k8MkVrOULYWMSPsMJJyoOw0OHK WDLcMyY2remps4otHKlGVZXCz12z+m3ICPPfkQU+I4bwV1ZXr4fBjAGOKZ7yerFj1KPg T3cFUvxHYuHh7i/xVTCK7W195vjKPl6WogpCJ+f+WoslkW9JnAbj6jnVfcpGWfPjK1CP bvMFZpwS/uLfX1fMKAbT0f+ngrT8HphSXD5K9AhTdZiPwdm78farqvb6hEIFuIqjYCB+ NF/PqXuctHhvjzv9/DHUuEsOtOAXuPcnPgTanqHuloN/+M6i4065KlL25PhaVFrwEwAp ffQQ== X-Gm-Message-State: AOAM530ewrN0NFX384HRbmBT+R2M3mYgQIdeTO0kGvB9FSMFjeMOq7yp rd0aBJDiwjseOjBJxfcJiYevGJVVVOiZkiqo1gNPKEydjdw= X-Received: by 2002:ac8:7fd5:: with SMTP id b21mr29673614qtk.101.1635760685329; Mon, 01 Nov 2021 02:58:05 -0700 (PDT) MIME-Version: 1.0 References: <6b715fb7-9850-04f3-4ab8-1a2a8a2cdfbf@gmail.com> <95c1a031-6751-f90f-d003-b74fbec0e9d8@gmail.com> <61381153-634e-489b-848f-7077ce46049a@bytedance.com> <20211020174933.GJ880162@paulmck-ThinkPad-P17-Gen-1> <20211022233655.GH880162@paulmck-ThinkPad-P17-Gen-1> In-Reply-To: <20211022233655.GH880162@paulmck-ThinkPad-P17-Gen-1> From: Luming Yu Date: Mon, 1 Nov 2021 05:59:52 -0400 Message-ID: Subject: Re: [PATCH] Clocksource: Avoid misjudgment of clocksource To: paulmck@kernel.org Cc: John Stultz , yanghui , Thomas Gleixner , Stephen Boyd , lkml , Shaohua Li Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Oct 22, 2021 at 7:36 PM Paul E. McKenney wrote= : > > On Thu, Oct 21, 2021 at 05:37:24AM -0400, Luming Yu wrote: > > On Wed, Oct 20, 2021 at 1:49 PM Paul E. McKenney w= rote: > > > > > > On Wed, Oct 20, 2021 at 06:09:58AM -0400, Luming Yu wrote: > > > > On Tue, Oct 19, 2021 at 1:04 AM John Stultz wrote: > > > > > > > > > > On Mon, Oct 18, 2021 at 9:14 PM yanghui wrote: > > > > > > =E5=9C=A8 2021/10/19 =E4=B8=8A=E5=8D=8812:14, John Stultz =E5= =86=99=E9=81=93: > > > > > > > On Tue, Oct 12, 2021 at 1:06 AM brookxu wrote: > > > > > > >> John Stultz wrote on 2021/10/12 13:29: > > > > > > >>> On Mon, Oct 11, 2021 at 10:23 PM brookxu wrote: > > > > > > >>>> John Stultz wrote on 2021/10/12 12:52 =E4=B8=8B=E5=8D=88: > > > > > > >>>>> On Sat, Oct 9, 2021 at 7:04 AM brookxu wrote: > > > > > > >>>> If we record the watchdog's start_time in clocksource_star= t_watchdog(), and then > > > > > > >>>> when we verify cycles in clocksource_watchdog(), check whe= ther the clocksource > > > > > > >>>> watchdog is blocked. Due to MSB verification, if the block= ed time is greater than > > > > > > >>>> half of the watchdog timer max_cycles, then we can safely = ignore the current > > > > > > >>>> verification? Do you think this idea is okay? > > > > > > >>> > > > > > > >>> I can't say I totally understand the idea. Maybe could you = clarify with a patch? > > > > > > >>> > > > > > > >> > > > > > > >> Sorry, it looks almost as follows: > > > > > > >> > > > > > > >> diff --git a/kernel/time/clocksource.c b/kernel/time/clockso= urce.c > > > > > > >> index b8a14d2..87f3b67 100644 > > > > > > >> --- a/kernel/time/clocksource.c > > > > > > >> +++ b/kernel/time/clocksource.c > > > > > > >> @@ -119,6 +119,7 @@ > > > > > > >> static DECLARE_WORK(watchdog_work, clocksource_watchdog_wo= rk); > > > > > > >> static DEFINE_SPINLOCK(watchdog_lock); > > > > > > >> static int watchdog_running; > > > > > > >> +static unsigned long watchdog_start_time; > > > > > > >> static atomic_t watchdog_reset_pending; > > > > > > >> > > > > > > >> static inline void clocksource_watchdog_lock(unsigned long= *flags) > > > > > > >> @@ -356,6 +357,7 @@ static void clocksource_watchdog(struct = timer_list *unused) > > > > > > >> int next_cpu, reset_pending; > > > > > > >> int64_t wd_nsec, cs_nsec; > > > > > > >> struct clocksource *cs; > > > > > > >> + unsigned long max_jiffies; > > > > > > >> u32 md; > > > > > > >> > > > > > > >> spin_lock(&watchdog_lock); > > > > > > >> @@ -402,6 +404,10 @@ static void clocksource_watchdog(struct= timer_list *unused) > > > > > > >> if (atomic_read(&watchdog_reset_pending)) > > > > > > >> continue; > > > > > > >> > > > > > > >> + max_jiffies =3D nsecs_to_jiffies(cs->max_idl= e_ns); > > > > > > >> + if (time_is_before_jiffies(watchdog_start_ti= me + max_jiffies)) > > > > > > >> + continue; > > > > > > >> + > > > > > > > > > > > > > > Sorry, what is the benefit of using jiffies here? Jiffies a= re > > > > > > > updated by counting the number of tick intervals on the curre= nt > > > > > > > clocksource. > > > > > > > > > > > > > > This seems like circular logic, where we're trying to judge t= he > > > > > > > current clocksource by using something we derived from the cu= rrent > > > > > > > clocksource. > > > > > > > That's why the watchdog clocksource is important, as it's sup= posed to > > > > > > > be a separate counter that is more reliable (but likely slowe= r) then > > > > > > > the preferred clocksource. > > > > > > > > > > > > > > So I'm not really sure how this helps. > > > > > > > > > > > > > > The earlier patch by yanghui at least used the watchdog inter= val to > > > > > > > decide if the watchdog timer had expired late. Which seemed > > > > > > > reasonable, but I thought it might be helpful to add some sor= t of a > > > > > > > counter so if the case is happening repeatedly (timers consta= ntly > > > > > > > being delayed) we have a better signal that the watchdog and = current > > > > > > > clocksource are out of sync. Because again, timers are fired= based on > > > > > > > > > > > > I think only have a signal ls not enough. we need to prevent > > > > > > clocksource from being incorrectly switched. > > > > > > > > > > Right, but we also have to ensure that we also properly disqualif= y > > > > > clocksources that are misbehaving. > > > > > > > > > > In the case that the current clocksource is running very slow (im= agine > > > > > old TSCs that lowered freq with cpufreq), then system time slows = down, > > > > > so timers fire late. > > > > > So it would constantly seem like the irqs are being delayed, so w= ith > > > > > your logic we would not disqualify a clearly malfunctioning > > > > > clocksource.. > > > > > > > > > > > The Timer callback function clocksource_watchdog() is executed = in the > > > > > > context of softirq(run_timer_softirq()). So if softirq is disab= led for > > > > > > long time(One situation is long time softlockup), clocksource_w= atchdog() > > > > > > will be delay executed. > > > > > > > > > > Yes. The reality is that timers are often spuriously delayed. We = don't > > > > > want a short burst of timer misbehavior to disqualify a good > > > > > clocksource. > > > > > > > > > > But the problem is that this situation and the one above (with th= e > > > > > freq changing TSC), will look exactly the same. > > > > > > > > > > So having a situation where if the watchdog clocksource thinks to= o > > > > > much time has passed between watchdog timers, we can skip judgeme= nt, > > > > > assuming its a spurious delay. But I think we need to keep a coun= ter > > > > > so that if this happens 3-5 times in a row, we stop ignoring the > > > > > misbehavior and judge the current clocksource, as it may be runni= ng > > > > > slowly. > > > > > > > > > > > > > > > > > > I think it will be better to add this to my patch: > > > > > > /* > > > > > > * Interval: 0.5sec. > > > > > > - * MaxInterval: 1s. > > > > > > + * MaxInterval: 20s. > > > > > > */ > > > > > > #define WATCHDOG_INTERVAL (HZ >> 1) > > > > > > -#define WATCHDOG_MAX_INTERVAL_NS (NSEC_PER_SEC) > > > > > > +#define WATCHDOG_MAX_INTERVAL_NS (20 * NSEC_PER_SEC) > > > > > > > > > > > > > > > > Some watchdog counters wrap within 20 seconds, so I don't think t= his > > > > > is a good idea. > > > > > > > > > > The other proposal to calculate the error rate, rather than a fix= ed > > > > > error boundary might be useful too, as if the current clocksource= and > > > > > watchdog are close, a long timer delay won't disqualify them if w= e > > > > > scale the error bounds to be within an given error rate. > > > > > > > > In most of tsc unstable trouble shooting on modern servers we exper= ienced, > > > > it usually ends up in a false alarm triggered by the clock source > > > > watchdog for tsc. > > > > > > > > I think Paul has a proposal to make a clock source watchdog to be m= ore > > > > intelligent. > > > > Its job is to find a real problem instead of causing a problem. > > > > > > And that proposal is now in mainline: > > > > Great! : -) > > > > > > 22a223833716 clocksource: Print deviation in nanoseconds when a clock= source becomes unstable > > > 1253b9b87e42 clocksource: Provide kernel module to test clocksource w= atchdog > > > 2e27e793e280 clocksource: Reduce clocksource-skew threshold > > > fa218f1cce6b clocksource: Limit number of CPUs checked for clock sync= hronization > > > 7560c02bdffb clocksource: Check per-CPU clock synchronization when ma= rked unstable > > > db3a34e17433 clocksource: Retry clock read if long delays detected > > > > > > The strategy is to disqualify a clock comparison if the reads took to= o > > > long, and to retry the comparison. If four consecutive comparison ta= ke > > > too long, clock skew is reported. The number of consecutive comparis= ons > > > may be adjusted by the usual kernel boot parameter. > > > > > > > so disabling it for known good-tsc might be a reasonable good idea > > > > that can save manpower for other > > > > more valuable problems to solve, or at least make it statistically = a > > > > problem less chance to happen. > > > > > > One additional piece that is still in prototype state in -rcu is to g= ive > > > clocksources some opportunity to resynchronize if there are too many > > > slow comparisons. This is intended to handle cases where clocks ofte= n > > > > if there is such tsc-sync algorithm existing in software, it really > > can help system software engineers > > to solve some rare power good signals synchronization problem caused > > by bios that caused > > boot time tsc sync check failure that usually would consume huge > > debugging engine for bringing up qualified linux system. > > > > Less depending on platform quirks should be good thing to linux for > > tsc && rcu support. > > Good point, I have procrastinated long enough. > > How about like this? sorry, I meant a better algorithm to use tsc adjust register like the tried one in arch/x86/kernel/tsc_sync.c > > Thanx, Paul > > ------------------------------------------------------------------------ > > commit 9ec2a03bbf4bee3d9fbc02a402dee36efafc5a2d > Author: Paul E. McKenney > Date: Thu May 27 11:03:28 2021 -0700 > > clocksource: Forgive repeated long-latency watchdog clocksource reads > > Currently, the clocksource watchdog reacts to repeated long-latency > clocksource reads by marking that clocksource unstable on the theory = that > these long-latency reads are a sign of a serious problem. And this t= heory Maybe we need to use other core's tsc as a reference clock instead of using HPET, which , to my knowledge , is the place where the problem happe= ns. Ruling out HPET and other slow clock devices as the obvious wrong choice of a reference clock for tsc, I guess there will be less chance we (in kernel code) will get bothered by other latency problems perceived in the clock source watchdog. > does in fact have real-world support in the form of firmware issues [= 1]. > > However, it is also possible to trigger this using stress-ng on what > the stress-ng man page terms "poorly designed hardware" [2]. And it > is not necessarily a bad thing for the kernel to diagnose cases where > heavy memory-contention workloads are being run on hardware that is n= ot > designed for this sort of use. > > Nevertheless, it is quite possible that real-world use will result in > some situation requiring that high-stress workloads run on hardware > not designed to accommodate them, and also requiring that the kernel > refrain from marking clocksources unstable. > > Therefore, react to persistent long-latency reads by leaving the > clocksource alone, but using the old 62.5-millisecond skew-detection > threshold. In addition, the offending clocksource is marked for > re-initialization, which both restarts that clocksource with a clean = bill > of health and avoids false-positive skew reports on later watchdog ch= ecks. > Once marked for re-initialization, the clocksource is not subjected t= o > further watchdog checking until a subsequent successful read from tha= t > clocksource that is free of excessive delays. > > However, if clocksource.max_cswd_coarse_reads consecutive clocksource= read > attempts result in long latencies, a warning (splat) will be emitted. > This kernel boot parameter defaults to 100, and this warning can be > disabled by setting it to zero or to a negative value. > > [ paulmck: Apply feedback from Chao Gao ] > > Link: https://lore.kernel.org/lkml/20210513155515.GB23902@xsang-OptiP= lex-9020/ # [1] > Link: https://lore.kernel.org/lkml/20210521083322.GG25531@xsang-OptiP= lex-9020/ # [2] > Link: https://lore.kernel.org/lkml/20210521084405.GH25531@xsang-OptiP= lex-9020/ > Link: https://lore.kernel.org/lkml/20210511233403.GA2896757@paulmck-T= hinkPad-P17-Gen-1/ > Tested-by: Chao Gao > Signed-off-by: Paul E. McKenney > > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentat= ion/admin-guide/kernel-parameters.txt > index 316027c3aadc..61d2436ae9df 100644 > --- a/Documentation/admin-guide/kernel-parameters.txt > +++ b/Documentation/admin-guide/kernel-parameters.txt > @@ -600,6 +600,14 @@ > loops can be debugged more effectively on product= ion > systems. > > + clocksource.max_cswd_coarse_reads=3D [KNL] > + Number of consecutive clocksource_watchdog() > + coarse reads (that is, clocksource reads that > + were unduly delayed) that are permitted before > + the kernel complains (gently). Set to a value > + less than or equal to zero to suppress these > + complaints. > + > clocksource.max_cswd_read_retries=3D [KNL] > Number of clocksource_watchdog() retries due to > external delays before the clock will be marked > diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h > index 1d42d4b17327..3e925d9ffc31 100644 > --- a/include/linux/clocksource.h > +++ b/include/linux/clocksource.h > @@ -110,6 +110,7 @@ struct clocksource { > int rating; > enum clocksource_ids id; > enum vdso_clock_mode vdso_clock_mode; > + unsigned int n_coarse_reads; > unsigned long flags; > > int (*enable)(struct clocksource *cs); > @@ -291,6 +292,7 @@ static inline void timer_probe(void) {} > #define TIMER_ACPI_DECLARE(name, table_id, fn) \ > ACPI_DECLARE_PROBE_ENTRY(timer, name, table_id, 0, NULL, 0, fn) > > +extern int max_cswd_coarse_reads; > extern ulong max_cswd_read_retries; > void clocksource_verify_percpu(struct clocksource *cs); > > diff --git a/kernel/time/clocksource-wdtest.c b/kernel/time/clocksource-w= dtest.c > index df922f49d171..7e82500c400b 100644 > --- a/kernel/time/clocksource-wdtest.c > +++ b/kernel/time/clocksource-wdtest.c > @@ -145,13 +145,12 @@ static int wdtest_func(void *arg) > else if (i <=3D max_cswd_read_retries) > s =3D ", expect message"; > else > - s =3D ", expect clock skew"; > + s =3D ", expect coarse-grained clock skew check a= nd re-initialization"; > pr_info("--- Watchdog with %dx error injection, %lu retri= es%s.\n", i, max_cswd_read_retries, s); > WRITE_ONCE(wdtest_ktime_read_ndelays, i); > schedule_timeout_uninterruptible(2 * HZ); > WARN_ON_ONCE(READ_ONCE(wdtest_ktime_read_ndelays)); > - WARN_ON_ONCE((i <=3D max_cswd_read_retries) !=3D > - !(clocksource_wdtest_ktime.flags & CLOCK_SOU= RCE_UNSTABLE)); > + WARN_ON_ONCE(clocksource_wdtest_ktime.flags & CLOCK_SOURC= E_UNSTABLE); > wdtest_ktime_clocksource_reset(); > } > > diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c > index b8a14d2fb5ba..796a127aabb9 100644 > --- a/kernel/time/clocksource.c > +++ b/kernel/time/clocksource.c > @@ -199,6 +199,9 @@ void clocksource_mark_unstable(struct clocksource *cs= ) > spin_unlock_irqrestore(&watchdog_lock, flags); > } > > +int max_cswd_coarse_reads =3D 100; > +module_param(max_cswd_coarse_reads, int, 0644); > +EXPORT_SYMBOL_GPL(max_cswd_coarse_reads); > ulong max_cswd_read_retries =3D 3; > module_param(max_cswd_read_retries, ulong, 0644); > EXPORT_SYMBOL_GPL(max_cswd_read_retries); > @@ -226,13 +229,22 @@ static bool cs_watchdog_read(struct clocksource *cs= , u64 *csnow, u64 *wdnow) > pr_warn("timekeeping watchdog on CPU%d: %= s retried %d times before success\n", > smp_processor_id(), watchdog->nam= e, nretries); > } > - return true; > + cs->n_coarse_reads =3D 0; > + return false; > } > + WARN_ONCE(max_cswd_coarse_reads > 0 && > + !(++cs->n_coarse_reads % max_cswd_coarse_reads)= , > + "timekeeping watchdog on CPU%d: %s %u consecuti= ve coarse-grained reads\n", smp_processor_id(), watchdog->name, cs->n_coars= e_reads); > } > > - pr_warn("timekeeping watchdog on CPU%d: %s read-back delay of %ll= dns, attempt %d, marking unstable\n", > - smp_processor_id(), watchdog->name, wd_delay, nretries); > - return false; > + if ((cs->flags & CLOCK_SOURCE_WATCHDOG) && !atomic_read(&watchdog= _reset_pending)) { > + pr_warn("timekeeping watchdog on CPU%d: %s read-back dela= y of %lldns, attempt %d, coarse-grained skew check followed by re-initializ= ation\n", > + smp_processor_id(), watchdog->name, wd_delay, nre= tries); > + } else { > + pr_warn("timekeeping watchdog on CPU%d: %s read-back dela= y of %lldns, attempt %d, awaiting re-initialization\n", > + smp_processor_id(), watchdog->name, wd_delay, nre= tries); > + } > + return true; > } > > static u64 csnow_mid; > @@ -356,6 +368,7 @@ static void clocksource_watchdog(struct timer_list *u= nused) > int next_cpu, reset_pending; > int64_t wd_nsec, cs_nsec; > struct clocksource *cs; > + bool coarse; > u32 md; > > spin_lock(&watchdog_lock); > @@ -373,16 +386,13 @@ static void clocksource_watchdog(struct timer_list = *unused) > continue; > } > > - if (!cs_watchdog_read(cs, &csnow, &wdnow)) { > - /* Clock readout unreliable, so give it up. */ > - __clocksource_unstable(cs); > - continue; > - } > + coarse =3D cs_watchdog_read(cs, &csnow, &wdnow); > > /* Clocksource initialized ? */ > if (!(cs->flags & CLOCK_SOURCE_WATCHDOG) || > atomic_read(&watchdog_reset_pending)) { > - cs->flags |=3D CLOCK_SOURCE_WATCHDOG; > + if (!coarse) > + cs->flags |=3D CLOCK_SOURCE_WATCHDOG; > cs->wd_last =3D wdnow; > cs->cs_last =3D csnow; > continue; > @@ -403,7 +413,13 @@ static void clocksource_watchdog(struct timer_list *= unused) > continue; > > /* Check the deviation from the watchdog clocksource. */ > - md =3D cs->uncertainty_margin + watchdog->uncertainty_mar= gin; > + if (coarse) { > + md =3D 62500 * NSEC_PER_USEC; > + cs->flags &=3D ~CLOCK_SOURCE_WATCHDOG; > + pr_warn("timekeeping watchdog on CPU%d: %s coarse= -grained %lu.%03lu ms clock-skew check followed by re-initialization\n", sm= p_processor_id(), watchdog->name, md / NSEC_PER_MSEC, md % NSEC_PER_MSEC / = NSEC_PER_USEC); > + } else { > + md =3D cs->uncertainty_margin + watchdog->uncerta= inty_margin; > + } > if (abs(cs_nsec - wd_nsec) > md) { > pr_warn("timekeeping watchdog on CPU%d: Marking c= locksource '%s' as unstable because the skew is too large:\n", > smp_processor_id(), cs->name);