Received: by 2002:a05:6a10:5bc5:0:0:0:0 with SMTP id os5csp3160539pxb; Mon, 18 Oct 2021 09:19:31 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxxT412Iqx0R2MK3Ha6BCNUg9UIcuQ1zcCTlhQds1fufscwGD6tGszvkJXvFJtgrxWCGu61 X-Received: by 2002:a50:e08a:: with SMTP id f10mr44986148edl.319.1634573971641; Mon, 18 Oct 2021 09:19:31 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1634573971; cv=none; d=google.com; s=arc-20160816; b=gxbzXSE/2LftNU+00oNvDEYQf6cy9L8o3urM0MFDvYoar9xAJ3zOBDPqfZbyx04Wfc XybYOmE/pTxM0aOQo+3AWpyjoPZ6PnX6FAva6S/AzvUPnZ4eJpp0ussfn2vvfmKh4vcc zoDhNmxKMorNo6lZEvuG58cd9NKYinK9ZAWK6LG1GM2u2FP0UeVzawUMeT58k6o9sle5 J9lozP84cDs38qVzxf76drPNAgzx/HtKnIKY7xl/I1T5Y7IasioC28u1rU72FKafBgUo AdU1c7u1E2SFkr9of/SVAtaiEhckvgYDiuJdt+aCU6GMAYqOUyoYWniMs6LeV6xnB+B+ gKVg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=39kEco6YUPZQeLiBTH5juEbmjZF1/s82FjRHzL/nA70=; b=x4eYTOSpnM5qxRtYIIXQgvQ8IGSBo9zaW3hn7UgD3JLZliCAoxbKuhwnFivbtr2+UU ekrFy2H6H17Ns79keeSfwutYLLhB9eqIPQuJJKBVb9G5cjESjrrNSGE5DVQZ7+HwCgz8 yBVDQ5eejlC9t5j6qEySkGqXu5FqlsbzHqriGNST6rUgLeHbmafDfssQYoWiWFE8ZQWT Uk19+RltLIUXeCrc/exlfReEt+eO0DbzkiXKkEWwLvO5zqTe9KKH3FxD223PWRrh88XY tL9eMaoEwslMl1ZRtJPtp0R4k9Hj425dIkhFol5kJJmqGpG/GgGCk5i14xcW3lKpdSJw oYMg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=mRjscxU0; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id g2si20037155edz.500.2021.10.18.09.19.06; Mon, 18 Oct 2021 09:19:31 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=mRjscxU0; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233425AbhJRQQ6 (ORCPT + 99 others); Mon, 18 Oct 2021 12:16:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45342 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232118AbhJRQQ5 (ORCPT ); Mon, 18 Oct 2021 12:16:57 -0400 Received: from mail-lj1-x22d.google.com (mail-lj1-x22d.google.com [IPv6:2a00:1450:4864:20::22d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 16232C06161C for ; Mon, 18 Oct 2021 09:14:46 -0700 (PDT) Received: by mail-lj1-x22d.google.com with SMTP id w23so699387lje.7 for ; Mon, 18 Oct 2021 09:14:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=39kEco6YUPZQeLiBTH5juEbmjZF1/s82FjRHzL/nA70=; b=mRjscxU0/V04/PRAzPqrZxXCuXysK06p8VtMGNGNM36W7cwxiTNFl2lrpM3vF7WM4O K4l8lwBnmDjCBkHBEGhMy34EqfJ1CfAXRoiKpkcswJOKjiMbJi2PIp6Rp0jdaxVTBkEB AvvIhiq0Sba4hVXoKnpPDy4cDRzWeikNjo5wImCPa6YZnSaDOD+gYWDVQV0XTAc0ORq1 I1kU87aPvdR6eW8xr+sve+P07wy6wNfCaNlFpC6/YH0sLt9idjigmXncnNl6LyZMCsP9 C0vyRZFxO0rgz6WO9WHp4icR5cUHVoDa/32J469Rn4pfZS2LTTNsW1IiwMqjhfZSDYyK dwrA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=39kEco6YUPZQeLiBTH5juEbmjZF1/s82FjRHzL/nA70=; b=OUvfKKbS93JuYfS1IUQVVnVfxg1rxkU6+ulJaOIRdJ4CUDI8KtMO1u0yTLGeI9y5AA nCnjDUkb02+jLAlNt/bL0LpuO8oRByaetcvScrg/CLqcTZP+cjyHnyj7M3jNvdEK/rZN 1F7tBRNNRK+wnqqY2EesJJkd1gMOAwXqR8TUiXolkQL5lttkLuPzqYCpGeQciaHwQaf0 0tX5jB7JPRssAdWaICW8QPm7cRSgjsluK+mOMtNS/iOxjc875OazhbXZXMnuRKq/BEkh MK0jtgIChaFXXrukNJyu9r5PmMCP0gf9CbhoayPK7qX0Gi85aHcYyx3VdOTbi1X2wQn1 aJyA== X-Gm-Message-State: AOAM530xck/FVFqOA/Efd0tDqPH/AUcvo+CIipO/DVolZodsxG75aMDf m1YGwKb2tRtF5jo6zCB/VIX5RuXoW6mv7a8sJF6XDQ== X-Received: by 2002:a2e:9616:: with SMTP id v22mr715661ljh.25.1634573684439; Mon, 18 Oct 2021 09:14:44 -0700 (PDT) MIME-Version: 1.0 References: <20211008080305.13401-1-yanghui.def@bytedance.com> <6b715fb7-9850-04f3-4ab8-1a2a8a2cdfbf@gmail.com> <95c1a031-6751-f90f-d003-b74fbec0e9d8@gmail.com> In-Reply-To: <95c1a031-6751-f90f-d003-b74fbec0e9d8@gmail.com> From: John Stultz Date: Mon, 18 Oct 2021 09:14:33 -0700 Message-ID: Subject: Re: [PATCH] Clocksource: Avoid misjudgment of clocksource To: brookxu Cc: yanghui , Thomas Gleixner , Stephen Boyd , lkml Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Oct 12, 2021 at 1:06 AM brookxu wrote: > John Stultz wrote on 2021/10/12 13:29: > > On Mon, Oct 11, 2021 at 10:23 PM brookxu wrote: > >> John Stultz wrote on 2021/10/12 12:52 =E4=B8=8B=E5=8D=88: > >>> On Sat, Oct 9, 2021 at 7:04 AM brookxu wrote: > >> If we record the watchdog's start_time in clocksource_start_watchdog()= , and then > >> when we verify cycles in clocksource_watchdog(), check whether the clo= cksource > >> watchdog is blocked. Due to MSB verification, if the blocked time is g= reater than > >> half of the watchdog timer max_cycles, then we can safely ignore the c= urrent > >> verification? Do you think this idea is okay? > > > > I can't say I totally understand the idea. Maybe could you clarify with= a patch? > > > > Sorry, it looks almost as follows: > > diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c > index b8a14d2..87f3b67 100644 > --- a/kernel/time/clocksource.c > +++ b/kernel/time/clocksource.c > @@ -119,6 +119,7 @@ > static DECLARE_WORK(watchdog_work, clocksource_watchdog_work); > static DEFINE_SPINLOCK(watchdog_lock); > static int watchdog_running; > +static unsigned long watchdog_start_time; > static atomic_t watchdog_reset_pending; > > static inline void clocksource_watchdog_lock(unsigned long *flags) > @@ -356,6 +357,7 @@ static void clocksource_watchdog(struct timer_list *u= nused) > int next_cpu, reset_pending; > int64_t wd_nsec, cs_nsec; > struct clocksource *cs; > + unsigned long max_jiffies; > u32 md; > > spin_lock(&watchdog_lock); > @@ -402,6 +404,10 @@ static void clocksource_watchdog(struct timer_list *= unused) > if (atomic_read(&watchdog_reset_pending)) > continue; > > + max_jiffies =3D nsecs_to_jiffies(cs->max_idle_ns); > + if (time_is_before_jiffies(watchdog_start_time + max_jiff= ies)) > + continue; > + Sorry, what is the benefit of using jiffies here? Jiffies are updated by counting the number of tick intervals on the current clocksource. This seems like circular logic, where we're trying to judge the current clocksource by using something we derived from the current clocksource. That's why the watchdog clocksource is important, as it's supposed to be a separate counter that is more reliable (but likely slower) then the preferred clocksource. So I'm not really sure how this helps. The earlier patch by yanghui at least used the watchdog interval to decide if the watchdog timer had expired late. Which seemed reasonable, but I thought it might be helpful to add some sort of a counter so if the case is happening repeatedly (timers constantly being delayed) we have a better signal that the watchdog and current clocksource are out of sync. Because again, timers are fired based on the current clocksource. So constant delays likely mean things are wrong. thanks -john thanks -john