Received: by 2002:a05:6a10:a852:0:0:0:0 with SMTP id d18csp807822pxy; Fri, 30 Apr 2021 17:34:19 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzvdXkIHQNxYhBrG4gJJy4xBpW8DYu0Nt4moSHfpNhgSaoAn/r4kNvNTO9kH3jg8IXM9e/e X-Received: by 2002:a17:906:f1cf:: with SMTP id gx15mr6876355ejb.435.1619829259704; Fri, 30 Apr 2021 17:34:19 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1619829259; cv=none; d=google.com; s=arc-20160816; b=HlD6c1j0JyR7T32Z5lqsFAoe8mStVVDerINrwp2vFGJNb15TGnlkWwIqVcjh88Rs6R 0pemQmLgLODL0vcj/PJmfmCLc84dZOV5t9CBw9iIKNRMLF7yzmMXvgtvK1LAmR0vjimn L5Mv/DUHExqWRHqhfZIAO6jGhde9Fcyj41l5ekCyJ4ULs6tpdj6z7RCUSRQI5Ct7S9iD 7ew2iJo4MrC5c5IlDxhsHh57GXeH85XzI3m9QQ2AwelQi9uUG6vCzbiVNTKojboaofb0 OZsTlIOMr0dR1EWDPlAwpbvapHH5XLQNBngw3E3Qr3KKeyrnRoUMvqjITg2JTPEe99jn QdaA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=YuY6x5nVbV+06zJUdTiIvDlfUCXz02oS2awxvneZss8=; b=D8Ab6XwxZDOsNWRj7r1fflUHHOBpmYTVMTqXySzx8c8VLyrnL+nChs+2xcnxSJsk7F CPtvrmvQNK5kH30c+zpTfKliPA7YAqaRGvdyglTowBpdKw3FCmiSSE4dDXjaYyCprh4f PnydDSsAI6K3eoKNVIcGA7aZOnMhUdADStocnyCT3LPzjRyyKs12ziywCOdqh6eP92t9 Qu10xghRd6wS+a75NyrUy3eVF8voglZhNp17WoJ6mfw94TFvUEEBGtuwXxgzbw2UUhJV EhgY6me0h8pH3905kvte3SHppXfn+RopzIOv8RvojlvEOX1Y6V0hpnU3zI04Gs1NuieZ 1stw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=ZlMF7Dwz; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id t8si5294582ejy.77.2021.04.30.17.33.57; Fri, 30 Apr 2021 17:34:19 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=ZlMF7Dwz; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231415AbhEAAdl (ORCPT + 99 others); Fri, 30 Apr 2021 20:33:41 -0400 Received: from mail.kernel.org ([198.145.29.99]:54394 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231254AbhEAAdh (ORCPT ); Fri, 30 Apr 2021 20:33:37 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 9C2DA61409; Sat, 1 May 2021 00:32:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1619829168; bh=65x5nKahVxHLRnurjvadEqRWP053JoCxdBqG5uPAqTM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=ZlMF7Dwz+tFtd3Ks5XdUgGbok1Y2nOGqyTTRSXTKNXnLcbwCIRRf3J/x8V2BYpkdO uEp6P3ELfQvU1titeKbsYoOx6XDwlpPOMub5+LXPhf/QmfwYxdR+Y9rfSLgzHomOWV VmNH7wmWkZkA6rvuisPP5c56wcywX9cjj2WCZmHPVYhB3i5NzcRN3E7BhSgdT7P8lx YbmTBq5SQC7/rvpUcH152OeDJNMGHBEfDcux/B2fPoHK0ZCKekRo63sozsAQxjOeRM NS1Ih0qePCEDEvpgpBt0n890WIDmY5C5T70xFlBvnpiacCD1k6bPeHotLfnFs+JsQj sxAFici65zQHA== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 349745C0191; Fri, 30 Apr 2021 17:32:48 -0700 (PDT) From: "Paul E. McKenney" To: tglx@linutronix.de Cc: linux-kernel@vger.kernel.org, john.stultz@linaro.org, sboyd@kernel.org, corbet@lwn.net, Mark.Rutland@arm.com, maz@kernel.org, kernel-team@fb.com, neeraju@codeaurora.org, ak@linux.intel.com, feng.tang@intel.com, zhengjun.xing@intel.com, "Paul E. McKenney" , Xing Zhengjun , Chris Mason Subject: [PATCH v12 clocksource 2/5] clocksource: Check per-CPU clock synchronization when marked unstable Date: Fri, 30 Apr 2021 17:32:44 -0700 Message-Id: <20210501003247.2448287-2-paulmck@kernel.org> X-Mailer: git-send-email 2.31.1.189.g2e36527f23 In-Reply-To: <20210501003204.GA2447938@paulmck-ThinkPad-P17-Gen-1> References: <20210501003204.GA2447938@paulmck-ThinkPad-P17-Gen-1> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Some sorts of per-CPU clock sources have a history of going out of synchronization with each other. However, this problem has purportedy been solved in the past ten years. Except that it is all too possible that the problem has instead simply been made less likely, which might mean that some of the occasional "Marking clocksource 'tsc' as unstable" messages might be due to desynchronization. How would anyone know? Therefore apply CPU-to-CPU synchronization checking to newly unstable clocksource that are marked with the new CLOCK_SOURCE_VERIFY_PERCPU flag. Lists of desynchronized CPUs are printed, with the caveat that if it is the reporting CPU that is itself desynchronized, it will appear that all the other clocks are wrong. Just like in real life. Link: https://lore.kernel.org/lkml/202104291438.PuHsxRkl-lkp@intel.com/ Link: https://lore.kernel.org/lkml/20210429140440.GT975577@paulmck-ThinkPad-P17-Gen-1 Link: https://lore.kernel.org/lkml/20210425224540.GA1312438@paulmck-ThinkPad-P17-Gen-1/ Link: https://lore.kernel.org/lkml/20210420064934.GE31773@xsang-OptiPlex-9020/ Link: https://lore.kernel.org/lkml/20210106004013.GA11179@paulmck-ThinkPad-P72/ Link: https://lore.kernel.org/lkml/20210414043435.GA2812539@paulmck-ThinkPad-P17-Gen-1/ Link: https://lore.kernel.org/lkml/20210419045155.GA596058@paulmck-ThinkPad-P17-Gen-1/ Cc: John Stultz Cc: Thomas Gleixner Cc: Stephen Boyd Cc: Jonathan Corbet Cc: Mark Rutland Cc: Marc Zyngier Cc: Andi Kleen Cc: Feng Tang Cc: Xing Zhengjun Reported-by: Chris Mason [ paulmck: Add "static" to clocksource_verify_one_cpu() per kernel test robot feedback. ] [ paulmck: Apply Thomas Gleixner feedback. ] Signed-off-by: Paul E. McKenney --- arch/x86/kernel/tsc.c | 3 +- include/linux/clocksource.h | 2 +- kernel/time/clocksource.c | 60 +++++++++++++++++++++++++++++++++++++ 3 files changed, 63 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c index f70dffc2771f..56289170753c 100644 --- a/arch/x86/kernel/tsc.c +++ b/arch/x86/kernel/tsc.c @@ -1151,7 +1151,8 @@ static struct clocksource clocksource_tsc = { .mask = CLOCKSOURCE_MASK(64), .flags = CLOCK_SOURCE_IS_CONTINUOUS | CLOCK_SOURCE_VALID_FOR_HRES | - CLOCK_SOURCE_MUST_VERIFY, + CLOCK_SOURCE_MUST_VERIFY | + CLOCK_SOURCE_VERIFY_PERCPU, .vdso_clock_mode = VDSO_CLOCKMODE_TSC, .enable = tsc_cs_enable, .resume = tsc_resume, diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h index 86d143db6523..83a3ebff7456 100644 --- a/include/linux/clocksource.h +++ b/include/linux/clocksource.h @@ -131,7 +131,7 @@ struct clocksource { #define CLOCK_SOURCE_UNSTABLE 0x40 #define CLOCK_SOURCE_SUSPEND_NONSTOP 0x80 #define CLOCK_SOURCE_RESELECT 0x100 - +#define CLOCK_SOURCE_VERIFY_PERCPU 0x200 /* simplify initialization of mask field */ #define CLOCKSOURCE_MASK(bits) GENMASK_ULL((bits) - 1, 0) diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c index 157530ae73ac..5ba978a5f45d 100644 --- a/kernel/time/clocksource.c +++ b/kernel/time/clocksource.c @@ -223,6 +223,60 @@ static bool cs_watchdog_read(struct clocksource *cs, u64 *csnow, u64 *wdnow) return false; } +static u64 csnow_mid; +static cpumask_t cpus_ahead; +static cpumask_t cpus_behind; + +static void clocksource_verify_one_cpu(void *csin) +{ + struct clocksource *cs = (struct clocksource *)csin; + + csnow_mid = cs->read(cs); +} + +static void clocksource_verify_percpu(struct clocksource *cs) +{ + int64_t cs_nsec, cs_nsec_max = 0, cs_nsec_min = LLONG_MAX; + u64 csnow_begin, csnow_end; + int cpu, testcpu; + s64 delta; + + cpumask_clear(&cpus_ahead); + cpumask_clear(&cpus_behind); + preempt_disable(); + testcpu = smp_processor_id(); + pr_warn("Checking clocksource %s synchronization from CPU %d.\n", cs->name, testcpu); + for_each_online_cpu(cpu) { + if (cpu == testcpu) + continue; + csnow_begin = cs->read(cs); + smp_call_function_single(cpu, clocksource_verify_one_cpu, cs, 1); + csnow_end = cs->read(cs); + delta = (s64)((csnow_mid - csnow_begin) & cs->mask); + if (delta < 0) + cpumask_set_cpu(cpu, &cpus_behind); + delta = (csnow_end - csnow_mid) & cs->mask; + if (delta < 0) + cpumask_set_cpu(cpu, &cpus_ahead); + delta = clocksource_delta(csnow_end, csnow_begin, cs->mask); + cs_nsec = clocksource_cyc2ns(delta, cs->mult, cs->shift); + if (cs_nsec > cs_nsec_max) + cs_nsec_max = cs_nsec; + if (cs_nsec < cs_nsec_min) + cs_nsec_min = cs_nsec; + } + preempt_enable(); + if (!cpumask_empty(&cpus_ahead)) + pr_warn(" CPUs %*pbl ahead of CPU %d for clocksource %s.\n", + cpumask_pr_args(&cpus_ahead), testcpu, cs->name); + if (!cpumask_empty(&cpus_behind)) + pr_warn(" CPUs %*pbl behind CPU %d for clocksource %s.\n", + cpumask_pr_args(&cpus_behind), testcpu, cs->name); + if (!cpumask_empty(&cpus_ahead) || !cpumask_empty(&cpus_behind)) + pr_warn(" CPU %d check durations %lldns - %lldns for clocksource %s.\n", + testcpu, cs_nsec_min, cs_nsec_max, cs->name); +} + static void clocksource_watchdog(struct timer_list *unused) { u64 csnow, wdnow, cslast, wdlast, delta; @@ -447,6 +501,12 @@ static int __clocksource_watchdog_kthread(void) unsigned long flags; int select = 0; + /* Do any required per-CPU skew verification. */ + if (curr_clocksource && + curr_clocksource->flags & CLOCK_SOURCE_UNSTABLE && + curr_clocksource->flags & CLOCK_SOURCE_VERIFY_PERCPU) + clocksource_verify_percpu(curr_clocksource); + spin_lock_irqsave(&watchdog_lock, flags); list_for_each_entry_safe(cs, tmp, &watchdog_list, wd_list) { if (cs->flags & CLOCK_SOURCE_UNSTABLE) { -- 2.31.1.189.g2e36527f23