Received: by 2002:a05:6a10:a841:0:0:0:0 with SMTP id d1csp3153849pxy; Sun, 25 Apr 2021 15:48:33 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxA/CEAr8xPKKyHI3FZYaT/i8Wb4tFBD3YwX/jvxvUd5v2IxPSgv0O5v9qS/C9i0u8DXBXA X-Received: by 2002:a17:907:e8a:: with SMTP id ho10mr15431919ejc.110.1619390913013; Sun, 25 Apr 2021 15:48:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1619390913; cv=none; d=google.com; s=arc-20160816; b=mPjsIQlgELkUW0JQGp6hWEbqdNYiDMzLCWyo5P3gAe8AjprljQcMYqQF3NSLu+5Tmo X9q/DtC4cd3eGF96WEeOTaB0+uYc4bHAZKqUbeJ3+36CtE8GFq4mdlEMpJaeJnXI0kls 3HMWYAvrkcQRqlDcv3yb89OOcZv29PQyDCePniP3houuFrw+661B/OdQBE5eenVA0r94 o0GsEy997S5mI+5PuTW/ovZQUaYnJ/j4orMQur8NLQPkziOWDGSKrK7NdC7n59WydSHT c5xq5+tQGXo4P4dkycffkBgyVM93IUuwgsAQ1LJiQn5CGyXFgFjsn9S3rjAtV0px7OQW 22Bg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=c6AjrhbIIC2+laT3c32y84pZ25keagXTKRwB3rDTw+0=; b=Ofa1ELoyQzeNfUqyt3l0G8tnBWJumTVRh8zxc+luDgRXysYcET4dTejtLsKWmRDW5i LZ3Uwe0ez2pL4XXZf4Mv05b66UVsKfBlYbgkMeaHIC3Hgn/VXSJT+7id5J1Sz9PhpJq0 i9UyVemzIicJyTQbL5V9d661f6LJ3ttpZ3lYU721UK++Esktr0QjKCdLyJMS/c69GvTp UYK6PX3TT/UwAYwsC4wqUoJTmmdFE58tprKeTSSl1cbsWeQ4JSAvzXHK/sED4q+xghJO CJnfjXuc6FxWmE/0ueKPDFSug2iYcDm5b4cjfq8i98jWZOkzoLhCNJAVOXU3OQdVe2H7 W2JA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=QtERsCuD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id nb17si11596517ejc.646.2021.04.25.15.48.10; Sun, 25 Apr 2021 15:48:33 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=QtERsCuD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231509AbhDYWr6 (ORCPT + 99 others); Sun, 25 Apr 2021 18:47:58 -0400 Received: from mail.kernel.org ([198.145.29.99]:50750 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231432AbhDYWrv (ORCPT ); Sun, 25 Apr 2021 18:47:51 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 1C6336100B; Sun, 25 Apr 2021 22:47:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1619390831; bh=+em+M3bytigZ5wiknzsJtopIcRJbrjz9KfCx/hVaTiI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=QtERsCuDLfvLRNgViUy/e3yN/LSKGK0lGA2RH+3AVxS/QuPhi1thRGtpDlDKsLyyp 8OQt+tpETHwsOuQjb+Qaf3yh6oCqaYFWXBnhb1zla/mdnq4YBZW1UV6/poW7iopHIp NA1fBbHQmykFQ90++Dyulyf5Ci254xvxBzW6m0zpfSJH5NP7uJKW6nhMEkxn9Zko8Y VAMIIWagnPkf5E8Y8cAnrwGdG0rPEkIP+aYAWjfe2vLXE09Kgaxxbyc51lyaRSU4SR C/d7zVoV5rMJu3AUo4/sKny+XhG4wWEZm/CC4resmQG7AKd8txrCD4AVCHmHWfzC5M jrkPwYIX2d8bw== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id D328E5C0124; Sun, 25 Apr 2021 15:47:10 -0700 (PDT) From: "Paul E. McKenney" To: tglx@linutronix.de Cc: linux-kernel@vger.kernel.org, john.stultz@linaro.org, sboyd@kernel.org, corbet@lwn.net, Mark.Rutland@arm.com, maz@kernel.org, kernel-team@fb.com, neeraju@codeaurora.org, ak@linux.intel.com, feng.tang@intel.com, zhengjun.xing@intel.com, "Paul E. McKenney" , Chris Mason Subject: [PATCH v10 clocksource 3/7] clocksource: Check per-CPU clock synchronization when marked unstable Date: Sun, 25 Apr 2021 15:47:04 -0700 Message-Id: <20210425224709.1312655-3-paulmck@kernel.org> X-Mailer: git-send-email 2.31.1.189.g2e36527f23 In-Reply-To: <20210425224540.GA1312438@paulmck-ThinkPad-P17-Gen-1> References: <20210425224540.GA1312438@paulmck-ThinkPad-P17-Gen-1> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Some sorts of per-CPU clock sources have a history of going out of synchronization with each other. However, this problem has purportedy been solved in the past ten years. Except that it is all too possible that the problem has instead simply been made less likely, which might mean that some of the occasional "Marking clocksource 'tsc' as unstable" messages might be due to desynchronization. How would anyone know? Therefore apply CPU-to-CPU synchronization checking to newly unstable clocksource that are marked with the new CLOCK_SOURCE_VERIFY_PERCPU flag. Lists of desynchronized CPUs are printed, with the caveat that if it is the reporting CPU that is itself desynchronized, it will appear that all the other clocks are wrong. Just like in real life. Cc: John Stultz Cc: Thomas Gleixner Cc: Stephen Boyd Cc: Jonathan Corbet Cc: Mark Rutland Cc: Marc Zyngier Cc: Andi Kleen Reported-by: Chris Mason [ paulmck: Add "static" to clocksource_verify_one_cpu() per kernel test robot feedback. ] [ paulmck: Apply Thomas Gleixner feedback. ] Signed-off-by: Paul E. McKenney --- arch/x86/kernel/tsc.c | 3 +- include/linux/clocksource.h | 2 +- kernel/time/clocksource.c | 60 +++++++++++++++++++++++++++++++++++++ 3 files changed, 63 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c index f70dffc2771f..56289170753c 100644 --- a/arch/x86/kernel/tsc.c +++ b/arch/x86/kernel/tsc.c @@ -1151,7 +1151,8 @@ static struct clocksource clocksource_tsc = { .mask = CLOCKSOURCE_MASK(64), .flags = CLOCK_SOURCE_IS_CONTINUOUS | CLOCK_SOURCE_VALID_FOR_HRES | - CLOCK_SOURCE_MUST_VERIFY, + CLOCK_SOURCE_MUST_VERIFY | + CLOCK_SOURCE_VERIFY_PERCPU, .vdso_clock_mode = VDSO_CLOCKMODE_TSC, .enable = tsc_cs_enable, .resume = tsc_resume, diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h index 86d143db6523..83a3ebff7456 100644 --- a/include/linux/clocksource.h +++ b/include/linux/clocksource.h @@ -131,7 +131,7 @@ struct clocksource { #define CLOCK_SOURCE_UNSTABLE 0x40 #define CLOCK_SOURCE_SUSPEND_NONSTOP 0x80 #define CLOCK_SOURCE_RESELECT 0x100 - +#define CLOCK_SOURCE_VERIFY_PERCPU 0x200 /* simplify initialization of mask field */ #define CLOCKSOURCE_MASK(bits) GENMASK_ULL((bits) - 1, 0) diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c index 94bfdb53f2f4..a8d73e1f9431 100644 --- a/kernel/time/clocksource.c +++ b/kernel/time/clocksource.c @@ -245,6 +245,60 @@ static bool cs_watchdog_read(struct clocksource *cs, u64 *csnow, u64 *wdnow) return false; } +static u64 csnow_mid; +static cpumask_t cpus_ahead; +static cpumask_t cpus_behind; + +static void clocksource_verify_one_cpu(void *csin) +{ + struct clocksource *cs = (struct clocksource *)csin; + + csnow_mid = cs->read(cs); +} + +static void clocksource_verify_percpu(struct clocksource *cs) +{ + int64_t cs_nsec, cs_nsec_max = 0, cs_nsec_min = LLONG_MAX; + u64 csnow_begin, csnow_end; + int cpu, testcpu; + s64 delta; + + cpumask_clear(&cpus_ahead); + cpumask_clear(&cpus_behind); + preempt_disable(); + testcpu = smp_processor_id(); + pr_warn("Checking clocksource %s synchronization from CPU %d.\n", cs->name, testcpu); + for_each_online_cpu(cpu) { + if (cpu == testcpu) + continue; + csnow_begin = cs->read(cs); + smp_call_function_single(cpu, clocksource_verify_one_cpu, cs, 1); + csnow_end = cs->read(cs); + delta = (s64)((csnow_mid - csnow_begin) & cs->mask); + if (delta < 0) + cpumask_set_cpu(cpu, &cpus_behind); + delta = (csnow_end - csnow_mid) & cs->mask; + if (delta < 0) + cpumask_set_cpu(cpu, &cpus_ahead); + delta = clocksource_delta(csnow_end, csnow_begin, cs->mask); + cs_nsec = clocksource_cyc2ns(delta, cs->mult, cs->shift); + if (cs_nsec > cs_nsec_max) + cs_nsec_max = cs_nsec; + if (cs_nsec < cs_nsec_min) + cs_nsec_min = cs_nsec; + } + preempt_enable(); + if (!cpumask_empty(&cpus_ahead)) + pr_warn(" CPUs %*pbl ahead of CPU %d for clocksource %s.\n", + cpumask_pr_args(&cpus_ahead), testcpu, cs->name); + if (!cpumask_empty(&cpus_behind)) + pr_warn(" CPUs %*pbl behind CPU %d for clocksource %s.\n", + cpumask_pr_args(&cpus_behind), testcpu, cs->name); + if (!cpumask_empty(&cpus_ahead) || !cpumask_empty(&cpus_behind)) + pr_warn(" CPU %d check durations %lldns - %lldns for clocksource %s.\n", + testcpu, cs_nsec_min, cs_nsec_max, cs->name); +} + static void clocksource_watchdog(struct timer_list *unused) { u64 csnow, wdnow, cslast, wdlast, delta; @@ -469,6 +523,12 @@ static int __clocksource_watchdog_kthread(void) unsigned long flags; int select = 0; + /* Do any required per-CPU skew verification. */ + if (curr_clocksource && + curr_clocksource->flags & CLOCK_SOURCE_UNSTABLE && + curr_clocksource->flags & CLOCK_SOURCE_VERIFY_PERCPU) + clocksource_verify_percpu(curr_clocksource); + spin_lock_irqsave(&watchdog_lock, flags); list_for_each_entry_safe(cs, tmp, &watchdog_list, wd_list) { if (cs->flags & CLOCK_SOURCE_UNSTABLE) { -- 2.31.1.189.g2e36527f23