Received: by 2002:a05:6a10:a841:0:0:0:0 with SMTP id d1csp957428pxy; Wed, 28 Apr 2021 18:32:09 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzBMcEY51plMqDC9dCsmbqsZFnf/CtykUFSr8o1Itt6CIN/yFLXHItZnrAqzpAfMpmzz3dL X-Received: by 2002:a17:906:b1d4:: with SMTP id bv20mr32280938ejb.46.1619659928838; Wed, 28 Apr 2021 18:32:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1619659928; cv=none; d=google.com; s=arc-20160816; b=J0FDxL8nETxlExp7ddXnLT37vr2KFykwIvKpUnw3GYLq3NMph5TiC3bxve7iwr4SL3 NPpZs5bW6nwdWdDsT3TGpbfymtojyihT7TiCMFreN5S6RhDyKKGlkzlDOifHRh5OfsIc 67NB+QUASsMsGFqO1jmQgRE5stycXljCv5xSOKrclnHTZCMRsH3uvAbn6P8QWLbWK3JH +DrzoNc5+l/3FKlvsYZNO/ihsANvwRK2DgHzxOEoWEUXhRLqc2HxNthDDy6aH13qLKGA u77mewoNXilLsT0C/OkOALVokwkDpL1QbqwCzKirB8bmNcHhT7Pz94LBY61or9F10RIY Begg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=/0IKZ7bsu6AU5AwphIwDSaEWVwldJE/Smeiap8wiGSk=; b=AEnF5WPZzqtdshLXvlZcbySry4gX+6/5EXI9NZ5xOOLYDszRJT5YpVNBNlLiqOa1Ru Kgc0J6M6bE3lDIWIlSQcZNfzwMcftr+wVMKcPijZJgNL2+QViRkdhG2qqci96xMtS2Zf rF5t6AuxxvjP1j5xlIPTEKN0Xc9BkJqRtreGJq9Jz3TtKqF1TJr3Vu0z2DqUg4/W17Tq Q+2oe4gOH5Mq3RmIsCALEDhM6KaPZFi11e9S5EQOwIjF6JYsuxTZkcDvwwArB33aUpTG VRNTVLT9UIbI87MYH7D0p3tN/cFg1YCHrsl/30xEsBgFPCvKo3lz47yZ4My38+fTQ0g3 Y7Ow== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=ncoTkNIF; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id j3si1305093eds.269.2021.04.28.18.31.46; Wed, 28 Apr 2021 18:32:08 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=ncoTkNIF; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235354AbhD2Bb2 (ORCPT + 99 others); Wed, 28 Apr 2021 21:31:28 -0400 Received: from mail.kernel.org ([198.145.29.99]:33416 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231161AbhD2BbX (ORCPT ); Wed, 28 Apr 2021 21:31:23 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 4ADCF61446; Thu, 29 Apr 2021 01:30:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1619659838; bh=GCDZbOy+quDNlrarTolqq+mZfSNjBFIYQ2Md2BhIDfI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=ncoTkNIFsT6f+/YxkQVhJ6HBSbdp8N7xqOfOmjcDkKhmpkn6Y0kEASUbzaUpri97t 60U8pVa0UdE/lvHtnekFyTAbx6e5L57Q2S6E6yR8R8IYXXIqjHTcmhHGYfNX7Bt6Hi g6XDqAcVLPsOrMjVmUT5O8HnCsMfWHNaXixsSWiqS4Btk2SpiYfLk4JjP9D5FteEpt pkCVt4fRXATmX2hT+cTo9nwylFvR6npfezfL1sjBaq/pvFzYQCeqMjCTwfwHG7k3EW 2hRA1PV07Scw/PFwD5mukJkjoHRxUMPpqKpHDWu5z1kK2h6ABaVwH2crlOpwFuP+Eo 4hEcBUsBSj3kg== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 1B3B45C05B1; Wed, 28 Apr 2021 18:30:38 -0700 (PDT) From: "Paul E. McKenney" To: tglx@linutronix.de Cc: linux-kernel@vger.kernel.org, john.stultz@linaro.org, sboyd@kernel.org, corbet@lwn.net, Mark.Rutland@arm.com, maz@kernel.org, kernel-team@fb.com, neeraju@codeaurora.org, ak@linux.intel.com, feng.tang@intel.com, zhengjun.xing@intel.com, "Paul E. McKenney" , Xing Zhengjun Subject: [PATCH v11 clocksource 5/6] clocksource: Limit number of CPUs checked for clock synchronization Date: Wed, 28 Apr 2021 18:30:36 -0700 Message-Id: <20210429013037.3958717-5-paulmck@kernel.org> X-Mailer: git-send-email 2.31.1.189.g2e36527f23 In-Reply-To: <20210429012909.GA3958584@paulmck-ThinkPad-P17-Gen-1> References: <20210429012909.GA3958584@paulmck-ThinkPad-P17-Gen-1> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Currently, if skew is detected on a clock marked CLOCK_SOURCE_VERIFY_PERCPU, that clock is checked on all CPUs. This is thorough, but might not be what you want on a system with a few tens of CPUs, let alone a few hundred of them. Therefore, by default check only up to eight randomly chosen CPUs. Also provide a new clocksource.verify_n_cpus kernel boot parameter. A value of -1 says to check all of the CPUs, and a non-negative value says to randomly select that number of CPUs, without concern about selecting the same CPU multiple times. However, make use of a cpumask so that a given CPU will be checked at most once. Link: https://lore.kernel.org/lkml/20210425224540.GA1312438@paulmck-ThinkPad-P17-Gen-1/ Link: https://lore.kernel.org/lkml/20210420064934.GE31773@xsang-OptiPlex-9020/ Link: https://lore.kernel.org/lkml/20210106004013.GA11179@paulmck-ThinkPad-P72/ Link: https://lore.kernel.org/lkml/20210414043435.GA2812539@paulmck-ThinkPad-P17-Gen-1/ Link: https://lore.kernel.org/lkml/20210419045155.GA596058@paulmck-ThinkPad-P17-Gen-1/ Suggested-by: Thomas Gleixner # For verify_n_cpus=1. Cc: John Stultz Cc: Thomas Gleixner Cc: Stephen Boyd Cc: Jonathan Corbet Cc: Mark Rutland Cc: Marc Zyngier Cc: Andi Kleen Cc: Feng Tang Cc: Xing Zhengjun Signed-off-by: Paul E. McKenney --- .../admin-guide/kernel-parameters.txt | 10 +++ kernel/time/clocksource.c | 74 ++++++++++++++++++- 2 files changed, 82 insertions(+), 2 deletions(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index babcc9e80fba..4058a74df9ab 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -626,6 +626,16 @@ unstable. Defaults to three retries, that is, four attempts to read the clock under test. + clocksource.verify_n_cpus= [KNL] + Limit the number of CPUs checked for clocksources + marked with CLOCK_SOURCE_VERIFY_PERCPU that + are marked unstable due to excessive skew. + A negative value says to check all CPUs, while + zero says not to check any. Values larger than + nr_cpu_ids are silently truncated to nr_cpu_ids. + The actual CPUs are chosen randomly, with + no replacement if the same CPU is chosen twice. + clearcpuid=BITNUM[,BITNUM...] [X86] Disable CPUID feature X for the kernel. See arch/x86/include/asm/cpufeatures.h for the valid bit diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c index 584433448226..f71f375df544 100644 --- a/kernel/time/clocksource.c +++ b/kernel/time/clocksource.c @@ -15,6 +15,8 @@ #include #include #include +#include +#include #include "tick-internal.h" #include "timekeeping_internal.h" @@ -200,6 +202,8 @@ static int inject_delay_shift_percpu = -1; module_param(inject_delay_shift_percpu, int, 0644); static ulong max_read_retries = 3; module_param(max_read_retries, ulong, 0644); +static int verify_n_cpus = 8; +module_param(verify_n_cpus, int, 0644); static void clocksource_watchdog_inject_delay(void) { @@ -250,6 +254,55 @@ static bool cs_watchdog_read(struct clocksource *cs, u64 *csnow, u64 *wdnow) static u64 csnow_mid; static cpumask_t cpus_ahead; static cpumask_t cpus_behind; +static cpumask_t cpus_chosen; + +static void clocksource_verify_choose_cpus(void) +{ + int cpu, i, n = verify_n_cpus; + + if (n < 0) { + /* Check all of the CPUs. */ + cpumask_copy(&cpus_chosen, cpu_online_mask); + cpumask_clear_cpu(smp_processor_id(), &cpus_chosen); + return; + } + + /* If no checking desired, or no other CPU to check, leave. */ + cpumask_clear(&cpus_chosen); + if (n == 0 || num_online_cpus() <= 1) + return; + + /* Make sure to select at least one CPU other than the current CPU. */ + cpu = cpumask_next(-1, cpu_online_mask); + if (cpu == smp_processor_id()) + cpu = cpumask_next(cpu, cpu_online_mask); + if (WARN_ON_ONCE(cpu >= nr_cpu_ids)) + return; + cpumask_set_cpu(cpu, &cpus_chosen); + + /* Force a sane value for the boot parameter. */ + if (n > nr_cpu_ids) + n = nr_cpu_ids; + + /* + * Randomly select the specified number of CPUs. If the same + * CPU is selected multiple times, that CPU is checked only once, + * and no replacement CPU is selected. This gracefully handles + * situations where verify_n_cpus is greater than the number of + * CPUs that are currently online. + */ + for (i = 1; i < n; i++) { + cpu = prandom_u32() % nr_cpu_ids; + cpu = cpumask_next(cpu - 1, cpu_online_mask); + if (cpu >= nr_cpu_ids) + cpu = cpumask_next(-1, cpu_online_mask); + if (!WARN_ON_ONCE(cpu >= nr_cpu_ids)) + cpumask_set_cpu(cpu, &cpus_chosen); + } + + /* Don't verify ourselves. */ + cpumask_clear_cpu(smp_processor_id(), &cpus_chosen); +} static void clocksource_verify_one_cpu(void *csin) { @@ -271,12 +324,22 @@ static void clocksource_verify_percpu(struct clocksource *cs) int cpu, testcpu; s64 delta; + if (verify_n_cpus == 0) + return; cpumask_clear(&cpus_ahead); cpumask_clear(&cpus_behind); + get_online_cpus(); preempt_disable(); + clocksource_verify_choose_cpus(); + if (cpumask_weight(&cpus_chosen) == 0) { + preempt_enable(); + put_online_cpus(); + pr_warn("Not enough CPUs to check clocksource '%s'.\n", cs->name); + return; + } testcpu = smp_processor_id(); - pr_warn("Checking clocksource %s synchronization from CPU %d.\n", cs->name, testcpu); - for_each_online_cpu(cpu) { + pr_warn("Checking clocksource %s synchronization from CPU %d to CPUs %*pbl.\n", cs->name, testcpu, cpumask_pr_args(&cpus_chosen)); + for_each_cpu(cpu, &cpus_chosen) { if (cpu == testcpu) continue; csnow_begin = cs->read(cs); @@ -296,6 +359,7 @@ static void clocksource_verify_percpu(struct clocksource *cs) cs_nsec_min = cs_nsec; } preempt_enable(); + put_online_cpus(); if (!cpumask_empty(&cpus_ahead)) pr_warn(" CPUs %*pbl ahead of CPU %d for clocksource %s.\n", cpumask_pr_args(&cpus_ahead), testcpu, cs->name); @@ -366,6 +430,12 @@ static void clocksource_watchdog(struct timer_list *unused) watchdog->name, wdnow, wdlast, watchdog->mask); pr_warn(" '%s' cs_now: %llx cs_last: %llx mask: %llx\n", cs->name, csnow, cslast, cs->mask); + if (curr_clocksource == cs) + pr_warn(" '%s' is current clocksource.\n", cs->name); + else if (curr_clocksource) + pr_warn(" '%s' (not '%s') is current clocksource.\n", curr_clocksource->name, cs->name); + else + pr_warn(" No current clocksource.\n"); __clocksource_unstable(cs); continue; } -- 2.31.1.189.g2e36527f23