2023-02-10 19:36:48

by Paul E. McKenney

[permalink] [raw]
Subject: [GIT PULL v2 clocksource] Clocksource watchdog commits for v6.3

Hello, Thomas,

The following changes since commit 1b929c02afd37871d5afb9d498426f83432e71c2:

Linux 6.2-rc1 (2022-12-25 13:41:39 -0800)

are available in the Git repository at:

git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git tags/clocksource.2023.02.06b

for you to fetch changes up to 0051293c533017e2a860e0a0a33517bc40240fff:

clocksource: Enable TSC watchdog checking of HPET and PMTMR only when requested (2023-02-06 16:38:30 -0800)

This adds commit 0051293c5330 ("clocksource: Enable TSC watchdog checking
of HPET and PMTMR only when requested") to the previous pull request as
discussed here:

https://lore.kernel.org/lkml/20230131012440.GA1251465@paulmck-ThinkPad-P17-Gen-1/

----------------------------------------------------------------
Clocksource watchdog commits for v6.3

This pull request contains the following:

o Improvements to clocksource-watchdog console messages.

o Loosening of the clocksource-watchdog skew criteria to match
those of NTP (500 parts per million, relaxed from 400 parts
per million). If it is good enough for NTP, it is good enough
for the clocksource watchdog.

o Suspend clocksource-watchdog checking temporarily when high
memory latencies are detected. This avoids the false-positive
clock-skew events that have been seen on production systems
running memory-intensive workloads.

o On systems where the TSC is deemed trustworthy, use it as the
watchdog timesource, but only when specifically requested using
the tsc=watchdog kernel boot parameter. This permits clock-skew
events to be detected, but avoids forcing workloads to use the
slow HPET and ACPI PM timers. These last two timers are slow
enough to cause systems to be needlessly marked bad on the one
hand, and real skew does sometimes happen on production systems
running production workloads on the other. And sometimes it is
the fault of the TSC, or at least of the firmware that told the
kernel to program the TSC with the wrong frequency.

o Add a tsc=revalidate kernel boot parameter to allow the kernel
to diagnose cases where the TSC hardware works fine, but was told
by firmware to tick at the wrong frequency. Such cases are rare,
but they really have happened on production systems.

----------------------------------------------------------------
Feng Tang (2):
clocksource: Suspend the watchdog temporarily when high read latency detected
x86/tsc: Add option to force frequency recalibration with HW timer

Paul E. McKenney (5):
clocksource: Loosen clocksource watchdog constraints
clocksource: Improve read-back-delay message
clocksource: Improve "skew is too large" messages
clocksource: Verify HPET and PMTMR when TSC unverified
clocksource: Enable TSC watchdog checking of HPET and PMTMR only when requested

Yunying Sun (1):
clocksource: Print clocksource name when clocksource is tested unstable

Documentation/admin-guide/kernel-parameters.txt | 10 ++++
arch/x86/include/asm/time.h | 1 +
arch/x86/kernel/hpet.c | 2 +
arch/x86/kernel/tsc.c | 55 +++++++++++++++++--
drivers/clocksource/acpi_pm.c | 6 ++-
kernel/time/Kconfig | 6 ++-
kernel/time/clocksource.c | 72 +++++++++++++++++--------
7 files changed, 123 insertions(+), 29 deletions(-)


Subject: [tip: timers/core] Merge tag 'clocksource.2023.02.06b' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into timers/core

The following commit has been merged into the timers/core branch of tip:

Commit-ID: ab407a1919d2676ddc5761ed459d4cc5c7be18ed
Gitweb: https://git.kernel.org/tip/ab407a1919d2676ddc5761ed459d4cc5c7be18ed
Author: Thomas Gleixner <[email protected]>
AuthorDate: Mon, 13 Feb 2023 19:28:48 +01:00
Committer: Thomas Gleixner <[email protected]>
CommitterDate: Mon, 13 Feb 2023 19:28:48 +01:00

Merge tag 'clocksource.2023.02.06b' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into timers/core

Pull clocksource watchdog changes from Paul McKenney:

o Improvements to clocksource-watchdog console messages.

o Loosening of the clocksource-watchdog skew criteria to match
those of NTP (500 parts per million, relaxed from 400 parts
per million). If it is good enough for NTP, it is good enough
for the clocksource watchdog.

o Suspend clocksource-watchdog checking temporarily when high
memory latencies are detected. This avoids the false-positive
clock-skew events that have been seen on production systems
running memory-intensive workloads.

o On systems where the TSC is deemed trustworthy, use it as the
watchdog timesource, but only when specifically requested using
the tsc=watchdog kernel boot parameter. This permits clock-skew
events to be detected, but avoids forcing workloads to use the
slow HPET and ACPI PM timers. These last two timers are slow
enough to cause systems to be needlessly marked bad on the one
hand, and real skew does sometimes happen on production systems
running production workloads on the other. And sometimes it is
the fault of the TSC, or at least of the firmware that told the
kernel to program the TSC with the wrong frequency.

o Add a tsc=revalidate kernel boot parameter to allow the kernel
to diagnose cases where the TSC hardware works fine, but was told
by firmware to tick at the wrong frequency. Such cases are rare,
but they really have happened on production systems.

Link: https://lore.kernel.org/r/20230210193640.GA3325193@paulmck-ThinkPad-P17-Gen-1
---