2021-05-01 00:33:19

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH v12 clocksource 0/5] Do not mark clocks unstable due to delays for v5.13

Hello!

If there is a sufficient delay between reading the watchdog clock and the
clock under test, the clock under test will be marked unstable through no
fault of its own. This series checks for this, doing limited retries
to get a good set of clock reads. If the clock is marked unstable
and is marked as being per-CPU, cross-CPU synchronization is checked.
This series also provides a clocksource-watchdog-test kernel module that
tests this new ability of distinguishing delay-induced clock skew from
true clock skew.

Note that "sufficient delay" can be provided by SMIs, NMIs, and of course
vCPU preemption.

The patches in this series are as follows:

1. Retry clock read if long delays detected.

2. Check per-CPU clock synchronization when marked unstable.

3. Limit number of CPUs checked for clock synchronization.

4. Reduce clocksource-skew threshold for TSC.

5. Provide kernel module to test clocksource watchdog.

Changes since v11, based on feedback from Thomas Gleixner:

o Remove the fault-injection code from clocksource.c.

o Create a kernel/time/clocksource-wdtest.c kernel module that
creates its own clocksource structures and injects delays
as part of their ->read() functions.

o Make this kernel module splat upon error, for example, when
a clocksource is not marked unstable but should have been.

o Apply a couple more "Link:" fields to all patches.

Changes since v9:

o Forgive tsc_early drift, based on feedback from Feng Tang; Xing,
Zhengjun; and Thomas Gleixner.

o Improve CPU selection for clock-synchronization checking.

Link: https://lore.kernel.org/lkml/20210419045155.GA596058@paulmck-ThinkPad-P17-Gen-1/

Changes since v8, based on Thomas Gleixner feedback:

o Reduced clock-skew threshold to 200us and delay limit to 50us.

o Split out a cs_watchdog_read() function.

o Removed the pointless CLOCK_SOURCE_VERIFY_PERCPU from kvm_clock.

o Initialized cs_nsec_max and cs_nsec_min to avoid firsttime checks.

Link: https://lore.kernel.org/lkml/20210414043435.GA2812539@paulmck-ThinkPad-P17-Gen-1/

Changes since v7, based on Thomas Gleixner feedback:

o Fix embarrassing git-format-patch operator error.

o Merge pairwise clock-desynchronization checking into the checking
of per-CPU clock synchronization when marked unstable.

o Do selective per-CPU checking rather than blindly checking all
CPUs. Provide a clocksource.verify_n_cpus kernel boot parameter
to control this behavior, with the value -1 choosing the old
check-all-CPUs behavior. The default is to randomly check 8 CPUs.

o Fix the clock-desynchronization checking to avoid a potential
use-after-free error for dynamically allocated clocksource
structures.

o Remove redundance "wdagain_nsec < 0" from clocksource_watchdog()
clocksource skew checking.

o Update commit logs and do code-style updates.

Link: https://lore.kernel.org/lkml/20210106004013.GA11179@paulmck-ThinkPad-P72/

Changes since v5:

o Rebased to v5.12-rc5.

Changes since v4:

o Rebased to v5.12-rc1.

Changes since v3:

o Rebased to v5.11.

o Apply Randy Dunlap feedback.

Changes since v2:

o Rebased to v5.11-rc6.

o Updated Cc: list.

Changes since v1:

o Applied feedback from Rik van Riel.

o Rebased to v5.11-rc3.

o Stripped "RFC" from the subject lines.

Thanx, Paul

------------------------------------------------------------------------

Documentation/admin-guide/kernel-parameters.txt | 16 +
arch/x86/kernel/tsc.c | 1
b/Documentation/admin-guide/kernel-parameters.txt | 6
b/arch/x86/kernel/tsc.c | 3
b/include/linux/clocksource.h | 2
b/kernel/time/Makefile | 1
b/kernel/time/clocksource-wdtest.c | 202 ++++++++++++++++++++++
b/kernel/time/clocksource.c | 52 +++++
b/kernel/time/jiffies.c | 15 -
b/lib/Kconfig.debug | 12 +
include/linux/clocksource.h | 6
kernel/time/clocksource.c | 193 +++++++++++++++++++--
12 files changed, 477 insertions(+), 32 deletions(-)


2021-05-01 00:34:01

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH v12 clocksource 4/5] clocksource: Reduce clocksource-skew threshold for TSC

Currently, WATCHDOG_THRESHOLD is set to detect a 62.5-millisecond skew in
a 500-millisecond WATCHDOG_INTERVAL. This requires that clocks be skewed
by more than 12.5% in order to be marked unstable. Except that a clock
that is skewed by that much is probably destroying unsuspecting software
right and left. And given that there are now checks for false-positive
skews due to delays between reading the two clocks, it should be possible
to greatly decrease WATCHDOG_THRESHOLD, at least for fine-grained clocks
such as TSC.

Therefore, add a new uncertainty_margin field to the clocksource
structure that contains the maximum uncertainty in nanoseconds for
the corresponding clock. This field may be initialized manually,
as it is for clocksource_tsc_early and clocksource_jiffies, which
is copied to refined_jiffies. If the field is not initialized
manually, it will be computed at clock-registry time as the period
of the clock in question based on the scale and freq parameters to
__clocksource_update_freq_scale() function. If either of those two
parameters are zero, the tens-of-milliseconds WATCHDOG_THRESHOLD is
used as a cowardly alternative to dividing by zero. No matter how the
uncertainty_margin field is calculated, it is bounded below by twice
WATCHDOG_MAX_SKEW, that is, by 100 microseconds.

Note that manually initialized uncertainty_margin fields are not adjusted,
but there is a WARN_ON_ONCE() that triggers if any such field is less than
twice WATCHDOG_MAX_SKEW. This WARN_ON_ONCE() is intended to discourage
production use of the one-nanosecond uncertainty_margin values that are
used to test the clock-skew code itself.

The actual clock-skew check uses the sum of the uncertainty_margin fields
of the two clocksource structures being compared. Integer overflow is
avoided because the largest computed value of the uncertainty_margin
fields is one billion (10^9), and double that value fits into an
unsigned int. However, if someone manually specifies (say) UINT_MAX,
they will get what they deserve.

Note that the refined_jiffies uncertainty_margin field is initialized to
TICK_NSEC, which means that skew checks involving this clocksource will
be sufficently forgiving. In a similar vein, the clocksource_tsc_early
uncertainty_margin field is initialized to 32*NSEC_PER_MSEC, which
replicates the current behavior and allows custom setting if needed
in order to address the rare skews detected for this clocksource in
current mainline.

Link: https://lore.kernel.org/lkml/[email protected]/
Link: https://lore.kernel.org/lkml/20210429140440.GT975577@paulmck-ThinkPad-P17-Gen-1
Link: https://lore.kernel.org/lkml/20210425224540.GA1312438@paulmck-ThinkPad-P17-Gen-1/
Link: https://lore.kernel.org/lkml/20210420064934.GE31773@xsang-OptiPlex-9020/
Link: https://lore.kernel.org/lkml/20210106004013.GA11179@paulmck-ThinkPad-P72/
Link: https://lore.kernel.org/lkml/20210414043435.GA2812539@paulmck-ThinkPad-P17-Gen-1/
Link: https://lore.kernel.org/lkml/20210419045155.GA596058@paulmck-ThinkPad-P17-Gen-1/
Suggested-by: Thomas Gleixner <[email protected]>
Cc: John Stultz <[email protected]>
Cc: Stephen Boyd <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Xing Zhengjun <[email protected]>
Cc: Feng Tang <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
---
arch/x86/kernel/tsc.c | 1 +
include/linux/clocksource.h | 3 +++
kernel/time/clocksource.c | 48 +++++++++++++++++++++++++++++--------
kernel/time/jiffies.c | 15 ++++++------
4 files changed, 50 insertions(+), 17 deletions(-)

diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index 56289170753c..6e11c9619437 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -1127,6 +1127,7 @@ static int tsc_cs_enable(struct clocksource *cs)
static struct clocksource clocksource_tsc_early = {
.name = "tsc-early",
.rating = 299,
+ .uncertainty_margin = 32 * NSEC_PER_MSEC,
.read = read_tsc,
.mask = CLOCKSOURCE_MASK(64),
.flags = CLOCK_SOURCE_IS_CONTINUOUS |
diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h
index 83a3ebff7456..8f87c1a6f323 100644
--- a/include/linux/clocksource.h
+++ b/include/linux/clocksource.h
@@ -42,6 +42,8 @@ struct module;
* @shift: Cycle to nanosecond divisor (power of two)
* @max_idle_ns: Maximum idle time permitted by the clocksource (nsecs)
* @maxadj: Maximum adjustment value to mult (~11%)
+ * @uncertainty_margin: Maximum uncertainty in nanoseconds per half second.
+ * Zero says to use default WATCHDOG_THRESHOLD.
* @archdata: Optional arch-specific data
* @max_cycles: Maximum safe cycle value which won't overflow on
* multiplication
@@ -93,6 +95,7 @@ struct clocksource {
u32 shift;
u64 max_idle_ns;
u32 maxadj;
+ u32 uncertainty_margin;
#ifdef CONFIG_ARCH_CLOCKSOURCE_DATA
struct arch_clocksource_data archdata;
#endif
diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index 8a7d704aa46b..a6de23800b65 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -95,6 +95,20 @@ static char override_name[CS_NAME_LEN];
static int finished_booting;
static u64 suspend_start;

+/*
+ * Threshold: 0.0312s, when doubled: 0.0625s.
+ * Also a default for cs->uncertainty_margin when registering clocks.
+ */
+#define WATCHDOG_THRESHOLD (NSEC_PER_SEC >> 5)
+
+/*
+ * Maximum permissible delay between two readouts of the watchdog
+ * clocksource surrounding a read of the clocksource being validated.
+ * This delay could be due to SMIs, NMIs, or to VCPU preemptions. Used as
+ * a lower bound for cs->uncertainty_margin values when registering clocks.
+ */
+#define WATCHDOG_MAX_SKEW (50 * NSEC_PER_USEC)
+
#ifdef CONFIG_CLOCKSOURCE_WATCHDOG
static void clocksource_watchdog_work(struct work_struct *work);
static void clocksource_select(void);
@@ -121,17 +135,9 @@ static int clocksource_watchdog_kthread(void *data);
static void __clocksource_change_rating(struct clocksource *cs, int rating);

/*
- * Interval: 0.5sec Threshold: 0.0625s
+ * Interval: 0.5sec.
*/
#define WATCHDOG_INTERVAL (HZ >> 1)
-#define WATCHDOG_THRESHOLD (NSEC_PER_SEC >> 4)
-
-/*
- * Maximum permissible delay between two readouts of the watchdog
- * clocksource surrounding a read of the clocksource being validated.
- * This delay could be due to SMIs, NMIs, or to VCPU preemptions.
- */
-#define WATCHDOG_MAX_SKEW (100 * NSEC_PER_USEC)

static void clocksource_watchdog_work(struct work_struct *work)
{
@@ -347,6 +353,7 @@ static void clocksource_watchdog(struct timer_list *unused)
int next_cpu, reset_pending;
int64_t wd_nsec, cs_nsec;
struct clocksource *cs;
+ u32 md;

spin_lock(&watchdog_lock);
if (!watchdog_running)
@@ -393,7 +400,8 @@ static void clocksource_watchdog(struct timer_list *unused)
continue;

/* Check the deviation from the watchdog clocksource. */
- if (abs(cs_nsec - wd_nsec) > WATCHDOG_THRESHOLD) {
+ md = cs->uncertainty_margin + watchdog->uncertainty_margin;
+ if (abs(cs_nsec - wd_nsec) > md) {
pr_warn("timekeeping watchdog on CPU%d: Marking clocksource '%s' as unstable because the skew is too large:\n",
smp_processor_id(), cs->name);
pr_warn(" '%s' wd_now: %llx wd_last: %llx mask: %llx\n",
@@ -1046,6 +1054,26 @@ void __clocksource_update_freq_scale(struct clocksource *cs, u32 scale, u32 freq
clocks_calc_mult_shift(&cs->mult, &cs->shift, freq,
NSEC_PER_SEC / scale, sec * scale);
}
+
+ /*
+ * If the uncertainty margin is not specified, calculate it.
+ * If both scale and freq are non-zero, calculate the clock
+ * period, but bound below at 2*WATCHDOG_MAX_SKEW. However,
+ * if either of scale or freq is zero, be very conservative and
+ * take the tens-of-milliseconds WATCHDOG_THRESHOLD value for the
+ * uncertainty margin. Allow stupidly small uncertainty margins
+ * to be specified by the caller for testing purposes, but warn
+ * to discourage production use of this capability.
+ */
+ if (scale && freq && !cs->uncertainty_margin) {
+ cs->uncertainty_margin = NSEC_PER_SEC / (scale * freq);
+ if (cs->uncertainty_margin < 2 * WATCHDOG_MAX_SKEW)
+ cs->uncertainty_margin = 2 * WATCHDOG_MAX_SKEW;
+ } else if (!cs->uncertainty_margin) {
+ cs->uncertainty_margin = WATCHDOG_THRESHOLD;
+ }
+ WARN_ON_ONCE(cs->uncertainty_margin < 2 * WATCHDOG_MAX_SKEW);
+
/*
* Ensure clocksources that have large 'mult' values don't overflow
* when adjusted.
diff --git a/kernel/time/jiffies.c b/kernel/time/jiffies.c
index a5cffe2a1770..165b85bcdf29 100644
--- a/kernel/time/jiffies.c
+++ b/kernel/time/jiffies.c
@@ -49,13 +49,14 @@ static u64 jiffies_read(struct clocksource *cs)
* for "tick-less" systems.
*/
static struct clocksource clocksource_jiffies = {
- .name = "jiffies",
- .rating = 1, /* lowest valid rating*/
- .read = jiffies_read,
- .mask = CLOCKSOURCE_MASK(32),
- .mult = TICK_NSEC << JIFFIES_SHIFT, /* details above */
- .shift = JIFFIES_SHIFT,
- .max_cycles = 10,
+ .name = "jiffies",
+ .rating = 1, /* lowest valid rating*/
+ .uncertainty_margin = TICK_NSEC,
+ .read = jiffies_read,
+ .mask = CLOCKSOURCE_MASK(32),
+ .mult = TICK_NSEC << JIFFIES_SHIFT, /* details above */
+ .shift = JIFFIES_SHIFT,
+ .max_cycles = 10,
};

__cacheline_aligned_in_smp DEFINE_RAW_SPINLOCK(jiffies_lock);
--
2.31.1.189.g2e36527f23

2021-05-01 00:34:19

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH v12 clocksource 2/5] clocksource: Check per-CPU clock synchronization when marked unstable

Some sorts of per-CPU clock sources have a history of going out of
synchronization with each other. However, this problem has purportedy
been solved in the past ten years. Except that it is all too possible
that the problem has instead simply been made less likely, which might
mean that some of the occasional "Marking clocksource 'tsc' as unstable"
messages might be due to desynchronization. How would anyone know?

Therefore apply CPU-to-CPU synchronization checking to newly unstable
clocksource that are marked with the new CLOCK_SOURCE_VERIFY_PERCPU flag.
Lists of desynchronized CPUs are printed, with the caveat that if it
is the reporting CPU that is itself desynchronized, it will appear that
all the other clocks are wrong. Just like in real life.

Link: https://lore.kernel.org/lkml/[email protected]/
Link: https://lore.kernel.org/lkml/20210429140440.GT975577@paulmck-ThinkPad-P17-Gen-1
Link: https://lore.kernel.org/lkml/20210425224540.GA1312438@paulmck-ThinkPad-P17-Gen-1/
Link: https://lore.kernel.org/lkml/20210420064934.GE31773@xsang-OptiPlex-9020/
Link: https://lore.kernel.org/lkml/20210106004013.GA11179@paulmck-ThinkPad-P72/
Link: https://lore.kernel.org/lkml/20210414043435.GA2812539@paulmck-ThinkPad-P17-Gen-1/
Link: https://lore.kernel.org/lkml/20210419045155.GA596058@paulmck-ThinkPad-P17-Gen-1/
Cc: John Stultz <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Stephen Boyd <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Feng Tang <[email protected]>
Cc: Xing Zhengjun <[email protected]>
Reported-by: Chris Mason <[email protected]>
[ paulmck: Add "static" to clocksource_verify_one_cpu() per kernel test robot feedback. ]
[ paulmck: Apply Thomas Gleixner feedback. ]
Signed-off-by: Paul E. McKenney <[email protected]>
---
arch/x86/kernel/tsc.c | 3 +-
include/linux/clocksource.h | 2 +-
kernel/time/clocksource.c | 60 +++++++++++++++++++++++++++++++++++++
3 files changed, 63 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index f70dffc2771f..56289170753c 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -1151,7 +1151,8 @@ static struct clocksource clocksource_tsc = {
.mask = CLOCKSOURCE_MASK(64),
.flags = CLOCK_SOURCE_IS_CONTINUOUS |
CLOCK_SOURCE_VALID_FOR_HRES |
- CLOCK_SOURCE_MUST_VERIFY,
+ CLOCK_SOURCE_MUST_VERIFY |
+ CLOCK_SOURCE_VERIFY_PERCPU,
.vdso_clock_mode = VDSO_CLOCKMODE_TSC,
.enable = tsc_cs_enable,
.resume = tsc_resume,
diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h
index 86d143db6523..83a3ebff7456 100644
--- a/include/linux/clocksource.h
+++ b/include/linux/clocksource.h
@@ -131,7 +131,7 @@ struct clocksource {
#define CLOCK_SOURCE_UNSTABLE 0x40
#define CLOCK_SOURCE_SUSPEND_NONSTOP 0x80
#define CLOCK_SOURCE_RESELECT 0x100
-
+#define CLOCK_SOURCE_VERIFY_PERCPU 0x200
/* simplify initialization of mask field */
#define CLOCKSOURCE_MASK(bits) GENMASK_ULL((bits) - 1, 0)

diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index 157530ae73ac..5ba978a5f45d 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -223,6 +223,60 @@ static bool cs_watchdog_read(struct clocksource *cs, u64 *csnow, u64 *wdnow)
return false;
}

+static u64 csnow_mid;
+static cpumask_t cpus_ahead;
+static cpumask_t cpus_behind;
+
+static void clocksource_verify_one_cpu(void *csin)
+{
+ struct clocksource *cs = (struct clocksource *)csin;
+
+ csnow_mid = cs->read(cs);
+}
+
+static void clocksource_verify_percpu(struct clocksource *cs)
+{
+ int64_t cs_nsec, cs_nsec_max = 0, cs_nsec_min = LLONG_MAX;
+ u64 csnow_begin, csnow_end;
+ int cpu, testcpu;
+ s64 delta;
+
+ cpumask_clear(&cpus_ahead);
+ cpumask_clear(&cpus_behind);
+ preempt_disable();
+ testcpu = smp_processor_id();
+ pr_warn("Checking clocksource %s synchronization from CPU %d.\n", cs->name, testcpu);
+ for_each_online_cpu(cpu) {
+ if (cpu == testcpu)
+ continue;
+ csnow_begin = cs->read(cs);
+ smp_call_function_single(cpu, clocksource_verify_one_cpu, cs, 1);
+ csnow_end = cs->read(cs);
+ delta = (s64)((csnow_mid - csnow_begin) & cs->mask);
+ if (delta < 0)
+ cpumask_set_cpu(cpu, &cpus_behind);
+ delta = (csnow_end - csnow_mid) & cs->mask;
+ if (delta < 0)
+ cpumask_set_cpu(cpu, &cpus_ahead);
+ delta = clocksource_delta(csnow_end, csnow_begin, cs->mask);
+ cs_nsec = clocksource_cyc2ns(delta, cs->mult, cs->shift);
+ if (cs_nsec > cs_nsec_max)
+ cs_nsec_max = cs_nsec;
+ if (cs_nsec < cs_nsec_min)
+ cs_nsec_min = cs_nsec;
+ }
+ preempt_enable();
+ if (!cpumask_empty(&cpus_ahead))
+ pr_warn(" CPUs %*pbl ahead of CPU %d for clocksource %s.\n",
+ cpumask_pr_args(&cpus_ahead), testcpu, cs->name);
+ if (!cpumask_empty(&cpus_behind))
+ pr_warn(" CPUs %*pbl behind CPU %d for clocksource %s.\n",
+ cpumask_pr_args(&cpus_behind), testcpu, cs->name);
+ if (!cpumask_empty(&cpus_ahead) || !cpumask_empty(&cpus_behind))
+ pr_warn(" CPU %d check durations %lldns - %lldns for clocksource %s.\n",
+ testcpu, cs_nsec_min, cs_nsec_max, cs->name);
+}
+
static void clocksource_watchdog(struct timer_list *unused)
{
u64 csnow, wdnow, cslast, wdlast, delta;
@@ -447,6 +501,12 @@ static int __clocksource_watchdog_kthread(void)
unsigned long flags;
int select = 0;

+ /* Do any required per-CPU skew verification. */
+ if (curr_clocksource &&
+ curr_clocksource->flags & CLOCK_SOURCE_UNSTABLE &&
+ curr_clocksource->flags & CLOCK_SOURCE_VERIFY_PERCPU)
+ clocksource_verify_percpu(curr_clocksource);
+
spin_lock_irqsave(&watchdog_lock, flags);
list_for_each_entry_safe(cs, tmp, &watchdog_list, wd_list) {
if (cs->flags & CLOCK_SOURCE_UNSTABLE) {
--
2.31.1.189.g2e36527f23

2021-05-01 00:36:00

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH v12 clocksource 3/5] clocksource: Limit number of CPUs checked for clock synchronization

Currently, if skew is detected on a clock marked CLOCK_SOURCE_VERIFY_PERCPU,
that clock is checked on all CPUs. This is thorough, but might not be
what you want on a system with a few tens of CPUs, let alone a few hundred
of them.

Therefore, by default check only up to eight randomly chosen CPUs.
Also provide a new clocksource.verify_n_cpus kernel boot parameter.
A value of -1 says to check all of the CPUs, and a non-negative value says
to randomly select that number of CPUs, without concern about selecting
the same CPU multiple times. However, make use of a cpumask so that a
given CPU will be checked at most once.

Link: https://lore.kernel.org/lkml/[email protected]/
Link: https://lore.kernel.org/lkml/20210429140440.GT975577@paulmck-ThinkPad-P17-Gen-1
Link: https://lore.kernel.org/lkml/20210425224540.GA1312438@paulmck-ThinkPad-P17-Gen-1/
Link: https://lore.kernel.org/lkml/20210420064934.GE31773@xsang-OptiPlex-9020/
Link: https://lore.kernel.org/lkml/20210106004013.GA11179@paulmck-ThinkPad-P72/
Link: https://lore.kernel.org/lkml/20210414043435.GA2812539@paulmck-ThinkPad-P17-Gen-1/
Link: https://lore.kernel.org/lkml/20210419045155.GA596058@paulmck-ThinkPad-P17-Gen-1/
Suggested-by: Thomas Gleixner <[email protected]> # For verify_n_cpus=1.
Cc: John Stultz <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Stephen Boyd <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Feng Tang <[email protected]>
Cc: Xing Zhengjun <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
---
.../admin-guide/kernel-parameters.txt | 10 +++
kernel/time/clocksource.c | 74 ++++++++++++++++++-
2 files changed, 82 insertions(+), 2 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 4ab93f2612a2..485023c99244 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -589,6 +589,16 @@
unstable. Defaults to three retries, that is,
four attempts to read the clock under test.

+ clocksource.verify_n_cpus= [KNL]
+ Limit the number of CPUs checked for clocksources
+ marked with CLOCK_SOURCE_VERIFY_PERCPU that
+ are marked unstable due to excessive skew.
+ A negative value says to check all CPUs, while
+ zero says not to check any. Values larger than
+ nr_cpu_ids are silently truncated to nr_cpu_ids.
+ The actual CPUs are chosen randomly, with
+ no replacement if the same CPU is chosen twice.
+
clearcpuid=BITNUM[,BITNUM...] [X86]
Disable CPUID feature X for the kernel. See
arch/x86/include/asm/cpufeatures.h for the valid bit
diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index 5ba978a5f45d..8a7d704aa46b 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -14,6 +14,8 @@
#include <linux/sched.h> /* for spin_unlock_irq() using preempt_count() m68k */
#include <linux/tick.h>
#include <linux/kthread.h>
+#include <linux/prandom.h>
+#include <linux/cpu.h>

#include "tick-internal.h"
#include "timekeeping_internal.h"
@@ -193,6 +195,8 @@ void clocksource_mark_unstable(struct clocksource *cs)

static ulong max_read_retries = 3;
module_param(max_read_retries, ulong, 0644);
+static int verify_n_cpus = 8;
+module_param(verify_n_cpus, int, 0644);

static bool cs_watchdog_read(struct clocksource *cs, u64 *csnow, u64 *wdnow)
{
@@ -226,6 +230,55 @@ static bool cs_watchdog_read(struct clocksource *cs, u64 *csnow, u64 *wdnow)
static u64 csnow_mid;
static cpumask_t cpus_ahead;
static cpumask_t cpus_behind;
+static cpumask_t cpus_chosen;
+
+static void clocksource_verify_choose_cpus(void)
+{
+ int cpu, i, n = verify_n_cpus;
+
+ if (n < 0) {
+ /* Check all of the CPUs. */
+ cpumask_copy(&cpus_chosen, cpu_online_mask);
+ cpumask_clear_cpu(smp_processor_id(), &cpus_chosen);
+ return;
+ }
+
+ /* If no checking desired, or no other CPU to check, leave. */
+ cpumask_clear(&cpus_chosen);
+ if (n == 0 || num_online_cpus() <= 1)
+ return;
+
+ /* Make sure to select at least one CPU other than the current CPU. */
+ cpu = cpumask_next(-1, cpu_online_mask);
+ if (cpu == smp_processor_id())
+ cpu = cpumask_next(cpu, cpu_online_mask);
+ if (WARN_ON_ONCE(cpu >= nr_cpu_ids))
+ return;
+ cpumask_set_cpu(cpu, &cpus_chosen);
+
+ /* Force a sane value for the boot parameter. */
+ if (n > nr_cpu_ids)
+ n = nr_cpu_ids;
+
+ /*
+ * Randomly select the specified number of CPUs. If the same
+ * CPU is selected multiple times, that CPU is checked only once,
+ * and no replacement CPU is selected. This gracefully handles
+ * situations where verify_n_cpus is greater than the number of
+ * CPUs that are currently online.
+ */
+ for (i = 1; i < n; i++) {
+ cpu = prandom_u32() % nr_cpu_ids;
+ cpu = cpumask_next(cpu - 1, cpu_online_mask);
+ if (cpu >= nr_cpu_ids)
+ cpu = cpumask_next(-1, cpu_online_mask);
+ if (!WARN_ON_ONCE(cpu >= nr_cpu_ids))
+ cpumask_set_cpu(cpu, &cpus_chosen);
+ }
+
+ /* Don't verify ourselves. */
+ cpumask_clear_cpu(smp_processor_id(), &cpus_chosen);
+}

static void clocksource_verify_one_cpu(void *csin)
{
@@ -241,12 +294,22 @@ static void clocksource_verify_percpu(struct clocksource *cs)
int cpu, testcpu;
s64 delta;

+ if (verify_n_cpus == 0)
+ return;
cpumask_clear(&cpus_ahead);
cpumask_clear(&cpus_behind);
+ get_online_cpus();
preempt_disable();
+ clocksource_verify_choose_cpus();
+ if (cpumask_weight(&cpus_chosen) == 0) {
+ preempt_enable();
+ put_online_cpus();
+ pr_warn("Not enough CPUs to check clocksource '%s'.\n", cs->name);
+ return;
+ }
testcpu = smp_processor_id();
- pr_warn("Checking clocksource %s synchronization from CPU %d.\n", cs->name, testcpu);
- for_each_online_cpu(cpu) {
+ pr_warn("Checking clocksource %s synchronization from CPU %d to CPUs %*pbl.\n", cs->name, testcpu, cpumask_pr_args(&cpus_chosen));
+ for_each_cpu(cpu, &cpus_chosen) {
if (cpu == testcpu)
continue;
csnow_begin = cs->read(cs);
@@ -266,6 +329,7 @@ static void clocksource_verify_percpu(struct clocksource *cs)
cs_nsec_min = cs_nsec;
}
preempt_enable();
+ put_online_cpus();
if (!cpumask_empty(&cpus_ahead))
pr_warn(" CPUs %*pbl ahead of CPU %d for clocksource %s.\n",
cpumask_pr_args(&cpus_ahead), testcpu, cs->name);
@@ -336,6 +400,12 @@ static void clocksource_watchdog(struct timer_list *unused)
watchdog->name, wdnow, wdlast, watchdog->mask);
pr_warn(" '%s' cs_now: %llx cs_last: %llx mask: %llx\n",
cs->name, csnow, cslast, cs->mask);
+ if (curr_clocksource == cs)
+ pr_warn(" '%s' is current clocksource.\n", cs->name);
+ else if (curr_clocksource)
+ pr_warn(" '%s' (not '%s') is current clocksource.\n", curr_clocksource->name, cs->name);
+ else
+ pr_warn(" No current clocksource.\n");
__clocksource_unstable(cs);
continue;
}
--
2.31.1.189.g2e36527f23

2021-05-01 00:37:12

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH v12 clocksource 1/5] clocksource: Retry clock read if long delays detected

When the clocksource watchdog marks a clock as unstable, this might
be due to that clock being unstable or it might be due to delays that
happen to occur between the reads of the two clocks. Yes, interrupts are
disabled across those two reads, but there are no shortage of things that
can delay interrupts-disabled regions of code ranging from SMI handlers
to vCPU preemption. It would be good to have some indication as to why
the clock was marked unstable.

Therefore, re-read the watchdog clock on either side of the read
from the clock under test. If the watchdog clock shows an excessive
time delta between its pair of reads, the reads are retried. The
maximum number of retries is specified by a new kernel boot parameter
clocksource.max_read_retries, which defaults to three, that is, up to four
reads, one initial and up to three retries. If more than one retry was
required, a message is printed on the console (the occasional single retry
is expected behavior, especially in guest OSes). If the maximum number
of retries is exceeded, the clock under test will be marked unstable.
However, the probability of this happening due to various sorts of
delays is quite small. In addition, the reason (clock-read delays)
for the unstable marking will be apparent.

Link: https://lore.kernel.org/lkml/[email protected]/
Link: https://lore.kernel.org/lkml/20210429140440.GT975577@paulmck-ThinkPad-P17-Gen-1
Link: https://lore.kernel.org/lkml/20210425224540.GA1312438@paulmck-ThinkPad-P17-Gen-1/
Link: https://lore.kernel.org/lkml/20210420064934.GE31773@xsang-OptiPlex-9020/
Link: https://lore.kernel.org/lkml/20210106004013.GA11179@paulmck-ThinkPad-P72/
Link: https://lore.kernel.org/lkml/20210414043435.GA2812539@paulmck-ThinkPad-P17-Gen-1/
Link: https://lore.kernel.org/lkml/20210419045155.GA596058@paulmck-ThinkPad-P17-Gen-1/
Cc: John Stultz <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Stephen Boyd <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Xing Zhengjun <[email protected]>
Acked-by: Feng Tang <[email protected]>
Reported-by: Chris Mason <[email protected]>
[ paulmck: Per-clocksource retries per Neeraj Upadhyay feedback. ]
[ paulmck: Don't reset injectfail per Neeraj Upadhyay feedback. ]
[ paulmck: Apply Thomas Gleixner feedback. ]
Signed-off-by: Paul E. McKenney <[email protected]>
---
.../admin-guide/kernel-parameters.txt | 6 +++
kernel/time/clocksource.c | 52 ++++++++++++++++---
2 files changed, 52 insertions(+), 6 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 04545725f187..4ab93f2612a2 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -583,6 +583,12 @@
loops can be debugged more effectively on production
systems.

+ clocksource.max_read_retries= [KNL]
+ Number of clocksource_watchdog() retries due to
+ external delays before the clock will be marked
+ unstable. Defaults to three retries, that is,
+ four attempts to read the clock under test.
+
clearcpuid=BITNUM[,BITNUM...] [X86]
Disable CPUID feature X for the kernel. See
arch/x86/include/asm/cpufeatures.h for the valid bit
diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index cce484a2cc7c..157530ae73ac 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -124,6 +124,13 @@ static void __clocksource_change_rating(struct clocksource *cs, int rating);
#define WATCHDOG_INTERVAL (HZ >> 1)
#define WATCHDOG_THRESHOLD (NSEC_PER_SEC >> 4)

+/*
+ * Maximum permissible delay between two readouts of the watchdog
+ * clocksource surrounding a read of the clocksource being validated.
+ * This delay could be due to SMIs, NMIs, or to VCPU preemptions.
+ */
+#define WATCHDOG_MAX_SKEW (100 * NSEC_PER_USEC)
+
static void clocksource_watchdog_work(struct work_struct *work)
{
/*
@@ -184,12 +191,44 @@ void clocksource_mark_unstable(struct clocksource *cs)
spin_unlock_irqrestore(&watchdog_lock, flags);
}

+static ulong max_read_retries = 3;
+module_param(max_read_retries, ulong, 0644);
+
+static bool cs_watchdog_read(struct clocksource *cs, u64 *csnow, u64 *wdnow)
+{
+ unsigned int nretries;
+ u64 wd_end, wd_delta;
+ int64_t wd_delay;
+
+ for (nretries = 0; nretries <= max_read_retries; nretries++) {
+ local_irq_disable();
+ *wdnow = watchdog->read(watchdog);
+ *csnow = cs->read(cs);
+ wd_end = watchdog->read(watchdog);
+ local_irq_enable();
+
+ wd_delta = clocksource_delta(wd_end, *wdnow, watchdog->mask);
+ wd_delay = clocksource_cyc2ns(wd_delta, watchdog->mult, watchdog->shift);
+ if (wd_delay <= WATCHDOG_MAX_SKEW) {
+ if (nretries > 1 || nretries >= max_read_retries) {
+ pr_warn("timekeeping watchdog on CPU%d: %s retried %d times before success\n",
+ smp_processor_id(), watchdog->name, nretries);
+ }
+ return true;
+ }
+ }
+
+ pr_warn("timekeeping watchdog on CPU%d: %s read-back delay of %lldns, attempt %d, marking unstable\n",
+ smp_processor_id(), watchdog->name, wd_delay, nretries);
+ return false;
+}
+
static void clocksource_watchdog(struct timer_list *unused)
{
- struct clocksource *cs;
u64 csnow, wdnow, cslast, wdlast, delta;
- int64_t wd_nsec, cs_nsec;
int next_cpu, reset_pending;
+ int64_t wd_nsec, cs_nsec;
+ struct clocksource *cs;

spin_lock(&watchdog_lock);
if (!watchdog_running)
@@ -206,10 +245,11 @@ static void clocksource_watchdog(struct timer_list *unused)
continue;
}

- local_irq_disable();
- csnow = cs->read(cs);
- wdnow = watchdog->read(watchdog);
- local_irq_enable();
+ if (!cs_watchdog_read(cs, &csnow, &wdnow)) {
+ /* Clock readout unreliable, so give it up. */
+ __clocksource_unstable(cs);
+ continue;
+ }

/* Clocksource initialized ? */
if (!(cs->flags & CLOCK_SOURCE_WATCHDOG) ||
--
2.31.1.189.g2e36527f23

2021-05-01 00:37:56

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH v12 clocksource 5/5] clocksource: Provide kernel module to test clocksource watchdog

When the clocksource watchdog marks a clock as unstable, this might
be due to that clock being unstable or it might be due to delays that
happen to occur between the reads of the two clocks. It would be good
to have a way of testing the clocksource watchdog's ability to
distinguish between these two causes of clock skew and instability.

Therefore, provide a new clocksource-wdtest module selected by a new
TEST_CLOCKSOURCE_WATCHDOG Kconfig option. This module has a single module
parameter named "holdoff" that provides the number of seconds of delay
before testing should start, which defaults to zero when built as a module
and to 10 seconds when built directly into the kernel. Very large systems
that boot slowly may need to increase the value of this module parameter.

This module uses hand-crafted clocksource structures to do its testing,
thus avoiding messing up timing for the rest of the kernel and for user
applications. This module first verifies that the ->uncertainty_margin
field of the clocksource structures are set sanely. It then tests the
delay-detection capability of the clocksource watchdog, increasing the
number of consecutive delays injected, first provoking console messages
complaining about the delays and finally forcing a clock-skew event.
Unexpected test results cause at least one WARN_ON_ONCE() console splat.
If there are no splats, the test has passed. Finally, it fuzzes the
value returned from a clocksource to test the clocksource watchdog's
ability to detect time skew.

This module checks the state of its clocksource after each test, and
uses WARN_ON_ONCE() to emit a console splat if there are any failures.
This should enable all types of test frameworks to detect any such
failures.

This facility is intended for diagnostic use only, and should be avoided
on production systems.

Link: https://lore.kernel.org/lkml/[email protected]/
Link: https://lore.kernel.org/lkml/20210429140440.GT975577@paulmck-ThinkPad-P17-Gen-1
Link: https://lore.kernel.org/lkml/20210425224540.GA1312438@paulmck-ThinkPad-P17-Gen-1/
Link: https://lore.kernel.org/lkml/20210420064934.GE31773@xsang-OptiPlex-9020/
Link: https://lore.kernel.org/lkml/20210106004013.GA11179@paulmck-ThinkPad-P72/
Link: https://lore.kernel.org/lkml/20210414043435.GA2812539@paulmck-ThinkPad-P17-Gen-1/
Link: https://lore.kernel.org/lkml/20210419045155.GA596058@paulmck-ThinkPad-P17-Gen-1/
Suggested-by: Thomas Gleixner <[email protected]>
Cc: John Stultz <[email protected]>
Cc: Stephen Boyd <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Feng Tang <[email protected]>
Cc: Xing Zhengjun <[email protected]>
Reported-by: Chris Mason <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
---
.../admin-guide/kernel-parameters.txt | 6 +
include/linux/clocksource.h | 3 +
kernel/time/Makefile | 1 +
kernel/time/clocksource-wdtest.c | 202 ++++++++++++++++++
kernel/time/clocksource.c | 11 +-
lib/Kconfig.debug | 12 ++
6 files changed, 230 insertions(+), 5 deletions(-)
create mode 100644 kernel/time/clocksource-wdtest.c

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 485023c99244..79733e7c04a9 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -599,6 +599,12 @@
The actual CPUs are chosen randomly, with
no replacement if the same CPU is chosen twice.

+ clocksource-wdtest.holdoff= [KNL]
+ Set the time in seconds that the clocksource
+ watchdog test waits before commencing its tests.
+ Defaults to zero when built as a module and to
+ 10 seconds when built into the kernel.
+
clearcpuid=BITNUM[,BITNUM...] [X86]
Disable CPUID feature X for the kernel. See
arch/x86/include/asm/cpufeatures.h for the valid bit
diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h
index 8f87c1a6f323..83b3c6639c39 100644
--- a/include/linux/clocksource.h
+++ b/include/linux/clocksource.h
@@ -285,4 +285,7 @@ static inline void timer_probe(void) {}
#define TIMER_ACPI_DECLARE(name, table_id, fn) \
ACPI_DECLARE_PROBE_ENTRY(timer, name, table_id, 0, NULL, 0, fn)

+extern ulong max_cswd_read_retries;
+void clocksource_verify_percpu(struct clocksource *cs);
+
#endif /* _LINUX_CLOCKSOURCE_H */
diff --git a/kernel/time/Makefile b/kernel/time/Makefile
index 1fb1c1ef6a19..1ed85b25b096 100644
--- a/kernel/time/Makefile
+++ b/kernel/time/Makefile
@@ -21,3 +21,4 @@ obj-$(CONFIG_HAVE_GENERIC_VDSO) += vsyscall.o
obj-$(CONFIG_DEBUG_FS) += timekeeping_debug.o
obj-$(CONFIG_TEST_UDELAY) += test_udelay.o
obj-$(CONFIG_TIME_NS) += namespace.o
+obj-$(CONFIG_TEST_CLOCKSOURCE_WATCHDOG) += clocksource-wdtest.o
diff --git a/kernel/time/clocksource-wdtest.c b/kernel/time/clocksource-wdtest.c
new file mode 100644
index 000000000000..01df12395c0e
--- /dev/null
+++ b/kernel/time/clocksource-wdtest.c
@@ -0,0 +1,202 @@
+// SPDX-License-Identifier: GPL-2.0+
+/*
+ * Unit test for the clocksource watchdog.
+ *
+ * Copyright (C) 2021 Facebook, Inc.
+ *
+ * Author: Paul E. McKenney <[email protected]>
+ */
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include <linux/device.h>
+#include <linux/clocksource.h>
+#include <linux/init.h>
+#include <linux/module.h>
+#include <linux/sched.h> /* for spin_unlock_irq() using preempt_count() m68k */
+#include <linux/tick.h>
+#include <linux/kthread.h>
+#include <linux/delay.h>
+#include <linux/prandom.h>
+#include <linux/cpu.h>
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Paul E. McKenney <[email protected]>");
+
+static int holdoff = IS_BUILTIN(CONFIG_TEST_CLOCKSOURCE_WATCHDOG) ? 10 : 0;
+module_param(holdoff, int, 0444);
+MODULE_PARM_DESC(holdoff, "Time to wait to start test (s).");
+
+/* Watchdog kthread's task_struct pointer for debug purposes. */
+static struct task_struct *wdtest_task;
+
+static u64 wdtest_jiffies_read(struct clocksource *cs)
+{
+ return (u64)jiffies;
+}
+
+/* Assume HZ > 100. */
+#define JIFFIES_SHIFT 8
+
+static struct clocksource clocksource_wdtest_jiffies = {
+ .name = "wdtest-jiffies",
+ .rating = 1, /* lowest valid rating*/
+ .uncertainty_margin = TICK_NSEC,
+ .read = wdtest_jiffies_read,
+ .mask = CLOCKSOURCE_MASK(32),
+ .flags = CLOCK_SOURCE_MUST_VERIFY,
+ .mult = TICK_NSEC << JIFFIES_SHIFT, /* details above */
+ .shift = JIFFIES_SHIFT,
+ .max_cycles = 10,
+};
+
+static int wdtest_ktime_read_ndelays;
+static bool wdtest_ktime_read_fuzz;
+
+static u64 wdtest_ktime_read(struct clocksource *cs)
+{
+ int wkrn = READ_ONCE(wdtest_ktime_read_ndelays);
+ static int sign = 1;
+ u64 ret;
+
+ if (wkrn) {
+ udelay(cs->uncertainty_margin / 250);
+ WRITE_ONCE(wdtest_ktime_read_ndelays, wkrn - 1);
+ }
+ ret = ktime_get_real_fast_ns();
+ if (READ_ONCE(wdtest_ktime_read_fuzz)) {
+ sign = -sign;
+ ret = ret + sign * 100 * NSEC_PER_MSEC;
+ }
+ return ret;
+}
+
+static void wdtest_ktime_cs_mark_unstable(struct clocksource *cs)
+{
+ pr_info("--- Marking %s unstable due to clocksource watchdog.\n", cs->name);
+}
+
+#define KTIME_FLAGS (CLOCK_SOURCE_IS_CONTINUOUS | \
+ CLOCK_SOURCE_VALID_FOR_HRES | \
+ CLOCK_SOURCE_MUST_VERIFY | \
+ CLOCK_SOURCE_VERIFY_PERCPU)
+
+static struct clocksource clocksource_wdtest_ktime = {
+ .name = "wdtest-ktime",
+ .rating = 300,
+ .read = wdtest_ktime_read,
+ .mask = CLOCKSOURCE_MASK(64),
+ .flags = KTIME_FLAGS,
+ .mark_unstable = wdtest_ktime_cs_mark_unstable,
+ .list = LIST_HEAD_INIT(clocksource_wdtest_ktime.list),
+};
+
+/* Reset the clocksource if needed. */
+static void wdtest_ktime_clocksource_reset(void)
+{
+ if (clocksource_wdtest_ktime.flags & CLOCK_SOURCE_UNSTABLE) {
+ clocksource_unregister(&clocksource_wdtest_ktime);
+ clocksource_wdtest_ktime.flags = KTIME_FLAGS;
+ schedule_timeout_uninterruptible(HZ / 10);
+ clocksource_register_khz(&clocksource_wdtest_ktime, 1000 * 1000);
+ }
+}
+
+/* Run the specified series of watchdog tests. */
+static int wdtest_func(void *arg)
+{
+ unsigned long j1, j2;
+ char *s;
+ int i;
+
+ schedule_timeout_uninterruptible(holdoff * HZ);
+
+ /*
+ * Verify that jiffies-like clocksources get the manually
+ * specified uncertainty margin.
+ */
+ pr_info("--- Verify jiffies-like uncertainty margin.\n");
+ __clocksource_register(&clocksource_wdtest_jiffies);
+ WARN_ON_ONCE(clocksource_wdtest_jiffies.uncertainty_margin != TICK_NSEC);
+
+ j1 = clocksource_wdtest_jiffies.read(&clocksource_wdtest_jiffies);
+ schedule_timeout_uninterruptible(HZ);
+ j2 = clocksource_wdtest_jiffies.read(&clocksource_wdtest_jiffies);
+ WARN_ON_ONCE(j1 == j2);
+
+ clocksource_unregister(&clocksource_wdtest_jiffies);
+
+ /*
+ * Verify that tsc-like clocksources are assigned a reasonable
+ * uncertainty margin.
+ */
+ pr_info("--- Verify tsc-like uncertainty margin.\n");
+ clocksource_register_khz(&clocksource_wdtest_ktime, 1000 * 1000);
+ WARN_ON_ONCE(clocksource_wdtest_ktime.uncertainty_margin < NSEC_PER_USEC);
+
+ j1 = clocksource_wdtest_ktime.read(&clocksource_wdtest_ktime);
+ udelay(1);
+ j2 = clocksource_wdtest_ktime.read(&clocksource_wdtest_ktime);
+ pr_info("--- tsc-like times: %lu - %lu = %lu.\n", j2, j1, j2 - j1);
+ WARN_ON_ONCE(time_before(j2, j1 + NSEC_PER_USEC));
+
+ /* Verify tsc-like stability with various numbers of errors injected. */
+ for (i = 0; i <= max_cswd_read_retries + 1; i++) {
+ if (i <= 1 && i < max_cswd_read_retries)
+ s = "";
+ else if (i <= max_cswd_read_retries)
+ s = ", expect message";
+ else
+ s = ", expect clock skew";
+ pr_info("--- Watchdog with %dx error injection, %lu retries%s.\n", i, max_cswd_read_retries, s);
+ WRITE_ONCE(wdtest_ktime_read_ndelays, i);
+ schedule_timeout_uninterruptible(2 * HZ);
+ WARN_ON_ONCE(READ_ONCE(wdtest_ktime_read_ndelays));
+ WARN_ON_ONCE((i <= max_cswd_read_retries) !=
+ !(clocksource_wdtest_ktime.flags & CLOCK_SOURCE_UNSTABLE));
+ wdtest_ktime_clocksource_reset();
+ }
+
+ /* Verify tsc-like stability with clock-value-fuzz error injection. */
+ pr_info("--- Watchdog clock-value-fuzz error injection, expect clock skew and per-CPU mismatches.\n");
+ WRITE_ONCE(wdtest_ktime_read_fuzz, true);
+ schedule_timeout_uninterruptible(2 * HZ);
+ WARN_ON_ONCE(!(clocksource_wdtest_ktime.flags & CLOCK_SOURCE_UNSTABLE));
+ clocksource_verify_percpu(&clocksource_wdtest_ktime);
+ WRITE_ONCE(wdtest_ktime_read_fuzz, false);
+
+ clocksource_unregister(&clocksource_wdtest_ktime);
+
+ pr_info("--- Done with test.\n");
+ return 0;
+}
+
+static void wdtest_print_module_parms(void)
+{
+ pr_alert("--- holdoff=%d\n", holdoff);
+}
+
+/* Cleanup function. */
+static void clocksource_wdtest_cleanup(void)
+{
+}
+
+static int __init clocksource_wdtest_init(void)
+{
+ int ret = 0;
+
+ wdtest_print_module_parms();
+
+ /* Create watchdog-test task. */
+ wdtest_task = kthread_run(wdtest_func, NULL, "wdtest");
+ if (IS_ERR(wdtest_task)) {
+ ret = PTR_ERR(wdtest_task);
+ pr_warn("%s: Failed to create wdtest kthread.\n", __func__);
+ wdtest_task = NULL;
+ return ret;
+ }
+
+ return 0;
+}
+
+module_init(clocksource_wdtest_init);
+module_exit(clocksource_wdtest_cleanup);
diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index a6de23800b65..f9466dcd4bf3 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -199,8 +199,9 @@ void clocksource_mark_unstable(struct clocksource *cs)
spin_unlock_irqrestore(&watchdog_lock, flags);
}

-static ulong max_read_retries = 3;
-module_param(max_read_retries, ulong, 0644);
+ulong max_cswd_read_retries = 3;
+module_param(max_cswd_read_retries, ulong, 0644);
+EXPORT_SYMBOL_GPL(max_cswd_read_retries);
static int verify_n_cpus = 8;
module_param(verify_n_cpus, int, 0644);

@@ -210,7 +211,7 @@ static bool cs_watchdog_read(struct clocksource *cs, u64 *csnow, u64 *wdnow)
u64 wd_end, wd_delta;
int64_t wd_delay;

- for (nretries = 0; nretries <= max_read_retries; nretries++) {
+ for (nretries = 0; nretries <= max_cswd_read_retries; nretries++) {
local_irq_disable();
*wdnow = watchdog->read(watchdog);
*csnow = cs->read(cs);
@@ -220,7 +221,7 @@ static bool cs_watchdog_read(struct clocksource *cs, u64 *csnow, u64 *wdnow)
wd_delta = clocksource_delta(wd_end, *wdnow, watchdog->mask);
wd_delay = clocksource_cyc2ns(wd_delta, watchdog->mult, watchdog->shift);
if (wd_delay <= WATCHDOG_MAX_SKEW) {
- if (nretries > 1 || nretries >= max_read_retries) {
+ if (nretries > 1 || nretries >= max_cswd_read_retries) {
pr_warn("timekeeping watchdog on CPU%d: %s retried %d times before success\n",
smp_processor_id(), watchdog->name, nretries);
}
@@ -293,7 +294,7 @@ static void clocksource_verify_one_cpu(void *csin)
csnow_mid = cs->read(cs);
}

-static void clocksource_verify_percpu(struct clocksource *cs)
+void clocksource_verify_percpu(struct clocksource *cs)
{
int64_t cs_nsec, cs_nsec_max = 0, cs_nsec_min = LLONG_MAX;
u64 csnow_begin, csnow_end;
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 2779c29d9981..344735b8b151 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -2513,6 +2513,18 @@ config TEST_FPU

If unsure, say N.

+config TEST_CLOCKSOURCE_WATCHDOG
+ tristate "Test clocksource watchdog in kernel space"
+ depends on CLOCKSOURCE_WATCHDOG
+ help
+ Enable this option to create a kernel module that will trigger
+ a test of the clocksource watchdog. This module may be loaded
+ via modprobe or insmod in which case it will run upon being
+ loaded, or it may be built in, in which case it will run
+ shortly after boot.
+
+ If unsure, say N.
+
endif # RUNTIME_TESTING_MENU

config MEMTEST
--
2.31.1.189.g2e36527f23

2021-05-01 09:54:33

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH v12 clocksource 5/5] clocksource: Provide kernel module to test clocksource watchdog

Hi "Paul,

I love your patch! Yet something to improve:

[auto build test ERROR on tip/timers/core]
[also build test ERROR on tip/x86/core linux/master linus/master v5.12]
[cannot apply to next-20210430]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url: https://github.com/0day-ci/linux/commits/Paul-E-McKenney/Do-not-mark-clocks-unstable-due-to-delays-for-v5-13/20210501-083404
base: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 2d036dfa5f10df9782f5278fc591d79d283c1fad
config: mips-allmodconfig (attached as .config)
compiler: mips-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/0day-ci/linux/commit/95bd62102b4d0bedef8d834a09697cd6daa96f25
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Paul-E-McKenney/Do-not-mark-clocks-unstable-due-to-delays-for-v5-13/20210501-083404
git checkout 95bd62102b4d0bedef8d834a09697cd6daa96f25
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross W=1 ARCH=mips

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <[email protected]>

All errors (new ones prefixed by >>, old ones prefixed by <<):

>> ERROR: modpost: "clocksource_verify_percpu" [kernel/time/clocksource-wdtest.ko] undefined!

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/[email protected]


Attachments:
(No filename) (1.76 kB)
.config.gz (68.48 kB)
Download all attachments

2021-05-01 16:05:34

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH v12 clocksource 5/5] clocksource: Provide kernel module to test clocksource watchdog

On Sat, May 01, 2021 at 05:49:51PM +0800, kernel test robot wrote:
> Hi "Paul,
>
> I love your patch! Yet something to improve:
>
> [auto build test ERROR on tip/timers/core]
> [also build test ERROR on tip/x86/core linux/master linus/master v5.12]
> [cannot apply to next-20210430]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch]
>
> url: https://github.com/0day-ci/linux/commits/Paul-E-McKenney/Do-not-mark-clocks-unstable-due-to-delays-for-v5-13/20210501-083404
> base: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 2d036dfa5f10df9782f5278fc591d79d283c1fad
> config: mips-allmodconfig (attached as .config)
> compiler: mips-linux-gcc (GCC) 9.3.0
> reproduce (this is a W=1 build):
> wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
> chmod +x ~/bin/make.cross
> # https://github.com/0day-ci/linux/commit/95bd62102b4d0bedef8d834a09697cd6daa96f25
> git remote add linux-review https://github.com/0day-ci/linux
> git fetch --no-tags linux-review Paul-E-McKenney/Do-not-mark-clocks-unstable-due-to-delays-for-v5-13/20210501-083404
> git checkout 95bd62102b4d0bedef8d834a09697cd6daa96f25
> # save the attached .config to linux build tree
> COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross W=1 ARCH=mips
>
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot <[email protected]>
>
> All errors (new ones prefixed by >>, old ones prefixed by <<):
>
> >> ERROR: modpost: "clocksource_verify_percpu" [kernel/time/clocksource-wdtest.ko] undefined!

Good show, you beat my test suite that includes allmodconfig. ;-)

Does the fixup patch below help? (It works here.)

Thanx, Paul

------------------------------------------------------------------------

commit 70e662d3cfe370490f310b81aaedf47d5d7cf55e
Author: Paul E. McKenney <[email protected]>
Date: Sat May 1 08:57:53 2021 -0700

squash! clocksource: Provide kernel module to test clocksource watchdog

[ paulmck: Export clocksource_verify_percpu per kernel test robot. ]
Signed-off-by: Paul E. McKenney <[email protected]>

diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index f9466dcd4bf3..9ac76fbe00fb 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -347,6 +347,7 @@ void clocksource_verify_percpu(struct clocksource *cs)
pr_warn(" CPU %d check durations %lldns - %lldns for clocksource %s.\n",
testcpu, cs_nsec_min, cs_nsec_max, cs->name);
}
+EXPORT_SYMBOL_GPL(clocksource_verify_percpu);

static void clocksource_watchdog(struct timer_list *unused)
{

2021-05-13 22:35:14

by kernel test robot

[permalink] [raw]
Subject: [clocksource] 388450c708: netperf.Throughput_tps -65.1% regression



Greeting,

FYI, we noticed a -65.1% regression of netperf.Throughput_tps due to commit:


commit: 388450c7081ded73432e2b7148c1bb9a0b039963 ("[PATCH v12 clocksource 4/5] clocksource: Reduce clocksource-skew threshold for TSC")
url: https://github.com/0day-ci/linux/commits/Paul-E-McKenney/Do-not-mark-clocks-unstable-due-to-delays-for-v5-13/20210501-083404
base: https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git 2d036dfa5f10df9782f5278fc591d79d283c1fad

in testcase: netperf
on test machine: 96 threads 2 sockets Ice Lake with 256G memory
with following parameters:

ip: ipv4
runtime: 300s
nr_threads: 25%
cluster: cs-localhost
test: UDP_RR
cpufreq_governor: performance
ucode: 0xb000280

test-description: Netperf is a benchmark that can be use to measure various aspect of networking performance.
test-url: http://www.netperf.org/netperf/



If you fix the issue, kindly add following tag
Reported-by: kernel test robot <[email protected]>


also as Feng Tang checked, this is a "unstable clocksource" case.
attached dmesg FYI.


Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
bin/lkp run generated-yaml-file

=========================================================================================
cluster/compiler/cpufreq_governor/ip/kconfig/nr_threads/rootfs/runtime/tbox_group/test/testcase/ucode:
cs-localhost/gcc-9/performance/ipv4/x86_64-rhel-8.3/25%/debian-10.4-x86_64-20200603.cgz/300s/lkp-icl-2sp1/UDP_RR/netperf/0xb000280

commit:
a48ef5b9b6 ("clocksource: Limit number of CPUs checked for clock synchronization")
388450c708 ("clocksource: Reduce clocksource-skew threshold for TSC")

a48ef5b9b65fab26 388450c7081ded73432e2b7148c
---------------- ---------------------------
%stddev %change %stddev
\ | \
2552064 -65.1% 891438 netperf.Throughput_total_tps
106336 -65.1% 37143 netperf.Throughput_tps
9493 -48.5% 4889 netperf.time.involuntary_context_switches
8378 ? 2% -8.5% 7666 ? 2% netperf.time.minor_page_faults
1113 +14.5% 1274 netperf.time.percent_of_cpu_this_job_got
3114 +16.5% 3628 netperf.time.system_time
238.81 -11.8% 210.71 netperf.time.user_time
7.656e+08 -65.1% 2.674e+08 netperf.time.voluntary_context_switches
7.656e+08 -65.1% 2.674e+08 netperf.workload
639.13 -14.2% 548.45 pmeter.Average_Active_Power
25129 +18.2% 29715 uptime.idle
62530 -27.8% 45143 slabinfo.radix_tree_node.active_objs
1116 -27.8% 805.67 slabinfo.radix_tree_node.active_slabs
62530 -27.8% 45143 slabinfo.radix_tree_node.num_objs
1116 -27.8% 805.67 slabinfo.radix_tree_node.num_slabs
1.527e+09 -98.8% 17975152 ? 5% turbostat.C1
20.76 -20.1 0.62 ? 7% turbostat.C1%
18935941 ? 13% +2701.4% 5.305e+08 turbostat.C1E
65.67 ? 2% +7.9% 70.83 ? 3% turbostat.PkgTmp
6.049e+09 -97.0% 1.799e+08 ? 7% cpuidle.C1.time
1.527e+09 -98.8% 17975177 ? 5% cpuidle.C1.usage
18936233 ? 13% +2701.3% 5.305e+08 cpuidle.C1E.usage
26807255 +66.2% 44545592 ? 17% cpuidle.POLL.time
3856520 ? 3% -56.2% 1688057 ? 17% cpuidle.POLL.usage
74.02 +15.6 89.60 mpstat.cpu.all.idle%
0.62 ? 2% -0.6 0.00 mpstat.cpu.all.irq%
6.57 -4.3 2.23 mpstat.cpu.all.soft%
17.04 -9.7 7.31 mpstat.cpu.all.sys%
1.74 -0.9 0.86 mpstat.cpu.all.usr%
66034 ? 11% +5978.5% 4013902 ? 49% numa-meminfo.node1.Active
65933 ? 11% +5987.8% 4013878 ? 49% numa-meminfo.node1.Active(anon)
6475043 ? 61% -91.9% 521267 ? 41% numa-meminfo.node1.Inactive
6474945 ? 61% -91.9% 521242 ? 41% numa-meminfo.node1.Inactive(anon)
2795743 ? 50% -77.9% 617984 ? 39% numa-meminfo.node1.Mapped
16495 ? 11% +5996.2% 1005570 ? 49% numa-vmstat.node1.nr_active_anon
1619161 ? 61% -92.0% 129300 ? 41% numa-vmstat.node1.nr_inactive_anon
698820 ? 50% -78.0% 153526 ? 40% numa-vmstat.node1.nr_mapped
16495 ? 11% +5996.2% 1005570 ? 49% numa-vmstat.node1.nr_zone_active_anon
1619160 ? 61% -92.0% 129300 ? 41% numa-vmstat.node1.nr_zone_inactive_anon
72.00 +20.8% 87.00 vmstat.cpu.id
12674271 -34.8% 8264276 vmstat.memory.cache
28.83 -13.3% 25.00 vmstat.procs.r
10041753 -65.0% 3517904 vmstat.system.cs
204196 +9.3% 223115 vmstat.system.in
79614 +8119.8% 6544149 meminfo.Active
79459 +8135.6% 6543994 meminfo.Active(anon)
12515837 -35.1% 8122403 meminfo.Cached
11946542 -37.0% 7529199 meminfo.Committed_AS
11747198 -92.6% 867161 ? 6% meminfo.Inactive
11747048 -92.6% 867014 ? 6% meminfo.Inactive(anon)
4315200 ? 6% -77.7% 963241 ? 4% meminfo.Mapped
14185177 -32.1% 9630648 meminfo.Memused
14936 ? 3% -44.1% 8345 meminfo.PageTables
11547928 -38.0% 7154494 meminfo.Shmem
14306545 -30.9% 9884359 meminfo.max_used_kB
19809 +8135.2% 1631378 proc-vmstat.nr_active_anon
69703 -7.8% 64270 proc-vmstat.nr_anon_pages
6203411 +1.8% 6316647 proc-vmstat.nr_dirty_background_threshold
12421991 +1.8% 12648739 proc-vmstat.nr_dirty_threshold
3120664 -35.0% 2026997 proc-vmstat.nr_file_pages
62410280 +1.8% 63544296 proc-vmstat.nr_free_pages
2928510 -92.6% 217781 ? 6% proc-vmstat.nr_inactive_anon
1070932 ? 7% -77.2% 244114 ? 3% proc-vmstat.nr_mapped
3722 ? 3% -43.7% 2097 proc-vmstat.nr_page_table_pages
2878686 -38.0% 1785019 proc-vmstat.nr_shmem
28786 -8.7% 26291 proc-vmstat.nr_slab_reclaimable
19809 +8135.2% 1631378 proc-vmstat.nr_zone_active_anon
2928510 -92.6% 217781 ? 6% proc-vmstat.nr_zone_inactive_anon
605165 ? 4% -70.0% 181791 ? 2% proc-vmstat.numa_hint_faults
500428 ? 3% -82.6% 86975 ? 6% proc-vmstat.numa_hint_faults_local
8324661 -34.0% 5497994 proc-vmstat.numa_hit
8238072 -34.3% 5411430 proc-vmstat.numa_local
835983 ? 7% -60.4% 330746 ? 4% proc-vmstat.numa_pte_updates
21697 +6821.8% 1501840 ? 6% proc-vmstat.pgactivate
8398800 -33.8% 5558275 proc-vmstat.pgalloc_normal
1964293 -25.0% 1474197 proc-vmstat.pgfault
875949 +13.2% 991262 proc-vmstat.pgfree
78682 ? 15% +185.8% 224886 ? 7% sched_debug.cfs_rq:/.MIN_vruntime.avg
1076987 ? 2% +24.5% 1340633 sched_debug.cfs_rq:/.MIN_vruntime.max
260325 ? 7% +69.4% 441009 ? 2% sched_debug.cfs_rq:/.MIN_vruntime.stddev
78682 ? 15% +185.8% 224886 ? 7% sched_debug.cfs_rq:/.max_vruntime.avg
1076987 ? 2% +24.5% 1340633 sched_debug.cfs_rq:/.max_vruntime.max
260325 ? 7% +69.4% 441009 ? 2% sched_debug.cfs_rq:/.max_vruntime.stddev
948642 +12.9% 1070711 sched_debug.cfs_rq:/.min_vruntime.avg
604932 ? 4% +13.7% 687739 ? 4% sched_debug.cfs_rq:/.min_vruntime.min
109825 ? 7% +45.9% 160254 ? 5% sched_debug.cfs_rq:/.min_vruntime.stddev
0.38 ? 5% -11.8% 0.33 ? 6% sched_debug.cfs_rq:/.nr_running.avg
844.78 ? 3% -14.2% 725.17 ? 8% sched_debug.cfs_rq:/.runnable_avg.max
-3091 +2594.8% -83301 sched_debug.cfs_rq:/.spread0.min
109825 ? 7% +45.9% 160258 ? 5% sched_debug.cfs_rq:/.spread0.stddev
843.81 ? 3% -14.8% 719.14 ? 8% sched_debug.cfs_rq:/.util_avg.max
189.98 ? 6% -15.9% 159.82 ? 8% sched_debug.cfs_rq:/.util_est_enqueued.avg
795.50 ? 2% -20.5% 632.78 ? 2% sched_debug.cfs_rq:/.util_est_enqueued.max
2800 ? 2% +284.1% 10757 ? 47% sched_debug.cpu.avg_idle.min
1206 ? 5% -55.2% 539.86 ? 4% sched_debug.cpu.clock_task.stddev
711.73 ? 6% -44.5% 395.18 ? 2% sched_debug.cpu.curr->pid.avg
1274 ? 2% -15.1% 1081 sched_debug.cpu.curr->pid.stddev
0.31 ? 3% -12.2% 0.27 ? 3% sched_debug.cpu.nr_running.avg
15846045 -65.2% 5507786 sched_debug.cpu.nr_switches.avg
20528236 ? 4% -63.6% 7475974 sched_debug.cpu.nr_switches.max
9220868 ? 5% -63.1% 3406063 ? 6% sched_debug.cpu.nr_switches.min
1692698 ? 6% -49.9% 848477 ? 5% sched_debug.cpu.nr_switches.stddev
1.00 -100.0% 0.00 sched_debug.sched_clock_stable()
17.42 -1.3% 17.19 perf-stat.i.MPKI
1.729e+10 -60.3% 6.871e+09 perf-stat.i.branch-instructions
1.75 +0.2 1.92 perf-stat.i.branch-miss-rate%
2.942e+08 -56.6% 1.276e+08 perf-stat.i.branch-misses
0.92 ? 7% +0.7 1.65 ? 3% perf-stat.i.cache-miss-rate%
12221846 ? 4% -22.7% 9443692 ? 3% perf-stat.i.cache-misses
1.473e+09 -61.4% 5.683e+08 perf-stat.i.cache-references
10129998 -65.0% 3546333 perf-stat.i.context-switches
1.46 +133.2% 3.41 perf-stat.i.cpi
1.243e+11 -9.3% 1.128e+11 perf-stat.i.cpu-cycles
274.12 -7.8% 252.77 ? 2% perf-stat.i.cpu-migrations
0.03 ? 11% -0.0 0.02 ? 6% perf-stat.i.dTLB-load-miss-rate%
7714060 ? 3% -80.1% 1532509 ? 6% perf-stat.i.dTLB-load-misses
2.493e+10 -62.3% 9.411e+09 perf-stat.i.dTLB-loads
413851 ? 7% -53.3% 193085 ? 5% perf-stat.i.dTLB-store-misses
1.517e+10 -63.6% 5.519e+09 perf-stat.i.dTLB-stores
8.66e+10 -60.7% 3.407e+10 perf-stat.i.instructions
0.69 -56.6% 0.30 perf-stat.i.ipc
1.30 -9.3% 1.17 perf-stat.i.metric.GHz
613.38 -62.0% 233.08 perf-stat.i.metric.M/sec
6360 -25.0% 4772 perf-stat.i.minor-faults
62.26 ? 9% +33.5 95.78 perf-stat.i.node-load-miss-rate%
641062 ? 41% +106.7% 1325277 ? 2% perf-stat.i.node-load-misses
1270854 ? 15% -93.6% 81705 ? 21% perf-stat.i.node-loads
85.77 -33.6 52.21 perf-stat.i.node-store-miss-rate%
2320340 -54.6% 1052549 perf-stat.i.node-store-misses
1605640 +10.1% 1767572 perf-stat.i.node-stores
6361 -25.0% 4772 perf-stat.i.page-faults
17.01 -1.9% 16.68 perf-stat.overall.MPKI
1.70 +0.2 1.86 perf-stat.overall.branch-miss-rate%
0.83 ? 5% +0.8 1.66 ? 3% perf-stat.overall.cache-miss-rate%
1.44 +130.5% 3.31 perf-stat.overall.cpi
10192 ? 4% +17.2% 11943 ? 2% perf-stat.overall.cycles-between-cache-misses
0.03 ? 3% -0.0 0.02 ? 6% perf-stat.overall.dTLB-load-miss-rate%
0.00 ? 7% +0.0 0.00 ? 5% perf-stat.overall.dTLB-store-miss-rate%
0.70 -56.6% 0.30 perf-stat.overall.ipc
33.04 ? 35% +61.2 94.21 perf-stat.overall.node-load-miss-rate%
59.09 -21.8 37.32 perf-stat.overall.node-store-miss-rate%
34154 +12.3% 38365 perf-stat.overall.path-length
1.723e+10 -60.3% 6.847e+09 perf-stat.ps.branch-instructions
2.932e+08 -56.6% 1.272e+08 perf-stat.ps.branch-misses
12182782 ? 4% -22.7% 9416463 ? 3% perf-stat.ps.cache-misses
1.468e+09 -61.4% 5.664e+08 perf-stat.ps.cache-references
10094934 -65.0% 3533893 perf-stat.ps.context-switches
1.239e+11 -9.3% 1.124e+11 perf-stat.ps.cpu-cycles
273.37 -7.8% 252.06 ? 2% perf-stat.ps.cpu-migrations
7683633 ? 3% -80.1% 1527379 ? 6% perf-stat.ps.dTLB-load-misses
2.485e+10 -62.3% 9.379e+09 perf-stat.ps.dTLB-loads
412328 ? 7% -53.3% 192438 ? 5% perf-stat.ps.dTLB-store-misses
1.512e+10 -63.6% 5.5e+09 perf-stat.ps.dTLB-stores
8.631e+10 -60.7% 3.395e+10 perf-stat.ps.instructions
6338 -25.0% 4754 perf-stat.ps.minor-faults
638936 ? 41% +106.7% 1320628 ? 2% perf-stat.ps.node-load-misses
1266002 ? 15% -93.6% 81460 ? 21% perf-stat.ps.node-loads
2312502 -54.6% 1049001 perf-stat.ps.node-store-misses
1601288 +10.0% 1761928 perf-stat.ps.node-stores
6338 -25.0% 4755 perf-stat.ps.page-faults
2.615e+13 -60.8% 1.026e+13 perf-stat.total.instructions
4863693 ? 3% -63.3% 1784617 ? 13% softirqs.CPU0.NET_RX
8390807 ? 6% -64.3% 2996025 ? 8% softirqs.CPU1.NET_RX
7739940 -62.3% 2916844 ? 10% softirqs.CPU10.NET_RX
7995905 -63.8% 2894634 ? 6% softirqs.CPU11.NET_RX
7672503 ? 6% -64.7% 2710697 ? 5% softirqs.CPU12.NET_RX
7845055 ? 5% -63.6% 2858075 ? 4% softirqs.CPU13.NET_RX
8007447 ? 2% -63.6% 2914695 ? 8% softirqs.CPU14.NET_RX
7921355 ? 8% -63.6% 2885268 ? 10% softirqs.CPU15.NET_RX
8134485 ? 5% -64.7% 2873713 ? 3% softirqs.CPU16.NET_RX
8369520 ? 4% -65.7% 2869899 ? 9% softirqs.CPU17.NET_RX
8431236 ? 5% -64.6% 2983107 ? 3% softirqs.CPU18.NET_RX
8194774 ? 4% -64.1% 2942680 ? 5% softirqs.CPU19.NET_RX
8131381 ? 4% -61.6% 3126035 ? 4% softirqs.CPU2.NET_RX
8635834 ? 5% -65.0% 3020495 ? 3% softirqs.CPU20.NET_RX
8660660 ? 6% -65.5% 2991711 ? 5% softirqs.CPU21.NET_RX
8842820 ? 3% -67.0% 2915579 ? 8% softirqs.CPU22.NET_RX
8703612 ? 5% -66.7% 2896298 ? 6% softirqs.CPU23.NET_RX
7667353 ? 4% -60.9% 2995334 ? 7% softirqs.CPU24.NET_RX
8202704 ? 6% -61.3% 3176485 ? 3% softirqs.CPU25.NET_RX
8035819 ? 3% -59.0% 3294811 ? 6% softirqs.CPU26.NET_RX
7667060 ? 2% -56.2% 3355323 ? 7% softirqs.CPU27.NET_RX
7818298 ? 4% -55.7% 3460870 ? 4% softirqs.CPU28.NET_RX
8209299 ? 6% -58.9% 3372827 ? 4% softirqs.CPU29.NET_RX
7826514 ? 14% -60.1% 3119356 ? 3% softirqs.CPU3.NET_RX
7905831 ? 5% -56.3% 3457821 ? 5% softirqs.CPU30.NET_RX
7920713 ? 5% -58.6% 3278887 ? 5% softirqs.CPU31.NET_RX
8192169 ? 3% -59.5% 3316165 ? 5% softirqs.CPU32.NET_RX
7941276 ? 4% -59.7% 3200765 ? 5% softirqs.CPU33.NET_RX
7504068 ? 14% -57.8% 3166178 ? 4% softirqs.CPU34.NET_RX
7870330 ? 3% -59.7% 3170329 ? 6% softirqs.CPU35.NET_RX
7943349 ? 3% -61.4% 3064142 ? 4% softirqs.CPU36.NET_RX
7937130 ? 3% -62.0% 3013094 ? 7% softirqs.CPU37.NET_RX
8267476 ? 3% -62.1% 3133563 ? 4% softirqs.CPU38.NET_RX
8028779 ? 7% -62.3% 3024788 ? 6% softirqs.CPU39.NET_RX
8219986 ? 4% -62.4% 3093371 ? 5% softirqs.CPU4.NET_RX
8113648 ? 6% -62.5% 3040010 ? 7% softirqs.CPU40.NET_RX
8356796 ? 4% -63.1% 3081411 ? 5% softirqs.CPU41.NET_RX
8278770 ? 5% -62.1% 3138752 ? 5% softirqs.CPU42.NET_RX
8260552 ? 7% -63.0% 3058037 ? 4% softirqs.CPU43.NET_RX
7882157 ? 2% -59.8% 3172162 ? 6% softirqs.CPU44.NET_RX
8316423 ? 2% -60.1% 3320176 ? 8% softirqs.CPU45.NET_RX
8444938 ? 3% -62.5% 3169460 ? 3% softirqs.CPU46.NET_RX
7083563 ? 20% -54.6% 3216099 softirqs.CPU47.NET_RX
10454503 ? 5% -71.4% 2986371 ? 6% softirqs.CPU48.NET_RX
8629502 ? 5% -67.8% 2778512 ? 4% softirqs.CPU49.NET_RX
8049381 ? 5% -59.7% 3240792 ? 5% softirqs.CPU5.NET_RX
8874688 ? 4% -70.7% 2597980 ? 5% softirqs.CPU50.NET_RX
7903829 ? 16% -65.5% 2730206 ? 6% softirqs.CPU51.NET_RX
8468110 ? 3% -70.3% 2512198 ? 6% softirqs.CPU52.NET_RX
8696922 ? 5% -72.3% 2405739 ? 7% softirqs.CPU53.NET_RX
8425470 -69.4% 2574360 ? 8% softirqs.CPU54.NET_RX
7831673 ? 4% -68.6% 2461209 ? 7% softirqs.CPU55.NET_RX
8735067 ? 11% -71.3% 2511232 ? 10% softirqs.CPU56.NET_RX
8383685 ? 4% -70.2% 2496584 ? 11% softirqs.CPU57.NET_RX
8306674 ? 3% -69.9% 2501269 ? 9% softirqs.CPU58.NET_RX
8197548 ? 4% -68.8% 2558582 ? 3% softirqs.CPU59.NET_RX
8024359 ? 2% -60.5% 3171087 ? 8% softirqs.CPU6.NET_RX
8271211 ? 8% -69.0% 2564369 ? 5% softirqs.CPU60.NET_RX
8082682 ? 7% -69.7% 2446964 ? 3% softirqs.CPU61.NET_RX
8119390 ? 2% -70.6% 2389317 ? 5% softirqs.CPU62.NET_RX
7646737 ? 8% -67.3% 2503721 ? 8% softirqs.CPU63.NET_RX
7949118 ? 6% -68.7% 2490606 ? 7% softirqs.CPU64.NET_RX
7744965 ? 3% -68.7% 2421934 ? 9% softirqs.CPU65.NET_RX
7472381 ? 8% -69.8% 2258480 ? 5% softirqs.CPU66.NET_RX
7651658 ? 9% -69.0% 2368252 ? 11% softirqs.CPU67.NET_RX
7442612 ? 5% -70.4% 2202676 ? 10% softirqs.CPU68.NET_RX
7015192 ? 6% -68.1% 2239285 ? 12% softirqs.CPU69.NET_RX
8345794 ? 5% -61.7% 3193281 ? 7% softirqs.CPU7.NET_RX
6965622 ? 5% -67.7% 2248121 ? 10% softirqs.CPU70.NET_RX
6968466 ? 8% -68.9% 2165129 ? 4% softirqs.CPU71.NET_RX
9305958 ? 3% -66.3% 3138664 ? 4% softirqs.CPU72.NET_RX
8453497 ? 6% -65.8% 2891626 ? 5% softirqs.CPU73.NET_RX
8801156 ? 4% -69.6% 2678589 ? 5% softirqs.CPU74.NET_RX
8525358 ? 5% -69.9% 2563539 ? 8% softirqs.CPU75.NET_RX
8427135 ? 6% -69.6% 2562708 ? 5% softirqs.CPU76.NET_RX
8091793 ? 6% -67.8% 2604572 ? 12% softirqs.CPU77.NET_RX
8222394 ? 4% -69.6% 2503244 ? 14% softirqs.CPU78.NET_RX
8021023 ? 8% -67.0% 2646990 ? 10% softirqs.CPU79.NET_RX
7123789 ? 7% -57.4% 3034199 ? 7% softirqs.CPU8.NET_RX
7907282 ? 3% -68.3% 2503273 ? 14% softirqs.CPU80.NET_RX
8064970 ? 8% -68.3% 2555326 ? 12% softirqs.CPU81.NET_RX
7408809 ? 16% -65.4% 2561466 ? 2% softirqs.CPU82.NET_RX
8144151 ? 3% -68.4% 2575811 ? 9% softirqs.CPU83.NET_RX
7689141 ? 6% -66.7% 2563415 ? 9% softirqs.CPU84.NET_RX
7712230 ? 4% -65.6% 2649338 ? 8% softirqs.CPU85.NET_RX
7673765 ? 4% -68.2% 2440968 ? 5% softirqs.CPU86.NET_RX
7847121 ? 4% -68.7% 2455763 ? 12% softirqs.CPU87.NET_RX
7598116 ? 5% -65.8% 2599653 ? 11% softirqs.CPU88.NET_RX
7823019 ? 3% -66.9% 2587190 ? 2% softirqs.CPU89.NET_RX
7964223 ? 3% -62.8% 2962506 ? 8% softirqs.CPU9.NET_RX
7170381 ? 4% -64.6% 2539672 ? 6% softirqs.CPU90.NET_RX
7402249 ? 6% -65.3% 2568897 ? 4% softirqs.CPU91.NET_RX
7427317 ? 3% -69.0% 2301245 ? 8% softirqs.CPU92.NET_RX
7488111 ? 2% -70.0% 2246435 ? 9% softirqs.CPU93.NET_RX
7161998 ? 6% -67.9% 2301183 ? 3% softirqs.CPU94.NET_RX
5089290 ? 29% -62.2% 1925573 ? 4% softirqs.CPU95.NET_RX
7.656e+08 -65.1% 2.675e+08 softirqs.NET_RX
286032 ? 86% -86.5% 38694 ? 34% interrupts.CAL:Function_call_interrupts
3451 ? 89% -70.6% 1014 ? 41% interrupts.CPU0.CAL:Function_call_interrupts
19257 ? 5% +206.8% 59079 ? 10% interrupts.CPU0.RES:Rescheduling_interrupts
3514 ? 60% -74.1% 910.33 ? 73% interrupts.CPU1.CAL:Function_call_interrupts
127.67 ? 17% -35.2% 82.67 ? 20% interrupts.CPU1.IWI:IRQ_work_interrupts
30556 ? 6% +227.1% 99939 ? 8% interrupts.CPU1.RES:Rescheduling_interrupts
2842 ? 86% -88.7% 322.50 ? 43% interrupts.CPU10.CAL:Function_call_interrupts
27500 ? 2% +252.8% 97010 ? 10% interrupts.CPU10.RES:Rescheduling_interrupts
2835 ? 88% -85.6% 408.50 ? 29% interrupts.CPU11.CAL:Function_call_interrupts
30707 ? 9% +212.2% 95866 ? 6% interrupts.CPU11.RES:Rescheduling_interrupts
3005 ? 90% -86.6% 401.33 ? 33% interrupts.CPU12.CAL:Function_call_interrupts
29040 ? 8% +211.8% 90551 ? 6% interrupts.CPU12.RES:Rescheduling_interrupts
2349 ? 48% -85.3% 344.83 ? 40% interrupts.CPU13.CAL:Function_call_interrupts
28673 ? 8% +233.7% 95693 ? 5% interrupts.CPU13.RES:Rescheduling_interrupts
2601 ? 83% -88.1% 309.50 ? 45% interrupts.CPU14.CAL:Function_call_interrupts
7782 ? 15% -20.8% 6163 ? 22% interrupts.CPU14.NMI:Non-maskable_interrupts
7782 ? 15% -20.8% 6163 ? 22% interrupts.CPU14.PMI:Performance_monitoring_interrupts
28922 ? 2% +235.7% 97099 ? 9% interrupts.CPU14.RES:Rescheduling_interrupts
2824 ? 69% -88.8% 317.67 ? 43% interrupts.CPU15.CAL:Function_call_interrupts
28498 ? 8% +235.2% 95519 ? 11% interrupts.CPU15.RES:Rescheduling_interrupts
2684 ? 74% -88.7% 302.67 ? 50% interrupts.CPU16.CAL:Function_call_interrupts
29427 ? 5% +223.2% 95105 ? 4% interrupts.CPU16.RES:Rescheduling_interrupts
3160 ? 82% -90.0% 316.00 ? 45% interrupts.CPU17.CAL:Function_call_interrupts
30523 ? 3% +211.1% 94965 ? 9% interrupts.CPU17.RES:Rescheduling_interrupts
3704 ?125% -91.0% 333.00 ? 47% interrupts.CPU18.CAL:Function_call_interrupts
30460 ? 4% +223.1% 98410 ? 4% interrupts.CPU18.RES:Rescheduling_interrupts
3204 ?103% -90.0% 320.00 ? 42% interrupts.CPU19.CAL:Function_call_interrupts
29430 ? 5% +229.3% 96915 ? 8% interrupts.CPU19.RES:Rescheduling_interrupts
29776 ? 4% +251.9% 104776 ? 5% interrupts.CPU2.RES:Rescheduling_interrupts
3451 ?110% -90.0% 345.50 ? 42% interrupts.CPU20.CAL:Function_call_interrupts
31388 ? 5% +218.8% 100052 ? 4% interrupts.CPU20.RES:Rescheduling_interrupts
3259 ? 80% -90.9% 298.00 ? 48% interrupts.CPU21.CAL:Function_call_interrupts
31581 ? 6% +215.3% 99561 ? 6% interrupts.CPU21.RES:Rescheduling_interrupts
3710 ?107% -91.7% 308.00 ? 43% interrupts.CPU22.CAL:Function_call_interrupts
152.83 ? 7% -26.7% 112.00 ? 15% interrupts.CPU22.IWI:IRQ_work_interrupts
32670 ? 3% +198.3% 97453 ? 10% interrupts.CPU22.RES:Rescheduling_interrupts
2706 ? 60% -87.4% 340.17 ? 42% interrupts.CPU23.CAL:Function_call_interrupts
32641 ? 5% +195.4% 96416 ? 7% interrupts.CPU23.RES:Rescheduling_interrupts
3098 ? 82% -87.9% 374.50 ? 39% interrupts.CPU24.CAL:Function_call_interrupts
27419 ? 6% +234.9% 91830 ? 7% interrupts.CPU24.RES:Rescheduling_interrupts
2911 ? 77% -86.4% 397.33 ? 42% interrupts.CPU25.CAL:Function_call_interrupts
28869 ? 9% +240.3% 98234 ? 3% interrupts.CPU25.RES:Rescheduling_interrupts
2802 ? 92% -87.7% 345.00 ? 55% interrupts.CPU26.CAL:Function_call_interrupts
28342 ? 3% +262.3% 102693 ? 6% interrupts.CPU26.RES:Rescheduling_interrupts
2979 ?102% -88.2% 350.83 ? 40% interrupts.CPU27.CAL:Function_call_interrupts
26142 ? 2% +298.6% 104201 ? 6% interrupts.CPU27.RES:Rescheduling_interrupts
2376 ? 64% -81.2% 447.67 ? 60% interrupts.CPU28.CAL:Function_call_interrupts
26581 ? 3% +301.9% 106824 ? 5% interrupts.CPU28.RES:Rescheduling_interrupts
3740 ?137% -91.3% 325.00 ? 47% interrupts.CPU29.CAL:Function_call_interrupts
4645 ? 29% +57.0% 7294 ? 13% interrupts.CPU29.NMI:Non-maskable_interrupts
4645 ? 29% +57.0% 7294 ? 13% interrupts.CPU29.PMI:Performance_monitoring_interrupts
28734 ? 6% +259.7% 103359 ? 3% interrupts.CPU29.RES:Rescheduling_interrupts
3000 ? 77% -88.3% 351.17 ? 48% interrupts.CPU3.CAL:Function_call_interrupts
28213 ? 14% +267.0% 103531 ? 4% interrupts.CPU3.RES:Rescheduling_interrupts
3232 ?125% -90.0% 324.83 ? 43% interrupts.CPU30.CAL:Function_call_interrupts
26938 ? 6% +296.6% 106834 ? 6% interrupts.CPU30.RES:Rescheduling_interrupts
2962 ?100% -89.0% 325.50 ? 45% interrupts.CPU31.CAL:Function_call_interrupts
28530 ? 12% +254.2% 101045 ? 6% interrupts.CPU31.RES:Rescheduling_interrupts
2685 ?103% -86.8% 354.33 ? 35% interrupts.CPU32.CAL:Function_call_interrupts
27297 ? 6% +276.2% 102682 ? 3% interrupts.CPU32.RES:Rescheduling_interrupts
2440 ? 95% -84.2% 385.00 ? 40% interrupts.CPU33.CAL:Function_call_interrupts
27354 ? 2% +264.1% 99598 ? 5% interrupts.CPU33.RES:Rescheduling_interrupts
3145 ?117% -88.4% 365.67 ? 40% interrupts.CPU34.CAL:Function_call_interrupts
25952 ? 15% +274.8% 97271 ? 6% interrupts.CPU34.RES:Rescheduling_interrupts
3064 ?113% -88.8% 342.17 ? 42% interrupts.CPU35.CAL:Function_call_interrupts
27847 ? 4% +251.2% 97813 ? 7% interrupts.CPU35.RES:Rescheduling_interrupts
1981 ? 35% -81.9% 358.67 ? 42% interrupts.CPU36.CAL:Function_call_interrupts
28654 ? 3% +231.0% 94855 ? 4% interrupts.CPU36.RES:Rescheduling_interrupts
2818 ?101% -87.9% 341.83 ? 43% interrupts.CPU37.CAL:Function_call_interrupts
28630 ? 4% +224.5% 92893 ? 7% interrupts.CPU37.RES:Rescheduling_interrupts
3472 ?127% -85.7% 497.67 ? 67% interrupts.CPU38.CAL:Function_call_interrupts
28809 ? 4% +236.9% 97073 ? 7% interrupts.CPU38.RES:Rescheduling_interrupts
25.67 ?152% -96.8% 0.83 ?107% interrupts.CPU38.TLB:TLB_shootdowns
2522 ? 93% -85.9% 356.83 ? 39% interrupts.CPU39.CAL:Function_call_interrupts
132.00 ? 19% -43.7% 74.33 ? 31% interrupts.CPU39.IWI:IRQ_work_interrupts
28626 ? 9% +228.5% 94035 ? 7% interrupts.CPU39.RES:Rescheduling_interrupts
2024 ? 20% -73.2% 542.17 ? 79% interrupts.CPU4.CAL:Function_call_interrupts
29504 ? 4% +249.3% 103064 ? 4% interrupts.CPU4.RES:Rescheduling_interrupts
3371 ?111% -85.3% 496.67 ? 65% interrupts.CPU40.CAL:Function_call_interrupts
29374 ? 7% +222.0% 94573 ? 5% interrupts.CPU40.RES:Rescheduling_interrupts
3558 ?114% -90.3% 345.67 ? 42% interrupts.CPU41.CAL:Function_call_interrupts
29469 ? 5% +219.9% 94267 ? 6% interrupts.CPU41.RES:Rescheduling_interrupts
3039 ? 92% -88.5% 349.83 ? 42% interrupts.CPU42.CAL:Function_call_interrupts
30316 ? 6% +219.6% 96895 ? 6% interrupts.CPU42.RES:Rescheduling_interrupts
2564 ? 86% -86.0% 359.33 ? 36% interrupts.CPU43.CAL:Function_call_interrupts
29890 ? 10% +215.3% 94254 ? 6% interrupts.CPU43.RES:Rescheduling_interrupts
3165 ?108% -83.5% 521.00 ? 62% interrupts.CPU44.CAL:Function_call_interrupts
29111 ? 2% +240.2% 99026 ? 8% interrupts.CPU44.RES:Rescheduling_interrupts
3290 ?105% -87.9% 398.83 ? 32% interrupts.CPU45.CAL:Function_call_interrupts
123.83 ? 13% -44.0% 69.33 ? 39% interrupts.CPU45.IWI:IRQ_work_interrupts
30439 ? 3% +241.1% 103830 ? 10% interrupts.CPU45.RES:Rescheduling_interrupts
2685 ? 60% -85.7% 385.17 ? 29% interrupts.CPU46.CAL:Function_call_interrupts
30271 ? 4% +226.9% 98955 ? 5% interrupts.CPU46.RES:Rescheduling_interrupts
2460 ? 61% -83.6% 402.83 ? 39% interrupts.CPU47.CAL:Function_call_interrupts
24896 ? 17% +309.5% 101944 ? 2% interrupts.CPU47.RES:Rescheduling_interrupts
4131 ? 56% -91.7% 342.17 ? 71% interrupts.CPU48.CAL:Function_call_interrupts
229.33 ? 10% -67.8% 73.83 ? 29% interrupts.CPU48.IWI:IRQ_work_interrupts
32291 ? 6% +213.0% 101059 ? 8% interrupts.CPU48.RES:Rescheduling_interrupts
2975 ? 65% -82.3% 527.83 ?117% interrupts.CPU49.CAL:Function_call_interrupts
25835 ? 3% +256.4% 92091 ? 4% interrupts.CPU49.RES:Rescheduling_interrupts
4078 ?120% -92.4% 308.50 ? 51% interrupts.CPU5.CAL:Function_call_interrupts
29451 ? 7% +265.0% 107495 ? 7% interrupts.CPU5.RES:Rescheduling_interrupts
2088 ? 16% -83.9% 335.67 ? 39% interrupts.CPU50.CAL:Function_call_interrupts
183.00 ? 8% -44.4% 101.67 ? 31% interrupts.CPU50.IWI:IRQ_work_interrupts
25789 ? 4% +233.4% 85992 ? 6% interrupts.CPU50.RES:Rescheduling_interrupts
2140 ? 30% -81.7% 392.17 ? 54% interrupts.CPU51.CAL:Function_call_interrupts
182.17 ? 21% -49.0% 92.83 ? 44% interrupts.CPU51.IWI:IRQ_work_interrupts
24006 ? 13% +280.3% 91303 ? 7% interrupts.CPU51.RES:Rescheduling_interrupts
2632 ? 59% -78.1% 577.50 ? 91% interrupts.CPU52.CAL:Function_call_interrupts
165.00 ? 16% -32.7% 111.00 ? 37% interrupts.CPU52.IWI:IRQ_work_interrupts
25342 ? 6% +231.4% 83975 ? 7% interrupts.CPU52.RES:Rescheduling_interrupts
3081 ? 83% -89.7% 316.83 ? 51% interrupts.CPU53.CAL:Function_call_interrupts
214.67 ? 19% -60.8% 84.17 ? 37% interrupts.CPU53.IWI:IRQ_work_interrupts
25636 ? 6% +214.3% 80579 ? 7% interrupts.CPU53.RES:Rescheduling_interrupts
3515 ? 92% -87.2% 451.33 ? 64% interrupts.CPU54.CAL:Function_call_interrupts
187.67 ? 8% -49.6% 94.50 ? 35% interrupts.CPU54.IWI:IRQ_work_interrupts
26324 ? 12% +225.6% 85723 ? 5% interrupts.CPU54.RES:Rescheduling_interrupts
2106 ? 52% -81.7% 386.33 ? 36% interrupts.CPU55.CAL:Function_call_interrupts
178.33 ? 26% -55.6% 79.17 ? 26% interrupts.CPU55.IWI:IRQ_work_interrupts
23170 ? 3% +254.3% 82097 ? 8% interrupts.CPU55.RES:Rescheduling_interrupts
2958 ? 63% -84.0% 473.67 ? 70% interrupts.CPU56.CAL:Function_call_interrupts
211.50 ? 11% -63.4% 77.33 ? 55% interrupts.CPU56.IWI:IRQ_work_interrupts
25318 ? 11% +230.2% 83588 ? 10% interrupts.CPU56.RES:Rescheduling_interrupts
3311 ? 78% -90.6% 312.33 ? 42% interrupts.CPU57.CAL:Function_call_interrupts
177.17 ? 21% -61.3% 68.50 ? 61% interrupts.CPU57.IWI:IRQ_work_interrupts
23748 ? 3% +247.5% 82534 ? 10% interrupts.CPU57.RES:Rescheduling_interrupts
2040 ? 22% -82.7% 352.17 ? 55% interrupts.CPU58.CAL:Function_call_interrupts
189.33 ? 12% -54.3% 86.50 ? 36% interrupts.CPU58.IWI:IRQ_work_interrupts
24089 ? 5% +246.4% 83445 ? 10% interrupts.CPU58.RES:Rescheduling_interrupts
3202 ? 65% -89.4% 340.00 ? 51% interrupts.CPU59.CAL:Function_call_interrupts
170.83 ? 22% -38.0% 106.00 ? 36% interrupts.CPU59.IWI:IRQ_work_interrupts
23171 ? 7% +268.2% 85320 ? 3% interrupts.CPU59.RES:Rescheduling_interrupts
3395 ? 82% -90.3% 330.83 ? 44% interrupts.CPU6.CAL:Function_call_interrupts
28920 ? 2% +262.1% 104724 ? 10% interrupts.CPU6.RES:Rescheduling_interrupts
2986 ? 91% -88.4% 345.67 ? 35% interrupts.CPU60.CAL:Function_call_interrupts
182.00 ? 10% -61.6% 69.83 ? 37% interrupts.CPU60.IWI:IRQ_work_interrupts
24300 ? 11% +252.2% 85587 ? 5% interrupts.CPU60.RES:Rescheduling_interrupts
2271 ? 53% -85.6% 326.33 ? 44% interrupts.CPU61.CAL:Function_call_interrupts
173.83 ? 23% -49.1% 88.50 ? 22% interrupts.CPU61.IWI:IRQ_work_interrupts
23463 ? 4% +245.5% 81067 ? 3% interrupts.CPU61.RES:Rescheduling_interrupts
2977 ? 89% -84.4% 464.67 ? 46% interrupts.CPU62.CAL:Function_call_interrupts
179.33 ? 13% -49.6% 90.33 ? 31% interrupts.CPU62.IWI:IRQ_work_interrupts
24321 ? 5% +226.2% 79323 ? 5% interrupts.CPU62.RES:Rescheduling_interrupts
2538 ? 55% -86.0% 355.83 ? 46% interrupts.CPU63.CAL:Function_call_interrupts
24810 ? 14% +235.4% 83214 ? 6% interrupts.CPU63.RES:Rescheduling_interrupts
2666 ? 64% -83.0% 452.67 ? 97% interrupts.CPU64.CAL:Function_call_interrupts
198.33 ? 7% -60.2% 79.00 ? 29% interrupts.CPU64.IWI:IRQ_work_interrupts
23899 ? 4% +245.8% 82634 ? 8% interrupts.CPU64.RES:Rescheduling_interrupts
4035 ?110% -92.0% 322.83 ? 45% interrupts.CPU65.CAL:Function_call_interrupts
188.17 ? 16% -49.8% 94.50 ? 30% interrupts.CPU65.IWI:IRQ_work_interrupts
23838 ? 6% +236.2% 80137 ? 9% interrupts.CPU65.RES:Rescheduling_interrupts
3209 ? 99% -88.6% 364.50 ? 43% interrupts.CPU66.CAL:Function_call_interrupts
22598 ? 7% +231.5% 74917 ? 6% interrupts.CPU66.RES:Rescheduling_interrupts
4930 ?117% -92.0% 392.00 ? 58% interrupts.CPU67.CAL:Function_call_interrupts
24093 ? 13% +224.9% 78279 ? 10% interrupts.CPU67.RES:Rescheduling_interrupts
4320 ?133% -89.1% 471.33 ? 70% interrupts.CPU68.CAL:Function_call_interrupts
172.83 ? 12% -60.5% 68.33 ? 59% interrupts.CPU68.IWI:IRQ_work_interrupts
22871 ? 8% +221.3% 73481 ? 10% interrupts.CPU68.RES:Rescheduling_interrupts
2078 ? 27% -81.9% 376.83 ? 51% interrupts.CPU69.CAL:Function_call_interrupts
148.33 ? 32% -52.4% 70.67 ? 64% interrupts.CPU69.IWI:IRQ_work_interrupts
22078 ? 10% +239.7% 75006 ? 11% interrupts.CPU69.RES:Rescheduling_interrupts
1881 ? 31% -82.5% 329.17 ? 44% interrupts.CPU7.CAL:Function_call_interrupts
29726 ? 6% +255.6% 105711 ? 7% interrupts.CPU7.RES:Rescheduling_interrupts
2814 ? 81% -86.1% 391.33 ? 38% interrupts.CPU70.CAL:Function_call_interrupts
143.67 ? 11% -50.3% 71.33 ? 36% interrupts.CPU70.IWI:IRQ_work_interrupts
22063 ? 9% +239.2% 74844 ? 9% interrupts.CPU70.RES:Rescheduling_interrupts
1918 ? 14% -79.6% 391.83 ? 57% interrupts.CPU71.CAL:Function_call_interrupts
23347 ? 12% +210.1% 72393 ? 5% interrupts.CPU71.RES:Rescheduling_interrupts
3504 ? 88% -90.0% 351.33 ? 44% interrupts.CPU72.CAL:Function_call_interrupts
200.17 ? 17% -60.5% 79.00 ? 50% interrupts.CPU72.IWI:IRQ_work_interrupts
28435 ? 5% +236.2% 95606 ? 4% interrupts.CPU72.RES:Rescheduling_interrupts
3326 ? 88% -89.7% 342.33 ? 42% interrupts.CPU73.CAL:Function_call_interrupts
205.33 ? 6% -58.1% 86.00 ? 24% interrupts.CPU73.IWI:IRQ_work_interrupts
26276 ? 8% +236.3% 88373 ? 5% interrupts.CPU73.RES:Rescheduling_interrupts
3511 ? 96% -88.8% 393.33 ? 38% interrupts.CPU74.CAL:Function_call_interrupts
198.00 ? 18% -54.8% 89.50 ? 30% interrupts.CPU74.IWI:IRQ_work_interrupts
26063 ? 5% +216.5% 82482 ? 4% interrupts.CPU74.RES:Rescheduling_interrupts
3334 ? 96% -85.6% 479.33 ? 51% interrupts.CPU75.CAL:Function_call_interrupts
209.33 ? 10% -53.0% 98.33 ? 21% interrupts.CPU75.IWI:IRQ_work_interrupts
24945 ? 6% +218.2% 79367 ? 10% interrupts.CPU75.RES:Rescheduling_interrupts
2969 ?103% -86.5% 401.17 ? 33% interrupts.CPU76.CAL:Function_call_interrupts
187.17 ? 13% -58.6% 77.50 ? 25% interrupts.CPU76.IWI:IRQ_work_interrupts
24388 ? 4% +220.9% 78254 ? 5% interrupts.CPU76.RES:Rescheduling_interrupts
3323 ? 97% -87.8% 406.67 ? 40% interrupts.CPU77.CAL:Function_call_interrupts
189.17 ? 11% -54.0% 87.00 ? 42% interrupts.CPU77.IWI:IRQ_work_interrupts
6865 ? 8% -34.2% 4515 ? 26% interrupts.CPU77.NMI:Non-maskable_interrupts
6865 ? 8% -34.2% 4515 ? 26% interrupts.CPU77.PMI:Performance_monitoring_interrupts
24997 ? 16% +222.7% 80678 ? 13% interrupts.CPU77.RES:Rescheduling_interrupts
2755 ? 75% -85.4% 401.67 ? 36% interrupts.CPU78.CAL:Function_call_interrupts
189.83 ? 13% -53.9% 87.50 ? 19% interrupts.CPU78.IWI:IRQ_work_interrupts
23071 ? 9% +232.0% 76599 ? 15% interrupts.CPU78.RES:Rescheduling_interrupts
2612 ? 83% -85.6% 376.00 ? 35% interrupts.CPU79.CAL:Function_call_interrupts
195.17 ? 21% -55.0% 87.83 ? 35% interrupts.CPU79.IWI:IRQ_work_interrupts
23168 ? 3% +250.4% 81181 ? 11% interrupts.CPU79.RES:Rescheduling_interrupts
3874 ?125% -90.6% 362.50 ? 43% interrupts.CPU8.CAL:Function_call_interrupts
25471 ? 10% +293.0% 100101 ? 7% interrupts.CPU8.RES:Rescheduling_interrupts
4028 ?117% -90.9% 367.33 ? 44% interrupts.CPU80.CAL:Function_call_interrupts
6621 ? 13% -35.9% 4241 ? 37% interrupts.CPU80.NMI:Non-maskable_interrupts
6621 ? 13% -35.9% 4241 ? 37% interrupts.CPU80.PMI:Performance_monitoring_interrupts
21879 ? 3% +251.2% 76830 ? 15% interrupts.CPU80.RES:Rescheduling_interrupts
2563 ? 60% -85.6% 370.17 ? 33% interrupts.CPU81.CAL:Function_call_interrupts
191.83 ? 26% -73.4% 51.00 ? 47% interrupts.CPU81.IWI:IRQ_work_interrupts
22362 ? 9% +252.6% 78849 ? 13% interrupts.CPU81.RES:Rescheduling_interrupts
2382 ? 75% -84.0% 382.00 ? 40% interrupts.CPU82.CAL:Function_call_interrupts
172.67 ? 17% -54.1% 79.33 ? 52% interrupts.CPU82.IWI:IRQ_work_interrupts
6795 ? 18% -54.0% 3123 ? 30% interrupts.CPU82.NMI:Non-maskable_interrupts
6795 ? 18% -54.0% 3123 ? 30% interrupts.CPU82.PMI:Performance_monitoring_interrupts
21096 ? 19% +273.9% 78881 ? 2% interrupts.CPU82.RES:Rescheduling_interrupts
3244 ? 85% -88.3% 379.83 ? 36% interrupts.CPU83.CAL:Function_call_interrupts
186.83 ? 18% -63.8% 67.67 ? 38% interrupts.CPU83.IWI:IRQ_work_interrupts
22935 ? 6% +244.3% 78970 ? 11% interrupts.CPU83.RES:Rescheduling_interrupts
2873 ? 97% -88.0% 344.50 ? 43% interrupts.CPU84.CAL:Function_call_interrupts
189.50 ? 16% -55.1% 85.17 ? 20% interrupts.CPU84.IWI:IRQ_work_interrupts
22009 ? 6% +260.3% 79292 ? 10% interrupts.CPU84.RES:Rescheduling_interrupts
2744 ? 73% -84.0% 438.67 ? 41% interrupts.CPU85.CAL:Function_call_interrupts
198.83 ? 14% -53.1% 93.17 ? 40% interrupts.CPU85.IWI:IRQ_work_interrupts
22470 ? 7% +264.9% 81997 ? 11% interrupts.CPU85.RES:Rescheduling_interrupts
3106 ? 88% -87.8% 378.67 ? 38% interrupts.CPU86.CAL:Function_call_interrupts
198.00 ? 12% -45.6% 107.67 ? 26% interrupts.CPU86.IWI:IRQ_work_interrupts
21961 ? 6% +243.7% 75483 ? 7% interrupts.CPU86.RES:Rescheduling_interrupts
2896 ? 81% -88.1% 344.17 ? 39% interrupts.CPU87.CAL:Function_call_interrupts
175.67 ? 11% -45.4% 96.00 ? 37% interrupts.CPU87.IWI:IRQ_work_interrupts
23199 ? 6% +227.3% 75922 ? 13% interrupts.CPU87.RES:Rescheduling_interrupts
2375 ? 29% -86.0% 332.00 ? 40% interrupts.CPU88.CAL:Function_call_interrupts
184.50 ? 20% -61.7% 70.67 ? 39% interrupts.CPU88.IWI:IRQ_work_interrupts
23301 ? 8% +241.3% 79534 ? 13% interrupts.CPU88.RES:Rescheduling_interrupts
3240 ? 98% -89.5% 341.83 ? 45% interrupts.CPU89.CAL:Function_call_interrupts
180.00 ? 9% -63.4% 65.83 ? 37% interrupts.CPU89.IWI:IRQ_work_interrupts
22757 ? 5% +246.3% 78811 ? 3% interrupts.CPU89.RES:Rescheduling_interrupts
2188 ? 51% -81.7% 401.00 ? 40% interrupts.CPU9.CAL:Function_call_interrupts
27998 ? 5% +249.9% 97965 ? 8% interrupts.CPU9.RES:Rescheduling_interrupts
2449 ? 65% -84.9% 368.83 ? 42% interrupts.CPU90.CAL:Function_call_interrupts
157.00 ? 26% -52.4% 74.67 ? 21% interrupts.CPU90.IWI:IRQ_work_interrupts
21244 ? 8% +264.3% 77400 ? 4% interrupts.CPU90.RES:Rescheduling_interrupts
3782 ?109% -89.3% 403.33 ? 40% interrupts.CPU91.CAL:Function_call_interrupts
174.50 ? 15% -53.4% 81.33 ? 19% interrupts.CPU91.IWI:IRQ_work_interrupts
22532 ? 14% +249.5% 78751 ? 5% interrupts.CPU91.RES:Rescheduling_interrupts
2528 ? 70% -85.5% 366.83 ? 36% interrupts.CPU92.CAL:Function_call_interrupts
180.17 ? 18% -40.7% 106.83 ? 24% interrupts.CPU92.IWI:IRQ_work_interrupts
23823 ? 15% +198.1% 71009 ? 8% interrupts.CPU92.RES:Rescheduling_interrupts
3253 ?116% -84.2% 512.67 ? 48% interrupts.CPU93.CAL:Function_call_interrupts
173.33 ? 17% -47.3% 91.33 ? 32% interrupts.CPU93.IWI:IRQ_work_interrupts
22272 ? 8% +211.7% 69420 ? 9% interrupts.CPU93.RES:Rescheduling_interrupts
2250 ? 60% -84.7% 343.50 ? 15% interrupts.CPU94.CAL:Function_call_interrupts
160.00 ? 16% -62.6% 59.83 ? 38% interrupts.CPU94.IWI:IRQ_work_interrupts
21853 ? 10% +224.7% 70960 ? 4% interrupts.CPU94.RES:Rescheduling_interrupts
1984 ? 75% -82.7% 342.83 ? 42% interrupts.CPU95.CAL:Function_call_interrupts
15688 ? 29% +281.0% 59770 ? 5% interrupts.CPU95.RES:Rescheduling_interrupts
14515 -38.0% 8992 interrupts.IWI:IRQ_work_interrupts
2517985 +240.2% 8565028 ? 2% interrupts.RES:Rescheduling_interrupts
53.22 -30.5 22.69 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe
51.71 -29.6 22.06 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe
33.23 -18.6 14.60 perf-profile.calltrace.cycles-pp.__x64_sys_sendto.do_syscall_64.entry_SYSCALL_64_after_hwframe
32.99 -18.5 14.49 perf-profile.calltrace.cycles-pp.__sys_sendto.__x64_sys_sendto.do_syscall_64.entry_SYSCALL_64_after_hwframe
31.92 -17.9 14.00 perf-profile.calltrace.cycles-pp.sock_sendmsg.__sys_sendto.__x64_sys_sendto.do_syscall_64.entry_SYSCALL_64_after_hwframe
31.31 -17.6 13.73 perf-profile.calltrace.cycles-pp.udp_sendmsg.sock_sendmsg.__sys_sendto.__x64_sys_sendto.do_syscall_64
27.57 -15.3 12.28 perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.cpuidle_enter.do_idle.cpu_startup_entry
22.93 -12.6 10.29 perf-profile.calltrace.cycles-pp.udp_send_skb.udp_sendmsg.sock_sendmsg.__sys_sendto.__x64_sys_sendto
22.55 -12.4 10.14 perf-profile.calltrace.cycles-pp.ip_send_skb.udp_send_skb.udp_sendmsg.sock_sendmsg.__sys_sendto
22.14 -12.2 9.97 perf-profile.calltrace.cycles-pp.ip_output.ip_send_skb.udp_send_skb.udp_sendmsg.sock_sendmsg
21.27 -11.7 9.60 perf-profile.calltrace.cycles-pp.ip_finish_output2.ip_output.ip_send_skb.udp_send_skb.udp_sendmsg
18.48 -11.1 7.38 perf-profile.calltrace.cycles-pp.__local_bh_enable_ip.ip_finish_output2.ip_output.ip_send_skb.udp_send_skb
18.30 -11.0 7.30 perf-profile.calltrace.cycles-pp.do_softirq.__local_bh_enable_ip.ip_finish_output2.ip_output.ip_send_skb
18.06 -10.9 7.20 perf-profile.calltrace.cycles-pp.__softirqentry_text_start.do_softirq.__local_bh_enable_ip.ip_finish_output2.ip_output
17.97 -10.8 7.22 perf-profile.calltrace.cycles-pp.__x64_sys_recvfrom.do_syscall_64.entry_SYSCALL_64_after_hwframe
17.79 -10.6 7.14 perf-profile.calltrace.cycles-pp.__sys_recvfrom.__x64_sys_recvfrom.do_syscall_64.entry_SYSCALL_64_after_hwframe
17.01 -10.1 6.91 perf-profile.calltrace.cycles-pp.net_rx_action.__softirqentry_text_start.do_softirq.__local_bh_enable_ip.ip_finish_output2
16.70 -9.9 6.79 perf-profile.calltrace.cycles-pp.__napi_poll.net_rx_action.__softirqentry_text_start.do_softirq.__local_bh_enable_ip
16.60 -9.9 6.75 perf-profile.calltrace.cycles-pp.process_backlog.__napi_poll.net_rx_action.__softirqentry_text_start.do_softirq
16.11 -9.7 6.43 perf-profile.calltrace.cycles-pp.inet_recvmsg.__sys_recvfrom.__x64_sys_recvfrom.do_syscall_64.entry_SYSCALL_64_after_hwframe
16.03 -9.6 6.40 perf-profile.calltrace.cycles-pp.udp_recvmsg.inet_recvmsg.__sys_recvfrom.__x64_sys_recvfrom.do_syscall_64
15.05 -8.7 6.32 perf-profile.calltrace.cycles-pp.__netif_receive_skb_one_core.process_backlog.__napi_poll.net_rx_action.__softirqentry_text_start
14.75 -8.6 6.20 perf-profile.calltrace.cycles-pp.ip_rcv.__netif_receive_skb_one_core.process_backlog.__napi_poll.net_rx_action
14.40 -8.3 6.05 perf-profile.calltrace.cycles-pp.ip_local_deliver.ip_rcv.__netif_receive_skb_one_core.process_backlog.__napi_poll
14.34 -8.3 6.03 perf-profile.calltrace.cycles-pp.ip_local_deliver_finish.ip_local_deliver.ip_rcv.__netif_receive_skb_one_core.process_backlog
14.29 -8.3 6.01 perf-profile.calltrace.cycles-pp.ip_protocol_deliver_rcu.ip_local_deliver_finish.ip_local_deliver.ip_rcv.__netif_receive_skb_one_core
14.19 -8.2 5.97 perf-profile.calltrace.cycles-pp.__udp4_lib_rcv.ip_protocol_deliver_rcu.ip_local_deliver_finish.ip_local_deliver.ip_rcv
13.54 -7.9 5.68 perf-profile.calltrace.cycles-pp.udp_unicast_rcv_skb.__udp4_lib_rcv.ip_protocol_deliver_rcu.ip_local_deliver_finish.ip_local_deliver
13.44 -7.8 5.64 perf-profile.calltrace.cycles-pp.udp_queue_rcv_one_skb.udp_unicast_rcv_skb.__udp4_lib_rcv.ip_protocol_deliver_rcu.ip_local_deliver_finish
12.20 -7.1 5.14 perf-profile.calltrace.cycles-pp.__udp_enqueue_schedule_skb.udp_queue_rcv_one_skb.udp_unicast_rcv_skb.__udp4_lib_rcv.ip_protocol_deliver_rcu
11.60 -7.0 4.55 perf-profile.calltrace.cycles-pp.__skb_recv_udp.udp_recvmsg.inet_recvmsg.__sys_recvfrom.__x64_sys_recvfrom
11.57 -6.7 4.87 perf-profile.calltrace.cycles-pp.sock_def_readable.__udp_enqueue_schedule_skb.udp_queue_rcv_one_skb.udp_unicast_rcv_skb.__udp4_lib_rcv
11.13 -6.4 4.70 perf-profile.calltrace.cycles-pp.__wake_up_common_lock.sock_def_readable.__udp_enqueue_schedule_skb.udp_queue_rcv_one_skb.udp_unicast_rcv_skb
10.27 -6.0 4.28 perf-profile.calltrace.cycles-pp.__wake_up_common.__wake_up_common_lock.sock_def_readable.__udp_enqueue_schedule_skb.udp_queue_rcv_one_skb
9.82 -5.7 4.10 perf-profile.calltrace.cycles-pp.autoremove_wake_function.__wake_up_common.__wake_up_common_lock.sock_def_readable.__udp_enqueue_schedule_skb
9.68 -5.6 4.04 perf-profile.calltrace.cycles-pp.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock.sock_def_readable
9.09 -5.5 3.55 perf-profile.calltrace.cycles-pp.__skb_wait_for_more_packets.__skb_recv_udp.udp_recvmsg.inet_recvmsg.__sys_recvfrom
7.80 -4.5 3.28 perf-profile.calltrace.cycles-pp.schedule_idle.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
7.20 -4.4 2.79 perf-profile.calltrace.cycles-pp.schedule_timeout.__skb_wait_for_more_packets.__skb_recv_udp.udp_recvmsg.inet_recvmsg
7.55 -4.4 3.17 perf-profile.calltrace.cycles-pp.__schedule.schedule_idle.do_idle.cpu_startup_entry.start_secondary
6.73 -4.1 2.60 perf-profile.calltrace.cycles-pp.schedule.schedule_timeout.__skb_wait_for_more_packets.__skb_recv_udp.udp_recvmsg
6.50 -4.0 2.50 perf-profile.calltrace.cycles-pp.__schedule.schedule.schedule_timeout.__skb_wait_for_more_packets.__skb_recv_udp
6.24 -3.7 2.52 perf-profile.calltrace.cycles-pp.ip_make_skb.udp_sendmsg.sock_sendmsg.__sys_sendto.__x64_sys_sendto
3.91 -2.4 1.51 perf-profile.calltrace.cycles-pp.pick_next_task_fair.__schedule.schedule_idle.do_idle.cpu_startup_entry
3.99 -2.4 1.59 perf-profile.calltrace.cycles-pp.ttwu_do_activate.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock
3.90 -2.3 1.57 perf-profile.calltrace.cycles-pp.enqueue_task_fair.ttwu_do_activate.try_to_wake_up.autoremove_wake_function.__wake_up_common
3.78 -2.2 1.54 perf-profile.calltrace.cycles-pp.__ip_append_data.ip_make_skb.udp_sendmsg.sock_sendmsg.__sys_sendto
3.35 -2.1 1.22 perf-profile.calltrace.cycles-pp.dequeue_task_fair.__schedule.schedule.schedule_timeout.__skb_wait_for_more_packets
3.14 -1.9 1.20 perf-profile.calltrace.cycles-pp.set_next_entity.pick_next_task_fair.__schedule.schedule_idle.do_idle
2.99 -1.9 1.08 perf-profile.calltrace.cycles-pp.dequeue_entity.dequeue_task_fair.__schedule.schedule.schedule_timeout
2.79 -1.7 1.12 perf-profile.calltrace.cycles-pp.sock_alloc_send_pskb.__ip_append_data.ip_make_skb.udp_sendmsg.sock_sendmsg
2.82 -1.6 1.18 perf-profile.calltrace.cycles-pp.enqueue_entity.enqueue_task_fair.ttwu_do_activate.try_to_wake_up.autoremove_wake_function
2.27 -1.4 0.92 perf-profile.calltrace.cycles-pp.alloc_skb_with_frags.sock_alloc_send_pskb.__ip_append_data.ip_make_skb.udp_sendmsg
2.25 ? 3% -1.4 0.90 ? 3% perf-profile.calltrace.cycles-pp.__ip_make_skb.ip_make_skb.udp_sendmsg.sock_sendmsg.__sys_sendto
2.18 -1.3 0.88 perf-profile.calltrace.cycles-pp.__alloc_skb.alloc_skb_with_frags.sock_alloc_send_pskb.__ip_append_data.ip_make_skb
1.94 ? 3% -1.2 0.76 ? 3% perf-profile.calltrace.cycles-pp.__ip_select_ident.__ip_make_skb.ip_make_skb.udp_sendmsg.sock_sendmsg
2.01 -1.0 0.96 ? 3% perf-profile.calltrace.cycles-pp.menu_select.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
1.70 ? 4% -1.0 0.66 ? 4% perf-profile.calltrace.cycles-pp.ip_idents_reserve.__ip_select_ident.__ip_make_skb.ip_make_skb.udp_sendmsg
1.64 -1.0 0.66 perf-profile.calltrace.cycles-pp._copy_to_iter.udp_recvmsg.inet_recvmsg.__sys_recvfrom.__x64_sys_recvfrom
1.47 -0.9 0.57 ? 2% perf-profile.calltrace.cycles-pp.__consume_stateless_skb.udp_recvmsg.inet_recvmsg.__sys_recvfrom.__x64_sys_recvfrom
1.34 -0.8 0.54 perf-profile.calltrace.cycles-pp.update_load_avg.set_next_entity.pick_next_task_fair.__schedule.schedule_idle
1.41 -0.7 0.69 ? 2% perf-profile.calltrace.cycles-pp.update_rq_clock.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock
1.27 ? 4% -0.7 0.57 perf-profile.calltrace.cycles-pp.ip_route_output_flow.udp_sendmsg.sock_sendmsg.__sys_sendto.__x64_sys_sendto
1.23 ? 4% -0.7 0.54 ? 2% perf-profile.calltrace.cycles-pp.ip_route_output_key_hash.ip_route_output_flow.udp_sendmsg.sock_sendmsg.__sys_sendto
2.44 ? 4% -0.4 2.08 ? 3% perf-profile.calltrace.cycles-pp.__dev_queue_xmit.ip_finish_output2.ip_output.ip_send_skb.udp_send_skb
0.00 +0.6 0.56 ? 4% perf-profile.calltrace.cycles-pp.read_hpet.ktime_get.sched_clock_tick.scheduler_tick.update_process_times
0.00 +0.6 0.56 ? 2% perf-profile.calltrace.cycles-pp.read_hpet.ktime_get.tick_sched_timer.__hrtimer_run_queues.hrtimer_interrupt
0.00 +0.6 0.58 ? 3% perf-profile.calltrace.cycles-pp.ktime_get.tick_sched_timer.__hrtimer_run_queues.hrtimer_interrupt.__sysvec_apic_timer_interrupt
0.00 +0.6 0.59 ? 6% perf-profile.calltrace.cycles-pp.ktime_get.sched_clock_tick.scheduler_tick.update_process_times.tick_sched_handle
0.00 +0.6 0.60 ? 5% perf-profile.calltrace.cycles-pp.sched_clock_tick.scheduler_tick.update_process_times.tick_sched_handle.tick_sched_timer
29.82 +0.6 30.44 perf-profile.calltrace.cycles-pp.cpuidle_enter.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
0.00 +0.6 0.63 ? 2% perf-profile.calltrace.cycles-pp.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.tick_nohz_idle_exit
0.00 +0.6 0.64 ? 2% perf-profile.calltrace.cycles-pp.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.tick_nohz_idle_exit.do_idle
0.00 +0.6 0.64 perf-profile.calltrace.cycles-pp.sched_clock_cpu.update_rq_clock.try_to_wake_up.autoremove_wake_function.__wake_up_common
0.00 +0.7 0.66 ? 5% perf-profile.calltrace.cycles-pp.scheduler_tick.update_process_times.tick_sched_handle.tick_sched_timer.__hrtimer_run_queues
0.00 +0.7 0.66 ? 2% perf-profile.calltrace.cycles-pp.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.tick_nohz_idle_exit.do_idle.cpu_startup_entry
29.65 +0.7 30.32 perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.do_idle.cpu_startup_entry.start_secondary
0.00 +0.7 0.67 ? 2% perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt.tick_nohz_idle_exit.do_idle.cpu_startup_entry.start_secondary
0.00 +0.7 0.70 ? 5% perf-profile.calltrace.cycles-pp.update_process_times.tick_sched_handle.tick_sched_timer.__hrtimer_run_queues.hrtimer_interrupt
0.00 +0.7 0.71 ? 5% perf-profile.calltrace.cycles-pp.tick_sched_handle.tick_sched_timer.__hrtimer_run_queues.hrtimer_interrupt.__sysvec_apic_timer_interrupt
0.00 +0.7 0.74 ? 2% perf-profile.calltrace.cycles-pp.read_hpet.ktime_get.tick_nohz_irq_exit.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt
0.00 +0.8 0.75 ? 2% perf-profile.calltrace.cycles-pp.ktime_get.tick_nohz_irq_exit.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state
0.00 +0.8 0.76 ? 2% perf-profile.calltrace.cycles-pp.tick_nohz_irq_exit.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter
0.00 +1.0 1.00 perf-profile.calltrace.cycles-pp.read_hpet.ktime_get.tick_irq_enter.irq_enter_rcu.sysvec_apic_timer_interrupt
0.00 +1.0 1.01 perf-profile.calltrace.cycles-pp.ktime_get.tick_irq_enter.irq_enter_rcu.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt
0.19 ?141% +1.0 1.21 ? 5% perf-profile.calltrace.cycles-pp.netif_rx_internal.netif_rx.loopback_xmit.dev_hard_start_xmit.__dev_queue_xmit
0.20 ?141% +1.0 1.22 ? 6% perf-profile.calltrace.cycles-pp.netif_rx.loopback_xmit.dev_hard_start_xmit.__dev_queue_xmit.ip_finish_output2
0.00 +1.0 1.04 ? 6% perf-profile.calltrace.cycles-pp.read_hpet.ktime_get_with_offset.netif_rx_internal.netif_rx.loopback_xmit
0.00 +1.0 1.04 perf-profile.calltrace.cycles-pp.read_hpet.ktime_get_update_offsets_now.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt
0.00 +1.1 1.07 ? 6% perf-profile.calltrace.cycles-pp.ktime_get_with_offset.netif_rx_internal.netif_rx.loopback_xmit.dev_hard_start_xmit
0.00 +1.1 1.07 perf-profile.calltrace.cycles-pp.read_hpet.ktime_get.clockevents_program_event.hrtimer_interrupt.__sysvec_apic_timer_interrupt
0.00 +1.1 1.07 perf-profile.calltrace.cycles-pp.ktime_get_update_offsets_now.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt
0.00 +1.1 1.08 perf-profile.calltrace.cycles-pp.read_hpet.ktime_get.sched_clock_tick.sched_clock_idle_wakeup_event.tick_irq_enter
0.00 +1.1 1.09 perf-profile.calltrace.cycles-pp.ktime_get.sched_clock_tick.sched_clock_idle_wakeup_event.tick_irq_enter.irq_enter_rcu
0.00 +1.1 1.10 perf-profile.calltrace.cycles-pp.sched_clock_tick.sched_clock_idle_wakeup_event.tick_irq_enter.irq_enter_rcu.sysvec_apic_timer_interrupt
0.00 +1.1 1.11 perf-profile.calltrace.cycles-pp.sched_clock_idle_wakeup_event.tick_irq_enter.irq_enter_rcu.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt
0.00 +1.1 1.12 ? 2% perf-profile.calltrace.cycles-pp.ktime_get.clockevents_program_event.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt
0.00 +1.2 1.16 ? 2% perf-profile.calltrace.cycles-pp.clockevents_program_event.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt
0.00 +1.3 1.30 ? 4% perf-profile.calltrace.cycles-pp.tick_sched_timer.__hrtimer_run_queues.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt
0.00 +1.8 1.78 ? 2% perf-profile.calltrace.cycles-pp.__hrtimer_run_queues.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt
0.00 +2.1 2.14 perf-profile.calltrace.cycles-pp.tick_irq_enter.irq_enter_rcu.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state
0.00 +2.1 2.14 perf-profile.calltrace.cycles-pp.irq_enter_rcu.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter
0.00 +4.0 4.05 perf-profile.calltrace.cycles-pp.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state
0.00 +4.1 4.08 perf-profile.calltrace.cycles-pp.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter
0.60 ? 2% +6.5 7.14 perf-profile.calltrace.cycles-pp.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter.do_idle
0.67 ? 2% +6.6 7.23 perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter.do_idle.cpu_startup_entry
0.00 +8.4 8.43 perf-profile.calltrace.cycles-pp.read_hpet.ktime_get.tick_nohz_idle_enter.do_idle.cpu_startup_entry
0.00 +8.5 8.53 perf-profile.calltrace.cycles-pp.ktime_get.tick_nohz_idle_enter.do_idle.cpu_startup_entry.start_secondary
0.34 ? 70% +8.7 9.08 perf-profile.calltrace.cycles-pp.tick_nohz_idle_enter.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
0.00 +9.2 9.21 perf-profile.calltrace.cycles-pp.read_hpet.ktime_get.sched_clock_tick.sched_clock_idle_wakeup_event.cpuidle_enter_state
0.00 +9.3 9.32 perf-profile.calltrace.cycles-pp.ktime_get.sched_clock_tick.sched_clock_idle_wakeup_event.cpuidle_enter_state.cpuidle_enter
0.00 +9.4 9.40 perf-profile.calltrace.cycles-pp.sched_clock_tick.sched_clock_idle_wakeup_event.cpuidle_enter_state.cpuidle_enter.do_idle
0.00 +9.7 9.70 perf-profile.calltrace.cycles-pp.sched_clock_idle_wakeup_event.cpuidle_enter_state.cpuidle_enter.do_idle.cpu_startup_entry
0.00 +14.6 14.60 perf-profile.calltrace.cycles-pp.read_hpet.ktime_get.tick_nohz_idle_exit.do_idle.cpu_startup_entry
0.00 +14.7 14.68 perf-profile.calltrace.cycles-pp.read_hpet.ktime_get.sched_clock_tick.sched_clock_idle_wakeup_event.tick_nohz_idle_exit
0.00 +14.7 14.69 perf-profile.calltrace.cycles-pp.ktime_get.tick_nohz_idle_exit.do_idle.cpu_startup_entry.start_secondary
0.00 +14.8 14.76 perf-profile.calltrace.cycles-pp.ktime_get.sched_clock_tick.sched_clock_idle_wakeup_event.tick_nohz_idle_exit.do_idle
0.00 +14.8 14.84 perf-profile.calltrace.cycles-pp.sched_clock_tick.sched_clock_idle_wakeup_event.tick_nohz_idle_exit.do_idle.cpu_startup_entry
0.00 +15.1 15.06 perf-profile.calltrace.cycles-pp.sched_clock_idle_wakeup_event.tick_nohz_idle_exit.do_idle.cpu_startup_entry.start_secondary
1.07 +29.7 30.73 perf-profile.calltrace.cycles-pp.tick_nohz_idle_exit.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
43.21 +32.1 75.30 perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64_no_verify
43.16 +32.1 75.27 perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
42.98 +32.2 75.20 perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
43.48 +32.4 75.89 perf-profile.calltrace.cycles-pp.secondary_startup_64_no_verify
53.30 -30.5 22.75 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
51.78 -29.7 22.12 perf-profile.children.cycles-pp.do_syscall_64
33.28 -18.7 14.62 perf-profile.children.cycles-pp.__x64_sys_sendto
33.02 -18.5 14.50 perf-profile.children.cycles-pp.__sys_sendto
31.95 -17.9 14.01 perf-profile.children.cycles-pp.sock_sendmsg
31.39 -17.6 13.76 perf-profile.children.cycles-pp.udp_sendmsg
27.74 -15.4 12.38 perf-profile.children.cycles-pp.intel_idle
22.95 -12.7 10.29 perf-profile.children.cycles-pp.udp_send_skb
22.57 -12.4 10.15 perf-profile.children.cycles-pp.ip_send_skb
22.16 -12.2 9.98 perf-profile.children.cycles-pp.ip_output
21.29 -11.7 9.61 perf-profile.children.cycles-pp.ip_finish_output2
18.57 -11.2 7.42 perf-profile.children.cycles-pp.__local_bh_enable_ip
18.33 -11.0 7.31 perf-profile.children.cycles-pp.do_softirq
18.24 -10.9 7.36 perf-profile.children.cycles-pp.__softirqentry_text_start
18.02 -10.8 7.25 perf-profile.children.cycles-pp.__x64_sys_recvfrom
17.82 -10.7 7.16 perf-profile.children.cycles-pp.__sys_recvfrom
17.05 -10.1 6.93 perf-profile.children.cycles-pp.net_rx_action
16.72 -9.9 6.79 perf-profile.children.cycles-pp.__napi_poll
16.63 -9.9 6.76 perf-profile.children.cycles-pp.process_backlog
16.11 -9.7 6.44 perf-profile.children.cycles-pp.inet_recvmsg
16.06 -9.7 6.41 perf-profile.children.cycles-pp.udp_recvmsg
15.07 -8.7 6.33 perf-profile.children.cycles-pp.__netif_receive_skb_one_core
14.77 -8.6 6.21 perf-profile.children.cycles-pp.ip_rcv
14.25 -8.5 5.77 perf-profile.children.cycles-pp.__schedule
14.41 -8.4 6.06 perf-profile.children.cycles-pp.ip_local_deliver
14.36 -8.3 6.04 perf-profile.children.cycles-pp.ip_local_deliver_finish
14.31 -8.3 6.02 perf-profile.children.cycles-pp.ip_protocol_deliver_rcu
14.20 -8.2 5.97 perf-profile.children.cycles-pp.__udp4_lib_rcv
13.55 -7.9 5.69 perf-profile.children.cycles-pp.udp_unicast_rcv_skb
13.47 -7.8 5.65 perf-profile.children.cycles-pp.udp_queue_rcv_one_skb
11.66 -7.1 4.58 perf-profile.children.cycles-pp.__skb_recv_udp
12.21 -7.1 5.15 perf-profile.children.cycles-pp.__udp_enqueue_schedule_skb
11.58 -6.7 4.88 perf-profile.children.cycles-pp.sock_def_readable
11.16 -6.4 4.71 perf-profile.children.cycles-pp.__wake_up_common_lock
10.28 -6.0 4.28 perf-profile.children.cycles-pp.__wake_up_common
9.84 -5.7 4.11 perf-profile.children.cycles-pp.autoremove_wake_function
9.74 -5.7 4.07 perf-profile.children.cycles-pp.try_to_wake_up
9.11 -5.5 3.56 perf-profile.children.cycles-pp.__skb_wait_for_more_packets
7.87 -4.6 3.32 perf-profile.children.cycles-pp.schedule_idle
7.22 -4.4 2.80 perf-profile.children.cycles-pp.schedule_timeout
6.74 -4.1 2.61 perf-profile.children.cycles-pp.schedule
6.26 -3.7 2.52 perf-profile.children.cycles-pp.ip_make_skb
4.24 -2.6 1.66 perf-profile.children.cycles-pp.pick_next_task_fair
4.01 -2.4 1.60 perf-profile.children.cycles-pp.ttwu_do_activate
3.92 -2.3 1.57 perf-profile.children.cycles-pp.enqueue_task_fair
3.80 -2.3 1.54 perf-profile.children.cycles-pp.__ip_append_data
3.38 -2.1 1.23 ? 2% perf-profile.children.cycles-pp.dequeue_task_fair
3.21 -2.0 1.24 perf-profile.children.cycles-pp.set_next_entity
3.06 -1.9 1.11 ? 2% perf-profile.children.cycles-pp.dequeue_entity
3.23 -1.9 1.35 perf-profile.children.cycles-pp.update_load_avg
2.89 -1.9 1.02 perf-profile.children.cycles-pp._raw_spin_lock
2.91 -1.7 1.22 perf-profile.children.cycles-pp.enqueue_entity
2.80 -1.7 1.13 perf-profile.children.cycles-pp.sock_alloc_send_pskb
2.47 -1.5 1.01 perf-profile.children.cycles-pp._raw_spin_lock_irqsave
2.28 ? 3% -1.4 0.91 ? 3% perf-profile.children.cycles-pp.__ip_make_skb
2.28 -1.4 0.92 perf-profile.children.cycles-pp.alloc_skb_with_frags
2.38 -1.3 1.05 perf-profile.children.cycles-pp.update_rq_clock
2.22 -1.3 0.89 perf-profile.children.cycles-pp.__alloc_skb
1.95 ? 4% -1.2 0.76 ? 3% perf-profile.children.cycles-pp.__ip_select_ident
2.06 -1.1 0.98 ? 2% perf-profile.children.cycles-pp.menu_select
1.64 -1.1 0.58 perf-profile.children.cycles-pp.update_curr
1.70 ? 4% -1.0 0.66 ? 3% perf-profile.children.cycles-pp.ip_idents_reserve
1.66 -1.0 0.67 perf-profile.children.cycles-pp._copy_to_iter
1.47 -0.9 0.57 ? 2% perf-profile.children.cycles-pp.__consume_stateless_skb
1.28 ? 4% -0.7 0.57 perf-profile.children.cycles-pp.ip_route_output_flow
1.20 -0.7 0.48 perf-profile.children.cycles-pp.copy_user_generic_unrolled
1.25 ? 4% -0.7 0.55 ? 2% perf-profile.children.cycles-pp.ip_route_output_key_hash
1.09 -0.6 0.44 ? 2% perf-profile.children.cycles-pp.copyout
1.08 -0.6 0.43 ? 3% perf-profile.children.cycles-pp.__switch_to
1.08 -0.6 0.46 ? 2% perf-profile.children.cycles-pp.sock_wfree
0.98 -0.6 0.38 ? 2% perf-profile.children.cycles-pp.__slab_free
1.02 -0.6 0.42 perf-profile.children.cycles-pp.select_task_rq_fair
1.05 ? 5% -0.6 0.47 ? 2% perf-profile.children.cycles-pp.ip_route_output_key_hash_rcu
1.07 -0.6 0.49 ? 4% perf-profile.children.cycles-pp.tick_nohz_get_sleep_length
0.92 ? 2% -0.6 0.37 ? 2% perf-profile.children.cycles-pp.__ksize
0.88 -0.5 0.34 ? 2% perf-profile.children.cycles-pp.prepare_to_wait_exclusive
0.91 -0.5 0.38 ? 2% perf-profile.children.cycles-pp.__update_load_avg_se
0.89 -0.5 0.38 ? 3% perf-profile.children.cycles-pp.move_addr_to_user
0.83 -0.5 0.36 ? 3% perf-profile.children.cycles-pp.syscall_exit_to_user_mode
0.78 -0.5 0.32 ? 2% perf-profile.children.cycles-pp.sock_def_write_space
0.75 -0.5 0.30 perf-profile.children.cycles-pp.__might_fault
0.78 -0.4 0.33 ? 2% perf-profile.children.cycles-pp.ttwu_do_wakeup
0.73 -0.4 0.29 ? 2% perf-profile.children.cycles-pp.udp_rmem_release
0.70 -0.4 0.28 perf-profile.children.cycles-pp.___might_sleep
0.69 -0.4 0.28 ? 3% perf-profile.children.cycles-pp.update_ts_time_stats
0.67 -0.4 0.26 ? 2% perf-profile.children.cycles-pp.nr_iowait_cpu
0.66 ? 3% -0.4 0.25 ? 2% perf-profile.children.cycles-pp.switch_mm_irqs_off
0.72 -0.4 0.31 ? 3% perf-profile.children.cycles-pp.check_preempt_curr
0.66 ? 2% -0.4 0.26 ? 2% perf-profile.children.cycles-pp.syscall_return_via_sysret
0.64 ? 2% -0.4 0.25 perf-profile.children.cycles-pp.skb_release_data
2.47 ? 4% -0.4 2.08 ? 3% perf-profile.children.cycles-pp.__dev_queue_xmit
0.62 -0.4 0.25 perf-profile.children.cycles-pp.___perf_sw_event
0.68 ? 8% -0.4 0.31 ? 4% perf-profile.children.cycles-pp.fib_table_lookup
0.95 -0.4 0.58 perf-profile.children.cycles-pp.sched_clock
0.66 -0.4 0.29 ? 2% perf-profile.children.cycles-pp.__update_load_avg_cfs_rq
0.68 -0.4 0.31 ? 4% perf-profile.children.cycles-pp.__check_object_size
0.60 ? 3% -0.4 0.23 ? 3% perf-profile.children.cycles-pp.pick_next_entity
0.66 -0.4 0.29 ? 3% perf-profile.children.cycles-pp.ip_generic_getfrag
0.63 ? 2% -0.4 0.27 perf-profile.children.cycles-pp.__cgroup_bpf_run_filter_skb
0.61 ? 2% -0.4 0.26 perf-profile.children.cycles-pp.perf_trace_sched_wakeup_template
0.57 ? 2% -0.3 0.22 perf-profile.children.cycles-pp._raw_spin_lock_bh
0.65 ? 2% -0.3 0.30 ? 3% perf-profile.children.cycles-pp.tick_nohz_next_event
0.59 -0.3 0.25 perf-profile.children.cycles-pp.__switch_to_asm
0.56 -0.3 0.23 ? 2% perf-profile.children.cycles-pp.resched_curr
0.57 ? 2% -0.3 0.24 ? 3% perf-profile.children.cycles-pp.kmalloc_reserve
0.54 ? 2% -0.3 0.22 perf-profile.children.cycles-pp.__kmalloc_node_track_caller
0.85 -0.3 0.54 perf-profile.children.cycles-pp.native_sched_clock
0.53 -0.3 0.23 ? 2% perf-profile.children.cycles-pp.rcu_idle_exit
0.50 ? 2% -0.3 0.21 ? 3% perf-profile.children.cycles-pp.copy_user_enhanced_fast_string
0.50 ? 3% -0.3 0.23 ? 4% perf-profile.children.cycles-pp.sockfd_lookup_light
0.46 -0.3 0.20 ? 3% perf-profile.children.cycles-pp.__udp4_lib_lookup
0.44 ? 2% -0.3 0.18 ? 2% perf-profile.children.cycles-pp.move_addr_to_kernel
0.46 ? 2% -0.3 0.20 ? 4% perf-profile.children.cycles-pp.ip_finish_output
0.38 -0.3 0.12 ? 3% perf-profile.children.cycles-pp.reweight_entity
0.43 ? 2% -0.3 0.18 ? 4% perf-profile.children.cycles-pp.kmem_cache_free
0.42 ? 4% -0.3 0.17 ? 3% perf-profile.children.cycles-pp.kmem_cache_alloc_node
0.40 ? 2% -0.3 0.15 perf-profile.children.cycles-pp.available_idle_cpu
0.42 -0.3 0.17 ? 3% perf-profile.children.cycles-pp.__rdgsbase_inactive
0.42 -0.2 0.17 ? 3% perf-profile.children.cycles-pp.skb_set_owner_w
0.41 -0.2 0.17 ? 2% perf-profile.children.cycles-pp.aa_sk_perm
0.40 ? 2% -0.2 0.17 ? 5% perf-profile.children.cycles-pp._copy_from_iter_full
0.43 ? 2% -0.2 0.20 ? 4% perf-profile.children.cycles-pp.get_next_timer_interrupt
0.39 -0.2 0.16 ? 4% perf-profile.children.cycles-pp.__wrgsbase_inactive
0.38 ? 2% -0.2 0.15 ? 5% perf-profile.children.cycles-pp.update_cfs_group
0.39 ? 2% -0.2 0.17 ? 3% perf-profile.children.cycles-pp.perf_tp_event
0.40 ? 4% -0.2 0.17 ? 4% perf-profile.children.cycles-pp.__fget_light
0.37 -0.2 0.15 ? 6% perf-profile.children.cycles-pp.sk_filter_trim_cap
0.36 ? 3% -0.2 0.15 ? 6% perf-profile.children.cycles-pp.cpus_share_cache
0.34 -0.2 0.14 ? 3% perf-profile.children.cycles-pp.__entry_text_start
0.34 -0.2 0.14 ? 4% perf-profile.children.cycles-pp.sock_recvmsg
0.32 ? 2% -0.2 0.13 ? 2% perf-profile.children.cycles-pp.__might_sleep
0.32 -0.2 0.14 ? 3% perf-profile.children.cycles-pp._copy_from_user
0.30 ? 4% -0.2 0.12 ? 6% perf-profile.children.cycles-pp.update_min_vruntime
0.30 -0.2 0.12 ? 3% perf-profile.children.cycles-pp.security_socket_recvmsg
0.32 ? 3% -0.2 0.15 ? 5% perf-profile.children.cycles-pp.hrtimer_next_event_without
0.32 ? 2% -0.2 0.15 ? 3% perf-profile.children.cycles-pp.exit_to_user_mode_prepare
0.31 -0.2 0.14 ? 2% perf-profile.children.cycles-pp.security_socket_sendmsg
0.28 ? 3% -0.2 0.11 ? 5% perf-profile.children.cycles-pp.rb_next
0.30 -0.2 0.13 ? 3% perf-profile.children.cycles-pp._copy_to_user
0.29 ? 4% -0.2 0.12 ? 4% perf-profile.children.cycles-pp.enqueue_to_backlog
0.28 ? 5% -0.2 0.12 ? 5% perf-profile.children.cycles-pp.validate_xmit_skb
0.26 -0.2 0.10 ? 4% perf-profile.children.cycles-pp.copy_fpregs_to_fpstate
0.24 ? 3% -0.2 0.09 ? 4% perf-profile.children.cycles-pp.rcu_read_unlock_strict
0.26 ? 2% -0.1 0.12 ? 8% perf-profile.children.cycles-pp.udp4_lib_lookup2
0.27 ? 4% -0.1 0.12 ? 3% perf-profile.children.cycles-pp.__x86_indirect_thunk_rax
0.23 ? 3% -0.1 0.09 ? 5% perf-profile.children.cycles-pp.ip_local_out
0.23 ? 2% -0.1 0.10 ? 4% perf-profile.children.cycles-pp.rcu_eqs_exit
0.21 ? 3% -0.1 0.09 ? 4% perf-profile.children.cycles-pp.siphash_3u32
0.39 ? 2% -0.1 0.27 ? 2% perf-profile.children.cycles-pp.finish_task_switch
0.20 ? 4% -0.1 0.08 ? 5% perf-profile.children.cycles-pp.copyin
0.20 ? 2% -0.1 0.09 ? 5% perf-profile.children.cycles-pp.__ip_local_out
0.20 ? 3% -0.1 0.08 ? 5% perf-profile.children.cycles-pp.__netif_receive_skb_core
0.22 ? 4% -0.1 0.10 ? 3% perf-profile.children.cycles-pp.hrtimer_get_next_event
0.20 ? 4% -0.1 0.08 ? 5% perf-profile.children.cycles-pp.migrate_disable
0.18 ? 2% -0.1 0.07 ? 5% perf-profile.children.cycles-pp.place_entity
0.20 ? 7% -0.1 0.09 ? 11% perf-profile.children.cycles-pp.__calc_delta
0.19 ? 3% -0.1 0.08 ? 10% perf-profile.children.cycles-pp.migrate_enable
0.18 ? 3% -0.1 0.07 ? 8% perf-profile.children.cycles-pp.call_cpuidle
0.17 ? 3% -0.1 0.06 ? 7% perf-profile.children.cycles-pp.put_prev_task_fair
0.18 ? 4% -0.1 0.08 ? 6% perf-profile.children.cycles-pp.__slab_alloc
0.17 ? 5% -0.1 0.07 ? 8% perf-profile.children.cycles-pp.___slab_alloc
0.17 ? 2% -0.1 0.07 ? 7% perf-profile.children.cycles-pp.__put_user_nocheck_4
0.15 ? 4% -0.1 0.06 ? 8% perf-profile.children.cycles-pp.menu_reflect
0.16 ? 2% -0.1 0.07 ? 7% perf-profile.children.cycles-pp.kfree
0.17 ? 13% -0.1 0.07 ? 11% perf-profile.children.cycles-pp.fib_lookup_good_nhc
0.17 ? 3% -0.1 0.07 ? 8% perf-profile.children.cycles-pp.newidle_balance
0.19 ? 3% -0.1 0.10 ? 3% perf-profile.children.cycles-pp.cpuidle_governor_latency_req
0.12 ? 4% -0.1 0.03 ? 99% perf-profile.children.cycles-pp.__build_skb_around
0.12 ? 3% -0.1 0.03 ? 70% perf-profile.children.cycles-pp.perf_trace_sched_switch
0.15 ? 3% -0.1 0.06 ? 8% perf-profile.children.cycles-pp.ip_send_check
0.11 ? 3% -0.1 0.03 ? 99% perf-profile.children.cycles-pp.pick_next_task_idle
0.15 ? 3% -0.1 0.06 ? 6% perf-profile.children.cycles-pp.ip_rcv_core
0.15 ? 2% -0.1 0.06 ? 7% perf-profile.children.cycles-pp.dst_release
0.15 ? 5% -0.1 0.06 ? 7% perf-profile.children.cycles-pp.__get_user_4
0.14 ? 2% -0.1 0.06 perf-profile.children.cycles-pp.ipv4_pktinfo_prepare
0.17 ? 2% -0.1 0.09 ? 5% perf-profile.children.cycles-pp.switch_fpu_return
0.13 ? 8% -0.1 0.06 ? 9% perf-profile.children.cycles-pp.netif_skb_features
0.13 ? 6% -0.1 0.05 ? 9% perf-profile.children.cycles-pp.ip_rcv_finish
0.13 ? 5% -0.1 0.05 ? 8% perf-profile.children.cycles-pp.__list_del_entry_valid
0.13 -0.1 0.06 ? 8% perf-profile.children.cycles-pp.__ip_finish_output
0.11 ? 4% -0.1 0.04 ? 44% perf-profile.children.cycles-pp.rcu_eqs_enter
0.10 ? 3% -0.1 0.03 ? 99% perf-profile.children.cycles-pp.__list_add_valid
0.10 ? 4% -0.1 0.03 ? 70% perf-profile.children.cycles-pp.__skb_try_recv_from_queue
0.12 ? 4% -0.1 0.05 ? 9% perf-profile.children.cycles-pp.__virt_addr_valid
0.14 ? 4% -0.1 0.07 ? 12% perf-profile.children.cycles-pp.__hrtimer_next_event_base
0.11 ? 4% -0.1 0.05 ? 7% perf-profile.children.cycles-pp.__cond_resched
0.13 ? 2% -0.1 0.07 ? 5% perf-profile.children.cycles-pp.__check_heap_object
0.11 ? 3% -0.1 0.05 ? 8% perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare
0.10 ? 6% -0.1 0.05 perf-profile.children.cycles-pp.rb_erase
0.12 ? 6% -0.0 0.07 ? 5% perf-profile.children.cycles-pp.syscall_enter_from_user_mode
0.09 ? 5% -0.0 0.06 ? 11% perf-profile.children.cycles-pp.import_single_range
0.11 ? 5% -0.0 0.09 ? 5% perf-profile.children.cycles-pp.rebalance_domains
0.07 ? 5% -0.0 0.05 perf-profile.children.cycles-pp._raw_spin_trylock
0.17 ? 4% -0.0 0.15 ? 2% perf-profile.children.cycles-pp._raw_spin_unlock_irqrestore
1.07 +0.2 1.23 perf-profile.children.cycles-pp.sched_clock_cpu
0.17 ? 4% +0.2 0.38 ? 14% perf-profile.children.cycles-pp.poll_idle
0.26 ? 28% +0.3 0.60 ? 24% perf-profile.children.cycles-pp.start_kernel
0.03 ? 70% +0.6 0.62 ? 2% perf-profile.children.cycles-pp.perf_mux_hrtimer_handler
30.02 +0.7 30.72 perf-profile.children.cycles-pp.cpuidle_enter
29.95 +0.7 30.70 perf-profile.children.cycles-pp.cpuidle_enter_state
0.46 ? 22% +0.8 1.22 ? 6% perf-profile.children.cycles-pp.netif_rx
0.43 ? 24% +0.8 1.22 ? 6% perf-profile.children.cycles-pp.netif_rx_internal
0.15 ? 7% +0.8 0.96 ? 4% perf-profile.children.cycles-pp.update_process_times
0.09 ? 7% +0.8 0.91 ? 5% perf-profile.children.cycles-pp.scheduler_tick
0.16 ? 7% +0.8 0.97 ? 4% perf-profile.children.cycles-pp.tick_sched_handle
0.00 +0.9 0.86 ? 2% perf-profile.children.cycles-pp.tick_nohz_irq_exit
0.09 ?141% +1.0 1.07 ? 6% perf-profile.children.cycles-pp.ktime_get_with_offset
0.00 +1.3 1.29 perf-profile.children.cycles-pp.sched_clock_local
0.00 +1.5 1.47 perf-profile.children.cycles-pp.ktime_get_update_offsets_now
0.14 ? 12% +1.5 1.63 perf-profile.children.cycles-pp.clockevents_program_event
0.18 ? 7% +1.6 1.80 ? 3% perf-profile.children.cycles-pp.tick_sched_timer
0.28 ? 5% +2.2 2.49 ? 2% perf-profile.children.cycles-pp.__hrtimer_run_queues
0.07 ? 12% +2.3 2.36 perf-profile.children.cycles-pp.tick_irq_enter
0.08 ? 14% +2.3 2.37 perf-profile.children.cycles-pp.irq_enter_rcu
0.49 ? 3% +5.2 5.64 perf-profile.children.cycles-pp.hrtimer_interrupt
0.53 ? 3% +5.2 5.68 perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
0.83 ? 2% +8.3 9.12 perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
0.95 ? 2% +8.3 9.25 perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
0.52 +8.6 9.16 perf-profile.children.cycles-pp.tick_nohz_idle_enter
0.00 +26.2 26.19 perf-profile.children.cycles-pp.sched_clock_idle_wakeup_event
0.00 +26.5 26.47 perf-profile.children.cycles-pp.sched_clock_tick
1.09 +29.9 30.96 perf-profile.children.cycles-pp.tick_nohz_idle_exit
43.21 +32.1 75.30 perf-profile.children.cycles-pp.start_secondary
43.48 +32.4 75.89 perf-profile.children.cycles-pp.secondary_startup_64_no_verify
43.48 +32.4 75.89 perf-profile.children.cycles-pp.cpu_startup_entry
43.37 +32.5 75.85 perf-profile.children.cycles-pp.do_idle
0.84 ? 2% +53.8 54.62 perf-profile.children.cycles-pp.ktime_get
0.00 +56.7 56.71 perf-profile.children.cycles-pp.read_hpet
27.74 -15.4 12.37 perf-profile.self.cycles-pp.intel_idle
2.82 -1.8 0.99 perf-profile.self.cycles-pp._raw_spin_lock
1.88 -1.6 0.27 ? 3% perf-profile.self.cycles-pp.update_rq_clock
2.78 -1.5 1.26 perf-profile.self.cycles-pp.__schedule
2.44 -1.4 1.00 perf-profile.self.cycles-pp._raw_spin_lock_irqsave
1.70 ? 4% -1.0 0.65 ? 4% perf-profile.self.cycles-pp.ip_idents_reserve
1.68 -1.0 0.69 perf-profile.self.cycles-pp.update_load_avg
1.51 -1.0 0.56 perf-profile.self.cycles-pp.set_next_entity
1.17 -0.7 0.43 ? 2% perf-profile.self.cycles-pp.do_idle
1.15 -0.7 0.46 perf-profile.self.cycles-pp.copy_user_generic_unrolled
1.01 ? 2% -0.7 0.35 ? 2% perf-profile.self.cycles-pp.enqueue_task_fair
1.00 -0.6 0.40 ? 4% perf-profile.self.cycles-pp.__switch_to
0.97 -0.6 0.37 ? 2% perf-profile.self.cycles-pp.__slab_free
1.04 -0.6 0.44 perf-profile.self.cycles-pp.enqueue_entity
0.96 ? 2% -0.6 0.39 ? 2% perf-profile.self.cycles-pp.__skb_wait_for_more_packets
0.91 ? 2% -0.5 0.36 ? 2% perf-profile.self.cycles-pp.__ksize
0.87 -0.5 0.37 perf-profile.self.cycles-pp.__update_load_avg_se
0.77 -0.5 0.29 ? 2% perf-profile.self.cycles-pp.update_curr
0.77 -0.5 0.31 ? 2% perf-profile.self.cycles-pp.sock_def_write_space
0.77 ? 2% -0.5 0.31 ? 3% perf-profile.self.cycles-pp.udp_sendmsg
0.72 -0.4 0.28 ? 2% perf-profile.self.cycles-pp.udp_rmem_release
0.71 ? 2% -0.4 0.28 perf-profile.self.cycles-pp.udp_queue_rcv_one_skb
0.71 -0.4 0.28 ? 3% perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
0.68 ? 2% -0.4 0.27 ? 2% perf-profile.self.cycles-pp.try_to_wake_up
0.66 -0.4 0.26 perf-profile.self.cycles-pp.nr_iowait_cpu
0.66 -0.4 0.26 perf-profile.self.cycles-pp.___might_sleep
0.66 ? 3% -0.4 0.25 ? 2% perf-profile.self.cycles-pp.switch_mm_irqs_off
0.66 -0.4 0.26 ? 2% perf-profile.self.cycles-pp.syscall_return_via_sysret
0.64 ? 2% -0.4 0.24 ? 2% perf-profile.self.cycles-pp.skb_release_data
0.76 ? 2% -0.4 0.37 ? 2% perf-profile.self.cycles-pp.menu_select
0.57 ? 3% -0.4 0.21 ? 3% perf-profile.self.cycles-pp.pick_next_entity
0.62 -0.4 0.26 ? 2% perf-profile.self.cycles-pp.__update_load_avg_cfs_rq
0.56 ? 2% -0.3 0.21 ? 2% perf-profile.self.cycles-pp._raw_spin_lock_bh
0.56 -0.3 0.22 ? 2% perf-profile.self.cycles-pp.resched_curr
0.57 -0.3 0.24 perf-profile.self.cycles-pp.__switch_to_asm
0.54 -0.3 0.21 ? 2% perf-profile.self.cycles-pp.___perf_sw_event
0.52 ? 2% -0.3 0.21 ? 4% perf-profile.self.cycles-pp.__udp_enqueue_schedule_skb
0.47 -0.3 0.16 ? 2% perf-profile.self.cycles-pp.process_backlog
0.82 -0.3 0.52 ? 2% perf-profile.self.cycles-pp.native_sched_clock
0.48 ? 2% -0.3 0.19 ? 3% perf-profile.self.cycles-pp.schedule_timeout
0.52 ? 10% -0.3 0.23 ? 7% perf-profile.self.cycles-pp.fib_table_lookup
0.47 ? 2% -0.3 0.19 ? 5% perf-profile.self.cycles-pp.copy_user_enhanced_fast_string
0.38 ? 2% -0.3 0.13 ? 6% perf-profile.self.cycles-pp.dequeue_entity
0.41 -0.2 0.16 ? 4% perf-profile.self.cycles-pp.syscall_exit_to_user_mode
0.41 ? 2% -0.2 0.16 ? 2% perf-profile.self.cycles-pp.skb_set_owner_w
0.41 ? 3% -0.2 0.17 ? 3% perf-profile.self.cycles-pp.sock_def_readable
0.36 ? 2% -0.2 0.12 ? 4% perf-profile.self.cycles-pp.reweight_entity
0.41 -0.2 0.16 ? 2% perf-profile.self.cycles-pp.__rdgsbase_inactive
0.39 -0.2 0.14 ? 3% perf-profile.self.cycles-pp.available_idle_cpu
0.40 ? 3% -0.2 0.16 ? 4% perf-profile.self.cycles-pp.__skb_recv_udp
0.39 -0.2 0.15 ? 3% perf-profile.self.cycles-pp.__wake_up_common
0.47 ? 4% -0.2 0.25 ? 2% perf-profile.self.cycles-pp.__softirqentry_text_start
0.46 -0.2 0.24 ? 2% perf-profile.self.cycles-pp.udp_recvmsg
0.38 -0.2 0.16 ? 4% perf-profile.self.cycles-pp.__wrgsbase_inactive
0.38 ? 4% -0.2 0.16 ? 5% perf-profile.self.cycles-pp.__fget_light
0.35 ? 3% -0.2 0.14 ? 7% perf-profile.self.cycles-pp.update_cfs_group
0.35 ? 2% -0.2 0.15 ? 6% perf-profile.self.cycles-pp.cpus_share_cache
0.32 -0.2 0.11 ? 4% perf-profile.self.cycles-pp.dequeue_task_fair
0.34 -0.2 0.14 ? 3% perf-profile.self.cycles-pp.__entry_text_start
0.35 ? 2% -0.2 0.15 ? 3% perf-profile.self.cycles-pp.__check_object_size
0.32 ? 3% -0.2 0.12 ? 3% perf-profile.self.cycles-pp.udp_send_skb
0.32 ? 3% -0.2 0.12 perf-profile.self.cycles-pp.__ip_append_data
0.34 ? 2% -0.2 0.15 ? 3% perf-profile.self.cycles-pp.__sys_sendto
0.48 -0.2 0.29 ? 2% perf-profile.self.cycles-pp.cpuidle_enter_state
0.33 -0.2 0.14 ? 2% perf-profile.self.cycles-pp.finish_task_switch
0.33 ? 8% -0.2 0.14 ? 5% perf-profile.self.cycles-pp.ip_finish_output2
0.33 ? 2% -0.2 0.15 ? 6% perf-profile.self.cycles-pp.ip_route_output_key_hash_rcu
0.30 ? 3% -0.2 0.12 ? 4% perf-profile.self.cycles-pp.perf_tp_event
0.29 -0.2 0.11 ? 4% perf-profile.self.cycles-pp.net_rx_action
0.31 ? 2% -0.2 0.13 ? 5% perf-profile.self.cycles-pp.do_syscall_64
0.30 ? 3% -0.2 0.13 ? 5% perf-profile.self.cycles-pp.rcu_idle_exit
0.28 ? 4% -0.2 0.11 ? 6% perf-profile.self.cycles-pp.update_min_vruntime
0.27 ? 3% -0.2 0.10 ? 4% perf-profile.self.cycles-pp.rb_next
0.27 ? 3% -0.2 0.11 ? 4% perf-profile.self.cycles-pp.__might_sleep
0.29 ? 3% -0.2 0.13 ? 3% perf-profile.self.cycles-pp.sock_wfree
0.26 -0.2 0.10 ? 3% perf-profile.self.cycles-pp.copy_fpregs_to_fpstate
0.26 -0.2 0.11 ? 5% perf-profile.self.cycles-pp.__x64_sys_sendto
0.25 ? 2% -0.2 0.10 ? 6% perf-profile.self.cycles-pp.aa_sk_perm
0.24 ? 6% -0.1 0.09 ? 4% perf-profile.self.cycles-pp.kmem_cache_alloc_node
0.24 ? 4% -0.1 0.10 ? 4% perf-profile.self.cycles-pp.__kmalloc_node_track_caller
0.24 -0.1 0.09 perf-profile.self.cycles-pp.ip_output
0.26 ? 2% -0.1 0.11 ? 3% perf-profile.self.cycles-pp.__ip_make_skb
0.24 ? 2% -0.1 0.10 ? 4% perf-profile.self.cycles-pp.__alloc_skb
0.26 -0.1 0.12 ? 3% perf-profile.self.cycles-pp.select_task_rq_fair
0.25 ? 3% -0.1 0.11 ? 8% perf-profile.self.cycles-pp.udp4_lib_lookup2
0.23 ? 3% -0.1 0.09 ? 4% perf-profile.self.cycles-pp.__local_bh_enable_ip
0.23 ? 9% -0.1 0.10 ? 6% perf-profile.self.cycles-pp.__dev_queue_xmit
0.21 -0.1 0.08 ? 5% perf-profile.self.cycles-pp.siphash_3u32
0.21 -0.1 0.09 ? 5% perf-profile.self.cycles-pp.__udp4_lib_lookup
0.22 ? 3% -0.1 0.10 ? 3% perf-profile.self.cycles-pp.__cgroup_bpf_run_filter_skb
0.21 ? 3% -0.1 0.09 perf-profile.self.cycles-pp.sock_sendmsg
0.21 ? 9% -0.1 0.09 ? 7% perf-profile.self.cycles-pp.do_softirq
0.20 ? 3% -0.1 0.09 ? 4% perf-profile.self.cycles-pp.__x64_sys_recvfrom
0.20 ? 2% -0.1 0.08 ? 4% perf-profile.self.cycles-pp.__sys_recvfrom
0.19 ? 3% -0.1 0.08 perf-profile.self.cycles-pp.__netif_receive_skb_core
0.19 ? 8% -0.1 0.08 ? 9% perf-profile.self.cycles-pp.__calc_delta
0.19 ? 4% -0.1 0.08 ? 6% perf-profile.self.cycles-pp.migrate_disable
0.19 ? 3% -0.1 0.08 ? 4% perf-profile.self.cycles-pp.ip_route_output_key_hash
0.18 ? 3% -0.1 0.07 ? 5% perf-profile.self.cycles-pp.prepare_to_wait_exclusive
0.21 ? 2% -0.1 0.10 ? 3% perf-profile.self.cycles-pp.pick_next_task_fair
0.18 ? 4% -0.1 0.08 ? 6% perf-profile.self.cycles-pp.migrate_enable
0.19 ? 3% -0.1 0.08 ? 7% perf-profile.self.cycles-pp.__udp4_lib_rcv
0.18 ? 2% -0.1 0.07 ? 10% perf-profile.self.cycles-pp.call_cpuidle
0.17 ? 3% -0.1 0.06 ? 6% perf-profile.self.cycles-pp.place_entity
0.19 ? 3% -0.1 0.09 ? 5% perf-profile.self.cycles-pp.schedule_idle
0.17 ? 2% -0.1 0.07 perf-profile.self.cycles-pp.schedule
0.17 ? 5% -0.1 0.07 ? 8% perf-profile.self.cycles-pp.___slab_alloc
0.17 ? 3% -0.1 0.07 ? 7% perf-profile.self.cycles-pp.__put_user_nocheck_4
0.16 ? 4% -0.1 0.07 ? 7% perf-profile.self.cycles-pp.cpu_startup_entry
0.17 ? 2% -0.1 0.08 ? 6% perf-profile.self.cycles-pp.ip_send_skb
0.15 ? 3% -0.1 0.06 perf-profile.self.cycles-pp.move_addr_to_user
0.15 ? 3% -0.1 0.06 perf-profile.self.cycles-pp.kfree
0.16 ? 15% -0.1 0.07 ? 11% perf-profile.self.cycles-pp.fib_lookup_good_nhc
0.15 ? 8% -0.1 0.06 perf-profile.self.cycles-pp.loopback_xmit
0.14 ? 4% -0.1 0.06 ? 11% perf-profile.self.cycles-pp.newidle_balance
0.14 ? 2% -0.1 0.05 ? 9% perf-profile.self.cycles-pp.ip_send_check
0.14 ? 4% -0.1 0.06 ? 6% perf-profile.self.cycles-pp.ip_rcv_core
0.16 ? 2% -0.1 0.08 ? 4% perf-profile.self.cycles-pp.switch_fpu_return
0.15 ? 7% -0.1 0.07 ? 5% perf-profile.self.cycles-pp.enqueue_to_backlog
0.10 ? 4% -0.1 0.03 ? 99% perf-profile.self.cycles-pp.__list_del_entry_valid
0.14 ? 5% -0.1 0.06 ? 11% perf-profile.self.cycles-pp.__get_user_4
0.11 ? 3% -0.1 0.03 ? 70% perf-profile.self.cycles-pp.perf_trace_sched_wakeup_template
0.15 ? 4% -0.1 0.08 ? 9% perf-profile.self.cycles-pp.check_preempt_curr
0.14 ? 5% -0.1 0.06 ? 6% perf-profile.self.cycles-pp.validate_xmit_skb
0.13 ? 5% -0.1 0.05 perf-profile.self.cycles-pp._copy_to_iter
0.14 ? 5% -0.1 0.06 ? 6% perf-profile.self.cycles-pp.kmem_cache_free
0.13 ? 2% -0.1 0.05 ? 8% perf-profile.self.cycles-pp.dst_release
0.11 ? 4% -0.1 0.04 ? 44% perf-profile.self.cycles-pp.tick_nohz_next_event
0.11 ? 4% -0.1 0.04 ? 45% perf-profile.self.cycles-pp.__virt_addr_valid
0.12 ? 6% -0.1 0.05 perf-profile.self.cycles-pp.__x86_indirect_thunk_rax
0.11 ? 4% -0.1 0.05 perf-profile.self.cycles-pp.rcu_eqs_exit
0.12 ? 5% -0.1 0.06 ? 17% perf-profile.self.cycles-pp.poll_idle
0.10 ? 4% -0.1 0.04 ? 44% perf-profile.self.cycles-pp.syscall_exit_to_user_mode_prepare
0.11 ? 3% -0.1 0.06 ? 6% perf-profile.self.cycles-pp.__check_heap_object
0.10 ? 3% -0.1 0.05 ? 7% perf-profile.self.cycles-pp.__hrtimer_next_event_base
0.10 ? 7% -0.0 0.06 ? 8% perf-profile.self.cycles-pp.syscall_enter_from_user_mode
0.08 -0.0 0.03 ? 70% perf-profile.self.cycles-pp.sockfd_lookup_light
0.08 ? 4% -0.0 0.04 ? 45% perf-profile.self.cycles-pp.import_single_range
0.07 ? 5% -0.0 0.05 perf-profile.self.cycles-pp._raw_spin_trylock
0.09 ? 4% -0.0 0.07 perf-profile.self.cycles-pp.tick_nohz_idle_exit
0.15 ? 5% +0.3 0.45 perf-profile.self.cycles-pp.sched_clock_cpu
0.00 +0.9 0.87 perf-profile.self.cycles-pp.sched_clock_local
0.00 +56.7 56.70 perf-profile.self.cycles-pp.read_hpet



netperf.Throughput_tps

120000 +------------------------------------------------------------------+
| .+ |
110000 |.+ +.+.+.+.++. .+.+ .+.+.++.+.+.+.++.+.+.+.+.++.+.+.+.+ |
100000 |-+ +.+.+ +.+ |
| |
90000 |-+ |
80000 |-+ |
| |
70000 |-+ |
60000 |-+ |
| |
50000 |-+ |
40000 |-+ |
| O OO O O O OO O O O O OO O O O OO O O O OO O O O O OO O O O OO O |
30000 +------------------------------------------------------------------+


netperf.Throughput_total_tps

2.8e+06 +-----------------------------------------------------------------+
2.6e+06 |-+.+ |
|.+ +.+.+.++.+.+.+.++.+.+.+.++.+.+.++.+.+.+.++.+.+.+.++.+.+.+ |
2.4e+06 |-+ |
2.2e+06 |-+ |
| |
2e+06 |-+ |
1.8e+06 |-+ |
1.6e+06 |-+ |
| |
1.4e+06 |-+ |
1.2e+06 |-+ |
| |
1e+06 |-O OO O O OO O O OO O O O OO O O OO O O O OO O O O OO O O OO O |
800000 +-----------------------------------------------------------------+


netperf.workload

8e+08 +-------------------------------------------------------------------+
|.+ ++.+.+.+.+.++.+.+.+.+.++.+.+.+.+.++.+.+.+.+.++.+.+.+.+.++ |
7e+08 |-+ |
| |
| |
6e+08 |-+ |
| |
5e+08 |-+ |
| |
4e+08 |-+ |
| |
| |
3e+08 |-O O OO O O O O O O O O O OO O O O O OO O O O O OO O O O O OO O O |
| O |
2e+08 +-------------------------------------------------------------------+


pmeter.Average_Active_Power

660 +---------------------------------------------------------------------+
|. +. |
640 |-+.+.+.++. .+. .+. .+. .+.+.+.+.+.+.+ .+. .+.+.+.+.+. : + |
| + +.+ +.+.++ + + + + |
620 |-+ |
| |
600 |-+ |
| |
580 |-+ |
| |
560 |-+ O O |
| O O O OO O O O O O O O O O O O O O O O OO O O O |
540 |-+ O O O O O O O O O |
| |
520 +---------------------------------------------------------------------+


netperf.time.user_time

250 +---------------------------------------------------------------------+
| +. .+.+. .+.++.+.+ |
240 |.+ +. : +.+ +.+ : .+.+.+. .+. |
| + + + : ++.+ +.+ ++.+ |
| + +.+. .+.+. + |
230 |-+ O + + |
| |
220 |-+ |
| |
210 |-+ O O O O O O O O O O O O O O |
| O O O O O O O |
| O OO O O O O O O O O |
200 |-O |
| O O |
190 +---------------------------------------------------------------------+


netperf.time.system_time

3700 +--------------------------------------------------------------------+
| O OO O O O O O OO O O O O OO O O O O O O O O |
3600 |-+ O O O O O O |
| O O O O |
3500 |-O O |
| |
3400 |-+ |
| |
3300 |-+ |
| |
3200 |-+ |
|. +.+.+.+. .+.++.+.+.+. |
3100 |-+. .+ +.+.+.++.+.+.+.+ +.+.++.+.+.+.+.+.+.++ |
| + |
3000 +--------------------------------------------------------------------+


netperf.time.percent_of_cpu_this_job_got

1300 +--------------------------------------------------------------------+
| O OO O O O O O OO O O O O O OO O O O O O O O O O |
| O O O O O |
1250 |-+ O O |
| O O O |
| |
1200 |-+ |
| |
1150 |-+ |
| |
|.+ ++.+.+.+.+.+.+.++.+.+.+.+.+.++.+.+.+.+.+.++.+.+.+. .+.+.++ |
1100 |-+: : + |
| : : |
| + |
1050 +--------------------------------------------------------------------+


netperf.time.voluntary_context_switches

8e+08 +-------------------------------------------------------------------+
|.+.+.++.+.+.+.+.++.+.+.+.+.++.+.+.+.+.++.+.+.+.+.++.+.+.+.+.++ |
7e+08 |-+ |
| |
| |
6e+08 |-+ |
| |
5e+08 |-+ |
| |
4e+08 |-+ |
| |
| |
3e+08 |-O O OO O O O O O O O O OO O O O O OO O O O O OO O O O O OO O O |
| OO |
2e+08 +-------------------------------------------------------------------+


netperf.time.involuntary_context_switches

11000 +-------------------------------------------------------------------+
| |
10000 |.+ .+.+.+ .+.+.++. .+. .+ |
| +. .++ + .++.+.+ + +.+ +.+.+.+.+.++.+.+.+.+.++ |
9000 |-+ + + |
| |
8000 |-+ |
| |
7000 |-+ |
| |
6000 |-O OO O |
| O O O O O OO O O O O |
5000 |-+ OO O O O O O OO O O O OO O O O OO O O |
| |
4000 +-------------------------------------------------------------------+


[*] bisect-good sample
[O] bisect-bad sample



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


---
0DAY/LKP+ Test Infrastructure Open Source Technology Center
https://lists.01.org/hyperkitty/list/[email protected] Intel Corporation

Thanks,
Oliver Sang


Attachments:
(No filename) (112.41 kB)
config-5.12.0-rc3-00029-g388450c7081d (175.54 kB)
job-script (8.57 kB)
job.yaml (5.69 kB)
reproduce (1.47 kB)
dmesg.xz (29.40 kB)
Download all attachments

2021-05-14 14:16:26

by Feng Tang

[permalink] [raw]
Subject: Re: [clocksource] 388450c708: netperf.Throughput_tps -65.1% regression

Hi Paul,

On Thu, May 13, 2021 at 10:07:07AM -0700, Paul E. McKenney wrote:
> On Thu, May 13, 2021 at 11:55:15PM +0800, kernel test robot wrote:
> >
> >
> > Greeting,
> >
> > FYI, we noticed a -65.1% regression of netperf.Throughput_tps due to commit:
> >
> >
> > commit: 388450c7081ded73432e2b7148c1bb9a0b039963 ("[PATCH v12 clocksource 4/5] clocksource: Reduce clocksource-skew threshold for TSC")
> > url: https://github.com/0day-ci/linux/commits/Paul-E-McKenney/Do-not-mark-clocks-unstable-due-to-delays-for-v5-13/20210501-083404
> > base: https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git 2d036dfa5f10df9782f5278fc591d79d283c1fad
> >
> > in testcase: netperf
> > on test machine: 96 threads 2 sockets Ice Lake with 256G memory
> > with following parameters:
> >
> > ip: ipv4
> > runtime: 300s
> > nr_threads: 25%
> > cluster: cs-localhost
> > test: UDP_RR
> > cpufreq_governor: performance
> > ucode: 0xb000280
> >
> > test-description: Netperf is a benchmark that can be use to measure various aspect of networking performance.
> > test-url: http://www.netperf.org/netperf/
> >
> >
> >
> > If you fix the issue, kindly add following tag
> > Reported-by: kernel test robot <[email protected]>
> >
> >
> > also as Feng Tang checked, this is a "unstable clocksource" case.
> > attached dmesg FYI.
>
> Agreed, given the clock-skew event and the resulting switch to HPET,
> performance regressions are expected behavior.
>
> That dmesg output does demonstrate the value of Feng Tang's patch!
>
> I don't see how to obtain the values of ->mult and ->shift that would
> allow me to compute the delta. So if you don't tell me otherwise, I
> will assume that the skew itself was expected on this hardware, perhaps
> somehow due to the tpm_tis_status warning immediately preceding the
> clock-skew event. If my assumption is incorrect, please let me know.

I run the case with the debug patch applied, the info is:

[ 13.796429] clocksource: timekeeping watchdog on CPU19: Marking clocksource 'tsc' as unstable because the skew is too large:
[ 13.797413] clocksource: 'hpet' wd_nesc: 505192062 wd_now: 10657158 wd_last: fac6f97 mask: ffffffff
[ 13.797413] clocksource: 'tsc' cs_nsec: 504008008 cs_now: 3445570292aa5 cs_last: 344551f0cad6f mask: ffffffffffffffff
[ 13.797413] clocksource: 'tsc' is current clocksource.
[ 13.797413] tsc: Marking TSC unstable due to clocksource watchdog
[ 13.844513] clocksource: Checking clocksource tsc synchronization from CPU 50 to CPUs 0-1,12,22,32-33,60,65.
[ 13.855080] clocksource: Switched to clocksource hpet

So the delta is 1184 us (505192062 - 504008008), and I agree with
you that it should be related with the tpm_tis_status warning stuff.

But this re-trigger my old concerns, that if the margins calculated
for tsc, hpet are too small?

With current math algorithm, the 'uncertainty_margin' is
calculated against the frequency, and those tsc/hpet/acpi_pm
timer is multiple of MHz or GHz, which gives them to have margin of
100 us. It works with normal systems. But in the wild world, there
could be some sparkles due to some immature HW components, their
firmwares or drivers etc, just like this case.

Thanks,
Feng


> Thanx, Paul
>

2021-05-14 20:10:26

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [clocksource] 388450c708: netperf.Throughput_tps -65.1% regression

On Fri, May 14, 2021 at 03:43:14PM +0800, Feng Tang wrote:
> Hi Paul,
>
> On Thu, May 13, 2021 at 10:07:07AM -0700, Paul E. McKenney wrote:
> > On Thu, May 13, 2021 at 11:55:15PM +0800, kernel test robot wrote:
> > >
> > >
> > > Greeting,
> > >
> > > FYI, we noticed a -65.1% regression of netperf.Throughput_tps due to commit:
> > >
> > >
> > > commit: 388450c7081ded73432e2b7148c1bb9a0b039963 ("[PATCH v12 clocksource 4/5] clocksource: Reduce clocksource-skew threshold for TSC")
> > > url: https://github.com/0day-ci/linux/commits/Paul-E-McKenney/Do-not-mark-clocks-unstable-due-to-delays-for-v5-13/20210501-083404
> > > base: https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git 2d036dfa5f10df9782f5278fc591d79d283c1fad
> > >
> > > in testcase: netperf
> > > on test machine: 96 threads 2 sockets Ice Lake with 256G memory
> > > with following parameters:
> > >
> > > ip: ipv4
> > > runtime: 300s
> > > nr_threads: 25%
> > > cluster: cs-localhost
> > > test: UDP_RR
> > > cpufreq_governor: performance
> > > ucode: 0xb000280
> > >
> > > test-description: Netperf is a benchmark that can be use to measure various aspect of networking performance.
> > > test-url: http://www.netperf.org/netperf/
> > >
> > >
> > >
> > > If you fix the issue, kindly add following tag
> > > Reported-by: kernel test robot <[email protected]>
> > >
> > >
> > > also as Feng Tang checked, this is a "unstable clocksource" case.
> > > attached dmesg FYI.
> >
> > Agreed, given the clock-skew event and the resulting switch to HPET,
> > performance regressions are expected behavior.
> >
> > That dmesg output does demonstrate the value of Feng Tang's patch!
> >
> > I don't see how to obtain the values of ->mult and ->shift that would
> > allow me to compute the delta. So if you don't tell me otherwise, I
> > will assume that the skew itself was expected on this hardware, perhaps
> > somehow due to the tpm_tis_status warning immediately preceding the
> > clock-skew event. If my assumption is incorrect, please let me know.
>
> I run the case with the debug patch applied, the info is:
>
> [ 13.796429] clocksource: timekeeping watchdog on CPU19: Marking clocksource 'tsc' as unstable because the skew is too large:
> [ 13.797413] clocksource: 'hpet' wd_nesc: 505192062 wd_now: 10657158 wd_last: fac6f97 mask: ffffffff
> [ 13.797413] clocksource: 'tsc' cs_nsec: 504008008 cs_now: 3445570292aa5 cs_last: 344551f0cad6f mask: ffffffffffffffff
> [ 13.797413] clocksource: 'tsc' is current clocksource.
> [ 13.797413] tsc: Marking TSC unstable due to clocksource watchdog
> [ 13.844513] clocksource: Checking clocksource tsc synchronization from CPU 50 to CPUs 0-1,12,22,32-33,60,65.
> [ 13.855080] clocksource: Switched to clocksource hpet
>
> So the delta is 1184 us (505192062 - 504008008), and I agree with
> you that it should be related with the tpm_tis_status warning stuff.
>
> But this re-trigger my old concerns, that if the margins calculated
> for tsc, hpet are too small?

If the error really did disturb either tsc or hpet, then we really
do not have a false positive, and nothing should change (aside from
perhaps documenting that TPM issues can disturb the clocks, or better
yet treating that perturbation as a separate bug that should be fixed).
But if this is yet another way to get a confused measurement, then it
would be better to work out a way to reject the confusion and keep the
tighter margins. I cannot think right off of a way that this could
cause measurement confusion, but you never know.

So any thoughts on exactly how the tpm_tis_status warning might have
resulted in the skew?

> With current math algorithm, the 'uncertainty_margin' is
> calculated against the frequency, and those tsc/hpet/acpi_pm
> timer is multiple of MHz or GHz, which gives them to have margin of
> 100 us. It works with normal systems. But in the wild world, there
> could be some sparkles due to some immature HW components, their
> firmwares or drivers etc, just like this case.

Isn't diagnosing issues from immature hardware, firmware, and drivers
actually a benefit? It would after all be quite unfortunate if some issue
that was visible only due to clock skew were to escape into production.

Thanx, Paul

2021-05-16 10:20:29

by Feng Tang

[permalink] [raw]
Subject: Re: [clocksource] 388450c708: netperf.Throughput_tps -65.1% regression

On Fri, May 14, 2021 at 10:49:08AM -0700, Paul E. McKenney wrote:
> On Fri, May 14, 2021 at 03:43:14PM +0800, Feng Tang wrote:
> > Hi Paul,
> >
> > On Thu, May 13, 2021 at 10:07:07AM -0700, Paul E. McKenney wrote:
> > > On Thu, May 13, 2021 at 11:55:15PM +0800, kernel test robot wrote:
> > > >
> > > >
> > > > Greeting,
> > > >
> > > > FYI, we noticed a -65.1% regression of netperf.Throughput_tps due to commit:
> > > >
> > > >
> > > > commit: 388450c7081ded73432e2b7148c1bb9a0b039963 ("[PATCH v12 clocksource 4/5] clocksource: Reduce clocksource-skew threshold for TSC")
> > > > url: https://github.com/0day-ci/linux/commits/Paul-E-McKenney/Do-not-mark-clocks-unstable-due-to-delays-for-v5-13/20210501-083404
> > > > base: https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git 2d036dfa5f10df9782f5278fc591d79d283c1fad
> > > >
> > > > in testcase: netperf
> > > > on test machine: 96 threads 2 sockets Ice Lake with 256G memory
> > > > with following parameters:
> > > >
> > > > ip: ipv4
> > > > runtime: 300s
> > > > nr_threads: 25%
> > > > cluster: cs-localhost
> > > > test: UDP_RR
> > > > cpufreq_governor: performance
> > > > ucode: 0xb000280
> > > >
> > > > test-description: Netperf is a benchmark that can be use to measure various aspect of networking performance.
> > > > test-url: http://www.netperf.org/netperf/
> > > >
> > > >
> > > >
> > > > If you fix the issue, kindly add following tag
> > > > Reported-by: kernel test robot <[email protected]>
> > > >
> > > >
> > > > also as Feng Tang checked, this is a "unstable clocksource" case.
> > > > attached dmesg FYI.
> > >
> > > Agreed, given the clock-skew event and the resulting switch to HPET,
> > > performance regressions are expected behavior.
> > >
> > > That dmesg output does demonstrate the value of Feng Tang's patch!
> > >
> > > I don't see how to obtain the values of ->mult and ->shift that would
> > > allow me to compute the delta. So if you don't tell me otherwise, I
> > > will assume that the skew itself was expected on this hardware, perhaps
> > > somehow due to the tpm_tis_status warning immediately preceding the
> > > clock-skew event. If my assumption is incorrect, please let me know.
> >
> > I run the case with the debug patch applied, the info is:
> >
> > [ 13.796429] clocksource: timekeeping watchdog on CPU19: Marking clocksource 'tsc' as unstable because the skew is too large:
> > [ 13.797413] clocksource: 'hpet' wd_nesc: 505192062 wd_now: 10657158 wd_last: fac6f97 mask: ffffffff
> > [ 13.797413] clocksource: 'tsc' cs_nsec: 504008008 cs_now: 3445570292aa5 cs_last: 344551f0cad6f mask: ffffffffffffffff
> > [ 13.797413] clocksource: 'tsc' is current clocksource.
> > [ 13.797413] tsc: Marking TSC unstable due to clocksource watchdog
> > [ 13.844513] clocksource: Checking clocksource tsc synchronization from CPU 50 to CPUs 0-1,12,22,32-33,60,65.
> > [ 13.855080] clocksource: Switched to clocksource hpet
> >
> > So the delta is 1184 us (505192062 - 504008008), and I agree with
> > you that it should be related with the tpm_tis_status warning stuff.
> >
> > But this re-trigger my old concerns, that if the margins calculated
> > for tsc, hpet are too small?
>
> If the error really did disturb either tsc or hpet, then we really
> do not have a false positive, and nothing should change (aside from
> perhaps documenting that TPM issues can disturb the clocks, or better
> yet treating that perturbation as a separate bug that should be fixed).
> But if this is yet another way to get a confused measurement, then it
> would be better to work out a way to reject the confusion and keep the
> tighter margins. I cannot think right off of a way that this could
> cause measurement confusion, but you never know.

I have no doubt in the correctness of the measuring method, but was
just afraid some platforms which use to 'just work' will be caught :)

> So any thoughts on exactly how the tpm_tis_status warning might have
> resulted in the skew?

The tpm error message has been reported before, and from google there
were some similar errors, we'll do some further check.

> > With current math algorithm, the 'uncertainty_margin' is
> > calculated against the frequency, and those tsc/hpet/acpi_pm
> > timer is multiple of MHz or GHz, which gives them to have margin of
> > 100 us. It works with normal systems. But in the wild world, there
> > could be some sparkles due to some immature HW components, their
> > firmwares or drivers etc, just like this case.
>
> Isn't diagnosing issues from immature hardware, firmware, and drivers
> actually a benefit? It would after all be quite unfortunate if some issue
> that was visible only due to clock skew were to escape into production.

Yes, it surely will expose some mal-functional cases which haven't
been caught before.

Thanks,
Feng


> Thanx, Paul

2021-05-17 06:18:08

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [clocksource] 388450c708: netperf.Throughput_tps -65.1% regression

On Sun, May 16, 2021 at 02:34:19PM +0800, Feng Tang wrote:
> On Fri, May 14, 2021 at 10:49:08AM -0700, Paul E. McKenney wrote:
> > On Fri, May 14, 2021 at 03:43:14PM +0800, Feng Tang wrote:
> > > Hi Paul,
> > >
> > > On Thu, May 13, 2021 at 10:07:07AM -0700, Paul E. McKenney wrote:
> > > > On Thu, May 13, 2021 at 11:55:15PM +0800, kernel test robot wrote:
> > > > >
> > > > >
> > > > > Greeting,
> > > > >
> > > > > FYI, we noticed a -65.1% regression of netperf.Throughput_tps due to commit:
> > > > >
> > > > >
> > > > > commit: 388450c7081ded73432e2b7148c1bb9a0b039963 ("[PATCH v12 clocksource 4/5] clocksource: Reduce clocksource-skew threshold for TSC")
> > > > > url: https://github.com/0day-ci/linux/commits/Paul-E-McKenney/Do-not-mark-clocks-unstable-due-to-delays-for-v5-13/20210501-083404
> > > > > base: https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git 2d036dfa5f10df9782f5278fc591d79d283c1fad
> > > > >
> > > > > in testcase: netperf
> > > > > on test machine: 96 threads 2 sockets Ice Lake with 256G memory
> > > > > with following parameters:
> > > > >
> > > > > ip: ipv4
> > > > > runtime: 300s
> > > > > nr_threads: 25%
> > > > > cluster: cs-localhost
> > > > > test: UDP_RR
> > > > > cpufreq_governor: performance
> > > > > ucode: 0xb000280
> > > > >
> > > > > test-description: Netperf is a benchmark that can be use to measure various aspect of networking performance.
> > > > > test-url: http://www.netperf.org/netperf/
> > > > >
> > > > >
> > > > >
> > > > > If you fix the issue, kindly add following tag
> > > > > Reported-by: kernel test robot <[email protected]>
> > > > >
> > > > >
> > > > > also as Feng Tang checked, this is a "unstable clocksource" case.
> > > > > attached dmesg FYI.
> > > >
> > > > Agreed, given the clock-skew event and the resulting switch to HPET,
> > > > performance regressions are expected behavior.
> > > >
> > > > That dmesg output does demonstrate the value of Feng Tang's patch!
> > > >
> > > > I don't see how to obtain the values of ->mult and ->shift that would
> > > > allow me to compute the delta. So if you don't tell me otherwise, I
> > > > will assume that the skew itself was expected on this hardware, perhaps
> > > > somehow due to the tpm_tis_status warning immediately preceding the
> > > > clock-skew event. If my assumption is incorrect, please let me know.
> > >
> > > I run the case with the debug patch applied, the info is:
> > >
> > > [ 13.796429] clocksource: timekeeping watchdog on CPU19: Marking clocksource 'tsc' as unstable because the skew is too large:
> > > [ 13.797413] clocksource: 'hpet' wd_nesc: 505192062 wd_now: 10657158 wd_last: fac6f97 mask: ffffffff
> > > [ 13.797413] clocksource: 'tsc' cs_nsec: 504008008 cs_now: 3445570292aa5 cs_last: 344551f0cad6f mask: ffffffffffffffff
> > > [ 13.797413] clocksource: 'tsc' is current clocksource.
> > > [ 13.797413] tsc: Marking TSC unstable due to clocksource watchdog
> > > [ 13.844513] clocksource: Checking clocksource tsc synchronization from CPU 50 to CPUs 0-1,12,22,32-33,60,65.
> > > [ 13.855080] clocksource: Switched to clocksource hpet
> > >
> > > So the delta is 1184 us (505192062 - 504008008), and I agree with
> > > you that it should be related with the tpm_tis_status warning stuff.
> > >
> > > But this re-trigger my old concerns, that if the margins calculated
> > > for tsc, hpet are too small?
> >
> > If the error really did disturb either tsc or hpet, then we really
> > do not have a false positive, and nothing should change (aside from
> > perhaps documenting that TPM issues can disturb the clocks, or better
> > yet treating that perturbation as a separate bug that should be fixed).
> > But if this is yet another way to get a confused measurement, then it
> > would be better to work out a way to reject the confusion and keep the
> > tighter margins. I cannot think right off of a way that this could
> > cause measurement confusion, but you never know.
>
> I have no doubt in the correctness of the measuring method, but was
> just afraid some platforms which use to 'just work' will be caught :)

Quite possibly. On the other hand, there might also be some platforms
on which applications have been behaving in subtle and strange ways,
and this series might be very helpful in such cases.

> > So any thoughts on exactly how the tpm_tis_status warning might have
> > resulted in the skew?
>
> The tpm error message has been reported before, and from google there
> were some similar errors, we'll do some further check.

Thnak you!

> > > With current math algorithm, the 'uncertainty_margin' is
> > > calculated against the frequency, and those tsc/hpet/acpi_pm
> > > timer is multiple of MHz or GHz, which gives them to have margin of
> > > 100 us. It works with normal systems. But in the wild world, there
> > > could be some sparkles due to some immature HW components, their
> > > firmwares or drivers etc, just like this case.
> >
> > Isn't diagnosing issues from immature hardware, firmware, and drivers
> > actually a benefit? It would after all be quite unfortunate if some issue
> > that was visible only due to clock skew were to escape into production.
>
> Yes, it surely will expose some mal-functional cases which haven't
> been caught before.

Here is hoping!

Thanx, Paul

2021-05-19 19:19:08

by Feng Tang

[permalink] [raw]
Subject: Re: [clocksource] 388450c708: netperf.Throughput_tps -65.1% regression

On Sun, May 16, 2021 at 02:34:19PM +0800, Feng Tang wrote:
> On Fri, May 14, 2021 at 10:49:08AM -0700, Paul E. McKenney wrote:
> > On Fri, May 14, 2021 at 03:43:14PM +0800, Feng Tang wrote:
> > > Hi Paul,
> > >
> > > On Thu, May 13, 2021 at 10:07:07AM -0700, Paul E. McKenney wrote:
> > > > On Thu, May 13, 2021 at 11:55:15PM +0800, kernel test robot wrote:
> > > > >
> > > > >
> > > > > Greeting,
> > > > >
> > > > > FYI, we noticed a -65.1% regression of netperf.Throughput_tps due to commit:
> > > > >
> > > > >
> > > > > commit: 388450c7081ded73432e2b7148c1bb9a0b039963 ("[PATCH v12 clocksource 4/5] clocksource: Reduce clocksource-skew threshold for TSC")
> > > > > url: https://github.com/0day-ci/linux/commits/Paul-E-McKenney/Do-not-mark-clocks-unstable-due-to-delays-for-v5-13/20210501-083404
> > > > > base: https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git 2d036dfa5f10df9782f5278fc591d79d283c1fad
> > > > >
> > > > > in testcase: netperf
> > > > > on test machine: 96 threads 2 sockets Ice Lake with 256G memory
> > > > > with following parameters:
> > > > >
> > > > > ip: ipv4
> > > > > runtime: 300s
> > > > > nr_threads: 25%
> > > > > cluster: cs-localhost
> > > > > test: UDP_RR
> > > > > cpufreq_governor: performance
> > > > > ucode: 0xb000280
> > > > >
> > > > > test-description: Netperf is a benchmark that can be use to measure various aspect of networking performance.
> > > > > test-url: http://www.netperf.org/netperf/
> > > > >
> > > > >
> > > > >
> > > > > If you fix the issue, kindly add following tag
> > > > > Reported-by: kernel test robot <[email protected]>
> > > > >
> > > > >
> > > > > also as Feng Tang checked, this is a "unstable clocksource" case.
> > > > > attached dmesg FYI.
> > > >
> > > > Agreed, given the clock-skew event and the resulting switch to HPET,
> > > > performance regressions are expected behavior.
> > > >
> > > > That dmesg output does demonstrate the value of Feng Tang's patch!
> > > >
> > > > I don't see how to obtain the values of ->mult and ->shift that would
> > > > allow me to compute the delta. So if you don't tell me otherwise, I
> > > > will assume that the skew itself was expected on this hardware, perhaps
> > > > somehow due to the tpm_tis_status warning immediately preceding the
> > > > clock-skew event. If my assumption is incorrect, please let me know.
> > >
> > > I run the case with the debug patch applied, the info is:
> > >
> > > [ 13.796429] clocksource: timekeeping watchdog on CPU19: Marking clocksource 'tsc' as unstable because the skew is too large:
> > > [ 13.797413] clocksource: 'hpet' wd_nesc: 505192062 wd_now: 10657158 wd_last: fac6f97 mask: ffffffff
> > > [ 13.797413] clocksource: 'tsc' cs_nsec: 504008008 cs_now: 3445570292aa5 cs_last: 344551f0cad6f mask: ffffffffffffffff
> > > [ 13.797413] clocksource: 'tsc' is current clocksource.
> > > [ 13.797413] tsc: Marking TSC unstable due to clocksource watchdog
> > > [ 13.844513] clocksource: Checking clocksource tsc synchronization from CPU 50 to CPUs 0-1,12,22,32-33,60,65.
> > > [ 13.855080] clocksource: Switched to clocksource hpet
> > >
> > > So the delta is 1184 us (505192062 - 504008008), and I agree with
> > > you that it should be related with the tpm_tis_status warning stuff.
> > >
> > > But this re-trigger my old concerns, that if the margins calculated
> > > for tsc, hpet are too small?
> >
> > If the error really did disturb either tsc or hpet, then we really
> > do not have a false positive, and nothing should change (aside from
> > perhaps documenting that TPM issues can disturb the clocks, or better
> > yet treating that perturbation as a separate bug that should be fixed).
> > But if this is yet another way to get a confused measurement, then it
> > would be better to work out a way to reject the confusion and keep the
> > tighter margins. I cannot think right off of a way that this could
> > cause measurement confusion, but you never know.
>
> I have no doubt in the correctness of the measuring method, but was
> just afraid some platforms which use to 'just work' will be caught :)
>
> > So any thoughts on exactly how the tpm_tis_status warning might have
> > resulted in the skew?
>
> The tpm error message has been reported before, and from google there
> were some similar errors, we'll do some further check.

Some update on this: further debug shows it is not related to TPM
module, as the 'unstable' still happens even if we disable TPM
module in kernel.

We run this case on another test box of same type but with latest
BIOS and microcode, the tsc freq is correctly calculated and the
'unstable' error can't be reproduced. And we will check how to
upgrade the test box in 0day.

Thanks,
Feng

2021-05-19 20:20:46

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [clocksource] 388450c708: netperf.Throughput_tps -65.1% regression

On Wed, May 19, 2021 at 02:09:02PM +0800, Feng Tang wrote:
> On Sun, May 16, 2021 at 02:34:19PM +0800, Feng Tang wrote:
> > On Fri, May 14, 2021 at 10:49:08AM -0700, Paul E. McKenney wrote:
> > > On Fri, May 14, 2021 at 03:43:14PM +0800, Feng Tang wrote:
> > > > Hi Paul,
> > > >
> > > > On Thu, May 13, 2021 at 10:07:07AM -0700, Paul E. McKenney wrote:
> > > > > On Thu, May 13, 2021 at 11:55:15PM +0800, kernel test robot wrote:
> > > > > >
> > > > > >
> > > > > > Greeting,
> > > > > >
> > > > > > FYI, we noticed a -65.1% regression of netperf.Throughput_tps due to commit:
> > > > > >
> > > > > >
> > > > > > commit: 388450c7081ded73432e2b7148c1bb9a0b039963 ("[PATCH v12 clocksource 4/5] clocksource: Reduce clocksource-skew threshold for TSC")
> > > > > > url: https://github.com/0day-ci/linux/commits/Paul-E-McKenney/Do-not-mark-clocks-unstable-due-to-delays-for-v5-13/20210501-083404
> > > > > > base: https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git 2d036dfa5f10df9782f5278fc591d79d283c1fad
> > > > > >
> > > > > > in testcase: netperf
> > > > > > on test machine: 96 threads 2 sockets Ice Lake with 256G memory
> > > > > > with following parameters:
> > > > > >
> > > > > > ip: ipv4
> > > > > > runtime: 300s
> > > > > > nr_threads: 25%
> > > > > > cluster: cs-localhost
> > > > > > test: UDP_RR
> > > > > > cpufreq_governor: performance
> > > > > > ucode: 0xb000280
> > > > > >
> > > > > > test-description: Netperf is a benchmark that can be use to measure various aspect of networking performance.
> > > > > > test-url: http://www.netperf.org/netperf/
> > > > > >
> > > > > >
> > > > > >
> > > > > > If you fix the issue, kindly add following tag
> > > > > > Reported-by: kernel test robot <[email protected]>
> > > > > >
> > > > > >
> > > > > > also as Feng Tang checked, this is a "unstable clocksource" case.
> > > > > > attached dmesg FYI.
> > > > >
> > > > > Agreed, given the clock-skew event and the resulting switch to HPET,
> > > > > performance regressions are expected behavior.
> > > > >
> > > > > That dmesg output does demonstrate the value of Feng Tang's patch!
> > > > >
> > > > > I don't see how to obtain the values of ->mult and ->shift that would
> > > > > allow me to compute the delta. So if you don't tell me otherwise, I
> > > > > will assume that the skew itself was expected on this hardware, perhaps
> > > > > somehow due to the tpm_tis_status warning immediately preceding the
> > > > > clock-skew event. If my assumption is incorrect, please let me know.
> > > >
> > > > I run the case with the debug patch applied, the info is:
> > > >
> > > > [ 13.796429] clocksource: timekeeping watchdog on CPU19: Marking clocksource 'tsc' as unstable because the skew is too large:
> > > > [ 13.797413] clocksource: 'hpet' wd_nesc: 505192062 wd_now: 10657158 wd_last: fac6f97 mask: ffffffff
> > > > [ 13.797413] clocksource: 'tsc' cs_nsec: 504008008 cs_now: 3445570292aa5 cs_last: 344551f0cad6f mask: ffffffffffffffff
> > > > [ 13.797413] clocksource: 'tsc' is current clocksource.
> > > > [ 13.797413] tsc: Marking TSC unstable due to clocksource watchdog
> > > > [ 13.844513] clocksource: Checking clocksource tsc synchronization from CPU 50 to CPUs 0-1,12,22,32-33,60,65.
> > > > [ 13.855080] clocksource: Switched to clocksource hpet
> > > >
> > > > So the delta is 1184 us (505192062 - 504008008), and I agree with
> > > > you that it should be related with the tpm_tis_status warning stuff.
> > > >
> > > > But this re-trigger my old concerns, that if the margins calculated
> > > > for tsc, hpet are too small?
> > >
> > > If the error really did disturb either tsc or hpet, then we really
> > > do not have a false positive, and nothing should change (aside from
> > > perhaps documenting that TPM issues can disturb the clocks, or better
> > > yet treating that perturbation as a separate bug that should be fixed).
> > > But if this is yet another way to get a confused measurement, then it
> > > would be better to work out a way to reject the confusion and keep the
> > > tighter margins. I cannot think right off of a way that this could
> > > cause measurement confusion, but you never know.
> >
> > I have no doubt in the correctness of the measuring method, but was
> > just afraid some platforms which use to 'just work' will be caught :)
> >
> > > So any thoughts on exactly how the tpm_tis_status warning might have
> > > resulted in the skew?
> >
> > The tpm error message has been reported before, and from google there
> > were some similar errors, we'll do some further check.
>
> Some update on this: further debug shows it is not related to TPM
> module, as the 'unstable' still happens even if we disable TPM
> module in kernel.
>
> We run this case on another test box of same type but with latest
> BIOS and microcode, the tsc freq is correctly calculated and the
> 'unstable' error can't be reproduced. And we will check how to
> upgrade the test box in 0day.

So this patch series might have located a real BIOS or firmware bug,
then? ;-)

Thanx, Paul

2021-05-20 00:54:36

by Feng Tang

[permalink] [raw]
Subject: Re: [clocksource] 388450c708: netperf.Throughput_tps -65.1% regression

On Wed, May 19, 2021 at 11:05:21AM -0700, Paul E. McKenney wrote:
> On Wed, May 19, 2021 at 02:09:02PM +0800, Feng Tang wrote:
> > On Sun, May 16, 2021 at 02:34:19PM +0800, Feng Tang wrote:
> > > On Fri, May 14, 2021 at 10:49:08AM -0700, Paul E. McKenney wrote:
> > > > On Fri, May 14, 2021 at 03:43:14PM +0800, Feng Tang wrote:
> > > > > Hi Paul,
> > > > >
> > > > > On Thu, May 13, 2021 at 10:07:07AM -0700, Paul E. McKenney wrote:
> > > > > > On Thu, May 13, 2021 at 11:55:15PM +0800, kernel test robot wrote:
> > > > > > >
> > > > > > >
> > > > > > > Greeting,
> > > > > > >
> > > > > > > FYI, we noticed a -65.1% regression of netperf.Throughput_tps due to commit:
> > > > > > >
> > > > > > >
> > > > > > > commit: 388450c7081ded73432e2b7148c1bb9a0b039963 ("[PATCH v12 clocksource 4/5] clocksource: Reduce clocksource-skew threshold for TSC")
> > > > > > > url: https://github.com/0day-ci/linux/commits/Paul-E-McKenney/Do-not-mark-clocks-unstable-due-to-delays-for-v5-13/20210501-083404
> > > > > > > base: https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git 2d036dfa5f10df9782f5278fc591d79d283c1fad
> > > > > > >
> > > > > > > in testcase: netperf
> > > > > > > on test machine: 96 threads 2 sockets Ice Lake with 256G memory
> > > > > > > with following parameters:
> > > > > > >
> > > > > > > ip: ipv4
> > > > > > > runtime: 300s
> > > > > > > nr_threads: 25%
> > > > > > > cluster: cs-localhost
> > > > > > > test: UDP_RR
> > > > > > > cpufreq_governor: performance
> > > > > > > ucode: 0xb000280
> > > > > > >
> > > > > > > test-description: Netperf is a benchmark that can be use to measure various aspect of networking performance.
> > > > > > > test-url: http://www.netperf.org/netperf/
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > If you fix the issue, kindly add following tag
> > > > > > > Reported-by: kernel test robot <[email protected]>
> > > > > > >
> > > > > > >
> > > > > > > also as Feng Tang checked, this is a "unstable clocksource" case.
> > > > > > > attached dmesg FYI.
> > > > > >
> > > > > > Agreed, given the clock-skew event and the resulting switch to HPET,
> > > > > > performance regressions are expected behavior.
> > > > > >
> > > > > > That dmesg output does demonstrate the value of Feng Tang's patch!
> > > > > >
> > > > > > I don't see how to obtain the values of ->mult and ->shift that would
> > > > > > allow me to compute the delta. So if you don't tell me otherwise, I
> > > > > > will assume that the skew itself was expected on this hardware, perhaps
> > > > > > somehow due to the tpm_tis_status warning immediately preceding the
> > > > > > clock-skew event. If my assumption is incorrect, please let me know.
> > > > >
> > > > > I run the case with the debug patch applied, the info is:
> > > > >
> > > > > [ 13.796429] clocksource: timekeeping watchdog on CPU19: Marking clocksource 'tsc' as unstable because the skew is too large:
> > > > > [ 13.797413] clocksource: 'hpet' wd_nesc: 505192062 wd_now: 10657158 wd_last: fac6f97 mask: ffffffff
> > > > > [ 13.797413] clocksource: 'tsc' cs_nsec: 504008008 cs_now: 3445570292aa5 cs_last: 344551f0cad6f mask: ffffffffffffffff
> > > > > [ 13.797413] clocksource: 'tsc' is current clocksource.
> > > > > [ 13.797413] tsc: Marking TSC unstable due to clocksource watchdog
> > > > > [ 13.844513] clocksource: Checking clocksource tsc synchronization from CPU 50 to CPUs 0-1,12,22,32-33,60,65.
> > > > > [ 13.855080] clocksource: Switched to clocksource hpet
> > > > >
> > > > > So the delta is 1184 us (505192062 - 504008008), and I agree with
> > > > > you that it should be related with the tpm_tis_status warning stuff.
> > > > >
> > > > > But this re-trigger my old concerns, that if the margins calculated
> > > > > for tsc, hpet are too small?
> > > >
> > > > If the error really did disturb either tsc or hpet, then we really
> > > > do not have a false positive, and nothing should change (aside from
> > > > perhaps documenting that TPM issues can disturb the clocks, or better
> > > > yet treating that perturbation as a separate bug that should be fixed).
> > > > But if this is yet another way to get a confused measurement, then it
> > > > would be better to work out a way to reject the confusion and keep the
> > > > tighter margins. I cannot think right off of a way that this could
> > > > cause measurement confusion, but you never know.
> > >
> > > I have no doubt in the correctness of the measuring method, but was
> > > just afraid some platforms which use to 'just work' will be caught :)
> > >
> > > > So any thoughts on exactly how the tpm_tis_status warning might have
> > > > resulted in the skew?
> > >
> > > The tpm error message has been reported before, and from google there
> > > were some similar errors, we'll do some further check.
> >
> > Some update on this: further debug shows it is not related to TPM
> > module, as the 'unstable' still happens even if we disable TPM
> > module in kernel.
> >
> > We run this case on another test box of same type but with latest
> > BIOS and microcode, the tsc freq is correctly calculated and the
> > 'unstable' error can't be reproduced. And we will check how to
> > upgrade the test box in 0day.
>
> So this patch series might have located a real BIOS or firmware bug,
> then? ;-)

Yes, it did a good job in exposing a real-world bug! (lucky
thing is the bug has a fix :))

Thanks,
Feng

> Thanx, Paul

2021-05-20 04:56:27

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [clocksource] 388450c708: netperf.Throughput_tps -65.1% regression

On Thu, May 20, 2021 at 08:52:10AM +0800, Feng Tang wrote:
> On Wed, May 19, 2021 at 11:05:21AM -0700, Paul E. McKenney wrote:
> > On Wed, May 19, 2021 at 02:09:02PM +0800, Feng Tang wrote:
> > > On Sun, May 16, 2021 at 02:34:19PM +0800, Feng Tang wrote:
> > > > On Fri, May 14, 2021 at 10:49:08AM -0700, Paul E. McKenney wrote:
> > > > > On Fri, May 14, 2021 at 03:43:14PM +0800, Feng Tang wrote:
> > > > > > Hi Paul,
> > > > > >
> > > > > > On Thu, May 13, 2021 at 10:07:07AM -0700, Paul E. McKenney wrote:
> > > > > > > On Thu, May 13, 2021 at 11:55:15PM +0800, kernel test robot wrote:
> > > > > > > >
> > > > > > > >
> > > > > > > > Greeting,
> > > > > > > >
> > > > > > > > FYI, we noticed a -65.1% regression of netperf.Throughput_tps due to commit:
> > > > > > > >
> > > > > > > >
> > > > > > > > commit: 388450c7081ded73432e2b7148c1bb9a0b039963 ("[PATCH v12 clocksource 4/5] clocksource: Reduce clocksource-skew threshold for TSC")
> > > > > > > > url: https://github.com/0day-ci/linux/commits/Paul-E-McKenney/Do-not-mark-clocks-unstable-due-to-delays-for-v5-13/20210501-083404
> > > > > > > > base: https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git 2d036dfa5f10df9782f5278fc591d79d283c1fad
> > > > > > > >
> > > > > > > > in testcase: netperf
> > > > > > > > on test machine: 96 threads 2 sockets Ice Lake with 256G memory
> > > > > > > > with following parameters:
> > > > > > > >
> > > > > > > > ip: ipv4
> > > > > > > > runtime: 300s
> > > > > > > > nr_threads: 25%
> > > > > > > > cluster: cs-localhost
> > > > > > > > test: UDP_RR
> > > > > > > > cpufreq_governor: performance
> > > > > > > > ucode: 0xb000280
> > > > > > > >
> > > > > > > > test-description: Netperf is a benchmark that can be use to measure various aspect of networking performance.
> > > > > > > > test-url: http://www.netperf.org/netperf/
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > If you fix the issue, kindly add following tag
> > > > > > > > Reported-by: kernel test robot <[email protected]>
> > > > > > > >
> > > > > > > >
> > > > > > > > also as Feng Tang checked, this is a "unstable clocksource" case.
> > > > > > > > attached dmesg FYI.
> > > > > > >
> > > > > > > Agreed, given the clock-skew event and the resulting switch to HPET,
> > > > > > > performance regressions are expected behavior.
> > > > > > >
> > > > > > > That dmesg output does demonstrate the value of Feng Tang's patch!
> > > > > > >
> > > > > > > I don't see how to obtain the values of ->mult and ->shift that would
> > > > > > > allow me to compute the delta. So if you don't tell me otherwise, I
> > > > > > > will assume that the skew itself was expected on this hardware, perhaps
> > > > > > > somehow due to the tpm_tis_status warning immediately preceding the
> > > > > > > clock-skew event. If my assumption is incorrect, please let me know.
> > > > > >
> > > > > > I run the case with the debug patch applied, the info is:
> > > > > >
> > > > > > [ 13.796429] clocksource: timekeeping watchdog on CPU19: Marking clocksource 'tsc' as unstable because the skew is too large:
> > > > > > [ 13.797413] clocksource: 'hpet' wd_nesc: 505192062 wd_now: 10657158 wd_last: fac6f97 mask: ffffffff
> > > > > > [ 13.797413] clocksource: 'tsc' cs_nsec: 504008008 cs_now: 3445570292aa5 cs_last: 344551f0cad6f mask: ffffffffffffffff
> > > > > > [ 13.797413] clocksource: 'tsc' is current clocksource.
> > > > > > [ 13.797413] tsc: Marking TSC unstable due to clocksource watchdog
> > > > > > [ 13.844513] clocksource: Checking clocksource tsc synchronization from CPU 50 to CPUs 0-1,12,22,32-33,60,65.
> > > > > > [ 13.855080] clocksource: Switched to clocksource hpet
> > > > > >
> > > > > > So the delta is 1184 us (505192062 - 504008008), and I agree with
> > > > > > you that it should be related with the tpm_tis_status warning stuff.
> > > > > >
> > > > > > But this re-trigger my old concerns, that if the margins calculated
> > > > > > for tsc, hpet are too small?
> > > > >
> > > > > If the error really did disturb either tsc or hpet, then we really
> > > > > do not have a false positive, and nothing should change (aside from
> > > > > perhaps documenting that TPM issues can disturb the clocks, or better
> > > > > yet treating that perturbation as a separate bug that should be fixed).
> > > > > But if this is yet another way to get a confused measurement, then it
> > > > > would be better to work out a way to reject the confusion and keep the
> > > > > tighter margins. I cannot think right off of a way that this could
> > > > > cause measurement confusion, but you never know.
> > > >
> > > > I have no doubt in the correctness of the measuring method, but was
> > > > just afraid some platforms which use to 'just work' will be caught :)
> > > >
> > > > > So any thoughts on exactly how the tpm_tis_status warning might have
> > > > > resulted in the skew?
> > > >
> > > > The tpm error message has been reported before, and from google there
> > > > were some similar errors, we'll do some further check.
> > >
> > > Some update on this: further debug shows it is not related to TPM
> > > module, as the 'unstable' still happens even if we disable TPM
> > > module in kernel.
> > >
> > > We run this case on another test box of same type but with latest
> > > BIOS and microcode, the tsc freq is correctly calculated and the
> > > 'unstable' error can't be reproduced. And we will check how to
> > > upgrade the test box in 0day.
> >
> > So this patch series might have located a real BIOS or firmware bug,
> > then? ;-)
>
> Yes, it did a good job in exposing a real-world bug! (lucky
> thing is the bug has a fix :))

Even better! ;-)

Thanx, Paul