Often the latency observed in a CPU is not caused by the work being done
in the CPU itself, but by work done on another CPU that causes the
hardware to stall all CPUs. In this case, it is interesting to know
what is happening on ALL CPUs, and the best way to do this is via
crash dump analysis.
Add the PANIC_ON_STOP option to osnoise/timerlat tracers. The default
behavior is having this option off. When enabled by the user, the system
will panic after hitting a stop tracing condition.
This option was motivated by a real scenario that Juri Lelli and I
were debugging.
Cc: Juri Lelli <[email protected]>
Cc: Daniel Bristot de Oliveira <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Signed-off-by: Daniel Bristot de Oliveira <[email protected]>
---
kernel/trace/trace_osnoise.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/kernel/trace/trace_osnoise.c b/kernel/trace/trace_osnoise.c
index 3f10dd1f2f1c..801eba0b5cf8 100644
--- a/kernel/trace/trace_osnoise.c
+++ b/kernel/trace/trace_osnoise.c
@@ -54,10 +54,11 @@
enum osnoise_options_index {
OSN_DEFAULTS = 0,
OSN_WORKLOAD,
+ OSN_PANIC_ON_STOP,
OSN_MAX
};
-static const char * const osnoise_options_str[OSN_MAX] = { "DEFAULTS", "OSNOISE_WORKLOAD" };
+static const char * const osnoise_options_str[OSN_MAX] = { "DEFAULTS", "OSNOISE_WORKLOAD", "PANIC_ON_STOP" };
#define OSN_DEFAULT_OPTIONS 0x2
unsigned long osnoise_options = OSN_DEFAULT_OPTIONS;
@@ -1270,6 +1271,9 @@ static __always_inline void osnoise_stop_tracing(void)
trace_array_printk_buf(tr->array_buffer.buffer, _THIS_IP_,
"stop tracing hit on cpu %d\n", smp_processor_id());
+ if (test_bit(OSN_PANIC_ON_STOP, &osnoise_options))
+ panic("tracer hit stop condition on CPU %d\n", smp_processor_id());
+
tracer_tracing_off(tr);
}
rcu_read_unlock();
--
2.32.0