Subject: [PATCH V4 0/3] Add osnoise/options options

After adding the osnoise/options file, a set of on/off options
came to my mind, most based on discussions while debugging problems
with Juri and Clark.

The PANIC_ON_STOP option facilitates the vmcore generation to aid
in the latency analysis using a crash dump.

The OSNOISE_PREEMPT_DISABLE and OSNOISE_IRQ_DISABLE options refine
the type of noise that the osnoise tracer detects, allowing the
tool to measure only IRQ-related noise, or NMI/HW-related noise,
respectively.

Each patch has a description of the options and the last patch
documents them in the osnoise documentation file.

[1] https://lore.kernel.org/r/[email protected]/

Changes from V3:
- Fix documentation (Bagas Sanjaya)
- Optmize the preempt disable option (Steven Rostedt)
Changes from v2:
- rebased on top of linux-trace.git/ftrace/core
- removed the patches already added to the ftrace/core
Changes from v1:
- Changed the cover letter topic
- Add Acked-by Masami to the first patch
- Add the PANIC_ON_STOP option
- Add the OSNOISE_PREEMPT_DISABLE and OSNOISE_IRQ_DISABLE options
- Improved the documentation

Daniel Bristot de Oliveira (3):
tracing/osnoise: Add PANIC_ON_STOP option
tracing/osnoise: Add preempt and/or irq disabled options
Documentation/osnoise: Add osnoise/options documentation

Documentation/trace/osnoise-tracer.rst | 20 +++++++++-
kernel/trace/trace_osnoise.c | 52 +++++++++++++++++++++++---
2 files changed, 65 insertions(+), 7 deletions(-)

--
2.32.0


Subject: [PATCH V4 3/3] Documentation/osnoise: Add osnoise/options documentation

Add the documentation about the osnoise/options file, the options,
and some additional explanation about the OSNOISE_WORKLOAD option.

Cc: Daniel Bristot de Oliveira <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Bagas Sanjaya <[email protected]>
Signed-off-by: Daniel Bristot de Oliveira <[email protected]>
---
Documentation/trace/osnoise-tracer.rst | 20 ++++++++++++++++++--
1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/Documentation/trace/osnoise-tracer.rst b/Documentation/trace/osnoise-tracer.rst
index 3c675ed82b27..f2008e317223 100644
--- a/Documentation/trace/osnoise-tracer.rst
+++ b/Documentation/trace/osnoise-tracer.rst
@@ -92,8 +92,8 @@ Note that the example above shows a high number of HW noise samples.
The reason being is that this sample was taken on a virtual machine,
and the host interference is detected as a hardware interference.

-Tracer options
----------------------
+Tracer Configuration
+--------------------

The tracer has a set of options inside the osnoise directory, they are:

@@ -115,6 +115,22 @@ The tracer has a set of options inside the osnoise directory, they are:
NO_OSNOISE_WORKLOAD disables the OSNOISE_WORKLOAD option. The
special DEAFAULTS option resets all options to the default value.

+Tracer Options
+--------------
+
+The osnoise/options file exposes a set of on/off configuration options for
+the osnoise tracer. These options are:
+
+ - DEFAULTS: reset the options to the default value.
+ - OSNOISE_WORKLOAD: do not dispatch osnoise workload (see dedicated
+ section below).
+ - PANIC_ON_STOP: call panic() if the tracer stops. This option serves to
+ capture a vmcore.
+ - OSNOISE_PREEMPT_DISABLE: disable preemption while running the osnoise
+ workload, allowing only IRQ and hardware-related noise.
+ - OSNOISE_IRQ_DISABLE: disable IRQs while running the osnoise workload,
+ allowing only NMIs and hardware-related noise, like hwlat tracer.
+
Additional Tracing
------------------

--
2.32.0

Subject: [PATCH V4 2/3] tracing/osnoise: Add preempt and/or irq disabled options

The osnoise workload runs with preemption and IRQs enabled in such
a way as to allow all sorts of noise to disturb osnoise's execution.
hwlat tracer has a similar workload but works with irq disabled,
allowing only NMIs and the hardware to generate noise.

While thinking about adding an options file to hwlat tracer to
allow the system to panic, and other features I was thinking
to add, like having a tracepoint at each noise detection, it
came to my mind that is easier to make osnoise and also do
hardware latency detection than making hwlat "feature compatible"
with osnoise.

Other points are:
- osnoise already has an independent cpu file.
- osnoise has a more intuitive interface, e.g., runtime/period vs.
window/width (and people often need help remembering what it is).
- osnoise: tracepoints
- osnoise stop options
- osnoise options file itself

Moreover, the user-space side (in rtla) is simplified by reusing the
existing osnoise code.

Finally, people have been asking me about using osnoise for hw latency
detection, and I have to explain that it was sufficient but not
necessary. These options make it sufficient and necessary.

Adding a Suggested-by Clark, as he often asked me about this
possibility.

Cc: Suggested-by: Clark Williams <[email protected]>
Cc: Daniel Bristot de Oliveira <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Signed-off-by: Daniel Bristot de Oliveira <[email protected]>
---
kernel/trace/trace_osnoise.c | 48 ++++++++++++++++++++++++++++++++----
1 file changed, 43 insertions(+), 5 deletions(-)

diff --git a/kernel/trace/trace_osnoise.c b/kernel/trace/trace_osnoise.c
index 801eba0b5cf8..0ec8bb54180f 100644
--- a/kernel/trace/trace_osnoise.c
+++ b/kernel/trace/trace_osnoise.c
@@ -55,10 +55,17 @@ enum osnoise_options_index {
OSN_DEFAULTS = 0,
OSN_WORKLOAD,
OSN_PANIC_ON_STOP,
+ OSN_PREEMPT_DISABLE,
+ OSN_IRQ_DISABLE,
OSN_MAX
};

-static const char * const osnoise_options_str[OSN_MAX] = { "DEFAULTS", "OSNOISE_WORKLOAD", "PANIC_ON_STOP" };
+static const char * const osnoise_options_str[OSN_MAX] = {
+ "DEFAULTS",
+ "OSNOISE_WORKLOAD",
+ "PANIC_ON_STOP",
+ "OSNOISE_PREEMPT_DISABLE",
+ "OSNOISE_IRQ_DISABLE" };

#define OSN_DEFAULT_OPTIONS 0x2
unsigned long osnoise_options = OSN_DEFAULT_OPTIONS;
@@ -1308,6 +1315,7 @@ static void notify_new_max_latency(u64 latency)
*/
static int run_osnoise(void)
{
+ bool irq_disable = test_bit(OSN_IRQ_DISABLE, &osnoise_options);
struct osnoise_variables *osn_var = this_cpu_osn_var();
u64 start, sample, last_sample;
u64 last_int_count, int_count;
@@ -1315,11 +1323,18 @@ static int run_osnoise(void)
s64 total, last_total = 0;
struct osnoise_sample s;
unsigned int threshold;
+ bool preempt_disable;
u64 runtime, stop_in;
u64 sum_noise = 0;
int hw_count = 0;
int ret = -1;

+ /*
+ * Disabling preemption is only required if IRQs are enabled,
+ * and the options is set on.
+ */
+ preempt_disable = !irq_disable && test_bit(OSN_PREEMPT_DISABLE, &osnoise_options);
+
/*
* Considers the current thread as the workload.
*/
@@ -1335,6 +1350,15 @@ static int run_osnoise(void)
*/
threshold = tracing_thresh ? : 5000;

+ /*
+ * Apply PREEMPT and IRQ disabled options.
+ */
+ if (irq_disable)
+ local_irq_disable();
+
+ if (preempt_disable)
+ preempt_disable();
+
/*
* Make sure NMIs see sampling first
*/
@@ -1422,16 +1446,21 @@ static int run_osnoise(void)
* cond_resched()
*/
if (IS_ENABLED(CONFIG_PREEMPT_RCU)) {
- local_irq_disable();
+ if (!irq_disable)
+ local_irq_disable();
+
rcu_momentary_dyntick_idle();
- local_irq_enable();
+
+ if (!irq_disable)
+ local_irq_enable();
}

/*
* For the non-preemptive kernel config: let threads runs, if
- * they so wish.
+ * they so wish, unless set not do to so.
*/
- cond_resched();
+ if (!irq_disable && !preempt_disable)
+ cond_resched();

last_sample = sample;
last_int_count = int_count;
@@ -1450,6 +1479,15 @@ static int run_osnoise(void)
*/
barrier();

+ /*
+ * Return to the preemptive state.
+ */
+ if (preempt_disable)
+ preempt_enable();
+
+ if (irq_disable)
+ local_irq_enable();
+
/*
* Save noise info.
*/
--
2.32.0

Subject: [PATCH V4 1/3] tracing/osnoise: Add PANIC_ON_STOP option

Often the latency observed in a CPU is not caused by the work being done
in the CPU itself, but by work done on another CPU that causes the
hardware to stall all CPUs. In this case, it is interesting to know
what is happening on ALL CPUs, and the best way to do this is via
crash dump analysis.

Add the PANIC_ON_STOP option to osnoise/timerlat tracers. The default
behavior is having this option off. When enabled by the user, the system
will panic after hitting a stop tracing condition.

This option was motivated by a real scenario that Juri Lelli and I
were debugging.

Cc: Juri Lelli <[email protected]>
Cc: Daniel Bristot de Oliveira <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Signed-off-by: Daniel Bristot de Oliveira <[email protected]>
---
kernel/trace/trace_osnoise.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/kernel/trace/trace_osnoise.c b/kernel/trace/trace_osnoise.c
index 3f10dd1f2f1c..801eba0b5cf8 100644
--- a/kernel/trace/trace_osnoise.c
+++ b/kernel/trace/trace_osnoise.c
@@ -54,10 +54,11 @@
enum osnoise_options_index {
OSN_DEFAULTS = 0,
OSN_WORKLOAD,
+ OSN_PANIC_ON_STOP,
OSN_MAX
};

-static const char * const osnoise_options_str[OSN_MAX] = { "DEFAULTS", "OSNOISE_WORKLOAD" };
+static const char * const osnoise_options_str[OSN_MAX] = { "DEFAULTS", "OSNOISE_WORKLOAD", "PANIC_ON_STOP" };

#define OSN_DEFAULT_OPTIONS 0x2
unsigned long osnoise_options = OSN_DEFAULT_OPTIONS;
@@ -1270,6 +1271,9 @@ static __always_inline void osnoise_stop_tracing(void)
trace_array_printk_buf(tr->array_buffer.buffer, _THIS_IP_,
"stop tracing hit on cpu %d\n", smp_processor_id());

+ if (test_bit(OSN_PANIC_ON_STOP, &osnoise_options))
+ panic("tracer hit stop condition on CPU %d\n", smp_processor_id());
+
tracer_tracing_off(tr);
}
rcu_read_unlock();
--
2.32.0

2022-12-01 04:33:09

by Bagas Sanjaya

[permalink] [raw]
Subject: Re: [PATCH V4 3/3] Documentation/osnoise: Add osnoise/options documentation

On Wed, Nov 30, 2022 at 07:35:42PM +0100, Daniel Bristot de Oliveira wrote:
> diff --git a/Documentation/trace/osnoise-tracer.rst b/Documentation/trace/osnoise-tracer.rst
> index 3c675ed82b27..f2008e317223 100644
> --- a/Documentation/trace/osnoise-tracer.rst
> +++ b/Documentation/trace/osnoise-tracer.rst
> @@ -92,8 +92,8 @@ Note that the example above shows a high number of HW noise samples.
> The reason being is that this sample was taken on a virtual machine,
> and the host interference is detected as a hardware interference.
>
> -Tracer options
> ----------------------
> +Tracer Configuration
> +--------------------
>
> The tracer has a set of options inside the osnoise directory, they are:
>
> @@ -115,6 +115,22 @@ The tracer has a set of options inside the osnoise directory, they are:
> NO_OSNOISE_WORKLOAD disables the OSNOISE_WORKLOAD option. The
> special DEAFAULTS option resets all options to the default value.
>
> +Tracer Options
> +--------------
> +
> +The osnoise/options file exposes a set of on/off configuration options for
> +the osnoise tracer. These options are:
> +
> + - DEFAULTS: reset the options to the default value.
> + - OSNOISE_WORKLOAD: do not dispatch osnoise workload (see dedicated
> + section below).
> + - PANIC_ON_STOP: call panic() if the tracer stops. This option serves to
> + capture a vmcore.
> + - OSNOISE_PREEMPT_DISABLE: disable preemption while running the osnoise
> + workload, allowing only IRQ and hardware-related noise.
> + - OSNOISE_IRQ_DISABLE: disable IRQs while running the osnoise workload,
> + allowing only NMIs and hardware-related noise, like hwlat tracer.
> +
> Additional Tracing
> ------------------
>

The doc LGTM, thanks!

Reviewed-by: Bagas Sanjaya <[email protected]>

--
An old man doll... just what I always wanted! - Clara


Attachments:
(No filename) (1.84 kB)
signature.asc (235.00 B)
Download all attachments

2022-12-09 20:48:25

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH V4 2/3] tracing/osnoise: Add preempt and/or irq disabled options

On Wed, 30 Nov 2022 19:19:10 +0100
Daniel Bristot de Oliveira <[email protected]> wrote:


Hi Daniel,

As I was adding this series, I noticed an issue that needs to be fixed.

> static int run_osnoise(void)
> {
> + bool irq_disable = test_bit(OSN_IRQ_DISABLE, &osnoise_options);
> struct osnoise_variables *osn_var = this_cpu_osn_var();
> u64 start, sample, last_sample;
> u64 last_int_count, int_count;
> @@ -1315,11 +1323,18 @@ static int run_osnoise(void)
> s64 total, last_total = 0;
> struct osnoise_sample s;
> unsigned int threshold;
> + bool preempt_disable;

Let's use a different name for the above variable.

> u64 runtime, stop_in;
> u64 sum_noise = 0;
> int hw_count = 0;
> int ret = -1;
>
> + /*
> + * Disabling preemption is only required if IRQs are enabled,
> + * and the options is set on.
> + */
> + preempt_disable = !irq_disable && test_bit(OSN_PREEMPT_DISABLE, &osnoise_options);
> +
> /*
> * Considers the current thread as the workload.
> */
> @@ -1335,6 +1350,15 @@ static int run_osnoise(void)
> */
> threshold = tracing_thresh ? : 5000;
>
> + /*
> + * Apply PREEMPT and IRQ disabled options.
> + */
> + if (irq_disable)
> + local_irq_disable();
> +
> + if (preempt_disable)
> + preempt_disable();
> +

The only reason the above works is because preempt_disable() is a macro.
If it was a function, then it would likely fail to build (as you are
overriding the name with a bool variable).

-- Steve

Subject: Re: [PATCH V4 2/3] tracing/osnoise: Add preempt and/or irq disabled options

On 12/9/22 21:35, Steven Rostedt wrote:
>> + if (preempt_disable)
>> + preempt_disable();
>> +
> The only reason the above works is because preempt_disable() is a macro.
> If it was a function, then it would likely fail to build (as you are
> overriding the name with a bool variable).

oops.

Sending a new version changing the variable name.

-- Daniel