When the hardware interrupt processing function is executed, the interrupt and
preemption of current cpu are disabled. As a result, the task is suspended.
The execution of the hardware processing function takes a long time
(for example 5 ms), will affect the task scheduling performance.
This patches provides the 'perf irq' command to trace and calculate the time
consumed of the hardware irq function.
[verse]
'perf irq' {record|timeconsume|script}
There are several variants of 'perf irq':
'perf irq record <command>' to record the irq handler events
of an arbitrary workload.
'perf irq script' to see a detailed trace of the workload that
was recorded (aliased to 'perf script' for now).
'perf irq timeconsume' to calculate the time consumed by each
hardware interrupt processing function.
Example usage:
perf irq record -- sleep 1
perf irq timeconsume
By default it shows the individual irq events, including the irq name,
cpu(execute the hardware interrupt processing function), time consumed,
entry time and exit time for the each hardware irq:
-------------------------------------------------------------------------------------------------------------------------------------------
Irq name | CPU | Time consume us | Handler entry time | Handler exit time
-------------------------------------------------------------------------------------------------------------------------------------------
enp2s0f2-tx-0 | [0006] | 0.000001 s | 6631263.313329 s | 6631263.313330 s
-------------------------------------------------------------------------------------------------------------------------------------------
Irq name | CPU | Time consume us | Handler entry time | Handler exit time
-------------------------------------------------------------------------------------------------------------------------------------------
megasas | [0013] | 0.000003 s | 6631263.209564 s | 6631263.209567 s
-------------------------------------------------------------------------------------------------------------------------------------------
Irq name | CPU | Time consume us | Handler entry time | Handler exit time
-------------------------------------------------------------------------------------------------------------------------------------------
acpi | [0016] | 0.000018 s | 6631263.085787 s | 6631263.085805 s
Bixuan Cui (2):
perf tools: add 'perf irq' to measure the hardware interrupts
perf tools: Add documentation for 'perf irq' command
tools/perf/Build | 1 +
tools/perf/Documentation/perf-irq.txt | 58 ++++++
tools/perf/builtin-irq.c | 288 ++++++++++++++++++++++++++
tools/perf/builtin.h | 1 +
tools/perf/command-list.txt | 1 +
tools/perf/perf.c | 1 +
6 files changed, 350 insertions(+)
create mode 100644 tools/perf/Documentation/perf-irq.txt
create mode 100644 tools/perf/builtin-irq.c
--
2.17.1
Add 'perf irq' to trace/measure the hardware interrupts.
Now three functions are provided:
1. 'perf irq record <command>' to record the irq handler events.
2. 'perf irq script' to see a detailed trace of the workload that
was recorded.
3. 'perf irq timeconsume' to calculate the time consumed by each
hardware interrupt processing function.
Signed-off-by: Bixuan Cui <[email protected]>
---
tools/perf/Build | 1 +
tools/perf/builtin-irq.c | 288 +++++++++++++++++++++++++++++++++++++++
tools/perf/builtin.h | 1 +
tools/perf/perf.c | 1 +
4 files changed, 291 insertions(+)
create mode 100644 tools/perf/builtin-irq.c
diff --git a/tools/perf/Build b/tools/perf/Build
index 5f392dbb88fc..d52a1e1d6d8a 100644
--- a/tools/perf/Build
+++ b/tools/perf/Build
@@ -24,6 +24,7 @@ perf-y += builtin-mem.o
perf-y += builtin-data.o
perf-y += builtin-version.o
perf-y += builtin-c2c.o
+perf-y += builtin-irq.o
perf-$(CONFIG_TRACE) += builtin-trace.o
perf-$(CONFIG_LIBELF) += builtin-probe.o
diff --git a/tools/perf/builtin-irq.c b/tools/perf/builtin-irq.c
new file mode 100644
index 000000000000..3a73e698dedf
--- /dev/null
+++ b/tools/perf/builtin-irq.c
@@ -0,0 +1,288 @@
+// SPDX-License-Identifier: GPL-2.0
+#include "builtin.h"
+#include "perf.h"
+#include "perf-sys.h"
+
+#include "util/cpumap.h"
+#include "util/evlist.h"
+#include "util/evsel.h"
+#include "util/evsel_fprintf.h"
+#include "util/symbol.h"
+#include "util/thread.h"
+#include "util/header.h"
+#include "util/session.h"
+#include "util/tool.h"
+#include "util/cloexec.h"
+#include "util/thread_map.h"
+#include "util/color.h"
+#include "util/stat.h"
+#include "util/string2.h"
+#include "util/callchain.h"
+#include "util/time-utils.h"
+
+#include <subcmd/pager.h>
+#include <subcmd/parse-options.h>
+#include "util/trace-event.h"
+
+#include "util/debug.h"
+#include "util/event.h"
+
+#include <linux/kernel.h>
+#include <linux/log2.h>
+#include <linux/zalloc.h>
+#include <sys/prctl.h>
+#include <sys/resource.h>
+#include <inttypes.h>
+
+#include <errno.h>
+#include <semaphore.h>
+#include <pthread.h>
+#include <math.h>
+#include <api/fs/fs.h>
+#include <perf/cpumap.h>
+#include <linux/time64.h>
+#include <linux/err.h>
+
+#include <linux/ctype.h>
+
+#define IRQ_NAME_LEN 20
+#define MAX_CPUS 4096
+
+static const char *cpu_list;
+static DECLARE_BITMAP(cpu_bitmap, MAX_NR_CPUS);
+
+struct perf_irq;
+
+struct perf_irq {
+ struct perf_tool tool;
+ bool force;
+
+ u32 irq_entry_irq;
+ char irq_name[IRQ_NAME_LEN];
+ u32 cpu;
+ u64 irq_entry_time;
+ u32 irq_entry_pid;
+ u32 irq_exit_irq;
+ u64 irq_exit_time;
+ u32 irq_exit_pid;
+};
+
+typedef int (*irq_handler)(struct perf_tool *tool,
+ union perf_event *event,
+ struct evsel *evsel,
+ struct perf_sample *sample,
+ struct machine *machine);
+
+static int perf_timeconsume_process_sample(struct perf_tool *tool,
+ union perf_event *event,
+ struct perf_sample *sample,
+ struct evsel *evsel,
+ struct machine *machine)
+{
+ int err = 0;
+
+ if (evsel->handler != NULL) {
+ irq_handler f = evsel->handler;
+ err = f(tool, event, evsel, sample, machine);
+ }
+
+ return err;
+}
+
+static void output_timeconsume(struct perf_irq *irq)
+{
+ int ret, i;
+ char irq_entry_time[30], irq_exit_time[30], irq_diff[30];
+
+ /* The entry and exit of the hardware irq function
+ * exist at the same time. Check it by irq and pid.
+ */
+ if (irq->irq_entry_pid != irq->irq_exit_pid ||
+ irq->irq_entry_irq != irq->irq_exit_irq)
+ return;
+
+ timestamp__scnprintf_usec(irq->irq_entry_time,
+ irq_entry_time, sizeof(irq_entry_time));
+ timestamp__scnprintf_usec(irq->irq_exit_time,
+ irq_exit_time, sizeof(irq_exit_time));
+ timestamp__scnprintf_usec(irq->irq_exit_time - irq->irq_entry_time,
+ irq_diff, sizeof(irq_diff));
+
+ printf(" -------------------------------------------------------------------------------------------------------------------------------------------\n");
+ printf(" Irq name | CPU | Time consume us | Handler entry time | Handler exit time \n");
+ printf(" -------------------------------------------------------------------------------------------------------------------------------------------\n");
+
+ ret = printf(" %s ", irq->irq_name);
+ for (i = 0; i < IRQ_NAME_LEN - ret; i++)
+ printf(" ");
+
+ printf("| [%04d] | %13s s | %16s s | %16s s\n",
+ irq->cpu, irq_diff, irq_entry_time, irq_exit_time);
+ printf("\n");
+}
+
+static int timeconsume_irq_handler_entry_event(struct perf_tool *tool,
+ union perf_event *event __maybe_unused,
+ struct evsel *evsel,
+ struct perf_sample *sample,
+ struct machine *machine __maybe_unused)
+{
+ int err = 0;
+ struct perf_irq *irq = container_of(tool, struct perf_irq, tool);
+
+ const char *name = evsel__strval(evsel, sample, "name");
+
+ irq->irq_entry_pid = evsel__intval(evsel, sample, "pid");
+ irq->irq_entry_irq = evsel__intval(evsel, sample, "irq");
+ irq->irq_entry_time = sample->time;
+ strncpy(irq->irq_name, name, IRQ_NAME_LEN);
+
+ return err;
+}
+
+static int timeconsume_irq_handler_exit_event(struct perf_tool *tool,
+ union perf_event *event __maybe_unused,
+ struct evsel *evsel,
+ struct perf_sample *sample,
+ struct machine *machine __maybe_unused)
+{
+ int err = 0;
+ struct perf_irq *irq = container_of(tool, struct perf_irq, tool);
+
+ irq->irq_exit_pid = evsel__intval(evsel, sample, "pid");
+ irq->irq_exit_irq = evsel__intval(evsel, sample, "irq");
+ irq->irq_exit_time = sample->time;
+ irq->cpu = sample->cpu;
+
+ if (cpu_list && !test_bit(sample->cpu, cpu_bitmap))
+ return err;
+
+ output_timeconsume(irq);
+
+ return err;
+}
+
+static int perf_irq__read_events(struct perf_irq *irq)
+{
+ struct evsel_str_handler handlers[] = {
+ { "irq:irq_handler_entry", timeconsume_irq_handler_entry_event, },
+ { "irq:irq_handler_exit", timeconsume_irq_handler_exit_event, },
+ };
+ struct perf_session *session;
+ struct perf_data data = {
+ .path = input_name,
+ .mode = PERF_DATA_MODE_READ,
+ .force = irq->force,
+ };
+ int rc = -1;
+
+ irq->tool.sample = perf_timeconsume_process_sample;
+
+ session = perf_session__new(&data, false, &irq->tool);
+ if (IS_ERR(session)) {
+ pr_debug("Error creating perf session");
+ return PTR_ERR(session);
+ }
+
+ if (cpu_list) {
+ rc = perf_session__cpu_bitmap(session, cpu_list, cpu_bitmap);
+ if (rc < 0)
+ goto out_delete;
+ }
+
+ if (perf_session__set_tracepoints_handlers(session, handlers))
+ goto out_delete;
+
+ rc = perf_session__process_events(session);
+ if (rc) {
+ pr_err("Failed to process events, error %d", rc);
+ goto out_delete;
+ }
+
+out_delete:
+ perf_session__delete(session);
+ return rc;
+}
+
+static int irq_timeconsume(struct perf_irq *irq)
+{
+ if (perf_irq__read_events(irq))
+ return -1;
+
+ return 0;
+}
+
+static int __cmd_record(int argc, const char **argv)
+{
+ unsigned int rec_argc, i, j;
+ const char **rec_argv;
+ const char * const record_args[] = {
+ "record",
+ "-a",
+ "-R",
+ "-m", "1024",
+ "-c", "1",
+ "-e", "irq:irq_handler_entry",
+ "-e", "irq:irq_handler_exit",
+ };
+
+ rec_argc = ARRAY_SIZE(record_args) + argc - 1;
+ rec_argv = calloc(rec_argc + 1, sizeof(char *));
+ if (rec_argv == NULL)
+ return -ENOMEM;
+
+ for (i = 0; i < ARRAY_SIZE(record_args); i++)
+ rec_argv[i] = strdup(record_args[i]);
+
+ for (j = 1; j < (unsigned int)argc; j++, i++)
+ rec_argv[i] = argv[j];
+
+ BUG_ON(i != rec_argc);
+
+ return cmd_record(i, rec_argv);
+}
+
+int cmd_irq(int argc, const char **argv)
+{
+ struct perf_irq irq = {
+ .force = false,
+ };
+
+ const struct option irq_options[] = {
+ OPT_END()
+ };
+ const struct option timeconsume_options[] = {
+ OPT_STRING('C', "cpu", &cpu_list, "cpu", "list of cpus to profile"),
+ OPT_PARENT(irq_options)
+ };
+
+ const char * const timeconsume_usage[] = {
+ "perf irq timeconsume [<options>]",
+ NULL
+ };
+ const char *const irq_subcommands[] = { "record", "timeconsume", "script",
+ NULL };
+ const char *irq_usage[] = {
+ NULL,
+ NULL
+ };
+
+ argc = parse_options_subcommand(argc, argv, irq_options, irq_subcommands,
+ irq_usage, PARSE_OPT_STOP_AT_NON_OPTION);
+ if (!argc)
+ usage_with_options(irq_usage, irq_options);
+
+ if (!strcmp(argv[0], "script")) {
+ return cmd_script(argc, argv);
+ } else if (!strncmp(argv[0], "record", 6)) {
+ return __cmd_record(argc, argv);
+ } else if (!strncmp(argv[0], "timeconsume", 11)) {
+ if (argc > 1) {
+ argc = parse_options(argc, argv, timeconsume_options, timeconsume_usage, 0);
+ if (argc)
+ usage_with_options(timeconsume_usage, timeconsume_options);
+ }
+ return irq_timeconsume(&irq);
+ }
+ return 0;
+}
diff --git a/tools/perf/builtin.h b/tools/perf/builtin.h
index 14a2db622a7b..fd92dfc89370 100644
--- a/tools/perf/builtin.h
+++ b/tools/perf/builtin.h
@@ -37,6 +37,7 @@ int cmd_inject(int argc, const char **argv);
int cmd_mem(int argc, const char **argv);
int cmd_data(int argc, const char **argv);
int cmd_ftrace(int argc, const char **argv);
+int cmd_irq(int argc, const char **argv);
int find_scripts(char **scripts_array, char **scripts_path_array, int num,
int pathlen);
diff --git a/tools/perf/perf.c b/tools/perf/perf.c
index 27f94b0bb874..df9cb344c2d0 100644
--- a/tools/perf/perf.c
+++ b/tools/perf/perf.c
@@ -88,6 +88,7 @@ static struct cmd_struct commands[] = {
{ "mem", cmd_mem, 0 },
{ "data", cmd_data, 0 },
{ "ftrace", cmd_ftrace, 0 },
+ { "irq", cmd_irq, 0 },
};
struct pager_config {
--
2.17.1
Add documentation for 'perf irq' command.
Signed-off-by: Bixuan Cui <[email protected]>
---
tools/perf/Documentation/perf-irq.txt | 58 +++++++++++++++++++++++++++
tools/perf/command-list.txt | 1 +
2 files changed, 59 insertions(+)
create mode 100644 tools/perf/Documentation/perf-irq.txt
diff --git a/tools/perf/Documentation/perf-irq.txt b/tools/perf/Documentation/perf-irq.txt
new file mode 100644
index 000000000000..8c0e388dad59
--- /dev/null
+++ b/tools/perf/Documentation/perf-irq.txt
@@ -0,0 +1,58 @@
+perf-irq(1)
+=============
+
+NAME
+----
+perf-irq - Tool to trace/measure hardware interrupts
+
+SYNOPSIS
+--------
+[verse]
+'perf irq' {record|timeconsume|script}
+
+DESCRIPTION
+-----------
+There are several variants of 'perf irq':
+
+ 'perf irq record <command>' to record the irq handler events
+ of an arbitrary workload.
+
+ 'perf irq script' to see a detailed trace of the workload that
+ was recorded (aliased to 'perf script' for now).
+
+ 'perf irq timeconsume' to calculate the time consumed by each
+ hardware interrupt processing function.
+
+ Example usage:
+ perf irq record -- sleep 1
+ perf irq timeconsume
+
+ By default it shows the individual irq events, including the irq name,
+ cpu(execute the hardware interrupt processing function), time consumed,
+ entry time and exit time for the each hardware irq:
+
+ -------------------------------------------------------------------------------------------------------------------------------------------
+ Irq name | CPU | Time consume us | Handler entry time | Handler exit time
+ -------------------------------------------------------------------------------------------------------------------------------------------
+ enp2s0f2-tx-0 | [0006] | 0.000001 s | 6631263.313329 s | 6631263.313330 s
+
+ -------------------------------------------------------------------------------------------------------------------------------------------
+ Irq name | CPU | Time consume us | Handler entry time | Handler exit time
+ -------------------------------------------------------------------------------------------------------------------------------------------
+ megasas | [0013] | 0.000003 s | 6631263.209564 s | 6631263.209567 s
+
+ -------------------------------------------------------------------------------------------------------------------------------------------
+ Irq name | CPU | Time consume us | Handler entry time | Handler exit time
+ -------------------------------------------------------------------------------------------------------------------------------------------
+ acpi | [0016] | 0.000018 s | 6631263.085787 s | 6631263.085805 s
+
+
+OPTIONS for 'perf irq'
+----------------------------
+
+--cpus::
+ Show just entries with activities for the given CPUs.
+
+SEE ALSO
+--------
+linkperf:perf-record[1]
diff --git a/tools/perf/command-list.txt b/tools/perf/command-list.txt
index bc6c585f74fc..c5224ea3ac71 100644
--- a/tools/perf/command-list.txt
+++ b/tools/perf/command-list.txt
@@ -26,6 +26,7 @@ perf-report mainporcelain common
perf-sched mainporcelain common
perf-script mainporcelain common
perf-stat mainporcelain common
+perf-irq mainporcelain common
perf-test mainporcelain common
perf-timechart mainporcelain common
perf-top mainporcelain common
--
2.17.1
Hi Bixuan,
On 12.01.2021 15:55, Bixuan Cui wrote:
> Add 'perf irq' to trace/measure the hardware interrupts.
>
> Now three functions are provided:
> 1. 'perf irq record <command>' to record the irq handler events.
> 2. 'perf irq script' to see a detailed trace of the workload that
> was recorded.
> 3. 'perf irq timeconsume' to calculate the time consumed by each
> hardware interrupt processing function.
>
> Signed-off-by: Bixuan Cui <[email protected]>
Thanks for the patches. There is still something that could be improved.
> ---
> tools/perf/Build | 1 +
> tools/perf/builtin-irq.c | 288 +++++++++++++++++++++++++++++++++++++++
> tools/perf/builtin.h | 1 +
> tools/perf/perf.c | 1 +
> 4 files changed, 291 insertions(+)
> create mode 100644 tools/perf/builtin-irq.c
>
<SNIP>
> +
> +static int __cmd_record(int argc, const char **argv)
> +{
> + unsigned int rec_argc, i, j;
> + const char **rec_argv;
> + const char * const record_args[] = {
> + "record",
> + "-a",
I see it works also like this:
sudo perf record -p PID -c 1 -e irq:irq_handler_entry,irq:irq_handler_exit
sudo perf record -R -c 1 -e irq:irq_handler_entry,irq:irq_handler_exit -- find /
This -a option jointly with -p option could be made configurable from
the command line for perf irq mode.
> + "-R",
> + "-m", "1024",
Do you see data losses with default buffer size of 512KB
when capturing trace in your specific use case?
If not then this -m could be avoided or made configurable
if you still need it.
Regards,
Alexei
Hi,
On 12.01.2021 15:55, Bixuan Cui wrote:
> Add documentation for 'perf irq' command.
>
> Signed-off-by: Bixuan Cui <[email protected]>
> ---
> tools/perf/Documentation/perf-irq.txt | 58 +++++++++++++++++++++++++++
> tools/perf/command-list.txt | 1 +
> 2 files changed, 59 insertions(+)
> create mode 100644 tools/perf/Documentation/perf-irq.txt
>
> diff --git a/tools/perf/Documentation/perf-irq.txt b/tools/perf/Documentation/perf-irq.txt
> new file mode 100644
> index 000000000000..8c0e388dad59
> --- /dev/null
> +++ b/tools/perf/Documentation/perf-irq.txt
> @@ -0,0 +1,58 @@
> +perf-irq(1)
> +=============
> +
> +NAME
> +----
> +perf-irq - Tool to trace/measure hardware interrupts
> +
> +SYNOPSIS
> +--------
> +[verse]
> +'perf irq' {record|timeconsume|script}
> +
> +DESCRIPTION
> +-----------
> +There are several variants of 'perf irq':
> +
> + 'perf irq record <command>' to record the irq handler events
> + of an arbitrary workload.
> +
> + 'perf irq script' to see a detailed trace of the workload that
> + was recorded (aliased to 'perf script' for now).
> +
> + 'perf irq timeconsume' to calculate the time consumed by each
> + hardware interrupt processing function.
> +
> + Example usage:
> + perf irq record -- sleep 1
> + perf irq timeconsume
This timeconsume mode looks more like: perf irq report <options>
> +
> + By default it shows the individual irq events, including the irq name,
> + cpu(execute the hardware interrupt processing function), time consumed,
> + entry time and exit time for the each hardware irq:
> +
> + -------------------------------------------------------------------------------------------------------------------------------------------
> + Irq name | CPU | Time consume us | Handler entry time | Handler exit time
> + -------------------------------------------------------------------------------------------------------------------------------------------
> + enp2s0f2-tx-0 | [0006] | 0.000001 s | 6631263.313329 s | 6631263.313330 s
> +
> + -------------------------------------------------------------------------------------------------------------------------------------------
> + Irq name | CPU | Time consume us | Handler entry time | Handler exit time
> + -------------------------------------------------------------------------------------------------------------------------------------------
> + megasas | [0013] | 0.000003 s | 6631263.209564 s | 6631263.209567 s
> +
> + -------------------------------------------------------------------------------------------------------------------------------------------
> + Irq name | CPU | Time consume us | Handler entry time | Handler exit time
> + -------------------------------------------------------------------------------------------------------------------------------------------
> + acpi | [0016] | 0.000018 s | 6631263.085787 s | 6631263.085805 s
> +
> +
> +OPTIONS for 'perf irq'
> +----------------------------
> +
> +--cpus::
> + Show just entries with activities for the given CPUs.
I am getting this:
tools/perf/perf irq --cpus=0 record -- find /
Error: unknown option `cpus=0'
Usage: perf irq [<options>] {record|timeconsume|script}
Regards,
Alexei
On 2021/1/13 3:50, Alexei Budankov wrote:
> Hi Bixuan,
>
> On 12.01.2021 15:55, Bixuan Cui wrote:
>> Add 'perf irq' to trace/measure the hardware interrupts.
>>
>> Now three functions are provided:
>> 1. 'perf irq record <command>' to record the irq handler events.
>> 2. 'perf irq script' to see a detailed trace of the workload that
>> was recorded.
>> 3. 'perf irq timeconsume' to calculate the time consumed by each
>> hardware interrupt processing function.
>>
>> Signed-off-by: Bixuan Cui <[email protected]>
> Thanks for the patches. There is still something that could be improved.
>
>> ---
>> tools/perf/Build | 1 +
>> tools/perf/builtin-irq.c | 288 +++++++++++++++++++++++++++++++++++++++
>> tools/perf/builtin.h | 1 +
>> tools/perf/perf.c | 1 +
>> 4 files changed, 291 insertions(+)
>> create mode 100644 tools/perf/builtin-irq.c
>>
> <SNIP>
>
>> +
>> +static int __cmd_record(int argc, const char **argv)
>> +{
>> + unsigned int rec_argc, i, j;
>> + const char **rec_argv;
>> + const char * const record_args[] = {
>> + "record",
>> + "-a",
> I see it works also like this:
>
> sudo perf record -p PID -c 1 -e irq:irq_handler_entry,irq:irq_handler_exit
> sudo perf record -R -c 1 -e irq:irq_handler_entry,irq:irq_handler_exit -- find /
>
> This -a option jointly with -p option could be made configurable from
> the command line for perf irq mode.
That's true. We can add a series of commands for 'perf irq',such as record, script and report.
So I kept the 'perf irq record'.
>
>> + "-R",
>> + "-m", "1024",
> Do you see data losses with default buffer size of 512KB
> when capturing trace in your specific use case?
>
> If not then this -m could be avoided or made configurable
> if you still need it.
Thank you for your advice, I will delete it.