2020-05-25 20:03:40

by Alexey Budankov

[permalink] [raw]
Subject: [PATCH v4 00/10] perf: support enable and disable commands in stat and record modes


Changes in v4:
- made checking of ctlfd state unconditional in record trace streaming loop
- introduced static poll fds to keep evlist__filter_pollfd() unaffected
- handled ret code of evlist__initialize_ctlfd() where need
- renamed and structured handle_events() function
- applied anonymous structs where needed

v3: https://lore.kernel.org/lkml/[email protected]/

Changes in v3:
- renamed functions and types from perf_evlist_ to evlist_ to avoid
clash with libperf code;
- extended commands to be strings of variable length consisting of
command name and also possibly including command specific data;
- merged docs update with the code changes;
- updated docs for -D,--delay=-1 option for stat and record modes;

v2: https://lore.kernel.org/lkml/[email protected]/

Changes in v2:
- renamed resume and pause commands to enable and disable ones, renamed
CTL_CMD_RESUME and CTL_CMD_PAUSE to CTL_CMD_ENABLE and CTL_CMD_DISABLE
to fit to the appropriate ioctls and avoid mixing up with PAUSE_OUTPUT
ioctl;
- factored out event handling loop into a handle_events() for stat mode;
- separated -D,--delay=-1 into separate patches for stat and record modes;

v1: https://lore.kernel.org/lkml/[email protected]/

repo: tip of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git perf/core

The patch set implements handling of 'start disabled', 'enable' and 'disable'
external control commands which can be provided for stat and record modes
of the tool from an external controlling process. 'start disabled' command
can be used to postpone enabling of events in the beginning of a monitoring
session. 'enable' and 'disable' commands can be used to enable and disable
events correspondingly any time after the start of the session.

The 'start disabled', 'enable' and 'disable' external control commands can be
used to focus measurement on specially selected time intervals of workload
execution. Focused measurement reduces tool intrusion and influence on
workload behavior, reduces distortion and amount of collected and stored
data, mitigates data accuracy loss because measurement and data capturing
happen only during intervals of interest.

A controlling process can be a bash shell script [1], native executable or
any other language program that can directly work with file descriptors,
e.g. pipes [2], and spawn a process, specially the tool one.

-D,--delay <val> option is extended with -1 value to skip events enabling
in the beginning of a monitoring session ('start disabled' command).
--ctl-fd and --ctl-fd-ack command line options are introduced to provide the
tool with a pair of file descriptors to listen to control commands and reply
to the controlling process on the completion of received commands.

The tool reads control command message from ctl-fd descriptor, handles the
command and optionally replies acknowledgement message to fd-ack descriptor,
if it is specified on the command line. 'enable' command is recognized as
'enable' string message and 'disable' command is recognized as 'disable'
string message both received from ctl-fd descriptor. Completion message is
'ack\n' and sent to fd-ack descriptor.

Example bash script demonstrating simple use case follows:

#!/bin/bash

ctl_dir=/tmp/

ctl_fifo=${ctl_dir}perf_ctl.fifo
test -p ${ctl_fifo} && unlink ${ctl_fifo}
mkfifo ${ctl_fifo} && exec {ctl_fd}<>${ctl_fifo}

ctl_ack_fifo=${ctl_dir}perf_ctl_ack.fifo
test -p ${ctl_ack_fifo} && unlink ${ctl_ack_fifo}
mkfifo ${ctl_ack_fifo} && exec {ctl_fd_ack}<>${ctl_ack_fifo}

perf stat -D -1 -e cpu-cycles -a -I 1000 \
--ctl-fd ${ctl_fd} --ctl-fd-ack ${ctl_fd_ack} \
-- sleep 40 &
perf_pid=$!

sleep 5 && echo 'enable' >&${ctl_fd} && read -u ${ctl_fd_ack} e1 && echo "enabled(${e1})"
sleep 10 && echo 'disable' >&${ctl_fd} && read -u ${ctl_fd_ack} d1 && echo "disabled(${d1})"
sleep 5 && echo 'enable' >&${ctl_fd} && read -u ${ctl_fd_ack} e2 && echo "enabled(${e2})"
sleep 10 && echo 'disable' >&${ctl_fd} && read -u ${ctl_fd_ack} d2 && echo "disabled(${d2})"

exec {ctl_fd_ack}>&- && unlink ${ctl_ack_fifo}
exec {ctl_fd}>&- && unlink ${ctl_fifo}

wait -n ${perf_pid}
exit $?


Script output:

[root@host dir] example
Events disabled
# time counts unit events
1.001101062 <not counted> cpu-cycles
2.002994944 <not counted> cpu-cycles
3.004864340 <not counted> cpu-cycles
4.006727177 <not counted> cpu-cycles
Events enabled
enabled(ack)
4.993808464 3,124,246 cpu-cycles
5.008597004 3,325,624 cpu-cycles
6.010387483 83,472,992 cpu-cycles
7.012266598 55,877,621 cpu-cycles
8.014175695 97,892,729 cpu-cycles
9.016056093 68,461,242 cpu-cycles
10.017937507 55,449,643 cpu-cycles
11.019830154 68,938,167 cpu-cycles
12.021719952 55,164,101 cpu-cycles
13.023627550 70,535,720 cpu-cycles
14.025580995 53,240,125 cpu-cycles
disabled(ack)
14.997518260 53,558,068 cpu-cycles
Events disabled
15.027216416 <not counted> cpu-cycles
16.029052729 <not counted> cpu-cycles
17.030904762 <not counted> cpu-cycles
18.032073424 <not counted> cpu-cycles
19.033805074 <not counted> cpu-cycles
Events enabled
enabled(ack)
20.001279097 3,021,022 cpu-cycles
20.035044381 6,434,367 cpu-cycles
21.036923813 89,358,251 cpu-cycles
22.038825169 72,516,351 cpu-cycles
# time counts unit events
23.040715596 55,046,157 cpu-cycles
24.042643757 78,128,649 cpu-cycles
25.044558535 61,052,428 cpu-cycles
26.046452785 62,142,806 cpu-cycles
27.048353021 74,477,971 cpu-cycles
28.050241286 61,001,623 cpu-cycles
29.052149961 61,653,502 cpu-cycles
disabled(ack)
30.004980264 82,729,640 cpu-cycles
Events disabled
30.053516176 <not counted> cpu-cycles
31.055348366 <not counted> cpu-cycles
32.057202097 <not counted> cpu-cycles
33.059040702 <not counted> cpu-cycles
34.060843288 <not counted> cpu-cycles
35.000888624 <not counted> cpu-cycles
[root@host dir]#

[1] http://man7.org/linux/man-pages/man1/bash.1.html
[2] http://man7.org/linux/man-pages/man2/pipe.2.html

---
Alexey Budankov (10):
tools/libperf: introduce static poll file descriptors
perf evlist: introduce control file descriptors
perf evlist: implement control command handling functions
perf stat: factor out event handling loop into a function
perf stat: extend -D,--delay option with -1 value
perf stat: implement control commands handling
perf stat: introduce --ctl-fd[-ack] options
perf record: extend -D,--delay option with -1 value
perf record: implement control commands handling
perf record: introduce --ctl-fd[-ack] options

tools/lib/api/fd/array.c | 42 +++++-
tools/lib/api/fd/array.h | 7 +
tools/lib/perf/evlist.c | 11 ++
tools/lib/perf/include/internal/evlist.h | 2 +
tools/perf/Documentation/perf-record.txt | 44 ++++++-
tools/perf/Documentation/perf-stat.txt | 45 ++++++-
tools/perf/builtin-record.c | 38 +++++-
tools/perf/builtin-stat.c | 155 +++++++++++++++++------
tools/perf/builtin-trace.c | 2 +-
tools/perf/util/evlist.c | 131 +++++++++++++++++++
tools/perf/util/evlist.h | 25 ++++
tools/perf/util/record.h | 4 +-
tools/perf/util/stat.h | 4 +-
13 files changed, 459 insertions(+), 51 deletions(-)

--
2.24.1


2020-05-25 20:04:07

by Alexey Budankov

[permalink] [raw]
Subject: [PATCH v4 04/10] perf stat: factor out event handling loop into a function


Factor out event handling loop into dispatch_events() function.

Signed-off-by: Alexey Budankov <[email protected]>
---
tools/perf/builtin-stat.c | 93 ++++++++++++++++++++++++---------------
1 file changed, 58 insertions(+), 35 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index c43ba6080691..ef2682a87491 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -421,6 +421,24 @@ static void process_interval(void)
print_counters(&rs, 0, NULL);
}

+static bool print_interval(unsigned int interval, int *times)
+{
+ if (interval) {
+ process_interval();
+ if (interval_count && !(--(*times)))
+ return true;
+ }
+ return false;
+}
+
+static bool process_timeout(bool timeout, unsigned int interval, int *times)
+{
+ if (timeout)
+ return true;
+ else
+ return print_interval(interval, times);
+}
+
static void enable_counters(void)
{
if (stat_config.initial_delay)
@@ -486,6 +504,42 @@ static bool is_target_alive(struct target *_target,
return false;
}

+static int dispatch_events(pid_t pid, struct perf_stat_config *config)
+{
+ pid_t child = 0;
+ bool stop = false;
+ struct timespec time_to_sleep;
+ int sleep_time, status = 0;
+ int times = config->times;
+
+ if (config->interval)
+ sleep_time = config->interval;
+ else if (config->timeout)
+ sleep_time = config->timeout;
+ else
+ sleep_time = 1000;
+
+ time_to_sleep.tv_sec = sleep_time / MSEC_PER_SEC;
+ time_to_sleep.tv_nsec = (sleep_time % MSEC_PER_SEC) * NSEC_PER_MSEC;
+
+ do {
+ if (pid != -1) {
+ child = waitpid(pid, &status, WNOHANG);
+ } else {
+ if (!stop)
+ stop = !is_target_alive(&target, evsel_list->core.threads);
+ }
+
+ if (child || stop || done)
+ break;
+
+ nanosleep(&time_to_sleep, NULL);
+ stop = process_timeout(config->timeout, config->interval, &times);
+ } while (1);
+
+ return status;
+}
+
enum counter_recovery {
COUNTER_SKIP,
COUNTER_RETRY,
@@ -544,12 +598,10 @@ static enum counter_recovery stat_handle_error(struct evsel *counter)
static int __run_perf_stat(int argc, const char **argv, int run_idx)
{
int interval = stat_config.interval;
- int times = stat_config.times;
int timeout = stat_config.timeout;
char msg[BUFSIZ];
unsigned long long t0, t1;
struct evsel *counter;
- struct timespec ts;
size_t l;
int status = 0;
const bool forks = (argc > 0);
@@ -558,17 +610,6 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
int i, cpu;
bool second_pass = false;

- if (interval) {
- ts.tv_sec = interval / USEC_PER_MSEC;
- ts.tv_nsec = (interval % USEC_PER_MSEC) * NSEC_PER_MSEC;
- } else if (timeout) {
- ts.tv_sec = timeout / USEC_PER_MSEC;
- ts.tv_nsec = (timeout % USEC_PER_MSEC) * NSEC_PER_MSEC;
- } else {
- ts.tv_sec = 1;
- ts.tv_nsec = 0;
- }
-
if (forks) {
if (perf_evlist__prepare_workload(evsel_list, &target, argv, is_pipe,
workload_exec_failed_signal) < 0) {
@@ -725,16 +766,9 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
perf_evlist__start_workload(evsel_list);
enable_counters();

- if (interval || timeout) {
- while (!waitpid(child_pid, &status, WNOHANG)) {
- nanosleep(&ts, NULL);
- if (timeout)
- break;
- process_interval();
- if (interval_count && !(--times))
- break;
- }
- }
+ if (interval || timeout)
+ dispatch_events(child_pid, &stat_config);
+
if (child_pid != -1) {
if (timeout)
kill(child_pid, SIGTERM);
@@ -751,18 +785,7 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
psignal(WTERMSIG(status), argv[0]);
} else {
enable_counters();
- while (!done) {
- nanosleep(&ts, NULL);
- if (!is_target_alive(&target, evsel_list->core.threads))
- break;
- if (timeout)
- break;
- if (interval) {
- process_interval();
- if (interval_count && !(--times))
- break;
- }
- }
+ dispatch_events(-1, &stat_config);
}

disable_counters();
--
2.24.1

2020-05-25 20:04:34

by Alexey Budankov

[permalink] [raw]
Subject: [PATCH v4 07/10] perf stat: introduce --ctl-fd[-ack] options


Introduce --ctl-fd[-ack] options to pass open file descriptors numbers
from command line. Extend perf-stat.txt file with --ctl-fd[-ack] options
description. Document possible usage model introduced by --ctl-fd[-ack]
options by providing example bash shell script.

Signed-off-by: Alexey Budankov <[email protected]>
---
tools/perf/Documentation/perf-stat.txt | 40 ++++++++++++++++++++++++++
tools/perf/builtin-stat.c | 11 +++++++
tools/perf/util/stat.h | 2 ++
3 files changed, 53 insertions(+)

diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
index b4d2434584f5..4b15373af800 100644
--- a/tools/perf/Documentation/perf-stat.txt
+++ b/tools/perf/Documentation/perf-stat.txt
@@ -164,6 +164,46 @@ with it. --append may be used here. Examples:
3>results perf stat --log-fd 3 -- $cmd
3>>results perf stat --log-fd 3 --append -- $cmd

+--ctl-fd::
+--ctl-fd-ack::
+
+Listen on ctl-fd descriptor for command to control measurement ('enable': enable events,
+'disable': disable events). Optionally send control command completion ('ack') to fd-ack
+descriptor to synchronize with the controlling process. Example of bash shell script
+to enable and disable events during measurements:
+
+#!/bin/bash
+
+ctl_dir=/tmp/
+
+ctl_fifo=${ctl_dir}perf_ctl.fifo
+test -p ${ctl_fifo} && unlink ${ctl_fifo}
+mkfifo ${ctl_fifo}
+exec {ctl_fd}<>${ctl_fifo}
+
+ctl_ack_fifo=${ctl_dir}perf_ctl_ack.fifo
+test -p ${ctl_ack_fifo} && unlink ${ctl_ack_fifo}
+mkfifo ${ctl_ack_fifo}
+exec {ctl_fd_ack}<>${ctl_ack_fifo}
+
+perf stat -D -1 -e cpu-cycles -a -I 1000 \
+ --ctl-fd ${ctl_fd} --ctl-fd-ack ${ctl_fd_ack} \
+ -- sleep 30 &
+perf_pid=$!
+
+sleep 5 && echo 'enable' >&${ctl_fd} && read -u ${ctl_fd_ack} e1 && echo "enabled(${e1})"
+sleep 10 && echo 'disable' >&${ctl_fd} && read -u ${ctl_fd_ack} d1 && echo "disabled(${d1})"
+
+exec {ctl_fd_ack}>&-
+unlink ${ctl_ack_fifo}
+
+exec {ctl_fd}>&-
+unlink ${ctl_fifo}
+
+wait -n ${perf_pid}
+exit $?
+
+
--pre::
--post::
Pre and post measurement hooks, e.g.:
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index c06c2b701a15..69b0e4f498c6 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -187,6 +187,8 @@ static struct perf_stat_config stat_config = {
.metric_only_len = METRIC_ONLY_LEN,
.walltime_nsecs_stats = &walltime_nsecs_stats,
.big_num = true,
+ .ctl_fd = -1,
+ .ctl_fd_ack = -1
};

static inline void diff_timespec(struct timespec *r, struct timespec *a,
@@ -1065,6 +1067,10 @@ static struct option stat_options[] = {
"Use with 'percore' event qualifier to show the event "
"counts of one hardware thread by sum up total hardware "
"threads of same physical core"),
+ OPT_INTEGER(0, "ctl-fd", &stat_config.ctl_fd,
+ "Listen on fd descriptor for command to control measurement ('enable': enable events, 'disable': disable events)"),
+ OPT_INTEGER(0, "ctl-fd-ack", &stat_config.ctl_fd_ack,
+ "Send control command completion ('ack') to fd ack descriptor"),
OPT_END()
};

@@ -2232,6 +2238,9 @@ int cmd_stat(int argc, const char **argv)
signal(SIGALRM, skip_signal);
signal(SIGABRT, skip_signal);

+ if (evlist__initialize_ctlfd(evsel_list, stat_config.ctl_fd, stat_config.ctl_fd_ack))
+ goto out;
+
status = 0;
for (run_idx = 0; forever || run_idx < stat_config.run_count; run_idx++) {
if (stat_config.run_count != 1 && verbose > 0)
@@ -2251,6 +2260,8 @@ int cmd_stat(int argc, const char **argv)
if (!forever && status != -1 && (!interval || stat_config.summary))
print_counters(NULL, argc, argv);

+ evlist__finalize_ctlfd(evsel_list);
+
if (STAT_RECORD) {
/*
* We synthesize the kernel mmap record just so that older tools
diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h
index b00d64f0ca26..65f42a04d9ef 100644
--- a/tools/perf/util/stat.h
+++ b/tools/perf/util/stat.h
@@ -131,6 +131,8 @@ struct perf_stat_config {
struct perf_cpu_map *cpus_aggr_map;
u64 *walltime_run;
struct rblist metric_events;
+ int ctl_fd;
+ int ctl_fd_ack;
};

void update_stats(struct stats *stats, u64 val);
--
2.24.1

2020-05-25 22:07:34

by Alexey Budankov

[permalink] [raw]
Subject: [PATCH v4 09/10] perf record: implement control commands handling


Implement handling of 'enable' and 'disable' control commands
coming from control file descriptor.

Signed-off-by: Alexey Budankov <[email protected]>
---
tools/perf/builtin-record.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 8c3ec29e7e80..1ff3b7a77283 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1526,6 +1526,7 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
bool disabled = false, draining = false;
int fd;
float ratio = 0;
+ enum evlist_ctl_cmd cmd = EVLIST_CTL_CMD_UNSUPPORTED;

atexit(record__sig_exit);
signal(SIGCHLD, sig_handler);
@@ -1842,6 +1843,21 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
alarm(rec->switch_output.time);
}

+ if (evlist__ctlfd_process(rec->evlist, &cmd) > 0) {
+ switch (cmd) {
+ case EVLIST_CTL_CMD_ENABLE:
+ pr_info(EVLIST_ENABLED_MSG);
+ break;
+ case EVLIST_CTL_CMD_DISABLE:
+ pr_info(EVLIST_DISABLED_MSG);
+ break;
+ case EVLIST_CTL_CMD_ACK:
+ case EVLIST_CTL_CMD_UNSUPPORTED:
+ default:
+ break;
+ }
+ }
+
if (hits == rec->samples) {
if (done || draining)
break;
--
2.24.1

2020-05-27 15:25:55

by Alexey Budankov

[permalink] [raw]
Subject: Re: [PATCH v4 00/10] perf: support enable and disable commands in stat and record modes


Making sure it is not sneaked out of your attention.

Thanks,
Alexey

On 25.05.2020 17:11, Alexey Budankov wrote:
>
> Changes in v4:
> - made checking of ctlfd state unconditional in record trace streaming loop
> - introduced static poll fds to keep evlist__filter_pollfd() unaffected
> - handled ret code of evlist__initialize_ctlfd() where need
> - renamed and structured handle_events() function
> - applied anonymous structs where needed
>
> v3: https://lore.kernel.org/lkml/[email protected]/
>
> Changes in v3:
> - renamed functions and types from perf_evlist_ to evlist_ to avoid
> clash with libperf code;
> - extended commands to be strings of variable length consisting of
> command name and also possibly including command specific data;
> - merged docs update with the code changes;
> - updated docs for -D,--delay=-1 option for stat and record modes;
>
> v2: https://lore.kernel.org/lkml/[email protected]/
>
> Changes in v2:
> - renamed resume and pause commands to enable and disable ones, renamed
> CTL_CMD_RESUME and CTL_CMD_PAUSE to CTL_CMD_ENABLE and CTL_CMD_DISABLE
> to fit to the appropriate ioctls and avoid mixing up with PAUSE_OUTPUT
> ioctl;
> - factored out event handling loop into a handle_events() for stat mode;
> - separated -D,--delay=-1 into separate patches for stat and record modes;
>
> v1: https://lore.kernel.org/lkml/[email protected]/
>
> repo: tip of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git perf/core
>
> The patch set implements handling of 'start disabled', 'enable' and 'disable'
> external control commands which can be provided for stat and record modes
> of the tool from an external controlling process. 'start disabled' command
> can be used to postpone enabling of events in the beginning of a monitoring
> session. 'enable' and 'disable' commands can be used to enable and disable
> events correspondingly any time after the start of the session.
>
> The 'start disabled', 'enable' and 'disable' external control commands can be
> used to focus measurement on specially selected time intervals of workload
> execution. Focused measurement reduces tool intrusion and influence on
> workload behavior, reduces distortion and amount of collected and stored
> data, mitigates data accuracy loss because measurement and data capturing
> happen only during intervals of interest.
>
> A controlling process can be a bash shell script [1], native executable or
> any other language program that can directly work with file descriptors,
> e.g. pipes [2], and spawn a process, specially the tool one.
>
> -D,--delay <val> option is extended with -1 value to skip events enabling
> in the beginning of a monitoring session ('start disabled' command).
> --ctl-fd and --ctl-fd-ack command line options are introduced to provide the
> tool with a pair of file descriptors to listen to control commands and reply
> to the controlling process on the completion of received commands.
>
> The tool reads control command message from ctl-fd descriptor, handles the
> command and optionally replies acknowledgement message to fd-ack descriptor,
> if it is specified on the command line. 'enable' command is recognized as
> 'enable' string message and 'disable' command is recognized as 'disable'
> string message both received from ctl-fd descriptor. Completion message is
> 'ack\n' and sent to fd-ack descriptor.
>
> Example bash script demonstrating simple use case follows:
>
> #!/bin/bash
>
> ctl_dir=/tmp/
>
> ctl_fifo=${ctl_dir}perf_ctl.fifo
> test -p ${ctl_fifo} && unlink ${ctl_fifo}
> mkfifo ${ctl_fifo} && exec {ctl_fd}<>${ctl_fifo}
>
> ctl_ack_fifo=${ctl_dir}perf_ctl_ack.fifo
> test -p ${ctl_ack_fifo} && unlink ${ctl_ack_fifo}
> mkfifo ${ctl_ack_fifo} && exec {ctl_fd_ack}<>${ctl_ack_fifo}
>
> perf stat -D -1 -e cpu-cycles -a -I 1000 \
> --ctl-fd ${ctl_fd} --ctl-fd-ack ${ctl_fd_ack} \
> -- sleep 40 &
> perf_pid=$!
>
> sleep 5 && echo 'enable' >&${ctl_fd} && read -u ${ctl_fd_ack} e1 && echo "enabled(${e1})"
> sleep 10 && echo 'disable' >&${ctl_fd} && read -u ${ctl_fd_ack} d1 && echo "disabled(${d1})"
> sleep 5 && echo 'enable' >&${ctl_fd} && read -u ${ctl_fd_ack} e2 && echo "enabled(${e2})"
> sleep 10 && echo 'disable' >&${ctl_fd} && read -u ${ctl_fd_ack} d2 && echo "disabled(${d2})"
>
> exec {ctl_fd_ack}>&- && unlink ${ctl_ack_fifo}
> exec {ctl_fd}>&- && unlink ${ctl_fifo}
>
> wait -n ${perf_pid}
> exit $?
>
>
> Script output:
>
> [root@host dir] example
> Events disabled
> # time counts unit events
> 1.001101062 <not counted> cpu-cycles
> 2.002994944 <not counted> cpu-cycles
> 3.004864340 <not counted> cpu-cycles
> 4.006727177 <not counted> cpu-cycles
> Events enabled
> enabled(ack)
> 4.993808464 3,124,246 cpu-cycles
> 5.008597004 3,325,624 cpu-cycles
> 6.010387483 83,472,992 cpu-cycles
> 7.012266598 55,877,621 cpu-cycles
> 8.014175695 97,892,729 cpu-cycles
> 9.016056093 68,461,242 cpu-cycles
> 10.017937507 55,449,643 cpu-cycles
> 11.019830154 68,938,167 cpu-cycles
> 12.021719952 55,164,101 cpu-cycles
> 13.023627550 70,535,720 cpu-cycles
> 14.025580995 53,240,125 cpu-cycles
> disabled(ack)
> 14.997518260 53,558,068 cpu-cycles
> Events disabled
> 15.027216416 <not counted> cpu-cycles
> 16.029052729 <not counted> cpu-cycles
> 17.030904762 <not counted> cpu-cycles
> 18.032073424 <not counted> cpu-cycles
> 19.033805074 <not counted> cpu-cycles
> Events enabled
> enabled(ack)
> 20.001279097 3,021,022 cpu-cycles
> 20.035044381 6,434,367 cpu-cycles
> 21.036923813 89,358,251 cpu-cycles
> 22.038825169 72,516,351 cpu-cycles
> # time counts unit events
> 23.040715596 55,046,157 cpu-cycles
> 24.042643757 78,128,649 cpu-cycles
> 25.044558535 61,052,428 cpu-cycles
> 26.046452785 62,142,806 cpu-cycles
> 27.048353021 74,477,971 cpu-cycles
> 28.050241286 61,001,623 cpu-cycles
> 29.052149961 61,653,502 cpu-cycles
> disabled(ack)
> 30.004980264 82,729,640 cpu-cycles
> Events disabled
> 30.053516176 <not counted> cpu-cycles
> 31.055348366 <not counted> cpu-cycles
> 32.057202097 <not counted> cpu-cycles
> 33.059040702 <not counted> cpu-cycles
> 34.060843288 <not counted> cpu-cycles
> 35.000888624 <not counted> cpu-cycles
> [root@host dir]#
>
> [1] http://man7.org/linux/man-pages/man1/bash.1.html
> [2] http://man7.org/linux/man-pages/man2/pipe.2.html
>
> ---
> Alexey Budankov (10):
> tools/libperf: introduce static poll file descriptors
> perf evlist: introduce control file descriptors
> perf evlist: implement control command handling functions
> perf stat: factor out event handling loop into a function
> perf stat: extend -D,--delay option with -1 value
> perf stat: implement control commands handling
> perf stat: introduce --ctl-fd[-ack] options
> perf record: extend -D,--delay option with -1 value
> perf record: implement control commands handling
> perf record: introduce --ctl-fd[-ack] options
>
> tools/lib/api/fd/array.c | 42 +++++-
> tools/lib/api/fd/array.h | 7 +
> tools/lib/perf/evlist.c | 11 ++
> tools/lib/perf/include/internal/evlist.h | 2 +
> tools/perf/Documentation/perf-record.txt | 44 ++++++-
> tools/perf/Documentation/perf-stat.txt | 45 ++++++-
> tools/perf/builtin-record.c | 38 +++++-
> tools/perf/builtin-stat.c | 155 +++++++++++++++++------
> tools/perf/builtin-trace.c | 2 +-
> tools/perf/util/evlist.c | 131 +++++++++++++++++++
> tools/perf/util/evlist.h | 25 ++++
> tools/perf/util/record.h | 4 +-
> tools/perf/util/stat.h | 4 +-
> 13 files changed, 459 insertions(+), 51 deletions(-)
>

2020-05-27 16:59:50

by Jiri Olsa

[permalink] [raw]
Subject: Re: [PATCH v4 00/10] perf: support enable and disable commands in stat and record modes

On Wed, May 27, 2020 at 12:27:31PM +0300, Alexey Budankov wrote:
>
> Making sure it is not sneaked out of your attention.

yep, I know about it, will get to it this week

jirka

>
> Thanks,
> Alexey
>
> On 25.05.2020 17:11, Alexey Budankov wrote:
> >
> > Changes in v4:
> > - made checking of ctlfd state unconditional in record trace streaming loop
> > - introduced static poll fds to keep evlist__filter_pollfd() unaffected
> > - handled ret code of evlist__initialize_ctlfd() where need
> > - renamed and structured handle_events() function
> > - applied anonymous structs where needed
> >
> > v3: https://lore.kernel.org/lkml/[email protected]/
> >
> > Changes in v3:
> > - renamed functions and types from perf_evlist_ to evlist_ to avoid
> > clash with libperf code;
> > - extended commands to be strings of variable length consisting of
> > command name and also possibly including command specific data;
> > - merged docs update with the code changes;
> > - updated docs for -D,--delay=-1 option for stat and record modes;
> >
> > v2: https://lore.kernel.org/lkml/[email protected]/
> >
> > Changes in v2:
> > - renamed resume and pause commands to enable and disable ones, renamed
> > CTL_CMD_RESUME and CTL_CMD_PAUSE to CTL_CMD_ENABLE and CTL_CMD_DISABLE
> > to fit to the appropriate ioctls and avoid mixing up with PAUSE_OUTPUT
> > ioctl;
> > - factored out event handling loop into a handle_events() for stat mode;
> > - separated -D,--delay=-1 into separate patches for stat and record modes;
> >
> > v1: https://lore.kernel.org/lkml/[email protected]/
> >
> > repo: tip of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git perf/core
> >
> > The patch set implements handling of 'start disabled', 'enable' and 'disable'
> > external control commands which can be provided for stat and record modes
> > of the tool from an external controlling process. 'start disabled' command
> > can be used to postpone enabling of events in the beginning of a monitoring
> > session. 'enable' and 'disable' commands can be used to enable and disable
> > events correspondingly any time after the start of the session.
> >
> > The 'start disabled', 'enable' and 'disable' external control commands can be
> > used to focus measurement on specially selected time intervals of workload
> > execution. Focused measurement reduces tool intrusion and influence on
> > workload behavior, reduces distortion and amount of collected and stored
> > data, mitigates data accuracy loss because measurement and data capturing
> > happen only during intervals of interest.
> >
> > A controlling process can be a bash shell script [1], native executable or
> > any other language program that can directly work with file descriptors,
> > e.g. pipes [2], and spawn a process, specially the tool one.
> >
> > -D,--delay <val> option is extended with -1 value to skip events enabling
> > in the beginning of a monitoring session ('start disabled' command).
> > --ctl-fd and --ctl-fd-ack command line options are introduced to provide the
> > tool with a pair of file descriptors to listen to control commands and reply
> > to the controlling process on the completion of received commands.
> >
> > The tool reads control command message from ctl-fd descriptor, handles the
> > command and optionally replies acknowledgement message to fd-ack descriptor,
> > if it is specified on the command line. 'enable' command is recognized as
> > 'enable' string message and 'disable' command is recognized as 'disable'
> > string message both received from ctl-fd descriptor. Completion message is
> > 'ack\n' and sent to fd-ack descriptor.
> >
> > Example bash script demonstrating simple use case follows:
> >
> > #!/bin/bash
> >
> > ctl_dir=/tmp/
> >
> > ctl_fifo=${ctl_dir}perf_ctl.fifo
> > test -p ${ctl_fifo} && unlink ${ctl_fifo}
> > mkfifo ${ctl_fifo} && exec {ctl_fd}<>${ctl_fifo}
> >
> > ctl_ack_fifo=${ctl_dir}perf_ctl_ack.fifo
> > test -p ${ctl_ack_fifo} && unlink ${ctl_ack_fifo}
> > mkfifo ${ctl_ack_fifo} && exec {ctl_fd_ack}<>${ctl_ack_fifo}
> >
> > perf stat -D -1 -e cpu-cycles -a -I 1000 \
> > --ctl-fd ${ctl_fd} --ctl-fd-ack ${ctl_fd_ack} \
> > -- sleep 40 &
> > perf_pid=$!
> >
> > sleep 5 && echo 'enable' >&${ctl_fd} && read -u ${ctl_fd_ack} e1 && echo "enabled(${e1})"
> > sleep 10 && echo 'disable' >&${ctl_fd} && read -u ${ctl_fd_ack} d1 && echo "disabled(${d1})"
> > sleep 5 && echo 'enable' >&${ctl_fd} && read -u ${ctl_fd_ack} e2 && echo "enabled(${e2})"
> > sleep 10 && echo 'disable' >&${ctl_fd} && read -u ${ctl_fd_ack} d2 && echo "disabled(${d2})"
> >
> > exec {ctl_fd_ack}>&- && unlink ${ctl_ack_fifo}
> > exec {ctl_fd}>&- && unlink ${ctl_fifo}
> >
> > wait -n ${perf_pid}
> > exit $?
> >
> >
> > Script output:
> >
> > [root@host dir] example
> > Events disabled
> > # time counts unit events
> > 1.001101062 <not counted> cpu-cycles
> > 2.002994944 <not counted> cpu-cycles
> > 3.004864340 <not counted> cpu-cycles
> > 4.006727177 <not counted> cpu-cycles
> > Events enabled
> > enabled(ack)
> > 4.993808464 3,124,246 cpu-cycles
> > 5.008597004 3,325,624 cpu-cycles
> > 6.010387483 83,472,992 cpu-cycles
> > 7.012266598 55,877,621 cpu-cycles
> > 8.014175695 97,892,729 cpu-cycles
> > 9.016056093 68,461,242 cpu-cycles
> > 10.017937507 55,449,643 cpu-cycles
> > 11.019830154 68,938,167 cpu-cycles
> > 12.021719952 55,164,101 cpu-cycles
> > 13.023627550 70,535,720 cpu-cycles
> > 14.025580995 53,240,125 cpu-cycles
> > disabled(ack)
> > 14.997518260 53,558,068 cpu-cycles
> > Events disabled
> > 15.027216416 <not counted> cpu-cycles
> > 16.029052729 <not counted> cpu-cycles
> > 17.030904762 <not counted> cpu-cycles
> > 18.032073424 <not counted> cpu-cycles
> > 19.033805074 <not counted> cpu-cycles
> > Events enabled
> > enabled(ack)
> > 20.001279097 3,021,022 cpu-cycles
> > 20.035044381 6,434,367 cpu-cycles
> > 21.036923813 89,358,251 cpu-cycles
> > 22.038825169 72,516,351 cpu-cycles
> > # time counts unit events
> > 23.040715596 55,046,157 cpu-cycles
> > 24.042643757 78,128,649 cpu-cycles
> > 25.044558535 61,052,428 cpu-cycles
> > 26.046452785 62,142,806 cpu-cycles
> > 27.048353021 74,477,971 cpu-cycles
> > 28.050241286 61,001,623 cpu-cycles
> > 29.052149961 61,653,502 cpu-cycles
> > disabled(ack)
> > 30.004980264 82,729,640 cpu-cycles
> > Events disabled
> > 30.053516176 <not counted> cpu-cycles
> > 31.055348366 <not counted> cpu-cycles
> > 32.057202097 <not counted> cpu-cycles
> > 33.059040702 <not counted> cpu-cycles
> > 34.060843288 <not counted> cpu-cycles
> > 35.000888624 <not counted> cpu-cycles
> > [root@host dir]#
> >
> > [1] http://man7.org/linux/man-pages/man1/bash.1.html
> > [2] http://man7.org/linux/man-pages/man2/pipe.2.html
> >
> > ---
> > Alexey Budankov (10):
> > tools/libperf: introduce static poll file descriptors
> > perf evlist: introduce control file descriptors
> > perf evlist: implement control command handling functions
> > perf stat: factor out event handling loop into a function
> > perf stat: extend -D,--delay option with -1 value
> > perf stat: implement control commands handling
> > perf stat: introduce --ctl-fd[-ack] options
> > perf record: extend -D,--delay option with -1 value
> > perf record: implement control commands handling
> > perf record: introduce --ctl-fd[-ack] options
> >
> > tools/lib/api/fd/array.c | 42 +++++-
> > tools/lib/api/fd/array.h | 7 +
> > tools/lib/perf/evlist.c | 11 ++
> > tools/lib/perf/include/internal/evlist.h | 2 +
> > tools/perf/Documentation/perf-record.txt | 44 ++++++-
> > tools/perf/Documentation/perf-stat.txt | 45 ++++++-
> > tools/perf/builtin-record.c | 38 +++++-
> > tools/perf/builtin-stat.c | 155 +++++++++++++++++------
> > tools/perf/builtin-trace.c | 2 +-
> > tools/perf/util/evlist.c | 131 +++++++++++++++++++
> > tools/perf/util/evlist.h | 25 ++++
> > tools/perf/util/record.h | 4 +-
> > tools/perf/util/stat.h | 4 +-
> > 13 files changed, 459 insertions(+), 51 deletions(-)
> >
>

2020-05-31 18:21:49

by Jiri Olsa

[permalink] [raw]
Subject: Re: [PATCH v4 04/10] perf stat: factor out event handling loop into a function

On Mon, May 25, 2020 at 05:19:45PM +0300, Alexey Budankov wrote:

SNIP

> @@ -544,12 +598,10 @@ static enum counter_recovery stat_handle_error(struct evsel *counter)
> static int __run_perf_stat(int argc, const char **argv, int run_idx)
> {
> int interval = stat_config.interval;
> - int times = stat_config.times;
> int timeout = stat_config.timeout;
> char msg[BUFSIZ];
> unsigned long long t0, t1;
> struct evsel *counter;
> - struct timespec ts;
> size_t l;
> int status = 0;
> const bool forks = (argc > 0);
> @@ -558,17 +610,6 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
> int i, cpu;
> bool second_pass = false;
>
> - if (interval) {
> - ts.tv_sec = interval / USEC_PER_MSEC;
> - ts.tv_nsec = (interval % USEC_PER_MSEC) * NSEC_PER_MSEC;
> - } else if (timeout) {
> - ts.tv_sec = timeout / USEC_PER_MSEC;
> - ts.tv_nsec = (timeout % USEC_PER_MSEC) * NSEC_PER_MSEC;
> - } else {
> - ts.tv_sec = 1;
> - ts.tv_nsec = 0;
> - }
> -
> if (forks) {
> if (perf_evlist__prepare_workload(evsel_list, &target, argv, is_pipe,
> workload_exec_failed_signal) < 0) {
> @@ -725,16 +766,9 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
> perf_evlist__start_workload(evsel_list);
> enable_counters();
>
> - if (interval || timeout) {
> - while (!waitpid(child_pid, &status, WNOHANG)) {
> - nanosleep(&ts, NULL);
> - if (timeout)
> - break;
> - process_interval();
> - if (interval_count && !(--times))
> - break;
> - }
> - }
> + if (interval || timeout)
> + dispatch_events(child_pid, &stat_config);
> +
> if (child_pid != -1) {
> if (timeout)
> kill(child_pid, SIGTERM);
> @@ -751,18 +785,7 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
> psignal(WTERMSIG(status), argv[0]);
> } else {
> enable_counters();
> - while (!done) {
> - nanosleep(&ts, NULL);
> - if (!is_target_alive(&target, evsel_list->core.threads))
> - break;
> - if (timeout)
> - break;
> - if (interval) {
> - process_interval();
> - if (interval_count && !(--times))
> - break;
> - }
> - }
> + dispatch_events(-1, &stat_config);

hum, from the discussion we had on v3 I expected more smaller patches
with easy changes, so the change is more transparent and easy to review

as I said before this part really makes me worried and needs to be as clear
as possible.. please introdce the new function first and replace the factored
places separately, also more verbose changelog would help ;-)

thanks,
jirka


> }
>
> disable_counters();
> --
> 2.24.1
>

2020-06-01 07:41:59

by Alexey Budankov

[permalink] [raw]
Subject: Re: [PATCH v4 04/10] perf stat: factor out event handling loop into a function


On 31.05.2020 21:19, Jiri Olsa wrote:
> On Mon, May 25, 2020 at 05:19:45PM +0300, Alexey Budankov wrote:
>
> SNIP
>
>> @@ -544,12 +598,10 @@ static enum counter_recovery stat_handle_error(struct evsel *counter)
>> static int __run_perf_stat(int argc, const char **argv, int run_idx)
>> {
>> int interval = stat_config.interval;
>> - int times = stat_config.times;
>> int timeout = stat_config.timeout;
>> char msg[BUFSIZ];
>> unsigned long long t0, t1;
>> struct evsel *counter;
>> - struct timespec ts;
>> size_t l;
>> int status = 0;
>> const bool forks = (argc > 0);
>> @@ -558,17 +610,6 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
>> int i, cpu;
>> bool second_pass = false;
>>
>> - if (interval) {
>> - ts.tv_sec = interval / USEC_PER_MSEC;
>> - ts.tv_nsec = (interval % USEC_PER_MSEC) * NSEC_PER_MSEC;
>> - } else if (timeout) {
>> - ts.tv_sec = timeout / USEC_PER_MSEC;
>> - ts.tv_nsec = (timeout % USEC_PER_MSEC) * NSEC_PER_MSEC;
>> - } else {
>> - ts.tv_sec = 1;
>> - ts.tv_nsec = 0;
>> - }
>> -
>> if (forks) {
>> if (perf_evlist__prepare_workload(evsel_list, &target, argv, is_pipe,
>> workload_exec_failed_signal) < 0) {
>> @@ -725,16 +766,9 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
>> perf_evlist__start_workload(evsel_list);
>> enable_counters();
>>
>> - if (interval || timeout) {
>> - while (!waitpid(child_pid, &status, WNOHANG)) {
>> - nanosleep(&ts, NULL);
>> - if (timeout)
>> - break;
>> - process_interval();
>> - if (interval_count && !(--times))
>> - break;
>> - }
>> - }
>> + if (interval || timeout)
>> + dispatch_events(child_pid, &stat_config);
>> +
>> if (child_pid != -1) {
>> if (timeout)
>> kill(child_pid, SIGTERM);
>> @@ -751,18 +785,7 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
>> psignal(WTERMSIG(status), argv[0]);
>> } else {
>> enable_counters();
>> - while (!done) {
>> - nanosleep(&ts, NULL);
>> - if (!is_target_alive(&target, evsel_list->core.threads))
>> - break;
>> - if (timeout)
>> - break;
>> - if (interval) {
>> - process_interval();
>> - if (interval_count && !(--times))
>> - break;
>> - }
>> - }
>> + dispatch_events(-1, &stat_config);
>
> hum, from the discussion we had on v3 I expected more smaller patches
> with easy changes, so the change is more transparent and easy to review
>
> as I said before this part really makes me worried and needs to be as clear
> as possible.. please introdce the new function first and replace the factored
> places separately, also more verbose changelog would help ;-)

Ok. Will try to reshape the patch that way.

~Alexey

2020-06-01 16:44:06

by Alexey Budankov

[permalink] [raw]
Subject: Re: [PATCH v4 04/10] perf stat: factor out event handling loop into a function


On 01.06.2020 10:38, Alexey Budankov wrote:
>
> On 31.05.2020 21:19, Jiri Olsa wrote:
>> On Mon, May 25, 2020 at 05:19:45PM +0300, Alexey Budankov wrote:
>>
>> SNIP
>>
>>> @@ -544,12 +598,10 @@ static enum counter_recovery stat_handle_error(struct evsel *counter)
<SNIP>
>>> static int __run_perf_stat(int argc, const char **argv, int run_idx)
>>> + dispatch_events(-1, &stat_config);
>>
>> hum, from the discussion we had on v3 I expected more smaller patches
>> with easy changes, so the change is more transparent and easy to review
>>
>> as I said before this part really makes me worried and needs to be as clear
>> as possible.. please introdce the new function first and replace the factored
>> places separately, also more verbose changelog would help ;-)
>
> Ok. Will try to reshape the patch that way.

Please see v5.
It puts this refactoring part into several smaller consecutive changes to make
review and possible bisecting activity easier. Let me know if some other parts
of the patch set also require similar breakdown.

~Alexey