Changes in v3:
- renamed functions and types from perf_evlist_ to evlist_ to avoid
clash with libperf code;
- extended commands to be strings of variable length consisting of
command name and also possibly including command specific data;
- merged docs update with the code changes;
- updated docs for -D,--delay=-1 option for stat and record modes;
v2: https://lore.kernel.org/lkml/[email protected]/
Changes in v2:
- renamed resume and pause commands to enable and disable ones, renamed
CTL_CMD_RESUME and CTL_CMD_PAUSE to CTL_CMD_ENABLE and CTL_CMD_DISABLE
to fit to the appropriate ioctls and avoid mixing up with PAUSE_OUTPUT
ioctl;
- factored out event handling loop into a handle_events() for stat mode;
- separated -D,--delay=-1 into separate patches for stat and record modes;
v1: https://lore.kernel.org/lkml/[email protected]/
repo: tip of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git perf/core
The patch set implements handling of 'start disabled', 'enable' and 'disable'
external control commands which can be provided for stat and record modes
of the tool from an external controlling process. 'start disabled' command
can be used to postpone enabling of events in the beginning of a monitoring
session. 'enable' and 'disable' commands can be used to enable and disable
events correspondingly any time after the start of the session.
The 'start disabled', 'enable' and 'disable' external control commands can be
used to focus measurement on specially selected time intervals of workload
execution. Focused measurement reduces tool intrusion and influence on
workload behavior, reduces distortion and amount of collected and stored
data, mitigates data accuracy loss because measurement and data capturing
happen only during intervals of interest.
A controlling process can be a bash shell script [1], native executable or
any other language program that can directly work with file descriptors,
e.g. pipes [2], and spawn a process, specially the tool one.
-D,--delay <val> option is extended with -1 value to skip events enabling
in the beginning of a monitoring session ('start disabled' command).
--ctl-fd and --ctl-fd-ack command line options are introduced to provide the
tool with a pair of file descriptors to listen to control commands and reply
to the controlling process on the completion of received commands.
The tool reads control command message from ctl-fd descriptor, handles the
command and optionally replies acknowledgement message to fd-ack descriptor,
if it is specified on the command line. 'enable' command is recognized as
'enable' string message and 'disable' command is recognized as 'disable'
string message both received from ctl-fd descriptor. Completion message is
'ack\n' and sent to fd-ack descriptor.
Example bash script demonstrating simple use case follows:
#!/bin/bash
ctl_dir=/tmp/
ctl_fifo=${ctl_dir}perf_ctl.fifo
test -p ${ctl_fifo} && unlink ${ctl_fifo}
mkfifo ${ctl_fifo} && exec {ctl_fd}<>${ctl_fifo}
ctl_ack_fifo=${ctl_dir}perf_ctl_ack.fifo
test -p ${ctl_ack_fifo} && unlink ${ctl_ack_fifo}
mkfifo ${ctl_ack_fifo} && exec {ctl_fd_ack}<>${ctl_ack_fifo}
perf stat -D -1 -e cpu-cycles -a -I 1000 \
--ctl-fd ${ctl_fd} --ctl-fd-ack ${ctl_fd_ack} \
-- sleep 40 &
perf_pid=$!
sleep 5 && echo 'enable' >&${ctl_fd} && read -u ${ctl_fd_ack} e1 && echo "enabled(${e1})"
sleep 10 && echo 'disable' >&${ctl_fd} && read -u ${ctl_fd_ack} d1 && echo "disabled(${d1})"
sleep 5 && echo 'enable' >&${ctl_fd} && read -u ${ctl_fd_ack} e2 && echo "enabled(${e2})"
sleep 10 && echo 'disable' >&${ctl_fd} && read -u ${ctl_fd_ack} d2 && echo "disabled(${d2})"
exec {ctl_fd_ack}>&- && unlink ${ctl_ack_fifo}
exec {ctl_fd}>&- && unlink ${ctl_fifo}
wait -n ${perf_pid}
exit $?
Script output:
[root@host dir] example
Events disabled
# time counts unit events
1.001101062 <not counted> cpu-cycles
2.002994944 <not counted> cpu-cycles
3.004864340 <not counted> cpu-cycles
4.006727177 <not counted> cpu-cycles
Events enabled
enabled(ack)
4.993808464 3,124,246 cpu-cycles
5.008597004 3,325,624 cpu-cycles
6.010387483 83,472,992 cpu-cycles
7.012266598 55,877,621 cpu-cycles
8.014175695 97,892,729 cpu-cycles
9.016056093 68,461,242 cpu-cycles
10.017937507 55,449,643 cpu-cycles
11.019830154 68,938,167 cpu-cycles
12.021719952 55,164,101 cpu-cycles
13.023627550 70,535,720 cpu-cycles
14.025580995 53,240,125 cpu-cycles
disabled(ack)
14.997518260 53,558,068 cpu-cycles
Events disabled
15.027216416 <not counted> cpu-cycles
16.029052729 <not counted> cpu-cycles
17.030904762 <not counted> cpu-cycles
18.032073424 <not counted> cpu-cycles
19.033805074 <not counted> cpu-cycles
Events enabled
enabled(ack)
20.001279097 3,021,022 cpu-cycles
20.035044381 6,434,367 cpu-cycles
21.036923813 89,358,251 cpu-cycles
22.038825169 72,516,351 cpu-cycles
# time counts unit events
23.040715596 55,046,157 cpu-cycles
24.042643757 78,128,649 cpu-cycles
25.044558535 61,052,428 cpu-cycles
26.046452785 62,142,806 cpu-cycles
27.048353021 74,477,971 cpu-cycles
28.050241286 61,001,623 cpu-cycles
29.052149961 61,653,502 cpu-cycles
disabled(ack)
30.004980264 82,729,640 cpu-cycles
Events disabled
30.053516176 <not counted> cpu-cycles
31.055348366 <not counted> cpu-cycles
32.057202097 <not counted> cpu-cycles
33.059040702 <not counted> cpu-cycles
34.060843288 <not counted> cpu-cycles
35.000888624 <not counted> cpu-cycles
[root@host dir]#
[1] http://man7.org/linux/man-pages/man1/bash.1.html
[2] http://man7.org/linux/man-pages/man2/pipe.2.html
---
Alexey Budankov (9):
perf evlist: introduce control file descriptors
perf evlist: implement control command handling functions
perf stat: factor out event handling loop into a function
perf stat: extend -D,--delay option with -1 value
perf stat: implement control commands handling
perf stat: introduce --ctl-fd[-ack] options
perf record: extend -D,--delay option with -1 value
perf record: implement control commands handling
perf record: introduce --ctl-fd[-ack] options
tools/perf/Documentation/perf-record.txt | 44 +++++++-
tools/perf/Documentation/perf-stat.txt | 45 +++++++-
tools/perf/builtin-record.c | 39 ++++++-
tools/perf/builtin-stat.c | 137 ++++++++++++++++-------
tools/perf/builtin-trace.c | 2 +-
tools/perf/util/evlist.c | 131 ++++++++++++++++++++++
tools/perf/util/evlist.h | 24 ++++
tools/perf/util/record.h | 4 +-
tools/perf/util/stat.h | 4 +-
9 files changed, 379 insertions(+), 51 deletions(-)
Define and initialize control file descriptors.
Signed-off-by: Alexey Budankov <[email protected]>
---
tools/perf/util/evlist.c | 3 +++
tools/perf/util/evlist.h | 3 +++
2 files changed, 6 insertions(+)
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 2a9de6491700..aa61619fa304 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -63,6 +63,9 @@ void evlist__init(struct evlist *evlist, struct perf_cpu_map *cpus,
perf_evlist__set_maps(&evlist->core, cpus, threads);
evlist->workload.pid = -1;
evlist->bkw_mmap_state = BKW_MMAP_NOTREADY;
+ evlist->ctl_fd = -1;
+ evlist->ctl_fd_ack = -1;
+ evlist->ctl_fd_pos = -1;
}
struct evlist *evlist__new(void)
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index b6f325dfb4d2..62f259d89b41 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -74,6 +74,9 @@ struct evlist {
pthread_t th;
volatile int done;
} thread;
+ int ctl_fd;
+ int ctl_fd_ack;
+ int ctl_fd_pos;
};
struct evsel_str_handler {
--
2.24.1
Factor out event handling loop into handle_events() function.
Signed-off-by: Alexey Budankov <[email protected]>
---
tools/perf/builtin-stat.c | 85 +++++++++++++++++++++++----------------
1 file changed, 50 insertions(+), 35 deletions(-)
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index e0c1ad23c768..9775b0905146 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -371,6 +371,16 @@ static void process_interval(void)
print_counters(&rs, 0, NULL);
}
+static bool print_interval_and_stop(struct perf_stat_config *config, int *times)
+{
+ if (config->interval) {
+ process_interval();
+ if (interval_count && !(--(*times)))
+ return true;
+ }
+ return false;
+}
+
static void enable_counters(void)
{
if (stat_config.initial_delay)
@@ -436,6 +446,42 @@ static bool is_target_alive(struct target *_target,
return false;
}
+static int handle_events(pid_t pid, struct perf_stat_config *config)
+{
+ pid_t child = 0;
+ bool res, stop = false;
+ struct timespec time_to_sleep;
+ int sleep_time, status = 0, times = config->times;
+
+ if (config->interval)
+ sleep_time = config->interval;
+ else if (config->timeout)
+ sleep_time = config->timeout;
+ else
+ sleep_time = 1000;
+
+ time_to_sleep.tv_sec = sleep_time / MSEC_PER_SEC;
+ time_to_sleep.tv_nsec = (sleep_time % MSEC_PER_SEC) * NSEC_PER_MSEC;
+
+ do {
+ if (pid != -1)
+ child = waitpid(pid, &status, WNOHANG);
+ if (child || stop || done)
+ break;
+ nanosleep(&time_to_sleep, NULL);
+ if (pid == -1)
+ stop = !is_target_alive(&target, evsel_list->core.threads);
+ if (config->timeout) {
+ stop = !stop ? true : stop;
+ } else {
+ res = print_interval_and_stop(config, ×);
+ stop = !stop ? res : stop;
+ }
+ } while (1);
+
+ return status;
+}
+
enum counter_recovery {
COUNTER_SKIP,
COUNTER_RETRY,
@@ -494,12 +540,10 @@ static enum counter_recovery stat_handle_error(struct evsel *counter)
static int __run_perf_stat(int argc, const char **argv, int run_idx)
{
int interval = stat_config.interval;
- int times = stat_config.times;
int timeout = stat_config.timeout;
char msg[BUFSIZ];
unsigned long long t0, t1;
struct evsel *counter;
- struct timespec ts;
size_t l;
int status = 0;
const bool forks = (argc > 0);
@@ -508,17 +552,6 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
int i, cpu;
bool second_pass = false;
- if (interval) {
- ts.tv_sec = interval / USEC_PER_MSEC;
- ts.tv_nsec = (interval % USEC_PER_MSEC) * NSEC_PER_MSEC;
- } else if (timeout) {
- ts.tv_sec = timeout / USEC_PER_MSEC;
- ts.tv_nsec = (timeout % USEC_PER_MSEC) * NSEC_PER_MSEC;
- } else {
- ts.tv_sec = 1;
- ts.tv_nsec = 0;
- }
-
if (forks) {
if (perf_evlist__prepare_workload(evsel_list, &target, argv, is_pipe,
workload_exec_failed_signal) < 0) {
@@ -675,16 +708,9 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
perf_evlist__start_workload(evsel_list);
enable_counters();
- if (interval || timeout) {
- while (!waitpid(child_pid, &status, WNOHANG)) {
- nanosleep(&ts, NULL);
- if (timeout)
- break;
- process_interval();
- if (interval_count && !(--times))
- break;
- }
- }
+ if (interval || timeout)
+ handle_events(child_pid, &stat_config);
+
if (child_pid != -1) {
if (timeout)
kill(child_pid, SIGTERM);
@@ -701,18 +727,7 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
psignal(WTERMSIG(status), argv[0]);
} else {
enable_counters();
- while (!done) {
- nanosleep(&ts, NULL);
- if (!is_target_alive(&target, evsel_list->core.threads))
- break;
- if (timeout)
- break;
- if (interval) {
- process_interval();
- if (interval_count && !(--times))
- break;
- }
- }
+ handle_events(-1, &stat_config);
}
disable_counters();
--
2.24.1
Extend -D,--delay option with -1 value to start monitoring with
events disabled to be enabled later by enable command provided
via control file descriptor.
Signed-off-by: Alexey Budankov <[email protected]>
---
tools/perf/Documentation/perf-stat.txt | 5 +++--
tools/perf/builtin-stat.c | 18 ++++++++++++++----
tools/perf/util/evlist.h | 4 ++++
tools/perf/util/stat.h | 2 +-
4 files changed, 22 insertions(+), 7 deletions(-)
diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
index 3fb5028aef08..3b91b30d7672 100644
--- a/tools/perf/Documentation/perf-stat.txt
+++ b/tools/perf/Documentation/perf-stat.txt
@@ -226,8 +226,9 @@ mode, use --per-node in addition to -a. (system-wide).
-D msecs::
--delay msecs::
-After starting the program, wait msecs before measuring. This is useful to
-filter out the startup phase of the program, which is often very different.
+After starting the program, wait msecs before measuring (-1: start with events
+disabled"). This is useful to filter out the startup phase of the program,
+which is often very different.
-T::
--transaction::
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 9775b0905146..73404b723592 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -383,16 +383,26 @@ static bool print_interval_and_stop(struct perf_stat_config *config, int *times)
static void enable_counters(void)
{
- if (stat_config.initial_delay)
+ if (stat_config.initial_delay < 0) {
+ pr_info(EVLIST_DISABLED_MSG);
+ return;
+ }
+
+ if (stat_config.initial_delay > 0) {
+ pr_info(EVLIST_DISABLED_MSG);
usleep(stat_config.initial_delay * USEC_PER_MSEC);
+ }
/*
* We need to enable counters only if:
* - we don't have tracee (attaching to task or cpu)
* - we have initial delay configured
*/
- if (!target__none(&target) || stat_config.initial_delay)
+ if (!target__none(&target) || stat_config.initial_delay) {
evlist__enable(evsel_list);
+ if (stat_config.initial_delay > 0)
+ pr_info(EVLIST_ENABLED_MSG);
+ }
}
static void disable_counters(void)
@@ -929,8 +939,8 @@ static struct option stat_options[] = {
"aggregate counts per thread", AGGR_THREAD),
OPT_SET_UINT(0, "per-node", &stat_config.aggr_mode,
"aggregate counts per numa node", AGGR_NODE),
- OPT_UINTEGER('D', "delay", &stat_config.initial_delay,
- "ms to wait before starting measurement after program start"),
+ OPT_INTEGER('D', "delay", &stat_config.initial_delay,
+ "ms to wait before starting measurement after program start (-1: start with events disabled"),
OPT_CALLBACK_NOOPT(0, "metric-only", &stat_config.metric_only, NULL,
"Only print computed metrics. No raw values", enable_metric_only),
OPT_BOOLEAN(0, "topdown", &topdown_run,
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 0554f8d2a65b..8a72612b8b3e 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -375,4 +375,8 @@ int evlist__initialize_ctlfd(struct evlist *evlist, int ctl_fd, int ctl_fd_ack);
int evlist__finalize_ctlfd(struct evlist *evlist);
int evlist__ctlfd_process(struct evlist *evlist, enum evlist_ctl_cmd *cmd);
+#define EVLIST_ENABLED_MSG "Events enabled\n"
+#define EVLIST_DISABLED_MSG "Events disabled\n"
+
+
#endif /* __PERF_EVLIST_H */
diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h
index b4fdfaa7f2c0..027b9dcd902f 100644
--- a/tools/perf/util/stat.h
+++ b/tools/perf/util/stat.h
@@ -113,7 +113,7 @@ struct perf_stat_config {
FILE *output;
unsigned int interval;
unsigned int timeout;
- unsigned int initial_delay;
+ int initial_delay;
unsigned int unit_width;
unsigned int metric_only_len;
int times;
--
2.24.1
Introduce --ctl-fd[-ack] options to pass open file descriptors numbers
from command line. Extend perf-stat.txt file with --ctl-fd[-ack] options
description. Document possible usage model introduced by --ctl-fd[-ack]
options by providing example bash shell script.
Signed-off-by: Alexey Budankov <[email protected]>
---
tools/perf/Documentation/perf-stat.txt | 40 ++++++++++++++++++++++++++
tools/perf/builtin-stat.c | 10 +++++++
tools/perf/util/stat.h | 2 ++
3 files changed, 52 insertions(+)
diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
index 3b91b30d7672..7f7a0019fbfc 100644
--- a/tools/perf/Documentation/perf-stat.txt
+++ b/tools/perf/Documentation/perf-stat.txt
@@ -164,6 +164,46 @@ with it. --append may be used here. Examples:
3>results perf stat --log-fd 3 -- $cmd
3>>results perf stat --log-fd 3 --append -- $cmd
+--ctl-fd::
+--ctl-fd-ack::
+
+Listen on ctl-fd descriptor for command to control measurement ('enable': enable events,
+'disable': disable events). Optionally send control command completion ('ack') to fd-ack
+descriptor to synchronize with the controlling process. Example of bash shell script
+to enable and disable events during measurements:
+
+#!/bin/bash
+
+ctl_dir=/tmp/
+
+ctl_fifo=${ctl_dir}perf_ctl.fifo
+test -p ${ctl_fifo} && unlink ${ctl_fifo}
+mkfifo ${ctl_fifo}
+exec {ctl_fd}<>${ctl_fifo}
+
+ctl_ack_fifo=${ctl_dir}perf_ctl_ack.fifo
+test -p ${ctl_ack_fifo} && unlink ${ctl_ack_fifo}
+mkfifo ${ctl_ack_fifo}
+exec {ctl_fd_ack}<>${ctl_ack_fifo}
+
+perf stat -D -1 -e cpu-cycles -a -I 1000 \
+ --ctl-fd ${ctl_fd} --ctl-fd-ack ${ctl_fd_ack} \
+ -- sleep 30 &
+perf_pid=$!
+
+sleep 5 && echo 'enable' >&${ctl_fd} && read -u ${ctl_fd_ack} e1 && echo "enabled(${e1})"
+sleep 10 && echo 'disable' >&${ctl_fd} && read -u ${ctl_fd_ack} d1 && echo "disabled(${d1})"
+
+exec {ctl_fd_ack}>&-
+unlink ${ctl_ack_fifo}
+
+exec {ctl_fd}>&-
+unlink ${ctl_fifo}
+
+wait -n ${perf_pid}
+exit $?
+
+
--pre::
--post::
Pre and post measurement hooks, e.g.:
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index abea82a1ba24..88055aaf670f 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -187,6 +187,8 @@ static struct perf_stat_config stat_config = {
.metric_only_len = METRIC_ONLY_LEN,
.walltime_nsecs_stats = &walltime_nsecs_stats,
.big_num = true,
+ .ctl_fd = -1,
+ .ctl_fd_ack = -1
};
static inline void diff_timespec(struct timespec *r, struct timespec *a,
@@ -984,6 +986,10 @@ static struct option stat_options[] = {
"Use with 'percore' event qualifier to show the event "
"counts of one hardware thread by sum up total hardware "
"threads of same physical core"),
+ OPT_INTEGER(0, "ctl-fd", &stat_config.ctl_fd,
+ "Listen on fd descriptor for command to control measurement ('enable': enable events, 'disable': disable events)"),
+ OPT_INTEGER(0, "ctl-fd-ack", &stat_config.ctl_fd_ack,
+ "Send control command completion ('ack') to fd ack descriptor"),
OPT_END()
};
@@ -2180,6 +2186,8 @@ int cmd_stat(int argc, const char **argv)
signal(SIGALRM, skip_signal);
signal(SIGABRT, skip_signal);
+ evlist__initialize_ctlfd(evsel_list, stat_config.ctl_fd, stat_config.ctl_fd_ack);
+
status = 0;
for (run_idx = 0; forever || run_idx < stat_config.run_count; run_idx++) {
if (stat_config.run_count != 1 && verbose > 0)
@@ -2199,6 +2207,8 @@ int cmd_stat(int argc, const char **argv)
if (!forever && status != -1 && !interval)
print_counters(NULL, argc, argv);
+ evlist__finalize_ctlfd(evsel_list);
+
if (STAT_RECORD) {
/*
* We synthesize the kernel mmap record just so that older tools
diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h
index 027b9dcd902f..0b0fa3a2cde2 100644
--- a/tools/perf/util/stat.h
+++ b/tools/perf/util/stat.h
@@ -130,6 +130,8 @@ struct perf_stat_config {
struct perf_cpu_map *cpus_aggr_map;
u64 *walltime_run;
struct rblist metric_events;
+ int ctl_fd;
+ int ctl_fd_ack;
};
void update_stats(struct stats *stats, u64 val);
--
2.24.1
Introduce --ctl-fd[-ack] options to pass open file descriptors numbers
from command line. Extend perf-record.txt file with --ctl-fd[-ack]
options description. Document possible usage model introduced by
--ctl-fd[-ack] options by providing example bash shell script.
Signed-off-by: Alexey Budankov <[email protected]>
---
tools/perf/Documentation/perf-record.txt | 39 ++++++++++++++++++++++++
tools/perf/builtin-record.c | 9 ++++++
tools/perf/util/record.h | 2 ++
3 files changed, 50 insertions(+)
diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index c2c4ce7ccee2..5c012cfe68a4 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -614,6 +614,45 @@ appended unit character - B/K/M/G
The number of threads to run when synthesizing events for existing processes.
By default, the number of threads equals 1.
+--ctl-fd::
+--ctl-fd-ack::
+Listen on ctl-fd descriptor for command to control measurement ('enable': enable events,
+'disable': disable events. Optionally send control command completion ('ack') to fd-ack
+descriptor to synchronize with the controlling process. Example of bash shell script
+to enable and disable events during measurements:
+
+#!/bin/bash
+
+ctl_dir=/tmp/
+
+ctl_fifo=${ctl_dir}perf_ctl.fifo
+test -p ${ctl_fifo} && unlink ${ctl_fifo}
+mkfifo ${ctl_fifo}
+exec {ctl_fd}<>${ctl_fifo}
+
+ctl_ack_fifo=${ctl_dir}perf_ctl_ack.fifo
+test -p ${ctl_ack_fifo} && unlink ${ctl_ack_fifo}
+mkfifo ${ctl_ack_fifo}
+exec {ctl_fd_ack}<>${ctl_ack_fifo}
+
+perf record -D -1 -e cpu-cycles -a \
+ --ctl-fd ${ctl_fd} --ctl-fd-ack ${ctl_fd_ack} \
+ -- sleep 30 &
+perf_pid=$!
+
+sleep 5 && echo 'enable' >&${ctl_fd} && read -u ${ctl_fd_ack} e1 && echo "enabled(${e1})"
+sleep 10 && echo 'disable' >&${ctl_fd} && read -u ${ctl_fd_ack} d1 && echo "disabled(${d1})"
+
+exec {ctl_fd_ack}>&-
+unlink ${ctl_ack_fifo}
+
+exec {ctl_fd}>&-
+unlink ${ctl_fifo}
+
+wait -n ${perf_pid}
+exit $?
+
+
SEE ALSO
--------
linkperf:perf-stat[1], linkperf:perf-list[1], linkperf:perf-intel-pt[1]
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 72f388623364..218cfaafaf10 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1713,6 +1713,8 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
perf_evlist__start_workload(rec->evlist);
}
+ evlist__initialize_ctlfd(rec->evlist, opts->ctl_fd, opts->ctl_fd_ack);
+
if (opts->initial_delay) {
pr_info(EVLIST_DISABLED_MSG);
if (opts->initial_delay > 0) {
@@ -1859,6 +1861,7 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
record__synthesize_workload(rec, true);
out_child:
+ evlist__finalize_ctlfd(rec->evlist);
record__mmap_read_all(rec, true);
record__aio_mmap_read_sync(rec);
@@ -2340,6 +2343,8 @@ static struct record record = {
},
.mmap_flush = MMAP_FLUSH_DEFAULT,
.nr_threads_synthesize = 1,
+ .ctl_fd = -1,
+ .ctl_fd_ack = -1,
},
.tool = {
.sample = process_sample_event,
@@ -2535,6 +2540,10 @@ static struct option __record_options[] = {
OPT_UINTEGER(0, "num-thread-synthesize",
&record.opts.nr_threads_synthesize,
"number of threads to run for event synthesis"),
+ OPT_INTEGER(0, "ctl-fd", &record.opts.ctl_fd,
+ "Listen on fd descriptor for command to control measurement ('enable': enable events, 'disable': disable events)"),
+ OPT_INTEGER(0, "ctl-fd-ack", &record.opts.ctl_fd_ack,
+ "Send control command completion ('ack') to fd ack descriptor"),
OPT_END()
};
diff --git a/tools/perf/util/record.h b/tools/perf/util/record.h
index 96a73bbd8cd4..da18aeca3623 100644
--- a/tools/perf/util/record.h
+++ b/tools/perf/util/record.h
@@ -69,6 +69,8 @@ struct record_opts {
int mmap_flush;
unsigned int comp_level;
unsigned int nr_threads_synthesize;
+ int ctl_fd;
+ int ctl_fd_ack;
};
extern const char * const *record_usage;
--
2.24.1
Extend -D,--delay option with -1 to start collection
with events disabled to be enabled later by enable
command provided via control file descriptor.
Signed-off-by: Alexey Budankov <[email protected]>
---
tools/perf/Documentation/perf-record.txt | 5 +++--
tools/perf/builtin-record.c | 12 ++++++++----
tools/perf/builtin-trace.c | 2 +-
tools/perf/util/record.h | 2 +-
4 files changed, 13 insertions(+), 8 deletions(-)
diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index 561ef55743e2..c2c4ce7ccee2 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -407,8 +407,9 @@ if combined with -a or -C options.
-D::
--delay=::
-After starting the program, wait msecs before measuring. This is useful to
-filter out the startup phase of the program, which is often very different.
+After starting the program, wait msecs before measuring (-1: start with events
+disabled). This is useful to filter out the startup phase of the program, which
+is often very different.
-I::
--intr-regs::
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 4d4502b7fea0..50dc2fe626e5 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1713,8 +1713,12 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
}
if (opts->initial_delay) {
- usleep(opts->initial_delay * USEC_PER_MSEC);
- evlist__enable(rec->evlist);
+ pr_info(EVLIST_DISABLED_MSG);
+ if (opts->initial_delay > 0) {
+ usleep(opts->initial_delay * USEC_PER_MSEC);
+ evlist__enable(rec->evlist);
+ pr_info(EVLIST_ENABLED_MSG);
+ }
}
trigger_ready(&auxtrace_snapshot_trigger);
@@ -2422,8 +2426,8 @@ static struct option __record_options[] = {
OPT_CALLBACK('G', "cgroup", &record.evlist, "name",
"monitor event in cgroup name only",
parse_cgroups),
- OPT_UINTEGER('D', "delay", &record.opts.initial_delay,
- "ms to wait before starting measurement after program start"),
+ OPT_INTEGER('D', "delay", &record.opts.initial_delay,
+ "ms to wait before starting measurement after program start (-1: start with events disabled )"),
OPT_BOOLEAN(0, "kcore", &record.opts.kcore, "copy /proc/kcore"),
OPT_STRING('u', "uid", &record.opts.target.uid_str, "user",
"user to profile"),
diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 56bcf1ab19f8..54a5c0264528 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -4773,7 +4773,7 @@ int cmd_trace(int argc, const char **argv)
"per thread proc mmap processing timeout in ms"),
OPT_CALLBACK('G', "cgroup", &trace, "name", "monitor event in cgroup name only",
trace__parse_cgroups),
- OPT_UINTEGER('D', "delay", &trace.opts.initial_delay,
+ OPT_INTEGER('D', "delay", &trace.opts.initial_delay,
"ms to wait before starting measurement after program "
"start"),
OPTS_EVSWITCH(&trace.evswitch),
diff --git a/tools/perf/util/record.h b/tools/perf/util/record.h
index 923565c3b155..96a73bbd8cd4 100644
--- a/tools/perf/util/record.h
+++ b/tools/perf/util/record.h
@@ -60,7 +60,7 @@ struct record_opts {
const char *auxtrace_snapshot_opts;
const char *auxtrace_sample_opts;
bool sample_transaction;
- unsigned initial_delay;
+ int initial_delay;
bool use_clockid;
clockid_t clockid;
u64 clockid_res_ns;
--
2.24.1
Implement handling of 'enable' and 'disable' control commands
coming from control file descriptor.
Signed-off-by: Alexey Budankov <[email protected]>
---
tools/perf/builtin-record.c | 18 +++++++++++++++++-
1 file changed, 17 insertions(+), 1 deletion(-)
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 50dc2fe626e5..72f388623364 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1505,6 +1505,7 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
bool disabled = false, draining = false;
int fd;
float ratio = 0;
+ enum evlist_ctl_cmd cmd = EVLIST_CTL_CMD_UNSUPPORTED;
atexit(record__sig_exit);
signal(SIGCHLD, sig_handler);
@@ -1802,8 +1803,23 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
* Propagate error, only if there's any. Ignore positive
* number of returned events and interrupt error.
*/
- if (err > 0 || (err < 0 && errno == EINTR))
+ if (err > 0 || (err < 0 && errno == EINTR)) {
err = 0;
+ if (evlist__ctlfd_process(rec->evlist, &cmd) > 0) {
+ switch (cmd) {
+ case EVLIST_CTL_CMD_ENABLE:
+ pr_info(EVLIST_ENABLED_MSG);
+ break;
+ case EVLIST_CTL_CMD_DISABLE:
+ pr_info(EVLIST_DISABLED_MSG);
+ break;
+ case EVLIST_CTL_CMD_ACK:
+ case EVLIST_CTL_CMD_UNSUPPORTED:
+ default:
+ break;
+ }
+ }
+ }
waking++;
if (evlist__filter_pollfd(rec->evlist, POLLERR | POLLHUP) == 0)
--
2.24.1
Implement functions of initialization, finalization and processing
of control commands coming from control file descriptors.
Signed-off-by: Alexey Budankov <[email protected]>
---
tools/perf/util/evlist.c | 128 +++++++++++++++++++++++++++++++++++++++
tools/perf/util/evlist.h | 17 ++++++
2 files changed, 145 insertions(+)
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index aa61619fa304..ea14794e7389 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1707,3 +1707,131 @@ struct evsel *perf_evlist__reset_weak_group(struct evlist *evsel_list,
}
return leader;
}
+
+int evlist__initialize_ctlfd(struct evlist *evlist, int ctl_fd, int ctl_fd_ack)
+{
+ if (ctl_fd == -1) {
+ pr_debug("Control descriptor is not initialized\n");
+ return 0;
+ }
+
+ evlist->ctl_fd_pos = perf_evlist__add_pollfd(&evlist->core, ctl_fd, NULL, POLLIN);
+ if (evlist->ctl_fd_pos < 0) {
+ evlist->ctl_fd_pos = -1;
+ pr_err("Failed to add ctl fd entry: %m\n");
+ return -1;
+ }
+
+ evlist->ctl_fd = ctl_fd;
+ evlist->ctl_fd_ack = ctl_fd_ack;
+
+ return 0;
+}
+
+int evlist__finalize_ctlfd(struct evlist *evlist)
+{
+ if (evlist->ctl_fd_pos == -1)
+ return 0;
+
+ evlist->core.pollfd.entries[evlist->ctl_fd_pos].fd = -1;
+ evlist->ctl_fd_pos = -1;
+ evlist->ctl_fd_ack = -1;
+ evlist->ctl_fd = -1;
+
+ return 0;
+}
+
+static int evlist__ctlfd_recv(struct evlist *evlist, enum evlist_ctl_cmd *cmd,
+ char *cmd_data, size_t data_size)
+{
+ int err;
+ char c;
+ size_t bytes_read = 0;
+
+ memset(cmd_data, 0, data_size--);
+
+ do {
+ err = read(evlist->ctl_fd, &c, 1);
+ if (err > 0) {
+ if (c == '\n' || c == '\0')
+ break;
+ cmd_data[bytes_read++] = c;
+ if (bytes_read == data_size)
+ break;
+ } else {
+ if (err == -1)
+ pr_err("Failed to read from ctlfd %d: %m\n", evlist->ctl_fd);
+ break;
+ }
+ } while (1);
+
+ pr_debug("Message from ctl_fd: \"%s%s\"\n", cmd_data,
+ bytes_read == data_size ? "" : c == '\n' ? "\\n" : "\\0");
+
+ if (err > 0) {
+ if (!strncmp(cmd_data, EVLIST_CTL_CMD_ENABLE_TAG,
+ strlen(EVLIST_CTL_CMD_ENABLE_TAG))) {
+ *cmd = EVLIST_CTL_CMD_ENABLE;
+ } else if (!strncmp(cmd_data, EVLIST_CTL_CMD_DISABLE_TAG,
+ strlen(EVLIST_CTL_CMD_DISABLE_TAG))) {
+ *cmd = EVLIST_CTL_CMD_DISABLE;
+ }
+ }
+
+ return err;
+}
+
+static int evlist__ctlfd_ack(struct evlist *evlist)
+{
+ int err;
+
+ if (evlist->ctl_fd_ack == -1)
+ return 0;
+
+ err = write(evlist->ctl_fd_ack, EVLIST_CTL_CMD_ACK_TAG,
+ sizeof(EVLIST_CTL_CMD_ACK_TAG));
+ if (err == -1)
+ pr_err("failed to write to ctl_ack_fd %d: %m\n", evlist->ctl_fd_ack);
+
+ return err;
+}
+
+int evlist__ctlfd_process(struct evlist *evlist, enum evlist_ctl_cmd *cmd)
+{
+ int err = 0;
+ char cmd_data[EVLIST_CTL_CMD_MAX_LEN];
+ int ctlfd_pos = evlist->ctl_fd_pos;
+ struct pollfd *entries = evlist->core.pollfd.entries;
+
+ if (!entries[ctlfd_pos].revents)
+ return 0;
+
+ if (entries[ctlfd_pos].revents & POLLIN) {
+ err = evlist__ctlfd_recv(evlist, cmd, cmd_data,
+ EVLIST_CTL_CMD_MAX_LEN);
+ if (err > 0) {
+ switch (*cmd) {
+ case EVLIST_CTL_CMD_ENABLE:
+ evlist__enable(evlist);
+ break;
+ case EVLIST_CTL_CMD_DISABLE:
+ evlist__disable(evlist);
+ break;
+ case EVLIST_CTL_CMD_ACK:
+ case EVLIST_CTL_CMD_UNSUPPORTED:
+ default:
+ pr_debug("ctlfd: unsupported %d\n", *cmd);
+ break;
+ }
+ if (!(*cmd == EVLIST_CTL_CMD_ACK || *cmd == EVLIST_CTL_CMD_UNSUPPORTED))
+ evlist__ctlfd_ack(evlist);
+ }
+ }
+
+ if (entries[ctlfd_pos].revents & (POLLHUP | POLLERR))
+ evlist__finalize_ctlfd(evlist);
+ else
+ entries[ctlfd_pos].revents = 0;
+
+ return err;
+}
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 62f259d89b41..0554f8d2a65b 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -358,4 +358,21 @@ void perf_evlist__force_leader(struct evlist *evlist);
struct evsel *perf_evlist__reset_weak_group(struct evlist *evlist,
struct evsel *evsel,
bool close);
+#define EVLIST_CTL_CMD_ENABLE_TAG "enable"
+#define EVLIST_CTL_CMD_DISABLE_TAG "disable"
+#define EVLIST_CTL_CMD_ACK_TAG "ack\n"
+
+#define EVLIST_CTL_CMD_MAX_LEN 64
+
+enum evlist_ctl_cmd {
+ EVLIST_CTL_CMD_UNSUPPORTED = 0,
+ EVLIST_CTL_CMD_ENABLE,
+ EVLIST_CTL_CMD_DISABLE,
+ EVLIST_CTL_CMD_ACK
+};
+
+int evlist__initialize_ctlfd(struct evlist *evlist, int ctl_fd, int ctl_fd_ack);
+int evlist__finalize_ctlfd(struct evlist *evlist);
+int evlist__ctlfd_process(struct evlist *evlist, enum evlist_ctl_cmd *cmd);
+
#endif /* __PERF_EVLIST_H */
--
2.24.1
Implement handling of 'enable' and 'disable' control commands
coming from control file descriptor.
Signed-off-by: Alexey Budankov <[email protected]>
---
tools/perf/builtin-stat.c | 48 +++++++++++++++++++++++++++++----------
1 file changed, 36 insertions(+), 12 deletions(-)
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 73404b723592..abea82a1ba24 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -460,8 +460,9 @@ static int handle_events(pid_t pid, struct perf_stat_config *config)
{
pid_t child = 0;
bool res, stop = false;
- struct timespec time_to_sleep;
- int sleep_time, status = 0, times = config->times;
+ int time_to_sleep, sleep_time, status = 0, times = config->times;
+ enum evlist_ctl_cmd cmd = EVLIST_CTL_CMD_UNSUPPORTED;
+ struct timespec time_start, time_stop, time_diff;
if (config->interval)
sleep_time = config->interval;
@@ -470,22 +471,45 @@ static int handle_events(pid_t pid, struct perf_stat_config *config)
else
sleep_time = 1000;
- time_to_sleep.tv_sec = sleep_time / MSEC_PER_SEC;
- time_to_sleep.tv_nsec = (sleep_time % MSEC_PER_SEC) * NSEC_PER_MSEC;
+ time_to_sleep = sleep_time;
do {
if (pid != -1)
child = waitpid(pid, &status, WNOHANG);
if (child || stop || done)
break;
- nanosleep(&time_to_sleep, NULL);
- if (pid == -1)
- stop = !is_target_alive(&target, evsel_list->core.threads);
- if (config->timeout) {
- stop = !stop ? true : stop;
- } else {
- res = print_interval_and_stop(config, ×);
- stop = !stop ? res : stop;
+ clock_gettime(CLOCK_MONOTONIC, &time_start);
+ if (!(evlist__poll(evsel_list, time_to_sleep) > 0)) { /* poll timeout or EINTR */
+ if (pid == -1)
+ stop = !is_target_alive(&target, evsel_list->core.threads);
+ if (config->timeout) {
+ stop = !stop ? true : stop;
+ } else {
+ res = print_interval_and_stop(config, ×);
+ stop = !stop ? res : stop;
+ }
+ time_to_sleep = sleep_time;
+ } else { /* fd revent */
+ if (evlist__ctlfd_process(evsel_list, &cmd) > 0) {
+ switch (cmd) {
+ case EVLIST_CTL_CMD_ENABLE:
+ pr_info(EVLIST_ENABLED_MSG);
+ stop = print_interval_and_stop(config, ×);
+ break;
+ case EVLIST_CTL_CMD_DISABLE:
+ stop = print_interval_and_stop(config, ×);
+ pr_info(EVLIST_DISABLED_MSG);
+ break;
+ case EVLIST_CTL_CMD_ACK:
+ case EVLIST_CTL_CMD_UNSUPPORTED:
+ default:
+ break;
+ }
+ }
+ clock_gettime(CLOCK_MONOTONIC, &time_stop);
+ diff_timespec(&time_diff, &time_stop, &time_start);
+ time_to_sleep -= time_diff.tv_sec * MSEC_PER_SEC +
+ time_diff.tv_nsec / NSEC_PER_MSEC;
}
} while (1);
--
2.24.1
Hi,
Is there anything else that could be done from my side to move this forward?
Thanks,
Alexey
On 13.05.2020 10:53, Alexey Budankov wrote:
>
> Changes in v3:
> - renamed functions and types from perf_evlist_ to evlist_ to avoid
> clash with libperf code;
> - extended commands to be strings of variable length consisting of
> command name and also possibly including command specific data;
> - merged docs update with the code changes;
> - updated docs for -D,--delay=-1 option for stat and record modes;
>
> v2: https://lore.kernel.org/lkml/[email protected]/
>
> Changes in v2:
> - renamed resume and pause commands to enable and disable ones, renamed
> CTL_CMD_RESUME and CTL_CMD_PAUSE to CTL_CMD_ENABLE and CTL_CMD_DISABLE
> to fit to the appropriate ioctls and avoid mixing up with PAUSE_OUTPUT
> ioctl;
> - factored out event handling loop into a handle_events() for stat mode;
> - separated -D,--delay=-1 into separate patches for stat and record modes;
>
> v1: https://lore.kernel.org/lkml/[email protected]/
>
> repo: tip of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git perf/core
>
> The patch set implements handling of 'start disabled', 'enable' and 'disable'
> external control commands which can be provided for stat and record modes
> of the tool from an external controlling process. 'start disabled' command
> can be used to postpone enabling of events in the beginning of a monitoring
> session. 'enable' and 'disable' commands can be used to enable and disable
> events correspondingly any time after the start of the session.
>
> The 'start disabled', 'enable' and 'disable' external control commands can be
> used to focus measurement on specially selected time intervals of workload
> execution. Focused measurement reduces tool intrusion and influence on
> workload behavior, reduces distortion and amount of collected and stored
> data, mitigates data accuracy loss because measurement and data capturing
> happen only during intervals of interest.
>
> A controlling process can be a bash shell script [1], native executable or
> any other language program that can directly work with file descriptors,
> e.g. pipes [2], and spawn a process, specially the tool one.
>
> -D,--delay <val> option is extended with -1 value to skip events enabling
> in the beginning of a monitoring session ('start disabled' command).
> --ctl-fd and --ctl-fd-ack command line options are introduced to provide the
> tool with a pair of file descriptors to listen to control commands and reply
> to the controlling process on the completion of received commands.
>
> The tool reads control command message from ctl-fd descriptor, handles the
> command and optionally replies acknowledgement message to fd-ack descriptor,
> if it is specified on the command line. 'enable' command is recognized as
> 'enable' string message and 'disable' command is recognized as 'disable'
> string message both received from ctl-fd descriptor. Completion message is
> 'ack\n' and sent to fd-ack descriptor.
>
> Example bash script demonstrating simple use case follows:
>
> #!/bin/bash
>
> ctl_dir=/tmp/
>
> ctl_fifo=${ctl_dir}perf_ctl.fifo
> test -p ${ctl_fifo} && unlink ${ctl_fifo}
> mkfifo ${ctl_fifo} && exec {ctl_fd}<>${ctl_fifo}
>
> ctl_ack_fifo=${ctl_dir}perf_ctl_ack.fifo
> test -p ${ctl_ack_fifo} && unlink ${ctl_ack_fifo}
> mkfifo ${ctl_ack_fifo} && exec {ctl_fd_ack}<>${ctl_ack_fifo}
>
> perf stat -D -1 -e cpu-cycles -a -I 1000 \
> --ctl-fd ${ctl_fd} --ctl-fd-ack ${ctl_fd_ack} \
> -- sleep 40 &
> perf_pid=$!
>
> sleep 5 && echo 'enable' >&${ctl_fd} && read -u ${ctl_fd_ack} e1 && echo "enabled(${e1})"
> sleep 10 && echo 'disable' >&${ctl_fd} && read -u ${ctl_fd_ack} d1 && echo "disabled(${d1})"
> sleep 5 && echo 'enable' >&${ctl_fd} && read -u ${ctl_fd_ack} e2 && echo "enabled(${e2})"
> sleep 10 && echo 'disable' >&${ctl_fd} && read -u ${ctl_fd_ack} d2 && echo "disabled(${d2})"
>
> exec {ctl_fd_ack}>&- && unlink ${ctl_ack_fifo}
> exec {ctl_fd}>&- && unlink ${ctl_fifo}
>
> wait -n ${perf_pid}
> exit $?
>
>
> Script output:
>
> [root@host dir] example
> Events disabled
> # time counts unit events
> 1.001101062 <not counted> cpu-cycles
> 2.002994944 <not counted> cpu-cycles
> 3.004864340 <not counted> cpu-cycles
> 4.006727177 <not counted> cpu-cycles
> Events enabled
> enabled(ack)
> 4.993808464 3,124,246 cpu-cycles
> 5.008597004 3,325,624 cpu-cycles
> 6.010387483 83,472,992 cpu-cycles
> 7.012266598 55,877,621 cpu-cycles
> 8.014175695 97,892,729 cpu-cycles
> 9.016056093 68,461,242 cpu-cycles
> 10.017937507 55,449,643 cpu-cycles
> 11.019830154 68,938,167 cpu-cycles
> 12.021719952 55,164,101 cpu-cycles
> 13.023627550 70,535,720 cpu-cycles
> 14.025580995 53,240,125 cpu-cycles
> disabled(ack)
> 14.997518260 53,558,068 cpu-cycles
> Events disabled
> 15.027216416 <not counted> cpu-cycles
> 16.029052729 <not counted> cpu-cycles
> 17.030904762 <not counted> cpu-cycles
> 18.032073424 <not counted> cpu-cycles
> 19.033805074 <not counted> cpu-cycles
> Events enabled
> enabled(ack)
> 20.001279097 3,021,022 cpu-cycles
> 20.035044381 6,434,367 cpu-cycles
> 21.036923813 89,358,251 cpu-cycles
> 22.038825169 72,516,351 cpu-cycles
> # time counts unit events
> 23.040715596 55,046,157 cpu-cycles
> 24.042643757 78,128,649 cpu-cycles
> 25.044558535 61,052,428 cpu-cycles
> 26.046452785 62,142,806 cpu-cycles
> 27.048353021 74,477,971 cpu-cycles
> 28.050241286 61,001,623 cpu-cycles
> 29.052149961 61,653,502 cpu-cycles
> disabled(ack)
> 30.004980264 82,729,640 cpu-cycles
> Events disabled
> 30.053516176 <not counted> cpu-cycles
> 31.055348366 <not counted> cpu-cycles
> 32.057202097 <not counted> cpu-cycles
> 33.059040702 <not counted> cpu-cycles
> 34.060843288 <not counted> cpu-cycles
> 35.000888624 <not counted> cpu-cycles
> [root@host dir]#
>
> [1] http://man7.org/linux/man-pages/man1/bash.1.html
> [2] http://man7.org/linux/man-pages/man2/pipe.2.html
>
> ---
> Alexey Budankov (9):
> perf evlist: introduce control file descriptors
> perf evlist: implement control command handling functions
> perf stat: factor out event handling loop into a function
> perf stat: extend -D,--delay option with -1 value
> perf stat: implement control commands handling
> perf stat: introduce --ctl-fd[-ack] options
> perf record: extend -D,--delay option with -1 value
> perf record: implement control commands handling
> perf record: introduce --ctl-fd[-ack] options
>
> tools/perf/Documentation/perf-record.txt | 44 +++++++-
> tools/perf/Documentation/perf-stat.txt | 45 +++++++-
> tools/perf/builtin-record.c | 39 ++++++-
> tools/perf/builtin-stat.c | 137 ++++++++++++++++-------
> tools/perf/builtin-trace.c | 2 +-
> tools/perf/util/evlist.c | 131 ++++++++++++++++++++++
> tools/perf/util/evlist.h | 24 ++++
> tools/perf/util/record.h | 4 +-
> tools/perf/util/stat.h | 4 +-
> 9 files changed, 379 insertions(+), 51 deletions(-)
>
On Mon, May 18, 2020 at 11:08:43AM +0300, Alexey Budankov wrote:
> Hi,
>
> Is there anything else that could be done from my side to move this forward?
sorry I did not get to this yet.. will check
jirka
On Wed, May 13, 2020 at 11:00:02AM +0300, Alexey Budankov wrote:
SNIP
> +
> +static int evlist__ctlfd_recv(struct evlist *evlist, enum evlist_ctl_cmd *cmd,
> + char *cmd_data, size_t data_size)
> +{
> + int err;
> + char c;
> + size_t bytes_read = 0;
> +
> + memset(cmd_data, 0, data_size--);
> +
> + do {
> + err = read(evlist->ctl_fd, &c, 1);
> + if (err > 0) {
> + if (c == '\n' || c == '\0')
> + break;
> + cmd_data[bytes_read++] = c;
> + if (bytes_read == data_size)
> + break;
> + } else {
> + if (err == -1)
> + pr_err("Failed to read from ctlfd %d: %m\n", evlist->ctl_fd);
> + break;
> + }
> + } while (1);
> +
> + pr_debug("Message from ctl_fd: \"%s%s\"\n", cmd_data,
> + bytes_read == data_size ? "" : c == '\n' ? "\\n" : "\\0");
> +
> + if (err > 0) {
> + if (!strncmp(cmd_data, EVLIST_CTL_CMD_ENABLE_TAG,
> + strlen(EVLIST_CTL_CMD_ENABLE_TAG))) {
> + *cmd = EVLIST_CTL_CMD_ENABLE;
> + } else if (!strncmp(cmd_data, EVLIST_CTL_CMD_DISABLE_TAG,
> + strlen(EVLIST_CTL_CMD_DISABLE_TAG))) {
> + *cmd = EVLIST_CTL_CMD_DISABLE;
> + }
are there more comands comming?
jirka
On Wed, May 13, 2020 at 11:03:03AM +0300, Alexey Budankov wrote:
>
> Introduce --ctl-fd[-ack] options to pass open file descriptors numbers
> from command line. Extend perf-stat.txt file with --ctl-fd[-ack] options
> description. Document possible usage model introduced by --ctl-fd[-ack]
> options by providing example bash shell script.
>
> Signed-off-by: Alexey Budankov <[email protected]>
> ---
> tools/perf/Documentation/perf-stat.txt | 40 ++++++++++++++++++++++++++
> tools/perf/builtin-stat.c | 10 +++++++
> tools/perf/util/stat.h | 2 ++
> 3 files changed, 52 insertions(+)
>
> diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
> index 3b91b30d7672..7f7a0019fbfc 100644
> --- a/tools/perf/Documentation/perf-stat.txt
> +++ b/tools/perf/Documentation/perf-stat.txt
> @@ -164,6 +164,46 @@ with it. --append may be used here. Examples:
> 3>results perf stat --log-fd 3 -- $cmd
> 3>>results perf stat --log-fd 3 --append -- $cmd
>
> +--ctl-fd::
> +--ctl-fd-ack::
> +
> +Listen on ctl-fd descriptor for command to control measurement ('enable': enable events,
> +'disable': disable events). Optionally send control command completion ('ack') to fd-ack
> +descriptor to synchronize with the controlling process. Example of bash shell script
> +to enable and disable events during measurements:
> +
> +#!/bin/bash
> +
> +ctl_dir=/tmp/
> +
> +ctl_fifo=${ctl_dir}perf_ctl.fifo
> +test -p ${ctl_fifo} && unlink ${ctl_fifo}
> +mkfifo ${ctl_fifo}
> +exec {ctl_fd}<>${ctl_fifo}
> +
> +ctl_ack_fifo=${ctl_dir}perf_ctl_ack.fifo
> +test -p ${ctl_ack_fifo} && unlink ${ctl_ack_fifo}
> +mkfifo ${ctl_ack_fifo}
> +exec {ctl_fd_ack}<>${ctl_ack_fifo}
> +
> +perf stat -D -1 -e cpu-cycles -a -I 1000 \
> + --ctl-fd ${ctl_fd} --ctl-fd-ack ${ctl_fd_ack} \
> + -- sleep 30 &
> +perf_pid=$!
> +
> +sleep 5 && echo 'enable' >&${ctl_fd} && read -u ${ctl_fd_ack} e1 && echo "enabled(${e1})"
> +sleep 10 && echo 'disable' >&${ctl_fd} && read -u ${ctl_fd_ack} d1 && echo "disabled(${d1})"
> +
> +exec {ctl_fd_ack}>&-
> +unlink ${ctl_ack_fifo}
> +
> +exec {ctl_fd}>&-
> +unlink ${ctl_fifo}
> +
> +wait -n ${perf_pid}
> +exit $?
> +
> +
> --pre::
> --post::
> Pre and post measurement hooks, e.g.:
> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> index abea82a1ba24..88055aaf670f 100644
> --- a/tools/perf/builtin-stat.c
> +++ b/tools/perf/builtin-stat.c
> @@ -187,6 +187,8 @@ static struct perf_stat_config stat_config = {
> .metric_only_len = METRIC_ONLY_LEN,
> .walltime_nsecs_stats = &walltime_nsecs_stats,
> .big_num = true,
> + .ctl_fd = -1,
> + .ctl_fd_ack = -1
> };
>
> static inline void diff_timespec(struct timespec *r, struct timespec *a,
> @@ -984,6 +986,10 @@ static struct option stat_options[] = {
> "Use with 'percore' event qualifier to show the event "
> "counts of one hardware thread by sum up total hardware "
> "threads of same physical core"),
> + OPT_INTEGER(0, "ctl-fd", &stat_config.ctl_fd,
> + "Listen on fd descriptor for command to control measurement ('enable': enable events, 'disable': disable events)"),
> + OPT_INTEGER(0, "ctl-fd-ack", &stat_config.ctl_fd_ack,
> + "Send control command completion ('ack') to fd ack descriptor"),
> OPT_END()
> };
>
> @@ -2180,6 +2186,8 @@ int cmd_stat(int argc, const char **argv)
> signal(SIGALRM, skip_signal);
> signal(SIGABRT, skip_signal);
>
> + evlist__initialize_ctlfd(evsel_list, stat_config.ctl_fd, stat_config.ctl_fd_ack);
please check the return value
jirka
> +
> status = 0;
> for (run_idx = 0; forever || run_idx < stat_config.run_count; run_idx++) {
> if (stat_config.run_count != 1 && verbose > 0)
> @@ -2199,6 +2207,8 @@ int cmd_stat(int argc, const char **argv)
> if (!forever && status != -1 && !interval)
> print_counters(NULL, argc, argv);
>
> + evlist__finalize_ctlfd(evsel_list);
> +
> if (STAT_RECORD) {
> /*
> * We synthesize the kernel mmap record just so that older tools
> diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h
> index 027b9dcd902f..0b0fa3a2cde2 100644
> --- a/tools/perf/util/stat.h
> +++ b/tools/perf/util/stat.h
> @@ -130,6 +130,8 @@ struct perf_stat_config {
> struct perf_cpu_map *cpus_aggr_map;
> u64 *walltime_run;
> struct rblist metric_events;
> + int ctl_fd;
> + int ctl_fd_ack;
> };
>
> void update_stats(struct stats *stats, u64 val);
> --
> 2.24.1
>
>
On Wed, May 13, 2020 at 10:59:00AM +0300, Alexey Budankov wrote:
>
> Define and initialize control file descriptors.
>
> Signed-off-by: Alexey Budankov <[email protected]>
> ---
> tools/perf/util/evlist.c | 3 +++
> tools/perf/util/evlist.h | 3 +++
> 2 files changed, 6 insertions(+)
>
> diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
> index 2a9de6491700..aa61619fa304 100644
> --- a/tools/perf/util/evlist.c
> +++ b/tools/perf/util/evlist.c
> @@ -63,6 +63,9 @@ void evlist__init(struct evlist *evlist, struct perf_cpu_map *cpus,
> perf_evlist__set_maps(&evlist->core, cpus, threads);
> evlist->workload.pid = -1;
> evlist->bkw_mmap_state = BKW_MMAP_NOTREADY;
> + evlist->ctl_fd = -1;
> + evlist->ctl_fd_ack = -1;
> + evlist->ctl_fd_pos = -1;
> }
>
> struct evlist *evlist__new(void)
> diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
> index b6f325dfb4d2..62f259d89b41 100644
> --- a/tools/perf/util/evlist.h
> +++ b/tools/perf/util/evlist.h
> @@ -74,6 +74,9 @@ struct evlist {
> pthread_t th;
> volatile int done;
> } thread;
> + int ctl_fd;
> + int ctl_fd_ack;
> + int ctl_fd_pos;
we are using the anonymous structs to keep related
parts together like for workload and thread
could you use it in there as well?
jirka
> };
>
> struct evsel_str_handler {
> --
> 2.24.1
>
On Wed, May 13, 2020 at 11:04:25AM +0300, Alexey Budankov wrote:
>
> Implement handling of 'enable' and 'disable' control commands
> coming from control file descriptor.
>
> Signed-off-by: Alexey Budankov <[email protected]>
> ---
> tools/perf/builtin-record.c | 18 +++++++++++++++++-
> 1 file changed, 17 insertions(+), 1 deletion(-)
>
> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> index 50dc2fe626e5..72f388623364 100644
> --- a/tools/perf/builtin-record.c
> +++ b/tools/perf/builtin-record.c
> @@ -1505,6 +1505,7 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
> bool disabled = false, draining = false;
> int fd;
> float ratio = 0;
> + enum evlist_ctl_cmd cmd = EVLIST_CTL_CMD_UNSUPPORTED;
>
> atexit(record__sig_exit);
> signal(SIGCHLD, sig_handler);
> @@ -1802,8 +1803,23 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
> * Propagate error, only if there's any. Ignore positive
> * number of returned events and interrupt error.
> */
> - if (err > 0 || (err < 0 && errno == EINTR))
> + if (err > 0 || (err < 0 && errno == EINTR)) {
> err = 0;
> + if (evlist__ctlfd_process(rec->evlist, &cmd) > 0) {
> + switch (cmd) {
> + case EVLIST_CTL_CMD_ENABLE:
> + pr_info(EVLIST_ENABLED_MSG);
> + break;
> + case EVLIST_CTL_CMD_DISABLE:
> + pr_info(EVLIST_DISABLED_MSG);
> + break;
> + case EVLIST_CTL_CMD_ACK:
> + case EVLIST_CTL_CMD_UNSUPPORTED:
> + default:
> + break;
> + }
> + }
> + }
so this will be processed only when:
if (hits == rec->samples) {
what if there's always somethign in the buffer? will this stall?
> waking++;
>
> if (evlist__filter_pollfd(rec->evlist, POLLERR | POLLHUP) == 0)
evlist__filter_pollfd will trigger draining if all the maps are closed,
but there's one more fd in pollfd now, will this prevent draining?
I wonder this would fit better to the sideband thread (util/sideband_evlist.c)
so we don't disturb the main thread with another check
jirka
> --
> 2.24.1
>
>
On Wed, May 13, 2020 at 11:00:47AM +0300, Alexey Budankov wrote:
>
> Factor out event handling loop into handle_events() function.
>
> Signed-off-by: Alexey Budankov <[email protected]>
> ---
> tools/perf/builtin-stat.c | 85 +++++++++++++++++++++++----------------
> 1 file changed, 50 insertions(+), 35 deletions(-)
>
> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> index e0c1ad23c768..9775b0905146 100644
> --- a/tools/perf/builtin-stat.c
> +++ b/tools/perf/builtin-stat.c
> @@ -371,6 +371,16 @@ static void process_interval(void)
> print_counters(&rs, 0, NULL);
> }
>
> +static bool print_interval_and_stop(struct perf_stat_config *config, int *times)
> +{
> + if (config->interval) {
> + process_interval();
> + if (interval_count && !(--(*times)))
> + return true;
> + }
> + return false;
> +}
> +
> static void enable_counters(void)
> {
> if (stat_config.initial_delay)
> @@ -436,6 +446,42 @@ static bool is_target_alive(struct target *_target,
> return false;
> }
>
> +static int handle_events(pid_t pid, struct perf_stat_config *config)
> +{
> + pid_t child = 0;
> + bool res, stop = false;
> + struct timespec time_to_sleep;
> + int sleep_time, status = 0, times = config->times;
> +
> + if (config->interval)
> + sleep_time = config->interval;
> + else if (config->timeout)
> + sleep_time = config->timeout;
> + else
> + sleep_time = 1000;
> +
> + time_to_sleep.tv_sec = sleep_time / MSEC_PER_SEC;
> + time_to_sleep.tv_nsec = (sleep_time % MSEC_PER_SEC) * NSEC_PER_MSEC;
> +
> + do {
> + if (pid != -1)
> + child = waitpid(pid, &status, WNOHANG);
> + if (child || stop || done)
> + break;
> + nanosleep(&time_to_sleep, NULL);
> + if (pid == -1)
> + stop = !is_target_alive(&target, evsel_list->core.threads);
> + if (config->timeout) {
> + stop = !stop ? true : stop;
> + } else {
> + res = print_interval_and_stop(config, ×);
> + stop = !stop ? res : stop;
> + }
> + } while (1);
> +
> + return status;
> +}
> +
> enum counter_recovery {
> COUNTER_SKIP,
> COUNTER_RETRY,
> @@ -494,12 +540,10 @@ static enum counter_recovery stat_handle_error(struct evsel *counter)
> static int __run_perf_stat(int argc, const char **argv, int run_idx)
> {
> int interval = stat_config.interval;
> - int times = stat_config.times;
> int timeout = stat_config.timeout;
> char msg[BUFSIZ];
> unsigned long long t0, t1;
> struct evsel *counter;
> - struct timespec ts;
> size_t l;
> int status = 0;
> const bool forks = (argc > 0);
> @@ -508,17 +552,6 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
> int i, cpu;
> bool second_pass = false;
>
> - if (interval) {
> - ts.tv_sec = interval / USEC_PER_MSEC;
> - ts.tv_nsec = (interval % USEC_PER_MSEC) * NSEC_PER_MSEC;
> - } else if (timeout) {
> - ts.tv_sec = timeout / USEC_PER_MSEC;
> - ts.tv_nsec = (timeout % USEC_PER_MSEC) * NSEC_PER_MSEC;
> - } else {
> - ts.tv_sec = 1;
> - ts.tv_nsec = 0;
> - }
> -
> if (forks) {
> if (perf_evlist__prepare_workload(evsel_list, &target, argv, is_pipe,
> workload_exec_failed_signal) < 0) {
> @@ -675,16 +708,9 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
> perf_evlist__start_workload(evsel_list);
> enable_counters();
>
> - if (interval || timeout) {
> - while (!waitpid(child_pid, &status, WNOHANG)) {
> - nanosleep(&ts, NULL);
> - if (timeout)
> - break;
> - process_interval();
> - if (interval_count && !(--times))
> - break;
> - }
> - }
> + if (interval || timeout)
> + handle_events(child_pid, &stat_config);
> +
> if (child_pid != -1) {
> if (timeout)
> kill(child_pid, SIGTERM);
> @@ -701,18 +727,7 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
> psignal(WTERMSIG(status), argv[0]);
> } else {
> enable_counters();
> - while (!done) {
> - nanosleep(&ts, NULL);
> - if (!is_target_alive(&target, evsel_list->core.threads))
> - break;
> - if (timeout)
> - break;
> - if (interval) {
> - process_interval();
> - if (interval_count && !(--times))
> - break;
> - }
> - }
> + handle_events(-1, &stat_config);
this makes me worried.. I'm not sure if it's good idea
to squash these 2 looops into one, because they are already
complex as they are.. and one of you following patches is
making it even more complex
wouldn't it be better if you just add single call into
each of them.. that would poll on your fd and process the
commands if needed?
jirka
On Wed, May 13, 2020 at 11:05:08AM +0300, Alexey Budankov wrote:
>
> Introduce --ctl-fd[-ack] options to pass open file descriptors numbers
> from command line. Extend perf-record.txt file with --ctl-fd[-ack]
> options description. Document possible usage model introduced by
> --ctl-fd[-ack] options by providing example bash shell script.
>
> Signed-off-by: Alexey Budankov <[email protected]>
> ---
> tools/perf/Documentation/perf-record.txt | 39 ++++++++++++++++++++++++
> tools/perf/builtin-record.c | 9 ++++++
> tools/perf/util/record.h | 2 ++
> 3 files changed, 50 insertions(+)
>
> diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
> index c2c4ce7ccee2..5c012cfe68a4 100644
> --- a/tools/perf/Documentation/perf-record.txt
> +++ b/tools/perf/Documentation/perf-record.txt
> @@ -614,6 +614,45 @@ appended unit character - B/K/M/G
> The number of threads to run when synthesizing events for existing processes.
> By default, the number of threads equals 1.
>
> +--ctl-fd::
> +--ctl-fd-ack::
> +Listen on ctl-fd descriptor for command to control measurement ('enable': enable events,
> +'disable': disable events. Optionally send control command completion ('ack') to fd-ack
> +descriptor to synchronize with the controlling process. Example of bash shell script
> +to enable and disable events during measurements:
> +
> +#!/bin/bash
> +
> +ctl_dir=/tmp/
> +
> +ctl_fifo=${ctl_dir}perf_ctl.fifo
> +test -p ${ctl_fifo} && unlink ${ctl_fifo}
> +mkfifo ${ctl_fifo}
> +exec {ctl_fd}<>${ctl_fifo}
> +
> +ctl_ack_fifo=${ctl_dir}perf_ctl_ack.fifo
> +test -p ${ctl_ack_fifo} && unlink ${ctl_ack_fifo}
> +mkfifo ${ctl_ack_fifo}
> +exec {ctl_fd_ack}<>${ctl_ack_fifo}
> +
> +perf record -D -1 -e cpu-cycles -a \
> + --ctl-fd ${ctl_fd} --ctl-fd-ack ${ctl_fd_ack} \
> + -- sleep 30 &
> +perf_pid=$!
> +
> +sleep 5 && echo 'enable' >&${ctl_fd} && read -u ${ctl_fd_ack} e1 && echo "enabled(${e1})"
> +sleep 10 && echo 'disable' >&${ctl_fd} && read -u ${ctl_fd_ack} d1 && echo "disabled(${d1})"
> +
> +exec {ctl_fd_ack}>&-
> +unlink ${ctl_ack_fifo}
> +
> +exec {ctl_fd}>&-
> +unlink ${ctl_fifo}
> +
> +wait -n ${perf_pid}
> +exit $?
> +
> +
> SEE ALSO
> --------
> linkperf:perf-stat[1], linkperf:perf-list[1], linkperf:perf-intel-pt[1]
> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> index 72f388623364..218cfaafaf10 100644
> --- a/tools/perf/builtin-record.c
> +++ b/tools/perf/builtin-record.c
> @@ -1713,6 +1713,8 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
> perf_evlist__start_workload(rec->evlist);
> }
>
> + evlist__initialize_ctlfd(rec->evlist, opts->ctl_fd, opts->ctl_fd_ack);
please check return value in here
jirka
> +
> if (opts->initial_delay) {
> pr_info(EVLIST_DISABLED_MSG);
> if (opts->initial_delay > 0) {
> @@ -1859,6 +1861,7 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
> record__synthesize_workload(rec, true);
>
> out_child:
> + evlist__finalize_ctlfd(rec->evlist);
> record__mmap_read_all(rec, true);
> record__aio_mmap_read_sync(rec);
>
> @@ -2340,6 +2343,8 @@ static struct record record = {
> },
> .mmap_flush = MMAP_FLUSH_DEFAULT,
> .nr_threads_synthesize = 1,
> + .ctl_fd = -1,
> + .ctl_fd_ack = -1,
> },
> .tool = {
> .sample = process_sample_event,
> @@ -2535,6 +2540,10 @@ static struct option __record_options[] = {
> OPT_UINTEGER(0, "num-thread-synthesize",
> &record.opts.nr_threads_synthesize,
> "number of threads to run for event synthesis"),
> + OPT_INTEGER(0, "ctl-fd", &record.opts.ctl_fd,
> + "Listen on fd descriptor for command to control measurement ('enable': enable events, 'disable': disable events)"),
> + OPT_INTEGER(0, "ctl-fd-ack", &record.opts.ctl_fd_ack,
> + "Send control command completion ('ack') to fd ack descriptor"),
> OPT_END()
> };
>
> diff --git a/tools/perf/util/record.h b/tools/perf/util/record.h
> index 96a73bbd8cd4..da18aeca3623 100644
> --- a/tools/perf/util/record.h
> +++ b/tools/perf/util/record.h
> @@ -69,6 +69,8 @@ struct record_opts {
> int mmap_flush;
> unsigned int comp_level;
> unsigned int nr_threads_synthesize;
> + int ctl_fd;
> + int ctl_fd_ack;
> };
>
> extern const char * const *record_usage;
> --
> 2.24.1
>
>
On 20.05.2020 15:38, Jiri Olsa wrote:
> On Wed, May 13, 2020 at 10:59:00AM +0300, Alexey Budankov wrote:
>>
>> Define and initialize control file descriptors.
>>
>> Signed-off-by: Alexey Budankov <[email protected]>
>> ---
>> tools/perf/util/evlist.c | 3 +++
>> tools/perf/util/evlist.h | 3 +++
>> 2 files changed, 6 insertions(+)
>>
>> diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
>> index 2a9de6491700..aa61619fa304 100644
>> --- a/tools/perf/util/evlist.c
>> +++ b/tools/perf/util/evlist.c
>> @@ -63,6 +63,9 @@ void evlist__init(struct evlist *evlist, struct perf_cpu_map *cpus,
>> perf_evlist__set_maps(&evlist->core, cpus, threads);
>> evlist->workload.pid = -1;
>> evlist->bkw_mmap_state = BKW_MMAP_NOTREADY;
>> + evlist->ctl_fd = -1;
>> + evlist->ctl_fd_ack = -1;
>> + evlist->ctl_fd_pos = -1;
>> }
>>
>> struct evlist *evlist__new(void)
>> diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
>> index b6f325dfb4d2..62f259d89b41 100644
>> --- a/tools/perf/util/evlist.h
>> +++ b/tools/perf/util/evlist.h
>> @@ -74,6 +74,9 @@ struct evlist {
>> pthread_t th;
>> volatile int done;
>> } thread;
>> + int ctl_fd;
>> + int ctl_fd_ack;
>> + int ctl_fd_pos;
>
> we are using the anonymous structs to keep related
> parts together like for workload and thread
>
> could you use it in there as well?
Accepted in v4.
~Alexey
>
> jirka
>
>> };
>>
>> struct evsel_str_handler {
>> --
>> 2.24.1
>>
>
On 20.05.2020 15:38, Jiri Olsa wrote:
> On Wed, May 13, 2020 at 11:00:02AM +0300, Alexey Budankov wrote:
>
> SNIP
>
>> +
>> +static int evlist__ctlfd_recv(struct evlist *evlist, enum evlist_ctl_cmd *cmd,
>> + char *cmd_data, size_t data_size)
>> +{
>> + int err;
>> + char c;
>> + size_t bytes_read = 0;
>> +
>> + memset(cmd_data, 0, data_size--);
>> +
>> + do {
>> + err = read(evlist->ctl_fd, &c, 1);
>> + if (err > 0) {
>> + if (c == '\n' || c == '\0')
>> + break;
>> + cmd_data[bytes_read++] = c;
>> + if (bytes_read == data_size)
>> + break;
>> + } else {
>> + if (err == -1)
>> + pr_err("Failed to read from ctlfd %d: %m\n", evlist->ctl_fd);
>> + break;
>> + }
>> + } while (1);
>> +
>> + pr_debug("Message from ctl_fd: \"%s%s\"\n", cmd_data,
>> + bytes_read == data_size ? "" : c == '\n' ? "\\n" : "\\0");
>> +
>> + if (err > 0) {
>> + if (!strncmp(cmd_data, EVLIST_CTL_CMD_ENABLE_TAG,
>> + strlen(EVLIST_CTL_CMD_ENABLE_TAG))) {
>> + *cmd = EVLIST_CTL_CMD_ENABLE;
>> + } else if (!strncmp(cmd_data, EVLIST_CTL_CMD_DISABLE_TAG,
>> + strlen(EVLIST_CTL_CMD_DISABLE_TAG))) {
>> + *cmd = EVLIST_CTL_CMD_DISABLE;
>> + }
>
> are there more comands comming?
Currently no.
~Alexey
On 20.05.2020 15:38, Jiri Olsa wrote:
> On Wed, May 13, 2020 at 11:03:03AM +0300, Alexey Budankov wrote:
>>
>> Introduce --ctl-fd[-ack] options to pass open file descriptors numbers
>> from command line. Extend perf-stat.txt file with --ctl-fd[-ack] options
>> description. Document possible usage model introduced by --ctl-fd[-ack]
>> options by providing example bash shell script.
>>
>> Signed-off-by: Alexey Budankov <[email protected]>
>> ---
>> tools/perf/Documentation/perf-stat.txt | 40 ++++++++++++++++++++++++++
>> tools/perf/builtin-stat.c | 10 +++++++
>> tools/perf/util/stat.h | 2 ++
>> 3 files changed, 52 insertions(+)
>>
>> diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
>> index 3b91b30d7672..7f7a0019fbfc 100644
>> --- a/tools/perf/Documentation/perf-stat.txt
>> +++ b/tools/perf/Documentation/perf-stat.txt
>> @@ -164,6 +164,46 @@ with it. --append may be used here. Examples:
>> 3>results perf stat --log-fd 3 -- $cmd
>> 3>>results perf stat --log-fd 3 --append -- $cmd
>>
>> +--ctl-fd::
>> +--ctl-fd-ack::
>> +
>> +Listen on ctl-fd descriptor for command to control measurement ('enable': enable events,
>> +'disable': disable events). Optionally send control command completion ('ack') to fd-ack
>> +descriptor to synchronize with the controlling process. Example of bash shell script
>> +to enable and disable events during measurements:
>> +
>> +#!/bin/bash
>> +
>> +ctl_dir=/tmp/
>> +
>> +ctl_fifo=${ctl_dir}perf_ctl.fifo
>> +test -p ${ctl_fifo} && unlink ${ctl_fifo}
>> +mkfifo ${ctl_fifo}
>> +exec {ctl_fd}<>${ctl_fifo}
>> +
>> +ctl_ack_fifo=${ctl_dir}perf_ctl_ack.fifo
>> +test -p ${ctl_ack_fifo} && unlink ${ctl_ack_fifo}
>> +mkfifo ${ctl_ack_fifo}
>> +exec {ctl_fd_ack}<>${ctl_ack_fifo}
>> +
>> +perf stat -D -1 -e cpu-cycles -a -I 1000 \
>> + --ctl-fd ${ctl_fd} --ctl-fd-ack ${ctl_fd_ack} \
>> + -- sleep 30 &
>> +perf_pid=$!
>> +
>> +sleep 5 && echo 'enable' >&${ctl_fd} && read -u ${ctl_fd_ack} e1 && echo "enabled(${e1})"
>> +sleep 10 && echo 'disable' >&${ctl_fd} && read -u ${ctl_fd_ack} d1 && echo "disabled(${d1})"
>> +
>> +exec {ctl_fd_ack}>&-
>> +unlink ${ctl_ack_fifo}
>> +
>> +exec {ctl_fd}>&-
>> +unlink ${ctl_fifo}
>> +
>> +wait -n ${perf_pid}
>> +exit $?
>> +
>> +
>> --pre::
>> --post::
>> Pre and post measurement hooks, e.g.:
>> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
>> index abea82a1ba24..88055aaf670f 100644
>> --- a/tools/perf/builtin-stat.c
>> +++ b/tools/perf/builtin-stat.c
>> @@ -187,6 +187,8 @@ static struct perf_stat_config stat_config = {
>> .metric_only_len = METRIC_ONLY_LEN,
>> .walltime_nsecs_stats = &walltime_nsecs_stats,
>> .big_num = true,
>> + .ctl_fd = -1,
>> + .ctl_fd_ack = -1
>> };
>>
>> static inline void diff_timespec(struct timespec *r, struct timespec *a,
>> @@ -984,6 +986,10 @@ static struct option stat_options[] = {
>> "Use with 'percore' event qualifier to show the event "
>> "counts of one hardware thread by sum up total hardware "
>> "threads of same physical core"),
>> + OPT_INTEGER(0, "ctl-fd", &stat_config.ctl_fd,
>> + "Listen on fd descriptor for command to control measurement ('enable': enable events, 'disable': disable events)"),
>> + OPT_INTEGER(0, "ctl-fd-ack", &stat_config.ctl_fd_ack,
>> + "Send control command completion ('ack') to fd ack descriptor"),
>> OPT_END()
>> };
>>
>> @@ -2180,6 +2186,8 @@ int cmd_stat(int argc, const char **argv)
>> signal(SIGALRM, skip_signal);
>> signal(SIGABRT, skip_signal);
>>
>> + evlist__initialize_ctlfd(evsel_list, stat_config.ctl_fd, stat_config.ctl_fd_ack);
>
> please check the return value
Accepted in v4.
~Alexey
On 20.05.2020 15:38, Jiri Olsa wrote:
> On Wed, May 13, 2020 at 11:00:47AM +0300, Alexey Budankov wrote:
>>
>> Factor out event handling loop into handle_events() function.
>>
>> Signed-off-by: Alexey Budankov <[email protected]>
>> ---
>> tools/perf/builtin-stat.c | 85 +++++++++++++++++++++++----------------
>> 1 file changed, 50 insertions(+), 35 deletions(-)
>>
>> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
>> index e0c1ad23c768..9775b0905146 100644
>> --- a/tools/perf/builtin-stat.c
>> +++ b/tools/perf/builtin-stat.c
>> @@ -371,6 +371,16 @@ static void process_interval(void)
>> print_counters(&rs, 0, NULL);
>> }
>>
>> +static bool print_interval_and_stop(struct perf_stat_config *config, int *times)
>> +{
>> + if (config->interval) {
>> + process_interval();
>> + if (interval_count && !(--(*times)))
>> + return true;
>> + }
>> + return false;
>> +}
>> +
>> static void enable_counters(void)
>> {
>> if (stat_config.initial_delay)
>> @@ -436,6 +446,42 @@ static bool is_target_alive(struct target *_target,
>> return false;
>> }
>>
>> +static int handle_events(pid_t pid, struct perf_stat_config *config)
>> +{
>> + pid_t child = 0;
>> + bool res, stop = false;
>> + struct timespec time_to_sleep;
>> + int sleep_time, status = 0, times = config->times;
>> +
>> + if (config->interval)
>> + sleep_time = config->interval;
>> + else if (config->timeout)
>> + sleep_time = config->timeout;
>> + else
>> + sleep_time = 1000;
>> +
>> + time_to_sleep.tv_sec = sleep_time / MSEC_PER_SEC;
>> + time_to_sleep.tv_nsec = (sleep_time % MSEC_PER_SEC) * NSEC_PER_MSEC;
>> +
>> + do {
>> + if (pid != -1)
>> + child = waitpid(pid, &status, WNOHANG);
>> + if (child || stop || done)
>> + break;
>> + nanosleep(&time_to_sleep, NULL);
>> + if (pid == -1)
>> + stop = !is_target_alive(&target, evsel_list->core.threads);
>> + if (config->timeout) {
>> + stop = !stop ? true : stop;
>> + } else {
>> + res = print_interval_and_stop(config, ×);
>> + stop = !stop ? res : stop;
>> + }
>> + } while (1);
>> +
>> + return status;
>> +}
>> +
>> enum counter_recovery {
>> COUNTER_SKIP,
>> COUNTER_RETRY,
>> @@ -494,12 +540,10 @@ static enum counter_recovery stat_handle_error(struct evsel *counter)
>> static int __run_perf_stat(int argc, const char **argv, int run_idx)
>> {
>> int interval = stat_config.interval;
>> - int times = stat_config.times;
>> int timeout = stat_config.timeout;
>> char msg[BUFSIZ];
>> unsigned long long t0, t1;
>> struct evsel *counter;
>> - struct timespec ts;
>> size_t l;
>> int status = 0;
>> const bool forks = (argc > 0);
>> @@ -508,17 +552,6 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
>> int i, cpu;
>> bool second_pass = false;
>>
>> - if (interval) {
>> - ts.tv_sec = interval / USEC_PER_MSEC;
>> - ts.tv_nsec = (interval % USEC_PER_MSEC) * NSEC_PER_MSEC;
>> - } else if (timeout) {
>> - ts.tv_sec = timeout / USEC_PER_MSEC;
>> - ts.tv_nsec = (timeout % USEC_PER_MSEC) * NSEC_PER_MSEC;
>> - } else {
>> - ts.tv_sec = 1;
>> - ts.tv_nsec = 0;
>> - }
>> -
>> if (forks) {
>> if (perf_evlist__prepare_workload(evsel_list, &target, argv, is_pipe,
>> workload_exec_failed_signal) < 0) {
>> @@ -675,16 +708,9 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
>> perf_evlist__start_workload(evsel_list);
>> enable_counters();
>>
>> - if (interval || timeout) {
>> - while (!waitpid(child_pid, &status, WNOHANG)) {
>> - nanosleep(&ts, NULL);
>> - if (timeout)
>> - break;
>> - process_interval();
>> - if (interval_count && !(--times))
>> - break;
>> - }
>> - }
>> + if (interval || timeout)
>> + handle_events(child_pid, &stat_config);
>> +
>> if (child_pid != -1) {
>> if (timeout)
>> kill(child_pid, SIGTERM);
>> @@ -701,18 +727,7 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
>> psignal(WTERMSIG(status), argv[0]);
>> } else {
>> enable_counters();
>> - while (!done) {
>> - nanosleep(&ts, NULL);
>> - if (!is_target_alive(&target, evsel_list->core.threads))
>> - break;
>> - if (timeout)
>> - break;
>> - if (interval) {
>> - process_interval();
>> - if (interval_count && !(--times))
>> - break;
>> - }
>> - }
>> + handle_events(-1, &stat_config);
>
> this makes me worried.. I'm not sure if it's good idea
> to squash these 2 looops into one, because they are already
> complex as they are.. and one of you following patches is
> making it even more complex
Loops bodies are mostly identical. The only difference is in events
they wait for and API used for that. Adding of more events will
complicate further. The code is duplicated, thus needs refactoring.
If the following patch complicates lets organize the patch it into
several smaller functions.
>
> wouldn't it be better if you just add single call into
> each of them.. that would poll on your fd and process the
> commands if needed?
That's of course possible, but doesn't manage existing complexity
at the first place - __run_perf_stat().
Let's still have handle_events() as a general dispatcher and implement
handlers for different events as separate functions?
~Alexey
On 20.05.2020 15:38, Jiri Olsa wrote:
> On Wed, May 13, 2020 at 11:05:08AM +0300, Alexey Budankov wrote:
>>
>> Introduce --ctl-fd[-ack] options to pass open file descriptors numbers
>> from command line. Extend perf-record.txt file with --ctl-fd[-ack]
>> options description. Document possible usage model introduced by
>> --ctl-fd[-ack] options by providing example bash shell script.
>>
>> Signed-off-by: Alexey Budankov <[email protected]>
>> ---
>> tools/perf/Documentation/perf-record.txt | 39 ++++++++++++++++++++++++
>> tools/perf/builtin-record.c | 9 ++++++
>> tools/perf/util/record.h | 2 ++
>> 3 files changed, 50 insertions(+)
>>
>> diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
>> index c2c4ce7ccee2..5c012cfe68a4 100644
>> --- a/tools/perf/Documentation/perf-record.txt
>> +++ b/tools/perf/Documentation/perf-record.txt
>> @@ -614,6 +614,45 @@ appended unit character - B/K/M/G
>> The number of threads to run when synthesizing events for existing processes.
>> By default, the number of threads equals 1.
>>
>> +--ctl-fd::
>> +--ctl-fd-ack::
>> +Listen on ctl-fd descriptor for command to control measurement ('enable': enable events,
>> +'disable': disable events. Optionally send control command completion ('ack') to fd-ack
>> +descriptor to synchronize with the controlling process. Example of bash shell script
>> +to enable and disable events during measurements:
>> +
>> +#!/bin/bash
>> +
>> +ctl_dir=/tmp/
>> +
>> +ctl_fifo=${ctl_dir}perf_ctl.fifo
>> +test -p ${ctl_fifo} && unlink ${ctl_fifo}
>> +mkfifo ${ctl_fifo}
>> +exec {ctl_fd}<>${ctl_fifo}
>> +
>> +ctl_ack_fifo=${ctl_dir}perf_ctl_ack.fifo
>> +test -p ${ctl_ack_fifo} && unlink ${ctl_ack_fifo}
>> +mkfifo ${ctl_ack_fifo}
>> +exec {ctl_fd_ack}<>${ctl_ack_fifo}
>> +
>> +perf record -D -1 -e cpu-cycles -a \
>> + --ctl-fd ${ctl_fd} --ctl-fd-ack ${ctl_fd_ack} \
>> + -- sleep 30 &
>> +perf_pid=$!
>> +
>> +sleep 5 && echo 'enable' >&${ctl_fd} && read -u ${ctl_fd_ack} e1 && echo "enabled(${e1})"
>> +sleep 10 && echo 'disable' >&${ctl_fd} && read -u ${ctl_fd_ack} d1 && echo "disabled(${d1})"
>> +
>> +exec {ctl_fd_ack}>&-
>> +unlink ${ctl_ack_fifo}
>> +
>> +exec {ctl_fd}>&-
>> +unlink ${ctl_fifo}
>> +
>> +wait -n ${perf_pid}
>> +exit $?
>> +
>> +
>> SEE ALSO
>> --------
>> linkperf:perf-stat[1], linkperf:perf-list[1], linkperf:perf-intel-pt[1]
>> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
>> index 72f388623364..218cfaafaf10 100644
>> --- a/tools/perf/builtin-record.c
>> +++ b/tools/perf/builtin-record.c
>> @@ -1713,6 +1713,8 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
>> perf_evlist__start_workload(rec->evlist);
>> }
>>
>> + evlist__initialize_ctlfd(rec->evlist, opts->ctl_fd, opts->ctl_fd_ack);
>
> please check return value in here
Accepted in v4.
~Alexey
On 20.05.2020 15:38, Jiri Olsa wrote:
> On Wed, May 13, 2020 at 11:04:25AM +0300, Alexey Budankov wrote:
>>
>> Implement handling of 'enable' and 'disable' control commands
>> coming from control file descriptor.
>>
>> Signed-off-by: Alexey Budankov <[email protected]>
>> ---
>> tools/perf/builtin-record.c | 18 +++++++++++++++++-
>> 1 file changed, 17 insertions(+), 1 deletion(-)
>>
>> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
>> index 50dc2fe626e5..72f388623364 100644
>> --- a/tools/perf/builtin-record.c
>> +++ b/tools/perf/builtin-record.c
>> @@ -1505,6 +1505,7 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
>> bool disabled = false, draining = false;
>> int fd;
>> float ratio = 0;
>> + enum evlist_ctl_cmd cmd = EVLIST_CTL_CMD_UNSUPPORTED;
>>
>> atexit(record__sig_exit);
>> signal(SIGCHLD, sig_handler);
>> @@ -1802,8 +1803,23 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
>> * Propagate error, only if there's any. Ignore positive
>> * number of returned events and interrupt error.
>> */
>> - if (err > 0 || (err < 0 && errno == EINTR))
>> + if (err > 0 || (err < 0 && errno == EINTR)) {
>> err = 0;
>> + if (evlist__ctlfd_process(rec->evlist, &cmd) > 0) {
>> + switch (cmd) {
>> + case EVLIST_CTL_CMD_ENABLE:
>> + pr_info(EVLIST_ENABLED_MSG);
>> + break;
>> + case EVLIST_CTL_CMD_DISABLE:
>> + pr_info(EVLIST_DISABLED_MSG);
>> + break;
>> + case EVLIST_CTL_CMD_ACK:
>> + case EVLIST_CTL_CMD_UNSUPPORTED:
>> + default:
>> + break;
>> + }
>> + }
>> + }
>
> so this will be processed only when:
>
> if (hits == rec->samples) {
>
> what if there's always somethign in the buffer? will this stall?
>
>> waking++;
>>
>> if (evlist__filter_pollfd(rec->evlist, POLLERR | POLLHUP) == 0)
>
> evlist__filter_pollfd will trigger draining if all the maps are closed,
> but there's one more fd in pollfd now, will this prevent draining?
Good catch. This needs redesign to notice pending commands during
active trace streaming. The same is regarding draining.
>
> I wonder this would fit better to the sideband thread (util/sideband_evlist.c)
> so we don't disturb the main thread with another check
~Alexey
On Wed, May 20, 2020 at 06:17:40PM +0300, Alexey Budankov wrote:
SNIP
> >> @@ -675,16 +708,9 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
> >> perf_evlist__start_workload(evsel_list);
> >> enable_counters();
> >>
> >> - if (interval || timeout) {
> >> - while (!waitpid(child_pid, &status, WNOHANG)) {
> >> - nanosleep(&ts, NULL);
> >> - if (timeout)
> >> - break;
> >> - process_interval();
> >> - if (interval_count && !(--times))
> >> - break;
> >> - }
> >> - }
> >> + if (interval || timeout)
> >> + handle_events(child_pid, &stat_config);
> >> +
> >> if (child_pid != -1) {
> >> if (timeout)
> >> kill(child_pid, SIGTERM);
> >> @@ -701,18 +727,7 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
> >> psignal(WTERMSIG(status), argv[0]);
> >> } else {
> >> enable_counters();
> >> - while (!done) {
> >> - nanosleep(&ts, NULL);
> >> - if (!is_target_alive(&target, evsel_list->core.threads))
> >> - break;
> >> - if (timeout)
> >> - break;
> >> - if (interval) {
> >> - process_interval();
> >> - if (interval_count && !(--times))
> >> - break;
> >> - }
> >> - }
> >> + handle_events(-1, &stat_config);
> >
> > this makes me worried.. I'm not sure if it's good idea
> > to squash these 2 looops into one, because they are already
> > complex as they are.. and one of you following patches is
> > making it even more complex
>
> Loops bodies are mostly identical. The only difference is in events
> they wait for and API used for that. Adding of more events will
> complicate further. The code is duplicated, thus needs refactoring.
> If the following patch complicates lets organize the patch it into
> several smaller functions.
yea, that might help
jirka
>
> >
> > wouldn't it be better if you just add single call into
> > each of them.. that would poll on your fd and process the
> > commands if needed?
>
> That's of course possible, but doesn't manage existing complexity
> at the first place - __run_perf_stat().
>
> Let's still have handle_events() as a general dispatcher and implement
> handlers for different events as separate functions?
>
> ~Alexey
>