2020-03-27 08:36:08

by Alexey Budankov

[permalink] [raw]
Subject: [PATCH v1 0/8] perf: support resume and pause commands in stat and record modes


The patch set implements handling of 'start paused', 'resume' and 'pause'
external control commands which can be provided for stat and record modes
of the tool from an external controlling process. 'start paused' command
can be used to postpone enabling of events in the beginning of a monitoring
session. 'resume' and 'pause' commands can be used to enable and disable
events correspondingly any time after the start of the session.

The 'start paused', resume and 'pause' external control commands can be
used to focus measurement on specially selected time intervals of workload
execution. Focused measurement reduces tool intrusion and influence on
workload behavior, reduces distortion and amount of collected and stored
data, mitigates data accuracy loss because measurement and data capturing
happen only during intervals of interest.

A controlling process can be a bash shell script [1], native executable or
any other language program that can directly work with file descriptors,
e.g. pipes [2], and spawn a process, specially the tool one.

-D,--delay <val> option is extended with -1 value to skip events enabling
in the beginning of a monitoring session ('start paused' command). --ctl-fd
and --ctl-fd-ack command line options are introduced to provide the tool
with a pair of file descriptors to listen to 'resume' and 'pause' commands
and reply to an external controlling process on the completion of received
commands processing.

The tool reads two byte control command message from ctl-fd descriptor,
handles the command and optionally replies two bytes acknowledgement message
to fd-ack descriptor, if it is specified on the command line. 'resume' command
is recognized as 'r' character message and 'pause' command is recognized as
'p' character message both received from ctl-fd descriptor. Completion message
is 'a''\n' and sent to fd-ack descriptor.

Bash script demonstrating simple use case follows:

#!/bin/bash

ctl_dir=/tmp/

ctl_fifo=${ctl_dir}perf_ctl.fifo
test -p ${ctl_fifo} && unlink ${ctl_fifo}
mkfifo ${ctl_fifo} && exec {ctl_fd}<>${ctl_fifo}

ctl_ack_fifo=${ctl_dir}perf_ctl_ack.fifo
test -p ${ctl_ack_fifo} && unlink ${ctl_ack_fifo}
mkfifo ${ctl_ack_fifo} && exec {ctl_fd_ack}<>${ctl_ack_fifo}

perf stat -D -1 -e cpu-cycles -a -I 1000 \
--ctl-fd ${ctl_fd} --ctl-fd-ack ${ctl_fd_ack} \
-- sleep 40 &
perf_pid=$!

sleep 5 && echo 'r' >&${ctl_fd} && read -u ${ctl_fd_ack} r1 && echo "resumed(${r1})"
sleep 10 && echo 'p' >&${ctl_fd} && read -u ${ctl_fd_ack} p1 && echo "paused(${p1})"
sleep 5 && echo 'r' >&${ctl_fd} && read -u ${ctl_fd_ack} r2 && echo "resumed(${r2})"
sleep 10 && echo 'p' >&${ctl_fd} && read -u ${ctl_fd_ack} p2 && echo "paused(${p2})"

exec {ctl_fd_ack}>&- && unlink ${ctl_ack_fifo}
exec {ctl_fd}>&- && unlink ${ctl_fifo}

wait -n ${perf_pid}
exit $?


Script output:

[root@host acme] example
Monitoring paused
# time counts unit events
1.001101062 <not counted> cpu-cycles
2.002994944 <not counted> cpu-cycles
3.004864340 <not counted> cpu-cycles
4.006727177 <not counted> cpu-cycles
Monitoring resumed
resumed(a)
4.993808464 3,124,246 cpu-cycles
5.008597004 3,325,624 cpu-cycles
6.010387483 83,472,992 cpu-cycles
7.012266598 55,877,621 cpu-cycles
8.014175695 97,892,729 cpu-cycles
9.016056093 68,461,242 cpu-cycles
10.017937507 55,449,643 cpu-cycles
11.019830154 68,938,167 cpu-cycles
12.021719952 55,164,101 cpu-cycles
13.023627550 70,535,720 cpu-cycles
14.025580995 53,240,125 cpu-cycles
paused(a)
14.997518260 53,558,068 cpu-cycles
Monitoring paused
15.027216416 <not counted> cpu-cycles
16.029052729 <not counted> cpu-cycles
17.030904762 <not counted> cpu-cycles
18.032073424 <not counted> cpu-cycles
19.033805074 <not counted> cpu-cycles
Monitoring resumed
resumed(a)
20.001279097 3,021,022 cpu-cycles
20.035044381 6,434,367 cpu-cycles
21.036923813 89,358,251 cpu-cycles
22.038825169 72,516,351 cpu-cycles
# time counts unit events
23.040715596 55,046,157 cpu-cycles
24.042643757 78,128,649 cpu-cycles
25.044558535 61,052,428 cpu-cycles
26.046452785 62,142,806 cpu-cycles
27.048353021 74,477,971 cpu-cycles
28.050241286 61,001,623 cpu-cycles
29.052149961 61,653,502 cpu-cycles
paused(a)
30.004980264 82,729,640 cpu-cycles
Monitoring paused
30.053516176 <not counted> cpu-cycles
31.055348366 <not counted> cpu-cycles
32.057202097 <not counted> cpu-cycles
33.059040702 <not counted> cpu-cycles
34.060843288 <not counted> cpu-cycles
35.000888624 <not counted> cpu-cycles
[root@host acme]#

repo: git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git perf/core
sha1: fe2413eefd7f21aade7eca7272332a1845c11837

[1] http://man7.org/linux/man-pages/man1/bash.1.html
[2] http://man7.org/linux/man-pages/man2/pipe.2.html

---
Alexey Budankov (8):
perf evlist: introduce control file descriptors
perf evlist: implement control command handling functions
perf stat: introduce control descriptors and --ctl-fd[-ack] options
perf stat: implement resume and pause control commands handling
perf docs: extend stat mode docs with info on --ctl-fd[-ack] options
perf record: introduce control descriptors and --ctl-fd[-ack] options
perf record: implement resume and pause control commands handling
perf docs: extend record mode docs with info on --ctl-fd[-ack] options

tools/perf/Documentation/perf-record.txt | 37 ++++++
tools/perf/Documentation/perf-stat.txt | 38 +++++++
tools/perf/builtin-record.c | 39 ++++++-
tools/perf/builtin-stat.c | 138 ++++++++++++++++-------
tools/perf/builtin-trace.c | 2 +-
tools/perf/util/evlist.c | 103 +++++++++++++++++
tools/perf/util/evlist.h | 18 +++
tools/perf/util/record.h | 4 +-
tools/perf/util/stat.h | 4 +-
9 files changed, 335 insertions(+), 48 deletions(-)


2020-03-27 08:46:52

by Alexey Budankov

[permalink] [raw]
Subject: [PATCH v1 1/8] perf evlist: introduce control file descriptors


Define and initialize control file descriptors.

Signed-off-by: Alexey Budankov <[email protected]>
---
tools/perf/util/evlist.c | 3 +++
tools/perf/util/evlist.h | 3 +++
2 files changed, 6 insertions(+)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 1548237b6558..1afd87cfa027 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -62,6 +62,9 @@ void evlist__init(struct evlist *evlist, struct perf_cpu_map *cpus,
perf_evlist__set_maps(&evlist->core, cpus, threads);
evlist->workload.pid = -1;
evlist->bkw_mmap_state = BKW_MMAP_NOTREADY;
+ evlist->ctl_fd = -1;
+ evlist->ctl_fd_ack = -1;
+ evlist->ctl_fd_pos = -1;
}

struct evlist *evlist__new(void)
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index f5bd5c386df1..ac3dd895ef8f 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -74,6 +74,9 @@ struct evlist {
pthread_t th;
volatile int done;
} thread;
+ int ctl_fd;
+ int ctl_fd_ack;
+ int ctl_fd_pos;
};

struct evsel_str_handler {
--
2.24.1

2020-03-27 08:47:49

by Alexey Budankov

[permalink] [raw]
Subject: [PATCH v1 2/8] perf evlist: implement control command handling functions


Implement functions of initialization, finalization and processing
of control commands coming from control file descriptors.

Signed-off-by: Alexey Budankov <[email protected]>
---
tools/perf/util/evlist.c | 100 +++++++++++++++++++++++++++++++++++++++
tools/perf/util/evlist.h | 12 +++++
2 files changed, 112 insertions(+)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 1afd87cfa027..56b01a22963b 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1834,3 +1834,103 @@ void perf_evlist__stop_sb_thread(struct evlist *evlist)
pthread_join(evlist->thread.th, NULL);
evlist__delete(evlist);
}
+
+int perf_evlist__initialize_ctlfd(struct evlist *evlist, int ctl_fd, int ctl_fd_ack)
+{
+ if (ctl_fd == -1) {
+ pr_debug("Control descriptor is not initialized\n");
+ return 0;
+ }
+
+ evlist->ctl_fd_pos = perf_evlist__add_pollfd(&evlist->core, ctl_fd, NULL, POLLIN);
+ if (evlist->ctl_fd_pos < 0) {
+ evlist->ctl_fd_pos = -1;
+ pr_err("Failed to add ctl fd entry: %m\n");
+ return -1;
+ }
+
+ evlist->ctl_fd = ctl_fd;
+ evlist->ctl_fd_ack = ctl_fd_ack;
+
+ return 0;
+}
+
+int perf_evlist__finalize_ctlfd(struct evlist *evlist)
+{
+ if (evlist->ctl_fd_pos == -1)
+ return 0;
+
+ evlist->core.pollfd.entries[evlist->ctl_fd_pos].fd = -1;
+ evlist->ctl_fd_pos = -1;
+ evlist->ctl_fd_ack = -1;
+ evlist->ctl_fd = -1;
+
+ return 0;
+}
+
+static int perf_evlist__ctlfd_recv(struct evlist *evlist, enum evlist_ctl_cmd *cmd)
+{
+ int err;
+ char buf[2];
+
+ err = read(evlist->ctl_fd, &buf, sizeof(buf));
+ if (err > 0)
+ *cmd = buf[0];
+ else if (err == -1)
+ pr_err("Failed to read from ctlfd %d: %m\n", evlist->ctl_fd);
+
+ return err;
+}
+
+static int perf_evlist__ctlfd_ack(struct evlist *evlist)
+{
+ int err;
+ char buf[2] = {CTL_CMD_ACK, '\n'};
+
+ if (evlist->ctl_fd_ack == -1)
+ return 0;
+
+ err = write(evlist->ctl_fd_ack, buf, sizeof(buf));
+ if (err == -1)
+ pr_err("failed to write to ctl_ack_fd %d: %m\n", evlist->ctl_fd_ack);
+
+ return err;
+}
+
+int perf_evlist__ctlfd_process(struct evlist *evlist, enum evlist_ctl_cmd *cmd)
+{
+ int err = 0;
+ int ctlfd_pos = evlist->ctl_fd_pos;
+ struct pollfd *entries = evlist->core.pollfd.entries;
+
+ if (!entries[ctlfd_pos].revents)
+ return 0;
+
+ if (entries[ctlfd_pos].revents & POLLIN) {
+ err = perf_evlist__ctlfd_recv(evlist, cmd);
+ if (err > 0) {
+ switch (*cmd) {
+ case CTL_CMD_RESUME:
+ evlist__enable(evlist);
+ break;
+ case CTL_CMD_PAUSE:
+ evlist__disable(evlist);
+ break;
+ case CTL_CMD_ACK:
+ case CTL_CMD_UNSUPPORTED:
+ default:
+ pr_debug("ctlfd: unsupported %d\n", *cmd);
+ break;
+ }
+ if (!(*cmd == CTL_CMD_ACK || *cmd == CTL_CMD_UNSUPPORTED))
+ perf_evlist__ctlfd_ack(evlist);
+ }
+ }
+
+ if (entries[ctlfd_pos].revents & (POLLHUP | POLLERR))
+ perf_evlist__finalize_ctlfd(evlist);
+ else
+ entries[ctlfd_pos].revents = 0;
+
+ return err;
+}
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index ac3dd895ef8f..a94b2993fafc 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -361,4 +361,16 @@ void perf_evlist__force_leader(struct evlist *evlist);
struct evsel *perf_evlist__reset_weak_group(struct evlist *evlist,
struct evsel *evsel,
bool close);
+
+enum evlist_ctl_cmd {
+ CTL_CMD_UNSUPPORTED = 0,
+ CTL_CMD_RESUME = 'r',
+ CTL_CMD_PAUSE = 'p',
+ CTL_CMD_ACK = 'a'
+};
+
+int perf_evlist__initialize_ctlfd(struct evlist *evlist, int ctl_fd, int ctl_fd_ack);
+int perf_evlist__finalize_ctlfd(struct evlist *evlist);
+int perf_evlist__ctlfd_process(struct evlist *evlist, enum evlist_ctl_cmd *cmd);
+
#endif /* __PERF_EVLIST_H */
--
2.24.1


2020-03-27 08:49:00

by Alexey Budankov

[permalink] [raw]
Subject: [PATCH v1 3/8] perf stat: introduce control descriptors and --ctl-fd[-ack] options


Introduce control file descriptors and --ctl-fd[-ack] options to pass
control descriptors from command line. Extend --delay option with -1
value to start collection in paused mode to be resumed later by resume
command provided via control file descriptor.

Signed-off-by: Alexey Budankov <[email protected]>
---
tools/perf/builtin-stat.c | 28 ++++++++++++++++++++++++----
tools/perf/util/evlist.h | 3 +++
tools/perf/util/stat.h | 4 +++-
3 files changed, 30 insertions(+), 5 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index ec053dc1e35c..0a1b79fd3f48 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -187,6 +187,8 @@ static struct perf_stat_config stat_config = {
.metric_only_len = METRIC_ONLY_LEN,
.walltime_nsecs_stats = &walltime_nsecs_stats,
.big_num = true,
+ .ctl_fd = -1,
+ .ctl_fd_ack = -1
};

static inline void diff_timespec(struct timespec *r, struct timespec *a,
@@ -373,16 +375,26 @@ static void process_interval(void)

static void enable_counters(void)
{
- if (stat_config.initial_delay)
+ if (stat_config.initial_delay < 0) {
+ pr_info(PERF_EVLIST__PAUSED_MSG);
+ return;
+ }
+
+ if (stat_config.initial_delay > 0) {
+ pr_info(PERF_EVLIST__PAUSED_MSG);
usleep(stat_config.initial_delay * USEC_PER_MSEC);
+ }

/*
* We need to enable counters only if:
* - we don't have tracee (attaching to task or cpu)
* - we have initial delay configured
*/
- if (!target__none(&target) || stat_config.initial_delay)
+ if (!target__none(&target) || stat_config.initial_delay) {
evlist__enable(evsel_list);
+ if (stat_config.initial_delay > 0)
+ pr_info(PERF_EVLIST__RESUMED_MSG);
+ }
}

static void disable_counters(void)
@@ -912,8 +924,8 @@ static struct option stat_options[] = {
"aggregate counts per thread", AGGR_THREAD),
OPT_SET_UINT(0, "per-node", &stat_config.aggr_mode,
"aggregate counts per numa node", AGGR_NODE),
- OPT_UINTEGER('D', "delay", &stat_config.initial_delay,
- "ms to wait before starting measurement after program start"),
+ OPT_INTEGER('D', "delay", &stat_config.initial_delay,
+ "ms to wait before starting measurement after program start (-1: start paused"),
OPT_CALLBACK_NOOPT(0, "metric-only", &stat_config.metric_only, NULL,
"Only print computed metrics. No raw values", enable_metric_only),
OPT_BOOLEAN(0, "topdown", &topdown_run,
@@ -933,6 +945,10 @@ static struct option stat_options[] = {
"Use with 'percore' event qualifier to show the event "
"counts of one hardware thread by sum up total hardware "
"threads of same physical core"),
+ OPT_INTEGER(0, "ctl-fd", &stat_config.ctl_fd,
+ "Listen on fd descriptor for command to control measurement ('r': resume, 'p': pause)"),
+ OPT_INTEGER(0, "ctl-fd-ack", &stat_config.ctl_fd_ack,
+ "Send control command completion ('a') to fd ack descriptor"),
OPT_END()
};

@@ -2129,6 +2145,8 @@ int cmd_stat(int argc, const char **argv)
signal(SIGALRM, skip_signal);
signal(SIGABRT, skip_signal);

+ perf_evlist__initialize_ctlfd(evsel_list, stat_config.ctl_fd, stat_config.ctl_fd_ack);
+
status = 0;
for (run_idx = 0; forever || run_idx < stat_config.run_count; run_idx++) {
if (stat_config.run_count != 1 && verbose > 0)
@@ -2148,6 +2166,8 @@ int cmd_stat(int argc, const char **argv)
if (!forever && status != -1 && !interval)
print_counters(NULL, argc, argv);

+ perf_evlist__finalize_ctlfd(evsel_list);
+
if (STAT_RECORD) {
/*
* We synthesize the kernel mmap record just so that older tools
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index a94b2993fafc..c91368483074 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -369,6 +369,9 @@ enum evlist_ctl_cmd {
CTL_CMD_ACK = 'a'
};

+#define PERF_EVLIST__RESUMED_MSG "Monitoring resumed\n"
+#define PERF_EVLIST__PAUSED_MSG "Monitoring paused\n"
+
int perf_evlist__initialize_ctlfd(struct evlist *evlist, int ctl_fd, int ctl_fd_ack);
int perf_evlist__finalize_ctlfd(struct evlist *evlist);
int perf_evlist__ctlfd_process(struct evlist *evlist, enum evlist_ctl_cmd *cmd);
diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h
index b4fdfaa7f2c0..0b0fa3a2cde2 100644
--- a/tools/perf/util/stat.h
+++ b/tools/perf/util/stat.h
@@ -113,7 +113,7 @@ struct perf_stat_config {
FILE *output;
unsigned int interval;
unsigned int timeout;
- unsigned int initial_delay;
+ int initial_delay;
unsigned int unit_width;
unsigned int metric_only_len;
int times;
@@ -130,6 +130,8 @@ struct perf_stat_config {
struct perf_cpu_map *cpus_aggr_map;
u64 *walltime_run;
struct rblist metric_events;
+ int ctl_fd;
+ int ctl_fd_ack;
};

void update_stats(struct stats *stats, u64 val);
--
2.24.1


2020-03-27 08:49:08

by Alexey Budankov

[permalink] [raw]
Subject: [PATCH v1 4/8] perf stat: implement resume and pause control commands handling


Implement handle_events() function to handle multiple events
coming from different sources during the measurement in various
modes. Events can come from workload being monitored, signals can
asynchronously arrive, control file descriptors can deliver resume
and pause commands from external processes.

Signed-off-by: Alexey Budankov <[email protected]>
---
tools/perf/builtin-stat.c | 110 +++++++++++++++++++++++++-------------
1 file changed, 74 insertions(+), 36 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 0a1b79fd3f48..58a1b2ff90f7 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -373,6 +373,16 @@ static void process_interval(void)
print_counters(&rs, 0, NULL);
}

+static bool print_interval_and_stop(struct perf_stat_config *config, int *times)
+{
+ if (config->interval) {
+ process_interval();
+ if (interval_count && !(--(*times)))
+ return true;
+ }
+ return false;
+}
+
static void enable_counters(void)
{
if (stat_config.initial_delay < 0) {
@@ -448,6 +458,66 @@ static bool is_target_alive(struct target *_target,
return false;
}

+static int handle_events(pid_t pid, struct perf_stat_config *config)
+{
+ pid_t child = 0;
+ bool res, stop = false;
+ int sleep_time, time_to_sleep, status = 0, times = config->times;
+ enum evlist_ctl_cmd cmd = CTL_CMD_UNSUPPORTED;
+ struct timespec time_start, time_stop, time_diff;
+
+ if (config->interval)
+ sleep_time = config->interval;
+ else if (config->timeout)
+ sleep_time = config->timeout;
+ else
+ sleep_time = 1000;
+
+ time_to_sleep = sleep_time;
+
+ do {
+ if (pid != -1)
+ child = wait4(pid, &status, WNOHANG, &(config->ru_data));
+ if (child || stop || done)
+ break;
+ clock_gettime(CLOCK_MONOTONIC, &time_start);
+ if (evlist__poll(evsel_list, time_to_sleep) > 0) { /* fd revent */
+ if (perf_evlist__ctlfd_process(evsel_list, &cmd) > 0) {
+ switch (cmd) {
+ case CTL_CMD_RESUME:
+ pr_info(PERF_EVLIST__RESUMED_MSG);
+ stop = print_interval_and_stop(config, &times);
+ break;
+ case CTL_CMD_PAUSE:
+ stop = print_interval_and_stop(config, &times);
+ pr_info(PERF_EVLIST__PAUSED_MSG);
+ break;
+ case CTL_CMD_ACK:
+ case CTL_CMD_UNSUPPORTED:
+ default:
+ break;
+ }
+ }
+ clock_gettime(CLOCK_MONOTONIC, &time_stop);
+ diff_timespec(&time_diff, &time_stop, &time_start);
+ time_to_sleep -= time_diff.tv_sec * MSEC_PER_SEC +
+ time_diff.tv_nsec / NSEC_PER_MSEC;
+ } else { /* poll timeout or EINTR */
+ if (pid == -1)
+ stop = !is_target_alive(&target, evsel_list->core.threads);
+ if (config->timeout) {
+ stop = !stop ? true : stop;
+ } else {
+ res = print_interval_and_stop(config, &times);
+ stop = !stop ? res : stop;
+ }
+ time_to_sleep = sleep_time;
+ }
+ } while (1);
+
+ return status;
+}
+
enum counter_recovery {
COUNTER_SKIP,
COUNTER_RETRY,
@@ -507,12 +577,10 @@ static enum counter_recovery stat_handle_error(struct evsel *counter)
static int __run_perf_stat(int argc, const char **argv, int run_idx)
{
int interval = stat_config.interval;
- int times = stat_config.times;
int timeout = stat_config.timeout;
char msg[BUFSIZ];
unsigned long long t0, t1;
struct evsel *counter;
- struct timespec ts;
size_t l;
int status = 0;
const bool forks = (argc > 0);
@@ -521,17 +589,6 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
int i, cpu;
bool second_pass = false;

- if (interval) {
- ts.tv_sec = interval / USEC_PER_MSEC;
- ts.tv_nsec = (interval % USEC_PER_MSEC) * NSEC_PER_MSEC;
- } else if (timeout) {
- ts.tv_sec = timeout / USEC_PER_MSEC;
- ts.tv_nsec = (timeout % USEC_PER_MSEC) * NSEC_PER_MSEC;
- } else {
- ts.tv_sec = 1;
- ts.tv_nsec = 0;
- }
-
if (forks) {
if (perf_evlist__prepare_workload(evsel_list, &target, argv, is_pipe,
workload_exec_failed_signal) < 0) {
@@ -688,18 +745,10 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
perf_evlist__start_workload(evsel_list);
enable_counters();

- if (interval || timeout) {
- while (!waitpid(child_pid, &status, WNOHANG)) {
- nanosleep(&ts, NULL);
- if (timeout)
- break;
- process_interval();
- if (interval_count && !(--times))
- break;
- }
- }
- if (child_pid != -1)
+ if (stat_config.ctl_fd == -1 && !interval && !timeout)
wait4(child_pid, &status, 0, &stat_config.ru_data);
+ else
+ status = handle_events(child_pid, &stat_config);

if (workload_exec_errno) {
const char *emsg = str_error_r(workload_exec_errno, msg, sizeof(msg));
@@ -711,18 +760,7 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
psignal(WTERMSIG(status), argv[0]);
} else {
enable_counters();
- while (!done) {
- nanosleep(&ts, NULL);
- if (!is_target_alive(&target, evsel_list->core.threads))
- break;
- if (timeout)
- break;
- if (interval) {
- process_interval();
- if (interval_count && !(--times))
- break;
- }
- }
+ handle_events(-1, &stat_config);
}

disable_counters();
--
2.24.1


2020-03-27 08:50:28

by Alexey Budankov

[permalink] [raw]
Subject: [PATCH v1 6/8] perf record: introduce control descriptors and --ctl-fd[-ack] options


Introduce control file descriptors and --ctl-fd[-ack] options to pass
control descriptors from command line. Extend --delay option with -1
value to start collection in paused mode to be resumed later by resume
command provided via control file descriptor.

Signed-off-by: Alexey Budankov <[email protected]>
---
tools/perf/builtin-record.c | 18 ++++++++++++++----
tools/perf/builtin-trace.c | 2 +-
tools/perf/util/record.h | 4 +++-
3 files changed, 18 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 4c301466101b..f99751943b40 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1621,8 +1621,12 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
}

if (opts->initial_delay) {
- usleep(opts->initial_delay * USEC_PER_MSEC);
- evlist__enable(rec->evlist);
+ pr_info(PERF_EVLIST__PAUSED_MSG);
+ if (opts->initial_delay > 0) {
+ usleep(opts->initial_delay * USEC_PER_MSEC);
+ evlist__enable(rec->evlist);
+ pr_info(PERF_EVLIST__RESUMED_MSG);
+ }
}

trigger_ready(&auxtrace_snapshot_trigger);
@@ -2218,6 +2222,8 @@ static struct record record = {
.default_per_cpu = true,
},
.mmap_flush = MMAP_FLUSH_DEFAULT,
+ .ctl_fd = -1,
+ .ctl_fd_ack = -1,
},
.tool = {
.sample = process_sample_event,
@@ -2320,8 +2326,8 @@ static struct option __record_options[] = {
OPT_CALLBACK('G', "cgroup", &record.evlist, "name",
"monitor event in cgroup name only",
parse_cgroups),
- OPT_UINTEGER('D', "delay", &record.opts.initial_delay,
- "ms to wait before starting measurement after program start"),
+ OPT_INTEGER('D', "delay", &record.opts.initial_delay,
+ "ms to wait before starting measurement after program start (-1: start paused)"),
OPT_BOOLEAN(0, "kcore", &record.opts.kcore, "copy /proc/kcore"),
OPT_STRING('u', "uid", &record.opts.target.uid_str, "user",
"user to profile"),
@@ -2405,6 +2411,10 @@ static struct option __record_options[] = {
#endif
OPT_CALLBACK(0, "max-size", &record.output_max_size,
"size", "Limit the maximum size of the output file", parse_output_max_size),
+ OPT_INTEGER(0, "ctl-fd", &record.opts.ctl_fd,
+ "Listen on fd descriptor for command to control measurement ('r': resume, 'p': pause)"),
+ OPT_INTEGER(0, "ctl-fd-ack", &record.opts.ctl_fd_ack,
+ "Send control command completion ('a') to fd ack descriptor"),
OPT_END()
};

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 01d542007c8b..4088d099f8bd 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -4778,7 +4778,7 @@ int cmd_trace(int argc, const char **argv)
"per thread proc mmap processing timeout in ms"),
OPT_CALLBACK('G', "cgroup", &trace, "name", "monitor event in cgroup name only",
trace__parse_cgroups),
- OPT_UINTEGER('D', "delay", &trace.opts.initial_delay,
+ OPT_INTEGER('D', "delay", &trace.opts.initial_delay,
"ms to wait before starting measurement after program "
"start"),
OPTS_EVSWITCH(&trace.evswitch),
diff --git a/tools/perf/util/record.h b/tools/perf/util/record.h
index 5421fd2ad383..138f914f4ea9 100644
--- a/tools/perf/util/record.h
+++ b/tools/perf/util/record.h
@@ -59,7 +59,7 @@ struct record_opts {
const char *auxtrace_snapshot_opts;
const char *auxtrace_sample_opts;
bool sample_transaction;
- unsigned initial_delay;
+ int initial_delay;
bool use_clockid;
clockid_t clockid;
u64 clockid_res_ns;
@@ -67,6 +67,8 @@ struct record_opts {
int affinity;
int mmap_flush;
unsigned int comp_level;
+ int ctl_fd;
+ int ctl_fd_ack;
};

extern const char * const *record_usage;
--
2.24.1


2020-03-27 08:50:53

by Alexey Budankov

[permalink] [raw]
Subject: [PATCH v1 5/8] perf docs: extend stat mode docs with info on --ctl-fd[-ack] options


Extend perf-stat.txt file with --ctl-fd[-ack] options description.
Document possible usage model introduced by --ctl-fd[-ack] options
by providing example bash shell script.

Signed-off-by: Alexey Budankov <[email protected]>
---
tools/perf/Documentation/perf-stat.txt | 38 ++++++++++++++++++++++++++
1 file changed, 38 insertions(+)

diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
index 4d56586b2fb9..a8a279ff9cb2 100644
--- a/tools/perf/Documentation/perf-stat.txt
+++ b/tools/perf/Documentation/perf-stat.txt
@@ -164,6 +164,44 @@ with it. --append may be used here. Examples:
3>results perf stat --log-fd 3 -- $cmd
3>>results perf stat --log-fd 3 --append -- $cmd

+--ctl-fd::
+--ctl-fd-ack::
+
+Listen on ctl-fd descriptor for command to control measurement ('r': resume, 'p': pause).
+Optionally send control command completion ('a') to fd-ack descriptor to synchronize with
+the controlling process. Example of bash shell script to resume and pause measurements:
+#!/bin/bash
+
+ctl_dir=/tmp/
+
+ctl_fifo=${ctl_dir}perf_ctl.fifo
+test -p ${ctl_fifo} && unlink ${ctl_fifo}
+mkfifo ${ctl_fifo}
+exec {ctl_fd}<>${ctl_fifo}
+
+ctl_ack_fifo=${ctl_dir}perf_ctl_ack.fifo
+test -p ${ctl_ack_fifo} && unlink ${ctl_ack_fifo}
+mkfifo ${ctl_ack_fifo}
+exec {ctl_fd_ack}<>${ctl_ack_fifo}
+
+perf stat -D -1 -e cpu-cycles -a -I 1000 \
+ --ctl-fd ${ctl_fd} --ctl-fd-ack ${ctl_fd_ack} \
+ -- sleep 30 &
+perf_pid=$!
+
+sleep 5 && echo 'r' >&${ctl_fd} && read -u ${ctl_fd_ack} r1 && echo "resumed(${r1})"
+sleep 10 && echo 'p' >&${ctl_fd} && read -u ${ctl_fd_ack} p1 && echo "paused(${p1})"
+
+exec {ctl_fd_ack}>&-
+unlink ${ctl_ack_fifo}
+
+exec {ctl_fd}>&-
+unlink ${ctl_fifo}
+
+wait -n ${perf_pid}
+exit $?
+
+
--pre::
--post::
Pre and post measurement hooks, e.g.:
--
2.24.1

2020-03-27 08:51:08

by Alexey Budankov

[permalink] [raw]
Subject: [PATCH v1 7/8] perf record: implement resume and pause control commands handling


Implement handling of events coming from control file descriptor.

Signed-off-by: Alexey Budankov <[email protected]>
---
tools/perf/builtin-record.c | 21 ++++++++++++++++++++-
1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index f99751943b40..ae6f7d08e472 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1418,6 +1418,7 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
struct evlist *sb_evlist = NULL;
int fd;
float ratio = 0;
+ enum evlist_ctl_cmd cmd = CTL_CMD_UNSUPPORTED;

atexit(record__sig_exit);
signal(SIGCHLD, sig_handler);
@@ -1620,6 +1621,8 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
perf_evlist__start_workload(rec->evlist);
}

+ perf_evlist__initialize_ctlfd(rec->evlist, opts->ctl_fd, opts->ctl_fd_ack);
+
if (opts->initial_delay) {
pr_info(PERF_EVLIST__PAUSED_MSG);
if (opts->initial_delay > 0) {
@@ -1710,8 +1713,23 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
* Propagate error, only if there's any. Ignore positive
* number of returned events and interrupt error.
*/
- if (err > 0 || (err < 0 && errno == EINTR))
+ if (err > 0 || (err < 0 && errno == EINTR)) {
err = 0;
+ if (perf_evlist__ctlfd_process(rec->evlist, &cmd) > 0) {
+ switch (cmd) {
+ case CTL_CMD_RESUME:
+ pr_info(PERF_EVLIST__RESUMED_MSG);
+ break;
+ case CTL_CMD_PAUSE:
+ pr_info(PERF_EVLIST__PAUSED_MSG);
+ break;
+ case CTL_CMD_ACK:
+ case CTL_CMD_UNSUPPORTED:
+ default:
+ break;
+ }
+ }
+ }
waking++;

if (evlist__filter_pollfd(rec->evlist, POLLERR | POLLHUP) == 0)
@@ -1751,6 +1769,7 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
record__synthesize_workload(rec, true);

out_child:
+ perf_evlist__finalize_ctlfd(rec->evlist);
record__mmap_read_all(rec, true);
record__aio_mmap_read_sync(rec);

--
2.24.1


2020-03-27 08:53:01

by Alexey Budankov

[permalink] [raw]
Subject: [PATCH v1 8/8] perf docs: extend record mode docs with info on --ctl-fd[-ack] options


Extend perf-record.txt file with --ctl-fd[-ack] options description.
Document possible usage model introduced by --ctl-fd[-ack] options
by providing example bash shell script.

Signed-off-by: Alexey Budankov <[email protected]>
---
tools/perf/Documentation/perf-record.txt | 37 ++++++++++++++++++++++++
1 file changed, 37 insertions(+)

diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index 7f4db7592467..939e8eb864fe 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -587,6 +587,43 @@ Make a copy of /proc/kcore and place it into a directory with the perf data file
Limit the sample data max size, <size> is expected to be a number with
appended unit character - B/K/M/G

+--ctl-fd::
+--ctl-fd-ack::
+Listen on ctl-fd descriptor for command to control measurement ('r': resume,
+'p': pause). Optionally send control command completion ('a') to fd-ack descriptor
+to synchronize with the controlling process. Example of bash shell script
+to resume and pause measurements:
+#!/bin/bash
+
+ctl_dir=/tmp/
+
+ctl_fifo=${ctl_dir}perf_ctl.fifo
+test -p ${ctl_fifo} && unlink ${ctl_fifo}
+mkfifo ${ctl_fifo}
+exec {ctl_fd}<>${ctl_fifo}
+
+ctl_ack_fifo=${ctl_dir}perf_ctl_ack.fifo
+test -p ${ctl_ack_fifo} && unlink ${ctl_ack_fifo}
+mkfifo ${ctl_ack_fifo}
+exec {ctl_fd_ack}<>${ctl_ack_fifo}
+
+perf record -D -1 -e cpu-cycles -a \
+ --ctl-fd ${ctl_fd} --ctl-fd-ack ${ctl_fd_ack} \
+ -- sleep 30 &
+perf_pid=$!
+
+sleep 5 && echo 'r' >&${ctl_fd} && read -u ${ctl_fd_ack} r1 && echo "resumed(${r1})"
+sleep 10 && echo 'p' >&${ctl_fd} && read -u ${ctl_fd_ack} p1 && echo "paused(${p1})"
+
+exec {ctl_fd_ack}>&-
+unlink ${ctl_ack_fifo}
+
+exec {ctl_fd}>&-
+unlink ${ctl_fifo}
+
+wait -n ${perf_pid}
+exit $?
+
SEE ALSO
--------
linkperf:perf-stat[1], linkperf:perf-list[1], linkperf:perf-intel-pt[1]
--
2.24.1

2020-04-01 15:08:22

by Jiri Olsa

[permalink] [raw]
Subject: Re: [PATCH v1 0/8] perf: support resume and pause commands in stat and record modes

On Fri, Mar 27, 2020 at 11:34:54AM +0300, Alexey Budankov wrote:
>
> The patch set implements handling of 'start paused', 'resume' and 'pause'
> external control commands which can be provided for stat and record modes
> of the tool from an external controlling process. 'start paused' command
> can be used to postpone enabling of events in the beginning of a monitoring
> session. 'resume' and 'pause' commands can be used to enable and disable
> events correspondingly any time after the start of the session.
>
> The 'start paused', resume and 'pause' external control commands can be
> used to focus measurement on specially selected time intervals of workload
> execution. Focused measurement reduces tool intrusion and influence on
> workload behavior, reduces distortion and amount of collected and stored
> data, mitigates data accuracy loss because measurement and data capturing
> happen only during intervals of interest.
>
> A controlling process can be a bash shell script [1], native executable or
> any other language program that can directly work with file descriptors,
> e.g. pipes [2], and spawn a process, specially the tool one.
>
> -D,--delay <val> option is extended with -1 value to skip events enabling
> in the beginning of a monitoring session ('start paused' command). --ctl-fd
> and --ctl-fd-ack command line options are introduced to provide the tool
> with a pair of file descriptors to listen to 'resume' and 'pause' commands
> and reply to an external controlling process on the completion of received
> commands processing.
>
> The tool reads two byte control command message from ctl-fd descriptor,
> handles the command and optionally replies two bytes acknowledgement message
> to fd-ack descriptor, if it is specified on the command line. 'resume' command
> is recognized as 'r' character message and 'pause' command is recognized as
> 'p' character message both received from ctl-fd descriptor. Completion message
> is 'a''\n' and sent to fd-ack descriptor.
>
> Bash script demonstrating simple use case follows:
>
> #!/bin/bash
>
> ctl_dir=/tmp/
>
> ctl_fifo=${ctl_dir}perf_ctl.fifo
> test -p ${ctl_fifo} && unlink ${ctl_fifo}
> mkfifo ${ctl_fifo} && exec {ctl_fd}<>${ctl_fifo}
>
> ctl_ack_fifo=${ctl_dir}perf_ctl_ack.fifo
> test -p ${ctl_ack_fifo} && unlink ${ctl_ack_fifo}
> mkfifo ${ctl_ack_fifo} && exec {ctl_fd_ack}<>${ctl_ack_fifo}
>
> perf stat -D -1 -e cpu-cycles -a -I 1000 \
> --ctl-fd ${ctl_fd} --ctl-fd-ack ${ctl_fd_ack} \
> -- sleep 40 &

hi,
is fifo the best choice? do you need it for plug perf in somewhere?
what's your use case for this?

fifos seem complicated because you need to create 2 of them, would
unix socket be better maybe? and do we really need that ack fd?

also you could pass just path and perf could create fifos from them

# perf stat --control-fifo /tmp/...

or to get really creazy, we could add option that would make perf
to listen on socket or whatever and we would control it via another
perf command ;-)

# perf stat --control ....
control socket: /tmp/xxx

# perf stat control -s /tmp/xxx disable
# perf stat control -s /tmp/xxx enable

but ATM I can't see too much use for this, so would be great to
know your usecase ;-)

thanks,
jirka

2020-04-01 16:11:15

by Alexey Budankov

[permalink] [raw]
Subject: Re: [PATCH v1 0/8] perf: support resume and pause commands in stat and record modes

Hi Jiri,

On 01.04.2020 17:01, Jiri Olsa wrote:
> On Fri, Mar 27, 2020 at 11:34:54AM +0300, Alexey Budankov wrote:
>>
>> The patch set implements handling of 'start paused', 'resume' and 'pause'
>> external control commands which can be provided for stat and record modes
>> of the tool from an external controlling process. 'start paused' command
>> can be used to postpone enabling of events in the beginning of a monitoring
>> session. 'resume' and 'pause' commands can be used to enable and disable
>> events correspondingly any time after the start of the session.
>>
>> The 'start paused', resume and 'pause' external control commands can be
>> used to focus measurement on specially selected time intervals of workload
>> execution. Focused measurement reduces tool intrusion and influence on
>> workload behavior, reduces distortion and amount of collected and stored
>> data, mitigates data accuracy loss because measurement and data capturing
>> happen only during intervals of interest.
>>
>> A controlling process can be a bash shell script [1], native executable or
>> any other language program that can directly work with file descriptors,
>> e.g. pipes [2], and spawn a process, specially the tool one.
>>
>> -D,--delay <val> option is extended with -1 value to skip events enabling
>> in the beginning of a monitoring session ('start paused' command). --ctl-fd
>> and --ctl-fd-ack command line options are introduced to provide the tool
>> with a pair of file descriptors to listen to 'resume' and 'pause' commands
>> and reply to an external controlling process on the completion of received
>> commands processing.
>>
>> The tool reads two byte control command message from ctl-fd descriptor,
>> handles the command and optionally replies two bytes acknowledgement message
>> to fd-ack descriptor, if it is specified on the command line. 'resume' command
>> is recognized as 'r' character message and 'pause' command is recognized as
>> 'p' character message both received from ctl-fd descriptor. Completion message
>> is 'a''\n' and sent to fd-ack descriptor.
>>
>> Bash script demonstrating simple use case follows:
>>
>> #!/bin/bash
>>
>> ctl_dir=/tmp/
>>
>> ctl_fifo=${ctl_dir}perf_ctl.fifo
>> test -p ${ctl_fifo} && unlink ${ctl_fifo}
>> mkfifo ${ctl_fifo} && exec {ctl_fd}<>${ctl_fifo}
>>
>> ctl_ack_fifo=${ctl_dir}perf_ctl_ack.fifo
>> test -p ${ctl_ack_fifo} && unlink ${ctl_ack_fifo}
>> mkfifo ${ctl_ack_fifo} && exec {ctl_fd_ack}<>${ctl_ack_fifo}
>>
>> perf stat -D -1 -e cpu-cycles -a -I 1000 \
>> --ctl-fd ${ctl_fd} --ctl-fd-ack ${ctl_fd_ack} \
>> -- sleep 40 &
>
> hi,
> is fifo the best choice? do you need it for plug perf in somewhere?
> what's your use case for this?

fifo is just an example to demonstrate the simplest usage model
and evaluate the feature using basic shell environment.

Our use case is to fork/exec perf binary from a controlling c++ process
and the process creates, duplicates and passes open fds' numbers into
the forked perf process using --ctl-fd, --ctl-fd-ack command line arguments.

>
> fifos seem complicated because you need to create 2 of them, would

The patch set allows omitting ack fd on the command line and in this case
perf tool will just receive and process commands from ctl fd without confirmation.
So it will also work without specifying --ctl-fd-ack option like this:

#!/bin/bash

ctl_dir=/tmp/

ctl_fifo=${ctl_dir}perf_ctl.fifo
test -p ${ctl_fifo} && unlink ${ctl_fifo}
mkfifo ${ctl_fifo} && exec {ctl_fd}<>${ctl_fifo}

perf stat -D -1 -e cpu-cycles -a -I 1000 --ctl-fd ${ctl_fd} -- sleep 40 &
perf_pid=$!

sleep 5 && echo 'r' >&${ctl_fd} && read -u ${ctl_fd_ack} r1 && echo "resumed(${r1})"
sleep 10 && echo 'p' >&${ctl_fd} && read -u ${ctl_fd_ack} p1 && echo "paused(${p1})"

exec {ctl_fd}>&- && unlink ${ctl_fifo}

wait -n ${perf_pid}
exit $?

> unix socket be better maybe? and do we really need that ack fd?

Tool reads from already open fd whose number is provided via --ctl-fd option.
Controlling process can create, open or pass objects of various types
(anon pipe, fifo, unix or TCP socket, something else) as fds in the forked
perf process so unix sockets are likely already supported too.

For our use case ack fd is really required to make sure counters are really
paused, resumed and synchronize in the controlling process on changed counters
state otherwise there will be races in the process's code.

>
> also you could pass just path and perf could create fifos from them
>
> # perf stat --control-fifo /tmp/...
>
> or to get really creazy, we could add option that would make perf
> to listen on socket or whatever and we would control it via another
> perf command ;-)
>
> # perf stat --control ....
> control socket: /tmp/xxx
>
> # perf stat control -s /tmp/xxx disable
> # perf stat control -s /tmp/xxx enable
>
> but ATM I can't see too much use for this, so would be great to
> know your usecase ;-)

Mentioned use cases and design approaches do make sense and have been
considered prior the patch set implementation. These use cases can be
supported on demand on top of the changes provided by the patch set.
Extending stat and record modes with a pair of open fd numbers and its
processing is currently enough for our use cases.

Thanks,
Alexey

>
> thanks,
> jirka
>

2020-04-02 14:21:44

by Jiri Olsa

[permalink] [raw]
Subject: Re: [PATCH v1 2/8] perf evlist: implement control command handling functions

On Fri, Mar 27, 2020 at 11:46:43AM +0300, Alexey Budankov wrote:

SNIP

> +
> +int perf_evlist__ctlfd_process(struct evlist *evlist, enum evlist_ctl_cmd *cmd)
> +{
> + int err = 0;
> + int ctlfd_pos = evlist->ctl_fd_pos;
> + struct pollfd *entries = evlist->core.pollfd.entries;
> +
> + if (!entries[ctlfd_pos].revents)
> + return 0;
> +
> + if (entries[ctlfd_pos].revents & POLLIN) {
> + err = perf_evlist__ctlfd_recv(evlist, cmd);
> + if (err > 0) {
> + switch (*cmd) {
> + case CTL_CMD_RESUME:
> + evlist__enable(evlist);
> + break;
> + case CTL_CMD_PAUSE:
> + evlist__disable(evlist);
> + break;

would CTL_CMD_ENABLE, CTL_CMD_DISABLE be better fit in here?

especialy because we have the 'pause' ioctl for sampling,
which I was thinking initialy you are using for record,

and it's still might be better fit for sampling than disable, no?

jirka

> + case CTL_CMD_ACK:
> + case CTL_CMD_UNSUPPORTED:
> + default:
> + pr_debug("ctlfd: unsupported %d\n", *cmd);
> + break;
> + }
> + if (!(*cmd == CTL_CMD_ACK || *cmd == CTL_CMD_UNSUPPORTED))
> + perf_evlist__ctlfd_ack(evlist);
> + }
> + }
> +

SNIP

2020-04-02 14:21:50

by Jiri Olsa

[permalink] [raw]
Subject: Re: [PATCH v1 4/8] perf stat: implement resume and pause control commands handling

On Fri, Mar 27, 2020 at 11:48:31AM +0300, Alexey Budankov wrote:

SNIP

>
> - if (interval) {
> - ts.tv_sec = interval / USEC_PER_MSEC;
> - ts.tv_nsec = (interval % USEC_PER_MSEC) * NSEC_PER_MSEC;
> - } else if (timeout) {
> - ts.tv_sec = timeout / USEC_PER_MSEC;
> - ts.tv_nsec = (timeout % USEC_PER_MSEC) * NSEC_PER_MSEC;
> - } else {
> - ts.tv_sec = 1;
> - ts.tv_nsec = 0;
> - }
> -
> if (forks) {
> if (perf_evlist__prepare_workload(evsel_list, &target, argv, is_pipe,
> workload_exec_failed_signal) < 0) {
> @@ -688,18 +745,10 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
> perf_evlist__start_workload(evsel_list);
> enable_counters();
>
> - if (interval || timeout) {
> - while (!waitpid(child_pid, &status, WNOHANG)) {
> - nanosleep(&ts, NULL);
> - if (timeout)
> - break;
> - process_interval();
> - if (interval_count && !(--times))
> - break;
> - }
> - }
> - if (child_pid != -1)
> + if (stat_config.ctl_fd == -1 && !interval && !timeout)
> wait4(child_pid, &status, 0, &stat_config.ru_data);
> + else
> + status = handle_events(child_pid, &stat_config);
>
> if (workload_exec_errno) {
> const char *emsg = str_error_r(workload_exec_errno, msg, sizeof(msg));
> @@ -711,18 +760,7 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
> psignal(WTERMSIG(status), argv[0]);
> } else {
> enable_counters();
> - while (!done) {
> - nanosleep(&ts, NULL);
> - if (!is_target_alive(&target, evsel_list->core.threads))
> - break;
> - if (timeout)
> - break;
> - if (interval) {
> - process_interval();
> - if (interval_count && !(--times))
> - break;
> - }
> - }

could you please separate the change to have:
- factor out the above too loops to the new function
- adding control fds handling to that function

so it's more obvious we don't break anything

thanks,
jirka

> + handle_events(-1, &stat_config);
> }
>
> disable_counters();
> --
> 2.24.1
>
>

2020-04-02 14:23:20

by Jiri Olsa

[permalink] [raw]
Subject: Re: [PATCH v1 3/8] perf stat: introduce control descriptors and --ctl-fd[-ack] options

On Fri, Mar 27, 2020 at 11:47:40AM +0300, Alexey Budankov wrote:
>
> Introduce control file descriptors and --ctl-fd[-ack] options to pass
> control descriptors from command line. Extend --delay option with -1
> value to start collection in paused mode to be resumed later by resume
> command provided via control file descriptor.

could you please separate those 2 changes? also for record change

thanks,
jirka

>
> Signed-off-by: Alexey Budankov <[email protected]>
> ---
> tools/perf/builtin-stat.c | 28 ++++++++++++++++++++++++----
> tools/perf/util/evlist.h | 3 +++
> tools/perf/util/stat.h | 4 +++-
> 3 files changed, 30 insertions(+), 5 deletions(-)
>
> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> index ec053dc1e35c..0a1b79fd3f48 100644
> --- a/tools/perf/builtin-stat.c
> +++ b/tools/perf/builtin-stat.c
> @@ -187,6 +187,8 @@ static struct perf_stat_config stat_config = {
> .metric_only_len = METRIC_ONLY_LEN,
> .walltime_nsecs_stats = &walltime_nsecs_stats,
> .big_num = true,
> + .ctl_fd = -1,
> + .ctl_fd_ack = -1
> };
>
> static inline void diff_timespec(struct timespec *r, struct timespec *a,
> @@ -373,16 +375,26 @@ static void process_interval(void)
>
> static void enable_counters(void)
> {
> - if (stat_config.initial_delay)
> + if (stat_config.initial_delay < 0) {
> + pr_info(PERF_EVLIST__PAUSED_MSG);
> + return;
> + }
> +
> + if (stat_config.initial_delay > 0) {
> + pr_info(PERF_EVLIST__PAUSED_MSG);
> usleep(stat_config.initial_delay * USEC_PER_MSEC);
> + }
>
> /*
> * We need to enable counters only if:
> * - we don't have tracee (attaching to task or cpu)
> * - we have initial delay configured
> */
> - if (!target__none(&target) || stat_config.initial_delay)
> + if (!target__none(&target) || stat_config.initial_delay) {
> evlist__enable(evsel_list);
> + if (stat_config.initial_delay > 0)
> + pr_info(PERF_EVLIST__RESUMED_MSG);
> + }
> }
>
> static void disable_counters(void)
> @@ -912,8 +924,8 @@ static struct option stat_options[] = {
> "aggregate counts per thread", AGGR_THREAD),
> OPT_SET_UINT(0, "per-node", &stat_config.aggr_mode,
> "aggregate counts per numa node", AGGR_NODE),
> - OPT_UINTEGER('D', "delay", &stat_config.initial_delay,
> - "ms to wait before starting measurement after program start"),
> + OPT_INTEGER('D', "delay", &stat_config.initial_delay,
> + "ms to wait before starting measurement after program start (-1: start paused"),
> OPT_CALLBACK_NOOPT(0, "metric-only", &stat_config.metric_only, NULL,
> "Only print computed metrics. No raw values", enable_metric_only),
> OPT_BOOLEAN(0, "topdown", &topdown_run,
> @@ -933,6 +945,10 @@ static struct option stat_options[] = {
> "Use with 'percore' event qualifier to show the event "
> "counts of one hardware thread by sum up total hardware "
> "threads of same physical core"),
> + OPT_INTEGER(0, "ctl-fd", &stat_config.ctl_fd,
> + "Listen on fd descriptor for command to control measurement ('r': resume, 'p': pause)"),
> + OPT_INTEGER(0, "ctl-fd-ack", &stat_config.ctl_fd_ack,
> + "Send control command completion ('a') to fd ack descriptor"),
> OPT_END()
> };
>
> @@ -2129,6 +2145,8 @@ int cmd_stat(int argc, const char **argv)
> signal(SIGALRM, skip_signal);
> signal(SIGABRT, skip_signal);
>
> + perf_evlist__initialize_ctlfd(evsel_list, stat_config.ctl_fd, stat_config.ctl_fd_ack);
> +
> status = 0;
> for (run_idx = 0; forever || run_idx < stat_config.run_count; run_idx++) {
> if (stat_config.run_count != 1 && verbose > 0)
> @@ -2148,6 +2166,8 @@ int cmd_stat(int argc, const char **argv)
> if (!forever && status != -1 && !interval)
> print_counters(NULL, argc, argv);
>
> + perf_evlist__finalize_ctlfd(evsel_list);
> +
> if (STAT_RECORD) {
> /*
> * We synthesize the kernel mmap record just so that older tools
> diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
> index a94b2993fafc..c91368483074 100644
> --- a/tools/perf/util/evlist.h
> +++ b/tools/perf/util/evlist.h
> @@ -369,6 +369,9 @@ enum evlist_ctl_cmd {
> CTL_CMD_ACK = 'a'
> };
>
> +#define PERF_EVLIST__RESUMED_MSG "Monitoring resumed\n"
> +#define PERF_EVLIST__PAUSED_MSG "Monitoring paused\n"
> +
> int perf_evlist__initialize_ctlfd(struct evlist *evlist, int ctl_fd, int ctl_fd_ack);
> int perf_evlist__finalize_ctlfd(struct evlist *evlist);
> int perf_evlist__ctlfd_process(struct evlist *evlist, enum evlist_ctl_cmd *cmd);
> diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h
> index b4fdfaa7f2c0..0b0fa3a2cde2 100644
> --- a/tools/perf/util/stat.h
> +++ b/tools/perf/util/stat.h
> @@ -113,7 +113,7 @@ struct perf_stat_config {
> FILE *output;
> unsigned int interval;
> unsigned int timeout;
> - unsigned int initial_delay;
> + int initial_delay;
> unsigned int unit_width;
> unsigned int metric_only_len;
> int times;
> @@ -130,6 +130,8 @@ struct perf_stat_config {
> struct perf_cpu_map *cpus_aggr_map;
> u64 *walltime_run;
> struct rblist metric_events;
> + int ctl_fd;
> + int ctl_fd_ack;
> };
>
> void update_stats(struct stats *stats, u64 val);
> --
> 2.24.1
>
>

2020-04-02 15:07:20

by Alexey Budankov

[permalink] [raw]
Subject: Re: [PATCH v1 3/8] perf stat: introduce control descriptors and --ctl-fd[-ack] options


On 02.04.2020 17:17, Jiri Olsa wrote:
> On Fri, Mar 27, 2020 at 11:47:40AM +0300, Alexey Budankov wrote:
>>
>> Introduce control file descriptors and --ctl-fd[-ack] options to pass
>> control descriptors from command line. Extend --delay option with -1
>> value to start collection in paused mode to be resumed later by resume
>> command provided via control file descriptor.
>
> could you please separate those 2 changes? also for record change

Accepted, as for stat as for record mode.

~Alexey

2020-04-02 15:08:58

by Alexey Budankov

[permalink] [raw]
Subject: Re: [PATCH v1 4/8] perf stat: implement resume and pause control commands handling


On 02.04.2020 17:17, Jiri Olsa wrote:
> On Fri, Mar 27, 2020 at 11:48:31AM +0300, Alexey Budankov wrote:
>
> SNIP
>
>>
>> - if (interval) {
>> - ts.tv_sec = interval / USEC_PER_MSEC;
>> - ts.tv_nsec = (interval % USEC_PER_MSEC) * NSEC_PER_MSEC;
>> - } else if (timeout) {
>> - ts.tv_sec = timeout / USEC_PER_MSEC;
>> - ts.tv_nsec = (timeout % USEC_PER_MSEC) * NSEC_PER_MSEC;
>> - } else {
>> - ts.tv_sec = 1;
>> - ts.tv_nsec = 0;
>> - }
>> -
>> if (forks) {
>> if (perf_evlist__prepare_workload(evsel_list, &target, argv, is_pipe,
>> workload_exec_failed_signal) < 0) {
>> @@ -688,18 +745,10 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
>> perf_evlist__start_workload(evsel_list);
>> enable_counters();
>>
>> - if (interval || timeout) {
>> - while (!waitpid(child_pid, &status, WNOHANG)) {
>> - nanosleep(&ts, NULL);
>> - if (timeout)
>> - break;
>> - process_interval();
>> - if (interval_count && !(--times))
>> - break;
>> - }
>> - }
>> - if (child_pid != -1)
>> + if (stat_config.ctl_fd == -1 && !interval && !timeout)
>> wait4(child_pid, &status, 0, &stat_config.ru_data);
>> + else
>> + status = handle_events(child_pid, &stat_config);
>>
>> if (workload_exec_errno) {
>> const char *emsg = str_error_r(workload_exec_errno, msg, sizeof(msg));
>> @@ -711,18 +760,7 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
>> psignal(WTERMSIG(status), argv[0]);
>> } else {
>> enable_counters();
>> - while (!done) {
>> - nanosleep(&ts, NULL);
>> - if (!is_target_alive(&target, evsel_list->core.threads))
>> - break;
>> - if (timeout)
>> - break;
>> - if (interval) {
>> - process_interval();
>> - if (interval_count && !(--times))
>> - break;
>> - }
>> - }
>
> could you please separate the change to have:
> - factor out the above too loops to the new function
> - adding control fds handling to that function
>
> so it's more obvious we don't break anything

Accepted.

~Alexey

2020-04-02 15:22:02

by Alexey Budankov

[permalink] [raw]
Subject: Re: [PATCH v1 2/8] perf evlist: implement control command handling functions


On 02.04.2020 17:17, Jiri Olsa wrote:
> On Fri, Mar 27, 2020 at 11:46:43AM +0300, Alexey Budankov wrote:
>
> SNIP
>
>> +
>> +int perf_evlist__ctlfd_process(struct evlist *evlist, enum evlist_ctl_cmd *cmd)
>> +{
>> + int err = 0;
>> + int ctlfd_pos = evlist->ctl_fd_pos;
>> + struct pollfd *entries = evlist->core.pollfd.entries;
>> +
>> + if (!entries[ctlfd_pos].revents)
>> + return 0;
>> +
>> + if (entries[ctlfd_pos].revents & POLLIN) {
>> + err = perf_evlist__ctlfd_recv(evlist, cmd);
>> + if (err > 0) {
>> + switch (*cmd) {
>> + case CTL_CMD_RESUME:
>> + evlist__enable(evlist);
>> + break;
>> + case CTL_CMD_PAUSE:
>> + evlist__disable(evlist);
>> + break;
>
> would CTL_CMD_ENABLE, CTL_CMD_DISABLE be better fit in here?

Makes sense. Let's have it named like the corresponding ioctls, for clarity.

>
> especialy because we have the 'pause' ioctl for sampling,
> which I was thinking initialy you are using for record,
>
> and it's still might be better fit for sampling than disable, no?

PAUSE_OUTPUT ioctl doesn't stop NMIs but it should be avoided
in order not to affect workload execution during paused intervals.

PERF_EVENT_IOC_PAUSE_OUTPUT (since Linux 4.7)
This allows pausing and resuming the event's ring-buffer. A
paused ring-buffer does not prevent generation of samples, but
simply discards them. The discarded samples are considered
lost, and cause a PERF_RECORD_LOST sample to be generated when
possible. An overflow signal may still be triggered by the
discarded sample even though the ring-buffer remains empty.

The argument is an unsigned 32-bit integer. A nonzero value
pauses the ring-buffer, while a zero value resumes the ring-
buffer.

~Alexey