Changes in v6:
- split re-factoring of events handling loops for stat mode
into smaller incremental parts
- added parts missing at v5
- corrected v5 runtime issues
v5: https://lore.kernel.org/lkml/[email protected]/
Changes in v5:
- split re-factoring of events handling loops for stat mode
into smaller incremental parts
v4: https://lore.kernel.org/lkml/[email protected]/
Changes in v4:
- made checking of ctlfd state unconditional in record trace streaming loop
- introduced static poll fds to keep evlist__filter_pollfd() unaffected
- handled ret code of evlist__initialize_ctlfd() where need
- renamed and structured handle_events() function
- applied anonymous structs where needed
v3: https://lore.kernel.org/lkml/[email protected]/
Changes in v3:
- renamed functions and types from perf_evlist_ to evlist_ to avoid
clash with libperf code;
- extended commands to be strings of variable length consisting of
command name and also possibly including command specific data;
- merged docs update with the code changes;
- updated docs for -D,--delay=-1 option for stat and record modes;
v2: https://lore.kernel.org/lkml/[email protected]/
Changes in v2:
- renamed resume and pause commands to enable and disable ones, renamed
CTL_CMD_RESUME and CTL_CMD_PAUSE to CTL_CMD_ENABLE and CTL_CMD_DISABLE
to fit to the appropriate ioctls and avoid mixing up with PAUSE_OUTPUT
ioctl;
- factored out event handling loop into a handle_events() for stat mode;
- separated -D,--delay=-1 into separate patches for stat and record modes;
v1: https://lore.kernel.org/lkml/[email protected]/
repo: tip of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git perf/core
The patch set implements handling of 'start disabled', 'enable' and 'disable'
external control commands which can be provided for stat and record modes
of the tool from an external controlling process. 'start disabled' command
can be used to postpone enabling of events in the beginning of a monitoring
session. 'enable' and 'disable' commands can be used to enable and disable
events correspondingly any time after the start of the session.
The 'start disabled', 'enable' and 'disable' external control commands can be
used to focus measurement on specially selected time intervals of workload
execution. Focused measurement reduces tool intrusion and influence on
workload behavior, reduces distortion and amount of collected and stored
data, mitigates data accuracy loss because measurement and data capturing
happen only during intervals of interest.
A controlling process can be a bash shell script [1], native executable or
any other language program that can directly work with file descriptors,
e.g. pipes [2], and spawn a process, specially the tool one.
-D,--delay <val> option is extended with -1 value to skip events enabling
in the beginning of a monitoring session ('start disabled' command).
--ctl-fd and --ctl-fd-ack command line options are introduced to provide the
tool with a pair of file descriptors to listen to control commands and reply
to the controlling process on the completion of received commands.
The tool reads control command message from ctl-fd descriptor, handles the
command and optionally replies acknowledgement message to fd-ack descriptor,
if it is specified on the command line. 'enable' command is recognized as
'enable' string message and 'disable' command is recognized as 'disable'
string message both received from ctl-fd descriptor. Completion message is
'ack\n' and sent to fd-ack descriptor.
Example bash script demonstrating simple use case follows:
#!/bin/bash
ctl_dir=/tmp/
ctl_fifo=${ctl_dir}perf_ctl.fifo
test -p ${ctl_fifo} && unlink ${ctl_fifo}
mkfifo ${ctl_fifo} && exec {ctl_fd}<>${ctl_fifo}
ctl_ack_fifo=${ctl_dir}perf_ctl_ack.fifo
test -p ${ctl_ack_fifo} && unlink ${ctl_ack_fifo}
mkfifo ${ctl_ack_fifo} && exec {ctl_fd_ack}<>${ctl_ack_fifo}
perf stat -D -1 -e cpu-cycles -a -I 1000 \
--ctl-fd ${ctl_fd} --ctl-fd-ack ${ctl_fd_ack} \
-- sleep 40 &
perf_pid=$!
sleep 5 && echo 'enable' >&${ctl_fd} && read -u ${ctl_fd_ack} e1 && echo "enabled(${e1})"
sleep 10 && echo 'disable' >&${ctl_fd} && read -u ${ctl_fd_ack} d1 && echo "disabled(${d1})"
sleep 5 && echo 'enable' >&${ctl_fd} && read -u ${ctl_fd_ack} e2 && echo "enabled(${e2})"
sleep 10 && echo 'disable' >&${ctl_fd} && read -u ${ctl_fd_ack} d2 && echo "disabled(${d2})"
exec {ctl_fd_ack}>&- && unlink ${ctl_ack_fifo}
exec {ctl_fd}>&- && unlink ${ctl_fifo}
wait -n ${perf_pid}
exit $?
Script output:
[root@host dir] example
Events disabled
# time counts unit events
1.001101062 <not counted> cpu-cycles
2.002994944 <not counted> cpu-cycles
3.004864340 <not counted> cpu-cycles
4.006727177 <not counted> cpu-cycles
Events enabled
enabled(ack)
4.993808464 3,124,246 cpu-cycles
5.008597004 3,325,624 cpu-cycles
6.010387483 83,472,992 cpu-cycles
7.012266598 55,877,621 cpu-cycles
8.014175695 97,892,729 cpu-cycles
9.016056093 68,461,242 cpu-cycles
10.017937507 55,449,643 cpu-cycles
11.019830154 68,938,167 cpu-cycles
12.021719952 55,164,101 cpu-cycles
13.023627550 70,535,720 cpu-cycles
14.025580995 53,240,125 cpu-cycles
disabled(ack)
14.997518260 53,558,068 cpu-cycles
Events disabled
15.027216416 <not counted> cpu-cycles
16.029052729 <not counted> cpu-cycles
17.030904762 <not counted> cpu-cycles
18.032073424 <not counted> cpu-cycles
19.033805074 <not counted> cpu-cycles
Events enabled
enabled(ack)
20.001279097 3,021,022 cpu-cycles
20.035044381 6,434,367 cpu-cycles
21.036923813 89,358,251 cpu-cycles
22.038825169 72,516,351 cpu-cycles
# time counts unit events
23.040715596 55,046,157 cpu-cycles
24.042643757 78,128,649 cpu-cycles
25.044558535 61,052,428 cpu-cycles
26.046452785 62,142,806 cpu-cycles
27.048353021 74,477,971 cpu-cycles
28.050241286 61,001,623 cpu-cycles
29.052149961 61,653,502 cpu-cycles
disabled(ack)
30.004980264 82,729,640 cpu-cycles
Events disabled
30.053516176 <not counted> cpu-cycles
31.055348366 <not counted> cpu-cycles
32.057202097 <not counted> cpu-cycles
33.059040702 <not counted> cpu-cycles
34.060843288 <not counted> cpu-cycles
35.000888624 <not counted> cpu-cycles
[root@host dir]#
[1] http://man7.org/linux/man-pages/man1/bash.1.html
[2] http://man7.org/linux/man-pages/man2/pipe.2.html
---
Alexey Budankov (13):
tools/libperf: introduce notion of static polled file descriptors
perf evlist: introduce control file descriptors
perf evlist: implement control command handling functions
perf stat: factor out body of event handling loop for system wide
perf stat: move target check to loop control statement
perf stat: factor out body of event handling loop for fork case
perf stat: factor out event handling loop into dispatch_events()
perf stat: extend -D,--delay option with -1 value
perf stat: implement control commands handling
perf stat: introduce --ctl-fd[-ack] options
perf record: extend -D,--delay option with -1 value
perf record: implement control commands handling
perf record: introduce --ctl-fd[-ack] options
tools/lib/api/fd/array.c | 42 ++++++-
tools/lib/api/fd/array.h | 7 ++
tools/lib/perf/evlist.c | 11 ++
tools/lib/perf/include/internal/evlist.h | 2 +
tools/perf/Documentation/perf-record.txt | 5 +-
tools/perf/Documentation/perf-stat.txt | 45 ++++++-
tools/perf/builtin-record.c | 38 +++++-
tools/perf/builtin-stat.c | 149 +++++++++++++++++------
tools/perf/builtin-trace.c | 2 +-
tools/perf/util/evlist.c | 131 ++++++++++++++++++++
tools/perf/util/evlist.h | 25 ++++
tools/perf/util/record.h | 4 +-
tools/perf/util/stat.h | 4 +-
13 files changed, 415 insertions(+), 50 deletions(-)
--
2.24.1
Define and initialize control file descriptors.
Signed-off-by: Alexey Budankov <[email protected]>
---
tools/perf/util/evlist.c | 3 +++
tools/perf/util/evlist.h | 5 +++++
2 files changed, 8 insertions(+)
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 173b4f0e0e6e..47541b5cab46 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -63,6 +63,9 @@ void evlist__init(struct evlist *evlist, struct perf_cpu_map *cpus,
perf_evlist__set_maps(&evlist->core, cpus, threads);
evlist->workload.pid = -1;
evlist->bkw_mmap_state = BKW_MMAP_NOTREADY;
+ evlist->ctl_fd.fd = -1;
+ evlist->ctl_fd.ack = -1;
+ evlist->ctl_fd.pos = -1;
}
struct evlist *evlist__new(void)
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index b6f325dfb4d2..0d8b361f1c8e 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -74,6 +74,11 @@ struct evlist {
pthread_t th;
volatile int done;
} thread;
+ struct {
+ int fd;
+ int ack;
+ int pos;
+ } ctl_fd;
};
struct evsel_str_handler {
--
2.24.1
Implement functions of initialization, finalization and processing
of control command messages coming from control file descriptors.
Allocate control file descriptor as a static descriptor in struct
fdarray of struct perf_evlist object by perf_evlist__add_pollfd_stat().
Signed-off-by: Alexey Budankov <[email protected]>
---
tools/perf/util/evlist.c | 128 +++++++++++++++++++++++++++++++++++++++
tools/perf/util/evlist.h | 17 ++++++
2 files changed, 145 insertions(+)
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 47541b5cab46..fbd98f741af9 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1718,3 +1718,131 @@ struct evsel *perf_evlist__reset_weak_group(struct evlist *evsel_list,
}
return leader;
}
+
+int evlist__initialize_ctlfd(struct evlist *evlist, int fd, int ack)
+{
+ if (fd == -1) {
+ pr_debug("Control descriptor is not initialized\n");
+ return 0;
+ }
+
+ evlist->ctl_fd.pos = perf_evlist__add_pollfd_stat(&evlist->core, fd, POLLIN);
+ if (evlist->ctl_fd.pos < 0) {
+ evlist->ctl_fd.pos = -1;
+ pr_err("Failed to add ctl fd entry: %m\n");
+ return -1;
+ }
+
+ evlist->ctl_fd.fd = fd;
+ evlist->ctl_fd.ack = ack;
+
+ return 0;
+}
+
+int evlist__finalize_ctlfd(struct evlist *evlist)
+{
+ if (evlist->ctl_fd.pos == -1)
+ return 0;
+
+ evlist->core.pollfd.stat_entries[evlist->ctl_fd.pos].fd = -1;
+ evlist->ctl_fd.pos = -1;
+ evlist->ctl_fd.ack = -1;
+ evlist->ctl_fd.fd = -1;
+
+ return 0;
+}
+
+static int evlist__ctlfd_recv(struct evlist *evlist, enum evlist_ctl_cmd *cmd,
+ char *cmd_data, size_t data_size)
+{
+ int err;
+ char c;
+ size_t bytes_read = 0;
+
+ memset(cmd_data, 0, data_size--);
+
+ do {
+ err = read(evlist->ctl_fd.fd, &c, 1);
+ if (err > 0) {
+ if (c == '\n' || c == '\0')
+ break;
+ cmd_data[bytes_read++] = c;
+ if (bytes_read == data_size)
+ break;
+ } else {
+ if (err == -1)
+ pr_err("Failed to read from ctlfd %d: %m\n", evlist->ctl_fd.fd);
+ break;
+ }
+ } while (1);
+
+ pr_debug("Message from ctl_fd: \"%s%s\"\n", cmd_data,
+ bytes_read == data_size ? "" : c == '\n' ? "\\n" : "\\0");
+
+ if (err > 0) {
+ if (!strncmp(cmd_data, EVLIST_CTL_CMD_ENABLE_TAG,
+ strlen(EVLIST_CTL_CMD_ENABLE_TAG))) {
+ *cmd = EVLIST_CTL_CMD_ENABLE;
+ } else if (!strncmp(cmd_data, EVLIST_CTL_CMD_DISABLE_TAG,
+ strlen(EVLIST_CTL_CMD_DISABLE_TAG))) {
+ *cmd = EVLIST_CTL_CMD_DISABLE;
+ }
+ }
+
+ return err;
+}
+
+static int evlist__ctlfd_ack(struct evlist *evlist)
+{
+ int err;
+
+ if (evlist->ctl_fd.ack == -1)
+ return 0;
+
+ err = write(evlist->ctl_fd.ack, EVLIST_CTL_CMD_ACK_TAG,
+ sizeof(EVLIST_CTL_CMD_ACK_TAG));
+ if (err == -1)
+ pr_err("failed to write to ctl_ack_fd %d: %m\n", evlist->ctl_fd.ack);
+
+ return err;
+}
+
+int evlist__ctlfd_process(struct evlist *evlist, enum evlist_ctl_cmd *cmd)
+{
+ int err = 0;
+ char cmd_data[EVLIST_CTL_CMD_MAX_LEN];
+ int ctlfd_pos = evlist->ctl_fd.pos;
+ struct pollfd *stat_entries = evlist->core.pollfd.stat_entries;
+
+ if (ctlfd_pos == -1 || !stat_entries[ctlfd_pos].revents)
+ return 0;
+
+ if (stat_entries[ctlfd_pos].revents & POLLIN) {
+ err = evlist__ctlfd_recv(evlist, cmd, cmd_data,
+ EVLIST_CTL_CMD_MAX_LEN);
+ if (err > 0) {
+ switch (*cmd) {
+ case EVLIST_CTL_CMD_ENABLE:
+ evlist__enable(evlist);
+ break;
+ case EVLIST_CTL_CMD_DISABLE:
+ evlist__disable(evlist);
+ break;
+ case EVLIST_CTL_CMD_ACK:
+ case EVLIST_CTL_CMD_UNSUPPORTED:
+ default:
+ pr_debug("ctlfd: unsupported %d\n", *cmd);
+ break;
+ }
+ if (!(*cmd == EVLIST_CTL_CMD_ACK || *cmd == EVLIST_CTL_CMD_UNSUPPORTED))
+ evlist__ctlfd_ack(evlist);
+ }
+ }
+
+ if (stat_entries[ctlfd_pos].revents & (POLLHUP | POLLERR))
+ evlist__finalize_ctlfd(evlist);
+ else
+ stat_entries[ctlfd_pos].revents = 0;
+
+ return err;
+}
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 0d8b361f1c8e..bccf0a970371 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -360,4 +360,21 @@ void perf_evlist__force_leader(struct evlist *evlist);
struct evsel *perf_evlist__reset_weak_group(struct evlist *evlist,
struct evsel *evsel,
bool close);
+#define EVLIST_CTL_CMD_ENABLE_TAG "enable"
+#define EVLIST_CTL_CMD_DISABLE_TAG "disable"
+#define EVLIST_CTL_CMD_ACK_TAG "ack\n"
+
+#define EVLIST_CTL_CMD_MAX_LEN 64
+
+enum evlist_ctl_cmd {
+ EVLIST_CTL_CMD_UNSUPPORTED = 0,
+ EVLIST_CTL_CMD_ENABLE,
+ EVLIST_CTL_CMD_DISABLE,
+ EVLIST_CTL_CMD_ACK
+};
+
+int evlist__initialize_ctlfd(struct evlist *evlist, int ctl_fd, int ctl_fd_ack);
+int evlist__finalize_ctlfd(struct evlist *evlist);
+int evlist__ctlfd_process(struct evlist *evlist, enum evlist_ctl_cmd *cmd);
+
#endif /* __PERF_EVLIST_H */
--
2.24.1
Implement adding of file descriptors by fdarray__add_stat() to
fix-sized (currently 1) stat_entries array located at struct fdarray.
Append added file descriptors to the array used by poll() syscall
during fdarray__poll() call. Copy poll() result of the added
descriptors from the array back to the storage for separate analysis.
Signed-off-by: Alexey Budankov <[email protected]>
---
tools/lib/api/fd/array.c | 42 +++++++++++++++++++++++-
tools/lib/api/fd/array.h | 7 ++++
tools/lib/perf/evlist.c | 11 +++++++
tools/lib/perf/include/internal/evlist.h | 2 ++
4 files changed, 61 insertions(+), 1 deletion(-)
diff --git a/tools/lib/api/fd/array.c b/tools/lib/api/fd/array.c
index 58d44d5eee31..b0027f2169c7 100644
--- a/tools/lib/api/fd/array.c
+++ b/tools/lib/api/fd/array.c
@@ -11,10 +11,16 @@
void fdarray__init(struct fdarray *fda, int nr_autogrow)
{
+ int i;
+
fda->entries = NULL;
fda->priv = NULL;
fda->nr = fda->nr_alloc = 0;
fda->nr_autogrow = nr_autogrow;
+
+ fda->nr_stat = 0;
+ for (i = 0; i < FDARRAY__STAT_ENTRIES_MAX; i++)
+ fda->stat_entries[i].fd = -1;
}
int fdarray__grow(struct fdarray *fda, int nr)
@@ -83,6 +89,20 @@ int fdarray__add(struct fdarray *fda, int fd, short revents)
return pos;
}
+int fdarray__add_stat(struct fdarray *fda, int fd, short revents)
+{
+ int pos = fda->nr_stat;
+
+ if (pos >= FDARRAY__STAT_ENTRIES_MAX)
+ return -1;
+
+ fda->stat_entries[pos].fd = fd;
+ fda->stat_entries[pos].events = revents;
+ fda->nr_stat++;
+
+ return pos;
+}
+
int fdarray__filter(struct fdarray *fda, short revents,
void (*entry_destructor)(struct fdarray *fda, int fd, void *arg),
void *arg)
@@ -113,7 +133,27 @@ int fdarray__filter(struct fdarray *fda, short revents,
int fdarray__poll(struct fdarray *fda, int timeout)
{
- return poll(fda->entries, fda->nr, timeout);
+ int nr, i, pos, res;
+
+ nr = fda->nr;
+
+ for (i = 0; i < fda->nr_stat; i++) {
+ if (fda->stat_entries[i].fd != -1) {
+ pos = fdarray__add(fda, fda->stat_entries[i].fd,
+ fda->stat_entries[i].events);
+ if (pos >= 0)
+ fda->priv[pos].idx = i;
+ }
+ }
+
+ res = poll(fda->entries, fda->nr, timeout);
+
+ for (i = nr; i < fda->nr; i++)
+ fda->stat_entries[fda->priv[i].idx] = fda->entries[i];
+
+ fda->nr = nr;
+
+ return res;
}
int fdarray__fprintf(struct fdarray *fda, FILE *fp)
diff --git a/tools/lib/api/fd/array.h b/tools/lib/api/fd/array.h
index b39557d1a88f..9bca72e80b09 100644
--- a/tools/lib/api/fd/array.h
+++ b/tools/lib/api/fd/array.h
@@ -3,6 +3,7 @@
#define __API_FD_ARRAY__
#include <stdio.h>
+#include <poll.h>
struct pollfd;
@@ -16,6 +17,9 @@ struct pollfd;
* I.e. using 'fda->priv[N].idx = * value' where N < fda->nr is ok,
* but doing 'fda->priv = malloc(M)' is not allowed.
*/
+
+#define FDARRAY__STAT_ENTRIES_MAX 1
+
struct fdarray {
int nr;
int nr_alloc;
@@ -25,6 +29,8 @@ struct fdarray {
int idx;
void *ptr;
} *priv;
+ int nr_stat;
+ struct pollfd stat_entries[FDARRAY__STAT_ENTRIES_MAX];
};
void fdarray__init(struct fdarray *fda, int nr_autogrow);
@@ -34,6 +40,7 @@ struct fdarray *fdarray__new(int nr_alloc, int nr_autogrow);
void fdarray__delete(struct fdarray *fda);
int fdarray__add(struct fdarray *fda, int fd, short revents);
+int fdarray__add_stat(struct fdarray *fda, int fd, short revents);
int fdarray__poll(struct fdarray *fda, int timeout);
int fdarray__filter(struct fdarray *fda, short revents,
void (*entry_destructor)(struct fdarray *fda, int fd, void *arg),
diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
index 6a875a0f01bb..e68e4c08e7c2 100644
--- a/tools/lib/perf/evlist.c
+++ b/tools/lib/perf/evlist.c
@@ -317,6 +317,17 @@ int perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd,
return pos;
}
+int perf_evlist__add_pollfd_stat(struct perf_evlist *evlist, int fd,
+ short revent)
+{
+ int pos = fdarray__add_stat(&evlist->pollfd, fd, revent | POLLERR | POLLHUP);
+
+ if (pos >= 0)
+ fcntl(fd, F_SETFL, O_NONBLOCK);
+
+ return pos;
+}
+
static void perf_evlist__munmap_filtered(struct fdarray *fda, int fd,
void *arg __maybe_unused)
{
diff --git a/tools/lib/perf/include/internal/evlist.h b/tools/lib/perf/include/internal/evlist.h
index 74dc8c3f0b66..2b3b4518c05e 100644
--- a/tools/lib/perf/include/internal/evlist.h
+++ b/tools/lib/perf/include/internal/evlist.h
@@ -46,6 +46,8 @@ struct perf_evlist_mmap_ops {
int perf_evlist__alloc_pollfd(struct perf_evlist *evlist);
int perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd,
void *ptr, short revent);
+int perf_evlist__add_pollfd_stat(struct perf_evlist *evlist, int fd,
+ short revent);
int perf_evlist__mmap_ops(struct perf_evlist *evlist,
struct perf_evlist_mmap_ops *ops,
--
2.24.1
Introduce process_timeout() and process_interval() functions that
factor out body of event handling loop for attach and system wide
monitoring use cases.
Signed-off-by: Alexey Budankov <[email protected]>
---
tools/perf/builtin-stat.c | 28 ++++++++++++++++++++--------
1 file changed, 20 insertions(+), 8 deletions(-)
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index b2b79aa161dd..6dd28c714f4a 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -422,6 +422,23 @@ static void process_interval(void)
print_counters(&rs, 0, NULL);
}
+static bool print_interval(unsigned int interval, int *times)
+{
+ if (interval) {
+ process_interval();
+ if (interval_count && !(--(*times)))
+ return true;
+ }
+ return false;
+}
+
+static bool process_timeout(int timeout, unsigned int interval, int *times)
+{
+ if (timeout)
+ return true;
+ return print_interval(interval, times);
+}
+
static void enable_counters(void)
{
if (stat_config.initial_delay)
@@ -558,6 +575,7 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
struct affinity affinity;
int i, cpu;
bool second_pass = false;
+ bool stop = false;
if (interval) {
ts.tv_sec = interval / USEC_PER_MSEC;
@@ -752,17 +770,11 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
psignal(WTERMSIG(status), argv[0]);
} else {
enable_counters();
- while (!done) {
+ while (!done && !stop) {
nanosleep(&ts, NULL);
if (!is_target_alive(&target, evsel_list->core.threads))
break;
- if (timeout)
- break;
- if (interval) {
- process_interval();
- if (interval_count && !(--times))
- break;
- }
+ stop = process_timeout(timeout, interval, ×);
}
}
--
2.24.1
Check for target existence in loop control statement jointly with 'stop'
indicator based on command line values and external asynchronous 'done'
signal.
Signed-off-by: Alexey Budankov <[email protected]>
---
tools/perf/builtin-stat.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 6dd28c714f4a..9c090eaae1e1 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -770,10 +770,8 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
psignal(WTERMSIG(status), argv[0]);
} else {
enable_counters();
- while (!done && !stop) {
+ while (!done && !stop && is_target_alive(&target, evsel_list->core.threads)) {
nanosleep(&ts, NULL);
- if (!is_target_alive(&target, evsel_list->core.threads))
- break;
stop = process_timeout(timeout, interval, ×);
}
}
--
2.24.1
Consolidate event dispatching loops for fork, attach and system
wide monitoring use cases into common dispatch_events() function.
Signed-off-by: Alexey Budankov <[email protected]>
---
tools/perf/builtin-stat.c | 35 ++++++++++++++++++++++++-----------
1 file changed, 24 insertions(+), 11 deletions(-)
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index edd6fdcd1c72..f2d1b8c1481d 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -504,6 +504,27 @@ static bool is_target_alive(struct target *_target,
return false;
}
+static int dispatch_events(bool forks, int timeout, int interval, int *times, struct timespec *ts)
+{
+ bool stop = false;
+ int child = 0, status = 0;
+
+ while (1) {
+ if (forks)
+ child = waitpid(child_pid, &status, WNOHANG);
+ else
+ child = !is_target_alive(&target, evsel_list->core.threads) ? 1 : 0;
+
+ if (done || stop || child)
+ break;
+
+ nanosleep(ts, NULL);
+ stop = process_timeout(timeout, interval, times);
+ }
+
+ return status;
+}
+
enum counter_recovery {
COUNTER_SKIP,
COUNTER_RETRY,
@@ -575,7 +596,6 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
struct affinity affinity;
int i, cpu;
bool second_pass = false;
- bool stop = false;
if (interval) {
ts.tv_sec = interval / USEC_PER_MSEC;
@@ -744,12 +764,8 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
perf_evlist__start_workload(evsel_list);
enable_counters();
- if (interval || timeout) {
- while (!stop && !waitpid(child_pid, &status, WNOHANG)) {
- nanosleep(&ts, NULL);
- stop = process_timeout(timeout, interval, ×);
- }
- }
+ if (interval || timeout)
+ status = dispatch_events(forks, timeout, interval, ×, &ts);
if (child_pid != -1) {
if (timeout)
kill(child_pid, SIGTERM);
@@ -766,10 +782,7 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
psignal(WTERMSIG(status), argv[0]);
} else {
enable_counters();
- while (!done && !stop && is_target_alive(&target, evsel_list->core.threads)) {
- nanosleep(&ts, NULL);
- stop = process_timeout(timeout, interval, ×);
- }
+ dispatch_events(forks, timeout, interval, ×, &ts);
}
disable_counters();
--
2.24.1
Factor out body of event handling loop for fork case reusing
process_timeout() and process_interval() functions.
Signed-off-by: Alexey Budankov <[email protected]>
---
tools/perf/builtin-stat.c | 8 ++------
1 file changed, 2 insertions(+), 6 deletions(-)
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 9c090eaae1e1..edd6fdcd1c72 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -745,13 +745,9 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
enable_counters();
if (interval || timeout) {
- while (!waitpid(child_pid, &status, WNOHANG)) {
+ while (!stop && !waitpid(child_pid, &status, WNOHANG)) {
nanosleep(&ts, NULL);
- if (timeout)
- break;
- process_interval();
- if (interval_count && !(--times))
- break;
+ stop = process_timeout(timeout, interval, ×);
}
}
if (child_pid != -1) {
--
2.24.1
Extend -D,--delay option with -1 value to start monitoring with
events disabled to be enabled later by enable command provided
via control file descriptor.
Signed-off-by: Alexey Budankov <[email protected]>
---
tools/perf/Documentation/perf-stat.txt | 5 +++--
tools/perf/builtin-stat.c | 18 ++++++++++++++----
tools/perf/util/evlist.h | 3 +++
tools/perf/util/stat.h | 2 +-
4 files changed, 21 insertions(+), 7 deletions(-)
diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
index b029ee728a0b..9f32f6cd558d 100644
--- a/tools/perf/Documentation/perf-stat.txt
+++ b/tools/perf/Documentation/perf-stat.txt
@@ -238,8 +238,9 @@ mode, use --per-node in addition to -a. (system-wide).
-D msecs::
--delay msecs::
-After starting the program, wait msecs before measuring. This is useful to
-filter out the startup phase of the program, which is often very different.
+After starting the program, wait msecs before measuring (-1: start with events
+disabled). This is useful to filter out the startup phase of the program,
+which is often very different.
-T::
--transaction::
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index f2d1b8c1481d..8cf248bb9e78 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -441,16 +441,26 @@ static bool process_timeout(int timeout, unsigned int interval, int *times)
static void enable_counters(void)
{
- if (stat_config.initial_delay)
+ if (stat_config.initial_delay < 0) {
+ pr_info(EVLIST_DISABLED_MSG);
+ return;
+ }
+
+ if (stat_config.initial_delay > 0) {
+ pr_info(EVLIST_DISABLED_MSG);
usleep(stat_config.initial_delay * USEC_PER_MSEC);
+ }
/*
* We need to enable counters only if:
* - we don't have tracee (attaching to task or cpu)
* - we have initial delay configured
*/
- if (!target__none(&target) || stat_config.initial_delay)
+ if (!target__none(&target) || stat_config.initial_delay) {
evlist__enable(evsel_list);
+ if (stat_config.initial_delay > 0)
+ pr_info(EVLIST_ENABLED_MSG);
+ }
}
static void disable_counters(void)
@@ -1007,8 +1017,8 @@ static struct option stat_options[] = {
"aggregate counts per thread", AGGR_THREAD),
OPT_SET_UINT(0, "per-node", &stat_config.aggr_mode,
"aggregate counts per numa node", AGGR_NODE),
- OPT_UINTEGER('D', "delay", &stat_config.initial_delay,
- "ms to wait before starting measurement after program start"),
+ OPT_INTEGER('D', "delay", &stat_config.initial_delay,
+ "ms to wait before starting measurement after program start (-1: start with events disabled)"),
OPT_CALLBACK_NOOPT(0, "metric-only", &stat_config.metric_only, NULL,
"Only print computed metrics. No raw values", enable_metric_only),
OPT_BOOLEAN(0, "metric-no-group", &stat_config.metric_no_group,
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index bccf0a970371..7c3726a685f5 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -377,4 +377,7 @@ int evlist__initialize_ctlfd(struct evlist *evlist, int ctl_fd, int ctl_fd_ack);
int evlist__finalize_ctlfd(struct evlist *evlist);
int evlist__ctlfd_process(struct evlist *evlist, enum evlist_ctl_cmd *cmd);
+#define EVLIST_ENABLED_MSG "Events enabled\n"
+#define EVLIST_DISABLED_MSG "Events disabled\n"
+
#endif /* __PERF_EVLIST_H */
diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h
index f75ae679eb28..626421ef35c2 100644
--- a/tools/perf/util/stat.h
+++ b/tools/perf/util/stat.h
@@ -116,7 +116,7 @@ struct perf_stat_config {
FILE *output;
unsigned int interval;
unsigned int timeout;
- unsigned int initial_delay;
+ int initial_delay;
unsigned int unit_width;
unsigned int metric_only_len;
int times;
--
2.24.1
Implement handling of 'enable' and 'disable' control commands
coming from control file descriptor. process_evlist() function
checks for events on static fd and makes required operations.
If poll event splits initiated timeout interval then the reminder
is calculated and still waited in the following poll() syscall.
Signed-off-by: Alexey Budankov <[email protected]>
---
tools/perf/builtin-stat.c | 67 +++++++++++++++++++++++++++++----------
1 file changed, 50 insertions(+), 17 deletions(-)
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 8cf248bb9e78..4a197b5eee23 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -439,6 +439,31 @@ static bool process_timeout(int timeout, unsigned int interval, int *times)
return print_interval(interval, times);
}
+static bool process_evlist(struct evlist *evlist, unsigned int interval, int *times)
+{
+ bool stop = false;
+ enum evlist_ctl_cmd cmd = EVLIST_CTL_CMD_UNSUPPORTED;
+
+ if (evlist__ctlfd_process(evlist, &cmd) > 0) {
+ switch (cmd) {
+ case EVLIST_CTL_CMD_ENABLE:
+ pr_info(EVLIST_ENABLED_MSG);
+ stop = print_interval(interval, times);
+ break;
+ case EVLIST_CTL_CMD_DISABLE:
+ stop = print_interval(interval, times);
+ pr_info(EVLIST_DISABLED_MSG);
+ break;
+ case EVLIST_CTL_CMD_ACK:
+ case EVLIST_CTL_CMD_UNSUPPORTED:
+ default:
+ break;
+ }
+ }
+
+ return stop;
+}
+
static void enable_counters(void)
{
if (stat_config.initial_delay < 0) {
@@ -514,10 +539,21 @@ static bool is_target_alive(struct target *_target,
return false;
}
-static int dispatch_events(bool forks, int timeout, int interval, int *times, struct timespec *ts)
+static int dispatch_events(bool forks, int timeout, int interval, int *times)
{
bool stop = false;
int child = 0, status = 0;
+ int time_to_sleep, sleep_time;
+ struct timespec time_start, time_stop, time_diff;
+
+ if (interval)
+ sleep_time = interval;
+ else if (timeout)
+ sleep_time = timeout;
+ else
+ sleep_time = 1000;
+
+ time_to_sleep = sleep_time;
while (1) {
if (forks)
@@ -528,8 +564,17 @@ static int dispatch_events(bool forks, int timeout, int interval, int *times, st
if (done || stop || child)
break;
- nanosleep(ts, NULL);
- stop = process_timeout(timeout, interval, times);
+ clock_gettime(CLOCK_MONOTONIC, &time_start);
+ if (!(evlist__poll(evsel_list, time_to_sleep) > 0)) { /* poll timeout or EINTR */
+ stop = process_timeout(timeout, interval, times);
+ time_to_sleep = sleep_time;
+ } else { /* fd revent */
+ stop = process_evlist(evsel_list, interval, times);
+ clock_gettime(CLOCK_MONOTONIC, &time_stop);
+ diff_timespec(&time_diff, &time_stop, &time_start);
+ time_to_sleep -= time_diff.tv_sec * MSEC_PER_SEC +
+ time_diff.tv_nsec / NSEC_PER_MSEC;
+ }
}
return status;
@@ -598,7 +643,6 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
char msg[BUFSIZ];
unsigned long long t0, t1;
struct evsel *counter;
- struct timespec ts;
size_t l;
int status = 0;
const bool forks = (argc > 0);
@@ -607,17 +651,6 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
int i, cpu;
bool second_pass = false;
- if (interval) {
- ts.tv_sec = interval / USEC_PER_MSEC;
- ts.tv_nsec = (interval % USEC_PER_MSEC) * NSEC_PER_MSEC;
- } else if (timeout) {
- ts.tv_sec = timeout / USEC_PER_MSEC;
- ts.tv_nsec = (timeout % USEC_PER_MSEC) * NSEC_PER_MSEC;
- } else {
- ts.tv_sec = 1;
- ts.tv_nsec = 0;
- }
-
if (forks) {
if (perf_evlist__prepare_workload(evsel_list, &target, argv, is_pipe,
workload_exec_failed_signal) < 0) {
@@ -775,7 +808,7 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
enable_counters();
if (interval || timeout)
- status = dispatch_events(forks, timeout, interval, ×, &ts);
+ status = dispatch_events(forks, timeout, interval, ×);
if (child_pid != -1) {
if (timeout)
kill(child_pid, SIGTERM);
@@ -792,7 +825,7 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
psignal(WTERMSIG(status), argv[0]);
} else {
enable_counters();
- dispatch_events(forks, timeout, interval, ×, &ts);
+ dispatch_events(forks, timeout, interval, ×);
}
disable_counters();
--
2.24.1
Introduce --ctl-fd[-ack] options to pass open file descriptors numbers
from command line. Extend perf-stat.txt file with --ctl-fd[-ack] options
description. Document possible usage model introduced by --ctl-fd[-ack]
options by providing example bash shell script.
Signed-off-by: Alexey Budankov <[email protected]>
---
tools/perf/Documentation/perf-stat.txt | 40 ++++++++++++++++++++++++++
tools/perf/builtin-stat.c | 11 +++++++
tools/perf/util/stat.h | 2 ++
3 files changed, 53 insertions(+)
diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
index 9f32f6cd558d..c4a5d31062ef 100644
--- a/tools/perf/Documentation/perf-stat.txt
+++ b/tools/perf/Documentation/perf-stat.txt
@@ -176,6 +176,46 @@ with it. --append may be used here. Examples:
3>results perf stat --log-fd 3 -- $cmd
3>>results perf stat --log-fd 3 --append -- $cmd
+--ctl-fd::
+--ctl-fd-ack::
+
+Listen on ctl-fd descriptor for command to control measurement ('enable': enable events,
+'disable': disable events). Optionally send control command completion ('ack') to fd-ack
+descriptor to synchronize with the controlling process. Example of bash shell script
+to enable and disable events during measurements:
+
+#!/bin/bash
+
+ctl_dir=/tmp/
+
+ctl_fifo=${ctl_dir}perf_ctl.fifo
+test -p ${ctl_fifo} && unlink ${ctl_fifo}
+mkfifo ${ctl_fifo}
+exec {ctl_fd}<>${ctl_fifo}
+
+ctl_ack_fifo=${ctl_dir}perf_ctl_ack.fifo
+test -p ${ctl_ack_fifo} && unlink ${ctl_ack_fifo}
+mkfifo ${ctl_ack_fifo}
+exec {ctl_fd_ack}<>${ctl_ack_fifo}
+
+perf stat -D -1 -e cpu-cycles -a -I 1000 \
+ --ctl-fd ${ctl_fd} --ctl-fd-ack ${ctl_fd_ack} \
+ -- sleep 30 &
+perf_pid=$!
+
+sleep 5 && echo 'enable' >&${ctl_fd} && read -u ${ctl_fd_ack} e1 && echo "enabled(${e1})"
+sleep 10 && echo 'disable' >&${ctl_fd} && read -u ${ctl_fd_ack} d1 && echo "disabled(${d1})"
+
+exec {ctl_fd_ack}>&-
+unlink ${ctl_ack_fifo}
+
+exec {ctl_fd}>&-
+unlink ${ctl_fifo}
+
+wait -n ${perf_pid}
+exit $?
+
+
--pre::
--post::
Pre and post measurement hooks, e.g.:
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 4a197b5eee23..a878e106ead0 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -188,6 +188,8 @@ static struct perf_stat_config stat_config = {
.metric_only_len = METRIC_ONLY_LEN,
.walltime_nsecs_stats = &walltime_nsecs_stats,
.big_num = true,
+ .ctl_fd = -1,
+ .ctl_fd_ack = -1
};
static inline void diff_timespec(struct timespec *r, struct timespec *a,
@@ -1080,6 +1082,10 @@ static struct option stat_options[] = {
"libpfm4 event selector. use 'perf list' to list available events",
parse_libpfm_events_option),
#endif
+ OPT_INTEGER(0, "ctl-fd", &stat_config.ctl_fd,
+ "Listen on fd descriptor for command to control measurement ('enable': enable events, 'disable': disable events)"),
+ OPT_INTEGER(0, "ctl-fd-ack", &stat_config.ctl_fd_ack,
+ "Send control command completion ('ack') to fd ack descriptor"),
OPT_END()
};
@@ -2249,6 +2255,9 @@ int cmd_stat(int argc, const char **argv)
signal(SIGALRM, skip_signal);
signal(SIGABRT, skip_signal);
+ if (evlist__initialize_ctlfd(evsel_list, stat_config.ctl_fd, stat_config.ctl_fd_ack))
+ goto out;
+
status = 0;
for (run_idx = 0; forever || run_idx < stat_config.run_count; run_idx++) {
if (stat_config.run_count != 1 && verbose > 0)
@@ -2268,6 +2277,8 @@ int cmd_stat(int argc, const char **argv)
if (!forever && status != -1 && (!interval || stat_config.summary))
print_counters(NULL, argc, argv);
+ evlist__finalize_ctlfd(evsel_list);
+
if (STAT_RECORD) {
/*
* We synthesize the kernel mmap record just so that older tools
diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h
index 626421ef35c2..06f0baabe775 100644
--- a/tools/perf/util/stat.h
+++ b/tools/perf/util/stat.h
@@ -133,6 +133,8 @@ struct perf_stat_config {
struct perf_cpu_map *cpus_aggr_map;
u64 *walltime_run;
struct rblist metric_events;
+ int ctl_fd;
+ int ctl_fd_ack;
};
void perf_stat__set_big_num(int set);
--
2.24.1
Extend -D,--delay option with -1 to start collection with events
disabled to be enabled later by enable command provided via control
file descriptor.
Signed-off-by: Alexey Budankov <[email protected]>
---
tools/perf/Documentation/perf-record.txt | 5 +++--
tools/perf/builtin-record.c | 12 ++++++++----
tools/perf/builtin-trace.c | 2 +-
tools/perf/util/record.h | 2 +-
4 files changed, 13 insertions(+), 8 deletions(-)
diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index fa8a5fcd27ab..a84376605805 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -407,8 +407,9 @@ if combined with -a or -C options.
-D::
--delay=::
-After starting the program, wait msecs before measuring. This is useful to
-filter out the startup phase of the program, which is often very different.
+After starting the program, wait msecs before measuring (-1: start with events
+disabled). This is useful to filter out the startup phase of the program, which
+is often very different.
-I::
--intr-regs::
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index e108d90ae2ed..d0b29a1070a0 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1749,8 +1749,12 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
}
if (opts->initial_delay) {
- usleep(opts->initial_delay * USEC_PER_MSEC);
- evlist__enable(rec->evlist);
+ pr_info(EVLIST_DISABLED_MSG);
+ if (opts->initial_delay > 0) {
+ usleep(opts->initial_delay * USEC_PER_MSEC);
+ evlist__enable(rec->evlist);
+ pr_info(EVLIST_ENABLED_MSG);
+ }
}
trigger_ready(&auxtrace_snapshot_trigger);
@@ -2462,8 +2466,8 @@ static struct option __record_options[] = {
OPT_CALLBACK('G', "cgroup", &record.evlist, "name",
"monitor event in cgroup name only",
parse_cgroups),
- OPT_UINTEGER('D', "delay", &record.opts.initial_delay,
- "ms to wait before starting measurement after program start"),
+ OPT_INTEGER('D', "delay", &record.opts.initial_delay,
+ "ms to wait before starting measurement after program start (-1: start with events disabled)"),
OPT_BOOLEAN(0, "kcore", &record.opts.kcore, "copy /proc/kcore"),
OPT_STRING('u', "uid", &record.opts.target.uid_str, "user",
"user to profile"),
diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 4cbb64edc998..290149b1b3b5 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -4813,7 +4813,7 @@ int cmd_trace(int argc, const char **argv)
"per thread proc mmap processing timeout in ms"),
OPT_CALLBACK('G', "cgroup", &trace, "name", "monitor event in cgroup name only",
trace__parse_cgroups),
- OPT_UINTEGER('D', "delay", &trace.opts.initial_delay,
+ OPT_INTEGER('D', "delay", &trace.opts.initial_delay,
"ms to wait before starting measurement after program "
"start"),
OPTS_EVSWITCH(&trace.evswitch),
diff --git a/tools/perf/util/record.h b/tools/perf/util/record.h
index 39d1de4b2a36..da138dcb4d34 100644
--- a/tools/perf/util/record.h
+++ b/tools/perf/util/record.h
@@ -61,7 +61,7 @@ struct record_opts {
const char *auxtrace_snapshot_opts;
const char *auxtrace_sample_opts;
bool sample_transaction;
- unsigned initial_delay;
+ int initial_delay;
bool use_clockid;
clockid_t clockid;
u64 clockid_res_ns;
--
2.24.1
Implement handling of 'enable' and 'disable' control commands
coming from control file descriptor.
Signed-off-by: Alexey Budankov <[email protected]>
---
tools/perf/builtin-record.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index d0b29a1070a0..0394e068dde8 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1527,6 +1527,7 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
bool disabled = false, draining = false;
int fd;
float ratio = 0;
+ enum evlist_ctl_cmd cmd = EVLIST_CTL_CMD_UNSUPPORTED;
atexit(record__sig_exit);
signal(SIGCHLD, sig_handler);
@@ -1830,6 +1831,21 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
alarm(rec->switch_output.time);
}
+ if (evlist__ctlfd_process(rec->evlist, &cmd) > 0) {
+ switch (cmd) {
+ case EVLIST_CTL_CMD_ENABLE:
+ pr_info(EVLIST_ENABLED_MSG);
+ break;
+ case EVLIST_CTL_CMD_DISABLE:
+ pr_info(EVLIST_DISABLED_MSG);
+ break;
+ case EVLIST_CTL_CMD_ACK:
+ case EVLIST_CTL_CMD_UNSUPPORTED:
+ default:
+ break;
+ }
+ }
+
if (hits == rec->samples) {
if (done || draining)
break;
--
2.24.1
Introduce --ctl-fd[-ack] options to pass open file descriptors numbers
from command line. Extend perf-record.txt file with --ctl-fd[-ack]
options description. Document possible usage model introduced by
--ctl-fd[-ack] options by providing example bash shell script.
Signed-off-by: Alexey Budankov <[email protected]>
---
tools/perf/builtin-record.c | 10 ++++++++++
tools/perf/util/record.h | 2 ++
2 files changed, 12 insertions(+)
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 0394e068dde8..8494ce964738 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1749,6 +1749,9 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
perf_evlist__start_workload(rec->evlist);
}
+ if (evlist__initialize_ctlfd(rec->evlist, opts->ctl_fd, opts->ctl_fd_ack))
+ goto out_child;
+
if (opts->initial_delay) {
pr_info(EVLIST_DISABLED_MSG);
if (opts->initial_delay > 0) {
@@ -1895,6 +1898,7 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
record__synthesize_workload(rec, true);
out_child:
+ evlist__finalize_ctlfd(rec->evlist);
record__mmap_read_all(rec, true);
record__aio_mmap_read_sync(rec);
@@ -2380,6 +2384,8 @@ static struct record record = {
},
.mmap_flush = MMAP_FLUSH_DEFAULT,
.nr_threads_synthesize = 1,
+ .ctl_fd = -1,
+ .ctl_fd_ack = -1,
},
.tool = {
.sample = process_sample_event,
@@ -2581,6 +2587,10 @@ static struct option __record_options[] = {
"libpfm4 event selector. use 'perf list' to list available events",
parse_libpfm_events_option),
#endif
+ OPT_INTEGER(0, "ctl-fd", &record.opts.ctl_fd,
+ "Listen on fd descriptor for command to control measurement ('enable': enable events, 'disable': disable events)"),
+ OPT_INTEGER(0, "ctl-fd-ack", &record.opts.ctl_fd_ack,
+ "Send control command completion ('ack') to fd ack descriptor"),
OPT_END()
};
diff --git a/tools/perf/util/record.h b/tools/perf/util/record.h
index da138dcb4d34..4cb72a478af1 100644
--- a/tools/perf/util/record.h
+++ b/tools/perf/util/record.h
@@ -70,6 +70,8 @@ struct record_opts {
int mmap_flush;
unsigned int comp_level;
unsigned int nr_threads_synthesize;
+ int ctl_fd;
+ int ctl_fd_ack;
};
extern const char * const *record_usage;
--
2.24.1
On 1/06/20 11:05 pm, Alexey Budankov wrote:
>
> Implement adding of file descriptors by fdarray__add_stat() to
> fix-sized (currently 1) stat_entries array located at struct fdarray.
> Append added file descriptors to the array used by poll() syscall
> during fdarray__poll() call. Copy poll() result of the added
> descriptors from the array back to the storage for separate analysis.
Why not instead call evlist__add_pollfd() before other fds are added, so
the fda->entries[] position is always fixed. Then this patch is not needed.
>
> Signed-off-by: Alexey Budankov <[email protected]>
> ---
> tools/lib/api/fd/array.c | 42 +++++++++++++++++++++++-
> tools/lib/api/fd/array.h | 7 ++++
> tools/lib/perf/evlist.c | 11 +++++++
> tools/lib/perf/include/internal/evlist.h | 2 ++
> 4 files changed, 61 insertions(+), 1 deletion(-)
>
> diff --git a/tools/lib/api/fd/array.c b/tools/lib/api/fd/array.c
> index 58d44d5eee31..b0027f2169c7 100644
> --- a/tools/lib/api/fd/array.c
> +++ b/tools/lib/api/fd/array.c
> @@ -11,10 +11,16 @@
>
> void fdarray__init(struct fdarray *fda, int nr_autogrow)
> {
> + int i;
> +
> fda->entries = NULL;
> fda->priv = NULL;
> fda->nr = fda->nr_alloc = 0;
> fda->nr_autogrow = nr_autogrow;
> +
> + fda->nr_stat = 0;
> + for (i = 0; i < FDARRAY__STAT_ENTRIES_MAX; i++)
> + fda->stat_entries[i].fd = -1;
> }
>
> int fdarray__grow(struct fdarray *fda, int nr)
> @@ -83,6 +89,20 @@ int fdarray__add(struct fdarray *fda, int fd, short revents)
> return pos;
> }
>
> +int fdarray__add_stat(struct fdarray *fda, int fd, short revents)
> +{
> + int pos = fda->nr_stat;
> +
> + if (pos >= FDARRAY__STAT_ENTRIES_MAX)
> + return -1;
> +
> + fda->stat_entries[pos].fd = fd;
> + fda->stat_entries[pos].events = revents;
> + fda->nr_stat++;
> +
> + return pos;
> +}
> +
> int fdarray__filter(struct fdarray *fda, short revents,
> void (*entry_destructor)(struct fdarray *fda, int fd, void *arg),
> void *arg)
> @@ -113,7 +133,27 @@ int fdarray__filter(struct fdarray *fda, short revents,
>
> int fdarray__poll(struct fdarray *fda, int timeout)
> {
> - return poll(fda->entries, fda->nr, timeout);
> + int nr, i, pos, res;
> +
> + nr = fda->nr;
> +
> + for (i = 0; i < fda->nr_stat; i++) {
> + if (fda->stat_entries[i].fd != -1) {
> + pos = fdarray__add(fda, fda->stat_entries[i].fd,
> + fda->stat_entries[i].events);
> + if (pos >= 0)
> + fda->priv[pos].idx = i;
> + }
> + }
> +
> + res = poll(fda->entries, fda->nr, timeout);
> +
> + for (i = nr; i < fda->nr; i++)
> + fda->stat_entries[fda->priv[i].idx] = fda->entries[i];
> +
> + fda->nr = nr;
> +
> + return res;
> }
>
> int fdarray__fprintf(struct fdarray *fda, FILE *fp)
> diff --git a/tools/lib/api/fd/array.h b/tools/lib/api/fd/array.h
> index b39557d1a88f..9bca72e80b09 100644
> --- a/tools/lib/api/fd/array.h
> +++ b/tools/lib/api/fd/array.h
> @@ -3,6 +3,7 @@
> #define __API_FD_ARRAY__
>
> #include <stdio.h>
> +#include <poll.h>
>
> struct pollfd;
>
> @@ -16,6 +17,9 @@ struct pollfd;
> * I.e. using 'fda->priv[N].idx = * value' where N < fda->nr is ok,
> * but doing 'fda->priv = malloc(M)' is not allowed.
> */
> +
> +#define FDARRAY__STAT_ENTRIES_MAX 1
> +
> struct fdarray {
> int nr;
> int nr_alloc;
> @@ -25,6 +29,8 @@ struct fdarray {
> int idx;
> void *ptr;
> } *priv;
> + int nr_stat;
> + struct pollfd stat_entries[FDARRAY__STAT_ENTRIES_MAX];
> };
>
> void fdarray__init(struct fdarray *fda, int nr_autogrow);
> @@ -34,6 +40,7 @@ struct fdarray *fdarray__new(int nr_alloc, int nr_autogrow);
> void fdarray__delete(struct fdarray *fda);
>
> int fdarray__add(struct fdarray *fda, int fd, short revents);
> +int fdarray__add_stat(struct fdarray *fda, int fd, short revents);
> int fdarray__poll(struct fdarray *fda, int timeout);
> int fdarray__filter(struct fdarray *fda, short revents,
> void (*entry_destructor)(struct fdarray *fda, int fd, void *arg),
> diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
> index 6a875a0f01bb..e68e4c08e7c2 100644
> --- a/tools/lib/perf/evlist.c
> +++ b/tools/lib/perf/evlist.c
> @@ -317,6 +317,17 @@ int perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd,
> return pos;
> }
>
> +int perf_evlist__add_pollfd_stat(struct perf_evlist *evlist, int fd,
> + short revent)
> +{
> + int pos = fdarray__add_stat(&evlist->pollfd, fd, revent | POLLERR | POLLHUP);
> +
> + if (pos >= 0)
> + fcntl(fd, F_SETFL, O_NONBLOCK);
> +
> + return pos;
> +}
> +
> static void perf_evlist__munmap_filtered(struct fdarray *fda, int fd,
> void *arg __maybe_unused)
> {
> diff --git a/tools/lib/perf/include/internal/evlist.h b/tools/lib/perf/include/internal/evlist.h
> index 74dc8c3f0b66..2b3b4518c05e 100644
> --- a/tools/lib/perf/include/internal/evlist.h
> +++ b/tools/lib/perf/include/internal/evlist.h
> @@ -46,6 +46,8 @@ struct perf_evlist_mmap_ops {
> int perf_evlist__alloc_pollfd(struct perf_evlist *evlist);
> int perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd,
> void *ptr, short revent);
> +int perf_evlist__add_pollfd_stat(struct perf_evlist *evlist, int fd,
> + short revent);
>
> int perf_evlist__mmap_ops(struct perf_evlist *evlist,
> struct perf_evlist_mmap_ops *ops,
>
Hi,
On 03.06.2020 14:38, Adrian Hunter wrote:
> On 1/06/20 11:05 pm, Alexey Budankov wrote:
>>
>> Implement adding of file descriptors by fdarray__add_stat() to
>> fix-sized (currently 1) stat_entries array located at struct fdarray.
>> Append added file descriptors to the array used by poll() syscall
>> during fdarray__poll() call. Copy poll() result of the added
>> descriptors from the array back to the storage for separate analysis.
>
> Why not instead call evlist__add_pollfd() before other fds are added, so
> the fda->entries[] position is always fixed. Then this patch is not needed.
It then will block event consumption loop, at least in record mode, due to
change sin initial assumptions behind fdarray__filter(). So extension of the
API with 'static' fds looks safer w.r.t. possible functional regressions at
the same time extending the API with ability to atomically wait for (poll())
not only event fds but also any other fds during monitoring.
Feel free to ask more.
~Alexey
On 3/06/20 3:01 pm, Alexey Budankov wrote:
> Hi,
>
> On 03.06.2020 14:38, Adrian Hunter wrote:
>> On 1/06/20 11:05 pm, Alexey Budankov wrote:
>>>
>>> Implement adding of file descriptors by fdarray__add_stat() to
>>> fix-sized (currently 1) stat_entries array located at struct fdarray.
>>> Append added file descriptors to the array used by poll() syscall
>>> during fdarray__poll() call. Copy poll() result of the added
>>> descriptors from the array back to the storage for separate analysis.
>>
>> Why not instead call evlist__add_pollfd() before other fds are added, so
>> the fda->entries[] position is always fixed. Then this patch is not needed.
>
> It then will block event consumption loop, at least in record mode, due to
> change sin initial assumptions behind fdarray__filter(). So extension of the
> API with 'static' fds looks safer w.r.t. possible functional regressions at
> the same time extending the API with ability to atomically wait for (poll())
> not only event fds but also any other fds during monitoring.
So make fdarray__filter() return the number of filterable fds remaining.
On 1/06/20 11:20 pm, Alexey Budankov wrote:
>
> Introduce --ctl-fd[-ack] options to pass open file descriptors numbers
> from command line. Extend perf-record.txt file with --ctl-fd[-ack]
perf-record.txt change seems to be missing.
> options description. Document possible usage model introduced by
> --ctl-fd[-ack] options by providing example bash shell script.
>
> Signed-off-by: Alexey Budankov <[email protected]>
> ---
> tools/perf/builtin-record.c | 10 ++++++++++
> tools/perf/util/record.h | 2 ++
> 2 files changed, 12 insertions(+)
>
> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> index 0394e068dde8..8494ce964738 100644
> --- a/tools/perf/builtin-record.c
> +++ b/tools/perf/builtin-record.c
> @@ -1749,6 +1749,9 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
> perf_evlist__start_workload(rec->evlist);
> }
>
> + if (evlist__initialize_ctlfd(rec->evlist, opts->ctl_fd, opts->ctl_fd_ack))
> + goto out_child;
> +
> if (opts->initial_delay) {
> pr_info(EVLIST_DISABLED_MSG);
> if (opts->initial_delay > 0) {
> @@ -1895,6 +1898,7 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
> record__synthesize_workload(rec, true);
>
> out_child:
> + evlist__finalize_ctlfd(rec->evlist);
> record__mmap_read_all(rec, true);
> record__aio_mmap_read_sync(rec);
>
> @@ -2380,6 +2384,8 @@ static struct record record = {
> },
> .mmap_flush = MMAP_FLUSH_DEFAULT,
> .nr_threads_synthesize = 1,
> + .ctl_fd = -1,
> + .ctl_fd_ack = -1,
> },
> .tool = {
> .sample = process_sample_event,
> @@ -2581,6 +2587,10 @@ static struct option __record_options[] = {
> "libpfm4 event selector. use 'perf list' to list available events",
> parse_libpfm_events_option),
> #endif
> + OPT_INTEGER(0, "ctl-fd", &record.opts.ctl_fd,
> + "Listen on fd descriptor for command to control measurement ('enable': enable events, 'disable': disable events)"),
> + OPT_INTEGER(0, "ctl-fd-ack", &record.opts.ctl_fd_ack,
> + "Send control command completion ('ack') to fd ack descriptor"),
> OPT_END()
> };
>
> diff --git a/tools/perf/util/record.h b/tools/perf/util/record.h
> index da138dcb4d34..4cb72a478af1 100644
> --- a/tools/perf/util/record.h
> +++ b/tools/perf/util/record.h
> @@ -70,6 +70,8 @@ struct record_opts {
> int mmap_flush;
> unsigned int comp_level;
> unsigned int nr_threads_synthesize;
> + int ctl_fd;
> + int ctl_fd_ack;
> };
>
> extern const char * const *record_usage;
>
> -----Original Message-----
> From: Hunter, Adrian <[email protected]>
> Sent: Wednesday, June 3, 2020 3:24 PM
> To: Alexey Budankov <[email protected]>
> Cc: Arnaldo Carvalho de Melo <[email protected]>; Jiri Olsa
> <[email protected]>; Namhyung Kim <[email protected]>; Alexander
> Shishkin <[email protected]>; Peter Zijlstra
> <[email protected]>; Ingo Molnar <[email protected]>; Andi Kleen
> <[email protected]>; linux-kernel <[email protected]>
> Subject: Re: [PATCH v6 01/13] tools/libperf: introduce notion of static polled
> file descriptors
>
> On 3/06/20 3:01 pm, Alexey Budankov wrote:
> > Hi,
> >
> > On 03.06.2020 14:38, Adrian Hunter wrote:
> >> On 1/06/20 11:05 pm, Alexey Budankov wrote:
> >>>
> >>> Implement adding of file descriptors by fdarray__add_stat() to
> >>> fix-sized (currently 1) stat_entries array located at struct fdarray.
> >>> Append added file descriptors to the array used by poll() syscall
> >>> during fdarray__poll() call. Copy poll() result of the added
> >>> descriptors from the array back to the storage for separate analysis.
> >>
> >> Why not instead call evlist__add_pollfd() before other fds are added,
> >> so the fda->entries[] position is always fixed. Then this patch is not
> needed.
> >
> > It then will block event consumption loop, at least in record mode,
> > due to change sin initial assumptions behind fdarray__filter(). So
> > extension of the API with 'static' fds looks safer w.r.t. possible
> > functional regressions at the same time extending the API with ability
> > to atomically wait for (poll()) not only event fds but also any other fds
> during monitoring.
>
> So make fdarray__filter() return the number of filterable fds remaining.
>
Or perhaps simpler, compare the return value to the number of fds that are known not to be filterable
On 03.06.2020 15:24, Adrian Hunter wrote:
> On 1/06/20 11:20 pm, Alexey Budankov wrote:
>>
>> Introduce --ctl-fd[-ack] options to pass open file descriptors numbers
>> from command line. Extend perf-record.txt file with --ctl-fd[-ack]
>
> perf-record.txt change seems to be missing.
Good catch, thanks!
Will add it to v7.
~Alexey
On 03.06.2020 15:30, Hunter, Adrian wrote:
>
>
>> -----Original Message-----
>> From: Hunter, Adrian <[email protected]>
>> Sent: Wednesday, June 3, 2020 3:24 PM
>> To: Alexey Budankov <[email protected]>
>> Cc: Arnaldo Carvalho de Melo <[email protected]>; Jiri Olsa
>> <[email protected]>; Namhyung Kim <[email protected]>; Alexander
>> Shishkin <[email protected]>; Peter Zijlstra
>> <[email protected]>; Ingo Molnar <[email protected]>; Andi Kleen
>> <[email protected]>; linux-kernel <[email protected]>
>> Subject: Re: [PATCH v6 01/13] tools/libperf: introduce notion of static polled
>> file descriptors
>>
>> On 3/06/20 3:01 pm, Alexey Budankov wrote:
>>> Hi,
>>>
>>> On 03.06.2020 14:38, Adrian Hunter wrote:
>>>> On 1/06/20 11:05 pm, Alexey Budankov wrote:
>>>>>
>>>>> Implement adding of file descriptors by fdarray__add_stat() to
>>>>> fix-sized (currently 1) stat_entries array located at struct fdarray.
>>>>> Append added file descriptors to the array used by poll() syscall
>>>>> during fdarray__poll() call. Copy poll() result of the added
>>>>> descriptors from the array back to the storage for separate analysis.
>>>>
>>>> Why not instead call evlist__add_pollfd() before other fds are added,
>>>> so the fda->entries[] position is always fixed. Then this patch is not
>> needed.
>>>
>>> It then will block event consumption loop, at least in record mode,
>>> due to change sin initial assumptions behind fdarray__filter(). So
>>> extension of the API with 'static' fds looks safer w.r.t. possible
>>> functional regressions at the same time extending the API with ability
>>> to atomically wait for (poll()) not only event fds but also any other fds
>> during monitoring.
>>
>> So make fdarray__filter() return the number of filterable fds remaining.
>>
>
>
> Or perhaps simpler, compare the return value to the number of fds that are known not to be filterable
Well, various implementations are for sure possible but the proposed design
avoids implicit code assumptions and dependency on API calls order as in API
implementation as in client code and that is why it's been proposed this way.
On 3/06/20 3:52 pm, Alexey Budankov wrote:
>
> On 03.06.2020 15:30, Hunter, Adrian wrote:
>>
>>
>>> -----Original Message-----
>>> From: Hunter, Adrian <[email protected]>
>>> Sent: Wednesday, June 3, 2020 3:24 PM
>>> To: Alexey Budankov <[email protected]>
>>> Cc: Arnaldo Carvalho de Melo <[email protected]>; Jiri Olsa
>>> <[email protected]>; Namhyung Kim <[email protected]>; Alexander
>>> Shishkin <[email protected]>; Peter Zijlstra
>>> <[email protected]>; Ingo Molnar <[email protected]>; Andi Kleen
>>> <[email protected]>; linux-kernel <[email protected]>
>>> Subject: Re: [PATCH v6 01/13] tools/libperf: introduce notion of static polled
>>> file descriptors
>>>
>>> On 3/06/20 3:01 pm, Alexey Budankov wrote:
>>>> Hi,
>>>>
>>>> On 03.06.2020 14:38, Adrian Hunter wrote:
>>>>> On 1/06/20 11:05 pm, Alexey Budankov wrote:
>>>>>>
>>>>>> Implement adding of file descriptors by fdarray__add_stat() to
>>>>>> fix-sized (currently 1) stat_entries array located at struct fdarray.
>>>>>> Append added file descriptors to the array used by poll() syscall
>>>>>> during fdarray__poll() call. Copy poll() result of the added
>>>>>> descriptors from the array back to the storage for separate analysis.
>>>>>
>>>>> Why not instead call evlist__add_pollfd() before other fds are added,
>>>>> so the fda->entries[] position is always fixed. Then this patch is not
>>> needed.
>>>>
>>>> It then will block event consumption loop, at least in record mode,
>>>> due to change sin initial assumptions behind fdarray__filter(). So
>>>> extension of the API with 'static' fds looks safer w.r.t. possible
>>>> functional regressions at the same time extending the API with ability
>>>> to atomically wait for (poll()) not only event fds but also any other fds
>>> during monitoring.
>>>
>>> So make fdarray__filter() return the number of filterable fds remaining.
>>>
>>
>>
>> Or perhaps simpler, compare the return value to the number of fds that are known not to be filterable
>
> Well, various implementations are for sure possible but the proposed design
> avoids implicit code assumptions and dependency on API calls order as in API
> implementation as in client code and that is why it's been proposed this way.
>
The dependencies are confined one function, namely __cmd_record(), it is
simpler, and avoids the inelegant changes to fdarray__poll().
On 03.06.2020 18:53, Adrian Hunter wrote:
> On 3/06/20 3:52 pm, Alexey Budankov wrote:
>>
>> On 03.06.2020 15:30, Hunter, Adrian wrote:
>>>
>>>
>>>> -----Original Message-----
>>>> From: Hunter, Adrian <[email protected]>
>>>> Sent: Wednesday, June 3, 2020 3:24 PM
>>>> To: Alexey Budankov <[email protected]>
>>>> Cc: Arnaldo Carvalho de Melo <[email protected]>; Jiri Olsa
>>>> <[email protected]>; Namhyung Kim <[email protected]>; Alexander
>>>> Shishkin <[email protected]>; Peter Zijlstra
>>>> <[email protected]>; Ingo Molnar <[email protected]>; Andi Kleen
>>>> <[email protected]>; linux-kernel <[email protected]>
>>>> Subject: Re: [PATCH v6 01/13] tools/libperf: introduce notion of static polled
>>>> file descriptors
>>>>
>>>> On 3/06/20 3:01 pm, Alexey Budankov wrote:
>>>>> Hi,
>>>>>
>>>>> On 03.06.2020 14:38, Adrian Hunter wrote:
>>>>>> On 1/06/20 11:05 pm, Alexey Budankov wrote:
>>>>>>>
>>>>>>> Implement adding of file descriptors by fdarray__add_stat() to
>>>>>>> fix-sized (currently 1) stat_entries array located at struct fdarray.
>>>>>>> Append added file descriptors to the array used by poll() syscall
>>>>>>> during fdarray__poll() call. Copy poll() result of the added
>>>>>>> descriptors from the array back to the storage for separate analysis.
>>>>>>
>>>>>> Why not instead call evlist__add_pollfd() before other fds are added,
>>>>>> so the fda->entries[] position is always fixed. Then this patch is not
>>>> needed.
>>>>>
>>>>> It then will block event consumption loop, at least in record mode,
>>>>> due to change sin initial assumptions behind fdarray__filter(). So
>>>>> extension of the API with 'static' fds looks safer w.r.t. possible
>>>>> functional regressions at the same time extending the API with ability
>>>>> to atomically wait for (poll()) not only event fds but also any other fds
>>>> during monitoring.
>>>>
>>>> So make fdarray__filter() return the number of filterable fds remaining.
>>>>
>>>
>>>
>>> Or perhaps simpler, compare the return value to the number of fds that are known not to be filterable
>>
>> Well, various implementations are for sure possible but the proposed design
>> avoids implicit code assumptions and dependency on API calls order as in API
>> implementation as in client code and that is why it's been proposed this way.
>>
>
> The dependencies are confined one function, namely __cmd_record(), it is
> simpler, and avoids the inelegant changes to fdarray__poll().
As far the changes don't cause regressions and serve their purpose well
they are acceptable.