2015-11-05 14:41:17

by Jiri Olsa

[permalink] [raw]
Subject: [PATCHv6 00/25] perf stat: Add scripting support

hi,
sending another version of stat scripting.

v6 changes:
- several patches from v4 already taken
- perf stat record can now place 'record' keyword
anywhere within stat options
- placed STAT feature checking earlier into record
patches so commands processing perf.data recognize
stat data and skip sample_type checking
- rebased on Arnaldo's perf/stat
- added Tested-by: Kan Liang <[email protected]>

v5 changes:
- several patches from v4 already taken
- using u16 for cpu number in cpu_map_event
- renamed PERF_RECORD_HEADER_ATTR_UPDATE to PERF_RECORD_EVENT_UPDATE
- moved low hanging fuits patches to the start of the patchset
- patchset tested by Kan Liang, thanks!

v4 changes:
- added attr update event for event's cpumask
- forbig aggregation on task workloads
- some minor reorders and changelog fixes

v3 changes:
- added attr update event to handle unit,scale,name for event
it fixed the uncore_imc_1/cas_count_read/ record/report
- perf report -D now displays stat related events
- some minor and changelog fixes

v2 changes:
- rebased to latest Arnaldo's perf/core
- patches 1 to 11 already merged in
- added --per-core/--per-socket/-A options for perf stat report
command to allow custom aggregation in stat report, please
check new examples below
- couple changelogs changes

The initial attempt defined its own formula lang and allowed
triggering user's script on the end of the stat command:
http://marc.info/?l=linux-kernel&m=136742146322273&w=2

This patchset abandons the idea of new formula language
and rather adds support to:
- store stat data into perf.data file
- add python support to process stat events

Basically it allows to store stat data into perf.data and
post process it with python scripts in a similar way we
do for sampling data.

The stat data are stored in new stat, stat-round, stat-config user events.
stat - stored for each read syscall of the counter
stat round - stored for each interval or end of the command invocation
stat config - stores all the config information needed to process data
so report tool could restore the same output as record

The python script can now define 'stat__<eventname>_<modifier>' functions
to get stat events data and 'stat__interval' to get stat-round data.

See CPI script example in scripts/python/stat-cpi.py.

Also available in:
git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git
perf/stat_script

thanks,
jirka

Examples:

- To record data for command stat workload:

$ perf stat record kill
...

Performance counter stats for 'kill':

0.372007 task-clock (msec) # 0.613 CPUs utilized
3 context-switches # 0.008 M/sec
0 cpu-migrations # 0.000 K/sec
62 page-faults # 0.167 M/sec
1,129,973 cycles # 3.038 GHz
<not supported> stalled-cycles-frontend
<not supported> stalled-cycles-backend
813,313 instructions # 0.72 insns per cycle
166,161 branches # 446.661 M/sec
8,747 branch-misses # 5.26% of all branches

0.000607287 seconds time elapsed

- To report perf stat data:

$ perf stat report

Performance counter stats for '/home/jolsa/bin/perf stat record kill':

0.372007 task-clock (msec) # inf CPUs utilized
3 context-switches # 0.008 M/sec
0 cpu-migrations # 0.000 K/sec
62 page-faults # 0.167 M/sec
1,129,973 cycles # 3.038 GHz
<not supported> stalled-cycles-frontend
<not supported> stalled-cycles-backend
813,313 instructions # 0.72 insns per cycle
166,161 branches # 446.661 M/sec
8,747 branch-misses # 5.26% of all branches

0.000000000 seconds time elapsed

- To store system-wide period stat data:

$ perf stat -e cycles:u,instructions:u -a -I 1000 record
# time counts unit events
1.000265471 462,311,482 cycles:u (100.00%)
1.000265471 590,037,440 instructions:u
2.000483453 722,532,336 cycles:u (100.00%)
2.000483453 848,678,197 instructions:u
3.000759876 75,990,880 cycles:u (100.00%)
3.000759876 86,187,813 instructions:u
^C 3.213960893 85,329,533 cycles:u (100.00%)
3.213960893 135,954,296 instructions:u

- To report perf stat data:

$ perf stat report
# time counts unit events
1.000265471 462,311,482 cycles:u (100.00%)
1.000265471 590,037,440 instructions:u
2.000483453 722,532,336 cycles:u (100.00%)
2.000483453 848,678,197 instructions:u
3.000759876 75,990,880 cycles:u (100.00%)
3.000759876 86,187,813 instructions:u
3.213960893 85,329,533 cycles:u (100.00%)
3.213960893 135,954,296 instructions:u

- To run stat-cpi.py script over perf.data:

$ perf script -s scripts/python/stat-cpi.py
1.000265: cpu -1, thread -1 -> cpi 0.783529 (462311482/590037440)
2.000483: cpu -1, thread -1 -> cpi 0.851362 (722532336/848678197)
3.000760: cpu -1, thread -1 -> cpi 0.881689 (75990880/86187813)
3.213961: cpu -1, thread -1 -> cpi 0.627634 (85329533/135954296)

- To pipe data from stat to stat-cpi script:

$ perf stat -e cycles:u,instructions:u -A -C 0 -I 1000 record | perf script -s scripts/python/stat-cpi.py
1.000192: cpu 0, thread -1 -> cpi 0.739535 (23921908/32347236)
2.000376: cpu 0, thread -1 -> cpi 1.663482 (2519340/1514498)
3.000621: cpu 0, thread -1 -> cpi 1.396308 (16162767/11575362)
4.000700: cpu 0, thread -1 -> cpi 1.092246 (20077258/18381624)
5.000867: cpu 0, thread -1 -> cpi 0.473816 (45157586/95306156)
6.001034: cpu 0, thread -1 -> cpi 0.532792 (43701668/82023818)
7.001195: cpu 0, thread -1 -> cpi 1.122059 (29890042/26638561)

- Raw script stat data output:

$ perf stat -e cycles:u,instructions:u -A -C 0 -I 1000 record | perf --no-pager script
CPU THREAD VAL ENA RUN TIME EVENT
0 -1 12302059 1000811347 1000810712 1000198821 cycles:u
0 -1 2565362 1000823218 1000823218 1000198821 instructions:u
0 -1 14453353 1000812704 1000812704 2000382283 cycles:u
0 -1 4600932 1000799342 1000799342 2000382283 instructions:u
0 -1 15245106 1000774425 1000774425 3000538255 cycles:u
0 -1 2624324 1000769310 1000769310 3000538255 instructions:u

- To display different aggregation in report:

$ perf stat -e cycles -a -I 1000 record sleep 3
# time counts unit events
1.000223609 703,427,617 cycles
2.000443651 609,975,307 cycles
3.000569616 668,479,597 cycles
3.000735323 1,155,816 cycles

$ perf stat report
# time counts unit events
1.000223609 703,427,617 cycles
2.000443651 609,975,307 cycles
3.000569616 668,479,597 cycles
3.000735323 1,155,816 cycles

$ perf stat report --per-core
# time core cpus counts unit events
1.000223609 S0-C0 2 327,612,412 cycles
1.000223609 S0-C1 2 375,815,205 cycles
2.000443651 S0-C0 2 287,462,177 cycles
2.000443651 S0-C1 2 322,513,130 cycles
3.000569616 S0-C0 2 271,571,908 cycles
3.000569616 S0-C1 2 396,907,689 cycles
3.000735323 S0-C0 2 694,977 cycles
3.000735323 S0-C1 2 460,839 cycles

$ perf stat report --per-socket
# time socket cpus counts unit events
1.000223609 S0 4 703,427,617 cycles
2.000443651 S0 4 609,975,307 cycles
3.000569616 S0 4 668,479,597 cycles
3.000735323 S0 4 1,155,816 cycles

$ perf stat report -A
# time CPU counts unit events
1.000223609 CPU0 205,431,505 cycles
1.000223609 CPU1 122,180,907 cycles
1.000223609 CPU2 176,649,682 cycles
1.000223609 CPU3 199,165,523 cycles
2.000443651 CPU0 148,447,922 cycles
2.000443651 CPU1 139,014,255 cycles
2.000443651 CPU2 204,436,559 cycles
2.000443651 CPU3 118,076,571 cycles
3.000569616 CPU0 149,788,954 cycles
3.000569616 CPU1 121,782,954 cycles
3.000569616 CPU2 247,277,700 cycles
3.000569616 CPU3 149,629,989 cycles
3.000735323 CPU0 269,675 cycles
3.000735323 CPU1 425,302 cycles
3.000735323 CPU2 364,169 cycles
3.000735323 CPU3 96,670 cycles


Cc: Andi Kleen <[email protected]>
Cc: Ulrich Drepper <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Don Zickus <[email protected]>
Tested-by: Kan Liang <[email protected]>
---
Jiri Olsa (25):
perf stat: Make stat options global
perf stat record: Add record command
perf stat record: Initialize record features
perf stat record: Synthesize stat record data
perf stat record: Store events IDs in perf data file
perf stat record: Add pipe support for record command
perf stat record: Write stat events on record
perf stat record: Write stat round events on record
perf stat record: Do not allow record with multiple runs mode
perf stat record: Synthesize event update events
perf stat report: Add report command
perf stat report: Process cpu/threads maps
perf stat report: Process stat config event
perf stat report: Add support to initialize aggr_map from file
perf stat report: Process stat and stat round events
perf stat report: Process event update events
perf stat report: Move csv_sep initialization before report command
perf stat report: Allow to override aggr_mode
perf script: Process cpu/threads maps
perf script: Process stat config event
perf script: Add process_stat/process_stat_interval scripting interface
perf script: Add stat default handlers
perf script: Display stat events by default
perf script: Add python support for stat events
perf script: Add stat-cpi.py script

tools/perf/Documentation/perf-stat.txt | 34 ++++
tools/perf/builtin-script.c | 139 +++++++++++++++
tools/perf/builtin-stat.c | 742 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++---------
tools/perf/scripts/python/stat-cpi.py | 74 ++++++++
tools/perf/util/evlist.c | 6 +-
tools/perf/util/evlist.h | 3 +
tools/perf/util/scripting-engines/trace-event-python.c | 114 +++++++++++-
tools/perf/util/session.c | 3 +
tools/perf/util/trace-event.h | 4 +
9 files changed, 1021 insertions(+), 98 deletions(-)
create mode 100644 tools/perf/scripts/python/stat-cpi.py


2015-11-05 14:55:52

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 01/25] perf stat: Make stat options global

So they can be used in perf stat record command
in following patch.

Tested-by: Kan Liang <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Jiri Olsa <[email protected]>
---
tools/perf/builtin-stat.c | 163 +++++++++++++++++++++++-----------------------
1 file changed, 82 insertions(+), 81 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index b74ee0f2e714..e77880b5094d 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -122,6 +122,9 @@ static bool forever = false;
static struct timespec ref_time;
static struct cpu_map *aggr_map;
static aggr_get_id_t aggr_get_id;
+static bool append_file;
+static const char *output_name;
+static int output_fd;

static volatile int done = 0;

@@ -927,6 +930,67 @@ static int stat__set_big_num(const struct option *opt __maybe_unused,
return 0;
}

+static const struct option stat_options[] = {
+ OPT_BOOLEAN('T', "transaction", &transaction_run,
+ "hardware transaction statistics"),
+ OPT_CALLBACK('e', "event", &evsel_list, "event",
+ "event selector. use 'perf list' to list available events",
+ parse_events_option),
+ OPT_CALLBACK(0, "filter", &evsel_list, "filter",
+ "event filter", parse_filter),
+ OPT_BOOLEAN('i', "no-inherit", &no_inherit,
+ "child tasks do not inherit counters"),
+ OPT_STRING('p', "pid", &target.pid, "pid",
+ "stat events on existing process id"),
+ OPT_STRING('t', "tid", &target.tid, "tid",
+ "stat events on existing thread id"),
+ OPT_BOOLEAN('a', "all-cpus", &target.system_wide,
+ "system-wide collection from all CPUs"),
+ OPT_BOOLEAN('g', "group", &group,
+ "put the counters into a counter group"),
+ OPT_BOOLEAN('c', "scale", &stat_config.scale, "scale/normalize counters"),
+ OPT_INCR('v', "verbose", &verbose,
+ "be more verbose (show counter open errors, etc)"),
+ OPT_INTEGER('r', "repeat", &run_count,
+ "repeat command and print average + stddev (max: 100, forever: 0)"),
+ OPT_BOOLEAN('n', "null", &null_run,
+ "null run - dont start any counters"),
+ OPT_INCR('d', "detailed", &detailed_run,
+ "detailed run - start a lot of events"),
+ OPT_BOOLEAN('S', "sync", &sync_run,
+ "call sync() before starting a run"),
+ OPT_CALLBACK_NOOPT('B', "big-num", NULL, NULL,
+ "print large numbers with thousands\' separators",
+ stat__set_big_num),
+ OPT_STRING('C', "cpu", &target.cpu_list, "cpu",
+ "list of cpus to monitor in system-wide"),
+ OPT_SET_UINT('A', "no-aggr", &stat_config.aggr_mode,
+ "disable CPU count aggregation", AGGR_NONE),
+ OPT_STRING('x', "field-separator", &csv_sep, "separator",
+ "print counts with custom separator"),
+ OPT_CALLBACK('G', "cgroup", &evsel_list, "name",
+ "monitor event in cgroup name only", parse_cgroups),
+ OPT_STRING('o', "output", &output_name, "file", "output file name"),
+ OPT_BOOLEAN(0, "append", &append_file, "append to the output file"),
+ OPT_INTEGER(0, "log-fd", &output_fd,
+ "log output to fd, instead of stderr"),
+ OPT_STRING(0, "pre", &pre_cmd, "command",
+ "command to run prior to the measured command"),
+ OPT_STRING(0, "post", &post_cmd, "command",
+ "command to run after to the measured command"),
+ OPT_UINTEGER('I', "interval-print", &stat_config.interval,
+ "print counts at regular interval in ms (>= 10)"),
+ OPT_SET_UINT(0, "per-socket", &stat_config.aggr_mode,
+ "aggregate counts per processor socket", AGGR_SOCKET),
+ OPT_SET_UINT(0, "per-core", &stat_config.aggr_mode,
+ "aggregate counts per physical processor core", AGGR_CORE),
+ OPT_SET_UINT(0, "per-thread", &stat_config.aggr_mode,
+ "aggregate counts per thread", AGGR_THREAD),
+ OPT_UINTEGER('D', "delay", &initial_delay,
+ "ms to wait before starting measurement after program start"),
+ OPT_END()
+};
+
static int perf_stat__get_socket(struct cpu_map *map, int cpu)
{
return cpu_map__get_socket(map, cpu, NULL);
@@ -1174,69 +1238,6 @@ static int add_default_attributes(void)

int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
{
- bool append_file = false;
- int output_fd = 0;
- const char *output_name = NULL;
- const struct option options[] = {
- OPT_BOOLEAN('T', "transaction", &transaction_run,
- "hardware transaction statistics"),
- OPT_CALLBACK('e', "event", &evsel_list, "event",
- "event selector. use 'perf list' to list available events",
- parse_events_option),
- OPT_CALLBACK(0, "filter", &evsel_list, "filter",
- "event filter", parse_filter),
- OPT_BOOLEAN('i', "no-inherit", &no_inherit,
- "child tasks do not inherit counters"),
- OPT_STRING('p', "pid", &target.pid, "pid",
- "stat events on existing process id"),
- OPT_STRING('t', "tid", &target.tid, "tid",
- "stat events on existing thread id"),
- OPT_BOOLEAN('a', "all-cpus", &target.system_wide,
- "system-wide collection from all CPUs"),
- OPT_BOOLEAN('g', "group", &group,
- "put the counters into a counter group"),
- OPT_BOOLEAN('c', "scale", &stat_config.scale, "scale/normalize counters"),
- OPT_INCR('v', "verbose", &verbose,
- "be more verbose (show counter open errors, etc)"),
- OPT_INTEGER('r', "repeat", &run_count,
- "repeat command and print average + stddev (max: 100, forever: 0)"),
- OPT_BOOLEAN('n', "null", &null_run,
- "null run - dont start any counters"),
- OPT_INCR('d', "detailed", &detailed_run,
- "detailed run - start a lot of events"),
- OPT_BOOLEAN('S', "sync", &sync_run,
- "call sync() before starting a run"),
- OPT_CALLBACK_NOOPT('B', "big-num", NULL, NULL,
- "print large numbers with thousands\' separators",
- stat__set_big_num),
- OPT_STRING('C', "cpu", &target.cpu_list, "cpu",
- "list of cpus to monitor in system-wide"),
- OPT_SET_UINT('A', "no-aggr", &stat_config.aggr_mode,
- "disable CPU count aggregation", AGGR_NONE),
- OPT_STRING('x', "field-separator", &csv_sep, "separator",
- "print counts with custom separator"),
- OPT_CALLBACK('G', "cgroup", &evsel_list, "name",
- "monitor event in cgroup name only", parse_cgroups),
- OPT_STRING('o', "output", &output_name, "file", "output file name"),
- OPT_BOOLEAN(0, "append", &append_file, "append to the output file"),
- OPT_INTEGER(0, "log-fd", &output_fd,
- "log output to fd, instead of stderr"),
- OPT_STRING(0, "pre", &pre_cmd, "command",
- "command to run prior to the measured command"),
- OPT_STRING(0, "post", &post_cmd, "command",
- "command to run after to the measured command"),
- OPT_UINTEGER('I', "interval-print", &stat_config.interval,
- "print counts at regular interval in ms (>= 10)"),
- OPT_SET_UINT(0, "per-socket", &stat_config.aggr_mode,
- "aggregate counts per processor socket", AGGR_SOCKET),
- OPT_SET_UINT(0, "per-core", &stat_config.aggr_mode,
- "aggregate counts per physical processor core", AGGR_CORE),
- OPT_SET_UINT(0, "per-thread", &stat_config.aggr_mode,
- "aggregate counts per thread", AGGR_THREAD),
- OPT_UINTEGER('D', "delay", &initial_delay,
- "ms to wait before starting measurement after program start"),
- OPT_END()
- };
const char * const stat_usage[] = {
"perf stat [<options>] [<command>]",
NULL
@@ -1252,7 +1253,7 @@ int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
if (evsel_list == NULL)
return -ENOMEM;

- argc = parse_options(argc, argv, options, stat_usage,
+ argc = parse_options(argc, argv, stat_options, stat_usage,
PARSE_OPT_STOP_AT_NON_OPTION);

interval = stat_config.interval;
@@ -1262,14 +1263,14 @@ int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)

if (output_name && output_fd) {
fprintf(stderr, "cannot use both --output and --log-fd\n");
- parse_options_usage(stat_usage, options, "o", 1);
- parse_options_usage(NULL, options, "log-fd", 0);
+ parse_options_usage(stat_usage, stat_options, "o", 1);
+ parse_options_usage(NULL, stat_options, "log-fd", 0);
goto out;
}

if (output_fd < 0) {
fprintf(stderr, "argument to --log-fd must be a > 0\n");
- parse_options_usage(stat_usage, options, "log-fd", 0);
+ parse_options_usage(stat_usage, stat_options, "log-fd", 0);
goto out;
}

@@ -1309,8 +1310,8 @@ int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
/* User explicitly passed -B? */
if (big_num_opt == 1) {
fprintf(stderr, "-B option not supported with -x\n");
- parse_options_usage(stat_usage, options, "B", 1);
- parse_options_usage(NULL, options, "x", 1);
+ parse_options_usage(stat_usage, stat_options, "B", 1);
+ parse_options_usage(NULL, stat_options, "x", 1);
goto out;
} else /* Nope, so disable big number formatting */
big_num = false;
@@ -1318,11 +1319,11 @@ int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
big_num = false;

if (!argc && target__none(&target))
- usage_with_options(stat_usage, options);
+ usage_with_options(stat_usage, stat_options);

if (run_count < 0) {
pr_err("Run count must be a positive number\n");
- parse_options_usage(stat_usage, options, "r", 1);
+ parse_options_usage(stat_usage, stat_options, "r", 1);
goto out;
} else if (run_count == 0) {
forever = true;
@@ -1332,8 +1333,8 @@ int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
if ((stat_config.aggr_mode == AGGR_THREAD) && !target__has_task(&target)) {
fprintf(stderr, "The --per-thread option is only available "
"when monitoring via -p -t options.\n");
- parse_options_usage(NULL, options, "p", 1);
- parse_options_usage(NULL, options, "t", 1);
+ parse_options_usage(NULL, stat_options, "p", 1);
+ parse_options_usage(NULL, stat_options, "t", 1);
goto out;
}

@@ -1347,9 +1348,9 @@ int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
fprintf(stderr, "both cgroup and no-aggregation "
"modes only available in system-wide mode\n");

- parse_options_usage(stat_usage, options, "G", 1);
- parse_options_usage(NULL, options, "A", 1);
- parse_options_usage(NULL, options, "a", 1);
+ parse_options_usage(stat_usage, stat_options, "G", 1);
+ parse_options_usage(NULL, stat_options, "A", 1);
+ parse_options_usage(NULL, stat_options, "a", 1);
goto out;
}

@@ -1361,12 +1362,12 @@ int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
if (perf_evlist__create_maps(evsel_list, &target) < 0) {
if (target__has_task(&target)) {
pr_err("Problems finding threads of monitor\n");
- parse_options_usage(stat_usage, options, "p", 1);
- parse_options_usage(NULL, options, "t", 1);
+ parse_options_usage(stat_usage, stat_options, "p", 1);
+ parse_options_usage(NULL, stat_options, "t", 1);
} else if (target__has_cpu(&target)) {
perror("failed to parse CPUs map");
- parse_options_usage(stat_usage, options, "C", 1);
- parse_options_usage(NULL, options, "a", 1);
+ parse_options_usage(stat_usage, stat_options, "C", 1);
+ parse_options_usage(NULL, stat_options, "a", 1);
}
goto out;
}
@@ -1381,7 +1382,7 @@ int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
if (interval && interval < 100) {
if (interval < 10) {
pr_err("print interval must be >= 10ms\n");
- parse_options_usage(stat_usage, options, "I", 1);
+ parse_options_usage(stat_usage, stat_options, "I", 1);
goto out;
} else
pr_warning("print interval < 100ms. "
--
2.4.3

2015-11-05 14:41:21

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 02/25] perf stat record: Add record command

Add 'perf stat record' command support. It creates simple
(header only) perf.data file ATM.

The record command could be specified anywhere among stat
options. All stat command options are valid for stat record
command with '-o' option exception. If specified for record
command it denotes the perf data file name.

Tested-by: Kan Liang <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Jiri Olsa <[email protected]>
---
tools/perf/Documentation/perf-stat.txt | 12 ++++++
tools/perf/builtin-stat.c | 78 ++++++++++++++++++++++++++++++++--
2 files changed, 87 insertions(+), 3 deletions(-)

diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
index 4e074a660826..70eee1c2c444 100644
--- a/tools/perf/Documentation/perf-stat.txt
+++ b/tools/perf/Documentation/perf-stat.txt
@@ -10,6 +10,7 @@ SYNOPSIS
[verse]
'perf stat' [-e <EVENT> | --event=EVENT] [-a] <command>
'perf stat' [-e <EVENT> | --event=EVENT] [-a] -- <command> [<options>]
+'perf stat' [-e <EVENT> | --event=EVENT] [-a] record [-o file] -- <command> [<options>]

DESCRIPTION
-----------
@@ -22,6 +23,8 @@ OPTIONS
<command>...::
Any command you can specify in a shell.

+record::
+ See STAT RECORD.

-e::
--event=::
@@ -159,6 +162,15 @@ filter out the startup phase of the program, which is often very different.

Print statistics of transactional execution if supported.

+STAT RECORD
+-----------
+Stores stat data into perf data file.
+
+-o file::
+--output file::
+Output file name.
+
+
EXAMPLES
--------

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index e77880b5094d..04123835fd81 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -59,6 +59,7 @@
#include "util/thread.h"
#include "util/thread_map.h"
#include "util/counts.h"
+#include "util/session.h"

#include <stdlib.h>
#include <sys/prctl.h>
@@ -126,6 +127,16 @@ static bool append_file;
static const char *output_name;
static int output_fd;

+struct perf_stat {
+ bool record;
+ struct perf_data_file file;
+ struct perf_session *session;
+ u64 bytes_written;
+};
+
+static struct perf_stat perf_stat;
+#define STAT_RECORD perf_stat.record
+
static volatile int done = 0;

static struct perf_stat_config stat_config = {
@@ -344,6 +355,15 @@ static int __run_perf_stat(int argc, const char **argv)
return -1;
}

+ if (STAT_RECORD) {
+ int err, fd = perf_data_file__fd(&perf_stat.file);
+
+ err = perf_session__write_header(perf_stat.session, evsel_list,
+ fd, false);
+ if (err < 0)
+ return err;
+ }
+
/*
* Enable counters and exec the command:
*/
@@ -1236,6 +1256,38 @@ static int add_default_attributes(void)
return perf_evlist__add_default_attrs(evsel_list, very_very_detailed_attrs);
}

+static const char * const recort_usage[] = {
+ "perf stat record [<options>]",
+ NULL,
+};
+
+static int __cmd_record(int argc, const char **argv)
+{
+ struct perf_session *session;
+ struct perf_data_file *file = &perf_stat.file;
+
+ argc = parse_options(argc, argv, stat_options, record_usage,
+ PARSE_OPT_STOP_AT_NON_OPTION);
+
+ if (output_name)
+ file->path = output_name;
+
+ session = perf_session__new(file, false, NULL);
+ if (session == NULL) {
+ pr_err("Perf session creation failed.\n");
+ return -1;
+ }
+
+ /* No pipe support ATM */
+ if (perf_stat.file.is_pipe)
+ return -EINVAL;
+
+ session->evlist = evsel_list;
+ perf_stat.session = session;
+ perf_stat.record = true;
+ return argc;
+}
+
int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
{
const char * const stat_usage[] = {
@@ -1246,6 +1298,7 @@ int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
const char *mode;
FILE *output = stderr;
unsigned int interval;
+ const char * const stat_subcommands[] = { "record" };

setlocale(LC_ALL, "");

@@ -1253,12 +1306,22 @@ int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
if (evsel_list == NULL)
return -ENOMEM;

- argc = parse_options(argc, argv, stat_options, stat_usage,
- PARSE_OPT_STOP_AT_NON_OPTION);
+ argc = parse_options_subcommand(argc, argv, stat_options, stat_subcommands,
+ (const char **) stat_usage,
+ PARSE_OPT_STOP_AT_NON_OPTION);
+
+ if (argc && !strncmp(argv[0], "rec", 3)) {
+ argc = __cmd_record(argc, argv);
+ if (argc < 0)
+ return -1;
+ }

interval = stat_config.interval;

- if (output_name && strcmp(output_name, "-"))
+ /*
+ * For record command the -o is already taken care of.
+ */
+ if (!STAT_RECORD && output_name && strcmp(output_name, "-"))
output = NULL;

if (output_name && output_fd) {
@@ -1425,6 +1488,15 @@ int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
if (!forever && status != -1 && !interval)
print_counters(NULL, argc, argv);

+ if (STAT_RECORD) {
+ int fd = perf_data_file__fd(&perf_stat.file);
+
+ perf_stat.session->header.data_size += perf_stat.bytes_written;
+ perf_session__write_header(perf_stat.session, evsel_list, fd, true);
+
+ perf_session__delete(perf_stat.session);
+ }
+
perf_evlist__free_stats(evsel_list);
out:
perf_evlist__delete(evsel_list);
--
2.4.3

2015-11-05 14:55:27

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 03/25] perf stat record: Initialize record features

Disabling all non stat related features.

Also as we now enable STAT feature in the data file,
adding code to instruct session open to skip sample
type checking for stat data files.

Tested-by: Kan Liang <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Jiri Olsa <[email protected]>
---
tools/perf/builtin-stat.c | 15 +++++++++++++++
tools/perf/util/session.c | 3 +++
2 files changed, 18 insertions(+)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 04123835fd81..2abf45d67ff2 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -1261,6 +1261,19 @@ static const char * const recort_usage[] = {
NULL,
};

+static void init_features(struct perf_session *session)
+{
+ int feat;
+
+ for (feat = HEADER_FIRST_FEATURE; feat < HEADER_LAST_FEATURE; feat++)
+ perf_header__set_feat(&session->header, feat);
+
+ perf_header__clear_feat(&session->header, HEADER_BUILD_ID);
+ perf_header__clear_feat(&session->header, HEADER_TRACING_DATA);
+ perf_header__clear_feat(&session->header, HEADER_BRANCH_STACK);
+ perf_header__clear_feat(&session->header, HEADER_AUXTRACE);
+}
+
static int __cmd_record(int argc, const char **argv)
{
struct perf_session *session;
@@ -1282,6 +1295,8 @@ static int __cmd_record(int argc, const char **argv)
if (perf_stat.file.is_pipe)
return -EINVAL;

+ init_features(session);
+
session->evlist = evsel_list;
perf_stat.session = session;
perf_stat.record = true;
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index cc3bda2fc78b..dab5277a71c4 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -37,6 +37,9 @@ static int perf_session__open(struct perf_session *session)
if (perf_data_file__is_pipe(file))
return 0;

+ if (perf_header__has_feat(&session->header, HEADER_STAT))
+ return 0;
+
if (!perf_evlist__valid_sample_type(session->evlist)) {
pr_err("non matching sample_type");
return -1;
--
2.4.3

2015-11-05 14:41:25

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 04/25] perf stat record: Synthesize stat record data

Synthesizing needed stat record data for report/script:
- cpu/thread maps
- stat config

Tested-by: Kan Liang <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Jiri Olsa <[email protected]>
---
tools/perf/builtin-stat.c | 47 +++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 47 insertions(+)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 2abf45d67ff2..8c24e88afd3c 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -196,6 +196,20 @@ static inline int nsec_counter(struct perf_evsel *evsel)
return 0;
}

+static int process_synthesized_event(struct perf_tool *tool __maybe_unused,
+ union perf_event *event,
+ struct perf_sample *sample __maybe_unused,
+ struct machine *machine __maybe_unused)
+{
+ if (perf_data_file__write(&perf_stat.file, event, event->header.size) < 0) {
+ pr_err("failed to write perf data, error: %m\n");
+ return -1;
+ }
+
+ perf_stat.bytes_written += event->header.size;
+ return 0;
+}
+
/*
* Read out the results of a single counter:
* do not aggregate counts across CPUs in system-wide mode
@@ -282,6 +296,35 @@ static void workload_exec_failed_signal(int signo __maybe_unused, siginfo_t *inf
workload_exec_errno = info->si_value.sival_int;
}

+static int perf_stat_synthesize_config(void)
+{
+ int err;
+
+ err = perf_event__synthesize_thread_map2(NULL, evsel_list->threads,
+ process_synthesized_event,
+ NULL);
+ if (err < 0) {
+ pr_err("Couldn't synthesize thread map.\n");
+ return err;
+ }
+
+ err = perf_event__synthesize_cpu_map(NULL, evsel_list->cpus,
+ process_synthesized_event, NULL);
+ if (err < 0) {
+ pr_err("Couldn't synthesize thread map.\n");
+ return err;
+ }
+
+ err = perf_event__synthesize_stat_config(NULL, &stat_config,
+ process_synthesized_event, NULL);
+ if (err < 0) {
+ pr_err("Couldn't synthesize config.\n");
+ return err;
+ }
+
+ return 0;
+}
+
static int __run_perf_stat(int argc, const char **argv)
{
int interval = stat_config.interval;
@@ -362,6 +405,10 @@ static int __run_perf_stat(int argc, const char **argv)
fd, false);
if (err < 0)
return err;
+
+ err = perf_stat_synthesize_config();
+ if (err < 0)
+ return err;
}

/*
--
2.4.3

2015-11-05 14:41:27

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 05/25] perf stat record: Store events IDs in perf data file

Store event IDs in evlist object so it get stored
into perf.data file.

Also making perf_evlist__id_add_fd global.

Tested-by: Kan Liang <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Jiri Olsa <[email protected]>
---
tools/perf/builtin-stat.c | 35 +++++++++++++++++++++++++++++++++++
tools/perf/util/evlist.c | 6 +++---
tools/perf/util/evlist.h | 3 +++
3 files changed, 41 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 8c24e88afd3c..2a15e491138b 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -325,6 +325,38 @@ static int perf_stat_synthesize_config(void)
return 0;
}

+#define FD(e, x, y) (*(int *)xyarray__entry(e->fd, x, y))
+
+static int __store_counter_ids(struct perf_evsel *counter,
+ struct cpu_map *cpus,
+ struct thread_map *threads)
+{
+ int cpu, thread;
+
+ for (cpu = 0; cpu < cpus->nr; cpu++) {
+ for (thread = 0; thread < threads->nr; thread++) {
+ int fd = FD(counter, cpu, thread);
+
+ if (perf_evlist__id_add_fd(evsel_list, counter,
+ cpu, thread, fd) < 0)
+ return -1;
+ }
+ }
+
+ return 0;
+}
+
+static int store_counter_ids(struct perf_evsel *counter)
+{
+ struct cpu_map *cpus = counter->cpus;
+ struct thread_map *threads = counter->threads;
+
+ if (perf_evsel__alloc_id(counter, cpus->nr, threads->nr))
+ return -ENOMEM;
+
+ return __store_counter_ids(counter, cpus, threads);
+}
+
static int __run_perf_stat(int argc, const char **argv)
{
int interval = stat_config.interval;
@@ -389,6 +421,9 @@ static int __run_perf_stat(int argc, const char **argv)
l = strlen(counter->unit);
if (l > unit_width)
unit_width = l;
+
+ if (STAT_RECORD && store_counter_ids(counter))
+ return -1;
}

if (perf_evlist__apply_filters(evsel_list, &counter)) {
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index d1392194a9a9..8069a6a588df 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -550,9 +550,9 @@ void perf_evlist__id_add(struct perf_evlist *evlist, struct perf_evsel *evsel,
evsel->id[evsel->ids++] = id;
}

-static int perf_evlist__id_add_fd(struct perf_evlist *evlist,
- struct perf_evsel *evsel,
- int cpu, int thread, int fd)
+int perf_evlist__id_add_fd(struct perf_evlist *evlist,
+ struct perf_evsel *evsel,
+ int cpu, int thread, int fd)
{
u64 read_data[4] = { 0, };
int id_idx = 1; /* The first entry is the counter value */
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index a459fe71b452..139a50038097 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -97,6 +97,9 @@ perf_evlist__find_tracepoint_by_name(struct perf_evlist *evlist,

void perf_evlist__id_add(struct perf_evlist *evlist, struct perf_evsel *evsel,
int cpu, int thread, u64 id);
+int perf_evlist__id_add_fd(struct perf_evlist *evlist,
+ struct perf_evsel *evsel,
+ int cpu, int thread, int fd);

int perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd);
int perf_evlist__alloc_pollfd(struct perf_evlist *evlist);
--
2.4.3

2015-11-05 14:41:29

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 06/25] perf stat record: Add pipe support for record command

Allowing storing stat record data into pipe, so report
tools (report/script) could read data directly from
record.

Tested-by: Kan Liang <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Jiri Olsa <[email protected]>
---
tools/perf/builtin-stat.c | 39 ++++++++++++++++++++++++++++-----------
1 file changed, 28 insertions(+), 11 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 2a15e491138b..c8c0acb0c2cd 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -296,10 +296,19 @@ static void workload_exec_failed_signal(int signo __maybe_unused, siginfo_t *inf
workload_exec_errno = info->si_value.sival_int;
}

-static int perf_stat_synthesize_config(void)
+static int perf_stat_synthesize_config(bool is_pipe)
{
int err;

+ if (is_pipe) {
+ err = perf_event__synthesize_attrs(NULL, perf_stat.session,
+ process_synthesized_event);
+ if (err < 0) {
+ pr_err("Couldn't synthesize attrs.\n");
+ return err;
+ }
+ }
+
err = perf_event__synthesize_thread_map2(NULL, evsel_list->threads,
process_synthesized_event,
NULL);
@@ -367,6 +376,7 @@ static int __run_perf_stat(int argc, const char **argv)
size_t l;
int status = 0;
const bool forks = (argc > 0);
+ bool is_pipe = STAT_RECORD ? perf_stat.file.is_pipe : false;

if (interval) {
ts.tv_sec = interval / 1000;
@@ -377,7 +387,7 @@ static int __run_perf_stat(int argc, const char **argv)
}

if (forks) {
- if (perf_evlist__prepare_workload(evsel_list, &target, argv, false,
+ if (perf_evlist__prepare_workload(evsel_list, &target, argv, is_pipe,
workload_exec_failed_signal) < 0) {
perror("failed to prepare workload");
return -1;
@@ -436,12 +446,17 @@ static int __run_perf_stat(int argc, const char **argv)
if (STAT_RECORD) {
int err, fd = perf_data_file__fd(&perf_stat.file);

- err = perf_session__write_header(perf_stat.session, evsel_list,
- fd, false);
+ if (is_pipe) {
+ err = perf_header__write_pipe(perf_data_file__fd(&perf_stat.file));
+ } else {
+ err = perf_session__write_header(perf_stat.session, evsel_list,
+ fd, false);
+ }
+
if (err < 0)
return err;

- err = perf_stat_synthesize_config();
+ err = perf_stat_synthesize_config(is_pipe);
if (err < 0)
return err;
}
@@ -949,6 +964,10 @@ static void print_counters(struct timespec *ts, int argc, const char **argv)
struct perf_evsel *counter;
char buf[64], *prefix = NULL;

+ /* Do not print anything if we record to the pipe. */
+ if (STAT_RECORD && perf_stat.file.is_pipe)
+ return;
+
if (interval)
print_interval(prefix = buf, ts);
else
@@ -1373,10 +1392,6 @@ static int __cmd_record(int argc, const char **argv)
return -1;
}

- /* No pipe support ATM */
- if (perf_stat.file.is_pipe)
- return -EINVAL;
-
init_features(session);

session->evlist = evsel_list;
@@ -1588,8 +1603,10 @@ int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
if (STAT_RECORD) {
int fd = perf_data_file__fd(&perf_stat.file);

- perf_stat.session->header.data_size += perf_stat.bytes_written;
- perf_session__write_header(perf_stat.session, evsel_list, fd, true);
+ if (!perf_stat.file.is_pipe) {
+ perf_stat.session->header.data_size += perf_stat.bytes_written;
+ perf_session__write_header(perf_stat.session, evsel_list, fd, true);
+ }

perf_session__delete(perf_stat.session);
}
--
2.4.3

2015-11-05 14:41:34

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 07/25] perf stat record: Write stat events on record

Writing stat events on 'perf stat record' at the time
we read counter values from kernel.

Tested-by: Kan Liang <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Jiri Olsa <[email protected]>
---
tools/perf/builtin-stat.c | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index c8c0acb0c2cd..1567c5337941 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -210,6 +210,18 @@ static int process_synthesized_event(struct perf_tool *tool __maybe_unused,
return 0;
}

+#define SID(e, x, y) xyarray__entry(e->sample_id, x, y)
+
+static int
+perf_evsel__write_stat_event(struct perf_evsel *counter, u32 cpu, u32 thread,
+ struct perf_counts_values *count)
+{
+ struct perf_sample_id *sid = SID(counter, cpu, thread);
+
+ return perf_event__synthesize_stat(NULL, cpu, thread, sid->id, count,
+ process_synthesized_event, NULL);
+}
+
/*
* Read out the results of a single counter:
* do not aggregate counts across CPUs in system-wide mode
@@ -233,6 +245,13 @@ static int read_counter(struct perf_evsel *counter)
count = perf_counts(counter->counts, cpu, thread);
if (perf_evsel__read(counter, cpu, thread, count))
return -1;
+
+ if (STAT_RECORD) {
+ if (perf_evsel__write_stat_event(counter, cpu, thread, count)) {
+ pr_err("failed to write stat event\n");
+ return -1;
+ }
+ }
}
}

--
2.4.3

2015-11-05 14:54:22

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 08/25] perf stat record: Write stat round events on record

Writing stat round events on 'perf stat record' for
each interval round. In non interval mode we store
round event after the last stat event.

Tested-by: Kan Liang <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Jiri Olsa <[email protected]>
---
tools/perf/builtin-stat.c | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 1567c5337941..16f6fe2f3435 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -210,6 +210,16 @@ static int process_synthesized_event(struct perf_tool *tool __maybe_unused,
return 0;
}

+static int write_stat_round_event(u64 time, u64 type)
+{
+ return perf_event__synthesize_stat_round(NULL, time, type,
+ process_synthesized_event,
+ NULL);
+}
+
+#define WRITE_STAT_ROUND_EVENT(time, interval) \
+ write_stat_round_event(time, PERF_STAT_ROUND_TYPE__ ## interval)
+
#define SID(e, x, y) xyarray__entry(e->sample_id, x, y)

static int
@@ -285,6 +295,11 @@ static void process_interval(void)
clock_gettime(CLOCK_MONOTONIC, &ts);
diff_timespec(&rs, &ts, &ref_time);

+ if (STAT_RECORD) {
+ if (WRITE_STAT_ROUND_EVENT(rs.tv_sec * NSECS_PER_SEC + rs.tv_nsec, INTERVAL))
+ pr_err("failed to write stat round event\n");
+ }
+
print_counters(&rs, 0, NULL);
}

@@ -1622,6 +1637,11 @@ int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
if (STAT_RECORD) {
int fd = perf_data_file__fd(&perf_stat.file);

+ if (!interval) {
+ if (WRITE_STAT_ROUND_EVENT(walltime_nsecs_stats.max, FINAL))
+ pr_err("failed to write stat round event\n");
+ }
+
if (!perf_stat.file.is_pipe) {
perf_stat.session->header.data_size += perf_stat.bytes_written;
perf_session__write_header(perf_stat.session, evsel_list, fd, true);
--
2.4.3

2015-11-05 14:41:41

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 09/25] perf stat record: Do not allow record with multiple runs mode

We currently don't support storing multiple session in perf.data,
so we can't allow -r option in stat record.

$ perf stat -e cycles -r 2 record ls
Cannot use -r option with perf stat record.

Tested-by: Kan Liang <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Jiri Olsa <[email protected]>
---
tools/perf/builtin-stat.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 16f6fe2f3435..26188c20f427 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -1420,6 +1420,11 @@ static int __cmd_record(int argc, const char **argv)
if (output_name)
file->path = output_name;

+ if (run_count != 1 || forever) {
+ pr_err("Cannot use -r option with perf stat record.\n");
+ return -1;
+ }
+
session = perf_session__new(file, false, NULL);
if (session == NULL) {
pr_err("Perf session creation failed.\n");
--
2.4.3

2015-11-05 14:41:39

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 10/25] perf stat record: Synthesize event update events

Synthesize other events stuff not carried within
attr event - unit, scale, name.

Tested-by: Kan Liang <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Jiri Olsa <[email protected]>
---
tools/perf/builtin-stat.c | 59 +++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 59 insertions(+)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 26188c20f427..5d34544e3079 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -330,8 +330,19 @@ static void workload_exec_failed_signal(int signo __maybe_unused, siginfo_t *inf
workload_exec_errno = info->si_value.sival_int;
}

+static bool has_unit(struct perf_evsel *counter)
+{
+ return counter->unit && *counter->unit;
+}
+
+static bool has_scale(struct perf_evsel *counter)
+{
+ return counter->scale != 1;
+}
+
static int perf_stat_synthesize_config(bool is_pipe)
{
+ struct perf_evsel *counter;
int err;

if (is_pipe) {
@@ -343,6 +354,54 @@ static int perf_stat_synthesize_config(bool is_pipe)
}
}

+ /*
+ * Synthesize other events stuff not carried within
+ * attr event - unit, scale, name
+ */
+ evlist__for_each(evsel_list, counter) {
+ if (!counter->supported)
+ continue;
+
+ /*
+ * Synthesize unit and scale only if it's defined.
+ */
+ if (has_unit(counter)) {
+ err = perf_event__synthesize_event_update_unit(NULL, counter, process_synthesized_event);
+ if (err < 0) {
+ pr_err("Couldn't synthesize evsel unit.\n");
+ return err;
+ }
+ }
+
+ if (has_scale(counter)) {
+ err = perf_event__synthesize_event_update_scale(NULL, counter, process_synthesized_event);
+ if (err < 0) {
+ pr_err("Couldn't synthesize evsel scale.\n");
+ return err;
+ }
+ }
+
+ if (counter->own_cpus) {
+ err = perf_event__synthesize_event_update_cpus(NULL, counter, process_synthesized_event);
+ if (err < 0) {
+ pr_err("Couldn't synthesize evsel scale.\n");
+ return err;
+ }
+ }
+
+ /*
+ * Name is needed only for pipe output,
+ * perf.data carries event names.
+ */
+ if (is_pipe) {
+ err = perf_event__synthesize_event_update_name(NULL, counter, process_synthesized_event);
+ if (err < 0) {
+ pr_err("Couldn't synthesize evsel name.\n");
+ return err;
+ }
+ }
+ }
+
err = perf_event__synthesize_thread_map2(NULL, evsel_list->threads,
process_synthesized_event,
NULL);
--
2.4.3

2015-11-05 14:53:50

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 11/25] perf stat report: Add report command

Adding 'perf stat report' command support. ATM it only
processes attr events and display nothing.

Tested-by: Kan Liang <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Jiri Olsa <[email protected]>
---
tools/perf/Documentation/perf-stat.txt | 12 +++++++
tools/perf/builtin-stat.c | 61 +++++++++++++++++++++++++++++++---
2 files changed, 69 insertions(+), 4 deletions(-)

diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
index 70eee1c2c444..95f492828657 100644
--- a/tools/perf/Documentation/perf-stat.txt
+++ b/tools/perf/Documentation/perf-stat.txt
@@ -11,6 +11,7 @@ SYNOPSIS
'perf stat' [-e <EVENT> | --event=EVENT] [-a] <command>
'perf stat' [-e <EVENT> | --event=EVENT] [-a] -- <command> [<options>]
'perf stat' [-e <EVENT> | --event=EVENT] [-a] record [-o file] -- <command> [<options>]
+'perf stat' report [-i file]

DESCRIPTION
-----------
@@ -26,6 +27,9 @@ OPTIONS
record::
See STAT RECORD.

+report::
+ See STAT REPORT.
+
-e::
--event=::
Select the PMU event. Selection can be:
@@ -170,6 +174,14 @@ Stores stat data into perf data file.
--output file::
Output file name.

+STAT REPORT
+-----------
+Reads and reports stat data from perf data file.
+
+-i file::
+--input file::
+Input file name.
+

EXAMPLES
--------
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 5d34544e3079..e921ad542846 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -60,6 +60,8 @@
#include "util/thread_map.h"
#include "util/counts.h"
#include "util/session.h"
+#include "util/tool.h"
+#include "asm/bug.h"

#include <stdlib.h>
#include <sys/prctl.h>
@@ -132,6 +134,7 @@ struct perf_stat {
struct perf_data_file file;
struct perf_session *session;
u64 bytes_written;
+ struct perf_tool tool;
};

static struct perf_stat perf_stat;
@@ -1020,8 +1023,8 @@ static void print_header(int argc, const char **argv)
else if (target.cpu_list)
fprintf(output, "\'CPU(s) %s", target.cpu_list);
else if (!target__has_task(&target)) {
- fprintf(output, "\'%s", argv[0]);
- for (i = 1; i < argc; i++)
+ fprintf(output, "\'%s", argv ? argv[0] : "pipe");
+ for (i = 1; argv && (i < argc); i++)
fprintf(output, " %s", argv[i]);
} else if (target.pid)
fprintf(output, "process id \'%s", target.pid);
@@ -1498,6 +1501,55 @@ static int __cmd_record(int argc, const char **argv)
return argc;
}

+static const char * const report_usage[] = {
+ "perf stat report [<options>]",
+ NULL,
+};
+
+static struct perf_stat perf_stat = {
+ .tool = {
+ .attr = perf_event__process_attr,
+ },
+};
+
+static int __cmd_report(int argc, const char **argv)
+{
+ struct perf_session *session;
+ const struct option options[] = {
+ OPT_STRING('i', "input", &input_name, "file", "input file name"),
+ OPT_END()
+ };
+ struct stat st;
+ int ret;
+
+ argc = parse_options(argc, argv, options, report_usage, 0);
+
+ if (!input_name || !strlen(input_name)) {
+ if (!fstat(STDIN_FILENO, &st) && S_ISFIFO(st.st_mode))
+ input_name = "-";
+ else
+ input_name = "perf.data";
+ }
+
+ perf_stat.file.path = input_name;
+ perf_stat.file.mode = PERF_DATA_MODE_READ;
+
+ session = perf_session__new(&perf_stat.file, false, &perf_stat.tool);
+ if (session == NULL)
+ return -1;
+
+ perf_stat.session = session;
+ stat_config.output = stderr;
+ evsel_list = session->evlist;
+
+ ret = perf_session__process_events(session);
+ if (ret)
+ return ret;
+
+ perf_session__delete(session);
+ return 0;
+}
+
int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
{
const char * const stat_usage[] = {
@@ -1508,7 +1560,7 @@ int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
const char *mode;
FILE *output = stderr;
unsigned int interval;
- const char * const stat_subcommands[] = { "record" };
+ const char * const stat_subcommands[] = { "record", "report" };

setlocale(LC_ALL, "");

@@ -1524,7 +1576,8 @@ int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
argc = __cmd_record(argc, argv);
if (argc < 0)
return -1;
- }
+ } else if (argc && !strncmp(argv[0], "rep", 3))
+ return __cmd_report(argc, argv);

interval = stat_config.interval;

--
2.4.3

2015-11-05 14:52:51

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 12/25] perf stat report: Process cpu/threads maps

Adding processing of cpu/threads maps. Configuring session's
evlist with these maps.

Tested-by: Kan Liang <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Jiri Olsa <[email protected]>
---
tools/perf/builtin-stat.c | 62 +++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 62 insertions(+)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index e921ad542846..858c6837a1e3 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -135,6 +135,9 @@ struct perf_stat {
struct perf_session *session;
u64 bytes_written;
struct perf_tool tool;
+ bool maps_allocated;
+ struct cpu_map *cpus;
+ struct thread_map *threads;
};

static struct perf_stat perf_stat;
@@ -1501,6 +1504,63 @@ static int __cmd_record(int argc, const char **argv)
return argc;
}

+static int set_maps(struct perf_stat *stat)
+{
+ if (!stat->cpus || !stat->threads)
+ return 0;
+
+ if (WARN_ONCE(stat->maps_allocated, "stats double allocation\n"))
+ return -EINVAL;
+
+ perf_evlist__set_maps(evsel_list, stat->cpus, stat->threads);
+
+ if (perf_evlist__alloc_stats(evsel_list, true))
+ return -ENOMEM;
+
+ stat->maps_allocated = true;
+ return 0;
+}
+
+static
+int process_thread_map_event(struct perf_tool *tool __maybe_unused,
+ union perf_event *event,
+ struct perf_session *session __maybe_unused)
+{
+ struct perf_stat *stat = container_of(tool, struct perf_stat, tool);
+
+ if (stat->threads) {
+ pr_warning("Extra thread map event, ignoring.\n");
+ return 0;
+ }
+
+ stat->threads = thread_map__new_event(&event->thread_map);
+ if (!stat->threads)
+ return -ENOMEM;
+
+ return set_maps(stat);
+}
+
+static
+int process_cpu_map_event(struct perf_tool *tool __maybe_unused,
+ union perf_event *event,
+ struct perf_session *session __maybe_unused)
+{
+ struct perf_stat *stat = container_of(tool, struct perf_stat, tool);
+ struct cpu_map *cpus;
+
+ if (stat->cpus) {
+ pr_warning("Extra cpu map event, ignoring.\n");
+ return 0;
+ }
+
+ cpus = cpu_map__new_data(&event->cpu_map.data);
+ if (!cpus)
+ return -ENOMEM;
+
+ stat->cpus = cpus;
+ return set_maps(stat);
+}
+
static const char * const report_usage[] = {
"perf stat report [<options>]",
NULL,
@@ -1509,6 +1569,8 @@ static const char * const report_usage[] = {
static struct perf_stat perf_stat = {
.tool = {
.attr = perf_event__process_attr,
+ .thread_map = process_thread_map_event,
+ .cpu_map = process_cpu_map_event,
},
};

--
2.4.3

2015-11-05 14:41:45

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 13/25] perf stat report: Process stat config event

Adding processing of stat config event and initialize
stat_config object.

Tested-by: Kan Liang <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Jiri Olsa <[email protected]>
---
tools/perf/builtin-stat.c | 10 ++++++++++
1 file changed, 10 insertions(+)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 858c6837a1e3..e50b5c7ba4f5 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -1504,6 +1504,15 @@ static int __cmd_record(int argc, const char **argv)
return argc;
}

+static
+int process_stat_config_event(struct perf_tool *tool __maybe_unused,
+ union perf_event *event,
+ struct perf_session *session __maybe_unused)
+{
+ perf_event__read_stat_config(&stat_config, &event->stat_config);
+ return 0;
+}
+
static int set_maps(struct perf_stat *stat)
{
if (!stat->cpus || !stat->threads)
@@ -1571,6 +1580,7 @@ static struct perf_stat perf_stat = {
.attr = perf_event__process_attr,
.thread_map = process_thread_map_event,
.cpu_map = process_cpu_map_event,
+ .stat_config = process_stat_config_event,
},
};

--
2.4.3

2015-11-05 14:41:47

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 14/25] perf stat report: Add support to initialize aggr_map from file

Using perf.data's perf_env data to initialize
aggregate config.

Tested-by: Kan Liang <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Jiri Olsa <[email protected]>
---
tools/perf/builtin-stat.c | 103 ++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 103 insertions(+)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index e50b5c7ba4f5..cf9ea08c66df 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -1297,6 +1297,101 @@ static int perf_stat_init_aggr_mode(void)
return cpus_aggr_map ? 0 : -ENOMEM;
}

+static inline int perf_env__get_cpu(struct perf_env *env, struct cpu_map *map, int idx)
+{
+ int cpu;
+
+ if (idx > map->nr)
+ return -1;
+
+ cpu = map->map[idx];
+
+ if (cpu >= env->nr_cpus_online)
+ return -1;
+
+ return cpu;
+}
+
+static int perf_env__get_socket(struct cpu_map *map, int idx, void *data)
+{
+ struct perf_env *env = data;
+ int cpu = perf_env__get_cpu(env, map, idx);
+
+ return cpu == -1 ? -1 : env->cpu[cpu].socket_id;
+}
+
+static int perf_env__get_core(struct cpu_map *map, int idx, void *data)
+{
+ struct perf_env *env = data;
+ int core = -1, cpu = perf_env__get_cpu(env, map, idx);
+
+ if (cpu != -1) {
+ int socket = env->cpu[cpu].socket_id;
+
+ /*
+ * Encode socket in upper 16 bits
+ * core_id is relative to socket, and
+ * we need a global id. So we combine
+ * socket + core id.
+ */
+ core = (socket << 16) | (env->cpu[cpu].core_id & 0xffff);
+ }
+
+ return core;
+}
+
+static int perf_env__build_socket_map(struct perf_env *env, struct cpu_map *cpus,
+ struct cpu_map **sockp)
+{
+ return cpu_map__build_map(cpus, sockp, perf_env__get_socket, env);
+}
+
+static int perf_env__build_core_map(struct perf_env *env, struct cpu_map *cpus,
+ struct cpu_map **corep)
+{
+ return cpu_map__build_map(cpus, corep, perf_env__get_core, env);
+}
+
+static int perf_stat__get_socket_file(struct cpu_map *map, int idx)
+{
+ return perf_env__get_socket(map, idx, &perf_stat.session->header.env);
+}
+
+static int perf_stat__get_core_file(struct cpu_map *map, int idx)
+{
+ return perf_env__get_core(map, idx, &perf_stat.session->header.env);
+}
+
+static int perf_stat_init_aggr_mode_file(struct perf_stat *stat)
+{
+ struct perf_env *env = &stat->session->header.env;
+
+ switch (stat_config.aggr_mode) {
+ case AGGR_SOCKET:
+ if (perf_env__build_socket_map(env, evsel_list->cpus, &aggr_map)) {
+ perror("cannot build socket map");
+ return -1;
+ }
+ aggr_get_id = perf_stat__get_socket_file;
+ break;
+ case AGGR_CORE:
+ if (perf_env__build_core_map(env, evsel_list->cpus, &aggr_map)) {
+ perror("cannot build core map");
+ return -1;
+ }
+ aggr_get_id = perf_stat__get_core_file;
+ break;
+ case AGGR_NONE:
+ case AGGR_GLOBAL:
+ case AGGR_THREAD:
+ case AGGR_UNSET:
+ default:
+ break;
+ }
+
+ return 0;
+}
+
/*
* Add default attributes, if there were no attributes specified or
* if -d/--detailed, -d -d or -d -d -d is used:
@@ -1509,7 +1604,15 @@ int process_stat_config_event(struct perf_tool *tool __maybe_unused,
union perf_event *event,
struct perf_session *session __maybe_unused)
{
+ struct perf_stat *stat = container_of(tool, struct perf_stat, tool);
+
perf_event__read_stat_config(&stat_config, &event->stat_config);
+
+ if (perf_stat.file.is_pipe)
+ perf_stat_init_aggr_mode();
+ else
+ perf_stat_init_aggr_mode_file(stat);
+
return 0;
}

--
2.4.3

2015-11-05 14:41:51

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 15/25] perf stat report: Process stat and stat round events

Adding processing of stat and stat round events.

The stat data com in stat events, using generic
function process_stat_round_event to store data
under perf_evsel object.

The stat-round events comes each interval or as
last event in non interval mode. The function
process_stat_round_event process stored data
for each perf_evsel object and print it out.

Tested-by: Kan Liang <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Jiri Olsa <[email protected]>
---
tools/perf/builtin-stat.c | 28 ++++++++++++++++++++++++++++
1 file changed, 28 insertions(+)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index cf9ea08c66df..511787d97d6c 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -1599,6 +1599,32 @@ static int __cmd_record(int argc, const char **argv)
return argc;
}

+static int process_stat_round_event(struct perf_tool *tool __maybe_unused,
+ union perf_event *event,
+ struct perf_session *session)
+{
+ struct stat_round_event *round = &event->stat_round;
+ struct perf_evsel *counter;
+ struct timespec tsh, *ts = NULL;
+ const char **argv = session->header.env.cmdline_argv;
+ int argc = session->header.env.nr_cmdline;
+
+ evlist__for_each(evsel_list, counter)
+ perf_stat_process_counter(&stat_config, counter);
+
+ if (round->type == PERF_STAT_ROUND_TYPE__FINAL)
+ update_stats(&walltime_nsecs_stats, round->time);
+
+ if (stat_config.interval && round->time) {
+ tsh.tv_sec = round->time / NSECS_PER_SEC;
+ tsh.tv_nsec = round->time % NSECS_PER_SEC;
+ ts = &tsh;
+ }
+
+ print_counters(ts, argc, argv);
+ return 0;
+}
+
static
int process_stat_config_event(struct perf_tool *tool __maybe_unused,
union perf_event *event,
@@ -1684,6 +1710,8 @@ static struct perf_stat perf_stat = {
.thread_map = process_thread_map_event,
.cpu_map = process_cpu_map_event,
.stat_config = process_stat_config_event,
+ .stat = perf_event__process_stat_event,
+ .stat_round = process_stat_round_event,
},
};

--
2.4.3

2015-11-05 14:52:12

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 16/25] perf stat report: Process event update events

Adding processing of event update events, so perf stat report
can store additional info for events - unit,scale,name.

Tested-by: Kan Liang <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Jiri Olsa <[email protected]>
---
tools/perf/builtin-stat.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 511787d97d6c..6636d29b3b18 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -1707,6 +1707,7 @@ static const char * const report_usage[] = {
static struct perf_stat perf_stat = {
.tool = {
.attr = perf_event__process_attr,
+ .event_update = perf_event__process_event_update,
.thread_map = process_thread_map_event,
.cpu_map = process_cpu_map_event,
.stat_config = process_stat_config_event,
--
2.4.3

2015-11-05 14:41:55

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 17/25] perf stat report: Move csv_sep initialization before report command

So we have csv_sep properly initialized before
report command leg.

Tested-by: Kan Liang <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Jiri Olsa <[email protected]>
---
tools/perf/builtin-stat.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 6636d29b3b18..174ffbd02a13 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -1776,6 +1776,13 @@ int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
(const char **) stat_usage,
PARSE_OPT_STOP_AT_NON_OPTION);

+ if (csv_sep) {
+ csv_output = true;
+ if (!strcmp(csv_sep, "\\t"))
+ csv_sep = "\t";
+ } else
+ csv_sep = DEFAULT_SEPARATOR;
+
if (argc && !strncmp(argv[0], "rec", 3)) {
argc = __cmd_record(argc, argv);
if (argc < 0)
@@ -1826,13 +1833,6 @@ int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)

stat_config.output = output;

- if (csv_sep) {
- csv_output = true;
- if (!strcmp(csv_sep, "\\t"))
- csv_sep = "\t";
- } else
- csv_sep = DEFAULT_SEPARATOR;
-
/*
* let the spreadsheet do the pretty-printing
*/
--
2.4.3

2015-11-05 14:51:27

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 18/25] perf stat report: Allow to override aggr_mode

Allowing to override record aggr_mode. It's possible
to use perf stat like:

$ perf stat report -A
$ perf stat report --per-core
$ perf stat report --per-socket

To customize the recorded aggregate mode regardless
what was used during the stat record command.

Tested-by: Kan Liang <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Jiri Olsa <[email protected]>
---
tools/perf/Documentation/perf-stat.txt | 10 ++++++++++
tools/perf/builtin-stat.c | 17 +++++++++++++++++
2 files changed, 27 insertions(+)

diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
index 95f492828657..52ef7a9d50aa 100644
--- a/tools/perf/Documentation/perf-stat.txt
+++ b/tools/perf/Documentation/perf-stat.txt
@@ -182,6 +182,16 @@ Reads and reports stat data from perf data file.
--input file::
Input file name.

+--per-socket::
+Aggregate counts per processor socket for system-wide mode measurements.
+
+--per-core::
+Aggregate counts per physical processor for system-wide mode measurements.
+
+-A::
+--no-aggr::
+Do not aggregate counts across all monitored CPUs.
+

EXAMPLES
--------
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 174ffbd02a13..b549fb343000 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -138,6 +138,7 @@ struct perf_stat {
bool maps_allocated;
struct cpu_map *cpus;
struct thread_map *threads;
+ enum aggr_mode aggr_mode;
};

static struct perf_stat perf_stat;
@@ -1634,6 +1635,15 @@ int process_stat_config_event(struct perf_tool *tool __maybe_unused,

perf_event__read_stat_config(&stat_config, &event->stat_config);

+ if (cpu_map__empty(stat->cpus)) {
+ if (stat->aggr_mode != AGGR_UNSET)
+ pr_warning("warning: processing task data, aggregation mode not set\n");
+ return 0;
+ }
+
+ if (stat->aggr_mode != AGGR_UNSET)
+ stat_config.aggr_mode = stat->aggr_mode;
+
if (perf_stat.file.is_pipe)
perf_stat_init_aggr_mode();
else
@@ -1714,6 +1724,7 @@ static struct perf_stat perf_stat = {
.stat = perf_event__process_stat_event,
.stat_round = process_stat_round_event,
},
+ .aggr_mode = AGGR_UNSET,
};

static int __cmd_report(int argc, const char **argv)
@@ -1721,6 +1732,12 @@ static int __cmd_report(int argc, const char **argv)
struct perf_session *session;
const struct option options[] = {
OPT_STRING('i', "input", &input_name, "file", "input file name"),
+ OPT_SET_UINT(0, "per-socket", &perf_stat.aggr_mode,
+ "aggregate counts per processor socket", AGGR_SOCKET),
+ OPT_SET_UINT(0, "per-core", &perf_stat.aggr_mode,
+ "aggregate counts per physical processor core", AGGR_CORE),
+ OPT_SET_UINT('A', "no-aggr", &perf_stat.aggr_mode,
+ "disable CPU count aggregation", AGGR_NONE),
OPT_END()
};
struct stat st;
--
2.4.3

2015-11-05 14:42:00

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 19/25] perf script: Process cpu/threads maps

Adding processing of cpu/threads maps. Configuring session's
evlist with these maps.

Tested-by: Kan Liang <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Jiri Olsa <[email protected]>
---
tools/perf/builtin-script.c | 67 +++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 67 insertions(+)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 72b5deb4bd79..cc3d8e141df6 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -18,7 +18,11 @@
#include "util/sort.h"
#include "util/data.h"
#include "util/auxtrace.h"
+#include "util/cpumap.h"
+#include "util/thread_map.h"
+#include "util/stat.h"
#include <linux/bitmap.h>
+#include "asm/bug.h"

static char const *script_name;
static char const *generate_script_lang;
@@ -739,6 +743,9 @@ struct perf_script {
bool show_task_events;
bool show_mmap_events;
bool show_switch_events;
+ bool allocated;
+ struct cpu_map *cpus;
+ struct thread_map *threads;
};

static int process_attr(struct perf_tool *tool, union perf_event *event,
@@ -1695,6 +1702,63 @@ static void script__setup_sample_type(struct perf_script *script)
}
}

+static int set_maps(struct perf_script *script)
+{
+ struct perf_evlist *evlist = script->session->evlist;
+
+ if (!script->cpus || !script->threads)
+ return 0;
+
+ if (WARN_ONCE(script->allocated, "stats double allocation\n"))
+ return -EINVAL;
+
+ perf_evlist__set_maps(evlist, script->cpus, script->threads);
+
+ if (perf_evlist__alloc_stats(evlist, true))
+ return -ENOMEM;
+
+ script->allocated = true;
+ return 0;
+}
+
+static
+int process_thread_map_event(struct perf_tool *tool,
+ union perf_event *event,
+ struct perf_session *session __maybe_unused)
+{
+ struct perf_script *script = container_of(tool, struct perf_script, tool);
+
+ if (script->threads) {
+ pr_warning("Extra thread map event, ignoring.\n");
+ return 0;
+ }
+
+ script->threads = thread_map__new_event(&event->thread_map);
+ if (!script->threads)
+ return -ENOMEM;
+
+ return set_maps(script);
+}
+
+static
+int process_cpu_map_event(struct perf_tool *tool __maybe_unused,
+ union perf_event *event,
+ struct perf_session *session __maybe_unused)
+{
+ struct perf_script *script = container_of(tool, struct perf_script, tool);
+
+ if (script->cpus) {
+ pr_warning("Extra cpu map event, ignoring.\n");
+ return 0;
+ }
+
+ script->cpus = cpu_map__new_data(&event->cpu_map.data);
+ if (!script->cpus)
+ return -ENOMEM;
+
+ return set_maps(script);
+}
+
int cmd_script(int argc, const char **argv, const char *prefix __maybe_unused)
{
bool show_full_info = false;
@@ -1723,6 +1787,8 @@ int cmd_script(int argc, const char **argv, const char *prefix __maybe_unused)
.auxtrace_info = perf_event__process_auxtrace_info,
.auxtrace = perf_event__process_auxtrace,
.auxtrace_error = perf_event__process_auxtrace_error,
+ .thread_map = process_thread_map_event,
+ .cpu_map = process_cpu_map_event,
.ordered_events = true,
.ordering_requires_timestamps = true,
},
@@ -2076,6 +2142,7 @@ int cmd_script(int argc, const char **argv, const char *prefix __maybe_unused)
flush_scripting();

out_delete:
+ perf_evlist__free_stats(session->evlist);
perf_session__delete(session);

if (script_started)
--
2.4.3

2015-11-05 14:50:44

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 20/25] perf script: Process stat config event

Adding processing of stat config event and initialize
stat_config object.

Tested-by: Kan Liang <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Jiri Olsa <[email protected]>
---
tools/perf/builtin-script.c | 10 ++++++++++
1 file changed, 10 insertions(+)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index cc3d8e141df6..97691c13aa6c 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -36,6 +36,7 @@ static bool print_flags;
static bool nanosecs;
static const char *cpu_list;
static DECLARE_BITMAP(cpu_bitmap, MAX_NR_CPUS);
+static struct perf_stat_config stat_config;

unsigned int scripting_max_stack = PERF_MAX_STACK_DEPTH;

@@ -1702,6 +1703,14 @@ static void script__setup_sample_type(struct perf_script *script)
}
}

+static int process_stat_config_event(struct perf_tool *tool __maybe_unused,
+ union perf_event *event,
+ struct perf_session *session __maybe_unused)
+{
+ perf_event__read_stat_config(&stat_config, &event->stat_config);
+ return 0;
+}
+
static int set_maps(struct perf_script *script)
{
struct perf_evlist *evlist = script->session->evlist;
@@ -1787,6 +1796,7 @@ int cmd_script(int argc, const char **argv, const char *prefix __maybe_unused)
.auxtrace_info = perf_event__process_auxtrace_info,
.auxtrace = perf_event__process_auxtrace,
.auxtrace_error = perf_event__process_auxtrace_error,
+ .stat_config = process_stat_config_event,
.thread_map = process_thread_map_event,
.cpu_map = process_cpu_map_event,
.ordered_events = true,
--
2.4.3

2015-11-05 14:42:05

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 21/25] perf script: Add process_stat/process_stat_interval scripting interface

Python and perl scripting code will define those
callbacks and get stat data.

Tested-by: Kan Liang <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Jiri Olsa <[email protected]>
---
tools/perf/util/trace-event.h | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/tools/perf/util/trace-event.h b/tools/perf/util/trace-event.h
index b85ee55cca0c..0ebc9dab2c7c 100644
--- a/tools/perf/util/trace-event.h
+++ b/tools/perf/util/trace-event.h
@@ -65,6 +65,7 @@ int tracing_data_put(struct tracing_data *tdata);
struct addr_location;

struct perf_session;
+struct perf_stat_config;

struct scripting_ops {
const char *name;
@@ -75,6 +76,9 @@ struct scripting_ops {
struct perf_sample *sample,
struct perf_evsel *evsel,
struct addr_location *al);
+ void (*process_stat) (struct perf_stat_config *config,
+ struct perf_evsel *evsel, u64 time);
+ void (*process_stat_interval) (u64 time);
int (*generate_script) (struct pevent *pevent, const char *outfile);
};

--
2.4.3

2015-11-05 14:43:42

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 22/25] perf script: Add stat default handlers

Implement struct scripting_ops::(process_stat|process_stat_interval)
handlers - calling scripting handlers from stat events handlers.

Tested-by: Kan Liang <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Jiri Olsa <[email protected]>
---
tools/perf/builtin-script.c | 31 +++++++++++++++++++++++++++++++
1 file changed, 31 insertions(+)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 97691c13aa6c..011161d91b0b 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -209,6 +209,9 @@ static int perf_evsel__check_attr(struct perf_evsel *evsel,
struct perf_event_attr *attr = &evsel->attr;
bool allow_user_set;

+ if (perf_header__has_feat(&session->header, HEADER_STAT))
+ return 0;
+
allow_user_set = perf_header__has_feat(&session->header,
HEADER_AUXTRACE);

@@ -648,6 +651,14 @@ static void process_event(union perf_event *event, struct perf_sample *sample,
printf("\n");
}

+static void process_stat(struct perf_stat_config *config __maybe_unused,
+ struct perf_evsel *evsel __maybe_unused,
+ u64 time __maybe_unused)
+{
+}
+
+static void process_stat_interval(u64 time __maybe_unused) { }
+
static int default_start_script(const char *script __maybe_unused,
int argc __maybe_unused,
const char **argv __maybe_unused)
@@ -676,6 +687,8 @@ static struct scripting_ops default_scripting_ops = {
.flush_script = default_flush_script,
.stop_script = default_stop_script,
.process_event = process_event,
+ .process_stat = process_stat,
+ .process_stat_interval = process_stat_interval,
.generate_script = default_generate_script,
};

@@ -1703,6 +1716,22 @@ static void script__setup_sample_type(struct perf_script *script)
}
}

+static int process_stat_round_event(struct perf_tool *tool __maybe_unused,
+ union perf_event *event,
+ struct perf_session *session)
+{
+ struct stat_round_event *round = &event->stat_round;
+ struct perf_evsel *counter;
+
+ evlist__for_each(session->evlist, counter) {
+ perf_stat_process_counter(&stat_config, counter);
+ scripting_ops->process_stat(&stat_config, counter, round->time);
+ }
+
+ scripting_ops->process_stat_interval(round->time);
+ return 0;
+}
+
static int process_stat_config_event(struct perf_tool *tool __maybe_unused,
union perf_event *event,
struct perf_session *session __maybe_unused)
@@ -1796,6 +1825,8 @@ int cmd_script(int argc, const char **argv, const char *prefix __maybe_unused)
.auxtrace_info = perf_event__process_auxtrace_info,
.auxtrace = perf_event__process_auxtrace,
.auxtrace_error = perf_event__process_auxtrace_error,
+ .stat = perf_event__process_stat_event,
+ .stat_round = process_stat_round_event,
.stat_config = process_stat_config_event,
.thread_map = process_thread_map_event,
.cpu_map = process_cpu_map_event,
--
2.4.3

2015-11-05 14:42:10

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 23/25] perf script: Display stat events by default

If no script is specified for stat data, display
stat events in raw form.

$ perf stat record ls

SNIP

Performance counter stats for 'ls':

0.851585 task-clock (msec) # 0.717 CPUs utilized
0 context-switches # 0.000 K/sec
0 cpu-migrations # 0.000 K/sec
114 page-faults # 0.134 M/sec
2,620,918 cycles # 3.078 GHz
<not supported> stalled-cycles-frontend
<not supported> stalled-cycles-backend
2,714,111 instructions # 1.04 insns per cycle
542,434 branches # 636.970 M/sec
15,946 branch-misses # 2.94% of all branches

0.001186954 seconds time elapsed

$ perf script
CPU THREAD VAL ENA RUN TIME EVENT
-1 26185 851585 851585 851585 1186954 task-clock
-1 26185 0 851585 851585 1186954 context-switches
-1 26185 0 851585 851585 1186954 cpu-migrations
-1 26185 114 851585 851585 1186954 page-faults
-1 26185 2620918 853340 853340 1186954 cycles
-1 26185 0 0 0 1186954 stalled-cycles-frontend
-1 26185 0 0 0 1186954 stalled-cycles-backend
-1 26185 2714111 853340 853340 1186954 instructions
-1 26185 542434 853340 853340 1186954 branches
-1 26185 15946 853340 853340 1186954 branch-misses

Tested-by: Kan Liang <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Jiri Olsa <[email protected]>
---
tools/perf/builtin-script.c | 35 +++++++++++++++++++++++++++++++++--
1 file changed, 33 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 011161d91b0b..d2ee759f99e2 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -652,9 +652,40 @@ static void process_event(union perf_event *event, struct perf_sample *sample,
}

static void process_stat(struct perf_stat_config *config __maybe_unused,
- struct perf_evsel *evsel __maybe_unused,
- u64 time __maybe_unused)
+ struct perf_evsel *counter, u64 time)
{
+ int nthreads = thread_map__nr(counter->threads);
+ int ncpus = perf_evsel__nr_cpus(counter);
+ int cpu, thread;
+ static int header_printed;
+
+ if (counter->system_wide)
+ nthreads = 1;
+
+ if (!header_printed) {
+ printf("%3s %8s %15s %15s %15s %15s %s\n",
+ "CPU", "THREAD", "VAL", "ENA", "RUN", "TIME", "EVENT");
+ header_printed = 1;
+ }
+
+ for (thread = 0; thread < nthreads; thread++) {
+ for (cpu = 0; cpu < ncpus; cpu++) {
+ struct perf_counts_values *counts;
+
+ counts = perf_counts(counter->counts, cpu, thread);
+
+ printf("%3d %8d %15" PRIu64 " %15" PRIu64 " %15" PRIu64 " %15" PRIu64 " %s\n",
+ counter->cpus->map[cpu],
+ thread_map__pid(counter->threads, thread),
+ counts->val,
+ counts->ena,
+ counts->run,
+ time,
+ perf_evsel__name(counter));
+ }
+ }
+
+ return;
}

static void process_stat_interval(u64 time __maybe_unused) { }
--
2.4.3

2015-11-05 14:42:55

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 24/25] perf script: Add python support for stat events

Add support to get stat events data in perf python scripts.

The python script shall implement following
new interface to process stat data:

def stat__<event_name>_[<modifier>](cpu, thread, time, val, ena, run):

- is called for every stat event for given counter,
if user monitors 'cycles,instructions:u" following
callbacks should be defined:

def stat__cycles(cpu, thread, time, val, ena, run):
def stat__instructions_u(cpu, thread, time, val, ena, run):

def stat__interval(time):

- is called for every interval with its time,
in non interval mode it's called after last
stat event with total meassured time in ns

The rest of the current interface stays untouched..

Please check example CPI metrics script in following patch
with command line examples in changelogs.

Tested-by: Kan Liang <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Jiri Olsa <[email protected]>
---
.../util/scripting-engines/trace-event-python.c | 114 +++++++++++++++++++--
1 file changed, 108 insertions(+), 6 deletions(-)

diff --git a/tools/perf/util/scripting-engines/trace-event-python.c b/tools/perf/util/scripting-engines/trace-event-python.c
index a8e825fca42a..8436eb23eb16 100644
--- a/tools/perf/util/scripting-engines/trace-event-python.c
+++ b/tools/perf/util/scripting-engines/trace-event-python.c
@@ -41,6 +41,9 @@
#include "../thread-stack.h"
#include "../trace-event.h"
#include "../machine.h"
+#include "thread_map.h"
+#include "cpumap.h"
+#include "stat.h"

PyMODINIT_FUNC initperf_trace_context(void);

@@ -859,6 +862,103 @@ static void python_process_event(union perf_event *event,
}
}

+static void get_handler_name(char *str, size_t size,
+ struct perf_evsel *evsel)
+{
+ char *p = str;
+
+ scnprintf(str, size, "stat__%s", perf_evsel__name(evsel));
+
+ while ((p = strchr(p, ':'))) {
+ *p = '_';
+ p++;
+ }
+}
+
+static void
+process_stat(struct perf_evsel *counter, int cpu, int thread, u64 time,
+ struct perf_counts_values *count)
+{
+ PyObject *handler, *t;
+ static char handler_name[256];
+ int n = 0;
+
+ t = PyTuple_New(MAX_FIELDS);
+ if (!t)
+ Py_FatalError("couldn't create Python tuple");
+
+ get_handler_name(handler_name, sizeof(handler_name),
+ counter);
+
+ handler = get_handler(handler_name);
+ if (!handler) {
+ pr_debug("can't find python handler %s\n", handler_name);
+ return;
+ }
+
+ PyTuple_SetItem(t, n++, PyInt_FromLong(cpu));
+ PyTuple_SetItem(t, n++, PyInt_FromLong(thread));
+ PyTuple_SetItem(t, n++, PyLong_FromLong(time));
+ PyTuple_SetItem(t, n++, PyLong_FromLong(count->val));
+ PyTuple_SetItem(t, n++, PyLong_FromLong(count->ena));
+ PyTuple_SetItem(t, n++, PyLong_FromLong(count->run));
+
+ if (_PyTuple_Resize(&t, n) == -1)
+ Py_FatalError("error resizing Python tuple");
+
+ call_object(handler, t, handler_name);
+
+ Py_DECREF(t);
+}
+
+static void python_process_stat(struct perf_stat_config *config,
+ struct perf_evsel *counter, u64 time)
+{
+ struct thread_map *threads = counter->threads;
+ struct cpu_map *cpus = counter->cpus;
+ int cpu, thread;
+
+ if (config->aggr_mode == AGGR_GLOBAL) {
+ process_stat(counter, -1, -1, time,
+ &counter->counts->aggr);
+ return;
+ }
+
+ for (thread = 0; thread < threads->nr; thread++) {
+ for (cpu = 0; cpu < cpus->nr; cpu++) {
+ process_stat(counter, cpus->map[cpu],
+ thread_map__pid(threads, thread), time,
+ perf_counts(counter->counts, cpu, thread));
+ }
+ }
+}
+
+static void python_process_stat_interval(u64 time)
+{
+ PyObject *handler, *t;
+ static const char handler_name[] = "stat__interval";
+ int n = 0;
+
+ t = PyTuple_New(MAX_FIELDS);
+ if (!t)
+ Py_FatalError("couldn't create Python tuple");
+
+ handler = get_handler(handler_name);
+ if (!handler) {
+ pr_debug("can't find python handler %s\n", handler_name);
+ return;
+ }
+
+ PyTuple_SetItem(t, n++, PyLong_FromLong(time));
+
+ if (_PyTuple_Resize(&t, n) == -1)
+ Py_FatalError("error resizing Python tuple");
+
+ call_object(handler, t, handler_name);
+
+ Py_DECREF(t);
+}
+
static int run_start_sub(void)
{
main_module = PyImport_AddModule("__main__");
@@ -1201,10 +1301,12 @@ static int python_generate_script(struct pevent *pevent, const char *outfile)
}

struct scripting_ops python_scripting_ops = {
- .name = "Python",
- .start_script = python_start_script,
- .flush_script = python_flush_script,
- .stop_script = python_stop_script,
- .process_event = python_process_event,
- .generate_script = python_generate_script,
+ .name = "Python",
+ .start_script = python_start_script,
+ .flush_script = python_flush_script,
+ .stop_script = python_stop_script,
+ .process_event = python_process_event,
+ .process_stat = python_process_stat,
+ .process_stat_interval = python_process_stat_interval,
+ .generate_script = python_generate_script,
};
--
2.4.3

2015-11-05 14:42:17

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 25/25] perf script: Add stat-cpi.py script

Adding stat-cpi.py as an example of how to do stat scripting.
It computes the CPI metrics from cycles and instructions
events.

Following stat record/report/script combinations could be used:

- get CPI for given workload

$ perf stat -e cycles,instructions record ls

SNIP

Performance counter stats for 'ls':

2,904,431 cycles
3,346,878 instructions # 1.15 insns per cycle

0.001782686 seconds time elapsed

$ perf script -s ./scripts/python/stat-cpi.py
0.001783: cpu -1, thread -1 -> cpi 0.867803 (2904431/3346878)

$ perf stat -e cycles,instructions record ls | perf script -s ./scripts/python/stat-cpi.py

SNIP

0.001730: cpu -1, thread -1 -> cpi 0.869026 (2928292/3369627)

- get CPI systemwide:

$ perf stat -e cycles,instructions -a -I 1000 record sleep 3
# time counts unit events
1.000158618 594,274,711 cycles (100.00%)
1.000158618 441,898,250 instructions
2.000350973 567,649,705 cycles (100.00%)
2.000350973 432,669,206 instructions
3.000559210 561,940,430 cycles (100.00%)
3.000559210 420,403,465 instructions
3.000670798 780,105 cycles (100.00%)
3.000670798 326,516 instructions

$ perf script -s ./scripts/python/stat-cpi.py
1.000159: cpu -1, thread -1 -> cpi 1.344823 (594274711/441898250)
2.000351: cpu -1, thread -1 -> cpi 1.311972 (567649705/432669206)
3.000559: cpu -1, thread -1 -> cpi 1.336669 (561940430/420403465)
3.000671: cpu -1, thread -1 -> cpi 2.389178 (780105/326516)

$ perf stat -e cycles,instructions -a -I 1000 record sleep 3 | perf script -s ./scripts/python/stat-cpi.py
1.000202: cpu -1, thread -1 -> cpi 1.035091 (940778881/908885530)
2.000392: cpu -1, thread -1 -> cpi 1.442600 (627493992/434974455)
3.000545: cpu -1, thread -1 -> cpi 1.353612 (741463930/547766890)
3.000622: cpu -1, thread -1 -> cpi 2.642110 (784083/296764)

Tested-by: Kan Liang <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Jiri Olsa <[email protected]>
---
tools/perf/scripts/python/stat-cpi.py | 74 +++++++++++++++++++++++++++++++++++
1 file changed, 74 insertions(+)
create mode 100644 tools/perf/scripts/python/stat-cpi.py

diff --git a/tools/perf/scripts/python/stat-cpi.py b/tools/perf/scripts/python/stat-cpi.py
new file mode 100644
index 000000000000..eb3936e99862
--- /dev/null
+++ b/tools/perf/scripts/python/stat-cpi.py
@@ -0,0 +1,74 @@
+#!/bin/python
+
+data = {}
+times = []
+threads = []
+cpus = []
+
+def get_key(time, event, cpu, thread):
+ return "%d-%s-%d-%d" % (time, event, cpu, thread)
+
+def store_key(time, cpu, thread):
+ if (time not in times):
+ times.append(time)
+
+ if (cpu not in cpus):
+ cpus.append(cpu)
+
+ if (thread not in threads):
+ threads.append(thread)
+
+def store(time, event, cpu, thread, val, ena, run):
+ #print "event %s cpu %d, thread %d, time %d, val %d, ena %d, run %d" % \
+ # (event, cpu, thread, time, val, ena, run)
+
+ store_key(time, cpu, thread)
+ key = get_key(time, event, cpu, thread)
+ data[key] = [ val, ena, run]
+
+def get(time, event, cpu, thread):
+ key = get_key(time, event, cpu, thread)
+ return data[key][0]
+
+def stat__cycles_k(cpu, thread, time, val, ena, run):
+ store(time, "cycles", cpu, thread, val, ena, run);
+
+def stat__instructions_k(cpu, thread, time, val, ena, run):
+ store(time, "instructions", cpu, thread, val, ena, run);
+
+def stat__cycles_u(cpu, thread, time, val, ena, run):
+ store(time, "cycles", cpu, thread, val, ena, run);
+
+def stat__instructions_u(cpu, thread, time, val, ena, run):
+ store(time, "instructions", cpu, thread, val, ena, run);
+
+def stat__cycles(cpu, thread, time, val, ena, run):
+ store(time, "cycles", cpu, thread, val, ena, run);
+
+def stat__instructions(cpu, thread, time, val, ena, run):
+ store(time, "instructions", cpu, thread, val, ena, run);
+
+def stat__interval(time):
+ for cpu in cpus:
+ for thread in threads:
+ cyc = get(time, "cycles", cpu, thread)
+ ins = get(time, "instructions", cpu, thread)
+ cpi = 0
+
+ if ins != 0:
+ cpi = cyc/float(ins)
+
+ print "%15f: cpu %d, thread %d -> cpi %f (%d/%d)" % (time/(float(1000000000)), cpu, thread, cpi, cyc, ins)
+
+def trace_end():
+ pass
+# for time in times:
+# for cpu in cpus:
+# for thread in threads:
+# cyc = get(time, "cycles", cpu, thread)
+# ins = get(time, "instructions", cpu, thread)
+#
+# if ins != 0:
+# cpi = cyc/float(ins)
+#
+# print "time %.9f, cpu %d, thread %d -> cpi %f" % (time/(float(1000000000)), cpu, thread, cpi)
--
2.4.3

2015-11-05 20:51:09

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 02/25] perf stat record: Add record command

Em Thu, Nov 05, 2015 at 03:40:46PM +0100, Jiri Olsa escreveu:
> Add 'perf stat record' command support. It creates simple
> (header only) perf.data file ATM.
>
> The record command could be specified anywhere among stat
> options. All stat command options are valid for stat record
> command with '-o' option exception. If specified for record
> command it denotes the perf data file name.
>
> Tested-by: Kan Liang <[email protected]>
> Link: http://lkml.kernel.org/n/[email protected]
> Signed-off-by: Jiri Olsa <[email protected]>

Still stopping here:

[acme@zoo linux]$ rm -f perf.data
[acme@zoo linux]$ perf stat record usleep 1

Performance counter stats for 'usleep 1':

0.621181 task-clock (msec) # 0.455 CPUs utilized
1 context-switches # 0.002 M/sec
0 cpu-migrations # 0.000 K/sec
54 page-faults # 0.087 M/sec
917,006 cycles # 1.476 GHz
611,746 stalled-cycles-frontend # 66.71% frontend cycles idle
<not supported> stalled-cycles-backend
654,410 instructions # 0.71 insns per cycle
# 0.93 stalled cycles per insn
132,653 branches # 213.550 M/sec
7,432 branch-misses # 5.60% of all branches

0.001365369 seconds time elapsed

[acme@zoo linux]$ ls -la perf.data
-rw-------. 1 acme acme 1384 Nov 5 17:42 perf.data
[acme@zoo linux]$ perf evlist
WARNING: The perf.data file's data size field is 0 which is unexpected.
Was the 'perf record' command properly terminated?
non matching sample_type[acme@zoo linux]$

--------------------------------

When we pass one event it gets a bit better:

[acme@zoo linux]$ rm -f perf.data
[acme@zoo linux]$ perf stat -e cycles record usleep 1

Performance counter stats for 'usleep 1':

1,056,818 cycles

0.000715850 seconds time elapsed

[acme@zoo linux]$ ls -la perf.data
-rw-------. 1 acme acme 232 Nov 5 17:44 perf.data
[acme@zoo linux]$ perf evlist
WARNING: The perf.data file's data size field is 0 which is unexpected.
Was the 'perf record' command properly terminated?
cycles
[acme@zoo linux]$

-----

In the second case it almost works, modulo that warning.

I think that what we need to achieve is for older tools to be able to, with a
file produced by 'perf stat record', to show this:

[root@zoo ~]# perf report --no-header --stdio
Error:
The perf.data file has no samples!
# To display the perf.data header info, please use --header/--header-only options.
#
[root@zoo ~]#


I.e. the file should look like one that is produced by this command, purposely
to not create any sample:

# perf record -e syscalls:sys_enter_accept usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.018 MB perf.data ]


I applied the first patch and added it to that perf/stat branch.

- Arnaldo

2015-11-06 08:24:05

by Jiri Olsa

[permalink] [raw]
Subject: Re: [PATCH 02/25] perf stat record: Add record command

On Thu, Nov 05, 2015 at 05:51:01PM -0300, Arnaldo Carvalho de Melo wrote:

SNIP

> In the second case it almost works, modulo that warning.
>
> I think that what we need to achieve is for older tools to be able to, with a
> file produced by 'perf stat record', to show this:
>
> [root@zoo ~]# perf report --no-header --stdio
> Error:
> The perf.data file has no samples!
> # To display the perf.data header info, please use --header/--header-only options.
> #
> [root@zoo ~]#
>
>
> I.e. the file should look like one that is produced by this command, purposely
> to not create any sample:
>
> # perf record -e syscalls:sys_enter_accept usleep 1
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.018 MB perf.data ]
>
>
> I applied the first patch and added it to that perf/stat branch.

well.. it's either simple patches and step by step
functionality or one big with everything..

[PATCH 02/25] perf stat record: Add record command
- adds record command that creates empty perf.data

[PATCH 03/25] perf stat record: Initialize record features
- adds FEATURES initialization for stat data

[PATCH 04/25] perf stat record: Synthesize stat record data
- adds meta data

[PATCH 05/25] perf stat record: Store events IDs in perf data file
- adds event IDs
...


you get proper warning right after patch 3/25, where
we store STAT feature bit and properly check it when
opening perf.data

I can merge patch 2 and 3 to get the proper warning
from begining.. but that'd be bigger patch ;-)

thanks,
jirka

2015-11-06 13:33:10

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 02/25] perf stat record: Add record command

Em Fri, Nov 06, 2015 at 09:24:00AM +0100, Jiri Olsa escreveu:
> On Thu, Nov 05, 2015 at 05:51:01PM -0300, Arnaldo Carvalho de Melo wrote:
>
> SNIP
>
> > In the second case it almost works, modulo that warning.
> >
> > I think that what we need to achieve is for older tools to be able to, with a
> > file produced by 'perf stat record', to show this:
> >
> > [root@zoo ~]# perf report --no-header --stdio
> > Error:
> > The perf.data file has no samples!
> > # To display the perf.data header info, please use --header/--header-only options.
> > #
> > [root@zoo ~]#
> >
> >
> > I.e. the file should look like one that is produced by this command, purposely
> > to not create any sample:
> >
> > # perf record -e syscalls:sys_enter_accept usleep 1
> > [ perf record: Woken up 1 times to write data ]
> > [ perf record: Captured and wrote 0.018 MB perf.data ]
> >
> >
> > I applied the first patch and added it to that perf/stat branch.
>
> well.. it's either simple patches and step by step
> functionality or one big with everything..

Humm, no, there are several things we should strive for, and
bisectability is one of the first, it requires smaller, self contained
patches, sure, but it also requires that at after applying each patch we
have sane output from the tools.

So, after applying the patch above we get a message that says the file
is corrupted, and more than that, it even forgets to put a newline,
further breaking the output.

> [PATCH 02/25] perf stat record: Add record command
> - adds record command that creates empty perf.data
>
> [PATCH 03/25] perf stat record: Initialize record features
> - adds FEATURES initialization for stat data
>
> [PATCH 04/25] perf stat record: Synthesize stat record data
> - adds meta data
>
> [PATCH 05/25] perf stat record: Store events IDs in perf data file
> - adds event IDs
> ...
>
>
> you get proper warning right after patch 3/25, where
> we store STAT feature bit and properly check it when
> opening perf.data

But that will be will _new_ tools, right? I'm talking about getting sane
output from _older_, unmodified, tools, like I demonstrated.

Anyway, I'll take the time to fix the broken missing newline and will
check those first few patches to see if I have a suggestion for you on
how to group them.

- Arnaldo

> I can merge patch 2 and 3 to get the proper warning
> from begining.. but that'd be bigger patch ;-)
>
> thanks,
> jirka

2015-11-06 14:13:34

by Jiri Olsa

[permalink] [raw]
Subject: Re: [PATCH 02/25] perf stat record: Add record command

On Fri, Nov 06, 2015 at 10:33:03AM -0300, Arnaldo Carvalho de Melo wrote:

SNIP

>
> Humm, no, there are several things we should strive for, and
> bisectability is one of the first, it requires smaller, self contained
> patches, sure, but it also requires that at after applying each patch we
> have sane output from the tools.
>
> So, after applying the patch above we get a message that says the file
> is corrupted, and more than that, it even forgets to put a newline,
> further breaking the output.

well, because the 'perf stat record' creates just minimal
perf.data and I'm adding data itself in later commits

>
> > [PATCH 02/25] perf stat record: Add record command
> > - adds record command that creates empty perf.data
> >
> > [PATCH 03/25] perf stat record: Initialize record features
> > - adds FEATURES initialization for stat data
> >
> > [PATCH 04/25] perf stat record: Synthesize stat record data
> > - adds meta data
> >
> > [PATCH 05/25] perf stat record: Store events IDs in perf data file
> > - adds event IDs
> > ...
> >
> >
> > you get proper warning right after patch 3/25, where
> > we store STAT feature bit and properly check it when
> > opening perf.data
>
> But that will be will _new_ tools, right? I'm talking about getting sane
> output from _older_, unmodified, tools, like I demonstrated.

new and old.. there's change to react on STAT feature
during opening perf data file in 03/25 that fixes the
issue.. I moved it from 'report' command patch earlier
in the patchset

jirka

Subject: [tip:perf/urgent] perf stat: Make stat options global

Commit-ID: e0547311133159bf95f7998726e4e4932d78d8ce
Gitweb: http://git.kernel.org/tip/e0547311133159bf95f7998726e4e4932d78d8ce
Author: Jiri Olsa <[email protected]>
AuthorDate: Thu, 5 Nov 2015 15:40:45 +0100
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Thu, 5 Nov 2015 17:54:34 -0300

perf stat: Make stat options global

So they can be used in perf stat record command in following patch.

Signed-off-by: Jiri Olsa <[email protected]>
Tested-by: Kan Liang <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/builtin-stat.c | 163 +++++++++++++++++++++++-----------------------
1 file changed, 82 insertions(+), 81 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index b74ee0f..e77880b 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -122,6 +122,9 @@ static bool forever = false;
static struct timespec ref_time;
static struct cpu_map *aggr_map;
static aggr_get_id_t aggr_get_id;
+static bool append_file;
+static const char *output_name;
+static int output_fd;

static volatile int done = 0;

@@ -927,6 +930,67 @@ static int stat__set_big_num(const struct option *opt __maybe_unused,
return 0;
}

+static const struct option stat_options[] = {
+ OPT_BOOLEAN('T', "transaction", &transaction_run,
+ "hardware transaction statistics"),
+ OPT_CALLBACK('e', "event", &evsel_list, "event",
+ "event selector. use 'perf list' to list available events",
+ parse_events_option),
+ OPT_CALLBACK(0, "filter", &evsel_list, "filter",
+ "event filter", parse_filter),
+ OPT_BOOLEAN('i', "no-inherit", &no_inherit,
+ "child tasks do not inherit counters"),
+ OPT_STRING('p', "pid", &target.pid, "pid",
+ "stat events on existing process id"),
+ OPT_STRING('t', "tid", &target.tid, "tid",
+ "stat events on existing thread id"),
+ OPT_BOOLEAN('a', "all-cpus", &target.system_wide,
+ "system-wide collection from all CPUs"),
+ OPT_BOOLEAN('g', "group", &group,
+ "put the counters into a counter group"),
+ OPT_BOOLEAN('c', "scale", &stat_config.scale, "scale/normalize counters"),
+ OPT_INCR('v', "verbose", &verbose,
+ "be more verbose (show counter open errors, etc)"),
+ OPT_INTEGER('r', "repeat", &run_count,
+ "repeat command and print average + stddev (max: 100, forever: 0)"),
+ OPT_BOOLEAN('n', "null", &null_run,
+ "null run - dont start any counters"),
+ OPT_INCR('d', "detailed", &detailed_run,
+ "detailed run - start a lot of events"),
+ OPT_BOOLEAN('S', "sync", &sync_run,
+ "call sync() before starting a run"),
+ OPT_CALLBACK_NOOPT('B', "big-num", NULL, NULL,
+ "print large numbers with thousands\' separators",
+ stat__set_big_num),
+ OPT_STRING('C', "cpu", &target.cpu_list, "cpu",
+ "list of cpus to monitor in system-wide"),
+ OPT_SET_UINT('A', "no-aggr", &stat_config.aggr_mode,
+ "disable CPU count aggregation", AGGR_NONE),
+ OPT_STRING('x', "field-separator", &csv_sep, "separator",
+ "print counts with custom separator"),
+ OPT_CALLBACK('G', "cgroup", &evsel_list, "name",
+ "monitor event in cgroup name only", parse_cgroups),
+ OPT_STRING('o', "output", &output_name, "file", "output file name"),
+ OPT_BOOLEAN(0, "append", &append_file, "append to the output file"),
+ OPT_INTEGER(0, "log-fd", &output_fd,
+ "log output to fd, instead of stderr"),
+ OPT_STRING(0, "pre", &pre_cmd, "command",
+ "command to run prior to the measured command"),
+ OPT_STRING(0, "post", &post_cmd, "command",
+ "command to run after to the measured command"),
+ OPT_UINTEGER('I', "interval-print", &stat_config.interval,
+ "print counts at regular interval in ms (>= 10)"),
+ OPT_SET_UINT(0, "per-socket", &stat_config.aggr_mode,
+ "aggregate counts per processor socket", AGGR_SOCKET),
+ OPT_SET_UINT(0, "per-core", &stat_config.aggr_mode,
+ "aggregate counts per physical processor core", AGGR_CORE),
+ OPT_SET_UINT(0, "per-thread", &stat_config.aggr_mode,
+ "aggregate counts per thread", AGGR_THREAD),
+ OPT_UINTEGER('D', "delay", &initial_delay,
+ "ms to wait before starting measurement after program start"),
+ OPT_END()
+};
+
static int perf_stat__get_socket(struct cpu_map *map, int cpu)
{
return cpu_map__get_socket(map, cpu, NULL);
@@ -1174,69 +1238,6 @@ static int add_default_attributes(void)

int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
{
- bool append_file = false;
- int output_fd = 0;
- const char *output_name = NULL;
- const struct option options[] = {
- OPT_BOOLEAN('T', "transaction", &transaction_run,
- "hardware transaction statistics"),
- OPT_CALLBACK('e', "event", &evsel_list, "event",
- "event selector. use 'perf list' to list available events",
- parse_events_option),
- OPT_CALLBACK(0, "filter", &evsel_list, "filter",
- "event filter", parse_filter),
- OPT_BOOLEAN('i', "no-inherit", &no_inherit,
- "child tasks do not inherit counters"),
- OPT_STRING('p', "pid", &target.pid, "pid",
- "stat events on existing process id"),
- OPT_STRING('t', "tid", &target.tid, "tid",
- "stat events on existing thread id"),
- OPT_BOOLEAN('a', "all-cpus", &target.system_wide,
- "system-wide collection from all CPUs"),
- OPT_BOOLEAN('g', "group", &group,
- "put the counters into a counter group"),
- OPT_BOOLEAN('c', "scale", &stat_config.scale, "scale/normalize counters"),
- OPT_INCR('v', "verbose", &verbose,
- "be more verbose (show counter open errors, etc)"),
- OPT_INTEGER('r', "repeat", &run_count,
- "repeat command and print average + stddev (max: 100, forever: 0)"),
- OPT_BOOLEAN('n', "null", &null_run,
- "null run - dont start any counters"),
- OPT_INCR('d', "detailed", &detailed_run,
- "detailed run - start a lot of events"),
- OPT_BOOLEAN('S', "sync", &sync_run,
- "call sync() before starting a run"),
- OPT_CALLBACK_NOOPT('B', "big-num", NULL, NULL,
- "print large numbers with thousands\' separators",
- stat__set_big_num),
- OPT_STRING('C', "cpu", &target.cpu_list, "cpu",
- "list of cpus to monitor in system-wide"),
- OPT_SET_UINT('A', "no-aggr", &stat_config.aggr_mode,
- "disable CPU count aggregation", AGGR_NONE),
- OPT_STRING('x', "field-separator", &csv_sep, "separator",
- "print counts with custom separator"),
- OPT_CALLBACK('G', "cgroup", &evsel_list, "name",
- "monitor event in cgroup name only", parse_cgroups),
- OPT_STRING('o', "output", &output_name, "file", "output file name"),
- OPT_BOOLEAN(0, "append", &append_file, "append to the output file"),
- OPT_INTEGER(0, "log-fd", &output_fd,
- "log output to fd, instead of stderr"),
- OPT_STRING(0, "pre", &pre_cmd, "command",
- "command to run prior to the measured command"),
- OPT_STRING(0, "post", &post_cmd, "command",
- "command to run after to the measured command"),
- OPT_UINTEGER('I', "interval-print", &stat_config.interval,
- "print counts at regular interval in ms (>= 10)"),
- OPT_SET_UINT(0, "per-socket", &stat_config.aggr_mode,
- "aggregate counts per processor socket", AGGR_SOCKET),
- OPT_SET_UINT(0, "per-core", &stat_config.aggr_mode,
- "aggregate counts per physical processor core", AGGR_CORE),
- OPT_SET_UINT(0, "per-thread", &stat_config.aggr_mode,
- "aggregate counts per thread", AGGR_THREAD),
- OPT_UINTEGER('D', "delay", &initial_delay,
- "ms to wait before starting measurement after program start"),
- OPT_END()
- };
const char * const stat_usage[] = {
"perf stat [<options>] [<command>]",
NULL
@@ -1252,7 +1253,7 @@ int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
if (evsel_list == NULL)
return -ENOMEM;

- argc = parse_options(argc, argv, options, stat_usage,
+ argc = parse_options(argc, argv, stat_options, stat_usage,
PARSE_OPT_STOP_AT_NON_OPTION);

interval = stat_config.interval;
@@ -1262,14 +1263,14 @@ int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)

if (output_name && output_fd) {
fprintf(stderr, "cannot use both --output and --log-fd\n");
- parse_options_usage(stat_usage, options, "o", 1);
- parse_options_usage(NULL, options, "log-fd", 0);
+ parse_options_usage(stat_usage, stat_options, "o", 1);
+ parse_options_usage(NULL, stat_options, "log-fd", 0);
goto out;
}

if (output_fd < 0) {
fprintf(stderr, "argument to --log-fd must be a > 0\n");
- parse_options_usage(stat_usage, options, "log-fd", 0);
+ parse_options_usage(stat_usage, stat_options, "log-fd", 0);
goto out;
}

@@ -1309,8 +1310,8 @@ int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
/* User explicitly passed -B? */
if (big_num_opt == 1) {
fprintf(stderr, "-B option not supported with -x\n");
- parse_options_usage(stat_usage, options, "B", 1);
- parse_options_usage(NULL, options, "x", 1);
+ parse_options_usage(stat_usage, stat_options, "B", 1);
+ parse_options_usage(NULL, stat_options, "x", 1);
goto out;
} else /* Nope, so disable big number formatting */
big_num = false;
@@ -1318,11 +1319,11 @@ int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
big_num = false;

if (!argc && target__none(&target))
- usage_with_options(stat_usage, options);
+ usage_with_options(stat_usage, stat_options);

if (run_count < 0) {
pr_err("Run count must be a positive number\n");
- parse_options_usage(stat_usage, options, "r", 1);
+ parse_options_usage(stat_usage, stat_options, "r", 1);
goto out;
} else if (run_count == 0) {
forever = true;
@@ -1332,8 +1333,8 @@ int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
if ((stat_config.aggr_mode == AGGR_THREAD) && !target__has_task(&target)) {
fprintf(stderr, "The --per-thread option is only available "
"when monitoring via -p -t options.\n");
- parse_options_usage(NULL, options, "p", 1);
- parse_options_usage(NULL, options, "t", 1);
+ parse_options_usage(NULL, stat_options, "p", 1);
+ parse_options_usage(NULL, stat_options, "t", 1);
goto out;
}

@@ -1347,9 +1348,9 @@ int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
fprintf(stderr, "both cgroup and no-aggregation "
"modes only available in system-wide mode\n");

- parse_options_usage(stat_usage, options, "G", 1);
- parse_options_usage(NULL, options, "A", 1);
- parse_options_usage(NULL, options, "a", 1);
+ parse_options_usage(stat_usage, stat_options, "G", 1);
+ parse_options_usage(NULL, stat_options, "A", 1);
+ parse_options_usage(NULL, stat_options, "a", 1);
goto out;
}

@@ -1361,12 +1362,12 @@ int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
if (perf_evlist__create_maps(evsel_list, &target) < 0) {
if (target__has_task(&target)) {
pr_err("Problems finding threads of monitor\n");
- parse_options_usage(stat_usage, options, "p", 1);
- parse_options_usage(NULL, options, "t", 1);
+ parse_options_usage(stat_usage, stat_options, "p", 1);
+ parse_options_usage(NULL, stat_options, "t", 1);
} else if (target__has_cpu(&target)) {
perror("failed to parse CPUs map");
- parse_options_usage(stat_usage, options, "C", 1);
- parse_options_usage(NULL, options, "a", 1);
+ parse_options_usage(stat_usage, stat_options, "C", 1);
+ parse_options_usage(NULL, stat_options, "a", 1);
}
goto out;
}
@@ -1381,7 +1382,7 @@ int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
if (interval && interval < 100) {
if (interval < 10) {
pr_err("print interval must be >= 10ms\n");
- parse_options_usage(stat_usage, options, "I", 1);
+ parse_options_usage(stat_usage, stat_options, "I", 1);
goto out;
} else
pr_warning("print interval < 100ms. "

2015-12-02 13:51:20

by Liang, Kan

[permalink] [raw]
Subject: RE: [PATCHv6 00/25] perf stat: Add scripting support

Hi Arnaldo and Jirka,

Any update about the status of this patchset?

Thanks,
Kan

> hi,
> sending another version of stat scripting.
>
> v6 changes:
> - several patches from v4 already taken
> - perf stat record can now place 'record' keyword
> anywhere within stat options
> - placed STAT feature checking earlier into record
> patches so commands processing perf.data recognize
> stat data and skip sample_type checking
> - rebased on Arnaldo's perf/stat
> - added Tested-by: Kan Liang <[email protected]>
>
> v5 changes:
> - several patches from v4 already taken
> - using u16 for cpu number in cpu_map_event
> - renamed PERF_RECORD_HEADER_ATTR_UPDATE to
> PERF_RECORD_EVENT_UPDATE
> - moved low hanging fuits patches to the start of the patchset
> - patchset tested by Kan Liang, thanks!
>
> v4 changes:
> - added attr update event for event's cpumask
> - forbig aggregation on task workloads
> - some minor reorders and changelog fixes
>
> v3 changes:
> - added attr update event to handle unit,scale,name for event
> it fixed the uncore_imc_1/cas_count_read/ record/report
> - perf report -D now displays stat related events
> - some minor and changelog fixes
>
> v2 changes:
> - rebased to latest Arnaldo's perf/core
> - patches 1 to 11 already merged in
> - added --per-core/--per-socket/-A options for perf stat report
> command to allow custom aggregation in stat report, please
> check new examples below
> - couple changelogs changes
>
> The initial attempt defined its own formula lang and allowed triggering
> user's script on the end of the stat command:
> http://marc.info/?l=linux-kernel&m=136742146322273&w=2
>
> This patchset abandons the idea of new formula language and rather adds
> support to:
> - store stat data into perf.data file
> - add python support to process stat events
>
> Basically it allows to store stat data into perf.data and post process it with
> python scripts in a similar way we do for sampling data.
>
> The stat data are stored in new stat, stat-round, stat-config user events.
> stat - stored for each read syscall of the counter
> stat round - stored for each interval or end of the command invocation
> stat config - stores all the config information needed to process data
> so report tool could restore the same output as record
>
> The python script can now define 'stat__<eventname>_<modifier>'
> functions to get stat events data and 'stat__interval' to get stat-round data.
>
> See CPI script example in scripts/python/stat-cpi.py.
>
> Also available in:
> git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git
> perf/stat_script
>
> thanks,
> jirka
>
> Examples:
>
> - To record data for command stat workload:
>
> $ perf stat record kill
> ...
>
> Performance counter stats for 'kill':
>
> 0.372007 task-clock (msec) # 0.613 CPUs utilized
> 3 context-switches # 0.008 M/sec
> 0 cpu-migrations # 0.000 K/sec
> 62 page-faults # 0.167 M/sec
> 1,129,973 cycles # 3.038 GHz
> <not supported> stalled-cycles-frontend
> <not supported> stalled-cycles-backend
> 813,313 instructions # 0.72 insns per cycle
> 166,161 branches # 446.661 M/sec
> 8,747 branch-misses # 5.26% of all branches
>
> 0.000607287 seconds time elapsed
>
> - To report perf stat data:
>
> $ perf stat report
>
> Performance counter stats for '/home/jolsa/bin/perf stat record kill':
>
> 0.372007 task-clock (msec) # inf CPUs utilized
> 3 context-switches # 0.008 M/sec
> 0 cpu-migrations # 0.000 K/sec
> 62 page-faults # 0.167 M/sec
> 1,129,973 cycles # 3.038 GHz
> <not supported> stalled-cycles-frontend
> <not supported> stalled-cycles-backend
> 813,313 instructions # 0.72 insns per cycle
> 166,161 branches # 446.661 M/sec
> 8,747 branch-misses # 5.26% of all branches
>
> 0.000000000 seconds time elapsed
>
> - To store system-wide period stat data:
>
> $ perf stat -e cycles:u,instructions:u -a -I 1000 record
> # time counts unit events
> 1.000265471 462,311,482 cycles:u (100.00%)
> 1.000265471 590,037,440 instructions:u
> 2.000483453 722,532,336 cycles:u (100.00%)
> 2.000483453 848,678,197 instructions:u
> 3.000759876 75,990,880 cycles:u (100.00%)
> 3.000759876 86,187,813 instructions:u
> ^C 3.213960893 85,329,533 cycles:u (100.00%)
> 3.213960893 135,954,296 instructions:u
>
> - To report perf stat data:
>
> $ perf stat report
> # time counts unit events
> 1.000265471 462,311,482 cycles:u (100.00%)
> 1.000265471 590,037,440 instructions:u
> 2.000483453 722,532,336 cycles:u (100.00%)
> 2.000483453 848,678,197 instructions:u
> 3.000759876 75,990,880 cycles:u (100.00%)
> 3.000759876 86,187,813 instructions:u
> 3.213960893 85,329,533 cycles:u (100.00%)
> 3.213960893 135,954,296 instructions:u
>
> - To run stat-cpi.py script over perf.data:
>
> $ perf script -s scripts/python/stat-cpi.py
> 1.000265: cpu -1, thread -1 -> cpi 0.783529 (462311482/590037440)
> 2.000483: cpu -1, thread -1 -> cpi 0.851362 (722532336/848678197)
> 3.000760: cpu -1, thread -1 -> cpi 0.881689 (75990880/86187813)
> 3.213961: cpu -1, thread -1 -> cpi 0.627634 (85329533/135954296)
>
> - To pipe data from stat to stat-cpi script:
>
> $ perf stat -e cycles:u,instructions:u -A -C 0 -I 1000 record | perf script -s
> scripts/python/stat-cpi.py
> 1.000192: cpu 0, thread -1 -> cpi 0.739535 (23921908/32347236)
> 2.000376: cpu 0, thread -1 -> cpi 1.663482 (2519340/1514498)
> 3.000621: cpu 0, thread -1 -> cpi 1.396308 (16162767/11575362)
> 4.000700: cpu 0, thread -1 -> cpi 1.092246 (20077258/18381624)
> 5.000867: cpu 0, thread -1 -> cpi 0.473816 (45157586/95306156)
> 6.001034: cpu 0, thread -1 -> cpi 0.532792 (43701668/82023818)
> 7.001195: cpu 0, thread -1 -> cpi 1.122059 (29890042/26638561)
>
> - Raw script stat data output:
>
> $ perf stat -e cycles:u,instructions:u -A -C 0 -I 1000 record | perf --no-
> pager script
> CPU THREAD VAL ENA RUN TIME EVENT
> 0 -1 12302059 1000811347 1000810712 1000198821 cycles:u
> 0 -1 2565362 1000823218 1000823218 1000198821
> instructions:u
> 0 -1 14453353 1000812704 1000812704 2000382283 cycles:u
> 0 -1 4600932 1000799342 1000799342 2000382283
> instructions:u
> 0 -1 15245106 1000774425 1000774425 3000538255 cycles:u
> 0 -1 2624324 1000769310 1000769310 3000538255
> instructions:u
>
> - To display different aggregation in report:
>
> $ perf stat -e cycles -a -I 1000 record sleep 3
> # time counts unit events
> 1.000223609 703,427,617 cycles
> 2.000443651 609,975,307 cycles
> 3.000569616 668,479,597 cycles
> 3.000735323 1,155,816 cycles
>
> $ perf stat report
> # time counts unit events
> 1.000223609 703,427,617 cycles
> 2.000443651 609,975,307 cycles
> 3.000569616 668,479,597 cycles
> 3.000735323 1,155,816 cycles
>
> $ perf stat report --per-core
> # time core cpus counts unit events
> 1.000223609 S0-C0 2 327,612,412 cycles
> 1.000223609 S0-C1 2 375,815,205 cycles
> 2.000443651 S0-C0 2 287,462,177 cycles
> 2.000443651 S0-C1 2 322,513,130 cycles
> 3.000569616 S0-C0 2 271,571,908 cycles
> 3.000569616 S0-C1 2 396,907,689 cycles
> 3.000735323 S0-C0 2 694,977 cycles
> 3.000735323 S0-C1 2 460,839 cycles
>
> $ perf stat report --per-socket
> # time socket cpus counts unit events
> 1.000223609 S0 4 703,427,617 cycles
> 2.000443651 S0 4 609,975,307 cycles
> 3.000569616 S0 4 668,479,597 cycles
> 3.000735323 S0 4 1,155,816 cycles
>
> $ perf stat report -A
> # time CPU counts unit events
> 1.000223609 CPU0 205,431,505 cycles
> 1.000223609 CPU1 122,180,907 cycles
> 1.000223609 CPU2 176,649,682 cycles
> 1.000223609 CPU3 199,165,523 cycles
> 2.000443651 CPU0 148,447,922 cycles
> 2.000443651 CPU1 139,014,255 cycles
> 2.000443651 CPU2 204,436,559 cycles
> 2.000443651 CPU3 118,076,571 cycles
> 3.000569616 CPU0 149,788,954 cycles
> 3.000569616 CPU1 121,782,954 cycles
> 3.000569616 CPU2 247,277,700 cycles
> 3.000569616 CPU3 149,629,989 cycles
> 3.000735323 CPU0 269,675 cycles
> 3.000735323 CPU1 425,302 cycles
> 3.000735323 CPU2 364,169 cycles
> 3.000735323 CPU3 96,670 cycles
>
>
> Cc: Andi Kleen <[email protected]>
> Cc: Ulrich Drepper <[email protected]>
> Cc: Will Deacon <[email protected]>
> Cc: Stephane Eranian <[email protected]>
> Cc: Don Zickus <[email protected]>
> Tested-by: Kan Liang <[email protected]>
> ---
> Jiri Olsa (25):
> perf stat: Make stat options global
> perf stat record: Add record command
> perf stat record: Initialize record features
> perf stat record: Synthesize stat record data
> perf stat record: Store events IDs in perf data file
> perf stat record: Add pipe support for record command
> perf stat record: Write stat events on record
> perf stat record: Write stat round events on record
> perf stat record: Do not allow record with multiple runs mode
> perf stat record: Synthesize event update events
> perf stat report: Add report command
> perf stat report: Process cpu/threads maps
> perf stat report: Process stat config event
> perf stat report: Add support to initialize aggr_map from file
> perf stat report: Process stat and stat round events
> perf stat report: Process event update events
> perf stat report: Move csv_sep initialization before report command
> perf stat report: Allow to override aggr_mode
> perf script: Process cpu/threads maps
> perf script: Process stat config event
> perf script: Add process_stat/process_stat_interval scripting interface
> perf script: Add stat default handlers
> perf script: Display stat events by default
> perf script: Add python support for stat events
> perf script: Add stat-cpi.py script
>
> tools/perf/Documentation/perf-stat.txt | 34 ++++
> tools/perf/builtin-script.c | 139 +++++++++++++++
> tools/perf/builtin-stat.c | 742
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> ++++++++++---------
> tools/perf/scripts/python/stat-cpi.py | 74 ++++++++
> tools/perf/util/evlist.c | 6 +-
> tools/perf/util/evlist.h | 3 +
> tools/perf/util/scripting-engines/trace-event-python.c | 114
> +++++++++++-
> tools/perf/util/session.c | 3 +
> tools/perf/util/trace-event.h | 4 +
> 9 files changed, 1021 insertions(+), 98 deletions(-) create mode 100644
> tools/perf/scripts/python/stat-cpi.py

2015-12-02 13:59:28

by Jiri Olsa

[permalink] [raw]
Subject: Re: [PATCHv6 00/25] perf stat: Add scripting support

On Wed, Dec 02, 2015 at 01:51:13PM +0000, Liang, Kan wrote:
> Hi Arnaldo and Jirka,
>
> Any update about the status of this patchset?

there's first batch waiting in Arnaldo's perf/stat
it'll get in soon hopefuly ;-)

I'll rebase the rest on top of it, once it's in
and resend..

jirka

>
> Thanks,
> Kan
>
> > hi,
> > sending another version of stat scripting.
> >
> > v6 changes:
> > - several patches from v4 already taken
> > - perf stat record can now place 'record' keyword
> > anywhere within stat options
> > - placed STAT feature checking earlier into record
> > patches so commands processing perf.data recognize
> > stat data and skip sample_type checking
> > - rebased on Arnaldo's perf/stat
> > - added Tested-by: Kan Liang <[email protected]>
> >
> > v5 changes:
> > - several patches from v4 already taken
> > - using u16 for cpu number in cpu_map_event
> > - renamed PERF_RECORD_HEADER_ATTR_UPDATE to
> > PERF_RECORD_EVENT_UPDATE
> > - moved low hanging fuits patches to the start of the patchset
> > - patchset tested by Kan Liang, thanks!
> >
> > v4 changes:
> > - added attr update event for event's cpumask
> > - forbig aggregation on task workloads
> > - some minor reorders and changelog fixes
> >
> > v3 changes:
> > - added attr update event to handle unit,scale,name for event
> > it fixed the uncore_imc_1/cas_count_read/ record/report
> > - perf report -D now displays stat related events
> > - some minor and changelog fixes
> >
> > v2 changes:
> > - rebased to latest Arnaldo's perf/core
> > - patches 1 to 11 already merged in
> > - added --per-core/--per-socket/-A options for perf stat report
> > command to allow custom aggregation in stat report, please
> > check new examples below
> > - couple changelogs changes
> >
> > The initial attempt defined its own formula lang and allowed triggering
> > user's script on the end of the stat command:
> > http://marc.info/?l=linux-kernel&m=136742146322273&w=2
> >
> > This patchset abandons the idea of new formula language and rather adds
> > support to:
> > - store stat data into perf.data file
> > - add python support to process stat events
> >
> > Basically it allows to store stat data into perf.data and post process it with
> > python scripts in a similar way we do for sampling data.
> >
> > The stat data are stored in new stat, stat-round, stat-config user events.
> > stat - stored for each read syscall of the counter
> > stat round - stored for each interval or end of the command invocation
> > stat config - stores all the config information needed to process data
> > so report tool could restore the same output as record
> >
> > The python script can now define 'stat__<eventname>_<modifier>'
> > functions to get stat events data and 'stat__interval' to get stat-round data.
> >
> > See CPI script example in scripts/python/stat-cpi.py.
> >
> > Also available in:
> > git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git
> > perf/stat_script
> >
> > thanks,
> > jirka
> >
> > Examples:
> >
> > - To record data for command stat workload:
> >
> > $ perf stat record kill
> > ...
> >
> > Performance counter stats for 'kill':
> >
> > 0.372007 task-clock (msec) # 0.613 CPUs utilized
> > 3 context-switches # 0.008 M/sec
> > 0 cpu-migrations # 0.000 K/sec
> > 62 page-faults # 0.167 M/sec
> > 1,129,973 cycles # 3.038 GHz
> > <not supported> stalled-cycles-frontend
> > <not supported> stalled-cycles-backend
> > 813,313 instructions # 0.72 insns per cycle
> > 166,161 branches # 446.661 M/sec
> > 8,747 branch-misses # 5.26% of all branches
> >
> > 0.000607287 seconds time elapsed
> >
> > - To report perf stat data:
> >
> > $ perf stat report
> >
> > Performance counter stats for '/home/jolsa/bin/perf stat record kill':
> >
> > 0.372007 task-clock (msec) # inf CPUs utilized
> > 3 context-switches # 0.008 M/sec
> > 0 cpu-migrations # 0.000 K/sec
> > 62 page-faults # 0.167 M/sec
> > 1,129,973 cycles # 3.038 GHz
> > <not supported> stalled-cycles-frontend
> > <not supported> stalled-cycles-backend
> > 813,313 instructions # 0.72 insns per cycle
> > 166,161 branches # 446.661 M/sec
> > 8,747 branch-misses # 5.26% of all branches
> >
> > 0.000000000 seconds time elapsed
> >
> > - To store system-wide period stat data:
> >
> > $ perf stat -e cycles:u,instructions:u -a -I 1000 record
> > # time counts unit events
> > 1.000265471 462,311,482 cycles:u (100.00%)
> > 1.000265471 590,037,440 instructions:u
> > 2.000483453 722,532,336 cycles:u (100.00%)
> > 2.000483453 848,678,197 instructions:u
> > 3.000759876 75,990,880 cycles:u (100.00%)
> > 3.000759876 86,187,813 instructions:u
> > ^C 3.213960893 85,329,533 cycles:u (100.00%)
> > 3.213960893 135,954,296 instructions:u
> >
> > - To report perf stat data:
> >
> > $ perf stat report
> > # time counts unit events
> > 1.000265471 462,311,482 cycles:u (100.00%)
> > 1.000265471 590,037,440 instructions:u
> > 2.000483453 722,532,336 cycles:u (100.00%)
> > 2.000483453 848,678,197 instructions:u
> > 3.000759876 75,990,880 cycles:u (100.00%)
> > 3.000759876 86,187,813 instructions:u
> > 3.213960893 85,329,533 cycles:u (100.00%)
> > 3.213960893 135,954,296 instructions:u
> >
> > - To run stat-cpi.py script over perf.data:
> >
> > $ perf script -s scripts/python/stat-cpi.py
> > 1.000265: cpu -1, thread -1 -> cpi 0.783529 (462311482/590037440)
> > 2.000483: cpu -1, thread -1 -> cpi 0.851362 (722532336/848678197)
> > 3.000760: cpu -1, thread -1 -> cpi 0.881689 (75990880/86187813)
> > 3.213961: cpu -1, thread -1 -> cpi 0.627634 (85329533/135954296)
> >
> > - To pipe data from stat to stat-cpi script:
> >
> > $ perf stat -e cycles:u,instructions:u -A -C 0 -I 1000 record | perf script -s
> > scripts/python/stat-cpi.py
> > 1.000192: cpu 0, thread -1 -> cpi 0.739535 (23921908/32347236)
> > 2.000376: cpu 0, thread -1 -> cpi 1.663482 (2519340/1514498)
> > 3.000621: cpu 0, thread -1 -> cpi 1.396308 (16162767/11575362)
> > 4.000700: cpu 0, thread -1 -> cpi 1.092246 (20077258/18381624)
> > 5.000867: cpu 0, thread -1 -> cpi 0.473816 (45157586/95306156)
> > 6.001034: cpu 0, thread -1 -> cpi 0.532792 (43701668/82023818)
> > 7.001195: cpu 0, thread -1 -> cpi 1.122059 (29890042/26638561)
> >
> > - Raw script stat data output:
> >
> > $ perf stat -e cycles:u,instructions:u -A -C 0 -I 1000 record | perf --no-
> > pager script
> > CPU THREAD VAL ENA RUN TIME EVENT
> > 0 -1 12302059 1000811347 1000810712 1000198821 cycles:u
> > 0 -1 2565362 1000823218 1000823218 1000198821
> > instructions:u
> > 0 -1 14453353 1000812704 1000812704 2000382283 cycles:u
> > 0 -1 4600932 1000799342 1000799342 2000382283
> > instructions:u
> > 0 -1 15245106 1000774425 1000774425 3000538255 cycles:u
> > 0 -1 2624324 1000769310 1000769310 3000538255
> > instructions:u
> >
> > - To display different aggregation in report:
> >
> > $ perf stat -e cycles -a -I 1000 record sleep 3
> > # time counts unit events
> > 1.000223609 703,427,617 cycles
> > 2.000443651 609,975,307 cycles
> > 3.000569616 668,479,597 cycles
> > 3.000735323 1,155,816 cycles
> >
> > $ perf stat report
> > # time counts unit events
> > 1.000223609 703,427,617 cycles
> > 2.000443651 609,975,307 cycles
> > 3.000569616 668,479,597 cycles
> > 3.000735323 1,155,816 cycles
> >
> > $ perf stat report --per-core
> > # time core cpus counts unit events
> > 1.000223609 S0-C0 2 327,612,412 cycles
> > 1.000223609 S0-C1 2 375,815,205 cycles
> > 2.000443651 S0-C0 2 287,462,177 cycles
> > 2.000443651 S0-C1 2 322,513,130 cycles
> > 3.000569616 S0-C0 2 271,571,908 cycles
> > 3.000569616 S0-C1 2 396,907,689 cycles
> > 3.000735323 S0-C0 2 694,977 cycles
> > 3.000735323 S0-C1 2 460,839 cycles
> >
> > $ perf stat report --per-socket
> > # time socket cpus counts unit events
> > 1.000223609 S0 4 703,427,617 cycles
> > 2.000443651 S0 4 609,975,307 cycles
> > 3.000569616 S0 4 668,479,597 cycles
> > 3.000735323 S0 4 1,155,816 cycles
> >
> > $ perf stat report -A
> > # time CPU counts unit events
> > 1.000223609 CPU0 205,431,505 cycles
> > 1.000223609 CPU1 122,180,907 cycles
> > 1.000223609 CPU2 176,649,682 cycles
> > 1.000223609 CPU3 199,165,523 cycles
> > 2.000443651 CPU0 148,447,922 cycles
> > 2.000443651 CPU1 139,014,255 cycles
> > 2.000443651 CPU2 204,436,559 cycles
> > 2.000443651 CPU3 118,076,571 cycles
> > 3.000569616 CPU0 149,788,954 cycles
> > 3.000569616 CPU1 121,782,954 cycles
> > 3.000569616 CPU2 247,277,700 cycles
> > 3.000569616 CPU3 149,629,989 cycles
> > 3.000735323 CPU0 269,675 cycles
> > 3.000735323 CPU1 425,302 cycles
> > 3.000735323 CPU2 364,169 cycles
> > 3.000735323 CPU3 96,670 cycles
> >
> >
> > Cc: Andi Kleen <[email protected]>
> > Cc: Ulrich Drepper <[email protected]>
> > Cc: Will Deacon <[email protected]>
> > Cc: Stephane Eranian <[email protected]>
> > Cc: Don Zickus <[email protected]>
> > Tested-by: Kan Liang <[email protected]>
> > ---
> > Jiri Olsa (25):
> > perf stat: Make stat options global
> > perf stat record: Add record command
> > perf stat record: Initialize record features
> > perf stat record: Synthesize stat record data
> > perf stat record: Store events IDs in perf data file
> > perf stat record: Add pipe support for record command
> > perf stat record: Write stat events on record
> > perf stat record: Write stat round events on record
> > perf stat record: Do not allow record with multiple runs mode
> > perf stat record: Synthesize event update events
> > perf stat report: Add report command
> > perf stat report: Process cpu/threads maps
> > perf stat report: Process stat config event
> > perf stat report: Add support to initialize aggr_map from file
> > perf stat report: Process stat and stat round events
> > perf stat report: Process event update events
> > perf stat report: Move csv_sep initialization before report command
> > perf stat report: Allow to override aggr_mode
> > perf script: Process cpu/threads maps
> > perf script: Process stat config event
> > perf script: Add process_stat/process_stat_interval scripting interface
> > perf script: Add stat default handlers
> > perf script: Display stat events by default
> > perf script: Add python support for stat events
> > perf script: Add stat-cpi.py script
> >
> > tools/perf/Documentation/perf-stat.txt | 34 ++++
> > tools/perf/builtin-script.c | 139 +++++++++++++++
> > tools/perf/builtin-stat.c | 742
> > +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > ++++++++++---------
> > tools/perf/scripts/python/stat-cpi.py | 74 ++++++++
> > tools/perf/util/evlist.c | 6 +-
> > tools/perf/util/evlist.h | 3 +
> > tools/perf/util/scripting-engines/trace-event-python.c | 114
> > +++++++++++-
> > tools/perf/util/session.c | 3 +
> > tools/perf/util/trace-event.h | 4 +
> > 9 files changed, 1021 insertions(+), 98 deletions(-) create mode 100644
> > tools/perf/scripts/python/stat-cpi.py

2015-12-17 18:57:14

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 17/25] perf stat report: Move csv_sep initialization before report command

Em Thu, Nov 05, 2015 at 03:41:01PM +0100, Jiri Olsa escreveu:
> So we have csv_sep properly initialized before
> report command leg.

I moved this to before "perf stat report: Process stat and stat round
events" so that what you wrote above makes sense, i.e. after this patch
nothing is produced by 'perf stat report' and right after the stat and
stat round one is applied I get:

[acme@ssdandy linux]$ perf stat report

Performance counter stats for '/home/acme/bin/perf stat record usleep 1':

0.411636 task-clock (msec) # 0.571 CPUs utilized
2 context-switches # 0.005 M/sec
0 cpu-migrations # 0.000 K/sec
149 page-faults # 0.362 M/sec
1,291,807 cycles # 3.138 GHz
959,632 stalled-cycles-frontend # 74.29% frontend cycles idle
703,170 stalled-cycles-backend # 54.43% backend cycles idle
757,538 instructions # 0.59 insns per cycle
# 1.27 stalled cycles per insn
133,293 branches # 323.813 M/sec
<not counted> branch-misses (0.00%)

0.000720394 seconds time elapsed

[acme@ssdandy linux]$

And not this ugly thing:

[acme@ssdandy linux]$ perf stat report

Performance counter stats for '/home/acme/bin/perf stat record usleep 1':

0.411636(null) (null)task-clock (msec) # 0.571 CPUs utilized
2(null) (null)context-switches # 0.005 M/sec
0(null) (null)cpu-migrations # 0.000 K/sec
149(null) (null)page-faults # 0.362 M/sec
1,291,807(null) (null)cycles # 3.138 GHz
959,632(null) (null)stalled-cycles-frontend # 74.29% frontend cycles idle
703,170(null) (null)stalled-cycles-backend # 54.43% backend cycles idle
757,538(null) (null)instructions # 0.59 insns per cycle
# 1.27 stalled cycles per insn
133,293(null) (null)branches # 323.813 M/sec
<not counted>(null) (null)branch-misses (0.00%)

0.000720394 seconds time elapsed

[acme@ssdandy linux]$

- Arnaldo

> Tested-by: Kan Liang <[email protected]>
> Link: http://lkml.kernel.org/n/[email protected]
> Signed-off-by: Jiri Olsa <[email protected]>
> ---
> tools/perf/builtin-stat.c | 14 +++++++-------
> 1 file changed, 7 insertions(+), 7 deletions(-)
>
> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> index 6636d29b3b18..174ffbd02a13 100644
> --- a/tools/perf/builtin-stat.c
> +++ b/tools/perf/builtin-stat.c
> @@ -1776,6 +1776,13 @@ int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
> (const char **) stat_usage,
> PARSE_OPT_STOP_AT_NON_OPTION);
>
> + if (csv_sep) {
> + csv_output = true;
> + if (!strcmp(csv_sep, "\\t"))
> + csv_sep = "\t";
> + } else
> + csv_sep = DEFAULT_SEPARATOR;
> +
> if (argc && !strncmp(argv[0], "rec", 3)) {
> argc = __cmd_record(argc, argv);
> if (argc < 0)
> @@ -1826,13 +1833,6 @@ int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
>
> stat_config.output = output;
>
> - if (csv_sep) {
> - csv_output = true;
> - if (!strcmp(csv_sep, "\\t"))
> - csv_sep = "\t";
> - } else
> - csv_sep = DEFAULT_SEPARATOR;
> -
> /*
> * let the spreadsheet do the pretty-printing
> */
> --
> 2.4.3

2015-12-17 19:46:45

by Jiri Olsa

[permalink] [raw]
Subject: Re: [PATCH 17/25] perf stat report: Move csv_sep initialization before report command

On Thu, Dec 17, 2015 at 03:57:07PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Thu, Nov 05, 2015 at 03:41:01PM +0100, Jiri Olsa escreveu:
> > So we have csv_sep properly initialized before
> > report command leg.
>
> I moved this to before "perf stat report: Process stat and stat round
> events" so that what you wrote above makes sense, i.e. after this patch
> nothing is produced by 'perf stat report' and right after the stat and
> stat round one is applied I get:
>
> [acme@ssdandy linux]$ perf stat report
>
> Performance counter stats for '/home/acme/bin/perf stat record usleep 1':
>
> 0.411636 task-clock (msec) # 0.571 CPUs utilized
> 2 context-switches # 0.005 M/sec
> 0 cpu-migrations # 0.000 K/sec
> 149 page-faults # 0.362 M/sec
> 1,291,807 cycles # 3.138 GHz
> 959,632 stalled-cycles-frontend # 74.29% frontend cycles idle
> 703,170 stalled-cycles-backend # 54.43% backend cycles idle
> 757,538 instructions # 0.59 insns per cycle
> # 1.27 stalled cycles per insn
> 133,293 branches # 323.813 M/sec
> <not counted> branch-misses (0.00%)
>
> 0.000720394 seconds time elapsed
>
> [acme@ssdandy linux]$
>
> And not this ugly thing:
>
> [acme@ssdandy linux]$ perf stat report
>
> Performance counter stats for '/home/acme/bin/perf stat record usleep 1':
>
> 0.411636(null) (null)task-clock (msec) # 0.571 CPUs utilized
> 2(null) (null)context-switches # 0.005 M/sec
> 0(null) (null)cpu-migrations # 0.000 K/sec
> 149(null) (null)page-faults # 0.362 M/sec
> 1,291,807(null) (null)cycles # 3.138 GHz
> 959,632(null) (null)stalled-cycles-frontend # 74.29% frontend cycles idle
> 703,170(null) (null)stalled-cycles-backend # 54.43% backend cycles idle
> 757,538(null) (null)instructions # 0.59 insns per cycle
> # 1.27 stalled cycles per insn
> 133,293(null) (null)branches # 323.813 M/sec
> <not counted>(null) (null)branch-misses (0.00%)
>
> 0.000720394 seconds time elapsed
>
> [acme@ssdandy linux]$

sounds good, thanks

jirka

Subject: [tip:perf/core] perf stat record: Initialize record features

Commit-ID: 3ba78bd00e508bf46a6aa2b8e296dc8287ea4c29
Gitweb: http://git.kernel.org/tip/3ba78bd00e508bf46a6aa2b8e296dc8287ea4c29
Author: Jiri Olsa <[email protected]>
AuthorDate: Thu, 5 Nov 2015 15:40:47 +0100
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Thu, 17 Dec 2015 15:15:17 -0300

perf stat record: Initialize record features

Disabling all non stat related features.

Also as we now enable STAT feature in the data file, adding code to
instruct session open to skip sample type checking for stat data files.

Signed-off-by: Jiri Olsa <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Tested-by: Kan Liang <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/builtin-stat.c | 15 +++++++++++++++
tools/perf/util/session.c | 3 +++
2 files changed, 18 insertions(+)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index af2a3bf..c9c896a 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -1310,6 +1310,19 @@ static const char * const recort_usage[] = {
NULL,
};

+static void init_features(struct perf_session *session)
+{
+ int feat;
+
+ for (feat = HEADER_FIRST_FEATURE; feat < HEADER_LAST_FEATURE; feat++)
+ perf_header__set_feat(&session->header, feat);
+
+ perf_header__clear_feat(&session->header, HEADER_BUILD_ID);
+ perf_header__clear_feat(&session->header, HEADER_TRACING_DATA);
+ perf_header__clear_feat(&session->header, HEADER_BRANCH_STACK);
+ perf_header__clear_feat(&session->header, HEADER_AUXTRACE);
+}
+
static int __cmd_record(int argc, const char **argv)
{
struct perf_session *session;
@@ -1331,6 +1344,8 @@ static int __cmd_record(int argc, const char **argv)
if (perf_stat.file.is_pipe)
return -EINVAL;

+ init_features(session);
+
session->evlist = evsel_list;
perf_stat.session = session;
perf_stat.record = true;
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index a90c74b..d5636ba 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -37,6 +37,9 @@ static int perf_session__open(struct perf_session *session)
if (perf_data_file__is_pipe(file))
return 0;

+ if (perf_header__has_feat(&session->header, HEADER_STAT))
+ return 0;
+
if (!perf_evlist__valid_sample_type(session->evlist)) {
pr_err("non matching sample_type\n");
return -1;

Subject: [tip:perf/core] perf stat record: Synthesize stat record data

Commit-ID: 8b99b1a4e0b082ea6a277766982dac84483d4d3c
Gitweb: http://git.kernel.org/tip/8b99b1a4e0b082ea6a277766982dac84483d4d3c
Author: Jiri Olsa <[email protected]>
AuthorDate: Thu, 5 Nov 2015 15:40:48 +0100
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Thu, 17 Dec 2015 15:15:18 -0300

perf stat record: Synthesize stat record data

Synthesizing needed stat record data for report/script:
- cpu/thread maps
- stat config

Committer note:

New records generated on a perf.data file with this patch:

$ perf report -D | grep PERF_RECORD_
0x568 [0x28]: PERF_RECORD_THREAD_MAP nr: 1 thread: 29097
0x590 [0x12]: PERF_RECORD_CPU_MAP nr: 1 cpu: 65535
0x5a2 [0x40]: PERF_RECORD_STAT_CONFIG
$

Signed-off-by: Jiri Olsa <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Tested-by: Kan Liang <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
[ Adjusted wrt kernel PERF_RECORD_MMAP added when introducing 'perf stat record' ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/builtin-stat.c | 56 ++++++++++++++++++++++++++++++++++++-----------
1 file changed, 43 insertions(+), 13 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index c9c896a..45bf4d2 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -217,26 +217,20 @@ static inline int nsec_counter(struct perf_evsel *evsel)
return 0;
}

-static int perf_stat__write(struct perf_stat *stat, void *bf, size_t size)
+static int process_synthesized_event(struct perf_tool *tool __maybe_unused,
+ union perf_event *event,
+ struct perf_sample *sample __maybe_unused,
+ struct machine *machine __maybe_unused)
{
- if (perf_data_file__write(stat->session->file, bf, size) < 0) {
+ if (perf_data_file__write(&perf_stat.file, event, event->header.size) < 0) {
pr_err("failed to write perf data, error: %m\n");
return -1;
}

- stat->bytes_written += size;
+ perf_stat.bytes_written += event->header.size;
return 0;
}

-static int process_synthesized_event(struct perf_tool *tool,
- union perf_event *event,
- struct perf_sample *sample __maybe_unused,
- struct machine *machine __maybe_unused)
-{
- struct perf_stat *stat = (void *)tool;
- return perf_stat__write(stat, event, event->header.size);
-}
-
/*
* Read out the results of a single counter:
* do not aggregate counts across CPUs in system-wide mode
@@ -323,6 +317,35 @@ static void workload_exec_failed_signal(int signo __maybe_unused, siginfo_t *inf
workload_exec_errno = info->si_value.sival_int;
}

+static int perf_stat_synthesize_config(void)
+{
+ int err;
+
+ err = perf_event__synthesize_thread_map2(NULL, evsel_list->threads,
+ process_synthesized_event,
+ NULL);
+ if (err < 0) {
+ pr_err("Couldn't synthesize thread map.\n");
+ return err;
+ }
+
+ err = perf_event__synthesize_cpu_map(NULL, evsel_list->cpus,
+ process_synthesized_event, NULL);
+ if (err < 0) {
+ pr_err("Couldn't synthesize thread map.\n");
+ return err;
+ }
+
+ err = perf_event__synthesize_stat_config(NULL, &stat_config,
+ process_synthesized_event, NULL);
+ if (err < 0) {
+ pr_err("Couldn't synthesize config.\n");
+ return err;
+ }
+
+ return 0;
+}
+
static int __run_perf_stat(int argc, const char **argv)
{
int interval = stat_config.interval;
@@ -403,6 +426,10 @@ static int __run_perf_stat(int argc, const char **argv)
fd, false);
if (err < 0)
return err;
+
+ err = perf_stat_synthesize_config();
+ if (err < 0)
+ return err;
}

/*
@@ -1560,7 +1587,10 @@ int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
* a saner message about no samples being in the perf.data file.
*
* This also serves to suppress a warning about f_header.data.size == 0
- * in header.c. -acme
+ * in header.c at the moment 'perf stat record' gets introduced, which
+ * is not really needed once we start adding the stat specific PERF_RECORD_
+ * records, but the need to suppress the kptr_restrict messages in older
+ * tools remain -acme
*/
int fd = perf_data_file__fd(&perf_stat.file);
int err = perf_event__synthesize_kernel_mmap((void *)&perf_stat,

Subject: [tip:perf/core] perf evlist: Export id_add_fd()

Commit-ID: 1c59612de0264790698e32eb0368daf3fcba4c65
Gitweb: http://git.kernel.org/tip/1c59612de0264790698e32eb0368daf3fcba4c65
Author: Jiri Olsa <[email protected]>
AuthorDate: Thu, 5 Nov 2015 15:40:49 +0100
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Thu, 17 Dec 2015 15:15:19 -0300

perf evlist: Export id_add_fd()

Will be used to storing the event IDs in evlist object so it get stored
into perf.data file.

Signed-off-by: Jiri Olsa <[email protected]>
Tested-by: Kan Liang <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
[ Split from the patch storing the ids in the perf.data file ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/evlist.c | 6 +++---
tools/perf/util/evlist.h | 3 +++
2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 8c44aad..b9eac0d 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -534,9 +534,9 @@ void perf_evlist__id_add(struct perf_evlist *evlist, struct perf_evsel *evsel,
evsel->id[evsel->ids++] = id;
}

-static int perf_evlist__id_add_fd(struct perf_evlist *evlist,
- struct perf_evsel *evsel,
- int cpu, int thread, int fd)
+int perf_evlist__id_add_fd(struct perf_evlist *evlist,
+ struct perf_evsel *evsel,
+ int cpu, int thread, int fd)
{
u64 read_data[4] = { 0, };
int id_idx = 1; /* The first entry is the counter value */
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index a459fe7..139a500 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -97,6 +97,9 @@ perf_evlist__find_tracepoint_by_name(struct perf_evlist *evlist,

void perf_evlist__id_add(struct perf_evlist *evlist, struct perf_evsel *evsel,
int cpu, int thread, u64 id);
+int perf_evlist__id_add_fd(struct perf_evlist *evlist,
+ struct perf_evsel *evsel,
+ int cpu, int thread, int fd);

int perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd);
int perf_evlist__alloc_pollfd(struct perf_evlist *evlist);

Subject: [tip:perf/core] perf stat record: Store events IDs in perf data file

Commit-ID: 2af4646d1041ee590b0032d2b0103fa81aa43174
Gitweb: http://git.kernel.org/tip/2af4646d1041ee590b0032d2b0103fa81aa43174
Author: Jiri Olsa <[email protected]>
AuthorDate: Thu, 5 Nov 2015 15:40:49 +0100
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Thu, 17 Dec 2015 15:15:21 -0300

perf stat record: Store events IDs in perf data file

Store event IDs in evlist object so it get stored into perf.data file.

Signed-off-by: Jiri Olsa <[email protected]>
Tested-by: Kan Liang <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/builtin-stat.c | 35 +++++++++++++++++++++++++++++++++++
1 file changed, 35 insertions(+)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 45bf4d2..39d0c30f 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -346,6 +346,38 @@ static int perf_stat_synthesize_config(void)
return 0;
}

+#define FD(e, x, y) (*(int *)xyarray__entry(e->fd, x, y))
+
+static int __store_counter_ids(struct perf_evsel *counter,
+ struct cpu_map *cpus,
+ struct thread_map *threads)
+{
+ int cpu, thread;
+
+ for (cpu = 0; cpu < cpus->nr; cpu++) {
+ for (thread = 0; thread < threads->nr; thread++) {
+ int fd = FD(counter, cpu, thread);
+
+ if (perf_evlist__id_add_fd(evsel_list, counter,
+ cpu, thread, fd) < 0)
+ return -1;
+ }
+ }
+
+ return 0;
+}
+
+static int store_counter_ids(struct perf_evsel *counter)
+{
+ struct cpu_map *cpus = counter->cpus;
+ struct thread_map *threads = counter->threads;
+
+ if (perf_evsel__alloc_id(counter, cpus->nr, threads->nr))
+ return -ENOMEM;
+
+ return __store_counter_ids(counter, cpus, threads);
+}
+
static int __run_perf_stat(int argc, const char **argv)
{
int interval = stat_config.interval;
@@ -410,6 +442,9 @@ static int __run_perf_stat(int argc, const char **argv)
l = strlen(counter->unit);
if (l > unit_width)
unit_width = l;
+
+ if (STAT_RECORD && store_counter_ids(counter))
+ return -1;
}

if (perf_evlist__apply_filters(evsel_list, &counter)) {

Subject: [tip:perf/core] perf stat record: Add pipe support for record command

Commit-ID: 664c98d4e1c2ff60627d78d4c8ae81cd2df13783
Gitweb: http://git.kernel.org/tip/664c98d4e1c2ff60627d78d4c8ae81cd2df13783
Author: Jiri Olsa <[email protected]>
AuthorDate: Thu, 5 Nov 2015 15:40:50 +0100
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Thu, 17 Dec 2015 15:15:22 -0300

perf stat record: Add pipe support for record command

Allowing storing stat record data into pipe, so report tools
(report/script) could read data directly from record.

Committer note:

Before this patch:

$ perf stat record -o - usleep 1 | perf report -i -
incompatible file format (rerun with -v to learn more)
$ perf stat record -o - usleep 1 | perf script -i -
incompatible file format (rerun with -v to learn more)
$ ls -la perf.data
ls: cannot access perf.data: No such file or directory
$

After:

$ perf stat record -o - usleep 1 | perf report -i -
# To display the perf.data header info, please use
# --header/--header-only options.
#
Error:
The - file has no samples!
$ perf stat record -o - usleep 1 | perf script -i -
Display of symbols requested but neither sample IP nor sample address
is selected. Hence, no addresses to convert to symbols.
0 [0x80]: failed to process type: 64
$ ls -la perf.data
ls: cannot access perf.data: No such file or directory
$

Signed-off-by: Jiri Olsa <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Tested-by: Kan Liang <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/builtin-stat.c | 39 ++++++++++++++++++++++++++++-----------
1 file changed, 28 insertions(+), 11 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 39d0c30f..8a2f9ce 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -317,10 +317,19 @@ static void workload_exec_failed_signal(int signo __maybe_unused, siginfo_t *inf
workload_exec_errno = info->si_value.sival_int;
}

-static int perf_stat_synthesize_config(void)
+static int perf_stat_synthesize_config(bool is_pipe)
{
int err;

+ if (is_pipe) {
+ err = perf_event__synthesize_attrs(NULL, perf_stat.session,
+ process_synthesized_event);
+ if (err < 0) {
+ pr_err("Couldn't synthesize attrs.\n");
+ return err;
+ }
+ }
+
err = perf_event__synthesize_thread_map2(NULL, evsel_list->threads,
process_synthesized_event,
NULL);
@@ -388,6 +397,7 @@ static int __run_perf_stat(int argc, const char **argv)
size_t l;
int status = 0;
const bool forks = (argc > 0);
+ bool is_pipe = STAT_RECORD ? perf_stat.file.is_pipe : false;

if (interval) {
ts.tv_sec = interval / 1000;
@@ -398,7 +408,7 @@ static int __run_perf_stat(int argc, const char **argv)
}

if (forks) {
- if (perf_evlist__prepare_workload(evsel_list, &target, argv, false,
+ if (perf_evlist__prepare_workload(evsel_list, &target, argv, is_pipe,
workload_exec_failed_signal) < 0) {
perror("failed to prepare workload");
return -1;
@@ -457,12 +467,17 @@ static int __run_perf_stat(int argc, const char **argv)
if (STAT_RECORD) {
int err, fd = perf_data_file__fd(&perf_stat.file);

- err = perf_session__write_header(perf_stat.session, evsel_list,
- fd, false);
+ if (is_pipe) {
+ err = perf_header__write_pipe(perf_data_file__fd(&perf_stat.file));
+ } else {
+ err = perf_session__write_header(perf_stat.session, evsel_list,
+ fd, false);
+ }
+
if (err < 0)
return err;

- err = perf_stat_synthesize_config();
+ err = perf_stat_synthesize_config(is_pipe);
if (err < 0)
return err;
}
@@ -970,6 +985,10 @@ static void print_counters(struct timespec *ts, int argc, const char **argv)
struct perf_evsel *counter;
char buf[64], *prefix = NULL;

+ /* Do not print anything if we record to the pipe. */
+ if (STAT_RECORD && perf_stat.file.is_pipe)
+ return;
+
if (interval)
print_interval(prefix = buf, ts);
else
@@ -1402,10 +1421,6 @@ static int __cmd_record(int argc, const char **argv)
return -1;
}

- /* No pipe support ATM */
- if (perf_stat.file.is_pipe)
- return -EINVAL;
-
init_features(session);

session->evlist = evsel_list;
@@ -1636,8 +1651,10 @@ int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
"older tools may produce warnings about this file\n.");
}

- perf_stat.session->header.data_size += perf_stat.bytes_written;
- perf_session__write_header(perf_stat.session, evsel_list, fd, true);
+ if (!perf_stat.file.is_pipe) {
+ perf_stat.session->header.data_size += perf_stat.bytes_written;
+ perf_session__write_header(perf_stat.session, evsel_list, fd, true);
+ }

perf_session__delete(perf_stat.session);
}

Subject: [tip:perf/core] perf stat record: Write stat events on record

Commit-ID: 5a6ea81b8f9ce2736759d256ac4d63be65751199
Gitweb: http://git.kernel.org/tip/5a6ea81b8f9ce2736759d256ac4d63be65751199
Author: Jiri Olsa <[email protected]>
AuthorDate: Thu, 5 Nov 2015 15:40:51 +0100
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Thu, 17 Dec 2015 16:00:22 -0300

perf stat record: Write stat events on record

Writing stat events on 'perf stat record' at the time we read counter
values from kernel.

Committer note:

After the patch:

$ perf stat record usleep 1

Performance counter stats for 'usleep 1':

0.598006 task-clock (msec) # 0.484 CPUs utilized
1 context-switches # 0.002 M/sec
0 cpu-migrations # 0.000 K/sec
52 page-faults # 0.087 M/sec
882,744 cycles # 1.476 GHz
581,416 stalled-cycles-frontend # 65.86% frontend cycles idle
<not supported> stalled-cycles-backend
636,479 instructions # 0.72 insns per cycle
# 0.91 stalled cycles per insn
129,334 branches # 216.275 M/sec
7,512 branch-misses # 5.81% of all branches

0.001235157 seconds time elapsed

$ oldperf evlist
task-clock
context-switches
cpu-migrations
page-faults
cycles
stalled-cycles-frontend
stalled-cycles-backend
instructions
branches
branch-misses
$ oldperf report --stdio
Error:
The perf.data file has no samples!
# To display the perf.data header info, please use --header/--header-only options.
#
$ perf report -D | grep PERF_RECORD
0x5b0 [0x28]: PERF_RECORD_THREAD_MAP nr: 1 thread: 5504
0x5d8 [0x12]: PERF_RECORD_CPU_MAP nr: 1 cpu: 65535
0x5ea [0x40]: PERF_RECORD_STAT_CONFIG
0x62a [0x30]: PERF_RECORD_STAT
0x65a [0x30]: PERF_RECORD_STAT
0x68a [0x30]: PERF_RECORD_STAT
0x6ba [0x30]: PERF_RECORD_STAT
0x6ea [0x30]: PERF_RECORD_STAT
0x71a [0x30]: PERF_RECORD_STAT
0x74a [0x30]: PERF_RECORD_STAT
0x77a [0x30]: PERF_RECORD_STAT
0x7aa [0x30]: PERF_RECORD_STAT
-1 -1 0x7da [0x40]: PERF_RECORD_MMAP -1/0: [0xffffffff81000000(0x1f000000) @ 0xffffffff81000000]: x [kernel.kallsyms]_text
$

Signed-off-by: Jiri Olsa <[email protected]>
Tested-by: Kan Liang <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/builtin-stat.c | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 8a2f9ce..32aa2ea 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -231,6 +231,18 @@ static int process_synthesized_event(struct perf_tool *tool __maybe_unused,
return 0;
}

+#define SID(e, x, y) xyarray__entry(e->sample_id, x, y)
+
+static int
+perf_evsel__write_stat_event(struct perf_evsel *counter, u32 cpu, u32 thread,
+ struct perf_counts_values *count)
+{
+ struct perf_sample_id *sid = SID(counter, cpu, thread);
+
+ return perf_event__synthesize_stat(NULL, cpu, thread, sid->id, count,
+ process_synthesized_event, NULL);
+}
+
/*
* Read out the results of a single counter:
* do not aggregate counts across CPUs in system-wide mode
@@ -254,6 +266,13 @@ static int read_counter(struct perf_evsel *counter)
count = perf_counts(counter->counts, cpu, thread);
if (perf_evsel__read(counter, cpu, thread, count))
return -1;
+
+ if (STAT_RECORD) {
+ if (perf_evsel__write_stat_event(counter, cpu, thread, count)) {
+ pr_err("failed to write stat event\n");
+ return -1;
+ }
+ }
}
}

Subject: [tip:perf/core] perf stat record: Write stat round events on record

Commit-ID: 7aad0c32bb6aaa39aab596264ddc49d44c8088f3
Gitweb: http://git.kernel.org/tip/7aad0c32bb6aaa39aab596264ddc49d44c8088f3
Author: Jiri Olsa <[email protected]>
AuthorDate: Thu, 5 Nov 2015 15:40:52 +0100
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Thu, 17 Dec 2015 16:00:31 -0300

perf stat record: Write stat round events on record

Writing stat round events on 'perf stat record' for each interval round.
In non interval mode we store round event after the last stat event.

Committer note:

After the patch:

$ perf report -D | grep PERF_RECORD | grep ROUND
0x852 [0x18]: PERF_RECORD_STAT_ROUND
$

Reported-by: Kan Liang <[email protected]>
Signed-off-by: Jiri Olsa <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/builtin-stat.c | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 32aa2ea..fcece42 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -231,6 +231,16 @@ static int process_synthesized_event(struct perf_tool *tool __maybe_unused,
return 0;
}

+static int write_stat_round_event(u64 time, u64 type)
+{
+ return perf_event__synthesize_stat_round(NULL, time, type,
+ process_synthesized_event,
+ NULL);
+}
+
+#define WRITE_STAT_ROUND_EVENT(time, interval) \
+ write_stat_round_event(time, PERF_STAT_ROUND_TYPE__ ## interval)
+
#define SID(e, x, y) xyarray__entry(e->sample_id, x, y)

static int
@@ -306,6 +316,11 @@ static void process_interval(void)
clock_gettime(CLOCK_MONOTONIC, &ts);
diff_timespec(&rs, &ts, &ref_time);

+ if (STAT_RECORD) {
+ if (WRITE_STAT_ROUND_EVENT(rs.tv_sec * NSECS_PER_SEC + rs.tv_nsec, INTERVAL))
+ pr_err("failed to write stat round event\n");
+ }
+
print_counters(&rs, 0, NULL);
}

@@ -1670,6 +1685,11 @@ int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
"older tools may produce warnings about this file\n.");
}

+ if (!interval) {
+ if (WRITE_STAT_ROUND_EVENT(walltime_nsecs_stats.max, FINAL))
+ pr_err("failed to write stat round event\n");
+ }
+
if (!perf_stat.file.is_pipe) {
perf_stat.session->header.data_size += perf_stat.bytes_written;
perf_session__write_header(perf_stat.session, evsel_list, fd, true);

Subject: [tip:perf/core] perf stat record: Do not allow record with multiple runs mode

Commit-ID: e9d6db8e8df42a38f79f264ab58c104e1678b12c
Gitweb: http://git.kernel.org/tip/e9d6db8e8df42a38f79f264ab58c104e1678b12c
Author: Jiri Olsa <[email protected]>
AuthorDate: Thu, 5 Nov 2015 15:40:53 +0100
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Thu, 17 Dec 2015 16:00:32 -0300

perf stat record: Do not allow record with multiple runs mode

We currently don't support storing multiple session in perf.data,
so we can't allow -r option in stat record.

$ perf stat -e cycles -r 2 record ls
Cannot use -r option with perf stat record.

Committer note:

Before this patch we would a perf.data file such as:

$ perf stat -e cycles -r 2 record ls
<SNIP>

Performance counter stats for 'ls' (2 runs):

3,935,236 cycles

0.002353261 seconds time elapsed ( +- 4.76% )

$ perf report -D | grep PERF_RECORD | grep ROUND
0xf0 [0]: failed to process type: 16
Error:
failed to process sample
$

Reported-by: Kan Liang <[email protected]>
Signed-off-by: Jiri Olsa <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/builtin-stat.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index fcece42..10f86a6 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -1449,6 +1449,11 @@ static int __cmd_record(int argc, const char **argv)
if (output_name)
file->path = output_name;

+ if (run_count != 1 || forever) {
+ pr_err("Cannot use -r option with perf stat record.\n");
+ return -1;
+ }
+
session = perf_session__new(file, false, NULL);
if (session == NULL) {
pr_err("Perf session creation failed.\n");

Subject: [tip:perf/core] perf stat record: Synthesize event update events

Commit-ID: 7b60a7e3a687481553d2b6ec7e6390a6e82f1849
Gitweb: http://git.kernel.org/tip/7b60a7e3a687481553d2b6ec7e6390a6e82f1849
Author: Jiri Olsa <[email protected]>
AuthorDate: Thu, 5 Nov 2015 15:40:54 +0100
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Thu, 17 Dec 2015 16:00:33 -0300

perf stat record: Synthesize event update events

Synthesize other events stuff not carried within attr event - unit,
scale, name.

Reported-by: Kan Liang <[email protected]>
Signed-off-by: Jiri Olsa <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/builtin-stat.c | 59 +++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 59 insertions(+)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 10f86a6..575e253 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -351,8 +351,19 @@ static void workload_exec_failed_signal(int signo __maybe_unused, siginfo_t *inf
workload_exec_errno = info->si_value.sival_int;
}

+static bool has_unit(struct perf_evsel *counter)
+{
+ return counter->unit && *counter->unit;
+}
+
+static bool has_scale(struct perf_evsel *counter)
+{
+ return counter->scale != 1;
+}
+
static int perf_stat_synthesize_config(bool is_pipe)
{
+ struct perf_evsel *counter;
int err;

if (is_pipe) {
@@ -364,6 +375,54 @@ static int perf_stat_synthesize_config(bool is_pipe)
}
}

+ /*
+ * Synthesize other events stuff not carried within
+ * attr event - unit, scale, name
+ */
+ evlist__for_each(evsel_list, counter) {
+ if (!counter->supported)
+ continue;
+
+ /*
+ * Synthesize unit and scale only if it's defined.
+ */
+ if (has_unit(counter)) {
+ err = perf_event__synthesize_event_update_unit(NULL, counter, process_synthesized_event);
+ if (err < 0) {
+ pr_err("Couldn't synthesize evsel unit.\n");
+ return err;
+ }
+ }
+
+ if (has_scale(counter)) {
+ err = perf_event__synthesize_event_update_scale(NULL, counter, process_synthesized_event);
+ if (err < 0) {
+ pr_err("Couldn't synthesize evsel scale.\n");
+ return err;
+ }
+ }
+
+ if (counter->own_cpus) {
+ err = perf_event__synthesize_event_update_cpus(NULL, counter, process_synthesized_event);
+ if (err < 0) {
+ pr_err("Couldn't synthesize evsel scale.\n");
+ return err;
+ }
+ }
+
+ /*
+ * Name is needed only for pipe output,
+ * perf.data carries event names.
+ */
+ if (is_pipe) {
+ err = perf_event__synthesize_event_update_name(NULL, counter, process_synthesized_event);
+ if (err < 0) {
+ pr_err("Couldn't synthesize evsel name.\n");
+ return err;
+ }
+ }
+ }
+
err = perf_event__synthesize_thread_map2(NULL, evsel_list->threads,
process_synthesized_event,
NULL);

Subject: [tip:perf/core] perf stat report: Add report command

Commit-ID: ba6039b6c8fcc24de7d6ab7b0bada4becaf84a2c
Gitweb: http://git.kernel.org/tip/ba6039b6c8fcc24de7d6ab7b0bada4becaf84a2c
Author: Jiri Olsa <[email protected]>
AuthorDate: Thu, 5 Nov 2015 15:40:55 +0100
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Thu, 17 Dec 2015 16:00:34 -0300

perf stat report: Add report command

Adding 'perf stat report' command support. ATM it only processes attr
events and display nothing.

Reported-by: Kan Liang <[email protected]>
Signed-off-by: Jiri Olsa <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/Documentation/perf-stat.txt | 12 +++++++
tools/perf/builtin-stat.c | 61 +++++++++++++++++++++++++++++++---
2 files changed, 69 insertions(+), 4 deletions(-)

diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
index 70eee1c..95f4928 100644
--- a/tools/perf/Documentation/perf-stat.txt
+++ b/tools/perf/Documentation/perf-stat.txt
@@ -11,6 +11,7 @@ SYNOPSIS
'perf stat' [-e <EVENT> | --event=EVENT] [-a] <command>
'perf stat' [-e <EVENT> | --event=EVENT] [-a] -- <command> [<options>]
'perf stat' [-e <EVENT> | --event=EVENT] [-a] record [-o file] -- <command> [<options>]
+'perf stat' report [-i file]

DESCRIPTION
-----------
@@ -26,6 +27,9 @@ OPTIONS
record::
See STAT RECORD.

+report::
+ See STAT REPORT.
+
-e::
--event=::
Select the PMU event. Selection can be:
@@ -170,6 +174,14 @@ Stores stat data into perf data file.
--output file::
Output file name.

+STAT REPORT
+-----------
+Reads and reports stat data from perf data file.
+
+-i file::
+--input file::
+Input file name.
+

EXAMPLES
--------
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 575e253..abba49b 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -60,6 +60,8 @@
#include "util/thread_map.h"
#include "util/counts.h"
#include "util/session.h"
+#include "util/tool.h"
+#include "asm/bug.h"

#include <stdlib.h>
#include <sys/prctl.h>
@@ -132,6 +134,7 @@ struct perf_stat {
struct perf_data_file file;
struct perf_session *session;
u64 bytes_written;
+ struct perf_tool tool;
};

static struct perf_stat perf_stat;
@@ -1041,8 +1044,8 @@ static void print_header(int argc, const char **argv)
else if (target.cpu_list)
fprintf(output, "\'CPU(s) %s", target.cpu_list);
else if (!target__has_task(&target)) {
- fprintf(output, "\'%s", argv[0]);
- for (i = 1; i < argc; i++)
+ fprintf(output, "\'%s", argv ? argv[0] : "pipe");
+ for (i = 1; argv && (i < argc); i++)
fprintf(output, " %s", argv[i]);
} else if (target.pid)
fprintf(output, "process id \'%s", target.pid);
@@ -1527,6 +1530,55 @@ static int __cmd_record(int argc, const char **argv)
return argc;
}

+static const char * const report_usage[] = {
+ "perf stat report [<options>]",
+ NULL,
+};
+
+static struct perf_stat perf_stat = {
+ .tool = {
+ .attr = perf_event__process_attr,
+ },
+};
+
+static int __cmd_report(int argc, const char **argv)
+{
+ struct perf_session *session;
+ const struct option options[] = {
+ OPT_STRING('i', "input", &input_name, "file", "input file name"),
+ OPT_END()
+ };
+ struct stat st;
+ int ret;
+
+ argc = parse_options(argc, argv, options, report_usage, 0);
+
+ if (!input_name || !strlen(input_name)) {
+ if (!fstat(STDIN_FILENO, &st) && S_ISFIFO(st.st_mode))
+ input_name = "-";
+ else
+ input_name = "perf.data";
+ }
+
+ perf_stat.file.path = input_name;
+ perf_stat.file.mode = PERF_DATA_MODE_READ;
+
+ session = perf_session__new(&perf_stat.file, false, &perf_stat.tool);
+ if (session == NULL)
+ return -1;
+
+ perf_stat.session = session;
+ stat_config.output = stderr;
+ evsel_list = session->evlist;
+
+ ret = perf_session__process_events(session);
+ if (ret)
+ return ret;
+
+ perf_session__delete(session);
+ return 0;
+}
+
int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
{
const char * const stat_usage[] = {
@@ -1537,7 +1589,7 @@ int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
const char *mode;
FILE *output = stderr;
unsigned int interval;
- const char * const stat_subcommands[] = { "record" };
+ const char * const stat_subcommands[] = { "record", "report" };

setlocale(LC_ALL, "");

@@ -1553,7 +1605,8 @@ int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
argc = __cmd_record(argc, argv);
if (argc < 0)
return -1;
- }
+ } else if (argc && !strncmp(argv[0], "rep", 3))
+ return __cmd_report(argc, argv);

interval = stat_config.interval;

Subject: [tip:perf/core] perf stat report: Process cpu/threads maps

Commit-ID: 1975d36e14b3314d1d0c7a428946ec0c27fd6e95
Gitweb: http://git.kernel.org/tip/1975d36e14b3314d1d0c7a428946ec0c27fd6e95
Author: Jiri Olsa <[email protected]>
AuthorDate: Thu, 5 Nov 2015 15:40:56 +0100
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Thu, 17 Dec 2015 16:21:03 -0300

perf stat report: Process cpu/threads maps

Adding processing of cpu/threads maps. Configuring session's evlist with
these maps.

Reported-by: Kan Liang <[email protected]>
Signed-off-by: Jiri Olsa <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
[ s/stat/st/g, s/time/tm/g parameters to fix 'already defined' build error with older distros (e.g. RHEL6.7) ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/builtin-stat.c | 66 +++++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 64 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index abba49b..0a1cfdd 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -135,6 +135,9 @@ struct perf_stat {
struct perf_session *session;
u64 bytes_written;
struct perf_tool tool;
+ bool maps_allocated;
+ struct cpu_map *cpus;
+ struct thread_map *threads;
};

static struct perf_stat perf_stat;
@@ -234,9 +237,9 @@ static int process_synthesized_event(struct perf_tool *tool __maybe_unused,
return 0;
}

-static int write_stat_round_event(u64 time, u64 type)
+static int write_stat_round_event(u64 tm, u64 type)
{
- return perf_event__synthesize_stat_round(NULL, time, type,
+ return perf_event__synthesize_stat_round(NULL, tm, type,
process_synthesized_event,
NULL);
}
@@ -1530,6 +1533,63 @@ static int __cmd_record(int argc, const char **argv)
return argc;
}

+static int set_maps(struct perf_stat *st)
+{
+ if (!st->cpus || !st->threads)
+ return 0;
+
+ if (WARN_ONCE(st->maps_allocated, "stats double allocation\n"))
+ return -EINVAL;
+
+ perf_evlist__set_maps(evsel_list, st->cpus, st->threads);
+
+ if (perf_evlist__alloc_stats(evsel_list, true))
+ return -ENOMEM;
+
+ st->maps_allocated = true;
+ return 0;
+}
+
+static
+int process_thread_map_event(struct perf_tool *tool __maybe_unused,
+ union perf_event *event,
+ struct perf_session *session __maybe_unused)
+{
+ struct perf_stat *st = container_of(tool, struct perf_stat, tool);
+
+ if (st->threads) {
+ pr_warning("Extra thread map event, ignoring.\n");
+ return 0;
+ }
+
+ st->threads = thread_map__new_event(&event->thread_map);
+ if (!st->threads)
+ return -ENOMEM;
+
+ return set_maps(st);
+}
+
+static
+int process_cpu_map_event(struct perf_tool *tool __maybe_unused,
+ union perf_event *event,
+ struct perf_session *session __maybe_unused)
+{
+ struct perf_stat *st = container_of(tool, struct perf_stat, tool);
+ struct cpu_map *cpus;
+
+ if (st->cpus) {
+ pr_warning("Extra cpu map event, ignoring.\n");
+ return 0;
+ }
+
+ cpus = cpu_map__new_data(&event->cpu_map.data);
+ if (!cpus)
+ return -ENOMEM;
+
+ st->cpus = cpus;
+ return set_maps(st);
+}
+
static const char * const report_usage[] = {
"perf stat report [<options>]",
NULL,
@@ -1538,6 +1598,8 @@ static const char * const report_usage[] = {
static struct perf_stat perf_stat = {
.tool = {
.attr = perf_event__process_attr,
+ .thread_map = process_thread_map_event,
+ .cpu_map = process_cpu_map_event,
},
};

Subject: [tip:perf/core] perf stat report: Process stat config event

Commit-ID: 62ba18ba938a8740ab18e02342b282d7378986f7
Gitweb: http://git.kernel.org/tip/62ba18ba938a8740ab18e02342b282d7378986f7
Author: Jiri Olsa <[email protected]>
AuthorDate: Thu, 5 Nov 2015 15:40:57 +0100
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Thu, 17 Dec 2015 16:27:00 -0300

perf stat report: Process stat config event

Adding processing of stat config event and initialize stat_config
object.

Reported-by: Kan Liang <[email protected]>
Signed-off-by: Jiri Olsa <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
[ Renamed 'stat' parameter to 'st' to fix 'already defined' build error with older distros (e.g. RHEL6.7) ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/builtin-stat.c | 10 ++++++++++
1 file changed, 10 insertions(+)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 0a1cfdd..1e5db50 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -1533,6 +1533,15 @@ static int __cmd_record(int argc, const char **argv)
return argc;
}

+static
+int process_stat_config_event(struct perf_tool *tool __maybe_unused,
+ union perf_event *event,
+ struct perf_session *session __maybe_unused)
+{
+ perf_event__read_stat_config(&stat_config, &event->stat_config);
+ return 0;
+}
+
static int set_maps(struct perf_stat *st)
{
if (!st->cpus || !st->threads)
@@ -1600,6 +1609,7 @@ static struct perf_stat perf_stat = {
.attr = perf_event__process_attr,
.thread_map = process_thread_map_event,
.cpu_map = process_cpu_map_event,
+ .stat_config = process_stat_config_event,
},
};

Subject: [tip:perf/core] perf stat report: Add support to initialize aggr_map from file

Commit-ID: 68d702f7a1202dd39d9fa01b7bea92ba9e5785d9
Gitweb: http://git.kernel.org/tip/68d702f7a1202dd39d9fa01b7bea92ba9e5785d9
Author: Jiri Olsa <[email protected]>
AuthorDate: Thu, 5 Nov 2015 15:40:58 +0100
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Thu, 17 Dec 2015 16:28:43 -0300

perf stat report: Add support to initialize aggr_map from file

Using perf.data's perf_env data to initialize aggregate config.

Reported-by: Kan Liang <[email protected]>
Signed-off-by: Jiri Olsa <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
[ s/stat/st/g, s/socket/socket_id/g to fix 'already defined' build error with older distros (e.g. RHEL6.7) ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/builtin-stat.c | 103 ++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 103 insertions(+)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 1e5db50..c780525 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -1326,6 +1326,101 @@ static void perf_stat__exit_aggr_mode(void)
cpus_aggr_map = NULL;
}

+static inline int perf_env__get_cpu(struct perf_env *env, struct cpu_map *map, int idx)
+{
+ int cpu;
+
+ if (idx > map->nr)
+ return -1;
+
+ cpu = map->map[idx];
+
+ if (cpu >= env->nr_cpus_online)
+ return -1;
+
+ return cpu;
+}
+
+static int perf_env__get_socket(struct cpu_map *map, int idx, void *data)
+{
+ struct perf_env *env = data;
+ int cpu = perf_env__get_cpu(env, map, idx);
+
+ return cpu == -1 ? -1 : env->cpu[cpu].socket_id;
+}
+
+static int perf_env__get_core(struct cpu_map *map, int idx, void *data)
+{
+ struct perf_env *env = data;
+ int core = -1, cpu = perf_env__get_cpu(env, map, idx);
+
+ if (cpu != -1) {
+ int socket_id = env->cpu[cpu].socket_id;
+
+ /*
+ * Encode socket in upper 16 bits
+ * core_id is relative to socket, and
+ * we need a global id. So we combine
+ * socket + core id.
+ */
+ core = (socket_id << 16) | (env->cpu[cpu].core_id & 0xffff);
+ }
+
+ return core;
+}
+
+static int perf_env__build_socket_map(struct perf_env *env, struct cpu_map *cpus,
+ struct cpu_map **sockp)
+{
+ return cpu_map__build_map(cpus, sockp, perf_env__get_socket, env);
+}
+
+static int perf_env__build_core_map(struct perf_env *env, struct cpu_map *cpus,
+ struct cpu_map **corep)
+{
+ return cpu_map__build_map(cpus, corep, perf_env__get_core, env);
+}
+
+static int perf_stat__get_socket_file(struct cpu_map *map, int idx)
+{
+ return perf_env__get_socket(map, idx, &perf_stat.session->header.env);
+}
+
+static int perf_stat__get_core_file(struct cpu_map *map, int idx)
+{
+ return perf_env__get_core(map, idx, &perf_stat.session->header.env);
+}
+
+static int perf_stat_init_aggr_mode_file(struct perf_stat *st)
+{
+ struct perf_env *env = &st->session->header.env;
+
+ switch (stat_config.aggr_mode) {
+ case AGGR_SOCKET:
+ if (perf_env__build_socket_map(env, evsel_list->cpus, &aggr_map)) {
+ perror("cannot build socket map");
+ return -1;
+ }
+ aggr_get_id = perf_stat__get_socket_file;
+ break;
+ case AGGR_CORE:
+ if (perf_env__build_core_map(env, evsel_list->cpus, &aggr_map)) {
+ perror("cannot build core map");
+ return -1;
+ }
+ aggr_get_id = perf_stat__get_core_file;
+ break;
+ case AGGR_NONE:
+ case AGGR_GLOBAL:
+ case AGGR_THREAD:
+ case AGGR_UNSET:
+ default:
+ break;
+ }
+
+ return 0;
+}
+
/*
* Add default attributes, if there were no attributes specified or
* if -d/--detailed, -d -d or -d -d -d is used:
@@ -1538,7 +1633,15 @@ int process_stat_config_event(struct perf_tool *tool __maybe_unused,
union perf_event *event,
struct perf_session *session __maybe_unused)
{
+ struct perf_stat *st = container_of(tool, struct perf_stat, tool);
+
perf_event__read_stat_config(&stat_config, &event->stat_config);
+
+ if (perf_stat.file.is_pipe)
+ perf_stat_init_aggr_mode();
+ else
+ perf_stat_init_aggr_mode_file(st);
+
return 0;
}

Subject: [tip:perf/core] perf stat report: Move csv_sep initialization before report command

Commit-ID: 6edb78a2178fd85d07b1a7fbb3629be56b860224
Gitweb: http://git.kernel.org/tip/6edb78a2178fd85d07b1a7fbb3629be56b860224
Author: Jiri Olsa <[email protected]>
AuthorDate: Thu, 5 Nov 2015 15:41:01 +0100
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Thu, 17 Dec 2015 16:29:06 -0300

perf stat report: Move csv_sep initialization before report command

So we have csv_sep properly initialized before report command leg.

Reported-by: Kan Liang <[email protected]>
Signed-off-by: Jiri Olsa <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/builtin-stat.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index c780525..f9d4e09 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -1776,6 +1776,13 @@ int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
(const char **) stat_usage,
PARSE_OPT_STOP_AT_NON_OPTION);

+ if (csv_sep) {
+ csv_output = true;
+ if (!strcmp(csv_sep, "\\t"))
+ csv_sep = "\t";
+ } else
+ csv_sep = DEFAULT_SEPARATOR;
+
if (argc && !strncmp(argv[0], "rec", 3)) {
argc = __cmd_record(argc, argv);
if (argc < 0)
@@ -1826,13 +1833,6 @@ int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)

stat_config.output = output;

- if (csv_sep) {
- csv_output = true;
- if (!strcmp(csv_sep, "\\t"))
- csv_sep = "\t";
- } else
- csv_sep = DEFAULT_SEPARATOR;
-
/*
* let the spreadsheet do the pretty-printing
*/

Subject: [tip:perf/core] perf stat report: Process stat and stat round events

Commit-ID: a56f9390aa9d9b1c782c3dbd5ca2c4245eb265fc
Gitweb: http://git.kernel.org/tip/a56f9390aa9d9b1c782c3dbd5ca2c4245eb265fc
Author: Jiri Olsa <[email protected]>
AuthorDate: Thu, 5 Nov 2015 15:40:59 +0100
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Thu, 17 Dec 2015 16:29:19 -0300

perf stat report: Process stat and stat round events

Adding processing of stat and stat round events.

The stat data com in stat events, using generic function
process_stat_round_event to store data under perf_evsel object.

The stat-round events comes each interval or as last event in non
interval mode. The function process_stat_round_event process stored data
for each perf_evsel object and print it out.

Committer note:

After this patch:

$ perf stat record usleep 1

Performance counter stats for 'usleep 1':

0.498381 task-clock (msec) # 0.571 CPUs utilized
2 context-switches # 0.004 M/sec
0 cpu-migrations # 0.000 K/sec
149 page-faults # 0.299 M/sec
1,271,635 cycles # 2.552 GHz
928,712 stalled-cycles-frontend # 73.03% frontend cycles idle
663,286 stalled-cycles-backend # 52.16% backend cycles idle
792,614 instructions # 0.62 insns per cycle
# 1.17 stalled cycles per insn
136,850 branches # 274.589 M/sec
<not counted> branch-misses (0.00%)

0.000873419 seconds time elapsed

$
$ perf stat report

Performance counter stats for '/home/acme/bin/perf stat record usleep 1':

0.498381 task-clock (msec) # 0.571 CPUs utilized
2 context-switches # 0.004 M/sec
0 cpu-migrations # 0.000 K/sec
149 page-faults # 0.299 M/sec
1,271,635 cycles # 2.552 GHz
928,712 stalled-cycles-frontend # 73.03% frontend cycles idle
663,286 stalled-cycles-backend # 52.16% backend cycles idle
792,614 instructions # 0.62 insns per cycle
# 1.17 stalled cycles per insn
136,850 branches # 274.589 M/sec
<not counted> branch-misses (0.00%)

0.000873419 seconds time elapsed

$

Reported-by: Kan Liang <[email protected]>
Signed-off-by: Jiri Olsa <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/builtin-stat.c | 28 ++++++++++++++++++++++++++++
1 file changed, 28 insertions(+)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index f9d4e09..d27d1b9 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -1628,6 +1628,32 @@ static int __cmd_record(int argc, const char **argv)
return argc;
}

+static int process_stat_round_event(struct perf_tool *tool __maybe_unused,
+ union perf_event *event,
+ struct perf_session *session)
+{
+ struct stat_round_event *round = &event->stat_round;
+ struct perf_evsel *counter;
+ struct timespec tsh, *ts = NULL;
+ const char **argv = session->header.env.cmdline_argv;
+ int argc = session->header.env.nr_cmdline;
+
+ evlist__for_each(evsel_list, counter)
+ perf_stat_process_counter(&stat_config, counter);
+
+ if (round->type == PERF_STAT_ROUND_TYPE__FINAL)
+ update_stats(&walltime_nsecs_stats, round->time);
+
+ if (stat_config.interval && round->time) {
+ tsh.tv_sec = round->time / NSECS_PER_SEC;
+ tsh.tv_nsec = round->time % NSECS_PER_SEC;
+ ts = &tsh;
+ }
+
+ print_counters(ts, argc, argv);
+ return 0;
+}
+
static
int process_stat_config_event(struct perf_tool *tool __maybe_unused,
union perf_event *event,
@@ -1713,6 +1739,8 @@ static struct perf_stat perf_stat = {
.thread_map = process_thread_map_event,
.cpu_map = process_cpu_map_event,
.stat_config = process_stat_config_event,
+ .stat = perf_event__process_stat_event,
+ .stat_round = process_stat_round_event,
},
};

Subject: [tip:perf/core] perf stat report: Process event update events

Commit-ID: fa6ea7817db3839b58d46649b7834320257e7702
Gitweb: http://git.kernel.org/tip/fa6ea7817db3839b58d46649b7834320257e7702
Author: Jiri Olsa <[email protected]>
AuthorDate: Thu, 5 Nov 2015 15:41:00 +0100
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Thu, 17 Dec 2015 16:29:29 -0300

perf stat report: Process event update events

Adding processing of event update events, so perf stat report can store
additional info for events - unit,scale,name.

Committer note:

Before:

# perf stat record -e power/energy-cores/ -a
^C
Performance counter stats for 'system wide':

77.41 Joules power/energy-cores/

1.597176695 seconds time elapsed

# perf stat report

Performance counter stats for '/home/acme/bin/perf stat record -e power/energy-cores/ -a':

332,488,114,176 power/energy-cores/

1.597176695 seconds time elapsed

#

After, using the same perf.data file generated in the "Before" case
above:

# perf stat report

Performance counter stats for '/home/acme/bin/perf stat record -e power/energy-cores/ -a':

77.41 Joules power/energy-cores/

1.597176695 seconds time elapsed

#

Reported-by: Kan Liang <[email protected]>
Signed-off-by: Jiri Olsa <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/builtin-stat.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index d27d1b9..3ccf5a9 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -1736,6 +1736,7 @@ static const char * const report_usage[] = {
static struct perf_stat perf_stat = {
.tool = {
.attr = perf_event__process_attr,
+ .event_update = perf_event__process_event_update,
.thread_map = process_thread_map_event,
.cpu_map = process_cpu_map_event,
.stat_config = process_stat_config_event,

Subject: [tip:perf/core] perf stat report: Allow to override aggr_mode

Commit-ID: 89af4e05c21d68f22e07fe66940ea675615a49ed
Gitweb: http://git.kernel.org/tip/89af4e05c21d68f22e07fe66940ea675615a49ed
Author: Jiri Olsa <[email protected]>
AuthorDate: Thu, 5 Nov 2015 15:41:02 +0100
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Thu, 17 Dec 2015 16:30:30 -0300

perf stat report: Allow to override aggr_mode

Allowing to override record aggr_mode. It's possible to use perf stat
like:

$ perf stat report -A
$ perf stat report --per-core
$ perf stat report --per-socket

To customize the recorded aggregate mode regardless what was used during
the stat record command.

Reported-by: Kan Liang <[email protected]>
Signed-off-by: Jiri Olsa <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
[ Renamed 'stat' parameter to 'st' to fix 'already defined' build error with older distros (e.g. RHEL6.7) ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/Documentation/perf-stat.txt | 10 ++++++++++
tools/perf/builtin-stat.c | 17 +++++++++++++++++
2 files changed, 27 insertions(+)

diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
index 95f4928..52ef7a9 100644
--- a/tools/perf/Documentation/perf-stat.txt
+++ b/tools/perf/Documentation/perf-stat.txt
@@ -182,6 +182,16 @@ Reads and reports stat data from perf data file.
--input file::
Input file name.

+--per-socket::
+Aggregate counts per processor socket for system-wide mode measurements.
+
+--per-core::
+Aggregate counts per physical processor for system-wide mode measurements.
+
+-A::
+--no-aggr::
+Do not aggregate counts across all monitored CPUs.
+

EXAMPLES
--------
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 3ccf5a9..9805e03 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -138,6 +138,7 @@ struct perf_stat {
bool maps_allocated;
struct cpu_map *cpus;
struct thread_map *threads;
+ enum aggr_mode aggr_mode;
};

static struct perf_stat perf_stat;
@@ -1663,6 +1664,15 @@ int process_stat_config_event(struct perf_tool *tool __maybe_unused,

perf_event__read_stat_config(&stat_config, &event->stat_config);

+ if (cpu_map__empty(st->cpus)) {
+ if (st->aggr_mode != AGGR_UNSET)
+ pr_warning("warning: processing task data, aggregation mode not set\n");
+ return 0;
+ }
+
+ if (st->aggr_mode != AGGR_UNSET)
+ stat_config.aggr_mode = st->aggr_mode;
+
if (perf_stat.file.is_pipe)
perf_stat_init_aggr_mode();
else
@@ -1743,6 +1753,7 @@ static struct perf_stat perf_stat = {
.stat = perf_event__process_stat_event,
.stat_round = process_stat_round_event,
},
+ .aggr_mode = AGGR_UNSET,
};

static int __cmd_report(int argc, const char **argv)
@@ -1750,6 +1761,12 @@ static int __cmd_report(int argc, const char **argv)
struct perf_session *session;
const struct option options[] = {
OPT_STRING('i', "input", &input_name, "file", "input file name"),
+ OPT_SET_UINT(0, "per-socket", &perf_stat.aggr_mode,
+ "aggregate counts per processor socket", AGGR_SOCKET),
+ OPT_SET_UINT(0, "per-core", &perf_stat.aggr_mode,
+ "aggregate counts per physical processor core", AGGR_CORE),
+ OPT_SET_UINT('A', "no-aggr", &perf_stat.aggr_mode,
+ "disable CPU count aggregation", AGGR_NONE),
OPT_END()
};
struct stat st;