Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1161675AbbKEOlR (ORCPT ); Thu, 5 Nov 2015 09:41:17 -0500 Received: from mx1.redhat.com ([209.132.183.28]:47312 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1161299AbbKEOlO (ORCPT ); Thu, 5 Nov 2015 09:41:14 -0500 From: Jiri Olsa To: Arnaldo Carvalho de Melo Cc: Andi Kleen , Ulrich Drepper , Will Deacon , Stephane Eranian , Don Zickus , lkml , David Ahern , Ingo Molnar , Namhyung Kim , Peter Zijlstra , "Liang, Kan" Subject: [PATCHv6 00/25] perf stat: Add scripting support Date: Thu, 5 Nov 2015 15:40:44 +0100 Message-Id: <1446734469-11352-1-git-send-email-jolsa@kernel.org> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 13655 Lines: 271 hi, sending another version of stat scripting. v6 changes: - several patches from v4 already taken - perf stat record can now place 'record' keyword anywhere within stat options - placed STAT feature checking earlier into record patches so commands processing perf.data recognize stat data and skip sample_type checking - rebased on Arnaldo's perf/stat - added Tested-by: Kan Liang v5 changes: - several patches from v4 already taken - using u16 for cpu number in cpu_map_event - renamed PERF_RECORD_HEADER_ATTR_UPDATE to PERF_RECORD_EVENT_UPDATE - moved low hanging fuits patches to the start of the patchset - patchset tested by Kan Liang, thanks! v4 changes: - added attr update event for event's cpumask - forbig aggregation on task workloads - some minor reorders and changelog fixes v3 changes: - added attr update event to handle unit,scale,name for event it fixed the uncore_imc_1/cas_count_read/ record/report - perf report -D now displays stat related events - some minor and changelog fixes v2 changes: - rebased to latest Arnaldo's perf/core - patches 1 to 11 already merged in - added --per-core/--per-socket/-A options for perf stat report command to allow custom aggregation in stat report, please check new examples below - couple changelogs changes The initial attempt defined its own formula lang and allowed triggering user's script on the end of the stat command: http://marc.info/?l=linux-kernel&m=136742146322273&w=2 This patchset abandons the idea of new formula language and rather adds support to: - store stat data into perf.data file - add python support to process stat events Basically it allows to store stat data into perf.data and post process it with python scripts in a similar way we do for sampling data. The stat data are stored in new stat, stat-round, stat-config user events. stat - stored for each read syscall of the counter stat round - stored for each interval or end of the command invocation stat config - stores all the config information needed to process data so report tool could restore the same output as record The python script can now define 'stat___' functions to get stat events data and 'stat__interval' to get stat-round data. See CPI script example in scripts/python/stat-cpi.py. Also available in: git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git perf/stat_script thanks, jirka Examples: - To record data for command stat workload: $ perf stat record kill ... Performance counter stats for 'kill': 0.372007 task-clock (msec) # 0.613 CPUs utilized 3 context-switches # 0.008 M/sec 0 cpu-migrations # 0.000 K/sec 62 page-faults # 0.167 M/sec 1,129,973 cycles # 3.038 GHz stalled-cycles-frontend stalled-cycles-backend 813,313 instructions # 0.72 insns per cycle 166,161 branches # 446.661 M/sec 8,747 branch-misses # 5.26% of all branches 0.000607287 seconds time elapsed - To report perf stat data: $ perf stat report Performance counter stats for '/home/jolsa/bin/perf stat record kill': 0.372007 task-clock (msec) # inf CPUs utilized 3 context-switches # 0.008 M/sec 0 cpu-migrations # 0.000 K/sec 62 page-faults # 0.167 M/sec 1,129,973 cycles # 3.038 GHz stalled-cycles-frontend stalled-cycles-backend 813,313 instructions # 0.72 insns per cycle 166,161 branches # 446.661 M/sec 8,747 branch-misses # 5.26% of all branches 0.000000000 seconds time elapsed - To store system-wide period stat data: $ perf stat -e cycles:u,instructions:u -a -I 1000 record # time counts unit events 1.000265471 462,311,482 cycles:u (100.00%) 1.000265471 590,037,440 instructions:u 2.000483453 722,532,336 cycles:u (100.00%) 2.000483453 848,678,197 instructions:u 3.000759876 75,990,880 cycles:u (100.00%) 3.000759876 86,187,813 instructions:u ^C 3.213960893 85,329,533 cycles:u (100.00%) 3.213960893 135,954,296 instructions:u - To report perf stat data: $ perf stat report # time counts unit events 1.000265471 462,311,482 cycles:u (100.00%) 1.000265471 590,037,440 instructions:u 2.000483453 722,532,336 cycles:u (100.00%) 2.000483453 848,678,197 instructions:u 3.000759876 75,990,880 cycles:u (100.00%) 3.000759876 86,187,813 instructions:u 3.213960893 85,329,533 cycles:u (100.00%) 3.213960893 135,954,296 instructions:u - To run stat-cpi.py script over perf.data: $ perf script -s scripts/python/stat-cpi.py 1.000265: cpu -1, thread -1 -> cpi 0.783529 (462311482/590037440) 2.000483: cpu -1, thread -1 -> cpi 0.851362 (722532336/848678197) 3.000760: cpu -1, thread -1 -> cpi 0.881689 (75990880/86187813) 3.213961: cpu -1, thread -1 -> cpi 0.627634 (85329533/135954296) - To pipe data from stat to stat-cpi script: $ perf stat -e cycles:u,instructions:u -A -C 0 -I 1000 record | perf script -s scripts/python/stat-cpi.py 1.000192: cpu 0, thread -1 -> cpi 0.739535 (23921908/32347236) 2.000376: cpu 0, thread -1 -> cpi 1.663482 (2519340/1514498) 3.000621: cpu 0, thread -1 -> cpi 1.396308 (16162767/11575362) 4.000700: cpu 0, thread -1 -> cpi 1.092246 (20077258/18381624) 5.000867: cpu 0, thread -1 -> cpi 0.473816 (45157586/95306156) 6.001034: cpu 0, thread -1 -> cpi 0.532792 (43701668/82023818) 7.001195: cpu 0, thread -1 -> cpi 1.122059 (29890042/26638561) - Raw script stat data output: $ perf stat -e cycles:u,instructions:u -A -C 0 -I 1000 record | perf --no-pager script CPU THREAD VAL ENA RUN TIME EVENT 0 -1 12302059 1000811347 1000810712 1000198821 cycles:u 0 -1 2565362 1000823218 1000823218 1000198821 instructions:u 0 -1 14453353 1000812704 1000812704 2000382283 cycles:u 0 -1 4600932 1000799342 1000799342 2000382283 instructions:u 0 -1 15245106 1000774425 1000774425 3000538255 cycles:u 0 -1 2624324 1000769310 1000769310 3000538255 instructions:u - To display different aggregation in report: $ perf stat -e cycles -a -I 1000 record sleep 3 # time counts unit events 1.000223609 703,427,617 cycles 2.000443651 609,975,307 cycles 3.000569616 668,479,597 cycles 3.000735323 1,155,816 cycles $ perf stat report # time counts unit events 1.000223609 703,427,617 cycles 2.000443651 609,975,307 cycles 3.000569616 668,479,597 cycles 3.000735323 1,155,816 cycles $ perf stat report --per-core # time core cpus counts unit events 1.000223609 S0-C0 2 327,612,412 cycles 1.000223609 S0-C1 2 375,815,205 cycles 2.000443651 S0-C0 2 287,462,177 cycles 2.000443651 S0-C1 2 322,513,130 cycles 3.000569616 S0-C0 2 271,571,908 cycles 3.000569616 S0-C1 2 396,907,689 cycles 3.000735323 S0-C0 2 694,977 cycles 3.000735323 S0-C1 2 460,839 cycles $ perf stat report --per-socket # time socket cpus counts unit events 1.000223609 S0 4 703,427,617 cycles 2.000443651 S0 4 609,975,307 cycles 3.000569616 S0 4 668,479,597 cycles 3.000735323 S0 4 1,155,816 cycles $ perf stat report -A # time CPU counts unit events 1.000223609 CPU0 205,431,505 cycles 1.000223609 CPU1 122,180,907 cycles 1.000223609 CPU2 176,649,682 cycles 1.000223609 CPU3 199,165,523 cycles 2.000443651 CPU0 148,447,922 cycles 2.000443651 CPU1 139,014,255 cycles 2.000443651 CPU2 204,436,559 cycles 2.000443651 CPU3 118,076,571 cycles 3.000569616 CPU0 149,788,954 cycles 3.000569616 CPU1 121,782,954 cycles 3.000569616 CPU2 247,277,700 cycles 3.000569616 CPU3 149,629,989 cycles 3.000735323 CPU0 269,675 cycles 3.000735323 CPU1 425,302 cycles 3.000735323 CPU2 364,169 cycles 3.000735323 CPU3 96,670 cycles Cc: Andi Kleen Cc: Ulrich Drepper Cc: Will Deacon Cc: Stephane Eranian Cc: Don Zickus Tested-by: Kan Liang --- Jiri Olsa (25): perf stat: Make stat options global perf stat record: Add record command perf stat record: Initialize record features perf stat record: Synthesize stat record data perf stat record: Store events IDs in perf data file perf stat record: Add pipe support for record command perf stat record: Write stat events on record perf stat record: Write stat round events on record perf stat record: Do not allow record with multiple runs mode perf stat record: Synthesize event update events perf stat report: Add report command perf stat report: Process cpu/threads maps perf stat report: Process stat config event perf stat report: Add support to initialize aggr_map from file perf stat report: Process stat and stat round events perf stat report: Process event update events perf stat report: Move csv_sep initialization before report command perf stat report: Allow to override aggr_mode perf script: Process cpu/threads maps perf script: Process stat config event perf script: Add process_stat/process_stat_interval scripting interface perf script: Add stat default handlers perf script: Display stat events by default perf script: Add python support for stat events perf script: Add stat-cpi.py script tools/perf/Documentation/perf-stat.txt | 34 ++++ tools/perf/builtin-script.c | 139 +++++++++++++++ tools/perf/builtin-stat.c | 742 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--------- tools/perf/scripts/python/stat-cpi.py | 74 ++++++++ tools/perf/util/evlist.c | 6 +- tools/perf/util/evlist.h | 3 + tools/perf/util/scripting-engines/trace-event-python.c | 114 +++++++++++- tools/perf/util/session.c | 3 + tools/perf/util/trace-event.h | 4 + 9 files changed, 1021 insertions(+), 98 deletions(-) create mode 100644 tools/perf/scripts/python/stat-cpi.py -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/